fitclinear appears to use sgd solver even when sparsa is specified

Question

Jonah Pearl 2023 年 3 月 5 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1923425-fitclinear-appears-to-use-sgd-solver-even-when-sparsa-is-specified

回答済み: Jonah Pearl 2024 年 5 月 31 日

Hi there,

I'm training some SVM's on moderate-dimensional data (a few thousand observations by [less than or equal to a few hundred] features) using the following core function call:

model = fitclinear(X_subset(train_idx,:),...
    Y_subset(train_idx),...
    'Regularization', 'lasso',...
    'Solver', {'sparsa'});

In order to assess the degree to which performance depends on the features jointly, rather than individually, I add them in one by one and re-run the classifier. The accuracy curve that I get out looks like this: (red and blue are two different experimental replicates)

As you can see, once the number of dimensions passes 100, the solver's accuracy changes dramatically, getting less accurate and also getting more stochastic. I presume this is because on the backend, MATLAB is changing the solver from sparsa to sgd, as implied by the docs but not explicitly stated. For a second set of data that I have, where the overall accuracy is higher (~80%), the effect is still present but not as dramatic.

Is there a way to prevent MATLAB from switching to sgd? I will try passing in the cofficients from the nfeatures=100 model as a warm start to the subsequent models, but even if that fixes my specific problem, this feels like a bug more generally worth reporting.

Thanks.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Jonah Pearl 2024 年 5 月 31 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1923425-fitclinear-appears-to-use-sgd-solver-even-when-sparsa-is-specified#answer_1466101

MATLAB support gave the following response, which appears to resolve the issue:

To start troubleshooting, I noticed that you are passing in 'sparsa' as a cell array containing a single string to "fitclinear". If possible, try replacing "{'sparsa'}" with just 'sparsa' as the solver. It may be that passing a cell array with just one solver is confusing the function. If this does not work to resolve the issue, then reach back out to me and I will continue to investigate.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

the cyclist 2023 年 3 月 6 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1923425-fitclinear-appears-to-use-sgd-solver-even-when-sparsa-is-specified#answer_1185815

This is interesting. I'd be surprised that MATLAB makes that transition when you have explicitly specified the Solver .. but I agree with you that the mention of 100 features in the documentation is a tantalizing hint that it might be happening.

Can you upload the data? I'd be pretty interested to investigate. (I could also create a simulated dataset. This probably doesn't depend the exact data.)

Here would be my approach to trying to confirm your hypothesis. You can use the debugger to pause execution inside fitclinear, then step through the program to see where the Solver is actually set. You could then see whether MATLAB is actively ignoring the Name-Value input.

You might be able to make a copy of the MATLAB code (and put it in your path), then adapt that code to do what you want.

I will mention that I think it is also possible that you are just seeing some phenomenon where the lasso is failing to regularize (or finding some local minimum instead of a global one), but it seems to too extraordinarily coincidental.

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

the cyclist 2023 年 3 月 10 日

@Bruno Luong, I think that when @Jonah Pearl used the word "sparse" in a prior comment, he just meant that one class of observations has significantly fewer instances than the other. I don't think this has anything to do with the sparse matrices.

Also not to be confused with the sparsa (Sparse Reconstruction by Separable Approximation) solver.

Jonah Pearl 2023 年 3 月 10 日

編集済み: Jonah Pearl 2023 年 3 月 10 日

Sorry didn't mean to introduce jargon! But by sparse, I meant, a large fraction of all observations are 0's. However I'm not using any sparse matrices (ie the MATLAB class) in this analysis. I'll give it a try later.

サインインしてコメントする。

fitclinear appears to use sgd solver even when sparsa is specified

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (1 件)

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

fitclinear appears to use sgd solver even when sparsa is specified

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

その他の回答 (1 件)

8 件のコメント 6 件の古いコメントを表示6 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

8 件のコメント
6 件の古いコメントを表示6 件の古いコメントを非表示