how to run thousands of regressions at the same time

Question

Lingfei Kong 2017 年 1 月 23 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/321503-how-to-run-thousands-of-regressions-at-the-same-time

コメント済み: Junqi Wu 2019 年 1 月 5 日

I simulated two variables, a and b. each variable matrix is 66 by 1000 matrix.(simulate 1000 samples, each sample has 66 observations). Now I want to regress each column at variable matrix a on the same column of variable matrix b, that is, I need to run 1000 regressions and save the coefficients, t-stats and R squares. So how to do it？

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Junqi Wu 2019 年 1 月 5 日

Hi Lingfei Kong, have you solved this question? What was the method you used?

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Soumya Saxena 2017 年 1 月 27 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/321503-how-to-run-thousands-of-regressions-at-the-same-time#answer_252300

編集済み: Soumya Saxena 2017 年 1 月 27 日

MATLAB Online で開く

In order to perform regression on every column of b, there are 2 workarounds:

1. You may loop through each column of b,and treat is as a column vector of response variables for every regression.

2. You may refer to the following link: https://www.mathworks.com/help/stats/multivariate-regression-2.html

You should use the multivariate regression using "mvregress" as follows:

beta = mvregress(a,b)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 2

Image Analyst 2017 年 1 月 28 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/321503-how-to-run-thousands-of-regressions-at-the-same-time#answer_252323

I don't think you can do thousands of regressions at the same time. If you have the Parallel Computing Toolbox, you can do as many regressions at the same time as you have cores in your CPU. So you can extract like 4 or 8 columns then regress those 4 or 8 simultaneously, but not thousands. You can do thousands but you'll have to extract your columns one at a time, then regress it, or if you have multiple cores, pass one column off to one core, another column off to the next core, etc.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

Answer 3

John D'Errico 2017 年 1 月 28 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/321503-how-to-run-thousands-of-regressions-at-the-same-time#answer_252335

編集済み: John D'Errico 2017 年 1 月 28 日

Let me see if I can point you in the right direction.

If your goal is to regress the columns using a model like y = a + b*x, then there are two parameters for each model. CAN you do this, in a way that will allow you to compute all of the parameters you want? So coeffs, t-stats, r-squared parameters, etc? Well, yes, you can. Is it worth the effort? Probably not.

The trick is you will need to create a SPARSE block diagonal matrix, with blocks that are 1000x2 down the diagonal. So each regression would be given by a distinct block. Regress would then provide t-statistics and coefficients. Note that the R^2 coefficient from this would be WRONG, flat out wrong. But you could compute R^2 easily enough for each model.

Note that I said the matrix X needs to be a SPARSE block diagonal matrix. If X is not sparse, then the computation will be immensely time consuming compared to a simple loop. So you will need to learn to create a sparse block diagonal matrix. blkdiag is able to do this, if used properly.

Another approach would be to use a tool like arrayfun to compute all of the models at once. I think that arrayfun will not then be able to accumulate all of the parameters that you will need. (I'd want to check that claim though.) So then you would need to do extra work to get things like t-statistics. and while R^2 is easy to post compute, a t-stat is not. In effect, it would force you to loop over all of the models, solving the regression again anyway.

Option 3 is to use parallel processing. This problem is inherently well-suited to such an approach, IF you have the parallel processing tools, and know how to use them. You would see a decent throughput gain here.

In the end, unless you can go parallel (option 3) it won't be worth the hassle. Just do it in a loop. Loops are not the end of the world.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Lingfei Kong 2017 年 1 月 28 日

Oh, I think it is better just by using simple loop.

サインインしてコメントする。

how to run thousands of regressions at the same time

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

回答 (3 件)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

how to run thousands of regressions at the same time

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

回答 (3 件)

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示