How can I efficiently save a linear model?

I have a large dataset (8000 x 287) that I want to use for linear regression. I am using the "fitlm()" function to train linear models with interactions. I need to train multiple models with datasets of the same size, and I want to save them for later usage. I saved the resulting linear model objects with save.
mdl = fitlm(...)
save('mdl.mat', 'mdl', '-v7.3')
Everything works but each model has a size of about 5.1 GB! Is there a more efficient way to save trained models? I only need the fitted polynomial for my predictions, so I don't understand why so much space is required to save a single model.

回答 (1 件)

RAJA SEKHAR BATTU
RAJA SEKHAR BATTU 2022 年 10 月 27 日

0 投票

Hi Bastian,
I think you already aware of regression. But I give you an Idea. Regression is the way of using variables(input) to fit a polynomial.
You can optimize and reduce the dataset to less variable say may be 287 to 20. For this you can use PCA.
Check some details about optimal fetaures.
If you reduce the dataset, Automatically the model reduces its memory after fitiing.
To save models, If you want to save multiple models for future use. You can use cell array to save the models with a loop for change of parameters.
I hope It is clear

4 件のコメント

Bastian
Bastian 2022 年 10 月 27 日
Thanks for your answer. I know that reducing the dataset would solve the problem, but I haven't yet found a subset with which I can achieve the same accuracy. So I'm stuck with the dataset as it is for now.
MATLAB stores lots of (in my opinion unnecessary) read-only properties in the resulting LinearModel object, e.g., the input data and summary statistics. Do you know a way to remove those properties?
RAJA SEKHAR BATTU
RAJA SEKHAR BATTU 2022 年 10 月 27 日
編集済み: RAJA SEKHAR BATTU 2022 年 10 月 27 日
Yes, I know a way.
mdl = fitlm(...);
% After this step you need to see what are present in the model
clear mdl.input; % example
% may be you can use this way because the fitted model usually saves as a structure
save('mdl.mat', 'mdl', '-v7.3');
I hope this will work in your case too
type mdl in command window to see the sub structure after fitiing
Thanks
Bastian
Bastian 2022 年 10 月 27 日
Unfortunately, the clear command doesn't work. I assume it is because the properties are read-only.
RAJA SEKHAR BATTU
RAJA SEKHAR BATTU 2022 年 10 月 27 日
@Bastian I think then only way to reduce the memory is to use Principal component analysis(PCA) or any feature reduction technique to reduce the features. The number of features to be reduced can be obtained by wrting a loop and checking the error(After PCA).
Please go through feature engineering for regression. You may get more information.
we cannot change the read only parameters because you can use that model 'mdl' for more multiple purposes.
Please find the more information about linear model regression in the following link
https://uk.mathworks.com/help/stats/fitlm.html

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeLinear and Nonlinear Regression についてさらに検索

質問済み:

2022 年 10 月 26 日

コメント済み:

2022 年 10 月 27 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by