Issues with reproducibility in multistart with parallelization
12 ビュー (過去 30 日間)
古いコメントを表示
I am running several model fits (1139) using multistart with parallelization. I first ran with 50 start points (including my initial guess). I then wanted to re-run with 150 start points, to compare the reduction in fval. So, when running the first 50, I batched my fits manually across a handful of nodes, and saved the rng state to a mat file. When running the 150, I batched the fits in the same way, except for a set of about 100 where that node was unavailable, and loaded the rng state.
To test the effect of the node (I think I read the node influences the number generator, and was curious what would happen with reproducibility) I ran that set of 100 fits across two nodes, loading the same state each time. In this case, I got identical outputs, identical function values.
However, I did not get good reproducibility from the 50 start point run to the 150 start point run. 27% of the 1139 fits were worse (had higher function values in a minimization problem) than the 50 start point fits. I also found that of the 1139 fits, 17% had greater than 1% higher fval, and 3% had 10% greater fval - I thought maybe its rounding or something, but this seems pretty high.
What am I missing? How can I make these fits reproducible?
0 件のコメント
回答 (1 件)
Matt J
2025 年 9 月 2 日 20:53
編集済み: Matt J
2025 年 9 月 2 日 20:56
I don't see why you would expect agreement between a 50-point multistart and a 150-point multistart. Only if both versions succeed in finding the global minimum would the results be guaranteed to agree.
4 件のコメント
Matt J
2025 年 9 月 3 日 1:33
You are quite welcome, but when/if you are convinced this is the correct answer please accept-click it.
参考
カテゴリ
Help Center および File Exchange で Global or Multiple Starting Point Search についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!