matrix pre-allocation

1 回表示 (過去 30 日間)
Chien-Chia Huang
Chien-Chia Huang 2011 年 4 月 13 日
What's actually the difference between the following commands
sparse(zeros(10000));
sparse([],[],[],10000,10000,0);
a = sparse([]); for i =1:10000,for j = 1:10000, a(i,j) = 0,end,end
Oftentimes the first line leads to out-of-memory problem while the rest do not when the dimension acretes.
I consulted to some tutorial about memory allocation that says "loop" is at times useful to speed up and avoid memory outflow.
When should I use loop to generate a really "big" matrix? or the second line just dominates?
Thanks

採用された回答

Andrew Newell
Andrew Newell 2011 年 4 月 13 日
The first command creates the full array zeros(10000) before turning it into a sparse array. The simplest way of creating the matrix you want is
sparse(10000,10000)
This is not a problem where using a loop is a good idea! You can time the commands using tic and toc:
tic
a = sparse([]); for i =1:100,for j = 1:100, a(i,j) = 0; end,end
toc
tic
b = sparse(100,100);
toc
On my computer, the output is
Elapsed time is 0.010886 seconds.
Elapsed time is 0.000298 seconds.
In other words, for an array that is only 100 x 100, the loops are 35 times slower. If you increase the array size, the loops increase in time as the total number of elements, while the second command always takes about the same time. So with your array, the loops would to it 350,000 times slower! And that's assuming that you change the comma after a(i,j) = 0 to a semicolon so the output isn't displayed after each element is changed.
  1 件のコメント
Chien-Chia Huang
Chien-Chia Huang 2011 年 4 月 13 日
Thanks, Andrew.
I did time the commands and got similar result like yours.
But I was always confounded by my first command which led to out-of-memory problem, while using loop could solve the problem, even though the matrix size is the same.

サインインしてコメントする。

その他の回答 (1 件)

Matt Fig
Matt Fig 2011 年 4 月 13 日
There are cases where loops are much faster than other alternatives. This can sometimes be difficult to predict ahead of time, however, when a loop is not a good idea is often easier to identify. In general:
1. Growing an array in a loop will always be your slowest option. The case you show above where a starts out empty and grows in a loop is going to be the slowest option.
2. The more complicated the indexing maneuvers in a loop, the slower it will be. Simple looping over consecutive elements in existing arrays will be fast.
3. A loop which makes many M-file function calls will be slow compared to a vectorized code. For speed, look for only a minimum of built-in calls within a loop.
4. These guidelines stand out more as the size of the arrays involved grows. For example, not pre-allocating a 1-by-5 array is no big deal, but a 1-by-5e7 array will slow down the code.
In your three examples above, the first example is a good demonstration of how not to pre-allocate a sparse array, as you found out. Your second example generates the ultimate sparse array - 0% density. I don't know enough about sparse arrays to know if this actually pre-allocates any memory at all. The third example is bad whether dealing with sparse arrays or not because you are growing and array in a loop.
  4 件のコメント
Chien-Chia Huang
Chien-Chia Huang 2011 年 4 月 13 日
Thanks, Matt.
So, if my desired matrix is sparse, is it still a bad idea to use the second example? or sparse(n,n)?
Wolfgang Schwanghart
Wolfgang Schwanghart 2011 年 4 月 13 日
You can preallocate sparse arrays using the sixth input argument of sparse or the function spalloc.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSparse Matrices についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by