performance when copying data from object array to cell
現在この質問をフォロー中です
- フォローしているコンテンツ フィードに更新が表示されます。
- コミュニケーション基本設定に応じて電子メールを受け取ることができます。
エラーが発生しました
ページに変更が加えられたため、アクションを完了できません。ページを再度読み込み、更新された状態を確認してください。
古いコメントを表示
Hi, I encounter performance issues, when copying data from an object property to a cell. My test class looks as follows:
classdef Class < handle
%CLASS Summary of this class goes here
% Detailed explanation goes here
properties
data = [1 2 3 4 5];
end
methods
end
end
My test script is the following:
% settings
N = 20000; % try 1000/10000
% create N objects
cls(N) = Class();
% collect data
tic;
data = {cls.data};
toc;
% result:
% N = 1000: T = 0.008s
% N = 10000: T = 0.6s
% N = 20000: T = 2.4s
I'd expect linear scaling of computation time with array size. This however does not hold. Can someone give a hint about how to increase copy performance in this example? Is there a reason why it does not scale linearly?
Thanks, Daniel
採用された回答
Kirby Fears
2017 年 1 月 9 日
編集済み: Kirby Fears
2017 年 1 月 9 日
Darim,
Since you are not pre-allocating the data cell, Matlab is probably expanding the size of data iteratively. With pre-allocation the timing is linear. Try for yourself.
% settings
N = 20000; % try 1000/10000
% create N objects
cls(N) = Class();
% collect data
tic;
data = cell(1,N);
for i = 1:numel(cls),
data{i} = cls(i).data;
end
toc;
8 件のコメント
Kirby,
this solves my problem, thanks. I've assumed that avoiding for-loops would speed up Matlab in most cases, but in this case it is obviously not advised.
Is there any solution without the for-loop?
Thanks again, Daniel
Kirby,
I've stated an enhanced Problem, i.e. I'd like to dynamically access the property. This is my test script and result on the current machine I am working on:
%%initialize
clear all
%%create peers
% settings
N = 30000; % try 1000/10000/20000/30000
% create N objects
cls(N) = Class();
%%Orignial solution
clear('data')
% collect data
tic;
data = {cls.data};
toc;
% result:
% N = 1000: T = 0.003s
% N = 10000: T = 0.32s
% N = 20000: T = 1.11s
% N = 30000: T = 2.44s
%%Kirby solution
clear('data');
% collect data
tic;
data = cell(1,N);
for i = 1:numel(cls),
data{i} = cls(i).data;
end
toc;
% result:
% N = 1000: T = 0.011s
% N = 10000: T = 0.063s
% N = 20000: T = 0.12s
% N = 30000: T = 0.18s
%%Enhanced problem
clear('data');
% Now we'd like to access the property dynamically
pName = 'data';
tic;
data = cell(1,N);
for i = 1:numel(cls),
data{i} = cls(i).(pName);
end
toc;
% result:
% N = 1000: T = 0.031s
% N = 10000: T = 0.33s
% N = 20000: T = 0.52s
% N = 30000: T = 0.77s
It is slower than your solution, due to dynamic access of the property data via string pName. Matlab jit very likely is not able to optimize this.
Can you suggest a solution for the enhanced problem, which has performance similar to your previous solution?
Thanks again, Daniel
data = { cls.( pName ) };
works fast on my PC when I run that. I imagine in this case the overhead of resolving the dynamic string 30,000 times is what causes the overhead whereas returning to the single line approach removes this.
Your original solution is fastest of all in my run, using R2016b, even if I increase N to 100,000. This option is not much slower, though tic-toc is not a reliable way to measure speed in general.
e.g. for N = 100,000 I get:
Elapsed time is 0.076481 seconds.
Elapsed time is 0.185008 seconds.
Elapsed time is 1.158645 seconds.
Elapsed time is 0.097908 seconds.
Adam,
your solution correponds to my original approach. I consider this to be the neatest version, as we don't have to consider pre-allocation of memory.
I am using R2015b, though and I think some major jit-Compiler improvements were made for R2016.
I'd like to be compatible with earlier Matlab versions and therefore I am looking for the best solution 2015b.
Can you suggest a workaround, which provides similar performance?
Thanks, Daniel
Being compatible with earlier versions of Matlab from a performance point of view is very difficult given the continuous improvement in performance of many areas of Matlab. Backward compatibility of function names is relatively easy, but as you know, with different versions and improvements different solutions become more performant.
What time does
data = { cls.( pName ) };
give on your machine as this wasn't included in your timings, esp. compared to your other dynamic string approach?
It may be that there isn't a fast way to do this in R2015 which is why it was improved for R2016
Timing is the same as for
data = {cls.data}
i.e.
% result:
% N = 1000: T = 0.003s
% N = 10000: T = 0.32s
% N = 20000: T = 1.11s
% N = 30000: T = 2.44s
Which makes sense since the dynamic property access has to be resolved only once.
In addition, I've quickly installed 2016b an can confirm your numbers:
%%Adam solution (use 2016b)
tic;
data = { cls.( pName ) };
toc;
% result:
% N = 1000: T = 0.0009s
% N = 10000: T = 0.007s
% N = 20000: T = 0.014s
% N = 30000: T = 0.018s
So, from this perspective, the neat solution looks great.
Thanks, Daniel
Kirby Fears
2017 年 1 月 10 日
編集済み: Kirby Fears
2017 年 1 月 10 日
Adam's approach is definitely best for the latest Matlab release. As for 2015b, I don't see a direct way to dynamically access a single property from the class without suffering the string resolution time.
If you know all the properties you might want to extract from your class, and if the contents of those properties are not too large, you could extract all properties during the loop (using hard coded names) into a MxN cell array with corresponding collection of property name strings like {'prop1','prop2',...,'propN'}.
After the loop, you can extract a specific property from the MxN cell array as needed. The performance of this approach relies entirely on the class you're working with. It might be faster than dynamic property access in your case.
Kirby, Adam,
since you were so supportive regarding my problem, I'd like to share the information I've received from Matlab support.
[Quote:] "The test case creates N number of objects with data that are assigned to exactly the same array. This causes MATLAB to create a long list of arrays sharing the same data to avoid making data copies. However, when creating the cell array, MATLAB ends up traversing this long list for each element. An optimization to handle this case better was introduced in 2016a.
In a real-world scenario, would thousands of the objects really still have the default value? You can see that the timing is linear as expected when using random data in each object instance:"
They enhanced the class:
%File: TestClass1.m
classdef TestClass1
%CLASS Summary of this class goes here
% Detailed explanation goes here
properties
data %= [1 2 3 4 5];
end
methods
function obj = TestClass1
obj.data = rand(1,5);
end
end
end
They tested:
%File: runTestClass1.m
function runTestClass1
runOneTest(1000);
runOneTest(2000);
runOneTest(10000);
end
function runOneTest(N)
% create N objects
for k = N:-1:1
cls(k) = TestClass1;
end
% collect data
tic;
data = {cls.data};
toc;
end
And they received:
>> runTestClass1
Elapsed time is 0.000493 seconds.
Elapsed time is 0.000863 seconds.
Elapsed time is 0.004480 seconds.
As well I did: I can confirm.
Cool!
Thanks again, Daniel
その他の回答 (0 件)
カテゴリ
ヘルプ センター および File Exchange で JSON Format についてさらに検索
製品
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Web サイトの選択
Web サイトを選択すると、翻訳されたコンテンツにアクセスし、地域のイベントやサービスを確認できます。現在の位置情報に基づき、次のサイトの選択を推奨します:
また、以下のリストから Web サイトを選択することもできます。
最適なサイトパフォーマンスの取得方法
中国のサイト (中国語または英語) を選択することで、最適なサイトパフォーマンスが得られます。その他の国の MathWorks のサイトは、お客様の地域からのアクセスが最適化されていません。
南北アメリカ
- América Latina (Español)
- Canada (English)
- United States (English)
ヨーロッパ
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
