Usage of structure as input of a function. Is it efficient ?
107 ビュー (過去 30 日間)
古いコメントを表示
Hello Mathworks community,
I have a general questioning about the efficiency / relevance of using structures as function inputs. I'll try to be clear...
I'm working on a code that has around 300 variables. I find very convenient to gather / sort these variables as fields of a few (~5) structures to sort them by "type". That clears my workspace and allow me to make shorter function calls. For example, I have a structure Model that contains fields .param1 .param2 ... param40 (all different, can be scalars, small vectors or small 3D matrices).
So, when I want to make some computation, I just do :
[output] = some_function(Model, other_input) ;
Furthermore, I use in my code a lot of functions / sub-functions / sub-sub-functions... I know that this might not be the most relevant in MatLab, but I think I have a good reason to do so. First that simplifies the structure of my code since I have a lot of sequential actions, so having small divided tasks is way more easy to manage. And secondly I intend to translate the code in C using MatLab Coder to run on an embedded platform.
That scheme was fine for me, until I started working on optimization / computing time and realized that there was a big difference between the 2 solutions below:
function [output] = some_function(Model, other_input)
% [...]
% Solution 1
% (tic)
% (for i = 1:1e6)
output = some_sub_function(Model.param1, Model.param2, Model.param3, other_input) ;
% (end)
% (toc) ==> ~4 seconds
% Solution 2
P1 = Model.param1 ;
P2 = Model.param2 ;
P3 = Model.param3 ;
% (tic)
% (for i = 1:1e6)
output = some_sub_function(P1, P2, P3, other_input) ;
% (end)
% (toc) ==> ~0.4 seconds, 10 times faster !
% [...]
end
Since computing time is critical for my work, I start to doubt about the choices I made. The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...
I've struggled to find ressources online about that matter (maybe I didn't find the right keywords), so I am curious about advices or feedback you could have ?
Thanks !
1 件のコメント
Rik
2020 年 8 月 10 日
Do you have that many iterations in your real code? It looks to me like simply retrieving the fields and storing them in temporary variables for you function calls might already be the most important factor here.
採用された回答
Bruno Luong
2020 年 8 月 10 日
編集済み: Bruno Luong
2020 年 8 月 10 日
"I've struggled to find ressources online about that matter (maybe I didn't find the right keywords"
No, it's not your fault, TMW rarely makes any comment about such thing. They thing's it's not programmer business to understand how things work. I just dislike the way they hide the details for programmers.
I worked with MATLAB for about 20 years, and I did a lot of timings trying to understand how things work. Mex programming also helps since it exposes me to mxArray data structure and that gives the hints what behind MATLAB statements.
In your case, accessing the structure fields, MATLAB store the fieldnames as char-arrays and everytime you call
s.param1
it must goes throught all the list of fielnames of the structure, doing string comparison until it matches the string 'param1', then it returns the corresponding mxArray (structure field). That explains why it is not fast to use it within the for loop. If you move the out
p1 = s.param1
the fieldname matching is done outside the loop and p1 strores the mxArray ready to be used.
Personally I do a lot of function arguments like you, meaning few structures as input arguments (usulay one nested structure) and single output as output. Just do need to be careful with for-loop and prepare the body of for-loop to have very simple operations, and avoid calling functions, accessing structure fields, table, multiple-level indexing, etc.... and you'll be fine. You need also to pay attention to function/statement that make memory-copies of your data, this also kill the performance of MATLAB. Using object programming is also slow, so avoid them especially if they don't deal with large/complex data and comes with array. Most of the time that comes in the purest form as "vectorization" and the for-loop disappears entirely.
21 件のコメント
Bruno Luong
2020 年 8 月 12 日
Before the data sharing is at the data pointer under mxArray structure.
Now it seems it go up one level and the sharing is at the mxArray pointer itself.
May be they move up the "super-cross-link" on top root level and localed now somewhere as a master table handled by MATLAB engine and no loger accessible to users?
James Tursa
2020 年 8 月 12 日
編集済み: James Tursa
2020 年 8 月 12 日
That's what I suspect. I haven't done exhaustive testing, but here is what I am seeing in the mxArray header:
R2019a
Reverse Cross Llink pointer spot, points to "previous" variable in shared data copy linked list
Cross Link pointer spot, points to "next" variable in shared data copy linked list
R2020a
Reverse Cross Link pointer spot, points to an integer that contains the number of shared data copies in existence.
Cross Link pointer spot, NULL
Not sure if the change happened in R2019b or R2020a ... will have to check this later.
You can still detect that it is a shared data copy by looking at that integer, but you can't find the copies because that linked list is no longer part of the mxArray header that I can see. I guess it is good that you can still detect data sharing in a mex routine (e.g., as a check prior to doing any inplace operation), but I prefer the old way where you can actually find the copies through the linked list.
As a bit of history, many years ago it used to be that shared data copies was always the method used for individual variable sharing and reference copies was the method used for cell and struct element sharing. Then they started using reference copies for individual variable sharing a few years ago. Now it seems they have taken it a step farther and are using reference copies for function arguments. Makes sense because reference copies are more efficient, but functions like reshape( ) and typecast( ) must still use the shared data copy method because stuff in the mxArray header changes.
その他の回答 (2 件)
Matt J
2020 年 8 月 10 日
編集済み: Matt J
2020 年 8 月 12 日
The thing is that if I stop using structures, I'll have gigantic function calls to write. And if I stop using functions, I'll have a gigantic code to naviguate into...
But if your code is doing such intricate things as to be "gigantic", then surely the struct access timings that you have shown us, whether 4 sec. or 0.4 sec, are going to be small overhead compared to the rest of your actual computation. Why care about a 4 sec. difference if the whole compuation takes 20 minutes or so?
If you really are in a situation where the main work of the function is 10 times faster than the work of unpacking the input from a struct, then I am suspicious if encapsulating those particular commands inside a function is really necessary and worthwhile. It means the function must be doing almost nothing. The overhead of even calling the function would be significant compared to the actual work that it does.
4 件のコメント
Matt J
2020 年 8 月 11 日
Another thing to consider, is that the overhead you see in Matlab may not be there in the final C/C++ version that you obtain from the Matlab Coder. I would expect struct indexing in C/C++ to be less encumbered, because the compiler has the opportunity to sort out in advance which fields are being accessed, and where the relevant memory is located.
Walter Roberson
2020 年 8 月 12 日
Earlier I mentioned that I had done some timing tests. I was not able to find the code for that, so I created some new code.
In the below tests, on my R2020a Mac system, this is a summary of the results:
- accessing a parameter and accessing a local variable are fastest, and indistinguishable in this test; legends for these are 'parameter' and 'local'. These do not access the structure passed in, and are there to give baselines to compare struct access against
- Surprisingly, using a constant value is a bit slower and irregular timing. This does not make sense to me at the moment. (This is the first line created below, a blue line in the plot, legend 'constant')
- Accessing a fixed field name of an input structure is roughly 25% slower than accessing a plain local or parameter, still quite fast. The timings for first (of 1000) fields or last of them in the struct are slightly different but quite close; legend entries 'Fstat' and 'Lstat'. Considering the other timing tests, we have to rule out the possibility that for repeated calls to get the same field that it has to search through the fieldname list every time. We cannot, however, currently rule out the possibility that the JIT is remembering the location after the first iteration
- Accessing a struct field name dynamically takes roughly twice as long as accessing a static field name. The timings for first field in the struct, legend 'Fdyn', and last field in the struct, legend 'Ldyn', and middle field in the struct, legend 'MDyn', and random field in the struct, legend 'Rdyn', are pretty irregular and not at all well separated. The timings do not rule out the possibility that first (Fdyn) is slightly faster than the others, but the uncertainty is high.
- If, hypothetically, structure reference required searching the fieldname list linearly each time, then Fstat (first static) and Fdyn (first dynamic) would be nearly indistuinguishable, and Lstat (last static) and Ldyn (last dynamic) would be nearly indistinguishable, and there would be a clear difference between the First and Last cases, but neither of these are true. If, hypothetically, the JIT is merely caching the last fieldname accessed under the hypothesis that it might be used again, with the same hypothetical linear search mechanism being used for both cases, then the large random variation for the dynamic access should be no more than the random variation for the static accesses, but the random variations are clearly quite different. Static fieldname access appears to be implemented through a different mechanism than dynamic fieldname access -- or at least is JIT'd quite differently.
- In the functions, the assignments to local that appear in all functions are there in all tests in order to eliminate the possibility that creating the local variable was "significantly more work" for the test (legend 'local') that would only be done for that test. On the other hand, it could be the case that the JIT is discarding the initialization if the variable is not used, so a more careful test would make sure to force a "use" of the variable in a way that could not be JIT'd away. Similar logic to force use of the structure input and auxillary input should also be undertaken in a more thorough test
- An analysis weakness of the current test is that it passes in the same struct each time, so we cannot at present speak to Bruno's point that it cannot know the offset of a particular static field name.
Anyhow: Yes, using a field from a struct that is passed in is pretty efficient compared to many other possibilities. Not as efficient as using a parameter. But a more throughout test would require passing in 1000 parameters and comparing timing to access "first" or "last" or "random" parameter, because hypothetically parameter names are not hard-wired to parameter number references. (The Mathworks documentation on speed of variable access does imply that named parameter access is hard-wired to a stack offset by the JIT, and does imply that parameter access could be faster than local variables.)
N = 100;
NFs = 1000;
S.walla = 1;
for K = 1 : NFs-2
name = char(randi(0+['a', 'z'], 1, 5));
S.(name) = K + 1;
end
S.bingo = NFs;
FN = fieldnames(S);
middle = FN{floor(end/2)};
Fs = {@() time_constant(S, 7), ...
@()time_parameter(S, 7), ...
@()time_local(S,7), ...
@()time_first_static(S, 'walla'), ...
@()time_last_static(S,'bingo'), ...
@()time_dynamic(S, 'walla'), ...
@()time_dynamic(S, 'bingo'), ...
@()time_dynamic(S, middle) };
Labels = {'constant', 'parameter', 'local', 'Fstat', 'Lstat', 'Fdyn', 'Ldyn', 'Mdyn', 'Rdyn'};
NFs = length(Fs);
times = zeros(N, NFs+1);
for J = 1 : NFs; F = Fs{J}; for K = 1 : N; times(K,J) = timeit(F, 0); end; end
for J = NFs+1; for K = 1 : N; lab = FN{randi(NFs)}; F = @()time_dynamic(S,lab); times(K,J) = timeit(F,0); end; end
plot(times);
legend(Labels);
function r = time_constant(S, const) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = -1;
end
function r = time_parameter(S, const) %#ok<INUSL>
local = -123; %#ok<NASGU>
r = const;
end
function r = time_local(S, const) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = local;
end
function r = time_first_static(S, label) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = S.walla;
end
function r = time_last_static(S, label) %#ok<INUSD>
local = -123; %#ok<NASGU>
r = S.bingo;
end
function r = time_dynamic(S, label)
local = -123; %#ok<NASGU>
r = S.(label);
end
0 件のコメント
参考
カテゴリ
Help Center および File Exchange で Fixed-Point Conversion についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!