フィルターのクリア

what is the efficient way to search for value in a large struct

32 ビュー (過去 30 日間)
shlomo odem
shlomo odem 2022 年 3 月 2 日
編集済み: Bruno Luong 2022 年 3 月 2 日
hello,
i want to serch for value in spacific field ("Config") of very larg struct array (1e6 rows).
i do this serch all the time Therefore, I need the fastest way...
this is the way i try:
tree(1e6) = struct("Index",[],"parent", [],"Config",[],"Movment",[])
tree = 1×1000000 struct array with fields:
Index parent Config Movment
just for the exemple i feel the field Config with value...
[tree.Config] = deal(1);
becose of the size of this struc the serch is very slow.
i try diffrent metode like:
value2find = 15;
1)
config = [tree.Config];
loc = ismember(config,value2find);
in this metod extruct all the value from the struct take long time:
timeit(@() [tree.Config])
ans = 0.2089
2)
this way is the worst:
Loc = arrayfun(@(x) ismember(value2find,x.Config),tree);
timeit(@() arrayfun(@(x) ismember(value2find,x.Config),tree))
ans = 2.7032
I thought of a few ideas to deal with the problem, first I will detail the algorithm(RRT*) in a slightly abstract way:
I have a loop that produces random values ​​according to a certain regularity, each value is checked to see if it exists in the struct array. If the value exists, an update operation is performed for the same row in the struct array, if the value does not exist then it is added to the struct array in the first available space.
My thoughts:
  • Should I save the field I am looking for as a numeric array separately from the structure? So the search will be more efficient but will it require me to update both the array and the structure each time?
  • Maybe instead of a structure, I should use a cell array? Will it be significantly faster?
  • I will emphasize that the struct array is empty of values ​​at the beginning and fills with time, it may be worthwhile to limit the search to only the part that has values, of course as time goes by there will be more values ​​and therefore the search time will increase accordingly.
I would love for suggestions and better ways than I was able to do
  2 件のコメント
Rik
Rik 2022 年 3 月 2 日
I think there isn't an objective answer to this question. I don't think a cell array would result in much of a speed up (if anything), so your choice is between extracting the parameter to an array every time, or storing it in a separate array.
Stephen23
Stephen23 2022 年 3 月 2 日
編集済み: Stephen23 2022 年 3 月 2 日
"Should I save the field I am looking for as a numeric array separately from the structure?"
That is probably the best option for your data.
But rather than inefficiently storing lots of separate scalar values in a large structure array, why not just use numeric arrays anyway (possibly in a scalar structure)? See Johan Pelloux-Prayer's answer.

サインインしてコメントする。

採用された回答

Bruno Luong
Bruno Luong 2022 年 3 月 2 日
編集済み: Bruno Luong 2022 年 3 月 2 日
You structure array seems to be linear, I would store the data in a structure of (cell) arrays rather than array of structures. This organization avoid to go back and forth to form the array of Config when searching.
  1 件のコメント
shlomo odem
shlomo odem 2022 年 3 月 2 日
Thank you very much, it solves the problem

サインインしてコメントする。

その他の回答 (1 件)

Johan
Johan 2022 年 3 月 2 日
I'm not sure that fits what you want but you could try making large arrays in a small structure instead of the opposite:
tree(1) = struct("Index",[],"parent", [],"Config",[],"Movment",[]);
tree.Config = rand(1e6,1);
tree2(1e6) = struct("Index",[],"parent", [],"Config",[],"Movment",[]);
value2find = 15;
for i = 1:100
tic
loc = ismember(tree.Config,value2find);
timer(i,1) = toc;
tic
loc = sum([tree2.Config]==value2find);
timer(i,2) = toc;
end
mean(timer)
ans = 1×2
0.0008 0.2176
  1 件のコメント
shlomo odem
shlomo odem 2022 年 3 月 2 日
Thank you very much, it solves the problem

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeStructures についてさらに検索

製品


リリース

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by