Improving Efficiency of Find Algorithm

10 ビュー (過去 30 日間)
Matt C
Matt C 2021 年 8 月 19 日
回答済み: Cris LaPierre 2021 年 8 月 19 日
Hello,
I am aware that logical indexing is much faster than the usage of the find function, in specific instances. I'm wondering if there is a way to improve the following algorithm - I'm not quite sure how to use indexing (if possible) in this situation.
What I have is a matrix of ascending values, though some of those values may be repeated (specifically, I have millions of ascending timestamps with many repeated). I am then seeking the start and end indices of a window that is between time X and Y.
Here is an example of the algorithm that I currently have implemented:
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 80;
start_index = find(myDataTimestamps >= window_start_time,1,'first');
end_index = find(myDataTimestamps <= window_end_time,1,'last');
Is there a way to improve the speed of this code and still return the same start_index of 3 and end_index of 13?
Much appreciated!

回答 (2 件)

Cris LaPierre
Cris LaPierre 2021 年 8 月 19 日
This approach may only work for this simple case, but here's a way to do it using max/min.
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 60;
% find start/end index
ind = 1:length(myDataTimestamps);
wind = myDataTimestamps==window_start_time | myDataTimestamps==window_end_time;
start_index = min(ind(wind))
start_index = 3
end_index = max(ind(wind))
end_index = 10
  1 件のコメント
Matt C
Matt C 2021 年 8 月 19 日
編集済み: Matt C 2021 年 8 月 19 日
I haven't been able to do a comparison, but that wasn't the massive improvement that I was hoping for. I had cancelled the script early without grabbing a total runtimes for comparison, but it looked like the proposed algorithm was going to take just as long (if not longer) than the find function. I implemented the recommendation as:
myDataTimestamps = [10 20 30 30 30 40 50 60 60 60 70 70 80 90];
window_start_time = 30;
window_end_time = 60;
% find start/end index
ind = 1:length(myDataTimestamps);
start_index = min(ind(myDataTimestamps>=window_start_time));
end_index = max(ind(myDataTimestamps<=window_end_time));
Have I blown anything in my above implementation? Note that my processor loading was ~50%, and only ~1 GB of my 24 GB of RAM was being used.
Edit: I can confirm that it took much longer using the min/max method. My code took ~45 minutes to fully execute using 'find', whereas it had only completed ~25% after about 2 hours using the min/max method.

サインインしてコメントする。


Cris LaPierre
Cris LaPierre 2021 年 8 月 19 日
I wonder if this is a scenario where using tall data may help. See this page.

カテゴリ

Help Center および File ExchangeLoops and Conditional Statements についてさらに検索

タグ

製品


リリース

R2015b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by