Pick out rows from an array where the 1st column is closest to a multiple of a number

1 回表示 (過去 30 日間)
Jason
Jason 2024 年 10 月 2 日
コメント済み: Arjun 2024 年 10 月 9 日
Hello, Im trying to downsample a 100k lines of excel and want to pick out only those rows where the column 1 values are closest to multiples of a number, say 50. So in the pic below, just the green highlighted rows.
I've thought about using mod but this doesn't work:
%For example if I choose these numbers (and consider just a column vector)
n=[1;5;7;9;12;15;29;33;39]
and want to pick out those numbers closest to multiples of 10
mod(n,10)
ans =
1
5
7
9
2
5
9
3
9
If I sort these and pick out the lowest ones, the value of 29 & 39 wouldn't be picked out.
I also considered some kind of interp, but I need to pick out the actual discrete value from the table below not an interpolated value. Normally I show my arttempt, but Im actually stuck here in where to actually start with this.

採用された回答

dpb
dpb 2024 年 10 月 2 日
編集済み: dpb 2024 年 10 月 2 日
Try
%data=readmatrix('yourfile.xlsx');
data=[0 18 38 56 75 94 112].'; % sample data
DELT=50; % the interval desired
v=[0:DELT:max(data(:,1))]; % the lookup values at delta intervals
ix=interp1(data(:,1),1:height(data),v,'nearest').'; % the magic--return index nearest to each v
reduced=data(ix,:); % and return the data rows corresponding
% show results
disp([ix reduced])
1 0 4 56 6 94
Applied to your array, reduced will have all columns, of course.
  2 件のコメント
Jason
Jason 2024 年 10 月 2 日
Awesome, thankyou!
dpb
dpb 2024 年 10 月 2 日
You were right in thinking you needed interpolation; just takes "time in grade" to know that interp1 has options other than just linear interpolate betweeen points--and then the recognition to make the location in the array be the "interpolated" variable returned to use as the index into the original array...the other solution basically implements what interp1 does internally here; albeit with much less effort coding-wise; the power of MATLAB in having all this stuff like this already implemented for you -- you just have to be able to find it... :)

サインインしてコメントする。

その他の回答 (1 件)

Arjun
Arjun 2024 年 10 月 2 日
Hi @Jason,
I see that you want to pick out rows from the array based on proximity to multiple of a number.
To downsample your data by picking rows where the first column values are closest to specific multiples (like 50), you can start by listing the multiples you care about, such as 0, 50, 100, and so on, up to the largest number in your data. For each of these multiples, you look through your data to find the number that is closest to it. This can be done by calculating how far each number is from the current multiple and then picking the one with the smallest difference. This way you can gather all the row’s which are closest to the multiple of the number you chose.
You can refer to the code below for better understanding:
n = [1; 5; 7; 9; 12; 15; 29; 33; 39; 50; 51; 97; 102];
multiple = 50;
% Pick the maximum number present in the data till there generate multiples
maxi = max(n);
multiples = 0:multiple:maxi;
final_selection = zeros(length(multiples), 1);
% Find the closest value to each multiple
for i = 1:length(multiples)
[~, idx] = min(abs(n - multiples(i)));
final_selection(i) = idx;
end
% Take only unique values
final_selection = unique(final_selection);
selected_rows = n(final_selection);
disp(selected_rows);
1 50 102
I hope this explanation helps!
  2 件のコメント
Jason
Jason 2024 年 10 月 2 日
Thankyou AJ, sorry I didn't accept your answer although it works and I like it, the answer above did come first
Arjun
Arjun 2024 年 10 月 9 日
No issues @Jason, I am happy that it helped you!

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeMatrix Indexing についてさらに検索

タグ

製品


リリース

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by