How to split data matrix conditionally?

I have a data matrix of 59 columns and variabale number of rows, Required to extract a new matrix such that it include only those values that are in a specific bound.
As a result, we left with variable number of observations in each coloumn. How can i get new matrix, in such a condition:
An examplry random data set with my approach as below, but did not get required results.
p = rand(10, 10)
for i=1:10
q = p((p(:,ii) > .2) & (p(:,ii) < .4) , :)
end

 採用された回答

Chunru
Chunru 2022 年 4 月 14 日
編集済み: Chunru 2022 年 4 月 14 日

0 投票

You will not get an matrix for the output since the number of entries satisfying the condition for each column will be different.
Your output can be combined as a cell array instead.
n = 50;
p = rand(n, n);
for i=1:n
pi = p(:, i);
q{i} = pi(pi > .2 & pi < .4);
end
q
q = 1×50 cell array
{13×1 double} {13×1 double} {9×1 double} {12×1 double} {11×1 double} {6×1 double} {9×1 double} {14×1 double} {8×1 double} {13×1 double} {10×1 double} {9×1 double} {9×1 double} {10×1 double} {8×1 double} {10×1 double} {17×1 double} {10×1 double} {14×1 double} {11×1 double} {9×1 double} {7×1 double} {5×1 double} {9×1 double} {8×1 double} {12×1 double} {12×1 double} {14×1 double} {9×1 double} {7×1 double} {5×1 double} {13×1 double} {9×1 double} {10×1 double} {10×1 double} {15×1 double} {9×1 double} {13×1 double} {12×1 double} {9×1 double} {11×1 double} {13×1 double} {10×1 double} {6×1 double} {12×1 double} {11×1 double} {12×1 double} {10×1 double} {11×1 double} {10×1 double}
%% counting base on q (actually you can do that on p instead)
count = cellfun(@numel, q)
count = 1×50
13 13 9 12 11 6 9 14 8 13 10 9 9 10 8 10 17 10 14 11 9 7 5 9 8 12 12 14 9 7
% number of cells with >=10 elements
nc = sum(count>=10)
nc = 31

12 件のコメント

Andi
Andi 2022 年 4 月 14 日
Thanks, in the next stage, I need to count the number of cells that have for example, more than 5, 10, 15, 200 etc observations, How can i move further
Chunru
Chunru 2022 年 4 月 14 日
See the update above.
Andi
Andi 2022 年 4 月 14 日
HI thanks, for update,
I apply your suggestion to my orginal data, but I dont known why it count nan as a vale:
hete is my script: data is also attached
clear all
clc
data1=readmatrix('data1.csv'); % selected candidate earthquake
ev_time=datenum(data1(:,1),data1(:,2),data1(:,3),data1(:,4),data1(:,5),data1(:,6));
cand_ev=ev_time';
for jj=1:194
b=cand_ev(:,jj);
aa(jj)= addtodate(b, 30, 'day');
bb(jj)= addtodate(b, -30, 'day');
end
U_lim=aa;
L_lim=bb;
b=load('selected_0.01.csv');
%a = b(~isnan(b));
keep = sum(~isnan(b), 1) >= 100;
a = b(:, keep);
U=U_lim(:,8);
L=L_lim(:,8);
for kk=1:59
e{kk}=a(a(:, kk)>L & a(:,kk)<U);
%q = a((a(:,kk) > L) & (a(:,kk) < U) , : );
end
count = cellfun(@numel, e)
Chunru
Chunru 2022 年 4 月 14 日
編集済み: Chunru 2022 年 4 月 14 日
It seems that you need to change your U and L so some data are selected.
data1=readmatrix('https://www.mathworks.com/matlabcentral/answers/uploaded_files/963815/data1.csv'); % selected candidate earthquake
ev_time=datenum(data1(:,1),data1(:,2),data1(:,3),data1(:,4),data1(:,5),data1(:,6));
cand_ev=ev_time';
for jj=1:194
b=cand_ev(:,jj);
aa(jj)= addtodate(b, 30, 'day');
bb(jj)= addtodate(b, -30, 'day');
end
U_lim=aa
U_lim = 1×194
1.0e+05 * 7.3604 7.3604 7.3608 7.3611 7.3611 7.3611 7.3611 7.3612 7.3612 7.3612 7.3613 7.3613 7.3613 7.3613 7.3614 7.3614 7.3614 7.3615 7.3615 7.3616 7.3619 7.3619 7.3620 7.3620 7.3625 7.3625 7.3625 7.3625 7.3631 7.3631
L_lim=bb
L_lim = 1×194
1.0e+05 * 7.3598 7.3598 7.3602 7.3605 7.3605 7.3605 7.3605 7.3606 7.3606 7.3606 7.3607 7.3607 7.3607 7.3607 7.3608 7.3608 7.3608 7.3609 7.3609 7.3610 7.3613 7.3613 7.3614 7.3614 7.3619 7.3619 7.3619 7.3619 7.3625 7.3625
b=readmatrix('https://www.mathworks.com/matlabcentral/answers/uploaded_files/963820/selected_0.01.csv');
%a = b(~isnan(b));
keep = sum(~isnan(b), 1) >= 100
keep = 1×10 logical array
1 1 1 1 1 1 1 1 1 1
a = b(:, keep);
any(isnan(a))
ans = 1×10 logical array
1 1 1 1 1 1 1 1 1 1
U=U_lim(:,8)
U = 7.3612e+05
L=L_lim(:,1)
L = 7.3598e+05
whos
Name Size Bytes Class Attributes L 1x1 8 double L_lim 1x194 1552 double U 1x1 8 double U_lim 1x194 1552 double a 45000x10 3600000 double aa 1x194 1552 double ans 1x10 10 logical b 45000x10 3600000 double bb 1x194 1552 double cand_ev 1x194 1552 double data1 194x6 9312 double ev_time 194x1 1552 double jj 1x1 8 double keep 1x10 10 logical
for kk=1:size(a,2) % 59 is too big for your data
ai = a(:, kk); % get the column
ai(isnan(ai)) = []; % remove nan
e{kk}=a(ai>L & ai<U);
%q = a((a(:,kk) > L) & (a(:,kk) < U) , : );
end
count = cellfun(@numel, e)
count = 1×10
91 129 175 110 276 775 660 1218 2144 424
Andi
Andi 2022 年 4 月 14 日
I need to find the number of coloumns that have non-zero values within teh bound of U and L, but our code also read zero/nan as an entry. Limt for U and L is varying for every entry of dataset1.
Chunru
Chunru 2022 年 4 月 14 日
See above. I have changed U and L. The code also remove nan.
Andi
Andi 2022 年 4 月 14 日
But as per the condition, we cant change the upper and lower limit, so thos make s me a bit confuse
Chunru
Chunru 2022 年 4 月 14 日
You should check if your U and L are set properly. For your original bound, you have no data selected.
I understand it is fixed. But it should be correct at the first hand.
Andi
Andi 2022 年 4 月 14 日
I just double check, U and L is 2 month period that is a significant time to check weather we have sufficent number of data point in dataset 2, Even, the U and L are short still code shoudl not count NaN values as a number, becsue the U and L are much larger then 0 as well as NaN
Chunru
Chunru 2022 年 4 月 14 日
NaNs are removed. Just need to ensure that U and V are appropriate values so that you have data fall into the interval.
ai = a(:, kk); % get the column
ai(isnan(ai)) = []; % remove nan
e{kk}=a(ai>L & ai<U);
%q = a((a(:,kk) > L) & (a(:,kk) < U) , : )
Make sure the following is correct:
U=U_lim(:,8);
L=L_lim(:,8);
You have an array of limits. Why the last pair is used?
Andi
Andi 2022 年 4 月 14 日
For each point in data set, I have an upper and lower limit, then by using that upper and lower limits i need to search for observations in ecah coloumn of dataset 2. So technically each column should have some value or just no value. There is no other choice to give answer like NaN or etc.
Andi
Andi 2022 年 4 月 14 日
we did mistake here that why we get NaN
e{ii, kk}=a(a(:, ii)>L_lim(:,kk) & a(:,ii)<U_lim(:, kk), ii);
Thank you for help.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

ヘルプ センター および File ExchangeCreating, Deleting, and Querying Graphics Objects についてさらに検索

製品

タグ

質問済み:

2022 年 4 月 14 日

コメント済み:

2022 年 4 月 14 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by