Find variable name of categorical values repeated a number of times

6 ビュー (過去 30 日間)
Marcus Glover
Marcus Glover 2019 年 12 月 21 日
コメント済み: dpb 2019 年 12 月 23 日
I am trying to find the variable names of columns in a table that match a value, in my case a string. The values in my table are categorical and have values of 'p' or 'f'. I can do this for one row.
row=categorical({'p' 'f' 'p' 'f'});
t=array2table(row)
% t =
%
% 1×4 table
%
% row1 row2 row3 row4
% ____ ____ ____ ____
%
% p f p f
t.Properties.VariableNames(table2array(t(1,:))=='p')
% ans =
%
% 1×2 cell array
%
% {'row1'} {'row3'}
But I want to find when the values have been repeated a few times, three in this case.
rows=[row;row;row];
ttt=array2table(rows)
% ttt =
%
% 3×4 table
%
% rows1 rows2 rows3 rows4
% _____ _____ _____ _____
%
% p f p f
% p f p f
% p f p f
t.Properties.VariableNames(table2array(ttt(1:3,:))=={'p';'p';'p'}) %does not work
%Variable index exceeds table dimensions.
isequal(table2array(ttt(1:3,3)),{'p';'p';'p'}) %does work for one column
% ans =
%
% logical
%
% 1
Not sure what to do, also not sure if there is a better way without converting my table to an array.
  4 件のコメント
Image Analyst
Image Analyst 2019 年 12 月 21 日
Make it easy for us to help you, not hard. Attach the table in a .mat file with the paper clip icon so we can try things with your actual data.
Marcus Glover
Marcus Glover 2019 年 12 月 21 日
編集済み: Marcus Glover 2019 年 12 月 21 日
I have attached the table as a .mat file here.
The actual data is longer, but I am only interested in the last three rows. I am looking for the column variable name when the last three rows all match the string 'p'.
Or for easy cut and paste:
row=categorical({'p' 'f' 'p' 'f'});
rows=[row;row;row];
ttt=array2table(rows)

サインインしてコメントする。

回答 (2 件)

dpb
dpb 2019 年 12 月 21 日
This is one of two ways...convert to a string represenation so can use string matching functions or use logical addressing and then find locations of runs. The string function may be simplest here. Brief outline of idea...
s=reshape(char(tt.Var1),1,[]); % convert column to char() row string
fn=@(v,c) ~isempty(regexp(reshape(char(v),1,[]),[c '{3,}'])); % anon function to find 3 or more consecutive values in s
For your sample
>> fn(tt.Var1,'p')
ans =
logical
1
>> fn(tt.Var2,'p')
ans =
logical
0
>> fn(tt.Var1,'f')
ans =
logical
0
>>
To use, apply with varfun or a loop over the desired columns building logic vector by column. Then the tt.PropertiesVariableNames property with that addressing vector will return the column names.

Image Analyst
Image Analyst 2019 年 12 月 21 日
Try this:
s = load('ttt.mat')
ttt = s.ttt
% Get a value to match, let's say from ttt.rows1(1)
thingToMatch = ttt.rows1(1)
for row = 1 : size(ttt, 1)
% Extract this row, just for simplicity in debugging.
thisRow = ttt{row, :}
% See how many match p
ia = ismember(thisRow, thingToMatch)
numMatches(row) = sum(ia);
end
numMatches % Echo to command window.
You'll see the number of matches:
thingToMatch =
categorical
p
numMatches =
2 2 2
  7 件のコメント
Image Analyst
Image Analyst 2019 年 12 月 23 日
Your "correction" does not check the very bottom row to see if they're p.
dpb
dpb 2019 年 12 月 23 日
MG, IA is correct...his test is 3==1(3-2) & and 3==2(3-1) which tests 1,2,3 are same. Your test is only 1(3-2)=='p' & and2(3-1) =='p'
It appears you're back to the three consecutive elements...I submit the string solution I posted above is far simpler for the purpose; besides being scalable to any value m for the number instead of being specifically coded for 3 only and requiring recoding to include specific conditions if the number ever does change.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeCategorical Arrays についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by