flatten structure array if values are identical

10 ビュー (過去 30 日間)
Ida
Ida 2020 年 8 月 28 日
コメント済み: Ida 2020 年 8 月 29 日
Dear matlab users,
I have a structure array, where the majority of fields have identical values. Some differ. The values can be numbers, strings, or cells. See minimal example below for a struct array of size 100, and fields A through F:
length(mystruct) = 100
%field A identical number
mystruct(1).A = 5
mystruct(2).A = 5
...
mystruct(100).A = 5
%field B identical string
mystruct(1).B = 'hello'
mystruct(2).B = 'hello'
..
mystruct(100).B = 'hello'
% field C not identical number
mystruct(1).C = 1
mystruct(2).C = 2
..
mystruct(100).C = 100
% field D not identical string
mystruct(1).D = 'x'
mystruct(2).D = 'y'
..
mystruct(100).D = 'z'
% field E identical cell
mystruct(1).E = {'a','b'}
mystruct(2).E = {'a','b'}
..
mystruct(100).E = {'a','b'}
% field F not identical cell
mystruct(1).F = {'a','b'}
mystruct(2).F = {'a','c','d'}
..
mystruct(100).F = {'b'}
I would like to "flatten" the common values of the structure array (which is actually very large with thousands of fields), and create a cell/vector for the non-common:
length(mystruct) = 1
mystruct.A = 5
mystruct.B = 'hello'
mystruct.C = [1,2,..,100] %this can also be a cell if easier
mystruct.D = {'x','y',..,'z'}
mystruct.E = {'a','b'}
mystruct.F = {{'a','b'}, {'a','c','d'}, .., {'b'}}
Is there a straight forward way to do this?
Many thanks,
Ida

採用された回答

Bruno Luong
Bruno Luong 2020 年 8 月 28 日
編集済み: Bruno Luong 2020 年 8 月 28 日
You can use SERIALIZE de DESERIALIZE MATLAB objects before taking UNIQUE.
The out will be slightly different than yours, since anything will be stored in the cell (easy to fix but introduce exception treatment)
clear s
s(1).A = 5;
s(2).A = 5;
s(3).A = 5;
%field B identical string
s(1).B = 'hello';
s(2).B = 'hello';
s(3).B = 'hello';
% field C not identical number
s(1).C = 1;
s(2).C = 2;
s(3).C = 100;
% field D not identical string
s(1).D = 'x';
s(2).D = 'y';
s(3).D = 'z';
% field E identical cell
s(1).E = {'a','b'};
s(2).E = {'a','b'};
s(3).E = {'a','b'};
% field F not identical cell
s(1).F = {'a','b'};
s(2).F = {'a','c','d'};
s(3).F = {'b'};
clear flats
fieldnames(s);
for fname=fieldnames(s)'
flats.(fname{1}) = genunique({s.(fname{1})});
end
flats
%%
function c=genunique(c)
% https://www.mathworks.com/matlabcentral/fileexchange/34564-fast-serialize-deserialize
str=cellfun(@(x) char(hlp_serialize(x)'),c,'unif',0);
[~,i]=unique(str,'stable');
c=c(i);
end
Result:
A: {[5]}
B: {'hello'}
C: {[1] [2] [100]}
D: {'x' 'y' 'z'}
E: {{1×2 cell}} % {{'a','b'}}
F: {{1×2 cell} {1×3 cell} {1×1 cell}} % {{'a','b'}, {'a','c','d'} {'b'}}
  1 件のコメント
Ida
Ida 2020 年 8 月 29 日
Thank you so much Bruno, this worked like a charm!

サインインしてコメントする。

その他の回答 (2 件)

Jon
Jon 2020 年 8 月 28 日
As a very general suggestion you should be able to make use of the MATLAB unique command for this purpose. Obviously a lot more details to how you would implement it.
  2 件のコメント
Jon
Jon 2020 年 8 月 28 日
編集済み: Jon 2020 年 8 月 28 日
Here is some code that I think does what you want. I just did it with the three elements you provided, as an illustration, but you could turn the script into a more general function.
It gets a little complicated, beause unfortunately the matlab unique function is not quite as general as you might hope. So first you have to turn things into elements that unique will work with and then change them back.
%field A identical number
mystruct(1).A = 5;
mystruct(2).A = 5;
mystruct(3).A = 5;
%field B identical string
mystruct(1).B = 'hello';
mystruct(2).B = 'hello';
mystruct(3).B = 'hello';
% field C not identical number
mystruct(1).C = 1;
mystruct(2).C = 2;
mystruct(3).C = 100;
% field D not identical string
mystruct(1).D = 'x';
mystruct(2).D = 'y';
mystruct(3).D = 'z';
% field E identical cell
mystruct(1).E = {'a','b'};
mystruct(2).E = {'a','b'};
mystruct(3).E = {'a','b'};
% field F not identical cell
mystruct(1).F = {'a','b'};
mystruct(2).F = {'a','c','d'};
mystruct(3).F = {'b'};
% find all of the field names
fieldNames = fields(mystruct);
% loop through field names, just keeping unique elements
for j = 1:length(fieldNames)
if isnumeric(mystruct(1).(fieldNames{j}))
flatStruct.(fieldNames{j})= unique([mystruct(:).(fieldNames{j})]);
else
% unique function won't work directly on cell arrays of charcter
% vectors, convert to string arrays and then back
A = {mystruct(:).(fieldNames{j})}; % cell array of character vectors
Astr = strings(length(A),1);
for k = 1:length(A)
% if elements themselves are cells then turn each element into
% a string by joining elements into a single string
if iscell(A{k})
Astr(k,1) = string(strjoin(A{k})); % string array
else
Astr(k,1) = string(A{k});
end
end
[~,ia] = unique(Astr);
% form cell array to return data
outArray = A(ia);
flatStruct.(fieldNames{j}) = outArray;
end
end
Ida
Ida 2020 年 8 月 29 日
Thank you Jon for your suggestion. I used Brunos suggestion which worked perfectly.

サインインしてコメントする。


Jeremy Perez
Jeremy Perez 2020 年 8 月 28 日
Hi Idea,
Extract, Convert, Sort unique
You can extract all the values of a fields with:
mystruct.A
You can use this function to sort the unique values:
help unique
However it will not work for all datatypes. So each time you wil have to use functions to convert your values to someting unique can work with. Alternatively, you can dig in the alternative unique functions made by the community.
You can also come up with your own function that converts your values into strings that you could compare with unique. It can be complex depending on your values. Field A, B, C, D are trivial. Field E and F use the links above.
You will end up with.
uniqueA = mystruct.A(uniqueLocationArray)
If you have to consider tolerences that's a whole new problem.
  1 件のコメント
Ida
Ida 2020 年 8 月 29 日
Thank you Jeremy for your suggestion. I used Brunos suggestion which worked perfectly.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeMatrices and Arrays についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by