Formating a table with unequal rows

28 ビュー (過去 30 日間)
Danielle Leblance
Danielle Leblance 2020 年 4 月 13 日
コメント済み: Star Strider 2020 年 4 月 15 日
I generated the number of employees in 4 countries and i was expecting my sample to be distributed in each country across all categories. However, i ended up with less categories in countries than others. Conseuqntly, matlab gave me an error when i tried to create a table that includes the 4 countries since i have unequal rows (look at the attached mat file). Hence, I exported each country's table to excel and combined them manually by replacing the missing rows with zeros. I also had to format the percentage manually to two decimal points.
I am wondering of there is a way to generate a formatted table that combines all categories in the first column and replaces the missing rpws of each country by zeros when this country doesn't have data for category (excel output is provoded as well). it is taking ages to do the work manually. Your help is greatly appreciated.

回答 (3 件)

Star Strider
Star Strider 2020 年 4 月 14 日
Another option is to use the outerjoin function:
D = load('matlab.mat');
OE1 = D.OutnbEmployees1;
OE2 = D.OutnbEmployees2;
OE3 = D.OutnbEmployees3;
OE4 = D.OutnbEmployees4;
Out = outerjoin(OE1,OE2, 'LeftKeys',1, 'RightKeys',1);
Out = outerjoin(Out,OE3, 'LeftKeys',1, 'RightKeys',1);
Out = outerjoin(Out,OE4, 'LeftKeys',1, 'RightKeys',1)
The result appears a bit unwieldy, however it will do what you want. (Instead of replacing the entries that do not all have entries with 0, it replaces them with NaN, however you can replace those with 0 later if you want.)
.
  4 件のコメント
Danielle Leblance
Danielle Leblance 2020 年 4 月 14 日
Thanks. it works without error now but it hasn't produced the desired output. I am attaching two tables:one produced by the codes and one is the desired output after manupaling the produced file , filling gaps with zeros, and adjusting the % to two decimal points.
Star Strider
Star Strider 2020 年 4 月 15 日
Try this:
D = load('matlab.mat');
matfileContents = struct2cell(D);
Outjoin = outerjoin(matfileContents{1},matfileContents{2}, 'LeftKeys',1, 'RightKeys',1);
Phrase = union(matfileContents{1}{:,1}, matfileContents{2}{:,1});
for k = 1:numel(matfileContents)-2
Outjoin = outerjoin(Outjoin, matfileContents{k+2}, 'LeftKeys',1, 'RightKeys',1);
Phrase = union(Phrase, matfileContents{k+2}{:,1});
end
VarNms = Outjoin.Properties.VariableNames;
VarNms(1:3:end) = [];
Out = table(Phrase);
for k = 1:fix(size(Outjoin,2)/3)
ix = ismember(Phrase,Outjoin{:,3*k-2});
ixn = find(ix);
T(ixn,2*(k-1)+[1 2]) = Outjoin(ixn,(3*k-2)+[1 2]); % Has ‘NaN’ Values
end
A = table2array(T);
A(isnan(A)) = 0;
T = array2table(A);
Out = [Out T];
Out.Properties.VariableNames(2:end) = VarNms
producing:
Out =
17×9 table
Phrase AlgFrequency Alg% IrqFrequency Irq% TunFrequency Tun% LbyFrequency Lby%
______________________________________ ____________ _______ ____________ _______ ____________ ______ ____________ _______
{'100,000 employees and above' } 0 0 1 0.66667 0 0 0 0
{'Between 1,000 and 2,999 employees'} 5 3.3333 4 2.6667 0 0 4 3.5398
{'Between 100 and 149 employees' } 11 7.3333 33 22 22 14.865 23 20.354
{'Between 15 and 24 employees' } 1 0.66667 1 0.66667 5 3.3784 2 1.7699
{'Between 150 and 199 employees' } 1 0.66667 0 0 3 2.027 1 0.88496
{'Between 200 and 249 employees' } 9 6 0 0 23 15.541 21 18.584
{'Between 25 and 49 employees' } 1 0.66667 2 1.3333 14 9.4595 5 4.4248
{'Between 250 and 499 employees' } 0 0 0 0 0 0 1 0.88496
{'Between 3,000 and 4,999 employees' } 0 0 0 0 0 0 20 17.699
{'Between 5 and 14 employees' } 9 6 9 6 9 6.0811 7 6.1947
{'Between 5,000 and 9,999 employees' } 0 0 0 0 2 1.3514 0 0
{'Between 50 and 74 employees' } 4 2.6667 2 1.3333 10 6.7568 9 7.9646
{'Between 500 and 749 employees' } 0 0 0 0 0 0 1 0.88496
{'Between 75 and 99 employees' } 3 2 1 0.66667 2 1.3514 4 3.5398
{'Between 750 and 999 employees' } 0 0 0 0 15 10.135 13 11.504
{'I don't know' } 0 0 0 0 2 1.3514 0 0
{'Under 5 employees' } 0 0 0 0 2 1.3514 0 0
The somewhat challenging part was getting all the rows to line up correctly considering that not all the tables had the same number of rows. The ismember and find calls solved that problem, since it was then simply a matter of indexing. I did the table → array → table coinversions because I could not get isnan to work with tables. There may be a simple way that I just overlooked, however just now I’m too tired to explore that.
This code should also be relatively robust, since it does not care about the contents of the .mat file. The only absolute requirement is that the tables all have three columns, and that they all have essentially the same configurations that those in this .mat file have.
.

サインインしてコメントする。


Tommy
Tommy 2020 年 4 月 14 日
How about this?
load('matlab.mat');
C = unique([OutnbEmployees1.AlgPhrase; OutnbEmployees2.IrqPhrase; OutnbEmployees3.TunPhrase; OutnbEmployees4.LbyPhrase]);
i1 = ~cellfun(@(c) any(strcmp(OutnbEmployees1.AlgPhrase, c)), C);
i2 = ~cellfun(@(c) any(strcmp(OutnbEmployees2.IrqPhrase, c)), C);
i3 = ~cellfun(@(c) any(strcmp(OutnbEmployees3.TunPhrase, c)), C);
i4 = ~cellfun(@(c) any(strcmp(OutnbEmployees4.LbyPhrase, c)), C);
T = table;
T.Phrase = C;
T.AlgFrequency(~i1) = OutnbEmployees1.AlgFrequency;
T.Alg_(~i1) = OutnbEmployees1.Alg_;
T.IrqFrequency(~i2) = OutnbEmployees2.IrqFrequency;
T.Irq_(~i2) = OutnbEmployees2.Irq_;
T.TunFrequency(~i3) = OutnbEmployees3.TunFrequency;
T.Tun_(~i3) = OutnbEmployees3.Tun_;
T.LbyFrequency(~i4) = OutnbEmployees4.LbyFrequency;
T.Lby_(~i4) = OutnbEmployees4.Lby_;

Danielle Leblance
Danielle Leblance 2020 年 4 月 14 日
Thanks. I am receiving an error for line 10:
Unrecognized table variable name 'Alg_'.

カテゴリ

Help Center および File ExchangeMatrices and Arrays についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by