Merging/Superimposing two tables without using loops

16 ビュー (過去 30 日間)
Nikhil Tawakley
Nikhil Tawakley 2017 年 3 月 16 日
コメント済み: Nikhil Tawakley 2017 年 4 月 10 日
I have two very large (200000x200) tables, Table_1 & Table_2. They have columns/variables that store different data types- strings, doubles, integers etc.
I want to create a third table, Table_3, that superimposes Table 2 over Table 1 if Table 1 has a missing value. i.e.
if Table_1(row,column)<>{''},
Table_3(row,column)=Table_1(row,column)
else
Table_3(row,column)=Table_2(row,column)
For example,
Table 1 Table 2 Table 3
Record Name Age Record Name Age Record Name Age
1 John 10 1 John 1 John 10
2 Daniela 2 Daniela 5 2 Daniela 5
3 Austin 30 3 Austin 30 3 Austin 30
4 45 4 Mark 45 4 Mark 45
I want to do this without using for loops because the time to traverse them is very very slow.
To consider:
  • Is there a way to do this using the fillmissing() function? This would be the most natural approach however, Table_3=fillmissing(Table_1,'Constant',Table_2) doesn't work. (The 2nd argument (Table_2) cannot be multidimensional.)
  • If this were a simple matrix, Table_3(Table_1=='')=Table_2 would've worked; but I can't seem to do this with tables.
Thanks, NT
  1 件のコメント
Nikhil Tawakley
Nikhil Tawakley 2017 年 3 月 17 日
Thanks Peter- it's "like" a join but not quite- one typically does a join if there are extra variables to capture; these tables are identical in structure/size. Data in Table_1 is just more reliable than the other (but is comparatively sparsely populated).
I wan't to "superimpose" the two tables such that data from Table 1 is given priority over Table 2.
There could be situation where Mark's age in Table_1 is 85- in that case I would like the output to read:
Table 1 Table 2 Table 3
Record Name Age Record Name Age Record Name Age
4 85 4 Mark 45 4 Mark 85
There's gotta be a way to do this! Thanks again for looking into this.

サインインしてコメントする。

採用された回答

Peter Perkins
Peter Perkins 2017 年 3 月 16 日
This isn't exactly a join, but it's close. It's not entirely clear to me what you want to do, for example, Mark is missing from reconrd 4 in table 1, but Age == 45 is not. However, in tables 2 and 3, where Mark is present, it's the same Age: 45. Is that always true? If so, I would think you could delete records from table 1 that had missing values, then outerjoin with table 2, then repeat for table 3.
Otherwise, maybe you're looking for some kind of setdiff operation.
  4 件のコメント
Peter Perkins
Peter Perkins 2017 年 3 月 20 日
編集済み: Peter Perkins 2017 年 3 月 20 日
OK, I've been misinterpreting your example, and I didn't realize that "table 3" was you expected output. So after re-reading, I see what you want.
You say, "no loops", but I think you mean "no loops over rows." Here's a version that loops over variables, making use of ismissing. With only 200 vars, I don't think that will be an issue. I'm gonna assume that the two tables contain the same record numbers, if that's not true you would have to do some kind of join before this:
>> t1 = table( ...
[1;2;3;4],{'John';'Daniela';'Austin';''},[10;NaN;30;45], ...
'VariableNames',{'Record' 'Name' 'Age'})
t1 =
4×3 table
Record Name Age
______ _________ ___
1 'John' 10
2 'Daniela' NaN
3 'Austin' 30
4 '' 45
>> t2 = table( ...
[1;2;3;4],{'John';'Daniela';'Austin';'Mark'},[NaN;5;30;45], ...
'VariableNames',{'Record' 'Name' 'Age'})
t2 =
4×3 table
Record Name Age
______ _________ ___
1 'John' NaN
2 'Daniela' 5
3 'Austin' 30
4 'Mark' 45
>> isempty(setxor(t1.Record,t2.Record)) % check the assumption
ans =
logical
1
>> t1 = sortrows(t1,'Record'); % just in case
>> t2 = sortrows(t2,'Record');
>> t3 = t1;
>> missingValues = ismissing(t1);
>> for varIndex = 2:width(t1) % index by var number, not name
i = missingValues(:,varIndex);
t3.(varIndex)(i) = t2.(varIndex)(i);
end
>> t3
t3 =
4×3 table
Record Name Age
______ _________ ___
1 'John' 10
2 'Daniela' 5
3 'Austin' 30
4 'Mark' 45
Nikhil Tawakley
Nikhil Tawakley 2017 年 4 月 10 日
Thanks for the detailed response, Peter; this was very helpful. I ended up "solving" the issue by looping over variables. It's still slow but fortunately it's acceptable :)

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeMatrix Indexing についてさらに検索

タグ

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by