Merging data from different matrices with different row numbers by matching date

3 ビュー (過去 30 日間)
John
John 2017 年 9 月 26 日
回答済み: Peter Perkins 2017 年 9 月 27 日
I have 1600 matrices (1600 companies info) each containing data similar to the following (first column is company ID, second is a date, and third is return)
% code
10001 73500 0.05
10001 73440 0.06
10001 73320 0.04
10001 73280 0.03
..... ..... ....
The issue is each of these 1600 matrices have different row number depending on the availability of the data from the database (i.e return is available for different companies on different dates and different number of periods).
Now the task at hand is to merge all these matrices into one.
So I created a vector of all possible dates like in the following:
% code
73500
73480
73440
73380
73320
73300
73280
.....
I am looking to extract the data so that each company return is listed in a column at the right date (row), the rest of the rows should show Nan. Something like the following:
% code
73500 0.04 nan nan ........ for the rest of companies until column 1601
73480 nan nan 0.02
73440 0.05 0.04 nan
73380 0.03 0.05 0.02
73320 nan 0.02 nan
73300 0.05 nan nan
73280 nan nan 0.05
..... .... .... ....
Appreciate any help. Thanks
  3 件のコメント
OCDER
OCDER 2017 年 9 月 26 日
I sense intersect will be involved to find matching indices of two matrices based on dates.
Stephen23
Stephen23 2017 年 9 月 26 日
編集済み: Stephen23 2017 年 9 月 26 日
"I have 1600 matrices..."
Where? In MATLAB memory, or in files? You might like to read this:

サインインしてコメントする。

採用された回答

Jan
Jan 2017 年 9 月 26 日
編集済み: Jan 2017 年 9 月 26 日
result = nan(nDates, nCompanies + 1);
result(:, 1) = Dates;
for iComp = 1:nCompanies
aMatrix = <how to get your iComp.th matrix>
[iR, iM] = ismember(Dates, aMatrix(:, 2));
result(iR, iComp + 1) = aMatrix(iM, 3);
end
  2 件のコメント
John
John 2017 年 9 月 26 日
Thank you Jan The code seem to be working except the last line
aMatrix(iM, 3)
throw the following error:
Subscript indices must either be real positive integers or logicals.
Any thoughts on that?
John
John 2017 年 9 月 26 日
Ok Jan. I figured it out. Just add
iM(iM==0)=[]
before that last line and it will work.
Thanks mate.

サインインしてコメントする。

その他の回答 (2 件)

KL
KL 2017 年 9 月 26 日
If you have R2016b or later, timetable does this with synchronize command,
timetable_summary = synchronize(timetable_company1,timetable_company2);

Peter Perkins
Peter Perkins 2017 年 9 月 27 日
If you are using R2016b or later, timetable provides two really good ways to approach this. The second approach also works with tables, which have been in MATLAB since R2013b.
Leaving aside the question of what exactly are those "dates" (days since??? datenums divided by 10?), try these.
1) Create separate timetables and synchronize
>> x1 = [10001 73500 0.05; ...
10001 73440 0.06; ...
10001 73320 0.04; ...
10001 73280 0.03];
>> x2 = [10002 73380 0.5; ...
10002 73320 0.6; ...
10002 73300 0.4; ...
10002 73280 0.3];
>> t1 = datetime(x1(:,2),'ConvertFrom','datenum');
>> CompanyA = timetable(x1(:,3),'RowTimes',t1,'VariableNames',{'Return'})
CompanyA =
4×1 timetable
Time Return
____________________ ______
27-Mar-0201 00:00:00 0.05
26-Jan-0201 00:00:00 0.06
28-Sep-0200 00:00:00 0.04
19-Aug-0200 00:00:00 0.03
>> t2 = datetime(x2(:,2),'ConvertFrom','datenum');
>> CompanyB = timetable(x2(:,3),'RowTimes',t2,'VariableNames',{'Return'})
CompanyB =
4×1 timetable
Time Return
____________________ ______
27-Nov-0200 00:00:00 0.5
28-Sep-0200 00:00:00 0.6
08-Sep-0200 00:00:00 0.4
19-Aug-0200 00:00:00 0.3
>> returns = synchronize(CompanyA,CompanyB,'union')
returns =
6×2 timetable
Time Return_CompanyA Return_CompanyB
____________________ _______________ _______________
19-Aug-0200 00:00:00 0.03 0.3
08-Sep-0200 00:00:00 NaN 0.4
28-Sep-0200 00:00:00 0.04 0.6
27-Nov-0200 00:00:00 NaN 0.5
26-Jan-0201 00:00:00 0.06 NaN
27-Mar-0201 00:00:00 0.05 NaN
2) Create one timetable (amd maybe stop there) and unstack
>> x = [x1; x2];
>> t = datetime(x(:,2),'ConvertFrom','datenum');
>> tt = timetable(x(:,1),x(:,3),'RowTimes',t,'VariableNames',{'Company' 'Return'});
>> tt.Company = categorical(tt.Company,[10001 10002],{'CompanyA' 'CompanyB'})
tt =
8×2 timetable
Time Company Return
____________________ ________ ______
27-Mar-0201 00:00:00 CompanyA 0.05
26-Jan-0201 00:00:00 CompanyA 0.06
28-Sep-0200 00:00:00 CompanyA 0.04
19-Aug-0200 00:00:00 CompanyA 0.03
27-Nov-0200 00:00:00 CompanyB 0.5
28-Sep-0200 00:00:00 CompanyB 0.6
08-Sep-0200 00:00:00 CompanyB 0.4
19-Aug-0200 00:00:00 CompanyB 0.3
>> returns = unstack(tt,'Return','Company')
returns =
6×2 timetable
Time CompanyA CompanyB
____________________ ________ ________
27-Mar-0201 00:00:00 0.05 NaN
26-Jan-0201 00:00:00 0.06 NaN
28-Sep-0200 00:00:00 0.04 0.6
19-Aug-0200 00:00:00 0.03 0.3
27-Nov-0200 00:00:00 NaN 0.5
08-Sep-0200 00:00:00 NaN 0.4

カテゴリ

Help Center および File ExchangeDates and Time についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by