Hello,
I have a large dataset of patents by
year, region, type of patents, regional share.
2000 FR01 0 0.137
2000 FR01 1 0.135
2000 FR01 1 1
2000 FR02 0 0.144
2000 FR02 1 0.135
2000 FR02 1 1
2001 FR01 0 0.143
2001 FR01 1 0.135
2001 FR01 1 1
2001 FR02 0 0.155
2001 FR02 1 0.175
2001 FR02 1 1
.........................................................................................
I want to find the total of regional share for each region by year by type of patents.
I would appreciate if someone is able to help me.
Thank you.

 採用された回答

Chunru
Chunru 2021 年 6 月 30 日
編集済み: Chunru 2021 年 6 月 30 日

0 投票

It could be something like this:
% Assume table (T) with these variables: year, region, type_of_patents, regional_share
% Find the total of regional share for each region by year by type of patents.
u_type_of_patents = unique(T.type_of_patents)
u_year = unique(T.year)
u_region = unique(T.region)
u_year = unique(T.year)
for ip = 1:length(u_type_of_patents)
for iy = 1:length(u_year)
for ir = 1:length(u_region)
totalshare=(sum(T.regional_share(...
T.type_of_patents==u_type_of_patents(ip) & ...
T.year==u_year(iy) & ...
T.region==u_region(ir) )));
%fprintf(...)
end
end
end

5 件のコメント

Saptorshee Chakraborty
Saptorshee Chakraborty 2021 年 6 月 30 日
編集済み: Saptorshee Chakraborty 2021 年 6 月 30 日
Sorry there is a error
Operands to the logical and (&&) and or (||) operators must be convertible to logical scalar values.
Error in untitled (line 10)
T.type_of_patents==u_type_of_patents(ip) && ...
Apparently there are ndef entries in the year column, I had no idea, previously.
I have to try to take care of them first.
Thank you.
Chunru
Chunru 2021 年 6 月 30 日
Use "&" instead. If you have string as table column, using appropriate string comparison. (I edit the && above)
Saptorshee Chakraborty
Saptorshee Chakraborty 2021 年 7 月 1 日
Hello,
I corrected the data and tried the command, but even after 6 hours the process was not complete. There are more than 13 million rows to be considered, so I guess due to loop it is taking a lot of time. Is there anyway to avoid loop and perform the task.
Thank you.
Lei Hou
Lei Hou 2021 年 7 月 1 日
Hi Saptorshee,
Try the following and see whether the performance is better.
>> rowfun(@sum,t,"InputVariables","regional share",'GroupingVariables',["year" "type of patents" "region"],"OutputVariableNames","total region share")
ans =
8×5 table
year type of patents region GroupCount total region share
____ _______________ ________ __________ __________________
2000 0 {'FR01'} 1 0.137
2000 0 {'FR02'} 1 0.144
2000 1 {'FR01'} 2 1.135
2000 1 {'FR02'} 2 1.135
2001 0 {'FR01'} 1 0.143
2001 0 {'FR02'} 1 0.155
2001 1 {'FR01'} 2 1.135
2001 1 {'FR02'} 2 1.175
Thanks,
Lei
Saptorshee Chakraborty
Saptorshee Chakraborty 2021 年 7 月 1 日
Hello Lei,
Thank you very very much indeed.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

製品

リリース

R2021a

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by