Analyzing a Large Amount of Data in a CSV file

29 ビュー (過去 30 日間)
Nathan Lawira
Nathan Lawira 2021 年 10 月 27 日
コメント済み: Nathan Lawira 2021 年 10 月 29 日
Hello. So I am tasked for one of my projects to analyze over a hundred thousand lines of data in a .csv file. I decide to use Matlab to help me analyze it. This data is essentially COVID19 data for over a hundred countries and I need to group the data by countries and plot a graph of deaths per million versus time. I've written a code so far but I received an error at the end. Currently, I am not sure what to do to fix the problem and I really need help. Also, I am new to Matlab so my code might look very inefficient and primitive. I would also very much appreciate feedback on how I can make it better.
T=readtable("data.csv")
countries=unique(T.location);
numcount=numel(T.location);
for i=1:height(T);
for j=cellstr(countries);
A=[];
count=0;
if ismember(T.location(i),j);
count=count+1;
A(count,1)=string(T.date(i));
A(count,2)=T.total_deaths_per_million(i);
A(count,3)=T.new_deaths_per_million(i);
A(count,4)=T.hospital_beds_per_thousand(i);
A(count,5)=T.life_expectancy(i);
A(count,6)=T.human_development_index(i);
end
eval('cntry' string(j) '=A');
end
end
After running this code, I receive this error (I'm not sure how to change the text to red, sorry).
File: test1.m Line: 17 Column: 22
Invalid expression. Check for missing multiplication operator, missing or unbalanced delimiters, or
other syntax error. To construct matrices, use brackets instead of parentheses.
Could anyone please help me to solve this error? Also, I am planning to plot deaths per million versus time but I am not sure how to convert dates into a number that can be plotted in a graph. Could anyone give me some tips on how to do so? Thank you very much!
  3 件のコメント
Nathan Lawira
Nathan Lawira 2021 年 10 月 28 日
So do you mean that I should instead store country names into a cell array and pull the data from that cell?
Stephen23
Stephen23 2021 年 10 月 28 日
"So do you mean that I should instead store country names into a cell array and pull the data from that cell?"
That would be one of many approaches that would be easier to work with than dynamic variable names.
Other approaches would be to use a table (see READTABLE) or a non-scalar structure.

サインインしてコメントする。

採用された回答

Yongjian Feng
Yongjian Feng 2021 年 10 月 27 日
The error is about your line 17. What do you want to do there?
Do you mean
eval(['cntry' num2str(j) '=A']);
  8 件のコメント
Yongjian Feng
Yongjian Feng 2021 年 10 月 29 日
編集済み: Yongjian Feng 2021 年 10 月 29 日
You can use a struct to store the data. Inside the loop, if you can figure out the name of a country, you can do
T=readtable("data.csv")
countries=unique(T.location);
numcount=numel(T.location);
for i=1:height(T);
for j=cellstr(countries);
A=[];
count=0;
countryName = '';
if ismember(T.location(i),j);
count=count+1;
A(count,1)=string(T.date(i));
A(count,2)=T.total_deaths_per_million(i);
A(count,3)=T.new_deaths_per_million(i);
A(count,4)=T.hospital_beds_per_thousand(i);
A(count,5)=T.life_expectancy(i);
A(count,6)=T.human_development_index(i);
% somehow you can figure out your country name
% maybe from T?
countryName = 'Australia';
end
if ~isempty(countryName)
% good you figure out your countryName
cntry.(countryName) = A;
else
% you need to decide what to do if you can't figure out the
% countryName.
end
end
end
%% access cntry using cntry{1}/cntry{2}....
Nathan Lawira
Nathan Lawira 2021 年 10 月 29 日
Thank you! I've found another way though by using struct() and creating another cell array. Thank you so much all for your help.

サインインしてコメントする。

その他の回答 (1 件)

Walter Roberson
Walter Roberson 2021 年 10 月 27 日
The line
eval('cntry' string(j) '=A');
is the problem. You have a character vector, then a space, then a string scalar, then a space, then a character vector. You do not have operators between the parts, so that is an invalid expression.
If you were to change it to
eval(['cntry' string(j) '=A']);
then you would have the problem that with the string scalar there, the result of the [] would be a 1 x 3 string object, but eval() is not designed to be able to accept a vector of string objects.
You should Don't Do That
  6 件のコメント
Stephen23
Stephen23 2021 年 10 月 28 日
編集済み: Stephen23 2021 年 10 月 28 日
"I would also very much appreciate feedback on how I can make it better."
Rather than nesting the data in fields named for each country, your data would be better (simpler, more efficient, much easier to access) if you created a flat non-scalar structure, which would look something like this:
S(1).country = 'Afghanistan';
S(1).population = 39e6;
S(1).whatever = 123;
s(2).country = 'Bolivia';
S(2).population = 12e6;
S(2).whatever - 456;
... etc
Then you can simply loop over all countries using indexing or easily generate comma-separated lists:
As an alternative, a table might be suitable for your data (and also makes processing data much simpler):
Nathan Lawira
Nathan Lawira 2021 年 10 月 28 日
Okay, I'll try it out. Thank you!

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeCharacters and Strings についてさらに検索

製品


リリース

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by