please help me in loading data to matlab

Hi all,
I have some problems in loading file Adult.data into MATLAB. When I try:
>> load adult.data
it displays:
??? Error using ==> load Unknown text on line number 1 of ASCII file C:\Users\Documents\MATLAB\adult.data "Self-emp-not-inc".
A line in the file:
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
I don't know why. I have tried fopen and scan but it is still impossible. Please help me. Thank you so much.

 採用された回答

Matt Tearle
Matt Tearle 2011 年 2 月 22 日

2 投票

Similar to what Andrew said, but I'd go with
fid = fopen('Adult.data');
A = textscan(fid,'%f%s%f%s%f%s%s%s%s%s%f%f%f%s%s,'delimiter',',');
fclose(fid);
It looks uglier, but it will import the 15 columns as separate cells, and the numeric values will be imported as numeric arrays.
But, that said, if you're doing any kind of statistical analysis on this kind of data, you probably want (and/or may already have) Statistics Toolbox. In which case, use dataset to import this data directly into a dataset array. This will make your life easier. In particular, you can use nominal to turn things like "bachelors", "white", and "male" into nominal arrays

1 件のコメント

Andrew Newell
Andrew Newell 2011 年 2 月 22 日
Your approach gets my vote - present pain for future gain!

サインインしてコメントする。

その他の回答 (6 件)

Andrew Newell
Andrew Newell 2011 年 2 月 22 日

2 投票

You can use textscan to read in the data as a comma-delimited set of strings:
fid = fopen('Adult.data');
A = textscan(fid,'%s','delimiter',',');
fclose(fid)
This reads the data into a cell containing a one-dimensional cell array. The next command will change it to a format that is probably more useful to you (there are 15 fields on each line):
A = reshape(A{:},15,[]);
Matt Tearle
Matt Tearle 2011 年 2 月 23 日

1 投票

BTW, best practice is to accept the answer that solved your initial question, then start a new question for the follow-up. That way, others can follow what's happening (for the benefit of others who might have similar questions). Anyway...
It depends, are you using Stats TB? If not, the base MATLAB way is
nnz((age > 50) & strcmpi(sex,'male'))
I'm assuming age is a numeric array, and sex is a cell array of strings. The > and strcmpi (case-insensitive string comparison) both create logical variables, which are combined using &. Applying the nnz function returns the number of true values.
If you have Stats TB and have the data in a dataset array, with sex as a nominal variable,
nnz((data.age > 50) & (data.sex == 'male'))
Matt Tearle
Matt Tearle 2011 年 2 月 23 日

1 投票

Would you be surprised if I suggested that what you really need is Statistics Toolbox?
But, the brute-force way in MATLAB could be done something like this:
clist = unique(country);
Nctry = length(clist);
num_richbuggers = zeros(Nctry,1);
for k = 1:Nctry
num_richbuggers(k) = nnz(strcmpi(assets,'>50K') & ...
strcmpi(country,clist{k}));
end
love
love 2011 年 2 月 23 日

0 投票

Thank you so much for your quick response. Now, I need to count the number of male who is over 50. How can I do it with MatLab.
For example:
39, State_gov,77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Female, 0, 0, 13, United-States, <=50K
58, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K
53, Private, 234721, 11th, 7, Married-civ-spouse, Handlers-cleaners, Husband, Black, Male, 0, 0, 40, United-States, <=50K
There are two males who are over 50. The result is 2. Please help me. I am a dummy in Matlab, I don't know how to do with that. Thank you.
love
love 2011 年 2 月 23 日

0 投票

Hi Matt, thanks for your answer. It works perfectly.
How 's about the grouping and counting in MatLab? For example, I need to group the countries and count the people who have more than 50K in each country. Thank you so much.

1 件のコメント

Andrew Newell
Andrew Newell 2011 年 2 月 23 日
See above for Matt's comment about best practice.

サインインしてコメントする。

love
love 2011 年 2 月 23 日

0 投票

Fantastic, it works, Matt. Matlab is great, you are great. Thank you very much.

カテゴリ

ヘルプ センター および File ExchangeLive Scripts and Functions についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by