How does one create a dataset from a large array and subsequently name the variables?

1 回表示 (過去 30 日間)
I have a body data array read from a excel file: 9000-by-130. I also have the header line read from the same file: 1-by-130. I am trying to create a 900-by-130 dataset from the data array while using the elements of the header line array as variable names.
data = <9000x130 dataset>
headers = <1x130 cell>
The issue I'm having is...
D = dataset(array(:));
...just creates a 9000-by-1 dateset!? So, for example,
for i = 1:size(headers,2)
dataSet.Properties.VarNames{i} = headers{i};
end
...cannot find variables to match elements in headers{i}.
The only solution is to enter the variables and the names manually 130 times!
dataSet = dataset(data(:,1),data(:,2)...data(:,130),'VarNames',{'Var1','Var2',...,'KillMeNow'});
Surely there's a simpler way. Is there?

採用された回答

Peter Perkins
Peter Perkins 2012 年 9 月 18 日
Tolulope, your question is not entirely clear. You show
data = <9000x130 dataset>
headers = <1x130 cell>
and it's not clear if that's what you have, or if that's what you want.
If that's what you have, and headers is a 1x130 cell array of strings, why does
data.Properties.VarNames = headers
not work? What does, "cannot find variables to match elements in headers{i}" mean?
You have an Excel file. You have said that you cannot use the dataset constructor to read the file and get the dataset array you want. It's not clear why that would be. Perhaps the header line is not in the obvuious place. The above should solve that.
This
D = dataset(array(:));
passes a column vector to the dataset constructor, so it's not unexpected that it creates a (something)x1 dataset array. It's not clear to me what "array" is, so it's impossible to know exactly what "something" is, or what you intended to do with that line.
If you have a 9000x130 numeric array, and a 1x130 cell array of strings, then as described in the help for dataset, you can do this:
x = rand(10,5);
d = dataset({X,names{:}})
Or, you can use use num2cell to split your numeric array up columnwise:
x = rand(10,5);
y = num2cell(x,1);
d = dataset(y{:},'VarNames',names)
If you have a very recent version of MATLAB, you can use the mat2dataset function (I can't recall if this exists in R2012a, but it definitely does in R2012b).
Hope this helps.
  3 件のコメント
Peter Perkins
Peter Perkins 2012 年 9 月 19 日
Just to be a bit more clear:
If you have a numeric array, then dataset({data,names{:}}) is the appropriate thing (unless you have the new R2012b, in which case use mat2dataset). But if you have a dataset already (and that's kind of what your original post indicated) and just want to add names to it, then you shouldn't call the dataset constructor again, you should just assign the names the data.Properties.VarNames.
David
David 2013 年 6 月 13 日
mat2dataset does not exist in 2012a. This is why I look for information here.

サインインしてコメントする。

その他の回答 (2 件)

Wayne King
Wayne King 2012 年 9 月 18 日
Why don't you organize the excel file and then create the dataset directly from the excel file with the
A = dataset('XLSFile',filename,...
syntax?
  1 件のコメント
Tolulope
Tolulope 2012 年 9 月 18 日
In principle, yes. But unfortunately, the excel files are far too large and too numerous. It makes it highly impractical to do that for each file :(

サインインしてコメントする。


Javier
Javier 2012 年 9 月 18 日
Hello Tolulope
This is what you have to do. In this case, data has 5 columns.
Step 1 (define the names)
for i=1:size(data,2)
VarNames{1,i}=['Var ',num2str(i)];
end
Step 2 (introduce in the data set)
dataSet = dataset(data(:,1),data(:,2),data(:,3),data(:,4),data(:,5),'VarNames',VarNames);
If this solve your question please grade or post a comment.
Best regards
Javier
  1 件のコメント
Tolulope
Tolulope 2012 年 9 月 18 日
Step 2 is the part that I'm hoping to avoid, as this would mean entering 130 distinct (non-generic) variable names. For example, the next file contains 450 columns!
Ideally, I would need to find a script in MatLab that automatically takes each column in data and and creates it in dataSet, distinctly.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeLarge Files and Big Data についてさらに検索

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by