Reading multiple sheets from an excel file

811 ビュー (過去 30 日間)
JMG
JMG 2021 年 7 月 24 日
コメント済み: dpb 2021 年 7 月 25 日
I have an excel spreadsheet and the data is seperated into a different sheet for each year but I want to have all of the years worth of data in one table so I was wondering what the best way to approach this is? Right now my code only gets the data from 2007 and thats it.
opt=detectImportOptions('BC1 2007-2020.xlsx');
opt.VariableTypes(3)={'double'};
BC1=readtable('BC1 2007-2020.xlsx',opt);
BC1.Date=datetime(BC1{:,1},'InputFormat','dd/MM/yy HH:mm:ss');
BC1=table2timetable(BC1);
plot(BC1.Date,BC1.WaterLevel_Meters);

採用された回答

dpb
dpb 2021 年 7 月 24 日
編集済み: dpb 2021 年 7 月 24 日
You'll just have to loop over the sheets; use sheetnames to return the list of sheets in the given file and iterate over it--presuming the sheets are in the desired order in the workbook, they'll be returned in chronological order; if not you can always sort() it first.
Presuming the sheets are also in the same format, you can use the same import object for each, changing only the sheet name/number although naming the sheet explicitly in the readtable call overrides the import options value.
I'll note that you could also incorporate the datetime variable type and input format into the options object thus saving the explicit conversion code.
Since it's not known a priori (I presume) how many records are in each sheet, the simplest coding is to simply dynamically catentate the new table onto the existing one...
fn='BC1 2007-2020.xlsx'; % using a fully-qualified filename here would be good practice
tBC=[];
opt=detectImportOptions(fn);
% wanted/need options code/fixup here...
shts=sheetnames(fn);
for i=1:numel(shts)
tBC=[tBC;readtable(fn,opt,'Sheet',shts(i))];
end
...
Do the conversion to timetable from all only once at the end as well as the time conversion if choose to not use the import options route for import for it...
While dynamic allocation is not the most elegant solution, it is by far the simplest and as long as the sizes and numbers of the sheets to be imported isn't too large, the overhead will be not too painful -- particularly if only doing the reading once or a few times and mostly using the resulting data. Of course, once read, one can save the table as a .mat file and retrieve the whole thing very quickly; only when new data are added would it be necessary to reread the raw file again.
  5 件のコメント
JMG
JMG 2021 年 7 月 24 日
Thank you I actually just found the document with the data all formatted together so I no longer need to do any of this! I appreciate your help though and now I know for future reference.
dpb
dpb 2021 年 7 月 24 日
I downloaded and looked at the file --
IMO, you should fix the broken file -- the date,time are sometimes in column A, sometimes A+B and there's other meta-data in the water level header line in some not in others but either way, it'll break if read the names.
That you can work around by not reading the variable names at all and just using the column numbers/generic Var1...VarN names then set the final variable names in the table properties struct.
You can write code to read either date, time or datetime as one string, but I still think it would be better to fix these data and then ensure create future in the same format going forward instead.
But, the other way would be to use detectImportOptions in turn on each file inside the loop and see how many variables it returns and then handle the different cases.
That also means have to convert to datetime for each file in turn using the appropriate code to convert the datetime and add the duration for those with both (or join the two strings) first. I didn't notice if the time/date formats were the same across sheets or not -- that could be another sticky wicket if somebody did those differently, too.

サインインしてコメントする。

その他の回答 (1 件)

Image Analyst
Image Analyst 2021 年 7 月 24 日
Try this:
baseFileName = 'BC1 2007-2020.xlsx'; % using a fully-qualified filename here would be good practice
tblAllSheetData = [];
% opt=detectImportOptions(baseFileName);
% wanted/need options code/fixup here...
sheetNames = sheetnames(baseFileName)
for k = 1 : numel(sheetNames)
thisTable = readtable(baseFileName,'Sheet',sheetNames{k});
fprintf('Read a table of %d rows and %d columns from a sheet names %s.\n',...
height(thisTable), width(thisTable), sheetNames{k});
% Stitch onto the end
tblAllSheetData = vertcat(tblAllSheetData, thisTable);
end
but it will fail when it gets to sheet 2008, which has 5 columns instead of only 3. How do you want to handle the situation where some sheets don't have the same number of columns as other sheets? Getting to the root of the problem . . . why DON'T all sheets have the same format?
  2 件のコメント
Image Analyst
Image Analyst 2021 年 7 月 25 日
@JMG, are you going to answer these questions?
dpb
dpb 2021 年 7 月 25 日
https://www.mathworks.com/matlabcentral/answers/885144-reading-multiple-sheets-from-an-excel-file#comment_1654434 comment above to my response of essentially same content that he located an existing file containing all the data so the Q? became moot...

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSpreadsheets についてさらに検索

タグ

製品


リリース

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by