Performance issues with readtable vs. xlsread

15 ビュー (過去 30 日間)
Mike
Mike 2020 年 12 月 2 日
編集済み: Mike 2020 年 12 月 3 日
I am working in R2020a and have converted the Excel file read in one of my MATLAB scripts from xlsread to readtable, since xlsread is now deprecated.
However, I am having performance issues with readtable compared to xlsread.
The Excel file is a 15MB .xlsb file with six sheets, and I am using named ranges to locate the data on the approrpriate sheet.
tic; [perforations, perforations_header] = xlsread(xls_file,sheet,'PERFORATIONS'); toc
Elapsed time is 4.354259 seconds.
tic; T=readtable(xls_file,'Sheet',sheet,'Range','PERFORATIONS'); toc
Elapsed time is 15.777517 seconds.
perforations = table2array(T); perforations_header=T.Properties.VariableNames; toc
Elapsed time is 15.803216 seconds.
So, two questions:
(1) Any suggestions on how to improve performance for readtable?
(2) Despite xlsread being deprecated, it clearly performs better than readtable. Is there any downside with staying with xlsread?
My thanks!
Mike
P.s. Attached is a cut down version of the Excel .xlsb file. It contains two named ranges: 'PERFORATIONS' and 'CELLS'. The full file was too large to attach.
  2 件のコメント
Walter Roberson
Walter Roberson 2020 年 12 月 3 日
Consider experimenting with readtable() with the 'UseExcel', true flag.
Mike
Mike 2020 年 12 月 3 日
Very interesting!
Rough figures for the 'PERFORATIONS' range:
xlsread => 4 seconds
readtable => 16 seconds
readtable('UseExcel',true) => 14 seconds
So, some improvement but still hasn't reached the performance of xlsread.
I'm starting with a very small example despite the use of an .xlsb file. Most of my "in anger" files will be x100 larger...
Thanks for the suggestion!
Mike

サインインしてコメントする。

採用された回答

Star Strider
Star Strider 2020 年 12 月 2 日
If you need to use xlsread, use it as long as you have it available.
Since you obviously want the header information as well, one option could be to use readtable once to read everything in once (likely using 'VariableNamingRule','preserve' in your call to it), perhaps extracting the variable names separately as <tablename>.Properties.VariableNames to a cell, then using save to put all the information you want in a .mat file. Then load the .mat file as necessary to retrieve the necessary information into your workspace. Either save the headers and data separately as different variables, or save it as a table.
I have no idea how efficient this would be, since I have no idea what your Excel file contains, or how large it is. It is simply an option to consider.
  2 件のコメント
Mike
Mike 2020 年 12 月 2 日
編集済み: Mike 2020 年 12 月 2 日
Thank you for the rapid reply. I have some control over the workflow for the project I'm working on, so I'll follow your lead and look into replacing the Excel files with a .mat save/load instead.
I don't have much experience with MATHWORKS/MATLAB. What does it mean for a function to be deprecated? Is there a management of change process to review possible transition issues before a function is removed?
Mike
Star Strider
Star Strider 2020 年 12 月 3 日
My pleasure!
My idea is to read them in once with readtable, save them to .mat files, then load them as necessary when you need them later.
If a function is deprecated, it will not be supported (and will likely not appear) beginning in some future release, although code using it may still work. (It’s complicated!)

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeSpreadsheets についてさらに検索

製品


リリース

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by