How to know the range automatically identified by readtable when importing an Excel file?

15 ビュー (過去 30 日間)
Stefano
Stefano 2023 年 4 月 13 日
編集済み: dpb 2023 年 4 月 15 日
I am using readtable in an app to import data from an Excel spreadsheet, which contains several lines of heading and then the actual table with information: something like the example below
readtable successfully manages to skip the headers and to pull out the data. My problem however is that I also need to copy all the information in the header section because then I need to create a second file with the same header and only some of the elements of the table below. I cannot simply copy a given range (say A1:G6 in the example above) because the number of rows/columns of the header changes depending on the file.
Is there a way to know which range was automatically identified as a table by readtable, so that I could then deduct the range of the header that I need?

採用された回答

dpb
dpb 2023 年 4 月 13 日
It will be only as good as the ability of the detectImportOptions bot's ability to recognize the start of the data, but
opt=detectImportOptions(yourfile);
disp(opt.DataRange)
disp(opt.VariableNamesRange)
disp(opt.VariableNames)
will show you what it determined automagically...the two ranges are the starting cell (upper LH corner of the range); the number of elements in the VariableNames array will let you deduce the number of columns. It does not return the Excel UsedRange range address; that could be a useful adjunct to ask for as an enhancement; as is, all you can do is then read the table and see how many rows were returned. That size, of course, is dependent upon the setting of the 'ImportError' and 'Missing' rules and the data content of the file itself.
IF (the proverbial "big if") the format is always similar to the shown table, it's probably going to be pretty reliable; less well formatted or more eclectic data collections may confuse the scanning and not always get the same answer for what appear to be very similar worksheets. In particular, missing or nonnumeric data in columns that should be recognized as numeric often will cause the whole column to be treated as a cell or may change the 'DataRange' value returned.
  7 件のコメント
dpb
dpb 2023 年 4 月 14 日
編集済み: dpb 2023 年 4 月 14 日
I don't recall ever seeing it documented in the doc itself; I remember it being discussed here in the forum when behavior was changing fairly frequently and questions arose by poster's prior code not acting the same...but, I don't download/install every release and don't visit the release notes on other releases very often, either, so well could have missed what there might have been.
dpb
dpb 2023 年 4 月 15 日
編集済み: dpb 2023 年 4 月 15 日
function usedRange=GetUsedRange(Excel, sheetNameOrNumber)
% Returns used range address string for input sheet.
% Can pass in the number of the sheet (1,2,3,etc.) or name ('Results').
try
usedRange=Excel.Worksheets.Item(sheetNameOrNumber).UsedRange; % get range reference
usedRange=usedRange.Address; % return the address string
catch ME
errorMessage = sprintf('Error in function GetUsedRange.\n\nError Message:\n%s', ME.message);
fprintf('%s\n', errorMessage);
end
return;
end
To use, first create the Excel actx engine with actxserver and open the file; these were designed to be used to do various things to a given workbook, opening the workbook only once, do the work, and then closing and cleaning up the object rather than as complete standalone, create a new instance every time.
>> excel = actxserver('Excel.Application');
>> d=dir('TestAnnualReportScholarshipData.xls');
>> fn=fullfile(d.folder,d.name);
>> Workbook = excel.Workbooks.Open(fn)
>> GetUsedRange(excel,'Endowed Funds')
>> ans =
'$A$1:$P$74'
>> excel.ActiveWorkbook.Close(false);
>> excel.Quit
>> clear excel

サインインしてコメントする。

その他の回答 (1 件)

Walter Roberson
Walter Roberson 2023 年 4 月 13 日
No. However you should first do a detectImportOptions on the file, and then pass the options to readtable() . The options object will indicate the range over which data was imported.
  1 件のコメント
dpb
dpb 2023 年 4 月 13 日
" The options object will indicate the range over which data was imported."
Not exactly the range, Walter. It will return the ULH corner of the data and you can deduce the number of columns, but it does not return a length indication; the 'UsedRange' from Excel could be a useful addition to the returned output; it's got to be used internally but isn't reported to the user.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSpreadsheets についてさらに検索

製品


リリース

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by