How to figure out the number of rows for headers?

20 ビュー (過去 30 日間)
Leon
Leon 2020 年 3 月 30 日
コメント済み: Leon 2020 年 4 月 2 日
I have Excel files with unknown number of headers. For exmple, some files only have one row of header info. Another one could have 12 header lines.
I would need to know the accurate count of the number of rows of headerlines, so that my program would know the Excel Row# for a cetain data value. Right now, it only knows the relative Row # of the numerical portion of the Excel file.
What makes things more complicated is that my data also routinely contain columns that are made of text-strings.
Right now, I use readtable to read the Excel file. It basically assume the first Row is the headerline, and the rest of them are data. Here is my question. Is there a straightforward way to identify the number of headerlines in my case?
Thanks!
  1 件のコメント
Leon
Leon 2020 年 4 月 2 日
The below code works for flat csv or txt files. It does a good job of reading all of the headers. Is there an easy way to make it work for xlsx files as well?
header_end = false;
header_count = 0;
% preallocate 50 rows of header text
header = cell(50, 1);
%% Loop through header lines
while ~header_end
%% Get a whole line from the file
aline = fgetl(FID);
%% Use a regular expression to look for the data block header
tokens = regexp(aline, ...
['(\d+)' ... % capture a number
'\s+!\s+' ... % followed by an !
'(\w+\s+)+'],... % and capture a series of strings
'tokens', 'warnings');
%% Parse data block header if found
if ~isempty(tokens) && numel(tokens{:}) == 2
% parse numeric token using string to double
num_datum = str2double(tokens{1}(1));
% parse variable string
varNameStr = deblank(tokens{1}(2));
names = strsplit(varNameStr{:});
%names = names{1};
% stop looping
header_end = true;
else
% Increment header line count and add new line
header_count = header_count+1;
header{header_count} = aline;
end
end

サインインしてコメントする。

回答 (2 件)

Fangjun Jiang
Fangjun Jiang 2020 年 3 月 30 日
If you use [Num, Txt, Raw]=xlsread(), can you figure it out based on the numerical, text and raw data?
  1 件のコメント
Leon
Leon 2020 年 3 月 30 日
Thanks. Unfortunately, I can't.

サインインしてコメントする。


Josh Zagorski
Josh Zagorski 2020 年 3 月 30 日
[A DELIM NHEADERLINES] = importdata(...) returns the detected number of header
lines in the input ASCII file.
So, [Matrix_converted,'delimiter',No_headerlines = importdata('file.xlsx');
  5 件のコメント
Josh Zagorski
Josh Zagorski 2020 年 3 月 31 日
Do you have the correct filepath?
Otherwise, I glossed over your "text-strings" data comment - wondering if data needs to be double/numeric "data contains a double array."
Leon
Leon 2020 年 4 月 2 日
Yes, I do.
Just tried a different file, below is what I get:
>> [A,B,C]= importdata('G01.xlsx')
A =
struct with fields:
data: [1×1 struct]
textdata: [1×1 struct]
colheaders: [1×1 struct]
B =
NaN
C =
0

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeSpreadsheets についてさらに検索

製品


リリース

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by