How to read variables from multiple '.tab' file?

Question

Tez 2017 年 7 月 18 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/349182-how-to-read-variables-from-multiple-tab-file

コメント済み: Yuanzheng Wen 2021 年 6 月 18 日

mvn_kp_insitu_20150202_v10_r01.zip

I have sequence of data files('.tab' file) each having 11900 rows and 236 columns. I have to read some of the variables from each file. For that I opened some of the files from the folder. But I can't read the variables. The variable columns have both NaN and numerical values. Only NaN values are shown instead of numerical values.

clear all;
clc;
files=dir(fullfile('C:\Users\Documents\2015\02\*.tab'));
for i=1:2
  fid(i)=fopen(files(i).name);
  files(i).values=textscan(fid(i), '%s','delimiter','','HeaderLines',296,'MultipleDelimsAsOne',1);
  formatSpec = '%19s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%16s%s%[^\n\r]';
  dataArray = textscan(files(i).name,formatSpec, 'Delimiter', '', 'WhiteSpace', '',  'ReturnOnError', false);
  raw = repmat({''},length(dataArray{1}),length(dataArray)-1);
  for k=1:11900;l=1:236;
      raw(k,l) = raw(length(dataArray{1}),length(dataArray)-1);
      n(k,l) = str2double(raw(:, 2));
      h(k,l) = str2double(raw(:, 197));
  end
      fclose('all');
end

How can I read multiple files and the variables in the file in MATLAB? I am attaching one of the file here.

data file https://lasp.colorado.edu/maven/sdc/public/data/sci/kp/insitu/2015/02

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Stephen23 2017 年 7 月 18 日

編集済み: Stephen23 2017 年 7 月 18 日

@Tez: please edit your question and upload a sample file by clicking the paperclip button.

Based on such a vague description the best advice you could expect is something like "try the file import tools, e.g. dlmread or importdata, reading their help carefully". Once you give us a sample file then we can test it ourselves.

You should also read these:

https://www.mathworks.com/help/matlab/import_export/process-a-sequence-of-files.html

https://www.mathworks.com/matlabcentral/answers/57446-faq-how-can-i-process-a-sequence-of-files

Tez 2017 年 7 月 18 日

I tried to read data using import tool. But the whole data is reading as single variable including the headerlines in single cell instead of 236.

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Stephen23 2017 年 7 月 18 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/349182-how-to-read-variables-from-multiple-tab-file#answer_274582

編集済み: Stephen23 2019 年 10 月 24 日

MATLAB Online で開く

This code automatically locates the header lines by checking for the # character at the start of the line, the variable mywant specifies which columns you want from the matrix: all other numeric columns are simply ignored and are not read into MATLAB memory. The code also ignores all string columns, although you could easily extend the code to import them too.

The variable mywant lets you request any numeric column/s (e.g. cX, cY, etc) by entering from minimum one row to maximum six rows of data header (starting from the top, i.e. row1, row2, etc.):

mywant = {{cXrow1text,cXrow2text,...}, {cYrow1text,cYrow2text,...}, ...}

If you are going to call this in a loop I would strongly suggest that you put this code into a function and call the function in the loop.

mywant = {{'Electron','Density',''},{'Spacecraft','Altitude','Aeroid'}};
mypath = '';
myname = 'mvn_kp_insitu_20150202_v10_r01.tab';
%myname = 'mvn_kp_insitu_20150512_v15_r01.tab';
myfull = fullfile(mypath,myname);
%
[fid,msg] = fopen(myfull,'rt');
assert(fid>=3,msg);
%
% Read lines until last '#':
vec = NaN(1,8);
str = '#';
while ~feof(fid) && strncmp(str,'#',1)
  vec([2:end,1]) = vec;
  vec(1) = ftell(fid);
  str = fgetl(fid);
end
fseek(fid,vec(end),'bof');
str = fgetl(fid);
soh = ftell(fid); % start of header
vec = regexp(str,'\d+','end');
num = numel(vec); % number of columns
vec = diff([0,vec]); % column widths
%
% Read header lines:
fmt = sprintf('%%%dc',vec); % fixed-width columns
hdr = textscan(fid,fmt,6,'Whitespace','');
hdr = cellfun(@cellstr,hdr,'uni',0);
hdr = cellfun(@strtrim,hdr,'uni',0);
%
% Locate the requested headers:
fun = @(h)any(cellfun(@(w)all(strcmp(w(:),h(1:numel(w)))),mywant));
idx = cellfun(fun,hdr);
% Identify any string columns:
opt = {'MultipleDelimsAsOne',true, 'CollectOutput',true, 'HeaderLines',6};
fmt = repmat('%s',1,num);
fseek(fid,soh,'bof');
dat = textscan(fid,fmt,1,opt{:});
dat = dat{1};
ids = isnan(str2double(dat)) & ~strcmpi('NaN',dat);
%
% Generate format string:
fmt = repmat({'%*f'},1,num);
fmt(idx) = {'%f'};  % requested columns
fmt(ids) = {'%*s'}; % string columns all ignored, but this could be changed...
fmt = horzcat(fmt{:});
% Read requested data:
fseek(fid,soh,'bof');
dat = textscan(fid,fmt,opt{:});
dat = dat{1};
%
fclose(fid);

It produces this output:

>> size(dat)
ans =
    2
>> dat
dat =
      NaN    222.850
000    224.600
000    226.370
000    228.170
000    229.990
300    231.830
100    233.700
000    235.580
000    237.490
000    239.420
000    241.380
000    243.350
000    245.350
000    247.370
000    249.420
000    251.480
000    253.570
000    255.670
000    257.800
000    259.950
000    262.130
000    264.320
000    266.540
000    268.780
000    271.030
000    273.310
000    275.610
000    277.940
000    280.280
000    282.640
000    285.030
  etc

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

Stephen23 2021 年 6 月 18 日

"How can I rewrite your code to read a sequence of the tab files at a time instead of one at a time? "

Run the code inside a loop, just as the documentation shows:

https://www.mathworks.com/help/matlab/import_export/process-a-sequence-of-files.html

Yuanzheng Wen 2021 年 6 月 18 日

Thanks for your reply, I will try it!

サインインしてコメントする。

Answer 2

Jan 2017 年 7 月 18 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/349182-how-to-read-variables-from-multiple-tab-file#answer_274568

MATLAB Online で開く

Notes: There is no need for fullfile, if you have one argument only. Storing the fileIDs in a vector by fid(i) is nopt useful, if you close all files by fclose('all') in each itereation. Better use fid=fopen(...) and fclose(fid).

fid(i)=fopen(files(i).name) does not consider the folder. Better:

fid(i)=fopen(fullfile(files(i).folder, files(i).name))

You import the file at first by textscan(fid(i)) and then again with textscan(files(i).name). Why do you do this? I'd expect this to fail.

I do not understand the purpose of

      raw(k,l) = raw(length(dataArray{1}),length(dataArray)-1);
      n(k,l) = str2double(raw(:, 2));
      h(k,l) = str2double(raw(:, 197));

All three assignments access the same elements of raw in each iteration. The expression does not depend on k, so the loop is a waste of time. As long as it is not clear what you want to achieve, suggesting a modification would be based on guessing only.

Finally the output raw, n, h is overwritten in the iterations of the for i loop.

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

Tez 2017 年 7 月 18 日

The folder has 30 files. What will be the code to read second and 197th columns of each file?

サインインしてコメントする。

How to read variables from multiple '.tab' file?

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

How to read variables from multiple '.tab' file?

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

回答 (2 件)

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

参考

カテゴリ

タグ

Community Treasure Hunt

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示