MATLAB Answers

Translated by

このページのコンテンツは英語から自動翻訳されています。自動翻訳をオフにする場合は「<a class="turn_off_mt" href="#">ここ</a>」をクリックしてください。

0

How to import Text File with 2 different Delimiters (how to organize header data and numeric data)

Ulrich Bretz さんによって質問されました 2017 年 11 月 1 日
最新アクティビティ Cedric Wannaz
さんによって 編集されました 2017 年 11 月 3 日
I want to import a text file. This contains a header (with space as delimiter) and data (tab delimited).
The txt-file looks like this:
FORMAT TAB_DELIMITED
NUM_HEADER_BLOCKS 162
NUM_PARAMS 646
PT_COUNT.CND_1 3895
FRAMES.CND_1 16
FILE_TYPE TIME_HISTORY
OPERATION RSP_TO_TAB
DATA_TYPE ASCII_FLOATING_POINT
DATE Fri Jun 23 11:20:24 2017
DELTA_T 9.765625e-02
TOTAL_T 3.803711e+02
PTS_PER_FRAME 256
PTS_PER_GROUP 256
CHANNELS 120
.
.
NUM_ZEROS 5 %end of header with line index 646
RfLongPositionFbk RfLatPositionFbk ...... %start of tab delimited area with the data (120 channels)
mm mm
-12.6182 -4.071238
-12.6192 -4.070237
-12.6182 -4.069237
  1. I want to search the Line which contains "NUM_PARAMS" and want to read the numeric value, which tell me the size of the header section.
  2. After that I want to read the file up to the line 646 in 2 rows - (1st row -> parameter name and 2nd row value.#Then I want to read the data (which is tab delimited - 120 channels).It would be fine if I can rename the channels with the names shown in the line above the units of measurement.
I started to read the full txt-file with the following code to import the header and search for the NUM_PARAM:
s = textscan(fid, '%s%s', 'delimiter', ' ');
idx_NUM_PARAMS = find(strcmp(s{1}, 'NUM_PARAMS'), 1, 'first');
NUM_PARAMSdbl = str2double(s{1,2}{idx_NUM_PARAMS,1});
But I imported also the data as String which is not usable because of the different delimiter.
So I read out the data in a second step:
dataTable = readtable(fileName, 'Delimiter', '\t', 'headerLines',NUM_PARAMSdbl+4,'ReadVariableNames',true);
But I cannot name the rows with the channel names, only with the line right above the data (with the units of measurement).
Thank you for every hint how can I solve my problem.

  0 件のコメント

サインイン to comment.

1 件の回答

回答者: Cedric Wannaz
2017 年 11 月 1 日
編集済み: Cedric Wannaz
2017 年 11 月 1 日

You may not need to use header information for parsing your file. Look at this example (applied to data.txt attached):
content = fileread( 'data.txt' ) ;
% - Split header/data.
pos = strfind( content, 'RfLongPositionFbk' ) ;
header = strtrim( content(1:pos-1) ) ;
data = content(pos:end) ;
% - Header -> struct with numeric values when possible.
header = regexp( header, '^(\S+)\s+([^\r\n]+)', 'tokens', 'lineanchors' ) ;
header = vertcat( header{:} ) ;
fNames = regexprep( header(:,1), '\W', '_' ) ;
values = strtrim( header(:,2) ) ;
buffer = str2double( values ) ;
isNum = ~isnan( buffer ) ;
values(isNum) = num2cell( buffer(isNum) ) ;
header = cell2struct( values,fNames ) ;
% - Data -> num array.
data = cell2mat( textscan( data, '%f %f', 'headerlines', 2 )) ;
Running this, you get:
>> header
header =
struct with fields:
FORMAT: 'TAB_DELIMITED'
NUM_HEADER_BLOCKS: 162
NUM_PARAMS: 646
PT_COUNT_CND_1: 3895
FRAMES_CND_1: 16
FILE_TYPE: 'TIME_HISTORY'
OPERATION: 'RSP_TO_TAB'
DATA_TYPE: 'ASCII_FLOATING_POINT'
DATE: 'Fri Jun 23 11:20:24 2017'
DELTA_T: 0.0977
TOTAL_T: 380.3711
PTS_PER_FRAME: 256
PTS_PER_GROUP: 256
CHANNELS: 120
NUM_ZEROS: 5
>> data
data =
-12.6182 -4.0712
-12.6192 -4.0702
-12.6182 -4.0692

  7 件のコメント

I still don't understand if you really need the information in the header or not (if it was just for getting the number of lines in the header and the number of channels). Assuming that you just want the data and the channel names and units, the following works:
content = fileread( '012f1ri(Forum).txt' ) ;
% - Extract # parameters and channels.
nParams = str2double( regexp( content, '(?<=NUM_PARAMS\s+)\S+', 'match', 'once' )) ;
nChannels = str2double( regexp( content, '(?<=CHANNELS\s+)\S+', 'match', 'once' )) ;
% - Read channel names and units, at line numParames plus 2 and 3.
fmtSpecNames = repmat( '%s', 1, nChannels ) ;
channelNames = textscan( content, fmtSpecNames, 1, 'HeaderLines', nParams+2 ) ;
channelNames = horzcat( channelNames{:} ) ;
channelUnits = textscan( content, fmtSpecNames, 1, 'HeaderLines', nParams+3 ) ;
channelUnits = horzcat( channelUnits{:} ) ;
% - Read channel data, from line numParams plus 4 on.
fmtSpecData = repmat( '%f', 1, nChannels ) ;
channelData = textscan( content, fmtSpecData, 'HeaderLines', nParams+4 ) ;
channelData = cell2mat( channelData ) ;
After running this, variables channelNames, channelUnits, and channelData contain names, units and data respectively.
Then we can convert to struct array, table, or whatever is best for you, and extract data from the header as well if needed.
Ulrich Bretz's "Answer" moved here:
That's now my status:
content = fileread(fileName);
lineStarts = [0, strfind( content, sprintf('\n') )] + 1 ;
numParams_header = str2double( regexp( content, '(?<=NUM_PARAMS\s+)\S+', 'match', 'once' ));
header = content(lineStarts(1):(lineStarts(numParams_header+1)-1));
channels = content(lineStarts(numParams_header +3):(lineStarts(numParams_header +4)-1));
units = content(lineStarts(numParams_header +4):(lineStarts(numParams_header +5)-1));
data = content(lineStarts(numParams_header +6):end);
How can i convert the channels and units from a sequence of characters to a char array?
I use Matlab R2014a
The answer in my comment above does this already. But if you want to follow your current approach, you can use STRSPLIT to get cell arrays of channel names and units (and possibly STRTRIM before, to get rid of \r if STRSPLIT outputs a 121th empty cell).
For the data, I would do it this way:
data = sscanf( data, '%f' ) ; % Long vector of all data.
data = reshape( data, numel(channels), [] ).' ; % Reshape into array.
where channels is a cell array of channel names (output of STRSPLIT).

サインイン to comment.



Translated by