MATLAB Answers

How to properly extract data from text file.

30 ビュー (過去 30 日間)
Sharah
Sharah 2017 年 5 月 11 日
コメント済み: Guillaume 2017 年 5 月 12 日
I have a data in a text file that looks basically like this:
LineType: 1
PlayMode: Single
GameType: OneBalloon
LineType: SumR3
TranslationSpeed: 0
SensivityBalloon1: 0.09
SensivityBalloon2: 0
LevelLength: 20
Season: Summer
Backgrounddifficulty: Easy
StarScore[1] DistanceScore[1] StabilityScore[1] ScoreFrames[1] Frame[1] Time[1] ForcePlayer1[1] BalloonPath_X[1] BalloonPath_Y[1] CharacterPath_X[1] CharacterPath_Y[1] IsInactive[1]
0 0 0 0 0 0 30653 0 4.225888 0 2.150741 0
1 0 0 0 1 0 30641 0 -2.579402 0 -4.643577 0
And I am using this to extract data starting from the StarScore:
file = fullfile('file.txt');
Subject(1).T = readtable(file,'Delimiter',' ', ...
'ReadVariableNames',true, 'HeaderLines', 10);
Subject(1).T(:, 13) = [];
Two questions I have:
1) The problem with this is that, the headerline should be at 11, but MATLAB extracted the first data as the header if I put HeaderLines to 11. It skips the first line. Why?
2) How to extract the first few information from the text file on a different cell and stop before it reaches starScore?

採用された回答

dpb
dpb 2017 年 5 月 11 日
編集済み: dpb 2017 年 5 月 12 日
The first one is (relatively) easy -- you set 'ReadVariableNames',true whose meaning per documentation is "the first row of the region to read contains the variable names for the table." Hence, the count of lines to skip is based on all the information to be parsed, not just the data portion; if you want the header line for names it becomes one of the data lines. So in that case 'HeaderLines' is just 10; you want the 11 th line.
I don't understand the second request, sorry...
ADDENDUM
OK, the second came to me (with some help from seeing Guillaume's Answer :) ). Another approach to same end result...
hdrdata=regexp(textread(file,'%s','headerlines', 10, ...
'delimiter','\n','whitespace',''),'split');
which will leave you a 10x1 cell array each of which contains the text/value pair.
While TMW has deprecated the venerable textread over its uptown cousin textscan, it has some advantages for certain uses including that it accepts filename instead of needing file handle and that it doesn't encapsulate everything in a cell array. The above does return a cellstr array, but the textscan version returns that inside another cell that has to be dereferenced before being passed to regexp; another step.
  6 件のコメント
Guillaume
Guillaume 2017 年 5 月 12 日
Alternatively,
fmt = strjoin(repmat({'%f'}, 1, nc), ' ');

サインインしてコメントする。

その他の回答 (1 件)

Guillaume
Guillaume 2017 年 5 月 12 日
dpb answered your first question.
For your second question, unfortunately it cannot be done with readtable. You have no option but to read the file a second time. This can be done many ways. A fairly simple way would be
fid = fopen(file, 'rt');
headerlines = 10;
headers = cell(headerlines, 2);
for row = 1 : headerlines
headers(row, :) = strsplit(fgetl(fid), ': ');
end
fclose(fid);

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by