How do I scan in data that is flowing on each other horizontally, the values running into each other can't be recognized and separated correctly.

2 ビュー (過去 30 日間)
I have a file 'test_file.txt' which contains:
some matrix x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 x40
var1 99 88 55 23 27 88 79 99 52 66 71 81 15 5 6 7 2811562569 999 2593699 759 8561245 999 123 58 62 12 9
Each column in the matrix (i.e. x10, x11,...) can have up to 4 numeric values, the formating the file is written in does not have a clean delimeter throughout. How can I read in data and have my code specify a maximum of 4 values in each column?
Here is what I've got:
readfile = 'testfile.txt';
writefile = 'my_file.txt';
summaryFile = fopen(writefile,'wt');
fid = fopen(readfile,'r');
format_str = repmat('%4.0d',1,31);
header = fgetl(fid);
fprintf(summaryFile, '%s\n', header);
tline = fgetl(fid);
tmp2 = textscan(tline, ['%q ' format_str '\n']);
var = cell2mat(tmp2{1});
vals = cell2mat(tmp2(:,2:end));
fprintf(summaryFile, [' %s ' format_str], var, vals);
and this is what I get from the above code:
some matrix x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 x40
var1 99 88 55 23 27 88 79 99 52 66 71 81 15 5 6 728115625 69 9992593 699 7598561 245 999 123 58 62 12 9
What I want to be outputted is something like this:
some matrix x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 x40
var1 99 88 55 23 27 88 79 99 52 66 71 81 15 5 6 7 28 1156 2569 999 259 3699 759 856 1245 999 123 58 62 12 9
Thanks in advance.

採用された回答

dpb
dpb 2019 年 7 月 18 日
編集済み: dpb 2019 年 7 月 19 日
Use
opts=detectImportOptions(readfile);
to create a base import options object for fixed-width text and then fix up the VariableNames, VariableWidths, VariableTypes and DataLine properties as needed then use it with readtable.
Alternatively, scan in the input file as character array and then reshape the numeric subsection to return column width of four which can then be converted.
ADDENDUM:
To illustrate the latter since indicate isn't regular file so import object is probably not available...if I take your var1 line of text after parsing of the leading text--
txt =
' 99 88 55 23 27 88 79 99 52 66 71 81 15 5 6 7 2811562569 999 2593699 759 8561245 999 123 58 62 12 9'
>> str2num(reshape(txt,4,[]).')
ans =
99
88
55
23
27
88
79
99
52
66
71
81
15
5
6
7
28
1156
2569
999
259
3699
759
856
1245
999
123
58
62
12
9
>>
Or, of course, if you can isolate the section of the file, then the import object can come into play again by identifying the first data row and the fields.
  3 件のコメント
dpb
dpb 2019 年 7 月 19 日
編集済み: dpb 2019 年 7 月 20 日
Can't be done w/ C format strings because on input the length field means to "read not more than" N characters EXCEPTING whitespace is ignored in the count. That's why your example gets off after the first case of fields which are filled to their full width.
C simply does not understand the concept of fixed-width fields--as far as it is concerned they don't exist.
As noted, if you can't use the import options object you'll have to resort to loading the array into memory and parsing it indivdually.
Also noted, if you'll separate the numerc subset array into Mx4*N character array, you can reshape() by 4, transpose, then use str2double() and reshape() the output.
It would help to attach a pertinent section of the file...
Shahab Yafai
Shahab Yafai 2019 年 7 月 22 日
That will work,
Thanks for your time and patience.

サインインしてコメントする。

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeText Files についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by