How do I scan in data that is flowing on each other horizontally, the values running into each other can't be recognized and separated correctly.

Question

Shahab Yafai 2019 年 7 月 18 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/472391-how-do-i-scan-in-data-that-is-flowing-on-each-other-horizontally-the-values-running-into-each-other

コメント済み: Shahab Yafai 2019 年 7 月 22 日

I have a file 'test_file.txt' which contains:

some matrix            x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 x40
                var1    99  88  55  23  27  88  79  99  52  66  71  81  15   5   6   7  2811562569 999 2593699 759 8561245 999 123  58  62  12   9

Each column in the matrix (i.e. x10, x11,...) can have up to 4 numeric values, the formating the file is written in does not have a clean delimeter throughout. How can I read in data and have my code specify a maximum of 4 values in each column?

Here is what I've got:

readfile = 'testfile.txt';
writefile = 'my_file.txt';
summaryFile = fopen(writefile,'wt');
fid = fopen(readfile,'r');
format_str = repmat('%4.0d',1,31);
header = fgetl(fid);
fprintf(summaryFile, '%s\n', header);
tline = fgetl(fid);
tmp2 = textscan(tline, ['%q ' format_str '\n']);
var = cell2mat(tmp2{1});
vals = cell2mat(tmp2(:,2:end));
fprintf(summaryFile, ['              %s    ' format_str], var, vals);

and this is what I get from the above code:

some matrix            x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26 x27 x28 x29 x30 x31 x32 x33 x34 x35 x36 x37 x38 x39 x40   
              var1      99  88  55  23  27  88  79  99  52  66  71  81  15   5   6   728115625  69 9992593 699 7598561 245 999 123  58  62  12   9

What I want to be outputted is something like this:

some matrix            x10 x11 x12 x13 x14 x15 x16 x17 x18 x19 x20 x21 x22 x23 x24 x25 x26   x27   x28 x29 x30   x31 x32 x33   x34 x35 x36 x37 x38 x39 x40
                var1    99  88  55  23  27  88  79  99  52  66  71  81  15   5   6   7  28  1156  2569 999 259  3699 759 856  1245 999 123  58  62  12   9

Thanks in advance.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

dpb 2019 年 7 月 18 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/472391-how-do-i-scan-in-data-that-is-flowing-on-each-other-horizontally-the-values-running-into-each-other#answer_383850

編集済み: dpb 2019 年 7 月 19 日

MATLAB Online で開く

Use

opts=detectImportOptions(readfile);

to create a base import options object for fixed-width text and then fix up the VariableNames, VariableWidths, VariableTypes and DataLine properties as needed then use it with readtable.

Alternatively, scan in the input file as character array and then reshape the numeric subsection to return column width of four which can then be converted.

ADDENDUM:

To illustrate the latter since indicate isn't regular file so import object is probably not available...if I take your var1 line of text after parsing of the leading text--

txt =
    '  99  88  55  23  27  88  79  99  52  66  71  81  15   5   6   7  2811562569 999 2593699 759 8561245 999 123  58  62  12   9'
>> str2num(reshape(txt,4,[]).')
ans =
          99
          88
          55
          23
          27
          88
          79
          99
          52
          66
          71
          81
          15
           5
           6
           7
          28
        1156
        2569
         999
         259
        3699
         759
         856
        1245
         999
         123
          58
          62
          12
           9
>>

Or, of course, if you can isolate the section of the file, then the import object can come into play again by identifying the first data row and the fields.

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

dpb 2019 年 7 月 19 日

編集済み: dpb 2019 年 7 月 20 日

Can't be done w/ C format strings because on input the length field means to "read not more than" N characters EXCEPTING whitespace is ignored in the count. That's why your example gets off after the first case of fields which are filled to their full width.

C simply does not understand the concept of fixed-width fields--as far as it is concerned they don't exist.

As noted, if you can't use the import options object you'll have to resort to loading the array into memory and parsing it indivdually.

Also noted, if you'll separate the numerc subset array into Mx4*N character array, you can reshape() by 4, transpose, then use str2double() and reshape() the output.

It would help to attach a pertinent section of the file...

Shahab Yafai 2019 年 7 月 22 日

That will work,

Thanks for your time and patience.

サインインしてコメントする。

How do I scan in data that is flowing on each other horizontally, the values running into each other can't be recognized and separated correctly.

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

How do I scan in data that is flowing on each other horizontally, the values running into each other can't be recognized and separated correctly.

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示