How to extract specific data from a .txt file

Question

Abubakar Rashid 2023 年 1 月 31 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1904160-how-to-extract-specific-data-from-a-txt-file

コメント済み: Voss 2023 年 2 月 1 日

FILE.TXT

Hi Everyone,

I am trying to extract certain data which is location specific from a .txt file. The text file is not uniformly arranged & contains all kind of characters.

In my case I am looking to extract value of Fx & Fy in row 17 of the FILE.txt file. I have also attached the .txt file.

Looking forward to get it resolved.

Thank you & Reagrds,

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Voss 2023 年 1 月 31 日

1
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1904160-how-to-extract-specific-data-from-a-txt-file#answer_1160820

MATLAB Online で開く

type FILE.TXT % show the contents of FILE.TXT, for reference
*******************************************************************************
 BCASE=1, RPM: 0.1000E+05 FREQ: 0.1667    [Hz] PS,PA:  2.758       2.758     [bar]
*******************************************************************************
 -------------------------------------------------
 Option (32): Ex,Ey: Centered or Off-centered seal
 -------------------------------------------------
 +-----------------------------------------------------------------------------+
 |HsealH=> SOLN.FOUND IN   86 Iter                                             |
 |    Ex=0.050,Ey=-.050; Ec=0.071; Ax= 0.0000E+00,Ay= 0.0000E+00 ZO= 0.1511E-01|
 |    rpm= 0.10000E+05  PS= 0.27579E+06  PA= 0.27579E+06 Cseal= 0.00000E+00    |
 +-----------------------------------------------------------------------------+
 |    Exo=0.05000, Eyo=-.05000  Preload d/C=0.00000                            |
 |    Mass  Flow= 0.22167E-06 Kg/s =  0.13300E-04 Kg/min                       |
 |    Mass Flow/Rhos= 0.16729E-04 lt/min                                       |
 +-----------------------------------------------------------------------------+
 |    FX= 0.21993E+03 N ; FY= 0.21822E+03 N  LOAD= 0.30982E+03 N;Angle= 224.78D|
 +-----------------------------------------------------------------------------+
 |    MX=-0.11444E-01 Nm; MY= 0.11704E-01 Nm                                   |
 +-----------------------------------------------------------------------------+
 |    Torque on Film Lands= 0.42679E+00 N-m                                    |
 +-----------------------------------------------------------------------------+
 ...............................................................................
data = readlines('FILE.TXT'); % read the file

Here's one way to get the values of FX and FY from line 17:

line_to_get = 17;
result = regexp(data{line_to_get},'(\w+)= ?([\.\dE+-]+)','tokens');
result = vertcat(result{:})
result = 4×2 cell array
    {'FX'   }    {'0.21993E+03'}
    {'FY'   }    {'0.21822E+03'}
    {'LOAD' }    {'0.30982E+03'}
    {'Angle'}    {'224.78'     }
result = str2double(result(ismember(result(:,1),{'FX','FY'}),2))
result = 2×1
  219.9300
  218.2200

The same regular expression can be used to get other stuff out of the file too:

line_to_get = 10;
result = regexp(data{line_to_get},'(\w+)= ?([\.\dE+-]+)','tokens');
result = vertcat(result{:})
result = 6×2 cell array
    {'Ex'}    {'0.050'     }
    {'Ey'}    {'-.050'     }
    {'Ec'}    {'0.071'     }
    {'Ax'}    {'0.0000E+00'}
    {'Ay'}    {'0.0000E+00'}
    {'ZO'}    {'0.1511E-01'}
line_to_get = 11;
result = regexp(data{line_to_get},'(\w+)= ?([\.\dE+-]+)','tokens');
result = vertcat(result{:})
result = 4×2 cell array
    {'rpm'  }    {'0.10000E+05'}
    {'PS'   }    {'0.27579E+06'}
    {'PA'   }    {'0.27579E+06'}
    {'Cseal'}    {'0.00000E+00'}

You would use str2double to convert what you need from the second columns of those cell arrays into numbers.

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

Voss 2023 年 1 月 31 日

MATLAB Online で開く

FILE.TXT

Or use the same regular expression to parse what you can from the file all at once:

fid = fopen('FILE.TXT');
data = fread(fid,'*char').';
fclose(fid);
result = regexp(data,'(\w+)= ?([\.\dE+-]+)','tokens');
result = vertcat(result{:})
result = 23×2 cell array
    {'BCASE'}    {'1'           }
    {'Ex'   }    {'0.050'       }
    {'Ey'   }    {'-.050'       }
    {'Ec'   }    {'0.071'       }
    {'Ax'   }    {'0.0000E+00'  }
    {'Ay'   }    {'0.0000E+00'  }
    {'ZO'   }    {'0.1511E-01'  }
    {'rpm'  }    {'0.10000E+05' }
    {'PS'   }    {'0.27579E+06' }
    {'PA'   }    {'0.27579E+06' }
    {'Cseal'}    {'0.00000E+00' }
    {'Exo'  }    {'0.05000'     }
    {'Eyo'  }    {'-.05000'     }
    {'C'    }    {'0.00000'     }
    {'Flow' }    {'0.22167E-06' }
    {'Rhos' }    {'0.16729E-04' }
    {'FX'   }    {'0.21993E+03' }
    {'FY'   }    {'0.21822E+03' }
    {'LOAD' }    {'0.30982E+03' }
    {'Angle'}    {'224.78'      }
    {'MX'   }    {'-0.11444E-01'}
    {'MY'   }    {'0.11704E-01' }
    {'Lands'}    {'0.42679E+00' }

And then put everything into a struct of parameter values:

result(:,2) = num2cell(str2double(result(:,2)));
result = result.';
parameters = struct(result{:})
parameters = struct with fields:
    BCASE: 1
       Ex: 0.0500
       Ey: -0.0500
       Ec: 0.0710
       Ax: 0
       Ay: 0
       ZO: 0.0151
      rpm: 10000
       PS: 275790
       PA: 275790
    Cseal: 0
      Exo: 0.0500
      Eyo: -0.0500
        C: 0
     Flow: 2.2167e-07
     Rhos: 1.6729e-05
       FX: 219.9300
       FY: 218.2200
     LOAD: 309.8200
    Angle: 224.7800
       MX: -0.0114
       MY: 0.0117
    Lands: 0.4268

Abubakar Rashid 2023 年 2 月 1 日

Hi Voss,

Your method seems really helpful. Can you explain this line

result = regexp(data{line_to_get},'(\w+)= ?([\.\dE+-]+)','tokens');

Also, I tried it on line 18, but it didn't work. Can you please explain? Thank you!

Voss 2023 年 2 月 1 日

MATLAB Online で開く

result = regexp(data{line_to_get},'(\w+)= ?([\.\dE+-]+)','tokens');

Uses regexp to match the regular expression, '(\w+)= ?([\.\dE+-]+)'.

Here's a breakdown of that regular expression:

  (\w+)= ?([\.\dE+-]+)
%  ^^^                     match 1 or more alphabetic, numeric, or underscore character(s)
%      ^                   followed by an equal sign
%       ^^                 followed by an optional space
%          ^^^^^^^^^^      followed by one or more: periods (\.), decimal digits (\d), capital "E"s, plus signs (+), and/or minus signs (-)
% ^   ^                    use parentheses to capture and return the parameter name
%         ^          ^     use parentheses to capture and return the parameter value

Line 18 looks like this:

+-----------------------------------------------------------------------------+

so it should not match that regular expression. In other words, it's not surprising that regexp returned no matches with that line.

Note that you can use that same regular expression on the contents of the entire file to capture all the parameter names and values, as I showed in the first comment to my answer.

サインインしてコメントする。

Answer 2

Walter Roberson 2023 年 1 月 31 日

2
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1904160-how-to-extract-specific-data-from-a-txt-file#answer_1160810

MATLAB Online で開く

FILE.TXT

filename = 'FILE.TXT';
S = fileread(filename);
XY = str2double(regexp(S, '(?<=F[XY]= )\S+', 'match'))
XY = 1×2
  219.9300  218.2200

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

Walter Roberson 2023 年 1 月 31 日

MATLAB Online で開く

[XY] matches something that is either X or Y. F[XY]= matches F then either X or Y then = then space. The (?<= ) part means that the FX= or FY= must be present in the input stream and to position immediately after that, but that the text is not to be included in what is returned from the function. It positions the input stream but it does not extract anything from the input stream.

The \S+ matches any number of non-whitespace characters.

So the code looks for FX= or FY= in the input stream, and extracts the next column to be returned from the function. This searching is repeated as long as there are more occurances in the input stream, so both FX and FY would be extracted.

The return from regexp() with 'match' is going to be a cell array of character vectors. Those character vectors are passed to str2double() to be converted to numeric form.

A B C FX= 1325.3 Q= -5 FY= -23.2 P ZFY= 83

The expression as coded does not know to look for whitespace before the variable, so the ZFY= would be matched as being an occurance of FY= . So the regexp() would return {'1325.3', '-23.2', '83'} in this particular case. That could be adjusted if it mattered (but it doesn't matter to you.)

Abubakar Rashid 2023 年 1 月 31 日

Okay thank you!

I highly appreciate your efforts. It makes a lot of sense now

サインインしてコメントする。

How to extract specific data from a .txt file

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

How to extract specific data from a .txt file

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

7 件のコメント 5 件の古いコメントを表示5 件の古いコメントを非表示

その他の回答 (1 件)

3 件のコメント 1 件の古いコメントを表示1 件の古いコメントを非表示

参考

カテゴリ

タグ

製品

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

7 件のコメント
5 件の古いコメントを表示5 件の古いコメントを非表示

3 件のコメント
1 件の古いコメントを表示1 件の古いコメントを非表示