How to use regexp to read numbers preceded by text out of a data file

Question

Jean Volkmar 2021 年 9 月 5 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1446984-how-to-use-regexp-to-read-numbers-preceded-by-text-out-of-a-data-file

編集済み: Jean Volkmar 2021 年 9 月 6 日

採用された回答: Walter Roberson

MATLAB Online で開く

Hello, i tried to get numbers out of a data file with regexp but i just cant get it to work.

The file is structured like this:

Time = 10001
smoothSolver:  Solving for Ux, Initial residual = 1.660195e-07, Final residual = 3.2635415e-09, No Iterations 2
smoothSolver:  Solving for Uy, Initial residual = 0.0010739333, Final residual = 1.1651965e-07, No Iterations 1
smoothSolver:  Solving for Uz, Initial residual = 1.1728104e-06, Final residual = 2.3091287e-08, No Iterations 2
GAMG:  Solving for p, Initial residual = 1.0923774e-05, Final residual = 7.2745974e-07, No Iterations 3
time step continuity errors : sum local = 0.34757443, global = 0.0029670501, cumulative = 0.0029670501
ExecutionTime = 0.92 s  ClockTime = 3 s

i want to get out the values behind "Solving for Ux, Initial residual =", "Solving for Uy, Initial residual =", "Solving for Uz, Initial residual =", "Solving for p, Initial residual" and "ClockTime ="

i tried testing this with the following code:

S = fileread(path);
residuals = regexp(S,"(?<=ClockTime\s=\s)\d*");

This should find any number of consecutive digits following the string "ClockTime =" but instead it gives me the numbers 595 and 1193 which the file does not even contain as consecutive digits.

I read this already but it is not really helping me.

Thanks for any help :)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Walter Roberson 2021 年 9 月 5 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1446984-how-to-use-regexp-to-read-numbers-preceded-by-text-out-of-a-data-file#answer_781094

MATLAB Online で開く

residuals = regexp(S,"(?<=ClockTime\s=\s)\d+", 'match');

The numbers you are getting back are the default outputs, which is the position relative to the start of the string.

Note: for your work you might want to use named tokens.

residuals = regexp(S,"(?<=residual\s =)(?<XR>)\S+).*(?<=residual\s =)(?<YR>)\S+).*(?<=residual\s =)(?<ZR>)\S+).*(?<=residual\s =)(?<PR>)\S+)(?<=ClockTime\s=\s)(?<CT>\d+)", 'names');

if all went well, if I did not make mistakes in the pattern, this would return a struct array with fields XR YR ZR PR CT each of which is a character vector, with XR corresponding to 'Solving for x', YR for the next line, ZR for the third line, PR for the GAMG line, and CT for the ClockTIme.

You would then

Nresiduals = structfun(@double, residuals, 'uniform', 0);
Ux = [Nresiduals.XR]; Uy = [Nresiduals.YR]; Uz = [NResiduals.ZR]; p = [NResiduals.PR];
clocks = [NResiduals.CT];

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

Walter Roberson 2021 年 9 月 5 日

MATLAB Online で開く

The search is done from the beginning. The first residual encountered is saved in XR, the second in YR, and so on. Ah, I guess I should modify to

residuals = regexp(S,"(?<=residual\s =)(?<XR>)\S+).*?(?<=residual\s =)(?<YR>)\S+).*?(?<=residual\s =)(?<ZR>)\S+).*?(?<=residual\s =)(?<PR>)\S+).*?(?<=ClockTime\s=\s)(?<CT>\d+)", 'names');

to move only to the nearest "residuals" afterwards instead of going to end of file. Also I was missing a .* before clock time.

\S+ is "all non-space". Ummm, that's going to grab the commas too... okay, modify that to end at comma...

residuals = regexp(S,"(?<=residual\s =)(?<XR>)[^,]+).*?(?<=residual\s =)(?<YR>)[^,]+).*?(?<=residual\s =)(?<ZR>)[^,]+).*?(?<=residual\s =)(?<PR>)[^,]+).*?(?<=ClockTime\s=\s)(?<CT>\S+)", 'names');

Non-comma should be able to handle reading 1.660195e-07 (exponential notation) and 0.0010739333 (not exponential) and even possible plain integers (I can tell that a %g format is being used and so if the time just happened to be integer it would not emit any decimal place at all)

The last match I changed from \d+ to \S+ in case at some point there is a non-integer time in the file. It does not need to check for comma as only the residuals have comma.

Walter Roberson 2021 年 9 月 5 日

I will experiment later after I have had some sleep.

Jean Volkmar 2021 年 9 月 6 日

編集済み: Jean Volkmar 2021 年 9 月 6 日

MATLAB Online で開く

Okay so i also had a go at experimenting, in the end i did not get it to work as nicely but better than nothing.

Here is what i did:

path(1) = "";
residuals = cell(length(path),5); %Preallocate
for j = 1:1:length(path)
    
S = fileread(path(j)); %read file
residuals{j,1} = regexp(S,"(?<=Ux,\sInitial\sresidual\s=\s)[^,]+","match"); %Find a number of consecutive charakters after a matching string "Ux, Initial residual = " 1 or more times consecutively, and exclude the comma. 
residuals{j,2} = regexp(S,"(?<=Uy,\sInitial\sresidual\s=\s)[^,]+","match"); %Find a number of consecutive charakters after a matching string "Uy, Initial residual = " 1 or more times consecutively, and exclude the comma. 
residuals{j,3} = regexp(S,"(?<=Uz,\sInitial\sresidual\s=\s)[^,]+","match"); %Find a number of consecutive charakters after a matching string "Uz, Initial residual = " 1 or more times consecutively, and exclude the comma. 
residuals{j,4} = regexp(S,"(?<=p,\sInitial\sresidual\s=\s)[^,]+","match"); %Find a number of consecutive charakters after a matching string "p, Initial residual = " 1 or more times consecutively, and exclude the comma. 
residuals{j,5} = regexp(S,"(?<=ClockTime\s=\s)\d+","match"); %Find a number of consecutive digits after a matching string "ClockTime = " 1 or more times consecutively, and exclude the comma. 
end

After that i saved the numbers into an array like this:

ux = zeros(length(path),length(residuals{1})); %Preallocate
for k = 1:1:length(path)
    for j = 1:1:length(residuals{1})
        ux(k,j) = str2double(residuals{k,1}(1,j)); %Convert string inside the cell to double
    end
end

サインインしてコメントする。

How to use regexp to read numbers preceded by text out of a data file

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

How to use regexp to read numbers preceded by text out of a data file

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

5 件のコメント 3 件の古いコメントを表示3 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

5 件のコメント
3 件の古いコメントを表示3 件の古いコメントを非表示