looking for regular expression to parse sparse data

Question

0 投票

Hi,

i have a sparse mass matrix exported from ansys, and the data looks as follows:

[ 1, 1]: 1.157e-07 [ 1, 4]: 2.332e-08 [ 1, 7]: 2.146e-08 [ 1, 10]: 5.835e-08 [ 1, 13]: 4.043e-08 [ 1, 16]: 1.011e-08 [ 1, 19]: 8.211e-09 [ 1, 22]: 2.590e-08 [ 1, 25]:-3.475e-08 [ 1, 28]:-2.854e-08 [ 1, 31]:-2.987e-08 [ 1, 34]:-8.897e-08 [ 1, 37]:-1.351e-08 [ 1, 40]:-8.564e-09 [ 1, 43]:-9.072e-09 [ 1, 46]:-3.556e-08 [ 1, 49]:-6.093e-08 [ 1, 52]:-1.343e-08 [ 1, 55]:-8.914e-09 [ 1, 58]:-3.609e-08 [ 1, 61]:-3.609e-08 [ 1, 64]:-6.093e-08 [ 1, 67]:-1.343e-08 [ 1, 70]:-8.914e-09 [ 1, 118]: 5.625e-08 [ 1, 121]: 2.883e-08 [ 1, 130]: 2.507e-08 [ 1, 133]: 1.102e-08 [ 1, 142]:-3.891e-08 [ 1, 154]:-1.175e-08 [ 1, 166]:-3.459e-08 [ 1, 169]:-1.171e-08 [ 1, 181]:-1.171e-08 [ 1, 184]:-3.459e-08 [ 1, 187]:-8.513e-08 [ 1, 190]:-3.947e-08 [ 1, 193]:-3.466e-08 [ 1, 196]:-1.196e-08 [ 1, 958]: 1.944e-08 [ 1, 964]: 7.516e-09 [ 1, 970]:-2.705e-08 [ 1, 979]:-8.340e-09 [ 1, 988]:-7.965e-09 [ 1, 994]:-7.965e-09 [ 1, 1021]: 2.166e-08 [ 1, 1024]: 9.467e-09 [ 1, 1027]:-2.557e-08 [ 1, 1030]:-3.156e-08 [ 1, 1033]:-7.830e-09 [ 1, 1036]:-1.295e-08 [ 1, 1039]:-1.246e-08 [ 1, 1042]:-1.246e-08

Im looking to put this into a dense matrix, but well enough will be to store all the items in a cell array of 3 columns: x, y, data by N rows, where the regular expression will read to the end of the file.

I would then search the cell array for the largest index (X,Y) and initialize an array of that size, then copy the data over from the cell array to the matrix.

Is this possible?

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Follow Question

Answer 1

Star Strider 2020 年 11 月 13 日

MATLAB Online で開く

1 投票

This uses one regexp call to parse the data into specific cells that are read with sscanf, and then partitioned into individual columns using the reshape function in the ‘Out’ assignment. It may not be exactly what you intended (I doubt that is possible), however it has the virtue of produciing the desired result:

M = '[     1,     1]: 1.157e-07 [     1,     4]: 2.332e-08 [     1,     7]: 2.146e-08 [     1,    10]: 5.835e-08 [     1,    13]: 4.043e-08 [     1,    16]: 1.011e-08 [     1,    19]: 8.211e-09 [     1,    22]: 2.590e-08 [     1,    25]:-3.475e-08 [     1,    28]:-2.854e-08 [     1,    31]:-2.987e-08 [     1,    34]:-8.897e-08 [     1,    37]:-1.351e-08 [     1,    40]:-8.564e-09 [     1,    43]:-9.072e-09 [     1,    46]:-3.556e-08 [     1,    49]:-6.093e-08 [     1,    52]:-1.343e-08 [     1,    55]:-8.914e-09 [     1,    58]:-3.609e-08 [     1,    61]:-3.609e-08 [     1,    64]:-6.093e-08 [     1,    67]:-1.343e-08 [     1,    70]:-8.914e-09 [     1,   118]: 5.625e-08 [     1,   121]: 2.883e-08 [     1,   130]: 2.507e-08 [     1,   133]: 1.102e-08 [     1,   142]:-3.891e-08 [     1,   154]:-1.175e-08 [     1,   166]:-3.459e-08 [     1,   169]:-1.171e-08 [     1,   181]:-1.171e-08 [     1,   184]:-3.459e-08 [     1,   187]:-8.513e-08 [     1,   190]:-3.947e-08 [     1,   193]:-3.466e-08 [     1,   196]:-1.196e-08 [     1,   958]: 1.944e-08 [     1,   964]: 7.516e-09 [     1,   970]:-2.705e-08 [     1,   979]:-8.340e-09 [     1,   988]:-7.965e-09 [     1,   994]:-7.965e-09 [     1,  1021]: 2.166e-08 [     1,  1024]: 9.467e-09 [     1,  1027]:-2.557e-08 [     1,  1030]:-3.156e-08 [     1,  1033]:-7.830e-09 [     1,  1036]:-1.295e-08 [     1,  1039]:-1.246e-08 [     1,  1042]:-1.246e-08';
V = regexp(M, '\[', 'split');
R = sscanf([V{:}], '%d,%d]: %f');
Out = reshape(R, 3, []);

with:

FirstFiveColumns = Out(:,1:5)

producing:

FirstFiveColumns =
            1            1            1            1            1
            1            4            7           10           13
    1.157e-07    2.332e-08    2.146e-08    5.835e-08    4.043e-08

with ‘x’ being the first row, ‘y’ being the second row, and the floating-point variables (I have no idea what they represent) the third row.

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示

Stephen23 2020 年 11 月 14 日

編集済み: Stephen23 2020 年 11 月 14 日

MATLAB Online で開く

Without regexp or reshape, sscanf can parse it directly:

format long
str = '[     1,     1]: 1.157e-07 [     1,     4]: 2.332e-08 [     1,     7]: 2.146e-08 [     1,    10]: 5.835e-08 [     1,    13]: 4.043e-08 [     1,    16]: 1.011e-08 [     1,    19]: 8.211e-09 [     1,    22]: 2.590e-08 [     1,    25]:-3.475e-08 [     1,    28]:-2.854e-08 [     1,    31]:-2.987e-08 [     1,    34]:-8.897e-08 [     1,    37]:-1.351e-08 [     1,    40]:-8.564e-09 [     1,    43]:-9.072e-09 [     1,    46]:-3.556e-08 [     1,    49]:-6.093e-08 [     1,    52]:-1.343e-08 [     1,    55]:-8.914e-09 [     1,    58]:-3.609e-08 [     1,    61]:-3.609e-08 [     1,    64]:-6.093e-08 [     1,    67]:-1.343e-08 [     1,    70]:-8.914e-09 [     1,   118]: 5.625e-08 [     1,   121]: 2.883e-08 [     1,   130]: 2.507e-08 [     1,   133]: 1.102e-08 [     1,   142]:-3.891e-08 [     1,   154]:-1.175e-08 [     1,   166]:-3.459e-08 [     1,   169]:-1.171e-08 [     1,   181]:-1.171e-08 [     1,   184]:-3.459e-08 [     1,   187]:-8.513e-08 [     1,   190]:-3.947e-08 [     1,   193]:-3.466e-08 [     1,   196]:-1.196e-08 [     1,   958]: 1.944e-08 [     1,   964]: 7.516e-09 [     1,   970]:-2.705e-08 [     1,   979]:-8.340e-09 [     1,   988]:-7.965e-09 [     1,   994]:-7.965e-09 [     1,  1021]: 2.166e-08 [     1,  1024]: 9.467e-09 [     1,  1027]:-2.557e-08 [     1,  1030]:-3.156e-08 [     1,  1033]:-7.830e-09 [     1,  1036]:-1.295e-08 [     1,  1039]:-1.246e-08 [     1,  1042]:-1.246e-08';
mat = sscanf(str,'[%d,%d]:%f ',[3,Inf]).'
mat = 52×3
   1.000000000000000   1.000000000000000   0.000000115700000
   1.000000000000000   4.000000000000000   0.000000023320000
   1.000000000000000   7.000000000000000   0.000000021460000
   1.000000000000000  10.000000000000000   0.000000058350000
   1.000000000000000  13.000000000000000   0.000000040430000
   1.000000000000000  16.000000000000000   0.000000010110000
   1.000000000000000  19.000000000000000   0.000000008211000
   1.000000000000000  22.000000000000000   0.000000025900000
   1.000000000000000  25.000000000000000  -0.000000034750000
   1.000000000000000  28.000000000000000  -0.000000028540000
●

Stephen23 2021 年 1 月 3 日

MATLAB Online で開く

"both of the answers above work if i have the data in a 'string'. However... it comes in as a 1x270000000 character vector. ... it still wont work."

I very much doubt that it would make any difference.

The code in my comment already uses a character vector, not a string. Using the equivalent string would give exactly the same output, because either a character vector or a string scalar can be supplied to sscanf, it makes zero difference. Lets try it:

Character vector:

str = '[     1,     1]: 1.157e-07 [     1,     4]: 2.332e-08'; % char vector
mat = sscanf(str,'[%d,%d]:%f ',[3,Inf]).'
mat = 2×3
    1.0000    1.0000    0.0000
    1.0000    4.0000    0.0000
●

String:

str = "[     1,     1]: 1.157e-07 [     1,     4]: 2.332e-08"; % string
mat = sscanf(str,'[%d,%d]:%f ',[3,Inf]).'
mat = 2×3
    1.0000    1.0000    0.0000
    1.0000    4.0000    0.0000
●

Most likely your character vector does not have the exact format that you showed us in your original question, e.g. contains some leading characters or non-displaying character, or some other difference. Both Star Strider's and my code rely on the input having the exact format that you showed in your question.

Tyler 2021 年 1 月 3 日

Thank you, this is correct. There was one line of header in the file.

Thanks so much

Star Strider 2021 年 1 月 4 日

As always, my pleasure!

サインインしてコメントする。

looking for regular expression to parse sparse data

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

リリース

タグ

Community Treasure Hunt

looking for regular expression to parse sparse data

0 件のコメント -2 件の古いコメントを表示 -2 件の古いコメントを非表示

採用された回答

6 件のコメント 4 件の古いコメントを表示 4 件の古いコメントを非表示

その他の回答 (0 件)

カテゴリ

製品

リリース

タグ

参考

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示 -2 件の古いコメントを非表示

6 件のコメント
4 件の古いコメントを表示 4 件の古いコメントを非表示