Efficient way to standardize large amounts of text

Question

André Kucharzewski 2021 年 10 月 19 日

0
リンク

この質問への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1567448-efficient-way-to-standardize-large-amounts-of-text

コメント済み: André Kucharzewski 2021 年 10 月 24 日

採用された回答: Duncan Po

Hello,

i have a table with a size of around 1 million rows. In one column there are different type of strings.

Mixed with letters and numbers. Like:

abc_123

cdf_123

123_cdf

123 (abc)

There are around 120 different text formats which repeat. Most of them are able to bring in a standard format like aa_11. Any format which is not able to fit get a standard undef format.

Any suggestions how i can handel such a large dataset without for loop over 1Million rows and check each cell?

Thanks in advance :)

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

サインインしてコメントする。

サインインしてこの質問に回答する。

Answer 1

Duncan Po 2021 年 10 月 19 日

0
リンク

この回答への直接リンク

https://jp.mathworks.com/matlabcentral/answers/1567448-efficient-way-to-standardize-large-amounts-of-text#answer_812253

You may be able to use patterns. For example, suppose the standard format is letters followed by underscore followed by numbers, you can detect this pattern:

>> x = ["abc_123", "cdf_123", "123_cdf", "123 (abc)"]; % create an example string array

>> matches(x, lettersPattern + "_" + digitsPattern) % check if the strings match the standard pattern

ans =

1×4 logical array

1 1 0 0

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

André Kucharzewski 2021 年 10 月 24 日

That should do the work, but its an function introduced with R2019b I only have R2019a.

Kinda sad :(

But Thank you for ur input :)

サインインしてコメントする。

Efficient way to standardize large amounts of text

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

Efficient way to standardize large amounts of text

0 件のコメント -2 件の古いコメントを表示-2 件の古いコメントを非表示

採用された回答

1 件のコメント -1 件の古いコメントを表示-1 件の古いコメントを非表示

その他の回答 (0 件)

参考

カテゴリ

タグ

製品

リリース

Community Treasure Hunt

0 件のコメント
-2 件の古いコメントを表示-2 件の古いコメントを非表示

1 件のコメント
-1 件の古いコメントを表示-1 件の古いコメントを非表示