Reading .txt file in MATLAB with issue in formatting

8 ビュー (過去 30 日間)
Tulkkas
Tulkkas 2022 年 2 月 23 日
回答済み: Jeremy Hughes 2022 年 2 月 24 日
I am using the MATLAB 2021b function readtable to read the following text file:
ISSUERID|FISCAL_YEAR|FIELD_ID|VALUE|PUBLISHED_DATE|SOURCE|DATA_TYPE|ADDITIONAL_INFO
IID000000002137286||DIVERSITY_DISCLOSURE_ETHNICITY_SOURCE|"https://www.cubesmart.com/about-us/corporate-responsibility/\""||{}||{}
The separator is the | (bar) character. Aenter code heres you can see, at the end of the "https://www.cubesmart.com/about-us/corporate-responsibility/\"" field value, there is the following \" character, which messes up the reading. I am trying to use the options 'Whitespace' to ignore it but for some reason it does not work. The code I am running is:
T_equ = readtable(file_name, 'FileType', 'text', 'Delimiter', {'|'}, 'Whitespace', '\"');
where file_name is just the path to the .txt file.
The results of the import is an empty table. I understand this results if the character \" would be read as a special character but from my understanding the 'Whitespace', '\"' pair/value argument should force the readtable function to ignore it. What am I missing here?
  3 件のコメント
Tulkkas
Tulkkas 2022 年 2 月 23 日
I did try with ouble slash but it does not work either. How would you read the text without interpreting the formatting? And then do the parsing?
Rik
Rik 2022 年 2 月 23 日
For example with my readfile function (which you can get from the FEX or with the AddOn manager), or with the readlines function.
You could use the split function to split based on the | character (or even use regexp).
The result will not be a table yet, but it should be easy to convert it to what you need.

サインインしてコメントする。

回答 (1 件)

Jeremy Hughes
Jeremy Hughes 2022 年 2 月 24 日
The issue is that \" is not how CSV files (and thus readtable) escape doube-quotes. To escape quotes, the file should have "".
Like this:
X|Y|Z|"And something in ""quotes""."
Otherwise, readtable will keep reading after \"" until it finds a lone double-quote character. I would guess that's what you're seeing.
The only way I can think to resolve this is by reading the file, and replacing \" with "" then write the data back out. There's no way to get readtable to treat \" as an escaped quote.
text = fileread(fn);
text = replace(text,'\"','""');
fid = fopen(fn,'w'); % or use a new file name if you don't want to overwrite it.
fwrite(fid,text);
fclose(fid);

カテゴリ

Help Center および File ExchangeText Files についてさらに検索

製品


リリース

R2021b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by