Finding Lines in a Large Text File with a Specific Text

32 ビュー (過去 30 日間)
Sonoma Rich
Sonoma Rich 2019 年 7 月 12 日
回答済み: Sonoma Rich 2019 年 7 月 13 日
I am trying to read a large text file (>1GB). I only want to read lines that contain a specific text. For example, I want to read every line that contains "<field name="data". Currently I am using fgetl and reading every line, checking if the text is in the line, but it takes too long. Any suggestions?

採用された回答

Sonoma Rich
Sonoma Rich 2019 年 7 月 13 日
I found the following code that works well
filetext = fileread('fileread.m');
expr = '[^\n]*fileread[^\n]*';
matches = regexp(filetext,expr,'match');
disp(matches')
but the regexp function is slower than I expected. I ended up using the following method which is significantly faster.
fid = fopen('fileread.m','r');
ftext = textscan(fid,'%s','Delimiter','\n');
fclose(fid);
matches = ftext{1}(contains(ftext{1},'fileread'));
disp(matches)

その他の回答 (2 件)

KSSV
KSSV 2019 年 7 月 12 日
Read about textscan. This function gives you option of running a loop and reading required chunks (lines) of the file. In these chunks, you can pick your required line.

Walter Roberson
Walter Roberson 2019 年 7 月 12 日
If you have enough memory:
S = fileread('YourFileNameHere.txt');
selected = regexp('^.*<fieldname\s*=.*$', 'match', 'dotexceptnewline', 'lineanchors');
And in the case where you do not care what is at the begining or end of line and just want to know what the "data" field content is, then
S = fileread('YourFileNameHere.txt');
datas = regexp('(?<=fieldname\s*=")(?<data>[^"]*)', 'tokens');
That should get you a struct array with field name 'data' that is the content of inside the quotes.

カテゴリ

Help Center および File ExchangeMigrate GUIDE Apps についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by