Can REGEXP or TEXTSCAN be used to split 2 distinct data sets from a single text file?

3 ビュー (過去 30 日間)
I’ve got several text files containing data blocks that look like this;
MSN_JET (0:31) Observation #1 Rx'd at: (58560.000) Msg. Time: (58561.000)
Send to SCS: yes Rcv Date: 2014030 Synch: ffff Test Mode: nominal
State Time: 12:00:00.000 (58561.000)
State Position: -1100.0000, -5100.0000, 4100.0000
MSN_SENSUM (0:32) Observation #20 Rx'd at: (58560.000) Msg. Time: (58561.000)
Send to SCS: yes Rcv Date: 2010121 Synch: ffff Test Mode: nominal
Con: 10 (Mobil_Tran) Length: 5678 Remote Num: 1 Number of Observations: 1
Type: 1 Track ID: 12345 Time Tag: 58563.00000000
Band ID: 1 RAD ID: 11 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
MSN_JET (0:31) Observation #1 Rx'd at: (58570.000) Msg. Time: (58571.000)
Send to SCS: yes Rcv Date: 2014030 Synch: ffff Test Mode: nominal
State Time: 12:00:00.000 (58571.000)
State Position: -1200.0000, -5200.0000, 4200.0000
MSN_SENSUM (0:32) Observation #20 Rx'd at: (58570.000) Msg. Time: (58571.000)
Send to SCS: yes Rcv Date: 2014030 Synch: ffff Test Mode: nominal
Con: 10 (Mobil_Tran) Length: 5678 Remote Num: 1 Number of Observations: 2
Type: 1 Track ID: 12345 Time Tag: 58573.00000000
Band ID: 1 RAD ID: 25 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
Type: 1 Track ID: 12345 Time Tag: 58575.00000000
Band ID: 1 RAD ID: 6 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
MSN_SENSUM (0:32) Observation #30 Rx'd at: (58580.000) Msg. Time: (58581.000)
Send to SCS: yes Rcv Date: 2014030 Synch: ffff Test Mode: nominal
Con: 10 (Mobil_Tran) Length: 5678 Remote Num: 1 Number of Observations: 3
Type: 1 Track ID: 12345 Time Tag: 58583.00000000
Band ID: 1 RAD ID: 3 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
Type: 1 Track ID: 12345 Time Tag: 58585.00000000
Band ID: 1 RAD ID: 14 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
Type: 1 Track ID: 12345 Time Tag: 58587.00000000
Band ID: 1 RAD ID: 33 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
MSN_SENSUM (0:32) Observation #20 Rx'd at: (58590.000) Msg. Time: (58591.000)
Send to SCS: yes Rcv Date: 2014030 Synch: ffff Test Mode: nominal
Con: 10 (Mobil_Tran) Length: 5678 Remote Num: 1 Number of Observations: 4
Type: 1 Track ID: 12345 Time Tag: 58593.00000000
Band ID: 1 RAD ID: 7 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
Type: 1 Track ID: 12345 Time Tag: 58595.00000000
Band ID: 1 RAD ID: 8 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
Type: 1 Track ID: 12345 Time Tag: 58597.00000000
Band ID: 1 RAD ID: 20 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
Type: 1 Track ID: 12345 Time Tag: 58599.00000000
Band ID: 1 RAD ID: 29 Scan ID: 0 LRT/HRT: 1 Valid Flag: 0
MSN_JET (0:31) Observation #1 Rx'd at: (58590.000) Msg. Time: (58591.000)
Send to SCS: yes Rcv Date: 2014030 Synch: ffff Test Mode: nominal
State Time: 12:00:00.000 (58591.000)
State Position: -1400.0000, -5400.0000, 4400.0000
The first data block is MSN_JET, and contains 4 lines of text.
The 2nd data block is MSN_SENSUM. It contains 3 lines of text, followed by a variable number lines based on the Number of Observations (located in MSN_SENSUM, line 3).
The 2 data blocks (MSN_JET and MSN_SENSUM) are repeated numerous times throughout the text file, and there are times where the number of blocks is not equal.
In the past year, I’ve used the REGEXP function to parse data from text files similar to these. However, I’m not sure if I can take the same approach given the fact that I want to parse entire data blocks.
The goal is to create 2 separate text files for processing. One will contain the MSN_JET data. The other will contain the MSN_SENSUM data.
Any ideas are greatly appreciated. Thanks.
  2 件のコメント
per isakson
per isakson 2014 年 2 月 6 日
Does the total file fit in memory?
Brad
Brad 2014 年 2 月 7 日
Yes, it does.

サインインしてコメントする。

採用された回答

per isakson
per isakson 2014 年 2 月 6 日
編集済み: per isakson 2014 年 2 月 7 日
Try this:
str = fileread('your_file.txt');
ca1 = regexp( str, 'MSN_JET.+?(?=(MSN_SENSUM)|($))', 'match' );
ca2 = regexp( str, 'MSN_SENSUM.+?(?=(MSN_JET)|($))', 'match' );
remains to print the two files. This process does not removed any new-line-characters.

その他の回答 (1 件)

Kelly Kearney
Kelly Kearney 2014 年 2 月 6 日
Assuming the answer to per's question is yes, then here's an example:
% Read text
fid = fopen('test.txt');
data = textscan(fid, '%s', 'delimiter', '\n');
fclose(fid);
data = data{1};
% Split based on flags
tmp = zeros(size(data));
flag = {'MSN_JET', 'MSN_SENSUM'};
for ii = 1:length(flag)
tmp(strncmp(data, flag{ii}, length(flag{ii}))) = ii;
end
idx = tmp > 0;
tmp1 = tmp(idx);
tmp = tmp1(cumsum(idx)) % Trick to fill zeros
datasep = cell(size(flag));
for ii = 1:length(flag)
datasep{ii} = data(tmp == ii);
end
Parsing data values out of that text will likely require some regular expressions.

カテゴリ

Help Center および File ExchangeData Import and Export についてさらに検索

製品

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by