How to parse text data

72 ビュー (過去 30 日間)
Life is Wonderful
Life is Wonderful 2019 年 7 月 17 日
コメント済み: Life is Wonderful 2019 年 8 月 2 日
Hi
I have data in the below format. I need the mechanism to parse the data from below format with expected output.
Input data format:
07/16 12:55:22.012 INFO | test_runner_utils:0812| Began logging to /tmp/test_that_results_hatch_deL3lZ
07/16 12:55:27.477 INFO | test_runner_utils:0259| autoserv| Processing control file
Expected Output format:
Define level of message extraction based on the marker sign ==> |
-Step 1: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|
-Step 2: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>| extract full text in a variable, option to grab variable if associated with value
-Step 3: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|
-Step 4: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|extract full text in a variable, option to grab variable if associated with value
-Step 5: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|<string>|
-Step 6: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|<string>|extract full text in a variable, option to grab variable if associated with value
Input data format:
07/16 12:55:27.620 DEBUG| utils:0287| [stdout] CHROMEOS_RELEASE_BOARD=hatch
07/16 13:28:58.330 INFO | mode_switcher:0673| -[FAFT]-[ start wait_for_client ]---
Expected Output format:
-Step 1: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|<[string]>
-Step 2: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|<[string]> extract full text in a variable, option to grab variable if associated with value
-Step 3: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|<[string]>
-Step 4: Extract Timestamp in mm/dd HH:MM:sec.millisec <string>|<string>:%1.3f|<[string]> [string] extract full text in a variable, option to grab variable if associated with value
Input data format:
2019-07-16 12:55:30 > string
2019-07-16 12:55:30 powerbtn: released
Expected Output format:
Note the marker >
-Step 1: Extract Timestamp in YYYY:MM:DD HH:mm:sec > < string>
-Step 2: Extract Timestamp in YYYY:MM:DD HH:mm:sec < full string>
Input data format
2019-07-16 12:55:31 > [12074.734997 HC 0x121 err 1]
Expected Output format
-Step 1: Extract Timestamp in YYYY:MM:DD HH:mm:sec > [< %1.3f string extract full text in a variable, option to grab variable if associated with value>]
Thanks a lot
  5 件のコメント
Life is Wonderful
Life is Wonderful 2019 年 7 月 19 日
編集済み: Life is Wonderful 2019 年 7 月 19 日
Previous one timestamp was in UTC format and this one is dd/mm format.
Can you please support me?
Thanks
Life is Wonderful
Life is Wonderful 2019 年 7 月 23 日
Any feedback ?

サインインしてコメントする。

採用された回答

Guillaume
Guillaume 2019 年 7 月 23 日
編集済み: Guillaume 2019 年 7 月 23 日
Are you still on very old version (please fill the release field next to the question)?. If on a modern version, the file can easily be read with:
VariableNames = {'Date', 'Level', 'delim1', 'PID', 'delim2', 'Message'};
VariableWidths = [19, 5, 1, 23, 2, 5000];
VariableTypes = {'datetime', 'char', 'char', 'char', 'char', 'char'};
opts = fixedWidthImportOptions('VariableNames', VariableNames, 'VariableWidths', VariableWidths, 'VariableTypes', VariableTypes, 'SelectedVariableNames', [1, 2, 4, 6]);
opts = setvaropts(opts, 'Date', 'InputFormat', 'MM/yy hh:mm:ss.SSS');
content = readtable('test_that.txt', opts);
results in:
If on a version fo matlab that doesn't have tables, use textscan with fixed width fields:
fid = fopen('test_that.txt', 'rt');
content = textscan(fid, '%18c%*c%5c%*c%23c%*2c%s', 'Delimiter', '', 'Whitespace', '');
fclose(fid);
content = [cellstr(content{1}), cellstr(content{2}), cellstr(content{3}), content{4}]
  23 件のコメント
Life is Wonderful
Life is Wonderful 2019 年 8 月 2 日
編集済み: Life is Wonderful 2019 年 8 月 2 日
Yes, you can put text on a figure. How do you determine the position of said text. You need at least an x (time?) and a y (????).
Sure.
x time = content.Date
y time = content.Level
x time = content.Date
y time = content.PID
x time = content.Date
y time = content.Message % May a trimmed one for view purpose
Thank a ton . Highly useful tip that too in time help.
Life is Wonderful
Life is Wonderful 2019 年 8 月 2 日
Probably you can suggest on time DATAPOINTS plot text information which is text annotations

サインインしてコメントする。

その他の回答 (2 件)

Bob Thompson
Bob Thompson 2019 年 7 月 18 日
I need next steps
◾Convert Datacontent into cell's - like timestamp , message data-1,message data-2
◾Put cell in proper format
◾Create Matlab variables
◾Display Matlab variable for good analysis
1) regexp automatically outputs all results in a cell, each containing a string.
2) You can convert strings to date time formats using datetime. To do this 'quickly' I suggest using a loop through your regexp results, or by using cellfun (which is really still a loop).
3) What exactly do you mean by this? I personally do not know of a way to dynamically create variables within Matlab, and I think you would be better served to keep the information in a cell array, or to make a table out of it. It is certainly possible to create new variables in a table from a captured string from regexp.
4) Displaying Matlab variables is simply a matter of not suppressing them, or if specifically wanting to display them then you can use fprintf with no target so it defaults to the command window.
  5 件のコメント
Bob Thompson
Bob Thompson 2019 年 7 月 19 日
Are you only looking to capture the timestamp? It seems like the issue is more in the initial regexp processing than in the date time conversion.
If you are only looking to capture the timestamp I would suggest doing a regexp call like this:
filedata = regexp(filecontent'(\d\d.\d\d\s\d\d.\d\d.\d\d.\d\d\d)\D+\d\d\d\d\D+\n','tokens');
dates = datetime([filedata{:}], 'InputFormat', 'MM/dd HH:mm:ss.SSS');
If you are looking to capture more than the timestamps then please explain more. I know you outline some more in your OP, but I'm not entirely sure what you're referring to.
Life is Wonderful
Life is Wonderful 2019 年 7 月 19 日
編集済み: Life is Wonderful 2019 年 7 月 19 日
I am looking not only for timestamp but associated data.I wrote in the begining how my requirement/algorithm looks like.
My text source file contains data as below
07/18 11:27:02.968 DEBUG| autoserv:0729| autoserv is running in drone lab_chrome-debug.
07/18 11:27:02.968 DEBUG| autoserv:0730| autoserv command was: /build/hatch/usr/local/build/autotest/server/autoserv -p -r /tmp/test_that_results_hatch_iGVg61/results-1-firmware_UpdateKernelSubkeyVersion -m 10.223.131.106 --no_console_prefix -u autotest_system -l ad_hoc_build/ad_hoc_suite/firmware_UpdateKernelSubkeyVersion -s --no_use_packaging /tmp/tmphhqbvd --args servo_host=localhost servo_port=9999
07/18 11:27:02.968 INFO | pidfile:0016| Logged pid 23629 to /tmp/test_that_results_hatch_iGVg61/results-1-firmware_UpdateKernelSubkeyVersion/.autoserv_execute
07/18 11:27:02.969 DEBUG| host_info:0263| Committing HostInfo to store InMemoryHostInfoStore[HostInfo[Labels: [], Attributes: {}]]
07/18 11:27:02.969 DEBUG| host_info:0267| HostInfo updated to: HostInfo[Labels: [], Attributes: {}]
07/18 11:27:02.970 DEBUG| base_job:0357| Persistent state global_properties.tag now set to ''
07/18 11:27:02.972 DEBUG| base_job:0357| Persistent state global_properties.fast now set to False
07/18 11:28:16.561 DEBUG| servo:0666| Setting power_state to 'rec'
07/18 11:28:23.419 WARNI| test:0606| The test failed with the following exception
Traceback (most recent call last):
File "/build/hatch/usr/local/build/autotest/client/common_lib/test.py", line 567, in _exec
_cherry_pick_call(self.initialize, *args, **dargs)
File "/build/hatch/usr/local/build/autotest/client/common_lib/test.py", line 715, in _cherry_pick_call
return func(*p_args, **p_dargs)
File "/build/hatch/usr/local/build/autotest/server/site_tests/firmware_UpdateKernelSubkeyVersion/firmware_UpdateKernelSubkeyVersion.py", line 60, in initialize
self.switcher.setup_mode('dev' if dev_mode else 'normal')
File "/build/hatch/usr/local/build/autotest/server/cros/faft/utils/mode_switcher.py", line 427, in setup_mode
self.reboot_to_mode(mode)
File "/build/hatch/usr/local/build/autotest/server/cros/faft/utils/mode_switcher.py", line 474, in reboot_to_mode
self._enable_dev_mode_and_reboot()
File "/build/hatch/usr/local/build/autotest/server/cros/faft/utils/mode_switcher.py", line 717, in _enable_dev_mode_and_reboot
self._enable_rec_mode_and_reboot(usb_state='host')
File "/build/hatch/usr/local/build/autotest/server/cros/faft/utils/mode_switcher.py", line 590, in _enable_rec_mode_and_reboot
psc.power_on(psc.REC_ON)
File "/build/hatch/usr/local/build/autotest/server/cros/servo/servo.py", line 134, in power_on
self._servo.set_nocheck('power_state', rec_mode)
File "/build/hatch/usr/local/build/autotest/server/cros/servo/servo.py", line 672, in set_nocheck
raise error.TestFail(err_msg)
TestFail: Setting 'power_state' to 'rec' :: Timeout waiting for response.
07/18 11:28:23.420 DEBUG| test:0611| Running cleanup for test.
07/18 11:44:07.043 DEBUG| ssh_host:0310| Running (ssh) 'true' from '_install|wait_up|is_up|ssh_ping|run|run_very_slowly'
I have to parse the data as following
Example
fileData.timestamp = 07/18 11:28:23.420
fileData.timestamp.Msglib = DEBUG
fileData.timestamp.MsgSublib = test
fileData.timestamp.MsgSublib.idx = 0611
fileData.timestamp.MsgSublib.FullContenet = Running cleanup for test.
If error is seen, then skip those line from the input text file and continue parsing the information

サインインしてコメントする。


Life is Wonderful
Life is Wonderful 2019 年 7 月 18 日
Adding the input file

カテゴリ

Help Center および File ExchangeData Import and Export についてさらに検索

製品


リリース

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by