Trying to extract different numbers out of structured text
1 回表示 (過去 30 日間)
古いコメントを表示
Hi guys,
I am trying to extract numbers out of a bunch of strings, where every number is different in each line but they have the same structure like this:
phrases = ["Analyst Actions: Stifel Nicolaus Cuts Apache Price Target to $37 From $40, Maintains Buy Rating"; % to $NEWPT From $OLDPT
"Analyst Actions: Citigroup Initiates Coverage on Expedia With Buy Rating, $130 Price Target"; % $NEWPT Price Target
"Johnson & Johnson's PT cut by Credit Suisse Group AG to $159.00. outperform rating. (NYSE:JNJ)"; % to $NEWPT
"Kroger's equal weight rating reiterated at Stephens. $35.00 PT. (NYSE:KR)"; % $NEWPT PT
"Analyst Actions: Citigroup Initiates Coverage on Booking Holdings With Buy Rating" % this row has no value and should be "None" "None"
% more similiar lines in the same structure
]
% Extract NewPT from each line
% Extract OldPT from each line
% Write "None" where NewPT or OldPT values are null
I am trying to create two columns - NewPT and OldPT and extract the values as commented above and assign "None" whenever values don't exist
I'll be thankful to anybody who can help me with this.
Thank you!
2 件のコメント
Rik
2020 年 3 月 15 日
There are only 3 cases, so it shouldn't be too difficult to write a parser. Did you try locating the values by searching for the dollar signs?
採用された回答
Akira Agata
2020 年 3 月 15 日
I believe 'Regular Expression' will extract the target part of string. The following is an example.
% Extract target part of string
newpt = regexp(phrases,'((?<=to \$)\d+\.?\d*|\d+\.?\d*(?= (Price Target|PT)))','match','once');
oldpt = regexp(phrases,'(?<=From \$)\d+\.?\d*','match','once');
% Convert to numerical array
newpt = str2double(newpt);
oldpt = str2double(oldpt);
その他の回答 (0 件)
参考
カテゴリ
Help Center および File Exchange で Characters and Strings についてさらに検索
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!