Problem only reading in select data

Hello all,
I am currently in the process of working on reading in this data file into MATLAB however I am having issues grabbing only the data I want. The file is formatted as follows:
*Sale Item Price Profit
1200 00213 12.21 3.26*
Date Salesperson Cost Sold At Net Money
1/10/11 12 13.45 16.45 3
1/14/11 14 3.98 3.48 -0.5
1/24/11 03 4.60 14.60 10
*Sale Item Price Profit
65 01452 13.78 6.12*
Date Salesperson Cost Sold At Net Money
1/04/11 11 20.10 40.10 20
1/06/11 11 20.11 16.11 4
*Sale Item Price Profit*
...
And so on.
I only want to have Matlab read in the data within the asterisks. Any thoughts on how to do this?
Thanks

4 件のコメント

Matt Tearle
Matt Tearle 2011 年 4 月 6 日
Just to clarify: the asterisks are actually in the file?
Zach
Zach 2011 年 4 月 6 日
The asterisks are not within in the file I put them in simply to show you exactly what pieces of data I needed to be read in.
Matt Tearle
Matt Tearle 2011 年 4 月 6 日
(To clarify the clarification: or are you looking to read data in any block with a certain headerline? ie "Sale Item Price Profit")
Zach
Zach 2011 年 4 月 6 日
I think my answer to this question if I'm following you correctly is I wish to read only the data associated with the Sale, Item, Price, Profit.

サインインしてコメントする。

 採用された回答

Matt Tearle
Matt Tearle 2011 年 4 月 6 日

1 投票

On the off-chance Walter's approach doesn't work (eg there are more than two block formats in the file), here's a more brute-force approach:
fid = fopen('asterisk.txt','rt');
data = [];
while ~feof(fid)
thisline = fgetl(fid);
if strncmpi('sale',thisline,4)
thisdata = textscan(fid,'%f %f %f %f','collectoutput',true);
data = [data;thisdata{1}];
end
end
fclose(fid);
You can modify the if statement to match whatever specific pattern you want.

8 件のコメント

Walter Roberson
Walter Roberson 2011 年 4 月 6 日
Nasty. That relies on the property of textscan() that it falls out of textscan() when the next available data does not match the first format element. With the information given, specifying that you only wanted to repeat the format once would avoid that problem -- but then you might as well use fscanf() instead of textscan()
Matt Tearle
Matt Tearle 2011 年 4 月 6 日
I don't understand the objection. What do you mean by "specifying that you only wanted to repeat the format once"? I agree that you could parse line-by-line, but I'm assuming
1) you want to read all blocks that start with a headerline "Sale Item Price Profit"
2) you don't know a priori how many lines are in each of those blocks
3) every block in the file starts with a headerline
4) as I said above, there are multiple block formats, not just the two shown
Under those assumptions, I don't see why you shouldn't read each "Sale Item Price Profit" block with textscan, knowing that it will stop at the next headerline.
Zach
Zach 2011 年 4 月 6 日
Well I also learned that 6.5 doesn't have textscan as a built in function.
Walter Roberson
Walter Roberson 2011 年 4 月 6 日
Matt, we weren't shown any examples of there being more than one line of data in a Sale block, so to match what was shown a textscan() repeat count of 1 could be used without depending upon textscan to "back up" when it figures out something is unparsable.
But that doesn't help Zach, who doesn't have textscan() and thus should probably be using fscanf()
Zach
Zach 2011 年 4 月 6 日
Is it even possible to parse through data with varying blocks using fscanf? Also I know the format to ignore is to throw an asterisk in the identification of the read input but will this input be able to handle the string that we were passing in earlier?
Walter Roberson
Walter Roberson 2011 年 4 月 6 日
In Matt's code example, replace the lines
thisdata = textscan(fid,'%f %f %f %f','collectoutput',true);
data = [data;thisdata{1}];
with
thisdata = fscanf(fid, '%f%f%f%f');
data = [data;thisdata];
Zach
Zach 2011 年 4 月 6 日
Thank you all for your help and if it isn't too much trouble I have one final understanding question. What exactly does the thisline portion do along with what does the 4 represent in the strncmpi function?
Matt Tearle
Matt Tearle 2011 年 4 月 6 日
Walter, that makes sense. Thanks for the non-textscan version.
Zach, fgetl reads a single line of text. Then sctrncmpi is comparing the the first 4 characters of that string with the string 'sale' (that's what the 4 does). You can adapt this if, for example, you had other blocks that also started with "sale" (but then had something else after).

サインインしてコメントする。

その他の回答 (1 件)

Walter Roberson
Walter Roberson 2011 年 4 月 6 日

1 投票

textread() with 'CommentStyle', {'Date', 'Profit'}

5 件のコメント

Matt Tearle
Matt Tearle 2011 年 4 月 6 日
Grah! Scooped by Walter Quickdraw Roberson while I was fiddling about with clarifications. Anyway, yes:
fid = fopen('asterisk.txt','rt');
data = textscan(fid,'%f %f %f %f','CommentStyle', {'Date', 'Profit'},'headerlines',1);
fclose(fid);
Zach
Zach 2011 年 4 月 6 日
I just tried applying this solution and unfortunately I got an error telling me that Comment style must be a string. I am confused because I thought this is what "{'Date','Profit'} did.
Matt Tearle
Matt Tearle 2011 年 4 月 6 日
Can you cut/paste the exact code you used?
Walter Roberson
Walter Roberson 2011 年 4 月 6 日
Zach: Which version of MATLAB are you using? Using a cell array of a pair of strings has been supported since at least 2007b, but there was probably a time when it wasn't supported.
Matt: You snooze, you loze! ;-)
Zach
Zach 2011 年 4 月 6 日
Sorry I went out to lunch I am using Matlab 6.5 so it probably wasn't supported in this version I will try to use Matt's code listed below.

サインインしてコメントする。

製品

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by