How do a read a text file as fixed width columns as in Excel?

Hello,
I have a hydrologic model that outputs hourly data in a strange but fixed width format. I can use Excel to get the format that I need but if I could skip that step and do it in matlab it would be useful.
Below is an example of the data and for some reason this matlab question window screws up the spacing so I will give the spacing for the 27 desired columns. Note: The spacings need to count white space
Column widths: 4,2,3,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5
1989 1 2318 119 120 121 122 123 124 1
1989 1 24 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 110 111 112 1
1989 1 2413 114 115 116 117 118 119 120 121 122 123 124 1
Here is how the data should be separated using vertical lineson the second example line:
1989| 1| 24| 1| 1| 2| 1| 3| 1| 4| 1| 5| 1| 6| 1| 7| 1| 8| 1| 9| 1|10| 1|11| 1|12| 1|
However, when i try fscanf using %c for the specifier I just get a mess.
fid =fopen('Webster_Soil.SRO','r');
a = fscanf(fid,'%4c %2c %3c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c %2c %5c\n');
fclose all;
Does anyone have any solutions? Thanks in advance for the help,
Brandon Sloan

回答 (2 件)

Leah
Leah 2013 年 3 月 19 日

0 投票

The sum of all of your spacing is 93, so you would need 93 characters per entry. The data above has 37, 57, & 57 characters in each string. I think you might be missing some white space or you spacing array is wrong. I did something like this.
S={'1989 1 2318 119 120 121 122 123 124 1';...
'1989 1 24 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 110 111 112 1';...
'1989 1 2413 114 115 116 117 118 119 120 121 122 123 124 1'};
last=1;
spacing=[4 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2];
for i=2:length(S)
Schar=char(S(i));
for j=1:length(spacing)
if j==1
Snew(i,j)=str2double(Schar(last:last+spacing(j)));
else
Snew(i,j)=str2double(Schar(last:last+spacing(j)-1));
end
last=last+spacing(j);
end
last=1;
end

3 件のコメント

Brandon Sloan
Brandon Sloan 2013 年 3 月 22 日
Leah
Thanks for your quick response. Yeah this textbox on the matlab website will not let me display the appropriate spacing for whatever reason, however your code works well when I modify it for my spacing. There is just one problem, it will not read incomplete lines like the first one in my data example. When i define define the for loop as for j = 1:length(spacing) it gives an error since it is trying to access an index outside the length of the character matrix for the first line. I am trying to create a workaround for it but was wondering if you had any ideas.
Additionally, I was wondering what would be the best way to read a large text file of this data in line by line so that it matches what the code does. Thanks a lot for your help.
Brandon
Brandon Sloan
Brandon Sloan 2013 年 3 月 22 日
Nevermind I got it to work with some playing with the numbering. Thank you very much for the help! The code that works is:
fid=fopen('Webster_Soil10m.SRO');
g = textscan(fid,'%s','Delimiter','\n');
fclose all;
S = g{1,1};
Snew = zeros(length(S),27);
last=1;
spacing=[4 2 3 2 5 2 5 2 5 2 5 2 5 2 5 2 5 2 5 2 5 2 5 2 5 2 5];
for i=1:length(S);
Schar=(S{i,1});
x = (length(Schar)+1.5)/3.5;
for j=1:x
Snew(i,j)=str2double(Schar(last:last+spacing(j)-1));
last=last+spacing(j);
end
last=1;
end
Cedric I tried your code but it kept giving me an error when I used the reshape function because the total number of elements was not divisible by the number of elements in a row. Thanks anyway for the help.
Brandon
Cedric
Cedric 2013 年 3 月 22 日
Hi Brandon, sorry, I made a mistake; please, see the EDIT in my post.

サインインしてコメントする。

Cedric
Cedric 2013 年 3 月 22 日
編集済み: Cedric 2013 年 3 月 22 日

0 投票

You could go for something like the following:
colw = [4,2,3,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5,2,5] ;
buffer = fileread('Webster_Soil.SRO') ;
buffer(buffer<' ') = [] ; % Remove \n,\r, etc. [EDITED]
buffer = reshape(buffer, sum(colw), []).' ;
data = str2double(mat2cell(buffer, ones(size(buffer,1),1), colw)) ;

2 件のコメント

Brandon Sloan
Brandon Sloan 2013 年 3 月 22 日
Hey Cedric, when I do that I keep getting this error:
??? Error using ==> reshape
Product of known dimensions, 93, not divisible into total number of elements, 244.
Error in ==> fd at 4
buffer = reshape(buffer, sum(colw), []).' ;
It is the same error that I got before, and I am not sure why. My other code does works so it is not too big of a deal. Thanks though
Brandon
Cedric
Cedric 2013 年 3 月 22 日
編集済み: Cedric 2013 年 3 月 22 日
Well, it means that not all the lines in the file are 93 characters long (which is the sum of the padding that you gave.
This codes assumes that the whole file is padded according to the widths that you defined. It reads the whole content in one shot, eliminates all line feeds, carriage returns, etc, which leaves a vector of valid characters (whose length is assumed to be 93 * the number of lines of the file has the structure defined above). The vector is then reshaped in an array of size n x 93, where n is the number of lines (operation that fails if the length of the vector is not a multiple of 93). This array is then split in strings according to your padding (which outputs a cell array) and then converted to an array of double.

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeText Data Preparation についてさらに検索

質問済み:

2013 年 3 月 19 日

編集済み:

2015 年 5 月 8 日

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by