textscan, problem with the treatasempty when the is the signal minus (-)

1 回表示 (過去 30 日間)
Ricardo MF
Ricardo MF 2013 年 10 月 30 日
編集済み: Cedric 2013 年 10 月 30 日
I have a problem to get this .tsv file (below) in. The character chosen by the Economic Institute as the empty value was the signal minus (-). When I use the function textscan and define the Treatasempty with '-' I wrongly damage the items with negative values.
my function request is:
a=textscan(fid,'%s%s%s%n%n%n%n%n%n%n%n%n%n%n%n','delimiter',',','headerlines',4,'ReturnOnError',1,'treatAsEmpty',{'-'});%
Can anybody help me?
headline1
headline2
headline3
headline4
Índice base fixa com ajuste sazonal (2011=100) (Número índice),"Tecidos, vestuário e calçados",103.17,-99.69,99.64,101.25,102.79,-104.80,105.25,104.64,104.85,102.74,104.85,105.47
Índice base fixa com ajuste sazonal (2011=100) (Número índice),"Móveis e eletrodomésticos",108.69,109.91,110.48,111.66,108.12,113.13,112.44,113.57,111.46,113.69,113.32,116.62
Índice base fixa com ajuste sazonal (2011=100) (Número índice),"Móveis",-,-,-,-,-,-,-,-,-,-,-,-
  1 件のコメント
Cedric
Cedric 2013 年 10 月 30 日
編集済み: Cedric 2013 年 10 月 30 日
Please attach a typical file to the question.
EDIT: well, disregard this comment if my solution below works. If it doesn't please do attach a file or send it to me by email if it's too confidential to go fully public.

サインインしてコメントする。

採用された回答

Cedric
Cedric 2013 年 10 月 30 日
編集済み: Cedric 2013 年 10 月 30 日
Here is a proposal that I can refine when/if you attach a file to your question:
content = fileread( 'myFile.tsv' ) ;
content = regexprep( content, '-(,|$)', ',' ) ;
data = textscan( content, '%s%s%s%n%n%n%n%n%n%n%n%n%n%n%n', ...
'Delimiter', ',', 'HeaderLines', 4, 'ReturnOnError', 1 ) ;
The idea is to eliminate minus signs which are before a comma or at the end of the file, before using TEXTSCAN.

その他の回答 (1 件)

Ricardo MF
Ricardo MF 2013 年 10 月 30 日
Dear Cedric, tks so much for your answer. Actually, it has been my first post (question) here at the www.mathworks.com. I tried your sugestion, but the function has not removed the minus from the end of each line. Maybe the signal for the end of line is not that one!? Tks a lot.
  1 件のコメント
Cedric
Cedric 2013 年 10 月 30 日
編集済み: Cedric 2013 年 10 月 30 日
Dear Ricardo, please perform the following update: change the call to REGEXPREP for..
content = regexprep( buffer, '-(?=[,\r\n]|$)', ',' ) ;
As I didn't have your file I tried on a one line string, and, as you pointed out, I forgot to implement the pattern for the end of line. Let me know if it is too slow; there are other patterns which could apply and it's difficult to know in advance which are faster. If it doesn't work still, feel free to send me the file or a chunk of it so I can perform tests.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeText Data Preparation についてさらに検索

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by