map first field to second field in txt file

hi,
I have this txt file :
1::Toy Story (1995) ::Animation|Children's|Comedy
2::Jumanji (1995) ::Adventure|Children's|Fantasy
8::Tom and Huck (1995) ::Adventure|Children's
I want to map for example 1 into animation, and 2 into adventure 8 into adventure i.e ,i need creat txtfile has two columns , the first column contains 1,2,8 and second column contains animation,adventure,adventure
please, how do that thanks in advance

 採用された回答

per isakson
per isakson 2012 年 7 月 31 日
編集済み: per isakson 2012 年 8 月 5 日

1 投票

A slight modification of the textscan command I provided to your question the other day will read the file. (You never explained how "::" should be interpreted.) What do you mean by "I read each filed alone of a one row, textscan do not work with it."? If you don't need a column add "*" after "%", e.g. "%*d" to suppress the first column.
Thus
>> cac = txt2m
cac =
[3x1 int32] {3x1 cell} {3x1 cell}
>> cac{:}
ans =
1
2
8
ans =
'Toy Story (1995) '
'Jumanji (1995) '
'Tom and Huck (1995) '
ans =
'Animation|Children's|Comedy'
'Adventure|Children's|Fantasy'
'Adventure|Children's'
>>
where the function, txt2m, is given by
function cac = txt2m()
fid = fopen('cssm.txt');
cac = textscan( fid, '%d%s%s' ...
, 'Delimiter' , ':' ...
, 'CollectOutput' , false ...
... , 'EmptyValue' , -999 ...
... , 'ExpChars' , '' ...
, 'MultipleDelimsAsOne' , true ...
, 'Whitespace' , '' );
fclose( fid );
end
then regexp and str2num
>> regexp( cac{2}, '\d{4}', 'match' )
ans =
{1x1 cell}
{1x1 cell}
{1x1 cell}
>> ans{:}
ans =
'1995'
ans =
'1995'
ans =
'1995'
--- In response to the answer below ---
This modified function, txt2m, reads and parses your file. It reads the file to a string with the function, fileread (thanks Walter, I didn't know of that one), and replaces "::" by "¤" (knock on wood). I just picked a character on the keyboard.
Try
>> cac = txt2m()
cac =
[13x1 int32] {13x1 cell} {13x1 cell}
>>
where
cssm.txt contains your 13 rows
and where
function cac = txt2m()
str = fileread( 'cssm.txt' );
str = strrep( str, '::', '¤' );
cac = textscan( str, '%d%s%s' ...
, 'Delimiter' , '¤' ...
, 'CollectOutput' , false ...
... , 'EmptyValue' , -999 ...
... , 'ExpChars' , '' ...
, 'MultipleDelimsAsOne' , true ...
, 'Whitespace' , '' );
end

13 件のコメント

huda nawaf
huda nawaf 2012 年 7 月 31 日
thanks, when write c1=cac{1} I got all data of three columns, can we get c1=cac{1};c2=cac{2}; c3=cac{3}?
many thanks
per isakson
per isakson 2012 年 7 月 31 日
編集済み: per isakson 2012 年 7 月 31 日
If so, your textscan-command is not identical to mine. Or less likely Matlab version. I use R2012a. See above. Try
cac2 = cac{1};
Use the Variable Editor to inspect the content of variables. Try double-click "cac" in the Workspace window.
huda nawaf
huda nawaf 2012 年 7 月 31 日
編集済み: Walter Roberson 2012 年 7 月 31 日
now i got c1=cac{1}
1
2
8
but when write c2=cac{2}
i got the other two fields together not separating.
why?
per isakson
per isakson 2012 年 7 月 31 日
Hard for me to guess. I don't know what exactly you are doing. Do you use this "line"?
'CollectOutput' , false ...
huda nawaf
huda nawaf 2012 年 7 月 31 日
this is what I do
f1=fopen('d:\matlab\r2011a\bin\movielens\1m_mov\movies.txt');
cac = textscan( f1, '%d %s %s' ...
, 'Delimiter' , ':' ...
, 'CollectOutput' , true ...
... , 'EmptyValue' , -999 ...
, 'ExpChars' , '' ...
, 'MultipleDelimsAsOne' , true ...
, 'Whitespace' , '' );
fclose( f1 )
c1=cac{1};c2=cac{2};c3=cac{3}
huda nawaf
huda nawaf 2012 年 7 月 31 日
now I got the three fields when I place false instea of true.
but I got just the 12 values for each field, while I have 3000 values for each column. why?
per isakson
per isakson 2012 年 8 月 1 日
編集済み: per isakson 2012 年 8 月 1 日
I cannot guess
  1. Any error message?
  2. What do the data lines 12 to 14 look like?
huda nawaf
huda nawaf 2012 年 8 月 2 日
thanks there is no any error message these lines 12::Dracula: Dead and Loving It (1995)::Comedy|Horror 13::Balto (1995)::Animation|Children's 14::Nixon (1995)::Drama
I will check my code with another txtfile, then tell u
huda nawaf
huda nawaf 2012 年 8 月 2 日
I tried with another txt file, the same thing, where I took file with 7 rows
but when run it give me 2 row values
per isakson
per isakson 2012 年 8 月 2 日
編集済み: per isakson 2012 年 8 月 2 日
The row
12::Dracula: Dead and Loving It (1995)::Comedy|Horror
contains four values not three as assumed when designing the format string. The ":" after "Dracula" is interpreted as a delimiter.
There is no simple way AFAIK to prescribe "::" as delimiter.
One way is to read the complete file as one string or one string per line and replace "::" by another character and use that as delimiter. Which character would be safe to use as delimiter?
You should have inspected the result, cac, of the read operation.
huda nawaf
huda nawaf 2012 年 8 月 3 日
THANKS, i corrected the error, and I did what you suggested but the problem is not solved.
per isakson
per isakson 2012 年 8 月 3 日
編集済み: per isakson 2012 年 8 月 3 日
  1. What did you do? What does your new code look like?
  2. How does it behave? What output? What error message?
Why do you expect me to guess?
per isakson
per isakson 2012 年 8 月 4 日
編集済み: per isakson 2012 年 8 月 4 日
Why don't you care to respond?

サインインしてコメントする。

その他の回答 (1 件)

huda nawaf
huda nawaf 2012 年 8 月 4 日
編集済み: Walter Roberson 2012 年 8 月 4 日

0 投票

I just need to read txtfile with this format:
1::Toy Story (1995) ::Animation|Children's|Comedy
2::Jumanji (1995) ::Adventure|Children's|Fantasy
8::Tom and Huck (1995) ::Adventure|Children's
there is no error message , but I have 3000 rows ,when I read it use the code u sent it earlier I got just first 12 rows?
I want to map first fiels into the first word of third field
ex. 1 Animation 2 Adventure 8 Adventure
this is what I need . the first 13 rows of my file:
1::Toy Story (1995)::Animation|Children's|Comedy
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale (1995)::Comedy|Drama
5::Father of the Bride Part II (1995)::Comedy
6::Heat (1995)::Action|Crime|Thriller
7::Sabrina (1995)::Comedy|Romance
8::Tom and Huck (1995)::Adventure|Children's
9::Sudden Death (1995)::Action
10::GoldenEye (1995)::Action|Adventure|Thriller
11::American President, The (1995)::Comedy|Drama|Romance
12::Dracula: Dead and Loving It (1995)::Comedy|Horror
13::Balto (1995)::Animation|Children's
thanks

3 件のコメント

Walter Roberson
Walter Roberson 2012 年 8 月 4 日
textscan() will not work for this, at least not as-is. You can read the file (such as by using fileread() ) and then use regexp() to parse it.
per isakson
per isakson 2012 年 8 月 4 日
編集済み: per isakson 2012 年 8 月 4 日
See my answer above. I hope the lines you don't show don't contain "¤".
huda nawaf
huda nawaf 2012 年 8 月 5 日
thanks for both walter and per. lastly, I got what I need by your efforts

サインインしてコメントする。

カテゴリ

ヘルプ センター および File ExchangeLarge Files and Big Data についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by