How to discard all punctuation from a text file

2 ビュー (過去 30 日間)
Fateme Jalali
Fateme Jalali 2016 年 1 月 18 日
コメント済み: Image Analyst 2017 年 9 月 16 日
Hello, I need a MATLAB code to discard all punctuation and signs from a text file.I want to keep only characters and numbers.Thanks.

採用された回答

John BG
John BG 2016 年 1 月 19 日
Star Stride is right but perhaps Fateme wants to keep the spaces, because space is not punctuation spaces will help reading the resulting string. The initial string
str1 = 'Hello, I need 1 MATLAB code to discard all punctuation, and signs from 9 text files.'
Lstr1=length(str1)
the characters that have to remain are
str_space='\s'
str_caps='[A-Z]'
str_ch='[a-z]'
str_nums='[0-9]'
their respective positions within str1 are
ind_space=regexp(str1,str_space)
ind_caps=regexp(str1,str_caps)
ind_chrs=regexp(str1,str_ch)
ind_nums=regexp(str1,str_nums)
mask=[ind_space ind_caps ind_chrs ind_nums]
now let's find the position of all punctuation characters
num_str2=1:1:Lstr1
num_str2(mask)=[]
now num_str2 contains the positions of punctuation characters to remove
str3=str1
str3(num_str2)=[]
str3 contains the resulting string without any other character than alphabetical characters and numbers.
str3 =
Hello I need 1 MATLAB code to discard all punctuation and signs from 9 text files
Hope it helps in your reading.
John
  1 件のコメント
Fateme Jalali
Fateme Jalali 2016 年 1 月 19 日
thank you.After discarding all punctuation i want to put a white space before the first character and after the last one.How can i do it?thanks

サインインしてコメントする。

その他の回答 (3 件)

Star Strider
Star Strider 2016 年 1 月 18 日
編集済み: Star Strider 2016 年 1 月 19 日
One possible approach:
str = 'Hello, I need 1 MATLAB code to discard all punctuation, and signs from 9 text files.';
Idx = regexp(str, '[^. , !]');
Result = str(Idx)
EDIT — To keep the spaces, just remove them from the regexp pattern string (in this instance, I was telling it to exclude spaces as well as the punctuation):
str = 'Hello, I need 1 MATLAB code to discard all punctuation, and signs from 9 text files.';
Idx = regexp(str, '[^.,!]');
Result = str(Idx)
Result =
Hello I need 1 MATLAB code to discard all punctuation and signs from 9 text files
You can add as many other punctuation or other characters as necessary between the square brackets ‘[]’, depending on what appears in your strings that you do not want in the result.
  2 件のコメント
Mubashir
Mubashir 2017 年 9 月 16 日
I want to use this in table. I can't remove punctuation's in whole column.
Image Analyst
Image Analyst 2017 年 9 月 16 日
So then go down the table's column one row at a time. What's wrong with that? It's easy.

サインインしてコメントする。


Walter Roberson
Walter Roberson 2016 年 1 月 19 日
https://www.mathworks.com/matlabcentral/newsreader/view_thread/127125
  2 件のコメント
Fateme Jalali
Fateme Jalali 2016 年 1 月 19 日
thank you.After discarding all punctuation i want to put a white space before the first character and after the last one.How can i do it?thanks
Walter Roberson
Walter Roberson 2016 年 1 月 19 日
YourString = [' ' YourString ' '];

サインインしてコメントする。


Image Analyst
Image Analyst 2016 年 1 月 19 日
Yet another way:
str = 'Hello, ~!@#$^&*()_+.,<>;"?I need 1 MATLAB code to discard all punctuation, and signs from 9 text files.'
% Get logical index to keep space, numbers, and upper and lower case letters.
keeperIndexes = str == ' ' | (str>='0' & str<='9') | ...
(str>='a' & str<='z') | (str>='A' & str<='Z');
strOut = str(keeperIndexes) % Extract only those elements

カテゴリ

Help Center および File ExchangeData Import and Export についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by