storing words

7 ビュー (過去 30 日間)
NUR KHAIRUNNISA rahimi
NUR KHAIRUNNISA rahimi 2011 年 11 月 22 日
I have made a few changes, however my program were not able to to store the processed words into a cell array, which leads to an error for execution of the next line of codes:
??? Subscript indices must either be real positive integers or logicals.
Error in ==> WordSearchtryout2>loadWordBank at 170
if strcmp(wordbank{iword}(end), 's')
Error in ==> WordSearchtryout2 at 59
loadWordBank();
I understand that this code might be super long, and you might not have any time to check any of it, but if you do, I would totally appreciate it.
function loadWordBank
clc;
[FILENAME, pathname] = uigetfile('*.wsb','Read Matlab Code File');
if isequal(FILENAME,0) || isequal(pathname,0)
fprintf('User pressed cancel');
else
fprintf('User selected %s \n', FILENAME);
end
fid = fopen(FILENAME,'rt');
if fid<0
%error could not find the file
return,
end
total_no_words=0;
lineNUM=1;
wordbank=cell(250000,1);
no_plurals=0;
while ~feof(fid)
tline = fgetl(fid);
if(isempty(tline))
%line is empty, skip it
continue;
end
if(~ischar(tline))
%if line does not contain character
fclose(fid);
break;
end
%we finally have a string
tline=strtrim(tline);
if(sum(isspace(tline)))==length(tline)
continue;
end
is_letter=isletter(tline);
is_space=isspace(tline);
checkcount=sum(is_letter+is_space);
if checkcount~=length(tline)
% invalid line in puzzle
return;
end
%tline only contain spaces and letters
if strcmp(tline,lower(tline))==1 || strcmp(tline,upper(tline))==1
total_no_words=total_no_words+1;
wordbank(total_no_words,1)={tline};
end
lineNUM=lineNUM+1;
end
wordbank2=wordbank;
for iword=1:length(wordbank)
if strcmp(wordbank{iword}(end), 's')
no_plurals=no_plurals+1;
wordbank2{iword} = wordbank{iword}(1:(end-1));
end
end
inv_words=line_NUM-length(wordbank);
% this include empty lines, lower upper case letters, as required
no_dup=length(wordbank2)-length(unique(wordbank2));
fprintf('LOAD WORD BANK \n');
fprintf('Loading word bank: none....started\n');
fprintf('Loading word bank: %s\b\b\b\n',FILENAME);
fprintf('Successfully loaded %d words from the word bank file\n',total_no_words);
fprintf('Removing invalid words..%d words were successfully removed. \n',inv_words); %not complete
fprintf('Removing duplicate words and sorting...done\n');
fprintf('Removed %d duplicate words\n',no_dup);
fprintf('Searching for and removing any plural forms of words ending in S:%%\n');
fprintf('Removed %d plural word \n',no_plurals);
fprintf('Building word indices and calculating beginning letter counts...done\n');
fprintf('Calculating word length counts...done\n');
fprintf('Final word count: %d \n');
end
  1 件のコメント
Walter Roberson
Walter Roberson 2011 年 11 月 22 日
Time for the debugger. At the command prompt,
dbstop if error
then rerun. When it stops, look carefully at the values, make sure you are calling the functions you think you are, and so on.

サインインしてコメントする。

採用された回答

Walter Roberson
Walter Roberson 2011 年 11 月 23 日
Your line
for iword=1:length(wordbank)
should be
for iword=1:total_no_words
as you do not want to be trying to de-pluralize words that were never stored. length(wordbank) is going to be set by you defining it as a cell array with 25000 entries.
As an optimization, you can replace your checkcount lines that currently say
checkcount=sum(is_letter+is_space);
if checkcount~=length(tline)
with
if ~all(is_letter | is_space)
And after that your line
if strcmp(tline,lower(tline))==1 || strcmp(tline,upper(tline))==1
could be optimized to
if strcmp(tline,lower(tline)) || strcmp(tline,upper(tline))
However! This line checks that time is all in upper-case or all in lower-case and will not store any line that is in mixed-case. Your all-upper-case lines will be stored in upper-case. I suspect you do not intend either of these behaviours, and I cannot tell what you are wanting to check with that "if" statement. I suspect you should be using
tline = lower(tline);
and then storing that unconditionally.
  1 件のコメント
NUR KHAIRUNNISA rahimi
NUR KHAIRUNNISA rahimi 2011 年 11 月 23 日
if strcmp(tline,lower(tline)) || strcmp(tline,upper(tline))
However! This line checks that time is all in upper-case or all in lower-case and will not store any line that is in mixed-case. Your all-upper-case lines will be stored in upper-case. I suspect you do not intend either of these behaviours, and I cannot tell what you are wanting to check with that "if" statement. I suspect you should be using
tline = lower(tline);
and then storing that unconditionally
Yes, I do intend to store the words that are either only lower case or only upper case, and not store mixed cases.

サインインしてコメントする。

その他の回答 (1 件)

Fangjun Jiang
Fangjun Jiang 2011 年 11 月 22 日
I am not sure if you follow your own question. There is a much easier way to do this.
  1 件のコメント
NUR KHAIRUNNISA rahimi
NUR KHAIRUNNISA rahimi 2011 年 11 月 23 日
I decided to use a cell array because it is easier for me to manipulate the cell array in a manner I would understand. However, your suggestion have helped me in understand more functions in Matlab, thank you. I will try to use it in the next part of the program.

サインインしてコメントする。

カテゴリ

Help Center および File ExchangeGet Started with MATLAB についてさらに検索

タグ

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by