In the following code, I check to see if the first letter is in the dictionary of words and if the length of the word matches. If it does, return the word. Otherwise, return an error statement.
words = {'apple', 'banana', 'bee', 'salad', 'corn', 'elephant', 'pterodactyl'};
user_letter_input = input('Please enter the first letter of a word: ', 's');
user_num_input = input('Please enter how long you would like the word to be: ');
for i = words
if ((i{1}(1) == user_letter_input) && (length(i{1}) == user_num_input))
result = i;
else
result = 0;
end
end
if (result == 0)
disp('There are no matching words');
else
disp(['Your new word is: ' result]);
end
The comparison returns i being 'apple' if I type a for the first input and 5 for the second input - as it should.
However, at the end when I try to see if (result == 0), it does not display the new word, even though result is not 0.
Could someone help me fix this please?
You are overwriting result each time through your for loop. The only time that result will be 0 after the loop, is if the last word in words matches your criteria.
I would recommend storing the matching words in a separate cell array, or have a boolean array to indicate which words match. In my opinion, using a boolean is better as it takes less memory and doesn't duplicate data.
words = {'apple', 'banana', 'bee', 'salad', 'corn', 'elephant', 'pterodactyl'};
user_letter_input = input('Please enter the first letter of a word: ', 's');
user_num_input = input('Please enter how long you would like the word to be: ');
isMatch = false(size(words));
for k = 1:numel(words)
word = words{k};
isMatch(k) = word(1) == lower(user_letter_input) && ...
numel(word) == user_num_input;
end
if ~any(isMatch)
disp('There are no matching words');
else
disp(['Your matching words are:', sprintf(' %s', words{isMatch})]);
end
Also, as a side note don't use the cell array in the for loop like that. That leads to a lot of confusion. Also avoid using i as a loop variable.
You're overwriting result each time the word in your dictionary doesn't match. The only time this will work is if the last word matches. You need to change both your initialization of result and your loop:
result = 0; %// assume that no words match
for i = words
if (....
result = 1; %// we found a match... record it
end
%// no else! If we get no match, result will already be 0
end
You can use a flag to detect whether a match was found:
breakflag = 0
for i = words
if ((i{1}(1) == user_letter_input) && (length(i{1}) == user_num_input))
breakflag = 1;
break;
end
end
if (breakflag == 0)
disp('There are no matching words');
else
disp(['Your new word is: ' i]);
end
Related
I have some rows in a text file have NA and i want to delete them .
when i used isempty(strfind(l,'NA')), this deletes also strings have NA such as: 'RNASE' ,'GNAS'
example
0.552353744371678 NA
0.0121476193502138 ANG;RNASE
0.189489997218949 GNAS
0.0911820441646675 MYCL1
output:
0.0911820441646675 MYCL1
output expected:
0.0121476193502138 ANG;RNASE
0.189489997218949 GNAS
0.0911820441646675 MYCL1
Using single regexp I do not know how to find
"NA that does not have any alphanumeric character before or after".
I mean, it is easy if you know there will be at least one other character before and after:
ind = regexp(str, '[^A-Za-z_]NA[^A-Za-z_]'); %Or something similar, depending what exactly can and cannot be there.
However, this string requires characters before and after and will not match single 'NA' by itself.
That is to say, I am nearly certain suitable regexp exists, I just don't know it :)
What I would do is (assuming strl = single line with text you are deciding to keep or remove, that might have multiple NA).
ind = regexp(strl, 'NA'); % This finds all NA in the string.
removestr = true;
for i = 1 : length(ind)
if (ind == 1 || any(regexp(strl(ind-1), '[^A-Za-z_]'))) ... &&
&& (ind+1 == length(strl) || any(regexp(strl(ind+2), '[^A-Za-z_]')))
disp('This is maybe the string to remove - if there are no wrong NA's later')
else
removestr = false;
break; % stop checking in this loop, this string is to keep.
end
end
if (removestr)
disp('Remove string')
end
Conditions in if are a bit overkill and quite slow, but should work. If you don't require checking for multiple NA in a single line, simply omit for loop.
I would like to see if the letter inputted by the user matches any of the words in a dictionary.
Could someone please help me do this? Thank you!
words = {'apple', 'banana', 'bee', 'salad', 'corn', 'elephant', 'pterodactyl'};
user_letter_input = input('Please enter the first letter of a word: ');
for i = words
if (i starts with user_letter_input)
disp(['Your new word is: ' i]);
end
end
You can use:
if(i{1}(1) == user_letter_input)
Here's a different, admittedly more hackish approach:
w = char(words); %// convert to 2D char array, padding with spaces
result = find(w(:,1)==user_letter_input); %// test equality with first column
result will be a vector with the indices of all matching words. For example,
words = {'apple', 'banana', 'bee', 'salad', 'corn', 'elephant', 'pterodactyl'};
user_letter_input = 'b'
will give
result =
2
3
I want to check the spelling error within an aricle, I have 100 articles to check to see got spelling error of not, if got one error then word return 1 else 0. I have to split the article into words by word then only check. I have done all of these here, but the problem is i could not check the spelling error of the split word.However, I could check with
deliberate_mistake = 'tabel';
suggestion = checkSpelling(deliberate_mistake)
output:
suggestion =
'table'
checkSpelling.m file
function suggestion = checkSpelling(word)
h = actxserver('word.application');
h.Document.Add;
correct = h.CheckSpelling(word);
if correct
suggestion = []; %return empty if spelled correctly
else
%If incorrect and there are suggestions, return them in a cell array
if h.GetSpellingSuggestions(word).count > 0
count = h.GetSpellingSuggestions(word).count;
for i = 1:count
suggestion{i} = h.GetSpellingSuggestions(word).Item(i).get('name');
end
else
%If incorrect but there are no suggestions, return this:
suggestion = 'no suggestions';
end
end
%Quit Word to release the server
h.Quit
f20.m file
for i = 1:1
data2=fopen(strcat('DATA\',int2str(i),''),'r')
CharData = fread(data2, '*char')'; %read text file and store data in CharData
fclose(data2);
word =regexp(CharData,' ','split')
[sizeData b] = size(word);
suggestion = checkSpelling(word)
Your input is a cell array, try to give your function a single string input. Works for me.
eliminate punctuation
words split when meeting new line and space, then store in array
check the text file got error or not with the function of checkSpelling.m file
sum up the total number of error in that article
no suggestion is assumed to be no error, then return -1
sum of error>20, return 1
sum of error<=20, return -1
I would like to check spelling error of certain paragraph, I face the problem to get rid of the punctuation. It may have problem to the other reason, it return me the error as below:
My data2 file is :
checkSpelling.m
function suggestion = checkSpelling(word)
h = actxserver('word.application');
h.Document.Add;
correct = h.CheckSpelling(word);
if correct
suggestion = []; %return empty if spelled correctly
else
%If incorrect and there are suggestions, return them in a cell array
if h.GetSpellingSuggestions(word).count > 0
count = h.GetSpellingSuggestions(word).count;
for i = 1:count
suggestion{i} = h.GetSpellingSuggestions(word).Item(i).get('name');
end
else
%If incorrect but there are no suggestions, return this:
suggestion = 'no suggestion';
end
end
%Quit Word to release the server
h.Quit
f19.m
for i = 1:1
data2=fopen(strcat('DATA\PRE-PROCESS_DATA\F19\',int2str(i),'.txt'),'r')
CharData = fread(data2, '*char')'; %read text file and store data in CharData
fclose(data2);
word_punctuation=regexprep(CharData,'[`~!##$%^&*()-_=+[{]}\|;:\''<,>.?/','')
word_newLine = regexp(word_punctuation, '\n', 'split')
word = regexp(word_newLine, ' ', 'split')
[sizeData b] = size(word)
suggestion = cellfun(#checkSpelling, word, 'UniformOutput', 0)
A19(i)=sum(~cellfun(#isempty,suggestion))
feature19(A19(i)>=20)=1
feature19(A19(i)<20)=-1
end
Substitute your regexprep call to
word_punctuation=regexprep(CharData,'\W','\n');
Here \W finds all non-alphanumeric characters (inclulding spaces) that get substituted with the newline.
Then
word = regexp(word_punctuation, '\n', 'split');
As you can see you don't need to split by space (see above). But you can remove the empty cells:
word(cellfun(#isempty,word)) = [];
Everything worked for me. However I have to say that you checkSpelling function is very slow. At every call it has to create an ActiveX server object, add new document, and delete the object after check is done. Consider rewriting the function to accept cell array of strings.
UPDATE
The only problem I see is removing the quote ' character (I'm, don't, etc). You can temporary substitute them with underscore (yes, it's considered alphanumeric) or any sequence of unused characters. Or you can use list of all non-alphanumeric characters to be remove in square brackets instead of \W.
UPDATE 2
Another solution to the 1st UPDATE:
word_punctuation=regexprep(CharData,'[^A-Za-z0-9''_]','\n');
I have a text file that looks like this:
(a (bee (cold down)))
if I load it using
c=textscan(fid,'%s');
I get this:
'(a'
'(bee'
'(cold'
'down)))'
What I would like to get is:
'('
'a'
'('
'bee'
'('
'cold'
'down'
')'
')'
')'
I know I can delimit with '(' and ')' by specifying 'Delimiter' in textscan, but then I will loose this character, which I want to keep.
Thank you in Advance.
The %s specifier indicates that you want Strings, what you want is individual chars. Use %c instead .
c=textscan(fid,'%c');
Update if you want too keep your words intact then you'll want to load your text using the %s specifier. After the text is loaded you can either solve this problem with Regular Expressions (not my forte) or write your own parser then parses each word individually and saves the paranthesis and words to a new cell array.
AFAIK, there is no canned routine capable of preserving arbitrary delimiters.
You'd have to do it yourself:
string = '(a (bee (cold down)))';
bo = string == '(';
bc = string == ')';
sp = string == ' ';
output = cell(nnz(bo|bc|sp)+1,1);
j = 1;
for ii = 1:numel(string)
if bo(ii)
output{j} = '(';
j = j + 1;
elseif bc(ii)
output{j} = ')';
j = j + 1;
elseif sp(ii)
j = j + 1;
else
output{j} = [output{j} string(ii)];
end
end
Which can probably be improved -- the growing character array will prevent the loop from being JIT'ed. The array bc | bo | sp holds all the information to vectorize this thing, I just don't see how at this hour...
Nevertheless, it should give you a place to start.
Matlab has a strtok function similar to C. Its format is:
token = strtok(str)
token = strtok(str, delimiter)
[token, remain] = strtok('str', ...)
there is also a string replace function strrep:
modifiedStr = strrep(origStr, oldSubstr, newSubstr)
What I would do is modify the original string with strrep to add in delimiters, then use strtok. Since you already scanned the string into c:
c = (c,'(','( '); %Add a space after each open paren
c = (c,')',' ) '); % Add a space before and after each close paren
token = zeros(10); preallocate for speed
i = 2;
[token(1), remain] = strtok(c, ' ');
while(remain)
[token(i), remain] = strtok(c, ' ');
i =i + 1;
end
gives you the linear token array of each of the string you requested.
strtok reference: http://www.mathworks.com/help/techdoc/ref/strtok.html
strrep reference: http://www.mathworks.com/help/techdoc/ref/strrep.html