Using strmatch in a loop - matlab

Can I do the something below coded using or without loop?
Actually I am having a character array comprising of unique words more than 5000 and other array comprising of approx 3000 words. I want to search each word in my array named as word in other array named as uniques and wish to create a feature vector i.e. values 1 if exists and 0 if doesn't.
I am doing the following..
load 'uniques' %uniques={'alpha','ok',abc'};
fid=fopen(myfilename);
words=textscan(fid,'%s');
fclose(fid);
word=words{1,1}; %word={'good','bad',anywhere','countries','ok',done','abc'}
for i=1:size(uniques,2)
ind=strmatch(word(i), uniques, 'exact');
end
Now, seeing above as examples before uniques and word arays, my system must return 0 for good as good is not there in uniques and same 0 values but 1 for ok because it does exist in uniques. All in all, I must have in the end, {0,0,0,0,1,0,1} ..
After I run, it gives me ind=[]
Please guide

You have described the exact functionality of the ismember function:
ismember(word, uniques);
as an aside, this is what #nkjt was saying about fixing your loop:
for i=1:size(word,2)
ind(i)=strmatch(word(i), uniques, 'exact');
end
But this loop is unnecessary since Matlab has this as a built in function

Related

MATLAB fwrite\fread issue: two variables are being concatenated

I am reading in a binary EDF file and I have to split it into multiple smaller EDF files at specific points and then adjust some of the values inside. Overall it works quite well but when I read in the file it combines 2 character arrays with each other. Obviously everything afterwords gets corrupted as well. I am at a dead end and have no idea what I'm doing wrong.
The part of the code (writing) that has to contain the problem:
byt=fread(fid,8,'*char');
fwrite(tfid,byt,'*char');
fwrite(tfid,fread(fid,44));
%new number of records
s = records;
fwrite(tfid,s,'*char');
fseek(fid,8,0);
%test
fwrite(tfid,fread(fid,8,'*char'),'*char');
When I use the reader it combines the records (fwrite(tfid,s,'*char'))
with the value of the next variable. All variables before this are displayed correctly. The relevant code of the reader:
hdr.bytes = str2double(fread(fid,8,'*char')');
reserved = fread(fid,44);%#ok
hdr.records = str2double(fread(fid,8,'*char')');
if hdr.records == -1
beep
disp('There appears to be a problem with this file; it returns an out-of-spec value of -1 for ''numberOfRecords.''')
disp('Attempting to read the file with ''edfReadUntilDone'' instead....');
[hdr, record] = edfreadUntilDone(fname, varargin);
return
end
hdr.duration = str2double(fread(fid,8,'*char')');
The likely problem is that your character array s does not have 8 characters in it, but you expect there to be 8 when you read it from the file. Whatever the number of characters in the array is, that's how many values fwrite will write out to the file. Anything less than 8 characters and you'll end up reading part of the next piece of data when you read from the file.
One fix would be to pad s with blanks before writing it:
s = [blanks(8-numel(records)) records];
In addition, the syntax '*char' is only valid when using fread: the * indicates that the output class should be 'char' as well. It's unnecessary when using fwrite.

How to read a number from text file via Matlab

I have 1000 text files and want to read a number from each file.
format of text file as:
af;laskjdf;lkasjda123241234123
$sakdfja;lskfj12352135qadsfasfa
falskdfjqwr1351
##alskgja;lksjgklajs23523,
asdfa#####1217653asl123654fjaksj
asdkjf23s#q23asjfklj
asko3
I need to read the number ("1217653") behind "#####" in each txt file.
The number will follow the "#####" closely in all text file.
"#####" and the close following number just appear one time in each file.
clc
clear
MyFolderInfo = dir('yourpath/folder');
fidin = fopen(file_name,'r','n','utf-8');
while ~feof(fidin)
tline=fgetl(fidin);
disp(tline)
end
fclose(fidin);
It is not finish yet. I am stuck with the problem that it can not read after the space line.
This is another approach using the function regex. This will easily provide a more advanced way of reading files and does not require reading the full file in one go. The difference from the already given example is basically that I read the file line-by-line, but since the example use this approach I believe it is worth answering. This will return all occurences of "#####NUMBER"
function test()
h = fopen('myfile.txt');
str = fgetl(h);
k = 1;
while (isempty(str) | str ~= -1 ) % Empty line returns empty string and EOF returns -1
res{k} = regexp(str,'#####\d+','match');
k = k+1;
str = fgetl(h);
end
for k=1:length(res)
disp(res{k});
end
EDIT
Using the expression '#####(\d+)' and the argument 'tokens' instead of 'match' Will actually return the digits after the "#####" as a string. The intent with this post was also, apart from showing another way to read the file, to show how to use regexp with a simple example. Both alternatives can be used with suitable conversion.
Assuming the following:
All files are ASCII files.
The number you are looking to extract is directly following #####.
The number you are looking for is a natural number.
##### followed by a number only occurs once per file.
You can use this code snippet inside a for loop to extract each number:
regx='#####(\d+)';
str=fileread(fileName);
num=str2double(regexp(str,regx,'tokens','once'));
Example of for loop
This code will iterate through ALL files in yourpath/folder and save the numbers into num.
regx='#####(\d+)'; % Create regex
folderDir='yourpath/folder';
files=cellstr(ls(folderDir)); % Find all files in folderDir
files=files(3:end); % remove . and ..
num=zeros(1,length(files)); % Pre allocate
for i=1:length(files) % Iterate through files
str=fileread(fullfile(folderDir,files{i})); % Extract str from file
num(i)=str2double(regexp(str,regx,'tokens','once')); % extract number using regex
end
If you want to extract more ''advanced'' numbers e.g. Integers or Real numbers, or handle several occurrences of #####NUMBER in a file you will need to update your question with a better representation of your text files.

How to Have Multiple or Conditions for While Loop

I'm trying to make a basic while loop to get back into the swing of things with matlab. All I'm trying to do is create a prompt to ask the user if today is their birthday and if they say yes it'll wish them happy birthday and if they say no it'll say "that's too bad". I can make the prompts appear but what I want to do is unless the user inputs 'yes' or 'no' they will continually be asked if today is their birthday. My question is how I create the loop to prompt my question over and over until the user inputs 'yes' or 'no'.
Try this:
while 1
b = input('Is today your birthday? ','s');
if any(strcmpi(b,{'yes','no'}))
break
end
end
Here is a way (there are many others):
Use a while loop in which you put the prompt (here I use inputdlg) and once the user enters the answer, you check if the string entered compares to either yes, Yes, no and No. If it does not, the dialog box pops up again. If it fits, a message appears.
In order to compare multiple strings at once, you can use strcmp with the answer provided by the user and use a cell array containing the strings you are looking for (i.e. yes/no/etc.). If the answer corresponds to any of the strings, the array (called CheckAns) contains a 1 and the sum is different than 0; otherwise the sum equals 0 so the loop continues.
That's a lot of words so here is the code:
%// Initialize the look up array. All 0 to start and enter the loop
CheckAns = [0 0 0 0];
while ~sum(CheckAns)
Ans = inputdlg('Is this your birthday?');
CheckAns = strcmpi(Ans,{'yes';'no'});
if strcmpi(Ans,'yes')
disp('Happy birthday')
elseif strcmpi(Ans,'no')
disp('Haha loser')
end
end

Getting Variables from a data structure and creating a matrix from those variables

I have a data structure which has data points named Vel1 to Vel1520. However, when I apply Uorder = orderfields(MeanU_Velocity); the variables put in the order Vel1 Vel10 Vel100 Vel1000 Vel1001 Vel1002 etc. Is there any way to sort the data structure such that it lists the variables from 1 to 1520 in ascending order? Regards, Jer
An easy fix to this is to always use the same number of digits. 0001, 0002, ..., 0010, ..., 1520
instead of num2str(42), try sprintf('Vel%04d', 42). This prints formatted text to a string. %04d is a special code that says: fill with zeros, reserve 4 places, print an integer number. Have a look at the documentation and look at matlabs formatted strings tutorial for more comprehensive examples.

How to randomly select from a list of 47 names that are entered from a data file?

I have managed to input a number data file into a matrix but have been unable to do so for any data that is not a number.
I have a list of 47 names and supposed to generate a random name from the list. I have tried to use the function textscan but was not going anywhere. Also how do I generate a random name from the list? All I have been able to do was generate a random number between 1 to 47.
Appreciate the replies. I should have said I need it in MATLAB sorry.
Here is a sample list of data in my data file
name01
name02
name03
and the code to read it:
fid = fopen('names.dat','rt');
headerChars = fgetl(fid);
data = fscanf(fid,'%f,%f,%f,%f',[4 47]).';
fclose(fid);
The above is what I have to read the data file into a matrix but it is only reading the first line. (Yes it was modified from a previous post here on this forums :/)
Edit: As per the helpful comments from mtrw, and the fixed formatting of the sample data file, I've updated my answer with more detail.
With a single name (i.e. "Bob", "Bob Smith", or "Smith, Bob") on each line of the file, you can use the function TEXTSCAN by specifying '%s' as the format argument (to denote reading a string) and the newline character '\n' as the 'Delimiter' (the character that separates the strings in the file):
fid = fopen('namefile.txt','r');
names = textscan(fid,'%s','Delimiter','\n');
fclose(fid);
Then it's a matter of randomly picking one of the names. You can use the function RANDI to generate a random integer in the range from 1 to the number of names read from the file (found using the NUMEL function):
names = names{1}; %# Get the contents from the cell returned by TEXTSCAN
selectedName = names{randi(numel(names))};
Sounds like you're halfway home. Take that random number and use it as an index for the list.
For example, if you randomly generate the number 23 then fetch the 23rd entry in the list which gives you a random name draw.
Use the RANDOMBETWEEN function to get a random number within your range. Use INDEX to get the actual cell value. For instance:
=INDEX(A1:A47, RANDBETWEEN(1, 47))
The above will work for your specific case of 47 names, assuming they're in column A. In general, you'd want something like:
=INDEX(MyNames, RANDBETWEEN(ROW(MyNames), ROW(MyNames) + ROWS(MyNames) - 1))
This assumes you've named your range of cells "MyNames" (for example, by selecting all the cells in your range and setting a name in the naming box). The above formula works by using the ROW function to return the top row of the MyNames array and the ROWS function to get the total rows in MyNames.