finding the number of occurrence of a pattern within a cell in matlab?

finding the number of occurrence of a pattern within a cell in matlab? - matlab

i have a cell like this:
x = {'3D'
'B4'
'EF'
'D8'
'E7'
'6C'
'33'
'37'}
let's assume that the cell is 1000x1. i want to find the number of occurrence of pattern = [30;30;64;63] within this cell but as the order shown. in the other word it's first check x{1,1},x{2,1},x{3,1},x{4,1}
then check x{2,1},x{3,1},x{4,1},x{5,1} and like this till the end of the cell and return the number of occurrence of it.
Here is my code but it didn't work!
while (size (pattern)< size(x))
count = 0;
for i=1:size(x)-length(pattern)+1
if size(abs(x(i:i+size(pattern)-1)-x))==0
count = count+1;
end
end
end

Your example code has a couple of issues - foremost I don't believe you are doing any comparison operations, which would be necessary to identify the occurrence of the pattern within the search data (x). Also, there is a variable type mismatch between x and pattern - one is a cell array of strings, and the other is a decimal array.
One way to approach this problem would be to restructure x and pattern as strings, and then use strfind to find occurrences of pattern. This method will only work if there is no missing data in either of the variables.
x = {'3D';'B4';'EF';'D8';'E7';'6C';'33';'37';'xE';'FD';'8y'};
pattern = {'EF','D8'};
collated_x=[x{:}];
collated_pattern = [pattern{:}];
found_locations = strfind(collated_x, collated_pattern);
% Remove 'offset' matches that start at even locations
found_locations = found_locations(mod(found_locations,2)==1);
count = length(found_locations)

Use string find function.
This is fast and simple solution:
clear
str_pattern=['B4','EF']; %same as str_pattern=['B4EF'];
x = {'3D'
'B4'
'EF'
'D8'
'EB'
'4E'
'F3'
'B4'
'EF'
'37'} ;
str_x=horzcat(x{:});
inds0=strfind(str_x,str_pattern); %including in-middle
inds1=inds0(bitand(inds0,1)==1); %exclude all in-middle results
disp(str_x);
disp(str_pattern);
disp(inds0);
disp(inds1);

Related

How to shuffle such that two same elements are not together?

I have a string containing several elements, some identical and some unique. I want my code to check every 2 following elements in my string and if they're equal, it should call a function ShuffleString, where the input variable (randomize) is the string itself, that will re-shuffle the string in a new position. Then, the script should re-check every 2 following elements in the string again until no two identical elements appear next to each other.
I have done the following:
My function file ShuffleString works fine. The input variable randomize, as stated earlier, contains the same elements as MyString but in a different order, as this was needed on an unrelated matter earlier in the script.
function [MyString] = ShuffleString(randomize)
MyString = [];
while length(randomize) > 0
S = randi(length(randomize), 1);
MyString = [MyString, randomize(S)];
randomize(S) = [];
end
The script doesn't work as intended. Right now it looks like this:
MyString = ["Cat" "Dog" "Mouse" "Mouse" "Dog" "Hamster" "Zebra" "Obama"...
"Dog" "Fish" "Salmon" "Turkey"];
randomize = MyString;
while(1)
for Z = 1:length(MyString)
if Z < length(MyString)
Q = Z+1;
end
if isequal(MyString{Z},MyString{Q})
[MyString]=ShuffleString(randomize)
continue;
end
end
end
It just seems to reshuffle the string an infinite amount of times. What's wrong with this and how can I make it work?

You are using an infinite while loop that has no way to break and hence it keeps iterating.
Here is a simpler way:
Use the third output argument of the unique function to get the elements in numeric form for easier processing. Apply diff on it to check if consecutive elements are same. If there is any occurrence of same consecutive elements, the output of diff will give at least one zero which when applied with negated all will return true to continue the loop and vice versa. At the end, use the shuffled indices/numeric representation of the strings obtained after the loop to index the first output argument of unique (which was calculated earlier). So the script will be:
MyString = ["Cat" "Dog" "Mouse" "Mouse" "Dog" "Hamster" "Zebra" "Obama"...
"Dog" "Fish" "Salmon" "Turkey"]; %Given string array
[a,~,c] = unique(MyString);%finding unique elements and their indices
while ~all(diff(c)) %looping until there are no same strings together
c = ShuffleString(c); %shuffling the unique indices
end
MyString = a(c); %using the shuffled indices to get the required string array
For the function ShuffleString, a better way would be to use randperm. Your version of function works but it keeps changing the size of the arrays MyString and randomize and hence adversely affects the performance and memory usage. Here is a simpler way:
function MyString = ShuffleString(MyString)
MyString = MyString(randperm(numel(MyString)));
end

Counting occurrences of a character in a string within a cell

I'm having trouble figuring out how to count the occurrences of a character in a string within a cell. For example, I have a file that contains information like so:
type
m
mmNs
SmNm
and I'm trying to determine how many m's are in each line. To do this, I've tried this code:
sampleddata = dataset('file','sample.txt','Delimiter','\t');
muts = sampleddata.type;
fileID = fopen('number_occur.txt','w');
for j = 1:3
mutations = muts(j)
M = length(find(mutations == 'm'));
fprintf(fileID, '%1f\n',M)
end
fclose(fileID)
However, I get an error that informs me: "Undefined operator '==' for input arguments of type 'cell'." Does anyone know how to overcome this problem?

Gonna post a result here in case you did not find a way to do it. There are loads of ways to do it, I am just going to put one of them.
Basically, you want a regex to do string matches:
a = {'type';
'm';
'mmNs';
'SmNm';
'mmmmM'} %//Load in Data,
pattern = 'm'; %//The pattern you are looking for is 'm', it could be anything really, a number of specific word or a specific pattern
lines = regexp(a, pattern, 'tokens'); %// look for this pattern in each line
result = cellfun('length',lines); %//count the size of matched patterns, so each time it matches, the size should increase by 1.
This gives the result in a matrix form:
result =
0
1
2
2
4

assigning values to a field of an structure array in MATLAB

I want to replace the value of the fields in a structure array. For example, I want to replace all 1's with 3's in the following construction.
a(1).b = 1;
a(2).b = 2;
a(3).b = 1;
a([a.b] == 1).b = 3; % This doesn't work and spits out:
% "Insufficient outputs from right hand side to satisfy comma separated
% list expansion on left hand side. Missing [] are the most likely cause."
Is there an easy syntax for this? I want to avoid ugly for loops for such simple operation.

Credits go to #Slayton, but you actually can do the same thing for assigning values too, using deal:
[a([a.b]==1).b]=deal(3)
So breakdown:
[a.b]
retrieves all b fields of the array a and puts this comma-separated-list in an array.
a([a.b]==1)
uses logical indexing to index only the elements of a that satisfy the constraint. Subsequently the full command above assigns the value 3 to all elements of the resulting comma-separated-list according to this.

You can retrieve that the value of a field for each struct in an array using cell notation.
bVals = {a.b};
bVals = cell2mat( bVals );
AFAIK, you can't do the same thing for inserting values into an array of structs. You'll have to use a loop.

Normalize length of cell array

I have a cell array of length 3 and I want to make a for loop with another cell array with length of 6 , so how can I add extra 3 cells for the first array in order to make the 2 cell arrays equal and to use my for loop in MATLAB?
For example, with 2 inputs:
type = { '12' '62' '5' };
colour = {'re' 'green' 'yellow' 'brown' 'blue' 'black'};
for i = 1:length(colour)
if isequal(colour(i), type(:))
result(i) = type(i);
else
end
end
I need to make the type cell array with the same size with colour cell array (I think I have to add extra 3 empty cells in side the type cell array).

I have to address several issues in your code first:
If you use a cell array, you must use curly braces ({}) to extract elements from it. Instead of writing colour(i) you should be writing colour{i}.
This is not a problem, but it's a matter of good practice. If you don't need to handle the else part of the if statement, don't write it at all.
Preallocate memory so that arrays don't grow inside the loop (it slows down the program). Specifically, add the line result = cell(size(colour)); before the for loop.
Your isequal logic is flawed. Practically, it would always return false because colour{1} is one element and type{:} is many.
According to your example, types contain numbers and colours letters, although they are both strings. Does it make sense to compare the two?
Now, regarding your question, it's up to you to decide how the for loop runs. Since you don't mention what you want to achieve (you rather ask how you want to achieve something without saying what exactly), I cannot say what your for loop should look like, if necessary at all. Maybe you meant to use ismember instead of isequal? If so, the fixed code can look like this:
result = cell(size(colour));
for i = 1:length(colour)
if ismember(colour{i}, type)
result{i} = type{i};
end
end
or shorter, like this:
result = cell(size(colour));
[found, idx] = ismember(colour, type);
result(found) = type{idx(found)}
If you provide more details, maybe I can refine my answer so that it helps you more.

Find first substring not in map

I'm using containers.Map class in Matlab as dictionary and I want to find the first substring (from left to right) that is not in my map.
For example, suppose I have the string 'math' and my map is something like this
key value
m 1
ma 2
. .
. .
. .
So if I start reading from left to right the first substring not in map would be 'mat'.
The obvious answer that comes to my mind is to loop every char and do some concatenation in order to find the substring that is not in my map using the method iskey(map, key) where key is the substring in each iteration.
Is there something more efficient to do this? Maybe some predefined function in matlab or at least a more elegant code.
Thanks

How about this.
map = containers.Map;
% Initialise map
map('m') = 1;
map('ma') = 2;
map('burt') = 3;
% Define search string
m = 'math';
% Create cell array element for first 1,2,3... letters of search
ma = repmat(m,length(m),1);
ma = cellstr(char(ma .* tril(ones(length(m)))));
% Find first substring that isn't in map
index = find(~map.isKey(ma),1,'first')