Vectorize filtering on matlab cell structures - matlab

I have a huge cell vector cc (size: 1xN) of the form:
cc{1} = {'indexString1', 'str_row1col1', 'str_row1col2' }
cc{2} = {'indexString2', 'str_row2col1', 'bighello', 'str_row1col3' }
cc{3} = {'indexString3','str_row3col1'}
cc{4} = {'indexString4','str_row3col1', 'helloWorld'}
I want to traverse each cell and remove specific cells that contain the word "hello", e.g c{4}{2}. Can we do that without for loops keeping the final structure of cc?
Best,
Thoth.
EDIT: From the answers and comments I have seen that the structure of the cell impose some limitations. So any other suggestion to store my data are welcome. I just want to keep together all the cells (e.g. 'str_row1col1', 'str_row1col2') that correspond to the same indexString*n* (e.g. indexString1). I made this edit in case it helps some final reshape.

Using regular expressions, you can obtain a logical array in which zeros represent occurences of the word 'hello' somewhere in the nested cell. As #LuisMendo pointed out, this would be much easier to delete the unwanted cells if they were not nested:
clc
clear
cc{1} = {'str_row1col1', 'str_row1col2' };
cc{2} = {'str_row2col1', 'bighello', 'str_row1col3' };
cc{3} = {'str_row3col1'};
cc{4} = {'str_row3col1', 'helloWorld'};
A = (cellfun(#isempty,regexp([cc{:}],'(\w*hello|hello\w*)','match')))
Gives the following array:
A =
1 1 1 0 1 1 1 0
For the rest I think you would need a loop since the nested cells are not all of the same size. Anyhow I hope it helps you a bit.
EDIT Here is what you can do using a for loop. In order to identify words of interest (earth and water as in your comment below), simply add them to the argument in the call to regexp. This character: | is used to make some sort of list so that Matlab checks all the expressions in the brackets.
Please refer to this page for more infos on regular expressions. There is also a possibility to look for regular expressions with case-sensitivity.
Sample code, in which I added strings containing earth and water:
cc{1} = {'str_row1col1', 'earth!superman' 'str_row1col2' 'DummyString'};
cc{2} = {'str_row2col1', 'bighello', 'str_row1col3' };
cc{3} = {'str_row3col1' 'str_row3col3' 'water_batman'};
cc{4} = {'str_row3col1' 'str_row4col2' 'helloWorld'};
cc{5} = {'str_row5_LegoMan' 'str_row5col2' 'AnotherDummyString' 'Useless String' 'BonjourWorld'};
% With a for loop, for example:
FinalCell = cell(size(cc,2),1);
for k = 1:size(cc,2)
DummyCell = cc{k}; % Use dummy cell for easier indexing
% This is where you tell Matlab what words/expressions you are looking for
A = cellfun(#isempty,regexp(cc{k},'(\w*hello|hello\w*|earth|water)','match'));
DummyCell(~A) = []; % Remove the cells containing the strings/words of interest
FinalCell{k} = DummyCell;
end
Then you're good to go. Hope that helps!

The closest thing possible I found is:
clear all
cc{1} = {'str_row1col1', 'str_row1col2' };
cc{2} = {'str_row2col1', 'bighello', 'str_row1col3' };
cc{3} = {'str_row3col1'};
cc{4} = {'str_row3col1', 'helloWorld'};
cc1 = [cc{:}];
cc1 = cc1(~strcmp('bighello',cc1));
This reorganize your array into a one dimensional array and it cannot match regular expression, but only whole words.
For a better job I am afraid you have to use for loops.

Related

How to create a new matrix with a for loop in MATLAB?

I am a newbie. I have problem.
I have 20 (1x100) different named vectors. I want to combine these vectors to create a 20x100 matrix with a for loop.
There are the examples of vectors.
namelist=["First","B","New"]
First = [1:100]
B = [1:2:200]
New = [4:4:400]
for i = 1: length(namelist)
new_database(i,1:end) = namelist{i}
end
But, when I want to try this I saw "The end operator must be used within an array index expression." error.
I know I can do same thing with this:
"new_database= [First;B;New]"
but i want to do this with a for loop.
Would you help me how can fix this error? or Would you explain me how can do this?
Your problem is with this line:
new_database(i,1:end) = namelist{i}
the curly braces are used with cells, exclusively and there is no need to use range indexing as you do (i, 1:end)
Generally, it is better practice to assign character arrays or strings to cells.
One question, what are you doing with the 'First', 'New' and 'B' ranges arrays?
Something like:
namelist=["First","B","New"]
First = [1:100];
B = [1:2:200];
New = [4:4:400];
new_database = cell(1, length(namelist));
for i = 1: length(namelist) % or length(new_database)
new_database{i} = namelist(i)
end
which generates this output:
EDIT: My apologies, now I see what you are trying to accomplish. You are building a database from a series of arrays, correct?
Following my previous response, you must consider some points:
1 Your new_database should be square. Regardless of the dimensions of the arrays you are passing to it, if you form a cell from them, you will invariably have empty cells if no data is passed to those rows or columns
2 In some cases, you don't need to use for-loops, where simple indexing might suffice your case problem. Consider the following example using cellstr:
titles = ["Position", "Fruits", "Mythical creatures"]
A = ["One", "Two", "Three"];
B = ["Apple", "Banana", "Durian"];
C = ["Dragon", "Cat", "Hamster"];
db = cell(4, 3);
db(1,:) = cellstr(titles)
db(2:end,1) = cellstr(A)
db(2:end,2) = cellstr(B)
db(2:end,3) = cellstr(C)
which generates this output:

How to load a sequence of image files using a for loop in MATLAB?

I am beginner in MATLAB. I would like to load 200 image files (size 192x192) in a specific folder by using a for loop.
The image names are '1.png', '2.png', '3.png' and so on.
My code is as below.
list = dir('C:/preds/*.png');
N = size(list,1);
sum_image = zeros(192,192,200);
for i = 1:N
sum_image(:,:,i) = imread('C:/preds/i.png');
end
Which part should I change ?
I would probably do it like the code below:
You are currently getting the list of filenames then not really doing much with it. Iterating over the list is safer otherwise if there is a missing number you could have issues. Also, the sort maybe unnecessary depending if you image numbering is zero-padded so they come out in the correct order ... but better safe than sorry. One other small change initializing the array to size N instead of hard-coding 200. This will make it more flexible.
searchDir = 'C:\preds\';
list = dir([searchDir '*.png']);
nameList = {list.name}; %Get array of names
imNum = str2double(strrep(nameList,'.png','')); %Get image number
[~,idx] = sort(imNum); %sort it
nameList = nameList(idx);
N = numel(nameList);
sum_image = zeros(192,192,N);
for i=1:N
sum_image(:,:,i) = imread(fullfile(searchDir,nameList{i}));
end
I would suggest changing the line within the loop to the following:
sum_image(:,:,i) = imread(['C:/preds/', num2str(i), '.png']);
MATLAB treats the i in your string as a character and not the variable i. The above line of code builds your string piece by piece.
If this isn't a homework problem, the right answer to this question is don't write this as a for loop. Use an imageDatastore:
https://www.mathworks.com/help/matlab/ref/imagedatastore.html
ds = imageDatastore('C:/preds/');
sumImageCellArray = readall(ds);
sumImage = cat(3,sumImageCellArray{:});

How to use an entry in an array to call a different array entry

I've looked at a few other questions but I couldn't come up with an answer, I'm relatively new to MatLab (but not programming) so I apologize if it's a duplicate.
I'm sure the title isn't very clear so here's an example:
I have an array, say name = ['Jack';'Jill'];. The elements in this array reference other arrays, such as:
Jack.income = 31000;
Jack.car = 1;
Jill.income = 55000;
Jill.car = 0;
Now, I would like to use name to pull data from the other arrays, such as:
data = name(1).income, which should return 31000, or data = name(2).car, which should return 0.
Any help on this would be greatly appreciated.
Thank you very much,
Eric
You'd be better off using an array of structs (or making your own person object perhaps):
people(1).name = 'Jack';
people(1).income = 31000;
people(1).car = 1;
people(2).name = 'Jill';
people(2).income = 55000;
people(2).car = 0;
Now you can generate a list of names like this (see cell arrays and comma separated lists):
names = {people.name};
which you can convert into an index like this (see logical indexing and ismember):
ind = ismember(names, 'Jack');
and then finally extract someone's income:
people(ind).income

Using a string to refer to a structure array - matlab

I am trying to take the averages of a pretty large set of data, so i have created a function to do exactly that.
The data is stored in some struct1.struct2.data(:,column)
there are 4 struct1 and each of these have between 20 and 30 sub-struct2
the data that I want to average is always stored in column 7 and I want to output the average of each struct2.data(:,column) into a 2xN array/double (column 1 of this output is a reference to each sub-struct2 column 2 is the average)
The omly problem is, I can't find a way (lots and lots of reading) to point at each structure properly. I am using a string to refer to the structures, but I get error Attempt to reference field of non-structure array. So clearly it doesn't like this. Here is what I used. (excuse the inelegence)
function [avrg] = Takemean(prefix,numslits)
% place holder arrays
avs = [];
slits = [];
% iterate over the sub-struct (struct2)
for currslit=1:numslits
dataname = sprintf('%s_slit_%02d',prefix,currslit);
% slap the average and slit ID on the end
avs(end+1) = mean(prefix.dataname.data(:,7));
slits(end+1) = currslit;
end
% transpose the arrays
avs = avs';
slits = slits';
avrg = cat(2,slits,avs); % slap them together
It falls over at this line avs(end+1) = mean(prefix.dataname.data,7); because as you can see, prefix and dataname are strings. So, after hunting around I tried making these strings variables with genvarname() still no luck!
I have spent hours on what should have been 5min of coding. :'(
Edit: Oh prefix is a string e.g. 'Hs' and the structure of the structures (lol) is e.g. Hs.Hs_slit_XX.data() where XX is e.g. 01,02,...27
Edit: If I just run mean(Hs.Hs_slit_01.data(:,7)) it works fine... but then I cant iterate over all of the _slit_XX
If you simply want to iterate over the fields with the name pattern <something>_slit_<something>, you need neither the prefix string nor numslits for this. Pass the actual structure to your function, extract the desired fields and then itereate them:
function avrg = Takemean(s)
%// Extract only the "_slit_" fields
names = fieldnames(s);
names = names(~cellfun('isempty', strfind(names, '_slit_')));
%// Iterate over fields and calculate means
avrg = zeros(numel(names), 2);
for k = 1:numel(names)
avrg(k, :) = [k, mean(s.(names{k}).data(:, 7))];
end
This method uses dynamic field referencing to access fields in structs using strings.
First of all, think twice before you use string construction to access variables.
If you really really need it, here is how it can be used:
a.b=123;
s1 = 'a';
s2 = 'b';
eval([s1 '.' s2])
In your case probably something like:
Hs.Hs_slit_01.data= rand(3,7);
avs = [];
dataname = 'Hs_slit_01';
prefix = 'Hs';
eval(['avs(end+1) = mean(' prefix '.' dataname '.data(:,7))'])

Outputting data from for loop to .mat file using numbers in title MATLAB

I need to output .mat files for the below data. I need one file to have cell (1,1) to be Mean_RPM_list1, cell (2,1) to be Mean_RPM_list2 etc. And then I need another file to have cell(1,1) to be Mean_Torque_list1 to have cell(1,1).....and so on.
Can anybody shed any light on this for me?
Also if someone knows how to automate me calling the matrices A and B so I could have A = [Mean_rpm1:Mean_rpmMAX], that would also be very helpful.
TIA for any help.
A = [Mean_rpm1 Mean_rpm2 Mean_rpm3 Mean_rpm4 Mean_rpm5 Mean_rpm6 Mean_rpm7 Mean_rpm8 Mean_rpm9 Mean_rpm10 Mean_rpm11 Mean_rpm12];
B = [Mean_torque1 Mean_torque2 Mean_torque3 Mean_torque4 Mean_torque5 Mean_torque6 Mean_torque7 Mean_torque8 Mean_torque9 Mean_torque10 Mean_torque11 Mean_torque12];
plot(A,B,'*')
for i = 1:num_bins;
bin = first + ((i-1)/10);
eval(sprintf('Mean_RPM_list%0.f = A;',bin*10));
eval(sprintf('Mean_Torque_list%0.f = B;',bin*10));
end
First of all this is really bad idea to create a set of variables with names different by numbers. As you can see it's very difficult to deal with such variables, you always have to use eval (or other related) statements.
It's much easier to create a cell array Mean_rpm and access its elements as Mean_rpm{1}, etc.
If the vectors are numeric and have the same size you can also make a 2D/3D array. Then access as Mean_rpm(:,:,1) etc.
Next, to store a cell array to a mat-file you have to create this array in MATLAB. No options (at least for now) to do it by parts in a loop. (But you can do it for numeric vectors and matrices using matfile object.) So why do you need this intermediate Mean_RPM_list variable? Just do Mean_RPM_list{bin*10} = A in your loop.
For your first question, if you already have those variables you have to use eval in a loop. Something like
A = [];
for k=1:K
eval(sprintf('A{k} = [A, Mean_rpm%d];',k));
end
You can also get names for all similar variables and combine them.
varlist = who('Mean_rpm*');
A = cell(1,numel(varlist);
for k = 1:numel(varlist)
eval('A{k} = varlist{k};');
end
Here is one without loop using CELL2FUN:
A=cellfun(#(x)evalin('base',x),varlist,'UniformOutput',0);
You should avoid having all these individual variables around in the first place. Data types like arrays, cell arrays and structure arrays exist to help you with this. If you want each variable to be associated with a name, you can use a structure array. I've made an example below. Instead of assigning a value to Mean_rpm1 like you are doing now, assign it to meanStruct.Mean_rpm1 then save the entire structure.
% as you generate values for each variable, assign them to the
% appropriate field.
meanStruct.Mean_rpm1 = [10:10];
meanStruct.Mean_rpm2 = [12:15];
meanStruct.Mean_rpm3 = [13:20];
meanStruct.Mean_rpm4 = [14];
meanStruct.Mean_rpm5 = [15:18];
meanStruct.Mean_rpm6 = [16:20];
meanStruct.Mean_rpm7 = [17:22];
meanStruct.Mean_rpm8 = [18:22];
meanStruct.Mean_rpm9 = [19:22];
meanStruct.Mean_rpm10 = [20:22];
meanStruct.Mean_rpm11 = [21:22];
meanStruct.Mean_rpm12 = [22:23];
% save the structure array
save('meanValues.mat','meanStruct')
% load and access the structure array
clear all
load('meanValues.mat')
temp = meanStruct.Mean_rpm3