Matching cell arrays in MATLAB - matlab

I am trying to find indices of the elements in one cell array in another cell array in MATLAB. For example:
a = {'Hello', 'Good', 'Sun', 'Moon'};
b = {'Well', 'I', 'You', 'Hello', 'Alone', 'Party', 'Long', 'Moon'};
I expect to get the following result which shows the index of elements of $a$ in array $b$:
index=[4, NaN, NaN, 8];
I know it is possible to implement it using loops, but I think there is simple way to do that which I don't know.
Thanks.

With ismember -
[matches,index] = ismember(a,b)
index(~matches) = nan
With intersect -
[~,pos,idx] = intersect(a,b)
index = nan(1,numel(a))
index(pos) = idx

You could use ismember
[flag,index] = ismember ( a, b )

you can use the 2nd output argument of ismember:
[ida,idb]=ismember(a, b)
ida = 1 0 0 1
idb= 4 0 0 8
If you really need NaN just do:
idb( idb == 0 ) = NaN

Related

assign new matrix values based on row and column index vectors

New to MatLab here (R2015a, Mac OS 10.10.5), and hoping to find a solution to this indexing problem.
I want to change the values of a large 2D matrix, based on one vector of row indices and one of column indices. For a very simple example, if I have a 3 x 2 matrix of zeros:
A = zeros(3, 2)
0 0
0 0
0 0
I want to change A(1, 1) = 1, and A(2, 2) = 1, and A(3, 1) = 1, such that A is now
1 0
0 1
1 0
And I want to do this using vectors to indicate the row and column indices:
rows = [1 2 3];
cols = [1 2 1];
Is there a way to do this without looping? Remember, this is a toy example that needs to work on a very large 2D matrix. For extra credit, can I also include a vector that indicates which value to insert, instead of fixing it at 1?
My looping approach is easy, but slow:
for i = 1:length(rows)
A(rows(i), cols(i)) = 1;
end
sub2ind can help here,
A = zeros(3,2)
rows = [1 2 3];
cols = [1 2 1];
A(sub2ind(size(A),rows,cols))=1
A =
1 0
0 1
1 0
with a vector to 'insert'
b = [1,2,3];
A(sub2ind(size(A),rows,cols))=b
A =
1 0
0 2
3 0
I found this answer online when checking on the speed of sub2ind.
idx = rows + (cols - 1) * size(A, 1);
therefore
A(idx) = 1 % or b
5 tests on a big matrix (~ 5 second operations) shows it's 20% faster than sub2ind.
There is code for an n-dimensional problem here too.
What you have is basically a sparse definition of a matrix. Thus, an alternative to sub2ind is sparse. It will create a sparse matrix, use full to convert it to a full matrix.
A=full(sparse(rows,cols,1,3,2))

Find corresponding array in a cell

Suppose I have a cell of arrays of the same size, for example
arr = {[1 NaN 2 ], ...
[NaN 4 7 ], ...
[3 4 NaN] };
and I also have a vector, for example
vec = [1 2 2];
How do I find the corresponding cell entry that matches the vector vec. Matching means the entries in the same location are the same, except for NaNs?
For this particular vector vec I would like to have 1 returned, since it matches the first row.
Another vector [5 4 7] would return 2.
Vectors that don't match like [7 7 7] and vectors that match more than one entry like [3 4 7] should throw an error.
Note that the vector [3 7 4] does not match the second entry, because the order is important.
For each cell element, just check if
all(isnan(cellElement) | cellElement == vec)
is true, which means, you found a match. If you convert your cell to a matrix checkMatrix with multiple rows and each row corresponding to one cellElement, you can even do it without implementing a loop by repeating vec vertically and comparing the whole matrix in a single step. You will have to tell all() to check along dimension 2 rather than dimension 1 and have find() detect all the matches, like so:
find( all( ...
isnan(checkMatrix) | checkMatrix == repmat(vec,size(checkMatrix, 1),1) ...
, 2)); % all() along dimension 2
So I thought about it and came up with this:
matching_ind = #(x, arr) find(...
cellfun(#(y) max(abs(not(x-y==0).*not(isnan(x-y)))),...
arr) == 0);
inds = matching_ind(vec, arr);
if length(inds) ~= 1
error('42');
end
See if this bsxfun based approach works for you -
A = vertcat(arr{:});
matching_ind = find(all(bsxfun(#eq,A,vec(:).') | isnan(A),2)) %//'
if numel(matching_ind)~=1
error('Error ID : 42.')
else
out = matching_ind(1);
end

How to check if the beginning elements of array in matlab is the same

I would like to see if an array starts with the same elements as another array without having to write a bunch of for loops going through each element individually.
For example if I had the arrays below
Array1 = [1 2 3 4]
Array2 = [1 2 3 4 5 3 2 5 7]
Array3 = [1 2 3 5]
Then comparing Array1 with Array2 would return true.
and comparing Array3 with Array2 would return false.
Is there any quick and easy way of doing this. I would not know the lengths of the arrays I would be comparing. The number of elements I want to compare equals the length of the shortest vector.
Thanks!
You can check if all elements in two vectors are the same using isequal. To check only the first n elements, you can do Array(1:n), thus the entire function will be like this:
Array1 = [1 2 3 4]
Array2 = [1 2 3 4 5 3 2 5 7]
Array3 = [1 2 3 5]
n = 4; % Compare the first n elements
isequal(Array1(1:n), Array2(1:n))
ans = 1
isequal(Array2(1:n), Array3(1:n))
ans = 0
If you use Array1(1:n) == Array2(1:n) you will get a piece-wise comparison resulting in 1 1 1 1. Of course, this means you could also do:
all(Array1(1:n) == Array2(1:n))
ans = 1
all(Array2(1:n) == Array3(1:n))
ans = 0
If you want n to be the number of elements in the smallest vector (per your comment), as Chris and Ben interpret the question, you can solve it this way:
isequal(Array1(min([numel(Array1) numel(Array2)])), Array2(min([numel(Array1) numel(Array2)])))
or a bit cleaner:
n = min([numel(Array1) numel(Array2)])
isequal(Array1(1:n), Array2(1:n))
Here's a function that will compare the initial segments of any two vectors, up to the length of the shortest vector. It returns true if they are identical, and false if they are not identical.
Note that
It only works correctly with vectors, not with matrices (although you could extend it to deal with matrices)
If any entries are NaN then it will always return false, since NaN == NaN is false.
Here it is -
function result = equal_initial_segment(x, y)
N = min(length(x), length(y));
result = isequal(x(1:N), y(1:N));
end
It seems like you are just comparing the all of the elements of the shorter list to the first elements of the longer list, in which case you can just do this:
function same = compareLists(list1, list2)
if length(list1) > length(list2)
same = isequal(list2, list1(1:length(list2));
else if
same = isequal(list1, list2(1:length(list1));
end
end
You can use strmatch for that:
~(isempty(strmatch(Array1, Array2)) && isempty(strmatch(Array2, Array1)))

Matlab - Return only rows of a matrix 'A' that not contain in other matrix 'B'

How do I return only the rows of a matrix 'A' that not contain some values
(These values ​​are an array 'B')?
A = {'A1', 5 'P01,P02,P03,P04';
'A2' 7, 'P08,P09';
'A3' 8, 'P07';
'A4' 8, 'P10,P11'};
B = { 'P07'; 'P10'; 'P11'};
I need to return only:
'A1'
'A2'
Thanks in advance for your help
To remove rows of A which contain at least one of the strings in B
Fancy one-liner with two nested cellfuns and strfind at its core:
A(all(cell2mat(cellfun(#(b) cellfun(#isempty, strfind(A(:,end),b)).',B, 'uni', 0))),1)
Perhaps the logical indices computed as intermediate result are of interest:
ind = cell2mat(cellfun(#(b) cellfun(#isempty, strfind(A(:,end),b)).',B, 'uni', 0));
A(all(ind),1)
Namely, ~ind tells you which strings of B are contained in which rows of A . In the example,
>> ~ind
ans =
0 0 1 0
0 0 0 1
0 0 0 1
How it works: strfind tests if each string of B is in A, and returns a vector with the corresponding positions. So an empty vector means the string is not present. If that vector is empty for all strings of B, that row of A should be selected.
Variations on Luis' theme:
ind = A( all(cellfun('isempty', ...
cellfun(#strfind, ...
repmat(A(:,end), 1,size(B,1)), ...
repmat(B', size(A,1),1), 'UniformOutput', false)), 2), 1)
Somewhat against my own expectations, this is a LOT faster than Luis' solution. I think it is primarily due to the string function vs. anonymous function (cellfun is a lot faster with string functions than with anonymous functions). cell2mat not being built-in is also a factor.
I suggest you change the way you store the data in A as follows:
A = {'A1', 5, {'P01','P02','P03','P04'};
'A2', 7, {'P08','P09'};
'A3', 8, {'P07'};
'A4', 8, {'P10','P11'}};
B = {'P07'; 'P10'; 'P11'};
Then you can do:
for n = 1:size(A,1)
ind(n) = ~sum(ismember(B,A{n,3}));
end
A(ind,1)
Or if you prefer a one liner then:
A(cellfun(#(x)(~sum(ismember(B,x))), A(:,3)),1)

find NaN values is cell array

lets assume I have the following array:
a = {1; 'abc'; NaN}
Now I want to find out in which indices this contains NaN, so that I can replace these with '' (empty string).
If I use cellfun with isnan I get a useless output
cellfun(#isnan, a, 'UniformOutput', false)
ans =
[ 0]
[1x3 logical]
[ 1]
So how would I do this correct?
Indeed, as you found yourself, this can be done by
a(cellfun(#(x) any(isnan(x)),a)) = {''}
Breakdown:
Fx = #(x) any(isnan(x))
will return a logical scalar, irrespective of whether x is a scalar or vector.
Using this function inside cellfun will then erradicate the need for 'UniformOutput', false:
>> inds = cellfun(Fx,a)
inds =
0
0
1
These can be used as indices to the original array:
>> a(inds)
ans =
[NaN]
which in turn allows assignment to these indices:
>> a(inds) = {''}
a =
[1]
'abc'
''
Note that the assignment must be done to a cell array itself. If you don't understand this, read up on the differences between a(inds) and a{inds}.
I found the answer on http://www.mathworks.com/matlabcentral/answers/42273
a(cellfun(#(x) any(isnan(x)),a)) = {''}
However, I do not understant it...
a(ind) = [] will remove the entries from the array
a(ind)= {''} will replace the NaN with an empty string.
If you want to delete the entry use = [] instead of = {''}.
If you wanted to replace the NaNs with a different value just set it equal to that value using curly braces:
a(ind) = {value}