Does matrix contain a vector? - matlab

I'm looking for a fast / concise way to check whether some matrix contains given vector, e.g.:
bigMatrix = [1 1 1; 2 2 2; 4 4 4; 5 5 5];
someFunction(bigMatrix, [1 1 1]) % = true
someFunction(bigMatrix, [3 3 3]) % = false
Is there such function/operator, or I need a loop?

I would suggest the following solution:
bigMatrix = [1 1 1; 2 2 2; 4 4 4; 5 5 5];
Vec = [2 2 2];
Index = ismember(bigMatrix, Vec, 'rows');
The result?
Index =
0
1
0
0
ismember is an incredibly useful function that checks whether the elements of one set are in another set. Here, I exploit the rows option to force the function to compare rows, rather than individual elements.
UPDATE: On the other hand, it is always worth doing a few speed tests! I just compared the ismember approach to the following alternative method:
N = size(bigMatrix, 1);
Index2 = zeros(N, 1);
for n = 1:N
if all(bigMatrix(n, :) == Vec)
Index2(n) = 1;
end
end
My findings? The size of bigMatrix matters! In particular, if bigMatrix is on the small side (somewhat of a misnomer), then the loop is much faster. The first approach is preferable only when bigMatrix becomes big. Further, the results are also dependent on how many columns bigMatrix has, as well as rows! I suggest you test both approaches for your application and then go with whichever is faster. (EDIT: This was on R2011a)
General Note: I am continually surprised by how much faster Matlab's loops have gotten in the last few years. Methinks vectorized code is no longer the holy grail that it once was.

Related

Efficient method to product subvectors of a matrix

Suppose that we have a matrix
A = [1 2 ; 3 4 ; 5 6];
and a logical matrix
B = [false true ; true true ; true false ];
I would like to obtain the row product of elements in A such that the corresponding element in B is true. In the example above, the answer is
C = [2 ; 3*4, 5] = [2 ; 12 ; 5];
One method would be to: 1) Take the power of A with respect to B; 2) Take the row product of the power matrix:
C = prod(A.^B,2);
The above command seems to perform unnecessary computation. Is there a faster way of computing C above?
Your method seems quite fast to me. If you really have a bottleneck there, you can maybe try with cheaper operators, like addition and multiplication:
C = prod(A.*B + ~B, 2);
I only tested it with octave, but it's about twice as fast.
Another less compact way, also fast in octave:
C=A; C(~B)=1; C=prod(C,2);
Here's another way, using accumarray. I doubt it's faster:
[ii, ~] = find(B); % create grouping variable
C = accumarray(ii, A(B), [], #prod); % compute product of each group

Comparing only nonzero elements

The specific task I'm trying to achieve is hard to describe, so here's an example: given A and x
A = [1 2;
3 0;
3 5;
4 0];
x = [1 2 3];
I want the algorithm to output
output: [1 2]
meaning that all of the nonzero elements in rows 1 and 2 in A are in x.
I have done this using cell arrays and loops; however, A and x are very large and my approach is not at all efficient. Also, I can't seem to figure out how to rework ismember to give me what I want. What is the fastest/least memory intensive method?
EDIT: Apologies, my original example was too simplistic. It is corrected now.
The first answer is good, but I would recommend to not using arrayfun. There are more eloquent ways to do what you ask. Use ismember combined with all, then index into the matrix A when you're done. Basically, your problem is to determine if a row has all of the values found in x and ignoring the zero values. In this case, we can find all of the values in the matrix A that are actually zero, then use this to augment our result.
Using A as the first input and x as the second input will return a matrix of the same size as A that tells you whether an element in A is found in x. If you want to check if all elements in the matrix A for a row can be found in x, check if all elements in a row is 1. On top of this, find all of the elements that are zero, then with the output of ismember set these to 1. This can be done with using a logical OR. After, you can use all and check each row independently by using the output of ismember as the first input into all and setting the second argument to 2. This would then return all of the rows in the matrix A where any column is found in x ignoring any values that are zero for a row in A which is what you're looking for:
A = [1 2; 3 0; 4 0];
x = [1 2 3];
mask = ismember(A, x);
ind = all(mask | A == 0, 2);
I'm also in favour of one-liners. We can consolidate this into one line of code:
ind = all(ismember(A, x) | A == 0, 2);
Even shorter is to simply invert A. All zero elements become true and false otherwise:
ind = all(ismember(A, x) | ~A, 2);
ind would thus be:
>> ind
ind =
3×1 logical array
1
1
0
Since you want the actual row indices, you can just use find on top of this:
>> find(ind)
ans =
1
2
To verify, let's use your second example in your comments:
>> A = [1 2;3 5;4 0];
>> x = [1 2 3];
>> ind = all(ismember(A, x) | ~A, 2)
ind =
3×1 logical array
1
0
0
>> find(ind)
ans =
1
I think the best way to rework ismember is to make sure there are no "no members" by just checking for the nonzero elements in A.
arrayfun can do the work in a fast way. It uses the most efficient parallel computing for your specific machine. The following line should return the correct output:
find(arrayfun(#(a) sum(~ismember(A(a,A(a,:)>0),x)),1:size(A,1))==0)
Is this what you were looking for?
However, if your problem is related to memory, then you may have to break the arrayfun operation into pieces (1:floor(size(A,1)/2), floor(size(A,1)/2):size(A,1) or smaller chunks), since MATLAB puts a bunch of workers to do the task, and may use all your available RAM memory...

Compare an array to several numbers at once

How do you compare an array of numbers to several given numbers? More precisely, I have an array given like so
inputArray = [1 2 2 3 4 6]
and I want to compare inputArray to the numbers 1:7 to ultimately count how many times a "1" is in inputArray, a "2", a "3" and so on.
Obviously I can do something like
res = zeros(7,1);
for i = 1:7
res(i) = sum(inputArray == i);
end
or more generally when I also might be interested in the locations of occurrences
res = zeros(7,length(inputArray));
for i = 1:7
res(i,:) = inputArray == i;
end
res2 = sum(res,1);
Out of curiosity and/or speed improvements I am wondering if this is possible without a for loop in a single statement?
It seems like you are looking for a histogram count, see here:
x = [1 3 10 1 8]
b = [1 2 3]
histc(x,b)
Will produce
[2 0 1]
Yet another possibility: use accumarray:
count = accumarray(inputArray(:), 1, [7 1]); %// Change "7" as needed
When you want more dimensions of vectorization than is built in to the functions you're working with, or want to collapse a simple loop into a function call, you can use bsxfun ("Binary Singleton eXpansion FUNction"). It's pretty general, reasonably fast, and produces concise code.
In this case, you could use it to construct that equality grid, and then sum them up.
a = [1 2 2 3 4 5];
i = [1:7]'; % Flip it so it's oriented perpendicular to a
res = bsxfun(#eq, a, i);
counts = sum(res,2)'; %'
% One-liner version
counts = sum(bsxfun(#eq, a, [1:7]'), 2)';
Though in the particular case you're working with, since you're doing simple arithmetic operations on primitive arrays, the for loops might actually be fastest with JIT optimizations, as long as you're careful to isolate the work in its own function so the JIT can do "in-place" optimizations.

Writing a multiplication MATLAB style

MATLAB's syntax differs somewhat from the traditional DO loops "logic" which iterates over indexes one at a time. With that in mind, what would be a more proper way to write the following, so it runs a bit faster but is still relatively clear for someone not too familiar with MATLAB.
KT = 0.;
for i=1:37
dKT = KTc(i,1) *const2^KTc(i,2) *const3^KTc(i,3) *const4^KTc(i,4) *const5^KTc(i,5);
KT = KT + dKT;
end
sprintf('KT = %f10.8', KT);
KTc is a matrix 37x5
(if it helps, only the (i,1) values are REAL values, the rest are INTEGERs)
All constants are REAL scalars.
Your lines (in the original question) correctly:
KT = 0.;
for i=1:37
dKT = KTc(i,1) *const2^KTc(i,2) *const3^KTc(i,3) *const4^KTc(i,4) *const5^KTc(i,5);
KT = KT + dKT;
end
sprintf('KT = %f10.8', KT);
On the other hand I would suggest
KT = repmat([1; const2; const3; const4; const5], 1, n) .^ KTc;
KT(1,:) = KTc(1,:);
KT = sum(KT(:));
Loops are rarely used in a real matlab-style program. The reason for that is, that although my second solution does more operations, in practice it is quicker due to more optimal caching at the processor, parallelization, and other possible optimizations which are done silently in the background.
UPDATE: (explanation on repmat)
I think repmat is short for "replicate matrix". What it does really is best explained with two typical examples:
v_row=[1 2 3];
repmat(v_row, 2, 1);
%result:
[1 2 3
1 2 3]
v_col=[1;2;3]; % I could also write v_col=v_row';
repmat(v_col, 1, 2);
[1 1
2 2
3 3]
In general repmat does this:
repmat(m, 2, 3);
[m m m
m m m]
% if m=[1 2; 3 4] was the value of m, then
[1 2 1 2 1 2
3 4 3 4 3 4
1 2 1 2 1 2
3 4 3 4 3 4]
Try ...
KT = KT_coeff(1,1:37) .* const1.^KT_coeff(2,1:37) .* const2.^KT_coeff(3,1:37) .* const3.^KT_coeff(4,1:37) .* const4.^KT_coeff(5,1:37);
Given that you know the size of each KT_coeff is 37 in the 2nd dimension, you could simplify this a bit further with by replacing 1:37 with just : above.
I would avoid all these exponentials, taking a log first and then an exp.
% // way cheaper to evaluate
log_KT = log([c1 c2 c3 c4])*KT_coeff(2:end,:);
% // Final exp
KT = KT_coeff(1,:) .* exp(log_KT);
KT = sum(KT);

What is the quickest way to keep the non dominated elements and omit the rest in MATLAB?

For example [2 , 5] dominates [3 , 8] cause (2 < 3) and (5 < 8)
but [2 , 5] does not dominates [3 , 1] cause though (2 < 3) but (5 > 1) so these two vectors are non dominated
now for example assume that I have a matrix like this :
a =[ 1 8;
2 6;
3 5;
4 6];
here the first three are non dominated but the last one is dominated by (3,5), I need a code which can omit it and give me this output:
ans =
[ 1 8;
2 6;
3 5]
note that there may be lots of non dominated elements in a Nx2 matrix
Compare one row with other rows using bsxfun
Do this for every row using arrayfun (or a loop if you prefer that) and transform the output back to a matrix with cell2mat
use any and all to check which rows are dominated
remove these rows
code:
a=[1 8;2 6;3 5;4 6];
dominated_idxs = any(cell2mat(arrayfun(#(ii) all(bsxfun(#(x,y) x>y,a,a(ii,:)),2),1:size(a,1),'uni',false)),2);
a(dominated_idxs,:) = [];
edit
If you want to use >= instead of > comparison, each row will dominate itself and will be removed, so you'll end up with an empty matrix. Filter these false-positives out by adjusting the code as follows:
a=[1 8;2 6;3 5;4 6];
N = size(a,1);
compare_matrix = cell2mat(arrayfun(#(ii) all(bsxfun(#(x,y) x>=y,a,a(ii,:)),2),1:N,'uni',false));
compare_matrix(1:N+1:N^2)=false; % set diagonal to false
dominated_idxs = any(compare_matrix,2);
a(dominated_idxs ,:) = [];
This problem is identical to identifying the so-called Pareto front.
If the number of elements N grows large and/or you need to carry out this sort of operation often (as I suspect you do), you might want to give a thought to a fully optimized MEX file for this purpose (available on the Mathworks File Exchange):
Compiling this, putting the mex in your Matlab path, and then using something like
a = a(paretofront(a));
will accomplish your task much quicker than any combination of Matlab-builtins is able to.