Compare an array to several numbers at once - matlab

How do you compare an array of numbers to several given numbers? More precisely, I have an array given like so
inputArray = [1 2 2 3 4 6]
and I want to compare inputArray to the numbers 1:7 to ultimately count how many times a "1" is in inputArray, a "2", a "3" and so on.
Obviously I can do something like
res = zeros(7,1);
for i = 1:7
res(i) = sum(inputArray == i);
end
or more generally when I also might be interested in the locations of occurrences
res = zeros(7,length(inputArray));
for i = 1:7
res(i,:) = inputArray == i;
end
res2 = sum(res,1);
Out of curiosity and/or speed improvements I am wondering if this is possible without a for loop in a single statement?

It seems like you are looking for a histogram count, see here:
x = [1 3 10 1 8]
b = [1 2 3]
histc(x,b)
Will produce
[2 0 1]

Yet another possibility: use accumarray:
count = accumarray(inputArray(:), 1, [7 1]); %// Change "7" as needed

When you want more dimensions of vectorization than is built in to the functions you're working with, or want to collapse a simple loop into a function call, you can use bsxfun ("Binary Singleton eXpansion FUNction"). It's pretty general, reasonably fast, and produces concise code.
In this case, you could use it to construct that equality grid, and then sum them up.
a = [1 2 2 3 4 5];
i = [1:7]'; % Flip it so it's oriented perpendicular to a
res = bsxfun(#eq, a, i);
counts = sum(res,2)'; %'
% One-liner version
counts = sum(bsxfun(#eq, a, [1:7]'), 2)';
Though in the particular case you're working with, since you're doing simple arithmetic operations on primitive arrays, the for loops might actually be fastest with JIT optimizations, as long as you're careful to isolate the work in its own function so the JIT can do "in-place" optimizations.

Related

Calculate the set of autocorrelation functions and then sum them

Good evening! I have a 3D vector. It has the first dimension 1. For clarity, I set it exactly the same as used in my program. "с" is like a number of experiments, in this case there are three, so I calculate the correlation function three times and then add them up.
In fact, the number of experiments is 100. I have to calculate 100 correlation functions and add them.
Tell me how you can do it automatically. And if possible, then no cycles. Thank you.
And yes, in the beginning I set the 3D vector using a loop. Is it possible to set it without a loop as well? This is certainly not my main question, but I would also like to know the answer to it.
d = [1 2 3];
c = [4 2 6];
for i = 1: length(c)
D(1,:,i) = d.*c(i);
end
D
X1 = xcorr(D(:,:,1));
X2 = xcorr(D(:,:,2));
X3 = xcorr(D(:,:,3));
X = X1+X2+X3;
With the help of a loop, my solution looks like this:
d = [1 2 3];
c = [4 2 6];
for i = 1: length(c)
D(1,:,i) = d.*c(i);
x(:,:,i) = xcorr(D(:,:,i));
end
X = sum(x,3)
It seems to be correct. Is it possible to do this without a cycle?
You can easily set your first array D without any loop, even though I don't know why you want to keep the first singleton dimension...
D(1, :, :) = d'.*c;
As for the sum of the autocorrelations, I'm not sure you can do it without a loop. The only thing that you can perhaps do is to not use an array to store the correlation for each index (if memory consumption is a problem for you) and just update the sum:
X = zeros(1, 2*length(d)-1); % initialize the sum array
for i = 1:length(c)
X = X + xcorr(D(:, :, i)); % update the sum
end

Efficient method to product subvectors of a matrix

Suppose that we have a matrix
A = [1 2 ; 3 4 ; 5 6];
and a logical matrix
B = [false true ; true true ; true false ];
I would like to obtain the row product of elements in A such that the corresponding element in B is true. In the example above, the answer is
C = [2 ; 3*4, 5] = [2 ; 12 ; 5];
One method would be to: 1) Take the power of A with respect to B; 2) Take the row product of the power matrix:
C = prod(A.^B,2);
The above command seems to perform unnecessary computation. Is there a faster way of computing C above?
Your method seems quite fast to me. If you really have a bottleneck there, you can maybe try with cheaper operators, like addition and multiplication:
C = prod(A.*B + ~B, 2);
I only tested it with octave, but it's about twice as fast.
Another less compact way, also fast in octave:
C=A; C(~B)=1; C=prod(C,2);
Here's another way, using accumarray. I doubt it's faster:
[ii, ~] = find(B); % create grouping variable
C = accumarray(ii, A(B), [], #prod); % compute product of each group

Multiple sampling with different sizes on Matlab

I am trying to implement this code so it works as quickly as possible.
Say I have a population of 100 different values, you can think of it as pop = 1:100 or pop = randn(1,100) to keep things simple. I have a vector n which gives me the size of samples I want to get. Say, for example, that n=[1 3 10 6 2]. What I want to do is to take 5 (which in reality is length(n)) different samples of pop, each consisting of n(i) elements without replacement. This means that for my first sample I want 1 element out of pop, for the second sample I want 3, for the third I want 10, and so on.
To be honest, I am not really interested in which elements are sampled. What I want to get is the sum of those elements that are present in the ith-sample. This would be trivial if I implemented it with a loop, but I am trying to avoid using them to keep my code as quick as possible. I have to do this for many different populations and with length(n)being very large.
If I had to do it with a loop, this would be how:
pop = randn(1,100);
n = [1 3 10 6 2];
sum_sample = zeros(length(n),1);
for i = 1:length(n)
sum_sample(i,1) = sum(randsample(pop,n(i)));
end
Is there a way to do this?
The only way to figure out what is fastest for you is to do a comparison of the different methods.
In fact the loop appears to be very fast in this case!
pop = randn(1,100);
n = [1 3 10 6 2];
tic
sr = #(n) sum(randsample(pop,n));
sum_sample = arrayfun(sr,n);
toc %% Returns about 0.004
clear su
tic
for t=numel(n):-1:1
su(t)=sum(randsample(pop,n(t)));
end
toc %% Returns about 0.003
You can create a function handle which choses the random samples and sums these up. Then you can use arrayfun to execute this function for all values of n:
pop = randn(1,100);
n = [1 3 10 6 2];
sr = #(n) sum(randsample(pop,n));
sum_sample = arrayfun(sr,n);
You can do something like this:
pop = randn(1,100);
n = [1 3 10 6 2];
sampled_data_index = randi(length(pop),1,sum(n));
sampled_data = pop(sampled_data_index);
The randi function randomly selects integer values in a specified range that is suitable for indexing. After you have the indices you can use those at once to sample the data from the pop database.
If you want to have unique indices you can replace the randi function with randperm:
sampled_data_index = randperm(length(pop),sum(n));
Finally:
You can have all the sampled values as a cell variable using the following code:
pop = randn(1,100);
n = [1 3 10 6 2];
fun = #(m) pop(randperm(length(pop),m));
C = arrayfun(fun,n,'UniformOutput',0)
Also having the sum of the sampled data:
funs = #(m) sum(pop(randperm(length(pop),m)));
sumC = arrayfun(funs,n)

Does matrix contain a vector?

I'm looking for a fast / concise way to check whether some matrix contains given vector, e.g.:
bigMatrix = [1 1 1; 2 2 2; 4 4 4; 5 5 5];
someFunction(bigMatrix, [1 1 1]) % = true
someFunction(bigMatrix, [3 3 3]) % = false
Is there such function/operator, or I need a loop?
I would suggest the following solution:
bigMatrix = [1 1 1; 2 2 2; 4 4 4; 5 5 5];
Vec = [2 2 2];
Index = ismember(bigMatrix, Vec, 'rows');
The result?
Index =
0
1
0
0
ismember is an incredibly useful function that checks whether the elements of one set are in another set. Here, I exploit the rows option to force the function to compare rows, rather than individual elements.
UPDATE: On the other hand, it is always worth doing a few speed tests! I just compared the ismember approach to the following alternative method:
N = size(bigMatrix, 1);
Index2 = zeros(N, 1);
for n = 1:N
if all(bigMatrix(n, :) == Vec)
Index2(n) = 1;
end
end
My findings? The size of bigMatrix matters! In particular, if bigMatrix is on the small side (somewhat of a misnomer), then the loop is much faster. The first approach is preferable only when bigMatrix becomes big. Further, the results are also dependent on how many columns bigMatrix has, as well as rows! I suggest you test both approaches for your application and then go with whichever is faster. (EDIT: This was on R2011a)
General Note: I am continually surprised by how much faster Matlab's loops have gotten in the last few years. Methinks vectorized code is no longer the holy grail that it once was.

What is the quickest way to keep the non dominated elements and omit the rest in MATLAB?

For example [2 , 5] dominates [3 , 8] cause (2 < 3) and (5 < 8)
but [2 , 5] does not dominates [3 , 1] cause though (2 < 3) but (5 > 1) so these two vectors are non dominated
now for example assume that I have a matrix like this :
a =[ 1 8;
2 6;
3 5;
4 6];
here the first three are non dominated but the last one is dominated by (3,5), I need a code which can omit it and give me this output:
ans =
[ 1 8;
2 6;
3 5]
note that there may be lots of non dominated elements in a Nx2 matrix
Compare one row with other rows using bsxfun
Do this for every row using arrayfun (or a loop if you prefer that) and transform the output back to a matrix with cell2mat
use any and all to check which rows are dominated
remove these rows
code:
a=[1 8;2 6;3 5;4 6];
dominated_idxs = any(cell2mat(arrayfun(#(ii) all(bsxfun(#(x,y) x>y,a,a(ii,:)),2),1:size(a,1),'uni',false)),2);
a(dominated_idxs,:) = [];
edit
If you want to use >= instead of > comparison, each row will dominate itself and will be removed, so you'll end up with an empty matrix. Filter these false-positives out by adjusting the code as follows:
a=[1 8;2 6;3 5;4 6];
N = size(a,1);
compare_matrix = cell2mat(arrayfun(#(ii) all(bsxfun(#(x,y) x>=y,a,a(ii,:)),2),1:N,'uni',false));
compare_matrix(1:N+1:N^2)=false; % set diagonal to false
dominated_idxs = any(compare_matrix,2);
a(dominated_idxs ,:) = [];
This problem is identical to identifying the so-called Pareto front.
If the number of elements N grows large and/or you need to carry out this sort of operation often (as I suspect you do), you might want to give a thought to a fully optimized MEX file for this purpose (available on the Mathworks File Exchange):
Compiling this, putting the mex in your Matlab path, and then using something like
a = a(paretofront(a));
will accomplish your task much quicker than any combination of Matlab-builtins is able to.