Matlab index to logic indexing - matlab

I have given a list of indices, e.g. i = [3 5] and a vector v = 1:6. I need a function f which returns the logical map for the vector v given the indices i, e.g.:
f(i, length(v)) = [0 0 1 0 1 0]
Since I will call this function several million times, I would like to make it as fast as possible. Is there a builtin function which performs this task?

I know I'm late in the game, but I really wanted to find a faster solution which is just as elegant as ismember. And indeed there is one, that employs the undocumented ismembc function:
ismembc(v, i)
Benchmark
N = 7;
i = [3 5];
%// slayton's solution
tic
for ii = 1:1e5
clear idx;
idx(N) = false;
idx(i) = true;
end
toc
%// H.Muster's solution
tic
for ii = 1:1e5
v = 1:N;
idx = ismember(v, i);
end
toc
%// Jonas' solution
tic
for ii = 1:1e5
idx = sparse(i, 1, true, N, 1);
end
toc
%// ismembc solution
tic
for ii = 1:1e5
v = 1:N;
idx = ismembc(v, i);
end
toc
Here's what I got:
Elapsed time is 1.482971 seconds.
Elapsed time is 6.369626 seconds.
Elapsed time is 2.039481 seconds.
Elapsed time is 0.776234 seconds.
Amazingly, ismembc is indeed the fastest!
Edit:
For very large values of N (i.e. when v is a large array), the faster solution is actually slayton's (and HebeleHododo's, for that matter). You have quite a variety of strategies to choose from, pick carefully :)
Edit by H.Muster:
Here's are benchmark results including _ismemberoneoutput:
Slayton's solution:
Elapsed time is 1.075650 seconds.
ismember:
Elapsed time is 3.163412 seconds.
ismembc:
Elapsed time is 0.390953 seconds.
_ismemberoneoutput:
Elapsed time is 0.477098 seconds.
Interestingly, Jonas' solution does not run for me, as I get an Index exceeds matrix dimensions. error...
Edit by hoogamaphone:
It's worth noting that ismembc requires both inputs to be numerical, sorted, non-sparse, non-NaN values, which is a detail that could be easily missed in the source documentation.

You can use ismember
i = [3 5];
v = 1:6;
ismember(v,i)
will return
ans =
0 0 1 0 1 0
For a probably faster version, you can try
builtin('_ismemberoneoutput', v, i)
Note that I tested this only for row vectors like specified by you.

Simply create a vector of logical indices and set the desired locations to true/false
idx = false( size( v) );
idx( i ) = true;
This can be wrapped in a function like so:
function idx = getLogicalIdx(size, i)
idx = false(size);
idx(i) = true;
end
If you need a indexing vector of the same size for each of your million operations allocated the vector once and then operate on it each iteration:
idx = false(size(v)); % allocate the vector
while( keepGoing)
idx(i) = true; % set the desired values to true for this iteration
doSomethingWithIndecies(idx);
idx(i) = false; % set indices back to false for next iteration
end
If you really need performance than you can write a mex function to do this for you. Here is a very basic, untested function that I wrote that is about 2x faster than the other methods:
#include <math.h>
#include <matrix.h>
#include <mex.h>
void mexFunction(int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
double M;
double *in;
M = mxGetScalar(prhs[0]);
in = mxGetPr(prhs[1]);
size_t N = mxGetNumberOfElements(prhs[1]);
plhs[0] = mxCreateLogicalMatrix( M,1 );
mxLogical *out= mxGetLogicals( plhs[0] );
int i, ind;
for (i=0; i<N; i++){
out[ (int)in[i] ] = 1;
}
}
There are several different ways to allocate a vector in matlab. Some are faster than others, see this Undocumented Matlab post for a good summary:
Here are some quick benchmarks comparing the different methods. The last method is by far the fastest but it requires you to use the same size logical indexing vector for each operation.
N = 1000;
ITER = 1e5;
i = randi(5000,100,1);
sz = [N, 1];
fprintf('Create using false()\n');
tic;
for j = 1:ITER
clear idx;
idx = false( N, 1 );
idx(i) = true;
end
toc;
fprintf('Create using indexing\n');
tic;
for j = 1:ITER
clear idx;
idx(N) = false;
idx(i) = true;
end
toc;
fprintf('Create once, update as needed\n');
tic;
idx = false(N,1);
for j = 1:ITER
idx(i) = true;
idx(i) = false;
end
toc;
fprintf('Create using ismembc\n');
a = ones(N,1);
tic;
for j = 1:ITER
idx = ismembc(1:N, i);
end
toc;

I expect that #slayton's solution is fastest. However, here's a one-liner alternative, that may at least save you some memory if the vectors are large.
vecLen = 6;
logicalIdx = sparse(idx,1,true,vecLen,1);

Just address a new variable with the idx matrix, it wil fill in the zeros for you:
idx = [3 5];
a(idx) = true
No need for a function, nor for passing the length in unless you want trailing zeros too.

You can write a function like this:
function logicalIdx = getLogicalIdx(idx, v)
logicalIdx = zeros(1,size(v,2));
logicalIdx(idx) = 1;
end
When you call the function:
v = 1:6;
idx = [3 5];
getLogicalIdx(idx,v)
The output will be:
ans =
0 0 1 0 1 0

Can you simply do v(i) =1 ?
for example if you say x = zeros(1,10);
and a = [1 3 4];
x(a) = 1 will return 1
0
1
1
0
0
0
0
0
0

Related

Applying a function to a matrix, which depends on the indices?

Suppose I have a matrix A and I want to apply a function f to each of its elements. I can then use f(A), if f is vectorized or arrayfun(f,A) if it's not.
But what if I had a function that depends on the entry and its indices: f = #(i,j,x) something. How do I apply this function to the matrix A without using a for loop like the following?
for j=1:size(A,2)
for i=1:size(A,1)
fA(i,j) = f(i,j,A(i,j));
end
end
I'd like to consider the function f to be vectorized. Hints on shorter notation for non-vectorized functions are welcome, though.
I have read your answers and I came up with another idea using indexing, which is the fastest way. Here is my test script:
%// Test function
f = #(i,j,x) i.*x + j.*x.^2;
%// Initialize times
tfor = 0;
tnd = 0;
tsub = 0;
tmy = 0;
%// Do the calculation 100 times
for it = 1:100
%// Random input data
A = rand(100);
%// Clear all variables
clear fA1 fA2 fA3 fA4;
%// Use the for loop
tic;
fA1(size(A,1),size(A,2)) = 0;
for j=1:size(A,2)
for i=1:size(A,1)
fA1(i,j) = f(i,j,A(i,j));
end
end
tfor = tfor + toc;
%// Use ndgrid, like #Divakar suggested
clear I J;
tic;
[I,J] = ndgrid(1:size(A,1),1:size(A,2));
fA2 = f(I,J,A);
tnd = tnd + toc;
%// Test if the calculation is correct
if max(max(abs(fA2-fA1))) > 0
max(max(abs(fA2-fA1)))
end
%// Use ind2sub, like #DennisKlopfer suggested
clear I J;
tic;
[I,J] = ind2sub(size(A),1:numel(A));
fA3 = arrayfun(f,reshape(I,size(A)),reshape(J,size(A)),A);
tsub = tsub + toc;
%// Test if the calculation is correct
if max(max(abs(fA3-fA1))) > 0
max(max(abs(fA3-fA1)))
end
%// My suggestion using indexing
clear sA1 sA2 ssA1 ssA2;
tic;
sA1=size(A,1);
ssA1=1:sA1;
sA2=size(A,2);
ssA2=1:sA2;
fA4 = f(ssA1(ones(1,sA2),:)', ssA2(ones(1,sA1,1),:), A); %'
tmy = tmy + toc;
%// Test if the calculation is correct
if max(max(abs(fA4-fA1))) > 0
max(max(abs(fA4-fA1)))
end
end
%// Print times
tfor
tnd
tsub
tmy
I get the result
tfor =
0.6813
tnd =
0.0341
tsub =
10.7477
tmy =
0.0171
Assuming that the function is vectorized ( no dependency or recursions involved), as mentioned in the comments earlier, you could use ndgrid to create 2D meshes corresponding to the two nested loop iterators i and j and of the same size as A. When these are fed to the particular function f, it would operate on the input 2D arrays in a vectorized manner. Thus, the implementation would look something like this -
[I,J] = ndgrid(1:size(A,1),1:size(A,2));
out = f(I,J,A);
Sample run -
>> f = #(i,j,k) i.^2+j.^2+sin(k);
A = rand(4,5);
for j=1:size(A,2)
for i=1:size(A,1)
fA(i,j) = f(i,j,A(i,j));
end
end
>> fA
fA =
2.3445 5.7939 10.371 17.506 26.539
5.7385 8.282 13.538 20.703 29.452
10.552 13.687 18.076 25.804 34.012
17.522 20.684 25.054 32.13 41.331
>> [I,J] = ndgrid(1:size(A,1),1:size(A,2)); out = f(I,J,A);
>> out
out =
2.3445 5.7939 10.371 17.506 26.539
5.7385 8.282 13.538 20.703 29.452
10.552 13.687 18.076 25.804 34.012
17.522 20.684 25.054 32.13 41.331
Using arrayfun(), ind2sub() and reshape() you can create the indexes matching the form of A. This way arrayfun() is applicable. There might be a better version as this feels a little bit like a hack, it should work on vectorized and unvectorized functions though.
[I,J] = ind2sub(size(A),1:numel(A));
fA = arrayfun(f,reshape(I,size(A)),reshape(J,size(A)),A)

Storing non-zero integers from one matrix into another

I'm attempting to create a loop that reads through a matrix (A) and stores the non-zero values into a new matrix (w). I'm not sure what is wrong with my code.
function [d,w] = matrix_check(A)
[nrow ncol] = size(A);
total = 0;
for i = 1:nrow
for j = 1:ncol
if A(i,j) ~= 0
total = total + 1;
end
end
end
d = total;
w = [];
for i = 1:nrow
for j = 1:ncol
if A(i,j) ~= 0
w = [A(i,j);w];
end
end
end
The second loop is not working (at at least it is not printing out the results of w).
You can use nonzeros and nnz:
w = flipud(nonzeros(A)); %// flipud to achieve the same order as in your code
d = nnz(A);
The second loop is working. I'm guessing you're doing:
>> matrix_check(A)
And not:
>> [d, w] = matrix_check(A)
MATLAB will only return the first output unless otherwise specified.
As an aside, you can accomplish your task utilizing MATLAB's logical indexing and take advantage of the (much faster, usually) array operations rather than loops.
d = sum(sum(A ~= 0));
w = A(A ~= 0);

MATLAB sum series function

I am very new in Matlab. I just try to implement sum of series 1+x+x^2/2!+x^3/3!..... . But I could not find out how to do it. So far I did just sum of numbers. Help please.
for ii = 1:length(a)
sum_a = sum_a + a(ii)
sum_a
end
n = 0 : 10; % elements of the series
x = 2; % value of x
s = sum(x .^ n ./ factorial(n)); % sum
The second part of your answer is:
n = 0:input('variable?')
Cheery's approach is perfectly valid when the number of terms of the series is small. For large values, a faster approach is as follows. This is more efficient because it avoids repeating multiplications:
m = 10;
x = 2;
result = 1+sum(cumprod(x./[1:m]));
Example running time for m = 1000; x = 1;
tic
for k = 1:1e4
result = 1+sum(cumprod(x./[1:m]));
end
toc
tic
for k = 1:1e4
result = sum(x.^(0:m)./factorial(0:m));
end
toc
gives
Elapsed time is 1.572464 seconds.
Elapsed time is 2.999566 seconds.

Subtracting each elements of a row vector , size (1 x n) from a matrix of size (m x n)

I have two matrices of big sizes, which are something similar to the following matrices.
m; with size 1000 by 10
n; with size 1 by 10.
I would like to subtract each element of n from all elements of m to get ten different matrices, each has size of 1000 by 10.
I started as follows
clc;clear;
nrow = 10000;
ncol = 10;
t = length(n)
for i = 1:nrow;
for j = 1:ncol;
for t = 1:length(n);
m1(i,j) = m(i,j)-n(1);
m2(i,j) = m(i,j)-n(2);
m3(i,j) = m(i,j)-n(3);
m4(i,j) = m(i,j)-n(4);
m5(i,j) = m(i,j)-n(5);
m6(i,j) = m(i,j)-n(6);
m7(i,j) = m(i,j)-n(7);
m8(i,j) = m(i,j)-n(8);
m9(i,j) = m(i,j)-n(9);
m10(i,j) = m(i,j)-n(10);
end
end
end
can any one help me how can I do it without writing the ten equations inside the loop? Or can suggest me any convenient way especially when the two matrices has many columns.
Why can't you just do this:
m01 = m - n(1);
...
m10 = m - n(10);
What do you need the loop for?
Even better:
N = length(n);
m2 = cell(N, 1);
for k = 1:N
m2{k} = m - n(k);
end
Here we go loopless:
nrow = 10000;
ncol = 10;
%example data
m = ones(nrow,ncol);
n = 1:ncol;
M = repmat(m,1,1,ncol);
N = permute( repmat(n,nrow,1,ncol) , [1 3 2] );
result = bsxfun(#minus, M, N );
%or just
result = M-N;
Elapsed time is 0.018499 seconds.
or as recommended by Luis Mendo:
M = repmat(m,1,1,ncol);
result = bsxfun(#minus, m, permute(n, [1 3 2]) );
Elapsed time is 0.000094 seconds.
please make sure that your input vectors have the same orientation like in my example, otherwise you could get in trouble. You should be able to obtain that by transposements or you have to modify this line:
permute( repmat(n,nrow,1,ncol) , [1 3 2] )
according to your needs.
You mentioned in a comment that you want to count the negative elements in each of the obtained columns:
A = result; %backup results
A(A > 0) = 0; %set non-negative elements to zero
D = sum( logical(A),3 );
which will return the desired 10000x10 matrix with quantities of negative elements. (Please verify it, I may got a little confused with the dimensions ;))
Create the three dimensional result matrix. Store your results, for example, in third dimension.
clc;clear;
nrow = 10000;
ncol = 10;
N = length(n);
resultMatrix = zeros(nrow, ncol, N);
neg = zeros(ncol, N); % amount of negative values
for j = 1:ncol
for i = 1:nrow
for t = 1:N
resultMatrix(i,j,t) = m(i,j) - n(t);
end
end
for t = 1:N
neg(j,t) = length( find(resultMatrix(:,j,t) < 0) );
end
end

Deleting matrix elements by = [] vs reassigning matrix

Is there any difference between these two methods for deleting elements in Matlab:
ElementsToDelete = [0 0 1 0 1 0 0 1 1 0]
A = 1:10
A(ElementsToDelete) = []
%Versus
A = 1:10
A = A(~ElementsToDelete)
Are there times when one method is more appropriate than the other? Is there a difference in efficiency? Or are they completely interchangeable?
Try this:
A = rand(1e3, 1);
b = A<0.5;
tic;
for ii = 1:1e5
a = A;
a(b) = [];
end
toc
tic;
for ii = 1:1e5
a = A;
a = a(~b);
end
toc
Results:
Elapsed time is 1.654146 seconds
Elapsed time is 1.126325 seconds
So the difference is a speed factor of 1.5 in favour of re-assigning. This however, is worse:
A = rand(1e4, 1);
stop = 0;
for jj = 1:10
a = A;
start = tic;
for ii = 1:1e5
a(a < rand) = [];
end
stop = stop + toc(start);
end
avg1 = stop/10
stop = 0;
for jj = 1:10
a = A;
start = tic;
for ii = 1:1e5
a = a(a > rand);
end
stop = stop + toc(start);
end
avg2 = stop/10
avg1/avg2
Results:
avg1 = 1.1740235 seconds
avg2 = 0.1850463 seconds
avg1/avg2 = 6.344485136963019
So, the factor's increased to well over 6.
My guess is that deletion (i.e., assigning with []) re-writes the entire array on each and every occurrence of a true in the internal loop through the logical indices. This is hopelessly inefficient, as becomes apparent when testing it like this. Re-assigning on the other hand can determine the size of the new array beforehand and initialize it accordingly; no re-writes needed.
Why the JIT does not compile the one into the other is a mystery to me, because deletion is a far more intuitive notation IMHO. But, as you see, it is inefficient compared to alternatives, and should thus be used sparingly. Never use it inside loops!