In matlab how to sum rows of a matrix according to specified bins/subscripts in a non-iteration manner? - matlab

This question generalizes the previous one Any way for matlab to sum an array according to specified bins NOT by for iteration? Best if there is buildin function for this. I am not sure, but I tried and the answers in previous post seem not to work with matrices.
For example, if
A = [7,8,1,1,2,2,2]; % the bins or subscripts
B = [2,1; ...
1,1; ...
1,1; ...
2,0; ...
3,1; ...
0,2; ...
2,4]; % the matrix
then the desired function "binsum" has two outputs, one is the bins, and the other is the accumulated row vectors. It is adding rows in B according to subscripts in A. For example, for 2, the sum is [3,1] + [0,2] + [2,4] = [5,6], for 1 it is [1,1] + [2,0] = [3,1].
[bins, sums] = binsum(A,B);
bins = [1,2,7,8]
sums = [2,1;
1,1;
3,1;
5,6]
The first method accumarray says its "val" argument can only be a scalar or vector. The second method spare seems not to accept a vector as the value "v" for each tuple (i,j) neither. So I have to post for help again, and it is still not desired to use iterations to go over the columns of B to do this.
I am using 2017a. Many thanks again!

A way to do that is using matrix multiplication:
bins = unique(A);
sums = (A==bins.')*B;
The above is memory-expensive, as it builds an intermediate logical matrix of size M×N, where M is the the number of bins and N is the length of A. Alternatively, you can build that matrix as sparse logical to save memory:
[bins, ~, labels] = unique(A);
sums = sparse(labels, 1:numel(A), true)*B;

A method base on sort and cumsum:
[s,I]=sort(A);
c=cumsum(B(I,:));
k= [s(1:end-1)~=s(2:end) true];
sums = diff([zeros(1,size(B,2)); c(k,:)])
bins=s(k)

Related

Comparing matrices of different size in matlab and storing values that are close

I have two matrices A and B. A(:,1) corresponds to an x-coordinate, A(:,2) corresponds to a y-coordinate, and A(:,3) corresponds to a certain radius. All three values in a row describe the same circle. Now let's say...
A =
[1,4,3]
[8,8,7]
[3,6,3]
B =
[1,3,3]
[1, 92,3]
[4,57,8]
[5,62,1]
[3,4,6]
[9,8,7]
What I need is to be able to loop through matrix A and determine if there are any rows in matrix B that are "similar" as in the x value is within a range (-2,2) of the x value of A (Likewise with the y-coordinate and radius).If it satisfies all three of these conditions, it will be added to a new matrix with the values that were in A. So for example I would need the above data to return...
ans =
[1,4,3]
[8,8,7]
Please help and thank you in advance to anyone willing to take the time!
You can use ismembertol.
result = A(ismembertol(A,B,2,'ByRows',1,'DataScale',1),:)
Manual method
A = [1,4,3;
8,8,7;
3,6,3];
B = [1,3,3;
1,92,3;
4,57,8;
5,62,1;
3,4,6;
9,8,7]; % example matrices
t = 2; % desired threshold
m = any(all(abs(bsxfun(#minus, A, permute(B, [3 2 1])))<=t, 2), 3);
result = A(m,:);
The key is using permute to move the first dimension of B to the third dimension. Then bsxfun computes the element-wise differences for all pairs of rows in the original matrices. A row of A should be selected if all the absolute differences with respect to any column of B are less than the desired threshold t. The resulting variable m is a logical index which is used for selecting those rows.
Using pdist2 (Statistics and Machine Learning Toolbox)
m = any(pdist2(A, B, 'chebychev')<=t, 2);
result = A(m,:);
Ths pdist2 function with the chebychev option computes the maximum coordinate difference (Chebychev distance, or L∞ metric) between pairs of rows.
With for loop
It should work:
A = [1,4,3;
8,8,7;
3,6,3]
B = [1,3,3;
1,92,3;
4,57,8;
5,62,1;
3,4,6;
9,8,7]
index = 1;
for i = 1:size(A,1)
C = abs(B - A(i,:));
if any(max(C,[],2)<=2)
out(index,:) = A(i,:);
index = index + 1
end
end
For each row of A, computes the absolute difference between B and that row, then checks if there exists a row in which the maximum is less than 2.
Without for loop
ind = any(max(abs(B - permute(A,[3 2 1])),[],2)<=2);
out = A(ind(:),:);

Index a vector by a matrix of conditions to obtain multiple selections of the target?

I have a vector T of length n and m other vectors of the same length with 0 or 1 used as condition to select elements of T. The condition vectors are combined into a matrix I of size n x m.
Is there a one liner to extract a matrix M of values from Tsuch that the i-th column of M are those elements in T that are selected by the condition elements of the i-th column in I?
Example:
T = (1:10)'
I = mod(T,2) == 0
T(I)'
yields
2 4 6 8 10
However
I = mod(T,2:4) == 0
T(I)'
yields an error in the last statement. I see that the columns might select a different number of elements which results in vectors of different lengths (as in the example). However, even this example doesn't work:
I = zeros(10,2)
I(:,1) = mod(T,2)==0
I(:,2) = mod(T,2)==1
Is there any way to achieve the solution in a one liner?
The easiest way I can think of to do something like this is to take advantage of the element-wise multiplication operator .* with your matrix I. Take this as an example:
% these lines are just setup of your problem
m = 10;
n = 10;
T = [1:m]';
I = randi([0 1], m, n);
% 1 liner to create M
M = repmat(T, 1, n) .* I;
What this does is expand T to be the same size as I using repmat and then multiplies all the elements together using .*.
Here is a one linear solution
mat2cell(T(nonzeros(bsxfun(#times,I,(1:numel(T)).'))),sum(I))
First logical index should be converted to numeric index for it we multiply T by each column of I
idx = bsxfun(#times,I,(1:numel(T)).');
But that index contain zeros we should extract those values that correspond to 1s in matrix I:
idx = nonzeros(idx);
Then we extract repeated elements of T :
T2 = T(idx);
so we need to split T2 to 3 parts size of each part is equal to sum of elements of corresponding column of I and mat2cell is very helpful
result = mat2cell(T2,sum(I));
result
ans =
{
[1,1] =
2
4
6
8
10
[2,1] =
3
6
9
[3,1] =
4
8
}
One line solution using cellfun and mat2cell
nColumns = size(I,2); nRows = size(T,1); % Take the liberty of a line to write cleaner code
cellfun(#(i)T(i),mat2cell(I,nRows,ones(nColumns,1)),'uni',0)
What is going on:
#(i)T(i) % defines a function handle that takes a logical index and returns elements from T for those indexes
mat2cell(I,nRows,ones(nColumns,1)) % Split I such that every column is a cell
'uni',0 % Tell cellfun that the function returns non uniform output

Multiple constant to a matrix and convert them into block diagonal matrix in matlab

I have a1 a2 a3. They are constants. I have a matrix A. What I want to do is to get a1*A, a2*A, a3*A three matrices. Then I want transfer them into a diagonal block matrix. For three constants case, this is easy. I can let b1 = a1*A, b2=a2*A, b3=a3*A, then use blkdiag(b1, b2, b3) in matlab.
What if I have n constants, a1 ... an. How could I do this without any looping?I know this can be done by kronecker product but this is very time-consuming and you need do a lot of unnecessary 0 * constant.
Thank you.
Discussion and code
This could be one approach with bsxfun(#plus that facilitates in linear indexing as coded in a function format -
function out = bsxfun_linidx(A,a)
%// Get sizes
[A_nrows,A_ncols] = size(A);
N_a = numel(a);
%// Linear indexing offsets between 2 columns in a block & between 2 blocks
off1 = A_nrows*N_a;
off2 = off1*A_ncols+A_nrows;
%// Get the matrix multiplication results
vals = bsxfun(#times,A,permute(a,[1 3 2])); %// OR vals = A(:)*a_arr;
%// Get linear indices for the first block
block1_idx = bsxfun(#plus,[1:A_nrows]',[0:A_ncols-1]*off1); %//'
%// Initialize output array base on fast pre-allocation inspired by -
%// http://undocumentedmatlab.com/blog/preallocation-performance
out(A_nrows*N_a,A_ncols*N_a) = 0;
%// Get linear indices for all blocks and place vals in out indexed by them
out(bsxfun(#plus,block1_idx(:),(0:N_a-1)*off2)) = vals;
return;
How to use: To use the above listed function code, let's suppose you have the a1, a2, a3, ...., an stored in a vector a, then do something like this out = bsxfun_linidx(A,a) to have the desired output in out.
Benchmarking
This section compares or benchmarks the approach listed in this answer against the other two approaches listed in the other answers for runtime performances.
Other answers were converted to function forms, like so -
function B = bsxfun_blkdiag(A,a)
B = bsxfun(#times, A, reshape(a,1,1,[])); %// step 1: compute products as a 3D array
B = mat2cell(B,size(A,1),size(A,2),ones(1,numel(a))); %// step 2: convert to cell array
B = blkdiag(B{:}); %// step 3: call blkdiag with comma-separated list from cell array
and,
function out = kron_diag(A,a_arr)
out = kron(diag(a_arr),A);
For the comparison, four combinations of sizes of A and a were tested, which are -
A as 500 x 500 and a as 1 x 10
A as 200 x 200 and a as 1 x 50
A as 100 x 100 and a as 1 x 100
A as 50 x 50 and a as 1 x 200
The benchmarking code used is listed next -
%// Datasizes
N_a = [10 50 100 200];
N_A = [500 200 100 50];
timeall = zeros(3,numel(N_a)); %// Array to store runtimes
for iter = 1:numel(N_a)
%// Create random inputs
a = randi(9,1,N_a(iter));
A = rand(N_A(iter),N_A(iter));
%// Time the approaches
func1 = #() kron_diag(A,a);
timeall(1,iter) = timeit(func1); clear func1
func2 = #() bsxfun_blkdiag(A,a);
timeall(2,iter) = timeit(func2); clear func2
func3 = #() bsxfun_linidx(A,a);
timeall(3,iter) = timeit(func3); clear func3
end
%// Plot runtimes against size of A
figure,hold on,grid on
plot(N_A,timeall(1,:),'-ro'),
plot(N_A,timeall(2,:),'-kx'),
plot(N_A,timeall(3,:),'-b+'),
legend('KRON + DIAG','BSXFUN + BLKDIAG','BSXFUN + LINEAR INDEXING'),
xlabel('Datasize (Size of A) ->'),ylabel('Runtimes (sec)'),title('Runtime Plot')
%// Plot runtimes against size of a
figure,hold on,grid on
plot(N_a,timeall(1,:),'-ro'),
plot(N_a,timeall(2,:),'-kx'),
plot(N_a,timeall(3,:),'-b+'),
legend('KRON + DIAG','BSXFUN + BLKDIAG','BSXFUN + LINEAR INDEXING'),
xlabel('Datasize (Size of a) ->'),ylabel('Runtimes (sec)'),title('Runtime Plot')
Runtime plots thus obtained at my end were -
Conclusions: As you can see, either one of the bsxfun based methods could be looked into, depending on what kind of datasizes you are dealing with!
Here's another approach:
Compute the products as a 3D array using bsxfun;
Convert into a cell array with one product (matrix) in each cell;
Call blkdiag with a comma-separated list generated from the cell array.
Let A denote your matrix, and a denote a vector with your constants. Then the desired result B is obtained as
B = bsxfun(#times, A, reshape(a,1,1,[])); %// step 1: compute products as a 3D array
B = mat2cell(B,size(A,1),size(A,2),ones(1,numel(a))); %// step 2: convert to cell array
B = blkdiag(B{:}); %// step 3: call blkdiag with comma-separated list from cell array
Here's a method using kron which seems to be faster and more memory efficient than Divakar's bsxfun based solution. I'm not sure if this is different to your method, but the timing seems pretty good. It might be worth doing some testing between the different methods to work out which is more efficient for you problem.
A=magic(4);
a1=1;
a2=2;
a3=3;
kron(diag([a1 a2 a3]),A)

Partial sum of a vector

For a vector v (e.g. v=[1,2,3,4,5]), and two index vectors (e.g. a=[1,1,1,2,3] and b=[3,4,5,5,5], with all a(i)<b(i)), I would like to construct w=sum(v(a:b)), which gives the values
w = zeros(length(a),1);
for i = 1:length(a)
w(i)=sum(v(a(i):b(i)));
end
It is slow when length(a) is large. Can I compute w without the for loop?
Yes! The nth element of cumsum(v) is the sum of the first n elements in v, so just take that and subtract the sum of the elements that you don't want to include:
v=[1,2,3,4,5]
a=[1,1,1,2,3]
b=[3,4,5,5,5]
C=cumsum(v)
C(b)-C(a)+v(a)
%// or alternatively
C=cumsum([0 v])
C(b+1)-C(a)
The following code works, but it is of course much less readable:
% assume v is a column vector
units = 1:length(v); units = units'; %units is a column vector
units_matrix = repmat(units, [1 length(a)]);
a_matrix = repmat(a, [length(v) 1]); % assuming a is is a row vector
b_matrix = repmat(b, [length(v) 1]);
weights = (units_matrix>=a_matrix) & (units_matrix<=b_matrix);
v_matrix = repmat(v, [1 length(a)]);
w = sum(v_matrix.*weights);
Explanation:
v_matrix contains copies of v. The summation will be done along
the column, so we need to prepare the other needed information in
vectorized form. units_matrix contains the indexes in v along the
columns. The columns are identical. a_matrix and b_matrix, in
each of their column, contains the indexes that are relevant for each
partial summation. All rows are identical. weights is a logical
matrix where, for each column, the indexes contained in
units_matrix between the corresponding a and b are 1 (true),
and the rest is 0. The element-wise multiplication thus filters the
"right" values, and all the indexes outside the range (again, for
each different column) is multiplied by zero. w is then he result
of the sum function, i.e. a row vector (every column of the
"filtered" matrix is summed).

Matlab Mean over same-indexed elements across cells

I have a cell array of 53 different (40,000 x 2000) sparse matrices. I need to take the mean over the third dimension, so that for example element (2,5) is averaged across the 53 cells. This should yield a single (33,000 x 2016) output. I think there ought to be a way to do this with cellfun(), but I am not able to write a function that works across cells on the same within-cell indices.
You can convert from sparse matrix to indices and values of nonzeros entries, and then use sparse to automatically obtain the sum in sparse form:
myCell = {sparse([0 1; 2 0]), sparse([3 0; 4 0])}; %// example
C = numel(myCell);
M = cell(1,C); %// preallocate
N = cell(1,C);
V = cell(1,C);
for c = 1:C
[m n v] = find(myCell{c}); %// rows, columns and values of nonzero entries
M{c} = m.';
N{c} = n.';
V{c} = v.';
end
result = sparse([M{:}],[N{:}],[V{:}])/C; %'// "sparse" sums over repeated indices
This should do the trick, just initialize an empty array and sum over each element of the cell array. I don't see any way around using a for loop without concatenating it into one giant 3D array (which will almost definitely run out of memory)
running_sum=zeros(size(cell_arr{1}))
for i=1:length(cell_arr)
running_sum=running_sum+cell_arr{i};
end
means = running_sum./length(cell_arr);