Subtotal Calculation in Matlab - matlab

I would like to take subtotal of table in matlab. If the values of two columns are equal, take the value and add if there is an entry.
If we give an example, source matrix is as follows:
A = [1 2 3;
1 2 2;
1 4 1;
2 2 1;
2 2 3];
The output would look like this:
B = [1 2 5;
1 4 1;
2 2 4];
If the first two columns are equal, sum the third column. Is there a simple way of doing, without having to loop several times?

You can do this with a combination of unique and accumarray:
%# find unique rows and their corresponding indices in A
[uniqueRows,~,rowIdx]=unique(A(:,1:2),'rows');
%# for each group of unique rows, sum the values of the third column of A
subtotal = accumarray(rowIdx,A(:,3),[],#sum);
B = [uniqueRows,subtotal];

You can use unique to get all of the groups, then splitapply to sum them
[u, ~, iu] = unique( A(:,1:2), 'rows' ); % Get unique rows & their indices
sums = splitapply( #sum, A(:,3), iu ); % Sum all values according to unique indices
output = [u, sums]
% >> output =
% output =
% 26 7 124
% 26 8 785
% 27 7 800
This is a late answer because a duplicate question has just been asked so I posted here instead. Note that splitapply was introduced in R2015b, so wasn't around when the accumarray solution was posted.

Related

Any way for matlab to sum an array according to specified bins NOT by for iteration? Best if there is buildin function for this

For example, if
A = [7,8,1,1,2,2,2]; % the bins (or subscripts)
B = [2,1,1,1,1,1,2]; % the array
then the desired function "binsum" has two outputs, one is the bins, and the other is the sum. It is just adding values in B according to subscripts in A. For example, for 2, the sum is 1 + 1 + 2 = 4, for 1 it is 1 + 1 = 2.
[bins, sums] = binsum(A,B);
bins = [1,2,7,8]
sums = [2,4,2,1]
The elements in "bins" need not be ordered but must correspond to elements in "sums". This can surely be done by "for" iterations, but "for" iteration is not desired, because there is a performance concern. It is best if there is a build in function for this.
Thanks a lot!
This is another job for accumarray
A = [7,8,1,1,2,2,2]; % the bins (or subscripts)
B = [2,1,1,1,1,1,2]; % the array
sums = accumarray(A.', B.').';
bins = unique(A);
Results:
>> bins
bins =
1 2 7 8
sums =
2 4 0 0 0 0 2 1
The index in sums corresponds to the bin value, so sums(2) = 4. You can use nonzeros to remove the unused bins so that bins(n) corresponds to sums(n)
sums = nonzeros(sums).';
sums =
2 4 2 1
or, to generate this form of sums in one line:
sums = nonzeros(accumarray(A.', B.')).';
Another possibility is to use sparse and then find.
Assuming A contains positive integers,
[bins, ~, sums] = find(sparse(A, 1, B));
This works because sparse automatically adds values (third input) for matching positions (as defined by the first two inputs).
If A can contain arbitrary values, you also need a call to unique, and find can be replaced by nonzeros:
[bins, ~, labels]= unique(A);
sums = nonzeros(sparse(labels, 1, B));
Here is a solution using sort and cumsum:
[s,I]=sort(A);
c=cumsum(B(I));
k= [s(1:end-1)~=s(2:end) true];
sums = diff([0 c(k)])
bins = s(k)

Count rows of a matrix and give back an array

I would like to know how to count rows in an matrix in such a way that gives an output for each colum. for example:
X=[1 1 1;
5 5 5]
I would like to find a command that when I input the matrix X the answers is [2 2 2], so that it counts the number of rows per column.
I have already found nunel(X) but the answer is a scalar numel(X)=6, whereas I need per column.
size(X,1) will give you the number of rows in the matrix (a scalar). a matrix has only one number of rows, i.e. each column has the same number of rows.
however if you still want the number of rows per each column you can use:
X = [1 1 1;
5 5 5];
nrows = size(X,1);
ncols = size(X,2);
nrowsPerCol = repmat(nrows, [1 ncols]) % [2 2 2]
Each matrix object in MATLAB has height and width property.
In other words: each column has the same number of rows.
To get this value, use MATLAB's size function:
[numOfRows, numOfCols] = size(X);

MATLAB: Applying vectors of row and column indices without looping

I have a situation analogous to the following
z = magic(3) % Data matrix
y = [1 2 2]' % Column indices
So,
z =
8 1 6
3 5 7
4 9 2
y represents the column index I want for each row. It's saying I should take row 1 column 1, row 2 column 2, and row 3 column 2. The correct output is therefore 8 5 9.
I worked out I can get the correct output with the following
x = 1:3;
for i = 1:3
result(i) = z(x(i),y(i));
end
However, is it possible to do this without looping?
Two other possible ways I can suggest is to use sub2ind to find the linear indices that you can use to sample the matrix directly:
z = magic(3);
y = [1 2 2];
ind = sub2ind(size(z), 1:size(z,1), y);
result = z(ind);
We get:
>> result
result =
8 5 9
Another way is to use sparse to create a sparse matrix which you can turn into a logical matrix and then sample from the matrix with this logical matrix.
s = sparse(1:size(z,1), y, 1, size(z,1), size(z,2)) == 1; % Turn into logical
result = z(s);
We also get:
>> result
result =
8
5
9
Be advised that this only works provided that each row index linearly increases from 1 up to the end of the rows. This conveniently allows you to read the elements in the right order taking advantage of the column-major readout that MATLAB is based on. Also note that the output is also a column vector as opposed to a row vector.
The link posted by Adriaan is a great read for the next steps in accessing elements in a vectorized way: Linear indexing, logical indexing, and all that.
there are many ways to do this, one interesting way is to directly work out the indexes you want:
v = 0:size(y,2)-1; %generates a number from 0 to the size of your y vector -1
ind = y+v*size(z,2); %generates the indices you are looking for in each row
zinv = z';
zinv(ind)
>> ans =
8 5 9

find row indices of different values in matrix

Having matrix A (n*2) as the source and B as a vector containing a subset of elements A, I'd like to find the row index of items.
A=[1 2;1 3; 4 5];
B=[1 5];
F=arrayfun(#(x)(find(B(x)==A)),1:numel(B),'UniformOutput',false)
gives the following outputs in a cell according to this help page
[2x1 double] [6]
indicating the indices of all occurrence in column-wise. But I'd like to have the indices of rows. i.e. I'd like to know that element 1 happens in row 1 and row 2 and element 5 happens just in row 3. If the indices were row-wise I could use ceil(F{x}/2) to have the desired output. Now with the variable number of rows, what's your suggested solution? As it may happens that there's no complete inclusion tag 'rows' in ismember function does not work. Besides, I'd like to know all indices of specified elements.
Thanks in advance for any help.
Approach 1
To convert F from its current linear-index form into row indices, use mod:
rows = cellfun(#(x) mod(x-1,size(A,1))+1, F, 'UniformOutput', false);
You can combine this with your code into a single line. Note also that you can directly use B as an input to arrayfun, and you avoid one stage of indexing:
rows = arrayfun(#(x) mod(find(x==A)-1,size(A,1))+1, B(:), 'UniformOutput', false);
How this works:
F as given by your code is a linear index in column-major form. This means the index runs down the first column of B, the begins at the top of the second column and runs down again, etc. So the row number can be obtained with just a modulo (mod) operation.
Approach 2
Using bsxfun and accumarray:
t = any(bsxfun(#eq, B(:), reshape(A, 1, size(A,1), size(A,2))), 3); %// occurrence pattern
[ii, jj] = find(t); %// ii indicates an element of B, and jj is row of A where it occurs
rows = accumarray(ii, jj, [], #(x) {x}); %// group results according to ii
How this works:
Assuming A and B as in your example, t is the 2x3 matrix
t =
1 1 0
0 0 1
The m-th row of t contains 1 at column n if the m-th element of B occurs at the n-th row of B. These values are converted into row and column form with find:
ii =
1
1
2
jj =
1
2
3
This means the first element of B ocurrs at rows 1 and 2 of A; and the second occurs at row 3 of B.
Lastly, the values of jj are grouped (with accumarray) according to their corresponding value of ii to generate the desired result.
One approach with bsxfun & accumarray -
%// Create a match of B's in A's with each column of matches representing the
%// rows in A where there is at least one match for each element in B
matches = squeeze(any(bsxfun(#eq,A,permute(B(:),[3 2 1])),2))
%// Get the indices values and the corresponding IDs of B
[indices,B_id] = find(matches)
%// Or directly for performance:
%// [indices,B_id] = find(any(bsxfun(#eq,A,permute(B(:),[3 2 1])),2))
%// Accumulate the indices values using B_id as subscripts
out = accumarray(B_id(:),indices(:),[],#(x) {x})
Sample run -
>> A
A =
1 2
1 3
4 5
>> B
B =
1 5
>> celldisp(out) %// To display the output, out
out{1} =
1
2
out{2} =
3
With arrayfun,ismember and find
[r,c] = arrayfun(#(x) find(ismember(A,x)) , B, 'uni',0);
Where r gives your desired results, you could also use the c variable to get the column of each number in B
Results for the sample input:
>> celldisp(r)
r{1} =
1
2
r{2} =
3
>> celldisp(c)
c{1} =
1
1
c{2} =
2

Generating all combinations with repetition using MATLAB

How do I create all k-combinations with repetitions of a given set (also called k-multicombinations or multisubsets) using MATLAB?
This is similar to the cartesian product, but two rows that only differ by their sorting should be considered the same (e.g. the vectors [1,1,2]=~=[1,2,1] are considered to be the same), so generating the cartesian product and then applying unique(sort(cartesianProduct,2),'rows') should yield the same results.
Example:
The call nmultichoosek(1:n,k) should generate the following matrix:
nmultichoosek(1:3,3)
ans =
1 1 1
1 1 2
1 1 3
1 2 2
1 2 3
1 3 3
2 2 2
2 2 3
2 3 3
3 3 3
We can use the bijection mentioned in the wikipedia article, which maps combinations without repetition of type n+k-1 choose k to k-multicombinations of size n. We generate the combinations without repetition and map them using bsxfun(#minus, nchoosek(1:n+k-1,k), 0:k-1);. This results in the following function:
function combs = nmultichoosek(values, k)
%// Return number of multisubsets or actual multisubsets.
if numel(values)==1
n = values;
combs = nchoosek(n+k-1,k);
else
n = numel(values);
combs = bsxfun(#minus, nchoosek(1:n+k-1,k), 0:k-1);
combs = reshape(values(combs),[],k);
end
Thanks to Hans Hirse for a correction.
Brute-force approach: generate all tuples and then keep only those that are sorted. Not suitable for large values of n or k.
values = 1:3; %// data
k = 3; %// data
n = numel(values); %// number of values
combs = values(dec2base(0:n^k-1,n)-'0'+1); %// generate all tuples
combs = combs(all(diff(combs.')>=0, 1),:); %'// keep only those that are sorted
This is probably an even more brutal (memory intensive) method than previous posts, but a tidy readable one-liner:
combs = unique(sort(nchoosek(repmat(values,1,k),k),2),'rows');