Matlab: command for counting occurrences in ascending order not cumulatively? - matlab

This must be asked before but I cannot find it now. It calculates the amount of zeros, add the count of zeros to vector, then calculate the amount of ones, append the count of ones to the vector and so on. If zero count, make it as zero.
Is there some zero command to do this counting in Matlab?
Input ---> Output
0 1 1 1 2 3 3 4 7 ---> [1,3,1,2,1,0,0,1]
0 1 1 1 ---> 1 3
2 7 ----> 0 0 1 0 0 0 0 1

To get the total count of occurrences of each number, use histc:
x = [0 1 1 1 2 3 3 4 7]; %// example data
histc(x, 0:max(x))

Related

Is there a fast way to count occurrences of items in a matrix and save them in another matrix without using loops?

I have a time-series matrix X whose first column contains user ID and second column contains the item ID they used at different times:
X=[1 4
2 1
4 2
2 3
3 4
1 1
4 2
5 3
2 1
4 2
5 4];
I want to find out which user used which item how many times, and save it in a matrix Y. The rows of Y represent users in ascending order of ID, and the columns represent items in ascending order of ID:
Y=[1 0 0 1
2 0 1 0
0 0 0 1
0 3 0 0
0 0 1 1]
The code I use to find matrix Y uses 2 for loops which is unwieldy for my large data:
no_of_users = size(unique(X(:,1)),1);
no_of_items = size(unique(X(:,2)),1);
users=unique(X(:,1));
Y=zeros(no_of_users,no_of_items);
for a=1:size(A,1)
for b=1:no_of_users
if X(a,1)==users(b,1)
Y(b,X(a,2)) = Y(b,X(a,2)) + 1;
end
end
end
Is there a more time efficient way to do it?
sparse creates a sparse matrix from row/column indices, conveniently accumulating the number of occurrences if you give a scalar value of 1. Just convert to a full matrix.
Y = full(sparse(X(:,1), X(:,2), 1))
Y =
1 0 0 1
2 0 1 0
0 0 0 1
0 3 0 0
0 0 1 1
But it's probably quicker to just use accumarray as suggested in the comments:
>> Y2 = accumarray(X, 1)
Y2 =
1 0 0 1
2 0 1 0
0 0 0 1
0 3 0 0
0 0 1 1
(In Octave, sparse seems to take about 50% longer than accumarray.)

Calculate mean of all values below the diagnonal line in an adjacency matrix

I have an adjacency matrix M, something like this:
[1 2 0 2 4
2 1 2 0 -1
0 3 1 2 3
2 0 2 1 0
4 -1 3 0 1]
I want to calculate the mean of all values below (but not including) the diagonal. The final output should be 1.5.
To get those values, I thought I'd use N = tril(M,-1). The issue is that I now have zeros in upper and lower part of the matrix N and therefore mean(sum(N)./sum(N~=0)) wouldn't work. Since I also have negative values, I can't just do the mean of values >=0 either. How can I do this?
In one line using logical indexing to extract just the values below the diagonal:
M = [ 1 2 0 2 4;
2 1 2 0 -1;
0 3 1 2 3;
2 0 2 1 0;
4 -1 3 0 1];
mean(M(tril(true(size(M)),-1)))
This returns 1.5 as #excaza indicated.

MATLAB generate all ways that n items can be put into m bins?

I want to find all ways that n items can be split among m bins. For example, for n=3 and m=3 the output would be (the order doesn't matter):
[3 0 0
0 3 0
0 0 3
2 1 0
1 2 0
0 1 2
0 2 1
1 0 2
2 0 1
1 1 1]
The algorithm should be as efficient as possible, preferrably vectorized/using inbuilt functions rather than for loops. Thank you!
This should be pretty efficient.
It works by generating all posible splitings of the real interval [0, n] at m−1 integer-valued, possibly coincident split points. The lengths of the resulting subintervals give the solution.
For example, for n=4 and m=3, some of the possible ways to split the interval [0, 4] at m−1 points are:
Split at 0, 0: this gives subintervals of lenghts 0, 0, 4.
Split at 0, 1: this gives subintervals of lenghts 0, 1, 3.
...
Split at 4, 4: this gives subintervals of lenghts 4, 0, 0.
Code:
n = 4; % number of items
m = 3; % number of bins
x = bsxfun(#minus, nchoosek(0:n+m-2,m-1), 0:m-2); % split points
x = [zeros(size(x,1),1) x n*ones(size(x,1),1)]; % add start and end of interval [0, n]
result = diff(x.').'; % compute subinterval lengths
The result is in lexicographical order.
As an example, for n = 4 items in m = 3 bins the output is
result =
0 0 4
0 1 3
0 2 2
0 3 1
0 4 0
1 0 3
1 1 2
1 2 1
1 3 0
2 0 2
2 1 1
2 2 0
3 0 1
3 1 0
4 0 0
I'd like to suggest a solution based on an external function and accumarray (it should work starting R2015a because of repelem):
n = uint8(4); % number of items
m = uint8(3); % number of bins
whichBin = VChooseKR(1:m,n).'; % see FEX link below. Transpose saves us a `reshape()` later.
result = accumarray([repelem(1:size(whichBin,2),n).' whichBin(:)],1);
Where VChooseKR(V,K) creates a matrix whose rows are all combinations created by choosing K elements of the vector V with repetitions.
Explanation:
The output of VChooseKR(1:m,n) for m=3 and n=4 is:
1 1 1 1
1 1 1 2
1 1 1 3
1 1 2 2
1 1 2 3
1 1 3 3
1 2 2 2
1 2 2 3
1 2 3 3
1 3 3 3
2 2 2 2
2 2 2 3
2 2 3 3
2 3 3 3
3 3 3 3
All we need to do now is "histcount" the numbers on each row using positive integer bins to get the desired result. The first output row would be [4 0 0] because all 4 elements go in the 1st bin. The second row would be [3 1 0] because 3 elements go in the 1st bin and 1 in the 2nd, etc.

Transform a matrix to a stacked vector where all zeroes after the last non-zero value per row are removed

I have a matrix with some zero values I want to erase.
a=[ 1 2 3 0 0; 1 0 1 3 2; 0 1 2 5 0]
>>a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
However, I want to erase only the ones after the last non-zero value of each line.
This means that I want to retain 1 2 3 from the first line, 1 0 1 3 2 from the second and 0 1 2 5 from the third.
I want to then store the remaining values in a vector. In the case of the example this would result in the vector
b=[1 2 3 1 0 1 3 2 0 1 2 5]
The only way I figured out involves a for loop that I would like to avoid:
b=[];
for ii=1:size(a,1)
l=max(find(a(ii,:)));
b=[b a(ii,1:l)];
end
Is there a way to vectorize this code?
There are many possible ways to do this, here is my approach:
arotate = a' %//rotate the matrix a by 90 degrees
b=flipud(arotate) %//flips the matrix up and down
c= flipud(cumsum(b,1)) %//cumulative sum the matrix rows -and then flip it back.
arotate(c==0)=[]
arotate =
1 2 3 1 0 1 3 2 0 1 2 5
=========================EDIT=====================
just realized cumsum can have direction parameter so this should do:
arotate = a'
b = cumsum(arotate,1,'reverse')
arotate(b==0)=[]
This direction parameter was not available on my 2010b version, but should be there for you if you are using 2013a or above.
Here's an approach using bsxfun's masking capability -
M = size(a,2); %// Save size parameter
at = a.'; %// Transpose input array, to be used for masked extraction
%// Index IDs of last non-zero for each row when looking from right side
[~,idx] = max(fliplr(a~=0),[],2);
%// Create a mask of elements that are to be picked up in a
%// transposed version of the input array using BSXFUN's broadcasting
out = at(bsxfun(#le,(1:M)',M+1-idx'))
Sample run (to showcase mask usage) -
>> a
a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
>> M = size(a,2);
>> at = a.';
>> [~,idx] = max(fliplr(a~=0),[],2);
>> bsxfun(#le,(1:M)',M+1-idx') %// mask to be used on transposed version
ans =
1 1 1
1 1 1
1 1 1
0 1 1
0 1 0
>> at(bsxfun(#le,(1:M)',M+1-idx')).'
ans =
1 2 3 1 0 1 3 2 0 1 2 5

Count non-zero entries in each column of a matrix

If I have a matrix:
A = [1 2 3 4 5; 1 1 6 1 2; 0 0 9 0 1]
A =
1 2 3 4 5
1 1 6 1 2
0 0 9 0 1
How can I count the number of non-zero entries for each column? For example the desired output for this matrix would be:
2, 2, 3, 2, 3
I am not sure how to do this as size, length or numel do not appear to meet the requirements. Perhaps it would be best to remove zero entries first?
It's simply
> A ~= 0
ans =
1 1 1 1 1
1 1 1 1 1
0 0 1 0 1
> sum(A ~= 0, 1)
ans =
2 2 3 2 3
Here's another solution I can suggest that isn't very speed worthy for dense matrices but quite fast for sparse matrices (thanks #user1877862!). This also would mimic how one might do this in a compiled language, like C or Java, and perhaps for research purposes too. First find the row and column locations that are non zero, then do a histogram on just the column locations to count the frequency of how often you see a non-zero in each column. In other words:
[~,col] = find(A ~= 0);
counts = histc(col, 1:size(A,2));
find outputs the row and column locations of where a matrix satisfies some Boolean condition inside the argument of the function. We ignore the first output as we aren't concerned with the row locations.
The output we get is:
counts =
2
2
3
2
3