Calculate mean of all values below the diagnonal line in an adjacency matrix - matlab

I have an adjacency matrix M, something like this:
[1 2 0 2 4
2 1 2 0 -1
0 3 1 2 3
2 0 2 1 0
4 -1 3 0 1]
I want to calculate the mean of all values below (but not including) the diagonal. The final output should be 1.5.
To get those values, I thought I'd use N = tril(M,-1). The issue is that I now have zeros in upper and lower part of the matrix N and therefore mean(sum(N)./sum(N~=0)) wouldn't work. Since I also have negative values, I can't just do the mean of values >=0 either. How can I do this?

In one line using logical indexing to extract just the values below the diagonal:
M = [ 1 2 0 2 4;
2 1 2 0 -1;
0 3 1 2 3;
2 0 2 1 0;
4 -1 3 0 1];
mean(M(tril(true(size(M)),-1)))
This returns 1.5 as #excaza indicated.

Related

Generate all possible column vectors in matlab

I am essentially trying to figure out how to generate code for basis vectors of different configurations of M objects into N different states (for example, if I had 2 snacks between 2 kids, I could have (2,0) (0,2) or (1,1), terrible example, but thats the idea)
I am struggling to figure out how to do this without going into many different loops (I want this to be automatic). The idea would be to create a Matrix where each row is a vector of length M. I would start with vec(1) = N then an if loop where if sum(vec) == N, Matrix(1,:)=vec; Then I could take vec(1)=N-i and do the same.
My only issue is I do not see how to use the if and forget it so that if I had maybe 2 objects in 5 locations, how would I do this to get (1 0 0 0 1).
I am not seeing how to do this.
You could use a recursive function:
function out = combos(M,N)
if N == 1
out = M;
else
out = [];
for i = 0:M
subout = combos(M-i,N-1);
subout(:,end+1) = i;
out = [out;subout];
end
end
I think this does what you want.
The key idea is to generate not the number of elements in each group, but the split points between groups. This can be done via combinations with repetition. Matlab's nchoosek generates combinations without repetition, but these are easily converted into what we need.
M = 5; % number of objects
N = 3; % number of groups
t = nchoosek(1:M+N-1, N-1); % combinations without repetition...
t = bsxfun(#minus, t, 1:N-1); % ...convert into combinations with repetition
t = diff([zeros(size(t,1), 1) t repmat(M, size(t,1), 1) ], [], 2); % the size of each
% group is the distance between split points
In this example, the result is
t =
0 0 5
0 1 4
0 2 3
0 3 2
0 4 1
0 5 0
1 0 4
1 1 3
1 2 2
1 3 1
1 4 0
2 0 3
2 1 2
2 2 1
2 3 0
3 0 2
3 1 1
3 2 0
4 0 1
4 1 0
5 0 0
This is a similar approach to Luis' without bsxfun. Because we don't like fun.
n = 5;
k = 3;
c = nchoosek(n+k-1, k-1);
result = diff([zeros(c, 1) nchoosek(1:(n+k-1), k-1) ones(c, 1)*(n+k)], [], 2) - 1;
This creates the partitions of the integer n with length k. Given an array of length n + (k-1), we find all combinations of (k-1) places to place partitions between the (unary) integers. For 5 items and 3 locations, we have 7 choices of where to put the partitions:
[ 0 0 0 0 0 0 0 ]
If our chosen combination is [2 4], we replace positions 2 and 4 with partitions to look like this:
[ 0 | 0 | 0 0 0 ]
The O's give the value in unary, so this combination is 1 1 3. To recover the values easily, we just augment the combinations with imaginary partitions at the next values to the left and right of the array (0 and n+k) and take the difference and subtract 1 (because the partitions themselves don't contribute to the value):
diff([0 2 4 8]) - 1
ans =
1 1 3
By sliding the partitions in to each possible combination of positions, we get all of the partitions of n.
Output:
result =
0 0 5
0 1 4
0 2 3
0 3 2
0 4 1
0 5 0
1 0 4
1 1 3
1 2 2
1 3 1
1 4 0
2 0 3
2 1 2
2 2 1
2 3 0
3 0 2
3 1 1
3 2 0
4 0 1
4 1 0
5 0 0

Is there a fast way to count occurrences of items in a matrix and save them in another matrix without using loops?

I have a time-series matrix X whose first column contains user ID and second column contains the item ID they used at different times:
X=[1 4
2 1
4 2
2 3
3 4
1 1
4 2
5 3
2 1
4 2
5 4];
I want to find out which user used which item how many times, and save it in a matrix Y. The rows of Y represent users in ascending order of ID, and the columns represent items in ascending order of ID:
Y=[1 0 0 1
2 0 1 0
0 0 0 1
0 3 0 0
0 0 1 1]
The code I use to find matrix Y uses 2 for loops which is unwieldy for my large data:
no_of_users = size(unique(X(:,1)),1);
no_of_items = size(unique(X(:,2)),1);
users=unique(X(:,1));
Y=zeros(no_of_users,no_of_items);
for a=1:size(A,1)
for b=1:no_of_users
if X(a,1)==users(b,1)
Y(b,X(a,2)) = Y(b,X(a,2)) + 1;
end
end
end
Is there a more time efficient way to do it?
sparse creates a sparse matrix from row/column indices, conveniently accumulating the number of occurrences if you give a scalar value of 1. Just convert to a full matrix.
Y = full(sparse(X(:,1), X(:,2), 1))
Y =
1 0 0 1
2 0 1 0
0 0 0 1
0 3 0 0
0 0 1 1
But it's probably quicker to just use accumarray as suggested in the comments:
>> Y2 = accumarray(X, 1)
Y2 =
1 0 0 1
2 0 1 0
0 0 0 1
0 3 0 0
0 0 1 1
(In Octave, sparse seems to take about 50% longer than accumarray.)

MATLAB generate all ways that n items can be put into m bins?

I want to find all ways that n items can be split among m bins. For example, for n=3 and m=3 the output would be (the order doesn't matter):
[3 0 0
0 3 0
0 0 3
2 1 0
1 2 0
0 1 2
0 2 1
1 0 2
2 0 1
1 1 1]
The algorithm should be as efficient as possible, preferrably vectorized/using inbuilt functions rather than for loops. Thank you!
This should be pretty efficient.
It works by generating all posible splitings of the real interval [0, n] at m−1 integer-valued, possibly coincident split points. The lengths of the resulting subintervals give the solution.
For example, for n=4 and m=3, some of the possible ways to split the interval [0, 4] at m−1 points are:
Split at 0, 0: this gives subintervals of lenghts 0, 0, 4.
Split at 0, 1: this gives subintervals of lenghts 0, 1, 3.
...
Split at 4, 4: this gives subintervals of lenghts 4, 0, 0.
Code:
n = 4; % number of items
m = 3; % number of bins
x = bsxfun(#minus, nchoosek(0:n+m-2,m-1), 0:m-2); % split points
x = [zeros(size(x,1),1) x n*ones(size(x,1),1)]; % add start and end of interval [0, n]
result = diff(x.').'; % compute subinterval lengths
The result is in lexicographical order.
As an example, for n = 4 items in m = 3 bins the output is
result =
0 0 4
0 1 3
0 2 2
0 3 1
0 4 0
1 0 3
1 1 2
1 2 1
1 3 0
2 0 2
2 1 1
2 2 0
3 0 1
3 1 0
4 0 0
I'd like to suggest a solution based on an external function and accumarray (it should work starting R2015a because of repelem):
n = uint8(4); % number of items
m = uint8(3); % number of bins
whichBin = VChooseKR(1:m,n).'; % see FEX link below. Transpose saves us a `reshape()` later.
result = accumarray([repelem(1:size(whichBin,2),n).' whichBin(:)],1);
Where VChooseKR(V,K) creates a matrix whose rows are all combinations created by choosing K elements of the vector V with repetitions.
Explanation:
The output of VChooseKR(1:m,n) for m=3 and n=4 is:
1 1 1 1
1 1 1 2
1 1 1 3
1 1 2 2
1 1 2 3
1 1 3 3
1 2 2 2
1 2 2 3
1 2 3 3
1 3 3 3
2 2 2 2
2 2 2 3
2 2 3 3
2 3 3 3
3 3 3 3
All we need to do now is "histcount" the numbers on each row using positive integer bins to get the desired result. The first output row would be [4 0 0] because all 4 elements go in the 1st bin. The second row would be [3 1 0] because 3 elements go in the 1st bin and 1 in the 2nd, etc.

Transform a matrix to a stacked vector where all zeroes after the last non-zero value per row are removed

I have a matrix with some zero values I want to erase.
a=[ 1 2 3 0 0; 1 0 1 3 2; 0 1 2 5 0]
>>a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
However, I want to erase only the ones after the last non-zero value of each line.
This means that I want to retain 1 2 3 from the first line, 1 0 1 3 2 from the second and 0 1 2 5 from the third.
I want to then store the remaining values in a vector. In the case of the example this would result in the vector
b=[1 2 3 1 0 1 3 2 0 1 2 5]
The only way I figured out involves a for loop that I would like to avoid:
b=[];
for ii=1:size(a,1)
l=max(find(a(ii,:)));
b=[b a(ii,1:l)];
end
Is there a way to vectorize this code?
There are many possible ways to do this, here is my approach:
arotate = a' %//rotate the matrix a by 90 degrees
b=flipud(arotate) %//flips the matrix up and down
c= flipud(cumsum(b,1)) %//cumulative sum the matrix rows -and then flip it back.
arotate(c==0)=[]
arotate =
1 2 3 1 0 1 3 2 0 1 2 5
=========================EDIT=====================
just realized cumsum can have direction parameter so this should do:
arotate = a'
b = cumsum(arotate,1,'reverse')
arotate(b==0)=[]
This direction parameter was not available on my 2010b version, but should be there for you if you are using 2013a or above.
Here's an approach using bsxfun's masking capability -
M = size(a,2); %// Save size parameter
at = a.'; %// Transpose input array, to be used for masked extraction
%// Index IDs of last non-zero for each row when looking from right side
[~,idx] = max(fliplr(a~=0),[],2);
%// Create a mask of elements that are to be picked up in a
%// transposed version of the input array using BSXFUN's broadcasting
out = at(bsxfun(#le,(1:M)',M+1-idx'))
Sample run (to showcase mask usage) -
>> a
a =
1 2 3 0 0
1 0 1 3 2
0 1 2 5 0
>> M = size(a,2);
>> at = a.';
>> [~,idx] = max(fliplr(a~=0),[],2);
>> bsxfun(#le,(1:M)',M+1-idx') %// mask to be used on transposed version
ans =
1 1 1
1 1 1
1 1 1
0 1 1
0 1 0
>> at(bsxfun(#le,(1:M)',M+1-idx')).'
ans =
1 2 3 1 0 1 3 2 0 1 2 5

Matlab: Search rows in matrix with fixed first and last element with vectorization

I have a matrix like the following (arbitrary cols/rows):
1 0 0 0 0
1 2 0 0 0
1 2 3 0 0
1 2 3 4 0
1 2 3 4 5
1 2 5 0 0
1 2 5 3 0
1 2 5 3 4
1 4 0 0 0
1 4 2 0 0
1 4 2 3 0
1 4 2 5 0
1 4 2 5 3
1 4 5 0 0
1 4 5 3 0
2 0 0 0 0
2 3 0 0 0
2 3 4 0 0
2 3 4 5 0
2 5 0 0 0
2 5 3 0 0
2 5 3 4 0
3 0 0 0 0
3 4 0 0 0
3 4 2 0 0
3 4 2 5 0
3 4 5 0 0
and now I want to get all rows where the first element is a certain value X and the last element (that is the last element != 0) is a certain value Y, OR turned around: the first is Y and the last is X.
Can't see any speedful code which does NOT use a for-loop :(
Thanks!
EDIT: To filter all rows with a certain first element is really easy, you don't need to help me here. So let's assume I only want to do the following: Filter all rows where the last element (i.e. the last element != 0 in each row) is either X or Y.
EDIT
Thanks a lot for your posts. I benchmarked the three possible solutions with a matrix of 473408*10 elements. Here's the benchmarkscript:
http://pastebin.com/9hEAWw9a
The results were:
t1 = 2.9425 Jonas
t2 = 0.0999 Brendan
t3 = 0.0951 Oli
So thanks a lot you guys, I'm sticking with Oli's solution and thus accept it. Thanks though for all the other solutions!
All you need to do is to find the linear indices of the last non-zero element of every row. The rest is easy:
[nRows,nCols] = size(A);
[u,v] = find(A); %# find all non-zero elements in A
%# for each row, find the highest column index with accumarray
%# and convert to linear index with sub2ind
lastIdx = sub2ind([nRows,nCols],(1:nRows)',accumarray(u,v,[nRows,1],#max,NaN));
To filter rows, you can then write
goodRows = A(:,1) == X & A(lastIdx) == Y
Here is a trick: Look for numbers with a 0 on the right, and sum them all:
H=[1 2 0 0 0;
2 3 1 0 0;
4 5 8 0 0;
8 5 4 2 2];
lastNumber=sum(H.*[H(:,2:end)==0 true(size(H,1),1)],2)
ans =
2
1
8
2
The rest is easy:
firstNumber=H(:,1);
find( (firstNumber==f) & (lastNumber==l) )
This works only if the numbers in each row are x number of non-zeros followed by a series of zeros. i.e. it will not work if the following is possible 1 0 3 4 0 0, I assume that isn't possible based on the sample input you gave ...
% 'a' is your array
[nx, ny] = size(a);
inds = [0:ny:ny*(nx-1)]' + sum(a ~= 0, 2);
% Needs to transpose so that the indexing reads left-to-right
aT = a';
valid1 = aT(inds) == Y;
valid2 = a(:,1) == X;
valid = valid1 & valid2;
valid_rows = a(valid,:);
This is messy I know ...