Find row-wise minima in sparse matrix - matlab

I would like to get the minimum nonzero values per row in a sparse matrix. Solutions I found for dense matrices suggested masking out the zero values by setting them to NaN or Inf. However, this obviously doesn't work for sparse matrices.
Ideally, I should get a column vector of all the row-wise minima, as I would get with
minValues = min( A, [], 2);
Except, obviously, using min leaves me with an all-zeros column vector due to the sparsity. Is there a solution using find?

This is perfect for accumarray. Consider the following sparse matrix,
vals = [3 1 1 9 7 4 10 1]; % got this from randi(10,1,8)
S = sparse([1 3 4 4 5 5 7 9],[2 2 3 6 7 8 8 11],vals);
Get the minimum value for each row, assuming 0 for empty elements:
[ii,jj] = find(S);
rowMinVals = accumarray(ii,nonzeros(S),[],#min)
Note that rows 4 and 5 of rowMinVals, which are the only two rows of S with multiple nonzero values are equal to the min of the row:
rowMinVals =
3
0
1
1 % min([1 9]
4 % min([7 4]
0
10
0
1
If the last row(s) of your sparse matrix do not contain any non-zeros, but you want your min row value output to reflect that you have numRows, for example, change theaccumarray command as follows,
rowMinVals = accumarray(ii,nonzeros(S),[numRows 1],#min).
Also, perhaps you also want to avoid including the default 0 in the output. One way to handle that is to set the fillval input argument to NaN:
rowMinVals = accumarray(ii,nonzeros(S),[numRows 1],#min,NaN)
rowMinVals =
3
NaN
1
1
4
NaN
10
NaN
1
NaN
NaN
NaN
Or you can keep using a sparse matrix with the fifth input argument, issparse:
>> rowMinVals = accumarray(ii,nonzeros(S),[],#min,[],true)
rowMinVals =
(1,1) 3
(3,1) 1
(4,1) 1
(5,1) 4
(7,1) 10
(9,1) 1

Related

Creating cumulative matrix which accounts for column start points

I have a simple example matrix as follows: (The actual matrix I'm working on is 674x11 and is not simply all '1' elements).
a =
1 1 1 NaN NaN
1 1 1 NaN NaN
1 1 1 1 NaN
1 1 1 1 1
1 1 1 1 1
I want to create a cumulative matrix which accounts for the fact that numeric elements start in each column at different rows. I want to achieve this by replacing the NaN value above the first numeric element in each column with the mean of that row.
So instead of:
cumsum(a)=
1 1 1 NaN NaN
2 2 2 NaN NaN
3 3 3 1 NaN
4 4 4 2 1
5 5 5 3 2
what I want to achieve is:
cumsum(a) =
1 1 1 NaN NaN
2 2 2 2 NaN
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
where element (2,4) is the mean of a(2,1:3) and element (3,5) is the mean of a(3,1:4).
You can compute the mean of each row (ignoring the NaN values) by using nanmean. We can then use find to identify the row in which each NaN is and replace the values with the mean of that row. Then we can follow that up with the cumsum operation
% Get the rows of each NaN value
bool = isnan(a);
[row,col] = find(bool);
% Compute the mean value of each row
rowmeans = nanmean(a, 2);
% Replace the NaN values with their row means
a(bool) = rowmeans(row);
% Perform the cumulative sum
result = cumsum(a);
If you want to leave the initial NaN values as NaN values afterwards, then you can follow it up with
result(bool) = NaN;

Find two MAXIMUM values' position in 3D matrix (MATLAB)

I have been having problem with identifying two maximum values' position in 3D matrix (MATLAB). Say I have matrix A output as follows:
A(:,:,1) =
5 3 5
0 1 0
A(:,:,2) =
0 2 0
8 0 8
A(:,:,3) =
3 0 0
0 7 7
A(:,:,4) =
6 6 0
4 0 0
For the first A(:,:,1), I want to identify that the first row have the highest value (A=5). But I need the two index position, which in this case, 1 and 3. And this is the same as the other A(:,:,:).
I have searched through SO but since I am bad in MATLAB, I couldn't find way to work this through.
Please do help me on this. It would be better if I don't need to use for loop to get the desired output.
Shot #1 Finding the indices for maximum values across each 3D slice -
%// Reshape A into a 2D matrix
A_2d = reshape(A,[],size(A,3))
%// Find linear indices of maximum numbers for each 3D slice
idx = find(reshape(bsxfun(#eq,A_2d,max(A_2d,[],1)),size(A)))
%// Convert those linear indices to dim1, dim2,dim3 indices and
%// present the final output as a Nx3 array
[dim1_idx,dim2_idx,dim3_idx] = ind2sub(size(A),idx)
out_idx_triplet = [dim1_idx dim2_idx dim3_idx]
Sample run -
>> A
A(:,:,1) =
5 3 5
0 1 0
A(:,:,2) =
0 2 0
8 0 8
A(:,:,3) =
3 0 0
0 7 7
A(:,:,4) =
6 6 0
4 0 0
out_idx_triplet =
1 1 1
1 3 1
2 1 2
2 3 2
2 2 3
2 3 3
1 1 4
1 2 4
out_idx_triplet(:,2) is what you are looking for!
Shot #2 Finding the indices for highest two numbers across each 3D slice -
%// Get size of A
[m,n,r] = size(A)
%// Reshape A into a 2D matrix
A_2d = reshape(A,[],r)
%// Find linear indices of highest two numbers for each 3D slice
[~,sorted_idx] = sort(A_2d,1,'descend')
idx = bsxfun(#plus,sorted_idx(1:2,:),[0:r-1]*m*n)
%// Convert those linear indices to dim1, dim2,dim3 indices
[dim1_idx,dim2_idx,dim3_idx] = ind2sub(size(A),idx(:))
%// Present the final output as a Nx3 array
out_idx_triplet = [dim1_idx dim2_idx dim3_idx]
out_idx_triplet(:,2) is what you are looking for!
The following code gives you the column and row of the respective maximum.
The first step will obtain the maximum of each sub-matrix containing the first and second dimension. Since max works per default with the first dimension, the matrix is reshaped to combine the original first and second dimension.
max_vals = max(reshape(A,size(A,1)*size(A,2),size(A,3)));
max_vals =
5 8 7 6
In the second step, the index of elements equal to the respective max_vals of each sub-matrix is obtained using arrayfun over the third dimension. Since the output of arrayfun are cells, cell2mat is used to transform the output into a matrix. As a last step, the linear index from find is transformed into sub-indices by ind2sub.
[i,j] = ind2sub(size(A(:,:,1)),cell2mat(arrayfun(#(i)find(A(:,:,i)==max_vals(i)),1:size(A,3),'UniformOutput',false)))
i =
1 2 2 1
1 2 2 1
j =
1 1 2 1
3 3 3 2
Hence, the values in j are the ones you want to have.

Repmat function in matlab

I have been through a bunch of questions about the Repeat function in MatLab, but I can't figure out how this process work.
I am trying to translate it into R, but my problem is that I do not know how the function manipulates the data.
The code is part of a process to make a pairs trading strategy, where the code takes in a vector of FALSE/TRUE expressions.
The code is:
% initialize positions array
positions=NaN(length(tday), 2);
% long entries
positions(shorts, :)=repmat([-1 1], [length(find(shorts)) 1]);
where shorts is the vector of TRUE/FALSE expressions.
Hope you can help.
repmat repeats the matrix you give him [dim1 dim2 dim3,...] times. What your code does is:
1.-length(find(shorts)): gets the amount of "trues" in shorts.
e.g:
shorts=[1 0 0 0 1 0]
length(find(shorts))
ans = 2
2.-repmat([-1 1], [length(find(shorts)) 1]); repeats the [-1 1] [length(find(shorts)) 1] times.
continuation of e.g.:
repmat([-1 1], [length(find(shorts)) 1]);
ans=[-1 1
-1 1];
3.- positions(shorts, :)= saves the given matrix in the given indexes. (NOTE!: only works if shorts is of type logical).
continuation of e.g.:
At this point, if you haven't omit anything, positions should be a 6x2 NaN matrix. the indexing will fill the true positions of shorts with the [-1 1] matrix. so after this, positions will be:
positions=[-1 1
NaN NaN
NaN NaN
NaN NaN
-1 1
NaN NaN]
Hope it helps
The MATLAB repmat function replicates and tiles the array. The syntax is
B = repmat(A,n)
where A is the input array and n specifies how to tile the array. If n is a vector [n1,n2] - as in your case - then A is replicated n1 times in rows and n2 times in columns. E.g.
A = [ 1 2 ; 3 4]
B = repmat(A,[2,3])
B = | |
1 2 1 2 1 2
3 4 3 4 3 4 __
1 2 1 2 1 2
3 4 3 4 3 4
(the lines are only to illustrate how A gets tiled)
In your case, repmat replicates the vector [-1, 1] for each non-zero element of shorts. You thus set each row of positions, whos corresponding entry in shorts is not zero, to [-1,1]. All other rows will stay NaN.
For example if
shorts = [1; 0; 1; 1; 0];
then your code will create
positions =
-1 1
NaN NaN
-1 1
-1 1
NaN NaN
I hope this helps you to clarify the effect of repmat. If not, feel free to ask.

Getting row and column numbers of valid elements in a matrix

I have a 3x3 matrix, populated with NaN and values of a variable:
NaN 7 NaN
5 NaN 0
NaN NaN 4
matrix = [NaN 7 NaN; 5 NaN 0; NaN NaN 4]
I would like to get the row and column numbers of non-NaN cells and put them in a matrix together with the value of the variable. That is, I would like to obtain the following matrix:
row col value
1 2 7
2 1 5
2 3 0
3 3 4
want = [1 2 7; 2 1 5; 2 3 0; 3 3 4]
Any help would be highly appreciated.
This can be done without loops:
[jj, ii, kk] = find((~isnan(matrix).*(reshape(1:numel(matrix), size(matrix)))).');
result = [ii jj matrix(kk)];
The trick is to multiply ~isnan(matrix) by a matrix of indices so that the third output of find gives the linear index of non-NaN entries. The transpose is needed to have the same order as in the question.
The following should work!
[p,q]=find(~isnan(matrix)) % Loops through matrix to find indices
want = zeros(numel(p),3) % three columns you need with same number of rows as p
for i=1:numel(p)
want[i,:] = [p(i) q(i) matrix(p(i), matrix(i))]
end
Should give you the correct result which is:
2 1 5
1 2 7
2 3 0
3 3 4
If you don't mind the ordering of the rows, you can use a simplified version of Luis Mendo's answer:
[row, col] = find(~isnan(matrix));
result = [row(:), col(:), matrix(~isnan(matrix))];
Which will result in:
2 1 5
1 2 7
2 3 0
3 3 4

How can I use values within a MATLAB matrix as indices to determine the location of data in a new matrix?

I have a matrix that looks like the following.
I want to take the column 3 values and put them in another matrix, according to the following rule.
The value in the Column 5 is the row index for the new matrix, and Column 6 is the column index. Therefore 20 (taken from 29,3) should be in Row 1 Column 57 of the new matrix, 30 (from 30,3) should in Row 1 column 4 of the new matrix, and so on.
If the value in column 3 is NaN then I want NaN to be copied over to the new matrix.
Example:
% matrix of values and row/column subscripts
A = [
20 1 57
30 1 4
25 1 16
nan 1 26
nan 1 28
25 1 36
nan 1 53
50 1 56
nan 2 1
nan 2 2
nan 2 3
80 2 5
];
% fill new matrix
B = zeros(5,60);
idx = sub2ind(size(B), A(:,2), A(:,3));
B(idx) = A(:,1);
There are a couple other ways to do this, but I think the above code is easy to understand. It is using linear indexing.
Assuming you don't have duplicate subscripts, you could also use:
B = full(sparse(A(:,2), A(:,3), A(:,1), m, n));
(where m and n are the output matrix size)
Another one:
B = accumarray(A(:,[2 3]), A(:,1), [m,n]);
I am not sure if I understood your question clearly but this might help:
(Assuming your main matrix is A)
nRows = max(A(:,5));
nColumns = max(A(:,6));
FinalMatrix = zeros(nRows,nColumns);
for i=1:size(A,1)
FinalMatrix(A(i,5),A(i,6))=A(i,3);
end
Note that above code sets the rest of the elements equal to zero.