Matlab:Efficient assignment of values in a sparse matrix - matlab

I'm working in Matlab and I have the next problem:
I have a B matrix of nx2 elements, which contains indexes for the assignment of a big sparse matrix A (almost 500,000x80,000). For each row of B, the first column is the column index of A that has to contain a 1, and the second column is the column index of A that has to contain -1.
For example:
B= 1 3
2 5
1 5
4 1
5 2
For this B matrix, The Corresponding A matrix has to be like this:
A= 1 0 -1 0 0
0 1 0 0 -1
1 0 0 0 -1
-1 0 0 1 0
0 -1 0 0 1
So, for the row i of B, the corresponding row i of A must be full of zeros except on A(i,B(i,1))=1 and A(i,B(i,2))=-1
This is very easy with a for loop over all the rows of B, but it's extremely slow. I also tried the next formulation:
A(:,B(:,1))=1
A(:,B(:,2))=-1
But matlab gave me an "Out of Memory Error". If anybody knows a more efficient way to achieve this, please let me know.
Thanks in advance!

You can use the sparse function:
m = size(B,1); %// number of rows of A. Or choose larger if needed
n = max(B(:)); %// number of columns of A. Or choose larger if needed
s = size(B,1);
A = sparse(1:s, B(:,1), 1, m, n) + sparse(1:s, B(:,2), -1, m, n);

I think you should be able to do this using the sub2ind function. This function converts matrix subscripts to linear indices. You should be able to do it like so:
pind = sub2ind(size(A),1:n,B(:,1)); % positive indices
nind = sub2ind(size(A),1:n,B(:,2)); % negative indices
A(pind) = 1;
A(nind) = -1;
EDIT: I (wrongly, I think) assumed the sparse matrix A already existed. If it doesn't exist, then this method wouldn't be the best option.

Related

How to get indexes of logical matrix without using find in matlab?

Let's assume my matrix A is the output of comparison function i.e. logical matrix having values 0 and 1's only. For a small matrix of size 3*4, we might have something like:
A =
1 1 0 0
0 0 1 0
0 0 1 1
Now, I am generating another matrix B which is of the same size as A, but its rows are filled with indexes of A and any leftover values in each row are set to zero.
B =
1 2 0 0
3 0 0 0
3 4 0 0
Currently, I am using find function on each row of A to get matrix B. Complete code can be written as:
A=[1,1,0,0;0,0,1,0;0,0,1,1];
[rows,columns]=size(A);
B=zeros(rows,columns);
for i=1:rows
currRow=find(A(i,:));
B(i,1:length(currRow))=currRow;
end
For large martixes, "find" function is taking time in the calculation as per Matlab Profiler. Is there any way to generate matrix B faster?
Note:
Matrix A is having more than 1000 columns in each row but non-zero elements are never more than 50. Here, I am taking Matrix B as the same size as A but Matrix B can be of much smaller size column-wise.
I would suggest using parfor, but the overhead is too much here, and there are more issues with it, so it is not a good solution.
rows = 5e5;
cols = 1000;
A = rand(rows, cols) < 0.050;
I = uint16(1:cols);
B = zeros(size(A), 'uint16');
% [r,c] = find(A);
tic
for i=1:rows
% currRow = find(A(i,:));
currRow = I(A(i,:));
B(i,1:length(currRow)) = currRow;
end
toc
#Cris suggests replacing find with an indexing operation. It increases the performance by about 10%.
Apparently, there is not a better optimization unless B is required to be in that specific form you tell. I suggest using [r,c] = find(A); if the indexes are not required in a matrix form.

Assigning values to a matrix through vector addition from an adjacency matrix

Very new to Matlab, I usually use STATA.
I want to use the nchoosek fuction to get the sum of vectors in one matrix.
I have a 21x21 adjacency matrix, with either 0 or 1 as the inputs. I want to create a new matrix, that will give me a sum of inputs between all possible triads from the adjacency matrix.
The new matrix should look have four variables, indexes (i, j, k) - corresponding to each combination from the 21x21. And a final variable which is a sum of the inputs.
The code I have so far is:
C = nchoosek(21,3)
B = zeros(nchoosek(21,3), 4)
for i=1:C
for j=i+1:C
for k=j+1:C
B(?)=B(i, j, k, A(i)+A(j)+A(k)) #A is the 21x21 adj mat
end
end
end
I know my assignment statement is incorrect as I don't completed understand the indexing role of the ":" operator. Any help will be appreciated.
Thanks!
This might be what you want:
clear all
close all
clc
A = rand(21,21); % Replace this with actual A
rowNum = 0;
for i=1:21
for j=i+1:21
for k=j+1:21
rowNum = rowNum+1;
B(rowNum,:) = [i, j, k, sum(A(:,i)+A(:,j)+A(:,k))];
end
end
end
There are some points:
You loop for different combinations. the total number of combination is nchoosek(21,3) which you can check after 3 nested loop. Your code with for i=1:C was the first error since you're actually looping for different values of i and different values of j and k. So these just 21 values not more.
To avoid repeated combinations, it's enough to start new index after the previous one, which you've realized in your code.
There are other possible approaches such as vectorized format, but to stick to your approach, I used a counter: rowNum which is the loop counter and updated along the loop.
B(rowNum,:) means all element of rowNum'th row of the matrix B.
Below is an algorithm to find the triads in an adjacency matrix. It checks all possible triads and sums the values.
%basic adjacency matrix with two triads (1-2-5) (2-3-5)
A=[];
A(1,:) = [0 1 0 0 1];
A(2,:) = [1 0 1 0 1];
A(3,:) = [0 1 0 0 1];
A(4,:) = [0 0 0 0 1];
A(5,:) = [1 1 1 1 0];
A=A==1; %logical matrix
triads=nchoosek(1:5,3);
S=nan(size(triads,1),4);
for ct = 1:size(triads,1)
S(ct,1:3)=[A(triads(ct,1),triads(ct,2)),A(triads(ct,1),triads(ct,3)),A(triads(ct,2),triads(ct,3))];
S(ct,4)=sum(S(ct,1:3));
end
triads(find(S(:,4)==3),:)
ans =
1 2 5
2 3 5

How to get the logical matrix corresponding to "scatter" plot?

If I have a two column matrix A like below, I can plot the scatter plot using scatter/plot command. I would like to get the matrix corresponding to such outputs as in hist command. hist command gives the vector output too.
A=[7 1;3 2; 4 3]
For example out=scatter(A(:,1),A(:,2)) must give something like below:
[0 0 0;
0 0 0;
0 1 0;
0 0 1;
0 0 0;
0 0 0;
1 0 0]
Only the indices (7,1), (3,2) and (4,3) are only ones. Or Can someone give me a snippet code to realize this without using loops?
You can use a combination of sparse and full where you can specify the non-zero row and column locations, and the rest of the matrix would be zero:
A = [7 1; 3 2; 4 3];
B = full(sparse(A(:,1), A(:,2), 1, max(A(:,1)), max(A(:,2)))) == 1;
The sparse command takes in the row and column locations of what is non-zero for the first two inputs, the third input is what the non-zero location would be for each row and column location. We can specify a constant to mean that every non-zero location gets the same coefficient, which is 1. We can also specify the size of the matrix, where in this case the rows and columns of the output correspond to the largest number in the first and second columns respectively. Because this is a sparse matrix, you will want to convert this to a full matrix and because you want it to be logical, you will want to compare all elements with the number 1.
We thus get for the output, which is B:
B =
7×3 logical array
0 0 0
0 0 0
0 1 0
0 0 1
0 0 0
0 0 0
1 0 0
Alternatively, we can use sub2ind to create linear indices to index into a pre-allocated matrix of logical false and set only those non-zero row locations to true:
A = [7 1; 3 2; 4 3];
B = false(max(A(:,1)), max(A(:,2)));
ind = sub2ind(size(B), A(:,1), A(:,2));
B(ind) = true;
We first allocate the matrix, then calculate the linear indices to index into the matrix, then finally set the right locations to true. The output here would be the same as the sparse approach.
Just to add: rayryeng's solution is fine if you really want your result to be logical in the sense that it is equal to one if there is anything at the coordinate and zero otherwise. Still, since you added a note on hist, I was wondering if you actually want to count the number of times a specific coordinate is hit. In this case, consider using
S = histcounts2(A(:,2),A(:,1));
if you have access to R2015b+. If not, there is a hist2 function on fileexchange you can use for the purpose.
Here is my solution. Matlab provides a command called accumarray.
S = logical(accumarray(A, 1) )
will give the result too.

assign new matrix values based on row and column index vectors

New to MatLab here (R2015a, Mac OS 10.10.5), and hoping to find a solution to this indexing problem.
I want to change the values of a large 2D matrix, based on one vector of row indices and one of column indices. For a very simple example, if I have a 3 x 2 matrix of zeros:
A = zeros(3, 2)
0 0
0 0
0 0
I want to change A(1, 1) = 1, and A(2, 2) = 1, and A(3, 1) = 1, such that A is now
1 0
0 1
1 0
And I want to do this using vectors to indicate the row and column indices:
rows = [1 2 3];
cols = [1 2 1];
Is there a way to do this without looping? Remember, this is a toy example that needs to work on a very large 2D matrix. For extra credit, can I also include a vector that indicates which value to insert, instead of fixing it at 1?
My looping approach is easy, but slow:
for i = 1:length(rows)
A(rows(i), cols(i)) = 1;
end
sub2ind can help here,
A = zeros(3,2)
rows = [1 2 3];
cols = [1 2 1];
A(sub2ind(size(A),rows,cols))=1
A =
1 0
0 1
1 0
with a vector to 'insert'
b = [1,2,3];
A(sub2ind(size(A),rows,cols))=b
A =
1 0
0 2
3 0
I found this answer online when checking on the speed of sub2ind.
idx = rows + (cols - 1) * size(A, 1);
therefore
A(idx) = 1 % or b
5 tests on a big matrix (~ 5 second operations) shows it's 20% faster than sub2ind.
There is code for an n-dimensional problem here too.
What you have is basically a sparse definition of a matrix. Thus, an alternative to sub2ind is sparse. It will create a sparse matrix, use full to convert it to a full matrix.
A=full(sparse(rows,cols,1,3,2))

Eliminating zeros in a matrix - Matlab

Hi I have the following matrix:
A= 1 2 3;
0 4 0;
1 0 9
I want matrix A to be:
A= 1 2 3;
1 4 9
PS - semicolon represents the end of each column and new column starts.
How can I do that in Matlab 2014a? Any help?
Thanks
The problem you run into with your problem statement is the fact that you don't know the shape of the "squeezed" matrix ahead of time - and in particular, you cannot know whether the number of nonzero elements is a multiple of either the rows or columns of the original matrix.
As was pointed out, there is a simple function, nonzeros, that returns the nonzero elements of the input, ordered by columns. In your case,
A = [1 2 3;
0 4 0;
1 0 9];
B = nonzeros(A)
produces
1
1
2
4
3
9
What you wanted was
1 2 3
1 4 9
which happens to be what you get when you "squeeze out" the zeros by column. This would be obtained (when the number of zeros in each column is the same) with
reshape(B, 2, 3);
I think it would be better to assume that the number of elements may not be the same in each column - then you need to create a sparse array. That is actually very easy:
S = sparse(A);
The resulting object S is a sparse array - that is, it contains only the non-zero elements. It is very efficient (both for storage and computation) when lots of elements are zero: once more than 1/3 of the elements are nonzero it quickly becomes slower / bigger. But it has the advantage of maintaining the shape of your matrix regardless of the distribution of zeros.
A more robust solution would have to check the number of nonzero elements in each column and decide what the shape of the final matrix will be:
cc = sum(A~=0);
will count the number of nonzero elements in each column of the matrix.
nmin = min(cc);
nmax = max(cc);
finds the smallest and largest number of nonzero elements in any column
[i j s] = find(A); % the i, j coordinates and value of nonzero elements of A
nc = size(A, 2); % number of columns
B = zeros(nmax, nc);
for k = 1:nc
B(1:cc(k), k) = s(j == k);
end
Now B has all the nonzero elements: for columns with fewer nonzero elements, there will be zero padding at the end. Finally you can decide if / how much you want to trim your matrix B - if you want to have no zeros at all, you will need to trim some values from the longer columns. For example:
B = B(1:nmin, :);
Simple solution:
A = [1 2 3;0 4 0;1 0 9]
A =
1 2 3
0 4 0
1 0 9
A(A==0) = [];
A =
1 1 2 4 3 9
reshape(A,2,3)
ans =
1 2 3
1 4 9
It's very simple though and might be slow. Do you need to perform this operation on very large/many matrices?
From your question it's not clear what you want (how to arrange the non-zero values, specially if the number of zeros in each column is not the same). Maybe this:
A = reshape(nonzeros(A),[],size(A,2));
Matlab's logical indexing is extremely powerful. The best way to do this is create a logical array:
>> lZeros = A==0
then use this logical array to index into A and delete these zeros
>> A(lZeros) = []
Finally, reshape the array to your desired size using the built in reshape command
>> A = reshape(A, 2, 3)