Counting non-zero elements in a sparse matrix using breeze - scala

I'm new to breeze. I can't seem to be able to count the number of non-zero element per row in a CSCMatrix.
Let's take the following example :
val matrix = CSCMatrix.tabulate(3, 2)((x,y) => x*y)
This is a sparse matrix of dimension 3x2:
|0|0|
|0|1|
|0|2|
What I want is to compute the number of non-zero elements per row, so the output should be a vector of the form:
|0|
|1|
|1|
It would be easy to do it with numpy, but I can't seem to be able to do it with breeze.

In case someone still needs this, you may do it with a Matrix vector multiplication in the following way:
matrix * DenseVector.ones[Int](matrix.cols)

Related

Scale every row in a sparse matrix by an element of a vector in MATLAB

I have a sparse matrix
obj.resOp = sparse(row,col,val);
and a vector containing the sums of each row in the matrix
sums = sparse(sum(obj.resOp,2));
Now what I want to do is
obj.resOp = obj.resOp ./ sums;
which would scale every row in the matrix so that the rowsum in each row is 1.
However in this last line, MATLAB internally seems to construct a full matrix from obj.resOp and hence I get this error:
Error using ./ Requested 38849x231827 (17.5GB) array exceeds maximum
array size preference. Creation of arrays greater than this limit may
take a long time and cause MATLAB to become unresponsive. See array size limit or preference
panel for more information.
for sufficiently large matrices.
In theory I think that expanding to a full matrix is not necessary. Is there any MATLAB formulation of what I want to achieve while keeping the sparsity of obj.resOp?
You can do this with a method similar to the one described in this answer.
Start with some sparse matrix
% Random sparse matrix: 10 rows, 4 cols, density 20%
S = sprand(10,4, 0.2);
Get the row sums, note that sum returns a sparse matrix from sparse inputs, so no need for your additional conversion (docs).
rowsums = sum(S,2);
Find all non-zero indices and their values
[rowidx, colidx, vals] = find(S)
Now create a sparse matrix from the element-wise division
out = sparse(rowidx, colidx, vals./rowsums(rowidx), size(S,1), size(S,2));
The equivalent calculation
obj.resOp = inv(diag(sums)) * obj.resOp;
works smoothly.

Matlab efficient sparse matrix multiplication

I have a sparse matrix which only has elements in three diagonals. E.g.
I also have a column vector where I wish to multiply every element in each row of the sparse matrix by the corresponding element in each row of the column vector. Is there an efficient way to do this in MATLAB? If the sparse matrix is called A and the column vector B, I've only tried
A.*repmat(B,[1,9])
which is obviously inefficient.
Here's one way:
C = bsxfun(#times, A, B)
According to docs, the resulting matrix C is sparse:
Binary operators yield sparse results if both operands are sparse, and full results if both are full. For mixed operands, the result is full unless the operation preserves sparsity. If S is sparse and F is full, then S+F, S*F, and F\S are full, while S.*F and S&F are sparse. In some cases, the result might be sparse even though the matrix has few zero elements.

Matlab: How to convert a matrix into a Toeplitz matrix

Considering a discrete dynamical system where x[0]=rand() denotes the initial condition of the system.
I have generated an m by n matrix by the following step -- generate m vectors with m different initial conditions each with dimension N (N indicates the number of samples or elements). This matrix is called R. Using R how do I create a Toeplitz matrix, T? T
Mathematically,
R = [ x_0[0], ....,x_0[n-1];
..., ,.....;
x_m[0],.....,x_m[n-1]]
The toeplitz matrix T =
x[n-1], x[n-2],....,x[0];
x[0], x[n-1],....,x[1];
: : :
x[m-2],x[m-3]....,x[m-1]
I tried working with toeplitz(R) but the dimension changes. The dimension should no change, as seen mathematically.
According to the paper provided (Toeplitz structured chaotic sensing matrix for compressive sensing by Yu et al.) there are two Chaotic Sensing Matrices involved. Let's explore them separately.
The Chaotic Sensing Matrix (Section A)
It is clearly stated that to create such matrix you have to build m independent signals (sequences) with m different initials conditions (in range ]0;1[) and then concatenate such signals per rows (that is, one signal = one row). Each of these signals must have length N. This actually is your matrix R, which is correctly evaluated as it is. Although I'd like to suggest a code improvement: instead of building a column and then transpose the matrix you can directly build such matrix per rows:
R=zeros(m,N);
R(:,1)=rand(m,1); %build the first column with m initial conditions
Please note: by running randn() you select values with Gaussian (Normal) distribution, such values might not be in range ]0;1[ as stated in the paper (right below equation 9). As instead by using rand() you take uniformly distributed values in such range.
After that, you can build every row separately according to the for-loop:
for i=1:m
for j=2:N %skip first column
R(i,j)=4*R(i,j-1)*(1-R(i,j-1));
R(i,j)=R(i,j)-0.5;
end
end
The Toeplitz Chaotic Sensing Matrix (Section B)
It is clearly stated at the beginning of Section B that to build the Toeplitz matrix you should consider a single sequence x with a given, single, initial condition. So let's build such sequence:
x=rand();
for j=2:N %skip first element
x(j)=4*x(j-1)*(1-x(j-1));
x(j)=x(j)-0.5;
end
Now, to build the matrix you can consider:
how do the first row looks like? Well, it looks like the sequence itself, but flipped (i.e. instead of going from 0 to n-1, it goes from n-1 to 0)
how do the first column looks like? It is the last item from x concatenated with the elements in range 0 to m-2
Let's then build the first row (r) and the first column (c):
r=fliplr(x);
c=[x(end) x(1:m-1)];
Please note: in Matlab the indices start from 1, not from 0 (so instead of going from 0 to m-2, we go from 1 to m-1). Also end means the last element from a given array.
Now by looking at the help for the toeplitz() function, it is clearly stated that you can build a non-squared Toeplitz matrix by specifying the first row and the first column. Therefore, finally, you can build such matrix as:
T=toeplitz(c,r);
Such matrix will indeed have dimensions m*N, as reported in the paper.
Even though the Authors call both of them \Phi, they actually are two separate matrices.
They do not take the Toeplitz of the Beta-Like Matrix (Toeplitz matrix is not a function or operator of some kind), neither do they transform the Beta-Like Matrix into a Toeplitz-matrix.
You have the Beta-Like Matrix (i.e. the Chaotic Sensing Matrix) at first, and then the Toeplitz-structured Chaotic Sensing Matrix: such structure is typical for Toeplitz matrices, that is a diagonal-constant structure (all elements along a diagonal have the same value).

Drawing a random non-zero element from a sparse matrix

I have a sparse logical matrix, which is quite large. I would like to draw random non-zero elements from it without storing all of its non-zero elements in a separate vector (eg. by using find command). Is there an easy way to do this?
Currently I am implementing rejection sampling, which is drawing a random element and checking whether that is non-zero or not. But it is not efficient when the ratio of non-zero elements is small.
A sparse logical matrix is not a very practical representation of your data if you want to pick random locations. Rejection sampling and find are the only two ways that make sense to me. Here's how you can do them efficiently (assuming you want to get 4 random locations):
%# using find
idx = find(S);
%# draw 4 without replacement
fourRandomIdx = idx(randperm(length(idx),4));
%# draw 4 with replacement
fourRandomIdx = idx(randi(1,length(idx),4));
%# get row, column values
[row,col] = ind2sub(size(S),fourRandomIdx);
%# using rejection sampling
density = nnz(S)/prod(size(S));
%# estimate how many samples you need to get at least 4 hits
%# and multiply by 2 (or 3)
n = ceil( 1 / (1-(1-density)^4) ) * 2;
%# random indices w/ replacement
randIdx = randi(1,n,prod(size(S)));
%# identify the first four non-zero elements
[row,col] = find(S(randIdx),4,'first');
An n x m matrix with nnz non-zero elements requires nnz + n + 1 integers to store the locations of its non-zero entries. For a logical matrix there is no need to store the value of the non-zero entries: these are all true. Correspondingly, you would do best to convert your logical sparse matrix into a list of the linear indices of its non-zero entries, together with n and m, which requires only nnz + 2 integers of storage. From these (and ind2sub) you can readily reconstruct the subscripts corresponding to any non-zero entry that you choose randomly using randi over the range 1..nnz
find is the standard interface to get the non-zero elements in a sparse matrix. Have a look here http://www.mathworks.se/help/techdoc/math/f6-9182.html#f6-13040
[i,j,s] = find(S)
find returns the row indices of nonzero values in vector i, the column indices in vector j, and the nonzero values themselves in the vector s.
No need to get s. Just pick a random index in i,j.
By representing the entries in a 3 column format, aka a coordinate list (i, j, value), you can simply select the items from the list. To get this, you can either use your original method for creating the sparse matrix (i.e. the precursor to sparse()), or use the find command, a la [i,j,s] = find(S);
If you don't need the entries, and it seems you don't, you can just extract i and j.
If, for some reason, your matrix is massive and your RAM limitations are severe, you can simply divide the matrix into regions, and let the probability of selecting a given sub-matrix be proportional to the number of non-zero elements (using nnz) in that sub-matrix. You could go so far as to divide the matrix into individual columns, and the rest of the calculation is trivial. NB: by applying sum to the matrix, you can get the per-column counts (assuming your entries are just 1s).
In this way, you need not even bother with rejection sampling (which seems pointless to me in this case, since Matlab knows where all of the non-zero entries are).

Weighted sum of elements in matrix - Matlab?

I have two 50 x 6 matrices, say A and B. I want to assign weights to each element of columns in matrix - more weight to elements occurring earlier in a column and less weight to elements occurring later in the same column...likewise for all 6 columns. Something like this:
cumsum(weight(row)*(A(row,col)-B(row,col)); % cumsum is for cumulative sum of matrix
How can we do it efficiently without using loops?
If you have your weight vector w as a 50x1 vector, then you can rewrite your code as
cumsum(repmat(w,1,6).*(A-B))
BTW, I don't know why you have the cumsum operating on a scalar in a loop... it has no effect. I'm assuming that you meant that's what you wanted to do with the entire matrix. Calling cumsum on a matrix will operate along each column by default. If you need to operate along the rows, you should call it with the optional dimension argument as cumsum(x,2), where x is whatever matrix you have.