Conversion from Matlab CSC to CSR format - matlab

I am using mex bridge to perform some operations on Sparse matrices from Matlab.
For that I need to convert input matrix into CSR (compressed row storage) format, since Matlab stores the sparse matrices in CSC (compressed column storage).
I was able to get value array and column_indices array. However, I am struggling to get row_pointer array for CSR format.Is there any C library that can help in conversion from CSC to CSR ?
Further, while writing a CUDA kernel, will it be efficient to use CSR format for sparse operations or should I just use following arrays :- row indices, column indices and values?
Which on would give me more control over the data, minimizing the number for-loops in the custom kernel?

Compressed row storage is similar to compressed column storage, just transposed. So the simplest thing is to use MATLAB to transpose the matrix before you pass it to your MEX file. Then, use the functions
Ap = mxGetJc(spA);
Ai = mxGetIr(spA);
Ax = mxGetPr(spA);
to get the internal pointers and treat them as row storage. Ap is row pointer, Ai is column indices of the non-zero entries, Ax are the non-zero values. Note that for symmetric matrices you do not have to do anything at all! CSC and CSR are the same.
Which format to use heavily depends on what you want to do with the matrix later. For example, have a look at matrix formats for Sparse matrix vector multiplication. That is one of the classic papers, research has moved since then so you can look around further.

I ended up converting CSC format from Matlab to CSR using CUSP library as follows.
After getting the matrix A from matlab and I got its row,col and values vectors and I copied them in respective thrust::host_vector created for each of them.
After that I created two cusp::array1d of type Indices and Values as follows.
typedef typename cusp::array1d<int,cusp::host_memory>Indices;
typedef typename cusp::array1d<float,cusp::host_memory>Values;
Indices row_indices(rows.begin(),rows.end());
Indices col_indices(cols.begin(),cols.end());
Values Vals(Val.begin(),Val.end());
where rows, cols and Val are thrust::host_vector that I got from Matlab.
After that I created a cusp::coo_matrix_view as given below.
typedef cusp::coo_matrix_view<Indices,Indices,Values>HostView;
HostView Ah(m,n,NNZa,row_indices,col_indices,Vals);
where m,n and NNZa are the parameters that I get from mex functions of sparse matrices.
I copied this view matrix to cusp::csr_matrixin device memory with proper dimensions set as given below.
cusp::csr_matrix<int,float,cusp::device_memory>CSR(m,n,NNZa);
CSR = Ah;
After that I just copied the three individual content arrays of this CSR matrix back to the host using thrust::raw_pointer_cast where arrays with proper dimension are already mxCalloced as given below.
cudaMemcpy(Acol,thrust::raw_pointer_cast(&CSR.column_indices[0]),sizeof(int)*(NNZa),cudaMemcpyDeviceToHost);
cudaMemcpy(Aptr,thrust::raw_pointer_cast(&CSR.row_offsets[0]),sizeof(int)*(n+1),cudaMemcpyDeviceToHost);
cudaMemcpy(Aval,thrust::raw_pointer_cast(&CSR.values[0]),sizeof(float)*(NNZa),cudaMemcpyDeviceToHost);
Hope this is useful to anyone who is using CUSP with Matlab

you can do something like this:
n = size(M,1);
nz_num = nnz(M);
[col,rowi,vals] = find(M');
row = zeros(n+1,1);
ll = 1; row(1) = 1;
for l = 2:n
if rowi(l)~=rowi(l-1)
ll = ll + 1;
row(ll) = l;
end
end
row(n+1) = nz_num+1;`
It works for me, hope it can help somebody else!

Related

How can i randomly sample from a distribution already fitted with allfitdist in MATLAB?

I 've found the best fitting of a variable distribution (D(:,2)) using the function "allfitdist". Now i want to save this result in a structure and then i want to randomly sample 10000 times from this result. I'm using this code:
[Ddg2 PDdg2] = allfitdist(D(:,2),'cdf')
My(2).result = PDdg2{1,1} %generalized pareto
output = random(My(2).result,10000)
Something is weard because in the output i get a really big matrix. Maybe i'm wrong in the third raw of the code, when i randomly sample from this distribution.
Someone can help me?
The documentation of random says:
R = random(___,sz1,...,szN) or R = random(___,[sz1,...,szN]) generates a sz1-by-⋯-by-szN array of random numbers from the specified probability distribution using input arguments...
...
If you specify a single value sz1, then R is a square matrix of size sz1.
You have specified sz1 as 10000 which is a single value and hence your output matrix is 10000×10000.
So the solution is:
output = random(pd,1,10000);

Logical Indexing Failing when Matrix is loaded by matfile

I have a matrix that was stored in a .mat file, and was then reloaded in matlab via the function matfile. I also have a logical index, like logical([1 0 1 0]), that I want to apply to the loaded matrix:
results = matfile('results.mat');
% id is my logical vector of the appropriate size
% IV is a matrix stored in results.mat
newIV = results.IV(:,id);
However, I am running into a problem and getting this error:
'IV' cannot be indexed with class 'logical'. Indices must be numeric.
I do not understand what is causing this issue. I have been using this same code before and it was working, the only thing was that I did not have to load the struct results before, I already had it in memory.
It gets weirder; this works:
IV = results.IV;
newIV = IV(:,id); % this works somehow
This also works:
results_raw = matfile('results.mat');
results = struct('IV',results_raw.IV);
newIV = IV(:,id); % this also works!!! why matlab, why???
I also tried resaving the results.mat file using the -v7.3 flag, but it did not solve the problem. The issue seems to be with loading the .mat file, because I created a struct with a matrix and used logical indexing and it worked fine.
Question: why does indexing work when I pass results.IV to IV? how can I make it work with results.IV?
Thanks for helping!!! :D
As #Adiel said in questions comments. You can't use logical indices.
So, convert logical indices to numeric indices using find.
results = matfile('results.mat');
% id is my logical vector of the appropriate size
% IV is a matrix stored in results.mat
newIV = results.IV(:,find(id));

Encoding a binary vector in a suitable way in Matlab

The context and the problem below are only examples that can help to visualize the question.
Context: Let's say that I'm continously generating random binary vectors G with length 1x64 (whose values are either 0 or 1).
Problem: I don't want to check vectors that I've already checked, so I want to create a kind of table that can identify what vectors are already generated before.
So, how can I identify each vector in an optimized way?
My first idea was to convert the binary vectors into decimal numbers. Due to the maximum length of the vectors, I would need 2^64 = 1.8447e+19 numbers to encode them. That's huge, so I need an alternative.
I thought about using hexadecimal coding. In that case, if I'm not wrong, I would need nchoosek(16+16-1,16) = 300540195 elements, which is also huge.
So, there are better alternatives? For example, a kind of hash function that can identify that vectors without repeating values?
So you have 64 bit values (or vectors) and you need a data structure in order to efficiently check if a new value is already existing?
Hash sets or binary trees come to mind, depending on if ordering is important or not.
Matlab has a hash table in containers.Map.
Here is a example:
tic;
n = 1e5; % number of random elements
keys = uint64(rand(n, 1) * 2^64); % random uint64
% check and add key if not already existing (using a containers.Map)
map = containers.Map('KeyType', 'uint64', 'ValueType', 'logical');
for i = 1 : n
key = keys(i);
if ~isKey(map, key)
map(key) = true;
end
end
toc;
However, depending on why you really need that and when you really need to check, the Matlab function unique might also be something for you.
Just throwing out duplicates once at the end like:
tic;
unique_keys = unique(keys);
toc;
is in this example 300 times faster than checking every time.

Indexing data in matlab

I have imported a lot of data from an excel spreadsheet so that I have a 1x27 matrix.
I have imported data from excel using this
filename = 'for_matlab.xlsx';
sheet = 27;
xlRange = 'A1:G6';
all_data = {};
for i=1:sheet,
all_data{i} = xlsread(filename, i, xlRange);
end
However each element of this all_data matrix (which is 1x27) contains my data but I'm having trouble accessing individual elements.
i.e.
all_data{1}
Will give me the entire matrix but I need to perform multiplications on individual elements of this data
also
all_data(1)
just gives '5x6 double', i.e. the matrix dimensions.
Does anybody know how I can divide all elements of each row by the third element in each row and do this for all of my 'sub-matrices' (for want of a better word)
Assuming that all_data is a cell array and that each cell contains a matrix (with at least three columns):
result = cellfun(#(x) bsxfun(#rdivide, x, x(:,3)), all_data, 'uniformoutput', 0);
You are mixing terminology in matlab. what you have is 1x27 CELLS each of them containing a matrix.
If you access all_data{1} it will give you the whole matrix stored in the first cell.
If you want to access the elemets of that matrix then you need to do: all_data{1}(2,4). This example access the 2,4 element of the matrix in the first cell.
Definitely Luis Mendo has solved you problem, but be aware of the differences of Cells and matrixes in Matlab!
Okay I have found the answer now.
Basically you have to use both types of brackets because the data types are different
i.e. all_data{1}(1:4) or something like that anyway.
Cheers

access sub-matrix of a multidimensional Mat in OpenCV

according to this
post and from OpenCV documentation, I can initialize and access each element of a multidimensional Mat.
Actually, I firstly coded in MATLAB and now need to convert to OpenCV. MATLAB matrix supports sub-matrix access like: a(:,:,3) or b(:,:,3:5)
Can this be done in OpenCV? as far as I know, this can be done with 2D Mat. How about more that 2D??
Edit01:
moreover, with multidimensional Mat, the properties cols and rows are not enough to characterize 3 sizes of the matrix. There are cases with dimension larger than 3. How to store these properties?
Edit02:
// create a 100x100x100 8-bit array
int sz[] = {100, 100, 100};
Mat bigCube(3, sz, CV_8U, Scalar::all(0));
I give up the idea of sub-matrix access with OpenCV Mat. Perhaps, it's not supported in OpenCV. But from this sample code, the constructor receives the 3rd dimension from 'sz'. Which property of Mat this 3rd dimension is passed to? probably in this case, rows = 100, cols = 100, the other ?? = 100
I'm lost with OPenCV documentation
Edit03: tracking Mat class from OpenCV source
I've found the definition of the constructor in Edit02 from mat.hpp:
inline Mat::Mat(int _dims, const int* _sz, int _type, const Scalar& _s)
: flags(0), dims(0), rows(0), cols(0), data(0), refcount(0),
datastart(0), dataend(0), datalimit(0), allocator(0), size(&rows)
{
create(_dims, _sz, _type);
*this = _s;
}
the next question is where and how "create" function here is defined?
=> tracing this Mat definition in OpenCV probably helps me to modify/customize my own features in Mat matrix
PS: excuse me if my post is written too messy!! I'm a novic programmer, trying to solve my programming problem. Plz feel free to correct me if my approach is not good or right enough. Thank you!!
You can easily access sub-matrix of 2D cv::Mat using functions rowRange, colRange or even
cv::Mat subMat = originalMat(cv::Rect(x,y,width,height));
Also, the number of channels in a matrix, that you can define in the matrix constructor, can be used as the third dimension (but it is limited to 256 or 512 i think).
There is also the templated cv::Mat_ class that you can adapt to fit your purpose
[edit]
I have checked the constructor for >2 dimensional matrices. When you run it the rows and cols field of Mat are set to -1. The actual matrix size is store in Mat::size as an array of int.
For matrix of dimensions >2 you cannot use the submatrices constructors using a cv::Rect or rowRange/colRange.
I'm afraid you have to do a bit of work to extract submatrices for dim>2, working directly with the row data. But you can use the information stored in Mat::step which tells you the layout of the array. This is explained in the official documentation.
you can create sub-matrix by:
cv::Mat subMat(100,100,CV_8U, bigCube.ptr(0));
subMat is a 2-D matrix so you can do what you want.