Best conversion between LuaJIT ffi cdata and torch Tensor - type-conversion

What's the best way to perform the conversion between LuaJIT ffi cdata [1] and Torch Tensor [2].
According to Mike's reply in lua-user mail list [3], if we really want to convert cdata to lua plain table, we have to do a loop to copy each item to a new created one. And actually Torch Tensor do provide some interface for better LuaJIT ffi access [4]. So, my current solution is do a loop first and convert the cdata to a lua plain table and then call the tensor construction function that create a tensor from a table [5].
But actually in my case, I need to do similar conversion between LuaJIT ffi cdata and Torch Tensor very frequently, is there any better approach rather than loop copy?
[1] http://luajit.org/ext_ffi_api.html (Creating cdata Objects Section)
[2] https://github.com/torch/torch7/blob/master/doc/tensor.md
[3] http://lua-users.org/lists/lua-l/2011-03/msg00584.html
[4] https://github.com/torch/torch7/blob/master/doc/tensor.md#luajit-ffi-access
[5] https://github.com/torch/torch7/blob/master/doc/tensor.md#torchtensortable

If your cdata represents a contiguous array of data then you can use ffi.copy. Here is a toy example:
require 'torch'
ffi = require 'ffi'
-- create a random float array
n = 3
x = torch.rand(n):float()
cdata = x:data()
assert(type(cdata) == 'cdata')
-- copy this cdata into a destination tensor
y = torch.FloatTensor(n)
ffi.copy(y:data(), cdata, n*ffi.sizeof('float'))
assert(x:equal(y))

Related

How to read a complex 3D matrix (binary file) in Matlab without using interleaved/reshaping method?

I have a very huge 3D matrix, the data was written into disk for future use. Writing the matrix into a bin is easy, reading it back however have some issue.
Write to bin:
z=repmat(complex(rand(5),rand(5)),[1 1 5])
z_imag = imag(z);
z_real = real(z);
adjacent = [z_real z_imag];
fileID = fopen('complex.bin','w');
fwrite(fileID,adjacent,'double')
And now, I try to read it back using memmapfile:
m = memmapfile('complex.bin', 'Offset', 0, 'Format', {'double' [5,5,5] 'x'});
complexValues = complex(m.Data(:).x(1,:), m.Data(:).x(2,:)); %this line doesn't work though, just for explanation's sake
It gave me an error saying that
Error using memmapfile/subsref (line 764) A subscripting operation on
the Data field attempted to create a comma-separated list. The
memmapfile class does not support the use of comma-separated lists
when subscripting.
I was referring to the solution here, the suggested solution used the reshape to shape the matrix beforehand (as contrast to my method above). I try to avoid using reshape in my code as I'm dealing with very huge data and that might computationally expensive and takes a long time. Is there an alternative/better way to do this?
Thanks in advance!

Work with binary numbers as scalars in Matlab

I am working with a MATLAB function that uses numbers in the binary base. To do so it uses the function dec2bin to transform an integer into a char array containing the binary information. The issue is that I plan to use HDL Coder to generate a HDL version of the function. One step of the process is to convert the variables to fixed point. This can be done automatically when the data is a scalar, so is there any way to manage binary numbers without using vectors?
dec2bin is just for display purposes. Numbers are always stored in the computer using binary representation. You can use the functions
bitand,
bitor,
bitxor,
bitcmp,
bitshift,
bitget, and
bitset
to do bit-wise manipulation of integer numbers:
>> a = uint32(7);
>> b = uint32(12);
>> bitand(a, b)
ans =
uint32
4
(Click on the function names above for the documentation. You can also do help bitand in MATLAB to read a shorter version of the documentation or doc bitand to read the full documentation.)

error in array of struct matlab

I want to train data on various labels using svm and want svm model as array of struct. I am doing like this but getting the error:
Subscripted assignment between dissimilar structures.
Please help me out
model = repmat(struct(),size);
for i=1:size
model(i) = svmtrain(train_data,labels(:,i),'Options', options);
end
A structure array in MATLAB can only contain structures with identical fields. By first creating an array of empty structures1 and then trying to fill it with SVMStruct structures, you try to create a mixed array with some empty structures, and some SVMStruct structures. This is not possible, thus MATLAB complains about "dissimilar structures" (not-equal structures: empty vs. SVMStruct).
To allocate an array of structs, you will have to specify all fields during initialization and fill them all with initial values - which is rather inconvenient in this case. A very simple alternative is to drop this initialization, and run your loop the other way around2,3:
for ii=sizeOfLabels:-1:1
model(ii) = svmtrain(train_data,labels(:,ii),'Options', options);
end
That way, for ii=sizeOfLabels, e.g. ii=100, MATLAB will call model(100)=..., while model doesn't exist yet. It will then allocate all space needed for 100 SVMStructs and fill the first 99 instances with empty values. That way you pre-allocate the memory, without having to worry about initializing the values.
1Note: if e.g. size=5, calling repmat(struct(),size) will create a 5-by-5 matrix of empty structs. To create a 1-by-5 array of structs, call repmat(struct(),1,size).
2Don't use size as a variable name, as this is a function. If you do that, you can't use the size function anymore.
3i and j denote the imaginary unit in MATLAB. Using them as a variable slows the code down and is error-prone. Use e.g. k or ii for loops instead.

(De)Serialize/deserialize MATLAB graph objects from and to Python

MATLAB has a representation of a directed/undirected graph. I would like to deserialize a graph with many node and edge attributes serialized via MATLAB's save function into Python. I know about scipy.io.loadmat and h5py's File (for MATLAB v7.3 saved files), but neither seems to produce a representation in Python that actually holds intelligible vertex/edge data.
How do I do this? I'm concerned with this and the inverse operation, i.e writing an object from Python to a format MATLAB load can read. Is there a bytewise data description of a serialized MATLAB object and/or a Graph looks like somewhere?
For example, in MATLAB I could:
s = [1 1 2 2 3];
t = [2 4 3 4 4];
G = digraph(s,t);
G.Edges.Rand = rand(size(G.Edges)); % Add an edge attribute
G.Nodes.Val = rand(size(G.Nodes)); % Add a node attribute
save('loadmat.mat', 'G'); % Readable by scipy.io.loadmat
save('h5py.mat', 'G', '-v7.3'); % Readable by h5py.File
then, in Python I could read these
from scipy.io import loadmat
G0 = loadmat('loadmat.mat')
from h5py import File
G1 = File('h5py.mat')
Neither seems to give me the vertex/edge data or am I just missing it?
Thanks
The lengthy but certain way to do this is to define a schema for G in something like Google Protocol Buffers. With that schema you can automatically generate serialiser source code for Python and Java (that you could use in Matlab). This would allow you exchange G between Matlab and Python, or indeed between anything else.
You would probably have to hand write code to translate between how G is stored in Matlab and however protoc chose to represent in Java the message you had defined in the schema. You won't (I'm fairly certain) be able to do Ggpb = G. If both Java and Matlab supported type reflection then you could probably write something neat to do it automatically...
The digraph object is an user-defined object type, not one of the fundamental MATLAB types; no wonder python doesn't understand it. Please note that even MATLAB will not understand the saved object layout if it doesn't have a working definition of the saved object's class, accessible in the path.
You might want to save the adjacency matrix:
A = full(adjacency(G));
save('adjacency.mat', 'A');
or the incidence matrix:
I = full(incidence(G));
save('incidence.mat', 'I');
whatever suits you better.
Late edit
Another way is to force the object to become a POD (plain old data) that has better chances to be understood by loadmat:
S = struct(G);
save('pod_digraph.mat', 'S');
But mind that you'll have access to all information; the dependent properties will be saved as such; you'll need to recreate the class' interface by yourself in order to maintain consistency (e.g. the adjacency matrix and the incidence matrix can be both constructed on-the-fly from the same internal information, which may look like neither of them). Also, one cannot convert a POD to the original object unless the constructor knows how to do this.

Conversion from Matlab CSC to CSR format

I am using mex bridge to perform some operations on Sparse matrices from Matlab.
For that I need to convert input matrix into CSR (compressed row storage) format, since Matlab stores the sparse matrices in CSC (compressed column storage).
I was able to get value array and column_indices array. However, I am struggling to get row_pointer array for CSR format.Is there any C library that can help in conversion from CSC to CSR ?
Further, while writing a CUDA kernel, will it be efficient to use CSR format for sparse operations or should I just use following arrays :- row indices, column indices and values?
Which on would give me more control over the data, minimizing the number for-loops in the custom kernel?
Compressed row storage is similar to compressed column storage, just transposed. So the simplest thing is to use MATLAB to transpose the matrix before you pass it to your MEX file. Then, use the functions
Ap = mxGetJc(spA);
Ai = mxGetIr(spA);
Ax = mxGetPr(spA);
to get the internal pointers and treat them as row storage. Ap is row pointer, Ai is column indices of the non-zero entries, Ax are the non-zero values. Note that for symmetric matrices you do not have to do anything at all! CSC and CSR are the same.
Which format to use heavily depends on what you want to do with the matrix later. For example, have a look at matrix formats for Sparse matrix vector multiplication. That is one of the classic papers, research has moved since then so you can look around further.
I ended up converting CSC format from Matlab to CSR using CUSP library as follows.
After getting the matrix A from matlab and I got its row,col and values vectors and I copied them in respective thrust::host_vector created for each of them.
After that I created two cusp::array1d of type Indices and Values as follows.
typedef typename cusp::array1d<int,cusp::host_memory>Indices;
typedef typename cusp::array1d<float,cusp::host_memory>Values;
Indices row_indices(rows.begin(),rows.end());
Indices col_indices(cols.begin(),cols.end());
Values Vals(Val.begin(),Val.end());
where rows, cols and Val are thrust::host_vector that I got from Matlab.
After that I created a cusp::coo_matrix_view as given below.
typedef cusp::coo_matrix_view<Indices,Indices,Values>HostView;
HostView Ah(m,n,NNZa,row_indices,col_indices,Vals);
where m,n and NNZa are the parameters that I get from mex functions of sparse matrices.
I copied this view matrix to cusp::csr_matrixin device memory with proper dimensions set as given below.
cusp::csr_matrix<int,float,cusp::device_memory>CSR(m,n,NNZa);
CSR = Ah;
After that I just copied the three individual content arrays of this CSR matrix back to the host using thrust::raw_pointer_cast where arrays with proper dimension are already mxCalloced as given below.
cudaMemcpy(Acol,thrust::raw_pointer_cast(&CSR.column_indices[0]),sizeof(int)*(NNZa),cudaMemcpyDeviceToHost);
cudaMemcpy(Aptr,thrust::raw_pointer_cast(&CSR.row_offsets[0]),sizeof(int)*(n+1),cudaMemcpyDeviceToHost);
cudaMemcpy(Aval,thrust::raw_pointer_cast(&CSR.values[0]),sizeof(float)*(NNZa),cudaMemcpyDeviceToHost);
Hope this is useful to anyone who is using CUSP with Matlab
you can do something like this:
n = size(M,1);
nz_num = nnz(M);
[col,rowi,vals] = find(M');
row = zeros(n+1,1);
ll = 1; row(1) = 1;
for l = 2:n
if rowi(l)~=rowi(l-1)
ll = ll + 1;
row(ll) = l;
end
end
row(n+1) = nz_num+1;`
It works for me, hope it can help somebody else!