Matlab's sparse function explanation - matlab

this is my first time posting anything so please be nice!
I'm studying a code about a random walker algorithm and i got lost with the use of sparse to make the sparse laplacian matrix of a point and edge set. I'm planning to make my own code of the sparse function, but i'm having problems understanding how it works and the output of it so any help would be perfect.
Thank you all !

A sparse matrix is a special type of "matrix" in matlab, which is conceptually equivalent to a normal matrix, but works differently 'under the hood'.
They are called "sparse", because they are usually used in situations where one would expect most elements of the matrix to contain zeros, and only a few non-zero elements.
The advantage of using this type of special object is that the memory it takes to create such an object depends primarily on the number of nonzero elements contained, rather than the size of the "actual" matrix.
By contrast, a normal (full) matrix needs memory allocated relative to its size. So for instance, a 1000x1000 matrix of numbers (so called 'doubles') will take roughly 8Mb bytes to store (1 million elements at 8 bytes per 'double'), even if all the elements are zero. Observe:
>> a = zeros(1000,1000);
>> b = sparse(1000,1000);
>> whos
Name Size Bytes Class Attributes
a 1000x1000 8000000 double
b 1000x1000 8024 double sparse
Now, assign a value to each of them at subscripts (1,1) and see what happens:
>> a(1,1) = 1 % without a semicolon, this will flood your screen with zeros
>> b(1,1) = 1
b =
(1,1) 1
As you can see, the sparse matrix only keeps track of nonzero values, and the zeros are 'implied'.
Now lets add some more elements:
>> a(1:100,1:100) = 1;
>> b(1:100,1:100) = 1;
>> whos
Name Size Bytes Class Attributes
a 1000x1000 8000000 double
b 1000x1000 168008 double sparse
As you can see, the allocated memory for a hasn't changed, because the size of the overall array hasn't changed. Whereas for b, because it now contains more nonzero values, it takes up more space in memory.
In general most sparse matrices should work with the same operations as normal matrices; the reason for this is that most 'normal' functions are explicitly defined to also accept sparse matrices, but treat them differently under the hood (i.e. they try to arrive at the same result, but using a different approach internally to do so, one that is more suitable to sparse matrices). e.g.:
>> c = sum(a(:))
c =
10000
>> d = sum(b(:))
d =
(1,1) 1000000
You can 'convert' a full matrix directly to a sparse one with the sparse command, and a sparse matrix back to a "full" matrix with the full command:
>> sparse(c)
ans =
(1,1) 10000
>> full(d)
ans =
1000000

Related

What's the maximum length of matrix I can store in Matlab

I am trying to store a matrix of size 4 x 10^6, but the Matlab can't do it when running it, it's like it can't store a matrix with that size or I should use another way to store. The code is as below:
matrix = [];
for j = 1 : 10^6
x = randn(4,1);
matrix = [matrix x];
end
The problem it still running for long time and can't finish it, however when I remove the line matrix = [matrix x]; , it finishes the loop very quickly. So what I need is to have the matrix in file so that I can use it wherever I need.
It is determined by your amount of available RAM. If you store double values, like here, you require 64 bits per number. Thus, storing 4M values requires 4*10^6*64 = 256M bits, which in turn is 32MB RAM.
A = rand(4,1e6);
whos A
Name Size Bytes Class Attributes
A 4x1000000 32000000 double
Thus you only cannot store this if you have less than 32MB RAM free.
The reason your code takes so long, is because you grow your matrix in place. The orange wiggles on the line matrix = [matrix x]; are not because the festive season is almost here, but because it is very bad practise to do this. As the warning tells you: preallocate your matrix. You know how large it will be, so just initialise it as matrix = zeros(4,1e6); instead of growing it.
Of course in this case you can simply do matrix = rand(4,1e6), which is even faster than looping.
For more information about preallocation see the official MATLAB documentation, this question (which I answered), or this one.

Alternate method for bin2dec function [duplicate]

I have a 12-bit binary that I need to convert to a decimal. For example:
A = [0,1,1,0,0,0,0,0,1,1,0,0];
Bit 1 is the most significant bit, Bit 12 is the least significant bit.
Note: This answer applies primarily to unsigned data types. For converting to signed types, a few extra steps are necessary, discussed here.
The bin2dec function is one option, but requires you to change the vector to a string first. bin2dec can also be slow compared to computing the number yourself. Here's a solution that's about 75 times faster:
>> A = [0,1,1,0,0,0,0,0,1,1,0,0];
>> B = sum(A.*2.^(numel(A)-1:-1:0))
B =
1548
To explain, A is multiplied element-wise by a vector of powers of 2, with the exponents ranging from numel(A)-1 down to 0. The resulting vector is then summed to give the integer represented by the binary pattern of zeroes and ones, with the first element in the array being considered the most significant bit. If you want the first element to be considered the least significant bit, you can do the following:
>> B = sum(A.*2.^(0:numel(A)-1))
B =
774
Update: You may be able to squeeze even a little more speed out of MATLAB by using find to get the indices of the ones (avoiding the element-wise multiplication and potentially reducing the number of exponent calculations needed) and using the pow2 function instead of 2.^...:
B = sum(pow2(find(flip(A))-1)); % Most significant bit first
B = sum(pow2(find(A)-1)); % Least significant bit first
Extending the solution to matrices...
If you have a lot of binary vectors you want to convert to integers, the above solution can easily be modified to convert all the values with one matrix operation. Suppose A is an N-by-12 matrix, with one binary vector per row. The following will convert them all to an N-by-1 vector of integer values:
B = A*(2.^(size(A, 2)-1:-1:0)).'; % Most significant bit first
B = A*(2.^(0:size(A, 2)-1)).'; % Least significant bit first
Also note that all of the above solutions automatically determine the number of bits in your vector by looking at the number of columns in A.
Dominic's answer assumes you have access to the Data Acquisition toolbox. If not use bin2dec:
A = [0,1,1,0,0,0,0,0,1,1,0,0];
bin2dec( sprintf('%d',A) )
or (in reverse)
A = [0,1,1,0,0,0,0,0,1,1,0,0];
bin2dec( sprintf('%d',A(end:-1:1)) )
depending on what you intend to be bit 1 and 12!
If the MSB is right-most (I'm not sure what you mean by Bit 1, sorry if that seems stupid):
Try:
binvec2dec(A)
Output should be:
ans =
774
If the MSB is left-most, use fliplr(A) first.

Matlab csvread: 450k+ columns

I am attempting to read a file with Matlab's csvread. It has 10 rows and ~900,000 columns. In the case of such a large number of columns, the function returns a vector, as opposed to a matrix of proper dimensions.
In order to test this, I truncated it to two sizes using cut, and when there are 457,000 columns we have the same behavior:
>> A = csvread( 'test.csv' );
>> size(A)
ans =
4570000 1
But when cut down to 45,700 columns, we have the desired behavior:
>> A = csvread( 'test.csv' );
>> size(A)
ans =
10 45700
Of course, Matlab is capable of handling matrices of the size 10x457,000, and I suppose I can use fscanf in a loop (I feel like this would be less inefficient?), but I was wondering if anyone had any insight.
EDIT: I suppose I could also just make the vector into a matrix of proper dimensions--but I still would like to understand this seemingly strange behavior of the matrix

Matlab fast neighborhood operation

I have a Problem. I have a Matrix A with integer values between 0 and 5.
for example like:
x=randi(5,10,10)
Now I want to call a filter, size 3x3, which gives me the the most common value
I have tried 2 solutions:
fun = #(z) mode(z(:));
y1 = nlfilter(x,[3 3],fun);
which takes very long...
and
y2 = colfilt(x,[3 3],'sliding',#mode);
which also takes long.
I have some really big matrices and both solutions take a long time.
Is there any faster way?
+1 to #Floris for the excellent suggestion to use hist. It's very fast. You can do a bit better though. hist is based on histc, which can be used instead. histc is a compiled function, i.e., not written in Matlab, which is why the solution is much faster.
Here's a small function that attempts to generalize what #Floris did (also that solution returns a vector rather than the desired matrix) and achieve what you're doing with nlfilter and colfilt. It doesn't require that the input have particular dimensions and uses im2col to efficiently rearrange the data. In fact, the the first three lines and the call to im2col are virtually identical to what colfit does in your case.
function a=intmodefilt(a,nhood)
[ma,na] = size(a);
aa(ma+nhood(1)-1,na+nhood(2)-1) = 0;
aa(floor((nhood(1)-1)/2)+(1:ma),floor((nhood(2)-1)/2)+(1:na)) = a;
[~,a(:)] = max(histc(im2col(aa,nhood,'sliding'),min(a(:))-1:max(a(:))));
a = a-1;
Usage:
x = randi(5,10,10);
y3 = intmodefilt(x,[3 3]);
For large arrays, this is over 75 times faster than colfilt on my machine. Replacing hist with histc is responsible for a factor of two speedup. There is of course no input checking so the function assumes that a is all integers, etc.
Lastly, note that randi(IMAX,N,N) returns values in the range 1:IMAX, not 0:IMAX as you seem to state.
One suggestion would be to reshape your array so each 3x3 block becomes a column vector. If your initial array dimensions are divisible by 3, this is simple. If they don't, you need to work a little bit harder. And you need to repeat this nine times, starting at different offsets into the matrix - I will leave that as an exercise.
Here is some code that shows the basic idea (using only functions available in FreeMat - I don't have Matlab on my machine at home...):
N = 100;
A = randi(0,5*ones(3*N,3*N));
B = reshape(permute(reshape(A,[3 N 3 N]),[1 3 2 4]), [ 9 N*N]);
hh = hist(B, 0:5); % histogram of each 3x3 block: bin with largest value is the mode
[mm mi] = max(hh); % mi will contain bin with largest value
figure; hist(B(:),0:5); title 'histogram of B'; % flat, as expected
figure; hist(mi-1, 0:5); title 'histogram of mi' % not flat?...
Here are the plots:
The strange thing, when you run this code, is that the distribution of mi is not flat, but skewed towards smaller values. When you inspect the histograms, you will see that is because you will frequently have more than one bin with the "max" value in it. In that case, you get the first bin with the max number. This is obviously going to skew your results badly; something to think about. A much better filter might be a median filter - the one that has equal numbers of neighboring pixels above and below. That has a unique solution (while mode can have up to four values, for nine pixels - namely, four bins with two values each).
Something to think about.
Can't show you a mex example today (wrong computer); but there are ample good examples on the Mathworks website (and all over the web) that are quite easy to follow. See for example http://www.shawnlankton.com/2008/03/getting-started-with-mex-a-short-tutorial/

How can I convert a binary to a decimal without using a loop?

I have a 12-bit binary that I need to convert to a decimal. For example:
A = [0,1,1,0,0,0,0,0,1,1,0,0];
Bit 1 is the most significant bit, Bit 12 is the least significant bit.
Note: This answer applies primarily to unsigned data types. For converting to signed types, a few extra steps are necessary, discussed here.
The bin2dec function is one option, but requires you to change the vector to a string first. bin2dec can also be slow compared to computing the number yourself. Here's a solution that's about 75 times faster:
>> A = [0,1,1,0,0,0,0,0,1,1,0,0];
>> B = sum(A.*2.^(numel(A)-1:-1:0))
B =
1548
To explain, A is multiplied element-wise by a vector of powers of 2, with the exponents ranging from numel(A)-1 down to 0. The resulting vector is then summed to give the integer represented by the binary pattern of zeroes and ones, with the first element in the array being considered the most significant bit. If you want the first element to be considered the least significant bit, you can do the following:
>> B = sum(A.*2.^(0:numel(A)-1))
B =
774
Update: You may be able to squeeze even a little more speed out of MATLAB by using find to get the indices of the ones (avoiding the element-wise multiplication and potentially reducing the number of exponent calculations needed) and using the pow2 function instead of 2.^...:
B = sum(pow2(find(flip(A))-1)); % Most significant bit first
B = sum(pow2(find(A)-1)); % Least significant bit first
Extending the solution to matrices...
If you have a lot of binary vectors you want to convert to integers, the above solution can easily be modified to convert all the values with one matrix operation. Suppose A is an N-by-12 matrix, with one binary vector per row. The following will convert them all to an N-by-1 vector of integer values:
B = A*(2.^(size(A, 2)-1:-1:0)).'; % Most significant bit first
B = A*(2.^(0:size(A, 2)-1)).'; % Least significant bit first
Also note that all of the above solutions automatically determine the number of bits in your vector by looking at the number of columns in A.
Dominic's answer assumes you have access to the Data Acquisition toolbox. If not use bin2dec:
A = [0,1,1,0,0,0,0,0,1,1,0,0];
bin2dec( sprintf('%d',A) )
or (in reverse)
A = [0,1,1,0,0,0,0,0,1,1,0,0];
bin2dec( sprintf('%d',A(end:-1:1)) )
depending on what you intend to be bit 1 and 12!
If the MSB is right-most (I'm not sure what you mean by Bit 1, sorry if that seems stupid):
Try:
binvec2dec(A)
Output should be:
ans =
774
If the MSB is left-most, use fliplr(A) first.