Hash function for binary matrices (Matlab) - matlab

I am working with big binary 2D matrices that are stored in a vector and every time a new matrix is obtained it is added to this vector, that can reach sizes of about 500 or 1000 elements. What I ask is if there is a more efficient way to store this matrices, maybe with a hash function. When there is a coincidence of two elements in the vector what I need is their position in the vector, not the matrix itself. I am working with Matlab.
this is executed after a new matrix is obtained:
states = [states new_state];
for i = 1:size(states,3)-1
if isequal(states(:,:,end), states(:,:,i))
found = 1;
num = size(states,3) - i;
break
end
end
matrices are binary:
new_state = [1 0 0 0; 0 0 0 1; 1 1 0 1; 1 1 0 0];

Related

How to get indexes of logical matrix without using find in matlab?

Let's assume my matrix A is the output of comparison function i.e. logical matrix having values 0 and 1's only. For a small matrix of size 3*4, we might have something like:
A =
1 1 0 0
0 0 1 0
0 0 1 1
Now, I am generating another matrix B which is of the same size as A, but its rows are filled with indexes of A and any leftover values in each row are set to zero.
B =
1 2 0 0
3 0 0 0
3 4 0 0
Currently, I am using find function on each row of A to get matrix B. Complete code can be written as:
A=[1,1,0,0;0,0,1,0;0,0,1,1];
[rows,columns]=size(A);
B=zeros(rows,columns);
for i=1:rows
currRow=find(A(i,:));
B(i,1:length(currRow))=currRow;
end
For large martixes, "find" function is taking time in the calculation as per Matlab Profiler. Is there any way to generate matrix B faster?
Note:
Matrix A is having more than 1000 columns in each row but non-zero elements are never more than 50. Here, I am taking Matrix B as the same size as A but Matrix B can be of much smaller size column-wise.
I would suggest using parfor, but the overhead is too much here, and there are more issues with it, so it is not a good solution.
rows = 5e5;
cols = 1000;
A = rand(rows, cols) < 0.050;
I = uint16(1:cols);
B = zeros(size(A), 'uint16');
% [r,c] = find(A);
tic
for i=1:rows
% currRow = find(A(i,:));
currRow = I(A(i,:));
B(i,1:length(currRow)) = currRow;
end
toc
#Cris suggests replacing find with an indexing operation. It increases the performance by about 10%.
Apparently, there is not a better optimization unless B is required to be in that specific form you tell. I suggest using [r,c] = find(A); if the indexes are not required in a matrix form.

Finding equal rows in Matlab

I have a matrix suppX in Matlab with size GxN and a matrix A with size MxN. I would like your help to construct a matrix Xresponse with size GxM with Xresponse(g,m)=1 if the row A(m,:) is equal to the row suppX(g,:) and zero otherwise.
Let me explain better with an example.
suppX=[1 2 3 4;
5 6 7 8;
9 10 11 12]; %GxN
A=[1 2 3 4;
1 2 3 4;
9 10 11 12;
1 2 3 4]; %MxN
Xresponse=[1 1 0 1;
0 0 0 0;
0 0 1 0]; %GxM
I have written a code that does what I want.
Xresponsemy=zeros(size(suppX,1), size(A,1));
for x=1:size(suppX,1)
Xresponsemy(x,:)=ismember(A, suppX(x,:), 'rows').';
end
My code uses a loop. I would like to avoid this because in my real case this piece of code is part of another big loop. Do you have suggestions without looping?
One way to do this would be to treat each matrix as vectors in N dimensional space and you can find the L2 norm (or the Euclidean distance) of each vector. After, check if the distance is 0. If it is, then you have a match. Specifically, you can create a matrix such that element (i,j) in this matrix calculates the distance between row i in one matrix to row j in the other matrix.
You can treat your problem by modifying the distance matrix that results from this problem such that 1 means the two vectors completely similar and 0 otherwise.
This post should be of interest: Efficiently compute pairwise squared Euclidean distance in Matlab.
I would specifically look at the answer by Shai Bagon that uses matrix multiplication and broadcasting. You would then modify it so that you find distances that would be equal to 0:
nA = sum(A.^2, 2); % norm of A's elements
nB = sum(suppX.^2, 2); % norm of B's elements
Xresponse = bsxfun(#plus, nB, nA.') - 2 * suppX * A.';
Xresponse = Xresponse == 0;
We get:
Xresponse =
3×4 logical array
1 1 0 1
0 0 0 0
0 0 1 0
Note on floating-point efficiency
Because you are using ismember in your implementation, it's implicit to me that you expect all values to be integer. In this case, you can very much compare directly with the zero distance without loss of accuracy. If you intend to move to floating-point, you should always compare with some small threshold instead of 0, like Xresponse = Xresponse <= 1e-10; or something to that effect. I don't believe that is needed for your scenario.
Here's an alternative to #rayryeng's answer: reduce each row of the two matrices to a unique identifier using the third output of unique with the 'rows' input flag, and then compare the identifiers with singleton expansion (broadcast) using bsxfun:
[~, ~, w] = unique([A; suppX], 'rows');
Xresponse = bsxfun(#eq, w(1:size(A,1)).', w(size(A,1)+1:end));

How to get the logical matrix corresponding to "scatter" plot?

If I have a two column matrix A like below, I can plot the scatter plot using scatter/plot command. I would like to get the matrix corresponding to such outputs as in hist command. hist command gives the vector output too.
A=[7 1;3 2; 4 3]
For example out=scatter(A(:,1),A(:,2)) must give something like below:
[0 0 0;
0 0 0;
0 1 0;
0 0 1;
0 0 0;
0 0 0;
1 0 0]
Only the indices (7,1), (3,2) and (4,3) are only ones. Or Can someone give me a snippet code to realize this without using loops?
You can use a combination of sparse and full where you can specify the non-zero row and column locations, and the rest of the matrix would be zero:
A = [7 1; 3 2; 4 3];
B = full(sparse(A(:,1), A(:,2), 1, max(A(:,1)), max(A(:,2)))) == 1;
The sparse command takes in the row and column locations of what is non-zero for the first two inputs, the third input is what the non-zero location would be for each row and column location. We can specify a constant to mean that every non-zero location gets the same coefficient, which is 1. We can also specify the size of the matrix, where in this case the rows and columns of the output correspond to the largest number in the first and second columns respectively. Because this is a sparse matrix, you will want to convert this to a full matrix and because you want it to be logical, you will want to compare all elements with the number 1.
We thus get for the output, which is B:
B =
7×3 logical array
0 0 0
0 0 0
0 1 0
0 0 1
0 0 0
0 0 0
1 0 0
Alternatively, we can use sub2ind to create linear indices to index into a pre-allocated matrix of logical false and set only those non-zero row locations to true:
A = [7 1; 3 2; 4 3];
B = false(max(A(:,1)), max(A(:,2)));
ind = sub2ind(size(B), A(:,1), A(:,2));
B(ind) = true;
We first allocate the matrix, then calculate the linear indices to index into the matrix, then finally set the right locations to true. The output here would be the same as the sparse approach.
Just to add: rayryeng's solution is fine if you really want your result to be logical in the sense that it is equal to one if there is anything at the coordinate and zero otherwise. Still, since you added a note on hist, I was wondering if you actually want to count the number of times a specific coordinate is hit. In this case, consider using
S = histcounts2(A(:,2),A(:,1));
if you have access to R2015b+. If not, there is a hist2 function on fileexchange you can use for the purpose.
Here is my solution. Matlab provides a command called accumarray.
S = logical(accumarray(A, 1) )
will give the result too.

Creating a weight adjacency matrix

I need to assign weights to edges of a graph, from the following papers:
"Fast linear iterations for distributed averaging" by L. Xiao and S. Boyd
"Convex Optimization of Graph Laplacian Eigenvalues" by S. Boyd
I have the adjacency matrix for my graph (a 50 by 50 matrix), with 512 non-zero values.
I also have a 256 by 1 vector with the optimal weights.
For the software I'm using, I need a 50 by 50 matrix with the weight of edge (i,j) in the relevant position of the adjacency matrix (and with the opposite sign for edge (j,i)).
My attempt is below, but I can't get it working.
function weights = construct_weight_mtx(weight_list, Adj)
weights = zeros(size(Adj));
positions = find(Adj);
for i=1:length(positions)/2
if Adj(i) == 1
weights(i) = weight_list(i);
end
end
weights = weights - weights';
find(Adj) == find(weights);
end
You're finding the nonzero positions in the original adjacency matrix, but you're finding all of them. To get around this, you then take only the first half of those positions.
for i=1:length(positions)/2 ...
Unfortunately, this takes the indices from complete columns rather than just the positions below the diagonal. So if your matrix was all 1's, you'd be taking:
1 1 1 0 0 ...
1 1 1 0 0 ...
1 1 1 0 0 ...
...
instead of:
1 0 0 0 0 ...
1 1 0 0 0 ...
1 1 1 0 0 ...
...
To take the correct values, we just take the lower triangular portion of Adj and then find the nonzero positions of that:
positions = find(tril(Adj));
Now we have only the 256 positions below the diagonal and we can loop over all of the positions. Next, we need to fix the assignment in the loop:
for i=1:length(positions)
if Adj(i) == 1 %// we already know Adj(i) == 1 for all indices in positions
weights(i) = weight_list(i); %// we need to update weights(positions(i))
end
end
So this becomes:
for i=1:length(positions)
weights(positions(i)) = weight_list(i);
end
But if all we're doing is assigning 256 values to 256 positions, we can do that without a for loop:
weights(position) = weight_list;
Note that the elements of weight_list must be in the proper order with the nonzero elements of the lower-triangular portion ordered by columns.
Completed code:
function weights = construct_weight_mtx(weight_list, Adj)
weights = zeros(size(Adj));
positions = find(tril(Adj));
weights(positions) = weight_list;
weights = weights - weights.'; %// ' is complex conjugate; not a big deal here, but something to know
find(Adj) == find(weights); %// Not sure what this is meant to do; maybe an assert?
end

Matlab:Efficient assignment of values in a sparse matrix

I'm working in Matlab and I have the next problem:
I have a B matrix of nx2 elements, which contains indexes for the assignment of a big sparse matrix A (almost 500,000x80,000). For each row of B, the first column is the column index of A that has to contain a 1, and the second column is the column index of A that has to contain -1.
For example:
B= 1 3
2 5
1 5
4 1
5 2
For this B matrix, The Corresponding A matrix has to be like this:
A= 1 0 -1 0 0
0 1 0 0 -1
1 0 0 0 -1
-1 0 0 1 0
0 -1 0 0 1
So, for the row i of B, the corresponding row i of A must be full of zeros except on A(i,B(i,1))=1 and A(i,B(i,2))=-1
This is very easy with a for loop over all the rows of B, but it's extremely slow. I also tried the next formulation:
A(:,B(:,1))=1
A(:,B(:,2))=-1
But matlab gave me an "Out of Memory Error". If anybody knows a more efficient way to achieve this, please let me know.
Thanks in advance!
You can use the sparse function:
m = size(B,1); %// number of rows of A. Or choose larger if needed
n = max(B(:)); %// number of columns of A. Or choose larger if needed
s = size(B,1);
A = sparse(1:s, B(:,1), 1, m, n) + sparse(1:s, B(:,2), -1, m, n);
I think you should be able to do this using the sub2ind function. This function converts matrix subscripts to linear indices. You should be able to do it like so:
pind = sub2ind(size(A),1:n,B(:,1)); % positive indices
nind = sub2ind(size(A),1:n,B(:,2)); % negative indices
A(pind) = 1;
A(nind) = -1;
EDIT: I (wrongly, I think) assumed the sparse matrix A already existed. If it doesn't exist, then this method wouldn't be the best option.