Finding equal rows in Matlab

Finding equal rows in Matlab - matlab

I have a matrix suppX in Matlab with size GxN and a matrix A with size MxN. I would like your help to construct a matrix Xresponse with size GxM with Xresponse(g,m)=1 if the row A(m,:) is equal to the row suppX(g,:) and zero otherwise.
Let me explain better with an example.
suppX=[1 2 3 4;
5 6 7 8;
9 10 11 12]; %GxN
A=[1 2 3 4;
1 2 3 4;
9 10 11 12;
1 2 3 4]; %MxN
Xresponse=[1 1 0 1;
0 0 0 0;
0 0 1 0]; %GxM
I have written a code that does what I want.
Xresponsemy=zeros(size(suppX,1), size(A,1));
for x=1:size(suppX,1)
Xresponsemy(x,:)=ismember(A, suppX(x,:), 'rows').';
end
My code uses a loop. I would like to avoid this because in my real case this piece of code is part of another big loop. Do you have suggestions without looping?

One way to do this would be to treat each matrix as vectors in N dimensional space and you can find the L2 norm (or the Euclidean distance) of each vector. After, check if the distance is 0. If it is, then you have a match. Specifically, you can create a matrix such that element (i,j) in this matrix calculates the distance between row i in one matrix to row j in the other matrix.
You can treat your problem by modifying the distance matrix that results from this problem such that 1 means the two vectors completely similar and 0 otherwise.
This post should be of interest: Efficiently compute pairwise squared Euclidean distance in Matlab.
I would specifically look at the answer by Shai Bagon that uses matrix multiplication and broadcasting. You would then modify it so that you find distances that would be equal to 0:
nA = sum(A.^2, 2); % norm of A's elements
nB = sum(suppX.^2, 2); % norm of B's elements
Xresponse = bsxfun(#plus, nB, nA.') - 2 * suppX * A.';
Xresponse = Xresponse == 0;
We get:
Xresponse =
3×4 logical array
1 1 0 1
0 0 0 0
0 0 1 0
Note on floating-point efficiency
Because you are using ismember in your implementation, it's implicit to me that you expect all values to be integer. In this case, you can very much compare directly with the zero distance without loss of accuracy. If you intend to move to floating-point, you should always compare with some small threshold instead of 0, like Xresponse = Xresponse <= 1e-10; or something to that effect. I don't believe that is needed for your scenario.

Here's an alternative to #rayryeng's answer: reduce each row of the two matrices to a unique identifier using the third output of unique with the 'rows' input flag, and then compare the identifiers with singleton expansion (broadcast) using bsxfun:
[~, ~, w] = unique([A; suppX], 'rows');
Xresponse = bsxfun(#eq, w(1:size(A,1)).', w(size(A,1)+1:end));

Related

Find equal rows between two Matlab matrices

I have a matrix index in Matlab with size GxN and a matrix A with size MxN.
Let me provide an example before presenting my question.
clear
N=3;
G=2;
M=5;
index=[1 2 3;
13 14 15]; %GxN
A=[1 2 3;
5 6 7;
21 22 23;
1 2 3;
13 14 15]; %MxN
I would like your help to construct a matrix Response with size GxM with Response(g,m)=1 if the row A(m,:) is equal to index(g,:) and zero otherwise.
Continuing the example above
Response= [1 0 0 1 0;
0 0 0 0 1]; %GxM
This code does what I want (taken from a previous question of mine - just to clarify: the current question is different)
Response=permute(any(all(bsxfun(#eq, reshape(index.', N, [], G), permute(A, [2 3 4 1])), 1), 2), [3 4 1 2]);
However, the command is extremely slow for my real matrix sizes (N=19, M=500, G=524288). I understand that I will not be able to get huge speed but anything that can improve on this is welcome.

Aproach 1: computing distances
If you have the Statistics Toolbox:
Response = ~(pdist2(index, A));
or:
Response = ~(pdist2(index, A, 'hamming'));
This works because pdist2 computes the distance between each pair of rows. Equal rows have distance 0. The logical negation ~ gives 1 for those pairs of rows, and 0 otherwise.
Approach 2: reducing rows to unique integer labels
This approach is faster on my machine:
[~,~,u] = unique([index; A], 'rows');
Response = bsxfun(#eq, u(1:G), u(G+1:end).');
It works by reducing rows to unique integer labels (using the third output of unique), and comparing the latter instead of the former.
For your size values this takes approximately 1 second on my computer:
clear
N = 19; M = 500; G = 524288;
index = randi(5,G,N); A = randi(5,M,N);
tic
[~,~,u] = unique([index; A], 'rows');
Response = bsxfun(#eq, u(1:G), u(G+1:end).');
toc
gives
Elapsed time is 1.081043 seconds.

MATLAB has a multitude of functions for working with sets, including setdiff, intersect, union etc. In this case, you can use the ismember function:
[~, Loc] = ismember(A,index,'rows');
Which gives:
Loc =
1
0
0
1
2
And Response would be constructed as follows:
Response = (1:size(index,1) == Loc).';
Response =
2×5 logical array
1 0 0 1 0
0 0 0 0 1

You could reshape the matrices so that each row instead lies along the 3rd dimension. Then we can use implicit expansion (see bsxfun for R2016b or earlier) for equality of all elements, and all to aggregate on the rows (i.e. false if not all equal for a given row).
Response = all( reshape( index, [], 1, size(index,2) ) == reshape( A, 1, [], size(A,2) ), 3 );
You might even be able to avoid some reshaping by using all in another dimension, but it's easier for me to visualise it this way.

Assigning values to a matrix through vector addition from an adjacency matrix

Very new to Matlab, I usually use STATA.
I want to use the nchoosek fuction to get the sum of vectors in one matrix.
I have a 21x21 adjacency matrix, with either 0 or 1 as the inputs. I want to create a new matrix, that will give me a sum of inputs between all possible triads from the adjacency matrix.
The new matrix should look have four variables, indexes (i, j, k) - corresponding to each combination from the 21x21. And a final variable which is a sum of the inputs.
The code I have so far is:
C = nchoosek(21,3)
B = zeros(nchoosek(21,3), 4)
for i=1:C
for j=i+1:C
for k=j+1:C
B(?)=B(i, j, k, A(i)+A(j)+A(k)) #A is the 21x21 adj mat
end
end
end
I know my assignment statement is incorrect as I don't completed understand the indexing role of the ":" operator. Any help will be appreciated.
Thanks!

This might be what you want:
clear all
close all
clc
A = rand(21,21); % Replace this with actual A
rowNum = 0;
for i=1:21
for j=i+1:21
for k=j+1:21
rowNum = rowNum+1;
B(rowNum,:) = [i, j, k, sum(A(:,i)+A(:,j)+A(:,k))];
end
end
end
There are some points:
You loop for different combinations. the total number of combination is nchoosek(21,3) which you can check after 3 nested loop. Your code with for i=1:C was the first error since you're actually looping for different values of i and different values of j and k. So these just 21 values not more.
To avoid repeated combinations, it's enough to start new index after the previous one, which you've realized in your code.
There are other possible approaches such as vectorized format, but to stick to your approach, I used a counter: rowNum which is the loop counter and updated along the loop.
B(rowNum,:) means all element of rowNum'th row of the matrix B.

Below is an algorithm to find the triads in an adjacency matrix. It checks all possible triads and sums the values.
%basic adjacency matrix with two triads (1-2-5) (2-3-5)
A=[];
A(1,:) = [0 1 0 0 1];
A(2,:) = [1 0 1 0 1];
A(3,:) = [0 1 0 0 1];
A(4,:) = [0 0 0 0 1];
A(5,:) = [1 1 1 1 0];
A=A==1; %logical matrix
triads=nchoosek(1:5,3);
S=nan(size(triads,1),4);
for ct = 1:size(triads,1)
S(ct,1:3)=[A(triads(ct,1),triads(ct,2)),A(triads(ct,1),triads(ct,3)),A(triads(ct,2),triads(ct,3))];
S(ct,4)=sum(S(ct,1:3));
end
triads(find(S(:,4)==3),:)
ans =
1 2 5
2 3 5

sum matrix using logical matrix - index exceeds matrix dimensions

I have two matrices.
mcaps which is a double 1698 x 2
index_g which is a logical 1698 x 2
When using the line of code below I get the error message that Index exceeds matrix dimensions. I don't see how this is the case though?
tsp = nansum(mcaps(index_g==1, :));
Update
Sorry I should have mentioned that I need the sum of each column in the mcaps vector
** Example of data **
mcaps index_g
5 6 0 0
4 3 0 0
6 5 1 1
4 6 0 1
8 7 0 0

There are two problems here. I missed one. Original answer is below.
What I missed is that when you use the logical index in this way, you are picking out elements of the matrix that may have different numbers of elements in each column, so MATLAB can't return a well formed matrix back to nansum, and so returns a vector. To get around this, use the fact that 0 + anything = 0
% create a mask of values you don't want to sum. Note that since
% index_g is already logical, you don't have to test equal to 1.
mask = ~index_g & isnan(mcaps)
% create a temporary variable
mcaps_to_sum = mcaps;
% change all of the values that you don't want to sum to zero
mcaps_to_sum(mask) = 0;
% do the sum
sum(mcaps_to_sum,1);
This is basically all that the nansum function does internally, is to set all of the NaN values to zero and then call the sum function.
index_g == 1 returns a 1698 x 2 logical matrix, but then you add in an extra dimension with the colon. To sum the columns, use the optional dim input. You want:
tsp = nansum(mcaps(index_g == 1),1);

Eliminating zeros in a matrix - Matlab

Hi I have the following matrix:
A= 1 2 3;
0 4 0;
1 0 9
I want matrix A to be:
A= 1 2 3;
1 4 9
PS - semicolon represents the end of each column and new column starts.
How can I do that in Matlab 2014a? Any help?
Thanks

The problem you run into with your problem statement is the fact that you don't know the shape of the "squeezed" matrix ahead of time - and in particular, you cannot know whether the number of nonzero elements is a multiple of either the rows or columns of the original matrix.
As was pointed out, there is a simple function, nonzeros, that returns the nonzero elements of the input, ordered by columns. In your case,
A = [1 2 3;
0 4 0;
1 0 9];
B = nonzeros(A)
produces
1
1
2
4
3
9
What you wanted was
1 2 3
1 4 9
which happens to be what you get when you "squeeze out" the zeros by column. This would be obtained (when the number of zeros in each column is the same) with
reshape(B, 2, 3);
I think it would be better to assume that the number of elements may not be the same in each column - then you need to create a sparse array. That is actually very easy:
S = sparse(A);
The resulting object S is a sparse array - that is, it contains only the non-zero elements. It is very efficient (both for storage and computation) when lots of elements are zero: once more than 1/3 of the elements are nonzero it quickly becomes slower / bigger. But it has the advantage of maintaining the shape of your matrix regardless of the distribution of zeros.
A more robust solution would have to check the number of nonzero elements in each column and decide what the shape of the final matrix will be:
cc = sum(A~=0);
will count the number of nonzero elements in each column of the matrix.
nmin = min(cc);
nmax = max(cc);
finds the smallest and largest number of nonzero elements in any column
[i j s] = find(A); % the i, j coordinates and value of nonzero elements of A
nc = size(A, 2); % number of columns
B = zeros(nmax, nc);
for k = 1:nc
B(1:cc(k), k) = s(j == k);
end
Now B has all the nonzero elements: for columns with fewer nonzero elements, there will be zero padding at the end. Finally you can decide if / how much you want to trim your matrix B - if you want to have no zeros at all, you will need to trim some values from the longer columns. For example:
B = B(1:nmin, :);

Simple solution:
A = [1 2 3;0 4 0;1 0 9]
A =
1 2 3
0 4 0
1 0 9
A(A==0) = [];
A =
1 1 2 4 3 9
reshape(A,2,3)
ans =
1 2 3
1 4 9
It's very simple though and might be slow. Do you need to perform this operation on very large/many matrices?

From your question it's not clear what you want (how to arrange the non-zero values, specially if the number of zeros in each column is not the same). Maybe this:
A = reshape(nonzeros(A),[],size(A,2));

Matlab's logical indexing is extremely powerful. The best way to do this is create a logical array:
>> lZeros = A==0
then use this logical array to index into A and delete these zeros
>> A(lZeros) = []
Finally, reshape the array to your desired size using the built in reshape command
>> A = reshape(A, 2, 3)

How to add one to each entry of the diagonal [Matlab]

I have a matrix such as
[1 1 1
1 1 1
1 1 1]
I want it to be
[2 1 1
1 2 1
1 1 2]
How do I do that?

Use eye function to get an identity matrix and add to original matrix
result = A+eye(3,3) ; % A the original matrix

Another possibility, which requires less operations (may be better for large matrices):
A(1:size(A,1)+1:end) = A(1:size(A,1)+1:end) + 1;
This uses the concept of linear indexing to address the diagonal elements.

Yet another way via logical indexing:
idx = eye(size(A))>0;
A(idx)= A(idx)+1;
This can easily be used for other things as well:
A(~idx)=2*A(~idx); %Multiply all non diagonal elements by two
A(eye(size(A))>0)=1:min(size(A)); %Set the diagonal to 1:n

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Finding equal rows in Matlab - matlab

Related

Find equal rows between two Matlab matrices

Assigning values to a matrix through vector addition from an adjacency matrix

sum matrix using logical matrix - index exceeds matrix dimensions

Eliminating zeros in a matrix - Matlab

How to add one to each entry of the diagonal [Matlab]

Categories

Resources