find common values of all rows of a matrix - matlab

I have a random generated matrix
A =[ 0.7015 -1.577 -1.333 0.022 -0.5 -2.00 -0.034 -0.714
-2.05 -0.5 1.12 -0.26 -0.97 0.96 -0.79 1.35
-0.353 0.28 -0.5 -1.75 -1.15 0.52 1.018 -0.22
-0.8 0.033 -0.29 -0.28 -0.5 -0.02 -0.13 -0.58 ]
I want to find the common values of all rows.Each row has no duplicated elements. Can anyone give me a help?

Get a vector of unique values with unique, and then compare each element of A with each unique value using bsxfun:
u = unique(A);
m = squeeze(all(any(bsxfun(#eq, A, permute(u, [2 3 1])),2),1));
result = u(m);
This should be fast, but may be memory-hungry, as it generates a 3D array of size mxnxp, where A is mxn and p is the number of unique values of A. It works even if a row can contains duplicated elements.
Exploiting the fact that each row has no duplicated elements, you can use a possibly more memory-eficient approach with accumarray:
[u, ~, w] = unique(A);
m = accumarray(w,1)==size(A,1);
result = u(m);

Related

Finding equal rows in Matlab

I have a matrix suppX in Matlab with size GxN and a matrix A with size MxN. I would like your help to construct a matrix Xresponse with size GxM with Xresponse(g,m)=1 if the row A(m,:) is equal to the row suppX(g,:) and zero otherwise.
Let me explain better with an example.
suppX=[1 2 3 4;
5 6 7 8;
9 10 11 12]; %GxN
A=[1 2 3 4;
1 2 3 4;
9 10 11 12;
1 2 3 4]; %MxN
Xresponse=[1 1 0 1;
0 0 0 0;
0 0 1 0]; %GxM
I have written a code that does what I want.
Xresponsemy=zeros(size(suppX,1), size(A,1));
for x=1:size(suppX,1)
Xresponsemy(x,:)=ismember(A, suppX(x,:), 'rows').';
end
My code uses a loop. I would like to avoid this because in my real case this piece of code is part of another big loop. Do you have suggestions without looping?
One way to do this would be to treat each matrix as vectors in N dimensional space and you can find the L2 norm (or the Euclidean distance) of each vector. After, check if the distance is 0. If it is, then you have a match. Specifically, you can create a matrix such that element (i,j) in this matrix calculates the distance between row i in one matrix to row j in the other matrix.
You can treat your problem by modifying the distance matrix that results from this problem such that 1 means the two vectors completely similar and 0 otherwise.
This post should be of interest: Efficiently compute pairwise squared Euclidean distance in Matlab.
I would specifically look at the answer by Shai Bagon that uses matrix multiplication and broadcasting. You would then modify it so that you find distances that would be equal to 0:
nA = sum(A.^2, 2); % norm of A's elements
nB = sum(suppX.^2, 2); % norm of B's elements
Xresponse = bsxfun(#plus, nB, nA.') - 2 * suppX * A.';
Xresponse = Xresponse == 0;
We get:
Xresponse =
3×4 logical array
1 1 0 1
0 0 0 0
0 0 1 0
Note on floating-point efficiency
Because you are using ismember in your implementation, it's implicit to me that you expect all values to be integer. In this case, you can very much compare directly with the zero distance without loss of accuracy. If you intend to move to floating-point, you should always compare with some small threshold instead of 0, like Xresponse = Xresponse <= 1e-10; or something to that effect. I don't believe that is needed for your scenario.
Here's an alternative to #rayryeng's answer: reduce each row of the two matrices to a unique identifier using the third output of unique with the 'rows' input flag, and then compare the identifiers with singleton expansion (broadcast) using bsxfun:
[~, ~, w] = unique([A; suppX], 'rows');
Xresponse = bsxfun(#eq, w(1:size(A,1)).', w(size(A,1)+1:end));

Merge different sized arrays into a table

Overview
I am currently working with a series of .txt files I am importing into MATLAB. For simplicity, I'll show my problem conceptually. Obligatory, I'm new to MATLAB (or programming in general).
These .txt files contain data from tracking a ROI in a video (frame-by-frame) with time ('t') in the first column and velocity ('v') in the second as shown below;
T1 = T2 = etc.
t v t v
0 NaN 0 NaN
0.1 100 0.1 200
0.2 200 0.2 500
0.3 400 0.3 NaN
0.4 150
0.5 NaN
Problem
Files differ in their size, the columns remain fixed but the rows vary from trial to trial as shown in T1 and T2.
The time column is the same for each of these files so I wanted to organise data in a table as follows;
time v1 v2 etc.
0 NaN NaN
0.1 100 200
0.2 200 500
0.3 400 NaN
0.4 150 0
0.5 NaN 0
Note that I want to add 0s (or NaN) to end of shorter trials to fix the issue of size differences.
Edit
Both solutions worked well for my dataset. I appreciate all the help!
You could import each file into a table using readtable and then use outerjoin to combine the tables in the way that you would expect. This will work if all data starts at t = 0 or not.
To create a table from a file:
T1 = readtable('filename1.dat');
T2 = readtable('filename2.dat');
Then to perform the outerjoin (pseudo data created for demonstration purposes).
t1 = table((1:4)', (5:8)', 'VariableNames', {'t', 'v'});
%// t v
%// _ _
%// 1 5
%// 2 6
%// 3 7
%// 4 8
% t2 is missing row 2
t2 = table([1;3;4], [1;3;4], 'VariableNames', {'t', 'v'});
%// t v
%// _ _
%// 1 1
%// 3 3
%// 4 4
%// Now perform an outer join and merge the key column
t3 = outerjoin(t1, t2, 'Keys', 't', 'MergeKeys', true)
%// t v_t1 v_t2
%// _ ____ ____
%// 1 5 1
%// 2 6 NaN
%// 3 7 3
%// 4 8 4
I would suggest the use of the padarray and horzcat functions. They respectively :
Pad a matrix or vector with extra data, effectively adding extra 0's or any specified value (NaNs work too).
Concatenate matrices or vectors horizontally.
First, try to obtain the length of the longest vector you have to concatenate. Let's call this value max_len. Once you have that, you can then pad each vector by doing :
v1 = padarray(v1, max_len - length(v1), 0, 'post');
% You can replace the '0' by any value you want !
Finally, once you have vectors of the same size, you can concatenate them using horzcat :
big_table = horzcat(v1, v2, ... , vn);

How to find the index of vector that corresponds with highest value of other vector?

I have a vector such as
A=[0.2 0.5 0.4 0.6]
that labels as the
A_labels=[1 2 3 4]
and other vector B equals
B=[30 10 20]
I assume that the highest values of vector B will be assigned for highest label in A, and reduces by order. That means
30 will assign for 4
10 will assign for 2
20 will assign for 3
I will scan all element of vector B and I want to find which labels corresponding with its based on above rule. Could you help me implement that scheme in MATLAB? Thanks
A=[0.2 0.5 0.4 0.6]
A_lables=1:1:size(A,2);
B=[30 10 20];
for i=1:size(B,2)
//Find label of A_labels corresponds with B(i)
// Result will be [4 2 3]
end
Not sure I've fully understood but can't you just sort B and A_labels descending and use the sort order from B as an index on the ordered A_labels?
So
[~,idx] = sort(B,'descend');
A_labels_ordered = sort(A_labels, 'descend');
result = A_labels_ordered(idx)
I think this does what you want. I'm assuming A_labels is sorted, as in your example.
[~, ind] = sort(B); %// sort B and get *indices* of the sorting
[~, ind] = sort(ind); %// get *rank* of each element of B
result = A_labels(end-numel(ind)+ind);

Construct a 3D matrix with linear index

I want to construct a 3D matrix of size = 80 * 80 * 2 based on a set of data:
1 4532 1257.0
1 4556 1257.0
1 4622 257.0
1 4633 257.0
2 7723 31.0
2 8024 31.0
2 8099 31.0
2 9800 31.0
2 8524 34.0
2 8525 34.0
2 8700 734.0
2 8701 734.0
The first column denotes the slice of matrix.
The second column denotes the linear index of the matrix.
The third column denotes the values of the elements.
What I'm doing now is: I first obtain two 80 * 80 2D matrices A and B and then concatenate them using cat(3, A, B):
Denote the above data be M.
for i = 1 : size(M,1)
if (M(:,1)==1)
[r c]=ind2sub(M(:,2));
A = accumarray([r c], M(:,3));
elseif (M(:,1)==2)
[r c]=ind2sub(M(:,2));
B = accumarray([r c], M(:,3));
end
end
cat(3, A, B)
I am curious if there is any solutions that can build the 80*80*2 matrix merely by the linear index (the second column of my data) or any other simpler solution works for the purpose.
I appreciate for your help.
So, I'm assuming your example data is incorrect, and that all values in column 2 are less than n*n, where nxn is the size of the matrix (80x80 in your case).
If that's the case, the following two lines should do the trick.
out = zeros(n,n,2);
out((M(:,1)-1).*n^2+M(:,2)) = M(:,3)
If the second column contains values up to 2*n*n, and thus are the linear indices, then:
out = zeros(n,n,2);
out(M(:,2)) = M(:,3)

Matlab: arithmetic operation on columns inside a for loop (simple yet devious!)

I'm trying to represent a simple matrix m*n (let's assume it has only one row!) such that m1n1 = m1n1^1, m1n2 = m1n1^2, m1n3 = m1n1^3, m1n3 = m1n1^4, ... m1ni = m1n1^i.
In other words, I am trying to iterate over a matrix columns n times to add a new vector(column) at the end such that each of the indices has the same value as the the first vector but raised to the power of its column number n.
This is the original vector:
v =
1.2421
2.3348
0.1326
2.3470
6.7389
and this is v after the third iteration:
v =
1.2421 1.5429 1.9165
2.3348 5.4513 12.7277
0.1326 0.0176 0.0023
2.3470 5.5084 12.9282
6.7389 45.4128 306.0329
now given that I'm a total noob in Matlab, I really underestimated the difficulty of such a seemingly easy task, that took my almost a day of debugging and surfing the web to find any clue. Here's what I have come up with:
rows = 5;
columns = 3;
v = x(1:rows,1);
k = v;
Ncol = ones(rows,1);
extraK = ones(rows,1);
disp(v)
for c = 1:columns
Ncol = k(:,length(k(1,:))).^c; % a verbose way of selecting the last column only.
extraK = cat(2,extraK,Ncol);
end
k = cat(2,k,extraK);
disp(extraK(:,2:columns+1)) % to cut off the first column
now this code (for some weird reason) work only if rows = 6 or less, and columns = 3 or less.
when rows = 7, this is the output:
v = 1.0e+03 *
0.0012 0.0015 0.0019
0.0023 0.0055 0.0127
0.0001 0.0000 0.0000
0.0023 0.0055 0.0129
0.0067 0.0454 0.3060
0.0037 0.0138 0.0510
0.0119 0.1405 1.6654
How could I get it to run on any number of rows and columns?
Thanks!
I have found a couple of things wrong with your code:
I'm not sure as to why you are defining d = 3;. This is just nitpicking, but you can remove that from your code safely.
You are not doing the power operation properly. Specifically, look at this statement:
Ncol = k(:,length(k(1,:))).^c; % a verbose way of selecting the last column only.
You are selectively choosing the last column, which is great, but you are not applying the power operation properly. If I understand your statement, you wish to take the original vector, and perform a power operation to the power of n, where n is the current iteration. Therefore, you really just need to do this:
Ncol = k.^c;
Once you replace Ncol with the above line, the code should now work. I also noticed that you crop out the first column of your result. The reason why you are getting duplicate columns is because your for loop starts from c = 1. Since you have already computed v.^1 = v, you can just start your loop at c = 2. Change your loop starting point to c = 2, and you can get rid of the removal of the first column.
However, I'm going to do this in an alternative way in one line of code. Before we do this, let's go through the theory of what you're trying to do.
Given a vector v that is m elements long stored in a m x 1 vector, what you want is to have a matrix of size m x n, where n is the desired number of columns, and for each column starting from left to right, you wish to take v to the nth power.
Therefore, given your example from your third "iteration", the first column represents v, the second column represents v.^2, and the third column represents v.^3.
I'm going to introduce you to the power of bsxfun. bsxfun stands for Binary Singleton EXpansion function. What bsxfun does is that if you have two inputs where either or both inputs has a singleton dimension, or if either of both inputs has only one dimension which has value of 1, each input is replicated in their singleton dimensions to match the size of the other input, and then an element-wise operation is applied to these inputs together to produce your output.
For example, if we had two vectors like so:
A = [1 2 3]
B = [1
2
3]
Note that one of them is a row vector, and the other is a column vector. bsxfun would see that A and B both have singleton dimensions, where A has a singleton dimension being the number of rows being 1, and B having a singleton dimension which is the number of columns being 1. Therefore, we would duplicate B as many columns as there are in A and duplicate A for as many rows as there are in B, and we actually get:
A = [1 2 3
1 2 3
1 2 3]
B = [1 1 1
2 2 2
3 3 3]
Once we have these two matrices, you can apply any element wise operations to these matrices to get your output. For example, you could add, subtract, take the power or do an element wise multiplication or division.
Now, how this scenario applies to your problem is the following. What you are doing is you have a vector v, and you will have a matrix of powers like so:
M = [1 2 3 ... n
1 2 3 ... n
...........
...........
1 2 3 ... n]
Essentially, we will have a column of 1s, followed by a column of 2s, up to as many columns as you want n. We would apply bsxfun on the vector v which is a column vector, and another vector that is only a single row of values from 1 up to n. You would apply the power operation to achieve your result. Therefore, you can conveniently calculate your output by doing:
columns = 3;
out = bsxfun(#power, v, 1:columns);
Let's try a few examples given your vector v:
>> v = [1.2421; 2.3348; 0.1326; 2.3470; 6.7389];
>> columns = 3;
>> out = bsxfun(#power, v, 1:columns)
out =
1.2421 1.5428 1.9163
2.3348 5.4513 12.7277
0.1326 0.0176 0.0023
2.3470 5.5084 12.9282
6.7389 45.4128 306.0321
>> columns = 7;
>> format bank
>> out = bsxfun(#power, v, 1:columns)
out =
Columns 1 through 5
1.24 1.54 1.92 2.38 2.96
2.33 5.45 12.73 29.72 69.38
0.13 0.02 0.00 0.00 0.00
2.35 5.51 12.93 30.34 71.21
6.74 45.41 306.03 2062.32 13897.77
Columns 6 through 7
3.67 4.56
161.99 378.22
0.00 0.00
167.14 392.28
93655.67 631136.19
Note that for setting the columns to 3, we get what we see in your post. For pushing the columns up to 7, I had to change the way the numbers were presented so you can see the numbers clearly. Not doing this would put this into exponential form, and there were a lot of zeroes that followed the significant digits.
Good luck!
When computing cumulative powers you can reuse previous results: for scalar x and n, x.^n is equal to x * x.^(n-1), where x.^(n-1) has been already obtained. This may be more efficient than computing each power independently, because multiplication is faster than power.
Let N be the maximum exponent desired. To use the described approach, the column vector v is repeated N times horizontally (repmat), and then a cumulative product is applied along each row (cumprod):
v =[1.2421; 2.3348; 0.1326; 2.3470; 6.7389]; %// data. Column vector
N = 3; %// maximum exponent
result = cumprod(repmat(v, 1, N), 2);