find the nearest point pairs between two sets of of matrix - matlab

Assume I have two sets of matrix (A and B), inside each matrix contains few point coordinates, I want to find out point in B nearest to A and output a cell array C listed the nearest point pair coordinates accordingly and one cell array D register the unpaired spot, how should I do it?
To be more specific, here is what I want
Two sets of matrix contain spot xy coordinates;
A=[ 1 2; 3 4]; B=[1 3; 5 6; 2 1];
want to get C{1,1}=[1 2; 1 3]; C{2,1}= [3 4; 5 6]; D{1,1}=[2 1];
Thanks for the help.

There is not exactly one solution to this problem, take for example the (one-dimensional, but expandable to N-D) case:
A= [1; 3];
B= [2];
Then either A(1) or A(2) can be the leftover point. Which one your algorithm spits out, will depend on how it works, ie which point you take first to find the nearest point.
Such algorithm consists imo of
Finding distances between each combination of A(i) and B(j). If you have the statistics toolbox, pdist2 does this for you:
A=[ 1 2; 3 4];
B=[1 3; 5 6; 2 1];
dist = pdist2(A,B);
Looping over the smallest of A or B (I'll take A, cause it is smallest in your example) and finding for each point in A the closest point in the remaining set of B:
N = size(A,1);
matchAtoB=NaN(N,1);
for ii=1:N
dist(:,matchAtoB(1:ii-1))=Inf; % make sure that already picked points of B are not eligible to be new closest point
[~,matchAtoB(ii)]=min(dist(ii,:));
end
matchBtoA = NaN(size(B,1),1);
matchBtoA(matchAtoB)=1:N;
remaining_indices = find(isnan(matchBtoA));
Combine result to your desired output matrices C and D:
C=arrayfun(#(ii) [A(ii,:) ; B(matchAtoB(ii),:)],1:N,'uni',false);
D=mat2cell(B(remaining_indices,:),ones(numel(remaining_indices),1),size(B,2));
Note that this code will also work with 1D points or higher (N-D), the pdist2 flattens everything out to scalar distances.

Here's my take on the problem:
A=[1 2
3 4];
B=[1 3
5 6
2 1];
dists = pdist2(A,B);
[dists, I] = sort(dists,2);
c = NaN(size(A,1),1);
for ii = 1:size(A,1)
newC = find(~any(bsxfun(#eq, I(ii,:), c), 1));
c(ii) = I(ii,newC(1));
end
C = cellfun(#(x)reshape(x,2,2).',...
mat2cell([A B(c,:)], ones(size(A,1),1), 4), 'uni', false);
D = {B(setdiff(1:size(B,1),c), :)}
This solution assumes
all your vectors are 2D
stacked in rows of A and B
and A is always the source (i.e., everything is compared to A)
If these assumptions do not (always) hold, you'll have to take a more general approach, like the one suggested by #GuntherStruyf.

Related

Using vector as indices not working as expected, matlab

Given a nXm matrix A and a mX2 matrix B and a matrix C of size mX1 containing 1s and 2s C=[1 2 1 2 1...], depending on which column, I want every row of A to be multiplied with. How can this be done? Or equivalently, given D = A*B how can I access only the values dictated by C. I tried D(:,C), but the result is not the expected.
Example a =[1 2; 3 4; 5 6] . c = [1 2 1] . a(?) = [1 4 5]
Any idea?
%example data
n=10;m=20;
A=rand(n,m)
B=rand(m,2)
C=round(rand(m,1))+1;
%solution:
B2=B(:,1); %multiplication vector
B2(C==2)=B(C==2,2) %change the ones where C==2
A*B2
You can run the following command for the last example:
a(sub2ind([3,2],1:3,c))'
In general case you can do like the following:
% n is the length of the D which is nx2 matrix
D(sub2ind([n,2],1:n,C))'

Shifting repeating rows to a new column in a matrix

I am working with a n x 1 matrix, A, that has repeating values inside it:
A = [0;1;2;3;4; 0;1;2;3;4; 0;1;2;3;4; 0;1;2;3;4]
which correspond to an n x 1 matrix of B values:
B = [2;4;6;8;10; 3;5;7;9;11; 4;6;8;10;12; 5;7;9;11;13]
I am attempting to produce a generalised code to place each repetition into a separate column and store it into Aa and Bb, e.g.:
Aa = [0 0 0 0 Bb = [2 3 4 5
1 1 1 1 4 5 6 7
2 2 2 2 6 7 8 9
3 3 3 3 8 9 10 11
4 4 4 4] 10 11 12 13]
Essentially, each repetition from A and B needs to be copied into the next column and then deleted from the first column
So far I have managed to identify how many repetitions there are and copy the entire column over to the next column and then the next for the amount of repetitions there are but my method doesn't shift the matrix rows to columns as such.
clc;clf;close all
A = [0;1;2;3;4;0;1;2;3;4;0;1;2;3;4;0;1;2;3;4];
B = [2;4;6;8;10;3;5;7;9;11;4;6;8;10;12;5;7;9;11;13];
desiredCol = 1; %next column to go to
destinationCol = 0; %column to start on
n = length(A);
for i = 2:1:n-1
if A == 0;
A = [ A(:, 1:destinationCol)...
A(:, desiredCol+1:destinationCol)...
A(:, desiredCol)...
A(:, destinationCol+1:end) ];
end
end
A = [...] retrieved from Move a set of N-rows to another column in MATLAB
Any hints would be much appreciated. If you need further explanation, let me know!
Thanks!
Given our discussion in the comments, all you need is to use reshape which converts a matrix of known dimensions into an output matrix with specified dimensions provided that the number of elements match. You wish to transform a vector which has a set amount of repeating patterns into a matrix where each column has one of these repeating instances. reshape creates a matrix in column-major order where values are sampled column-wise and the matrix is populated this way. This is perfect for your situation.
Assuming that you already know how many "repeats" you're expecting, we call this An, you simply need to reshape your vector so that it has T = n / An rows where n is the length of the vector. Something like this will work.
n = numel(A); T = n / An;
Aa = reshape(A, T, []);
Bb = reshape(B, T, []);
The third parameter has empty braces and this tells MATLAB to infer how many columns there will be given that there are T rows. Technically, this would simply be An columns but it's nice to show you how flexible MATLAB can be.
If you say you already know the repeated subvector, and the number of times it repeats then it is relatively straight forward:
First make your new A matrix with the repmat function.
Then remap your B vector to the same size as you new A matrix
% Given that you already have the repeated subvector Asub, and the number
% of times it repeats; An:
Asub = [0;1;2;3;4];
An = 4;
lengthAsub = length(Asub);
Anew = repmat(Asub, [1,An]);
% If you can assume that the number of elements in B is equal to the number
% of elements in A:
numberColumns = size(Anew, 2);
newB = zeros(size(Anew));
for i = 1:numberColumns
indexStart = (i-1) * lengthAsub + 1;
indexEnd = indexStart + An;
newB(:,i) = B(indexStart:indexEnd);
end
If you don't know what is in your original A vector, but you do know it is repetitive, if you assume that the pattern has no repeats you can use the find function to find when the first element is repeated:
lengthAsub = find(A(2:end) == A(1), 1);
Asub = A(1:lengthAsub);
An = length(A) / lengthAsub
Hopefully this fits in with your data: the only reason it would not is if your subvector within A is a pattern which does not have unique numbers, such as:
A = [0;1;2;3;2;1;0; 0;1;2;3;2;1;0; 0;1;2;3;2;1;0; 0;1;2;3;2;1;0;]
It is worth noting that from the above intuitively you would have lengthAsub = find(A(2:end) == A(1), 1) - 1;, But this is not necessary because you are already effectively taking the one off by only looking in the matrix A(2:end).

Filter elements from a 3D matrix without loop

I have a 3d matrix H(i,j,k) with dimensions (i=1:m,j=1:n,k=1:o). I will use a simple case with m=n=o = 2:
H(:,:,1) =[1 2; 3 4];
H(:,:,2) =[5 6; 7 8];
I want to filter this matrix and project it to an (m,n) matrix by selecting for each j in 1:n a different k in 1:0.
For instance, I would like to retrieve (j,k) = {(1,2), (2,1)}, resulting in matrix G:
G = [5 2; 7 4];
This can be achieved with a for loop:
filter = [2 1]; % meaning filter (j,k) = {(1,2), (2,1)}
for i = 1:length(filter)
G(:,i) = squeeze(H(:,i,filter(i)));
end
But I'm wondering if it is possible to avoid the for loop via some smart indexing.
You can create all the linear indices to get such an output with the expansion needed for the first dimension with bsxfun. The implementation would look like this -
szH = size(H)
offset = (filter-1)*szH(1)*szH(2) + (0:numel(filter)-1)*szH(1)
out = H(bsxfun(#plus,[1:szH(1)].',offset))
How does it work
(filter-1)*szH(1)*szH(2) and (0:numel(filter)-1)*szH(1) gets the linear indices considering only the third and second dimension elements respectively. Adding these two gives us the offset linear indices.
Add the first dimenion linear indices 1:szH(1) to offset array in elementwise fashion with bsxfun to give us the actual linear indices, which when indexed into H would be the output.
Sample run -
H(:,:,1) =
1 2
3 4
H(:,:,2) =
5 6
7 8
filter =
2 1
out =
5 2
7 4

Finding closest (not equal) rows in two matrices matlab

I have two matrices A(m X 3) and B(n X 3); where m >> n.
Numbers in B have close or equal values to numbers in A.
I want to search closest possible values from A to the values present in B in a way that at the end of search, A will reduced to (n X 3).
There are two main issues:
Only a complete row from A can be compared to a complete row in B, where numbers in each column of A and B are varying independently.
Numbers in A and B may be as close as third place of decimal (e.g. 20.101 and 20.103)
I hope I am clear in asking my question.
Does anybody know about any function already present in matlab for this thing?
Depending on how you look at the task, here are two different approaches
Minimum Distance to Each Row in Second Matrix
Two ways to look at this: (1) closest point in A for each point in B, or (2) closest point in B for each point in A.
Closest point in A
For each point in B you can find the closest point in A (e.g. Euclidean distance), as requested in comments:
% Calculate all MxN high-dimensional (3D space) distances at once
distances = squeeze(sum(bsxfun(#minus,B,permute(A,[3 2 1])).^2,2));
% Find closest row in A for each point in B
[~,ik] = min(distances,[],2)
Make an array the size of B containing these closest points in A:
Anew = A(ik,:)
This will implicitly throw out any points in A that are too far from points in B, as long as each point in B does have a match in A. If each point in B does not necessarily have a "match" (point at an acceptable distance) in A, then it is necessary to actively reject points based on distances, resulting in an output that would be shorter than B. This solution seems out of scope.
Closest point in B
Compute the Euclidenan distance from each point (row) in A to each point in B and identify the closest point in B:
distances = squeeze(sum(bsxfun(#minus,A,permute(B,[3 2 1])).^2,2));
[~,ik] = min(distances,[],2)
Make an array the size of A containing these closest points in B:
Anew = B(ik,:)
The size of Anew in this approach is the same as A.
Merging Similar Points in First Matrix
Another approach is to use the undocumented _mergesimpts function.
Consider this test data:
>> B = randi(5,4,3)
B =
1 4 4
2 3 4
1 3 4
3 4 5
>> tol = 0.001;
>> A = repmat(B,3,1) + tol * rand(size(B,1)*3,3)
A =
1.0004 4.0005 4.0000
2.0004 3.0005 4.0008
1.0004 3.0009 4.0002
3.0008 4.0005 5.0004
1.0006 4.0004 4.0007
2.0008 3.0007 4.0004
1.0009 3.0007 4.0007
3.0010 4.0005 5.0004
1.0002 4.0003 4.0007
2.0001 3.0001 4.0007
1.0007 3.0006 4.0004
3.0001 4.0003 5.0000
Merge similar rows in A according to a specified tolerance, tol:
>> builtin('_mergesimpts',A,tol,'average')
ans =
1.0004 4.0004 4.0005
1.0007 3.0007 4.0005
2.0005 3.0005 4.0006
3.0006 4.0004 5.0003
Merge similar rows, using B to get expected numbers
>> builtin('_mergesimpts',[A; B],tol,'first')
ans =
1 3 4
1 4 4
2 3 4
3 4 5
To replace each row of A by the closest row of B
You can use pdist2 to compute distance between rows, and then the second output of min to find the index of the minimum-distance row:
[~, ind] = min(pdist2(B,A,'euclidean')); %// or specify some other distance
result = B(ind,:);
The advantage of this approach is that pdist2 lets you specify other distance functions, or even define your own. For example, to use L1 distance change first line to
[~, ind] = min(pdist2(B,A,'cityblock'));
To retain rows of A which are closest to rows of B
Use pdist2 as above. For each row of A compute the minimum distance to rows of B. Retain the n rows of A with lowest value of that minimum distance:
[~, ii] = sort(min(pdist2(B,A,'euclidean'))); %// or use some other distance
result = A(ii(1:n),:);
Try this code -
%% Create data
m=7;
n=4;
TOL = 0.0005;
A = rand(m,3)/100;
B = rand(n,3)/100;
B(2,:) = A(5,:); % For testing that the matching part of the second row from B must be the fifth row from A
%% Interesting part
B2 = repmat(reshape(B',1,3,n),[m 1]);
closeness_matrix = abs(bsxfun(#minus, A, B2));
closeness_matrix(closeness_matrix<TOL)=0;
closeness_matrix_mean = mean(closeness_matrix,2); % Assuming that the "closeness" for the triplets on each row can be measured by the mean value of them
X1 = squeeze(closeness_matrix_mean);
[minval,minind] = min(X1,[],1);
close_indices = minind';
A_final = A(close_indices,:)

Given a cell array of vectors, delete vectors that are subsets of any other vector

I wanna delete all the subsets of cell c, suppose I have 6 cell vectors: c{1}=[1 2 3]; c{2}=[2 3 4];
c{3}=[1 2 3 4 5 6]; c{4}=[2 3 4 7]; c{5}=[2 3 7]; c{6}=[4 5 6]; then I wanna delete [1 2 3], [2 3 4] and [4 5 6]. I used two for loops to find all these subsets, but it's too slow for large datasets, is there any simple way can do this?
The following code removes a vector if it's a subset of any other vector. The approach is very similar to my answer to this other question:
n = numel(c);
[i1 i2] = meshgrid(1:n); %// generate all pairs of cells (their indices, really)
issubset = arrayfun(#(k) all(ismember(c{i1(k)},c{i2(k)})), 1:n^2); %// subset?
issubset = reshape(issubset,n,n) - eye(n); %// remove diagonal
c = c(~any(issubset)); %// remove subsets
Note that, in your example, [2 3 7] should also be removed.
You could find the cells that are exact matches for a particular input vector s1 using the following approach:
indx = find(cell2mat(cellfun(#(x)strcmp(num2str(x),num2str(s1)),c,'un', 0)));
You can then loop over matches (which should now be a much smaller set), and remove them by setting their contents to an empty set:
for ii=1:length(indx)
c{:,ii} = [];
end