Finding the common segments of two noncontinuous vectors - matlab

I'm looking for a quick and elegant manner to solve this problem:
I have two noncontinuous line, like the black ones in this image:
For each, I have two vectors - one defining the starting points of each segment and the other defining the ending points.
I am looking for a MATLAB script that will give me the start and end points for the blue line, which is the intersection of the two lines.
I could, of course, create two vectors, each containing all the elements in the black lines, and then use "intersect". However, since the numbers here are in billions, the size of these vectors will be huge and the intersection will take long.
Any ideas?

Nice question!
This is a solution without loops for combining n discontinuous lines (n is 2 in the original post).
Consider n discontinuous lines, each defined by its start and stop points. Consider also an arbitrary test point P. Let S denote the solution, that is, a discontinuous line defined as the intersection of all the input lines. The key idea is: P is in S if and only if the number of start points to the left of P minus the number of stop points to the left of P equals n (considering all points from all lines).
This idea can be applied compactly with vectorized operations:
start = {[1 11 21], [2 10 15 24]}; %// start points
stop = {[3 14 25], [3 12 18 27]}; %// stop points
%// start and stop are cell arrays containing n vectors, with n arbitrary
n = numel(start);
start_cat = horzcat(start{:}); %// concat all start points
stop_cat = horzcat(stop{:}); %// concat all stop points
m = [ start_cat stop_cat; ones(1,numel(start_cat)) -ones(1,numel(stop_cat)) ].';
%'// column 1 contains all start and stop points.
%// column 2 indicates if each point is a start or a stop point
m = sortrows(m,1); %// sort all start and stop points (column 1),
%// keeping track of whether each point is a start or a stop point (column 2)
ind = find(cumsum(m(:,2))==n); %// test the indicated condition
result_start = m(ind,1).'; %'// start points of the solution
result_stop = m(ind+1,1).'; %'// stop points of the solution
With the above data, the result is
result_start =
2 11 24
result_stop =
3 12 25

Your idea of discretizising is fine, but instead of using fixed step sizes I reduced it to the relevant points. The start or endpoint of the union are start or end point from one of the inputs.
%first input
v{1}=[1,3,5,7;2,4,6,8];
%second input
v{2}=[2.5,6.5;4,8];
%solution can only contain these values:
relevantPoints=union(v{1}(:),v{2}(:));
%logical matrix: row i column j is true if input i contains segment j
%numel(relevantPoints) Points = numel(relevantPoints)-1 Segments
bn=false(size(v,2),numel(relevantPoints)-1);
for vector=1:numel(v)
c=v{vector};
for segment=1:size(c,2)
thissegment=c(:,segment);
%set all segments of bn to true, which are covered by the input segment
bn(vector,find(relevantPoints==thissegment(1)):find(relevantPoints==thissegment(2))-1)=true;
end
end
%finally the logic we want to apply
resultingSegments=and(bn(1,:),bn(2,:));
seg=[relevantPoints(find(resultingSegments))';relevantPoints(find(resultingSegments)+1)'];

Related

Faster way of putting Matrix elements into vectors other than a for loop

Hopefully this will come across correctly. I have 4 clouds/groups of points in a grid array (imagine a 2D space with 4 separate clusters of for example 3x3 grid of points) with each point having an X and Y coordinate. I'd like to write a vector of four points in the form of (X1, Y1, X2, Y2, X3, Y3, X4, Y4) where the number represents each cloud/group. Now I would actually like to write a matrix of all the combinations of the above vector covering all the points, so upper left points in all four groups in the first line, same for the second line, but the top middle point for group 4, etc.
One way to do it is to for-loop over all the variable, which would mean 8 nested for loops (4 for each X coordinate of 4 groups, 4 for each Y coordinate of 4 groups).
Is there a faster way maybe? 4 3x3 groups means 6561 combinations. Going to a larger array in each group, 11x11 for example, would mean 214 million combinations.
I'm trying to parallelize some calculations using these point coordinates, but writing the results in a parfor loop presents it's own set of issues if I was to do it on the points themselves. With a matrix of combinations I could just write the results in another matrix with the same number of rows and write the result of the nth row of point coordinates to the nth row of results.
As I understand, you have 4 groups of 3x3=9 coordinate pairs. You need to draw one pair from each group as one result, and produce all possible such results.
Thus, reducing each of the groups of 9 coordinate pairs to a lookup-table that is indexed with a number from 1 to 9, your problem can be reduced to drawing, with replacement, 4 values from the set 1:9.
It is fairly easy to produce all such combinations. permn is one function that does this (from the File Exchange). But you can actually get these even easier using ndgrid:
ind = 1:9;
[a1,a2,a3,a4] = ndgrid(ind,ind,ind,ind);
ind = [a1(:),a2(:),a3(:),a4(:)];
Each row in ind is the indices into one of your 3x3 grids.
For example, if grid 1 is:
x1 = [0.5,0.7,0.8];
y1 = [4.2,5.7,7.1];
then you can generate the coordinate pairs as follows:
[x1,y1] = meshgrid(x1,y1); % meshgrid is nearly the same as ndgrid...
xy1 = [x1(:),y1(:)];
Now your combination k is:
k = 563;
[xy1(ind(k,1),:), xy2(ind(k,2),:), xy3(ind(k,3),:), xy4(ind(k,4),:)]
You probably want to implement the above using multidimensional arrays rather than x1, x2, x3, etc. Adding indices to variables makes for confusing code that is difficult to extend. For example, for n groups you could write:
n = 4;
ind = 1:9;
ind = repmat({ind},n,1);
[ind{:}] = ndgrid(ind{:});
ind = cellfun(#(m)reshape(m,[],1), ind, 'UniformOutput',false);
ind = [ind{:}]; % ind is now the same as in the block of code above
etc.

How do I track when multiple objects touch in MATLAB?

I have x,y pixel coordinates of multiple objects that have been tracked from an image (3744x5616). The coordinates are stored in a structure called objects, e.g.
objects(1).centre = [1868 1236]
The objects are each uniquely identified by a numerical code, e.g.
objects(i).code = 33
I want to be able to record each time any two objects come within a radius of 300 pixels each other. What would be the best way to check through if any objects are touching and then record the identity of both objects involved in the interaction, like object 33 interacts with object 34.
Thanks!
Best thing I can think of right now is a brute force approach. Simply check the distances from one object's centre with the rest of the other objects and manually check if the distances are < 300 pixels.
If you want this fast, we should probably do this without any toolboxes. You can intelligently do this with vanilla MATLAB using bsxfun. First, create separate arrays for the X and Y coordinates of each object:
points = reshape([objects.centre], 2, []);
X = points(1,:);
Y = points(2,:);
[objects.centre] accesses the individual coordinates of each centre field in your structure and unpacks them into a comma-separated list. I reshape this array so that it is 2 rows where the first row is the X coordinate and the second row is the Y coordinate. I extract out the rows and place them into separate arrays.
Next, create two difference matrices for each X and Y where the rows denote one unique coordinate and the columns denote another unique coordinate. The values inside this matrix are the differences between the point i at row i and point j at column j:
Xdiff = bsxfun(#minus, X.', X);
Ydiff = bsxfun(#minus, Y.', Y);
bsxfun stands for Binary Singleton EXpansion FUNction. If you're familiar with the repmat function, it essentially replicates matrices and vectors under the hood so that both inputs you're operating on have the same size. In this case, what I'm doing is specifying X or Y as both of the inputs. One is the transposed version of the other. By doing this bsxfun automatically broadcasts each input so that the inputs match in dimension. Specifically, the first input is a column vector of X and so this gets repeated and stacked horizontally for as many times as there are values in X.
Similarly this is done for the Y value. After you do this, you perform an element-wise subtraction for both outputs and you get the component wise subtraction between one point and another point for X and Y where the row gives you the first point, and the column gives you the second point. As a toy example, imagine we had X = [1 2 3]. Doing a bsxfun call using the above code gives:
>> Xdiff = bsxfun(#minus, [1 2 3].', [1 2 3])
Xdiff =
## | 1 2 3
----------------------
1 | 0 -1 -2
2 | 1 0 -1
3 | 2 1 0
There are some additional characters I placed in the output, but these are used solely for illustration and to give you a point of reference. By taking a row value from the ## column and subtracting from a column value from the ## row gives you the desired subtract. For example, the first row second column illustrates 1 - 2 = -1. The second row, third column illustrates 2 - 3 = -1. If you do this for both the X and Y points, you get the component-wise distances for one point against all of the other points in a symmetric matrix.
You'll notice that this is an anti-symmetric matrix where the diagonal is all 0 ... makes sense since the distance of one dimension of one point with respect to itself should be 0. The bottom left triangular portion of the matrix is the opposite sign of the right... because of the order of subtraction. If you subtracted point 1 with point 2, doing the opposite subtraction gives you the opposite sign. However, let's assume that the rows denote the first object and the columns denote the second object, so you'd want to concentrate on the lower half.
Now, compute the distance, and make sure you set either the upper or lower triangular half to NaN because when computing the distance, the sign gets ignored. If you don't ignore this, we'd find duplicate objects that interact, so object 3 and object 1 would be a different interaction than object 1 and object 3. You obviously don't care about the order, so set either the upper or lower triangular half to NaN for the next step. Assuming Euclidean distance:
dists = sqrt(Xdiff.^2 + Ydiff.^2);
dists(tril(ones(numel(objects))==1)) = NaN;
The first line computes the Euclidean distance of all pairs of points and we use tril to extract the lower triangular portion of a matrix that consists of all logical 1. Extracting this matrix, we use this to set the lower half of the matrix to NaN. This allows us to skip entries we're not interested in. Note that I also set the diagonal to 0, because we're not interested in distances of one object to itself.
Now that you're finally here, search for those objects that are < 300 pixels:
[I,J] = find(dists < 300);
I and J are row/column pairs that determine which rows and columns in the matrix have values < 300, so in our case, each pair of I and J in the array gives you the object locations that are close to each other.
To finally figure out the right object codes, you can do:
codes = [[objects(I).code].' [objects(J).code].'];
This uses I and J to access the corresponding codes of those objects that were similar in a comma-separated list and places them side by side into a N x 2 matrix. As such, each row of codes gives you unique pairs of objects that satisfied the distance requirements.
For copying and pasting:
points = reshape([objects.centre], 2, []);
X = points(1,:);
Y = points(2,:);
Xdiff = bsxfun(#minus, X.', X);
Ydiff = bsxfun(#minus, Y.', Y);
dists = sqrt(Xdiff.^2 + Ydiff.^2);
dists(tril(ones(numel(objects))==1)) = NaN;
[I,J] = find(dists < 300);
codes = [[objects(I).code].' [objects(J).code].'];
Toy Example
Here's an example that we can use to verify if what we have is correct:
objects(1).centre = [1868 1236];
objects(2).centre = [2000 1000];
objects(3).centre = [1900 1300];
objects(4).centre = [3000 2000];
objects(1).code = 33;
objects(2).code = 34;
objects(3).code = 35;
objects(4).code = 99;
I initialized 4 objects with different centroids and different codes. Let's see what the dists array gives us after we compute it:
>> format long g
>> dists
dists =
NaN 270.407100498489 71.5541752799933 1365.69396278961
NaN NaN 316.227766016838 1414.2135623731
NaN NaN NaN 1303.84048104053
NaN NaN NaN NaN
I intentionally made the last point farther than any of the other three points to ensure that we can show cases where there are points not near other ones.
As you can see, points (1,2) and (1,3) are all near each other, which is what we get when we complete the rest of the code. This corresponds to objects 33, 34 and 35 with pairings of (33,34) and (33,35). Points with codes 34 and 35 I made slightly smaller, but they are still greater than the 300 pixel threshold, so they don't count either:
>> codes
codes =
33 34
33 35
Now, if you want to display this in a prettified format, perhaps use a for loop:
for vec = codes.'
fprintf('Object with code %d interacted with object with code %d\n', vec(1), vec(2));
end
This for loop is a bit tricky. It's a little known fact that for loops can also accept matrices and the index variable gives you one column of each matrix at a time from left to right. Therefore, I transposed the codes array so that each pair of unique codes becomes a column. I just access the first and second element of each column and print it out.
We get:
Object with code 33 interacted with object with code 34
Object with code 33 interacted with object with code 35

How to plot a 3D object from line segments in Matlab

I have a problem with fast plotting of a simple 3D model which I read from a .dxf file. The hole object is defined by points and the lines between them.
I have a matrix with coordinates. Every row is a unique point and every column is one coordinate.
Then I have an index matrix of size Nx2, where N is the number of rows in the model and on every row there are 2 points indexed from the coordinate matrix which should be connected by a line.
So the structure of the data is very similar to that of the data after triangulation and I need a function similar to trimesh or trisurf, though not for triangles, but for lines.
I can do that by letting a for loop cycle through the index matrix and plot every row separately, but it is very slow as compared built-in functions like trimesh.
Brief example:
%Coordinate matrix
NODES=[
-12.76747 -13.63075 -6.41142
-12.76747 -8.63075 -6.41142
-8.76747 -13.63075 -6.41142
-16.76747 -13.63075 -6.41142
-11.76747 -7.63075 -2.41142
];
%index matrix
LINES=[
1 2
3 4
1 4
3 5
1 5
];
%The slow way of creating the figure
figure(1)
hold on
for k=1:length(LINES)
plot3(NODES(LINES(k,:), 1), NODES(LINES(k,:), 2), NODES(LINES(k,:), 3), '.-')
end
view(20, 20)
hold off
I want to find a better and faster way to produce this figure
I think the code is self-explanatory (it assumes that NODES and LINES are already defined):
%'Calculated: edge coordinates and line specs'
TI = transpose(LINES);
DI = 2*ones(1,size(TI,2));
X = mat2cell(NODES(TI,1), DI);
Y = mat2cell(NODES(TI,2), DI);
Z = mat2cell(NODES(TI,3), DI);
L = repmat({'.-'}, size(X));
%'Output: plot'
ARGS = transpose([X,Y,Z,L]);
plot3(ARGS{:});

Plot of function in Matlab

I need to plot this function in Matlab:
Lines must be connected, I mean at end of decreasing line, increasing one must start etc. It looks like this:
Any idea? I need it on some wide interval, for example t goes from zero to 10
The reason why it isn't working as expected is because for each curve you are drawing between multiples of 0.1 seconds, the y-intercept is not being properly calculated and so the curves are not placed in the right location. For the first part of your curve, y = -57.5t, the y-intercept is at the origin and so your curve is y = -57.5t as expected. However, when you reach 0.1 seconds, you need to solve for the y-intercept for this new line with the new slope, as it has shifted over. Specifically:
y = 42.5t + b
We know that at t = 0.1 seconds, y = -5.75 given the previous curve. Solving for the y-intercept gives us:
-5.75 = (42.5)(0.1) + b
b = -10
As such, between 0.1s <= t <= 0.2s, your equation of the line is actually:
y = 42.5t - 10
Now, repeating the same procedure at t = 0.2s, we have a new equation of the line, even though it has the same slope as the origin:
y = -57.5t + b
From the previous curve, we know that at t = 0.2 seconds, y = (42.5)(0.2) - 10 = -1.5. Therefore, the intercept for this new curve is:
-1.5 = -(57.5)(0.2) + b
b = 10
Therefore, y = -57.5t + 10 is the curve between 0.2s <= t <= 0.3s. If you keep repeating these calculations, you'll see that the next y-intercept is -20, then for the next one it's 20, then the next one after that is -30 and so on. You see a nice multiple of 10 pattern for these calculations, and you'll see that the curve with the positive slope always has a negative y-intercept that is a multiple of 10, and the curve with the negative slope has a positive slope with a y-intercept that is a multiple of -10.
This is the pattern we need to keep in mind when plotting this curve. Because when you're plotting in MATLAB, we have to plot points discretely, you'll want to define a sampling time that defines the time resolution between each point. Because these are linear curves, you don't need that small of a sampling time, but let's choose 0.01 seconds for the sake of simplicity. This means that we will have 10 points between each new curve.
Therefore, for every 10 points in our plot, we will draw a different curve with a different y-intercept for each curve. Because you want to draw points between 0 to 10 seconds, this means we will need (100)(10) = 1000 points. However, this does not include the origin, so you actually need 1001 points. As such, you'd define your t vector like this:
t = linspace(0,10,1001);
Now, for every 10 points, we need to keep changing our y intercept. At the first segment, the y intercept is 0, the second segment, the y intercept is 10 and so on. Now, a lot of MATLAB purists are going to tell you that for loops are taboo, but when it comes to indexing operations, for loops are amongst the fastest in timing in comparison to other more vectorized solutions. As an example, take a look at this post, where I implement a solution with a for loop and it was the fastest amongst the other proposed solutions.
First let's define an array of slopes where each element tells us the slope per segment. Because we have 10 seconds worth of segments, and each segment is 0.1 seconds in length, including the origin we have 101 segments. At the origin, we have a slope of -57.5. After this, our slopes alternate between 42.5 and -57.5. Actually, this alternates 50 times. To create this array, we do:
m = [-57.5 repmat([42.5 -57.5], 1, 50)];
I use repmat to repeat the [42.5 -57.5] array 50 times for a total of 100 times, plus the -57.5 at the origin.
Now, let's define a y-intercept vector that tells us what the y intercept is at each segment.
y = zeros(1,101);
y(2:2:101) = 1;
y = 10*cumsum(y);
y(2:2:101) = -y(2:2:101);
The above code will generate a y-intercept vector such that it starts at 0, then has coefficients of -10, 10, then -20, 20, etc. The trick with this code is that I first generate a sequence of [0 1 0 1 0 1 0 1 0 1...]. After, I use cumsum, which does a cumulative summation where for each point in your array, it adds values from the beginning up until that point. Therefore, if we did cumsum on this binary sequence, it would give us [0 1 1 2 2 3 3 4 4...]. When we multiply this by 10, we get [0 10 10 20 20 30 30 40 40...]. Finally, to complete the slopes, we just negate every even location in this array, and so we finally get [0 -10 10 -20 20 -30 30 -40 40...].
Now, here's the code we're going to use to generate our curve. We are going to iterate through each segment, and generate our output values with the y-intercept taken into account. We first need to allocate an output array that will store our values, then we will populate the values per segment. We also need to keep track of which time values we are going to access to compute our output values.
As such:
%// Define time vector
t = linspace(0,10,1001);
%// Define slopes
m = [-57.5 repmat([42.5 -57.5], 1, 50)];
%// Define y-intercepts
y = zeros(1,101);
y(2:2:101) = 1;
y = 10*cumsum(y);
y(2:2:101) = -y(2:2:101);
%// Calculate the output curves for each segment
out = zeros(1, numel(t));
for idx = 1 : numel(y)-1
%// Compute where in the time array and output array
%// we need to write to
vals_to_access = (idx - 1)*10 + 1 : idx*10;
%// Create the curve for this segment
out(vals_to_access) = m(idx)*t(vals_to_access) + y(idx);
end
%// Copy second last value over to last value
out(end) = out(end-1);
%// Plot the curve
plot(t,out);
axis tight;
The trick with the for loop is to know where to access the time values for each segment, and where to write these values to. That's the purpose of vals_to_access. Also, note that the for loop only populated values in the array from the first index up to the 1000th index, but did not compute the 1001th element. To make things simple, we'll just copy the element from the second last point to the last point, which is why out(end) = out(end-1); is there. The above code will also plot the curve and makes sure that the axes are tightly bound. As such, this is what I get:

How do I generate pair of random points in a circle using Matlab?

Let a circle of known radius be plotted in MATLAB.
Assume a pair of random points whose location has to be determined in terms of coordinates (x1,y1) (x2,y2)..(xn,yn). Pairs should be close to each other. For example T1 and R1 should be near.
As shown in figure, there are four random pairs (T1,R1)..(T4,R4).
There coordinates need to be determined wrt to center (0,0).
How can I generate this in MATLAB?
The simplest approach to pick a point from a uniform distribution over a circle with reduce R is using Gibbs sampling. Here is the code:
function [x y] = circular uniform (R)
while true
x = 2*R*rand() - R
y = 2*R*rand() - R
if (x*x + y*y) > R*R
return
end
end
The loop runs 4/π times on average.
(Complete edit after the question was edited).
To complete this task, I think that you need to combine the different approaches that have been mentioned before your edit:
To generate the centers T1,T2,T3,... in the green torus, use the polar coordinates. (Edit: this turned out to be wrong, rejection sampling must also be used here, otherwise, the distribution is not uniform!)
To generate the points R1,R2,R3,... in the circle around T1,T2,T3,... but still in the torus, use the rejection sampling.
With these ingredients, you should be able to do everything you need. Here is the code I wrote:
d=859.23;
D=1432.05;
R=100;
N=400;
% Generate the angle
theta = 2*pi*rand(N,1);
% Generate the radius
r = d + (D-d)*rand(N,1);
% Get the centers of the circles
Tx = r.*cos(theta);
Ty = r.*sin(theta);
% Generate the R points
Rx=zeros(N,1);
Ry=zeros(N,1);
for i=1:N
while true
% Try
alpha = 2*pi*rand();
rr = R*rand();
Rx(i) = Tx(i) + rr*cos(alpha);
Ry(i) = Ty(i) + rr*sin(alpha);
% Check if in the correct zone
if ( (Rx(i)*Rx(i) + Ry(i)*Ry(i) > d*d) && (Rx(i)*Rx(i) + Ry(i)*Ry(i) < D*D) )
break
end
end
end
% Display
figure(1);
clf;
angle=linspace(0,2*pi,1000);
plot( d*cos(angle), d*sin(angle),'-b');
hold on;
plot( D*cos(angle), D*sin(angle),'-b');
for i=1:N
plot(Tx(i),Ty(i),'gs');
plot(Rx(i),Ry(i),'rx');
plot([Tx(i) Rx(i)],[Ty(i) Ry(i)],'-k');
end
hold off;
let R be radious of (0;0) centered circle.
(x,y) : x^2+y^2<=R^2 (LE) to be inside the circle
x = rand()*2*R - R;
y should be in interval (-sqrt(R^2 - x^2);+sqrt(R^2 - x^2))
so, let it be
y = rand()*sqrt(R^2 - x^2)*2-sqrt(R^2 - x^2);
Hope, that's right, i have no matlab to test.
Hope, you'll manage to find close pairs your self.
Ok, i'll spend a bit more time for a hint.
To find a random number k in interval [a,b] use
k = rand()*(b-a)+a
Now it should really help if i still remember the matlab syntaxis. Good luck.
Here is a low quality solution that is very easy to use with uniformly distributed points. Assuming the number of points is small efficiency should not be a concern, if you want better quality you can use something more powerfull than nearest neighbor:
While you have less than n points: Generate a random point
If it is in the circle, store it else go to step 1
While there are unpaired points: check which point is nearest to the first unpaired point, make them a pair
As a result most pairs should be good, but some can be really really bad. I would recommend you to try it and perhaps add a step 4 with k-opt or some other local search if required. And if you really have little points (e.g. less than 20) you can of course just calculate all distances and find the optimum matching.
If you don't really care about the uniform distribution, here is an even easier solution:
While you have less than n points: Generate a random point
If it is in the circle, store it else go to step 1
For each of these points, generate a point near it
If it is in the circle, store it else go to step 3
Generate the random points as #PheuVerg suggested (with a slight vectorized tweak)
n = 8; %must be even!
x = rand(n, 1)*2*R - R;
y = rand(n, 1).*sqrt(R^2 - x.^2).*2-sqrt(R^2 - x.^2);
Then use kmeans clustering to get n/2 centers
[~ c] = kmeans([x y], n/2);
now you have to loop through each center and find it's distance to each point
dists = zeros(n, n/2);
for cc = 1:n/2
for pp = 1:n
dists(pp, cc) = sqrt((c(cc,1) - x(pp))^2 + (c(cc,2) - y(pp))^2);
end
end
now you must find the smallest 2 values for each columns of dists
[sorted, idx] = sort(dists);
so now the top two rows of each column are the two nearest points. But there could be clashes! i.e. points that are nearest to two different centers. So for repeated values you have to loop through and choose swap for the point that will give you the smallest extra distance.
Example data:
x =
0.7894
-0.7176
-0.5814
0.0708
0.5198
-0.2299
0.2245
-0.8941
y =
-0.0800
-0.3339
0.0012
0.9765
-0.4135
0.5733
-0.1867
0.2094
sorted =
0.1870 0 0 0.1555
0.2895 0.5030 0.5030 0.2931
0.3145 1.1733 0.6715 0.2989
1.0905 1.1733 0.7574 0.7929
1.1161 1.2326 0.8854 0.9666
1.2335 1.2778 1.0300 1.2955
1.2814 1.4608 1.2106 1.3051
1.4715 1.5293 1.2393 1.5209
idx =
5 4 6 3
7 6 4 2
1 3 3 8
6 7 8 6
3 8 7 7
2 1 2 4
4 5 1 5
8 2 5 1
So now it's clear that 5 and 7 are pairs, and that 3 and 2 are pairs. But 4 and 6 are both repeated. (in this case it is clear that they are pairs too I guess!) but what I would suggest is to leave point 4 with center 2 and point 6 with center 3. Then we start at column 2 and see the next available point is 8 with a distance of 1.2326. This would leave point 1 paired with point 6 but then it's distance from the center is 1.2106. Had we paired point 6 with 8 and point 4 with 1 we would have got distances of 0.7574 and 1.2778 respectively which is actually less total distance. So finding 'close' pairs is easy but finding the set of pairs with the globally smallest minimum is hard! This solutions gets you something decent quite easily but fi you need the global best then I'm afraid you have quite a bit of work to do still :(
Finally let me add some visualisation. First lets (manually) create a vector that shows which points are paired:
I = [1 2 2 1 3 4 3 4];
Remember that that will depend on your data! Now you can plot is nicely like this:
gscatter(x, y, I)
Hope this gets you close and that you can eliminate the manual pairing of mine at the end by yourself. It shouldn't be too hard to get a crude solution.