Calculate Euclidean distance for every row with every other row in a NxM matrix? - matlab

I have a matrix that I generate from a CSV file as follows:
X = xlsread('filename.csv');
I am looping through the matrix based on the number of records and I need to find the Euclidean distance for each of the rows of this matrix :
for i = 1:length(X)
j = X(:, [2:5])
end
The resulting matrix is of 150 X 4. What would be the best way to calculate the Euclidean distance of each row (with 4 columns as the data points) with every row and getting an average of the same?

In order to find the Euclidean distance between any pair of rows, you could use the function pdist.
X = randn(6, 4);
D = pdist(X,'euclidean');
res=mean(D);
The average is stored in res.

Related

Distance between any combination of two points

I have 100 coordinates in a variable x in MATLAB . How can I make sure that distance between all combinations of two points is greater than 1?
You can do this in just one simple line, with the functions all and pdist:
if all(pdist(x)>1)
...
end
Best,
First you'll need to generate a matrix that gives you all possible pairs of coordinates. This post can serve as inspiration:
Generate a matrix containing all combinations of elements taken from n vectors
I'm going to assume that your coordinates are stored such that the columns denote the dimensionality and the rows denote how many points you have. As such, for 2D, you would have a 100 x 2 matrix, and in 3D you would have a 100 x 3 matrix and so on.
Once you generate all possible combinations, you simply compute the distance... which I will assume it to be Euclidean here... of all points and ensure that all of them are greater than 1.
As such:
%// Taken from the linked post
vectors = { 1:100, 1:100 }; %// input data: cell array of vectors
n = numel(vectors); %// number of vectors
combs = cell(1,n); %// pre-define to generate comma-separated list
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order
combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1
combs = reshape(combs,[],n); %// reshape to obtain desired matrix
%// Index into your coordinates array
source_points = x(combs(:,1), :);
end_points = x(combs(:,2), :);
%// Checks to see if all distances are greater than 1
is_separated = all(sqrt(sum((source_points - end_points).^2, 2)) > 1);
is_separated will contain either 1 if all points are separated by a distance of 1 or greater and 0 otherwise. If we dissect the last line of code, it's a three step procedure:
sum((source_points - end_points).^2, 2) computes the pairwise differences between each component for each pair of points, squares the differences and then sums all of the values together.
sqrt(...(1)...) computes the square root which gives us the Euclidean distance.
all(...(2)... > 1) then checks to see if all of the distances computed in Step #2 were greater than 1 and our result thus follows.

How to create matrix of nearest neighbours from dataset using matrix of indices - matlab

I have an Nx2 matrix of data points where each row is a data point. I also have an NxK matrix of indices of the K nearest neighbours from the knnsearch function. I am trying to create a matrix that contains in each row the data point followed by the K neighbouring data points, i.e. for K = 2 we would have something like [data1, neighbour1, neighbour2] for each row.
I have been messing round with loops and attempting to index with matrices but to no avail, the fact that each datapoint is 1x2 is confusing me.
My ultimate aim is to calculate gradients to train an RBF network in a similar manner to:
D = (x_dist - y_dist)./(y_dist+(y_dist==0));
temp = y';
neg_gradient = -2.*sum(kron(D, ones(1,2)) .* ...
(repmat(y, 1, ndata) - repmat((temp(:))', ndata, 1)), 1);
neg_gradient = (reshape(neg_gradient, net.nout, ndata))';
You could use something along those lines:
K = 2;
nearest = knnsearch(data, data, 'K', K+1);%// Gets point itself and K nearest ones
mat = reshape(data(nearest.',:).',[],N).'; %// Extracts the coordinates
We generate data(nearest.',:) to get a 3*N-by-2 matrix, where every 3 consecutive rows are the points that correspond to each other. We transpose this to get the xy-coordinates into the same column. (MATLAB is column major, i.e. values in a column are stored consecutively). Then we reshape the data, so every column contains the xy-coordinates of the rows of nearest. So we only need to transpose once more in the end.

Finding maximum/minimum distance of two rows in a matrix using MATLAB

Say we have a matrix m x n where the number of rows of the matrix is very big. If we assume each row is a vector, then how could one find the maximum/minimum distance between vectors in this matrix?
My suggestion would be to use pdist. This computes pairs of Euclidean distances between unique combinations of observations like #seb has suggested, but this is already built into MATLAB. Your matrix is already formatted nicely for pdist where each row is an observation and each column is a variable.
Once you do apply pdist, apply squareform so that you can display the distance between pairwise entries in a more pleasant matrix form. The (i,j) entry for each value in this matrix tells you the distance between the ith and jth row. Also note that this matrix will be symmetric and the distances along the diagonal will inevitably equal to 0, as any vector's distance to itself must be zero. If your minimum distance between two different vectors were zero, if we were to search this matrix, then it may possibly report a self-distance instead of the actual distance between two different vectors. As such, in this matrix, you should set the diagonals of this matrix to NaN to avoid outputting these.
As such, assuming your matrix is A, all you have to do is this:
distValues = pdist(A); %// Compute pairwise distances
minDist = min(distValues); %// Find minimum distance
maxDist = max(distValues); %// Find maximum distance
distMatrix = squareform(distValues); %// Prettify
distMatrix(logical(eye(size(distMatrix)))) = NaN; %// Ignore self-distances
[minI,minJ] = find(distMatrix == minDist, 1); %// Find the two vectors with min. distance
[maxI,maxJ] = find(distMatrix == maxDist, 1); %// Find the two vectors with max. distance
minI, minJ, maxI, maxJ will return the two rows of A that produced the smallest distance and the largest distance respectively. Note that with the find statement, I have made the second parameter 1 so that it only returns one pair of vectors that have this minimum / maximum distance between each other. However, if you omit this parameter, then it will return all possible pairs of rows that share this same distance, but you will get duplicate entries as the squareform is symmetric. If you want to escape the duplication, set either the upper triangular half, or lower triangular half of your squareform matrix to NaN to tell MATLAB to skip searching in these duplicated areas. You can use MATLAB's tril or triu commands to do that. Take note that either of these methods by default will include the diagonal of the matrix and so there won't be any extra work here. As such, try something like:
distValues = pdist(A); %// Compute pairwise distances
minDist = min(distValues); %// Find minimum distance
maxDist = max(distValues); %// Find maximum distance
distMatrix = squareform(distValues); %// Prettify
distMatrix(triu(true(size(distMatrix)))) = NaN; %// To avoid searching for duplicates
[minI,minJ] = find(distMatrix == minDist); %// Find pairs of vectors with min. distance
[maxI,maxJ] = find(distMatrix == maxDist); %// Find pairs of vectors with max. distance
Judging from your application, you just want to find one such occurrence only, so let's leave it at that, but I'll put that here for you in case you need it.
You mean the max/min distance between any 2 rows? If so, you can try that:
numRows = 6;
A = randn(numRows, 100); %// Example of input matrix
%// Compute distances between each combination of 2 rows
T = nchoosek(1:numRows,2); %// pairs of indexes for all combinations of 2 rows
for k=1:length(T)
d(k) = norm(A(T(k,1),:)-A(T(k,2),:));
end
%// Find min/max distance
[~, minIndex] = min(d);
[~, maxIndex] = max(d);
T(minIndex,:) %// Displays indexes of the 2 rows with minimum distance
T(maxIndex,:) %// Displays indexes of the 2 rows with maximum distance

Euclidean distance between two columns of two vector Matlab

I have two vectors A & B of size 250x4. The first column in each vector has the X values and the second column has the Y values. I want to calculate the euclidean distance between each the X & Y of each row in the two vectors and save the result in a new vector C of size 250x1 which holds the result of the euclidean distance. For example, if the first row in A is A1x, A1y, A1n, A1m and the first row in B is B1x, B1y, B1n, B1m so I want to get the eucledian distance which will be [(A1x-B1x)^2 + (A1y-B1y)^2]^0.5 and the result will be saved in C1 and same will be done for the rest of the 250 rows. So if anyone could please advise how to do this in Matlab.
Like this:
%// First extract on x-y data from A and B
Axy = A(:,1:2);
Bxy = B(:,1:2);
%// Find all euclidean distances (row-wise)
C1 = sqrt(sum((Axy-Bxy).^2,2));
plus it handles higher dimension too
use pdist2:
C1=diag(pdist2(A(:,1:2),B(:,1:2)));
Actually, pdist2 will give you a 250x250 matrix, because it calculate all the distances. You need only the main diagonal, so calling diag on the result (as in the code above) will produce the wanted result.

Adding Specific Coordinates of a Matrix in Matlab

I am trying to find the sum of certain coordinates in a matrix.
I have a N x M matrix. I have a vector which contains 2xM values. Every pair of values in the vector is a coordinate in the matrix. Hence their are M number of coordinates. I want to find the sum of all of the coordinates without using a for loop.
Is there a matrix operation I can use to get this?
Thanks
As I understand it the vector contains the (row,column) coordinates to the matrix elements. You can just transform them into matrix element number indices. This example shows how to do it. I'm assuming that your coordinate vector looks like this:
[n-coordinate1 m-coordinate1 n-coordinate2 m-coordinate2 ...]
n = 5; % number of rows
m = 5; % number of columns
matrix = round(10*rand(n,m)); % An n by m example matrix
% A vector with 2*m elements. Element 1 is the n coordinate,
% Element 2 the m coordinate, and so on. Indexes into the matrix:
vector = ceil(rand(1,2*m)*5);
% turn the (n,m) coordinates into the element number index:
matrixIndices = vector(1:2:end) + (vector(2:2:end)-1)*n);
sumOfMatrixElements = sum(matrix(matrixIndices)); % sums the values of the indexed matrix elements
If you want to find the centroid of your 2xM array coords, then you can simply write
centroid = mean(coords,2)
If you want to find the weighted centroid, where each coordinate pair is weighted by the corresponding entry in the MxN array A, you can use sub2ind like this:
idx = sub2ind(size(A),coords(1,:)',coords(2,:)');
weights = A(idx);
weightedCentroid = sum( bsxfun( #times, coords', weights), 1 ) / sum(weights);
If all you want is the sum of all the entries to which the coordinates point, you can do the above and simply sum the weights:
idx = sub2ind(size(A),coords(1,:)',coords(2,:)');
weights = A(idx);
sumOfValues = sum(weights);