MATLAB calculating distances in a loop - matlab

I'm loading a .csv file to do a few calculations in matlab. The file itself has ~1600 lines, but I'm interested in only a subset.
load file.csv; %load file
for i = 400:1200 %rows I am interested in
rh_x= file(i,60); % columns interested, in column 60 for the x, 61 for y
rh_y= file(i,61);
rh_x2 = file(i+1, 60); % next point (x,y)
rh_y2 = file(i+1, 61);
p1 = [rh_x, rh_y];
p2 = [rh_x2, rh_y2];
coord = [p1, p2];
Distan = pdist(coord, 'euclidean'); ****
disp(Distan);
end
Nothing is being stored in my Distan variable (distance formula), where I tried to input two points. Why is that the case? I'm just wanting to calculate the distance formula for all the pairs of points in rows 60 and 61 for frames 400-1200.
Thank you.

Change your coord assignment to the following:
coord = [p1; p2];
The way you have it, it is storing all of the x, y pairs on the same row, as a 1x4 matrix. The above method stores it as a 2x2 matrix and pdist gives an answer.

Related

MATLAB: 2D DATA TO 3D ! How to?

I have a 34200 x 4 table. This table shows the 30 years monthly amount of pr (precipitation) in some latitudes and longitudes. So the columns are lat, lon, date, and pr. I want to convert it to a 3D matrix that longitude x latitude x month.
I attach my table.
Please tell me how to do it I'm a beginner. If I don't want to convert it based on the month I could but this issue is so complicated for me.
Please look at my table it's only 235 KB I upload it to my DropBox so please click on Open in the top right side and click download.
Here is my image
Inspecting your data, your latitude and logitude values actually represent 95 unique locations which are scattered seemingly randomly. You can see that in the figure below.
length(unique(C.lat)) % 95
length(unique(C.lon)) % 95
scatter(C.lat, C.lon)
If the locations were spaced in a grid, it would make sense to use lat and lon as the axes of a data matrix. But instead, it is better to use only one axis representing the unique locations. This then leaves you with a second axis representing the date.
length(unique(C.date)) % 360
360 * 95 % 34200 - the number of values we have
Reformatting the data
Therefore, I would store the data in a 2D matrix as follows.
locations_lat = unique(C.lat, 'stable');
locations_lon = unique(C.lon, 'stable');
dates = unique(C.date, 'stable');
data = reshape(C.pr, length(dates), length(locations_lat)); % size 360 x 95
Then, to check that this has worked, choose a random example.
location_num = 27;
date_num = 242;
lat = locations_lat(location_num) % 14.68055556
lon = locations_lon(location_num) % 65.23111111
date = dates(date_num) % 2/1/2009
precipitation = data(date_num, location_num) % 16.7179
Searching for that position and date in the original tale, we have:
9602| 14.6805555600000 65.2311111100000 '2/1/2009' 16.7179000000000
If A is your data with the 4 columns [34200x4] then you can create a 3-dimensional matrix like this:
B = zeros(len(A),3)
B(:, 1) = A(:, 1)
B(:, 2) = A(:, 2)
B(:, 3) = A(:, 3)
Possibly even:
B(:,1:3) = A(:, 1:3)
Depending how your data is set, you may need to transpose which is:
B = 'B
You can map data over as long as the dimensions match.
You can implement a for loop if you have even more columns or for dynamic data entries.

Calculating the root-mean-square-error between two matrices one of which contains NaN values

This is a part of a larger project so I will try to keep only the relevant parts (The variables and my attempt at the calculations)
I want to calculate the root mean squared error between Zi_cubic and Z_actual
RMSE formula
Given/already established variables
rng('default');
% Set up 2,000 random numbers between -1 & +1 as our x & y values
n=2000;
x = 2*(rand(n,1)-0.5);
y = 2*(rand(n,1)-0.5);
z = x.^5+y.^3;
% Interpolate to a regular grid
d = -1:0.01:1;
[Xi,Yi] = meshgrid(d,d);
Zi_cubic = griddata(x,y,z,Xi,Yi,'cubic');
Z_actual = Xi.^5+Yi.^3;
My attempt at a calculation
My approach is to
Arrange Zi_cubic and Z_actual as column vectors
Take the difference
Square each element in the difference
Sum up all the elements in 4 using nansum
Divide by the number of finite elements in 4
Take the square root
D1 = reshape(Zi_cubic,[numel(Zi_cubic),1]);
D2 = reshape(Z_actual,[numel(Z_actual),1]);
D3 = D1 - D2;
D4 = D3.^2;
D5 = nansum(D4)
d6 = sum(isfinite(D4))
D6 = D5/d6
D7 = sqrt(D6)
Apparently this is wrong. I'm either mis-applying the RMSE formula or I don't understand what I'm telling matlab to do.
Any help would be appreciated. Thanks in advance.
Your RMSE is fine (in my book). The only thing that seems possibly off is the meshgrid and griddata. Your inputs to griddata are vectors and you are asking for a matrix output. That is fine, but you're potentially undersampling your input space. In other words, you are giving n samples as inputs, but perhaps you are expected to give n^2 samples as inputs? Here's some sample code for a smaller n to demonstrate this effect more clearly:
rng('default');
% Set up 2,000 random numbers between -1 & +1 as our x & y values
n=100; %Reduced because scatter is slow to plot
x = 2*(rand(n,1)-0.5);
y = 2*(rand(n,1)-0.5);
z = x.^5+y.^3;
S = 100;
subplot(1,2,1)
scatter(x,y,S,z)
%More data, more accurate ...
[x2,y2] = meshgrid(x,y);
z2 = x2.^5+y2.^3;
subplot(1,2,2)
scatter(x2(:),y2(:),S,z2(:))
The second plot should be a lot cleaner and thus will likely provide a more accurate estimate of Z_actual later on.
I also thought you might be running into some issues with floating point numbers and calculating RMSE but that appears not to be the case. Here's some alternative code which is how I would write RMSE.
d = Zi_cubic(:) - Z_actual(:);
mask = ~isnan(d);
n_valid = sum(mask);
rmse = sqrt(sum(d(mask).^2)/n_valid);
Notice that (:) linearizes the matrix. Also it is useful to try and use better variable names than D1-D7.
In the end though these are just suggestions and your code looks fine.
PS - I'm assuming that you are supposed to be using cubic interpolation as that is another place you could perhaps deviate from what's expected ...

Mahalanobis distance in Matlab

I would like to calculate the mahalanobis distance of input feature vector Y (1x14) to all feature vectors in matrix X (18x14). Each 6 vectors of X represent one class (So I have 3 classes). Then based on mahalanobis distances I will choose the vector that is the nearest to the input and classify it to one of the three classes as well.
My problem is when I use the following code I got only one value. How can I get mahalanobis distance between the input Y and every vector in X. So at the end I have 18 values and then I choose the smallest one. Any help will be appreciated. Thank you.
Note: I know that mahalanobis distance is a measure of the distance between a point P and a distribution D, but I don't how could this be applied in my situation.
Y = test1; % Y: 1x14 vector
S = cov(X); % X: 18x14 matrix
mu = mean(X,1);
d = ((Y-mu)/S)*(Y-mu)'
I also tried to separate the matrix X into 3; so each one represent the feature vectors of one class. This is the code, but it doesn't work properly and I got 3 distances and some have negative value!
Y = test1;
X1 = Action1;
S1 = cov(X1);
mu1 = mean(X1,1);
d1 = ((Y-mu1)/S1)*(Y-mu1)'
X2 = Action2;
S2 = cov(X2);
mu2 = mean(X2,1);
d2 = ((Y-mu2)/S2)*(Y-mu2)'
X3= Action3;
S3 = cov(X3);
mu3 = mean(X3,1);
d3 = ((Y-mu3)/S3)*(Y-mu3)'
d= [d1,d2,d3];
MahalanobisDist= min(d)
One last thing, when I used mahal function provided by Matlab I got this error:
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate.
If you have to implement the distance yourself (school assignment for instance) this is of absolutely no use to you, but if you just need to calculate the distance as an intermediate step for other calculations I highly recommend d = Pdist2(a,b, distance_measure) the documentation is on matlabs site
It computes the pairwise distance between a vector (or even a matrix) b and all elements in a and stores them in vector d where the columns correspond to entries in b and the rows are entries from a. So d(i,j) is the distance between row j in b and row i in a (hope that made sense). If you want it could even parameters to find the k nearest neighbors, it's a great function.
in your case you would use the following code and you'd end up with the distance between elements, and the index as well
%number of neighbors
K = 1;
% X=18x14, Y=1x14, dist=18x1
[dist, iidx] = pdist2(X,Y,'mahalanobis','smallest',K);
%to find the class, you can do something like this
num_samples_per_class = 6;
matching_class = ceil(iidx/ num_samples_per_class);

How can I make 3d plots of planes by using spreadsheet in matlab

pointA=[9.62579 15.7309 3.3291];
pointB=[13.546 25.6869 3.3291];
pointC=[23.502 21.7667 -3.3291];
pointD=[19.5818 11.8107 -3.3291];
points=[pointA' pointB' pointC' pointD'];
fill3(points(1,:),points(2,:),points(3,:),'r')
grid on
alpha(0.3)
This code will show a filled plane(Cant add images yet T.T)
Now here is my problem. On a spreadsheet, I have x,y,z coordinates of thousands of points. The 4 consecutive points form a plane like the one shown. How do I make a code such that for every 4 consecutive points, it makes a filled plane.
Basically, if I have 400 points, I want the code to plot 100 planes.
Assuming your data are a matrix, m = (400,3)
m = rand(400,3);
for i = 1:length(m);
m2 = m'; % Transpose
end
Create a 3-D matrix in which 'j' represents each set of points:
m3=[];
%Not the most elegant way to cycle through every four points but it works!
z = 0:(length(m2)/4); z1 = (z*4)+1; z1 = z1(:,1:length(z)-1);
for j = 1:length(z1);
m3(:,:,j) = m2(:,z1(j):(z1(j)+3));
end
'j' now has a total length = 100 - representing the amount planes;
fill3(m3(1,:,1),m3(2,:,1),m3(3,:,1),'r');
% Cycle through planes- make a new figure for each plane;
for j = 1:length(z1);
fill3(m3(1,:,j),m3(2,:,j),m3(3,:,j),'r');
end
clear all, close all, clc
pointA=rand(99,1);
pointB=rand(99,1);
pointC=rand(99,1);
pointD=rand(99,1);
pointAmat = reshape(pointA,3,1,[]);
pointBmat = reshape(pointB,3,1,[]);
pointCmat = reshape(pointC,3,1,[]);
pointDmat = reshape(pointD,3,1,[]);
points=[pointAmat pointBmat pointCmat pointDmat];
for i = 1:size(points,3)
fill3(points(1,:,i),points(2,:,i),points(3,:,i),'r')
hold all
end
grid on
alpha(0.3)
Hope this helps.

Interpolation with matlab

I have a vector with different values.
Some of the values are zeros and sometimes they even come one after another.
I need to plot this vector against another vector with the same size but I can't have zeros in it.
What is the best way I can do some kind of interpolation to my vector and how do I do it?
I tried to read about interpolation in mat-lab but I didn't understand good enough to implement it.
If it's possible to explain it to me step by step I will be grateful since I'm new with this program.
Thanks
Starting from a dataset consisting of two equal length vectors x,y, where y values equal to zero are to be excluded, first pick the subset excluding zeros:
incld = y~=0;
Then you interpolate over that subset:
yn = interp1(x(incld),y(incld),x);
Example result, plotting x against y (green) and x against yn (red):
edit
Notice that, by the definition of interpolation, if terminal points are zero, you will have to take care of that separately, for instance by running the following before the lines above:
if y(1)==0, y(1) = y(find(y~=0,1,'first'))/2; end
if y(end)==0, y(end) = y(find(y~=0,1,'last'))/2; end
edit #2
And this is the 2D version of the above, where arrays X and Y are coordinates corresponding to the entries in 2D array Z:
[nr nc]=size(Z);
[X Y] = meshgrid([1:nc],[1:nr]);
X2 = X;
Y2 = Y;
Z2 = Z;
excld = Z==0;
X2(excld) = [];
Y2(excld) = [];
Z2(excld) = [];
ZN = griddata(X2,Y2,Z2,X,Y);
ZN contains the interpolated points.
In the figure below, zeros are shown by dark blue patches. Left is before interpolation, right is after: