Plot distances between points matlab - matlab

I've made a plot of 10 points
10 10
248,628959661970 66,9462583977501
451,638770451973 939,398361884535
227,712826026548 18,1775336366957
804,449583613070 683,838613746355
986,104241895970 783,736480083219
29,9919502693899 534,137567882728
535,664190667238 885,359450931142
87,0772199008924 899,004898906140
990 990
With the first column as x-coordinates and the other column as y-coordinates
Leading to the following Plot:
Using the following code: scatter(Problem.Points(:,1),Problem.Points(:,2),'.b')
I then also calculated the euclidean distances using Problem.DistanceMatrix = pdist(Problem.Points);
Problem.DistanceMatrix = squareform(Problem.DistanceMatrix);
I replaced the distances by 1*10^6 when they are larger than a certain value.
This lead to the following table:
Then, I would like to plot the lines between the corresponding points, preferably with their distances, but only in case the distance < 1*10^6.
Specifically i want to plot the line [1,2] [1,4] [1,7] [2,4] etc.
My question is, can this be done and how?

Assuming one set of your data is in something called xdata and the other in ydata and then the distances in distances, the following code should accomplish what you want.
hold on
for k = 1:length(xdata)
for j = 1:length(ydata)
if(distances(k,j) < 1e6)
plot([xdata(k) xdata(j)], [ydata(k) ydata(j)]);
end
end
end
You just need to iterate through your matrix and then if the value is less than 1e6, then plot the line between the kth and jth index points. This will however double plot lines, so it will plot from k to j, and also from j to k, but it is quick to code and easy to understand. I got the following plot with this.

This should do the trick:
P = [
10.0000000000000 10.0000000000000;
248.6289596619700 66.9462583977501;
451.6387704519730 939.3983618845350;
227.7128260265480 18.1775336366957;
804.4495836130700 683.8386137463550;
986.1042418959700 783.7364800832190;
29.9919502693899 534.1375678827280;
535.6641906672380 885.3594509311420;
87.0772199008924 899.0048989061400;
990.0000000000000 990.0000000000000
];
P_len = size(P,1);
D = squareform(pdist(P));
D(D > 600) = 1e6;
scatter(P(:,1),P(:,2),'*b');
hold on;
for i = 1:P_len
pi = P(i,:);
for j = 1:P_len
pj = P(j,:);
d = D(i,j);
if ((d > 0) && (d < 1e6))
plot([pi(1) pj(1)],[pi(2) pj(2)],'-r');
end
end
end
hold off;
Final output:
On a side note, the part in which you replaces the distance values trespassing a certain treshold (it looks like it's 600 by looking at your distances matrix) with 1e6 can be avoided by just inserting that threshold into the loop for plotting the lines. I mean... it's not wrong, but I just think it's an unnecessary step.
D = squareform(pdist(P));
% ...
if ((d > 0) && (d < 600))
plot([pi(1) pj(1)],[pi(2) pj(2)],'-r');
end

A friend of mine suggested using gplot
gplot(Problem.AdjM, Problem.Points(:,:), '-o')
With problem.points as the coordinates and Problem.AdjM as the adjacency matrix. The Adjacency matrix was generated like this:
Problem.AdjM=Problem.DistanceMatrix;
Problem.AdjM(Problem.AdjM==1000000)=0;
Problem.AdjM(Problem.AdjM>0)=1;
Since the distances of 1*10^6 was the replacement of a distance that is too large, I put the adjacency there to 0 and all the other to 1.
This lead to the following plot, which was more or less what I wanted:
Since you people have been helping me in such a wonderful way, I just wanted to add this:
I added J. Mel's solution to my code, leading to two exactly the same figures:
Since the figures get the same outcome, both methods should be all right. Furthermore, since Tommasso's and J Mel's outcomes were equal earlier, Tommasso's code must also be correct.
Many thanks to both of you and all other people contributing!

Related

Which Bins are occupied in a 3D histogram in MatLab

I got 3D data, from which I need to calculate properties.
To reduce computung I wanted to discretize the space and calculate the properties from the Bin instead of the individual data points and then reasign the propertie caclulated from the bin back to the datapoint.
I further only want to calculate the Bins which have points within them.
Since there is no 3D-binning function in MatLab, what i do is using histcounts over each dimension and then searching for the unique Bins that have been asigned to the data points.
a5pre=compositions(:,1);
a7pre=compositions(:,2);
a8pre=compositions(:,3);
%% BINNING
a5pre_edges=[0,linspace(0.005,0.995,19),1];
a5pre_val=(a5pre_edges(1:end-1) + a5pre_edges(2:end))/2;
a5pre_val(1)=0;
a5pre_val(end)=1;
a7pre_edges=[0,linspace(0.005,0.995,49),1];
a7pre_val=(a7pre_edges(1:end-1) + a7pre_edges(2:end))/2;
a7pre_val(1)=0;
a7pre_val(end)=1;
a8pre_edges=a7pre_edges;
a8pre_val=a7pre_val;
[~,~,bin1]=histcounts(a5pre,a5pre_edges);
[~,~,bin2]=histcounts(a7pre,a7pre_edges);
[~,~,bin3]=histcounts(a8pre,a8pre_edges);
bins=[bin1,bin2,bin3];
[A,~,C]=unique(bins,'rows','stable');
a5pre=a5pre_val(A(:,1));
a7pre=a7pre_val(A(:,2));
a8pre=a8pre_val(A(:,3));
It seems like that the unique function is pretty time consuming, so I was wondering if there is a faster way to do it, knowing that the line only can contain integer or so... or a totaly different.
Best regards
function [comps,C]=compo_binner(x,y,z,e1,e2,e3,v1,v2,v3)
C=NaN(length(x),1);
comps=NaN(length(x),3);
id=1;
for i=1:numel(x)
B_temp(1,1)=v1(sum(x(i)>e1));
B_temp(1,2)=v2(sum(y(i)>e2));
B_temp(1,3)=v3(sum(z(i)>e3));
C_id=sum(ismember(comps,B_temp),2)==3;
if sum(C_id)>0
C(i)=find(C_id);
else
comps(id,:)=B_temp;
id=id+1;
C_id=sum(ismember(comps,B_temp),2)==3;
C(i)=find(C_id>0);
end
end
comps(any(isnan(comps), 2), :) = [];
end
But its way slower than the histcount, unique version. Cant avoid find-function, and thats a function you sure want to avoid in a loop when its about speed...
If I understand correctly you want to compute a 3D histogram. If there's no built-in tool to compute one, it is simple to write one:
function [H, lindices] = histogram3d(data, n)
% histogram3d 3D histogram
% H = histogram3d(data, n) computes a 3D histogram from (x,y,z) values
% in the Nx3 array `data`. `n` is the number of bins between 0 and 1.
% It is assumed all values in `data` are between 0 and 1.
assert(size(data,2) == 3, 'data must be Nx3');
H = zeros(n, n, n);
indices = floor(data * n) + 1;
indices(indices > n) = n;
lindices = sub2ind(size(H), indices(:,1), indices(:,2), indices(:,3));
for ii = 1:size(data,1)
H(lindices(ii)) = H(lindices(ii)) + 1;
end
end
Now, given your compositions array, and binning each dimension into 20 bins, we get:
[H, indices] = histogram3d(compositions, 20);
idx = find(H);
[x,y,z] = ind2sub(size(H), idx);
reduced_compositions = ([x,y,z] - 0.5) / 20;
The bin centers for H are at ((1:20)-0.5)/20.
On my machine this runs in a fraction of a second for 5 million inputs points.
Now, for each composition(ii,:), you have a number indices(ii), which matches with another number idx[jj], corresponding to reduced_compositions(jj,:). One easy way to make the assignment of results is as follows:
H(H > 0) = 1:numel(idx);
indices = H(indices);
Now for each composition(ii,:), your closest match in the reduced set is reduced_compositions(indices(ii),:).

How to interpolate random non monotonic increasing data

So I am working on my Thesis and I need to calculate geometric characteristics of an airfoil.
To do this, I need to interpolate the horizontal and vertical coordinates of an airfoil. This is used for a tool which will calculate the geometric characteristics automatically which come from random airfoil geometry files.
Sometime the Y values of the airfoil are non monotonic. Hence, the interp1 command gives an error since some values in the Y vector are repeated.
Therefore, my question is: How do I recognize and subsequently interpolate non monotonic increasing data automatically in Matlab.
Here is a sample data set:
0.999974 0.002176
0.994846 0.002555
0.984945 0.003283
0.973279 0.004131
0.960914 0.005022
0.948350 0.005919
0.935739 0.006810
0.923111 0.007691
0.910478 0.008564
0.897850 0.009428
0.885229 0.010282
0.872617 0.011125
0.860009 0.011960
0.847406 0.012783
0.834807 0.013598
0.822210 0.014402
0.809614 0.015199
0.797021 0.015985
0.784426 0.016764
0.771830 0.017536
0.759236 0.018297
0.746639 0.019053
0.734038 0.019797
0.721440 0.020531
0.708839 0.021256
0.696240 0.021971
0.683641 0.022674
0.671048 0.023367
0.658455 0.024048
0.645865 0.024721
0.633280 0.025378
0.620699 0.026029
0.608123 0.026670
0.595552 0.027299
0.582988 0.027919
0.570436 0.028523
0.557889 0.029115
0.545349 0.029697
0.532818 0.030265
0.520296 0.030820
0.507781 0.031365
0.495276 0.031894
0.482780 0.032414
0.470292 0.032920
0.457812 0.033415
0.445340 0.033898
0.432874 0.034369
0.420416 0.034829
0.407964 0.035275
0.395519 0.035708
0.383083 0.036126
0.370651 0.036530
0.358228 0.036916
0.345814 0.037284
0.333403 0.037629
0.320995 0.037950
0.308592 0.038244
0.296191 0.038506
0.283793 0.038733
0.271398 0.038920
0.259004 0.039061
0.246612 0.039153
0.234221 0.039188
0.221833 0.039162
0.209446 0.039064
0.197067 0.038889
0.184693 0.038628
0.172330 0.038271
0.159986 0.037809
0.147685 0.037231
0.135454 0.036526
0.123360 0.035684
0.111394 0.034690
0.099596 0.033528
0.088011 0.032181
0.076685 0.030635
0.065663 0.028864
0.055015 0.026849
0.044865 0.024579
0.035426 0.022076
0.027030 0.019427
0.019970 0.016771
0.014377 0.014268
0.010159 0.012029
0.007009 0.010051
0.004650 0.008292
0.002879 0.006696
0.001578 0.005207
0.000698 0.003785
0.000198 0.002434
0.000000 0.001190
0.000000 0.000000
0.000258 -0.001992
0.000832 -0.003348
0.001858 -0.004711
0.003426 -0.005982
0.005568 -0.007173
0.008409 -0.008303
0.012185 -0.009379
0.017243 -0.010404
0.023929 -0.011326
0.032338 -0.012056
0.042155 -0.012532
0.052898 -0.012742
0.064198 -0.012720
0.075846 -0.012533
0.087736 -0.012223
0.099803 -0.011837
0.111997 -0.011398
0.124285 -0.010925
0.136634 -0.010429
0.149040 -0.009918
0.161493 -0.009400
0.173985 -0.008878
0.186517 -0.008359
0.199087 -0.007845
0.211686 -0.007340
0.224315 -0.006846
0.236968 -0.006364
0.249641 -0.005898
0.262329 -0.005451
0.275030 -0.005022
0.287738 -0.004615
0.300450 -0.004231
0.313158 -0.003870
0.325864 -0.003534
0.338565 -0.003224
0.351261 -0.002939
0.363955 -0.002680
0.376646 -0.002447
0.389333 -0.002239
0.402018 -0.002057
0.414702 -0.001899
0.427381 -0.001766
0.440057 -0.001656
0.452730 -0.001566
0.465409 -0.001496
0.478092 -0.001443
0.490780 -0.001407
0.503470 -0.001381
0.516157 -0.001369
0.528844 -0.001364
0.541527 -0.001368
0.554213 -0.001376
0.566894 -0.001386
0.579575 -0.001398
0.592254 -0.001410
0.604934 -0.001424
0.617614 -0.001434
0.630291 -0.001437
0.642967 -0.001443
0.655644 -0.001442
0.668323 -0.001439
0.681003 -0.001437
0.693683 -0.001440
0.706365 -0.001442
0.719048 -0.001444
0.731731 -0.001446
0.744416 -0.001443
0.757102 -0.001445
0.769790 -0.001444
0.782480 -0.001445
0.795173 -0.001446
0.807870 -0.001446
0.820569 -0.001446
0.833273 -0.001446
0.845984 -0.001448
0.858698 -0.001448
0.871422 -0.001451
0.884148 -0.001448
0.896868 -0.001446
0.909585 -0.001443
0.922302 -0.001445
0.935019 -0.001446
0.947730 -0.001446
0.960405 -0.001439
0.972917 -0.001437
0.984788 -0.001441
0.994843 -0.001441
1.000019 -0.001441
First column is X and the second column is Y. Notice how the last values of Y are repeated.
Maybe someone can provide me with a piece of code to do this? Or any suggestions are welcome as well.
Remember I need to automate this process.
Thanks for your time and effort I really appreciate it!
There is quick and dirty method if you do not know the exact function defining the foil profile. Split your data into 2 sets, top and bottom planes, so the 'x' data are monotonic increasing.
First I imported your data table in the variable A, then:
%// just reorganise your input in individual vectors. (this is optional but
%// if you do not do it you'll have to adjust the code below)
x = A(:,1) ;
y = A(:,2) ;
ipos = y > 0 ; %// indices of the top plane
ineg = y <= 0 ; %// indices of the bottom plane
xi = linspace(0,1,500) ; %// new Xi for interpolation
ypos = interp1( x(ipos) , y(ipos) , xi ) ; %// re-interp the top plane
yneg = interp1( x(ineg) , y(ineg) , xi ) ; %// re-interp the bottom plane
y_new = [fliplr(yneg) ypos] ; %// stiches the two half data set together
x_new = [fliplr(xi) xi] ;
%% // display
figure
plot(x,y,'o')
hold on
plot(x_new,y_new,'.r')
axis equal
As said on top, it is quick and dirty. As you can see from the detail figure, you can greatly improve the x resolution this way in the area where the profile is close to the horizontal direction, but you loose a bit of resolution at the noose of the foil where the profile is close to the vertical direction.
If it's acceptable then you're all set. If you really need the resolution at the nose, you could look at interpolating on x as above but do a very fine x grid near the noose (instead of the regular x grid I provided as example).
if your replace the xi definition above by:
xi = [linspace(0,0.01,50) linspace(0.01,1,500)] ;
You get the following near the nose:
adjust that to your needs.
To interpolate any function, there must be a function defined. When you define y=f(x), you cannot have the same x for two different values of y because then we are not talking about a function. In your example data, neither x nor y are monotonic, so anyway you slice it, you'll have two (or more) "y"s for the same "x". If you wish to interpolate, you need to divide this into two separate problems, top/bottom and define proper functions for interp1/2/n to work with, for example, slice it horizontally where x==0. In any case, you would have to provide additional info than just x or y alone, e.g.: x=0.5 and y is on top.
On the other hand, if all you want to do is to insert a few values between each x and y in your array, you can do this using finite differences:
%// transform your original xy into 3d array where x is in first slice and y in second
xy = permute(xy(85:95,:), [3,1,2]); %// 85:95 is near x=0 in your data
%// lets say you want to insert three additional points along each line between every two points on given airfoil
h = [0, 0.25, 0.5, 0.75].'; %// steps along each line - column vector
%// every interpolated h along the way between f(x(n)) and f(x(n+1)) can
%// be defined as: f(x(n) + h) = f(x(n)) + h*( f(x(n+1)) - f(x(n)) )
%// this is first order finite differences approximation in 1D. 2D is very
%// similar only with gradient (this should be common knowledge, look it up)
%// from here it's just fancy matrix play
%// 2D gradient of xy curve
gradxy = diff(xy, 1, 2); %// diff xy, first order, along the 2nd dimension, where x and y now run
h_times_gradxy = bsxfun(#times, h, gradxy); %// gradient times step size
xy_in_3d_array = bsxfun(#plus, xy(:,1:end-1,:), h_times_gradxy); %// addition of "f(x)" and there we have it, the new x and y for every step h
[x,y] = deal(xy_in_3d_array(:,:,1), xy_in_3d_array(:,:,2)); %// extract x and y from 3d matrix
xy_interp = [x(:), y(:)]; %// use Matlab's linear indexing to flatten x and y into columns
%// plot to check results
figure; ax = newplot; hold on;
plot(ax, xy(:,:,1), xy(:,:,2),'o-');
plot(ax, xy_interp(:,1), xy_interp(:,2),'+')
legend('Original','Interpolated',0);
axis tight;
grid;
%// The End
And these are the results, near x=0 for clarity of presentation:
Hope that helps.
Cheers.

Random sample of points from a rectangular box with a spherical obstacle

The workspace is given as:
limits=[-1 4; -1 4; -1 4];
And in this workspace, there is a spherical obstacle which is defined as:
obstacle.origin_x=1.6;
obstacle.origin_y=0.8;
obstacle.origin_z=0.2;
obstacle.radius_obs=0.2;
save('obstacle.mat', 'obstacle');
I would like to create random point in the area of lim. I created random points using the code below:
function a=rndmpnt(lim, numofpoints)
x=lim(1,1)+(lim(1,2)-lim(1,1))*rand(1,numofpoint);
y=lim(2,1)+(lim(2,2)-lim(2,1))*rand(1,numofpoint);
z=lim(3,1)+(lim(3,2)-lim(3,1))*rand(1,numofpoint);
a=[x y z];
Now I would like to eliminate the points in the area of limits-obstacle. how can I do that?
You want to reject the points within the obstacle. Naturally, after rejection you will probably end up with fewer points than numofpoint. So the process will need to be repeated until enough points are generated. A while loop is appropriate here.
Rejection is done by finding ix (indices of acceptable points) and appending only those points to matrix a. The loop repeats until there are enough of those, and returns exactly the number requested.
function a = rndmpnt(lim, numofpoints)
a = zeros(3,0); % begin with empty matrix
while size(a,2) < numofpoint % not enough points yet
x=lim(1,1)+(lim(1,2)-lim(1,1))*rand(1,numofpoint);
y=lim(2,1)+(lim(2,2)-lim(2,1))*rand(1,numofpoint);
z=lim(3,1)+(lim(3,2)-lim(3,1))*rand(1,numofpoint);
ix = (x - obstacle.origin_x).^2 + (y - obstacle.origin_y).^2 + (z - obstacle.origin_z).^2 > obstacle.radius_obs^2;
a = [a, [x(ix); y(ix); z(ix)]];
end
a = a(:, 1:numofpoint);
end
You may want to add a safeguard against infinite loop (some limit on the number of cycles) in case the user passes in the values such that there are no acceptable points.

Optimizing code, removing "for loop"

I'm trying to remove outliers from a tick data series, following Brownlees & Gallo 2006 (if you may be interested).
The code works fine but given that I'm working on really long vectors (the biggest has 20m observations and after 20h it was not done computing) I was wondering how to speed it up.
What I did until now is:
I changed the time and date format to numeric double and I saw that it saves quite some time in processing and A LOT OF MEMORY.
I allocated memory for the vectors:
[n] = size(price);
x = price;
score = nan(n,'double'); %using tic and toc I saw that nan requires less time than zeros
trimmed_mean = nan(n,'double');
sd = nan(n,'double');
out_mat = nan(n,'double');
Here is the loop I'd love to remove. I read that vectorizing would speed up a lot, especially using long vectors.
for i = k+1:n
trimmed_mean(i) = trimmean(x(i-k:i-1 & i+1:i+k),10,'round'); %trimmed mean computed on the 'k' closest observations to 'i' (i is excluded)
score(i) = x(i) - trimmed_mean(i);
sd(i) = std(x(i-k:i-1 & i+1:i+k)); %same as the mean
tmp = abs(score(i)) > (alpha .* sd(i) + gamma);
out_mat(i) = tmp*1;
end
Here is what I was trying to do
trimmed_mean=trimmean(regroup_matrix,10,'round',2);
score=bsxfun(#minus,x,trimmed_mean);
sd=std(regroup_matrix,2);
temp = abs(score) > (alpha .* sd + gamma);
out_mat = temp*1;
But given that I'm totally new to Matlab, I don't know how to properly construct the matrix of neighbouring observations. I just think it should be shaped like: regroup_matrix= nan (n,2*k).
EDIT: To be specific, what I am trying to do (and I am not able to) is:
Given a column vector "x" (n,1) for each observation "i" in "x" I want to take the "k" neighbouring observations to "i" (from i-k to i-1 and from i+1 to i+k) and put these observations as rows of a matrix (n, 2*k).
EDIT 2: I made a few changes to the code and I think I am getting closer to the solution. I posted another question specific to what I think is the problem now:
Matlab: Filling up matrix rows using moving intervals from a column vector without a for loop
What I am trying to do now is:
[n] = size(price,1);
x = price;
[j1]=find(x);
matrix_left=zeros(n, k,'double');
matrix_right=zeros(n, k,'double');
toc
matrix_left(j1(k+1:end),:)=x(j1-k:j1-1);
matrix_right(j1(1:end-k),:)=x(j1+1:j1+k);
matrix_group=[matrix_left matrix_right];
trimmed_mean=trimmean(matrix_group,10,'round',2);
score=bsxfun(#minus,x,trimmed_mean);
sd=std(matrix_group,2);
temp = abs(score) > (alpha .* sd + gamma);
outmat = temp*1;
I have problems with the matrix_left and matrix_right creation.
j1, that I am using for indexing is a column vector with the indices of price's observations. The output is simply
j1=[1:1:n]
price is a column vector of double with size(n,1)
For your reshape, you can do the following:
idxArray = bsxfun(#plus,(k:n)',[-k:-1,1:k]);
reshapedArray = x(idxArray);
Thanks to Jonas that showed me the way to go I came up with this:
idxArray_left=bsxfun(#plus,(k+1:n)',[-k:-1]); %matrix with index of left neighbours observations
idxArray_fill_left=bsxfun(#plus,(1:k)',[1:k]); %for observations from 1:k I take the right neighbouring observations, this way when computing mean and standard deviations there will be no problems.
matrix_left=[idxArray_fill_left; idxArray_left]; %Just join the two matrices and I have the complete matrix of left neighbours
idxArray_right=bsxfun(#plus,(1:n-k)',[1:k]); %same thing as left but opposite.
idxArray_fill_right=bsxfun(#plus,(n-k+1:n)',[-k:-1]);
matrix_right=[idxArray_right; idxArray_fill_right];
idx_matrix=[matrix_left matrix_right]; %complete index matrix, joining left and right indices
neigh_matrix=x(idx_matrix); %exactly as proposed by Jonas, I fill up a matrix of observations from 'x', following idx_matrix indexing
trimmed_mean=trimmean(neigh_matrix,10,'round',2);
score=bsxfun(#minus,x,trimmed_mean);
sd=std(neigh_matrix,2);
temp = abs(score) > (alpha .* sd + gamma);
outmat = temp*1;
Again, thanks a lot to Jonas. You really made my day!
Thanks also to everyone that had a look to the question and tried to help!

Binning in matlab

I have been unable to find a function in matlab or octave to do what I want.
I have a matrix m of two columns (x and y values). I know that I can extract the column by doing m(:,1) or m(:,2). I want to split it into smaller matricies of [potentially] equal size and and plot the mean of these matricies. In other words, I want to put the values into bins based on the x values, then find means of the bins. I feel like the hist function should help me, but it doesn't seem to.
Does anyone know of a built-in function to do something like this?
edit
I had intended to mention that I looked at hist and couldn't get it to do what I wanted, but it must have slipped my mind.
Example: Let's say I have the following (I'm trying this in octave, but afaik it works in matlab):
x=1:20;
y=[1:10,10:1];
m=[x, y];
If I want 10 bins, I would like m to be split into:
m1=[1:2, 1:2]
...
m5=[9:10, 9:10]
m6=[10:11, 10:-1:9]
...
m10=[19:20, 2:-1:1]
and then get the mean of each bin.
Update: I have posted a follow-up question here. I would greatly appreciate responses.
I have answered this in video form on my blog:
http://blogs.mathworks.com/videos/2009/01/07/binning-data-in-matlab/
Here is the code:
m = rand(10,2); %Generate data
x = m(:,1); %split into x and y
y = m(:,2);
topEdge = 1; % define limits
botEdge = 0; % define limits
numBins = 2; % define number of bins
binEdges = linspace(botEdge, topEdge, numBins+1);
[h,whichBin] = histc(x, binEdges);
for i = 1:numBins
flagBinMembers = (whichBin == i);
binMembers = y(flagBinMembers);
binMean(i) = mean(binMembers);
end