I have two vectors which are paired values
size(X)=1e4 x 1; size(Y)=1e4 x 1
Is it possible to plot a contour plot of some sort making the contours by the highest density of points? Ie highest clustering=red, and then gradient colour elsewhere?
If you need more clarification please ask.
Regards,
EXAMPLE DATA:
X=[53 58 62 56 72 63 65 57 52 56 52 70 54 54 59 58 71 66 55 56];
Y=[40 33 35 37 33 36 32 36 35 33 41 35 37 31 40 41 34 33 34 37 ];
scatter(X,Y,'ro');
Thank you for everyone's help. Also remembered we can use hist3:
x={0:0.38/4:0.38}; % # How many bins in x direction
y={0:0.65/7:0.65}; % # How many bins in y direction
ncount=hist3([X Y],'Edges',[x y]);
pcolor(ncount./sum(sum(ncount)));
colorbar
Anyone know why edges in hist3 have to be cells?
This is basically a question about estimating the probability density function generating your data and then visualizing it in a good and meaningful way I'd say. To that end, I would recommend using a more smooth estimate than the histogram, for instance Parzen windowing (a generalization of the histogram method).
In my code below, I have used your example dataset, and estimated the probability density in a grid set up by the range of your data. You here have 3 variables you need to adjust to use on your original data; Borders, Sigma and stepSize.
Border = 5;
Sigma = 5;
stepSize = 1;
X=[53 58 62 56 72 63 65 57 52 56 52 70 54 54 59 58 71 66 55 56];
Y=[40 33 35 37 33 36 32 36 35 33 41 35 37 31 40 41 34 33 34 37 ];
D = [X' Y'];
N = length(X);
Xrange = [min(X)-Border max(X)+Border];
Yrange = [min(Y)-Border max(Y)+Border];
%Setup coordinate grid
[XX YY] = meshgrid(Xrange(1):stepSize:Xrange(2), Yrange(1):stepSize:Yrange(2));
YY = flipud(YY);
%Parzen parameters and function handle
pf1 = #(C1,C2) (1/N)*(1/((2*pi)*Sigma^2)).*...
exp(-( (C1(1)-C2(1))^2+ (C1(2)-C2(2))^2)/(2*Sigma^2));
PPDF1 = zeros(size(XX));
%Populate coordinate surface
[R C] = size(PPDF1);
NN = length(D);
for c=1:C
for r=1:R
for d=1:N
PPDF1(r,c) = PPDF1(r,c) + ...
pf1([XX(1,c) YY(r,1)],[D(d,1) D(d,2)]);
end
end
end
%Normalize data
m1 = max(PPDF1(:));
PPDF1 = PPDF1 / m1;
%Set up visualization
set(0,'defaulttextinterpreter','latex','DefaultAxesFontSize',20)
fig = figure(1);clf
stem3(D(:,1),D(:,2),zeros(N,1),'b.');
hold on;
%Add PDF estimates to figure
s1 = surfc(XX,YY,PPDF1);shading interp;alpha(s1,'color');
sub1=gca;
view(2)
axis([Xrange(1) Xrange(2) Yrange(1) Yrange(2)])
Note, this visualization is actually 3-dimensional:
See this 4 minute video on the mathworks site:
http://blogs.mathworks.com/videos/2010/01/22/advanced-making-a-2d-or-3d-histogram-to-visualize-data-density/
I believe this should provide very close to exactly the functionality you require.
I would divide the area the plot covers into a grid and then count the number of points in each square of the grid. Here's an example of how that could be done.
% Get random data with high density
X=randn(1e4,1);
Y=randn(1e4,1);
Xmin=min(X);
Xmax=max(X);
Ymin=min(Y);
Ymax=max(Y);
% guess of grid size, could be divided into nx and ny
n=floor((length(X))^0.25);
% Create x and y-axis
x=linspace(Xmin,Xmax,n);
y=linspace(Ymin,Ymax,n);
dx=x(2)-x(1);
dy=y(2)-y(1);
griddata=zeros(n);
for i=1:length(X)
% Calculate which bin the point is positioned in
indexX=floor((X(i)-Xmin)/dx)+1;
indexY=floor((Y(i)-Ymin)/dy)+1;
griddata(indexX,indexY)=griddata(indexX,indexY)+1;
end
contourf(x,y,griddata)
Edit: The video in the answer by Marm0t uses the same technique but probably explains it in a better way.
Related
How to improve the distance calculation on the 2 separated datasets?
This is the code:
X = [ 3.6 79
1.8 54
3.333 74
2.283 62
4.533 85
2.883 55
4.7 88
3.6 85
1.95 51
4.35 85
1.833 54
3.917 84
4.2 78
1.75 47
4.7 83
2.167 52
1.75 62
4.8 84
1.6 52
4.25 79
1.8 51
1.75 47
3.45 78
3.067 69
4.533 74
3.6 83
1.967 55
4.083 76
3.85 78
4.433 79
4.3 73
4.467 77
3.367 66
4.033 80
3.833 74
2.017 52
1.867 48
4.833 80
1.833 59
4.783 90 ]
clc;
close all;
figure;
h(1) = plot(X(:,1),X(:,2),'bx');
hold on;
X1 = X(1:3,:);
X2 = X(4:40,:);
h(2) = plot(X1(1:3,1), X1(1:3,2),'rs','MarkerSize',10);
k=5;
[D2 ind] = sort(squeeze(sqrt(sum(bsxfun(#minus,X2,permute(X1,[3 2 1])).^2,2))))
ind_closest = ind(1:k,:)
x_closest = X(ind_closest,:)
for j = 1:length(x_closest);
h(3) =plot(x_closest(j,1),x_closest(j,2),'ko','MarkerSize',10);
end
The output is shown as in the picture below:
The problem is, the code does not pick the closest data points of red squared data points. I also tried to use pdist2 function from statistical toolbox,the result yields similar with the bsxfun function that i applied in my code.
I'm not sure which part in the code need to improve so that i can pick the data points that closest to the target.
Really appreciate if anyone can help me to improve my code
If the closest point means closest to X, line 19 & line 20 should be replaced as
[D2 ind] = sort(squeeze(sqrt(sum(bsxfun(#minus,X,permute(X1,[3 2 1])).^2,2))))
ind_closest = ind(2:k+1,:)
If the closest point means closest to X2, then try this:
x_closest = X2(ind_closest,:)
In the meanwhile, I modified your code a little bit, since your h(3) could be optimized.
clc; clear; close all;
%load fisheriris
%X=meas(:,3:4);
load X
X=unique(X,'rows');
figure;
h(1) = plot(X(:,1),X(:,2),'bx');
hold on;
X1 = X([5 15 30],:);
h(2) = plot(X1(:,1), X1(:,2),'rs','MarkerSize',10);
[D2,ind] = sort(squeeze(sqrt(sum(bsxfun(#minus,X,permute(X1,[3 2 1])).^2,2))));
k=3;
ind_closest = unique(ind(2:k+1,:));
x_closest = X(ind_closest,:);
h(3) =plot(x_closest(:,1),x_closest(:,2),'ko','MarkerSize',10);
axis equal
It seems to be working fine.
I have 1788x3 double matrix.
My goal is split first and seconds columns values as a coordinates and create 256*256 matrix. Missing values will be zero.
That is the part of my matrix:
For example in 256*256 matrix (161,37) coordinates value will be 0.347365914411139
161 37 0.347365914411139
162 38 0.414350944291199
160 38 -0.904597803215328
165 35 -0.853613950415835
163 38 -0.926329070526244
166 35 -1.37361928823183
168 37 0.661707825299905
Looking forward your answers.
Regards;
The easiest, but not necessarily most efficient way to do this would be using a loop, i.e.
% if m = you 1788x3 data
x = sparse(256,256) %// x = zeros(256); % //use either of these
for nn = 1:size(m,1)
x(m(nn,1),m(nn,2)) = m(nn,3);
end
The plot in MATLAB looks like this:
The code to generate this is very simple:
y = [0 18 450];
x = [0 5.3 6.575];
plot(x,y);
How could I know the values of 119 equally spaced discrete points on this plot?
In simple MATLAB plots, the points are connected together by simple linear interpolation. Simply put, a straight line is drawn between each pair of points. You can't physically get these points from the graph other than those you used to plot the points (at least not easily...).
If you however do desire 119 points at equally spaced intervals that would theoretically be obtained from the above set of 4 points, you can use the interp1 function to do so:
y = [0 18 450];
x = [0 5.3 6.575]
yy = interp1(x, y, linspace(min(x),max(x),119), 'linear');
interp1 performs linear (note the 'linear' flag at the end...) interpolation given a set of key points defined by x and y points and a set of x points to use to interpolate between the key x points to generate the interpolated y points stored in yy. linspace in this case generates a linearly increasing array from the smallest value in x to the largest value in x with 119 of these points.
Here's a running example with your data:
>> format compact;
>> y = [0 18 450];
>> x = [0 5.3 6.575];
>> yy = interp1(x, y, linspace(min(x),max(x),119), 'linear');
>> yy
yy =
Columns 1 through 8
0 0.1892 0.3785 0.5677 0.7570 0.9462 1.1354 1.3247
Columns 9 through 16
1.5139 1.7031 1.8924 2.0816 2.2709 2.4601 2.6493 2.8386
Columns 17 through 24
3.0278 3.2171 3.4063 3.5955 3.7848 3.9740 4.1633 4.3525
Columns 25 through 32
4.5417 4.7310 4.9202 5.1094 5.2987 5.4879 5.6772 5.8664
Columns 33 through 40
6.0556 6.2449 6.4341 6.6234 6.8126 7.0018 7.1911 7.3803
Columns 41 through 48
7.5696 7.7588 7.9480 8.1373 8.3265 8.5157 8.7050 8.8942
Columns 49 through 56
9.0835 9.2727 9.4619 9.6512 9.8404 10.0297 10.2189 10.4081
Columns 57 through 64
10.5974 10.7866 10.9759 11.1651 11.3543 11.5436 11.7328 11.9220
Columns 65 through 72
12.1113 12.3005 12.4898 12.6790 12.8682 13.0575 13.2467 13.4360
Columns 73 through 80
13.6252 13.8144 14.0037 14.1929 14.3822 14.5714 14.7606 14.9499
Columns 81 through 88
15.1391 15.3283 15.5176 15.7068 15.8961 16.0853 16.2745 16.4638
Columns 89 through 96
16.6530 16.8423 17.0315 17.2207 17.4100 17.5992 17.7885 17.9777
Columns 97 through 104
34.6540 53.5334 72.4128 91.2921 110.1715 129.0508 147.9302 166.8096
Columns 105 through 112
185.6889 204.5683 223.4477 242.3270 261.2064 280.0857 298.9651 317.8445
Columns 113 through 119
336.7238 355.6032 374.4826 393.3619 412.2413 431.1206 450.0000
I have a histogram that I want conditional coloring in it with this rule :
Values that are upper than 50 have red bars and values lower than 50 have blue bars.
Suppose that we have this input matrix:
X = [32 64 32 12 56 76 65 44 89 87 78 56 96 90 86 95 100 65];
I want default bins of MATLAB and applying this coloring on X-axes (bins). I'm using GUIDE to design my GUI and this histogram is an axes in my GUI.
This is our normal graph. Bars with upper values than 50 should be red and bars with lower values than 50 should be green (X-axes). Bars with upper values than 50 should be red and ?
I think this does what you want (as per comments). The bar around 50 is split into the two colors. This is done by using a patch to change the color of part of that bar.
%// Data:
X = [32 64 32 12 56 76 65 44 89 87 78 56 96 90 86 95 100 65]; %// data values
D = 50; %// where to divide into two colors
%// Histogram plot:
[y n] = hist(X); %// y: values; n: bin centers
ind = n>50; %// bin centers: greater or smaller than D?
bar(n(ind), y(ind), 1, 'r'); %// for greater: use red
hold on %// keep graph, Or use hold(your_axis_handle, 'on')
bar(n(~ind), y(~ind), 1, 'b'); %// for smaller: use blue
[~, nd] = min(abs(n-D)); %// locate bar around D: it needs the two colors
patch([(n(nd-1)+n(nd))/2 D D (n(nd-1)+n(nd))/2], [0 0 y(nd) y(nd)], 'b');
%// take care of that bar with a suitable patch
X = [32 64 32 12 56 76 65 44 89 87 78 56 96 90 86 95 100 65];
then you create an histogram, but you are only going to use this to get the numbers of bins, the numbers of elements and positions:
[N,XX]=hist(X);
close all
and finally here is the code where you use the Number of elements (N) and the position (XX) of the previous hist and color them
figure;
hold on;
width=8;
for i=1:length(N)
h = bar(XX(i), N(i),8);
if XX(i)>50
col = 'r';
else
col = 'b';
end
set(h, 'FaceColor', col)
end
here you can consider using more than one if and then you can set multiple colors
cheers
First sort X:
X = [32 64 32 12 56 76 65 44 89 87 78 56 96 90 86 95 100 65];
sorted_X = sort(X)
sorted_X :
sorted_X =
Columns 1 through 14
12 32 32 44 56 56 64 65 65 76 78 86 87 89
Columns 15 through 18
90 95 96 100
Then split the data based on 50:
idx1 = find(sorted_X<=50,1,'last');
A = sorted_X(1:idx1);
B = sorted_X(idx1+1:end);
Display it as two different histograms.
hist(A);
hold on;
hist(B);
h = findobj(gca,’Type’,’patch’);
display(h)
set(h(1),’FaceColor’,’g’,’EdgeColor’,’k’);
set(h(2),’FaceColor’,’r’,’EdgeColor’,’k’);
Assume we have the following data:
H_T = [36 66 21 65 52 67 73; 31 23 19 33 36 39 42]
P = [40 38 39 40 35 32 37]
Using MATLAB 7.0, I want to create three new matrices that have the following properties:
The matrix H (the first part in matrix H_T) will be divided to 3 intervals:
Matrix 1: the 1st interval contains the H values between 20 to 40
Matrix 2: the 2nd interval contains the H values between 40 to 60
Matrix 3: the 3rd interval contains the H values between 60 to 80
The important thing is that the corresponding T and P will also be included in their new matrices meaning that H will control the new matrices depending on the specifications defined above.
So, the resultant matrices will be:
H_T_1 = [36 21; 31 19]
P_1 = [40 39]
H_T_2 = [52; 36]
P_2 = [35]
H_T_3 = [66 65 67 73; 23 33 39 42]
P_3 = [38 40 32 37]
Actually, this is a simple example and it is easy by looking to create the new matrices depending on the specifications, BUT in my values I have thousands of numbers which makes it very difficult to do that.
Here's a quick solution
[~,bins] = histc(H_T(1,:), [20 40 60 80]);
outHT = cell(3,1);
outP = cell(3,1);
for i=1:3
idx = (bins == i);
outHT{i} = H_T(:,idx);
outP{i} = P(idx);
end
then you access the matrices as:
>> outHT{3}
ans =
66 65 67 73
23 33 39 42
>> outP{3}
ans =
38 40 32 37