Splitting data into two classes visually in matlab - matlab

I have two clusters of data each cluster has x,y (coordinates) and a value to know it's type(1 class1,2 class 2).I have plotted these data but i would like to split these classes with boundary(visually). what is the function to do such thing. i tried contour but it did not help!

Consider this classification problem (using the Iris dataset):
As you can see, except for easily separable clusters for which you know the equation of the boundary beforehand, finding the boundary is not a trivial task...
One idea is to use the discriminant analysis function classify to find the boundary (you have a choice between linear and quadratic boundary).
The following is a complete example to illustrate the procedure. The code requires the Statistics Toolbox:
%# load Iris dataset (make it binary-class with 2 features)
load fisheriris
data = meas(:,1:2);
labels = species;
labels(~strcmp(labels,'versicolor')) = {'non-versicolor'};
NUM_K = numel(unique(labels)); %# number of classes
numInst = size(data,1); %# number of instances
%# visualize data
figure(1)
gscatter(data(:,1), data(:,2), labels, 'rb', '*o', ...
10, 'on', 'sepal length', 'sepal width')
title('Iris dataset'), box on, axis tight
%# params
classifierType = 'quadratic'; %# 'quadratic', 'linear'
npoints = 100;
clrLite = [1 0.6 0.6 ; 0.6 1 0.6 ; 0.6 0.6 1];
clrDark = [0.7 0 0 ; 0 0.7 0 ; 0 0 0.7];
%# discriminant analysis
%# classify the grid space of these two dimensions
mn = min(data); mx = max(data);
[X,Y] = meshgrid( linspace(mn(1),mx(1),npoints) , linspace(mn(2),mx(2),npoints) );
X = X(:); Y = Y(:);
[C,err,P,logp,coeff] = classify([X Y], data, labels, classifierType);
%# find incorrectly classified training data
[CPred,err] = classify(data, data, labels, classifierType);
bad = ~strcmp(CPred,labels);
%# plot grid classification color-coded
figure(2), hold on
image(X, Y, reshape(grp2idx(C),npoints,npoints))
axis xy, colormap(clrLite)
%# plot data points (correctly and incorrectly classified)
gscatter(data(:,1), data(:,2), labels, clrDark, '.', 20, 'on');
%# mark incorrectly classified data
plot(data(bad,1), data(bad,2), 'kx', 'MarkerSize',10)
axis([mn(1) mx(1) mn(2) mx(2)])
%# draw decision boundaries between pairs of clusters
for i=1:NUM_K
for j=i+1:NUM_K
if strcmp(coeff(i,j).type, 'quadratic')
K = coeff(i,j).const;
L = coeff(i,j).linear;
Q = coeff(i,j).quadratic;
f = sprintf('0 = %g + %g*x + %g*y + %g*x^2 + %g*x.*y + %g*y.^2',...
K,L,Q(1,1),Q(1,2)+Q(2,1),Q(2,2));
else
K = coeff(i,j).const;
L = coeff(i,j).linear;
f = sprintf('0 = %g + %g*x + %g*y', K,L(1),L(2));
end
h2 = ezplot(f, [mn(1) mx(1) mn(2) mx(2)]);
set(h2, 'Color','k', 'LineWidth',2)
end
end
xlabel('sepal length'), ylabel('sepal width')
title( sprintf('accuracy = %.2f%%', 100*(1-sum(bad)/numInst)) )
hold off

Related

Gaussian mixture model - get the contour of given probability value : Matlab

I need to identify the 99% probability contour of a GMM fitted to data. Following this example, I'd like to be able to specify which contours to plot, and the x,y, of them.
mu1 = [1 2]; Sigma1 = [2 0; 0 0.5];
mu2 = [-3 -5]; Sigma2 = [1 0;0 1];
X = [mvnrnd(mu1,Sigma1,1000); mvnrnd(mu2,Sigma2,1000)];
GMModel = fitgmdist(X,2);
figure
y = [zeros(1000,1);ones(1000,1)];
h = gscatter(X(:,1),X(:,2),y);
hold on
gmPDF = #(x,y) arrayfun(#(x0,y0) pdf(GMModel,[x0 y0]),x,y);
g = gca;
fcontour(gmPDF,[g.XLim g.YLim])
title('{\bf Scatter Plot and Fitted Gaussian Mixture Contours}')
legend(h,'Model 0','Model1')
hold off
So, in the following figure, I'd like to to be able to plot the 99% in dashed black line "k". Any idea how to accomplish this?
You can display and get the coordinates of the given contour line specifying the LevelList property of fcontour, and then reading the ContourMatrix property of the contour handle:
% Random function, insert here yours
f = #(x,y) arrayfun(#(x0,y0) x0.^2 + y0.^2 - 0.1,x,y);
% The function value you want to get the contour for
lvl = 0.99;
% Plot the contour line
cHandle = fcontour(f, '--k', 'LevelList', [lvl]);
hold on
% Get the coordinates
lvlX = cHandle.ContourMatrix(1, 2:end);
lvlY = cHandle.ContourMatrix(2, 2:end);
% For a check:
plot(lvlX, lvlY, '--r')

2 dim histogram binning in matlab [duplicate]

I have written a 2D histogram algorithm for 2 matlab vectors. Unfortunately, I cannot figure out how to vectorize it, and it is about an order of magnitude too slow for my needs. Here is what I have:
function [ result ] = Hist2D( vec0, vec1 )
%Hist2D takes two vectors, and computes the two dimensional histogram
% of those images. It assumes vectors are non-negative, and bins
% are the integers.
%
% OUTPUTS
% result -
% size(result) = 1 + [max(vec0) max(vec1)]
% result(i,j) = number of pixels that have value
% i-1 in vec0 and value j-1 in vec1.
result = zeros(max(vec0)+1, max(vec1)+1);
fvec0 = floor(vec1)+1;
fvec1 = floor(vec0)+1;
% UGH, This is gross, there has to be a better way...
for i = 1 : size(fvec0);
result(fvec0(i), fvec1(i)) = 1 + result(fvec0(i), fvec1(i));
end
end
Thoughts?
Thanks!!
John
Here is my version for a 2D histogram:
%# some random data
X = randn(2500,1);
Y = randn(2500,1)*2;
%# bin centers (integers)
xbins = floor(min(X)):1:ceil(max(X));
ybins = floor(min(Y)):1:ceil(max(Y));
xNumBins = numel(xbins); yNumBins = numel(ybins);
%# map X/Y values to bin indices
Xi = round( interp1(xbins, 1:xNumBins, X, 'linear', 'extrap') );
Yi = round( interp1(ybins, 1:yNumBins, Y, 'linear', 'extrap') );
%# limit indices to the range [1,numBins]
Xi = max( min(Xi,xNumBins), 1);
Yi = max( min(Yi,yNumBins), 1);
%# count number of elements in each bin
H = accumarray([Yi(:) Xi(:)], 1, [yNumBins xNumBins]);
%# plot 2D histogram
imagesc(xbins, ybins, H), axis on %# axis image
colormap hot; colorbar
hold on, plot(X, Y, 'b.', 'MarkerSize',1), hold off
Note that I removed the "non-negative" restriction, but kept integer bin centers (this could be easily changed into dividing range into equally-sized specified number of bins instead "fractions").
This was mainly inspired by #SteveEddins blog post.
You could do something like:
max0 = max(fvec0) + 1;
max1 = max(fvec1) + 1;
% Combine the vectors
combined = fvec0 + fvec1 * max0;
% Generate a 1D histogram
hist_1d = hist(combined, max0*max1);
% Convert back to a 2D histogram
hist_2d = reshape(hist, [max0 max1]);
(Note: untested)

MATLAB Multiple(parallel) box plots in single figure

I'm using the boxplot function in MATLAB. I need to plot boxplots for 6 different datasets for 6 'XTicks' i.e each tick in the x axis should contain 6 corresponding boxes, whiskers, median lines and set of outliers within it's domain. I tried manipulating the 'XTick' property by setting offsets for each variable, but it doesn't apply for boxplot() as it would for a normal plot(). I'm also not able to add legends.
A 3 variable equivalent of my problem would like the following:
Edit:
The following is the code snippet that needs to be modified
TreadmillData = randi([20,200],69,6);
Speeds = {'1.5mph' '2.5mph' '3.5mph' '4.5mph' '5.5mph' '6.5mph'};
DeviceColors = {'r' 'g' 'c' [0.5 0 0.5] 'b' [1 0.5 0]};
Pedometer1 = TreadmillData(1:7:end,:);
Pedometer2 = TreadmillData(2:7:end,:);
Pedometer3 = TreadmillData(3:7:end,:);
Pedometer4 = TreadmillData(4:7:end,:);
Pedometer5 = TreadmillData(5:7:end,:);
Pedometer6 = TreadmillData(6:7:end,:);
GroupedData = {Pedometer1 Pedometer2 Pedometer3 Pedometer4 Pedometer5 Pedometer6};
legendEntries = {'dev1' 'dev2' 'dev3' 'dev4' 'dev5' 'dev6'};
figure;
Xt = 20:20:120;
Xt_Offset = [-15,-10,-5,5,10,15];
for i=1:6
boxplot(GroupedData{i},'Color',DeviceColors{i});
set(gca,'XTick',Xt+Xt_Offset(i));
if i==3
set(gca,'XTickLabel',Speeds);
end
hold on;
end
xlabel('Speed');ylabel('Step Count'); grid on;
legend(legendEntries);
Any help would be appreciated!
I've made some modifications to your code. I've tested this in R2014b.
TreadmillData = randi([20,200],69,6);
Speeds = {'1.5mph' '2.5mph' '3.5mph' '4.5mph' '5.5mph' '6.5mph'};
DeviceColors = {'r' 'g' 'c' [0.5 0 0.5] 'b' [1 0.5 0]};
Pedometer1 = TreadmillData(1:7:end,:);
Pedometer2 = TreadmillData(2:7:end,:);
Pedometer3 = TreadmillData(3:7:end,:);
Pedometer4 = TreadmillData(4:7:end,:);
Pedometer5 = TreadmillData(5:7:end,:);
Pedometer6 = TreadmillData(6:7:end,:);
GroupedData = {Pedometer1 Pedometer2 Pedometer3 Pedometer4 Pedometer5 Pedometer6};
legendEntries = {'dev1' 'dev2' 'dev3' 'dev4' 'dev5' 'dev6'};
N = numel(GroupedData);
delta = linspace(-.3,.3,N); %// define offsets to distinguish plots
width = .2; %// small width to avoid overlap
cmap = hsv(N); %// colormap
legWidth = 1.8; %// make room for legend
figure;
hold on;
for ii=1:N %// better not to shadow i (imaginary unit)
%if ii~=ceil(N/2)
% labels = repmat({''},1,N); %// empty labels
%else
labels = Speeds; %// center plot: use real labels
%end
boxplot(GroupedData{ii},'Color', DeviceColors{ii}, 'boxstyle','filled', ...
'position',(1:numel(labels))+delta(ii), 'widths',width, 'labels',labels)
%// plot filled boxes with specified positions, widths, labels
plot(NaN,1,'color',DeviceColors{ii}); %// dummy plot for legend
end
xlabel('Speed'); ylabel('Step Count'); grid on;
xlim([1+2*delta(1) numel(labels)+legWidth+2*delta(N)]) %// adjust x limits, with room for legend
legend(legendEntries);
Here is a solution for plotting several boxplot. You have to group all the data in a single matrix, each group being separated by a column of Nan. After that, you can simply plot a single regular boxplot with ad-hoc options such as colors and labels.
The following example uses 2 groups of 3, so 7 columns. The 4 first lines of data:
0.6993 0.0207 -0.7485 NaN 0.5836 -0.1763 -1.8468
-0.0494 -1.5411 0.8022 NaN 2.7124 -0.0636 -2.3639
0.9134 0.7106 -0.1375 NaN -0.2200 -0.2528 -0.8350
-0.5655 1.3820 0.6038 NaN -0.7563 -0.9779 0.3789
And the code:
figure('Color', 'w');
c = colormap(lines(3));
A = randn(60,7); % some data
A(:,4) = NaN; % this is the trick for boxplot
C = [c; ones(1,3); c]; % this is the trick for coloring the boxes
% regular plot
boxplot(A, 'colors', C, 'plotstyle', 'compact', ...
'labels', {'','ASIA','','','','USA',''}); % label only two categories
hold on;
for ii = 1:3
plot(NaN,1,'color', c(ii,:), 'LineWidth', 4);
end
title('BOXPLOT');
ylabel('MPG');
xlabel('ORIGIN');
legend({'SUV', 'SEDAN', 'SPORT'});
set(gca, 'XLim', [0 8], 'YLim', [-5 5]);

Gaussian Probabilities plot around a trajectory

I am trying to write some code to generate a plot similar to the one below on matlab (taken from here):
I have a set of points on a curve (x_i,y_i,z_i). Each point generates a Gaussian distribution (of mean (x_i,y_i,z_i) and covariance matrix I_3).
What I did is I meshed the space into npoint x npoints x npoints and computed the sum of the probability densities for each of the 'sources' (x_i,y_i,z_i) in each point (x,y,z). Then, if the value I get is big enough (say 95% of the maximum density), I keep the point. otherwise I discard it.
The problem with my code is that it is too slow (many for loops) and the graph I get doesn't look like the one below:
Does anyone know whether there is a package to get a similar plot as the one below?
Using isosurface we can do reasonably well. (Although I'm not honestly sure what you want, I think this is close:
% Create a path
points = zeros(10,3);
for ii = 2:10
points(ii, :) = points(ii-1,:) + [0.8 0.04 0] + 0.5 * randn(1,3);
end
% Create the box we're interested in
x = linspace(-10,10);
y = x;
z = x;
[X,Y,Z] = meshgrid(x,y,z);
% Calculate the sum of the probability densities(ish)
V = zeros(size(X));
for ii = 1:10
V = V + 1/(2*pi)^(3/2) * exp(-0.5 * (((X-points(ii,1)).^2 + (Y-points(ii,2)).^2 + (Z-points(ii,3)).^2)));
end
fv = isosurface(X,Y,Z,V, 1e-4 * 1/(2*pi)^(3/2), 'noshare');
fv2 = isosurface(X,Y,Z,V, 1e-5 * 1/(2*pi)^(3/2), 'noshare');
p = patch('vertices', fv.vertices, 'faces', fv.faces);
set(p,'facecolor', 'none', 'edgecolor', 'blue', 'FaceAlpha', 0.05)
hold on;
p2 = patch('vertices', fv2.vertices, 'faces', fv2.faces);
set(p2,'facecolor', 'none', 'edgecolor', 'red', 'FaceAlpha', 0.1)
scatter3(points(:,1), points(:,2), points(:,3));

Using Coordinate System in Matlab

I've been working on a project involving inverse source problem known within the electromagnetic wave field. The problem i have is that ; I have to define 3 points in a 2D space. These points should have a x,y coordinate of course and a value which will define its' current. Like this:
A1(2,3)=1
A2(2,-2)=2
and so on.
Also i have to define a circle around this and divide it into 200 points. Like the first point would be ; say R=2 ; B1(2,0) ;B50(0,2);B100(-2,0) and so on.
Now i really am having a hard time to define a space in MATLAB and circle it. So what i am asking is to help me define a 2D space and do it as the way i described. Thanks for any help guys!
This kind of code may be use. Look at grid in the Variable editor.
grid = zeros(50, 50);
R = 10;
angles = (1:200)/2/pi;
x = cos(angles)*R;
y = sin(angles)*R;
center = [25 20];
for n=1:length(angles)
grid(center(1)+1+round(x(n)), center(2)+1+round(y(n))) = 1;
end
You have to define a grid large enough for your need.
Here is a complete example that might be of help:
%# points
num = 3;
P = [2 3; 2 -2; -1 1]; %# 2D points coords
R = [2.5 3 3]; %# radii of circles around points
%# compute circle points
theta = linspace(0,2*pi,20)'; %'
unitCircle = [cos(theta) sin(theta)];
C = zeros(numel(theta),2,num);
for i=1:num
C(:,:,i) = bsxfun(#plus, R(i).*unitCircle, P(i,:));
end
%# prepare plot
xlims = [-6 6]; ylims = [-6 6];
line([xlims nan 0 0],[0 0 nan ylims], ...
'LineWidth',2, 'Color',[.2 .2 .2])
axis square, grid on
set(gca, 'XLim',xlims, 'YLim',ylims, ...
'XTick',xlims(1):xlims(2), 'YTick',xlims(1):xlims(2))
title('Cartesian Coordinate System')
xlabel('x-coords'), ylabel('y-coords')
hold on
%# plot centers
plot(P(:,1), P(:,2), ...
'LineStyle','none', 'Marker','o', 'Color','m')
str = num2str((1:num)','A%d'); %'
text(P(:,1), P(:,2), , str, ...
'HorizontalAlignment','left', 'VerticalAlignment','bottom')
%# plot circles
clr = lines(num);
h = zeros(num,1);
for i=1:num
h(i) = plot(C(:,1,i), C(:,2,i), ...
'LineStyle','-', 'Marker','.', 'Color',clr(i,:));
end
str = num2str((1:num)','Circle %d'); %'
legend(h, str, 'Location','SW')
hold off