Plot data with MATLAB biplot with more than 1 color - matlab

I have 3 groups of data that had PCA performed on them as one group. I want to highlight each variable group with a different color. Prior to this I overlaid 3 biplots. This gives different colors but creates a distortion in the data as each biplot function skews the data. This caused the groups to all be skewed by different amounts, making the plot not a correct representation.
How do I take a PCA scores matrix (30x3) and split it so the first 10x3 is one color, the next 10x3 is another and the third 10x3 is another, without the data being skewed?

"Skewing" is happening because biplot is renormalizing the scores so the farthest score is distance 1 . axis equal isn't going to fix this. You should use scatter3 instead of biplot
data = rand(30,3);
group = scores(1:10,:)
scatter3(group(:,1), group(:,2), group(:,3), '.b')
hold all
group = scores(11:20,:)
scatter3(group(:,1), group(:,2), group(:,3), '.r')
group = scores(21:30,:)
scatter3(group(:,1), group(:,2), group(:,3), '.g')
hold off
title('Data')
xlabel('X')
ylabel('Y')
zlabel('Z')
Or modify your code's scatter3 lines so that the markers are different colors. The parameter after 'marker' tells what symbol and what symbol and color to plot. E.g. '.r' is a red dot. See Linespec for marker and color parameters.
scatter3(plotdataholder(1:14,1),plotdataholder(1:14,2),plotdataholder(1:14,3),35,[1 0 0],'marker', '.b');
hold on;
scatter3(plotdataholder(15:28,1),plotdataholder(15:28,2),plotdataholder(15:28,3),35,[0 0 1],'marker', '.r') ;
scatter3(plotdataholder(29:42,1),plotdataholder(29:42,2),plotdataholder(29:42,3),35,[0 1 0],'marker', '.g');

This is the method I used to plot biplot data with different colors. The lines of code prior to plot are taken from the biplot.m file. The way biplot manipulates data is kept intact and stops skewing of data when using overlaid biplots.
This coding is not the most efficient, one can see parts that can be cut. I wanted to keep the code intact so one can see how biplot works in it's entirety.
%%%%%%%%%%%%%%%%%%%%%
xxx = coeff(:,1:3);
yyy= score(:,1:3);
**%Taken from biplot.m; This is alter the data the same way biplot alters data - having the %data fit on grid axes no larger than 1.**
[n,d2] = size(yyy);
[p,d] = size(xxx); %7 by 3
[dum,maxind] = max(abs(xxx),[],1);
colsign = sign(xxx(maxind + (0:p:(d-1)*p)));
xxx = xxx .* repmat(colsign, p, 1);
yyy= (yyy ./ max(abs(yyy(:)))) .* repmat(colsign, 42, 1);
nans = NaN(n,1);
ptx = [yyy(:,1) nans]';
pty = [yyy(:,2) nans]';
ptz = [yyy(:,3) nans]';
**%I grouped the pt matrices for my benefit**
plotdataholder(:,1) = ptx(1,:);
plotdataholder(:,2) = pty(1,:);
plotdataholder(:,3) = ptz(1,:);
**%my original score matrix is 42x3 - wanted each 14x3 to be a different color**
scatter3(plotdataholder(1:14,1),plotdataholder(1:14,2),plotdataholder(1:14,3),35,[1 0 0],'marker', '.');
hold on;
scatter3(plotdataholder(15:28,1),plotdataholder(15:28,2),plotdataholder(15:28,3),35,[0 0 1],'marker', '.') ;
scatter3(plotdataholder(29:42,1),plotdataholder(29:42,2),plotdataholder(29:42,3),35,[0 1 0],'marker', '.');
xlabel('Principal Component 1');
ylabel('Principal Component 2');
zlabel('Principal Component 3');

I am not sure if it will help, but try axis equal after you have overlaid the plots.

Related

MATLAB: combining and normalizing histograms with different sample sizes

I have four sets of data, the distribution of which I would like to represent in MATLAB in one figure. Current code is:
[n1,x1]=hist([dataset1{:}]);
[n2,x2]=hist([dataset2{:}]);
[n3,x3]=hist([dataset3{:}]);
[n4,x4]=hist([dataset4{:}]);
bar(x1,n1,'hist');
hold on; h1=bar(x1,n1,'hist'); set(h1,'facecolor','g')
hold on; h2=bar(x2,n2,'hist'); set(h2,'facecolor','g')
hold on; h3=bar(x3,n3,'hist'); set(h3,'facecolor','g')
hold on; h4=bar(x4,n4,'hist'); set(h4,'facecolor','g')
hold off
My issue is that I have different sampling sizes for each group, dataset1 has an n of 69, dataset2 has an n of 23, dataset3 and dataset4 have n's of 10. So how do I normalize the distributions when representing these three groups together?
Is there some way to..for example..divide the instances in each bin by the sampling for that group?
You can normalize your histograms by dividing by the total number of elements:
[n1,x1] = histcounts(randn(69,1));
[n2,x2] = histcounts(randn(23,1));
[n3,x3] = histcounts(randn(10,1));
[n4,x4] = histcounts(randn(10,1));
hold on
bar(x4(1:end-1),n4./sum(n4),'histc');
bar(x3(1:end-1),n3./sum(n3),'histc');
bar(x2(1:end-1),n2./sum(n2),'histc');
bar(x1(1:end-1),n1./sum(n1),'histc');
hold off
ax = gca;
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
However, as you can see above, you can do some more things to make your code more simple and short:
You only need to hold on once.
Instead of collecting all the bar handles, use the axes handle.
Plot the bar in ascending order of the number of elements in the dataset, so all histograms will be clearly visible.
With the axes handle set all properties at one command.
and as a side note - it's better to use histcounts.
Here is the result:
EDIT:
If you want to also plot the pdf line from histfit, then you can save it first, and then plot it normalized:
dataset = {randn(69,1),randn(23,1),randn(10,1),randn(10,1)};
fits = zeros(100,2,numel(dataset));
hold on
for k = numel(dataset):-1:1
total = numel(dataset{k}); % for normalizing
f = histfit(dataset{k}); % draw the histogram and fit
% collect the curve data and normalize it:
fits(:,:,k) = [f(2).XData; f(2).YData./total].';
x = f(1).XData; % collect the bar positions
n = f(1).YData; % collect the bar counts
f.delete % delete the histogram and the fit
bar(x,n./total,'histc'); % plot the bar
end
ax = gca; % get the axis handle
% set all color and transparency for the bars:
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
% plot all the curves:
plot(squeeze(fits(:,1,:)),squeeze(fits(:,2,:)),'LineWidth',3)
hold off
Again, there are some other improvements you can introduce to your code:
Put everything in a loop to make thigs more easily changed later.
Collect all the curves data to one variable so you can plot them all together very easily.
The new result is:

How to draw differently spaced grid lines in Matlab

On mathworks, I have found a code which is suppossed to draw grid lines in a plot:
g_x = -25:1.25:0;
g_y = -35:2.5:-5;
for i = 1:length(g_x)
plot([g_x(i) g_x(i)],[g_y(1) g_y(end)],'k:')% y grid lines
hold on
end
for i=1:length(g_y)
plot([g_x(1) g_x(end)],[g_y(i) g_y(i)],'k:') % x grid lines
hold on
end
here's a link
I don't understand the plot command: e.g. the y grid lines - one of the inputs is a vector containing all the spacing points of the x-axis, where I want to have a grid. These points are given in two columns and they are assigned to the second vector, which only contains the first and the last point shown on the y-axis. As I understand this command, it will for example take the first element g_x(1) and g_y(1) and plot a : , it will then take g_x(2) and g_y(1) and plot :, and so on. But How does it keep on plotting : from g_y(1) continuously until g-y(end) ?
To directly answer your question, it simply plots the two endpoints of each grid line and since the default LineStyle used by plot is a solid line, they will automatically be connected. What that code is doing is creating all permutations of endpoints and plotting those to form the grid.
Rather than creating custom plot objects (if you're using R2015b or later), you can simply use the minor grid lines and modify the locations of the minor tick marks of the axes.
g_x = -25:1.25:0;
g_y = -35:2.5:-5;
ax = axes('xlim', [-25 0], 'ylim', [-35 -5]);
% Turn on the minor grid lines
grid(ax, 'minor')
% Modify the location of the x and y minor tick marks
ax.XAxis.MinorTickValues = g_x;
ax.YAxis.MinorTickValues = g_y;
Basically the code does this:
g_x = -25:1.25:0;
generates a array with values -25.0000 -23.7500 -22.5000 -21.2500 ... 0
These are the positions where vertical lines are drawn
The same holds for g_y, but of course this determines where horizontal lines are drawn.
The option 'k' determined that it is a dashed line.
And the loops just loo through the arrays.
So in the first iteration the plot function draws a line from the position
[-25, -35]
to the position
[-25, -5]
So if you want to change the grid, just change the values stored in g_x
g_x = -25:3.0:0;
would for example draw vertical lines with the width 3.0
I hope this makes sense for you.

Matlab plot of several digital signals

I'm trying to find a way to nicely plot my measurement data of digital signals.
So I have my data available as csv and mat file, exported from an Agilent Oscilloscope. The reason I'm not just taking a screen shot of the Oscilloscope screen is that I need to be more flexible (make several plots with one set of data, only showing some of the lines). Also I need to be able to change the plot in a month or two so my only option is creating a plot from the data with a computer.
What I'm trying to achieve is something similar to this picture:
The only thing missing on that pic is a yaxis with 0 and 1 lines.
My first try was to make a similar plot with Matlab. Here's what I got:
What's definitely missing is that the signal names are right next to the actual line and also 0 and 1 ticks on the y-axis.
I'm not even sure if Matlab is the right tool for this and I hope you guys can give me some hints/a solution on how to make my plots :-)
Here's my Matlab code:
clear;
close all;
clc;
MD.RAW = load('Daten/UVLOT1 debounced 0.mat'); % get MeasurementData
MD.N(1) = {'INIT\_DONE'};
MD.N(2) = {'CONF\_DONE'};
MD.N(3) = {'NSDN'};
MD.N(4) = {'NRST'};
MD.N(5) = {'1V2GD'};
MD.N(6) = {'2V5GD'};
MD.N(7) = {'3V3GD'};
MD.N(8) = {'5VGD'};
MD.N(9) = {'NERR'};
MD.N(10) = {'PGD'};
MD.N(11) = {'FGD'};
MD.N(12) = {'IGAGD'};
MD.N(13) = {'GT1'};
MD.N(14) = {'NERRA'};
MD.N(15) = {'GT1D'};
MD.N(16) = {'GB1D'};
% concat vectors into one matrix
MD.D = [MD.RAW.Trace_D0, MD.RAW.Trace_D1(:,2), MD.RAW.Trace_D2(:,2), MD.RAW.Trace_D3(:,2), ...
MD.RAW.Trace_D4(:,2), MD.RAW.Trace_D5(:,2), MD.RAW.Trace_D6(:,2), MD.RAW.Trace_D7(:,2), ...
MD.RAW.Trace_D8(:,2), MD.RAW.Trace_D9(:,2), MD.RAW.Trace_D10(:,2), MD.RAW.Trace_D11(:,2), ...
MD.RAW.Trace_D12(:,2), MD.RAW.Trace_D13(:,2), MD.RAW.Trace_D14(:,2), MD.RAW.Trace_D15(:,2)];
cm = hsv(size(MD.D,2)); % make colormap for plot
figure;
hold on;
% change timebase to ns
MD.D(:,1) = MD.D(:,1) * 1e9;
% plot lines
for i=2:1:size(MD.D,2)
plot(MD.D(:,1), MD.D(:,i)+(i-2)*1.5, 'color', cm(i-1,:));
end
hold off;
legend(MD.N, 'Location', 'EastOutside');
xlabel('Zeit [ns]'); % x axis label
title('Messwerte'); % title
set(gca, 'ytick', []); % hide y axis
Thank you guys for your help!
Dan
EDIT:
Here's a pic what I basically want. I added the signal names via text now the only thing that's missing are the 0, 1 ticks. They are correct for the init done signal. Now I just need them repeated instead of the other numbers on the y axis (sorry, kinda hard to explain :-)
So as written in my comment to the question. For appending Names to each signal I would recommend searching the documentation of how to append text to graph. There you get many different ways how to do it. You can change the position (above, below) and the exact point of data. As an example you could use:
text(x_data, y_data, Var_Name,'VerticalAlignment','top');
Here (x_data, y_data) is the data point where you want to append the text and Var_Name is the name you want to append.
For the second question of how to get a y-data which contains 0 and 1 values for each signal. I would do it by creating your signal the way, that your first signal has values of 0 and 1. The next signal is drawn about 2 higher. Thus it changes from 2 to 3 and so on. That way when you turn on y-axis (grid on) you get values at each integer (obviously you can change that to other values if you prefer less distance between 2 signals). Then you can relabel the y-axis using the documentation of axes (check the last part, because the documentation is quite long) and the set() function:
set(gca, 'YTick',0:1:last_entry, 'YTickLabel',new_y_label(0:1:last_entry))
Here last_entry is 2*No_Signals-1 and new_y_label is an array which is constructed of 0,1,0,1,0,....
For viewing y axis, you can turn the grid('on') option. However, you cannot chage the way the legends appear unless you resize it in the matlab figure. If you really want you can insert separate textboxes below each of the signal plots by using the insert ->Textbox option and then change the property (linestyle) of the textbox to none to get the exact same plot as above.
This is the end result and all my code, in case anybody else wants to use the good old ctrl-v ;-)
Code:
clear;
close all;
clc;
MD.RAW = load('Daten/UVLOT1 debounced 0.mat'); % get MeasurementData
MD.N(1) = {'INIT\_DONE'};
MD.N(2) = {'CONF\_DONE'};
MD.N(3) = {'NSDN'};
MD.N(4) = {'NRST'};
MD.N(5) = {'1V2GD'};
MD.N(6) = {'2V5GD'};
MD.N(7) = {'3V3GD'};
MD.N(8) = {'5VGD'};
MD.N(9) = {'NERR'};
MD.N(10) = {'PGD'};
MD.N(11) = {'FGD'};
MD.N(12) = {'IGAGD'};
MD.N(13) = {'GT1'};
MD.N(14) = {'NERRA'};
MD.N(15) = {'GT1D'};
MD.N(16) = {'GB1D'};
% concat vectors into one matrix
MD.D = [MD.RAW.Trace_D0, MD.RAW.Trace_D1(:,2), MD.RAW.Trace_D2(:,2), MD.RAW.Trace_D3(:,2), ...
MD.RAW.Trace_D4(:,2), MD.RAW.Trace_D5(:,2), MD.RAW.Trace_D6(:,2), MD.RAW.Trace_D7(:,2), ...
MD.RAW.Trace_D8(:,2), MD.RAW.Trace_D9(:,2), MD.RAW.Trace_D10(:,2), MD.RAW.Trace_D11(:,2), ...
MD.RAW.Trace_D12(:,2), MD.RAW.Trace_D13(:,2), MD.RAW.Trace_D14(:,2), MD.RAW.Trace_D15(:,2)];
cm = hsv(size(MD.D,2)); % make colormap for plot
figure;
hold on;
% change timebase to ns
MD.D(:,1) = MD.D(:,1) * 1e9;
% plot lines
for i=2:1:size(MD.D,2)
plot(MD.D(:,1), MD.D(:,i)+(i-2)*2, 'color', cm(i-1,:));
text(MD.D(2,1), (i-2)*2+.5, MD.N(i-1));
end
hold off;
%legend(MD.N, 'Location', 'EastOutside');
xlabel('Zeit [ns]'); % x axis label
title('Messwerte'); % title
% make y axis and grid the way I want it
set(gca, 'ytick', 0:size(MD.D,2)*2-3);
grid off;
set(gca,'ygrid','on');
set(gca, 'YTickLabel', {'0'; '1'});
ylim([-1,(size(MD.D,2)-1)*2]);

How to mark points in a plot that are over a specified value in Matlab?

Say I have a data set with x values of longitude and Y values of 1 to 100. How can I plot the whole data set and represent all Y values over 90 with a different symbol?
Thanks for the help!
The easiest way would be to plot the sets separately, and specify a different symbol for each set i.e.
plot(x(Y<=90),Y(Y<=90),'bx',x(Y>90),Y(Y>90),'bo');
You could also do different colors. The scatter function has the ability to specify a different color for each point with the syntax scatter(x,y,s,c). For your example, you could do:
% make data
rng(0,'twister'); theta = linspace(0,2*pi,150);
x = sin(theta) + 0.75*rand(1,150); x = x*100;
y = cos(theta) + 0.75*rand(1,150); y = y*100;
mask = y>90;
% plot with custom colors for each point
c = zeros(numel(x),3); % matrix of RGB colorspecs
c(mask,:) = repmat([1 0 0],nnz(mask),1); % red
c(~mask,:) = repmat([0 0 1],nnz(~mask),1); % blue
scatter(x,y,10,c,'+');
Or instead of and RGB colorspec matrix, you can index into the current colormap. This allows you to get a nice smooth variation with some value:
scatter(x,y,10,y+x,'o') % x+y is mapped to indexes into default colormap, jet(64)
You can combine this color mapping with the approach of separating the data into two sets to also get different markers. Split the data, plot the first set with scatter as above, hold on, and plot the second set with a different marker. For example,
cv = x+y; % or just y, but this is an interesting example
scatter(x(mask),y(mask),10,cv(mask),'+');
hold on
scatter(x(~mask),y(~mask),10,cv(~mask),'o');
The result is different marker styles, where '+' is used where y>90 and '+' elsewhere, and different colors, where color is determined by mapping the values of cv=x+y onto the current colormap. The idea here is to look at 2 different modes of variations, but you could just use cv=y.

Configuring biplot in Matlab to distinguish in scatter

My original data is a 195x22 record set containing vocal measurements of people having Parkinson's disease or not. In a vector, 195x1, I have a status which is either 1/0.
Now, I have performed a PCA and I do a biplot, which turns out well. The problem is that I can't tell which dots from my scatter plot origin of a sick or a healthy person (I can't link it with status). I would like for my scatter plot to have a red dot if healthy (status=0) and green if sick (status=1).
How would I do that? My biplot code is:
biplot(coeff(:,1:2), ...
'Scores', score(:,1:2), ...
'VarLabels', Labels, ...
'markersize', 15 ...
);
xlabel('Bi-Plot: Standardized Data');
xlabel('PCA1');
ylabel('PCA2');
Click to view image
UPDATE (Solution):
Solution is inspired by #Magla and code can be seen here: http://pastebin.com/KHUj3DnA
With this beautiful graph:
The principal component scores (red points) in a biplot are not the ones returned by the pca function. As the help states,
biplot scales the scores so that they fit on the plot: It divides each score by the maximum absolute value of all scores, and multiplies
by the maximum coefficient length of coefs. Then biplot changes the
sign of score coordinates according to the sign convention for the
coefs.
You therefore can't easily use the (X,Y) information to find out which point belong to a category.
Here is a workaround using the ObsLabels option of biplot. ObsLabels assigns some user-defined data to each observation: for each point, we will assign the index corresponding to a status variable (a simple incrementing value). With this, you can easily modify the red points of a biplot - here marker set to square and red/green color.
The following figure
is produced by this code
%some data
load carsmall
x = [Acceleration Displacement Horsepower MPG Weight]; x = x(all(~isnan(x),2),:);
[coefs,score] = pca(zscore(x));
%the status vector (here zero or one)
class_pt = round(rand(size(score,1),1));
vbls = {'Accel','Disp','HP','MPG','Wgt'};
figure('Color', 'w');
hbi = biplot(coefs(:,1:2),'scores',score(:,1:2),'varlabels',vbls,...
'ObsLabels',num2str((1:size(score,1))'));
for ii = 1:length(hbi)
userdata = get(hbi(ii), 'UserData');
if ~isempty(userdata)
if class_pt(userdata) == 0
set(hbi(ii), 'Color', 'g', 'Marker', 's');
elseif class_pt(userdata) == 1
set(hbi(ii), 'Color', 'r', 'Marker', 's');
end
end
end