Scatterplot matlab - matlab

I have some problems with a scatter plot.
I am plotting a matrix containing grades per assignment for students e.g. [assignments x grades], but if more than one student gets the same grade in the same assignment, the points will be on top of each other. I want to add a small random number (between -0.1 and 0.1) to the x- and y-coordinates of each dot.
On the x-axis it should be number of assignments and on the y-axis it should be all the grades.
the grades matrix is defined as a 12x4 matrix
My code looks like this:
n_assignments = size(grades,2); % Total number of assignments.
n_students = size(grades,1); % Total number of student.
hold on; % Retain current plot when adding new plots.
for i = 1:n_assignments % Loop through every assignment.
% Scatter plot of assignment vs grades for that assignment.
% One assignment on every iteration.
scatter(i*ones(1, n_students), grades(i, :), 'jitter', 'on', 'jitterAmount', 0.1);
end
hold off; % Set the hold state to off.
set(gca, 'XTick', 1:n_assignments); % Display only integer values in x-axis.
xlabel('assignment'); % Label for x-axis.
ylabel('grades'); % Label for y-axis.
grid on; % Display grid lines.
But I keep getting the error message:
X and Y must be vectors of the same length.

Please note that the scatter plot jitter is an undocumented
feature. You can also have semi-transparent markers in line and
scatter plots, which could be another alternative to solve your
current problem.
I will cover the scatter 'jitter' feature in this answer.
Note that 'jitter' only affects the x-axis but not the y-axis (more info on Undocumented Matlab).
Have a look at this example I made based on your description:
Suppose you have a class with 20 students and they have completed 5 assignments. The grades for the assignments are stored in a matrix (grades) where the rows are the assignments and the columns are the students.
Then I simply generate a scatter plot of the data in the grades matrix, one row at a time, in a for loop and using hold on to keep all the graphics on the same figure.
n_assignments = 5; % Total number of assignments.
n_students = 20; % Total number of students.
grades = randi(10, n_assignments, n_students); % Random matrix of grades.
hold on; % Retain current plot when adding new plots.
for i = 1:n_assignments % Loop through every assignment.
% Scatter plot of assignment vs grades for that assignment.
% One assignment on every iteration.
scatter(i*ones(1, n_students), grades(i, :), 'jitter', 'on', 'jitterAmount', 0.1);
end
hold off; % Set the hold state to off.
set(gca, 'XTick', 1:n_assignments); % Display only integer values in x-axis.
xlabel('assignment'); % Label for x-axis.
ylabel('grades'); % Label for y-axis.
grid on; % Display grid lines.
This is the result:
If you still want to add jitter in the y-axis, you would have to do that manually by adding random noise to your grades data, which is something I personally wouldn't recommend, because the grades in the scatter plot could get mixed, thus rendering the plot completely unreliable.

Related

Matlab - plotting function with a for loop over a matrix

I have an assignment which sounds like:
"“Grades per assignment”: A plot with the assignments on the x-axis and the grades on the y-axis. The x-axis must show all assignments from 1 to M, and the y-axis must show all grade −3 to 12. The plot must contain:
Each of the given grades marked by a dot. You must add a small random number (between -0.1 and 0.1) to the x- and y-coordinates of each dot, to be able tell apart the different dots which otherwise would be on top of each other when more than one student has received the same grade in the same assignment.
The average grade of each of the assignments plotted as a line"
For now i have created this function:
function gradesPlot(grades)
figure(2);
n_assignments=size(grades,2);
hold on; % Retain current plot when adding new plots.
for i = 1:n_assignments % Loop through every assignment.
% Scatter plot of assignment vs grades for that assignment.
% One assignment on every iteration.
n_assignments2=([1:size(grades,2)]);
scatter(n_assignments2,grades(:,i)'jitter', 'on', 'jitterAmount', 0.1)
hold off; % Set the hold state to off.
end
%Titles to the plot
title('Grades per assignment');
xlabel('Assignment');
ylabel('Given grades');
break;
end
when i run the code it says that the vectors must be same length.
And it looks like it doesn't loop over the matrix more than ones.
The test grades i am using as input is looking like this:
grades=[[-3,4,10];[7,4,12];[7,10,12];[0,4,4];[2,2,2];[2,2,2]]
I hope some of you guys can help me get this function to work - maybe in an easier way?
Thank you in advance
You should not turn off hold since it is tells MatLab to plot everything in the current active plot, roughly speaking. You can find a possible solution to your problem down below: I added some explanations in comments.
function gradesPlot(grades)
figure(2);
% Extract the relevant information: number of assignements, number of grades
[n_assignments,n_grades] = size(grades);
hold on; % Retain current plot when adding new plots.
for i = 1:n_assignments % Loop through every assignment.
% Scatter plot of assignment vs grades for that assignment.
% One assignment on every iteration
% For Scatter, you have to provide 2 vectors of the same size: in this
% way, we are putting al the dots corresponding to the grades of the
% i-th assignement in correspondence of the i-th coordinate on the x
% axis. We are also temporary saving in h the handle to the attributes
% of the dots, in order to retrieve the color.
h = scatter(i*ones(n_grades,1),grades(i,:),'jitter', 'on', 'jitterAmount', 0.1);
% This plots the horizontal line corresponding to the average of the
% grades related to the i-th assignement
l = line([i-0.5 i+0.5],[1,1]*mean(grades(i,2:end)));
% For using the same color as the dots.
l.Color = h.CData;
end
%Titles to the plot
title('Grades per assignment');
xlabel('Assignment');
ylabel('Given grades');
axis([0 n_assignments+1 -4 13])
end
Remember that the break command must be used inside a loop, not for exiting a function. Use return if you desire.

Setting axis limit for plot

I want to set the limit for X axis in this plot from 0 to 325. When i am using xlim to set the limits (commented in the code). It doesn't work properly. When i use xlim, the entire structure of plot changes. Any help will be appreciated.
figure
imagesc(transpose(all_area_for_visual));
colormap("jet")
colorbar('Ticks',0:3,'TickLabels',{'Home ','Field','Bad house','Good house'})
xlabel('Time (min)')
tickLocs = round(linspace(1,length(final_plot_mat_missing_part(2:end,1)),8));
timeVector = final_plot_mat_missing_part(2:end,1);
timeForTicks = (timeVector(tickLocs))./60;
xticks(tickLocs);
xticklabels(timeForTicks);
%xlim([0 325]);
ylabel('Car identity')
yticks(1:length(Ucolumnnames_fpm))
yticklabels([Ucolumnnames_fpm(1,:)])
If I get you right, you want to plot only part of the data in all_area_for_visual, given by a condition on tickLocs. So you should first condition the data, and then plot it:
% generate the vector of x values:
tickLocs = round(linspace(1,length(final_plot_mat_missing_part(2:end,1)),8));
% create an index vector (of logicals) that marks the columns to plot from the data matix:
validX = tickLocs(tickLocs<=325);
% plot only the relevant part of the data:
imagesc(transpose(all_area_for_visual(:,validX)));
% generate the correct ticks for the data that was plotted:
timeVector = final_plot_mat_missing_part(2:end,1);
timeForTicks = (timeVector(tickLocs(validX)))./60;
xticks(tickLocs(validX));
% here you continue with setting the labels, colormap and so on...
imagesc puts the data in little rectangles centered around integers 1:width and 1:height by default. You can specify what the x and y locations of each data point by adding two vectors to the call:
imagesc(x,y,transpose(all_area_for_visual));
where x and y are vectors with the locations along the x and y axes you want to place the data.
Note that xlim and xticks don’t change the location of the data, only the region of the axis shown, and the location of tick marks along the axis. With xticklabels you can change what is shown at each tick mark, so you can use that to “fake” the data locations, but the xlim setting still applies to the actual locations, not to the labels assigned to the ticks.
I think it is easier to plot the data in the right locations to start with. Here is an example:
% Fake your data, I'm making a small matrix here for illustration purposes
all_area_for_visual = min(floor(cumsum(rand(20,5)/2)),3);
times = linspace(0,500,20); % These are the locations along the time axis for each matrix element
car_id_names = [4,5,8,15,18]; % These are the labels to put along the y-axis
car_ids = 1:numel(car_id_names); % These are the locations to use along the y-axis
% Replicate your plot
figure
imagesc(times,car_ids,transpose(all_area_for_visual));
% ^^^ ^^^ NOTE! specifying locations
colormap("jet")
colorbar('Ticks',0:3,'TickLabels',{'Home ','Field','Bad house','Good house'})
xlabel('Time (min)')
ylabel('Car identity')
set(gca,'YTick',car_ids,'YTickLabel',car_id_names) % Combine YTICK and YTICKLABEL calls
% Now you can specify your limit, in actual time units (min)
xlim([0 325]);

Highlight specific section of graph in MATLAB

I wish to highlight/mark some parts of a array via plot in MATLAB. After some research (like here) I tried to hold the first plot, find the indexes for highlighting and then a new plot, only with those points. However, those points are being drawn but all shifted to the beginning of the axis:
I'm currently trying using this code:
load consumer; % the main array to plot (157628x10 double) - data on column 9
load errors; % a array containing the error indexes (1x5590 double)
x = 1:size(consumer,1)'; % returns a (157628x1 double)
idx = (ismember(x,errors)); % returns a (157628x1 logical)
fig = plot(consumer(:,9));
hold on, plot(consumer(idx,9),'r.');
hold off
Another thing I would like to do was highlighting the whole section of the graph, like a "patch" on the same sections. Any ideas?
The trouble is that you are only providing the y-axis data to the plot function. By default, this means all data is plotted on the 1:numel(y) x locations of your plot, where y is your y-axis data.
You have 2 options...
Also provide x-axis data. You've already got the array x anyway!
figure; hold on;
plot(x, consumer(:,9));
plot(x(idx), consumer(idx,9), 'r.');
Aside: I'm slightly confused why you create idx. If errors is as you describe it (indexes of the array) then you should just be able to use consumer(errors,9).
Make all data which you don't want to appear equal to NaN. Because of the way you're loading your error indices in, this is less quick and easy. Basically you'd copy consumer(:,9) into a new variable, and index all undesirable points to set them equal to NaN.
This method has the benefit of breaking up discontinuous sections too.
y = consumer(:,9); % copy your y data before changes
idx = ~ismember(x, errors); % get the indices you *don't* want to re-plot
y(idx) = NaN; % Set equal to NaN so they aren't plotted
figure; hold on;
plot(x, consumer(:,9));
plot(x, y, 'r'); % Plot all points, NaNs wont show

MATLAB: combining and normalizing histograms with different sample sizes

I have four sets of data, the distribution of which I would like to represent in MATLAB in one figure. Current code is:
[n1,x1]=hist([dataset1{:}]);
[n2,x2]=hist([dataset2{:}]);
[n3,x3]=hist([dataset3{:}]);
[n4,x4]=hist([dataset4{:}]);
bar(x1,n1,'hist');
hold on; h1=bar(x1,n1,'hist'); set(h1,'facecolor','g')
hold on; h2=bar(x2,n2,'hist'); set(h2,'facecolor','g')
hold on; h3=bar(x3,n3,'hist'); set(h3,'facecolor','g')
hold on; h4=bar(x4,n4,'hist'); set(h4,'facecolor','g')
hold off
My issue is that I have different sampling sizes for each group, dataset1 has an n of 69, dataset2 has an n of 23, dataset3 and dataset4 have n's of 10. So how do I normalize the distributions when representing these three groups together?
Is there some way to..for example..divide the instances in each bin by the sampling for that group?
You can normalize your histograms by dividing by the total number of elements:
[n1,x1] = histcounts(randn(69,1));
[n2,x2] = histcounts(randn(23,1));
[n3,x3] = histcounts(randn(10,1));
[n4,x4] = histcounts(randn(10,1));
hold on
bar(x4(1:end-1),n4./sum(n4),'histc');
bar(x3(1:end-1),n3./sum(n3),'histc');
bar(x2(1:end-1),n2./sum(n2),'histc');
bar(x1(1:end-1),n1./sum(n1),'histc');
hold off
ax = gca;
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
However, as you can see above, you can do some more things to make your code more simple and short:
You only need to hold on once.
Instead of collecting all the bar handles, use the axes handle.
Plot the bar in ascending order of the number of elements in the dataset, so all histograms will be clearly visible.
With the axes handle set all properties at one command.
and as a side note - it's better to use histcounts.
Here is the result:
EDIT:
If you want to also plot the pdf line from histfit, then you can save it first, and then plot it normalized:
dataset = {randn(69,1),randn(23,1),randn(10,1),randn(10,1)};
fits = zeros(100,2,numel(dataset));
hold on
for k = numel(dataset):-1:1
total = numel(dataset{k}); % for normalizing
f = histfit(dataset{k}); % draw the histogram and fit
% collect the curve data and normalize it:
fits(:,:,k) = [f(2).XData; f(2).YData./total].';
x = f(1).XData; % collect the bar positions
n = f(1).YData; % collect the bar counts
f.delete % delete the histogram and the fit
bar(x,n./total,'histc'); % plot the bar
end
ax = gca; % get the axis handle
% set all color and transparency for the bars:
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
% plot all the curves:
plot(squeeze(fits(:,1,:)),squeeze(fits(:,2,:)),'LineWidth',3)
hold off
Again, there are some other improvements you can introduce to your code:
Put everything in a loop to make thigs more easily changed later.
Collect all the curves data to one variable so you can plot them all together very easily.
The new result is:

ploting 3d graph in matlab?

I am currently a begineer, and i am using matlab to do a data analysis. I have a a text file with data at the first row is formatted as follow:
time;wave height 1;wave height 2;.......
I have column until wave height 19 and rows total 4000 rows.
Data in the first column is time in second. From 2nd column onwards, it is wave height elevation which is in meter. At the moment I like to ask matlab to plot a 3d graph with time on the x axis, wave elevation on the y axis, and wave elevation that correspond to wave height number from 1 to 19, i.e. data in column 2 row 10 has a let say 8m which is correspond to wave height 1 and time at the column 1 row 10.
I have try the following:
clear;
filename='abc.daf';
path='C:\D';
a=dlmread([path '\' filename],' ', 2, 1);
[nrows,ncols]=size(a);
t=a(1:nrows,1);%define t from text file
for i=(1:20),
j=(2:21);
end
wi=a(:,j);
for k=(2:4000),
l=k;
end
r=a(l,:);
But everytime i use try to plot them, the for loop wi works fine, but for r=a(l,:);, the plot only either give me the last time data only but i want all data in the file to be plot.
Is there a way i can do that. I am sorry as it is a bit confusing but i will be very thankful if anyone can help me out.
Thank you!!!!!!!!!!
Once you load your data as you do in your code above your variable a should be a 4000-by-20 array. You could then create a 3-D plot in a couple of different ways. You could create a 3-D line plot using the function PLOT3, plotting one line for each column of wave elevation data:
t = a(:,1); %# Your time vector
for i = 2:20 %# Loop over remaining columns
plot3(t,(i-1).*ones(4000,1),a(:,i)); %# Plot one column
hold on; %# Continue plotting to the same axes
end
xlabel('Time'); %# Time on the x-axis
ylabel('Wave number'); %# Wave number (1-19) on y-axis
zlabel('Wave elevation'); %# Elevation on z-axis
Another way to plot your data in 3-D is to make a mesh or surface plot, using the functions MESH or SURF, respectively. Here's an example:
h = surf(a(:,1),1:19,a(:,2:20)'); %'# Plot a colored surface
set(h,'EdgeColor','none'); %# Turn off edge coloring (easier to see surface)
xlabel('Time'); %# Time on the x-axis
ylabel('Wave number'); %# Wave number (1-19) on y-axis
zlabel('Wave elevation'); %# Elevation on z-axis
I don't quite understand what your function does, for example, I do not see any plot command.
Here's how I'd try to make a 3D plot according to your specs:
%# Create some data - time from 0 to 2pi, ten sets of data with frequency 1 through 10.
%# You would just load A instead (I use uppercase just so I know that A is a 2D array,
%# rather than a vector)
x = linspace(0,2*pi,100)';%#' linspace makes equally spaced points
w = 1:10;
[xx,ww]=ndgrid(x,w); %# prepare data for easy calculation of matrix A
y = ww.*sin(xx.*ww);
A = [x,y]; %# A is [time,data]
%# find size of A
[nRows,nCols] = size(A);
%# create a figure, loop through the columns 2:end of A to plot
colors = hsv(10);
figure,
hold on,
for i=1:nCols-1,
%# plot time vs waveIdx vs wave height
plot3(A(:,1),i*ones(nRows,1),A(:,1+i),'Color',colors(i,:)),
end
%# set a reasonable 3D view
view(45,60)
%# for clarity, label axes
xlabel('time')
ylabel('wave index')
zlabel('wave height')
Or, you could try gnuplot. Fast, free and relatively easy to use. I use it to generate heat maps for datasets in the millions of rows.