MATLAB: combining and normalizing histograms with different sample sizes - matlab

I have four sets of data, the distribution of which I would like to represent in MATLAB in one figure. Current code is:
[n1,x1]=hist([dataset1{:}]);
[n2,x2]=hist([dataset2{:}]);
[n3,x3]=hist([dataset3{:}]);
[n4,x4]=hist([dataset4{:}]);
bar(x1,n1,'hist');
hold on; h1=bar(x1,n1,'hist'); set(h1,'facecolor','g')
hold on; h2=bar(x2,n2,'hist'); set(h2,'facecolor','g')
hold on; h3=bar(x3,n3,'hist'); set(h3,'facecolor','g')
hold on; h4=bar(x4,n4,'hist'); set(h4,'facecolor','g')
hold off
My issue is that I have different sampling sizes for each group, dataset1 has an n of 69, dataset2 has an n of 23, dataset3 and dataset4 have n's of 10. So how do I normalize the distributions when representing these three groups together?
Is there some way to..for example..divide the instances in each bin by the sampling for that group?

You can normalize your histograms by dividing by the total number of elements:
[n1,x1] = histcounts(randn(69,1));
[n2,x2] = histcounts(randn(23,1));
[n3,x3] = histcounts(randn(10,1));
[n4,x4] = histcounts(randn(10,1));
hold on
bar(x4(1:end-1),n4./sum(n4),'histc');
bar(x3(1:end-1),n3./sum(n3),'histc');
bar(x2(1:end-1),n2./sum(n2),'histc');
bar(x1(1:end-1),n1./sum(n1),'histc');
hold off
ax = gca;
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
However, as you can see above, you can do some more things to make your code more simple and short:
You only need to hold on once.
Instead of collecting all the bar handles, use the axes handle.
Plot the bar in ascending order of the number of elements in the dataset, so all histograms will be clearly visible.
With the axes handle set all properties at one command.
and as a side note - it's better to use histcounts.
Here is the result:
EDIT:
If you want to also plot the pdf line from histfit, then you can save it first, and then plot it normalized:
dataset = {randn(69,1),randn(23,1),randn(10,1),randn(10,1)};
fits = zeros(100,2,numel(dataset));
hold on
for k = numel(dataset):-1:1
total = numel(dataset{k}); % for normalizing
f = histfit(dataset{k}); % draw the histogram and fit
% collect the curve data and normalize it:
fits(:,:,k) = [f(2).XData; f(2).YData./total].';
x = f(1).XData; % collect the bar positions
n = f(1).YData; % collect the bar counts
f.delete % delete the histogram and the fit
bar(x,n./total,'histc'); % plot the bar
end
ax = gca; % get the axis handle
% set all color and transparency for the bars:
set(ax.Children,{'FaceColor'},mat2cell(lines(4),ones(4,1),3))
set(ax.Children,{'FaceAlpha'},repmat({0.7},4,1))
% plot all the curves:
plot(squeeze(fits(:,1,:)),squeeze(fits(:,2,:)),'LineWidth',3)
hold off
Again, there are some other improvements you can introduce to your code:
Put everything in a loop to make thigs more easily changed later.
Collect all the curves data to one variable so you can plot them all together very easily.
The new result is:

Related

How to color multiple lines based on their value?

I produced a plot that contains 50 curves and each of them corresponds to a specific value of a parameter called "Jacobi constant", so I have 50 values of jacobi constant stored in array called jacobi_cst_L1:
3.000900891023230
3.000894276927840
3.000887643313580
3.000881028967010
3.000874419173230
3.000867791975870
3.000861196034850
3.000854592397690
3.000847948043080
3.000841330136040
3.000834723697250
3.000828099771820
3.000821489088600
3.000814922863360
3.000808265737810
3.000801695858850
3.000795067776960
3.000788475204760
3.000781845363950
3.000775192199620
3.000768609354090
3.000761928862980
3.000755335851910
3.000748750854930
3.000742084743060
3.000735532899990
3.000728906460450
3.000722309400740
3.000715644446600
3.000709016645110
3.000702431180730
3.000695791284050
3.000689196186970
3.000682547292110
3.000675958537960
3.000669315388860
3.000662738391370
3.000656116141060
3.000649560630930
3.000642857256680
3.000636330415510
3.000629657944820
3.000623060310100
3.000616425935580
3.000609870077710
3.000603171772120
3.000596554947660
3.000590018845460
3.000583342259840
3.000576748353570
I want to use a colormap to color my curves and then show in a lateral bar the legend that show the numerical values corresponding to each color of orbit.
By considering my example image, I would want to add the array of constants in the lateral bar and then to color each curve according the lateral bar.
% Family of 50 planar Lyapunov orbits around L1 in dimensionless unit
fig = figure;
for k1 = 1:(numel(files_L1_L2_Ly_prop)-2)
plot([Ly_orb_filt(1).prop(k1).orbits.x],[Ly_orb_filt(1).prop(k1).orbits.y],...
"Color",my_green*1.1); hold on %"Color",my_green*1.1
colorbar()
end
axis equal
% Plot L1 point
plot(Ly_orb_filt_sys_data(1).x,Ly_orb_filt_sys_data(1).y,'.',...
'color',[0,0,0],'MarkerFaceColor',my_green,'MarkerSize',10);
text(Ly_orb_filt_sys_data(1).x-0.00015,Ly_orb_filt_sys_data(1).y-0.0008,'L_{1}');
%Primary bodies plots
plot(AstroData.mu_SEM_sys -1,0,'.',...
'color',my_blue,'MarkerFaceColor',my_blue,'MarkerSize',20);
text(AstroData.mu_SEM_sys-1,0-0.001,'$Earth + Moon$','Interpreter',"latex");
grid on;
xlabel('$x$','interpreter','latex','fontsize',12);
ylabel('$y$','interpreter','latex','FontSize',12);
How can I color each line based on its Jacobi constant value?
You can use any colour map to produce a series of RGB-triplets for the plotting routines to read (Or create an m-by-3 matrix with elements between 0 and 1 yourself):
n = 10; % Plot 10 lines
x = 1:15;
colour_map = jet(n); % Get colours. parula, hsv, hot etc.
figure;
hold on
for ii = 1:n
% Plot each line individually
plot(x, x+ii, 'Color', colour_map(ii, :))
end
colorbar % Show the colour bar.
Which on R2007b produces:
Note that indexing into a colour map will produce linearly spaced colours, thus you'll need to either interpolate or calculate a lot to get the specific ones you need. Then you can (need to?) modify the resulting colour bar's labels by hand to reflect your input values. I'd simply use parula(50), treat its indices as linspace(jacobi(1), jacobi(end), 50) and then my_colour = interp1(linspace(jacobi(1), jacobi(end), 50), parula(50), jacobi).
So in your code, rather than using "Color",my_green*1.1 for each line, use "Color",my_colour(kl,:), where my_colour is whatever series of RGB triplets you have defined.

Scatter with line segments

I really like scatter()'s ability to automatically color points based on some vector of values, I just want to add colored lines between the points.
The plot in question has time on x-axis, monte-carlo number on y-axis, and then some measured value as the color vector (e.g. number of cars seen in a video frame).
Basically, each point is an update in the system. So calling scatter(time,monte_carlo_number,[],color_vec) plots the points at which there is an update in the system, with color representing some value. This is great, but I would like to add line segments that connect these points, each segment matching the color specified by color_vec.
Basic working example
% Create example data
data = table();
data.time = randsample(1:100, 1000, true)';
data.mc = randsample(1:50, 1000, true)'; % actual monte-carlo run number labels are sorted
data.color_value = randsample(1:10, 1000, true)';
% Create the scatter plot
scatter(data.time, data.mc, [] , data.color_value, 'filled')
colorbar('Ticks', unique(data.color_value))
% Always label your axes
xlabel('Time (s)')
ylabel('Monte-Carlo Run Number')
Below is a screen-shot of what this code might produce. If color_value is the number of cars seen in a video frame, we can see each time this value is updated via the points. However, it is easier for humans to read this plot if there were lines connecting each point to the next with the correct color. This demonstrates to the viewer that this value continues on in time until the next update.
Something like this? I changed the number of samples to 100, and it is already quite a mess, so I don't think this is going to the viewer understand what's plotted.
% Create example data
data = table();
np = 100;
data.time = randsample(1:100, np, true)';
data.mc = randsample(1:50, np, true)'; % actual monte-carlo run number labels are sorted
data.color_value = randsample(1:10, np, true)';
vals = unique(data.color_value).';
cmap = parula(numel(vals));
colors = [];
for k = 1:numel(vals)
ind = find(data.color_value == vals(k));
data_sel{k} = sortrows(data(ind,:));
colors(k,:) = cmap(k,:);
end
figure(1); clf;
% Create the scatter plot
scatter(data.time, data.mc, [] , data.color_value, 'filled')
hold on
for k = 1:numel(vals)
plot(data_sel{k}.time, data_sel{k}.mc, 'Color',colors(k,:))
end
colorbar('Ticks', unique(data.color_value))
% Always label your axes
xlabel('Time (s)')
ylabel('Monte-Carlo Run Number')

Setting axis limit for plot

I want to set the limit for X axis in this plot from 0 to 325. When i am using xlim to set the limits (commented in the code). It doesn't work properly. When i use xlim, the entire structure of plot changes. Any help will be appreciated.
figure
imagesc(transpose(all_area_for_visual));
colormap("jet")
colorbar('Ticks',0:3,'TickLabels',{'Home ','Field','Bad house','Good house'})
xlabel('Time (min)')
tickLocs = round(linspace(1,length(final_plot_mat_missing_part(2:end,1)),8));
timeVector = final_plot_mat_missing_part(2:end,1);
timeForTicks = (timeVector(tickLocs))./60;
xticks(tickLocs);
xticklabels(timeForTicks);
%xlim([0 325]);
ylabel('Car identity')
yticks(1:length(Ucolumnnames_fpm))
yticklabels([Ucolumnnames_fpm(1,:)])
If I get you right, you want to plot only part of the data in all_area_for_visual, given by a condition on tickLocs. So you should first condition the data, and then plot it:
% generate the vector of x values:
tickLocs = round(linspace(1,length(final_plot_mat_missing_part(2:end,1)),8));
% create an index vector (of logicals) that marks the columns to plot from the data matix:
validX = tickLocs(tickLocs<=325);
% plot only the relevant part of the data:
imagesc(transpose(all_area_for_visual(:,validX)));
% generate the correct ticks for the data that was plotted:
timeVector = final_plot_mat_missing_part(2:end,1);
timeForTicks = (timeVector(tickLocs(validX)))./60;
xticks(tickLocs(validX));
% here you continue with setting the labels, colormap and so on...
imagesc puts the data in little rectangles centered around integers 1:width and 1:height by default. You can specify what the x and y locations of each data point by adding two vectors to the call:
imagesc(x,y,transpose(all_area_for_visual));
where x and y are vectors with the locations along the x and y axes you want to place the data.
Note that xlim and xticks don’t change the location of the data, only the region of the axis shown, and the location of tick marks along the axis. With xticklabels you can change what is shown at each tick mark, so you can use that to “fake” the data locations, but the xlim setting still applies to the actual locations, not to the labels assigned to the ticks.
I think it is easier to plot the data in the right locations to start with. Here is an example:
% Fake your data, I'm making a small matrix here for illustration purposes
all_area_for_visual = min(floor(cumsum(rand(20,5)/2)),3);
times = linspace(0,500,20); % These are the locations along the time axis for each matrix element
car_id_names = [4,5,8,15,18]; % These are the labels to put along the y-axis
car_ids = 1:numel(car_id_names); % These are the locations to use along the y-axis
% Replicate your plot
figure
imagesc(times,car_ids,transpose(all_area_for_visual));
% ^^^ ^^^ NOTE! specifying locations
colormap("jet")
colorbar('Ticks',0:3,'TickLabels',{'Home ','Field','Bad house','Good house'})
xlabel('Time (min)')
ylabel('Car identity')
set(gca,'YTick',car_ids,'YTickLabel',car_id_names) % Combine YTICK and YTICKLABEL calls
% Now you can specify your limit, in actual time units (min)
xlim([0 325]);

Highlight specific section of graph in MATLAB

I wish to highlight/mark some parts of a array via plot in MATLAB. After some research (like here) I tried to hold the first plot, find the indexes for highlighting and then a new plot, only with those points. However, those points are being drawn but all shifted to the beginning of the axis:
I'm currently trying using this code:
load consumer; % the main array to plot (157628x10 double) - data on column 9
load errors; % a array containing the error indexes (1x5590 double)
x = 1:size(consumer,1)'; % returns a (157628x1 double)
idx = (ismember(x,errors)); % returns a (157628x1 logical)
fig = plot(consumer(:,9));
hold on, plot(consumer(idx,9),'r.');
hold off
Another thing I would like to do was highlighting the whole section of the graph, like a "patch" on the same sections. Any ideas?
The trouble is that you are only providing the y-axis data to the plot function. By default, this means all data is plotted on the 1:numel(y) x locations of your plot, where y is your y-axis data.
You have 2 options...
Also provide x-axis data. You've already got the array x anyway!
figure; hold on;
plot(x, consumer(:,9));
plot(x(idx), consumer(idx,9), 'r.');
Aside: I'm slightly confused why you create idx. If errors is as you describe it (indexes of the array) then you should just be able to use consumer(errors,9).
Make all data which you don't want to appear equal to NaN. Because of the way you're loading your error indices in, this is less quick and easy. Basically you'd copy consumer(:,9) into a new variable, and index all undesirable points to set them equal to NaN.
This method has the benefit of breaking up discontinuous sections too.
y = consumer(:,9); % copy your y data before changes
idx = ~ismember(x, errors); % get the indices you *don't* want to re-plot
y(idx) = NaN; % Set equal to NaN so they aren't plotted
figure; hold on;
plot(x, consumer(:,9));
plot(x, y, 'r'); % Plot all points, NaNs wont show

How do I reach first and second plots from bode()

I know how to create the Bode plots with bode() function. If I want to overlap two or more systems frequency responses, I use
bode(sys1,sys2,...)
or
hold on
When I want to reach the plot in order to put a legend with text(), for instance, is easy to reach the second plot. Something like the figure pointer always returns to the second plot (phase graph).
i.e., if try these lines:
G = tf([1],[1 6]); figure(1); bode(G); text(10,-20,'text');
G = tf([1],[1 6]); figure(2); bode(G); text(10,-20,'text');
when I return to the first figure, with figure(1), and try
figure(1); text(10,-20,'text')
legend is displayed in the second plot (Phase plot)
I try these other lines:
P = bodeoptions; % Set phase visiblity to off
P.PhaseVisible = 'off';
G = tf([1],[1 6]);
figure(1); bode(G,P); text(10,-20,'text');
figure(1); text(10,-20,'text');
As you can see, even I turn off the phase plot visiblity, the legend is not displayed.
Essentialy, my question is, how do I reach first and second plots, one by one? I tried with subplot(), but it is pretty clear this is not the way Matlab traces these plots.
Thanks in advance.
It all comes to getting into upper plot, since after bodeplot command the lower one is active. Intuitively one would want to call subplot(2,1,1), but this just creates new blank plot on top of if. Therefore we should do something like this:
% First, define arbitrary transfer function G(s), domain ww
% and function we want to plot on magnitude plot.
s = tf('s');
G = 50 / ( s*(1.6*s+1)*(0.23*s+1) );
ww = logspace(0,3,5000);
y = 10.^(-2*log10(ww)+log10(150));
hand = figure; % create a handle to new figure
h = bodeplot(G,ww);
hold on;
children = get(hand, 'Children') % use this handle to obtain list of figure's children
% We see that children has 3 objects:
% 1) Context Menu 2) Axis object to Phase Plot 3) Axis object to Magnitude Plot
magChild = children(3); % Pick a handle to axes of magnitude in bode diagram.
% magChild = childern(2) % This way you can add data to Phase Plot.
axes(magChild) % Make those axes current
loglog(ww,y,'r');
legend('transfer function','added curve')
you can get magnitude and phase data separately for each system using:
[mag,phase] = bode(sys,w)
now you can use subplot or plot to plot the diagram you want.
The only solution I was able to perform is taking into account axis position. It is not very clean but it works.
Here is the code to select mag plot:
ejes=findobj(get(gcf,'children'),'Type','axes','visible','on');
posicion=get(ejes,'pos');
tam=length(posicion);
for ii=1:tam
a=posicion{ii}(2);
vectorPos(ii)=a;
end
[valorMax,ind]=max(vectorPos); % min for choosing the phase plot
axes(ejes(ind))