how to normalize histogram - matlab

I ma creating the exponential distribution using inverse method. I want to normalize histogram. How can I do it?
This is my code
N=100;
Lambda=2;
r=rand(N,1);
X=-log(1-r)/Lambda;
hist(X), colormap(bone);
t = 0:0.01:5;
pdf=Lambda*exp(-Lambda*t);
hold on, plot(t,pdf,'LineWidth',2)

The histogram should be normalized to unit area so that it can be compared with the theoretical pdf. To normalize to unit area you need to divide by the number of samples and by the bin width:
N = 100;
Lambda=2;
r = rand(N,1);
X = -log(1-r)/Lambda;
[hy, hx] = hist(X); %/ get histogram values
hy = hy/numel(X)/(hx(2)-hx(1)); %//normalize histogram
bar(hx, hy) %// plot histogram
t = 0:0.01:5;
pdf = Lambda*exp(-Lambda*t);
hold on, plot(t,pdf,'LineWidth',2) %// plot pdf
Or use the new histogram function (introduced in R2014b), which automatically normalizes according to the specified normalization option:
N = 100;
Lambda=2;
r = rand(N,1);
X = -log(1-r)/Lambda;
histogram(X, 'Normalization', 'pdf') %// plot normalized histogram
t = 0:0.01:5;
pdf = Lambda*exp(-Lambda*t);
hold on, plot(t,pdf,'LineWidth',2) %// plot pdf

Related

Producing a histogram in Matlab with out using Hist

I am using histograms in Matlab to look at the distribution of some data from my experiments. I want to find the mean distribution (mean height of the bars) from a group of tests then produce an average histogram.
By using this code:
data = zeros(26,31);
for i = 1:length(files6)
x = csvread(files6(i).name);
x = x(1:end,:);
time = x(:,1);
variable = x(:,3);
thing(:,1) = x(:,1);
thing(:,2) = x(:,3);
figure()
binCenter = {0:tbinstep:tbinend 0:varbinstep:varbinend};
hist3(thing, 'Ctrs', binCenter, 'CDataMode','auto','FaceColor','interp');
colorbar
[N,C] = hist3(thing, 'Ctrs', binCenter);
data = data + N;
clearvars x time variable
end
avedata = data / i;
I can find the mean of N, which will be the Z value for the plot (histogram) I want, and I have X,Y (which are the same for all tests) from:
x = 0:tbinstep:tbinend;
y = 0:varbinstep:varbinend;
But how do I bring these together to make the graphical out that shows the average height of the bars? I can't use hist3 again as that will just calculate the distribution of avedata.
AT THE RISK OF STARTING AN XY PROBLEM using bar3 has been suggested, but that asks the question "how do I go from 2 vectors and a matrix to 1 matrix bar3 can handle? I.e. how do I plot x(1), y(1), avedata(1,1) and so on for all the data points in avedata?"
TIA
By looking at hist3 source code in matlab r2014b, it has his own plotting implemented inside that prepares data and plot it using surf method. Here is a function that reproduce the same output highly inspired from the hist3 function with your options ('CDataMode','auto','FaceColor','interp'). You can put this in a new file called hist3plot.m:
function [ h ] = hist3plot( N, C )
%HIST3PLOT Summary of this function goes here
% Detailed explanation goes here
xBins = C{1};
yBins = C{2};
% Computing edges and width
nbins = [length(xBins), length(yBins)];
xEdges = [0.5*(3*xBins(1)-xBins(2)), 0.5*(xBins(2:end)+xBins(1:end-1)), 0.5*(3*xBins(end)-xBins(end-1))];
yEdges = [0.5*(3*yBins(1)-yBins(2)), 0.5*(yBins(2:end)+yBins(1:end-1)), 0.5*(3*yBins(end)-yBins(end-1))];
xWidth = xEdges(2:end)-xEdges(1:end-1);
yWidth = yEdges(2:end)-yEdges(1:end-1);
del = .001; % space between bars, relative to bar size
% Build x-coords for the eight corners of each bar.
xx = xEdges;
xx = [xx(1:nbins(1))+del*xWidth; xx(2:nbins(1)+1)-del*xWidth];
xx = [reshape(repmat(xx(:)',2,1),4,nbins(1)); NaN(1,nbins(1))];
xx = [repmat(xx(:),1,4) NaN(5*nbins(1),1)];
xx = repmat(xx,1,nbins(2));
% Build y-coords for the eight corners of each bar.
yy = yEdges;
yy = [yy(1:nbins(2))+del*yWidth; yy(2:nbins(2)+1)-del*yWidth];
yy = [reshape(repmat(yy(:)',2,1),4,nbins(2)); NaN(1,nbins(2))];
yy = [repmat(yy(:),1,4) NaN(5*nbins(2),1)];
yy = repmat(yy',nbins(1),1);
% Build z-coords for the eight corners of each bar.
zz = zeros(5*nbins(1), 5*nbins(2));
zz(5*(1:nbins(1))-3, 5*(1:nbins(2))-3) = N;
zz(5*(1:nbins(1))-3, 5*(1:nbins(2))-2) = N;
zz(5*(1:nbins(1))-2, 5*(1:nbins(2))-3) = N;
zz(5*(1:nbins(1))-2, 5*(1:nbins(2))-2) = N;
% Plot the bars in a light steel blue.
cc = repmat(cat(3,.75,.85,.95), [size(zz) 1]);
% Plot the surface
h = surf(xx, yy, zz, cc, 'CDataMode','auto','FaceColor','interp');
% Setting x-axis and y-axis limits
xlim([yBins(1)-yWidth(1) yBins(end)+yWidth(end)]) % x-axis limit
ylim([xBins(1)-xWidth(1) xBins(end)+xWidth(end)]) % y-axis limit
end
You can then call this function when you want to plot outputs from Matlab's hist3 function. Note that this can handle non uniform positionning of bins:
close all; clear all;
data = rand(10000,2);
xBins = [0,0.1,0.3,0.5,0.6,0.8,1];
yBins = [0,0.1,0.3,0.5,0.6,0.8,1];
figure()
hist3(data, {xBins yBins}, 'CDataMode','auto','FaceColor','interp')
title('Using hist3')
figure()
[N,C] = hist3(data, {xBins yBins});
hist3plot(N, C); % The function is called here
title('Using hist3plot')
Here is a comparison of the two outputs:
So if I understand your question and code correctly, you are plotting the distribution of multiple experiments' data as histograms, then you want to calculate the average shape of all the previous histograms.
I usually avoid giving approaches the asker isn't explicitly asking for, but for this one I must comment that it is a very strange thing to do. I've never heard of calculating the average shape of multiple histograms before. So just in case, you could simply append all your experiment's data into a single variable, and plot a normalized histogram of that using histogram2. This code outputs a relative frequency histogram. (Other normalization methods)
% Append all data in a single matrix
x = []
for i = 1:length(files6)
x = [x; csvread(files6(i).name)];
end
% Plot normalized bivariate histogram, normalized
xEdges = 0:tbinstep:tbinend;
yEdges = 0:varbinstep:varbinend;
histogram2(x(:,1), x(:,3), xEdges, yEdges, 'Normalize', 'Probability')
Now, if you really are looking to draw the average shape of multiple histograms, then yes, use bar3. Since bar3 doesn't accept an (x,y) value argument, you can follow the other answer, or modify the XTickLabel and YTickLabel property to match whatever your bin range is, afterwards.
... % data = yourAverageData;
% Save axis handle to `h`
h = bar3(data);
% Set property of axis
h.XTickLabels = 0:tbinstep:tbinend;
h.YTickLabels = 0:varbinstep:varbinend;

Convert large xyz file into gridded data (Matlab)

I have a large XYZ file (300276x3, this file includes x and y coordinates (not lat/lon, but polar stereographic) and elevation z) and I'm wondering if it would be possible to convert this into a gridded dataset (n x m matrix). The xyz file can be downloaded from:
https://wetransfer.com/downloads/4ae4ce51072dceef93486314d161509920191021213532/48e4ee68c17269bd6f7a72c1384b3c9a20191021213532/60b04d
and imported in matlab by:
AIS_SEC = importdata('AIS_SEC.xyz');
I tried:
X= XYZ(:,1);
Y= XYZ(:,2);
Z= XYZ(:,3);
xr = sort(unique(X));
yr = sort(unique(Y));
gRho = zeros(length(yr),length(xr));
gRho = griddata(X,Y,Z,xr,yr')
imagesc(gRho)
Requested 300276x300276 (671.8GB) array exceeds maximum array size preference. Creation of arrays
greater than this limit may take a long time and cause MATLAB to become unresponsive. See array size
limit or preference panel for more information.
I tried:
% Get coordinate vectors
x = unique(XYZ(:,1)) ;
y = unique(XYZ(:,2)) ;
% dimensions of the data
nx = length(x) ;
ny = length(y) ;
% Frame matrix of grid
D = reshape(XYZ(:,3),[ny,nx]) ;
% flip matrix to adjust for plot
H = flipud(H) ;
% Transpose the matrix
H = H' ; % Check if is required
surf(x,y,H) ;
Error using reshape
To RESHAPE the number of elements must not change.
I can now plot the nx3 file with scatter3 (see image)
scatter3(XYZ(:,1),XYZ(:,2),XYZ(:,3),2,XYZ(:,3)) ;
colorbar
But I'd like to do it with imagesc. Hence, I would like to convert the nx3 file into a nxm matrix (in raster/gridded format) and as en extra I would like it as a geotiff file for use in QGIS.
Thanks!
You were almost there... Looking at the message about array size you got, it seems likely that the result of unique(X) results in 300276 unique values, probably due to some noisy data.
So instead of using griddata with these large X and Y vectors, you can define some new ones on the domain you need:
% make some sample data
N = 1000;
xv = linspace(-10,10,N);
yv = linspace(-10,10,N);
[XV,YV] = meshgrid(xv,yv);
ZV = XV.^2 + YV.^2;
% make into long vectors:
X = XV(:);
Y = YV(:);
Z = ZV(:);
% make x and y vector to interpolate z
N = 50; % size of new grid
xv = linspace(min(X), max(X), N);
yv = linspace(min(Y), max(Y), N);
[XV,YV] = meshgrid(xv,yv);
% use griddata to find right Z for each x,y pair
ZV_grid = griddata(X,Y,Z,XV,YV);
% look at result
figure();
subplot(211)
imagesc(ZV);
subplot(212);
imagesc(ZV_grid)

Fitting a density (distribution) to a histogram in Matlab

I have a matrix M(mx2). The first column is my bins and the second one is the frequency associated with each bin. I want to fit a smooth curve to this histogram in matlab, but most of what I have tried (like kdensity) needs the real distribution of data, which I don't have them.
Is there any functions that can take the bins and their frequency and give me a smooth curve of bin-freq. ?
Here's a hack that should work for you: generate a sample from your histogram, then run ksdensity on the sample.
rng(42) % seed RNG to make reproducible
% make example histogram
N = 1e3;
bins = -5:5;
counts = round(rand(size(bins))*N);
M = [bins' counts'];
figure
hold on
bar(M(:,1), M(:,2));
% draw a sample from it
sampleCell = arrayfun( #(i) repmat(M(i,1), M(i,2), 1), 1:size(M,1), 'uniformoutput', false )';
sample = cat(1, sampleCell{:});
[f, x] = ksdensity(sample);
plot(x, f*sum(M(:,2)));

Fitting Probability distribution function in eac histogram for each bin data using matlab.....?

I have x data and I did binning and created histogram of each bin.Now I want to fit the probability distribution function in each bin so I can see histogram and probability distribution function at the same graph. Here 'X' is in horizontal axis data and 'Y' represent vertical axis data of the same datasets. I have written the code below
X = load data1 ; Y = load data2
topEdge = 10; % upper limit
botEdge = 0;
numBins = 20;
binEdges = linspace(botEdge, topEdge, numBins+1);
[h,whichBin] = histc(X, binEdges)
% Histrogram plot of each bin
for i = 1:numBins
flagBinMembers = (whichBin == i); %Creates vector of the indices of the data entries that are in bin i
BinMean(i) = mean(power_ref(flagBinMembers)); %Calculate mean value in this bin
BinStd(i) = std(power_ref(flagBinMembers));
x = power_ref(flagBinMembers) - BinMean(i);
mu = power_ref(flagBinMembers)- BinMean(i); sigma =power_ref(flagBinMembers)- BinStd(i);
figure();
histogram(x, 'Normalization', 'pdf')
hold on;
Y = normpdf(x,mu,sigma);
plot(x,Y);
hold off
end
After running this code I am not getting the fitted 'pdf' and histogram on same curve. I correctly got histogram but not 'pdf'. Can any one suggest me or help me ....?? Thank you for your such patient for reading this !
I am getting such curve output : output of first bin Output of 2nd bin
As you can see histogram and pdf not fitting. Where I am making mistake...??

Non-uniform axis of imagesc() in Matlab

Question: is it possible to illustrate an image on non-uniform axis?
Details:
I need to illustrate a multidimensional timeseries as an image. But the time grid of this timeseries is very non-uniform. Here is an example:
m = 10;
n = 3;
t = sort(rand(m, 1)); % non-uniform time
values = randn(m, n); % some random values
The figure, plot(t, values); handles it well.
But imagesc() converts t into uniform time between t(1) and t(end) according to documentation:
imagesc(x,y,C) displays C as an image and specifies the bounds of the
x- and y-axis with vectors x and y.
Therefore, the command:
figure, imagesc(t, 1 : n, values'); colorbar;
illustrates the image on uniform time grid.
Edit: It's possible to re-sample the timeseries with higher uniform resolution. But my timeseries is already very large.
There is pcolor function in MATLAB. This function does exactly what you're asking.
m = 10;
n = 3;
t = sort(rand(m, 1)); % non-uniform time
values = randn(m, n); % some random values
figure
plot(t, values);
figure
pcolor(t, 1 : n, values');
colorbar;
try uimagesc from the file exchange.
Solution
Try using surface for non-uniform spacing.
First, create a 3D xyz surface of the same size as your input data:
m = 10;
n = 3;
t = sort(rand(m, 1)); % non-uniform time
values = randn(m, n); % some random values
x = repmat(t,1,n);
y = repmat(1:n,m,1);
z = zeros(size(y));
Then, colormap your values. There is a nice tool posted to the mathworks file exchange, real2rgb, that can do this for you:
cdata = real2rgb(values); % Where size(cdata) = [m n 3]
Lastly, plot the surface. You can even get fancy and set the transparency.
surface(x,y,z,cdata,'EdgeColor','none','FaceColor','texturemap',...
'CDataMapping','direct');
alpha(0.3)