Matlab Plotting Normal Distribution Probability Density Function - matlab

I am new to statistics. I have a discriminant function:
 
g(x) = ln p(x| w)+ lnP(w)
I know it has a normal distribution. I know mü and sigma variables. How can I plot pdf function of it at Matlab?
Here is a conversation: How to draw probability density function in MatLab? however I don't want to use any toolbax of Matlab.

Use normpdf, or mvnpdf for a multivariate Normal distribution:
mu = 0;
sigma = 1;
xs = [-5:.1:5];
ys = normpdf(xs, mu, sigma);
clf;
plot(xs, ys);

MATLAB plots vectors of data, so you'll need to make an X-vector, and a Y-vector.
If I had a function, say, x^2, i might do:
x = -1:.01:1; %make the x-vector
y = x.^2; %square x
plot(x,y);
You know the function of the PDF (y = exp(-x.^2./sigma^2).*1/sqrt(2*pi*sigma^2) ), so all you have to do is make the x-vector, and plot away!

Based upon a comment from #kamaci, this question gives a complete answer using #Pete's answer.
To avoid using a MATLAB toolbox, just plot the Normal probability density function (PDF) directly.
mu = 12.5; % mean
sigma = 3.75; % standard deviation
fh=#(x) exp(-((x-mu).^2)./(2*(sigma^2)))*(1/sqrt(2*pi*(sigma^2))); % PDF function
X = 0:.01:25;
p = plot(X,fh(X),'b-')

Related

Curve fitting using for loop for polynomial up to degree i?

I have this hard coded version which fits data to a curve for linear, quadratic and cubic polynomials:
for some data x and a function y
M=[x.^0 x.^1];
L=[x.^0 x.^1 x.^2];
linear = (M'*M)\(M'*y);
plot(x, linear(1)+linear(2)*x, ';linear;r');
deg2 = (L'*L)\(L'*y);
plot(x, deg2(1)+deg2(2)*x+deg2(3)*(x.*x), ';quadratic plot;b');
I am wondering how can I turn this into a for loop to plot curves for degree n polynomials? The part I'm stuck on is the plotting part, how would I be able to translate the increase in the number of coefficients in to the for loop?
what I have:
for i = 1:5 % say we're trying to plot curves up to degree 5 polynomials...
curr=x.^(0:i);
degI = (curr'*curr)\(curr'*y);
plot(x, ???) % what goes in here<-
end
If it is only the plotting, you can use the polyval function to evaluate polynomials of desired grade by supplying a vector of coefficients
% For example, some random coefficients for a 5th order polynomial
% degI = (curr'*curr)\(curr'*y) % Your case
degi = [3.2755 0.8131 0.5950 2.4918 4.7987 1.5464]; % for 5th order polynomial
x = linspace(-2, 2, 10000);
hold on
% Using polyval to loop over the grade of the polynomials
for i = 1:length(degI)
plot(x, polyval(degI(1:i), x))
end
gives the all polynomials in one plot
I believe this should answer your question exactly. You just have to be careful with the matrix dimensions.
for i = 1:5
curr=x.^(0:i);
degI = (curr'*curr)\(curr'*y);
plot(x, x.^(0:i)*degI)
end

Multivariate Normal Distribution Matlab, probability area

I have 2 arrays: one with x-coordinates, the other with y-coordinates.
Both are a normal distribution as a result of a Monte-Carlo simulation. I know how to find the sigma and mu for both array's, and get a 95% confidence interval:
[mu,sigma]=normfit(x_array);
hist(x_array);
x=norminv([0.025 0.975],mu,sigma)
However, both array's are correlated with each other. To plot the probability distribution of the combined array's, i use the multivariate normal distribution. In MATLAB this gives me:
[MuX,SigmaX]=normfit(x_array);
[MuY,SigmaY]=normfit(y_array);
mu = [MuX MuY];
Sigma=cov(x_array,y_array);
x1 = MuX-4*SigmaX:5:MuX+4*SigmaX; x2 = MuY-4*SigmaY:5:MuY+4*SigmaY;
[X1,X2] = meshgrid(x1,x2);
F = mvnpdf([X1(:) X2(:)],mu,Sigma);
F = reshape(F,length(x2),length(x1));
surf(x1,x2,F);
caxis([min(F(:))-.5*range(F(:)),max(F(:))]);
set(gca,'Ydir','reverse')
xlabel('x0-as'); ylabel('y0-as'); zlabel('Probability Density');
So far so good. Now I want to calculate the 95% probability area. I'am looking for a function as mndinv, just as norminv. However, such a function doesn't exist in MATLAB, which makes sense because there are endless possibilities... Does somebody have a tip about how to get a 95% probability area? Thanks in advance.
For the bivariate case you can add the ellispe whose area corresponds to NORMINV(95%). This ellipse is uniquely identified and for proof see the first source in the link.
% Suppose you know the distribution params, or you got them from normfit()
mu = [3, 7];
sigma = [1, 2.5
2.5 9];
% X/Y values for plotting grid
x = linspace(mu(1)-3*sqrt(sigma(1)), mu(1)+3*sqrt(sigma(1)),100);
y = linspace(mu(2)-3*sqrt(sigma(end)), mu(2)+3*sqrt(sigma(end)),100);
% Z values
[X1,X2] = meshgrid(x,y);
Z = mvnpdf([X1(:) X2(:)],mu,sigma);
Z = reshape(Z,length(y),length(x));
% Plot
h = pcolor(x,y,Z);
set(h,'LineStyle','none')
hold on
% Add level set
alpha = 0.05;
r = sqrt(-2*log(alpha));
rho = sigma(2)/sqrt(sigma(1)*sigma(end));
M = [sqrt(sigma(1)) rho*sqrt(sigma(end))
0 sqrt(sigma(end)-sigma(end)*rho^2)];
theta = 0:0.1:2*pi;
f = bsxfun(#plus, r*[cos(theta)', sin(theta)']*M, mu);
plot(f(:,1), f(:,2),'--r')
Sources
https://upload.wikimedia.org/wikipedia/commons/a/a2/Cumulative_function_n_dimensional_Gaussians_12.2013.pdf
https://en.wikipedia.org/wiki/Multivariate_normal_distribution
To get the numerical value of F where the top part lies, you should use top5=prctile(F(:),95) . This will return the value of F that limits the bottom 95% of data with the top 5%.
Then you can get just the top 5% with
Ftop=zeros(size(F));
Ftop=F>top5;
Ftop=Ftop.*F;
%// optional: Ftop(Ftop==0)=NaN;
surf(x1,x2,Ftop,'LineStyle','none');

calculating sum of two triangular random variables (Matlab)

I would like to calculate the sum of two triangular random variables,
P(x1+x2 < y)
Is there a faster way to implement the sum of two triangular random variables in Matlab?
EDIT: It seems there's possibly a much easier way, as shown in this minitab demonstration. So it's not impossible. It doesn't explain how the PDF was calculated, sadly. Still looking into how I can do this in matlab.
EDIT2: Following advice, I'm using conv function in Matlab to develop the PDF of the sum of two random variables:
clear all;
clc;
pd1 = makedist('Triangular','a',85,'b',90,'c',100);
pd2 = makedist('Triangular','a',90,'b',100,'c',110);
x = linspace(85,290,200);
x1 = linspace(85,100,200);
x2 = linspace(90,110,200);
pdf1 = pdf(pd1,x1);
pdf2 = pdf(pd2,x2);
z = median(diff(x))*conv(pdf1,pdf2,'same');
p1 = trapz(x1,pdf1) %probability P(x1<y)
p2 = trapz(x2,pdf2) %probability P(x2<y)
p12 = trapz(x,z) %probability P(x1+x2 <y)
hold on;
plot(x1,pdf1) %plot pdf of dist. x1
plot(x2,pdf2) %plot pdf of dist. x2
plot(x,z) %plot pdf of x1+x2
hold off;
However this code has two problems:
PDF of X1+X2 integrates to much higher than 1.
PDF of X1+X2 varies widely depending on the range of x. Intuitively, if the X1+X2 is larger than 210 (the sum of upper limits "c" of two individual triangular distributions, 100 + 110), shouldn't P(X1+X2 <210) equal to 1? Also, since the lower limits "a" is 85 and 90, P(X1+X2 <85) = 0?
The pdf of the sum of independent variables is the convolution of the pdf's of the variables. So you need to compute the convolution of two variables with trianular pdf's. A triangle is piecewise linear, so the convolution will be piecewise quadratic.
There are a few ways to about it. If a numerical result is acceptable: discretize the pdf's and compute the convolution of the discretized pdf's. I believe there is a function conv in Matlab for that. If not, you can take the fast Fourier transform (via fft), compute the product point by point, then take the inverse transform (ifft if I remember correctly) since fft(convolution(f, g)) = fft(f) fft(g). You will need to be careful about normalization if you use either conv or fft.
If you must have an exact result, the convolution is just an integral, and if you're careful with the limits of integration, you can figure it out by hand. I don't know if the Matlab symbolic toolbox is available to you, and if so, I don't know if it can handle integrals of functions defined piecewise.
Below is the proper implementation for future users. Many thanks to Robert Dodier for guidance.
clear all;
clc;
min1 = 85;
max1 = 100;
min2 = 90;
max2 = 110;
y = 210;
pd1 = makedist('Triangular','a',min1,'b',90,'c',max1);
pd2 = makedist('Triangular','a',min2,'b',100,'c',max2);
dx = 0.01; % to ensure constant spacing
x1 = min1:dx:max1; % Could include some of the region where
x2 = min2:dx:max2; % the pdf is 0, but we don't have to.
x12 = linspace(...
x1(1) + x2(1) , ...
x1(end) + x2(end) , ...
length(x1)+length(x2)-1);
[c,index] = min(abs(x12-y));
x_short = linspace(min1+min2,x12(index),index);
pdf1 = pdf(pd1,x1);
pdf2 = pdf(pd2,x2);
pdf12 = conv(pdf1,pdf2)*dx;
zz = pdf12(1:index);
zz(index) = 0;
p1 = trapz(x1,pdf1)
p2 = trapz(x2,pdf2)
p12 = trapz(x_short,zz)
plot(x1,pdf1,x2,pdf2,x12,pdf12)
hold on;
fill(x_short,zz,'blue') % plot x1+x2
hold off;

Given a covarince matrix, generate a Gaussian random variable in Matlab

Given a M x M desired covariance, R, and a desired number of sample vectors, N calculate a N x M Gaussian random vector, X in vanilla MATLAB (i.e. can't use r = mvnrnd(MU,SIGMA,cases)).
Not really sure how to tackle this, usually you need a covariance AND mean to generate a Gaussian random variable. I think sqrtm and chol could be useful.
If you have access to the MATLAB statistics toolbox you can type edit mvnrnd in MATLAB to see their solution.
[T p] = chol(sigma);
if m1 == c
mu = mu';
end
mu = mu(ones(cases,1),:);
r = randn(cases,c) * T + mu;
It feels almost like cheating to point this out, but editing MATLAB's source is very useful to understand things in general. You can also search for mvnrnd.m on google if you don't have the toolbox.
Example:
% Gaussian mean and covariance
d = 2; % number of dimensions
mu = rand(1,d);
sigma = rand(d,d); sigma = sigma*sigma';
% generate 100 samples from above distribution
num = 100;
X = mvnrnd(mu, sigma, num);
% plot samples (only for 2D case)
scatter(X(:,1), X(:,2), 'filled'), hold on
ezcontour(#(x,y) mvnpdf([x y], mu, sigma), xlim(), ylim())
title('X~N(\mu,\sigma)')
xlabel('X_1'), ylabel('X_2')
The above code uses functions from the Statistics toolbox (mvnrnd and mvnpdf). If you don't have access to it, consider these replacements (using the same concepts mentioned by others):
mvnrnd = #(mu,S,num) bsxfun(#plus, randn(num,numel(mu))*cholcov(S), mu);
mvnpdf = #(x,mu,S) exp(-0.5*(x-mu)*(S\(x-mu)')) / sqrt((2*pi)^d*det(S));

regression in matlab

I have this matlab code for regression with one indepenpent variable, but what if I have two independent variables(x1 and x2)? How should I modify this code of polynomial regression?
x = linspace(0,10,200)'; % independent variable
y = x + 1.5*sin(x) + randn(size(x,1),1); % dependent variable
A = [x.^0, x]; % construct a matrix of permutations
w = (A'*A)\(A'*y); % solve the normal equation
y2 = A*w; % restore the dependent variable
r = y-y1; % find the vector of regression residual
plot(x, [y y2]);
Matlab has facilities for polynomial regression function polyfit. Have you tried that?
http://www.mathworks.com/help/techdoc/data_analysis/f1-8450.html
http://www.mathworks.com/help/toolbox/stats/bq_676m-2.html#bq_676m-3
But if you want to workout your own formulation,you should probably look at textbook or some online resources on regression e.g.
http://www.edwardtufte.com/tufte/dapp/DAPP3a.pdf