empirical mean and variance plot in matlab with the normal distribution

empirical mean and variance plot in matlab with the normal distribution - matlab

I'am new in matlab programming,I should Write a script to generate a random sequence (x1,..., XN) of size N following the normal distribution N (0, 1) and calculate the empirical mean mN and variance σN^2
Then,I should plot them:
this is my essai:
function f = normal_distribution(n)
x =randn(n);
muem = 1./n .* (sum(x));
muem
%mean(muem)
vaem = 1./n .* (sum((x).^2));
vaem
hold on
plot(x,muem,'-')
grid on
plot(x,vaem,'*')
NB:those are the formules that I have used:
I have obtained,a Figure and I don't know if is it correct or not ,thanks for Help

From your question, it seems what you want to do is calculate the mean and variance from a sample of size N (nor an NxN matrix) drawn from a standard normal distribution. So you may want to use randn(n, 1), instead of randn(n). Also as #ThP pointed out, it does not make sense to plot mean and variance vs. x. What you could do is to calculate means and variances for inceasing sample sizes n1, n2, ..., nm, and then plot sample size vs. mean or variance, to see them converge to 0 and 1. See the code below:
function [] = plotMnV(nIter)
means = zeros(nIter, 1);
vars = zeros(nIter, 1);
for pow = 1:nIter
n = 2^pow;
x =randn(n, 1);
means(pow) = 1./n * sum(x);
vars(pow) = 1./n * sum(x.^2);
end
plot(1:nIter, means, 'o-');
hold on;
plot(1:nIter, vars, '*-');
end
For example, plotMnV(20) gave me the plot below.

Related

How to do circular convolution between 2 functions with cconv?

I was asked to do circular convolution between two functions by sampling them, using the functions cconv. A known result of this sort of convolution is: CCONV( sin(x), sin(x) ) == -pi*cos(x)
To test the above I did:
w = linspace(0,2*pi,1000);
l = linspace(0,2*pi,1999);
stem(l,cconv(sin(w),sin(w))
but the result I got was:
which is absolutely not -pi*cos(x).
Can anybody please explain what is wrong with my code and how to fix it?

In the documentation of cconv it says that:
c = cconv(a,b,n) circularly convolves vectors a and b. n is the length of the resulting vector. If you omit n, it defaults to length(a)+length(b)-1. When n = length(a)+length(b)-1, the circular convolution is equivalent to the linear convolution computed with conv.
I believe that the reason for your problem is that you do not specify the 3rd input to cconv, which then selects the default value, which is not the right one for you. I have made an animation showing what happens when different values of n are chosen.
If you compare my result for n=200 to your plot you will see that the amplitude of your data is 10 times larger whereas the length of your linspace is 10 times bigger. This means that some normalization is needed, likely a multiplication by the linspace step.
Indeed, after proper scaling and choice of n we get the right result:
res = 100; % resolution
w = linspace(0,2*pi,res);
dx = diff(w(1:2)); % grid step
stem( linspace(0,2*pi,res), dx * cconv(sin(w),sin(w),res) );
This is the code I used for the animation:
hF = figure();
subplot(1,2,1); hS(1) = stem(1,cconv(1,1,1)); title('Autoscaling');
subplot(1,2,2); hS(2) = stem(1,cconv(1,1,1)); xlim([0,7]); ylim(50*[-1,1]); title('Constant limits');
w = linspace(0,2*pi,100);
for ind1 = 1:200
set(hS,'XData',linspace(0,2*pi,ind1));
set(hS,'YData',cconv(sin(w),sin(w),ind1));
suptitle("n = " + ind1);
drawnow
% export_fig(char("D:\BLABLA\F" + ind1 + ".png"),'-nocrop');
end

Matlab - Plot normal distribution with unknown mean that is normally distributed with known parameters

How can I use Matlab to plot a univariate normal distribution when it has unknown mean but the mean is also normally distributed with known mean of mean and variance of mean?
Eg. N(mean, 4) and mean ~N(2,8)

Using the law of total probability, one can write
pdf(x) = int(pdf(x | mean) * pdf(mean) dmean)
So, we can calculate it in Matlab as follows:
% define the constants
sigma_x = 4;
mu_mu = 2;
sigma_mu = 8;
% define the pdf of a normal distribution using the Symbolic Toolbox
% to be able to calculate the integral
syms x mu sigma
pdf(x, mu, sigma) = 1./sqrt(2*pi*sigma.^2) * exp(-(x-mu).^2/(2*sigma.^2));
% calculate the desired pdf
pdf_x(x) = int(pdf(x, mu, sigma_x) * pdf(mu, mu_mu, sigma_mu), mu, -Inf, Inf);
pdfCheck = int(pdf_x, x, -Inf, Inf) % should be one
% plot the desired pdf (green) and N(2, 4) as reference (red)
xs = -40:0.1:40;
figure
plot(xs, pdf(xs, mu_mu, sigma_x), 'r')
hold on
plot(xs, pdf_x(xs), 'g')
Note that I also checked that the integral of the calculated pdf is indeed equal to one, which is a necessary condition for being a pdf.
The green plot is the requested pdf. The red plot is added as reference and represents the pdf for a constant mean (equal to the average mean).

Transform random draws from a bivariate normal into the unit square in Matlab

I have an nx2 matrix r in Matlab reporting n draws from a bivariate normal distribution
n=1000;
m1=0.3;
m2=-m1;
v1=0.2;
n=10000;
v2=2;
rho=0.5;
mu = [m1, m2];
sigma = [v1,rho*sqrt(v1)*sqrt(v2);rho*sqrt(v1)*sqrt(v2),v2];
r = mvnrnd(mu,sigma,n);
I want to normalise these draws to the unit square [0,1]^2
First option
rmax1=max(r(:,1));
rmin1=min(r(:,1));
rmax2=max(r(:,2));
rmin2=min(r(:,2));
rnew=zeros(n,2);
for i=1:n
rnew(i,1)=(r(i,1)-rmin1)/(rmax1-rmin1);
rnew(i,2)=(r(i,2)-rmin2)/(rmax2-rmin2);
end
Second option
rmin1, rmax1, rmin2, rmax2 may be quite variable due to the sampling process. An alternative is applying the 68–95–99.7 rule (here) and I am asking for some help on how to generalise it to a bivariate normal (in particular Step 1 below). Here's my idea
%Step 1: transform the draws in r into draws from a bivariate normal
%with variance-covariance matrix equal to the 2x2 identity matrix
%and mean equal to mu
%How?
%Let t be the transformed vector
%Step 2: apply the 68–95–99.7 rule to each column of t
tmax1=mu(1)+3*1;
tmin1=mu(1)-3*1;
tmax2=mu(2)+3*1;
tmin2=mu(2)-3*1;
tnew=zeros(n,2);
for i=1:n
tnew(i,1)=(t(i,1)-tmin1)/(tmax1-tmin1);
tnew(i,2)=(t(i,1)-tmin2)/(tmax2-tmin2);
end
%Step 3: discard potential values (very few) outside [0,1]

In your case the x and y coordinates of the random vector are correlated, so it's not just a transformation in x and in y independently. You first need to rotate your samples so that x and y become uncorrelated (then the covariance matrix will be diagonal. You don't need it to be the identity, since anywya you normalize later). Then you can apply the transformation you call "2nd option" to the new x and y independently. Shortly, you need to diagonalize the covariance matrix.
As a side note, your code adds/subtracts 3 times 1, instead of 3 times the standard deviation. Also, you can avoid the for loop, using (e.g) Matlab's bsxfun which applies an operation between matrix and vector:
t = bsxfun(#minus,r,mean(r,1)); % center the data
[v, d] = eig(sigma); % find the directions for projection
t = t * v; % the projected data is uncorrelated
sigma_new = sqrt(diag(d)); % that's the std in the new coordinates
% now transform each coordinate independently
tmax1 = 3*sigma_new(1);
tmin1 = -3*sigma_new(1);
tmax2 = 3*sigma_new(2);
tmin2 = -3*sigma_new(2);
tnew = bsxfun(#minus, t, [tmin1, tmin2]);
tnew = bsxfun(#rdivide, tnew, [tmax1-tmin1, tmax2-tmin2]);
You still need to discard the few samples which are out of [0,1], as you wrote.

MATLAB: 3d reconstruction using eight point algorithm

I am trying to achieve 3d reconstruction from 2 images. Steps I followed are,
1. Found corresponding points between 2 images using SURF.
2. Implemented eight point algo to find "Fundamental matrix"
3. Then, I implemented triangulation.
I have got Fundamental matrix and results of triangulation till now. How do i proceed further to get 3d reconstruction? I'm confused reading all the material available on internet.
Also, This is code. Let me know if this is correct or not.
Ia=imread('1.jpg');
Ib=imread('2.jpg');
Ia=rgb2gray(Ia);
Ib=rgb2gray(Ib);
% My surf addition
% collect Interest Points from Each Image
blobs1 = detectSURFFeatures(Ia);
blobs2 = detectSURFFeatures(Ib);
figure;
imshow(Ia);
hold on;
plot(selectStrongest(blobs1, 36));
figure;
imshow(Ib);
hold on;
plot(selectStrongest(blobs2, 36));
title('Thirty strongest SURF features in I2');
[features1, validBlobs1] = extractFeatures(Ia, blobs1);
[features2, validBlobs2] = extractFeatures(Ib, blobs2);
indexPairs = matchFeatures(features1, features2);
matchedPoints1 = validBlobs1(indexPairs(:,1),:);
matchedPoints2 = validBlobs2(indexPairs(:,2),:);
figure;
showMatchedFeatures(Ia, Ib, matchedPoints1, matchedPoints2);
legend('Putatively matched points in I1', 'Putatively matched points in I2');
for i=1:matchedPoints1.Count
xa(i,:)=matchedPoints1.Location(i);
ya(i,:)=matchedPoints1.Location(i,2);
xb(i,:)=matchedPoints2.Location(i);
yb(i,:)=matchedPoints2.Location(i,2);
end
matchedPoints1.Count
figure(1) ; clf ;
imshow(cat(2, Ia, Ib)) ;
axis image off ;
hold on ;
xbb=xb+size(Ia,2);
set=[1:matchedPoints1.Count];
h = line([xa(set)' ; xbb(set)'], [ya(set)' ; yb(set)']) ;
pts1=[xa,ya];
pts2=[xb,yb];
pts11=pts1;pts11(:,3)=1;
pts11=pts11';
pts22=pts2;pts22(:,3)=1;pts22=pts22';
width=size(Ia,2);
height=size(Ib,1);
F=eightpoint(pts1,pts2,width,height);
[P1new,P2new]=compute2Pmatrix(F);
XP = triangulate(pts11, pts22,P2new);
eightpoint()
function [ F ] = eightpoint( pts1, pts2,width,height)
X = 1:width;
Y = 1:height;
[X, Y] = meshgrid(X, Y);
x0 = [mean(X(:)); mean(Y(:))];
X = X - x0(1);
Y = Y - x0(2);
denom = sqrt(mean(mean(X.^2+Y.^2)));
N = size(pts1, 1);
%Normalized data
T = sqrt(2)/denom*[1 0 -x0(1); 0 1 -x0(2); 0 0 denom/sqrt(2)];
norm_x = T*[pts1(:,1)'; pts1(:,2)'; ones(1, N)];
norm_x_ = T*[pts2(:,1)';pts2(:,2)'; ones(1, N)];
x1 = norm_x(1, :)';
y1= norm_x(2, :)';
x2 = norm_x_(1, :)';
y2 = norm_x_(2, :)';
A = [x1.*x2, y1.*x2, x2, ...
x1.*y2, y1.*y2, y2, ...
x1, y1, ones(N,1)];
% compute the SVD
[~, ~, V] = svd(A);
F = reshape(V(:,9), 3, 3)';
[FU, FS, FV] = svd(F);
FS(3,3) = 0; %rank 2 constrains
F = FU*FS*FV';
% rescale fundamental matrix
F = T' * F * T;
end
triangulate()
function [ XP ] = triangulate( pts1,pts2,P2 )
n=size(pts1,2);
X=zeros(4,n);
for i=1:n
A=[-1,0,pts1(1,i),0;
0,-1,pts1(2,i),0;
pts2(1,i)*P2(3,:)-P2(1,:);
pts2(2,i)*P2(3,:)-P2(2,:)];
[~,~,va] = svd(A);
X(:,i) = va(:,4);
end
XP(:,:,1) = [X(1,:)./X(4,:);X(2,:)./X(4,:);X(3,:)./X(4,:); X(4,:)./X(4,:)];
end
function [ P1,P2 ] = compute2Pmatrix( F )
P1=[1,0,0,0;0,1,0,0;0,0,1,0];
[~, ~, V] = svd(F');
ep = V(:,3)/V(3,3);
P2 = [skew(ep)*F,ep];
end

From a quick look, it looks correct. Some notes are as follows:
You normalized code in eightpoint() is no ideal.
It is best done on the points involved. Each set of points will have its scaling matrix. That is:
[pts1_n, T1] = normalize_pts(pts1);
[pts2_n, T2] = normalize-pts(pts2);
% ... code
% solution
F = T2' * F * T
As a side note (for efficiency) you should do
[~,~,V] = svd(A, 0);
You also want to enforce the constraint that the fundamental matrix has rank-2. After you compute F, you can do:
[U,D,v] = svd(F);
F = U * diag([D(1,1),D(2,2), 0]) * V';
In either case, normalization is not the only key to make the algorithm work. You'll want to wrap the estimation of the fundamental matrix in a robust estimation scheme like RANSAC.
Estimation problems like this are very sensitive to non Gaussian noise and outliers. If you have a small number of wrong correspondence, or points with high error, the algorithm will break.
Finally, In 'triangulate' you want to make sure that the points are not at infinity prior to the homogeneous division.
I'd recommend testing the code with 'synthetic' data. That is, generate your own camera matrices and correspondences. Feed them to the estimate routine with varying levels of noise. With zero noise, you should get an exact solution up to floating point accuracy. As you increase the noise, your estimation error increases.
In its current form, running this on real data will probably not do well unless you 'robustify' the algorithm with RANSAC, or some other robust estimator.
Good luck.
Good luck.

Which version of MATLAB do you have?
There is a function called estimateFundamentalMatrix in the Computer Vision System Toolbox, which will give you the fundamental matrix. It may give you better results than your code, because it is using RANSAC under the hood, which makes it robust to spurious matches. There is also a triangulate function, as of version R2014b.
What you are getting is sparse 3D reconstruction. You can plot the resulting 3D points, and you can map the color of the corresponding pixel to each one. However, for what you want, you would have to fit a surface or a triangular mesh to the points. Unfortunately, I can't help you there.

If what you're asking is how to I proceed from fundamental Matrix + corresponding points to a dense model then you still have a lot of work ahead of you.
relative camera locations (R,T) can be calculated from a fundamental matrix assuming you know the internal camera params (up to scale, rotation, translation). To get a full dense matrix there are a few ways to go. you can try using an existing library (PMVS for example). I'd look into OpenMVG but I'm not sure about matlab interface.
Another way to go, you can compute a dense optical flow (many available for matlab). Look for a epipolar OF (It takes a fundamental matrix and restricts the solution to lie on the epipolar lines). Then you can triangulate every pixel to get a depthmap.
Finally you will have to play with format conversions to get from a depthmap to VRML (You can look at meshlab)
Sorry my answer isn't more Matlab oriented.

How to normalize a histogram in MATLAB?

How to normalize a histogram such that the area under the probability density function is equal to 1?

My answer to this is the same as in an answer to your earlier question. For a probability density function, the integral over the entire space is 1. Dividing by the sum will not give you the correct density. To get the right density, you must divide by the area. To illustrate my point, try the following example.
[f, x] = hist(randn(10000, 1), 50); % Create histogram from a normal distribution.
g = 1 / sqrt(2 * pi) * exp(-0.5 * x .^ 2); % pdf of the normal distribution
% METHOD 1: DIVIDE BY SUM
figure(1)
bar(x, f / sum(f)); hold on
plot(x, g, 'r'); hold off
% METHOD 2: DIVIDE BY AREA
figure(2)
bar(x, f / trapz(x, f)); hold on
plot(x, g, 'r'); hold off
You can see for yourself which method agrees with the correct answer (red curve).
Another method (more straightforward than method 2) to normalize the histogram is to divide by sum(f * dx) which expresses the integral of the probability density function, i.e.
% METHOD 3: DIVIDE BY AREA USING sum()
figure(3)
dx = diff(x(1:2))
bar(x, f / sum(f * dx)); hold on
plot(x, g, 'r'); hold off

Since 2014b, Matlab has these normalization routines embedded natively in the histogram function (see the help file for the 6 routines this function offers). Here is an example using the PDF normalization (the sum of all the bins is 1).
data = 2*randn(5000,1) + 5; % generate normal random (m=5, std=2)
h = histogram(data,'Normalization','pdf') % PDF normalization
The corresponding PDF is
Nbins = h.NumBins;
edges = h.BinEdges;
x = zeros(1,Nbins);
for counter=1:Nbins
midPointShift = abs(edges(counter)-edges(counter+1))/2;
x(counter) = edges(counter)+midPointShift;
end
mu = mean(data);
sigma = std(data);
f = exp(-(x-mu).^2./(2*sigma^2))./(sigma*sqrt(2*pi));
The two together gives
hold on;
plot(x,f,'LineWidth',1.5)
An improvement that might very well be due to the success of the actual question and accepted answer!
EDIT - The use of hist and histc is not recommended now, and histogram should be used instead. Beware that none of the 6 ways of creating bins with this new function will produce the bins hist and histc produce. There is a Matlab script to update former code to fit the way histogram is called (bin edges instead of bin centers - link). By doing so, one can compare the pdf normalization methods of #abcd (trapz and sum) and Matlab (pdf).
The 3 pdf normalization method give nearly identical results (within the range of eps).
TEST:
A = randn(10000,1);
centers = -6:0.5:6;
d = diff(centers)/2;
edges = [centers(1)-d(1), centers(1:end-1)+d, centers(end)+d(end)];
edges(2:end) = edges(2:end)+eps(edges(2:end));
figure;
subplot(2,2,1);
hist(A,centers);
title('HIST not normalized');
subplot(2,2,2);
h = histogram(A,edges);
title('HISTOGRAM not normalized');
subplot(2,2,3)
[counts, centers] = hist(A,centers); %get the count with hist
bar(centers,counts/trapz(centers,counts))
title('HIST with PDF normalization');
subplot(2,2,4)
h = histogram(A,edges,'Normalization','pdf')
title('HISTOGRAM with PDF normalization');
dx = diff(centers(1:2))
normalization_difference_trapz = abs(counts/trapz(centers,counts) - h.Values);
normalization_difference_sum = abs(counts/sum(counts*dx) - h.Values);
max(normalization_difference_trapz)
max(normalization_difference_sum)
The maximum difference between the new PDF normalization and the former one is 5.5511e-17.

hist can not only plot an histogram but also return you the count of elements in each bin, so you can get that count, normalize it by dividing each bin by the total and plotting the result using bar. Example:
Y = rand(10,1);
C = hist(Y);
C = C ./ sum(C);
bar(C)
or if you want a one-liner:
bar(hist(Y) ./ sum(hist(Y)))
Documentation:
hist
bar
Edit: This solution answers the question How to have the sum of all bins equal to 1. This approximation is valid only if your bin size is small relative to the variance of your data. The sum used here correspond to a simple quadrature formula, more complex ones can be used like trapz as proposed by R. M.

[f,x]=hist(data)
The area for each individual bar is height*width. Since MATLAB will choose equidistant points for the bars, so the width is:
delta_x = x(2) - x(1)
Now if we sum up all the individual bars the total area will come out as
A=sum(f)*delta_x
So the correctly scaled plot is obtained by
bar(x, f/sum(f)/(x(2)-x(1)))

The area of abcd`s PDF is not one, which is impossible like pointed out in many comments.
Assumptions done in many answers here
Assume constant distance between consecutive edges.
Probability under pdf should be 1. The normalization should be done as Normalization with probability, not as Normalization with pdf, in histogram() and hist().
Fig. 1 Output of hist() approach, Fig. 2 Output of histogram() approach
The max amplitude differs between two approaches which proposes that there are some mistake in hist()'s approach because histogram()'s approach uses the standard normalization.
I assume the mistake with hist()'s approach here is about the normalization as partially pdf, not completely as probability.
Code with hist() [deprecated]
Some remarks
First check: sum(f)/N gives 1 if Nbins manually set.
pdf requires the width of the bin (dx) in the graph g
Code
%http://stackoverflow.com/a/5321546/54964
N=10000;
Nbins=50;
[f,x]=hist(randn(N,1),Nbins); % create histogram from ND
%METHOD 4: Count Densities, not Sums!
figure(3)
dx=diff(x(1:2)); % width of bin
g=1/sqrt(2*pi)*exp(-0.5*x.^2) .* dx; % pdf of ND with dx
% 1.0000
bar(x, f/sum(f));hold on
plot(x,g,'r');hold off
Output is in Fig. 1.
Code with histogram()
Some remarks
First check: a) sum(f) is 1 if Nbins adjusted with histogram()'s Normalization as probability, b) sum(f)/N is 1 if Nbins is manually set without normalization.
pdf requires the width of the bin (dx) in the graph g
Code
%%METHOD 5: with histogram()
% http://stackoverflow.com/a/38809232/54964
N=10000;
figure(4);
h = histogram(randn(N,1), 'Normalization', 'probability') % hist() deprecated!
Nbins=h.NumBins;
edges=h.BinEdges;
x=zeros(1,Nbins);
f=h.Values;
for counter=1:Nbins
midPointShift=abs(edges(counter)-edges(counter+1))/2; % same constant for all
x(counter)=edges(counter)+midPointShift;
end
dx=diff(x(1:2)); % constast for all
g=1/sqrt(2*pi)*exp(-0.5*x.^2) .* dx; % pdf of ND
% Use if Nbins manually set
%new_area=sum(f)/N % diff of consecutive edges constant
% Use if histogarm() Normalization probability
new_area=sum(f)
% 1.0000
% No bar() needed here with histogram() Normalization probability
hold on;
plot(x,g,'r');hold off
Output in Fig. 2 and expected output is met: area 1.0000.
Matlab: 2016a
System: Linux Ubuntu 16.04 64 bit
Linux kernel 4.6

For some Distributions, Cauchy I think, I have found that trapz will overestimate the area, and so the pdf will change depending on the number of bins you select. In which case I do
[N,h]=hist(q_f./theta,30000); % there Is a large range but most of the bins will be empty
plot(h,N/(sum(N)*mean(diff(h))),'+r')

There is an excellent three part guide for Histogram Adjustments in MATLAB (broken original link, archive.org link),
the first part is on Histogram Stretching.