How to do MCMC simulation using Metropolis hasting algorithm in Matlab? - matlab

I am trying to simulate a distribution for parameter theta f= theta ^(z_f+n+alpha-1)*(1-theta)^(n+1-z_f-k+ beta-1), where all the parameter except for theta is know. I am using Metro polish hasting algorithm to do the MCMC simulation . My proposal density is a beta distribution with parameter alpha and beta. My code for the simulation are as follows. I am using a buitlin Matlab code called mhsample() for this purpose, How do I know if my code is working properly?
clear
clc
alpha=2;
beta=2;
z_f=1;
n=6;
k=5;
nsamples = 3000;
pdf= #(x) x^(z_f+n+alpha-1)*(1-x)^(n+1-z_f-k+beta-1); % here x acts as theta
proppdf= #(x,y) betapdf(x, alpha, beta);
proprnd =#(x) betarnd(alpha,beta,1);
smpl = mhsample(0.1,nsamples,'pdf',pdf,'proprnd',proprnd,'proppdf',proppdf);

I'm unsure of what you're asking when you say "how do I know if my code is working properly" -- I'm assuming it executes? But for a visual comparison of your function vs. the simulation, you can plot both the PDF and the data you got from mhsample as follows:
% i'm assuming you ran the code above so that smpl and #pdf are both defined...
fplot(pdf,[0 1]); % fplot takes your function and plots it between x-limit [0,1]
figure % new figure
hist(smpl,30); % 30 here is bin size, change it to your preference
Figure below:
the histogram of smpl's output on left, i.e., your simulation
the function pdf bounded in [0,1] on right for comparison to your simulation
This was just a wild guess because those two figures resemble each other and are also beta-distribution-esque.
If you want a more complex analysis than that, I'm afraid I'm not yet proficient in MCMC :)

Related

Curve fitting in MATLAB, for a Sinusoidal function with more than 8 terms?

I'm trying to fit some data to a sum of sines function in MATLAB, however, the number of terms of sine function in MATLAB is limited,i.e. to 1 ≤ n ≤ 8. However, I want more terms in my fit functions, i.e. over 50 term. Is there anyway to make MATLAB to fit my data to a sum of sine function with over 8 sinusoidal terms? Why there is such constraint in MATLAB (is it technically or arbitrary)? Is there any toolbox to fit sinusoidal function (especially something that is capable of supporting wieghted data)?
>f = fit(X,Y, 'sin10')
>Error using fittype>iCreateFromLibrary (line 412)
>Library function sin10 not found.
It is o.k up to 'sin8' or 'sin9' parameters.
I appreciate any answer.
I'v found a solution to my question accidentally, while browsing MATLAB help. I post this answer in hope of helping people who have the same problem.
As the first shot to solve this , I tried 'fit' instruction. For some reasons, customized 'fit' based fitting code like below, didn't workout:
FitOptions = fitoptions('Method','NonlinearLeastSquares', 'Algorithm', 'Trust-Region', 'MaxIter');
FitType = fittype('a*sin(1*f) + b*sin(2*f) + c*sin(3*f) + d*sin(4*f) + e*sin(5*f) + g*sin(6*f) + h*sin(7*f) + k*sin(8*f) + l*sin(9*f) + m*sin(10*f) + n*sin(11*f)', 'independent', 'f');
[FittedModel, GOF] = fit(freq, data, FitType)
% `In above code, phase parameters are not included, they might be added.
What I found is that using 'lsqcurvefit' instruction from Optimization Toolbox, customized function fitting is more feasible and easier than 'fit' function. I tested it to fit my data to sum of 12 (>8) sines in below code:
clear;clc
xdata=1:0.1:10; % X or Independant Data
ydata=sin(xdata+0.2)+0.5*sin(0.3*xdata+0.3)+ 2*sin( 0.2*xdata+23 )+...
0.7*sin( 0.34*xdata+12 )+.76*sin( .23*xdata+.3 )+.98*sin(.76 *xdata+.56 )+...
+.34*sin( .87*xdata+.123 )+.234*sin(.234 *xdata+23 ); % Y or Dependant data
x0 = randn(36,1); % Initial Guess
fun = #(x,xdata)x(1)*sin(x(2)*xdata+x(3))+...
x(4)*sin(x(5)*xdata+x(6))+...
x(7)*sin(x(8)*xdata+x(9))+...
x(10)*sin(x(11)*xdata+x(12))+...
x(13)*sin(x(14)*xdata+x(15))+...
x(16)*sin(x(17)*xdata+x(18))+...
x(19)*sin(x(20)*xdata+x(21))+...
x(22)*sin(x(23)*xdata+x(24))+...
x(25)*sin(x(26)*xdata+x(27))+...
x(28)*sin(x(29)*xdata+x(30))+...
x(31)*sin(x(32)*xdata+x(33))+...
x(34)*sin(x(35)*xdata+x(36)); % Goal function which is Sum of 12 sines
options = optimoptions('lsqcurvefit','Algorithm','trust-region-reflective');% Options for fitting
x=lsqcurvefit(fun,x0,xdata,ydata) % the main instruction
times = linspace(xdata(1),xdata(end));
plot(xdata,ydata,'ko',times,fun(x,times),'r-')
legend('Data','Fitted Sum of 12 Sines')
title('Data and Fitted Curve')
The results is satisfactory (till now), it is shown in below:
The above problem is that when I use matlab fit function, with specified argument for Sum of Sines fitting (e.g fit(xdata,ydata,'sin6')), it easily converges to an optimum solution and fitting results are acceptable as below:
but when I tried to fit same data using a customarily defined function, it results are not satisfactory at all as you see in figure below:
fun=#(x,xdata)a1*sin(b1*xdata+c1)+...+a6*sin(b6*xdata+c6); %Sum if Six Sines
f=fit(xdata,ydata,fun);
First, I felt it is the fit instruction so I tried other instructions like lsqcurvefit , it worked well for some data but as soon as other data were ued it started to ill-behave.
From Maltab documentations, I figured out Sum of Sine fitting and Fourier fitting are extremely sensitive to Starting points or initial points, or values that fitting algorithm assumes for fitting parameters (amplitudes, frequencies and phases) for its first iteration. Through inspection of Matlab fitting toolbox .m files , I noticed matlab does some clever trick to obtain starting point when you use predefined function fitting (e.g. fit(x,y,'sin1'), or fit(x,y,'sin2'),... but when you chose ti enter your custom function the initial points are generated randomly! This is why Matlab build functions work and my custom function fitting does not (even though I enter the same function).
By the way, Matlab computes FFT of the ydata and through some (seems greedy) method extracts initial points for amplitudes, frequencies and phases (a function called startpt.m does this).

MATLAB: Generate sample from a known density function

I want to write a function that generates samples from a distribution. To do that, I know its continuous density function. The first idea that came into mind was to write a code like this one:
% a and b are know coefficients.
% besselj is a MATLAB function.
x = 0:0.0001:5;
% PDF=density function (PDF=a (1 x 50.000) vector)
PDF = (x/b).*exp(-(x.^2+a^2)./(2*b)).*besselj(1,(x.*a)./b);
% density function has to be normalized
PDF = PDF./sum(PDF)
% y=the vector containing the sample
[temp , y] = histc(rand(1 , N), [0 cumsum(PDF)]);
For the curious ones, this density function represents a Rice distribution ;-).
However, on second thoughts, this algorithm seems bad. Indeed, from a 100%-known density function, I get a partially-known discretized density function (the vector PDF) and I rely on this new density function to obtain samples. I think that some information is lost because of the discretization.
So, my question is the following: Is there a way I can generate sample from a continuous density function without loosing information ?
Moreover, the code above doesn't seem good to me because the magnitudes of the plot(x,PDF) before and after the normalization are absolutely not the same.

MATLAB Fitting Function

I am trying to fit a line to some data without using polyfit and polyval. I got some good help already on how to implement this and I have gotten it to work with a simple sin function. However, when applied to the function I am trying to fit, it does not work. Here is my code:
clear all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=lb:step:ub;
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
i=1;
for x=lb:step:ub
G(i,1)= abs(sin((a/la)*pi*x*(sqrt(1+(1/x)^2)-1)));
N(i,1)=2*rand()-1;
Ghat(i,1)=(1+ep1*N(i,1))*G(i,1)+ep2*N(i,1);
r(i,1)=x;
i=i+1;
end
x=r;
y=G;
V=[x.^0];
Vfit=[x.^0];
for i=1:1:1000
V = [x.^i V];
c = V \ y;
Vfit = [x.^i Vfit];
yFit=Vfit*c;
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
The first two sections are just defining variables and the function. The second for loop is where I am making the fit. As you can see, I have it pause after every nth order in order to see the fit.
I changed your fit formula a bit, I got the same answers but quickly got
a warning that the matrix was singular. No sense in continuing past
the point that the inversion is singular.
Depending on what you are doing you can usually change out variables or change domains.
This doesn't do a lot better, but it seemed to help a little bit.
I increased the number of samples by a factor of 10 since the initial part of the curve
didn't look sampled highly enough.
I added a weighting variable but it is set to equal weight in the code below. Attempts
to deweight the tail didn't help as much as I hoped.
Probably not really a solution, but perhaps will help with a few more knobs/variables.
...
step=.01; %step-size through data
...
x=r;
y=G;
t=x.*sqrt(1+x.^(-2));
t=log(t);
V=[ t.^0];
w=ones(size(t));
for i=1:1:1000
% Trying to solve for value of c
% c that
% yhat = V*c approximates y
% or y = V*c
% V'*y = V'*V * c
% c = (V'*V) \ V'*y
V = [t.^i V];
c = (V'*diag(w.^2)*V ) \ (V'*diag(w.^2)*y) ;
yFit=V*c;
subplot(211)
plot(t,y,'o',t,yFit,'--')
subplot(212)
plot(x,y,'o',x,yFit,'--')
drawnow
pause
end
It looks like more of a frequency estimation problem, and trying to fit a unknown frequency
with polynomial tends to be touch and go. Replacing the polynomial basis with a quick
sin/cos basis didn't seem to do to bad.
V = [sin(t*i) cos(t*i) V];
Unless you specifically need a polynomial basis, you can apply your knowledge of the problem domain to find other potential basis functions for your fit, or to attempt to make the domain in which you are performing the fit more linear.
As dennis mentioned, a different set of basis functions might do better. However you can improve the polynomial fit with QR factorisation, rather than just \ to solve the matrix equation. It is a badly conditioned problem no matter what you do however, and using smooth basis functions wont allow you to accurately reproduce the sharp corners in the actual function.
clear all
close all
clc
lb=0.001; %lowerbound of data
ub=10; %upperbound of data
step=.1; %step-size through data
a=.03;
la=1482/120000; %1482 is speed of sound in water and 120kHz
ep1=.02;
ep2=.1;
x=logspace(log10(lb),log10(ub),100)';
r_sq_des=0.90; %desired value of r^2 for the fit of data without noise present
y=abs(sin(a/la*pi*x.*(sqrt(1+(1./x).^2)-1)));
N=2*rand(size(x))-1;
Ghat=(1+ep1*N).*y+ep2*N;
V=[x.^0];
xs=(lb:.01:ub)';
Vfit=[xs.^0];
for i=1:1:20%length(x)-1
V = [x.^i V];
Vfit = [xs.^i Vfit];
[Q,R]=qr(V,0);
c = R\(Q'*y);
yFit=Vfit*c;
plot(x,y,'o',xs,yFit)
axis([0 10 0 1])
drawnow
pause
end

Matlab: Rejection sampling

I want to sample from only the tails ([-5sigma,-3sigma] and [3sigma,5sigma]) of a Normal Distribution when I run a Monte-Carlo Simulation and therefore Rejection Sampling comes to mind. I am however, struggling to implement this in Matlab. Up till now I have been using something similar to the code below (which I know isn't rejection sampling), but would Rejection Sampling be a better way to solve this issue?
function [new_E11] = elasticmodulusrng()
new_E11 = normrnd(136e9,9.067e9,[1 1]);
while new_E11>=136e9-3*9.067e9 && new_E11<=136e9+3*9.067e9
new_E11 = normrnd(136e9,9.067e9,[1 1]);
end
Thanks
Edit: Using code in the Answer
mu=0
sigma=1
%get scaling factor
scale=(normcdf(5*sigma+mu,mu,sigma)-normcdf(3*sigma+mu,mu,sigma))*2
%define pdf
cpdf=#(x)((1/scale)*normpdf(x,mu,sigma).*((abs(x-mu)<5.*sigma)&(abs(x-mu.*sigma)>3)))
%get cdf via integral
ccdf=#(x)integral(cpdf,mu-5.*sigma,x)
%allow vector inputs
ccdf=#(x)arrayfun(ccdf,x)
%inverse cdf
icdf=#(y)(fzero(#(x)(ccdf(x)-y),.5));
%allow vector inputs
icdf=#(x)arrayfun(icdf,x);
%icdf is very slow, thus evaluate some numbers and use the cached and interpolated version:
cachedicdf=nan(n+1,1);
x=0:0.01:1;
y=icdf(x);
icdf=#(uni)interp1(x,y,uni);
%plot some example data
hist(icdf(rand(10000000,1)),1000);
The accuracy is not what I expected, but I'll leave it here. Maybe someone is able to improve the code.

MATLAB program about sine and cosine functions

So I plot sine(w*time) vs cosine(w*time)
w being angular frequency.
Hope I'm not wasting anyone's time if I ask:
Would this look like a circle?
I've researched a whole bunch but most websites only graph sine and cosine side-by-side and show comparisons.
I got it to look like a circle and I was just wondering if this is correct.
Also, What can I call this plot? I just gave it a title "plot of a circle". But I am wondering if that is professional enough since I am doing it for class.
Thanks for your time and answers. Greatly appreciated.
My MATLAB code for anyone interested:
clear all; clc; % clear the Workspace and the Command Window
f = 2; w = 2*pi*f; % specify a frequency in Hz and convert to rad/sec
T = 0.01; % specify a time increment
time = 0 : T : 0.5; % specify a vector of time points
x = sin(w*time); % evaluate the sine function for each element of the vector time
y = cos(w*time);
plot(x,y)
axis equal
grid on
xlabel('sin(w*time)');ylabel('cos(w*time)');title('Plot of a Circle');
axis([-1.1 1.1 -1.1 1.1]);
print
Here is a link to a Wolfram Alpha query I just did:
http://www.wolframalpha.com/input/?i=x%3Dsin%28t%29%2C+y%3Dcos%28t%29
I am not sure if it what you want to see, but that site (WolframAlpha.com) is a great place to explore and challenge mathematical concepts that are new to you.
Also, I would call it a plot of a circle since that is what the output looks like.
You are making a Lissajous curve. Keep in mind that a cosine is just a sine offset by pi/2 radians, and so plotting a sine against a cosine will indeed result in a circle. Changing the frequency and/or relative phase between x(t) and y(t) will result in many different interesting patterns.