How do you program the Monte Carlo Integration method in Matlab? - matlab

I am trying to figure out how to right a math based app with Matlab, although I cannot seem to figure out how to get the Monte Carlo method of integration to work. I feel that I do not have algorithm thought out correctly either. As of now, I have something like:
// For the function {integral of cos(x^3)*exp(x^(1/2))+x dx
// from x = 0 to x = 10
ans = 0;
for i = 1:100000000
x = 10*rand;
ans = ans + cos(x^3)*exp(x^(1/2))+x
end
I feel that this is completely wrong because my outputs are hardly even close to what is expected. How should I correctly write this? Or, how should the algorithm for setting this up look?

Two issues:
1) If you look at what you're calculating, "ans" is going to grow as i increases. By putting a huge number of samples, you're just increasing your output value. How could you normalize this value so that it stays relatively the same, regardless of number of samples?
2) Think about what you're trying to calculate here. Your current "ans" is giving you the sum of 100000000 independent random measurements of the output to your function. What does this number represent if you divide by the number of samples you've taken? How could you combine that knowledge with the range of integration in order to get the expected area under the curve?

I managed to solve this with the formula I found here. I ended up using:
ans = 0;
n = 0;
for i:1:100000000
x = 10*rand;
n = n + cos(x^3)*exp(x^(1/2))+x;
end
ans = ((10-0)/100000000)*n

Related

MATLAB how to make nlinfit() not quit out of loop and how to test distribution of results

Given some data (from the function 1/x), I solved for powers of a model function made of the sum of exponentials. I did this for several different samples of data and I want to plot the powers that I find and test to see what type of distribution they have.
My main problems are:
My loops are quitting when the nlinfit() function says the function gives -Inf values. I've tried playing around with initial conditions but can't see how to fix it. I know that I am working with the function 1/x, so there may not be a way around this problem. Is there at least a way to force the loop to keep going after getting this error? It only happens for some iterations. If I manually regenerate my sample data and run it again it works. So if I could force the loop to keep going it would be ideal. I have tried switching the for loop to a while loop and putting a try catch statement inside but this just seems to run endlessly.
I have code set to test how if the distribution of the resulting power is close to the exponential distribution. Is this the correct way to do this? When I add in the Monte Carlo tolerance ('MCTol') it has a drastic effect on p in some cases (going from 0.5 to 0.9). Do I have it set up correctly? Is there a way to also see empirically how far the curve is from the exponential distribution?
Are there better ways to fit this curve or do any of the above? I'm just learning so any suggestions are appreciated.
Here is my current code :
%Function that I use to fit curve and get error for each iteration
function[beta,error] = Fit_and_Calc_Error(T,X,modelfun, beta0)
opts = statset('nlinfit');
opts.RobustWgtFun = 'bisquare';
beta = nlinfit(T,X,modelfun,beta0,opts);
error = immse(X,modelfun(beta,T));
end
%Starting actual loop stuff
tracking_a = zeros();
tracking_error = zeros();
modelfun = #(a,x)(exp(-a(1)*x)+exp(-a(2)*x));
for i = 1:60
%make new random data
x = 20*rand(300,1)+1; %between 1 and 21 (Avoiding x=0)
X = sort(x);
Y = 1./(X);
%fit random data
beta0 = [max(X(:,1))/10;max(X(:,1))/10];
[beta,error]= Fit_and_Calc_Error(X,Y,modelfun, beta0);
%storing data found
tracking_a(i,1:length(beta)) = beta;
tracking_error(i) = error;
end
%Testing if first power found has exponential dist
pdN = fitdist(tracking_a(:,1),'Exponential');
histogram(tracking_a(:,1),10)
hold on
x_values2 = 0:0.001:5;
y2 = pdf(pdN,x_values2);
plot(x_values2,y2)
hold off
[h,p] = lillietest(tracking_a(:,1),'Distribution','exponential','MCTol',1e-4)
%Testing if second power found has exponential dist
pdN2 = fitdist(tracking_a(:,2),'Exponential');
histogram(tracking_a(:,2),10)
hold on
y3 = pdf(pdN2,x_values2);
plot(x_values2,y3)
hold off
[h2,p2] = lillietest(tracking_a(:,2),'Distribution','exponential','MCTol',1e-4)

Approximating an integral in MATLAB

I've been trying to implement the following integral in MATLAB
Given a number n, I wrote the code that returns an array with n elements, containing approximations of each integral.
First, I tried this using a 'for' loop and the recurrence relationship on the first line. But from the 20th integral and above the values are completely wrong (correct to 0 significant figures and wrong sign).
The same goes if I use the explicit formula on the second line and two 'for' loops.
As n grows larger, so does the error on the approximations.
So the main issue here is that I haven't found a way to minimize the error as much as possible.
Any ideas? Thanks in advance.
Here is an example of the code and the resulting values, using the second formula:
This integral, for positive values of n, cannot have values >1 or <0
First attempt:
I tried the iterative method and found interesting thing. The approximation may not be true for all n. In fact if I keep track of (n-1)*I(n-1) in each loop I can see
I = zeros(20,3);
I(1,1) = 1-1/exp(1);
for ii = 2:20
I(ii,2) = ii-1;
I(ii,3) = (ii-1)*I(ii-1,1);
I(ii,1) = 1-I(ii,3);
end
There is some problem around n=18. In fact, I18 = 0.05719 and 18*I18 = 1.029 which is larger than 1. I don't think there is any numerical error or number overflow in this procedure.
Second attempt:
To make sure the maths is correct (I verified twice on paper) I used trapz to numerically evaluate the integral, and n=18 didn't cause any problem.
>> x = linspace(0,1,1+1e4);
>> f = #(n) exp(-1)*exp(x).*x.^(n-1);
>> f = #(n) exp(-1)*exp(x).*x.^(n-1)*1e-4;
>> trapz(f(5))
ans =
1.708934160520510e-01
>> trapz(f(17))
ans =
5.571936009790170e-02
>> trapz(f(18))
ans =
5.277113416899408e-02
>>
A closer look is as follows. I18 is slightly different (to the 4th significant digit) between the (stable) numerical method and (unstable) iterative method. 18*I18 is therefore possible to exceed 1.
I = zeros(20,3);
I(1,1) = 1-1/exp(1);
for ii = 2:20
I(ii,2) = ii-1;
I(ii,3) = (ii-1)*I(ii-1,1);
I(ii,1) = 1-I(ii,3);
end
J = zeros(20,3);
x = linspace(0,1,1+1e4);
f = #(n) exp(-1)*exp(x).*x.^(n-1)*1e-4;
J(1,1) = trapz(f(1));
for jj = 2:20
J(jj,1) = trapz(f(jj));
J(jj,2) = jj-1;
J(jj,3) = (jj-1)*J(jj-1,1);
end
I suspect there is an error in each iterative step due to the nature of numerical computations. If the iteration is long, the error propagates and, unfortunately in this case, amplifies rapidly. In order to verify this, I combined the above two methods into a hybrid algo. For most of the time the iterative way is used, and once in a while a numerical integral is evaluated from scratch without relying on previous iterations.
K = zeros(40,4);
K(1,1) = 1-1/exp(1);
for kk = 2:40
K(kk,2) = trapz(f(kk));
K(kk,3) = (kk-1)*K(kk-1,1);
K(kk,4) = 1-K(kk,3);
if mod(kk,5) == 0
K(kk,1) = K(kk,2);
else
K(kk,1) = K(kk,4);
end
end
If the iteration lasts more than 4 steps, error amplification will be large enough to invert the sign, and starts nonrecoverable oscillation.
The code should be able to explain all the data structures. Anyway, let me put some focus here. The second column is the result of trapz, which is the numerical integral done on the non-iterative integration definition of I(n). The third column is (n-1)*I(n-1) and should be always positive and less than 1. The forth column is 1-(n-1)*I(n-1) and should always be positive. The first column is the choice I have made between the trapz result and iterative result, to be the "true" value of I(n).
As can be seen here, in each iteration there is a small error compared to the independent numerical way. The error grows in the 3rd and 4th iteration and finally breaks the thing in its 5th. This is observed around n=25, under the case that I pick the numerical result in every 5 loops since the beginning.
Conclusion: There is nothing wrong with any definition of this integral. However the numerical error when evaluating the expressions is unfortunately aggregating, hence limiting the way you can perform the computation.

Sampling and DTFT in Matlab

I need to produce a signal x=-2*cos(100*pi*n)+2*cos(140*pi*n)+cos(200*pi*n)
So I put it like this :
N=1024;
for n=1:N
x=-2*cos(100*pi*n)+2*cos(140*pi*n)+cos(200*pi*n);
end
But What I get is that the result keeps giving out 1
I tried to test each values according to each n, and I get the same results for any n
For example -2*cos(100*pi*n) with n=1 has to be -1.393310473. Instead of that, Matlab gave the result -2 for it and it always gave -2 for any n
I don't know how to fix it, so I hope someone could help me out! Thank you!
Not sure where you get the idea that -2*cos(100*pi) should be anything other than -2. Maybe you are not aware that Matlab works in radians?
Look at your expression. Each term can be factored to contain 2*pi*(an integer). And you should know that cos(2*pi*(an integer)) = 1.
So the results are exactly as expected.
What you are seeing is basically what happens when you under-sample a waveform. You may know that the Nyquist criterion says that you need to have a sampling rate that is at least two times greater than the highest frequency component present; but in your case, you are sampling one point every 50, 70, 100 complete cycles. So you are "far beyond Nyquist". And that can only be solved by sampling more closely.
For example, you could do:
t = linspace(0, 1, 1024); % sample the waveform 1024 times between 0 and 1
f1 = 50;
f2 = 70;
f3 = 100;
signal = -2*cos(2*pi*f1*t) + 2*cos(2*pi*f2*t) + cos(2*pi*f3*t);
figure; plot(t, signal)
I think you are using degrees when you are doing your calculations, so do this:
n = 1:1024
x=-2*cosd(100*pi*n)+2*cosd(140*pi*n)+cosd(200*pi*n);
cosd uses degrees instead of radians. Radians is the default for cos so matlab has a separate function when degree input is used. For me this gave:
-2*cosd(100*pi*1) = -1.3933
The first term that I got using:
x=-2*cosd(100*pi*1)+2*cosd(140*pi*1)+cosd(200*pi*1)
x = -1.0693
Also notice that I defined n as n = 1:1024; this will give all integers from 1,2,...,1024,
there is no need to use a for loop since many of Matlab's built in functions are vectorized. Meaning you can just input a vector and it will calculate the function for every element in the vector.

why is the vector coming out of 'trapz' function as NAN?

i am trying to calculate the inverse fourier transform of the vector XRECW. for some reason i get a vector of NANs.
please help!!
t = -2:1/100:2;
x = ((2/5)*sin(5*pi*t))./((1/25)-t.^2);
w = -20*pi:0.01*pi:20*pi;
Hw = (exp(j*pi.*(w./(10*pi)))./(sinc(w./(10*pi)))).*(heaviside(w+5*pi)-heaviside(w-5*pi));%low pass filter
xzohw = 0;
for q=1:20:400
xzohw = xzohw + x(q).*(2./w).*sin(0.1.*w).*exp(-j.*w*0.2*((q-1)/20)+0.5);%calculating fourier transform of xzoh
end
xzohw = abs(xzohw);
xrecw = abs(xzohw.*Hw);%filtering the fourier transform high frequencies
xrect=0;
for q=1:401
xrect(q) = (1/(2*pi)).*trapz(xrecw.*exp(j*w*t(q))); %inverse fourier transform
end
xrect = abs(xrect);
plot(t,xrect)
Here's a direct answer to your question of "why" there is a nan. If you run your code, the Nan comes from dividing by zero in line 7 for computing xzohw. Notice that w contains zero:
>> find(w==0)
ans =
2001
and you can see in line 7 that you divide by the elements of w with the (2./w) factor.
A quick fix (although it is not a guarantee that your code will do what you want) is to avoid including 0 in w by using a step which avoids zero. Since pi is certainly not divisible by 100, you can try taking steps in .01 increments:
w = -20*pi:0.01:20*pi;
Using this, your code produces a plot which might resemble what you're looking for. In order to do better, we might need more details on exactly what you're trying to do, or what these variables represent.
Hope this helps!

Matlab Code To Approximate The Exponential Function

Does anyone know how to make the following Matlab code approximate the exponential function more accurately when dealing with large and negative real numbers?
For example when x = 1, the code works well, when x = -100, it returns an answer of 8.7364e+31 when it should be closer to 3.7201e-44.
The code is as follows:
s=1
a=1;
y=1;
for k=1:40
a=a/k;
y=y*x;
s=s+a*y;
end
s
Any assistance is appreciated, cheers.
EDIT:
Ok so the question is as follows:
Which mathematical function does this code approximate? (I say the exponential function.) Does it work when x = 1? (Yes.) Unfortunately, using this when x = -100 produces the answer s = 8.7364e+31. Your colleague believes that there is a silly bug in the program, and asks for your assistance. Explain the behaviour carefully and give a simple fix which produces a better result. [You must suggest a modification to the above code, or it's use. You must also check your simple fix works.]
So I somewhat understand that the problem surrounds large numbers when there is 16 (or more) orders of magnitude between terms, precision is lost, but the solution eludes me.
Thanks
EDIT:
So in the end I went with this:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:40
x1 = x/10;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^10;
s
Not sure if it's completely correct but it returns some good approximations.
exp(-100) = 3.720075976020836e-044
s = 3.722053303838800e-044
After further analysis (and unfortunately submitting the assignment), I realised increasing the number of iterations, and thus increasing terms, further improves efficiency. In fact the following was even more efficient:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:200
x1 = x/200;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^200;
s
Which gives:
exp(-100) = 3.720075976020836e-044
s = 3.720075976020701e-044
As John points out in a comment, you have an error inside the loop. The y = y*k line does not do what you need. Look more carefully at the terms in the series for exp(x).
Anyway, I assume this is why you have been given this homework assignment, to learn that series like this don't converge very well for large values. Instead, you should consider how to do range reduction.
For example, can you use the identity
exp(x+y) = exp(x)*exp(y)
to your advantage? Suppose you store the value of exp(1) = 2.7182818284590452353...
Now, if I were to ask you to compute the value of exp(1.3), how would you use the above information?
exp(1.3) = exp(1)*exp(0.3)
But we KNOW the value of exp(1) already. In fact, with a little thought, this will let you reduce the range for an exponential down to needing the series to converge rapidly only for abs(x) <= 0.5.
Edit: There is a second way one can do range reduction using a variation of the same identity.
exp(x) = exp(x/2)*exp(x/2) = exp(x/2)^2
Thus, suppose you wish to compute the exponential of large number, perhaps 12.8. Getting this to converge acceptably fast will take many terms in the simple series, and there will be a great deal of subtractive cancellation happening, so you won't get good accuracy anyway. However, if we recognize that
12.8 = 2*6.4 = 2*2*3.2 = ... = 16*0.8
then IF you could efficiently compute the exponential of 0.8, then the desired value is easy to recover, perhaps by repeated squaring.
exp(12.8)
ans =
362217.449611248
a = exp(0.8)
a =
2.22554092849247
a = a*a;
a = a*a;
a = a*a;
a = a*a
362217.449611249
exp(0.8)^16
ans =
362217.449611249
Note that WHENEVER you do range reduction using methods like this, while you may incur numerical problems due to the additional computations necessary, you will usually come out way ahead due to the greatly enhanced convergence of your series.
Why do you think that's the wrong answer? Look at the last term of that sequence, and it's size, and tell me why you expect you should have an answer that's close to 0.
My original answer stated that roundoff error was the problem. That will be a problem with this basic approach, but why do you think 40 is enough terms for the appropriate mathematical ( as opposed to computer floating point arithmetic) answer.
100^40 / 40! ~= 10^31.
Woodchip has the right idea with range reduction. That's the typical approach people use to implement these kinds of functions very quickly. Once you get that all figured out, you deal with roundoff errors of alternating sequences, by summing adjacent terms within the loop, and stepping with k = 1 : 2 : 40 (for instance). That doesn't work here until you use woodchips's idea because for x = -100, the summands grow for a very long time. You need |x| < 1 to guarantee intermediate terms are shrinking, and thus a rewrite will work.