TASK: Using the Monte_Carlo method, compute the approximation of PI for N=100,200,500,1000,2000,5000,10000,100000 and plot the Error of Approximation against N on a LogLog Plot. Where Error=abs(Actual value - Approximate Value). In addition, compute PI using two other infinite series methods and compute pi for N=10,20,50,100,200,500,1000,2000,5000,10000. Plot out on the same graph the Error of Approximation on a LogLog plot for all 2 formulae and the MonteCarlo method.
Estimating PI using M_C Method.
clear all
clc
close all
%n = linspace(0, 100000, 100)
n = [100, 200, 500, 1000, 2000, 5000, 10000, 100000]
c = 0;
t = 0;
for q=1:length(n)
x = rand([1 n(q)]);
y = rand([1 n(q)]);
for i= 1:n(q)
t = t+1;
if x(i)^2 + y(i)^2 <= 1
c = c+1;
figure(2)
k(i) = x(i);
r(i) = y(i);
hold on;
else
figure(2)
p(i) = x(i);
j(i) = y(i);
end
end
figure(1)
hold on
if n(q) == 1000
plot(k, r, 'b.');
plot(p, j, 'r.');
end
ratio = c/t;
PI= 4*ratio
error = abs(pi - PI)
figure(2)
loglog(n(q), error, '-b');
end
loglog(n, error, 's-')
grid on
%% Calculating PI using the James Gregory
%n = 10000;
n = [10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000]
d = 1;
c = 4;
for j = 1:n
d = d + 2;
c = c + ((-1)^(j))*(4)*(1/d);
end
PI_1 = c;
error = abs(n - PI_1);
loglog(n,error, '-s')
display(c);
%% Calculating PI using the Leibniz Formula for PI
%n = 10000;
n = [10, 20, 50, 100, 200, 500, 1000, 2000, 5000, 10000]
d = 1;
c = 1;
for k = 1:n
d = d + 2;
c = c + ((-1)^k)*(1/d);
end
PI_2 = c*4;
error = abs(n - PI_2);
figure(3)
loglog(n, error, '-s')
The problem I'm having is that the loglog plots don't display the expected information.
Plotting Discussion
For the Monte Carlo plotting, the line
loglog(n, error, 's-')
following the for-loop overwrites all of the plotting done by
loglog(n(q), error, '-b');
because a hold('on') was never issued for figure(2).
Also, in both cases, the plots will look odd due both to the style options and the fact that error is not a vector:
The call loglog(n, error, 's-') will generate a series of disconnected boxes since n is a vector and error is a scalar; Matlab interprets the elements of n as separated data sets each associated with the same scalar value error (the error from the last iteration of the for-loop; plot([1,2],0,'s-') is another example).
The called loglog(n(q), error, '-b'); has a similar problem. Since the style calls for a "solid blue line" but the data passed to loglog is a scalar-scalar pair each iteration, nothing will appear. Matlab cannot form a line for scalar-scalar input (consider the line plot plot(1,1,'-b') versus the circle plot plot(1,1,'ob') as another example).
These problems can be fixed by changing error to a vector of length(n):
error = zeros(1,length(n)); % before for-loop
...
error(q) = abs(pi - PI); % inside the q-for-loop
and performing the loglog plot after the for-loop only (this is also a performance increase since plotting calls are heavy relative to computation).
Performance Discussion
Speaking of performance (to speed up your Monte Carlo), the crowning virtue of Monte Carlo integration, besides not succumbing to the curse of dimensionality, is its ridiculously parallelizable (i.e., vectorizable) nature.
And this is a great thing since vanilla Monte Carlo requires a lot of samples to get accurate results.
Also, Matlab's logical indexing allows for a nice semantic way to pull values satisfying a number of criteria.
With that said, your i-for-loop for the Monte Carlo method can be vetorized with the following code:
% % ----- i-for-loop replacement
% Determine location of points
inCircle = (x.^2 + y.^2) <= 1;
% k = xIn, r = yIn
xIn = x(inCircle);
yIn = y(inCircle);
% p = xOut, j = yOut
xOut = x(~inCircle); % or x(not(inCircle));
yOut = y(~inCircle); % or y(not(inCircle));
% % ----- end of i-for-loop replacement
% Calculate MC pi and error
ratio = nnz(inCircle)/n(q);
PI = 4*ratio;
error(q) = abs(pi - PI);
Related
I have attempted to implement the iterative version of gradient descent algorithm which however is not working correctly. The vectorized implementation of the same algorithm however works correctly.
Here is the iterative implementation :
function [theta] = gradientDescent_i(X, y, theta, alpha, iterations)
% get the number of rows and columns
nrows = size(X, 1);
ncols = size(X, 2);
% initialize the hypothesis vector
h = zeros(nrows, 1);
% initialize the temporary theta vector
theta_temp = zeros(ncols, 1);
% run gradient descent for the specified number of iterations
count = 1;
while count <= iterations
% calculate the hypothesis values and fill into the vector
for i = 1 : nrows
for j = 1 : ncols
term = theta(j) * X(i, j);
h(i) = h(i) + term;
end
end
% calculate the gradient
for j = 1 : ncols
for i = 1 : nrows
term = (h(i) - y(i)) * X(i, j);
theta_temp(j) = theta_temp(j) + term;
end
end
% update the gradient with the factor
fact = alpha / nrows;
for i = 1 : ncols
theta_temp(i) = fact * theta_temp(i);
end
% update the theta
for i = 1 : ncols
theta(i) = theta(i) - theta_temp(i);
end
% update the count
count += 1;
end
end
And below is the vectorized implementation of the same algorithm :
function [theta, theta_all, J_cost] = gradientDescent(X, y, theta, alpha)
% set the learning rate
learn_rate = alpha;
% set the number of iterations
n = 1500;
% number of training examples
m = length(y);
% initialize the theta_new vector
l = length(theta);
theta_new = zeros(l,1);
% initialize the cost vector
J_cost = zeros(n,1);
% initialize the vector to store all the calculated theta values
theta_all = zeros(n,2);
% perform gradient descent for the specified number of iterations
for i = 1 : n
% calculate the hypothesis
hypothesis = X * theta;
% calculate the error
err = hypothesis - y;
% calculate the gradient
grad = X' * err;
% calculate the new theta
theta_new = (learn_rate/m) .* grad;
% update the old theta
theta = theta - theta_new;
% update the cost
J_cost(i) = computeCost(X, y, theta);
% store the calculated theta value
if i < n
index = i + 1;
theta_all(index,:) = theta';
end
end
Link to the dataset can be found here
The filename is ex1data1.txt
ISSUES
For initial theta = [0, 0] (this is a vector!), learning rate of 0.01 and running this for 1500 iterations I get the optimal theta as :
theta0 = -3.6303
theta1 = 1.1664
The above is the output for the vectorized implementation which I know I have implemented correctly (it passed all the test cases on Coursera).
However, when I implemented the same algorithm using the iterative method (1st code I mentioned) the theta values I get are (alpha = 0.01, iterations = 1500):
theta0 = -0.20720
theta1 = -0.77392
This implementation fails to pass the test cases and I know therefore that the implementation is incorrect.
I am however unable to understand where I am going wrong as the iterative code does the same job, same multiplications as the vectorized one and when I tried to trace the output of 1 iteration of both the codes, the values came same (on pen and paper!) but failed when I ran them on Octave.
Any help regarding this would be of great help especially if you could point out where I went wrong and what exactly was the cause of failure.
Points to consider
The implementation of hypothesis is correct as I tested it out and both the codes gave the same results, so no issues here.
I printed the output of the gradient vector in both the codes and realised that the error lies here because the outputs here were very different!
Additionally, here is the code for pre-processing the data :
function[X, y] = fileReader(filename)
% load the dataset
dataset = load(filename);
% get the dimensions of the dataset
nrows = size(dataset, 1);
ncols = size(dataset, 2);
% generate the X matrix from the dataset
X = dataset(:, 1 : ncols - 1);
% generate the y vector
y = dataset(:, ncols);
% append 1's to the X matrix
X = [ones(nrows, 1), X];
end
What is going wrong with the first code is that the theta_temp and the h vectors are not being initialised properly. For the very first iteration (when count value equals 1) your code runs properly because for that particular iteration the the h and the theta_temp vectors have been initialised to 0 properly. However, since these are temporary vectors for each iteration of gradient descent, they have not been initialised to 0 vectors again for the subsequent iterations. That is, for iteration 2, the values that are modified into h(i) and theta_temp(i) are just added to the old values. Hence because of that, the code does not work properly. You need to update the vectors as zero vectors at the beginning of each iteration and then they would work correctly. Here is my implementation of your code (the first one, observe the changes) :
function [theta] = gradientDescent_i(X, y, theta, alpha, iterations)
% get the number of rows and columns
nrows = size(X, 1);
ncols = size(X, 2);
% run gradient descent for the specified number of iterations
count = 1;
while count <= iterations
% initialize the hypothesis vector
h = zeros(nrows, 1);
% initialize the temporary theta vector
theta_temp = zeros(ncols, 1);
% calculate the hypothesis values and fill into the vector
for i = 1 : nrows
for j = 1 : ncols
term = theta(j) * X(i, j);
h(i) = h(i) + term;
end
end
% calculate the gradient
for j = 1 : ncols
for i = 1 : nrows
term = (h(i) - y(i)) * X(i, j);
theta_temp(j) = theta_temp(j) + term;
end
end
% update the gradient with the factor
fact = alpha / nrows;
for i = 1 : ncols
theta_temp(i) = fact * theta_temp(i);
end
% update the theta
for i = 1 : ncols
theta(i) = theta(i) - theta_temp(i);
end
% update the count
count += 1;
end
end
I ran the code and it gave the same values of theta which you have mentioned. However, what I wonder is how did you state that the output of hypothesis vector was the same in both cases where clearly, this was one of the reasons for the first code failing!
The formula for the discrete double Fourier series that I'm attempting to code in MATLAB is:
The coefficient in front of the trigonometric sum (Fourier amplitude) is what I'm trying to extract from the fitting of the data through the double Fourier series seen above. Using my current code, the original function is not reconstructed, therefore my coefficients cannot be correct. I'm not certain if this is of any significance or insight, but the second term for the A coefficients (Akn(1))) is 13 orders of magnitude larger than any other coefficient.
Any suggestions, modifications, or comments about my program would be greatly appreciated.
%data = csvread('digitized_plot_data.csv',1);
%xdata = data(:,1);
%ydata = data(:,2);
%x0 = xdata(1);
lambda = 20; %km
tau = 20; %s
vs = 7.6; %k/s (velocity of CHAMP satellite)
L = 4; %S
% Number of terms to use:
N = 100;
% set up matrices:
M = zeros(length(xdata),1+2*N);
M(:,1) = 1;
for k=1:N
for n=1:N %error using *, inner matrix dimensions must agree...
M(:,2*n) = cos(2*pi/lambda*k*vs*xdata).*cos(2*pi/tau*n*xdata);
M(:,2*n+1) = sin(2*pi/lambda*k*vs*xdata).*sin(2*pi/tau*n*xdata);
end
end
C = M\ydata;
%least squares coefficients:
A0 = C(1);
Akn = C(2:2:end);
Bkn = C(3:2:end);
% reconstruct original function values (verification check):
y = A0;
for k=1:length(Akn)
y = y + Akn(k)*cos(2*pi/lambda*k*vs*xdata).*cos(2*pi/tau*n*xdata) + Bkn(k)*sin(2*pi/lambda*k*vs*xdata).*sin(2*pi/tau*n*xdata);
end
% plotting
hold on
plot(xdata,ydata,'ko')
plot(xdata,yk,'b--')
legend('Data','Least Squares','location','northeast')
xlabel('Centered Time Event [s]'); ylabel('J[\muA/m^2]'); title('Single FAC Event (50 Hz)')
I want to replicate a figure from this article. More specifically, I want to replicate Figure number 4, which I believe is the representation of Equation 9.
So far I have come up with this code:
% implementing equation 9 and figure 4
step = 0.01; t = 1:step:3600;
d = 3; % dimension
N = 8000; % number of molecules
H = 0.01; % H = [0.01,0.1,1] is in mol/micrometer^3
H = H*6.02214078^5; % hence I scaled the Avogadro's number (right or wrong?)
D = 10; % diffusion coefficient in micrometer^2/sec
u(1) = 1./(1.^(d/2)); % inner function in equation 9; first pulse
for i = 2:numel(t)/1000
u(i) = u(i-1)+(1./(i.^(d/2))); % u-> the pulse number
lmda(i) = (1/(4*pi*D))*((N/(H)).*sum(u)).^(2/d);
end
figure;plot(lmda)
But I am not able to replicate it.
Equation 9
For details on the parameters, refer to the above code. The authors did mention that the summation in equation 9 is a Reimann Zeta series. Wonder if that has anything to do with the result?
Figure 4, which I am trying to replicate:
Could someone kindly tell me the mistake I am making?
P.s: This is not a homework.
Problem 1: You think you are scaling by Avogadro's number on this line
H = H*6.02214078^5;
In fact, you're scaling by approximately 7920=6.022^5. If you wanted to scale by the Avogadro number then you should do:
H = H * 6.02214078e23 % = 6.02214078 * 10^23 : the Avogadro number
Problem 2: You aren't plotting against t, you're plotting against the sample number which doesn't really make sense (unless your t happened to be in integer seconds). Remove the /1000 from your loop
for i = 2:numel(t)
% ...
end
% Then plot
plot(t, lmda)
At this stage we can see something is really wrong. Now that we're scaling by the correct Avo number, the orders of magnitude are way out. I suggest that you trust the H in figure 4 and the H in equation 9 are the same H, it would be very confusing if the author intended anything different!
On that basis, I would suggest you are using the wrong D, N, or time between pulses. I've set up the pulse timing a bit clearer in my code below. I've also streamlined your loop somewhat using vectorisation, and removed the H scaling.
If you tweak it so dtPulses=100 as well as D=100, then the plots are almost identical. You perhaps need to consider how these two numbers affect the result...
% implementing equation 9 and figure 4
d = 3; % dimension
N = 8000; % number of molecules
D = 100; % diffusion coefficient in micrometer^2/sec
dtPulses = 10; % Seconds between pulses
tPulses = 1:dtPulses:3600; % Time array to plot against
nt = numel(tPulses);
i = 1:nt; % pulse numbers
u = 1 ./ (i.^(d/2)); % inner function in equation 9: individual pulse
for k = 2:nt % Running sum
u(k) = u(k-1)+u(k);
end
% Now plot for different H (mol/micrometer^3)
H = [0.01, 0.1, 1];
figure; hold on; linestyles = {':k', '--k', '-k'};
for nH = 1:3
lmda = ((1/(4*pi*D))*(N/H(nH)).*u).^(2/d);
plot(tPulses, lmda, linestyles{nH}, 'linewidth', 2)
end
I'm trying to plot the following equation (let's call it "Equation 1"):
This is the code I'm testing:
clear all;
xl=0; xr=1; % x domain [xl,xr]
J = 10; % J: number of division for x
dx = (xr-xl) / J; % dx: mesh size
tf = 0.1; % final simulation time
Nt = 60; % Nt: number of time steps
dt = tf/Nt/4;
x = xl : dx : xr; % generate the grid point
u_ex = zeros(J+1,Nt);
for n = 1:Nt
t = n*dt; % current time
for j=1:J+1
xj = xl + (j-1)*dx;
suma = zeros(100 , 1);
for k= 1:100
suma(k) = 4/(((2*k-1)^2) *pi*pi);
suma(k) = suma(k) * exp(-((2*k-1)^2) *pi*pi*t) * cos(2*k-1)*pi*xj;
end
m = sum(suma);
u_ex(j, n)= 0.5 - m;
end
end
tt = dt : dt : Nt*dt;
figure(1)
surf(x,tt, u_ex'); % 3-D surface plot
xlabel('x')
ylabel('t')
zlabel('u')
The problem is that all I get is a flat surface:
Equation 1 is suppossed to be the solution of the following parabolic partial differential equation with boundary values:
And after getting the numerical solution, it should look like this:
This plot gets the right values at the boundaries x = 0 and x = 1. The plot of Equation 1 doesn't have those values at the boundaries.
My complete .m code (that plots both the numerical solution and Equation 1) is:
clear all; % clear all variables in memory
xl=0; xr=1; % x domain [xl,xr]
J = 10; % J: number of division for x
dx = (xr-xl) / J; % dx: mesh size
tf = 0.1; % final simulation time
Nt = 60; % Nt: number of time steps
dt = tf/Nt/4;
mu = dt/(dx)^2;
if mu > 0.5 % make sure dt satisy stability condition
error('mu should < 0.5!')
end
% Evaluate the initial conditions
x = xl : dx : xr; % generate the grid point
% store the solution at all grid points for all time steps
u = zeros(J+1,Nt);
u_ex = zeros(J+1,Nt);
% Find the approximate solution at each time step
for n = 1:Nt
t = n*dt; % current time
% boundary condition at left side
gl = 0;
% boundary condition at right side
gr = 0;
for j=2:J
if n==1 % first time step
u(j,n) = j;
else % interior nodes
u(j,n)=u(j,n-1) + mu*(u(j+1,n-1) - 2*u(j,n-1) + u(j-1,n-1));
end
end
u(1,n) = gl; % the left-end point
u(J+1,n) = gr; % the right-end point
% calculate the analytic solution
for j=1:J+1
xj = xl + (j-1)*dx;
suma = zeros(100 , 1);
for k= 1:100
suma(k) = 4/(((2*k-1)^2) *pi*pi);
suma(k) = suma(k) * exp(-((2*k-1)^2) *pi*pi*t) * cos(2*k-1)*pi*xj;
end
m = sum(suma);
u_ex(j, n)= 0.5 - m;
end
end
% Plot the results
tt = dt : dt : Nt*dt;
figure(1)
colormap(gray); % draw gray figure
surf(x,tt, u'); % 3-D surface plot
xlabel('x')
ylabel('t')
zlabel('u')
title('Numerical solution of 1-D parabolic equation')
figure(2)
surf(x,tt, u_ex'); % 3-D surface plot
xlabel('x')
ylabel('t')
zlabel('u')
title('Analytic solution of 1-D parabolic equation')
maxerr=max(max(abs(u-u_ex))),
The code is taken from the book "Computational Partial Differential Equations Using MATLAB" by Yi-Tung Chen, Jichun Li, chapter 2, exercise 3.
In short: I'm not asking about the differential equation or the boundary problem, I want to know is: Why am I getting a flat surface when plotting Equation 1? Am I missing a parenthesis?
I do not want to use the symsum function because it never stop the script execution and I want to learn how to plot Equation 1 with no using symsum.
I've tested this code with Matlab R2008b and Octave 4.2.1. I got the same results (even with sums of 1000, 10000 and 50000 terms in the for loop with the k variable).
Edit!
Thanks, Steve!
I was missing a couple of parenthesis near the cosine, the right code is:
clear all; % clear all variables in memory
xl=0; xr=1; % x domain [xl,xr]
J = 10; % J: number of division for x
dx = (xr-xl) / J; % dx: mesh size
tf = 0.1; % final simulation time
Nt = 60; % Nt: number of time steps
dt = tf/Nt/4;
mu = dt/(dx)^2;
if mu > 0.5 % make sure dt satisy stability condition
error('mu should < 0.5!')
end
% Evaluate the initial conditions
x = xl : dx : xr; % generate the grid point
% store the solution at all grid points for all time steps
u = zeros(J+1,Nt);
u_ex = zeros(J+1,Nt);
% Find the approximate solution at each time step
for n = 1:Nt
t = n*dt; % current time
% boundary condition at left side
gl = 0;
% boundary condition at right side
gr = 0;
for j=2:J
if n==1 % first time step
u(j,n) = j;
else % interior nodes
u(j,n)=u(j,n-1) + mu*(u(j+1,n-1) - 2*u(j,n-1) + u(j-1,n-1));
end
end
u(1,n) = gl; % the left-end point
u(J+1,n) = gr; % the right-end point
% calculate the analytic solution
for j=1:J+1
xj = xl + (j-1)*dx;
suma = zeros(1000 , 1);
for k= 1:1000
suma(k) = 4/(((2*k-1)^2) *pi*pi);
suma(k) *= exp(-((2*k-1)^2) *pi*pi*t) * cos((2*k-1)*pi*xj);
end
m = sum(suma);
u_ex(j, n)= 0.5 - m;
end
end
% Plot the results
tt = dt : dt : Nt*dt;
figure(1)
colormap(gray); % draw gray figure
surf(x,tt, u'); % 3-D surface plot
xlabel('x')
ylabel('t')
zlabel('u')
title('Numerical solution of 1-D parabolic equation')
figure(2)
surf(x,tt, u_ex'); % 3-D surface plot
xlabel('x')
ylabel('t')
zlabel('u')
title('Analytic solution of 1-D parabolic equation')
Now my Equation 1 looks much better:
Also Steve was right when pointing out that my numerical solution may be wrong. I didn't notice that the boundary values are for the derivatives of my function, not the actual values of the function. I'll ask my teacher about this.
Edit2!
Ok, I got it. To calculate the derivatives at the boundaries you have to use hint 2.21 in the same book:
% hint 2.21 given by the book
% it is better to calculate the boundary values after calculating the inner points inside the for j = 1:m loop because you will need them:
u(1, n) = u(2, n) - dx * gl; % the left-end point
u(J+1,n) = u(J, n) + dx * gr; % the right-end point
Now my numerical solution looks like my analytic solution :D
Matlab R2008b can't recognize the *= operator that Octave does. I'm not tested this operator in other versions of Matlab because I'm too poor.
Yvon: I think the analytical solution formula comes from the real part of a Fourier expansion, but authors don't tell how they got it.
I'm trying to plot the following function
% k-nn density estimation
% localSearcher is a handle class responsible for finding the k closest points to x
function z = k_nearest_neighbor(x, localSearcher)
% Total point count
n = localSearcher.getPointCount();
% Get the indexes of the k closest points to x
% (the k parameter is contained inside the localSearcher class)
idx = localSearcher.search(x);
% k is constant
k = length(idx);
% Compute the volume (i.e. the hypersphere centered in x that encloses every
% sample in idx)
V = localSearcher.computeVolume(x, idx);
% The estimate for the density function is p(x) = k / (n * V)
z = k / (n * V);
end
I know for sure that the above algorithm is correct, because I get a reasonable plot using the following function
% Plot the values of k_nearest_neighbor(x, searcher) by sampling it manually
function manualPlot(samples, searcher)
a = -2;
b = 2;
n = 1000;
h = (b - a) / n;
sp = linspace(a, b, n);
pt = zeros(n);
areas = zeros(n);
estimated_pdf = #(x)k_nearest_neighbor(x, searcher);
area = 0;
for i = 1 : length(sp)
x = sp(i);
pt(i) = estimated_pdf(x);
area = area + h * pt(i);
areas(i) = area;
end
figure, hold on
title('k-nn density estimation');
plot(sp, pt, sp, areas, ':r');
legend({'$p_n(x)$', '$\int_{-2}^{x} p_n(x)\, \, dx$'}, 'Interpreter', 'latex', 'FontSize', 14);
plot(samples,zeros(length(samples)),'ro','markerfacecolor', [1, 0, 0]);
axis auto
end
called by
function test2()
clear all
close all
% Pattern Classification (page 175)
samples = [71 / 800; 128 / 800; 223 / 800; 444 / 800; 475 / 800; 546 / 800; 641 / 800; 780 / 800];
% 3-nn density estimation
searcher = NaiveNearestSearcher(samples, 3);
manualPlot(samples, searcher);
end
which outputs
However, if I try to do the same thing with ezplot
% Plot the values of k_nearest_neighbor using ezplot
function autoPlot(samples, searcher)
estimated_pdf = #(x)k_nearest_neighbor(x, searcher);
figure, hold on
ezplot(estimated_pdf, [-2,2]);
title('k-nn density estimation');
legend({'$p_n(x)$', '$\int_{-2}^{x} p_n(x)\, \, dx$'}, 'Interpreter', 'latex', 'FontSize', 14);
plot(samples,zeros(length(samples)),'ro','markerfacecolor',[1,0,0]);
axis auto
end
I get the following incorrect result
No warnings are issued from the console.
It's like the searcher parameter passed to the anonymous function
estimated_pdf = #(x)k_nearest_neighbor(x, searcher);
ezplot(estimated_pdf, [-2,2]);
goes "out of scope" (or something) before ezplot terminates.
The really weird thing is that adding
function z = k_nearest_neighbor(x, localSearcher)
[... same identical code ... ]
global becauseWhyNotVector;
% becauseWhyNotVector(end + 1) = 1; NOT WORKING, I must use the x variable for some reason
becauseWhyNotVector(end + 1) = x;
end
apparently fixes the problem (!).
Here's the full source code, I'm using MATLAB R2011a.