I'm trying to sample 1000 numbers from a distribution with the probability density function f(x) = (1/3)x^2 , -1 < x < 2 using the rejection method. I also want to plot a histogram based on the data.
My textbook gives the following rules for using the rejection method:
1. Find such numbers a, b, and c that 0 ≤ f(x) ≤c for a ≤ x ≤ b. The bounding box
stretches along the x-axis from a to b and along the y-axis from 0 to c.
2. Obtain Standard Uniform random variables U and V from a random number generator
or a table of random numbers.
3. Define X = a+(b−a)U and Y = cV. Then X has Uniform(a,b) distribution, Y is
Uniform(0, c), and the point (X, Y ) is Uniformly distributed in the bounding box.
Based on those rules I wrote the following the code, but I believe I'm really far off from a proper solution and can use some guidance
a=-1; b=2; c=2;
while p < 1000
U = rand; V = rand;
X = a+U*(b-a); Y = c*V; f = (1/3)*X^2;
if Y<=f
x(p)=X;
p = p+1;
end
end
histogram(x);
Where exactly p is defined? Supposed to be set to 0
Wrt algorithm, it looks ok except it could be made more efficient - f(x) reach maximum value at X=2, so you could set c to 4/3.
Related
I am looking to calculate an array from a formula using x and y variables, the domain of x is (0,50) and y is (0,30) . I am asked to discretise the domain of x and y with 0.01 separation between points, then compute L(x,y) (which I have a formula for)(This will be points of a graph, ultimately I'm looking for the min lengths between points)
I'm not sure what I need to define in my script because if I define x and y as arrays with 0.01 separation they end up being uneven and unable to calculate as the arrays are uneven
%change these values for A, B and C positions
Ax=10;
Ay=5;
Bx=15;
By=25;
Cx=40;
Cy=10;
x = 0:0.01:50; % Array of values for x from 0-50 spaced at 0.01
y = 0:0.01:30; % Array of values for y from 0-30 spaced at 0.01
%length of point P from A, B and C and display
Lpa=sqrt((Ax-x).^2+(Ay-y).^2);
Lpb=sqrt((Bx-x).^2+(By-y).^2);
Lpc=sqrt((Cx-x).^2+(Cy-y).^2);
L=Lpa+Lpb+Lpc
I am getting an error telling me the two matrix are not even which makes sense to not work but I'm not sure how to define a matrix that will result in the minimum x and y values I am after.
Any help would be greatly appreciated.
You want to calculate L for each possible pair of x and y. In other words, for the first value of x = 0, you will calculate L for all y values from 0 to 30, then for next value of x = 0.01, you will do the same and so on.
MATLAB has a really cool function called meshgrid to create a matrix for every pair of x and y. So after generating x and y, change your code to the following to get a 2D matrix for L -
[X, Y] = meshgrid(x, y)
%length of point P from A, B and C and display
Lpa = sqrt((Ax - X).^2 + (Ay - Y).^2);
Lpb = sqrt((Bx - X).^2 + (By - Y).^2);
Lpc = sqrt((Cx - X).^2 + (Cy - Y).^2);
L = Lpa + Lpb + Lpc
I want to integrate
f(x) = exp(-x^2/2)
from x=-infinity to x=+infinity
by using the Monte Carlo method. I use the function randn() to generate all x_i for the function f(x_i) = exp(-x_i^2/2) I want to integrate to calculate afterwards the mean value of f([x_1,..x_n]). My problem is, that the result depends on what values I choose for my borders x1 and x2 (see below). My result is going far away from the real value by increasing the value of x1 and x2. Actually the result should be better and better by increasing x1 and x2.
Does someone see my mistake?
Here is my Matlab code
clear all;
b=10; % border
x1 = -b; % left border
x2 = b; % right border
n = 10^6; % number of random numbers
x = randn(n,1);
f = ones(n,1);
g = exp(-(x.^2)/2);
F = ((x2-x1)/n)*f'*g;
The right value should be ~2.5066.
Thanks
Try this:
clear all;
b=10; % border
x1 = -b; % left border
x2 = b; % right border
n = 10^6; % number of random numbers
x = sort(abs(x1 - x2) * rand(n,1) + x1);
f = exp(-x.^2/2);
F = trapz(x,f)
F =
2.5066
Ok, lets start with writing of general case of MC integration:
I = S f(x) * p(x) dx, x in [a...b]
S here is integral sign.
Usually, p(x) is normalized probability density function, f(x) you want to integrate, and algorithm is very simple one:
set accumulator s to zero
start loop of N events
sample x randomly from p(x)
given x, compute f(x) and add to accumulator
back to start loop if not done
if done, divide accumulator by N and return it
In simplest textbook case you have
I = S f(x) dx, x in [a...b]
where it means PDF is equal to uniformly distributed one
p(x) = 1/(b-a)
but what you have to sum is actually (b-a)*f(x), because your integral now looks like
I = S (b-a)*f(x) 1/(b-a) dx, x in [a...b]
In general, if both f(x) and p(x) could serve as PDF, then it is matter of choice whether you integrate f(x) over p(x), or p(x) over f(x). No difference! (Well, except maybe computation time)
So, back to particular integral (which is equal to \sqrt{2\pi}, i believe)
I = S exp(-x^2/2) dx, x in [-infinity...infinity]
You could use more traditional approach like #Agriculturist and write it
I = S exp(-x^2/2)*(2a) 1/(2a) dx, x in [-a...a]
and sample x from U(0,1) in [-a...a] interval, and for each x compute exp() and average it and get the result
From what I understand, you want to use exp() as PDF, so your integral looks like
I = S D * exp(-x^2/2)/D dx, x in [-infinity...infinity]
PDF to be normalized so it shall include normalization factor D, which is exactly equal to \sqrt{2 \pi} from gaussian integral.
Now f(x) is just a constant equal to D. It doesn't depend on x. It means that you for each sampled x should add to accumulator a CONSTANT value of D. After running N samples,
in accumulator you'll have exactly N*D. To find mean you'll divide by N and as a result you'll get perfect D, which is \sqrt{2 \pi}, which, in turn, is
2.5066.
Too rusty to write any matlab, and Happy New Year anyway
Problem: Create the function mylinecheck(a,b,c,d,e,f) which takes six inputs:
a,b,c,d,e,f which are real numbers, and a,c,e are not equal. The function must check if the three points (a,b), (c,d), and (e,f) all lie on the same line. If so, return a 1. If not, return a 0.
I think what I want to do is tell MATLAB to check if coordinates (c,d) and (e,f) are multiples of (a,b), and then if not I will return a 0. If so, I will return a 1. If this is the right thought process, I'm not sure how to command MATLAB to do so. Any advice would be greatly appreciated.
The points (x1,y1), (x2,y2), and (x3,y3) lie on the same line if and only if they satisfy
a x + b y + c = 0
for fixed values of a, b, and c (I cannot get over your notation; sorry for the "confusion"), where a or b are nonzero. Hence they lie on the same line if and only if
a x1 + b y1 + c = 0 [x1 y1 1][a] [0]
a x2 + b y2 + c = 0 <=> [x2 y2 1][b] = [0]
a x3 + b y3 + c = 0 [x3 y3 1][c] [0],
that is, the homogeneous linear system with the matrix
[x1 y1 1]
X = [x2 y2 1]
[x3 y3 1]
has a nonzero solution. This is possible only if X is singular. By eliminating the last column of X you can find that X is singular if and only if the matrix
Y = [x2-x1 y2-y1]
[x3-x1 y3-y1]
is singular.
To reliably check for the singularity of a matrix in Matlab, you can use SVD or, equivalently, the function rank. Hence your function could be implemented as follows:
function [result] = mylinecheck(x1,y1,x2,y2,x3,y3)
result = rank([x2-x1, y2-y1; x3-x1, y3-y1]) < 2;
If you want to check if points all fall on the same line (or are collinear), one of the classic methods would be to assume that each point forms a vertex in a triangle. If the three points make the triangle such that the area is equal to 0, then the points would be collinear or form a line. This can be done by checking the determinant of the following matrix:
[a b 1]
[c d 1]
[e f 1]
You can read the article on collinearity on Wolfram Mathworld here: http://mathworld.wolfram.com/Collinear.html (I also linked it above).
As such, your function simply needs to be:
function [out] = mylinecheck(a,b,c,d,e,f)
D = [a b 1; c d 1; e f 1];
out = det(D) == 0;
However, due to numerical imprecision, you may provide floating point numbers where the points are indeed collinear, but you may get a determinant that isn't equal to 0 (actually, perhaps a small number). As such, one thing I can suggest is check to see if the determinant is less than a small number. Something like:
function [out] = mylinecheck(a,b,c,d,e,f)
D = [a b 1; c d 1; e f 1];
out = abs(det(D)) < 1e-10;
1e-10 is a small number which is 10^{-10}. We take the abs to account for both positive and negative determinants, so you would be checking collinearity and is true if:
-10^{-10} < det(D) < 10^{-10}
However, with the comments made by Pavel, if points fall along the same line, if we decide to scale the coordinates, the determinant value will also increase in value as well. One suggestion I have is to perhaps be more liberal with the threshold. Make it larger.... perhaps something like 0.1.
I have:
x = [1970:1:2000]
y = [data]
size(x) = [30,1]
size(y) = [30,1]
I want:
% Yl = kx + m, where
[k,m] = polyfit(x,y,1)
For some reason i have to use "regress" for this.
Using k = regress(x,y) gives some totally random value that i have no idea where it comes from. How do it?
The number of outputs you get in "k" is dependant on the size of input X, so you will not get both m and k just by putting in your x and y straight. From the docs:
b = regress(y,X) returns a p-by-1 vector b of coefficient estimates for a multilinear regression of the responses in y on the predictors in X. X is an n-by-p matrix of p predictors at each of n observations. y is an n-by-1 vector of observed responses.
It is not exactly stated, but the example in the help docs using the carsmall inbuilt dataset shows you how to set this up. For your case, you'd want:
X = [ones(size(x)) x]; % make sure this is 30 x 2
b = regress(y,X); % y should be 30 x 1, b should be 2 x 1
b(1) should then be your m, and b(2) your k.
regress can also provide additional outputs, such as confidence intervals, residuals, statistics such as r-squared, etc. The input remains the same, you'd just change the outputs:
[b,bint,r,rint,stats] = regress(y,X);
I'm pretty confused on how I would go about summing an infinite amount of matrices in MATLAB. Lets say I have this function (a gaussian):
%Set up grid/coordinate system
Ngrid=400;
w=Ngrid;
h=Ngrid;
%Create Gaussian Distribution
G = zeros ([w, h]);
Sig = 7.3; %I want the end/resultant G to be a summation of Sign from 7.3 to 10 with dx
for x = 1 : w
for y = 1 : h
G (x, y) = exp (-((Sig^-2)*((x-w/2+1)^2 + (y-h/2+1)^2)) / (2));
end
end
I essentially want the end/resultant function G to be a summation of Sign from 7.3 to 10 with dx (which is infinitesimally) small ie integration. How would I go about doing this? I am pretty confused. Can it even be done?
You don't appear to actually be summing G over a range of Sig values. You never change the value of Sig. In any case, assuming that dx isn't too small and that you have the memory this can be done without any loops, let alone two.
Ngrid = 400;
w = Ngrid;
h = Ngrid;
% Create range for Sig
dx = 0.1;
Sig = 7.3:dx:10;
% Build mesh of x and y points
x = 1:w;
y = 1:h;
[X,Y] = meshgrid(x,y);
% Evaluate columnized mesh points at each value of Sig, sum up, reshape to matrix
G = reshape(sum(exp(bsxfun(#rdivide,-((X(:)-w/2+1).^2+(Y(:)-h/2+1).^2),2*Sig.^2)),2),[h w]);
figure
imagesc(G)
axis equal
This results in a figure like this
The long complicated line above can be replaced by this (uses less memory, but may be slower):
G = exp(-((X-w/2+1).^2+(Y-h/2+1).^2)/(2*Sig(1)^2));
for i = 2:length(Sig)
G = G+exp(-((X-w/2+1).^2+(Y-h/2+1).^2)/(2*Sig(i)^2));
end