simulating a random walk in matlab - matlab

i have variable x that undergoes a random walk according to the following rules:
x(t+1)=x(t)-1; probability p=0.3
x(t+1)=x(t)-2; probability q=0.2
x(t+1)=x(t)+1; probability p=0.5
a) i have to create this variable initialized at zero and write a for loop for 100 steps and that runs 10000 times storing each final value in xfinal
b) i have to plot a probability distribution of xfinal (a histogram) choosing a bin size and normalization!!* i have to report the mean and variance of xfinal
c) i have to recreate the distribution by application of the central limit theorem and plot the probability distribution on the same plot!
help would be appreciated in telling me how to choose the bin size and normalize the histogram and how to attempt part c)
your help is much appreciated!!
p=0.3;
q=0.2;
s=0.5;
numberOfSteps = 100;
maxCount = 10000;
for count=1:maxCount
x=0;
for i = 1:numberOfSteps
random = rand(1, 1);
if random <=p
x=x-1;
elseif random<=(p+q)
x=x-2;
else
x=x+1;
end
end
xfinal(count) = x;
end
[f,x]=hist(xfinal,30);
figure(1)
bar(x,f/sum(f));
xlabel('xfinal')
ylabel('frequency')
mean = mean(xfinal)
variance = var(xfinal)

For the first question, check the help for hist on mathworks homepage
[nelements,centers] = hist(data,nbins);
You do not select the bin size, but the number of bins. nelements gives the elements per bin and center is all the bin centers. So to say, it would be the same to call
hist(data,nbins);
as
[nelements,centers] = hist(data,nbins);
plot(centers,nelements);
except that the representation is different (line or pile). To normalize, simply divide nelements with sum(nelements)
For c, here i.i.d. variables it actually is a difference if the variables are real or complex. However for real variables the central limit theorem in short tells you that for a large number of samples the distribution will limit the normal distribution. So if the samples are real, you simply asssumes a normal distribution, calculates the mean and variance and plots this as a normal distribution. If the variables are complex, then each of the variables will be normally distributed which means that you will have a rayleigh distribution instead.

Mathworks is deprecating hist that is being replaced with histogram.
more details in this link
You are not applying the PDF function as expected, the expression Y doesn't work
For instance Y does not have the right X-axis start stop points. And you are using x as input to Y while x already used as pivot inside the double for loop.
When I ran your code Y generates a single value, it is not a vector but just a scalar.
This
bar(x,f/sum(f));
bringing down all input values with sum(f) division? no need.
On attempting to overlap the ideal probability density function, often one has to do additional scaling, to have both real and ideal visually overlapped.
MATLAB can do the scaling for us, and no need to modify input data /sum(f).
With a dual plot using yyaxis
You also mixed variance and standard deviation.
Instead try something like this
y2=1 / sqrt(2*pi*var1)*exp(-(x2-m1).^2 / (2*var1))
ok, the following solves your question(s)
codehere
clear all;
close all;
clc
p=0.3; % thresholds
q=0.2;
s=0.5;
n_step=100;
max_cnt=10000;
n_bin=30; % histogram amount bins
xf=zeros(1,max_cnt);
for cnt=1:max_cnt % runs loop
x=0;
for i = 1:n_step % steps loop
t_rand1 = rand(1, 1);
if t_rand1 <=p
x=x-1;
elseif t_rand1<=(p+q)
x=x-2;
else
x=x+1;
end
end
xf(cnt) = x;
end
% [f,x]=hist(xf,n_bin);
hf1=figure(1)
ax1=gca
yyaxis left
hp1=histogram(xf,n_bin);
% bar(x,f/sum(f));
grid on
xlabel('xf')
ylabel('frequency')
m1 = mean(xf)
var1 = var(xf)
s1=var1^.5 % sigma
%applying central limit theorem %finding the mean
n_x2=1e3 % just enough points
min_x2=min(hp1.BinEdges)
max_x2=max(hp1.BinEdges)
% quite same as
min_x2=hp1.BinLimits(1)
max_x2=hp1.BinLimits(2)
x2=linspace(min_x2,max_x2,n_x2)
y2=1/sqrt(2*pi*var1)*exp(-(x2-m1).^2/(2*var1));
% hold(ax1,'on')
yyaxis right
plot(ax1,x2,y2,'r','LineWidth',2)
.
.
.
note I have not used these lines
% Xp=-1; Xq=-2; Xs=1; mu=Xp.*p+Xq.*q+Xs.*s;
% muN=n_step.*mu;
%
% sigma=(Xp).^2.*p+(Xq).^2.*q+(Xs).^2.s; % variance
% sigmaN=n_step.(sigma-(mu).^2);
People ususally call sigma to variance^.5
This supplied script is a good start point to now take it to wherever you need it to go.

Related

Where did i do wrong when i tried to approximatee this data using polynomial?

I am starting to learn numerical analysis using MATLAB in my course. so far we have covered polynomial interpolation (spline, polyfit, constraint spline, etc.) I was doing this practice question and I can not get the correct answer. I have uploaded the code I used and the question, where did I do wrong? thanks in advance!
close all; clear all; clc;
format long e
x = linspace(0,1,8);
xplot = linspace(0,1);
f = #(x) atan(x.*(x+1));
y_val = f(xplot);
c = polyfit(x,f(x),7);
p = polyval(c,0.7);
err = abs(f(0.7)-p)/(f(0.7))
The question I encountered is seen in the picture
After some playing around, it seems to be a matter of computing the absolute error instead of the relative absolute error.
The code below yields the desired answer. And yes, it is pretty unclear from the question which error is intended.
% Definitions
format long e
x = linspace(0,1,8)';
xplot= linspace(0,1);
f = #(x) atan(x.*(x+1));
y_val = f(xplot);
% Degree of polynomial
n = 7;
% Points to evaluate function
point1 = 0.5;
point2 = 0.7;
% Fit
c= polyfit(x,f(x),n);
% Evaluate
approxPoint1 = polyval(c, point1);
approxPoint2 = polyval(c, point2);
% Absolute errors
errPoint1 = abs( f(point1) - approxPoint1)
errPoint2 = abs( f(point2) - approxPoint2)
What you did wrong was :
mixing absolute and relative values when calculating errors to feed resulting variable err.
incorrectly placing abs() parentheses when calculating err: Your abs() only fixes the numerator, but then the denominator. To obtain |f(0.7)| you also need another abs(f(0.7))
for point x=0.7 instead of
err = abs(f(0.7)-p)/(f(0.7))
could well simply be
err = abs(f(.7)-p));
you only calculate err for assessment point 0.5 . In order to choose among the possible candidates of what seems to be a MATLAB Associate test multichoice answer, one needs err on 0.5 and err on 0.7 and then match the pair in the correct order, among all offered possible answers.
Although it's common practice to approach polynomial approximation of N points with an N-1 degree polynomial, it's often possible to approximate below satisfactory error with lower degree polynomials than N-1.
Lower degree polynomials means less calculations, less time spent approximating points. If one obtains a fair enough approximation with for instance a degree 4 polynomial, why waste time calculating an approximation with a higher degree all the way up to N-1?
Since the question does not tell what degree should have the approximating polynomial, you have to find it sweeping all polynomial degrees from 1 to up to a reasonable order.
The last error I found is that you have used linspace without specifying amount of points, MATLAB then takes by default 100 points hoping it's going to be ok.
Well, in your question 100 points is way too low an amount of points as I am going to show after supplying the following lines, as mentioned in previous point, sweeping all possible approximating polynomials, NOT on default 100 points you tacitly chose.
np=8; % numel(x) amount supplied points
N=1e3; % amount points x grid we build to measure f(x)
x= linspace(0,1,np);
xplot = linspace(0,1,N);
f = #(x) atan(x.*(x+1)); % function to approximate
y_val = f(xplot); % f(x)
xm=[.5 .7]; % points where to asses error
% finding poly coeffs
err1=zeros(2,np+2); % 1st row are errors on x=.5, 2nd row errors on x=.7
figure(1);
ax1=gca
hp1=plot(xplot,y_val)
grid on;
hp1.LineWidth=3;
hp1.Color='r';
hold on
for k=1:1:np+2
c = polyfit(x,f(x),k);
p_01 = polyval(c,xm(1));
err1(1,k) = abs(f(xm(1))-p_01);
% err(1,k) = abs((f(0.5)-p_05)/(f(0.5)))
p_02 = polyval(c,xm(2));
err1(2,k) = abs(f(xm(2))-p_02);
% err(2,k) = abs((f(0.7)-p_07)/(f(0.7)))
plot(x,polyval(c,x),'LineWidth',1.5); %'Color','b');
end
err1
.
.
The only pair of errors matching in the correct order are indeed those of polynomial order 7, but the total smallest error corresponds to the approximating polynomial of order 6.
What happens when taking linspace without defining a large enough amount of points? let's have a look:
np=8; % numel(x) amount supplied points
% N=1e3; % amount points x grid we build to measure f(x)
x= linspace(0,1,np);
xplot = linspace(0,1); %,N);
f = #(x) atan(x.*(x+1)); % function to approximate
y_val = f(xplot); % f(x)
xm=[.5 .7]; % points where to asses error
% finding poly coeffs
err1=zeros(2,np+2); % 1st row are errors on x=.5, 2nd row errors on x=.7
figure(1);
ax1=gca
hp1=plot(xplot,y_val)
grid on;
hp1.LineWidth=3;
hp1.Color='r';
hold on
for k=1:1:np+2
c = polyfit(x,f(x),k);
p_01 = polyval(c,xm(1));
err1(1,k) = abs(f(xm(1))-p_01);
% err(1,k) = abs(f(0.5)-p_05)/abs(f(0.5))
p_02 = polyval(c,xm(2));
err1(2,k) = abs(f(xm(2))-p_02);
% err(2,k) = abs(f(0.7)-p_07)/abs(f(0.7))
plot(x,polyval(c,x),'LineWidth',1.5); %'Color','b');
end
err1
With only 100 points all errors come up way too large, not a single error anywhere near 1e-5 or 1e-6.
This is why one couldn't tell which pair to go for, because all obtained values where at least 5 orders of magnitude away from landing zone.
I was about to include a plot with legend, but the visualization of this particular approach is in this case and in my opinion at best misleading, as in both plots for 100 and 1000 points, at 1st glance, both look as if the errors should be similar regardless of the amount of grid points used.
But as shown above 1e2 points cannot approximate the function,
it's like pitch dark, looking for something and pointing torch 180 from where we should be aiming at, not a chance to spot it.
Yet 1e3 grid points produce a pair of errors matching one of the possible answers, this is option D.
I hope it helps, thanks for reading my answer.

how to replace the ode45 method with the runge-kutta in this matlab?

I tried everything and looked everywhere but can't find any solution for my question.
clc
clear all
%% Solving the Ordinary Differential Equation
G = 6.67408e-11; %Gravitational constant
M = 10; %Mass of the fixed object
r = 1; %Distance between the objects
tspan = [0 100000]; %Time Progression from 0 to 100000s
conditions = [1;0]; %y0= 1m apart, v0=0 m/s
F=#(t,y)var_r(y,G,M,r);
[t,y]=ode45(F,tspan,conditions); %ODE solver algorithm
%%part1: Plotting the Graph
% plot(t,y(:,1)); %Plotting the Graph
% xlabel('time (s)')
% ylabel('distance (m)')
%% part2: Animation of Results
plot(0,0,'b.','MarkerSize', 40);
hold on %to keep the first graph
for i=1:length(t)
k = plot(y(i,1),0,'r.','MarkerSize', 12);
pause(0.05);
axis([-1 2 -2 2]) %Defining the Axis
xlabel('X-axis') %X-Axis Label
ylabel('Y-axis') %Y-Axis Label
delete(k)
end
function yd=var_r(y,G,M,r) %function of variable r
g = (G*M)/(r + y(1))^2;
yd = [y(2); -g];
end
this is the code where I'm trying to replace the ode45 with the runge kutta method but its giving me errors. my runge kutta function:
function y = Runge_Kutta(f,x0,xf,y0,h)
n= (xf-x0)/h;
y=zeros(n+1,1);
x=(x0:h:xf);
y(1) = y0;
for i=1:n
k1 = f(x(i),y(i));
k2= f(x(i)+ h/2 , y(i) +h*(k1)/2);
y(i+1) = y(i)+(h*k2);
end
plot(x,y,'-.M')
legend('RKM')
title ('solution of y(x)');
xlabel('x');
ylabel('y(x)')
hold on
end
Before converting your ode45( ) solution to manually written RK scheme, it doesn't even look like your ode45( ) solution is correct. It appears you have a gravitational problem set up where the initial velocity is 0 so a small object will simply fall into a large mass M on a line (rectilinear motion), and that is why you have scalar position and velocity.
Going with this assumption, r is something you should be calculating on the fly, not using as a fixed input to the derivative function. E.g., I would have expected something like this:
F=#(t,y)var_r(y,G,M); % get rid of r
:
function yd=var_r(y,G,M) % function of current position y(1) and velocity y(2)
g = (G*M)/y(1)^2; % gravity accel based on current position
yd = [y(2); -g]; % assumes y(1) is positive, so acceleration is negative
end
The small object must start with a positive initial position for the derivative code to be valid as you have it written. As the small object falls into the large mass M, the above will only hold until it hits the surface or atmosphere of M. Or if you model M as a point mass, then this scheme will become increasingly difficult to integrate correctly because the acceleration becomes large without bound as the small mass gets very close to the point mass M. You would definitely need a variable step size approach in this case. The solution becomes invalid if it goes "through" mass M. In fact, once the speed gets too large the whole setup becomes invalid because of relativistic effects.
Maybe you could explain in more detail if your system is supposed to be set up this way, and what the purpose of the integration is. If it is really supposed to be a 2D or 3D problem, then more states need to be added.
For your manual Runge-Kutta code, you completely forgot to integrate the velocity so this is going to fail miserably. You need to carry a 2-element state from step to step, not a scalar as you are currently doing. E.g., something like this:
y=zeros(2,n+1); % 2-element state as columns of the y variable
x=(x0:h:xf);
y(:,1) = y0; % initial state is the first 2-element column
% change all the scalar y(i) to column y(:,i)
for i=1:n
k1 = f(x(i),y(:,i));
k2= f(x(i)+ h/2 , y(:,i) +h*(k1)/2);
y(:,i+1) = y(:,i)+(h*k2);
end
plot(x,y(1,:),'-.M') % plot the position part of the solution
This is all assuming the f that gets passed in is the same F you have in your original code.
y(1) is the first scalar element in the data structure of y (this counts in column-first order). You want to generate in y a list of column vectors, as your ODE is a system with state dimension 2. Thus you need to generate y with that format, y=zeros(length(x0),n+1); and then address the list entries as matrix columns y(:,1)=x0 and the same modification in every place where you extract or assign a list entry.
Matlab introduce various short-cuts that, if used consequently, lead to contradictions (I think the script-hater rant (german) is still valid in large parts). Essentially, unlike in other systems, Matlab gives direct access to the underlying data structure of matrices. y(k) is the element of the underlying flat array (that is interpreted column-first in Matlab like in Fortran, unlike, e.g., Numpy where it is row-first).
Only the two-index access is to the matrix with its dimensions. So y(:,k) is the k-th matrix column and y(k,:) the k-th matrix row. The single-index access is nice for row or column vectors, but leads immediately to problems when collecting such vectors in lists, as these lists are automatically matrices.

Statistical test gives result different than reality?

Let's define X as :
and objects releated to it:
I want to calculation value of function following, and plot it on the same graph with histogram eigenvalues of Y.
After that I want to perform chi2gof test to judge weather those two distributions converge to each other or not. I want to use parameter expected with is designed to compare distributions of function and histogram.
My work so far
clf;
m=8000;
n=10000;
X=randn(m,n);
Y=X*X'/n;
sd=std(X(:));
l=m/n;
eigs=eig(Y);
lp=sd^2*(1+sqrt(l))^2;
lm=sd^2*(1-sqrt(l))^2;
x=linspace(0.00000001,lp,n);
for i = 1:length(x)
if (and(x(i) <= lp, x(i) >= lm))
dv(i) = sqrt((lp-x(i)).*(x(i)-lm))./(2*pi*sd^2*l.*x(i));
else
dv(i) = 0;
end
end
This code fully calculates my function dv. Now to plot it on histogram, and to add sens to it I normalized histogram to have unit area.
hold on;
[h, centres] = hist(eigs, 50);
% normalise to unit area
norm_h = h / (numel(eigs) * (centres(2)-centres(1)));
bar(centres, norm_h);
plot(x, dv, "r");
hold off;
The result from this code is image following:
As we can see the dv line really nicely fits the histogram. We can be almost sure that chi square test for same distribution should output p value very close to 1 (it means that samples are from the same distribution). However code
[h,p,stats] = chi2gof(dv,'Expected',norm_h);
outputs
p =
0
It means that null hypothesis of the same distributions were rejected. My question is - how ? Am I using something incorrectly, or this pvalue is really 0 ?

Matlab: generate two random numbers which are always different

I want to plot two grey squares, one of which is always darker than the other (for a test participant to decide in each trial which is darker).
I've set up the trials using
for N=1:20
w=rand(1)
z=rand(1)
% ...
end
And when I plot my squares I've set the color of one square using
'markerfacecolor', [w w w]
and set the second square similarly, using z.
The problem is: The two random numbers shouldn't be the same, because when this is the case the participant can't really decide which square is darker.
Can anyone help me to figure out how I can prevent the two random numbers being the same as each other within each loop?
"Shouldn't be the same" seems like a loose tolerance definition in this context, for instance would you allow w = 0.5, z = 0.50001? They would be pretty similar greys!
Let's define a tolerance, and find a random w and z
tol = 0.01; % forced difference between w and z
w = rand; % rand returns one value by default, don't have to use rand(1)
z = rand;
Now loop until z is either less than w-tol or greater than w+tol,
while abs(w-z) > tol
z = rand;
end
Note: You may want to add an iteration counter, so the while loop is ended after, say, 1000 attempts! Be aware that setting tol too large could cause this to take a lot longer.
Full example in the form of your example:
tol = 0.1; % larger tolerance
for N = 1:20
w = rand; z = rand;
while abs(w-z) > tol; z = rand; end;
% >> w = 0.8045
% >> z = 0.8169
% z = 0.7322
% z = 0.1895 % tolerance satisfied, stop while loop and continue
% Do your trial here...
end
Repeatability
If you wanted to do the exact same trials on multiple subjects, you would want the greys to be the same! Either run the randomising function once and store the output values for re-use, or reset the seed of the random number generator using rng
Since by definition the variable is "random", then you should not be naturally able to do that.
Given a large enough space from which you are picking the random value uniformly, it is pretty cheap computationally to simple add a while loop inside after obtaining w. Keep looping and choosing a new z until z is not equal to w. For you application the consequences are extremely negligible.

Measure similar information using Kullback-Leibler (KL) distance matlab code

I am trying implement the distance measurement between two distributions. The detail is described in here and input image . Let short summarize the idea of the paper:
The input image is divided into inside region and outside region by using Heaviside function H
Calculate the distribution inside region and outside region, in which phi is boundary of inside and outside.
Calculate the Kullback-Leibler distance
I implement that scheme, but I have three problems:
Log function in paper is log or log2 in matlab?
Log(0) is infinite, But we know that distribution result will return many 0 values. How to ignore it? In my case, I plus with eps value, some people add h1(h1==0)=1, which is correct?
Could you see my code? Is it correct? I am not sure about my implementation.
This is my code to implement that scheme:
function main()
Img=imread('1.bmp');%please download at above link
Img=double(Img(:,:,1));
%% Initial boundary
c0=2; %const value
phi = ones(size(Img(:,:,1))).*c0;
phi(26:32,28:34) = -c0;
%% Heaviside function
epsilon=1
Hu=0.5*(1+(2/pi)*atan(phi./epsilon));
%% Inside and outside image
inImg=Img.*(1-Hu);
outImg=Img.*(Hu);
%% Let caclulate KL distance
h1 = histogram(inImg, 256, 0, 255); %Histogram of inside
h2 = histogram(outImg, 256, 0, 255);%Histogram of outside
lamda1=KLdist(h1,h2) % distance from h1 to h2
lamda2=KLdist(h2,h1) % distance from h2 to h1
end
%%%%%%%%%% function for KL distance%%%%%%%%%%%%%%%
function [d1,d2]=KLdist(h1,h2)
d1=sum(h1.*log2(h1+eps)-h1.*log2(h2+eps))
d2=sum(h2.*log2(h2+eps)-h2.*log2(h1+eps))
end
%%%%%%%%%%function for histogram calculation%%%%%%
function [h,bins] = histogram(I, n, min, max)
I = I(:);
range = max - min;
drdb = range / double(n); % dr/db - change in range per bin
h = zeros(n,1);
bins = zeros(n,1);
for i=1:n
% note: while the instructions say "within integer round off" I'm leaving
% this as float bin edges, to handle the potential float input
% ie - say the input was a probability image.
low = min + (i-1)*drdb;
high = min + i*drdb;
h(i) = sum( (I>=low) .* (I<high) );
bins(i) = low;
end
h(n) = h(n) + sum( (I>=(n*drdb)) .* (I<=max) ); % include anything we may have missed in the last bin.
h = h ./ sum(h); % "relative frequency"
end
Let answer one by one
Log function in paper is log or log2 in matlab?
Ans: This is natural log. In matlab, you just call log()
Log(0) is infinite, But we know that distribution result will return many 0 values. How to ignore it?
Ans: To ignore the log of zero. You need add some small value as log(x+eps) or log(x+(x==0)*eps), where x is your values
Could you see my code? Is it correct? I am not sure about my implementation.
Ans: Your code looks fine. You can base on my suggestion to improve your code. Good luck