how to implement gradient descent from scratch? - linear-regression

how do you find the derivative of the cost function of a linear regression problem ?
there were bunch of posts on the internet but weirdly none of them had shown how to compute the derivative.
for example I picked this from link of the website
def gradient_descent(x, y, theta, learning_rate=0.1, num_epochs=10):
m = x.shape[0]
J_all = []
for _ in range(num_epochs):
h_x = h(x, theta)
cost_ = (1/m)*(x.T#(h_x - y))
theta = theta - (learning_rate)*cost_
J_all.append(cost_function(x, y, theta))
return theta, J_all
but I don't see any derivative function being used. it's just the learning rate multiplied by the cost function.
can you show me please how to compute the derivative ?

You can calculate the derivative manually if the expression is simple enough and it will not change dynamically based on some input. However, if the mathematical function is complex, you might want to use the finite difference gradient approximation (central difference approximation).
def finite_diff_grad(f, x, delta=0.1):
return (f(x+delta) - f(x-delta))/(2*delta)
For multivariable functions, you would have something like this:
def finite_diff_grad(f, x, y, delta=10e-4):
der_part_x = (f(x+delta, y) - f(x-delta, y))/(2*delta)
der_part_y = (f(x, y+delta) - f(x, y-delta))/(2*delta)
return [der_part_x, der_part_y]
Where f is the function you want to find the derivative/gradient of.
Based on your problem's context, you might want to give different values to delta.

Related

How to numerically solve for the upper bound of a definite integral in Matlab?

Given the equation
For some given function f(x) where gamma is also given, how can you numerically solve for upper bound u in Matlab?
f(x) can be a placeholder for any model.
This is a root-finding and integration problem but with my lack of knowledge in Matlab, I'm still trying to figure out how it is done.
My initial solution is a brute force approach. Let's say we have
and gamma = 0.8, we can find the definite integral from -inf to u by extracting its integral from some very small value u, working our way up until we reach a result gamma = 0.8.
syms f(x)
f(x) = (1/(sqrt(6*pi)))*exp(-(x^2/6));
gamma = 0.8;
u = -10;
res = int(f,x,-Inf,u);
while double(res) <= gamma
u = u+0.1;
res = int(f,x,-Inf,u);
end
fprintf("u is %f", u);
This solution is pretty slow and will definitely not work all the time.
I set u = 10 because looking at the graph of the function, we don't really get anything outside the interval [-5, 5].
You can use MATLAB Symbolic Math Toolbox (an addon you might need to install).
That way you can define yourself a "true" unknow variable x (not an array of x-values) and later integrate from negative infinity:
syms f(x)
f(x) = exp(2*x) % an example function
gamma = int(f,x,-Inf,u)
This yields gamma as the integral from -Inf to u, after defining f(x) as a symbolic function and u as a scalar

Incorrect design matrix H while implementing the Kalman Filter

I am trying to implement extended Kalman filter for the estimation of position from GPS pseudoranges. I used the Taylor series to linearize the measurement equation, the code for this is given below.
function [H, R, Po] = getHandRMat(currentdata,estimatedRecPos,sigmaNot)
% Caluclate R and H for calculating Kalman Gain
R = getErrorCovMatObs(sigmaNot, currentdata);
Po = sqrt((currentdata(:,3)-estimatedRecPos(1)).^2 + ...
(currentdata(:,4)-estimatedRecPos(2)).^2 +...
(currentdata(:,5)-estimatedRecPos(3)).^2)+estimatedRecPos(4);
H = [(estimatedRecPos(1) - currentdata(:,3))./Po, ...
(estimatedRecPos(2) - currentdata(:,4))./Po, ...
(estimatedRecPos(3) - currentdata(:,5))./Po, ones(size(Po))];
end
The state vector consists of [x, y, z, cdt] of the receiver. currentdata has the satellite X, Y, Z coordinates and pseudoranges. estimatedRecPos is the state vector.
Problem
When I perform the update step stateVec = stateVec + K*(Z - (H*stateVec));, the value of Z - (H*stateVec) is extremely high, which isn't right. So I'm not sure where I am going wrong. I feel my design matrix H is correct as, I am using the same for Least sq estimation and that seems to work .
** Diff between Actual measurement and predicted measurement**
1.0e+07 *
2.5856
2.6063
2.6019
2.6899
2.6016
2.7021
2.6136
2.5983
2.5863
Any help is greatly appreciated.
Thanks
PS. the algorithm is implemented in Matlab
Update 1: Here is the algorithm presented in probabilistic robotics text book by Sebastian Thrun and Wolfram Brgard.
Does this mean that instead of using the design matrix H to convert the state vector to predicted measurement, I directly use the non-linear eq. directly?

How to perform indefinite integration of this function in MATLAB?

I need to perform the following operations as shown in the image. I need to calculate the value of function H for different inputs(x) using MATLAB.
I am giving the following command from Symbolic Math Toolbox
syms y t x;
f1=(1-exp(-y))/y;
f2=-t+3*int(f1,[0,t]);
f3=exp(f2);
H=int(f3,[0,x]);
but the value of 2nd integral i.e. integral in the function H can't be calculated and my output is of the form of
H =
int(exp(3*eulergamma - t - 3*ei(-t) + 3*log(t)), t, 0, x)
If any of you guys know how to evaluate this or have a different idea about this, please share it with me.
Directly Finding the Numerical Solution using integral:
Since you want to calculate H for different values of x, so instead of analytical solution, you can go for numerical solution.
Code:
syms y t;
f1=(1-exp(-y))/y; f2=-t+3*int(f1,[0,t]); f3=exp(f2);
H=integral(matlabFunction(f3),0,100) % Result of integration when x=100
Output:
H =
37.9044
Finding the Approximate Analytical Solution using Monte-Carlo Integration:
It probably is an "Elliptic Integral" and cannot be expressed in terms of elementary functions. However, you can find an approximate analytical solution using "Monte-Carlo Integration" according to which:
where f(c) = 1/n Σ f(xᵢ)
Code:
syms x y t;
f1=(1-exp(-y))/y; f2=-t+3*int(f1,[0,t]); f3=exp(f2);
f3a= matlabFunction(f3); % Converting to function handle
n = 1000;
t = x*rand(n,1); % Generating random numbers within the limits (0,x)
MCint(x) = x * mean(f3a(t)); % Integration
H= double(MCint(100)) % Result of integration when x=100
Output:
H =
35.2900
% Output will be different each time you execute it since it is based
% on generation of random numbers
Drawbacks of this approach:
Solution is not exact but approximated.
Greater the value of n, better the result and slower the code execution speed.
Read the documentation of matlabFunction, integral, Random Numbers Within a Specific Range, mean and double for further understanding of the code(s).

How to integrate over a discrete 2D surface in MATLAB?

I have a function z = f(x, y), where z is the value at point (x, y). How may I integrate z over the x-y plane in MATLAB?
By function above, I actually mean I have something similar to a hash table. That is, given a (x, y) pair, I can look up the table to find the corresponding z value.
The problem would be rather simple, if the points were uniformly distributed over x-y plane, in which case I can simply sum up all the z values, multiply it with the bottom area, and finally divide it by the number of points I have. However, the distribution is not uniform as shown below. So I am actually asking for the computation method that minimises the error.
The currently accepted answer will only work for gridded data. If your data is scattered you can use the following approach instead:
scatteredInterpolant + integral2:
f = scatteredInterpolant(x(:), y(:), z(:), 'linear');
int = integral2(#(x,y) f(x,y), xmin, xmax, ymin, ymax);
This defines the linear interpolant f of the data z(i) = f(x(i),y(i)) and uses it as an argument to integral2. Note that ymin and ymax, instead of doubles, can be function handles depending on x. So usually you will be integrating rectangles, but this could be used for integration regions a bit more complicated.
If your integration area is rather complicated or has holes, you should consider triangulating your data.
DIY using triangulation:
Let's say your integration area is given by the triangulation trep, which for example could be obtained by trep = delaunayTriangulation(x(:), y(:)). If you have your values z corresponding to z(i) = f(trep.Points(i,1), trep.Points(i,2)), you can use the following integration routine. It computes the exact integral of the linear interpolant. This is done by evaluating the areas of all the triangles and then using these areas as weights for the midpoint(mean)-value on each triangle.
function int = integrateTriangulation(trep, z)
P = trep.Points; T = trep.ConnectivityList;
d21 = P(T(:,2),:)-P(T(:,1),:);
d31 = P(T(:,3),:)-P(T(:,1),:);
areas = abs(1/2*(d21(:,1).*d31(:,2)-d21(:,2).*d31(:,1)));
int = areas'*mean(z(T),2);
If you have a discrete dataset for which you have all the x and y values over which z is defined, then just obtain the Zdata matrix corresponding to those (x,y) pairs. Save this matrix, and then you can make it a continuous function using interp2:
function z_interp = fun(x,y)
z_interp = interp2(Xdata,Ydata,Zdata,x,y);
end
Then you can use integral2 to find the integral:
q = integral2(#fun,xmin,xmax,ymin,ymax)
where #fun is your function handle that takes in two inputs.
I had to integrate a biavariate normal distribution recently in MatLab. The idea is very simple. Matlab defines a surface through a meshgrid, so from x, y you need to do this:
x = -10:0.05:10;
y = x;
[X,Y] = meshgrid(x',y');
...for example. Then, let's call FX the function that defines the value at each point of the surface. To calculate the integral you just need to do this:
surfint = zeros(length(X),1);
for a = 1:length(X)
surfint(a,1) = trapz(x,FX(:,a));
end
trapz(x, surfint)
For me, this is the simplest way.

Numerically integrate a function f(x) over x using MATLAB where f(x) has another argument y which is a vector

I would like to numerically integrate a vector which represents a function f(x) over the range of x specified by bounds x0 and x1 in Matlab. I would like to check that the output of the integration is correct and that it converges.
There are the quad and quadl functions that serve well in identifying the required error tolerance, but they need the input argument to be a function and not the resulting vector of the function. There is also the trapz function where we can enter the two vectors x and f(x), but then it computes the integral of f(x) with respect to x depending on the spacing used by vector x. However, there is no given way using trapz to adjust the tolerance as in quad and quadl and make sure the answer is converging.
The main problem why I can't use quad and quadl functions is that f(x) is the following equation:
f(x) = sum(exp(-1/2 *(x-y))), the summation is over y, where y is a vector of length n and x is an element that is given each time to the function f(x). Therefore, all elements in vector y are subtracted from element x and then the summation over y is calculated to give us the value f(x). This is done for m values of x, where m is not equal to n.
When I use quadl as explained in the Matlab manual, where f(x) is defined in a separate function .m file and then in the main calling file, I use Q = quadl(#f,x0,x1,tolerance,X,Y); here X is a vector of length m and Y is a vector of length L. Matlab gives an error: "??? Error using ==> minus
Matrix dimensions must agree." at the line where I define the function f(x) in the .m function file. f(x) = sum(exp(-1/2 *(x-y)))
I assume the problem is that Matlab treats x and y as vectors that should be of the same length when they are subtracted from each other, whereas what's needed is to subtract the vector Y each time from a single element from the vector X.
Would you please recommend a way to solve this problem and successfully numerically integrate f(x) versus x with a method to control the tolerance?
From the documentationon quad it says:
The function y = fun(x) should accept a vector argument x and return a vector result y, the integrand evaluated at each element of x.
So every time we call the function, we need to evaluate the integrand at each given x.
Also, to parameterize the function call with the constant vector Y, I recommend an anonymous function call. There's a reasonable demo here. Here's how I implemented your problem in Matlab:
function Q = test_num_int(x0,x1,Y)
Q = quad(#(x) myFun(x,Y),x0,x1);
end
function fx = myFun(x,Y)
fy = zeros(size(Y));
fx = zeros(size(x));
for jj=1:length(fx)
for ii=1:length(Y)
fy(ii) = exp(-1/2 *(x(jj)-Y(ii)));
end
fx(jj) = sum(fy);
end
end
Then I called the function and got the following output:
Y = 0:0.1:1;
x0 = 0;
x1 = 1;
Q = test_num_int(x0,x1,Y)
Q =
11.2544
The inputs for the lower and upper bound and the constant array are obviously just dummy values, but the integral converges very quickly, almost immediately. Hope this helps!
I believe the following would also work:
y = randn(10,1);
func = #(x) sum(exp(-1/2 *(x-y)));
integral(func,0,1,'ArrayValued',true)