Numerical derivative of a vector - matlab

I have a problem with numerical derivative of a vector that is x: Nx1 with respect to another vector t (time) that is the same size of x.
I do the following (x is chosen to be sine function as an example):
t=t0:ts:tf;
x=sin(t);
xd=diff(x)/ts;
but the answer xd is (N-1)x1 and I figured out that it does not compute derivative corresponding to the first element of x.
is there any other way to compute this derivative?

You are looking for the numerical gradient I assume.
t0 = 0;
ts = pi/10;
tf = 2*pi;
t = t0:ts:tf;
x = sin(t);
dx = gradient(x)/ts
The purpose of this function is a different one (vector fields), but it offers what diff doesn't: input and output vector of equal length.
gradient calculates the central difference between data points. For an
array, matrix, or vector with N values in each row, the ith value is
defined by
The gradient at the end points, where i=1 and i=N, is calculated with
a single-sided difference between the endpoint value and the next
adjacent value within the row. If two or more outputs are specified,
gradient also calculates central differences along other dimensions.
Unlike the diff function, gradient returns an array with the same
number of elements as the input.

I know I'm a little late to the game here, but you can also get an approximation of the numerical derivative by taking the derivatives of the polynomial (cubic) splines that runs through your data:
function dy = splineDerivative(x,y)
% the spline has continuous first and second derivatives
pp = spline(x,y); % could also use pp = pchip(x,y);
[breaks,coefs,K,r,d] = unmkpp(pp);
% pre-allocate the coefficient vector
dCoeff = zeroes(K,r-1);
% Columns are ordered from highest to lowest power. Both spline and pchip
% return 4xn matrices, ordered from 3rd to zeroth power. (Thanks to the
% anonymous person who suggested this edit).
dCoeff(:, 1) = 3 * coefs(:, 1); % d(ax^3)/dx = 3ax^2;
dCoeff(:, 2) = 2 * coefs(:, 2); % d(ax^2)/dx = 2ax;
dCoeff(:, 3) = 1 * coefs(:, 3); % d(ax^1)/dx = a;
dpp = mkpp(breaks,dCoeff,d);
dy = ppval(dpp,x);
The spline polynomial is always guaranteed to have continuous first and second derivatives at each point. I haven not tested and compared this against using pchip instead of spline, but that might be another option as it too has continuous first derivatives (but not second derivatives) at every point.
The advantage of this is that there is no requirement that the step size be even.

There are some options to work-around your issue.
First: you can make your domain larger. Instead of N, use N+1 gridpoints.
Second: depending on the end-point of interest, you can use
Forward difference: F(x + dx) - F(x)
Backward difference: F(x) - F(x - dx)

Related

How exactly works this simple calculus of a ML gradient descent cost function using Octave\MatLab?

I am following a machine learning course on Coursera and I am doing the following exercise using Octave (MatLab should be the same).
The exercise is related to the calculation of the cost function for a gradient descent algoritm.
In the course slide I have that this is the cost function that I have to implement using Octave:
This is the formula from the course slide:
So J is a function of some THETA variables represented by the THETA matrix (in the previous second equation).
This is the correct MatLab\Octave implementation for the J(THETA) computation:
function J = computeCost(X, y, theta)
%COMPUTECOST Compute cost for linear regression
% J = COMPUTECOST(X, y, theta) computes the cost of using theta as the
% parameter for linear regression to fit the data points in X and y
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta
% You should set J to the cost.
J = (1/(2*m))*sum(((X*theta) - y).^2)
% =========================================================================
end
where:
X is a 2 column matrix of m rows having all the elements of the first column set to the value 1:
X =
1.0000 6.1101
1.0000 5.5277
1.0000 8.5186
...... ......
...... ......
...... ......
y is a vector of m elements (as X):
y =
17.59200
9.13020
13.66200
........
........
........
Finnally theta is a 2 columns vector having 0 asvalues like this:
theta = zeros(2, 1); % initialize fitting parameters
theta
theta =
0
0
Ok, coming back to my working solution:
J = (1/(2*m))*sum(((X*theta) - y).^2)
specifically to this matrix multiplication (multiplication between the matrix X and the vector theta): I know that it is a valid matrix multiplication because the number of column of X (2 columns) is equal to the number of rows of theta (2 rows) so it is a perfectly valid matrix multiplication.
My doubt that is driving me crazy (probably it is a trivial doubt) is related to the previous course slide context:
As you can see in the second equation used to calculated the current h_theta(x) value it is using the transposed theta vector and not the theta vector as done in the code.
Why ?!?!
I suspect that it depends only on how was created the theta vector. It was build in this way:
theta = zeros(2, 1); % initialize fitting parameters
that is generating a 2 line 1 column vector instead of a classic one line 2 column vector. So maybe I have not to transpose it. But I am absolutely not sure about this assertion.
Is my intuition correct or what am I missing?
Your intuition is correct. Effectively it does not matter whether you perform the multiplication as theta.' * X or as X.' * theta, since this either generates a horizontal vector or a vertical vector of the hypothesis representing all observations, and what you're expected to do next is subtract the y vector from the hypothesis vector at each observation, and sum the results. So as long as y has the same orientation as your hypothesis and you subtract at each equivalent point, then the scalar end-result of the summation will be the same.
Often enough, you'll see the X.' * theta version preferred over theta.' * X purely for convenience, to avoid transposing over and over again just to be consistent with the mathematical notation. But this is fine, since the underlying math doesn't really change, only the order of equivalent operations.
I agree it's confusing though, both because it makes it harder to follow the formula when the code effectively looks like it's doing something else, and also since it messes with the usual convention that a vertical vector represents 'coordinates', and a horizontal vector represents observations. In such cases, especially in languages like matlab / octave where the orientation of a vector isn't explicitly defined in the variable's type, it is doubly important to document what you expect the inputs to represent, and preferably there should have been assert statements in the code confirming the input has been passed in the correct orientation. Clearly here they felt it wasn't necessary because this code is acting under controlled conditions in a predefined exercise environment anyway, but it would have been good practice to do so from a software engineering point of view.

How can I use the Second Derivative test to find Maxima and Minima in MATLAB

I'm looking to create a function that takes an equation and marks the maxima and/or minima, and asymptotes on a graph.
From Calc 1, I remember using the Second Derivative Test.
I started by solving for the roots of the first derivative - but am not sure how I can plot where the point in this vector intersect my original equation.
syms x; %//
f = sin(x) %// Define equation as a function
df=diff(f) %// First derivatives
ddf=diff(df) %// Second Derivatives
I found the roots for the x-values of these points with dfRoots = solve(df)
and then created a vector of them called dfRoots_realDouble
dfRoots_double = double(dfRoots);
dfRoots_realDouble = real(dfRoots_double);
dfRoots_realDouble represent the x-values i need, but I dont know how to plot them as points and separate minima from maxima, etc
I'd like to be able to put in any function, like sin(6*(x^5) + 7*(x^3)+8*(x^2)) on [-1,1] and it highlight all the maxima with one marker, and all the minima with another.
If you sort the roots, you will have an alternating vector of maximums and minimums. Use the second derivative to determine if your vector of roots starts with a maximum or a minimum and then split the vector.
roots = sort([-3 4 2 -5]);
is_max = (subs(ddf, x, roots(1)) < 0)
if is_max
max_points = roots(1:2:end)
min_points = roots(2:2:end)
else
max_points = roots(2:2:end)
min_points = roots(1:2:end)
end
% plot with two different symbols
plot(max_points, subs(f, x, max_points), 'or',...
min_points, subs(f, x, min_points), '*k')

numerical integration for Gaussian function - indefinite integral

My approach
fun = #(y) (1/sqrt(pi))*exp(-(y-1).^2).*log(1 + exp(-4*y))
integral(fun,-Inf,Inf)
This gives NaN.
So I tried plotting it.
y= -10:0.1:10;
plot(y,exp(-(y-1).^2).*log(1 + exp(-4*y)))
Then understood that domain (siginificant part) is from -4 to +4.
So changed the limits to
integral(fun,-10,10)
However I do not want to always plot the graph and then know its limits. So is there any way to know the integral directly from -Inf to Inf.
Discussion
If your integrals are always of the form
I would use a high-order Gauss–Hermite quadrature rule.
It's similar to the Gauss-Legendre-Kronrod rule that forms the basis for quadgk but is specifically tailored for integrals over the real line with a standard Gaussian multiplier.
Rewriting your equation with the substitution x = y-1, we get
.
The integral can then be computed using the Gauss-Hermite rule of arbitrary order (within reason):
>> order = 10;
>> [nodes,weights] = GaussHermiteRule(order);
>> f = #(x) log(1 + exp(-4*(x+1)))/sqrt(pi);
>> sum(f(nodes).*weights)
ans =
0.1933
I'd note that the function below builds a full order x order matrix to compute nodes, so it shouldn't be made too large.
There is a way to avoid this by explicitly computing the weights, but I decided to be lazy.
Besides, event at order 100, the Gaussian multiplier is about 2E-98, so the integrand's contribution is extremely minimal.
And while this isn't inherently adaptive, a high-order rule should be sufficient in most cases ... I hope.
Code
function [nodes,weights] = GaussHermiteRule(n)
% ------------------------------------------------------------------------------
% Find the nodes and weights for a Gauss-Hermite Quadrature integration.
%
if (n < 1)
error('There is no Gauss-Hermite rule of order 0.');
elseif (n < 0) || (abs(n - round(n)) > eps())
error('Given order ''n'' must be a strictly positive integer.');
else
n = round(n);
end
% Get the nodes and weights from the Golub-Welsch function
n = (0:n)' ;
b = n*0 ;
a = b + 0.5 ;
c = n ;
[nodes,weights] = GolubWelsch(a,b,c,sqrt(pi));
end
function [xk,wk] = GolubWelsch(ak,bk,ck,mu0)
%GolubWelsch
% Calculate the approximate* nodes and weights (normalized to 1) of an orthogonal
% polynomial family defined by a three-term reccurence relation of the form
% x pk(x) = ak pkp1(x) + bk pk(x) + ck pkm1(x)
%
% The weight scale factor mu0 is the integral of the weight function over the
% orthogonal domain.
%
% Calculate the terms for the orthonormal version of the polynomials
alpha = sqrt(ak(1:end-1) .* ck(2:end));
% Build the symmetric tridiagonal matrix
T = full(spdiags([[alpha;0],bk,[0;alpha]],[-1,0,+1],length(alpha),length(alpha)));
% Calculate the eigenvectors and values of the matrix
[V,xk] = eig(T,'vector');
% Calculate the weights from the eigenvectors - technically, Golub-Welsch requires
% a normalization, but since MATLAB returns unit eigenvectors, it is omitted.
wk = mu0*(V(1,:).^2)';
end
I've had success with transforming such infinite-bounded integrals using a numerical variable transformation, as explained in Numerical Recipes 3e, section 4.5.3. Basically, you substitute in y=c*tan(t)+b and then numerically integrate over t in (-pi/2,pi/2), which sweeps y from -infinity to infinity. You can tune the values of c and b to optimize the process. This approach largely dodges the question of trying to determine cutoffs in the domain, but for this to work reliably using quadrature you have to know that the integrand does not have features far from y=b.
A quick and dirty solution would be to look for a position, where your function is sufficiently small enough and then taking it as limits. This assumes that for x>0 the function fun decreases montonically and fun(x) is roughly the same size as fun(-x) for all x.
%// A small number
epsilon = eps;
%// Stepsize for searching bound
stepTest = 1;
%// Starting position for searching bound
position = 0;
%// Not yet small enough
smallEnough = false;
%// Search bound
while ~smallEnough
smallEnough = (fun(position) < eps);
position = position + stepTest;
end
%// Calculate integral
integral(fun, -position, position)
If your were happy with plotting the function, deciding by eye where you can cut, then this code will suffice, I guess.

Limit cycle in Matlab for 2nd order autonomous system

I wish to create a limit cycle in Matlab. A limit cycle looks something like this:
I have no idea how to do it though, I've never done anything like this in Matlab.
The equations to describe the limit cycle are the following:
x_1d=x_2
x_2d=-x_1+x_2-2*(x_1+2*x_2)x_2^2
It is to be centered around the equilibrium which is (0,0)
Can any of you help me?
If you use the partial derivatives of your function to make a vector field, you can then use streamlines to visualize the cycle that you are describing.
For example, the function f = x^2+y^2
Gives me partial derivatives dx = 2x, dy=2y For the visualization, I sample from the partial derivatives over a grid.
[x,y] = meshgrid(0:0.1:1,0:0.1:1);
dx = 2*x;
dy = 2*y;
To visualize the vector field, I use quiver;
figure;
quiver(x, y, dx, dy);
Using streamline, I can visualize the path a particle injected into the vector field would take. In my example, I inject the particle at (0.1, 0.1)
streamline(x,y, dx, dy, 0.1, 0.1);
This produces the following visualization
In your case, you can omit the quiver step to remove the hedgehog arrows at every grid point.
Here's another example that shows the particle converging to an orbit.
Edit: Your function specifically.
So as knedlsepp points out, the function you are interested in is a bit ambiguously stated. In Matlab, * represents the matrix product while .* represents the element-wise multiplication between matrices. Similarly, '^2' represents MM for a matrix M, while .^2 represents taking the element-wise power.
So,
[x_1,x_2] = meshgrid(-4:0.1:4,-4:0.1:4);
dx_1 = x_2;
dx_2 = -x_1+x_2-2*(x_1+2*x_2)*(x_2)^2;
figure; streamline(x_1,x_2, dx_1, dx_2, 0:0.1:4, 0:0.1:4);
Looks like
This function will not show convergence because it doesn't converge.
knedlsepp suggests that the function you are actually interested in is
dx_1 = -1 * x_2;
dx_2 = -1 * -x_1+x_2-2*(x_1+2*x_2).*(x_2).^2;
His post has a nice description of the rest.
This post shows the code to produce the integral lines of your vector field defined by:
dx/dt = y
dy/dt = -x+y-2*(x+2*y)*y^2.
It is important to properly vectorize this function. (i.e. Introducing dots at all the important places)
dxdt = #(x,y) y;
dydt = #(x,y) -x+y-2*(x+2*y).*y.^2;
[X,Y] = meshgrid(linspace(-4,4,100));
[sx,sy] = meshgrid(linspace(-3,3,20));
streamline(stream2(X, Y, ... % Points
dxdt(X,Y), dydt(X,Y),... % Derivatives
sx, sy)); % Starting points
axis equal tight
To get a picture more similar to yours, change the grid size and starting points:
[X,Y] = meshgrid(linspace(-1,1,100));
[sx,sy] = meshgrid(linspace(0,0.75,20),0.2);

Matlab Vectorization of Multivariate Gaussian Basis Functions

I have the following code for calculating the result of a linear combination of Gaussian functions. What I'd really like to do is to vectorize this somehow so that it's far more performant in Matlab.
Note that y is a column vector (output), x is a matrix where each column corresponds to a data point and each row corresponds to a dimension (i.e. 2 rows = 2D), variance is a double, gaussians is a matrix where each column is a vector corresponding to the mean point of the gaussian and weights is a row vector of the weights in front of each gaussian. Note that the length of weights is 1 bigger than gaussians as weights(1) is the 0th order weight.
function [ y ] = CalcPrediction( gaussians, variance, weights, x )
basisFunctions = size(gaussians, 2);
xvalues = size(x, 2);
if length(weights) ~= basisFunctions + 1
ME = MException('TRAIN:CALC', 'The number of weights should be equal to the number of basis functions plus one');
throw(ME);
end
y = weights(1) * ones(xvalues, 1);
for xIdx = 1:xvalues
for i = 1:basisFunctions
diff = x(:, xIdx) - gaussians(:, i);
y(xIdx) = y(xIdx) + weights(i+1) * exp(-(diff')*diff/(2*variance));
end
end
end
You can see that at the moment I simply iterate over the x vectors and then the gaussians inside 2 for loops. I'm hoping that this can be improved - I've looked at meshgrid but that seems to only apply to vectors (and I have matrices)
Thanks.
Try this
diffx = bsxfun(#minus,x,permute(gaussians,[1,3,2])); % binary operation with singleton expansion
diffx2 = squeeze(sum(diffx.^2,1)); % dot product, shape is now [XVALUES,BASISFUNCTIONS]
weight_col = weights(:); % make sure weights is a column vector
y = exp(-diffx2/2/variance)*weight_col(2:end); % a column vector of length XVALUES
Note, I changed diff to diffx since diff is a builtin. I'm not sure this will improve performance as allocating arrays will offset increase by vectorization.