How does one compute a single finite differences in Matlab efficiently? - matlab

I wanted to compute a finite difference with respect to the change of the function in Matlab. In other words
f(x+e_i) - f(x)
is what I want to compute. Note that its very similar to the first order numerical partial differentiation (forward differentiation in this case) :
(f(x+e_i) - f(x)) / (e_i)
Currently I am using for loops to compute it but it seems that Matlab is much slower than I thought. I am doing it as follows:
function [ dU ] = numerical_gradient(W,f,eps)
%compute gradient or finite difference update numerically
[D1, D2] = size(W);
dU = zeros(D1, D2);
for d1=1:D1
for d2=1:D2
e = zeros([D1,D2]);
e(d1,d2) = eps;
f_e1 = f(W+e);
f_e2 = f(W-e);
%numerical_derivative = (f_e1 - f_e2)/(2*eps);
%dU(d1,d2) = numerical_derivative
numerical_difference = f_e1 - f_e2;
dU(d1,d2) = numerical_difference;
end
end
it seems that its really difficult to vectorize the above code because for numerical differences follow the definition of the gradient and partial derivatives which is:
df_dW = [ ..., df_dWi, ...]
where df_dWi assumes the other coordinates are fixed and it only worries about the change of the variable Wi. Thus, I can't just change all the coordinates at once.
Is there a better way to do this? My intuition tells me that the best way to do this is to implement this not in matlab but in some other language, say C and then have matlab call that library. Is that true? Does it mean that the best solution is some Matlab library that does this for me?
I did see:
https://www.mathworks.com/matlabcentral/answers/332414-what-is-the-quickest-way-to-find-a-gradient-or-finite-difference-in-matlab-of-a-real-function-in-hig
but unfortunately, it computes exact derivatives, which isn't what I am looking for. I am explicitly looking for differences or "bad approximation" to the gradient.
Since it seems this code is not easy to vectorize (in fact my intuition tells me its not possible to do so) my only other idea is to implement this finite difference function in C and then have C call the function. Is this a good idea? Anyone know how to do this?
I did try reading the following:
https://www.mathworks.com/help/matlab/matlab_external/standalone-example.html
but it was too difficult to understand for me because I have no idea what a mex file is, if I need to have a arrayProduct.c file as well as a mex.h file, if I also needed a matlab file, etc. If there just existed a way to simply download a working example with all the functions they suggest there and some instructions to compile it, then it would be super helpful. But just reading the hmtl/article like that its impossible for me to infer what they want me to do.
For the sake of completness it seems reddit has some comments in its discussion of this:
https://www.reddit.com/r/matlab/comments/623m7i/how_does_one_compute_a_single_finite_differences/

Here is a more efficient doing so:
function [ vNumericalGrad ] = CalcNumericalGradient( hInputFunc, vInputPoint, epsVal )
numElmnts = size(vInputPoint, 1);
vNumericalGrad = zeros([numElmnts, 1]);
refVal = hInputFunc(vInputPoint);
for ii = 1:numElmnts
% Set the perturbation vector
refInVal = vInputPoint(ii);
vInputPoint(ii) = refInVal + epsVal;
% Compute Numerical Gradient
vNumericalGrad(ii) = (hInputFunc(vInputPoint) - refVal) / epsVal;
% Reset the perturbation vector
vInputPoint(ii) = refInVal;
end
end
This code allocate less memory.
The above code performance will be totally controlled by the speed of the hInputFunction.
The small tricks compared to original code are:
No memory reallocation of e each iteration.
Instead of addition of vectors W + e there are 2 assignments to the array.
Decreasing the calls to hInputFunction() by half by defining the reference value outside the loop (This only works for Forward / Backward difference).
Probably this will be very close to C code unless you can code in C more efficiently the function which computes the value (hInputFunction).
A full implementation can be found in StackOverflow Q44984132 Repository (It was Posted in StackOverflow Q44984132).
See CalcFunGrad( vX, hObjFun, difMode, epsVal ).

A way better approach (numerically more stable, no issue of choosing the perturbation hyperparameter, accurate up to machine precision) is to use algorithmic/automatic differentiation. For this you need the Matlab Deep Learning Toolbox. Then you can use dlgradient to compute the gradient. Below you find the source code attached corresponding to your example.
Most importantly, you can examine the error and observe that the deviation of the automatic approach from the analytical solution is indeed machine precision, while for the finite difference approach (I choose second order central differences) the error is orders of magnitude higher. For 100 points and a range of $[-10, 10]$ this errors are somewhat tolerable, but if you play a bit with Rand_Max and n_points you observe that the errors become larger and larger.
Error of algorithmic / automatic diff. is: 1.4755528111219851e-14
Error of finite difference diff. is: 1.9999999999348703e-01 for perturbation 1.0000000000000001e-01
Error of finite difference diff. is: 1.9999999632850161e-03 for perturbation 1.0000000000000000e-02
Error of finite difference diff. is: 1.9999905867860374e-05 for perturbation 1.0000000000000000e-03
Error of finite difference diff. is: 1.9664569947425062e-07 for perturbation 1.0000000000000000e-04
Error of finite difference diff. is: 1.0537897883625319e-07 for perturbation 1.0000000000000001e-05
Error of finite difference diff. is: 1.5469326944467290e-06 for perturbation 9.9999999999999995e-07
Error of finite difference diff. is: 1.3322061696937969e-05 for perturbation 9.9999999999999995e-08
Error of finite difference diff. is: 1.7059535957436630e-04 for perturbation 1.0000000000000000e-08
Error of finite difference diff. is: 4.9702408787320664e-04 for perturbation 1.0000000000000001e-09
Source Code:
f2.m
function y = f2(x)
x1 = x(:, 1);
x2 = x(:, 2);
x3 = x(:, 3);
y = x1.^2 + 2*x2.^2 + 2*x3.^3 + 2*x1.*x2 + 2*x2.*x3;
f2_grad_analytic.m:
function grad = f2_grad_analytic(x)
x1 = x(:, 1);
x2 = x(:, 2);
x3 = x(:, 3);
grad(:, 1) = 2*x1 + 2*x2;
grad(:, 2) = 4*x2 + 2*x1 + 2 * x3;
grad(:, 3) = 6*x3.^2 + 2*x2;
f2_grad_AD.m:
function grad = f2_grad_AD(x)
x1 = x(:, 1);
x2 = x(:, 2);
x3 = x(:, 3);
y = x1.^2 + 2*x2.^2 + 2*x3.^3 + 2*x1.*x2 + 2*x2.*x3;
grad = dlgradient(y, x);
CalcNumericalGradient.m:
function NumericalGrad = CalcNumericalGradient(InputPoints, eps)
% (Central, second order accurate FD)
NumericalGrad = zeros(size(InputPoints) );
for i = 1:size(InputPoints, 2)
perturb = zeros(size(InputPoints));
perturb(:, i) = eps;
NumericalGrad(:, i) = (f2(InputPoints + perturb) - f2(InputPoints - perturb)) / (2 * eps);
end
main.m:
clear;
close all;
clc;
n_points = 100;
Rand_Max = 20;
x_test_FD = rand(n_points, 3) * Rand_Max - Rand_Max/2;
% Calculate analytical solution
grad_analytic = f2_grad_analytic(x_test_FD);
grad_AD = zeros(n_points, 3);
for i = 1:n_points
x_test_dl = dlarray(x_test_FD(i,:) );
grad_AD(i,:) = dlfeval(#f2_grad_AD, x_test_dl);
end
Err_AD = norm(grad_AD - grad_analytic);
fprintf("Error of algorithmic / automatic diff. is: %.16e\n", Err_AD);
eps_range = [1e-1, 1e-2, 1e-3, 1e-4, 1e-5, 1e-6, 1e-7, 1e-8, 1e-9];
for i = 1:length(eps_range)
eps = eps_range(i);
grad_FD = CalcNumericalGradient(x_test_FD, eps);
Err_FD = norm(grad_FD - grad_analytic);
fprintf("Error of finite difference diff. is: %.16e for perturbation %.16e\n", Err_FD, eps);
end

Related

Function fzero in Matlab is not converging

I am solving a problem in my Macroeconomics class. Consider the following equation:
Here, k is fixed and c(k) was defined through the ```interp1''' function in Matlab. Here is my code:
beta = 0.98;
delta = 0.13;
A = 2;
alpha = 1/3;
n_grid = 1000; % Number of points for capital
k_grid = linspace(5, 15, n_grid)';
tol = 1e-5;
max_it = 1000;
c0 = ones(n_grid, 1);
new_k = zeros(n_grid, 1);
dist_c = tol + 1;
it_c = 0;
while dist_c > tol && it_c < max_it
c_handle = #(k_tomorrow) interp1(k_grid, c0, k_tomorrow, 'linear', 'extrap');
for i=1:n_grid
% Solve for k'
euler = #(k_tomorrow) (1/((1-delta)* k_grid(i) + A * k_grid(i)^alpha - k_tomorrow)) - beta*(1-delta + alpha*A*k_tomorrow^(alpha - 1))/c_handle(k_prime);
new_k(i) = fzero(euler, k_grid(i)); % What's a good guess for fzero?
end
% Compute new values for consumption
new_c = A*k_grid.^alpha + (1-delta)*k_grid - new_k;
% Check convergence
dist_c = norm(new_c - c0);
c0 = new_c;
it_c = it_c + 1;
end
When I run this code, for some indexes $i$, it runs fine and fzero can find the solution. But for indexes it just returns NaN and exits without finding the root. This is a somewhat well-behaved problem in Economics and the solution we are looking indeed exists and the algorithm I tried to implement is guaranteed to work. But I don't have much experience with solving this in MATLAB and I guess I have a silly mistake somewhere. Any ideas on how to procede?
This is the typical error message:
Exiting fzero: aborting search for an interval containing a sign change
because complex function value encountered during search.
(Function value at -2.61092 is 0.74278-0.30449i.)
Check function or try again with a different starting value.
Thanks a lot in advance!
The only term that can produce complex numbers is
k'^(alpha - 1) = k'^(-2/3)
You probably want the result according to the real variant of the cube root, which you could get as
sign(k') * abs(k')^(-2/3)
or more generally and avoiding divisions by zero
k' * (1e-16+abs(k'))^(alpha - 2)

Input equations into Matlab for Simulink Function

I am currently working on an assignment where I need to create two different controllers in Matlab/Simulink for a robotic exoskeleton leg. The idea behind this is to compare both of them and see which controller is better at assisting a human wearing it. I am having a lot of trouble putting specific equations into a Matlab function block to then run in Simulink to get results for an AFO (adaptive frequency oscillator). The link has the equations I'm trying to put in and the following is the code I have so far:
function [pos_AFO, vel_AFO, acc_AFO, offset, omega, phi, ampl, phi1] = LHip(theta, eps, nu, dt, AFO_on)
t = 0;
% syms j
% M = 6;
% j = sym('j', [1 M]);
if t == 0
omega = 3*pi/2;
theta = 0;
phi = pi/2;
ampl = 0;
else
omega = omega*(t-1) + dt*(eps*offset*cos(phi1));
theta = theta*(t-1) + dt*(nu*offset);
phi = phi*(t-1) + dt*(omega + eps*offset*cos(phi*core(t-1)));
phi1 = phi*(t-1) + dt*(omega + eps*offset*cos(phi*core(t-1)));
ampl = ampl*(t-1) + dt*(nu*offset*sin(phi));
offset = theta - theta*(t-1) - sym(ampl*sin(phi), [1 M]);
end
pos_AFO = (theta*(t-1) + symsum(ampl*(t-1)*sin(phi* (t-1))))*AFO_on; %symsum needs input argument for index M and range
vel_AFO = diff(pos_AFO)*AFO_on;
acc_AFO = diff(vel_AFO)*AFO_on;
end
https://www.pastepic.xyz/image/pg4mP
Essentially, I don't know how to do the subscripts, sigma, or the (t+1) function. Any help is appreciated as this is due next week
You are looking to find the result of an adaptive process therefore your algorithm needs to consider time as it progresses. There is no (t-1) operator as such. It is just a mathematical notation telling you that you need to reuse an old value to calculate a new value.
omega_old=0;
theta_old=0;
% initialize the rest of your variables
for [t=1:N]
omega[t] = omega_old + % here is the rest of your omega calculation
theta[t] = theta_old + % ...
% more code .....
% remember your old values for next iteration
omega_old = omega[t];
theta_old = theta[t];
end
I think you forgot to apply the modulo operation to phi judging by the original formula you linked. As a general rule, design your code in small pieces, make sure the output of each piece makes sense and then combine all pieces and make sure the overall result is correct.

Using MATLAB plots to find linear equation constants

Finding m and c for an equation y = mx + c, with the help of math and plots.
y is data_model_1, x is time.
Avoid other MATLAB functions like fitlm as it defeats the purpose.
I am having trouble finding the constants m and c. I am trying to find both m and c by limiting them to a range (based on smart guess) and I need to deduce the m and c values based on the mean error range. The point where mean error range is closest to 0 should be my m and c values.
load(file)
figure
plot(time,data_model_1,'bo')
hold on
for a = 0.11:0.01:0.13
c = -13:0.1:-10
data_a = a * time + c ;
plot(time,data_a,'r');
end
figure
hold on
for a = 0.11:0.01:0.13
c = -13:0.1:-10
data_a = a * time + c ;
mean_range = mean(abs(data_a - data_model_1));
plot(a,mean_range,'b.')
end
A quick & dirty approach
You can quickly get m and c using fminsearch(). In the first example below, the error function is the sum of squared error (SSE). The second example uses the sum of absolute error. The key here is ensuring the error function is convex.
Note that c = Beta(1) and m = Beta(2).
Reproducible example (MATLAB code[1]):
% Generate some example data
N = 50;
X = 2 + 13*random(makedist('Beta',.7,.8),N,1);
Y = 5 + 1.5.*X + randn(N,1);
% Example 1
SSEh =#(Beta) sum((Y - (Beta(1) + (Beta(2).*X))).^2);
Beta0 = [0.5 0.5]; % Initial Guess
[Beta SSE] = fminsearch(SSEh,Beta0)
% Example 2
SAEh =#(Beta) sum(abs(Y-(Beta(1) + Beta(2).*X)));
[Beta SumAbsErr] = fminsearch(SAEh,Beta0)
This is a quick & dirty approach that can work for many applications.
#Wolfie's comment directs you to the analytical approach to solve a system of linear equations with the \ operator or mldivide(). This is the more correct approach (though it will get a similar answer). One caveat is this approach gets the SSE answer.
[1] Tested with MATLAB R2018a

My example shows SVD is less numerically stable than QR decomposition

I asked this question in Math Stackexchange, but it seems it didn't get enough attention there so I am asking it here. https://math.stackexchange.com/questions/1729946/why-do-we-say-svd-can-handle-singular-matrx-when-doing-least-square-comparison?noredirect=1#comment3530971_1729946
I learned from some tutorials that SVD should be more stable than QR decomposition when solving Least Square problem, and it is able to handle singular matrix. But the following example I wrote in matlab seems to support the opposite conclusion. I don't have a deep understanding of SVD, so if you could look at my questions in the old post in Math StackExchange and explain it to me, I would appreciate a lot.
I use a matrix that have a large condition number(e+13). The result shows SVD get a much larger error(0.8) than QR(e-27)
% we do a linear regression between Y and X
data= [
47.667483331 -122.1070832;
47.667483331001 -122.1070832
];
X = data(:,1);
Y = data(:,2);
X_1 = [ones(length(X),1),X];
%%
%SVD method
[U,D,V] = svd(X_1,'econ');
beta_svd = V*diag(1./diag(D))*U'*Y;
%% QR method(here one can also use "\" operator, which will get the same result as I tested. I just wrote down backward substitution to educate myself)
[Q,R] = qr(X_1)
%now do backward substitution
[nr nc] = size(R)
beta_qr=[]
Y_1 = Q'*Y
for i = nc:-1:1
s = Y_1(i)
for j = m:-1:i+1
s = s - R(i,j)*beta_qr(j)
end
beta_qr(i) = s/R(i,i)
end
svd_error = 0;
qr_error = 0;
for i=1:length(X)
svd_error = svd_error + (Y(i) - beta_svd(1) - beta_svd(2) * X(i))^2;
qr_error = qr_error + (Y(i) - beta_qr(1) - beta_qr(2) * X(i))^2;
end
You SVD-based approach is basically the same as the pinv function in MATLAB (see Pseudo-inverse and SVD). What you are missing though (for numerical reasons) is using a tolerance value such that any singular values less than this tolerance are treated as zero.
If you refer to edit pinv.m, you can see something like the following (I won't post the exact code here because the file is copyrighted to MathWorks):
[U,S,V] = svd(A,'econ');
s = diag(S);
tol = max(size(A)) * eps(norm(s,inf));
% .. use above tolerance to truncate singular values
invS = diag(1./s);
out = V*invS*U';
In fact pinv has a second syntax where you can explicitly specify the tolerance value pinv(A,tol) if the default one is not suitable...
So when solving a least-squares problem of the form minimize norm(A*x-b), you should understand that the pinv and mldivide solutions have different properties:
x = pinv(A)*b is characterized by the fact that norm(x) is smaller than the norm of any other solution.
x = A\b has the fewest possible nonzero components (i.e sparse).
Using your example (note that rcond(A) is very small near machine epsilon):
data = [
47.667483331 -122.1070832;
47.667483331001 -122.1070832
];
A = [ones(size(data,1),1), data(:,1)];
b = data(:,2);
Let's compare the two solutions:
x1 = A\b;
x2 = pinv(A)*b;
First you can see how mldivide returns a solution x1 with one zero component (this is obviously a valid solution because you can solve both equations by multiplying by zero as in b + a*0 = b):
>> sol = [x1 x2]
sol =
-122.1071 -0.0537
0 -2.5605
Next you see how pinv returns a solution x2 with a smaller norm:
>> nrm = [norm(x1) norm(x2)]
nrm =
122.1071 2.5611
Here is the error of both solutions which is acceptably very small:
>> err = [norm(A*x1-b) norm(A*x2-b)]
err =
1.0e-11 *
0 0.1819
Note that use mldivide, linsolve, or qr will give pretty much same results:
>> x3 = linsolve(A,b)
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 2.159326e-16.
x3 =
-122.1071
0
>> [Q,R] = qr(A); x4 = R\(Q'*b)
x4 =
-122.1071
0
SVD can handle rank-deficiency. The diagonal matrix D has a near-zero element in your code and you need use pseudoinverse for SVD, i.e. set the 2nd element of 1./diag(D) to 0 other than the huge value (10^14). You should find SVD and QR have equally good accuracy in your example. For more information, see this document http://www.cs.princeton.edu/courses/archive/fall11/cos323/notes/cos323_f11_lecture09_svd.pdf
Try this SVD version called block SVD - you just set the iterations equal to the accuracy you want - usually 1 is enough. If you want all the factors (this has a default # selected for factor reduction) then edit the line k= to the size(matrix) if I recall my MATLAB correctly
A= randn(100,5000);
A=corr(A);
% A is your correlation matrix
tic
k = 1000; % number of factors to extract
bsize = k +50;
block = randn(size(A,2),bsize);
iter = 2; % could set via tolerance
[block,R] = qr(A*block,0);
for i=1:iter
[block,R] = qr(A*(A'*block),0);
end
M = block'*A;
% Economy size dense SVD.
[U,S] = svd(M,0);
U = block*U(:,1:k);
S = S(1:k,1:k);
% Note SVD of a symmetric matrix is:
% A = U*S*U' since V=U in this case, S=eigenvalues, U=eigenvectors
V=real(U*sqrt(S)); %scaling matrix for simulation
toc
% reduced randomized matrix for simulation
sims = 2000;
randnums = randn(k,sims);
corrrandnums = V*randnums;
est_corr_matrix = corr(corrrandnums');
total_corrmatrix_difference =sum(sum(est_corr_matrix-A))

Simple Harmonic Motion - Verlet - External force - Matlab

I ran through the algebra which I had previously done for the Verlet method without the force - this lead to the same code as you see below, but the "+(2*F/D)" term was missing when I ignored the external force. The algorithm worked accurately, as expected, however for the following parameters:
m = 7 ; k = 8 ; b = 0.1 ;
params = [m,k,b];
(and step size h = 0.001)
a force far above something like 0.00001 is much too big. I suspect I've missed a trick with the algebra.
My question is whether someone can spot the flaw in my addition of a force term in my Verlet method
% verlet.m
% uses the verlet step algorithm to integrate the simple harmonic
% oscillator.
% stepsize h, for a second-order ODE
function vout = verlet(vinverletx,h,params,F)
% vin is the particle vector (xn,yn)
x0 = vinverletx(1);
x1 = vinverletx(2);
% find the verlet coefficients
D = (2*params(1))+(params(3)*h);
A = (2/D)*((2*params(1))-(params(2)*h^2));
B=(1/D)*((params(3)*h)-(2*params(1)));
x2 = (A*x1)+(B*x0)+(2*F/D);
vout = x2;
% vout is the particle vector (xn+1,yn+1)
end
As written in the answer to the previous question, the moment friction enters the equation, the system is no longer conservative and the name "Verlet" does no longer apply. It is still a valid discretization of
m*x''+b*x'+k*x = F
(with some slight error with large consequences).
The discretization employs the central difference quotients of first and second order
x'[k] = (x[k+1]-x[k-1])/(2*h) + O(h^2)
x''[k] = (x[k+1]-2*x[k]+x[k-1])/(h^2) + O(h^2)
resulting in
(2*m+b*h)*x[k+1] - 2*(2*m+h^2*k) * x[k] + (2*m-b*h)*x[k-1] = 2*h^2 * F[k] + O(h^4)
Error: As you can see, you are missing a factor h^2 in the term with F.