ode45 for Langevin equation - matlab

I have a question about the use of Matlab to compute solution of stochastic differentials equations. The equations are the 2.2a,b, page 3, in this paper (PDF).
My professor suggested using ode45 with a small time step, but the results do not match with those in the article. In particular the time series and the pdf. I also have a doubt about the definition of the white noise in the function.
Here the code for the integration function:
function dVdt = R_Lang( t,V )
global sigma lambda alpha
W1=sigma*randn(1,1);
W2=sigma*randn(1,1);
dVdt=[alpha*V(1)+lambda*V(1)^3+1/V(1)*0.5*sigma^2+W1;
sigma/V(1)*W2];
end
Main script:
clear variables
close all
global sigma lambda alpha
sigma=sqrt(2*0.0028);
alpha=3.81;
lambda=-5604;
tspan=[0,10];
options = odeset('RelTol',1E-6,'AbsTol',1E-6,'MaxStep',0.05);
A0=random('norm',0,0.5,[2,1]);
[t,L]=ode45(#(t,L) R_Lang(t,L),tspan,A0,options);
If you have any suggestions I'd be grateful.
Here the new code to confront my EM method and 'sde_euler'.
lambda = -5604;
sigma=sqrt(2*0.0028) ;
Rzero = 0.03; % problem parameters
phizero=-1;
dt=1e-5;
T = 0:dt:10;
N=length(T);
Xi1 = sigma*randn(1,N); % Gaussian Noise with variance=sigma^2
Xi2 = sigma*randn(1,N);
alpha=3.81;
Rem = zeros(1,N); % preallocate for efficiency
Rtemp = Rzero;
phiem = zeros(1,N); % preallocate for efficiency
phitemp = phizero;
for j = 1:N
Rtemp = Rtemp + dt*(alpha*Rtemp+lambda*Rtemp^3+sigma^2/(2*Rtemp)) + sigma*Xi1(j);
phitemp=phitemp+sigma/Rtemp*Xi2(j);
phiem(j)=phitemp;
Rem(j) = Rtemp;
end
f = #(t,V)[alpha*V(1)+lambda*V(1)^3+0.5*sigma^2/V(1)/2;
0]; % Drift function
g = #(t,V)[sigma;
sigma/V(1)]; % Diffusion function
A0 = [0.03;0]; % 2-by-1 initial condition
opts = sdeset('RandSeed',1,'SDEType','Ito'); % Set random seed, use Ito formulation
L = sde_euler(f,g,T,A0,opts);
plot(T,Rem,'r')
hold on
plot(T,L(:,1),'b')
Thanks again for the help !

ODEs and SDEs are very different and one should not use tools for ODEs, like ode45, to try to solve SDEs. Looking at the paper you linked to, they used a basic Euler-Maruyama scheme to integrate the system. This a very simple solver to implement yourself.
Before proceeding, you (and your professor!) should take some time to read up on SDEs and how to solve them numerically. I recommend this paper, which includes many Matlab examples:
Desmond J. Higham, 2001, An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations, SIAM Rev. (Educ. Sect.), 43 525–46. http://dx.doi.org/10.1137/S0036144500378302
The URL to the Matlab files in the paper won't work; use this one. Note, that as this a 15-year old paper, some of the code related to random number generation is out of date (use rng(1) instead of randn('state',1) to seed the generator).
If you are familiar with ode45 you might look at my SDETools Matlab toolbox on GitHub. It was designed to be fast and has an interface that works very similarly to Matlab's ODE suite. Here is how you might code up your example using the Euler-Maruyma solver:
sigma = 1e-1*sqrt(2*0.0028);
lambda = -5604;
alpha = 3.81;
f = #(t,V)[alpha*V(1)+lambda*V(1)^3+0.5*sigma^2/V(1);
0]; % Drift function
g = #(t,V)[sigma;
sigma/V(1)]; % Diffusion function
dt = 1e-3; % Time step
t = 0:dt:10; % Time vector
A0 = [0.03;-2]; % 2-by-1 initial condition
opts = sdeset('RandSeed',1,'SDEType','Ito'); % Set random seed, use Ito formulation
L = sde_euler(f,g,t,A0,opts); % Integrate
figure;
subplot(211);
plot(t,L(:,2));
ylabel('\phi');
subplot(212);
plot(t,L(:,1));
ylabel('r');
xlabel('t');
I had to reduce the size of sigma or the noise was so large that it could cause the radius variable to go negative. I'm not sure if the paper discusses how they handle this singularity. You can try the 'NonNegative' option within sdeset to try to handle this or you may need to construct your own solver. I also couldn't find what integration time step the paper used. You should also consider contacting the authors of the paper directly.
UPDATE
Here's an Euler-Maruyama implementation that matches the sde_euler code above:
sigma = 1e-1*sqrt(2*0.0028);
lambda = -5604;
alpha = 3.81;
f = #(t,V)[alpha*V(1)+lambda*V(1)^3+0.5*sigma^2/V(1);
0]; % Drift function
g = #(t,V)[sigma;
sigma/V(1)]; % Diffusion function
dt = 1e-3; % Time step
t = 0:dt:10; % Time vector
A0 = [0.03;-2]; % 2-by-1 initial condition
% Create and initialize state vector (L here is transposed relative to sde_euler output)
lt = length(t);
n = length(A0);
L = zeros(n,lt);
L(:,1) = A0;
% Set seed and pre-calculate Wiener increments with order matching sde_euler
rng(1);
r = sqrt(dt)*randn(lt-1,n).';
% General Euler-Maruyama integration loop
for i = 1:lt-1
L(:,i+1) = L(:,i)+f(t(i),L(:,i))*dt+r(:,i).*g(t(i),L(:,i));
end
figure;
subplot(211);
plot(t,L(2,:));
ylabel('\phi');
subplot(212);
plot(t,L(1,:));
ylabel('r');
xlabel('t');

Related

How to miss out matrix elements in pcolor plot?

I'm solving a set of nonlinear simultaneous equations using Matlab's fsolve to find unknown parameters x1 and x2. The simultaneous equations have two independent parameters a and b, as defined in the root2d function:
function F = root2d(x,a,b)
F(1) = exp(-exp(-(x(1)+x(2)))) - x(2)*(a+x(1)^2);
F(2) = x(1)*cos(x(2)) + x(2)*sin(x(1)) - b;
end
I use the following code to solve the simultaneous equations, and plot the results as a 2d figure using pcolor.
alist = linspace(0.6,1.2,10);
blist = linspace(0.4,0.8,5);
% results
x1list = zeros(length(blist),length(alist));
x2list = zeros(length(blist),length(alist));
% solver options
options = optimoptions('fsolve','Display','None');
for ii = 1:length(blist)
b = blist(ii);
for jj = 1:length(alist)
a = alist(jj);
x0 = [0 0]; % init guess
[xopt,yopt,exitflag] = fsolve(#(x0)root2d(x0,a,b),x0,options);
% optimised values
x1list(ii,jj) = xopt(1);
x2list(ii,jj) = xopt(2);
success(ii,jj) = exitflag; % did solver succeed?
end
end
% plotting
figure
s = pcolor(alist(success>0),blist(success>0),x1list(success>0));
xlabel('a')
ylabel('b')
title('my data x_1')
figure
s = pcolor(alist(success>0),blist(success>0),x2list(success>0));
xlabel('a')
ylabel('b')
title('my data x_2')
However I only want to plot the x1 and x2 where the solver has successfully converged to a solution. This is where the success matrix element (or exitflag) has a value greater than 0. Usually you just write x1list(success>0) when using the plot function and Matlab omits any solutions where (success<=0), but pcolor doesn't have that functionality.
Is there a way around this? For example, displaying all (success<=0) solutions as a black area.
Yes there is!
The easiest way is to make them NaN, so they are simply not drawn.
just do
x1list(~success)=NaN;
pcolor(alist,blist,x1list)

Obtaining steady state solution for spring mass dashpot system

I'm trying to solve the following problem using MATLAB but I faced multiple issues. The plot I obtained doesn't seem right even though I tried to obtain the steady-state solution, I got a plot that doesn't look steady.
The problem I'm trying to solve
The incorrect plot I got.
and here is the code
% system parameters
m=1; k=1; c=.1; wn=sqrt(k/m); z=c/2/sqrt(m*k); wd=wn*sqrt(1-z^2);
% initial conditions
x0=0; v0=0;
%% time
dt=.001; tMax=8*pi; t=0:(tMax-0)/999:tMax;
% input
A=1
omega=(2*pi)/10
F=A/2-(4*A/pi^2)*cos(omega*t); Fw=fft(F);
F=k*A*cos(omega*t); Fw=fft(F);
% normalize
y = F/m;
% compute coefficients proportional to the Fourier series coefficients
Yw = fft(y);
% setup the equations to solve the particular solution of the differential equation
% by the method of undetermined coefficients
N=1000
T=10
k = [0:N/2];
w = 2*pi*k/T;
A = wn*wn-w.*w;
B = 2*z*wn*w;
% solve the equation [A B;-B A][real(Xw); imag(Xw)] = [real(Yw); imag(Yw)] equation
% Note that solution can be obtained by writing [A B;-B A] as a scaling + rotation
% of a 2D vector, which we solve using complex number algebra
C = sqrt(A.*A+B.*B);
theta = acos(A./C);
Ywp = exp(j*theta)./C.*Yw([1:N/2+1]);
% build a hermitian-symmetric spectrum
Xw = [Ywp conj(fliplr(Ywp(2:end-1)))];
% bring back to time-domain (function synthesis from Fourier Series coefficients)
x = ifft(Xw);
figure()
plot(t,x)
Your forcing function doesn't look like the triangle wave in the problem. I edited the %% time section of your code into the following and appeared to give a steady state response.
%% time
TP = 10; % forcing time period (10 s)
dt=.001;
tMax= 3*TP; % needs to be multiple of the time period
t=0:(tMax-0)/999:tMax;
% input
A=1; % Forcing amplitude
omega=(2*pi)/TP;
% forcing is a triangle wave
% generate a triangle wave with min/max values of 0/1.
F = 0*t;
for i = 1:length(t)
if mod(t(i), TP) <= TP/2
F(i) = mod(t(i), TP)/(TP/2);
else
F(i) = 2 - mod(t(i), TP)/(TP/2);
end
end
F = F*A; % scale triangle wave by amplitude
% you can also use MATLAB's sawtooth() function if you have the signal
% processing toolbox

Obtaining theoretical semivariogram from parametrized covariance function (STK)

I have been using the STK toolbox for a few days, for kriging of environmental parameter fields, i.e. in a geostatistical context.
I find the toolbox very well implemented and useful (big thanks to the authors!), and the kriging predictions I am getting through STK actually seem fine; however, I am finding myself unable to visualize a semivariogram model based on the STK output (i.e. estimated parameters for gaussian process / covariance functions).
I am attaching an example figure, showing the empirical semivariogram for a simple 1D test case and a Gaussian semivariogram model (as typically used in geostatistics, see also figure) fitted directly to that data. The figure further shows a semivariogram model based on STK output, i.e. using previously estimated model parameters (model.param from stk_param_estim) to get covariance K on a target grid of lag distances and then converting K to semivariance (according to the well-known relation semivar = K0-K where K0 is the covariance at zero lag). I am attaching a simple script to reproduce the figure and detailing the attempted conversion.
As you can see in the figure, this doesn’t do the trick. I have tried several other simple examples and STK datasets, but models obtained through STK vs direct fitting never agree, and in fact usually look much more different than in the example (i.e. the range often seems very different, in addition to the sill/sigma2; uncomment line 12 in the script to see another example). I have also attempted to input the converted STK parameters into the geostatistical model (also in the script), however, the output is identical to the result based on converting K above.
I’d be very thankful for your help!
Figure illustrating the lack of agreement between semivariograms based on direct fit vs conversion of STK output
% Code to reproduce the figure illustrating my problem of getting
% variograms from STK output. The only external functions needed are those
% included with STK.
% TEST DATA - This is simply a monotonic part of the normal pdf
nugget = 0;
X = [0:20]'; % coordinates
% X = [0:50]'; % uncomment this line to see how strongly the models can deviate for different test cases
V = normpdf(X./10+nugget,0,1); % observed values
covmodel = 'stk_gausscov_iso'; % covar model, part of STK toolbox
variomodel = 'stk_gausscov_iso_vario'; % variogram model, nested function
% GET STRUCTURE FOR THE SELECTED KRIGING (GAUSSIAN PROCESS) MODEL
nDim = size(X,2);
model = stk_model (covmodel, nDim);
model.lognoisevariance = NaN; % This makes STK fit nugget
% ESTIMATE THE PARAMETERS OF THE COVARIANCE FUNCTION
[param0, model.lognoisevariance] = stk_param_init (model, X, V); % Compute an initial guess for the parameters of the covariance function (param0)
model.param = stk_param_estim (model, X, V, param0); % Now model the covariance function
% EMPIRICAL SEMIVARIOGRAM (raw, binning removed for simplicity)
D = pdist(X)';
semivar_emp = 0.5.*(pdist(V)').^2;
% THEORETICAL SEMIVARIOGRAM FROM STK
% Target grid of lag distances
DT = [0:1:100]';
DT_zero = zeros(size(DT));
% Get covariance matrix on target grid using STK estimated pars
pairwise = true;
K = feval(model.covariance_type, model.param, DT, DT_zero, -1, pairwise);
% convert covariance to semivariance, i.e. G = C(0) - C(h)
sill = exp(model.param(1));
nugget = exp(model.lognoisevariance);
semivar_stk = sill - K + nugget; % --> this variable is then plotted
% TEST: FIT A GAUSSIAN VARIOGRAM MODEL DIRECTLY TO THE EMPIRICAL SEMIVARIOGRAM
f = #(par)mseval(par,D,semivar_emp,variomodel);
par0 = [10 10 0.1]; % initial guess for pars
[par,mse] = fminsearch(f, par0); % optimize
semivar_directfit = feval(variomodel, par, DT); % evaluate
% TEST 2: USE PARS FROM STK AS INPUT TO GAUSSIAN VARIOGRAM MODEL
par(1) = exp(model.param(1)); % sill, PARAM(1) = log (SIGMA ^ 2), where SIGMA is the standard deviation,
par(2) = sqrt(3)./exp(model.param(2)); % range, PARAM(2) = - log (RHO), where RHO is the range parameter. --- > RHO = exp(-PARAM(2))
par(3) = exp(model.lognoisevariance); % nugget
semivar_stkparswithvariomodel = feval(variomodel, par, DT);
% PLOT SEMIVARIOGRAM
figure(); hold on;
plot(D(:), semivar_emp(:),'.k'); % Observed variogram, raw
plot(DT, semivar_stk,'-b','LineWidth',2); % Theoretical variogram, on a grid
plot(DT, semivar_directfit,'--r','LineWidth',2); % Test direct fit variogram
plot(DT,semivar_stkparswithvariomodel,'--g','LineWidth',2); % Test direct fit variogram using pars from stk
legend('raw empirical semivariance (no binned data here for simplicity) ',...
'Gaussian cov model from STK, i.e. exp(Sigma2) - K + exp(lognoisevar)',...
'Gaussian semivariogram model (fitted directly to semivariance)',...
'Gaussian semivariogram model (using transformed params from STK)');
xlabel('Lag distance','Fontweight','b');
ylabel('Semivariance','Fontweight','b');
% NESTED FUNCTIONS
% Objective function for direct fit
function [mse] = mseval(par,D,Graw,variomodel)
Gmod = feval(variomodel, par, D);
mse = mean((Gmod-Graw).^2);
end
% Gaussian semivariogram model.
function [semivar] = stk_gausscov_iso_vario(par, D) %#ok<DEFNU>
% D : lag distance, c : sill, a : range, n : nugget
c = par(1); % sill
a = par(2); % range
if length(par) > 2, n = par(3); % nugget optional
else, n = 0; end
semivar = n + c .* (1 - exp( -3.*D.^2./a.^2 )); % Model
end
There is nothing wrong with the way you compute the semivariogram.
To understand the figure that you obtain, consider that:
The parameters of the model are estimated in STK using the (restricted) maximum likelihood method, not by least-squares fitting on the semi-variogram.
For very smooth stationary random fields observed over short intervals, you should not expect that the theoretical semivariogram will agree with the empirical semivariogram, with or without binning. The reason for this is that the observations, and thus the squared differences, are very correlated in this case.
To convince yourself of the second point, you can run the following script repeatedly:
% a smooth GP
model = stk_model (#stk_gausscov_iso, 1);
model.param = log ([1.0, 0.2]); % unit variance
x_max = 20; x_obs = x_max * rand (50, 1);
% Simulate data
z_obs = stk_generate_samplepaths (model, x_obs);
% Empirical semivariogram (raw, no binning)
h = (pdist (double (x_obs)))';
semivar_emp = 0.5 * (pdist (z_obs)') .^ 2;
% Model-based semivariogram
x1 = (0:0.01:x_max)';
x0 = zeros (size (x1));
K = feval (model.covariance_type, model.param, x0, x1, -1, true);
semivar_th = 1 - K;
% Figure
figure; subplot (1, 2, 1); plot (x_obs, z_obs, '.');
subplot (1, 2, 2); plot (h(:), semivar_emp(:),'.k'); hold on;
plot (x1, semivar_th,'-b','LineWidth',2);
legend ('empirical', 'model'); xlabel ('lag'); ylabel ('semivar');
Further questions on parameter estimation for Gaussian process models should probably be asked on Cross-Validated rather than Stack Overflow.

Extract the values for some parameters from stochastic differential equation solution

I am solving stochastic differential equation in matlab.
For example:
consider the stochastic differential equation
dx=k A(x,t)dt+ B(x,t)dW(t)
where k is constants, A and B are functions, and dW(t) is Wiener process.
I plot the solution for all t in [0,20]. We know that dW(t) is randomly generated. My question is: I want to know the value of A(x,t), B(x,t), dW(t) for a particular value of t and for particular sub-interval, say [3,6]. What command in Matlab I can use?
Here is the code I used based on a paper by D.Higham:
clear all
close all
t0 = 0; % start time of simulation
tend = 20; % end time
m=2^9; %number of steps in each Brownian path
deltat= tend/m; % time increment for each Brownian path
D=0.1; %diffsuion
R=4;
dt = R*deltat;
dW=sqrt( deltat)*randn(2,m);
theta0=pi*rand(1);
phi0=2*pi*rand(1);
P_initial=[ theta0; phi0];
L = m/ R;
pem=zeros(2,L);
EM_rescale=zeros(2,L);
ptemp=P_initial;
for j=1:L
Winc = sum(dW(:,[ R*(j-1)+1: R*j]),2);
theta=ptemp(1);% updating theta
phi=ptemp(2); % updating phi
%psi=ptemp(3); % updating psi
A=[ D.*cot(theta);...
0];% updating the drift
B=[sqrt(D) 0 ;...
0 sqrt(D)./sin(theta) ]; %% updating the diffusion function
ptemp=ptemp+ dt*A+B*Winc;
pem(1,j)=ptemp(1);%store theta
pem(2,j)=ptemp(2);%store phi
EM_rescale(1,j)=mod(pem(1,j),pi); % re-scale theta
EM_rescale(2,j)=mod(pem(2,j),2*pi); % re-scale phi
end
plot([0:dt:tend],[P_initial,EM_rescale],'--*')
Suppose I want to know all parameters (including random: Brownian) at each specific time point or for any time interval. How to do that?
I'm doing my best to understand your question here, but it's still a bit unclear to me.
Change the loop to:
for ii=1:L
Winc = sum(dW(:,[ R*(ii-1)+1: R*ii]),2);
theta=ptemp(1);% updating theta
phi=ptemp(2); % updating phi
A{ii}=[ D.*cot(theta);...
0];% updating the drift
B{ii}=[sqrt(D) 0 ;...
0 sqrt(D)./sin(theta) ]; %% updating the diffusion function
ptemp = ptemp + dt*A{ii}+B{ii}*Winc;
pem(:,ii) = ptemp;
EM_rescale(1,ii) = mod(pem(1,ii),pi); % re-scale theta
EM_rescale(2,ii) = mod(pem(2,ii),2*pi); % re-scale phi
end
Now, you can get the values of A and B this way:
t = 3;
t_num = round(m/tend*t);
A{t_num}
B{t_num}
ans =
0.0690031455719538
0
ans =
0.316227766016838 0
0 0.38420611784333

Finding the best monotonic curve fit

Edit: Some time after I asked this question, an R package called MonoPoly (available here) came out that does exactly what I want. I highly recommend it.
I have a set of points I want to fit a curve to. The curve must be monotonic (never decreasing in value) i.e. the curve can only go upward or stay flat.
I originally had been polyfitting my results and this had been working great until I found a particular dataset. The polyfit for data in this dataset was non-monotonic.
I did some research and found a possible solution in this post:
Use lsqlin. Constrain the first derivative to be non-negative at both
ends of the domain of interest.
I'm coming from a programming rather than math background so this is a little beyond me. I don't know how to constrain the first derivative to be non-negative as he said. Also, I think in my case I need a curve so I should use lsqcurvefit but I don't know how to constrain it to produce monotonic curves.
Further research turned up this post recommending lsqcurvefit but I can't figure out how to use the important part:
Try this non-linear function F(x) also. You use it together with
lsqcurvefit but it require a start guess on the parameters. But it is
a nice analytic expression to give as a semi-empirical formula in a
paper or a report.
%Monotone function F(x), with c0,c1,c2,c3 varitional constants F(x)=
c3 + exp(c0 - c1^2/(4*c2))sqrt(pi)...
Erfi((c1 + 2*c2*x)/(2*sqrt(c2))))/(2*sqrt(c2))
%Erfi(x)=erf(i*x) (look mathematica) but the function %looks much like
x^3 %derivative f(x), probability density f(x)>=0
f(x)=dF/dx=exp(c0+c1*x+c2*x.^2)
I must have a monotonic curve but I'm not sure how to do it, even with all of this information. Would a random number be enough for a "start guess". Is lsqcurvefit best? How can I use it to produce a best fitting monotonic curve?
Thanks
Here is a simple solution using lsqlin. The derivative constrain is enforced in each data point, this could be easily modified if needed.
Two coefficient matrices are needed, one (C) for least square error calculation and one (A) for derivatives in the data points.
% Following lsqlin's notations
%--------------------------------------------------------------------------
% PRE-PROCESSING
%--------------------------------------------------------------------------
% for reproducibility
rng(125)
degree = 3;
n_data = 10;
% dummy data
x = rand(n_data,1);
d = rand(n_data,1) + linspace(0,1,n_data).';
% limit on derivative - in each data point
b = zeros(n_data,1);
% coefficient matrix
C = nan(n_data, degree+1);
% derivative coefficient matrix
A = nan(n_data, degree);
% loop over polynomial terms
for ii = 1:degree+1
C(:,ii) = x.^(ii-1);
A(:,ii) = (ii-1)*x.^(ii-2);
end
%--------------------------------------------------------------------------
% FIT - LSQ
%--------------------------------------------------------------------------
% Unconstrained
% p1 = pinv(C)*y
p1 = fliplr((C\d).')
p2 = polyfit(x,d,degree)
% Constrained
p3 = fliplr(lsqlin(C,d,-A,b).')
%--------------------------------------------------------------------------
% PLOT
%--------------------------------------------------------------------------
xx = linspace(0,1,100);
plot(x, d, 'x')
hold on
plot(xx, polyval(p1, xx))
plot(xx, polyval(p2, xx),'--')
plot(xx, polyval(p3, xx))
legend('data', 'lsq-pseudo-inv', 'lsq-polyfit', 'lsq-constrained', 'Location', 'southoutside')
xlabel('X')
ylabel('Y')
For the specified input the fitted curves:
Actually this code is more general than what you requested, since the degree of polynomial can be changed as well.
EDIT: enforce derivative constrain in additional points
The issue pointed out in the comments is due to that the derivative checks are enforced only in the data points. Between those no checks are performed. Below is a solution to alleviate this problem. The idea: convert the problem to an unconstrained optimization by using a penalty term.
Note that it is using a term pen to penalize the violation of the derivative check, thus the result is not a true least square error solution. Additionally, the result is dependent on the penalty function.
function lsqfit_constr
% Following lsqlin's notations
%--------------------------------------------------------------------------
% PRE-PROCESSING
%--------------------------------------------------------------------------
% for reproducibility
rng(125)
degree = 3;
% data from comment
x = [0.2096 -3.5761 -0.6252 -3.7951 -3.3525 -3.7001 -3.7086 -3.5907].';
d = [95.7750 94.9917 90.8417 62.6917 95.4250 89.2417 89.4333 82.0250].';
n_data = length(d);
% number of equally spaced points to enforce the derivative
n_deriv = 20;
xd = linspace(min(x), max(x), n_deriv);
% limit on derivative - in each data point
b = zeros(n_deriv,1);
% coefficient matrix
C = nan(n_data, degree+1);
% derivative coefficient matrix
A = nan(n_deriv, degree);
% loop over polynom terms
for ii = 1:degree+1
C(:,ii) = x.^(ii-1);
A(:,ii) = (ii-1)*xd.^(ii-2);
end
%--------------------------------------------------------------------------
% FIT - LSQ
%--------------------------------------------------------------------------
% Unconstrained
% p1 = pinv(C)*y
p1 = (C\d);
lsqe = sum((C*p1 - d).^2);
p2 = polyfit(x,d,degree);
% Constrained
[p3, fval] = fminunc(#error_fun, p1);
% correct format for polyval
p1 = fliplr(p1.')
p2
p3 = fliplr(p3.')
fval
%--------------------------------------------------------------------------
% PLOT
%--------------------------------------------------------------------------
xx = linspace(-4,1,100);
plot(x, d, 'x')
hold on
plot(xx, polyval(p1, xx))
plot(xx, polyval(p2, xx),'--')
plot(xx, polyval(p3, xx))
% legend('data', 'lsq-pseudo-inv', 'lsq-polyfit', 'lsq-constrained', 'Location', 'southoutside')
xlabel('X')
ylabel('Y')
%--------------------------------------------------------------------------
% NESTED FUNCTION
%--------------------------------------------------------------------------
function e = error_fun(p)
% squared error
sqe = sum((C*p - d).^2);
der = A*p;
% penalty term - it is crucial to fine tune it
pen = -sum(der(der<0))*10*lsqe;
e = sqe + pen;
end
end
Gradient free methods might be used to solve the problem by exactly enforcing the derivative constrain, for example:
[p3, fval] = fminsearch(#error_fun, p_ini);
%--------------------------------------------------------------------------
% NESTED FUNCTION
%--------------------------------------------------------------------------
function e = error_fun(p)
% squared error
sqe = sum((C*p - d).^2);
der = A*p;
if any(der<0)
pen = Inf;
else
pen = 0;
end
e = sqe + pen;
end
fmincon with non-linear constraint might be a better choice.
I let you to work out the details and to tune the algorithms. I hope that it is sufficient.