I have a function with three parameters and some data that I want to fit. How can I do this optimally? I am not even sure of the range of the three parameters in the equation.
The function has free parameters alpha, beta and gamma and is given by
y = (1 - alpha + alpha./sqrt(1 + 2*beta*(gamma*x).^2./alpha)).^(-1) - 1;
I have arrays of x and y data points (around 50 points in each set) to which I want to find the best fit (defined as minimizing least squares) using any alpha, beta and gamma.
The solutions online recommend the curve fitting toolbox, which I do not have on my machine and am unable to install. I only have the barebones MATLAB 2015b version.
You need an optimization algorithm for smooth, R^n -> R functions. Since you have only access to barebone Matlab, a good idea is to take an algorithm from File Exchange. For illustration I picked LMFnlsq, which should suffice since you have a small problem, although it seems to be more general and a little bit overkill here.
Download LMFnlsq and add to your Matlab path.
Example
For convenience make a function called regr_fun:
function y = regr_fun(par, x)
alpha = par(1);
beta = par(2);
gamma = par(3);
y = (1 - alpha + alpha./sqrt(1 + 2*beta*(gamma*x).^2./alpha)).^(-1) - 1;
end
Curve fitting (in the same folder as regr_fun):
%---------------------------------------------------------------------
% DUMMY DATA
%---------------------------------------------------------------------
% Generate data from known model contaminated with random noise
rng(333) % for reproducibility
alpha = 2;
beta = 0.1;
gamma = 0.1;
par = [alpha, beta, gamma];
xx = 1:50;
y_true = regr_fun(par, xx);
yy = y_true + normrnd(0,1,1,50);
%---------------------------------------------------------------------
% FIT MODEL
%---------------------------------------------------------------------
% intial point of solver
p0 = [1,1,1];
obj_fun = #(p) sum((regr_fun(p, xx) - yy).^2);
% optimization
p_fit = LMFnlsq(obj_fun, p0);
y_fit = regr_fun(p_fit, xx);
%---------------------------------------------------------------------
% PLOT
%---------------------------------------------------------------------
plot(xx, yy, 'o')
hold on
plot(xx, y_true)
plot(xx, y_fit, '--')
Note
Although matlab.codetools.requiredFilesAndProducts lists the symbolic toolbox as well, for this problem it is not needed and the function should run withouth that as well.
Related
I have been using the STK toolbox for a few days, for kriging of environmental parameter fields, i.e. in a geostatistical context.
I find the toolbox very well implemented and useful (big thanks to the authors!), and the kriging predictions I am getting through STK actually seem fine; however, I am finding myself unable to visualize a semivariogram model based on the STK output (i.e. estimated parameters for gaussian process / covariance functions).
I am attaching an example figure, showing the empirical semivariogram for a simple 1D test case and a Gaussian semivariogram model (as typically used in geostatistics, see also figure) fitted directly to that data. The figure further shows a semivariogram model based on STK output, i.e. using previously estimated model parameters (model.param from stk_param_estim) to get covariance K on a target grid of lag distances and then converting K to semivariance (according to the well-known relation semivar = K0-K where K0 is the covariance at zero lag). I am attaching a simple script to reproduce the figure and detailing the attempted conversion.
As you can see in the figure, this doesn’t do the trick. I have tried several other simple examples and STK datasets, but models obtained through STK vs direct fitting never agree, and in fact usually look much more different than in the example (i.e. the range often seems very different, in addition to the sill/sigma2; uncomment line 12 in the script to see another example). I have also attempted to input the converted STK parameters into the geostatistical model (also in the script), however, the output is identical to the result based on converting K above.
I’d be very thankful for your help!
Figure illustrating the lack of agreement between semivariograms based on direct fit vs conversion of STK output
% Code to reproduce the figure illustrating my problem of getting
% variograms from STK output. The only external functions needed are those
% included with STK.
% TEST DATA - This is simply a monotonic part of the normal pdf
nugget = 0;
X = [0:20]'; % coordinates
% X = [0:50]'; % uncomment this line to see how strongly the models can deviate for different test cases
V = normpdf(X./10+nugget,0,1); % observed values
covmodel = 'stk_gausscov_iso'; % covar model, part of STK toolbox
variomodel = 'stk_gausscov_iso_vario'; % variogram model, nested function
% GET STRUCTURE FOR THE SELECTED KRIGING (GAUSSIAN PROCESS) MODEL
nDim = size(X,2);
model = stk_model (covmodel, nDim);
model.lognoisevariance = NaN; % This makes STK fit nugget
% ESTIMATE THE PARAMETERS OF THE COVARIANCE FUNCTION
[param0, model.lognoisevariance] = stk_param_init (model, X, V); % Compute an initial guess for the parameters of the covariance function (param0)
model.param = stk_param_estim (model, X, V, param0); % Now model the covariance function
% EMPIRICAL SEMIVARIOGRAM (raw, binning removed for simplicity)
D = pdist(X)';
semivar_emp = 0.5.*(pdist(V)').^2;
% THEORETICAL SEMIVARIOGRAM FROM STK
% Target grid of lag distances
DT = [0:1:100]';
DT_zero = zeros(size(DT));
% Get covariance matrix on target grid using STK estimated pars
pairwise = true;
K = feval(model.covariance_type, model.param, DT, DT_zero, -1, pairwise);
% convert covariance to semivariance, i.e. G = C(0) - C(h)
sill = exp(model.param(1));
nugget = exp(model.lognoisevariance);
semivar_stk = sill - K + nugget; % --> this variable is then plotted
% TEST: FIT A GAUSSIAN VARIOGRAM MODEL DIRECTLY TO THE EMPIRICAL SEMIVARIOGRAM
f = #(par)mseval(par,D,semivar_emp,variomodel);
par0 = [10 10 0.1]; % initial guess for pars
[par,mse] = fminsearch(f, par0); % optimize
semivar_directfit = feval(variomodel, par, DT); % evaluate
% TEST 2: USE PARS FROM STK AS INPUT TO GAUSSIAN VARIOGRAM MODEL
par(1) = exp(model.param(1)); % sill, PARAM(1) = log (SIGMA ^ 2), where SIGMA is the standard deviation,
par(2) = sqrt(3)./exp(model.param(2)); % range, PARAM(2) = - log (RHO), where RHO is the range parameter. --- > RHO = exp(-PARAM(2))
par(3) = exp(model.lognoisevariance); % nugget
semivar_stkparswithvariomodel = feval(variomodel, par, DT);
% PLOT SEMIVARIOGRAM
figure(); hold on;
plot(D(:), semivar_emp(:),'.k'); % Observed variogram, raw
plot(DT, semivar_stk,'-b','LineWidth',2); % Theoretical variogram, on a grid
plot(DT, semivar_directfit,'--r','LineWidth',2); % Test direct fit variogram
plot(DT,semivar_stkparswithvariomodel,'--g','LineWidth',2); % Test direct fit variogram using pars from stk
legend('raw empirical semivariance (no binned data here for simplicity) ',...
'Gaussian cov model from STK, i.e. exp(Sigma2) - K + exp(lognoisevar)',...
'Gaussian semivariogram model (fitted directly to semivariance)',...
'Gaussian semivariogram model (using transformed params from STK)');
xlabel('Lag distance','Fontweight','b');
ylabel('Semivariance','Fontweight','b');
% NESTED FUNCTIONS
% Objective function for direct fit
function [mse] = mseval(par,D,Graw,variomodel)
Gmod = feval(variomodel, par, D);
mse = mean((Gmod-Graw).^2);
end
% Gaussian semivariogram model.
function [semivar] = stk_gausscov_iso_vario(par, D) %#ok<DEFNU>
% D : lag distance, c : sill, a : range, n : nugget
c = par(1); % sill
a = par(2); % range
if length(par) > 2, n = par(3); % nugget optional
else, n = 0; end
semivar = n + c .* (1 - exp( -3.*D.^2./a.^2 )); % Model
end
There is nothing wrong with the way you compute the semivariogram.
To understand the figure that you obtain, consider that:
The parameters of the model are estimated in STK using the (restricted) maximum likelihood method, not by least-squares fitting on the semi-variogram.
For very smooth stationary random fields observed over short intervals, you should not expect that the theoretical semivariogram will agree with the empirical semivariogram, with or without binning. The reason for this is that the observations, and thus the squared differences, are very correlated in this case.
To convince yourself of the second point, you can run the following script repeatedly:
% a smooth GP
model = stk_model (#stk_gausscov_iso, 1);
model.param = log ([1.0, 0.2]); % unit variance
x_max = 20; x_obs = x_max * rand (50, 1);
% Simulate data
z_obs = stk_generate_samplepaths (model, x_obs);
% Empirical semivariogram (raw, no binning)
h = (pdist (double (x_obs)))';
semivar_emp = 0.5 * (pdist (z_obs)') .^ 2;
% Model-based semivariogram
x1 = (0:0.01:x_max)';
x0 = zeros (size (x1));
K = feval (model.covariance_type, model.param, x0, x1, -1, true);
semivar_th = 1 - K;
% Figure
figure; subplot (1, 2, 1); plot (x_obs, z_obs, '.');
subplot (1, 2, 2); plot (h(:), semivar_emp(:),'.k'); hold on;
plot (x1, semivar_th,'-b','LineWidth',2);
legend ('empirical', 'model'); xlabel ('lag'); ylabel ('semivar');
Further questions on parameter estimation for Gaussian process models should probably be asked on Cross-Validated rather than Stack Overflow.
I have a question about the use of Matlab to compute solution of stochastic differentials equations. The equations are the 2.2a,b, page 3, in this paper (PDF).
My professor suggested using ode45 with a small time step, but the results do not match with those in the article. In particular the time series and the pdf. I also have a doubt about the definition of the white noise in the function.
Here the code for the integration function:
function dVdt = R_Lang( t,V )
global sigma lambda alpha
W1=sigma*randn(1,1);
W2=sigma*randn(1,1);
dVdt=[alpha*V(1)+lambda*V(1)^3+1/V(1)*0.5*sigma^2+W1;
sigma/V(1)*W2];
end
Main script:
clear variables
close all
global sigma lambda alpha
sigma=sqrt(2*0.0028);
alpha=3.81;
lambda=-5604;
tspan=[0,10];
options = odeset('RelTol',1E-6,'AbsTol',1E-6,'MaxStep',0.05);
A0=random('norm',0,0.5,[2,1]);
[t,L]=ode45(#(t,L) R_Lang(t,L),tspan,A0,options);
If you have any suggestions I'd be grateful.
Here the new code to confront my EM method and 'sde_euler'.
lambda = -5604;
sigma=sqrt(2*0.0028) ;
Rzero = 0.03; % problem parameters
phizero=-1;
dt=1e-5;
T = 0:dt:10;
N=length(T);
Xi1 = sigma*randn(1,N); % Gaussian Noise with variance=sigma^2
Xi2 = sigma*randn(1,N);
alpha=3.81;
Rem = zeros(1,N); % preallocate for efficiency
Rtemp = Rzero;
phiem = zeros(1,N); % preallocate for efficiency
phitemp = phizero;
for j = 1:N
Rtemp = Rtemp + dt*(alpha*Rtemp+lambda*Rtemp^3+sigma^2/(2*Rtemp)) + sigma*Xi1(j);
phitemp=phitemp+sigma/Rtemp*Xi2(j);
phiem(j)=phitemp;
Rem(j) = Rtemp;
end
f = #(t,V)[alpha*V(1)+lambda*V(1)^3+0.5*sigma^2/V(1)/2;
0]; % Drift function
g = #(t,V)[sigma;
sigma/V(1)]; % Diffusion function
A0 = [0.03;0]; % 2-by-1 initial condition
opts = sdeset('RandSeed',1,'SDEType','Ito'); % Set random seed, use Ito formulation
L = sde_euler(f,g,T,A0,opts);
plot(T,Rem,'r')
hold on
plot(T,L(:,1),'b')
Thanks again for the help !
ODEs and SDEs are very different and one should not use tools for ODEs, like ode45, to try to solve SDEs. Looking at the paper you linked to, they used a basic Euler-Maruyama scheme to integrate the system. This a very simple solver to implement yourself.
Before proceeding, you (and your professor!) should take some time to read up on SDEs and how to solve them numerically. I recommend this paper, which includes many Matlab examples:
Desmond J. Higham, 2001, An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations, SIAM Rev. (Educ. Sect.), 43 525–46. http://dx.doi.org/10.1137/S0036144500378302
The URL to the Matlab files in the paper won't work; use this one. Note, that as this a 15-year old paper, some of the code related to random number generation is out of date (use rng(1) instead of randn('state',1) to seed the generator).
If you are familiar with ode45 you might look at my SDETools Matlab toolbox on GitHub. It was designed to be fast and has an interface that works very similarly to Matlab's ODE suite. Here is how you might code up your example using the Euler-Maruyma solver:
sigma = 1e-1*sqrt(2*0.0028);
lambda = -5604;
alpha = 3.81;
f = #(t,V)[alpha*V(1)+lambda*V(1)^3+0.5*sigma^2/V(1);
0]; % Drift function
g = #(t,V)[sigma;
sigma/V(1)]; % Diffusion function
dt = 1e-3; % Time step
t = 0:dt:10; % Time vector
A0 = [0.03;-2]; % 2-by-1 initial condition
opts = sdeset('RandSeed',1,'SDEType','Ito'); % Set random seed, use Ito formulation
L = sde_euler(f,g,t,A0,opts); % Integrate
figure;
subplot(211);
plot(t,L(:,2));
ylabel('\phi');
subplot(212);
plot(t,L(:,1));
ylabel('r');
xlabel('t');
I had to reduce the size of sigma or the noise was so large that it could cause the radius variable to go negative. I'm not sure if the paper discusses how they handle this singularity. You can try the 'NonNegative' option within sdeset to try to handle this or you may need to construct your own solver. I also couldn't find what integration time step the paper used. You should also consider contacting the authors of the paper directly.
UPDATE
Here's an Euler-Maruyama implementation that matches the sde_euler code above:
sigma = 1e-1*sqrt(2*0.0028);
lambda = -5604;
alpha = 3.81;
f = #(t,V)[alpha*V(1)+lambda*V(1)^3+0.5*sigma^2/V(1);
0]; % Drift function
g = #(t,V)[sigma;
sigma/V(1)]; % Diffusion function
dt = 1e-3; % Time step
t = 0:dt:10; % Time vector
A0 = [0.03;-2]; % 2-by-1 initial condition
% Create and initialize state vector (L here is transposed relative to sde_euler output)
lt = length(t);
n = length(A0);
L = zeros(n,lt);
L(:,1) = A0;
% Set seed and pre-calculate Wiener increments with order matching sde_euler
rng(1);
r = sqrt(dt)*randn(lt-1,n).';
% General Euler-Maruyama integration loop
for i = 1:lt-1
L(:,i+1) = L(:,i)+f(t(i),L(:,i))*dt+r(:,i).*g(t(i),L(:,i));
end
figure;
subplot(211);
plot(t,L(2,:));
ylabel('\phi');
subplot(212);
plot(t,L(1,:));
ylabel('r');
xlabel('t');
Edit: Some time after I asked this question, an R package called MonoPoly (available here) came out that does exactly what I want. I highly recommend it.
I have a set of points I want to fit a curve to. The curve must be monotonic (never decreasing in value) i.e. the curve can only go upward or stay flat.
I originally had been polyfitting my results and this had been working great until I found a particular dataset. The polyfit for data in this dataset was non-monotonic.
I did some research and found a possible solution in this post:
Use lsqlin. Constrain the first derivative to be non-negative at both
ends of the domain of interest.
I'm coming from a programming rather than math background so this is a little beyond me. I don't know how to constrain the first derivative to be non-negative as he said. Also, I think in my case I need a curve so I should use lsqcurvefit but I don't know how to constrain it to produce monotonic curves.
Further research turned up this post recommending lsqcurvefit but I can't figure out how to use the important part:
Try this non-linear function F(x) also. You use it together with
lsqcurvefit but it require a start guess on the parameters. But it is
a nice analytic expression to give as a semi-empirical formula in a
paper or a report.
%Monotone function F(x), with c0,c1,c2,c3 varitional constants F(x)=
c3 + exp(c0 - c1^2/(4*c2))sqrt(pi)...
Erfi((c1 + 2*c2*x)/(2*sqrt(c2))))/(2*sqrt(c2))
%Erfi(x)=erf(i*x) (look mathematica) but the function %looks much like
x^3 %derivative f(x), probability density f(x)>=0
f(x)=dF/dx=exp(c0+c1*x+c2*x.^2)
I must have a monotonic curve but I'm not sure how to do it, even with all of this information. Would a random number be enough for a "start guess". Is lsqcurvefit best? How can I use it to produce a best fitting monotonic curve?
Thanks
Here is a simple solution using lsqlin. The derivative constrain is enforced in each data point, this could be easily modified if needed.
Two coefficient matrices are needed, one (C) for least square error calculation and one (A) for derivatives in the data points.
% Following lsqlin's notations
%--------------------------------------------------------------------------
% PRE-PROCESSING
%--------------------------------------------------------------------------
% for reproducibility
rng(125)
degree = 3;
n_data = 10;
% dummy data
x = rand(n_data,1);
d = rand(n_data,1) + linspace(0,1,n_data).';
% limit on derivative - in each data point
b = zeros(n_data,1);
% coefficient matrix
C = nan(n_data, degree+1);
% derivative coefficient matrix
A = nan(n_data, degree);
% loop over polynomial terms
for ii = 1:degree+1
C(:,ii) = x.^(ii-1);
A(:,ii) = (ii-1)*x.^(ii-2);
end
%--------------------------------------------------------------------------
% FIT - LSQ
%--------------------------------------------------------------------------
% Unconstrained
% p1 = pinv(C)*y
p1 = fliplr((C\d).')
p2 = polyfit(x,d,degree)
% Constrained
p3 = fliplr(lsqlin(C,d,-A,b).')
%--------------------------------------------------------------------------
% PLOT
%--------------------------------------------------------------------------
xx = linspace(0,1,100);
plot(x, d, 'x')
hold on
plot(xx, polyval(p1, xx))
plot(xx, polyval(p2, xx),'--')
plot(xx, polyval(p3, xx))
legend('data', 'lsq-pseudo-inv', 'lsq-polyfit', 'lsq-constrained', 'Location', 'southoutside')
xlabel('X')
ylabel('Y')
For the specified input the fitted curves:
Actually this code is more general than what you requested, since the degree of polynomial can be changed as well.
EDIT: enforce derivative constrain in additional points
The issue pointed out in the comments is due to that the derivative checks are enforced only in the data points. Between those no checks are performed. Below is a solution to alleviate this problem. The idea: convert the problem to an unconstrained optimization by using a penalty term.
Note that it is using a term pen to penalize the violation of the derivative check, thus the result is not a true least square error solution. Additionally, the result is dependent on the penalty function.
function lsqfit_constr
% Following lsqlin's notations
%--------------------------------------------------------------------------
% PRE-PROCESSING
%--------------------------------------------------------------------------
% for reproducibility
rng(125)
degree = 3;
% data from comment
x = [0.2096 -3.5761 -0.6252 -3.7951 -3.3525 -3.7001 -3.7086 -3.5907].';
d = [95.7750 94.9917 90.8417 62.6917 95.4250 89.2417 89.4333 82.0250].';
n_data = length(d);
% number of equally spaced points to enforce the derivative
n_deriv = 20;
xd = linspace(min(x), max(x), n_deriv);
% limit on derivative - in each data point
b = zeros(n_deriv,1);
% coefficient matrix
C = nan(n_data, degree+1);
% derivative coefficient matrix
A = nan(n_deriv, degree);
% loop over polynom terms
for ii = 1:degree+1
C(:,ii) = x.^(ii-1);
A(:,ii) = (ii-1)*xd.^(ii-2);
end
%--------------------------------------------------------------------------
% FIT - LSQ
%--------------------------------------------------------------------------
% Unconstrained
% p1 = pinv(C)*y
p1 = (C\d);
lsqe = sum((C*p1 - d).^2);
p2 = polyfit(x,d,degree);
% Constrained
[p3, fval] = fminunc(#error_fun, p1);
% correct format for polyval
p1 = fliplr(p1.')
p2
p3 = fliplr(p3.')
fval
%--------------------------------------------------------------------------
% PLOT
%--------------------------------------------------------------------------
xx = linspace(-4,1,100);
plot(x, d, 'x')
hold on
plot(xx, polyval(p1, xx))
plot(xx, polyval(p2, xx),'--')
plot(xx, polyval(p3, xx))
% legend('data', 'lsq-pseudo-inv', 'lsq-polyfit', 'lsq-constrained', 'Location', 'southoutside')
xlabel('X')
ylabel('Y')
%--------------------------------------------------------------------------
% NESTED FUNCTION
%--------------------------------------------------------------------------
function e = error_fun(p)
% squared error
sqe = sum((C*p - d).^2);
der = A*p;
% penalty term - it is crucial to fine tune it
pen = -sum(der(der<0))*10*lsqe;
e = sqe + pen;
end
end
Gradient free methods might be used to solve the problem by exactly enforcing the derivative constrain, for example:
[p3, fval] = fminsearch(#error_fun, p_ini);
%--------------------------------------------------------------------------
% NESTED FUNCTION
%--------------------------------------------------------------------------
function e = error_fun(p)
% squared error
sqe = sum((C*p - d).^2);
der = A*p;
if any(der<0)
pen = Inf;
else
pen = 0;
end
e = sqe + pen;
end
fmincon with non-linear constraint might be a better choice.
I let you to work out the details and to tune the algorithms. I hope that it is sufficient.
I have a set of 3D points (x,y,z) and I would like to fit a straight line using Least absolute deviation method to those data.
I found a function from the internet which works pretty well with 2D data, how could I modify this to adapt 3D data points?
function B = L1LinearRegression(X,Y)
% Determine size of predictor data
[n m] = size(X);
% Initialize with least-squares fit
B = [ones(n,1) X] \ Y;
% Least squares regression
BOld = B;
BOld(1) = BOld(1) + 1e-5;
% Force divergence
% Repeat until convergence
while (max(abs(B - BOld)) > 1e-6) % Move old coefficients
BOld = B; % Calculate new observation weights (based on residuals from old coefficients)
W = sqrt(1 ./ max(abs((BOld(1) + (X * BOld(2:end))) - Y),1e-6)); % Floor to avoid division by zero
% Calculate new coefficients
B = (repmat(W,[1 m+1]) .* [ones(n,1) X]) \ (W .* Y);
end
Thank you very much!
I know that this is not answer to the question but rather to different problem leading to the question.
We can use fit function several times.
% XYZ=[x(:),y(:),z(:)]; % suppose we have data in this format
M=size(XYZ,1); % read size of our data
t=((0:M-1)/(M-1))'; % create arbitrary parameter t
% fit all coordinates as function x_i=a_i*t+b_i
fitX=fit(t,XYZ(:,1),'poly1');
fitY=fit(t,XYZ(:,2),'poly1');
fitZ=fit(t,XYZ(:,3),'poly1');
temp=[0;1]; % define the interval where the line shall be plotted
%Evaluate and plot the line coordinates
Line=[feval(fitX(temp)),feval(fitY(temp)),feval(fitZ(temp))];
plot(Line)
The advantage is that this work for any cloud, even if it is parallel to any axis. another advantage is that you are not limitted only to polynomes of 1st order, you can choose any function for different axis and fit any 3D curve.
Is there any way to fit a function with n variables in Matlab? Any example would be very useful.
Till now I used curve fitting toolbox, which provides solution I need for functions with 2 arguments. But now I need to fit a function with much more variables.
The worst thing is that dependance is non-linear (probably something like a/x+b/y+c/z+…, but it's only a hypothesis). If it was linear, '\' operator would do the trick.
lsqnonlin will do, e.g.
%% generate noisy points for fitting
a = 1; b = 2; c = 3;
x = rand(100,3);
y = a./x(:,1) + b./x(:,2) + c./x(:,3) + 0.1*rand(1,1);
%% fitting
% define residual vector
minRes = #(p) (p(1) ./ x(:,1) + p(2) ./ x(:,2) + p(3) ./ x(:,3) - y);
% start values
par0 = [1,1,1];
% optimize
par = lsqnonlin(minRes, par0);
lsqnonlin
function included in Matlab