Using fminsearch for parameter estimation - matlab

I am trying to find log Maximum likelihood estimation for Gaussian distribution, in order to estimate parameters.
I know that Matlab has a built-in function that does this by fitting a Gaussian distribution, but I need to do this with logMLE in order to expand this method later for other distributions.
So here is the log-likelihood function for gaussian dist :
Gaussian Log MLE
And I used this code to estimate the parameters for a set of variables (r) with fminsearch. but my search does not coverage and I don't fully understand where is the problem:
clear
clc
close all
%make random numbers with gaussian dist
r=[2.39587291079469
1.57478022109723
-0.442284350603745
4.39661178526569
7.94034385633171
7.52208574723178
5.80673144943155
-3.11338531920164
6.64267230284774
-2.02996003947964];
% mu=2 sigma=3
%introduce f
f=#(x,r)-(sum((-0.5.*log(2*3.14.*(x(2))))-(((r-(x(2))).^2)./(2.*(x(1))))))
fun = #(x)f(x,r);
% starting point
x0 = [0,0];
[y,fval,exitflag,output] = fminsearch(fun,x0)
f =
#(x,r)-(sum((-0.5.*log(2*3.14.*(x(2))))-(((r-(x(2))).^2)./(2.*(x(1))))))
Exiting: Maximum number of function evaluations has been exceeded
- increase MaxFunEvals option.
Current function value: 477814.233176
y = 1×2
1.0e+-3 *
0.2501 -0.0000
fval = 4.7781e+05 + 1.5708e+01i
exitflag = 0
output =
iterations: 183
funcCount: 400
algorithm: 'Nelder-Mead simplex direct search'
message: 'Exiting: Maximum number of function evaluations has been exceeded↵ - increase MaxFunEvals option.↵ Current function value: 477814.233176 ↵'

Rewrite f as follows:
function y = g(x, r)
n = length(r);
log_part = 0.5.*n.*log(x(2).^2);
sum_part = ((sum(r-x(1))).^2)./(2.*x(2).^2);
y = log_part + sum_part;
end
Use fmincon instead of fminsearch because standard deviation is
always a positif number.
Set standard deviation lower bound to zero 0
The entire code is as follows:
%make random numbers with gaussian dist
r=[2.39587291079469
1.57478022109723
-0.442284350603745
4.39661178526569
7.94034385633171
7.52208574723178
5.80673144943155
-3.11338531920164
6.64267230284774
-2.02996003947964];
% mu=2 sigma=3
fun = #(x)g(x, r);
% starting point
x0 = [0,0];
% borns
lb = [-inf, 0];
ub = [inf, inf];
[y, fval] = fmincon(fun,x0,[],[],[],[],lb,ub, []);
function y = g(x, r)
n = length(r);
log_part = 0.5.*n.*log(x(2).^2);
sum_part = ((sum(r-x(1))).^2)./(2.*x(2).^2);
y = log_part + sum_part;
end
Solution
y = [3.0693 0.0000]
For better estimation use mle() directly
The code is quiet simple:
y = mle(r,'distribution','normal')
Solution
y = [3.0693 3.8056]

Related

[Octave]Using fminunc is not always giving a consistent solution

I am trying to find the coefficients in an equation to model the step response of a motor which is of the form 1-e^x. The equation I'm using to model is of the form
a(1)*t^2 + a(2)*t^3 + a(3)*t^3 + ...
(It is derived in a research paper used to solve for motor parameters)
Sometimes using fminunc to find the coefficients works out okay, and I get a good result, and it matches the training data fairly well. Other times the returned coefficients are horrible (going extremely higher than what the output should be and is orders of magnitude off). This especially happens once I started using higher order terms: using any model that uses x^8 or higher (x^9, x^10, x^11, etc.) always produces bad results.
Since it works sometimes, I can't think why my implementation would be wrong. I have tried fminunc while providing the gradients and while also not providing the gradients yet there is no difference. I've looked into using other functions to solve for the coefficients, like polyfit, but in that instance it has to have terms that are raised from 1 to the highest order term, but the model I'm using has its lowest power at 2.
Here is the main code:
clear;
%Overall Constants
max_power = 7;
%Loads in data
%data = load('TestData.txt');
load testdata.mat
%Sets data into variables
indep_x = data(:,1); Y = data(:,2);
%number of data points
m = length(Y);
%X is a matrix with the independant variable
exps = [2:max_power];
X_prime = repmat(indep_x, 1, max_power-1); %Repeats columns of the indep var
X = bsxfun(#power, X_prime, exps);
%Initializes theta to rand vals
init_theta = rand(max_power-1,1);
%Sets up options for fminunc
options = optimset( 'MaxIter', 400, 'Algorithm', 'quasi-newton');
%fminunc minimizes the output of the cost function by changing the theta paramaeters
[theta, cost] = fminunc(#(t)(costFunction(t, X, Y)), init_theta, options)
%
Y_line = X * theta;
figure;
hold on; plot(indep_x, Y, 'or');
hold on; plot(indep_x, Y_line, 'bx');
And here is costFunction:
function [J, Grad] = costFunction (theta, X, Y)
%# of training examples
m = length(Y);
%Initialize Cost and Grad-Vector
J = 0;
Grad = zeros(size(theta));
%Poduces an output based off the current values of theta
model_output = X * theta;
%Computes the squared error for each example then adds them to get the total error
squared_error = (model_output - Y).^2;
J = (1/(2*m)) * sum(squared_error);
%Computes the gradients for each theta t
for t = 1:size(theta, 1)
Grad(t) = (1/m) * sum((model_output-Y) .* X(:, t));
end
endfunction
Any help or advice would be appreciated.
Try adding regularization to your costFunction:
function [J, Grad] = costFunction (theta, X, Y, lambda)
m = length(Y);
%Initialize Cost and Grad-Vector
J = 0;
Grad = zeros(size(theta));
%Poduces an output based off the current values of theta
model_output = X * theta;
%Computes the squared error for each example then adds them to get the total error
squared_error = (model_output - Y).^2;
J = (1/(2*m)) * sum(squared_error);
% Regularization
J = J + lambda*sum(theta(2:end).^2)/(2*m);
%Computes the gradients for each theta t
regularizator = lambda*theta/m;
% overwrite 1st element i.e the one corresponding to theta zero
regularizator(1) = 0;
for t = 1:size(theta, 1)
Grad(t) = (1/m) * sum((model_output-Y) .* X(:, t)) + regularizator(t);
end
endfunction
The regularization term lambda is used to control the learning rate. Start with lambda=1. The grater the value for lambda, the slower the learning will occur. Increase lambda if the behavior you describe persists. You may need to increase the number of iterations if lambda gets high.
You may also consider normalization of your data, and some heuristic for initializing theta - setting all theta to 0.1 may be better than random. If nothing else it'll provide better reproducibility from training to training.

ODE45: Give different results from `expm` in sparse matrix

K is a large sparse matrix and y is a vector. At a particular timestep dt from t1 to t1+dt:
Method1:
The expm leads to:
K = ...
y = ...
y = expm(-1i*dt*K)*y; %new y
Method2:
The ode45 gives:
K = ...
y = ...
y0 = y;
[T, Y] = ode45(#(t,y)dy(y,K),[t1 t1+dt],y0);
y = Y(end,:).'; %new y
where:
function ydot = dy(y,K)
ydot = -1i*K*y;
The two method gives different result for large sparse matrix. Which is the correct one?
As I mentioned above, there is no way to 100% guarantee the correctness of ode solver results. But you can:
manually set the upper bound for integration step size;
try using a
stiff solver (ode15s,ode23t etc);
supply the Jacobian matrix or the
Jacobian pattern for dy(y,K) to improve the solver accuracy.
Here is an example of manually setting the maximum step size:
options= odeset('MaxStep',1e-3); % some experimentally obtained value here
[T, Y] = ode45(#(t,y)dy(y,K),[t1 t1+dt],y0,options);
Here is the description of Jacobian and Jpattern options. Note that you can't use them with ode45, you should use another solver

Matlab: converting symbolic to function handle, values returned not double

I have some obscurely long symbolic expression of 3 variables (g_eq), which i would like to minimize with some constraints. I am trying to do so by converting it to a function handle and using fmicon:
cfun = matlabFunction(g_eq,'vars',[kappa;theta;sigma]);
A = [-1 0 0; 0 -1 0; 0 0 -1];
b = [0; 0; 0];
[x,fval] = fmincon(#(kappa, theta, sigma) cfun, x0, A, b)
Which Matlab doesn't like:
FMINCON requires all values returned by user functions to be of
data type double.
I would suspect, that the problem is with cfun, as it is full of numbers with symbolic precision, can I somehow convert it, so that they're double? Or better (computation time wise) while creating my objective function (cfun) (complicated process of some transformation of data and a parametric model), can I use symbols or some other "proxy for the variables" with double for the numeric part of the expressions?
Thanks,
J.
Edit - MCVE:
My aim is to find parameters of a model by minimizing a difference between the model implied and data implied laplace transforms over some weighted regions. Here I provide the problem over one small region without use of weights over the regions and some further simplifications. In part 0, I provide the code for functions of the transformation, in part II I make the parametric transformation, while in III the data transformation and attempt to minimize it in IV.
%% 0. Functions used
%% 0.1 L_V1 - transform of parametric
function lv = L_V1(u,sigma,kappa,theta)
lv = (1/(1+u.*sigma^2/(2*kappa))).^(2*kappa*theta/sigma^2);
end
%% 0.2 LV_2 - transform of data
function lv = L_hat1(u,D,n,T)
A_u = cos(sqrt(2 .*u) *sqrt(n) .*D);
Z_u = 1/n * sum(A_u);
lv = 1/T * sum(Z_u);
end
%% I. Pre-estimation
ulng1=100; %select number of points on the evaluated interval
u1 = linspace(.8, 1.6,ulng1); % create region of interest
%% II. Parametric part
par_mat1 = sym(zeros(ulng1,1)); % create interval for parametric
syms sigma kappa theta LV_par1;
for i = 1:ulng1
par_mat1(i) = L_V1(u1(i),sigma,kappa,theta); %transformations of parametric
end
LV_par1 = sum(par_mat1); %sum of parametric over the region
%% III. Data part
n = 100; %choose number of days
T = 20; %choose number of obs over a day
D = rand([n-1, T]); %vector of observations, here just random numbers
for i = 1:ulng1
hat_mat1(i) = L_hat1(u1(i),D,n,T); %transformations of data
end
hat_1 = sum(hat_mat1); %sum of transforms over the region
%% IV. Minimize
W = 1; %weighting matrix, here just one region, hence 1
MC = hat_1 - LV_par1 ; %moment condition
g_eq = (MC) * (W) *(MC.'); %objective function (symbolic)
cfun = matlabFunction(g_eq,'vars',[kappa;theta;sigma]); %create a function handle from the symbolic
x0 = [.5; 1; .5];
A = [-1 0 0; 0 -1 0; 0 0 -1]; %constrains
b = [0; 0; 0]; %constrains
[x,fval] = fmincon(#(kappa, theta, sigma) cfun, x0, A, b) %minimize
The optimization parameters are always passed as vector.
[x,fval] = fmincon(#(x) cfun(x(1),x(2),x(3)), x0, A, b)

Approximating stochastic integral with Weiner process (MATLAB)

Is there a toolbox or available MATLAB function that will allow me to solve the following approximation of stochastic integrals, where z is a Wiener process:
%Lets say n is 100 and dt is 1/252 and k = .1
n = 100;
dt = 1/252;
k = 0.1;
dz = randn(n,1); %get random increments: normal
%dz = 2*(randi(2,n,1)-1.5); % or plus/minus ones : bernoulli
fnt = exp(-k*(n*dt - [0:n-1]*dt))*sqrt(dt)*dz;

Error using fmincon (line 300) A must have 'n' column(s)

I'm getting the error "Error using fmincon (line 300) A must have 'n' column(s)." when trying to solve the following optimisation code. I think there is an error in the definition of the constraints. Someone had the same problem http://goo.gl/35MeC but unfortunately I don't read chinese!!
The objective is to find the optimal values of the array Y subject to constraints on Y and on its integral IntY. To understand better the nature of the problem, each value of Y represent the value of a variable in a successive time step, and the objective function to minimise is basically a cost of interactions.
function [Y, IntY] = optTest()
% inputs of the problem
TS = 10; % number of time steps
YMin = -0.1; % minimum value of Y
YMax = 0.2; % maximum value of Y
IntYMin = 0.1; % min value of the integral of Y
IntYMax = 0.9; % max value of the integral of Y
IntYInit = 0.2; % initial value of the integral of Y
Prices = PricesFun(TS);
% use of function 'fmincon', preparation of the inputs
% x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub)
A = [tril(ones(TS))*1;tril(ones(TS))*-1];
b = [ones(TS,1)*(IntYInit-IntYMin);ones(TS,1)*(IntYMax-IntYInit)];
lb = ones(TS,1)*YMax;
ub = ones(TS,1)*YMin;
Y0 = ones(TS)*IntYInit;
Y = fmincon(#(x) costFun(x, Prices),Y0,A,b,[],[],lb,ub);
IntY = cumsum(Y);
function cost = costFun(x, Prices)
cost = sum(x*Prices);
function P = PricesFun(TS)
x = linspace(1,TS,TS);
pi = 3.1415;
P = 2 + sin(x/TS*4*pi);
The code above is self contained, if you want to try, you have just to paste it in matlab and call the function:
[Y, IntY] = optTest();
Your initial guess Y0 is what defines the number of variables involved in the optimization. You are inputting a TS x TS square matrix for Y0, which would require TS*TS linear constraints. Given that you're using column vectors for lb and ub, I assume you meant to create Y0 as a column vector as well, or Y0 = ones(TS,1)*IntYInit;