I am taking an Econometrics course, and have been trying to use Python rather than the propreitry STATA and EVIEWS they set the assignments in.
In one of the questions, I have consumption data over time. I am asked to compute it in two ways.
The first way is calculating a model of the form consumption = Aexp(Bt), and the second way is to log both sides and do ordinary OLS on log(consumption) = alpha + Bt
I know how to do the second way. Howver, when I try to do the first way it goes wrong. Using statsmodels, I can exponentiate the time data (after normalising), but this calculates a regression in the form consumption = Aexp(t) + B, which is not what I want. (I want to specify where the parameters go). In sklearn I could find a polynomial regression, but not exponential.
Then I found scipy.curve_fit
However this seems to have two problems:
(1) It seems to rely on initial guesses for parameters, which means my output will end up being different from proprietry software (whereas output for things like OLS are the same) [as I assume initial guesses means some iterative solution is done which is helpful for very weird and wonderful functions, but I assume fairly standard results hold for exponential regression]
(2) every time I try to implement it, it just returns the guess parameters.
Here is my code
`consumption_data = pd.read_csv(......\consumption.csv")
def func(x,a,b):
return a * np.exp(b*x)
xdata = consumption_data.YEAR
ydata = consumption_data.CONSUMPTION
ydata = (ydata - 1948)/100
popt, pcov = curve_fit(func, xdata, ydata, (1,1))
plt.plot(xdata, func(xdata, *popt), 'g--',)
The scipy.optimize code is basically just copy-pasted from their tutorial
short answer: use statsmodels GLM
statsmodels does not have nonlinear least squares. The best python library for that is lmfit https://pypi.org/project/lmfit/
curve_fit, lmfit and nonlinear least squares algorithm in general find an iterative solution to the optimization problem. Even when we have to provide starting values, the solution is in many cases the same across packages up to convergence tolerance, e.g. 1e-5 or 1e-6.
Many standard models in statistics and econometrics have a single global maximum with well behaved data. However, in other cases like mixture models, there might be many local optima and the estimation might converge to one of them.
To the specific case:
consumption = A exp(B t)
can be rewritten as
consumption = exp(a + B t)
So this is just a single index model or a generalized linear model with an exponential mean function.
The general version has the expectation of the dependent variable as a nonlinear function of a linear combination of the explanatory variables:
E(y | x) = g(x b)
This can be estimated with statsmodels with GLM with family Gaussian and the log-link.
Aside: In econometrics, there is a literature to use Poisson quasi-likelihood as an estimator for exp models instead of taking the log of the dependent variable.
Poisson usually uses the log-link function as in the above.
However, using GLM allows us to use log-link, i.e. exponential mean function, with any of the supported distribution families. The main difference is in the underlying variance assumption. Gaussian assumes constant variance, Poisson assumes that the variance is proportional to the mean and Gamma assumes that the variance is quadratic in the mean.
If we use a robust sandwich covariance estimator for parameter inference, then standard errors and inference are correct even if the variance function is misspecified.
I am working on a MR-physic simulation written in Matlab which simulates bloch's equations on an defined object. The magnetisation in the object is updated every time-step with the following functions.
function Mt = evolveMtrans(gamma, delta_B, G, T2, Mt0, delta_t)
% this function calculates precession and relaxation of the
% transversal component, Mt, of M
delta_phi = gamma*(delta_B + G)*delta_t;
Mt = Mt0 .* exp(-delta_t*1./T2 - 1i*delta_phi);
This function is a very small part of the entire code but is called upon up to 250.000 times and thus slows down the code and the performance of the entire simulation. I have thought about how I can speed up the calculation but haven't come up with a good solution. There is one line that is VERY time consuming and stands for approximately 50% - 60% of the overall simulation time. This is the line,
Mt = Mt0 .* exp(-delta_t*1./T2 - 1i*delta_phi);
Mt0 = 512x512 matrix
delta_t = a scalar
T2 = 512x512 matrix
delta_phi = 512x512 matrix
I would be very grateful for any suggestion to speed up this calculation.
More info below,
The function evovleMtrans is called every timestep during the simulation.
The parameters that are used for calling the function are,
gamma = a constant. (gyramagnetic constant)
delta_B = the magnetic field value
G = gradientstrength
T2 = a 512x512 matrix with T2-values for the object
Mstart.r = a 512x512 matrix with the values M.r had the last timestep
delta_t = a scalar with the difference in time since the last calculated M.r
The only parameters of these that changed during the simulation are,
G, Mstart.r and delta_t. The rest do not change their values during the simulation.
The part below is the part in the main code that calls the function.
% update phase and relaxation to calcTime
delta_t = calcTime - Mstart_t;
delta_B = (d-d0)*B0;
G = Sq.Gx*Sq.xGxref + Sq.Gz*Sq.zGzref;
% Precession around B0 (z-axis) and B1 (+-x-axis or +-y-axis)
% is defined clock-wise in a right hand system x, y, z and
% x', y', z (see the Bloch equation, Bloch 1946 and Levitt
% 1997). The x-axis has angle zero and the y-axis has angle 90.
% For flipping/precession around B1 in the xy-plane, z-axis has
% angle zero.
% For testing of precession direction:
% delta_phi = gamma*((ones(size(d)))*1e-6*B0)*delta_t;
M.r = evolveMtrans(gamma, delta_B, G, T2, Mstart.r, delta_t);
M.l = evolveMlong(T1, M0.l, Mstart.l, delta_t);
This is not a surprise.
That "single line" is a matrix equation. It's really 1,024 simultaneous equations.
Per Jannick, that first term means element-wise division, so "delta_t/T[i,j]". Multiplying a matrix by a scalar is O(N^2). Matrix addition is O(N^2). Evaluating exponential of a matrix will be O(N^2).
I'm not sure if I saw a complex argument in there as well. Does that mean complex matricies with real and imaginary entries? Does your equation simplify to real and imaginary parts? That means twice the number of computations.
Your best hope is to exploit symmetry as much as possible. If all your matricies are symmetric, you cut your calculations roughly in half.
Use parallelization if you can.
Algorithm choice can make a big difference, too. If you're using explicit Euler integration, you may have time step limitations due to stability concerns. Is that why you have 250,000 steps? Maybe a larger time step is possible with a more stable integration schema. Think about a higher order adaptive scheme with error correction, like 5th order Runge Kutta.
There are several possibilities to improve the speed of the code but all that I see come with a caveat.
Numerical ode integration
The first possibility would be to change your analytical solution by numerical differential equation solver. This has several advantages
The analytical solution includes the complex exponential function, which is costly to calculate, while the differential equation contains only multiplication and addition. (d/dt u = -a u => u=exp(-at))
There are plenty of built-in solvers for matlab available and they are typically pretty fast (e.g. ode45). The built-ins however all use a variable step size. This improves speed and accuracy but would be a problem if you really need a fixed equally spaced grid of time points. Here are unofficial fixed step solvers.
As a start you could also try to use just an euler step by replacing
M.r = evolveMtrans(gamma, delta_B, G, T2, Mstart.r, delta_t);
delta_phi = gamma*(delta_B + G)*t_step;
M.r += M.r .* (1-t_step*1./T2 - 1i*delta_phi);
You can then further improve that by precalculating all constant values, e.g. one_over_T1=1/T1, moving delta_phi out of the loop.
You are bound to a minimum step size or the accuracy suffers. Therefore this is only a good idea if you time-spacing is quite fine.
Less points in time
You should carfully analyze whether you really need so many points in time. It seems somewhat puzzling to me that you need so many points. As you know the full analytical solution you can freely choose how to sample the time and maybe use this to your advantage.
Going fortran
This might seem like a grand step but in my experience basic (simple loops, matrix operations etc.) matlab code can be relatively easily translated to fortran line-by-line. This would be especially helpful in addition to my first point. If you still want to use the full analytical solution probably there is not much to gain here because exp is already pretty fast in matlab.
I have the following equation:
I want to do a exponential curve fitting using MATLAB for the above equation, where y = f(u,a). y is my output while (u,a) are my inputs. I want to find the coefficients A,B for a set of provided data.
I know how to do this for simple polynomials by defining states. As an example, if states= (ones(size(u)), u u.^2), this will give me L+Mu+Nu^2, with L, M and N being regression coefficients.
However, this is not the case for the above equation. How could I do this in MATLAB?
Building on what #eigenchris said, simply take the natural logarithm (log in MATLAB) of both sides of the equation. If we do this, we would in fact be linearizing the equation in log space. In other words, given your original equation:
We get:
However, this isn't exactly polynomial regression. This is more of a least squares fitting of your points. Specifically, what you would do is given a set of y and set pair of (u,a) points, you would build a system of equations and solve for this system via least squares. In other words, given the set y = (y_0, y_1, y_2,...y_N), and (u,a) = ((u_0, a_0), (u_1, a_1), ..., (u_N, a_N)), where N is the number of points that you have, you would build your system of equations like so:
This can be written in matrix form:
To solve for A and B, you simply need to find the least-squares solution. You can see that it's in the form of:
Y = AX
To solve for X, we use what is called the pseudoinverse. As such:
X = A^{*} * Y
A^{*} is the pseudoinverse. This can eloquently be done in MATLAB using the \ or mldivide operator. All you have to do is build a vector of y values with the log taken, as well as building the matrix of u and a values. Therefore, if your points (u,a) are stored in U and A respectively, as well as the values of y stored in Y, you would simply do this:
x = [u.^2 a.^3] \ log(y);
x(1) will contain the coefficient for A, while x(2) will contain the coefficient for B. As A. Donda has noted in his answer (which I embarrassingly forgot about), the values of A and B are obtained assuming that the errors with respect to the exact curve you are trying to fit to are normally (Gaussian) distributed with a constant variance. The errors also need to be additive. If this is not the case, then your parameters achieved may not represent the best fit possible.
See this Wikipedia page for more details on what assumptions least-squares fitting takes:
One approach is to use a linear regression of log(y) with respect to u² and a³:
Assuming that u, a, and y are column vectors of the same length:
AB = [u .^ 2, a .^ 3] \ log(y)
After this, AB(1) is the fit value for A and AB(2) is the fit value for B. The computation uses Matlab's mldivide operator; an alternative would be to use the pseudo-inverse.
The fit values found this way are Maximum Likelihood estimates of the parameters under the assumption that deviations from the exact equation are constant-variance normally distributed errors additive to A u² + B a³. If the actual source of deviations differs from this, these estimates may not be optimal.
I am using the global maximization toolbox to maximize the following function:
function x = NameOfFunction (w1, w2, w3, a, b, c, Structure1, Structure2, Structure3)
where I am minimizing x by changing the values of w1, w2, and w3. The remaining parameters are constants and structures containing data. The value of x, as well as the three w variables depend on the data that is fed into the function via the structures.
The function returns x which is the mean of 180 values that are calculated in the process of running NameOfFunction.
I am wondering how I could add a constraint on the standard deviation of the 180 values. I am not interested in minimizing both the mean and the standard deviation, but rather to minimize x(the mean), while allowing standard deviation to be no greater than some specific value. I know how to add constraints to the decision variables (ie. w1, w2, w3), but have no idea how to do so for a value like the standard deviation.
EDIT: More detail, per Werner's suggestion:
%the functions is f(w) rather than f(x)
Aeq = [1 1 1];
beq = 1;
lb = .10 * [1 1 1];
ub = .8 * [1 1 1];
w = [weight1, weight2, weight3];
wstart = randn(3,1);
options = optimset('Algorithm','interior-point');
% function handle for the objective function (note that variables
% aa through hh are additional parameters that the solver does not modify):
h = #(w)NameOfFunction(w(1),w(2),w(3), aa, bb, cc, dd, ee, ff, gg, hh);
% problem structure:
problem = createOptimProblem('fmincon','x0',wstart,'objective',h,...'
gs = GlobalSearch;
I'm running a GlobalSearch using fmincon.
7/16/2013, After implementing nonlcon I was able to achieve what I tried to do. (I have a follow-up question, which I put on the bottom of this post). Here's what I did:
I added another function (StdConstraintFunction) as discussed. So now I have the following:
stdMax = 0.01;
h = #(w)NameOfFunction(w(1),w(2),w(3),aa, bb, cc, dd, ee, ff, gg);
StdConstraint = #(w)StdConstraintFunction(w(1),w(2),w(3),aa, bb, cc, dd, ee, ff, gg,stdMax);
where StdConstraintFunction is a modified version of NameOfFunction that calculates the standard deviation rather than the mean.
The last line in the two functions is the only thing that is different in the body of the functions.
In NameOfFunction, the last line is:
ReturnVariable = -1 * (nanmedian([vect1]));
%note: I added the -1 multiplication to search for the maximum rather than minimum
The last line in StdConstraintFunction is:
ReturnVariable = (std([vect1]) - stdMax);
ceq = [];
%ceq is a required variable that is supposed to return the equality non-linear
%constraint; here it is blank because I don't have one. The optimization
%would produce an error if I exclude it
and my problem setup is:
problem = createOptimProblem('fmincon','x0',xstart,'objective',h,'Aeq',Aeq,'beq',beq,'options',options,'lb',lb,'ub',ub,'nonlcon',StdConstraint);
#Werner: If you want to post this as the answer to the question, I will gladly accept it as the official answer. Thanks so much for all your help!
Solving optimizations problems with non-linear conditions depending only on variables being optimized
Using matlab fmincon documentation:
x = fmincon(fun,x0,A,b,Aeq,beq,lb,ub,nonlcon,options)
nonlcon:The function that computes the nonlinear inequality constraints c(x)≤ 0 and the nonlinear equality constraints ceq(x) = 0. nonlcon accepts a vector x and returns the two vectors c and ceq. c is a vector that contains the nonlinear inequalities evaluated at x, and ceq is a vector that contains the nonlinear equalities evaluated at x. nonlcon should be specified as a function handle to a file or to an anonymous function, such as mycon:
x = fmincon(#myfun,x0,A,b,Aeq,beq,lb,ub,#mycon)
where mycon is a MATLAB function such as
function [c,ceq] = mycon(x)
c = ... % Compute nonlinear inequalities at x.
ceq = ... % Compute nonlinear equalities at x.
If the gradients of the constraints can also be computed and the GradConstr option is 'on', as set by
options = optimoptions('fmincon','GradConstr','on')
then nonlcon must also return, in the third and fourth output arguments, GC, the gradient of c(x), and GCeq, the gradient of ceq(x). GC and GCeq can be sparse or dense. If GC or GCeq is large, with relatively few nonzero entries, save running time and memory in the interior-point algorithm by representing them as sparse matrices. For more information, see Nonlinear Constraints.
So, what you will need to do in order to add the non linear constrain is to use the nonlcon function that will return c with the standard deviation from w's values. It may be accomplished using an anonymous function:
nonlcon = #(x) std(x) - std_lim;
which means std(x) <= std_lim, where x are the variables passed to matlab to be optimized, in this case the wstart variable, but at the kth iteration. Of course, instead of std you may use whatever you may want, i.e x(1)^2 + x(2)^3 - sin(x(3)), supposing you have three variables being optimized.
And then change your code to:
problem = createOptimProblem('fmincon','x0',wstart,'objective',h,...
Note: If you don't have one or more of the above extra variables, i.e. linear lower boundary lb, just don't add it to createOptimProblem.
Solving optimizations problems with non-linear conditions also depending on variables that are not being optimized
In this particular problem, the problem variables to be optimized (w) are not the only variables needed to calculate the standard deviation as noticed by #Mr. Kinn, so there is a need to feed the matlab non linear conditions function with extra variables that are not being optimized in each iteration. In order to do so, we alter our anonymous function handle to:
StdConstraint = #(w)StdConstraintFunction(w(1),w(2),w(3),aa, bb, cc, dd, ee, ff, gg,stdMax);
which is a function handle with one input variable, called w, fed via matlab internal code with the variables being optimized. In this presented solution, this variable is fed as three arguments to a matlab file function called StdConstrainFunction, which will also receive variables aa,bb,cc,dd,ee,ff,gg,stdMax from the environment where the StdConstraint handle function was created, so they will not be modified by the matlab internal routines during optimization.
There, the variables are used to calculate the non linear condition to be respected, remembering that, as said by the matlab documentation, the values returned by the non linear condition function must be two: c,ceq. The first returned output, c, are the conditions which must be lesser than zero when they are being respected, o.c. when out of bounds. The second output ceqare the non linear conditions equation that must be respected.
You may adapt this particular solution for your problem, just by changing the arguments passed to the handle function used as non linear constrain.
Consider also seeing this question.