ipython non-linear least squares with constraints equations - scipy

I am new to iPython, and need to solve a specific curve fitting problem, I have the concept but my programming knowledge is yet too limited. I have experimental data (x, y) to fit to an equation (curve fitting) with four coefficients (a,b,c,d), I would like to fix one of these coefficients (e.g. a ) to a specific value and refit my experimental data (non-linear least squares). coefficients b, c and d are not independent of another, meaning they are related by a system of equations.
Is it more adequate to use curve_fit or lmfit?
I started this with curve_fit:
def fitfunc(x,a,b,c,d):
return a+b*x+c/x+log10(x)*d
popt, fitcov = curve_fit(fitfunc, x, y)
or a code like this with lmfit:
import scipy as sp
from lmfit import minimize, Parameters, Parameter, report_fit
def fctmin(params, x, y):
a = params['a'].value
b = params['b'].value
c = params['c'].value
d = params['d'].value
model = a+b*x+c/x+d*np.log10(x)
return model - y
#create parameters
params = Parameters()
params.add('a', value = -89)
params.add('b', value =b)
params.add('c', value = c)
params.add('d', value = d)
#fit leastsq model
result = minimize(fctmin, params, args=(x, y))
#calculate results
final = y + result.residual
report_fit(params)

I'll admit to being biased. curve_fit() is designed to simplify scipy.optimize.leastsq() by assuming that you are fitting y(x) data to a model for y(x, parameters), so the function you pass to curve_fit() is one that will calculate the model for the values to be fit. lmfit is a bit more general and flexible in that your objective function has to return the array to be minimized in the least-squares sense, but your objective function has to return "model-data" instead of "model"
But, lmfit has features that appear to do exactly what you want: fix one of the parameters in the model without having to rewrite the objective function.
That is, you could say
params.add('a', value = -89, vary=False)
and the parameter 'a' will stay fixed. To do that with curve_fit() you have to rewrite your model function.
In addition, you say that b, c and d are related by equations, but don't give details. With lmfit, you might be able to include these equations as constraints. You have
params.add('b', value =b)
params.add('c', value = c)
params.add('d', value = d)
though I don't see a value for b. Even assuming there is a value, this creates three independent variables with the same starting value. You might mean "vary b, and force c and d to have the same value". lmfit can do this with:
params.add('b', value = 10)
params.add('c', expr = 'b')
params.add('d', expr = 'c')
That will have one independent variable, and the value for c will be forced to the value of b (and d to c). You can use (approximately) any valid python statement as a constraints constraint expression, for example:
params.add('b', value = 10)
params.add('c', expr = 'sqrt(b/10)')
params.add('d', expr = '1-c')
I think that might be the sort of thing you're looking for.

Related

Using MATLAB's chi2gof with non-standard user-specified PDFs

I would like to use MATLAB's chi2gof to perform a chi-square goodness-of-fit test. My problem is that my assumed (i.e., theoretical) distribution is not one of the standard built-in probability distributions in MATLAB. The specific form of my desired distribution is:
p = x^a*exp(-b*x^2)
where a and b are constants. There must be a way to use chi2gof for arbitrary PDFs? I have done an exhaustive Google search, but have come up empty-handed.
You can specify a handle to a function that takes a single argument to chi2gof this way:
a = ...
b = ...
c = ...
F = #(x)a*exp(-b*x-c*x.^2); % Technically this is an anonymous function
[H,P,STATS] = chi2gof(data,'cdf',F)
Or in special cases:
a = ...
b = ...
c = ...
F = #(x,a,b,c)a*exp(-b*x-c*x.^2);
[H,P,STATS] = chi2gof(data,'cdf',{F,a,b,c})
the last line of which is equivalent to
[H,P,STATS] = chi2gof(data,'cdf',#(x)F(x,a,b,c))
If the parameters a, b, and c are estimated (e.g., using some fitting process), then you should specify the number of estimated parameters to chi2gof. In this case:
[H,P, STATS] = chi2gof(data,'cdf',F,'nparams',3)
Please read the documentation to learn about the other options.

Symbolic integration vs numeric integration in MATLAB

I have an expression with three variables x,y and v. I want to first integrate over v, and so I use int function in MATLAB.
The command that I use is the following:
g =int((1-fxyz)*pv, v, y,+inf)%
PS I haven't given you what the function fxyv is but it is very complicated and so int is taking so long and I am afraid after waiting it might not solve it.
I know one option for me is to integrate numerically using for example integrate, however I want to note that the second part of this problem requires me to integrate exp[g(x,y)] over x and y from 0 to infinity and from x to infinity respectively. So I can't take numerical values of x and y when I want to integrate over v I think or maybe not ?
Thanks
Since the question does not contain sufficient detail to attempt analytic integration, this answer focuses on numeric integration.
It is possible to solve these equations numerically. However, because of complex dependencies between the three integrals, it is not possible to simply use integral3. Instead, one has to define functions that compute parts of the expressions using a simple integral, and are themselves fed into other calls of integral. Whether this approach leads to useful results in terms of computation time and precision cannot be answered generally, but depends on the concrete choice of the functions f and p. Fiddling around with precision parameters to the different calls of integral may be necessary.
I assume that the functions f(x, y, v) and p(v) are defined in the form of Matlab functions:
function val = f(x, y, v)
val = ...
end
function val = p(v)
val = ...
end
Because of the way they are used later, they have to accept multiple values for v in parallel (as an array) and return as many function values (again as an array, of the same size). x and y can be assumed to always be scalars. A simple example implementation would be val = ones(size(v)) in both cases.
First, let's define a Matlab function g that implements the first equation:
function val = g(x, y)
val = integral(#gIntegrand, y, inf);
function val = gIntegrand(v)
% output must be of the same dimensions as parameter v
val = (1 - f(x, y, v)) .* p(v);
end
end
The nested function gIntegrand defines the object of integration, the outer performs the numeric integration that gives the value of g(x, y). Integration is over v, parameters x and y are shared between the outer and the nested function. gIntegrand is written in such a way that it deals with multiple values of v in the form of arrays, provided f and p do so already.
Next, we define the integrand of the outer integral in the second equation. To do so, we need to compute the inner integral, and therefore also have a function for the integrand of the inner integral:
function val = TIntegrandOuter(x)
val = nan(size(x));
for i = 1 : numel(x)
val(i) = integral(#TIntegrandInner, x(i), inf);
end
function val = TIntegrandInner(y)
val = nan(size(y));
for j = 1 : numel(y)
val(j) = exp(g(x(i), y(j)));
end
end
end
Because both function are meant to be fed as an argument into integral, they need to be able to deal with multiple values. In this case, this is implemented via an explicit for loop. TIntegrandInner computes exp(g(x, y)) for multiple values of y, but the fixed value of x that is current in the loop in TIntegrandOuter. This value x(i) play both the role of a parameter into g(x, y) and of an integration limit. Variables x and i are shared between the outer and the nested function.
Almost there! We have the integrand, only the outermost integration needs to be performed:
T = integral(#TIntegrandOuter, 0, inf);
This is a very convoluted implementation, which is not very elegant, and probably not very efficient. Again, whether results of this approach prove to be useful needs to be tested in practice. However, I don't see any other way to implement these numeric integrations in Matlab in a better way in general. For specific choices of f(x, y, v) and p(v), there might be possible improvements.

Finding an estimated solution of the equation

I have a truncation function defined as:
function f = phi_b(x, b)
if b == 0
f = sign(x);
else
f = -1 * (x<-b) + 1*(x>b) + (1/b) * x .* ((x>=-b & x<=b));
end;
It is used to truncate the observations which in my particular case corresponds to white noise:
model = arima('Constant',0,'AR',{0},'Variance',1);
y = simulate(model, 100);
The function I need in the end is:
r = #(b) (1/100) * sum((phi_b(y,b)).^2);
The problem is in finding the solution of the equation r(b)==0.1. Usual procedures like the one below will not work:
solve(r(b)==0.1, b)
Is there any way to solve such types of equations?
If the result of r(b) is a vector, you could invoke the min function and see where in this vector the closest value would be to 0.1. You can do something like:
result = r(b);
[val,index] = min(abs(result - 0.1));
val will contain how "close" 0.1 is with the best element in the vector that matches this criteria and index will tell you where in the result vector this element is. For example, if val = 0.00001 and index = 7, this means that the best value in result is 0.00001 away from 0.1. Also, index 7 in result is where this best element is located. To see what the actual value is, do r(7) or r(index).
Interestingly enough, you can use val as a way of measuring the resolution of your data. In other words, if val is very large, this could mean that you need to introduce more values in your vector at a smaller step size. If val is quite small, this could mean that what you originally specified as your b vector is adequate enough. I'm not familiar with the function so I have not considered whether or not there could be no solutions to the data you have provided to your r function.

Matlab minimization with fminsearch and parametrized function

I am writing a program in Matlab and I have a function defined this way.
sum (i=1...100) (a*x(i) + b*y(i) + c)
x and y are known, while a, b and c are not: I need to find values for them such that the total value of the function is minimized. There is no additional constraint for the problem.
I thought of using fminsearch to solve this minimization problem, but from Mathworks I get that functions which are suitable inputs for fminsearch are defined like this (an example):
square = #(x) x.^2
So in my case I could use a vector p=[a, b, c] as the value to minimize, but then I don't know how to define the remaining part of the function. As you can see the number of possible values for the index i is huge, so I cannot simply sum everything together explicitly, but I need to represent the summation in some way. If I write the function somewhere else then I am forced to use symbolic calculus for a, b and c (declaring them with syms) and I'm not sure fminsearch would accept that.
What can I do? Of course if fminsearch turns out to be unfeasible for my situation I accept links to use something else.
The most general solution is to use x and y in the definition of the objective function:
>> objfun = #(p) sum( p(1).*x + p(2).*y + p(3) );
>> optp = fminsearch( objfun, po, ... );

how to store symbolically the derivatives in matlab

my question relates to the Symbolic Math Toolbox from Matlab. I have the following code:
syms x x_0 u delta sigma_1
mu = sym ('mu(x)');
sigma_u = sym ('sigma(u)');
sigma = sym ('sigma(x)');
f = int (1/sigma_u, u, x_0, x);
df = subs(diff(f,x))
df_2 = subs(diff (f,x,2))
L = subs(mu*df+1/2*sigma^2*df_2)
The result of L is corect
L =
mu(x)/sigma(x) - diff(sigma(x), x)/2
However, for further derivations and for simplicity, I would like to define
sigma_1 = sym('diff(sigma,x)');
or in a similar way such as to get as result for
L =
mu(x)/sigma(x) - sigma_1(x)/2
Basically, I would like to store under a name the symbolic expression diff(sigma(x),x) such that Matlab knows that when it gets this result in a expression, to poste the name sigma_1 (x) instead of diff(sigma(x),x)
Yes it is possible, you can use subs(L, 'diff(sigma(x),x)', 'sigma_1(x)'). Note to make the substitution work, the second input of subs must be exactly like what you want to replace; hence it cannot be 'diff(sigma, x)' which lacks the (x) behind the sigma.
Also note that here is a similar question for which I provided a more complete solution (they asked the question after yours, but I read theirs first).