How to reduce step of calculation derivative? - scipy

I using scipy for optimization by SLSQP method.
My optimization function calculating by numerical method and The function doesn't enough differentiable in small area. I wanna to increase step of calculation derivative. I have found option in documentation:
finite_diff_rel_step
I have calculated without this option, look this calculation of derivative:
pressure=[370.00000001 420.], energy=417.7700822243337
pressure=[370. 420.00000001], energy=417.7700822283879
Step is equal 0.00000001
When I input array in option
options_opt = ({'disp': True,
'maxiter': 1000,
'finite_diff_rel_step': [0.1, 0.1]})
res = minimize(self.optimize_solver_c3,
x0,
method='SLSQP',
options=options_opt,
bounds=bnds,
constraints=nonliner_dP,
callback=self.callback_c3)
I have the same output step
pressure=[370.00000001 420. ], energy=417.77008222351986
pressure=[370. 420.00000001], energy=417.77008221408164
Could you show me example how this option using?

Related

How to use integration in Nonlinear Data-Fitting: Fsumsquares and then fminunc for parameter optimization

I am able to run the code for any equations but when I introduce an integral the command won't run:
t=dataset_TK1(:,1);
dataset_TK4=xlsread('Akis','Sheet1','AG491:AR725');
y_4=dataset_TK4(:,12);
Kg=1.76717865712934;
N0=1.08E+05;
fun1=#(Z) Z^(-1+(X(1)-Kg)/X(3))*exp(Z);
Ntotal=#(X,t)integral(fun1,X(2)*exp(-X(3)*t),X(2));
X0=[10,10,10];
Fsumsquares=#(X)sum((Ntotal(X,t)-y_4).^2);
opts = optimoptions('fminunc','Algorithm','quasi-newton');
[xunc,ressquared,eflag,outputu] = fminunc(Fsumsquares,X0,opts)
Any suggestions?
Thank you
The MWE (minimum working example) code below works, in the sense it doesn't spurt out errors: to obtain it, I have substituted your excel data with dummy arrays.
I limited myself to remove unnecessary handles from functions in your snippet, and adjust operators to element-wise .^ and .* in fun1.
Also, in the Ntotal integral, the limits should be scalars, not vectors. That is why I took only one element out of t:
% dummies, put your data back
t=ones(1,10);
y_4=[1,2,3,4,5,6,7,8,9,10];
X=[11,3,4,6,2,3,55,22,89,6];
Kg=1.76717865712934;
N0=1.08E+05;
fun1=#(Z) Z.^(-1+(X(1)-Kg)/X(3)).*exp(Z); % changed
Ntotal=integral(fun1,X(2)*exp(-X(3)*t(1)),X(2)); % changed
X0=[10,10,10];
Fsumsquares=#(X) sum((Ntotal-y_4).^2); % changed
opts = optimoptions('fminunc','Algorithm','quasi-newton');
[xunc,ressquared,eflag,outputu] = fminunc(Fsumsquares,X0,opts)

Small bug in MATLAB R2017B LogLikelihood after fitnlm?

Background: I am working on a problem similar to the nonlinear logistic regression described in the link [1] (my problem is more complicated, but link [1] is enough for the next sections of this post). Comparing my results with those obtained in parallel with a R package, I got similar results for the coefficients, but (very approximately) an opposite logLikelihood.
Hypothesis: The logLikelihood given by fitnlm in Matlab is in fact the negative LogLikelihood. (Note that this impairs consequently the BIC and AIC computation by Matlab)
Reasonning: in [1], the same problem is solved through two different approaches. ML-approach/ By defining the negative LogLikelihood and making an optimization with fminsearch. GLS-approach/ By using fitnlm.
The negative LogLikelihood after the ML-approach is:380
The negative LogLikelihood after the GLS-approach is:-406
I imagine the second one should be at least multiplied by (-1)?
Questions: Did I miss something? Is the (-1) coefficient enough, or would this simple correction not be enough?
Self-contained code:
%copy-pasting code from [1]
myf = #(beta,x) beta(1)*x./(beta(2) + x);
mymodelfun = #(beta,x) 1./(1 + exp(-myf(beta,x)));
rng(300,'twister');
x = linspace(-1,1,200)';
beta = [10;2];
beta0=[3;3];
mu = mymodelfun(beta,x);
n = 50;
z = binornd(n,mu);
y = z./n;
%ML Approach
mynegloglik = #(beta) -sum(log(binopdf(z,n,mymodelfun(beta,x))));
opts = optimset('fminsearch');
opts.MaxFunEvals = Inf;
opts.MaxIter = 10000;
betaHatML = fminsearch(mynegloglik,beta0,opts)
neglogLH_MLApproach = mynegloglik(betaHatML);
%GLS Approach
wfun = #(xx) n./(xx.*(1-xx));
nlm = fitnlm(x,y,mymodelfun,beta0,'Weights',wfun)
neglogLH_GLSApproach = - nlm.LogLikelihood;
Source:
[1] https://uk.mathworks.com/help/stats/examples/nonlinear-logistic-regression.html
This answer (now) only details which code is used. Please see Tom Lane's answer below for a substantive answer.
Basically, fitnlm.m is a call to NonLinearModel.fit.
When opening NonLinearModel.m, one gets in line 1209:
model.LogLikelihood = getlogLikelihood(model);
getlogLikelihood is itself described between lines 1234-1251.
For instance:
function L = getlogLikelihood(model)
(...)
L = -(model.DFE + model.NumObservations*log(2*pi) + (...) )/2;
(...)
Please also not that this notably impacts ModelCriterion.AIC and ModelCriterion.BIC, as they are computed using model.LogLikelihood ("thinking" it is the logLikelihood).
To get the corresponding formula for BIC/AIC/..., type:
edit classreg.regr.modelutils.modelcriterion
this is Tom from MathWorks. Take another look at the formula quoted:
L = -(model.DFE + model.NumObservations*log(2*pi) + (...) )/2;
Remember the normal distribution has a factor (1/sqrt(2*pi)), so taking logs of that gives us -log(2*pi)/2. So the minus sign comes from that and it is part of the log likelihood. The property value is not the negative log likelihood.
One reason for the difference in the two log likelihood values is that the "ML approach" value is computing something based on the discrete probabilities from the binomial distribution. Those are all between 0 and 1, and they add up to 1. The "GLS approach" is computing something based on the probability density of the continuous normal distribution. In this example, the standard deviation of the residuals is about 0.0462. That leads to density values that are much higher than 1 at the peak. So the two things are not really comparable. You would need to convert the normal values to probabilities on the same discrete intervals that correspond to individual outcomes from the binomial distribution.

Bicoin price prediction using spark and scala [duplicate]

I am new to Apache Spark and trying to use the machine learning library to predict some data. My dataset right now is only about 350 points. Here are 7 of those points:
"365","4",41401.387,5330569
"364","3",51517.886,5946290
"363","2",55059.838,6097388
"362","1",43780.977,5304694
"361","7",46447.196,5471836
"360","6",50656.121,5849862
"359","5",44494.476,5460289
Here's my code:
def parsePoint(line):
split = map(sanitize, line.split(','))
rev = split.pop(-2)
return LabeledPoint(rev, split)
def sanitize(value):
return float(value.strip('"'))
parsedData = textFile.map(parsePoint)
model = LinearRegressionWithSGD.train(parsedData, iterations=10)
print model.predict(parsedData.first().features)
The prediction is something totally crazy, like -6.92840330273e+136. If I don't set iterations in train(), then I get nan as a result. What am I doing wrong? Is it my data set (the size of it, maybe?) or my configuration?
The problem is that LinearRegressionWithSGD uses stochastic gradient descent (SGD) to optimize the weight vector of your linear model. SGD is really sensitive to the provided stepSize which is used to update the intermediate solution.
What SGD does is to calculate the gradient g of the cost function given a sample of the input points and the current weights w. In order to update the weights w you go for a certain distance in the opposite direction of g. The distance is your step size s.
w(i+1) = w(i) - s * g
Since you're not providing an explicit step size value, MLlib assumes stepSize = 1. This seems to not work for your use case. I'd recommend you to try different step sizes, usually lower values, to see how LinearRegressionWithSGD behaves:
LinearRegressionWithSGD.train(parsedData, numIterartions = 10, stepSize = 0.001)

Using matlab fit object as a function

Matlab fit is no doubt useful but it is not clear how to use it as a function
apart from trivial integration and differentiation given on the official website:
http://uk.mathworks.com/help/curvefit/example-differentiating-and-integrating-a-fit.html
For example given a fit stored in the object 'curve' one can evaluate
curve(x) to get a number. But how one would, e.g. integrate |curve(x)|^2 (apart from clumsily creating a new fit)? Trying naively
curve = fit(x_vals,y_vals,'smoothingspline');
integral(curve(x)*curve(x), 0, 1)
gives an error:
Output of the function must be the same size as the input. If FUN is an array-valued integrand, set the 'ArrayValued' option to true.
I have also tried a work around by definining a normal function and an implicit function for the integrand (below) but both give the same error.
func=#(x)(curve(x))...; % trial solution 1
function func_val=func(curve, x)...; % trial solution 2
Defining function for the integrand followed by integration with option 'ArrayValued' set to 'true' works:
func=#(x)(curve(x)*curve(x));
integral(func,0,1,'ArrayValued',true)
You need to have the function vectorized, i.e use element-wise operations like curve(x).*curve(x) or curve(x).^2.
Also make sure that the shape of the output matches the input, i.e a row input gives a row output, similarly a column comes out as a column. It seems that evaluating the fit object always returns a column vector (e.g. f(1:10) returns a 10x1 vector not 1x10).
With that said, here is an example:
x = linspace(0,4*pi,100)';
y = sin(x);
y = y + 0.5*y.*randn(size(y));
f = fit(x, y, 'smoothingspline');
now you can integrate as:
integral(#(x) reshape(f(x).^2,size(x)), 0, 1)
in this case, it can be simplified as a simple transpose:
integral(#(x) (f(x).^2)', 0, 1)

Multi-class regression in nolearn?

I'm trying to build a Neural Network using nolearn that can do regression on multiple classes.
For example:
net = NeuralNet(layers=layers_s,
input_shape=(None, 2048),
l1_num_units=8000,
l2_num_units=4000,
l3_num_units=2000,
l4_num_units=1000,
d1_p = 0.25,
d2_p = 0.25,
d3_p = 0.25,
d4_p = 0.1,
output_num_units=noutput,
output_nonlinearity=None,
regression=True,
objective_loss_function=lasagne.objectives.squared_error,
update_learning_rate=theano.shared(float32(0.1)),
update_momentum=theano.shared(float32(0.8)),
on_epoch_finished=[
AdjustVariable('update_learning_rate', start=0.1, stop=0.001),
AdjustVariable('update_momentum', start=0.8, stop=0.999),
EarlyStopping(patience=200),
],
verbose=1,
max_epochs=1000)
noutput is the number of classes for which I want to do regression, if I set this to 1 everything works. When I use 26 (the number of classes here) as output_num_unit I get a Theano dimension error. (dimension mismatch in args to gemm (128,1000)x(1000,26)->(128,1))
The Y labels are continues variables, corresponding to a class. I tried to reshape the Y labels to (rows,classes) but this means I have to give a lot of the Y labels a value of 0 (because the value for that class is unknown). Is there any way to do this without setting some y_labels to 0?
If you want to do multiclass (or multilabel) regression with 26 classes, your output must not have shape (1082,), but (1082, 26). In order to preprocess your output, you can use sklearn.preprocessing.label_binarize
which will transform your 1D output to 2D output.
Also, your output non linearity should be a softmax function, so that the rows of your output sum to 1.