trying to understand scipy's optimize.minimize function, getting indexerror - scipy

I'm trying to write code that will optimize a multivariate function using sklearn's optimize function, but it keeps returning an IndexError, and I'm not sure where to go from here.
The code is this:
revcoeff = coefficients[::-1]
xdot = np.zeros(0)
normfeat1 = normfeat1.reshape(-1,1)
xdot = np.append(normfeat1, normfeat2.reshape(-1,1), axis=1)
a = revcoeff[1:3]
b = xdot[0, :]
seed = np.zeros(5) #does seed need to be the coefficients? not sure
fun = lambda x: np.multiply((1/666), np.power(np.sum(np.dot(a, xdot[x, :])-medianv[x]),2)) #costfunction
optsol = optimize.minimize(fun, seed)
where there are two features I'm using in my nearest neighbors algorithm. Coefficients for the fitted regression model are given into the array "coefficients".
What I'm having trouble understanding is 1) why my code is throwing a "IndexError: arrays used as indicies must be of integer or boolean type"....and also partially I'm confused by the optimize.minimize function itself. It takes in two input values, the function and x0 (an ndarray with initial guesses). What should x0 be, the coefficients values? Or do I pick random values, and how many are necessary?

np.zeros() does not return integers by default. Try, for example, np.zeros(5, dtype=int) instead. It won't solve all of the problems with your code, though. You'll see some other error message.
Also, notice, that 1/666 returns 0 instead of 0.00150150. You probably want 1/666.0.
It would be helpful if you could clean up your code as half of it is of no use.

Related

Solve a matrix valued differential equation in Matlab

I am trying to solve a particular system of ODE's dF/dt = A*F, F_initial = eye(9). Being a Matlab novice, I am trying to somehow use the implemented ode45 function, and I found useful advises online. However, all of them assume A to be constant, while in my case the matrix A is a function of t, in other words, A changes with each time step.
I have solved A separately and stored it in a 9x9xN array (because my grid is t = 0:dt:2, N=2/dt is the number of time-steps, and A(:,:,i) corresponds to it's value at the i-th time step). But I can't implement this array in ode45 to eventually solve my ODE.
Any help is welcomed, and please tell me if I have missed anything important while explaining my problem. Thank you
First of all, F must be a column vector when using ode45. You won't ever get a result by setting F_initial = eye(9), you'd need F = ones(9,1).
Also, ode45 (documentation here, check tspan section) doesn't necessarily evaluate your function at the timesteps that you give it, so you can't pre-compute the A matrix. Here I'm going to assume that F is a column vector and A is a matrix which acts on it, which can be computed each timestep. If this is the case, then we can just include A in the function passed to ode45, like so:
F_initial = ones(9,1);
dt = 0.01;
tspan = 0:2/dt:2;
[t, F] = ode45(#(t,F) foo(t, F, Ainput), tspan, F_initial);
function f = foo(t, F, Ainput)
A = calculate_A(t, Ainput);
f = A*F;
end
function A = calculate_A(t, Ainput)
%some logic, calculate A based on inputs and timestep
A = ones(9,9)*sqrt(t)*Ainput;
end
The #(x) f(x,y) basically creates a new anonymous function which allows you to treat y as a constant in the calculation.
Hope this is helpful, let me know if I've misunderstood something or if you have other questions.

Speeding up matlab for loop

I have a system of 5 ODEs with nonlinear terms involved. I am trying to vary 3 parameters over some ranges to see what parameters would produce the necessary behaviour that I am looking for.
The issue is I have written the code with 3 for loops and it takes a very long time to get the output.
I am also storing the parameter values within the loops when it meets a parameter set that satisfies an ODE event.
This is how I have implemented it in matlab.
function [m,cVal,x,y]=parameters()
b=5000;
q=0;
r=10^4;
s=0;
n=10^-8;
time=3000;
m=[];
cVal=[];
x=[];
y=[];
val1=0.1:0.01:5;
val2=0.1:0.2:8;
val3=10^-13:10^-14:10^-11;
for i=1:length(val1)
for j=1:length(val2)
for k=1:length(val3)
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
[t,y,te,ye]=ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if length(te)==1
m=[m;val1(i)];
cVal=[cVal;val2(j)];
x=[x;val3(k)];
y=[y;ye(1)];
end
end
end
end
Is there any other way that I can use to speed up this process?
Profile viewer results
I have written the system of ODEs simply with the a format like
function s=systemFunc(t,y,p)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p(1)*y(2)*y(1)/(p(2)*y(2)+y(1));
s(2)=p(3)*y(1)-d*y(2);
end
f,d,k are constant parameters.
The equations are more complicated than what's here as its a system of 5 ODEs with lots of non linear terms interacting with each other.
Tommaso is right. Preallocating will save some time.
But I would guess that there is fundamentally not a lot you can do since you are running ode45 in a loop. ode45 itself may be the bottleneck.
I would suggest you profile your code to see where the bottleneck is:
profile on
parameters(... )
profile viewer
I would guess that ode45 is the problem. Probably you will find that you should actually focus your time on optimizing the systemFunc code for performance. But you won't know that until you run the profiler.
EDIT
Based on the profiler output and additional code, I see some things that will help
It seems like the vectorization of your values is hurting you. Instead of
#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)])
try
#(t,y)systemFunc(t,y,val1(i),val2(j),val3(k))
where your system function is defined as
function s=systemFunc(t,y,p1,p2,p3)
s= zeros(2,1);
s(1)=f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1));
s(2)=p3*y(1)-d*y(2);
end
Next, note that you don't have to preallocate space in the systemFunc, just combine them in the output:
function s=systemFunc(t,y,p1,p2,p3)
s = [ f*y(1)*(1-(y(1)/k))-p1*y(2)*y(1)/(p2*y(2)+y(1)),
p3*y(1)-d*y(2) ];
end
Finally, note that ode45 is internally taking about 1/3 of your runtime. There is not much you will be able to do about that. If you can live with it, I would suggest increasing your 'AbsTol' and 'RelTol' to more reasonable numbers. Those values are really small, and are making ode45 run for a really long time. If you can live with it, try increasing them to something like 1e-6 or 1e-8 and see how much the performance increases. Alternatively, depending on how smooth your function is, you might be able to do better with a different integrator (like ode23). But your mileage will vary based on how smooth your problem is.
I have two suggestions for you.
Preallocate the vectors in which you store your results and use an
increasing index to populate them into each iteration.
Since the options you use are always the same, instantiate then
outside the loop only once.
Final code:
function [m,cVal,x,y] = parameters()
b = 5000;
q = 0;
r = 10^4;
s = 0;
n = 10^-8;
time = 3000;
options = odeset('AbsTol',1e-15,'RelTol',1e-13,'Events',#eventfunction);
val1 = 0.1:0.01:5;
val1_len = numel(val1);
val2 = 0.1:0.2:8;
val2_len = numel(val2);
val3 = 10^-13:10^-14:10^-11;
val3_len = numel(val3);
total_len = val1_len * val2_len * val3_len;
m = NaN(total_len,1);
cVal = NaN(total_len,1);
x = NaN(total_len,1);
y = NaN(total_len,1);
res_offset = 1;
for i = 1:val1_len
for j = 1:val2_len
for k = 1:val3_len
[t,y,te,ye] = ode45(#(t,y)systemFunc(t,y,[val1(i),val2(j),val3(k)]),0:time,[b,q,s,r,n],options);
if (length(te) == 1)
m(res_offset) = val1(i);
cVal(res_offset) = val2(j);
x(res_offset) = val3(k);
y(res_offset) = ye(1);
end
res_offset = res_offset + 1;
end
end
end
end
If you only want to preserve result values that have been correctly computed, you can remove the rows containing NaNs at the bottom of your function. Indexing on one of the vectors will be enough to clear everything:
rows_ok = ~isnan(y);
m = m(rows_ok);
cVal = cVal(rows_ok);
x = x(rows_ok);
y = y(rows_ok);
In continuation of the other suggestions, I have 2 more suggestions for you:
You might want to try with a different solver, ODE45 is for non-stiff problems, but from the looks of it, it might seem like your problem could be stiff (parameters have a different order of magnitude). Try for instance with the ode23s method.
Secondly, without knowing which event you are looking for, maybe it is possible for you to use a logarithmic search rather than a linear one. e.g. the Bisection method. This will severely cut down on the number of times you have to solve the equation.

Using matlab fit object as a function

Matlab fit is no doubt useful but it is not clear how to use it as a function
apart from trivial integration and differentiation given on the official website:
http://uk.mathworks.com/help/curvefit/example-differentiating-and-integrating-a-fit.html
For example given a fit stored in the object 'curve' one can evaluate
curve(x) to get a number. But how one would, e.g. integrate |curve(x)|^2 (apart from clumsily creating a new fit)? Trying naively
curve = fit(x_vals,y_vals,'smoothingspline');
integral(curve(x)*curve(x), 0, 1)
gives an error:
Output of the function must be the same size as the input. If FUN is an array-valued integrand, set the 'ArrayValued' option to true.
I have also tried a work around by definining a normal function and an implicit function for the integrand (below) but both give the same error.
func=#(x)(curve(x))...; % trial solution 1
function func_val=func(curve, x)...; % trial solution 2
Defining function for the integrand followed by integration with option 'ArrayValued' set to 'true' works:
func=#(x)(curve(x)*curve(x));
integral(func,0,1,'ArrayValued',true)
You need to have the function vectorized, i.e use element-wise operations like curve(x).*curve(x) or curve(x).^2.
Also make sure that the shape of the output matches the input, i.e a row input gives a row output, similarly a column comes out as a column. It seems that evaluating the fit object always returns a column vector (e.g. f(1:10) returns a 10x1 vector not 1x10).
With that said, here is an example:
x = linspace(0,4*pi,100)';
y = sin(x);
y = y + 0.5*y.*randn(size(y));
f = fit(x, y, 'smoothingspline');
now you can integrate as:
integral(#(x) reshape(f(x).^2,size(x)), 0, 1)
in this case, it can be simplified as a simple transpose:
integral(#(x) (f(x).^2)', 0, 1)

Make this matlab code run without a for loop

Lets say that I have an array x with some data values.
I performed a clustering algorithm that has produced a label map with the label names - labelMap. Each point in the data now has a unique cluster label associated to it.
I then perform the function foo(subset, secondArg) over each subset. The function foo returns a new array with the result which has the same size as its given the argument (its a map() function which also recieves a second argument).
What follows is the current implementation:
x = rand(1,1000);
numClusters = 3; % specified in advance by the user, for example using a clustering algorithm such as K-Means, this is a given.
fooSecondArg = [1,2,3]; % second argument for foo().
labelMap = kmeans(x,numClusters);
res = zeros(size(x));
%% make me run without a for loop :)
for ind = 1:numClusters
res(labelMap == ind) = foo(x(labelMap == ind), fooSecondArg(ind));
end
My question is as follows:
As there is no overlap between the indices in x foo() acts upon, is there a way to perform foo over x without using a for or a parfor loop? (I don't want to use a parfor loop as It takes quite a while to start the additional matlab processes and I cannot later on deploy it as a stand alone application).
Many thanks!
Edit:
I was asked for an example of foo() because I was told the solution may depend on it. A good viable example you may use in your answer:
function out = foo(x,secondArg)
out = x.^2/secondArg;
Whether the loop can be removed or not depends on the function you have. In your case the function is
function out = foo(x,secondArg)
out = x.^2/secondArg;
where secondArg depends on the cluster.
For this function (and for similar ones) you can indeed eliminate the loop as follows:
res = x.^2 ./ fooSecondArg(labelMap);
Here
labelMap is the same size of x (indicates the cluster label for each entry of x);
fooSecondArg is a vector of length equal to the number of clusters (indicates the second argument for each cluster).
Thus fooSecondArg(labelMap) is an array the same size of x, which for each point gives the second argument corresponding to that point's cluster. The operator ./ performs element-wise division to produce the desired result.

Octave fminsearch: Problems with minimization and options

I am trying to use Octave's fminsearch function, which I have used in MATLAB before. The function seems not sufficiently documented (for me at least), and I have no idea how to set to options such that it would actually minimize.
I tried fitting a very simple exponential function using the code at the end of this message. I want the following:
I want the function to take as input the x- and y-values, just like MATLAB would do. Furthermore, I want some control over the options, to make sure that it actually minimizes (i.e. to a minimum!).
Of course, in the end I want to fit functions that are more complicated than exponential, but I want to be able to fit exponentials at least.
I have several problems with fminsearch:
I tried handing over the x- and y-value to the function, but a matlab-style thing like this:
[xx,fval]=fminsearch(#exponential,[1000 1],x,y);
or
[xx,fval]=fminsearch(#exponential,[33000 1],options,x,y)
produces errors:
error: options(6) does not correspond to known algorithm
error: called from:
error: /opt/local/share/octave/packages/optim-1.0.6/fmins.m at line 72, column 16
error: /opt/local/share/octave/packages/optim-1.0.6/fminsearch.m at line 29, column 4
Or, respectively (for the second case above):
error: `x' undefined near line 4 column 3
error: called from:
error: /Users/paul/exponential.m at line 4, column 2
error: /opt/local/share/octave/packages/optim-1.0.6/nmsmax.m at line 63, column 6
error: /opt/local/share/octave/packages/optim-1.0.6/fmins.m at line 77, column 9
error: /opt/local/share/octave/packages/optim-1.0.6/fminsearch.m at line 29, column 4
Apparently, the order of arguments that fminsearch takes is different from the one in MATLAB. So, how is this order??
How can I make fminsearch take values and options?
I found a workaround to the problem that the function would not take values: I defined the x- and y values as global. Not elegant, but at least then the values are available in the function.
Nonetheless, fminsearch does not minimize properly.
This is shown below:
Here is the function:
function f=exponential(coeff)
global x
global y
X=x;
Y=y;
a= coeff(1);
b= coeff(2);
Y_fun = a .* exp(-X.*b);
DIFF = Y_fun - Y;
SQ_DIFF = DIFF.^2;
f=sum(SQ_DIFF);
end
Here is the code:
global x
global y
x=[0:1:200];
y=4930*exp(-0.0454*x);
options(10)=10000000;
[cc,fval]=fminsearch(#exponential,[5000 0.01])
This is the output:
cc =
4930.0 5184.6
fval = 2.5571e+08
Why does fminsearch not find the solution?
There is an fminsearch implementation in the octave-forge package "optim".
You can see in its implementation file that the third parameter is always an options vector, the fourth is always a grad vector, so your ,x,y invocations will not work.
You can also see in the implementation that it calls an fmins implementation.
The documentation of that fmins implementation states:
if options(6)==0 && options(5)==0 - regular simplex
if options(6)==0 && options(5)==1 - right-angled simplex
Comment: the default is set to "right-angled simplex".
this works better for me on a broad range of problems,
although the default in nmsmax is "regular simplex"
A recent problem of mine would solve fine with matlab's fminsearch, but not with this octave-forge implementation. I had to specify an options vector [0 1e-3 0 0 0 0] to have it use a regular simplex instead of a 'right-angled simplex'. The octave default makes no sense if your coefficients differ vastly in scale.
The optimization function fminsearch will always try to find a minimum, no matter what the options are. So if you are finding it's not finding a minimum, it's because it failed to do so.
From the code you provide, I cannot determine what goes wrong. The solution with the globals should work, and indeed does work over here, so something else on your side must be going awry. (NOTE: I do use MATLAB, not Octave, so those two functions could be slightly different...)
Anyway, why not do it like this?
function f = exponential(coeff)
x = 0:1:200;
y = 4930*exp(-0.0454*x);
a = coeff(1);
b = coeff(2);
Y_fun = a .* exp(-x.*b);
f = sum((Y_fun-y).^2);
end
Or, if you must pass x and y as external parameters,
x = [0:1:200];
y = 4930*exp(-0.0454*x);
[cc,fval] = fminsearch(#(c)exponential(c,x,y),[5000 0.01])
function f = exponential(coeff,x,y)
a = coeff(1);
b = coeff(2);
Y_fun = a .* exp(-x.*b);
f = sum((Y_fun-y).^2);
end