What does 'Constrviolation' physically mean in Matlab's fmincon output? - matlab

I am using the fmincon function in Matlab. I have been trying to figure out what 'constrviolation' means when you run the function and call output. When you get infeasible solution or the solver end prematurely, you get a non-zero (& non-integer) constrviolation.
I put in a screen shot for reference.
I have searched the documentation and it says it means "Maximum of constraint functions" and I have no idea what that means. It's not an integer number so my first guess was that it is the percentage of constraints violated (or satisfied).
Any help would be appreciated.

Just interpreting the docs given some optimization-background:
constrviolation
Maximum of constraint functions
This is just the maximum of all absolute constraint-function errors
Example:
x0 + x1 = 1
x0 + x1 + x2 = 2
Somehow the solution is:
x = [0.6, 0.5, 0.9]
constrviolation is:
max( abs( 0.6 + 0.5 - 1 ), abs( 0.6 + 0.5 + 0.9 - 2 ) ) = max( 0.1, 0 ) = 0.1
This is bad and technically means: your solution is infeasible! (should converge to zero; e.g. 1e-8)
As the solver did not end very gracefully, it can't give you a real status about the problem (feasible vs. infeasible).
It might be valuable to add: Interior-point algorithms (like used here) might (some do, some don't) iterate through infeasible solutions and only finally converge to a feasible one (if existing)!
Also bad:
firstorderopt
Measure of first-order optimality
Should converge to zero too (e.g. 1e-8)! Not achieved in your example!
Now there are many possible reasons why this is happening. As you did not provide any code, we only can guess (and won't be happy about it).
You probably hit some iteration-limit like MaxFunctionEvaluations or MaxIterations. The ratio of funcCount and iterations look like numerical-differentiation, which can push the number of functions calls a lot!

Related

While loop does not stop - calculating armstrong number [duplicate]

Obviously, float comparison is always tricky. I have a lot of assert-check in my (scientific) code, so very often I have to check for equality of sums to one, and similar issues.
Is there a quick-and easy / best-practice way of performing those checks?
The easiest way I can think of is to build a custom function for fixed tolerance float comparison, but that seems quite ugly to me. I'd prefer a built-in solution, or at least something that is extremely clear and straightforward to understand.
I think it's most likely going to have to be a function you write yourself. I use three things pretty constantly for running computational vector tests so to speak:
Maximum absolute error
return max(abs(result(:) - expected(:))) < tolerance
This calculates maximum absolute error point-wise and tells you whether that's less than some tolerance.
Maximum excessive error count
return sum( (abs(result(:) - expected(:))) < tolerance )
This returns the number of points that fall outside your tolerance range. It's also easy to modify to return percentage.
Root mean squared error
return norm(result(:) - expected(:)) < rmsTolerance
Since these and many other criteria exist for comparing arrays of floats, I would suggest writing a function which would accept the calculation result, the expected result, the tolerance and the comparison method. This way you can make your checks very compact, and it's going to be much less ugly than trying to explain what it is that you're doing in comments.
Any fixed tolerance will fail if you put in very large or very small numbers, simplest solution is to use eps to get the double precision:
abs(A-B)<eps(A)*4
The 4 is a totally arbitrary number, which is sufficient in most cases.
Don't know any special build in solution. Maybe something with using eps function?
For example as you probably know this will give False (i.e. 0) as a result:
>> 0.1 + 0.1 + 0.1 == 0.3
ans =
0
But with eps you could do the following and the result is as expected:
>> (0.1+0.1+0.1) - 0.3 < eps
ans =
1
I have had good experience with xUnit, a unit test framework for Matlab. After installing it, you can use:
assertVectorsAlmostEqual(a,b) (checks for normwise closeness between vectors; configurable absolute/relative tolerance and sane defaults)
assertElementsAlmostEqual(a,b) (same check, but elementwise on every single entry -- so [1 1e-12] and [1 -1e-9] will compare equal with the former but not with the latter).
They are well-tested, fast to use and clear enough to read. The function names are quite long, but with any decent editor (or the Matlab one) you can write them as assertV<tab>.
For those who understand both MATLAB and Python (NumPy), it would maybe be useful to check the code of the following Python functions, which do the job:
numpy.allclose(a, b, rtol=1e-05, atol=1e-08)
numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)

Efficient size choice for SciPy Discrete Sine Transform

I noticed that SciPy has an implementation of the Discrete Sine Transform, and I was comparing it to the one that's in MATLAB. The MATLAB documentation notes that for best performance, the size of the inputs should be 2^p -1, presumably for a divide and conquer strategy. Is this also true for the SciPy implementation?
Although this question is old, I happen to have just ran some tests and then stumbled upon this question.
The answer is yes. Internally, scipy seems to converts the array to size M = 2*(N+1).
Ideally, M = 2^i, for some integer i. Therefore, N should follow N = 2^i - 1. The following picture shows how timings scale with fft-size. Note that the orange line is much smoother, indicating no unexpected memory overhead.
Green line: N = 2^i
Blue line: N = 2^i + 1
Orange line: N = 2^i - 1
UPDATE
After digging some more into the documentation of scipy.fftpack, I found that the above answer is only partly true. According to the documentation, "SciPy’s FFTPACK has efficient functions for radix {2, 3, 4, 5}". This means that instead of efficiently doing arrays of size M = 2^i, it can handle any M = 2^i * 3^j * 5^k (4 is not a prime). The optimum for scipy.fftpack.dst (or dct) is then M - 1. Finding those numbers can be a little awkward, but luckily there's a function for that, too!
Please note that the above graph is log-log scale, so speedups of 40 or so are not uncommon. Thus, choosing a fast size can make you calculations orders of magnitudes faster! (I found this out the hard way).

Trying to find the point when two functions differ from each other a 5% with Matlab

As I've written in the title, I'm trying to find the exact distance (dimensionless distance in this case) when two functions start to differ from each other a 5% of the Y-axis. The two functions intersect at the value of 1 in the X-axis and I need to find the described distance before the intersection, not after (i.e., it must be less than 1). I've written a Matlab code for you to see the shape of the functions and the following calculations which I'm trying to make them work but they don't, I don't know why. "Explicit solution could not be found".
I don't know if I explained it clearly. Please let me know if you need a more detailed explanation.
I hope you can throw some light in this issue.
Thank you so much in advanced.
r=0:0.001:1.2;
ro=0.335;
rt=r./ro;
De=0.3534;
k=2.8552;
B=(2*k/De)^0.5;
Fm=2.*De.*B.*ro.*[1-exp(B.*ro.*(1-rt))].*exp(B.*ro.*(1-rt));
A=5;
b=2.2347;
C=167.4692;
Ftt=(C.*(exp(-b.*rt).*((b.^6.*rt.^5)./120 + (b.^5.*rt.^4)./24 + (b.^4.*rt.^3)./6 + (b.^3.*rt.^2)./2 + b.^2.*rt + b) - b.*exp(-b.*rt).*((b.^6.*rt.^6)./720 + (b.^5.*rt.^5)./120 + (b.^4.*rt.^4)./24 + (b.^3.*rt.^3)./6 + (b.^2.*rt.^2)./2 + b.*rt + 1)))./rt.^6 - (6.*C.*(exp(-b.*rt).*((b.^6.*rt.^6)./720 + (b.^5.*rt.^5)./120 + (b.^4.*rt.^4)./24 + (b.^3.*rt.^3)./6 + (b.^2.*rt.^2)./2 + b.*rt + 1) - 1))./rt.^7 - A.*b.*exp(-b.*rt);
plot(rt,-Fm,'red')
axis([0 2 -1 3])
xlabel('Dimensionless distance')
ylabel('Force, -dU/dr')
hold on
plot(rt,-Ftt,'green')
clear rt
syms rt
%assume(0<rt<1)
r1=solve((Fm-Ftt)/Ftt==0.05,rt)
r2=solve((Ftt-Fm)/Fm==0.05,rt)
Welcome to the crux of floating point data. The reason why is because for the values of r that you are providing, the exact solution of 0.05 may be in between two of the values in your r array and so you won't be able to get an exact solution. Also, FWIW, your equation may never generate a solution of 0.05, which is why you're getting that error too. Either way, doing that explicit solve on floating point data is never recommended, unless you know very well how your data are shaped and what values you expect for the output of the function you're applying the data to.
As such, it's always recommended that you find the nearest value that satisfies your condition. As such, you should do something like this:
[~,ind] = min(abs((Fm-Ftt)./Ftt - 0.05));
r1 = r(ind);
The first line will find the nearest location in your r array that satisfies the 5% criterion. The next line of code will then give you the value that is in your r array that satisfies this. You can do the same with r2 by:
[~,ind2] = min(abs((Ftt-Fm)./Fm - 0.05));
r2 = r(ind2);
What the above code is basically doing is that it is trying to find at what point in your array would the difference between your data and 5% be 0. In other words, which point in your r array would be close enough to make the above relation equal to 0, or essentially when it is as close to 5% as possible.
If you want to improve this, you can always change the step size of r... perhaps make it 0.00001 or something. However, the smaller the step size, the larger your array and you'll eventually run out of memory!

Optimization with discrete parameters in Matlab

I have 12 sets of vectors (about 10-20 vectors each) and i want to pick one vector of each set so that a function f that takes the sum of these vectors as argument is maximized. In addition i have constraints for some components of that sum.
Example:
a_1 = [3 2 0 5], a_2 = [3 0 0 2], a_3 = [6 0 1 1], ... , a_20 = [2 12 4 3]
b_1 = [4 0 4 -2], b_2 = [0 0 1 0], b_3 = [2 0 0 4], ... , b_16 = [0 9 2 3]
...
l_1 = [4 0 2 0], l_2 = [0 1 -2 0], l_3 = [4 4 0 1], ... , l_19 = [3 0 9 0]
s = [s_1 s_2 s_3 s_4] = a_x + b_y + ... + l_z
Constraints:
s_1 > 40
s_2 < 100
s_4 > -20
Target: Chose x, y, ... , z to maximize f(s):
f(s) -> max
Where f is a nonlinear function that takes the vector s and returns a scalar.
Bruteforcing takes too long because there are about 5.9 trillion combinations, and since i need the maximum (or even better the top 10 combinations) i can not use any of the greedy algorithms that came to my mind.
The vectors are quite sparse, about 70-90% are zeros. If that is helping somehow ...?
The Matlab Optimization toolbox didnt help either since it doesnt much support for discrete optimization.
Basically this is a lock-picking problem, where the lock's pins have 20 distinct positions, and there are 12 pins. Also:
some of the pin's positions will be blocked, depending on the positions of all the other pins.
Depending on the specifics of the lock, there may be multiple keys that fit
...interesting!
Based on Rasman's approach and Phpdna's comment, and the assumption that you are using int8 as data type, under the given constraints there are
>> d = double(intmax('int8'));
>> (d-40) * (d+100) * (d+20) * 2*d
ans =
737388162
possible vectors s (give or take a few, haven't thought about +1's etc.). ~740 million evaluations of your relatively simple f(s) shouldn't take more than 2 seconds, and having found all s that maximize f(s), you are left with the problem of finding linear combinations in your vector set that add up to one of those solutions s.
Of course, this finding of combinations is no easy feat, and the whole method breaks down anyway if you are dealing with
int16: ans = 2.311325368800510e+018
int32: ans = 4.253529737045237e+037
int64: ans = 1.447401115466452e+076
So, I'll discuss a more direct and more general approach here.
Since we're talking integers and a fairly large search space, I'd suggest using a branch-and-bound algorithm. But unlike the bintprog algorithm, you'd have to use different branching strategies, and of course, these should be based on a non-linear objective function.
Unfortunately, there is nothing like this in the optimization toolbox (or the File Exchange as far as I could find). fmincon is a no-go, since it uses gradient and Hessian information (which will usually be all-zero for integers), and fminsearch is a no-go, since you'll need a really good initial estimate, and the rate of convergence is (roughly) O(N), meaning, for this 20-dimensional problem you'll have to wait quite long before convergence, without the guarantee of having found the global solution.
An interval method could be a possibility, however, I personally have very little experience with this. There is no native interval-related stuff in MATLAB or any of its toolboxes, but there's the freely available INTLAB.
So, if you're not feeling like implementing your own non-linear binary integer programming algorithm, or are not in the mood for an adventure with INTLAB, there's really only one thing left: heuristic methods. In this link there is a similar situation, with an outline of the solution: use the genetic algorithm (ga) from the Global Optimization toolbox.
I would implement the problem roughly like so:
function [sol, fval, exitflag] = bintprog_nonlinear()
%// insert your data here
%// Any sparsity you may have here will only make this more
%// *memory* efficient, not *computationally*
data = [...
... %// this will be an array with size 4-by-20-by-12
... %// (or some permutation of that you find more intuitive)
];
%// offsets into the 3D array to facilitate indexing a bit
offsets = bsxfun(#plus, ...
repmat(1:size(data,1), size(data,3),1), ...
(0:size(data,3)-1)' * size(data,1)*size(data,2)); %//'
%// your objective function
function val = obj(X)
%// limit "X" to integers in [1 20]
X = min(max(round(X),1),size(data,3));
%// "X" will be a collection of 12 integers between 0 and 20, which are
%// indices into the data matrix
%// form "s" from "X"
s = sum(bsxfun(#plus, offsets, X*size(data,1) - size(data,1)));
%// XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX
%// Compute the NEGATIVE VALUE of your function here
%// XxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxXxX
end
%// your "non-linear" constraint function
function [C, Ceq] = nonlcon(X)
%// limit "X" to integers in [1 20]
X = min(max(round(X),1),size(data,3));
%// form "s" from "X"
s = sum(bsxfun(#plus, offsets, X(:)*size(data,1) - size(data,1)));
%// we have no equality constraints
Ceq = [];
%// Compute inequality constraints
%// NOTE: solver is trying to solve C <= 0, so:
C = [...
40 - s(1)
s(2) - 100
-20 - s(4)
];
end
%// useful GA options
options = gaoptimset(...
'UseParallel', 'always'...
...
);
%// The rest really depends on the specifics of the problem.
%// Useful to look at will be at least 'TolCon', 'Vectorized', and of course,
%// 'PopulationType', 'Generations', etc.
%// THE OPTIMZIATION
[sol, fval, exitflag] = ga(...
#obj, size(data,3), ... %// objective function, taking a vector of 20 values
[],[], [],[], ... %// no linear (in)equality constraints
1,size(data,2), ... %// lower and upper limits
#nonlcon, options); %// your "nonlinear" constraints
end
Note that even though your constraints are essentially linear, the way by which you must compute the value for your s necessitates the use of a custom constraint function (nonlcon).
Especially note that this is currently (probably) a sub-optimal way to use ga -- I don't know the specifics of your objective function, so a lot more may be possible. For instance, I currently use a simple round() to convert the input X to integers, but using 'PopulationType', 'custom' (with a custom 'CreationFcn', 'MutationFcn' etc.) might produce better results. Also, 'Vectorized' will likely speed things up a lot, but I don't know whether your function is easily vectorized.
And yes, I use nested functions (I just love those things!); it prevents these huge, usually identical lists of input arguments if you use sub-functions or stand-alone functions, and they can really be a performance boost because there is little copying of data. But, I realize that their scoping rules make them somewhat akin to goto constructs, and so they are -ahum- "not everyone's cup of tea"...you might want to convert them to sub-functions to prevent long and useless discussions with your co-workers :)
Anyway, this should be a good place to start. Let me know if this is useful at all.
Unless you define some intelligence on how the vector sets are organized, there will be no intelligent way of solving your problem other then pure brute force.
Say you find s s.t. f(s) is max given constraints of s, you still need to figure out how to build s with twelve 4-element vectors (an overdetermined system if there ever was one), where each vector has 20 possible values. Sparsity may help, although I'm not sure how it is possible to have a vector with four elements be 70-90% zero, and sparsity would only be useful if there was some yet to be described methodology in how the vector are organized
So I'm not saying you can't solve the problem, I'm saying you need to rethink how the problem is set-up.
I know, this answer is reaching you really late.
Unfortunately, the problem, as is, show not many patterns to be exploited, besides of brute force -Branch&Bound, Master& Slave, etc.- Trying a Master Slave approach -i.e. solving first the function continuous nonlinear problem as master, and solving the discrete selection as slave could help, but with as many combinations, and without any more information over the vectors, there is not too much space for work.
But based on the given continuous almost everywhere functions, based on combinations of sums and multiplication operators and their inverses, the sparsity is a clear point to be exploited here. If 70-90% of vectors are zero, almost a good part of the solution space will be close to zero, or close to infinite. Hence a 80-20 pseudo solution would discard easily the 'zero' combinations, and use only the 'infinite' ones.
This way, the brute-force could be guided.

How to generate random matlab vector with these constraints

I'm having trouble creating a random vector V in Matlab subject to the following set of constraints: (given parameters N,D, L, and theta)
The vector V must be N units long
The elements must have an average of theta
No 2 successive elements may differ by more than +/-10
D == sum(L*cosd(V-theta))
I'm having the most problems with the last one. Any ideas?
Edit
Solutions in other languages or equation form are equally acceptable. Matlab is just a convenient prototyping tool for me, but the final algorithm will be in java.
Edit
From the comments and initial answers I want to add some clarifications and initial thoughts.
I am not seeking a 'truly random' solution from any standard distribution. I want a pseudo randomly generated sequence of values that satisfy the constraints given a parameter set.
The system I'm trying to approximate is a chain of N links of link length L where the end of the chain is D away from the other end in the direction of theta.
My initial insight here is that theta can be removed from consideration until the end, since (2) in essence adds theta to every element of a 0 mean vector V (shifting the mean to theta) and (4) simply removes that mean again. So, if you can find a solution for theta=0, the problem is solved for all theta.
As requested, here is a reasonable range of parameters (not hard constraints, but typical values):
5<N<200
3<D<150
L==1
0 < theta < 360
I would start by creating a "valid" vector. That should be possible - say calculate it for every entry to have the same value.
Once you got that vector I would apply some transformations to "shuffle" it. "Rejection sampling" is the keyword - if the shuffle would violate one of your rules you just don't do it.
As transformations I come up with:
switch two entries
modify the value of one entry and modify a second one to keep the 4th condition (Theoretically you could just shuffle two till the condition is fulfilled - but the chance that happens is quite low)
But maybe you can find some more.
Do this reasonable often and you get a "valid" random vector. Theoretically you should be able to get all valid vectors - practically you could try to construct several "start" vectors so it won't take that long.
Here's a way of doing it. It is clear that not all combinations of theta, N, L and D are valid. It is also clear that you're trying to simulate random objects that are quite complex. You will probably have a hard time showing anything useful with respect to these vectors.
The series you're trying to simulate seems similar to the Wiener process. So I started with that, you can start with anything that is random yet reasonable. I then use that as a starting point for an optimization that tries to satisfy 2,3 and 4. The closer your initial value to a valid vector (satisfying all your conditions) the better the convergence.
function series = generate_series(D, L, N,theta)
s(1) = theta;
for i=2:N,
s(i) = s(i-1) + randn(1,1);
end
f = #(x)objective(x,D,L,N,theta)
q = optimset('Display','iter','TolFun',1e-10,'MaxFunEvals',Inf,'MaxIter',Inf)
[sf,val] = fminunc(f,s,q);
val
series = sf;
function value= objective(s,D,L,N,theta)
a = abs(mean(s)-theta);
b = abs(D-sum(L*cos(s-theta)));
c = 0;
for i=2:N,
u =abs(s(i)-s(i-1)) ;
if u>10,
c = c + u;
end
end
value = a^2 + b^2+ c^2;
It seems like you're trying to simulate something very complex/strange (a path of a given curvature?), see questions by other commenters. Still you will have to use your domain knowledge to connect D and L with a reasonable mu and sigma for the Wiener to act as initialization.
So based on your new requirements, it seems like what you're actually looking for is an ordered list of random angles, with a maximum change in angle of 10 degrees (which I first convert to radians), such that the distance and direction from start to end and link length and number of links are specified?
Simulate an initial guess. It will not hold with the D and theta constraints (i.e. specified D and specified theta)
angles = zeros(N, 1)
for link = 2:N
angles (link) = theta(link - 1) + (rand() - 0.5)*(10*pi/180)
end
Use genetic algorithm (or another optimization) to adjust the angles based on the following cost function:
dx = sum(L*cos(angle));
dy = sum(L*sin(angle));
D = sqrt(dx^2 + dy^2);
theta = atan2(dy/dx);
the cost is now just the difference between the vector given by my D and theta above and the vector given by the specified D and theta (i.e. the inputs).
You will still have to enforce the max change of 10 degrees rule, perhaps that should just make the cost function enormous if it is violated? Perhaps there is a cleaner way to specify sequence constraints in optimization algorithms (I don't know how).
I feel like if you can find the right optimization with the right parameters this should be able to simulate your problem.
You don't give us a lot of detail to work with, so I'll assume the following:
random numbers are to be drawn from [-127+theta +127-theta]
all random numbers will be drawn from a uniform distribution
all random numbers will be of type int8
Then, for the first 3 requirements, you can use this:
N = 1e4;
theta = 40;
diffVal = 10;
g = #() randi([intmin('int8')+theta intmax('int8')-theta], 'int8') + theta;
V = [g(); zeros(N-1,1, 'int8')];
for ii = 2:N
V(ii) = g();
while abs(V(ii)-V(ii-1)) >= diffVal
V(ii) = g();
end
end
inline the anonymous function for more speed.
Now, the last requirement,
D == sum(L*cos(V-theta))
is a bit of a strange one...cos(V-theta) is a specific way to re-scale the data to the [-1 +1] interval, which the multiplication with L will then scale to [-L +L]. On first sight, you'd expect the sum to average out to 0.
However, the expected value of cos(x) when x is a random variable from a uniform distribution in [0 2*pi] is 2/pi (see here for example). Ignoring for the moment the fact that our limits are different from [0 2*pi], the expected value of sum(L*cos(V-theta)) would simply reduce to the constant value of 2*N*L/pi.
How you can force this to equal some other constant D is beyond me...can you perhaps elaborate on that a bit more?