How do I solve a linear model in KDB using BFGS? - kdb

I've got a toy linear model:
\l ml/ml.q
.ml.loadfile`:optimize/init.q
xx: 9h$til 10
yy: ((xx)*3) + 4
x0: 1 1
error:{sum xexp[(yy - (xx*x) + y);2]}
q).ml.optimize.BFGS[error;x0;();::]
\
'type
[4] /home/chris/anaconda3/q/ml/optimize/utils.q:467: .ml.i.gradEval:
// Evaluate the gradient
(i.funcEval[func;xk;args]-fk)%eps
^
}
I'm hoping it will minimize the error function, and recover 3;4 from the model.
It doesn't seem to go though, despite having followed the docs as best I can:
https://code.kx.com/q/ml/toolkit/optimize/
What am I doing wrong?

The problem was related to the error function; it should be unary and take a list as a parameter.
error:{sum xexp[(yy - (xx*x[0]) + x[1]);2]}

Related

Julia (JuMP): Indicator constraints with multiple conditional values (is a boolean expression possible?)

I want to implement a constraint depending on the change of values in my binary decision variable, x, over "time".
I am trying to implement a minimum operating time constraint for a unit commitment optimization problem for power systems. x is representing the unit activation where 0 and 1 show that a power unit, n, at a certain time, t, respectively is shut off or turned on.
For this, indicator constraints seem to be a promising solution and with the inspiration of a similar problem the implementation seemed quite straightforward.
So, since boolean operators are introduced (! and ¬), I prematurely wanted to express the change in a boolean way:
#constraint(m, xx1[n=1:N,t=2:T], (!x[n,t-1] && x[n,t]) => {next(t, 1) + next(t, 2) == 2})
Saying: if unit was deactivated before but now is on, then demand the unit to be active for the next 2 times.
Where next(t, i) = x[((t - 1 + i) % T) + 1].
I got the following error:
LoadError: MethodError: no method matching !(::VariableRef)
Closest candidates are:
!(!Matched::Missing) at missing.jl:100
!(!Matched::Bool) at bool.jl:33
!(!Matched::Function) at operators.jl:896
I checked that the indicator constraint is working properly with a single term only.
Question: Is this possible or is there another obvious solution?
Troubleshooting and workarounds: I have tried the following (please correct me if my diagnosis is wrong):
Implement change as an expression: indicator constraints only work with binary integer variables.
Implement change as another variable relating to x. I have found a solution but it is quite sketchy, which is documented in a Julia discourse. The immediate problem, found from the solution, is that indicator constraints do not work as bi-implication but only one way, LHS->RHS. Please see the proper approach given by #Oscar Dowson.
You can get the working code from github.
The trick is to find constraint(s) that have an equivalent truth-table:
# Like
(!x[1] && x[2]) => {z == 1}
# Is equivalent to:
z >= -x[1] + x[2]
# Proof
-x[1] + x[2] = sum <= z
--------------------------
- 0 + 0 = 0 <= 0
- 1 + 0 = -1 <= 0
- 0 + 1 = 1 <= 1
- 1 + 1 = 0 <= 0
I was recommended MOSEK Modeling Cookbook to help working out the correct formulation of constraints.
See eventually the thread here from where I got the answer for further details.

How is this MATLAB code (involving colon operator) resolved?

Recently, I wanted to calculate the next multiple of 5 of several values.
I was very confused by the output of this code, which should have done the trick:
7:11 - mod(7:11, 5) + 5
ans =
7 8 9 10 11 12 13 14
While the actual working solution was this:
(7:11) - mod(7:11, 5) + 5
ans =
10 10 10 15 15
So this seems to be related to operator precedence! But what exactly does the first command do, and why does it output a (1,8) vector?
Addendum: I have found that the first command can also be written as:
7:(11 - mod(7:11, 5) + 5)
Which already hints towards the explanation of the observed result, but I am still curious about the whole explanation.
Here's the list of MATLAB operator precedence
As you can see, parentheses, (), are solved first, meaning that mod(7:11,5) will be done first. Then point 6), the addition and subtraction are taken care of from left to right, i.e. 11-mod(7:11,5) and then 11-mod(7:11,5)+5. Then point 7), the colon, :, gets evaluated, thus 7:11-mod(7:11,5)+5.
As you noted correctly 7:11 - mod(7:11, 5) + 5 is the same as 7:(11 - mod(7:11, 5) + 5), as seen above using operator precedence.
Now to the second part: why do you obtain 8 values, rather than 5? The problem here is "making an array with an array". Basically:
1:3
ans =
1 2 3
1:(3:5)
ans =
1 2 3
This shows what's going on. If you initialise an array with the colon, but have the end point as an array, MATLAB uses only the first value. As odd as it may sound, it's documented behaviour.
mod(7:11,5) generates an array, [2 3 4 0 1]. This array is then subtracted from 11 and 5 is added [14 13 12 16 15]. Now, as we see in the documentation, only the first element is then considered. 7:[14 13 12 16 15] gets parsed as 7:14 and will result in 8 values, as you've shown.
Doing (7:11) - mod(7:11, 5) + 5 first creates two arrays: 7:11 and mod(7:11,5). It then subtracts the two arrays elementwise and adds 5 to each of the elements. Interesting to note here would be that 7:12 - mod(7:11, 5) + 5 would work, whereas (7:12) - mod(7:11, 5) + 5 would result in an error due to incompatible array sizes.

Applying adverb to colon operator

Please help me with colon : operator, I'm stuck on how it works. It works as an assignment, assignment through x+:1, global assignment/view ::, I/O 0:, 1:, to return value from the middle of the function :r, and to get an unary form of operator #:.
But what happend if one apply an adverb to it? I tried this way:
$ q
KDB+ 3.6 2019.04.02 Copyright (C) 1993-2019 Kx Systems
q)(+')[100;2 3 4]
102 103 104
q)(:')[x;2 3 4]
'x
[0] (:')[x;2 3 4]
^
q)(:')[100;2 3 4]
2 3 4
I expect evaluations in order: x:2, then x:3, then x:4. To get x:4 as a result. But I've got an error. And also :' works with a number 100 for some unknown reason.
What :' is actually doing?
q)parse "(:')[100;2 3 4]"
(';:)
100
2 3 4
Parsing didn't shed much light to me, so I'm asking for your help.
When modified by an iterator (also known as an adverb in q speak), : behaves just like any other binary operator. In your example
q)(:')[100;2 3 4]
2 3 4
an atom 100 is extended to a conformant list 100 100 100 and then : is applied to elements of the two lists pairwise. The final result is returned. It might look confusing (: tries to modify a constant value, really?) but if you compare this to any other binary operator and notice that they never modify their operands but return a result of expression everything should click into place.
For example, compare
q)+'[100; 2 3 4]
102 103 104
and
q)(:')[100;2 3 4]
2 3 4
In both cases an a temporary vector 100 100 100 is created implicitly and an operator is applied to it and 2 3 4. So the former is semantically equivalent to
(t[0]+2;t[1]+2;t[2]+4)
and the latter to
(t[0]:2;t[1]:2;t[2]:4)
where t is that temporary vector.
This explains why (:')[x;2 3 4] gives an error -- if x doesn't exist kdb can't extend it to a list.

scipy dblquad providing the wrong result in simple double integral

I am trying to calculate a straightforward doble definite integral in Python: function Max(0, (4-12x) + (6-12y)) in the square [0,1] x [0,1].
We can do it with Mathematica and get the exact result:
Integrate[Max[0, 4-12*u1 + 6-12*u2], {u1, 0, 1}, {u2, 0,1}] = 125/108.
With a simple Monte Carlo simulation I can confirm this result. However, using scipy.integrate.dblquad I am getting a value of 0.0005772072907971, with error 0.0000000000031299
from scipy.integrate import dblquad
def integ(u1, u2):
return max(0, (4 - 12*u1) + (6 - 12*u2))
sol_int, err = dblquad(integ, 0, 1, lambda _:0, lambda _:1, epsabs=1E-12, epsrel=1E-12)
print("dblquad: %0.16f. Error: %0.16f" % (sol_int, err) )
Agreed that the function is not derivable, but it is continuous, I see no reason for this particular integral to be problematic.
I thought maybe dblquad has an 'options' argument where I can try different numerical methods, but I found nothing like that.
So, what am I doing wrong?
try different numerical methods
That's what I would suggest, given the trouble that iterated quad has on Windows. After changing it to an explicit two-step process, you can replace one of quad with another method, romberg seems the best alternative to me.
from scipy.integrate import quad, romberg
def integ(u1, u2):
return max(0, (4 - 12*u1) + (6 - 12*u2))
sol_int = romberg(lambda u1: quad(lambda u2: integ(u1, u2), 0, 1)[0], 0, 1)
print("romberg-quad: %0.16f " % sol_int)
This prints 1.1574073959987758 on my computer, and hopefully you will get the same.

GCD test - to test dependency between loop statements

I understand how the GCD works on a trivial example as below:
for(i=1; i<=100; i++)
{
X[2*i+3] = X[2*i] + 50;
}
we first transform it into the following form:
X[a*i + b] and X[c*i + d]
a=2, b=3, c=2, d=0 and GCD(a,c)=2 and (d-b) is -3. Since 2 does not divide -3, no dependence is possible.
But how can we do this GCD test on a doubly nested loop?
For example:
for (i=0; i<10; i++){
for (j=0; j<10; j++){
A[1+2*i + 20*j] = A[2+20*i + 2*j);
}
}
While the subscripts can be delinearized, the GCD test is simple to apply directly. In your example, the subscript pair is [1+2*i + 20*j] and [2+20*i + 2*j], so we're looking for an integer solution to the equation
1 + 2*i + 20*j = 2 + 20*i' + 2*j'
Rearranging, we get
2*i - 20*i' + 20*j - 2*j = 1
Compute the GCD of all the coefficients, 2, -20, 20, and -2, and see if it divides the constant. In this case, the GCD is 2. Since 2 doesn't divide 1, there's no dependence.
The "easy" way to apply GCD in the nested loop case is to apply it only in cases where the arrays themselves are multidemsional; i.e., the original source code uses multiple subscripts rather than already linearized expressions. Of course if you can "back transform" these linearized subscripts then you'll have the equivalent.
Once you've cast the problem as a multidemsional problem then you may simply apply the GCD test "dimension by dimension". If any dimension shows no dependence then you can stop and declare there is no dependence for the entire multidemsional subscripting sequence.
The key of course is that casting as a multidimensional indexing problem gives you the nice property that there's a one-to-one mapping between individual index values and the corresponding index expression tuples. Without this the problem is harder.
This is the approach I took in the ASC Fortran vectorizing compiler back in the 70's and it worked pretty well, particularly used in conjunction with directional subscript analysis for the non disjoint case. The GCD test by itself is really not sufficient, but it does give you a relatively inexpensive way of making an early decision in your analysis in those cases where you then can avoid the more expensive dependence analysis.