constraint on sum of abs(w) in scipy optimizer - scipy

I would like to put an upper limit on the sum of abs(w) in a scipy optimization problem. This can be done in a linear program by using dummy variables, e.g. y > w, y > -w, sum(y) < K, but I cannot figure out how to formulate it in the scipy optimize framework.
Code example is below. This runs but the total portfolio gross is not fixed. This is a long/short portfolio optimization where the w's sum to zero, and I want abs(w) to sum to 1.0. Is there a way to add this second constraint in scipy's framework ?
import numpy as np
import scipy.optimize as sco
def optimize(alphas, cov, maxRisk):
def _calcRisk(w):
var = np.dot(np.dot(w.T, cov), w)
return(var)
def _calcAlpha(w):
alpha = np.dot(alphas, w)
return(-alpha)
constraints = (
{'type': 'eq', 'fun': lambda w: np.sum(w)},
{'type': 'ineq', 'fun': lambda w: maxRisk*maxRisk - _calcRisk(w)} )
n = len(alphas)
bounds = tuple((-1, 1) for x in range(n))
initw = n * [0.00001 / n]
result = sco.minimize(_calcAlpha, initw, method='SLSQP',
bounds=bounds, constraints=constraints)
return(result)

A simple algebraic trick will do. Since equality constraints tacitly mean that the constraint function result is to be zero, you just shift the function's output by 1.0. Since np.sum(w)-1.0=0.0 is equivalent to np.sum(w)=1.0. See the documentation on scipy.optimize.minimize. In turn, just change the line
{'type': 'eq', 'fun': lambda w: np.sum(w)},
to
{'type': 'eq', 'fun': lambda w: np.sum(w) - 1.0}

Thanks to folks who responded. The answer is to make the free variable vector bigger, and slice from it to get the variables as needed (obvious I guess :-). The following works (use at your own risk of course):
import numpy as np
import scipy.optimize as sco
# make the required lambda function "final" so it does not change when param i (or n) changes
def makeFinalLambda(i, n, op):
if op == '+':
return(lambda w: w[n+i] + w[i])
else:
return(lambda w: w[n+i] - w[i])
def optimize(alphas, cov, maxRisk):
n = len(alphas)
def _calcRisk(x):
w = x[:n]
var = np.dot(np.dot(w.T, cov), w)
return(var)
def _calcAlpha(x):
w = x[:n]
alpha = np.dot(alphas, w)
return(-alpha)
constraints = []
# make the constraints to create abs value variables
for i in range(n):
# note that this doesn't work; all the functions will refer to current i value
# constraints.append({'type': 'ineq', 'fun': lambda w: w[n+i] - w[i] })
# constraints.append({'type': 'ineq', 'fun': lambda w: w[n+i] + w[i] })
constraints.append({'type': 'ineq', 'fun': makeFinalLambda(i, n, '-') })
constraints.append({'type': 'ineq', 'fun': makeFinalLambda(i, n, '+') })
# add neutrality, gross value, and risk constraints
constraints = constraints + \
[{'type': 'eq', 'fun': lambda w: np.sum(w[:n]) },
{'type': 'eq', 'fun': lambda w: np.sum(w[n:]) - 1.0 },
{'type': 'ineq', 'fun': lambda w: maxRisk*maxRisk - _calcRisk(w)}]
bounds = tuple((-1, 1) for x in range(n))
bounds = bounds + tuple((0, 1) for x in range(n))
# try to choose a nice, feasible starting vector
initw = n * [0.001 / n]
initw = initw + [abs(w)+0.001 for w in initw]
result = sco.minimize(_calcAlpha, initw, method='SLSQP',
bounds=bounds, constraints=constraints)
return(result)
This iteratively creates 2 constraints for each weight variable to compute the absolute value variables. It's nicer to do this as a vector (per-element) constraint, as follows:
def optimize(alphas, cov, maxRisk):
n = len(alphas)
def _calcRisk(x):
w = x[:n]
var = np.dot(np.dot(w.T, cov), w)
return(var)
def _calcAlpha(x):
w = x[:n]
alpha = np.dot(alphas, w)
return(-alpha)
absfunpos = lambda x : [x[n+i] - x[i] for i in range(n)]
absfunneg = lambda x : [x[n+i] + x[i] for i in range(n)]
constraints = (
sco.NonlinearConstraint(absfunpos, [0.0]*n, [2.0]*n),
sco.NonlinearConstraint(absfunneg, [0.0]*n, [2.0]*n),
{'type': 'eq', 'fun': lambda w: np.sum(w[:n]) },
{'type': 'eq', 'fun': lambda w: np.sum(w[n:]) - 1.0 },
{'type': 'ineq', 'fun': lambda w: maxRisk*maxRisk - _calcRisk(w) } )
bounds = tuple((-1, 1) for x in range(n))
bounds = bounds + tuple((0, 3) for x in range(n))
initw = n * [0.01 / n]
initw = initw + [abs(w) for w in initw]
result = sco.minimize(_calcAlpha, initw, method='SLSQP',
bounds=bounds, constraints=constraints)
return(result)

Related

Is this the correct way to calculate the covariance matrix for 2D feature map?

Within a neural network, I have some 2D feature maps with values between 0 and 1. For these maps, I want to calculate the covariance matrix based on the values at each coordination. Unfortunately, pytorch has no .cov() function like in numpy. So I wrote the following function instead:
def get_covariance(tensor):
bn, nk, w, h = tensor.shape
tensor_reshape = tensor.reshape(bn, nk, 2, -1)
x = tensor_reshape[:, :, 0, :]
y = tensor_reshape[:, :, 1, :]
mean_x = torch.mean(x, dim=2).unsqueeze(-1)
mean_y = torch.mean(y, dim=2).unsqueeze(-1)
xx = torch.sum((x - mean_x) * (x - mean_x), dim=2).unsqueeze(-1) / (h * w - 1)
xy = torch.sum((x - mean_x) * (y - mean_y), dim=2).unsqueeze(-1) / (h * w - 1)
yx = xy
yy = torch.sum((y - mean_y) * (y - mean_y), dim=2).unsqueeze(-1) / (h * w - 1)
cov = torch.cat((xx, xy, yx, yy), dim=2)
cov = cov.reshape(bn, nk, 2, 2)
return cov
Is that the correct way to do it?
Edit:
Here is a comparison with the numpy function:
a = torch.randn(1, 1, 64, 64)
a_numpy = a.reshape(1, 1, 2, -1).numpy()
torch_cov = get_covariance(a)
numpy_cov = np.cov(a_numpy[0][0])
torch_cov
tensor([[[[ 0.4964, -0.0053],
[-0.0053, 0.4926]]]])
numpy_cov
array([[ 0.99295635, -0.01069122],
[-0.01069122, 0.98539236]])
Apparently, my values are too small by a factor of 2. Why could that be?
Edit2: Ahhh I figured it out. It has to be divided by (h*w/2 - 1) :) Then the values match.

Error while evaluating the function convolution

This is my first attempt to write anything in matlab, so please, be patient.
I am trying to evaluate the solution of the following ODE: w'' + N(w, w') = f(t) with the Cauchy conditions w(0) = w'(0) = 0. Here N is a given nonlinear function, f is a given source. I also need the function
where G is the solution of the following ODE:
where G(0) = G'(0) =0, s is a constant, and
My try is as follows: I define N, f, w and G:
k = 1000;
N = #(g1,g2) g1^2 + sin(g2);
f = #(t) 0.5 * (1 + tanh(k * t));
t = linspace(0, 10, 100);
w = nonlinearnonhom(N, f);
G = nonlinearGreen(N);
This part is ok. I can plot both w and G: both seems to be correct. Now, I want to evaluate wG. For that purpose, I use the direct and inverse Laplace transforms as follows:
wG = ilaplace(laplace(G, t, s) * laplace(f, t, s), s, t);
but is says
Undefined function 'laplace' for input arguments of type 'double'.
Error in main (line 13)
wG = ilaplace(laplace(G, t, s) * laplace(f, t, s), s, t);
Now, I am not sure if this definition of wG is correct at all and if there are not any other definitions.
Appendix: nonlinearGreen(N) is defined as follows:
function G = nonlinearGreen(N)
eps = .0001;
del = #(t)[1/(eps * pi) * exp( -t^2/eps^2)];
eqGreen = #(t, g)[g(2); - N(g(1),g(2)) + del(t)];
tspan = [0, 100];
Cc = [0, 0];
solGreen = ode45(eqGreen, tspan, Cc);
t = linspace(0, 10, 1000);
G = deval(solGreen, t, 1);
end
and nonlinearnonhom is defined as follows:
function w = nonlinearnonhom(N, f)
eqnonhom = #(t, g)[g(2); - N(g(1),g(2)) + f(t)];
tspan = [0, 100];
Cc = [0, 0];
solnonhom = ode45(eqnonhom, tspan, Cc);
t = linspace(0, 10, 100);
w = deval(solnonhom, t, 1);
end
You keep mixing different kind of types and it's not a good idea. I suggest you keep with symbolic all the way if you want to use the laplace function. When you define N and f with #(arobase) as function handles and not symbolic expressions as you might want to do. I suggest you have a look at symbolic documentation and rewrite your functions as symbolic.
Then, the error message is pretty clear.
Undefined function 'laplace' for input arguments of type 'double'.
Error in main (line 13)
wG = ilaplace(laplace(G, t, s) * laplace(f, t, s), s, t);
It means that the function laplace can't have arguments of type double.
The problem is that your t is a vector of double. Another mistake is that s is not defined in your code.
According to Matlab documentation of laplace, all arguments are of type symbolic.
You can try to manually specify symbolic s and t.
% t = linspace(0, 10, 100); % This is wrong
syms s t
wG = ilaplace(laplace(G, t, s) * laplace(f, t, s), s, t);
I have no error after that.

How to accumulate and appy gradients for Async n-step DQNetwork update in Tensorflow?

I am trying to implement Asynchronous Methods for Deep Reinforcement Learning and one of the steps requires to accumulate the gradient over different steps and then apply it.
What is the best way to achieve this in tensorflow?
I got so far as to accumulate the gradient and I don't think is the fastest way to achieve it (lots of transfers from tensorflow to python and back).
Any suggestions are welcome.
This is my code of a toy NN. It does not model or compute anything it just exercise the operations that I want to use.
import tensorflow as tf
from model import *
graph = tf.Graph()
with graph.as_default():
state = tf.placeholder(tf.float32, shape=[None, 80,80,1])
with tf.variable_scope('layer1'):
W = weight_variable([8, 8, 1, 32])
variable_summaries(W, "layer1/W")
b = bias_variable([32])
variable_summaries(b, "layer1/b")
h = conv2d(state, W, 4) + b
activation = tf.nn.relu(h)
pool1 = max_pool_2x2(activation)
print(pool1.get_shape())
pool1 = tf.reshape(pool1, [-1, 3200])
with tf.variable_scope('readout'):
W = weight_variable([3200, 3])
b = bias_variable([3])
logits = tf.matmul(pool1, W) + b
variable_summaries(h, "y")
action_indexes = tf.placeholder(tf.int32, shape=[None], name="action_indexes")
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, action_indexes)
starter_learning_rate = 1e-6
global_step = tf.Variable(0, trainable=False)
# decay every 1000 steps with a base of 0.96:
learning_rate = tf.train.exponential_decay(starter_learning_rate,
global_step,
10000, 0.96, staircase=True)
optimizer = tf.train.RMSPropOptimizer(learning_rate)
gradients_and_variables = optimizer.compute_gradients(loss, tf.trainable_variables())
discounted_values = tf.placeholder(tf.float32, shape=[None, 1])
with tf.Session(graph=graph) as s:
for v in tf.trainable_variables():
print(v.name, v.dtype, v.get_shape())
s.run(tf.initialize_all_variables())
feed_dict= {
state : np.zeros([1, 80, 80, 1]),
action_indexes: [1],
}
var_to_grad = dict((var.name, grad) for grad, var in gradients_and_variables)
keys = sorted(var_to_grad.keys())
print(keys)
name_to_var = dict((var.name, var) for _, var in gradients_and_variables)
for i in range(10):
gradients = s.run([ var_to_grad[k] for k in keys], feed_dict=feed_dict)
for k,v in zip(keys, gradients):
var_to_grad[k] += v
for k in keys:
print(var_to_grad[k])
s.run( optimizer.apply_gradients( (g, name_to_var[v]) for v,g in var_to_grad.iteritems()), feed_dict=feed_dict)
Updated code after #yaroslave suggestion:
import tensorflow as tf
from model import *
graph = tf.Graph()
with graph.as_default():
minibatch = 32
state = tf.placeholder(tf.float32, shape=[minibatch, 80,80,1], name="input")
with tf.variable_scope('layer1'):
W = weight_variable([8, 8, 1, 32])
variable_summaries(W, "layer1/W")
b = bias_variable([32])
variable_summaries(b, "layer1/b")
h = conv2d(state, W, 4) + b
activation = tf.nn.relu(h)
pool1 = max_pool_2x2(activation)
print(pool1.get_shape())
pool1 = tf.reshape(pool1, [-1, 3200])
with tf.variable_scope('readout'):
W = weight_variable([3200, 3])
b = bias_variable([3])
logits = tf.matmul(pool1, W) + b
variable_summaries(h, "y")
action_indexes = tf.placeholder(tf.int32, shape=[minibatch], name="action_indexes")
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, action_indexes)
starter_learning_rate = 1e-6
global_step = tf.Variable(0, trainable=False)
# decay every 1000 steps with a base of 0.96:
learning_rate = tf.train.exponential_decay(starter_learning_rate,
global_step,
10000, 0.96, staircase=True)
optimizer = tf.train.RMSPropOptimizer(learning_rate)
trainable_variables = tf.trainable_variables()
varname_to_var = dict( (v.name, v) for v in trainable_variables )
keys = sorted(varname_to_var.keys())
gradients_and_variables = optimizer.compute_gradients(loss, [ varname_to_var[k] for k in keys])
var_to_grad = dict((var.name, grad) for grad, var in gradients_and_variables)
name_to_var = dict((var.name, var) for _, var in gradients_and_variables)
# save the gradients in memory
var_to_ref_grad = {}
for k in keys:
grad = var_to_grad[k]
print(k, grad.get_shape())
ref = tf.Variable(tf.zeros_like(grad))
ref = ref.assign_add(grad)
var_to_ref_grad[k] = ref
discounted_values = tf.placeholder(tf.float32, shape=[None, 1], name='discounted_values')
# control when to apply gradients
compute_gradients_flag = tf.placeholder(tf.int32, name="compute_gradients")
def fn1():
var_grad_list = []
for k in keys:
grad = var_to_ref_grad[k]
var = varname_to_var[k]
var_grad_list.append((grad,var))
optimizer.apply_gradients(var_grad_list)
return tf.no_op()
fn2 = lambda : tf.no_op()
last_op = tf.cond(tf.equal(compute_gradients_flag, 1), fn1, fn2)
with tf.Session(graph=graph) as s:
feed_dict= {
state : np.zeros([minibatch, 80, 80, 1]),
action_indexes: [1],
compute_gradients_flag: False,
}
s.run(tf.initialize_all_variables())
for i in range(10):
# accumulate gradients
s.run(last_op, feed_dict=feed_dict)
You don't really have to manually accumulate gradients. You can have Tensorflow accumulate them for you by applying the rollout update as a batch.
s_list = list_of_states_visited
a_list = list_of_actions_taken
R_list = list_of_value_targets
sess.run(local_net.update, feed_dict={
local_net.input: s_list,
local_net.a: a_list,
local_net.R: R_list
})
Something like this might work to create ops for accumulating gradients, resetting the accumulated gradients, and applying the accumulated gradients (untested!):
def build_gradient_accumulators(optimizer, gradients_and_variables):
accum_grads_and_vars = []
accumulators = []
resetters = []
for grad, var in gradients_and_variables:
accum = tf.Variable(tf.zeros_like(grad))
accum = accum.assign_add(grad)
accumulators.append(accum)
accum_grads_and_vars.append((accum, var))
resetters.append(tf.assign(accum, tf.zeros_like(accum)))
reset_op = tf.group(*resetters)
accum_op = tf.group(*accumulators)
apply_op = optimizer.apply_gradients(accum_grads_and_vars)
return reset_op, accum_op, apply_op

Symbolic int results in deletion of main variable (MATLAB)

I am trying to do something like this:
syms x h4 t4 c13;
t = 0.6*sin(pi*x);
h1x = 0.5*(1 - t);
h0 = h1x;
h14x = -h4 -t4*(x - 0.5);
h24x = h4 + t4*(x - 0.5);
symvar(h14x)
which returns
ans =
[ h4, t4, x]
Then
u13x = (-4*int(h14x, x, 0, x) + c13)/h0
symvar(u13x)
returns
u13x =
-(c13 + 4*x*(h4 - t4/2) + 2*t4*x^2)/((3*sin(pi*x))/10 - 1/2)
ans =
[ c13, h4, t4, x]
and
p12x = -3*int(u13x, x, 0, x)
symvar(p12x)
which is
p12x =
-3*int(-(c13 + 4*x*(h4 - t4/2) + 2*t4*x^2)/((3*sin(pi*x))/10 - 1/2), x, 0, x)
ans =
[ c13, h4, t4 ]
As you can see from u13x where the variables were [h4, t4, c13, x], while integrating to p12x it got reduced to [h4, t4, c13] even though the integral limits are variable (in terms of x). Is it a bug? I can't seem to get by this weird behaviour. Is there a workaround?
Here are three possible workarounds (tested in R2015a).
1. Use a symbolic function
One option is to make the input that will be passed to sym/symvar a symfun in terms of x and then use the optional second argument to specify a finite number of variable to look for:
syms x h4 t4 c13;
t = 0.6*sin(pi*x);
h1x = 0.5*(1 - t);
h0 = h1x;
h14x = -h4 -t4*(x - 0.5);
u13x = (-4*int(h14x, x, 0, x) + c13)/h0
p12x(x) = -3*int(u13x, x, 0, x) % Make symfun, function of x
n = realmax; % 4 or greater to get all variables in this case
symvar(p12x, n) % Second argument must be finite integer
which returns the expected [ x, t4, h4, c13]. Just setting the second argument to a very large integer value seems to work.
2. Convert expression to a string
There are actually two versions of symvar. There is symvar for string inputs and sym/symvar, in the Symbolic Math toolbox, for symbolic expressions. The two forms apparently behave differently in this case. So, another workaround is to convert the equation with int to a character string with sym/char before passing it to symvar and then converting the output back to a vector of symbolic variables:
syms x h4 t4 c13;
t = 0.6*sin(pi*x);
h1x = 0.5*(1 - t);
h0 = h1x;
h14x = -h4 -t4*(x - 0.5);
u13x = (-4*int(h14x, x, 0, x) + c13)/h0
p12x = -3*int(u13x, x, 0, x)
sym(symvar(char(p12x))).'
which also returns the expected [ c13, h4, t4, x] (note that order appears to be opposite of the first workaround above).
3. Call MuPAD function from Matlab
Lastly, you can call the MuPAD function indets that finds indeterminates in an expression.
syms x h4 t4 c13;
t = 0.6*sin(pi*x);
h1x = 0.5*(1 - t);
h0 = h1x;
h14x = -h4 -t4*(x - 0.5);
u13x = (-4*int(h14x, x, 0, x) + c13)/h0
p12x = -3*int(u13x, x, 0, x)
feval(symengine, 'x->indets(x) minus Type::ConstantIdents', p12x)
which returns [ x, c13, h4, t4]. This will work if the input p12x is class sym or symfun. You can also use:
evalin(symengine, ['indets(hold(' char(p12x) ')) minus Type::ConstantIdents'])
The reason that sym/symvar doesn't work in your case is because it is based on freeIndets under the hood, which explicitly ignores free variables in functions like int.

scipy - why isn't COBYLA respecting constraint?

I'm using COBYLA to do a cost minimization on a linear objective function with constraints. I'm implementing lower and upper bounds by including a constraint for each.
import numpy as np
import scipy.optimize
def linear_cost(factor_prices):
def cost_fn(x):
return np.dot(factor_prices, x)
return cost_fn
def cobb_douglas(factor_elasticities):
def tech_fn(x):
return np.product(np.power(x, factor_elasticities), axis=1)
return tech_fn
def mincost(targets, cost_fn, tech_fn, bounds):
n = len(bounds)
m = len(targets)
x0 = np.ones(n) # Do not use np.zeros.
cons = []
for factor in range(n):
lower, upper = bounds[factor]
l = {'type': 'ineq',
'fun': lambda x: x[factor] - lower}
u = {'type': 'ineq',
'fun': lambda x: upper - x[factor]}
cons.append(l)
cons.append(u)
for output in range(m):
t = {'type': 'ineq',
'fun': lambda x: tech_fn(x)[output] - targets[output]}
cons.append(t)
res = scipy.optimize.minimize(cost_fn, x0,
constraints=cons,
method='COBYLA')
return res
COBYLA doesn't respect the upper or lower-bound constraints, but it does respect the technology constraint.
>>> p = np.array([5., 20.])
>>> cost_fn = linear_cost(p)
>>> fe = np.array([[0.5, 0.5]])
>>> tech_fn = cobb_douglas(fe)
>>> bounds = [[0.0, 15.0], [0.0, float('inf')]]
>>> mincost(np.array([12.0]), cost_fn, tech_fn, bounds)
x: array([ 24.00010147, 5.99997463])
message: 'Optimization terminated successfully.'
maxcv: 1.9607782064667845e-10
nfev: 75
status: 1
success: True
fun: 239.99999999822359
Why wouldn't COBYLA respect the first factor constraint (i.e. upper-bound # 15)?
COBYLA is in fact respecting all the bounds you give.
The problem lies in the construction of the cons list.
Namely, binding of variables in lambda and other inner-scoped functions in Python (and Javascript) is lexical, and does not behave in the way you assume: http://eev.ee/blog/2011/04/24/gotcha-python-scoping-closures/ After the loop is finished, the variables lower and upper have values 0 and inf, and variable factor has value 1, and these values are what is then used by all of the lambda functions.
One workaround is to explicitly bind the specific values of variables to dummy keyword arguments:
for factor in range(n):
lower, upper = bounds[factor]
l = {'type': 'ineq',
'fun': lambda x, a=lower, i=factor: x[i] - a}
u = {'type': 'ineq',
'fun': lambda x, b=upper, i=factor: b - x[i]}
cons.append(l)
cons.append(u)
for output in range(m):
t = {'type': 'ineq',
'fun': lambda x, i=output: tech_fn(x)[i] - targets[i]}
cons.append(t)
A second way is to add a factory function generating the lambdas.