scaling the sparse_softmax_cross_entropy_with_logits

scaling the sparse_softmax_cross_entropy_with_logits - softmax

How could I scale gradients where the loss comes from sparse_softmax_cross_entropy_with_logits. For example, I was trying to divide by 128 as below, but I found error:
new_gradients = [(grad/128, var) for (grad, var) in gradients]
TypeError: unsupported operand type(s) for /: 'IndexedSlices' and 'int'
The code I was using is below:
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels)
gradients = opt.compute_gradient(loss)
new_gradients = [(grad/128, var) for (grad, var) in gradients]
train_step = opt.appy_gradients(new_gradients)

I found a way to solve the problem as follows:
new_gradients = [(grad/128, var) for (grad, var) in gradients]
should be
new_gradients = [(tf.div(grad, 128), var) for (grad, var) in gradients]

Related

Using the GPU with Lux and NeuralPDE Julia

I am trying to run a model using the GPU, no problem with the CPU. I think somehow using measured boundary conditions is causing the issue but I am not sure. I am following this example: https://docs.sciml.ai/dev/modules/NeuralPDE/tutorials/gpu/. I am following this example for using measured boundary conditions: https://docs.sciml.ai/dev/modules/MethodOfLines/tutorials/icbc_sampled/
using Random
using NeuralPDE, Lux, CUDA, Random
using Optimization
using OptimizationOptimisers
using NNlib
import ModelingToolkit: Interval
using Interpolations
# Measured Boundary Conditions (Arbitrary For Example)
bc1 = 1.0:1:1001.0 .|> Float32
bc2 = 1.0:1:1001.0 .|> Float32
ic1 = zeros(101) .|> Float32
ic2 = zeros(101) .|> Float32;
# Interpolation Functions Registered as Symbolic
itp1 = interpolate(bc1, BSpline(Cubic(Line(OnGrid()))))
up_cond_1_f(t::Float32) = itp1(t)
#register_symbolic up_cond_1_f(t)
itp2 = interpolate(bc2, BSpline(Cubic(Line(OnGrid()))))
up_cond_2_f(t::Float32) = itp2(t)
#register_symbolic up_cond_2_f(t)
itp3 = interpolate(ic1, BSpline(Cubic(Line(OnGrid()))))
init_cond_1_f(x::Float32) = itp3(x)
#register_symbolic init_cond_1_f(x)
itp4 = interpolate(ic2, BSpline(Cubic(Line(OnGrid()))))
init_cond_2_f(x::Float32) = itp4(x)
#register_symbolic init_cond_2_f(x);
# Parameters and differentials
#parameters t, x
#variables u1(..), u2(..)
Dt = Differential(t)
Dx = Differential(x);
# Arbitrary Equations
eqs = [Dt(u1(t, x)) + Dx(u2(t, x)) ~ 0.,
Dt(u1(t, x)) * u1(t,x) + Dx(u2(t, x)) + 9.81 ~ 0.]
# Boundary Conditions with Measured Data
bcs = [
u1(t,1) ~ up_cond_1_f(t),
u2(t,1) ~ up_cond_2_f(t),
u1(1,x) ~ init_cond_1_f(x),
u2(1,x) ~ init_cond_2_f(x)
]
# Space and time domains
domains = [t ∈ Interval(1.0,1001.0),
x ∈ Interval(1.0,101.0)];
# Neural network
input_ = length(domains)
n = 10
chain = Chain(Dense(input_,n,NNlib.tanh_fast),Dense(n,n,NNlib.tanh_fast),Dense(n,4))
strategy = GridTraining(.25)
ps = Lux.setup(Random.default_rng(), chain)[1]
ps = ps |> Lux.ComponentArray |> gpu .|> Float32
discretization = PhysicsInformedNN(chain,
strategy,
init_params=ps)
# Model Setup
#named pdesystem = PDESystem(eqs,bcs,domains,[t,x],[u1(t, x),u2(t, x)])
prob = discretize(pdesystem,discretization);
sym_prob = symbolic_discretize(pdesystem,discretization);
# Losses and Callbacks
pde_inner_loss_functions = sym_prob.loss_functions.pde_loss_functions
bcs_inner_loss_functions = sym_prob.loss_functions.bc_loss_functions
callback = function (p, l)
println("loss: ", l)
println("pde_losses: ", map(l_ -> l_(p), pde_inner_loss_functions))
println("bcs_losses: ", map(l_ -> l_(p), bcs_inner_loss_functions))
return false
end;
# Train Model (Throws Error)
res = Optimization.solve(prob,Adam(0.01); callback = callback, maxiters=5000)
phi = discretization.phi;
I get the following error:
GPU broadcast resulted in non-concrete element type Union{}.
This probably means that the function you are broadcasting contains an error or type instability.
Please Advise.

Callback in Bender's decomposition

I am learning Bender's decomposition method and I want to use that in a small instance. I started from "bendersatsp.py" example in CPLEX. When I run this example with my problem, I got the following error. Could you please let me know what the problem is and how I can fix it? In the following you can see the modifications in lazy constraints function. I have two decision variables in master problem "z_{ik}" and "u_{k}" that would incorporate as constant in the workerLp.
class BendersLazyConsCallback(LazyConstraintCallback):
def __call__(self):
print("shoma")
v = self.v
u = self.u
z = self.z
print ("u:", u)
print ("z:", z)
workerLP = self.workerLP
boxty = len(u)
#scenarios=self.scenarios2
ite=len(z)
print ("ite:", ite)
print ("boxty:", boxty)
# Get the current x solution
sol1 = []
sol2 = []
sol3 = []
print("okkkk")
for k in range(1, boxty+1):
sol1.append([])
sol1[k-1]= [self.get_values(u[k-1])];
print ("sol1:", sol1[k-1])
for i in range(1, ite+1):
sol2.append([])
sol2[i-1]= self.get_values(z[i-1]);
print ("sol2:", sol2[i-1])
for i in range(1, ite+1):
sol3.append([])
sol3[i-1]= self.get_values(v[i-1]);
#print ("sol3:", sol3[i-1])
# Benders' cut separation
if workerLP.separate(sol3,sol1,sol2,v,u,z):
self.add(cut = workerLP.cutLhs, sense = "G", rhs = workerLP.cutRhs)
CPLEX Error 1006: Error during callback.
benders(sys.argv[1][0], datafile)
cpx.solve()
_proc.mipopt(self._env._e, self._lp)
check_status(env, status)
raise callback_exception
TypeError: unsupported operand type(s) for +: 'int' and 'list'

Julia: Inexact error when trying to get the integer part of a BigFloat

I am interested in getting the digits of a BigFloat in the form of bytes. I get a very strange error that I cannot debug. I provide a minimal example where the error appears.
function floatToBytes(x::BigFloat)
ret = zeros(UInt8, 4)
xs = significand(x)/2
b = UInt8(0)
for i = 1:4
xs *= 256
b = trunc(UInt8, xs)
ret[i] = b
xs -= b
end
return ret
end
println( floatToBytes(BigFloat(0.9921875001164153)) )
println( floatToBytes(BigFloat(0.9960937501164153)) )
What I get when running this is
UInt8[0xfe, 0x00, 0x00, 0x00]
ERROR: LoadError: InexactError()
Stacktrace:
[1] trunc(::Type{UInt8}, ::BigFloat) at ./mpfr.jl:201
etc.
It seems that it doesn't want to turn 255 into a UInt8. I can circumvent the problem by defining the function as
function floatToBytes(x::BigFloat)
ret = zeros(UInt8, 4)
xs = significand(x)/2
b = UInt8(0)
for i = 1:4
xs *= 256
try
b = trunc(UInt8, xs)
catch
b = trunc(UInt8, xs-1)+UInt8(1)
end
ret[i] = b
xs -= b
end
return ret
end
But this is highly unsatisfactory. What is going on here?

The problem looks like a bug in trunc for BigFloat. The problem is the current code does (typemin(T) <= x <= typemax(T)) || throw(InexactError(:trunc, T, x)) which throws an error because x is larger than 255 which is the typemax.
It actually needs to do the trunc in BigFloat domain and then cast to T (and have the cast check for typemax).
I've opened an issue regarding this at: https://github.com/JuliaLang/julia/issues/24041
In the meantime, a solution could be to do:
UInt8(trunc(xs))
i.e. trunc first and cast later. For example:
julia> UInt8(trunc(BigFloat(0.9960937501164153)*256))
0xff

Pair GP polynomial operator

There is trouble with PARI/GP. Does anyone know to operate the right function/command in PARI/GP, for fining the minimal polynomial of
[y = x^2-x+1 (mod x^6+x^5+x^4+x^3+x^2+x+1)]
PARI/GP gives this error:
gp > minpoly(x^6+x^5+x^4+x^3+x^2+x+1,{v=x^2-x+1})
*** at top-level: ...(x^6+x^5+x^4+x^3+x^2+x+1,v=x^2-x+1)
*** ^----------
*** incorrect type in evaluator [variable name expected] (t_INT).
Thanks for helping.
I also try:
(11:36) gp > elt = Mod(x^2-x+1, x^6+x^5+x^4+x^3+x^2+x+1)
%52 = Mod(43, 39991)
(11:36) gp > poly = minpoly(elt, v='y)
%53 = Mod(1, 39991)*y + Mod(39948, 39991)
(11:36) gp > subst(poly, variable(poly), elt)
%54 = Mod(0, 39991)
(11:36) gp >
Is this supposed to be a script?

In fact, you want the following call:
elt = Mod('x^2-'x+1, 'x^6+'x^5+'x^4+'x^3+'x^2+'x+1)
poly = minpoly(elt, v='y)
gp > y^6 - 6*y^5 + 15*y^4 - 20*y^3 + 22*y^2 - 6*y + 1
Just to verify:
subst(poly, variable(poly), elt)
gp > 0
Parameter v for minpoly just stands for the variable name, not the modulo.

Cython and Scipy

I'm cythonizing a skript that containes scipy.stats.norm() function for calculation of implied vola.
Instead of scipy.stats.norm() I use scipy.special.ndtr() since this is somewhat faster. However, when profiling my script most time (50 from 125sec) is still spent within this function, in particular within _distn_infrastructure.py:1610(cdf).
That's the function:
def cdf(self, x, *args, **kwds):
"""
in class rv_continuous(rv_generic):
Cumulative distribution function of the given RV.
Parameters
----------
x : array_like
quantiles
arg1, arg2, arg3,... : array_like
The shape parameter(s) for the distribution (see docstring of the
instance object for more information)
loc : array_like, optional
location parameter (default=0)
scale : array_like, optional
scale parameter (default=1)
Returns
-------
cdf : ndarray
Cumulative distribution function evaluated at `x`
"""
args, loc, scale = self._parse_args(*args, **kwds)
x, loc, scale = map(asarray, (x, loc, scale))
args = tuple(map(asarray, args))
x = (x-loc)*1.0/scale
cond0 = self._argcheck(*args) & (scale > 0)
cond1 = (scale > 0) & (x > self.a) & (x < self.b)
cond2 = (x >= self.b) & cond0
cond = cond0 & cond1
output = zeros(shape(cond), 'd')
place(output, (1-cond0)+np.isnan(x), self.badvalue)
place(output, cond2, 1.0)
if any(cond): # call only if at least 1 entry
goodargs = argsreduce(cond, *((x,)+args))
place(output, cond, self._cdf(*goodargs))
if output.ndim == 0:
return output[()]
return output
However, I neighter see any function that does the actual cdf calculation nor a call of a another function that actually does it. I tried to print the output of this function via inserting
print output
before
return output
however, the print command is highlighted as wrong syntax and when running the skript there is no print of the output. How to go from here? I somehow need to speed up the norm-CDF calculation.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

scaling the sparse_softmax_cross_entropy_with_logits - softmax

I found a way to solve the problem as follows: new_gradients = [(grad/128, var) for (grad, var) in gradients] should be new_gradients = [(tf.div(grad, 128), var) for (grad, var) in gradients]

Related

Using the GPU with Lux and NeuralPDE Julia

Callback in Bender's decomposition

Julia: Inexact error when trying to get the integer part of a BigFloat

Pair GP polynomial operator

Cython and Scipy

Categories

Resources