I have a pytorch custom layer defined as:
class MyCustomLayer(nn.Module):
def __init__(self):
super(MyCustomLayer, self).__init__()
self.my_parameter = torch.rand(1, requires_grad = True)
# the following allows the previously defined parameter to be recognized as a network parameter when instantiating the model
self.my_registered_parameter = nn.ParameterList([nn.Parameter(self.my_parameter)])
def forward(self, x):
return x*self.my_parameter
I then define my network where the custom layer is used:
class MyNet(nn.Module):
def __init__(self):
super(MyNet, self).__init__()
self.layer1 = MyCustomLayer()
def forward(self, x):
x = self.layer1(x)
return x
Now Let's instantiate MyNet and observe the issue:
# instantiate MyNet and run it over one input value
model = MyNet()
x = torch.tensor(torch.rand(1))
output = model(x)
criterion = nn.MSELoss()
loss = criterion(1, output)
loss.backward()
Iterating through model parameters shows None for custom layer parameter:
for p in model.parameters():
print (p.grad)
None
while directly accessing that parameter shows the correct grad value:
print(model.layer1.my_parameter.grad)
tensor([-1.4370])
This, in turn, prevents the optim step from updating the inner parameters automatically and leaves me with the hassle of having to update those manually. Anyone knows how I can address this issue?
What you did i.e. return x*self.my_registered_parameter[0] worked because you use the registered param for calculating the gradient.
When you call nn.Parameter it returns a new object and hence self.my_parameter that you use for the operation and the one registered are not same.
You can fix this by declaring the my_parameter as nn.Parameter
self.my_parameter = nn.Parameter(torch.rand(1, requires_grad = True))
self.my_registered_parameter= nn.ParameterList([self.some_parameter])
or you don't need to create my_registered_parameter variable at all. When you declare self.my_parameter as nn.Parameter it gets registered as a parameter.
Alright!
I had to switch the parameter variable calls within the custom layer to the nn.ParameterList object (i.e. return x*self.my_registered_parameter[0] instead of x*self.my_parameter ). In this example that meant changing the custom layer's parameter call in forward method to:
def forward(self, x):
return x*self.my_registered_parameter[0]
This is where it would've been nice to have pass by reference!
Now optim updates all the parameters as expected!
Related
I am new to object orientation, and I am having troubles understanding the following:
import torch.nn as nn
class mynet(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(20, 64)
def forward(self, x):
x = self.fc1(x)
The line self.fc1 = nn.Linear(20, 64) is supposed to create a member variable fc1 to my class, right? But what is the return value of nn.Linear(20, 64)?
According to the documentation, nn.Linear is defined as
class torch.nn.Linear(in_features: int, out_features: int, bias: bool = True).
However, in my basic OOP tutorial I have only seen something like class CLASSNAME(BASECLASS) so that the class CLASSNAME inherits from BASECLASS. What does the documentation mean with its way of writing all that stuff in between the brackets?
Also, the line x=fc1(x) somehow makes it look as if fc1 was a function now.
I seem to lack OOP knowledge here... Any help appreciated!
First lets take a look at this
self.fc1 = nn.Linear(20, 64)
This part is probably familiar to someone with a basic understanding of python and OOP. Here we are simply creating a new instance of an nn.Linear class and initializing the class using positional arguments 20 and 64 corresponding to in_features and out_features respectively. The arguments in the documentation are the expected arguments to be passed to nn.Linear's __init__ method.
Now for the part that's probably a little more confusing
x = self.fc1(x)
The nn.Linear class is a callable since it's parent class, nn.Module, implements a special method named __call__. That means you can treat self.fc1 like a function and do things like x = self.fc1(x), which is equivalent to x = self.fc1.__call__(x).
You can create a little examination:
import torch
import torch.nn as nnn
fc1 = nn.Linear(20, 64)
print(fc1, type(fc1))
ret = fc1(torch.randn(20))
print(ret, type(ret), ret.shape)
Out:
Linear(in_features=20, out_features=64, bias=True) <class 'torch.nn.modules.linear.Linear'>
tensor([-0.2795, 0.8476, -0.8207, 0.3943, 0.1464, -0.2174, 0.6605, 0.6072,
-0.6881, -0.1118, 0.8226, 0.1515, 1.3658, 0.0814, -0.8751, -0.9587,
0.1310, 0.2539, -0.3072, -0.0225, 0.4663, -0.0019, 0.0404, 0.9279,
0.4948, -0.3420, 0.9061, 0.1752, 0.1809, 0.5917, -0.1010, -0.3210,
1.1910, 0.5145, 0.2254, 0.2077, -0.0040, -0.6406, -0.1885, 0.5270,
0.0824, -0.0787, 1.5140, -0.7958, 1.1727, 0.1862, -1.0700, 0.0431,
0.6849, 0.1393, 0.7547, 0.0917, -0.3264, -0.2152, -0.0728, -0.6441,
-0.1162, 0.4154, 0.3486, -0.1693, 0.6697, 0.0229, 0.0311, 0.1433],
grad_fn=<AddBackward0>) <class 'torch.Tensor'> torch.Size([64])
fc1 is of type class 'torch.nn.modules.linear.Linear'.
It needs some "juice" to work. In your case it needs the input tensor torch.randn(20) to return the output of torch.Size([64]).
So fc1 is a class instance that you can run with () in which case the forward() method of a class nn.Linear will be called.
In most cases when working with your modules (like mynet in your case) you will list the modules in __init__, and then in the forward of your module you will be defining what will happen (the behavior).
The three kind of modules in PyTorch are:
Functional modules
Default modules
Custom modules
Custom modules like mynet you created typically use default modules:
nn.Identity()
nn.Embedding()
nn.Linear()
nn.Conv2d()
nn.BatchNorm() (BxHxW)
nn.LayerNorm() (CxHxW)
nn.Dropout()
nn.ReLU()
And many many other modules that I haven't set. But of course, you can create custom modules without any default modules, just by using nn.Parameter(), see the last example.
The third kind functional modules are defined here.
Also check nn.Linear implementation. You may note the F.linear() functional module is used.
You may test the naive implementation of Linear from Fastai Book:
import torch
import torch.nn as nn
import math
class Linear(nn.Module):
def __init__(self, n_in, n_out):
super().__init__()
self.weight = nn.Parameter(torch.randn(n_out, n_in) * math.sqrt(2/n_in))
self.bias = nn.Parameter(torch.zeros(n_out))
def forward(self, x): return x # self.weight.T + self.bias
fc = Linear(20,64)
ret = fc(torch.randn(20))
print(ret.shape) # 64
You may try to understand the difference between the naive implementation provided inside PyTorch.
I'm unsure if this is possible but I was wondering if there is a way to get a variable from an outer scope without passing it as an argument.
I've played around with global() and inspect but i'm having issues trying to get the attribute.
Here's what i'm trying to do:
class Example:
#staticmethod
def query(**kwargs):
print(f.read())
with open(__file__) as f:
Example.query(foo='bar')
So after a while of back and forth I have finally found a solution to my issue.
As Matthias suggested I use global to find the object, i decided to use inspect to add it myself like so:
Assigning
def __enter__(self):
inspect.stack()[1][0].f_globals["_ExampleName"] = self
Retrieving (Fixed)
#staticmethod
def _find_example():
stack = inspect.stack()
for stack_index in range(2, len(stack)):
stack_globals = inspect.stack()[stack_index][0].f_globals
if "_ExampleName" in stack_globals.keys():
return stack_globals["_ExampleName"]
This is a bit 'dodgy' as inspect is not ment to be used in a production environment, However works and solves my issue
Here is a MCVE of what you're trying to do.
class Example:
#staticmethod
def query(**kwargs):
print(f.read())
with open(__file__) as f:
Example.query(foo='bar')
Works as expected.
What you should do is have the Client class set a class variable to the current session.
class Client:
last_session = None
def Session():
# code that creates new session, in variable s
Client.last_session = s
return s
client = Client()
with client.Session as s:
Example.query(foo=bar)
Class Example:
#staticmethod
def query(**kwargs):
s = Client.last_session
s.magic_stuff()
I'm using a generator to make sequential training data for a hierarchical recurrent model, which needs the outputs of the previous batch to generate the inputs for the next batch. This is a similar situation to the Keras argument stateful=True which saves the hidden states for the next batch, except it's more complicated so I can't just use that as-is.
So far I tried putting a hack in the loss function:
def custom_loss(y_true, y_pred):
global output_ref
output_ref[0] = y_pred[0].eval(session=K.get_session())
output_ref[1] = y_pred[1].eval(session=K.get_session())
but that didn't compile and I hope there's a better way. Will Keras callbacks be of any help?
Learned from here:
model.compile(optimizer='adam')
# hack after compile
output_layers = [ 'gru' ]
s_name = 's'
model.metrics_names += [s_name]
model.metrics_tensors += [layer.output for layer in model.layers if layer.name in output_layers]
class my_callback(Callback):
def on_batch_end(self, batch, logs=None):
s_pred = logs[s_name]
print('s_pred:', s_pred)
return
model.fit(..., callbacks=[my_callback()])
I use this in the Tensorflow version of Keras, but it should work in Keras without Tensorflow
import tensorflow as tf
class ModelOutput:
''' Class wrapper for a metric that stores the output passed to it '''
def __init__(self, name):
self.name = name
self.y_true = None
self.y_pred = None
def save_output(self, y_true, y_pred):
self.y_true = y_true
self.y_pred = y_pred
return tf.constant(True)
class ModelOutputCallback(tf.keras.callbacks.Callback):
def __init__(self, model_outputs):
tf.keras.callbacks.Callback.__init__(self)
self.model_outputs = model_outputs
def on_train_batch_end(self, batch, logs=None):
#use self.model_outputs to get the outputs here
model_outputs = [
ModelOutput('rbox_score_map'),
ModelOutput('rbox_shapes'),
ModelOutput('rbox_angles')
]
# Note the extra [] around m.save_output, this example is for a model with
# 3 outputs, metrics must be a list of lists if you type it out
model.compile( ..., metrics=[[m.save_output] for m in self.model_outputs])
model.fit(..., callbacks=[ModelOutputCallback(model_outputs)])
I was wondering how to create an amount(that is taken from a user input) of the same type of buttons that can be controlled individually? I tried to use classes to do this, but it only creates a single button.
class GridBtn(QMainWindow):
def __init__(self, self_global, x, y):
super(GridBtn, self).__init__()
self.button = QPushButton("0", self_global)
self.move(x,y)
def change_val(self, val):
self.button = QPushButton(val, self_global)
def returnx(self, x):
return x
def returny(self, y):
return y
That is the GridBtn class that the grid generator references.
self.grid_x = 3
self.grid_y = 3 #later changed to user input, just for testing
for x in range(self.grid_x):
for y in range(self.grid_y):
for grid_btn in range(self.grid_y):
print("test") #testing if works
#need to fix this to make more efficient
grid_btn = GridBtn(self, x*10, y*10)
self.button_grid_layout.addWidget(grid_btn.button,x,y)
This is trying to create a specific amount of buttons, but just creates one button, like this:
The problem was that you were calling on a subfunction above the one you needed to xd
I am using an external library in Scala, which uses a set of traits to pass around complex configuration options to different methods. This is Highcharts Scala API, but the problem seems to be more general.
The library defines a trait (HighchartsOptions in the actual usage), which is just a data transfer object that stores a number of fields and allows them to be passed around. Code simplified and generalized for clarity looks like this:
trait Opts {
def option1: Int = 3
def option2: String = "abc"
//Many more follow, often of more complex types
}
As long as the complete set of options can be generated in one place, this allows for a neat syntax:
val opts = new Opts() {
override val option1 = 5
//And so on for more fields
}
doSomething(opts)
However, there are a few situations where one piece of code prepares such a configuration but another piece of code needs to adjust just one option extra. It would be nice to be able to pass some Opts instance to a method and let the method modify a value or two.
Since the original trait is based on defs rather than vars, it's easy to override an option's value only if the type of the object is known, like in the example above. If a method receives only an instance of some anonymous subclass of Opts, how can it create another instance or modify the received one so that a call to e.g. option2 could return a different value? The desired operation is similar to what Mockito's spy does, however I feel there should be some less contrived way than using a mocking framework to achieve this effect.
PS: Actually I am a bit surprised by the use of such an interface by the library's authors, so perhaps I'm missing something and there is some completely different way of achieving my goal of building a single set of options from several different places in the code (e.g. some builder object that is mutable and that I can pass around instead of the finished HighchartsOptions)?
I would first check if using the Opts trait (solely) is an absolute necessity. Hopefully it's not and then you just extend the trait, overriding defs with vars, like you said.
When Opts is mandatory and you have its instance that you want to copy modifying some fields, here's what you could do:
Write a wrapper for Opts, which extends Opts, but delegates every call to the wrapped Opts excluding the fields that you want to modify. Set those fields to values you want.
Writing the wrapper for a broad-interface trait can be boring task, therefore you may consider using http://www.warski.org/blog/2013/09/automatic-generation-of-delegate-methods-with-macro-annotations/ to let macros generate most of it automatically.
The shortest, simplest way.
Define a case class:
case class Options(
option1: Int,
option2: String
/* ... */
) extends Opts
and implicit conversion from Opts to your Options
object OptsConverter {
implicit def toOptions(opts: Opts) = Options(
option1 = opts.option1,
option2 = opts.option2
/* ... */
)
}
That way you get all copy methods (generated by compiler) for free.
You can use it like that:
import OptsConverter.toOptions
def usage(opts: Opts) = {
val improvedOpts = opts.copy(option2 = "improved")
/* ... */
}
Note, that Options extends Opts, so you can use it whenever Opts is required. You'll be able to call copy to obtain a modified instance of Opts in every place where you import the implicit conversion.
The simplest solution is to allow the trait to define it's own "copy" method, and allow it's subclasses (or even base class) to work with that. However, the parameters can really only match the base class unless you recast it later. Incidentally this doesn't work as "mixed in" so your root might as well be an abstract class, but it works the same way. The point of this is that the subclass type keeps getting passed along as it's copied.
(Sorry I typed this without a compiler so it may need some work)
trait A {
type myType<:A
def option1: Int
def option2: String
def copyA(option1_:Int=option1, option2_String=option2):myType = new A {
def option1 = option_1
def option2 = option_2
}
}
trait B extends A { me=>
type myType = B
def option3: Double
//callable from A but properly returns B
override def copyA(option1_:Int=option1, option2_:String=option2):myType = new B {
def option1 = option_1
def option2 = option_2
def option3 = me.option3
}
//this is only callable if you've cast to type B
def copyB(option1_:Int=option1, option2_:String=option2, option3_:Double=option3):myType = new B {
def option1 = option_1
def option2 = option_2
def option3 = option_3
}
}