how to get the input and final output of a pytorch network - neural-network

is there an API of how to get the intput and output node of a pytorch network ?
I tried model.features(), but this won't help.
Example: I get a pytorch network, and its structure in netron:
network
The Conv2d, MaxPool2d and Linear can be easily parsed. I get trouble with getting the information like name and size of the input node and output node.

For getting input to a particular layer means output of previous layer.
So for getting any information during forward pass and backward pass like output of a specific layer, gradient, or if you want to modify any of them there is concept in pytorch known as Hooks.
Find more information here https://pytorch.org/tutorials/beginner/former_torchies/nnft_tutorial.html#forward-and-backward-function-hooks

Related

why: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

I'm writing neural networks using torch. Here's a little problem I can't solve.
I've put both network and network inputs into the GPU, but there was an error in training.
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
I have solved the problem.
After reviewing the training function, I confirm that the model and input data have been loaded onto the GPU. Then the problem must be in the network model. When I checked the network model, I found that I wasn't using structures like List. It is likely that the model was not loaded into the GPU when other custom network models were used.
So I checked each layer of the network model, which is to force each layer onto the GPU (.cuda()). Depending on where the error prompt is located, the problem was identified after each layer's attempt, which was that the model was not loaded onto the GPU when a custom model was called.
Finally, simply find the location to call the custom model in the network-defined forward function and force it into the GPU(.cuda()).

Calculate number of parameters in neural network

I am wondering would the number of parameters in the models like ResNet18, Vgg16, and DenseNet201 would change if we change the input size to the model?
I did measure the number of parameters with the following command
pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
Also, I have tried this snippet, and the number of parameters did not change for different input size
import torchvision.models as models
model= models.resnet18(pretrained = False)
model.cuda()
summary(model, (1,64,64))
No it would not. Parameters of a model have the purpose of processing the input as it propagates inside the network pipeline.
The parameters are mostly trained to serve their purpose, which is defined by the training task. Consider a increase in number of parameters based on the input? What would their values be? Would they be random? How would this new parameters with new values affect the inference of the model?
Such a sudden, random change to the fine-tuned, well-trained parameters of the model would be impractical. Maybe there are some other algorithms that I am unaware of, that change their parameter collection based on input. But the architectures that have been mentioned in question do not support such functionality.
Traninable parameters do not change with the change in input. If you see the weights in first layer of the model with the command list(model.parameters())[0].shape you can realize that it does not depend on the height and width of the input, but it depends on the number of channels(e.g Gray, RGB, HyperSpectral), which usually is very insignificant in bigger models. For further information about getting the input shape, you can see this toy example.

Predictions using Convolutional Neural Networks and DL4J

This is my first time working with DL4J (Deep Learning for Java) and also my first Convolutional Neural Network. My Goal is to use the Convolutional Neural Netowrk to give me some predicted values about an image. I gathered and labelled my images myself. The labels or expected outputs consist of two numbers between 0 and 1 (I just wrote them in the file name like 0.01x0.87.jpg).
Now I can't find any way to use the DataSetIterator Class which DL4J uses so that I can also set my label values.
Is there a simple way to tell DL4J that I want to train my Network to recognize that image 0.01x0.01.jpg should spit out the values 0.01 and 0.01?
What you want to do is usually known as regression. In contrast to classification where you want to either have a 0 or 1 output, in regression any value can be the target.
In your case, you will likely want to use a network architecture that uses either a sigmoid (which forces your values to be between 0 and 1) or an identity (which keeps the values as is, i.e. allows for them to be outside of the 0 to 1 range) activation function.
As you have two values that you are trying to predict, you will have to also define that you are using two outputs.
So much for your model architecture.
For data loading, you can use the ImageRecordReader, but also pass it a PathMultiLabelGenerator of your own. When you implement the PathMultiLabelGenerator interface, you will get the full path of the image as a string, and you can do whatever you want with it, like for example remove the file ending, split on x and parse your filename into a list of DoubleWritable. DoubleWritable is just a simple wrapper class for double so creating that is as easy as just instantiating it by passing the actual value to the constructor.
To create a dataset iterator you can now follow the documentation on RecordReaderDataSetIterator.

On Keras and batch ground truth.

I'm trying to implement some loss function into Keras with Tensorflow backend. The said function being kind of tricky, I would like to come up with the following scheme:
Through the all network, I would like to keep some information I(B) depending on the passed batch B from the input. I know this can be done by using a multi-input /output network.
More precisely, for each batch B, I would like to retrieve the labels B_true associated, because my function I is a function of this ground truth, i.e. I(B_true)
Is this possible?
Thank you a lot!

Neural network to figure out the meaning of user input

I'm trying to make a neural network try to figure out the meaning of input(keyboard keys in this case) according to the user.
I have multiple possible output "commands" that the NN can interpret the inputs to mean, and at each state certain outputs can count as beneficial while others are a detriment
When the NN starts up for the first time, no input should have any particular meaning to it but as time goes on I want the NN to be able to figure out what the user most likely meant.
I've tried a Multilayer perceptron NN that has as many input nodes as there are physical inputs and as many output nodes as there are commands and a number of nodes equal to the sum of the other two layers as a single hidden layer, in this case it is then a 5,15,10.
The NN assumes that the user will only make moves that are in the NN's best interest.
So far it seems the NN is just figuring out what is the command it can take that will most likely result in a beneficial move, regardless of the input key rather than trying to figure out what key should result in what move according to the user.
Because of this I'm wondering (most likely wrong) if I should produce a separate NN for each input to try and figure out the current output according to the user.
Is there a different type of NN I should look into that will work better, and is there a recommended configuration for this problem?
I'll be happy with some recommendations of reading material that would help in this particular problem.
I'm at best an amateur in NN and would like to learn a lot more about the whole field, But I'm trying to focus my efforts on this problem for now.
Accordng to me you want the output to be according to behaviour of the player as number of inputs are more than in actual case. So according to me there should be some type of memory for the actions taken by the player in order to find the patterns.This can be done using Long Short term Memory.