why: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same - neural-network

I'm writing neural networks using torch. Here's a little problem I can't solve.
I've put both network and network inputs into the GPU, but there was an error in training.
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

I have solved the problem.
After reviewing the training function, I confirm that the model and input data have been loaded onto the GPU. Then the problem must be in the network model. When I checked the network model, I found that I wasn't using structures like List. It is likely that the model was not loaded into the GPU when other custom network models were used.
So I checked each layer of the network model, which is to force each layer onto the GPU (.cuda()). Depending on where the error prompt is located, the problem was identified after each layer's attempt, which was that the model was not loaded onto the GPU when a custom model was called.
Finally, simply find the location to call the custom model in the network-defined forward function and force it into the GPU(.cuda()).

Related

Large Neural Network Pruning

I have done some experiments on neural network pruning, but only on small models. I used to prune the relevant weights as follows (similarly as it is explained in the official tutorial https://pytorch.org/tutorials/intermediate/pruning_tutorial.html):
for name,module in model.named_modules():
if 'layer' in name:
parameters_to_prune.append((getattr(model, name),'weight'))
prune.global_unstructured(
parameters_to_prune,
pruning_method=prune.L1Unstructured,
amount=sparsity_constant,
)
The main problem in doing this, is that I have to define a list (or tuple) of layers to prune. This works when I define my model by hands and I know the name of different layers (for example, in the code provided, I was aware of the fact that all the fully connected layers, had the string "layer" in their name.
How can I avoid this process, and define a pruning method that prunes all the parameters of a given model, without having to call the layers by name?
All in all, I'm looking for a function that, given a model and a constant of sparsity, globally prunes the given model (by masking it):
model = models.ResNet18()
function_that_prunes(model, sparsity_constant)

how to get the input and final output of a pytorch network

is there an API of how to get the intput and output node of a pytorch network ?
I tried model.features(), but this won't help.
Example: I get a pytorch network, and its structure in netron:
network
The Conv2d, MaxPool2d and Linear can be easily parsed. I get trouble with getting the information like name and size of the input node and output node.
For getting input to a particular layer means output of previous layer.
So for getting any information during forward pass and backward pass like output of a specific layer, gradient, or if you want to modify any of them there is concept in pytorch known as Hooks.
Find more information here https://pytorch.org/tutorials/beginner/former_torchies/nnft_tutorial.html#forward-and-backward-function-hooks

Object detection for a single object only

I have been working with object detection. But these methods consist of very deep neural networks and require lots of memory to store the trained models. E.g. I once tried to train a Mask R-CNN model, and the weights take 200 MB.
However, my focus is on detecting a single object only. So, I guess these methods are not suitable. Are there any object detection method that can do this job with a low memory requirement?
You can try SSD or faster RCNN they are easily available in Tensorflow object detection API
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md
here you can get pre-trained models and config file
you can select your model by taking look on speed and mAP(accuracy) column as per your requirement.
Following mukul's answer, I specifically recommend you check out SSDLite-MobileNetV2.
It's a lite-weight model, which is still enough expressive for good results.
Especially when you're restricting yourself to a single class, as you can see in the example of FaceSSD-MobileNetV2 as in here (Note however this is vanilla SSD).
So you can simply Take the pre-trained model of SSDLite-MobileNetV2 with the corresponding config file, and modify it for a single class.
This means changing num_classes to 1, modifying the label_map.pbtxt, and of course - preparing the dataset with the single class you want.
If you want a more robust model, but which has no pre-trained mode, you can use an FPN version.
Checkout this config file, which is with MobileNetV1, and modify it for your needs (e.g. switching to MobileNetV2, switching to use_depthwise, etc).
On one hand, there's no detection pre-trained model, but on the other the detection head is shared over all (relevant) scales, so it's somewhat easier to train.
So simply fine-tune it from the corresponding classification checkpoint from here.

Is it possible to set initial state to a simulink model to do simulations?

Consider that I have built an electrical circuit or any other system at Simulink and to do simulations, Simulink should work in the sense that it builds a state space model of the system, right? If that is the case, is it possible to set an initial condition of this model? And more, is it possible to know what are the state variables of the model built by Simulink?
The Simulink.BlockDiagram.getinitialState method can be used to interrogate the model, and return an appropriate structure giving the current initial value of the states.
The values in the structure can then be changed and the (new) values used with the model configuration parameters to start at a different initial state. See the doc for a usage example.

Why does my Simulink-Model rebuild at every iteration?

I'am trying to speedup my simulink project and want to use the Accelerator-Simulation mode.
The aim of my project is to control a cyclic process and is structured as followes:
matlab-script, where all parameters and a feedforward control with
parameter estimation is implemented. Also it starts simulating the
simulink model for each iteration.
simulink model, where the dynamic system and the feedforward control (basically a lookup-table) together with a feedback control
is implemented. Parameters of all blocks are set by workspace variables/structs generated by the script.
The feedforward control variable is calculated and parameter are estimated from the simulated data after every simulation pass. Then the model is simulated again. The model is not changing during the iterations, but still it is compiling at every cycle. From the first: Is this solution appropriate for using the Accelerator mode?
I tried to follow theses proposed steps to determine, why it is built at every iteration: mathworks
If i run it with the Accelerator-Mode (referring to the documentation of this function, it now compiles for simulation), I still cannot reproduce why it is compiled at every iteration.
csdet1.ContentsChecksum.Value ~= csdet2.ContentsChecksum.Value
is true, but the proposed code does not find any details.
csdet1.InterfaceChecksum.Value ~= csdet2.InterfaceChecksum.Value
is also true, the proposed code outputs that
UserDefinedTypesChecksum
is different. What does that mean and how can I resolve this?
Sidefact: When I run Simulink.BlockDiagram.getChecksum() with the Model opened in Simulink and Normal-Mode chosen, I get this error:
Continuous update specified for this chart chartname This is not
supported for RTW."
But this chart is a Matlab-Function block, not a stateflow chart?!