Image Processing MLP - Detecting Classes - matlab

I've implemented an MLP that is able to detect hand written digits. So far the algorithm can identify numbers 0 and 1, but when I have implemented a new class, e.i. 2, the algorithm is unable to learn it. At the beginning I thought I had had a mistake in the implementation of the new class so I decided to swap the new class for a previous one that worked, in other words, if class0 was 0 and new class was 2 now class0 is 2 and new class is 0. Surprisingly the new class managed to be detected with almost no error, but class0 had a huge error, which means, the new class is properly implemented.
The MLP has two layers with 20 hidden units each, both of them are nonlinear with a sigmoidal function.

I think if I am able to understand your question properly then as you will add a new class and training a model such as here you trained a neural network then the final layer will change i.e, the no. of neurons in the final layer will be changed as a new class is added.
This can be one of the reasons for not detecting the new class.

Related

MiniBatches there are no samples for class label exception

I was following the first example given in Accord.Net framework's documentation here to train a multi class SVM classifier with my own dataset but during the training loop the I got an error that says:
There are no samples for class label 3. Please make sure that class
labels are contiguous and there is at least one training sample for
each label.
The data that I'm using has 27 classes and 10 of the classes have less than 1000 samples in them so the batches returned by the MiniBatches object might not have samples from all of the available classes. Is there a way to resolve this issue without writing a custom object to sample from each class for each mini batch?

How to combine anomaly detection model with Object Detection Model

newbie here in deep learning. My question is:
I have an already trained object detection (yolov5) model for 3 classes [0,1,2]. Now, my next step is to classify one class , e.g. class [0] as anomaly or not. In other words, I need an additional classifier to further classify it into to sub-classes , i.e., anomalous or non-anomalous through the use of classifier or anomaly detection model. Can you give me an advice on how can I proceed with this? I will use GANs as anomaly detection model. This would be a great help. Thank you in advance.
one of the ways to solve your problems is through One-Class Learning (OCL). In OCL, the algorithm learns from only one user-defined interest class (in your case, non-anomalous objects of class 0) and classifies a new example as belonging to this interest class or not. Thus, you can adapt OCL algorithms to your problem. One of the ways is to use labeled examples of class 0 non-anomalous and use these examples in learning the algorithm. Finally, your algorithm will answer whether new instances of class 0 are non-anomalous (class of interest) or anomalous (non-interest class). Examples of OCL algorithms can be found at: https://scikit-learn.org/stable/modules/outlier_detection.html. It is noteworthy that these are traditional OCL algorithms. On the other hand, there are already versions of OCL algorithms based on deep learning.
In addition, I'll provide you with an example of anomaly detection using the One-Class Support Vector Machine (OCSVM), one of the most traditional famous OCL algorithms.
from sklearn.svm import OneClassSVM as OCSVM
normal_data_of_class_0 = [[],[],[],[], ......, []] #class 0 normal examples
ocsvm = OCSVM()
ocsvm.fit(normal_data_of_class_0)
data_of_class_0 = [[],[],[],[], ......, []] #class 0 new examples
y_pred = ocsvm.predict(data_of_class_0) # +1 == normal data (interest class) || -1 == abnormal data (outliers)

Large Neural Network Pruning

I have done some experiments on neural network pruning, but only on small models. I used to prune the relevant weights as follows (similarly as it is explained in the official tutorial https://pytorch.org/tutorials/intermediate/pruning_tutorial.html):
for name,module in model.named_modules():
if 'layer' in name:
parameters_to_prune.append((getattr(model, name),'weight'))
prune.global_unstructured(
parameters_to_prune,
pruning_method=prune.L1Unstructured,
amount=sparsity_constant,
)
The main problem in doing this, is that I have to define a list (or tuple) of layers to prune. This works when I define my model by hands and I know the name of different layers (for example, in the code provided, I was aware of the fact that all the fully connected layers, had the string "layer" in their name.
How can I avoid this process, and define a pruning method that prunes all the parameters of a given model, without having to call the layers by name?
All in all, I'm looking for a function that, given a model and a constant of sparsity, globally prunes the given model (by masking it):
model = models.ResNet18()
function_that_prunes(model, sparsity_constant)

Character Recognition Using Back Propagation Algorithm Testing

Recently I've been working on character recognition using Back Propagation Algorithm. I've taken the image and reduced to 5x7 size, therefore I got 35 pixels and trained the network using those pixels with 35 input neurons, 35 hidden nodes, and 10 output nodes. And I had completed the training successfully and I got weights that I needed. And I've got stuck here. I have my test set and I know I should feed forward the network. But I don't know what to do exactly. My test set will be 4 samples of 1x35. My output layer has 10 neurons. how do I exactly distinguish the characters with the output that I will get? I want to know how this testing works. Please guide me through this stage. Thanks in advance.
One vs All
A common approach for testing these types of neural networks is "one-vs-all" approach. We view each of the output nodes as its own classifier that is giving the probability of the sample being that class vs not being that class.
For instance if you network output [1, 0, ..., 0] then class 1 has high probability of being class 1 vs not being class 1. Class 2 has low probability of being class 2 vs not being class 2, etc.
Ties
In the case of a tie, it is common (in research) to have a random function break the tie. If you get [1, 1, 1, ..., 1] then the function would pick a number from 1-10 and that is your prediction. In practice sometimes an expert system is used to break ties. Perhaps class 1 is more expensive than class 2, so we tie in preference to class 2.
Steps
So the steps are:
Split dataset into test/train set
Train weights on train set
Pass test set forward through the neural network
For each sample, choose the argmax (the output with highest value) as your prediction
In case of tie, choose randomly between all tying classes
Aside
In your particular case, I imagine implementation of this strategy will result in a network that barely beats random performance (10%) accuracy.
I would suggest some reconsidering of the network architecture.
If you look at your 5x7 images, can you tell what number that image was originally? It seems likely that scaling the image down to this size losses too much information that the network cannot distinguish between classes.
Debugging
From what you've described I would look at the following when debugging your network.
Is your data preprocessing (down-scaling) leeching out too much information? Check this by manually investigating a few of the images and seeing if you can tell what the image should be.
Does your one-hot algorithm work? When you convert your targets for training, does it successfully convert 1 -> [1, 0, 0, ..., 0]?
Is your back-prop / gradient descent algorithm correct? You should see (roughly) a monotonic decrease in your loss function while training. Try at every step (or every few steps) printing the loss that you are optimizing. Or even for a very simple gut check, print mean squared error: (P-Y)^2

How to fine tune an FCN-32s for interactive object segmentation

I'm trying to implement the proposed model in a CVPR paper (Deep Interactive Object Selection) in which the data set contains 5 channels for each input sample:
1.Red
2.Blue
3.Green
4.Euclidean distance map associated to positive clicks
5.Euclidean distance map associated to negative clicks (as follows):
To do so, I should fine tune the FCN-32s network using "object binary masks" as labels:
As you see, in the first conv layer I have 2 extra channels, so I did net surgery to use pretrained parameters for the first 3 channels and Xavier initialization for 2 extras.
For the rest of the FCN architecture, I have these questions:
Should I freeze all the layers before "fc6" (except the first conv layer)? If yes, how the extra channels of the first conv will be learned? Are the gradients strong enough to reach the first conv layer during training process?
What should be the kernel size of the "fc6"? should I keep 7? I saw in "Caffe net_surgery" notebook that it depends on the output size of the last layer ("pool5").
The main problem is the number of outputs of the "score_fr" and "upscore" layers, since I'm not doing class segmentation (to use 21 for 20 classes and the background), how should I change it? What about 2? (one for object and the other for the non-object (background) area)?
Should I change "crop" layer "offset" to 32 to have center crops?
In case of changing each of these layers, what is the best initialization strategy for them? "bilinear" for "upscore" and "Xavier" for the rest?
Should I convert my binary label matrix values into zero-centered ( {-0.5,0.5} ) status, or it is OK to use them with the values in {0,1} ?
Any useful idea will be appreciated.
PS:
I'm using Euclidean loss, while I'm using "1" as the number of outputs for "score_fr" and "upscore" layers. If I use 2 for that, I guess it should be softmax.
I can answer some of your questions.
The gradients will reach the first layer so it should be possible to learn the weights even if you freeze the other layers.
Change the num_output to 2 and finetune. You should get a good output.
I think you'll need to experiment with each of the options and see how the accuracy is.
You can use the values 0,1.