create deep network in matlab with logsig layer instead of softmax layer - matlab

I want to create a deep classification net, but my classes aren't mutually exclusive (that is what sofmaxlayer do).
Is it possible to define a non mutually exclusive classification layer (i.e., a data can be in more than one class)?
One way to do it, it would be with a logsig function in the classification layer, instead of a softmax, but I have no idea how to acomplish that....

In CNN you can have multiple class in last layer as you know. But if I understand correctly your need in last layer an out put with that is in a range of numbers instead of 1 or 0 for each class. Its mean you need regression. If your labels support this task it's OK and you can do it with regression just like what happen in bounding box regression for localization. And you don't need soft-max in last layer. just use other activation functions that produce sufficient out put for your task.

Related

How to add a custom layer and loss function into a pretrained CNN model by matconvnet?

I'm new to matconvnet. Recently, I'd like to try a new loss function instead of the existing one in pretrained model, e.g., vgg-16, which usually uses softmax loss layer. What's more, I want to use a new feature extractor layer, instead of pooling layer or max layer. I know there are 2 CNN wrappers in matconvnet, simpleNN and DagNN respectively, since I'm using vgg-16,a linear model which has a linear sequence of building blocks. So, in simpleNN wrapper, how to create a custom layer in detail, espectially the procedure and the relevant concept, e.g., do I need to remove layers behind the new feature extractor layer or just leave them ? And I know how to compute the derivative of the loss function so the details of computation inside the layer is not that important in this question, I just want to know the procedure represented by codes. Could someone help me? I'll appreciate it a lot !
You can remove the older error or objective layer
net.layer('abc')=[];
and you can add new error code in vl_nnloss() file

How to jointly learn two tasks at prediction level?

I have trained a network on two different modals of the same image. I pass the data together in one layer but after that, it is pretty much two networks in parallel, they don't share a layer and the two tasks have different set of labels, therefore I have two different loss and accuracy layers (I use caffe btw). I would like to learn these tasks jointly. For example, the prediction of a class of task 1 should be higher in presence of the task 2 predicting a certain class label. I don't want to join them at feature level but at prediction level. How do I get to do this?
Why don't you want to join the prediction at feature level?
If you really want to stick to your idea of not joining any layers of the network, you can apply a CRF or SVM on top of the overall prediction pipeline to learn cross-correlations between the predictions. For any other method you will need to combine features inside the network, one way or another. However I would strongly recommend, that you consider doing this. It is a general theme in deep learning, that doing stuff inside the network works better then doing it outside.
From what I have learned by experimenting with joint prediction, you will get the most performance gain, if you share weights between all convolutional layers of the network. You can then apply independent fc-layers, followed by a softmax regression and separate loss functions on top of the jointly predicted features. This will allow the network to learn cross-correlation between features while it is still able to make separate predictions.
Have a look at my MultiNet paper as a good starting point. All our training code is on github.

How to Combine two classification model in matlab?

I am trying to detect the faces using the Matlab built-in viola jones face detection. Is there anyway that I can combine two classification models like "FrontalFaceCART" and "ProfileFace" into one in order to get a better result?
Thank you.
You can't combine models. That's a non-sense in any classification task since every classifier is different (works differently, i.e. different algorithm behind it, and maybe is also trained differently).
According to the classification model(s) help (which can be found here), your two classifiers work as follows:
FrontalFaceCART is a model composed of weak classifiers, based on classification and regression tree analysis
ProfileFace is composed of weak classifiers, based on a decision stump
More infos can be found in the link provided but you can easily see that their inner behaviour is rather different, so you can't mix them or combine them.
It's like (in Machine Learning) mixing a Support Vector Machine with a K-Nearest Neighbour: the first one uses separating hyperplanes whereas the latter is simply based on distance(s).
You can, however, train several models in parallel (e.g. independently) and choose the model that better suits you (e.g. smaller error rate/higher accuracy): so you basically create as many different classifiers as you like, give them the same training set, evaluate each accuracy (and/or other parameters) and choose the best model.
One option is to make a hierarchical classifier. So in a first step you use the frontal face classifier (assuming that most pictures are frontal faces). If the classifier fails, you try with the profile classifier.
I did that with a dataset of faces and it improved my overall classification accuracy. Furthermore, if you have some a priori information, you can use it. In my case the faces were usually in the middle up part of the picture.
To further improve your performance, without using the two classifiers in MATLAB you are using, you would need to change your technique (and probably your programming language). This is the best method so far: Facenet.

Neural networks: classification using Encog

I'm trying to get started using neural networks for a classification problem. I chose to use the Encog 3.x library as I'm working on the JVM (in Scala). Please let me know if this problem is better handled by another library.
I've been using resilient backpropagation. I have 1 hidden layer, and e.g. 3 output neurons, one for each of the 3 target categories. So ideal outputs are either 1/0/0, 0/1/0 or 0/0/1. Now, the problem is that the training tries to minimize the error, e.g. turn 0.6/0.2/0.2 into 0.8/0.1/0.1 if the ideal output is 1/0/0. But since I'm picking the highest value as the predicted category, this doesn't matter for me, and I'd want the training to spend more effort in actually reducing the number of wrong predictions.
So I learnt that I should use a softmax function as the output (although it is unclear to me if this becomes a 4th layer or I should just replace the activation function of the 3rd layer with softmax), and then have the training reduce the cross entropy. Now I think that this cross entropy needs to be calculated either over the entire network or over the entire output layer, but the ErrorFunction that one can customize calculates the error on a neuron-by-neuron basis (reads array of ideal inputs and actual inputs, writes array of error values). So how does one actually do cross entropy minimization using Encog (or which other JVM-based library should I choose)?
I'm also working with Encog, but in Java, though I don't think it makes a real difference. I have similar problem and as far as I know you have to write your own function that minimizes cross entropy.
And as I understand it, softmax should just replace your 3rd layer.

Neural Network bias

I'm building a feed forward neural network, and I'm trying to decide how to implement the bias. I'm not sure about two things:
1) Is there any downside to implementing the bias as a trait of the node as opposed to a dummy input+weight?
2) If I implement it as a dummy input, would it be input just in the first layer (from the input to the hidden layer), or would I need a dummy input in every layer?
Thanks!
P.S. I'm currently using 2d arrays to represent weights between layers. Any ideas for other implementation structures? This isn't my main question, just looking for food for thought.
Implementation doesn't matter as long as the behaviour is right.
Yes, it is needed in every layer.
2d array is a way to go.
I'd suggest to include bias as another neuron with constant input 1. This will make it easier to implement - you don't need a special variable for it or anything like that.