I am trying to write the code in keras from already written Matlab Model in example here:https://in.mathworks.com/help/deeplearning/examples/denoise-speech-using-deep-learning-networks.html
They have defined a layer in the end called regressionLayer. I want to know what to use corresponding to this in keras or pytorch.
I have simply added the sigmoid activation rather than this regressionLayer in keras. But I doubt if this is correct because I dont seem to get the desired output and this seems to be one of the reason.
model.add(Conv2D(1, (129,1), strides =(1,100),padding='same',
input_shape=(129,8,18),activation='sigmoid'))
In Matlab the regression layer just computes a mean squared loss, which is the way Caffe works (losses as layers), but not the way Keras works, so the equivalent line would not be a layer, just setting the loss:
model.compile(loss='mse', optimizer=...)
Note that we do not include accuracy metrics if you are doing regression, as it is a classification only metric.
Related
I am trying to use the following CNN architecture for semantic pixel classification. The code I am using is here
However, from my understanding this type of semantic segmentation network typically should have a softmax output layer for producing the classification result.
I could not find softmax used anywhere within the script. Here is the paper I am reading on this segmentation architecture. From Figure 2, I am seeing softmax being used. Hence I would like to find out why this is missing in the script. Any insight is welcome.
You are using quite a complex code to do the training/inference. But if you dig a little you'll see that the loss functions are implemented here and your model is actually trained using cross_entropy loss. Looking at the doc:
This criterion combines log_softmax and nll_loss in a single function.
For numerical stability it is better to "absorb" the softmax into the loss function and not to explicitly compute it by the model.
This is quite a common practice having the model outputs "raw" predictions (aka "logits") and then letting the loss (aka criterion) do the softmax internally.
If you really need the probabilities you can add a softmax on top when deploying your model.
I've used Matlab Classification Learner App to train my SVM classifier and i have 99.9% of accuracy in prediction (i tested it with the function predict on matlab). What i wanted to do now was to predict without usind this function but using the hyperplane. I exported the trained classifier and so i have all the weights and the bias to find the hyperplane. Which formula should i use to predict new data? I tryed computing the sign of w'x but it works only in few cases. Can you help me understand what should i do?
Thanks a lot!
I have a train dataset and a test dataset, and I train a SVM with fitcsvm in MATLAB. Then, I proceed to test the trained model with predict. I'm always using the same datasets, but I keep getting different AUCs for the same model, which makes me wonder where in the process is there a random component. Note that
I'm aware of the fact that formally there isn't such thing as ROC curve or AUC and
I'm not asking for the statistical background of the SVM problem. It is relative to the matlab implementation of the training/test algorithm. I expected to have the same results because the training algorithm is, afaik, a deterministic process.
when i use the libsvm in matlab for multiclass classification, the svmpredict command consists of also the testing labels. As I dont have the labels for test set, is it possible to predict it somehow using the libsvm in matlab?
Yes, just provide a meaningless label vector. The only use of the labels is so the prediction function can report some statistics. They are not actually required for prediction in any way.
I am trying to make a simple radial basis function network (RBFN) for regression. I have a 20 dimensional (feature) dataset with over 600 samples. I need the final network to output 1 scalar value for each 20 dimensional sample.
Note: new to machine learning...and feel like I am missing an important concept here.
With the perceptron we can, and I have, trained a linear network until the prediction error is at a minimum using a small subset of the initial samples.
Is there a similar process with the RBFN?
Yes there is,
The main two differences between a multi-layer perceptron and a RBFN are the fact that a RBFN usually implies just one layer and that the activation function is a gaussian instead of a sigmoid.
The training phase can be done using gradient descend of the error loss function, so it is relatively simple to implement.
Keep in mind that RBFN is a linear combination of RBF units, so the range of the output is limited and you would need to transform it if you need an scalar outside of that range.
There is a few of resources that you could consult as reference:
[PDF] (http://scholar.lib.vt.edu/theses/available/etd-6197-223641/unrestricted/Ch3.pdf)
[Wikipedia] (http://en.wikipedia.org/wiki/Radial_basis_function_network)
[Wolfram] (http://reference.wolfram.com/applications/neuralnetworks/NeuralNetworkTheory/2.5.2.html)
Hope it helps,