How to do regularization in Matlab's NN toolbox - matlab

My data set has 150 independent variables and 10 predictors or response. The problem is to find a mapping between input and output variables. There are 1000 data points out of which 70% I have used for training and 30% for testing. I am using a feedforward neural network with 10 hidden neurons as explained in this Matlab document . I am evaluating the performance using the command
perf_Train = perform(net,TrainedData',lblTrain')
YPred = net(XTest);
perf_Test = perform(net,YPred,lblTest')
which basically gives the mean square error between the actual and the predicted (estimated) response for training and testing. My testing data is not able to fit properly to the trained model, however the training data fits quite well.
Problem1: My training performance is always lesser than test performance measure i.e., perf_Train = 0.0867 and perf_Test = 0.567
Is this overfitting or underfitting?
Problem2: How do I make the test data fit accurately? Theory say that to overcome overfitting and underfitting, we need to do regularization. Is there any parameter that needs to be input into the function such as regularization to overcome this?

It is overfitting since training error is lower than test error.
I would recommend to set less epochs(iteration) for your training or use less training data.
I would also recommend to check that the training data and test data are picked up randomly.
For regulation, it can be set like this:
net.performParam.regularization = 0.5;
The performance ratio depends on the model, 0.5 is just an example.
For more details, you can refer to the documentation below.
https://www.mathworks.com/help/deeplearning/ug/improve-neural-network-generalization-and-avoid-overfitting.html#bss4gz0-38

Related

Confusion regarding Preparation of data for the task of data fitting using NN

I am using a multi layer perceptron for fitting a model to a data given input-output pair following the tutorial https://www.mathworks.com/help/deeplearning/gs/fit-data-with-a-neural-network.html.
Confusion 1) I am having a tough time understanding where the test set which has been created using the command net.divideParam.testRatio used? In general, we split the data set into train, validation and an unseen test set that is used for performanace evaluation and reporting the confusion matrix. This approach is usually done for classification task. But for the problem of regression and model fitting es. using NN should we not explicitly have a test set that is unseen during training? Is this command net.divideParam.testRatio creating that unseen test set but it is never used in testing the network? The program code uses all of the inputs in the testing. It is unclear if after training I should use an unseen dataset for testing and then reporting the performance or not.
% Create a Fitting Network
hiddenLayerSize = 10;
net = fitnet(hiddenLayerSize);
inputs = houseInputs;
targets = houseTargets;
% Set up Division of Data for Training, Validation, Testing
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% Train the Network
[net,tr] = train(net,inputs,targets);
% Test the Network
outputs = net(inputs);
errors = gsubtract(outputs,targets);
performance = perform(net,targets,outputs)
Confusion 2) When using regression model mvregress do we follow the same approach as the answer for confusion 1)
Please help. I am unable to find the correct practices and approach for these initial steps and I believe that the proper use makes a great impact in the result.
I can help you mostly with confusion 1).
When you train a neural network, you are separating the dataset in 3 sets:
Training set, used to train the network (the only dataset which actually allows the update of the Network weights);
Validation set, used to stop the training (this is the parameter Validation checks in the GUI);
Test set, which influences the performance plots and the overall performance of the fitter;
Therefore, of these 3, only the training set is seen by the network and influences the weights update; while the validation set allows to stop the training if the network is overfitting the training data (an improvement in training data fitting does not improve the validation data fitting/classification). Finally, test set is useful for a first check of the fitter performance.
If you check the value of net.divideParam, you can see that the network stores the percentage of values for each set; during the training, the inputs and targets will be randomly divided according to these 3 values. This is why if you use the toolbox to plot the performance of the network. You can also avoid this to be done randomly by setting the net.divideFcn to 'divideind'. This is mostly useful if you know well your dataset.
When you train the network using
[net,tr] = train(net,inputs,targets);
tr stores the results of the training, including the indexes of the training (tr.trainInd), validation (tr.valInd) and test set (tr.testInd). To retrieve each of the sets it is possible to index the input with those inputs, while other parameters, such as the accuracy or the performance of the network can be retrieved through tr.
Regarding confusion 2, I think that regression model mvregress works with a different approach: it should just evaluate the parameters for the fitting without splitting the dataset in three slices. It should be up to you to evaluate the regression by adding some points or removing them from the inputs.

Neural network parameter selection

I am looking at (two-layer) feed-forward Neural Networks in Matlab. I am investigating parameters that can minimise the classification error.
A google search reveals that these are some of them:
Number of neurons in the hidden layer
Learning Rate
Momentum
Training type
Epoch
Minimum Error
Any other suggestions?
I've varied the number of hidden neurons in Matlab, varying it from 1 to 10. I found that the classification error is close to 0% with 1 hidden neuron and then grows very slightly as the number of neurons increases. My question is: shouldn't a larger number of hidden neurons guarantee an equal or better answer, i.e. why might the classification error go up with more hidden neurons?
Also, how might I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
Many thanks
Since you are considering a simple two layer feed forward network and have already pointed out 6 different things you need to consider to reduce classification errors, I just want to add one thing only and that is amount of training data. If you train a neural network with more data, it will work better. Note that, training with large amount of data is a key to get good outcome from neural networks, specially from deep neural networks.
Why the classification error goes up with more hidden neurons?
Answer is simple. Your model has over-fitted the training data and thus resulting in poor performance. Note that, if you increase the number of neurons in hidden layers, it would decrease training errors but increase testing errors.
In the following figure, see what happens with increased hidden layer size!
How may I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
I am expecting you have already seen feed forward neural net in Matlab. You just need to manipulate the second parameter of the function feedforwardnet(hiddenSizes,trainFcn) which is trainFcn - a training function.
For example, if you want to use gradient descent with momentum and adaptive learning rate backpropagation, then use traingdx as the training function. You can also use traingda if you want to use gradient descent with adaptive learning rate backpropagation.
You can change all the required parameters of the function as you want. For example, if you want to use traingda, then you just need to follow the following two steps.
Set net.trainFcn to traingda. This sets net.trainParam to traingda's default parameters.
Set net.trainParam properties to desired values.
Example
net = feedforwardnet(3,'traingda');
net.trainParam.lr = 0.05; % setting the learning rate to 5%
net.trainParam.epochs = 2000 % setting number of epochs
Please see this - gradient descent with adaptive learning rate backpropagation and gradient descent with momentum and adaptive learning rate backpropagation.

Accuracy of Neural network Output-Matlab ANN Toolbox

I'm relatively new to Matlab ANN Toolbox. I am training the NN with pattern recognition and target matrix of 3x8670 containing 1s and 0s, using one hidden layer, 40 neurons and the rest with default settings. When I get the simulated output for new set of inputs, then the values are around 0 and 1. I then arrange them in descending order and choose a fixed number(which is known to me) out of 8670 observations to be 1 and rest to be zero.
Every time I run the program, the first row of the simulated output always has close to 100% accuracy and the following rows dont exhibit the same kind of accuracy.
Is there a logical explanation in general? I understand that answering this query conclusively might require the understanding of program and problem, but its made of of several functions to clearly explain. Can I make some changes in the training to get consistence output?
If you have any suggestions please share it with me.
Thanks,
Nishant
Your problem statement is not clear for me. For example, what you mean by: "I then arrange them in descending order and choose a fixed number ..."
As I understand, you did not get appropriate output from your NN as compared to the real target. I mean, your output from NN is difference than target. If so, there are different possibilities which should be considered:
How do you divide training/test/validation sets for training phase? The most division should be assigned to training (around 75%) and rest for test/validation.
How is your training data set? Can it support most scenarios as you expected? If your trained data set is not somewhat similar to your test data sets (e.g., you have some new records/samples in the test data set which had not (near) appear in the training phase, it explains as 'outlier' and NN cannot work efficiently with these types of samples, so you need clustering approach not NN classification approach), your results from NN is out-of-range and NN cannot provide ideal accuracy as you need. NN is good for those data set training, where there is no very difference between training and test data sets. Otherwise, NN is not appropriate.
Sometimes you have an appropriate training data set, but the problem is training itself. In this condition, you need other types of NN, because feed-forward NNs such as MLP cannot work with compacted and not well-separated regions of data very well. You need strong function approximation such as RBF and SVM.

ANN different results for same train-test sets

I'm implementing a neural network for a supervised classification task in MATLAB.
I have a training set and a test set to evaluate the results.
The problem is that every time I train the network for the same training set I get very different results (sometimes I get a 95% classification accuracy and sometimes like 60%) for the same test set.
Now I know this is because I get different initial weights and I know that I can use 'seed' to set the same initial weights but the question is what does this say about my data and what is the right way to look at this? How do I define the accuracy I'm getting using my designed ANN? Is there a protocol for this (like running the ANN 50 times and get an average accuracy or something)?
Thanks
Make sure your test set is large enough compared to the training set (e.g. 10% of the overall data) and check it regarding diversity. If your test set only covers very specific cases, this could be a reason. Also make sure you always use the same test set. Alternatively you should google the term cross-validation.
Furthermore, observing good training set accuracy while observing bad test set accuracy is a sign for overfitting. Try to apply regularization like a simple L2 weight decay (simply multiply your weight matrices with e.g. 0.999 after each weight update). Depending on your data, Dropout or L1 regularization could also help (especially if you have a lot of redundancies in your input data). Also try to choose a smaller network topology (fewer layers and/or fewer neurons per layer).
To speed up training, you could also try alternative learning algorithms like RPROP+, RPROP- or RMSProp instead of plain backpropagation.
Looks like your ANN is not converging to the optimal set of weights. Without further details of the ANN model, I cannot pinpoint the problem, but I would try increasing the number of iterations.

Matlab neural network testing

I have created a neural network and the performance is good. By using nprtool, we are allow to test the network with an input data and target data. Here is my question, what is the purpose of testing a neural network with target data provided? Isn't it testing should not hav e target data so that we can know how well can the trained neural network perform without target data is given? Hope someone will respond to this, thanks =)
I'm not familiar with nprtool, but I suspect it would give the input data to your neural network, and then compare your NN's output data with the target data (and compute some kind of success rate based on that).
So your NN will never see the target data, it's just used to measure the performance.
It's like the "teacher's edition" of the exercise books in school. The student (i.e. the NN) doesn't have the solutions, but her/his answers will be compared against them by the teacher (i.e. nprtool). (Okay, the teacher probably/hopefully knows the subject, but you get the idea.)
The "target" data t is the desired y of y=net(x) used as example to train the network.
What nprtool do is to divide the training set into three groups: the training set, the validation set and the test set.
The first one is used to actually update the network.
The second one is used to determine the performances of the net (note: this set is NOT used in any way to update the network): as the NN "learns" the error (as difference between the t and net(x)) over the validation set decreases. The trend will eventually stop or even reverse: this phenomena is called "overfitting", which means the NN is now chasing the training set, "memorizing" it at the cost of the ability to generalize (meaning: to perform well with unseen data). So the purpose of this validation set is to determine when to stop the training before the NN starts overfitting. This should answer your question.
Finally third set is for external testing, to leave you a set of data untouched by the training procedure.
Even though the total data set [training, validation and testing] are inputs to the training algorithm, the testing data is in no way used to design (i.e., train and validate) the net
total = design + test
design = train + validate
The training data is used to estimate weights and biases
The validation data is used to monitor the design performance on nontraining data. REGARDLESS OF THE PERFORMANCE ON TRAINING DATA, if validation performance degrades continuously for 6 (default) epochs, training is terminated (VALIDATION STOPPING).
This mitigates the dreaded phenomenon of OVERTRAINING AN OVERFIT NET where performance on nontraining data degrades even if the training set performance is improving.
An overfit net has more unknown weights and biases than training equations, thereby allowing an infinite number of solutions. A simple example of overfitting with two unknowns but only one equation:
KNOWN: a, b, c
FIND: unique x1 and x2
USING: a * x1 + b * x2 = c
Hope this helps.
Greg