Problems with outputs in neural networks (in MATLAB's neural networks toolbox) - matlab

I trained my artificial neural network (ANN) in MATLAB with 652,500 data points, and in another blind test (652,100 data points - for completely new input data sets) the output is excellent (as I want). But the problem occurs when I insert very less amount of data (for example, below 50 data points). The output is quite unexpected, and I checked it many times.
To be more precise, the training phase contains 10% data for training, 45% for validation and 45% for testing. The training is quite successful, and for large amount of new input data it works very well. The problem is when very limited data (compared to training data points) are inserted in the neural network, it shows quite unrealistic output, beyond the range on what it was trained.
Why is this so? Could anyone light some sheds on this please?
Also mention please, is there any strict (hard and fast) rules on training and final testing data points? For example: what percent of training data should be / must be introduced in the new input data sets. I guess the problem is my network overestimate or underestimate the output as very less percentage of data it receives as compared to training phase.

Your problem is over-fitting of the dataset in duration of training. Data dividing is a very important task in training of a neural network. In general and more scientifically, the percentage of the training set should be between 70-80%. Test and validation sets should be each on around 10-15%. For instance:
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
You imagine a student in a class. TrainRatio is materials/lectures that should be learned by student. ValRatio is the percentage of the materials that should be examined as a middle-term examination, and TestRatio is the percentage of the materials should be examined as final examination. So, if you have not enough material for training, the student cannot be a success in the middle and final examination. Is it clear? A neural network works for such a simple student for learning/training. So, your network faces with over-fitting problems.

Related

How to do regularization in Matlab's NN toolbox

My data set has 150 independent variables and 10 predictors or response. The problem is to find a mapping between input and output variables. There are 1000 data points out of which 70% I have used for training and 30% for testing. I am using a feedforward neural network with 10 hidden neurons as explained in this Matlab document . I am evaluating the performance using the command
perf_Train = perform(net,TrainedData',lblTrain')
YPred = net(XTest);
perf_Test = perform(net,YPred,lblTest')
which basically gives the mean square error between the actual and the predicted (estimated) response for training and testing. My testing data is not able to fit properly to the trained model, however the training data fits quite well.
Problem1: My training performance is always lesser than test performance measure i.e., perf_Train = 0.0867 and perf_Test = 0.567
Is this overfitting or underfitting?
Problem2: How do I make the test data fit accurately? Theory say that to overcome overfitting and underfitting, we need to do regularization. Is there any parameter that needs to be input into the function such as regularization to overcome this?
It is overfitting since training error is lower than test error.
I would recommend to set less epochs(iteration) for your training or use less training data.
I would also recommend to check that the training data and test data are picked up randomly.
For regulation, it can be set like this:
net.performParam.regularization = 0.5;
The performance ratio depends on the model, 0.5 is just an example.
For more details, you can refer to the documentation below.
https://www.mathworks.com/help/deeplearning/ug/improve-neural-network-generalization-and-avoid-overfitting.html#bss4gz0-38

Neural network parameter selection

I am looking at (two-layer) feed-forward Neural Networks in Matlab. I am investigating parameters that can minimise the classification error.
A google search reveals that these are some of them:
Number of neurons in the hidden layer
Learning Rate
Momentum
Training type
Epoch
Minimum Error
Any other suggestions?
I've varied the number of hidden neurons in Matlab, varying it from 1 to 10. I found that the classification error is close to 0% with 1 hidden neuron and then grows very slightly as the number of neurons increases. My question is: shouldn't a larger number of hidden neurons guarantee an equal or better answer, i.e. why might the classification error go up with more hidden neurons?
Also, how might I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
Many thanks
Since you are considering a simple two layer feed forward network and have already pointed out 6 different things you need to consider to reduce classification errors, I just want to add one thing only and that is amount of training data. If you train a neural network with more data, it will work better. Note that, training with large amount of data is a key to get good outcome from neural networks, specially from deep neural networks.
Why the classification error goes up with more hidden neurons?
Answer is simple. Your model has over-fitted the training data and thus resulting in poor performance. Note that, if you increase the number of neurons in hidden layers, it would decrease training errors but increase testing errors.
In the following figure, see what happens with increased hidden layer size!
How may I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
I am expecting you have already seen feed forward neural net in Matlab. You just need to manipulate the second parameter of the function feedforwardnet(hiddenSizes,trainFcn) which is trainFcn - a training function.
For example, if you want to use gradient descent with momentum and adaptive learning rate backpropagation, then use traingdx as the training function. You can also use traingda if you want to use gradient descent with adaptive learning rate backpropagation.
You can change all the required parameters of the function as you want. For example, if you want to use traingda, then you just need to follow the following two steps.
Set net.trainFcn to traingda. This sets net.trainParam to traingda's default parameters.
Set net.trainParam properties to desired values.
Example
net = feedforwardnet(3,'traingda');
net.trainParam.lr = 0.05; % setting the learning rate to 5%
net.trainParam.epochs = 2000 % setting number of epochs
Please see this - gradient descent with adaptive learning rate backpropagation and gradient descent with momentum and adaptive learning rate backpropagation.

What is the usefulness of the mean file with AlexNet neural network?

When using an AlexNet neural network, be it with caffe or CNTK, it needs a mean file as input. What is this mean file for ? How does it affect the training ? How is it generated, only from training sample ?
Mean subtraction removes the DC component from images. It has the geometric interpretation of centering the cloud of data around the origin along every dimension. It reduces the correlation between images which improves training. From my experience I can say that it improves the training accuracy significantly. It is computed from the training data. Computing mean from the testing data makes no sense.

Continuously train MATLAB ANN, i.e. online training?

I would like to ask for ideas what options there is for training a MATLAB ANN (artificial neural network) continuously, i.e. not having a pre-prepared training set? The idea is to have an "online" data stream thus, when first creating the network it's completely untrained but as samples flow in the ANN is trained and converges.
The ANN will be used to classify a set of values and the implementation would visualize how the training of the ANN gets improved as samples flows through the system. I.e. each sample is used for training and then also evaluated by the ANN and the response is visualized.
The effect that I expect is that for the very first samples the response of the ANN will be more or less random but as the training progress the accuracy improves.
Any ideas are most welcome.
Regards, Ola
In MATLAB you can use the adapt function instead of train. You can do this incrementally (change weights every time you get a new piece of information) or you can do it every N-samples, batch-style.
This document gives an in-depth run-down on the different styles of training from the perspective of a time-series problem.
I'd really think about what you're trying to do here, because adaptive learning strategies can be difficult. I found that they like to flail all over compared to their batch counterparts. This was especially true in my case where I work with very noisy signals.
Are you sure that you need adaptive learning? You can't periodically re-train your NN? Or build one that generalizes well enough?

Matlab neural network testing

I have created a neural network and the performance is good. By using nprtool, we are allow to test the network with an input data and target data. Here is my question, what is the purpose of testing a neural network with target data provided? Isn't it testing should not hav e target data so that we can know how well can the trained neural network perform without target data is given? Hope someone will respond to this, thanks =)
I'm not familiar with nprtool, but I suspect it would give the input data to your neural network, and then compare your NN's output data with the target data (and compute some kind of success rate based on that).
So your NN will never see the target data, it's just used to measure the performance.
It's like the "teacher's edition" of the exercise books in school. The student (i.e. the NN) doesn't have the solutions, but her/his answers will be compared against them by the teacher (i.e. nprtool). (Okay, the teacher probably/hopefully knows the subject, but you get the idea.)
The "target" data t is the desired y of y=net(x) used as example to train the network.
What nprtool do is to divide the training set into three groups: the training set, the validation set and the test set.
The first one is used to actually update the network.
The second one is used to determine the performances of the net (note: this set is NOT used in any way to update the network): as the NN "learns" the error (as difference between the t and net(x)) over the validation set decreases. The trend will eventually stop or even reverse: this phenomena is called "overfitting", which means the NN is now chasing the training set, "memorizing" it at the cost of the ability to generalize (meaning: to perform well with unseen data). So the purpose of this validation set is to determine when to stop the training before the NN starts overfitting. This should answer your question.
Finally third set is for external testing, to leave you a set of data untouched by the training procedure.
Even though the total data set [training, validation and testing] are inputs to the training algorithm, the testing data is in no way used to design (i.e., train and validate) the net
total = design + test
design = train + validate
The training data is used to estimate weights and biases
The validation data is used to monitor the design performance on nontraining data. REGARDLESS OF THE PERFORMANCE ON TRAINING DATA, if validation performance degrades continuously for 6 (default) epochs, training is terminated (VALIDATION STOPPING).
This mitigates the dreaded phenomenon of OVERTRAINING AN OVERFIT NET where performance on nontraining data degrades even if the training set performance is improving.
An overfit net has more unknown weights and biases than training equations, thereby allowing an infinite number of solutions. A simple example of overfitting with two unknowns but only one equation:
KNOWN: a, b, c
FIND: unique x1 and x2
USING: a * x1 + b * x2 = c
Hope this helps.
Greg