Increasing the number of epochs to reach the performance goal while training neural network - matlab

I am training the neural network with input vector of 85*650 and target vector of 26*650. Here is the list of parameters that I have used
net.trainParam.max_fail = 6;
net.trainParam.min_grad=1e-5;
net.trainParam.show=10;
net.trainParam.lr=0.9;
net.trainParam.epochs=13500;
net.trainParam.goal=0.001;
Number of hidden nodes=76
As you can see ,I have set the number of epochs to 13500. Is it OK to set the number of epochs to such a large number?. Performance goal is not reaching if the number of epochs is decreased and I am getting a bad classification while testing.

Try not to focus on the number of epochs. Instead, you should have, at least, two sets of data: one for training and another for testing. Use the testing set to get a feel for how well your ANN is performing and how many epochs is needed to get a decent ANN.
For example, you want to stop training when performance on your testing set as levelled-off or has begun to decrease (get worse). This would be evidence of over-learning which is the reason why more epochs is not always better.

Related

Is it possible that accuracy get reduced while increasing the no of epochs?

I am training a DNN in MATLAB , while optimizing my network, I am observing a decrement in accuracy while increasing the epochs. Is it possible?
The loss values in on the other hand decreases during training while increasing epochs. Please guide.
tldr; absolutely.
When entire training dataset is seen once by the model (feed forwarded once), it's termed as 1 epoch.
The below graph shows the general behaviour of accuracy with the number of epochs. Training on more number of epochs can result in low accuracy on validation, even though loss will continue to reduce (training accuracy will be high). This is termed as overfitting.
No. of epochs to train also a hyperparameter that needs fine tuning.
It is absolutely possible:
Especially when you are training in batches
When your learning rate is too high

Neural network parameter selection

I am looking at (two-layer) feed-forward Neural Networks in Matlab. I am investigating parameters that can minimise the classification error.
A google search reveals that these are some of them:
Number of neurons in the hidden layer
Learning Rate
Momentum
Training type
Epoch
Minimum Error
Any other suggestions?
I've varied the number of hidden neurons in Matlab, varying it from 1 to 10. I found that the classification error is close to 0% with 1 hidden neuron and then grows very slightly as the number of neurons increases. My question is: shouldn't a larger number of hidden neurons guarantee an equal or better answer, i.e. why might the classification error go up with more hidden neurons?
Also, how might I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
Many thanks
Since you are considering a simple two layer feed forward network and have already pointed out 6 different things you need to consider to reduce classification errors, I just want to add one thing only and that is amount of training data. If you train a neural network with more data, it will work better. Note that, training with large amount of data is a key to get good outcome from neural networks, specially from deep neural networks.
Why the classification error goes up with more hidden neurons?
Answer is simple. Your model has over-fitted the training data and thus resulting in poor performance. Note that, if you increase the number of neurons in hidden layers, it would decrease training errors but increase testing errors.
In the following figure, see what happens with increased hidden layer size!
How may I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
I am expecting you have already seen feed forward neural net in Matlab. You just need to manipulate the second parameter of the function feedforwardnet(hiddenSizes,trainFcn) which is trainFcn - a training function.
For example, if you want to use gradient descent with momentum and adaptive learning rate backpropagation, then use traingdx as the training function. You can also use traingda if you want to use gradient descent with adaptive learning rate backpropagation.
You can change all the required parameters of the function as you want. For example, if you want to use traingda, then you just need to follow the following two steps.
Set net.trainFcn to traingda. This sets net.trainParam to traingda's default parameters.
Set net.trainParam properties to desired values.
Example
net = feedforwardnet(3,'traingda');
net.trainParam.lr = 0.05; % setting the learning rate to 5%
net.trainParam.epochs = 2000 % setting number of epochs
Please see this - gradient descent with adaptive learning rate backpropagation and gradient descent with momentum and adaptive learning rate backpropagation.

Is there any standard rule for considering the best result or at which stage I have to stop train the data with minimum error

I have a dataset containing data 1100, from where I have considered 75% for training, 15% testing and 15% for validation. The problem is that every time I train the network for the same training set I get very different results. Is there any standard rule for considering the best result or at which stage I have to stop train the data with minimum error.
Normally, if you are using a neural network, you should not have too different results between different runs on the same training set. So, first of all, check that your algorithm is working correctly using some standard benchmark problems (like iris/wisconsin from UCI repository)
Regarding when to stop the training, there are two options:
1. When the training set error falls below a threshold
2. When the validation set error starts increasing
Case (1) is clear, as the training error always decreases. For case (2) however, there is no absolute criterion, as the validation error might vary during the training. So, just plot it, to see how it behaves, and then set a threshold depending on you observations (for example, stop when its value becomes 10% larger than the minimum value it acquired during the training)

Matlab neural network training

What is the difference in performing the following codes? It is better to modify epochs in the training structure or put the training function in a loop?
Thank you
First code:
for(i=1:10)
% Train the Network
[net,tr] = train(net,inputs,targets);
end
Second code:
net.trainParam.epochs = 200;
[net,tr] = train(net,inputs,targets);
If the inputs and targets you provide for training are describing a model that is very hard to train, then there is theoretically no difference between the first and the second code. This is assuming that your network is being trained and hitting the maximum number of iterations / epochs for each iteration in the for loop.
Assuming that this is the case, what would basically happen is that in your first piece of code, it would simply take the trained network at the previous iteration, and use that for the next iteration. This is assuming that training did not converge, and it should simply pick up "where it left off" in terms of training. In the second piece of code, you are setting the total number of iterations required for convergence in the beginning and letting the training happen only once.
If the situation arises such that it is very hard to train your network and we reach the maximum number of iterations / epochs per iteration in your for loop, then there will be no difference.
However, depending on your inputs and targets, training your neural network may take less than the maximum number of epochs you set. For example, should you set your maximum number of epochs to ... say... 100, and it took only 35 epochs to train at the first iteration of your loop, the next iterations after that will not change the network at all, and so there will be unnecessary computation as a result.
As such, if your network is very easy to train, then just use the second piece of code. If your network is very difficult to train, then simply setting the number of maximum epochs with the second piece of code and training it all in one go may not be enough. As such, if you have a harder network to train, instead of setting a huge number of epochs which may take time for convergence, it may be wise to reduce the number of total epochs and place this into a for loop instead for incremental changes.
As such, if you want to take something away from this, use the second piece of code if you can see that the network is fairly simple to train. Use the first piece of code with a reduced number of epochs but place it into a for loop for harder networks to train.

ANN different results for same train-test sets

I'm implementing a neural network for a supervised classification task in MATLAB.
I have a training set and a test set to evaluate the results.
The problem is that every time I train the network for the same training set I get very different results (sometimes I get a 95% classification accuracy and sometimes like 60%) for the same test set.
Now I know this is because I get different initial weights and I know that I can use 'seed' to set the same initial weights but the question is what does this say about my data and what is the right way to look at this? How do I define the accuracy I'm getting using my designed ANN? Is there a protocol for this (like running the ANN 50 times and get an average accuracy or something)?
Thanks
Make sure your test set is large enough compared to the training set (e.g. 10% of the overall data) and check it regarding diversity. If your test set only covers very specific cases, this could be a reason. Also make sure you always use the same test set. Alternatively you should google the term cross-validation.
Furthermore, observing good training set accuracy while observing bad test set accuracy is a sign for overfitting. Try to apply regularization like a simple L2 weight decay (simply multiply your weight matrices with e.g. 0.999 after each weight update). Depending on your data, Dropout or L1 regularization could also help (especially if you have a lot of redundancies in your input data). Also try to choose a smaller network topology (fewer layers and/or fewer neurons per layer).
To speed up training, you could also try alternative learning algorithms like RPROP+, RPROP- or RMSProp instead of plain backpropagation.
Looks like your ANN is not converging to the optimal set of weights. Without further details of the ANN model, I cannot pinpoint the problem, but I would try increasing the number of iterations.