Maximum number of iterations to train a neural network - neural-network

I have a set of size N. How can I determine whether this data set is trained or not?
Training will take place infinitely if the data I feed is random. So I should have a maximum number iterations for which a neural network can be considered as trained normally, to avoid having an infinite number of iterations.
What is the maximum number of iteration for which I can consider the Neural Network as trained?

You will need to define a confidence interval, which you are ready to accept. Please read the article: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=00478409 for further information.

Related

Large values of weights in neural network

I use Q-learning with neural network as approimator. And after several training iteration, weights acquire values in the range from 0 to 10. Can the weights take such values? Or does this indicate bad network parameters?
Weights can take those values. Especially when you're propagating a large number of iterations; the connections that need to be 'heavy', get 'heavier'.
There are plenty examples showing neural networks with weights larger than 1. Example.
Also, following this image, there is no such thing as weight limits:
legend

How to improve digit recognition prediction in Neural Networks in Matlab?

I've made digit recognition (56x56 digits) using Neural Networks, but I'm getting 89.5% accuracy on test set and 100% on training set. I know that it's possible to get >95% on test set using this training set. Is there any way to improve my training so I can get better predictions? Changing iterations from 300 to 1000 gave me +0.12% accuracy. I'm also file size limited so increasing number of nodes can be impossible, but if that's the case maybe I could cut some pixels/nodes from the input layer.
To train I'm using:
input layer: 3136 nodes
hidden layer: 220 nodes
labels: 36
regularized cost function with lambda=0.1
fmincg to calculate weights (1000 iterations)
As mentioned in the comments, the easiest and most promising way is to switch to a Convolutional Neural Network. But with you current model you can:
Add more layers with less neurons each, which increases learning capacity and should increase accuracy by a bit. Problem is that you might start overfitting. Use regularization to counter this.
Use batch Normalization (BN). While you are already using regularization, BN accelerates training and also does regularization, and is a NN specific algorithm that might work better.
Make an ensemble. Train several NNs on the same dataset, but with a different initialization. This will produce slightly different classifiers and you can combine their output to get a small increase in accuracy.
Cross-entropy loss. You don't mention what loss function you are using, if its not Cross-entropy, then you should start using it. All the high accuracy classifiers use cross-entropy loss.
Switch to backpropagation and Stochastic Gradient Descent. I do not know the effect of using a different optimization algorithm, but backpropagation might outperform the optimization algorithm you are currently using, and you could combine this with other optimizers such as Adagrad or ADAM.
Other small changes that might increase accuracy are changing the activation functions (like ReLU), shuffle training samples after every epoch, and do data augmentation.

Matlab neural network training

What is the difference in performing the following codes? It is better to modify epochs in the training structure or put the training function in a loop?
Thank you
First code:
for(i=1:10)
% Train the Network
[net,tr] = train(net,inputs,targets);
end
Second code:
net.trainParam.epochs = 200;
[net,tr] = train(net,inputs,targets);
If the inputs and targets you provide for training are describing a model that is very hard to train, then there is theoretically no difference between the first and the second code. This is assuming that your network is being trained and hitting the maximum number of iterations / epochs for each iteration in the for loop.
Assuming that this is the case, what would basically happen is that in your first piece of code, it would simply take the trained network at the previous iteration, and use that for the next iteration. This is assuming that training did not converge, and it should simply pick up "where it left off" in terms of training. In the second piece of code, you are setting the total number of iterations required for convergence in the beginning and letting the training happen only once.
If the situation arises such that it is very hard to train your network and we reach the maximum number of iterations / epochs per iteration in your for loop, then there will be no difference.
However, depending on your inputs and targets, training your neural network may take less than the maximum number of epochs you set. For example, should you set your maximum number of epochs to ... say... 100, and it took only 35 epochs to train at the first iteration of your loop, the next iterations after that will not change the network at all, and so there will be unnecessary computation as a result.
As such, if your network is very easy to train, then just use the second piece of code. If your network is very difficult to train, then simply setting the number of maximum epochs with the second piece of code and training it all in one go may not be enough. As such, if you have a harder network to train, instead of setting a huge number of epochs which may take time for convergence, it may be wise to reduce the number of total epochs and place this into a for loop instead for incremental changes.
As such, if you want to take something away from this, use the second piece of code if you can see that the network is fairly simple to train. Use the first piece of code with a reduced number of epochs but place it into a for loop for harder networks to train.

Increasing the number of epochs to reach the performance goal while training neural network

I am training the neural network with input vector of 85*650 and target vector of 26*650. Here is the list of parameters that I have used
net.trainParam.max_fail = 6;
net.trainParam.min_grad=1e-5;
net.trainParam.show=10;
net.trainParam.lr=0.9;
net.trainParam.epochs=13500;
net.trainParam.goal=0.001;
Number of hidden nodes=76
As you can see ,I have set the number of epochs to 13500. Is it OK to set the number of epochs to such a large number?. Performance goal is not reaching if the number of epochs is decreased and I am getting a bad classification while testing.
Try not to focus on the number of epochs. Instead, you should have, at least, two sets of data: one for training and another for testing. Use the testing set to get a feel for how well your ANN is performing and how many epochs is needed to get a decent ANN.
For example, you want to stop training when performance on your testing set as levelled-off or has begun to decrease (get worse). This would be evidence of over-learning which is the reason why more epochs is not always better.

Issues with neural network

I am having some issues with using neural network. I am using a non linear activation function for the hidden layer and a linear function for the output layer. Adding more neurons in the hidden layer should have increased the capability of the NN and made it fit to the training data more/have less error on training data.
However, I am seeing a different phenomena. Adding more neurons is decreasing the accuracy of the neural network even on the training set.
Here is the graph of the mean absolute error with increasing number of neurons. The accuracy on the training data is decreasing. What could be the cause of this?
Is it that the nntool that I am using of matlab splits the data randomly into training,test and validation set for checking generalization instead of using cross validation.
Also I could see lots of -ve output values adding neurons while my targets are supposed to be positives. Could it be another issues?
I am not able to explain the behavior of NN here. Any suggestions? Here is the link to my data consisting of the covariates and targets
https://www.dropbox.com/s/0wcj2y6x6jd2vzm/data.mat
I am unfamiliar with nntool but I would suspect that your problem is related to the selection of your initial weights. Poor initial weight selection can lead to very slow convergence or failure to converge at all.
For instance, notice that as the number of neurons in the hidden layer increases, the number of inputs to each neuron in the visible layer also increases (one for each hidden unit). Say you are using a logit in your hidden layer (always positive) and pick your initial weights from the random uniform distribution between a fixed interval. Then as the number of hidden units increases, the inputs to each neuron in the visible layer will also increase because there are more incoming connections. With a very large number of hidden units, your initial solution may become very large and result in poor convergence.
Of course, how this all behaves depends on your activation functions and the distributio of the data and how it is normalized. I would recommend looking at Efficient Backprop by Yann LeCun for some excellent advice on normalizing your data and selecting initial weights and activation functions.