Suppose I have very big train set so that Matlab hangs while training or there is insufficient memory to hold train set.
Is it possible to split the training set into parts and train the network by parts?
Is it possible to train the network with one sample at a time (one by one)?
You can just manually divide dataset into batches and train them one after one:
for bn = 1:num_batches
inputs = <get batch bn inputs>;
targets = <get batch bn targets>;
net = train(net, inputs, targets);
end
Though batch size should be greater than 1, but anyway that should reduce memory consumtion for training.
In case of trainlm training alogrithm, net.efficiency.memoryReduction optim could help.
Also instead of default trainlm algorithm you can try less memory consuming ones like trainrp.
For details on training algorithms check matlab documentation page.
I assumed above that you are using corresponding matlab toolbox for neural networks.
Regarding training one sample at a time you could try googling for stochastic gradient descent algorithm. But, it looks like it is not in default set of training algorithm in the toolbox.
Related
I am using a multi layer perceptron for fitting a model to a data given input-output pair following the tutorial https://www.mathworks.com/help/deeplearning/gs/fit-data-with-a-neural-network.html.
Confusion 1) I am having a tough time understanding where the test set which has been created using the command net.divideParam.testRatio used? In general, we split the data set into train, validation and an unseen test set that is used for performanace evaluation and reporting the confusion matrix. This approach is usually done for classification task. But for the problem of regression and model fitting es. using NN should we not explicitly have a test set that is unseen during training? Is this command net.divideParam.testRatio creating that unseen test set but it is never used in testing the network? The program code uses all of the inputs in the testing. It is unclear if after training I should use an unseen dataset for testing and then reporting the performance or not.
% Create a Fitting Network
hiddenLayerSize = 10;
net = fitnet(hiddenLayerSize);
inputs = houseInputs;
targets = houseTargets;
% Set up Division of Data for Training, Validation, Testing
net.divideParam.trainRatio = 70/100;
net.divideParam.valRatio = 15/100;
net.divideParam.testRatio = 15/100;
% Train the Network
[net,tr] = train(net,inputs,targets);
% Test the Network
outputs = net(inputs);
errors = gsubtract(outputs,targets);
performance = perform(net,targets,outputs)
Confusion 2) When using regression model mvregress do we follow the same approach as the answer for confusion 1)
Please help. I am unable to find the correct practices and approach for these initial steps and I believe that the proper use makes a great impact in the result.
I can help you mostly with confusion 1).
When you train a neural network, you are separating the dataset in 3 sets:
Training set, used to train the network (the only dataset which actually allows the update of the Network weights);
Validation set, used to stop the training (this is the parameter Validation checks in the GUI);
Test set, which influences the performance plots and the overall performance of the fitter;
Therefore, of these 3, only the training set is seen by the network and influences the weights update; while the validation set allows to stop the training if the network is overfitting the training data (an improvement in training data fitting does not improve the validation data fitting/classification). Finally, test set is useful for a first check of the fitter performance.
If you check the value of net.divideParam, you can see that the network stores the percentage of values for each set; during the training, the inputs and targets will be randomly divided according to these 3 values. This is why if you use the toolbox to plot the performance of the network. You can also avoid this to be done randomly by setting the net.divideFcn to 'divideind'. This is mostly useful if you know well your dataset.
When you train the network using
[net,tr] = train(net,inputs,targets);
tr stores the results of the training, including the indexes of the training (tr.trainInd), validation (tr.valInd) and test set (tr.testInd). To retrieve each of the sets it is possible to index the input with those inputs, while other parameters, such as the accuracy or the performance of the network can be retrieved through tr.
Regarding confusion 2, I think that regression model mvregress works with a different approach: it should just evaluate the parameters for the fitting without splitting the dataset in three slices. It should be up to you to evaluate the regression by adding some points or removing them from the inputs.
My data set has 150 independent variables and 10 predictors or response. The problem is to find a mapping between input and output variables. There are 1000 data points out of which 70% I have used for training and 30% for testing. I am using a feedforward neural network with 10 hidden neurons as explained in this Matlab document . I am evaluating the performance using the command
perf_Train = perform(net,TrainedData',lblTrain')
YPred = net(XTest);
perf_Test = perform(net,YPred,lblTest')
which basically gives the mean square error between the actual and the predicted (estimated) response for training and testing. My testing data is not able to fit properly to the trained model, however the training data fits quite well.
Problem1: My training performance is always lesser than test performance measure i.e., perf_Train = 0.0867 and perf_Test = 0.567
Is this overfitting or underfitting?
Problem2: How do I make the test data fit accurately? Theory say that to overcome overfitting and underfitting, we need to do regularization. Is there any parameter that needs to be input into the function such as regularization to overcome this?
It is overfitting since training error is lower than test error.
I would recommend to set less epochs(iteration) for your training or use less training data.
I would also recommend to check that the training data and test data are picked up randomly.
For regulation, it can be set like this:
net.performParam.regularization = 0.5;
The performance ratio depends on the model, 0.5 is just an example.
For more details, you can refer to the documentation below.
https://www.mathworks.com/help/deeplearning/ug/improve-neural-network-generalization-and-avoid-overfitting.html#bss4gz0-38
I trained a Neural Network with a GA and with Backpropagation. The GA finds suitable weights for the training data but performs poorly on the test data. If I train the NN with BackPropagation, it performs much better on the test data even though the training error isn't much smaller than for the GA trained version. Even when I use the weights obtained by the GA as initial weights for Backpropagation, the NN performs worse on the test data than using only Backpropagation for training. Can anyone tell me where I could have made a mistake?
I suggest you read something about overfitting. In short you will be excelent at training set but poor at testing set(because NN follows anomaly and uncertainity and datas). Task of NN is generalize, but GA only perfect minimize error in training set(to be fair, this is GA task).
There are some methods how to deal with overfitting. I suggest you use validation set. First step is division your data into the three sets. Training testing and validation. Method is simple, you will train your NN with GA to minimalize error on training set, but you also run your NN on validation set, only run, not train. Error of network decrease on training set, but error should also decrease at validation set. So if error decrease at training set, but start increase at validation set, you must stop with learning(please don't stop at first iterations).
Hope it will be helpful.
I have encountered a similar problem, and the choice of the initial values of the neural network does not seem to affect the final classification accuracy. I used the feedforwardnet() function in matlab to compare the two cases. One is direct training, and the program gives random initial weights and bias values. One is to find the appropriate initial weights values and bias values through the GA algorithm, and then assign them to the neural network, and then start training. However, the latter approach does not improve the accuracy of neural network classification.
I am looking at (two-layer) feed-forward Neural Networks in Matlab. I am investigating parameters that can minimise the classification error.
A google search reveals that these are some of them:
Number of neurons in the hidden layer
Learning Rate
Momentum
Training type
Epoch
Minimum Error
Any other suggestions?
I've varied the number of hidden neurons in Matlab, varying it from 1 to 10. I found that the classification error is close to 0% with 1 hidden neuron and then grows very slightly as the number of neurons increases. My question is: shouldn't a larger number of hidden neurons guarantee an equal or better answer, i.e. why might the classification error go up with more hidden neurons?
Also, how might I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
Many thanks
Since you are considering a simple two layer feed forward network and have already pointed out 6 different things you need to consider to reduce classification errors, I just want to add one thing only and that is amount of training data. If you train a neural network with more data, it will work better. Note that, training with large amount of data is a key to get good outcome from neural networks, specially from deep neural networks.
Why the classification error goes up with more hidden neurons?
Answer is simple. Your model has over-fitted the training data and thus resulting in poor performance. Note that, if you increase the number of neurons in hidden layers, it would decrease training errors but increase testing errors.
In the following figure, see what happens with increased hidden layer size!
How may I vary the Learning Rate, Momentum, Training type, Epoch and Minimum Error in Matlab?
I am expecting you have already seen feed forward neural net in Matlab. You just need to manipulate the second parameter of the function feedforwardnet(hiddenSizes,trainFcn) which is trainFcn - a training function.
For example, if you want to use gradient descent with momentum and adaptive learning rate backpropagation, then use traingdx as the training function. You can also use traingda if you want to use gradient descent with adaptive learning rate backpropagation.
You can change all the required parameters of the function as you want. For example, if you want to use traingda, then you just need to follow the following two steps.
Set net.trainFcn to traingda. This sets net.trainParam to traingda's default parameters.
Set net.trainParam properties to desired values.
Example
net = feedforwardnet(3,'traingda');
net.trainParam.lr = 0.05; % setting the learning rate to 5%
net.trainParam.epochs = 2000 % setting number of epochs
Please see this - gradient descent with adaptive learning rate backpropagation and gradient descent with momentum and adaptive learning rate backpropagation.
I've made digit recognition (56x56 digits) using Neural Networks, but I'm getting 89.5% accuracy on test set and 100% on training set. I know that it's possible to get >95% on test set using this training set. Is there any way to improve my training so I can get better predictions? Changing iterations from 300 to 1000 gave me +0.12% accuracy. I'm also file size limited so increasing number of nodes can be impossible, but if that's the case maybe I could cut some pixels/nodes from the input layer.
To train I'm using:
input layer: 3136 nodes
hidden layer: 220 nodes
labels: 36
regularized cost function with lambda=0.1
fmincg to calculate weights (1000 iterations)
As mentioned in the comments, the easiest and most promising way is to switch to a Convolutional Neural Network. But with you current model you can:
Add more layers with less neurons each, which increases learning capacity and should increase accuracy by a bit. Problem is that you might start overfitting. Use regularization to counter this.
Use batch Normalization (BN). While you are already using regularization, BN accelerates training and also does regularization, and is a NN specific algorithm that might work better.
Make an ensemble. Train several NNs on the same dataset, but with a different initialization. This will produce slightly different classifiers and you can combine their output to get a small increase in accuracy.
Cross-entropy loss. You don't mention what loss function you are using, if its not Cross-entropy, then you should start using it. All the high accuracy classifiers use cross-entropy loss.
Switch to backpropagation and Stochastic Gradient Descent. I do not know the effect of using a different optimization algorithm, but backpropagation might outperform the optimization algorithm you are currently using, and you could combine this with other optimizers such as Adagrad or ADAM.
Other small changes that might increase accuracy are changing the activation functions (like ReLU), shuffle training samples after every epoch, and do data augmentation.