Artificial Neural Network Back Propagation testing - neural-network

I have developed a code for ANN BP to classify snore segments. I have 10 input features and 1 hidden layer with 10 neuron and one output neuron. I denoted 1 as no snore and 0 as snore segment. I have 3000 segments and among them 2500 are no snore segments which are marked as 1. and 500 snore segments which are marked as 0. I already divided the data set in three sets (70% training, 15% validation and 15% testing).
Now, while training the network, first I shuffled the training set and mixed the snore and no snore segments all together. So, After I trained the network, when I validate it (by only feed forward network), I found that it can only classify one of them. Let me clear it further, suppose, in the training set the last element is no snore (which is 1). So, it trained the network for that last output. Then in the validation phase, it always give output close to 1 even for snore segments (which is 0). Same thing happen if the last element is snore (0). Then it gives output close to 0 all the time in validation phase.
How can I solve this problem? Why Can't my network did not memorize the output for previous segments. It only saves for the last segment? What should I change in the network to solve it?

The problem I see is that there is not enough neurons and sinapsis in hidden layer. Remember that until now there is no such a way to calculate exactly the number of neurons in hidden layer so we must use test an error methodology. There are many empirical formulas that you can check in next link
https://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw

This is a classification problem so I would recommend that you have two output neurons. One output neuron is one if the segment is a snore segment and the other one is -1 if it is not a snore segment and vice versa for segments without a snore. This should help the network classify both of them. You should also be normalizing your input features to a range between 1 and -1. This will help the neural network better understand your inputs. You may also want to look at using a softmax layer as your output.
Another thing that you may need is you may need to add another hidden layer or more neurons to your current hidden layer. Thanks yo #YuryEuceda for this suggestion. You may also need to add in a bias input if you do not already have one.

the problem is in the number of hidden layer
in this paper u will find different methods to choose it
http://www.ijettjournal.com/volume-3/issue-6/IJETT-V3I6P206.pdf
i propose you
number of hidden layers = ( number of inputs + number of output laysers)* 2/3

Related

How does one calculate the target outputs of neurons in hidden layers of a neural network?

In a simple single-layer network, it is easy to calculate the target outputs of neurons, as they are identical to the target outputs of the network itself. However, in a multiple-layer network, I am not quite sure how to calculate the targets for each individual neuron in the hidden layers, because they do not necessarily have a direct connection to the final output and are most likely not given in the training data. How would one find these values?
I would not be surprised if I am missing something and am going about this incorrectly, but I would like to know nonetheless. Thanks in advance for any and all input.
Taken from this great guide on pg. 18:
Calculate the Errors for the hidden layer neurons. Unlike the output layer we can’t
calculate these directly (because we don’t have a Target), so we Back Propagate
them from the output layer (hence the name of the algorithm). This is done by
taking the Errors from the output neurons and running them back through the
weights to get the hidden layer errors.
Or in other words, you don't. You propagate the activations from the input to the output, calculate the error of the output, then backpropagate the error from the output back to the input (thus the name of the algorithm).
In the unfortunate case that the link I posted goes down, it can be found by Googling "backpropagation algorithm 3".

How is the number of hidden and output neurons calculated for a neural network?

I'm very new to neural networks but I am trying to create one for optical character recognition. I have 100 images of every number from 0-9 in the size of 24x14. The number of inputs for the neural network are 336 but I don't know how to get the number of hidden neurons and output neurons.
How do I calculate it?
While for the output neurons the number should be equal to the number of classes you want to discriminate, for the hidden layer, the size is not so straight forward to set and it is mainly dependent on the trade-off between complexity of the model and generalization capabilities (see https://en.wikipedia.org/wiki/Artificial_neural_network#Computational_power).
The answers to this question can help:
training feedforward neural network for OCR
The number of output neurons is simply your number of classes (unless you only have 2 classes and are not using the one-hot representation, in which case you can make do with just 2 output neuron).
The number of hidden layers, and subsequently number of hidden neurons is not as straightforward as you might think as a beginner. Every problem will have a different configuration that will work for it. You have to try multiple things out. Just keep this in mind though:
The more layers you add, the more complex your calculations become and hence, the slower your network will train.
One of the best and easiest practices is to keep the number of hidden neurons fixed in each layer.
Keep in mind what hidden neurons in each layer mean. The input layer is your starting features and each subsequent hidden layer is what you do with those features.
Think about your problem and the features you are using. If you are dealing with images, you might want a large number of neurons in your first hidden layer to break apart your features into smaller units.
Usually you results would not vary much when you increase the number of neurons to a certain extent. And you'll get used to this as you practice more. Just keep in mind the trade-offs you are making
Good luck :)

Artificial neural network presented with unclassified inputs

I am trying to classify portions of time series data using a feed forward neural network using 20 neurons in a single hidden layer, and 3 outputs corresponding to the 3 events I would like to be able to recognize. There are many other things that I could classify in the data (obviously), but I don't really care about them for the time being. Neural network creation and training has been performed using Matlab's neural network toolbox for pattern recognition, as this is a classification problem.
In order to do this I am sequentially populating a moving window, then inputting the window into the neural network. The issue I have is that I am obviously not able to classify and train every possible shape the time series takes on. Due to this, I typically get windows filled with data that look very different from the windows I used to train the neural network, but still get outputs near 1.
Essentially, the 3 things I trained the ANN with are windows of 20 different data sets that correspond to shapes that would correspond to steady state, a curve that starts with a negative slope and levels off to 0 slope (essentially the left half side of a parabola that opens upwards), and a curve corresponding to 0 slope that quickly declines (right half side of a parabola that opens downwards).
Am I incorrect in thinking that if I input data that doesn't correspond to any of the items I trained the ANN with it should output values near 0 for all outputs?
Or is it likely due to the fact that these basically cover all the bases of steady state, increasing and decreasing, despite large differences in slope, and therefore something is always classified?
I guess I just need a nudge in the right direction.
Neural network output values
A neural network may not guarantee specific output values if these input values / expected output values were presented during the training period.
A neural network will not consistently output 0 for untrained input values.
A solution is to simply present the network with an array of input values that should result in the network outputting 0.

Artificial Neural Network layers

I have decided to try and make a reccognition system. And I want to start with pictures of say, 16x16 pixels. That will be 256 INPUT NEURONS.
Now, the output neurons is essensially how many results I want, so say I want to distinguish the letters A, B and C.
then i need 3 OUTPUT NEURONS right?
My question is, how can I know how many neurons I need in the hidden layer? And what was the purpose of them again? Is it how many character classes I want? Say, O and Q are quite simular, so thay both would lead to one hidden layer neuron who later tell them appart?
You're right about the input and output layers.
How can I know how many neurons I need in the hidden layer?
There's no concrete rule that says exactly how many units you need in the hidden layers of a neural network. There are some general guidelines though, which I'll quote from one of my answers on Cross Validated.
Number of input units: Dimension of features x(i)
Number of output units: Number of classes
Reasonable default is one hidden layer, or if > 1 hidden layer, have the same number of hidden units in every layer (usually the more the better, anywhere from about 1X to 4X the number of input units).
You also asked:
And what was the purpose of them again?
The hidden layer units just transform the inputs into values (using coefficients selected during training) that can be used by the output layer.
Is it how many character classes I want? Say, O and Q are quite similar, so thy both would lead to one hidden layer neuron who later tell them apart?
No, that's not right. The number of output units will be the same as the number of classes you want. Each output unit will correspond to one letter, and will say whether or not the input image is that letter (with some probability). The output unit with the highest probability is the one you select as the right letter.

Issues with neural network

I am having some issues with using neural network. I am using a non linear activation function for the hidden layer and a linear function for the output layer. Adding more neurons in the hidden layer should have increased the capability of the NN and made it fit to the training data more/have less error on training data.
However, I am seeing a different phenomena. Adding more neurons is decreasing the accuracy of the neural network even on the training set.
Here is the graph of the mean absolute error with increasing number of neurons. The accuracy on the training data is decreasing. What could be the cause of this?
Is it that the nntool that I am using of matlab splits the data randomly into training,test and validation set for checking generalization instead of using cross validation.
Also I could see lots of -ve output values adding neurons while my targets are supposed to be positives. Could it be another issues?
I am not able to explain the behavior of NN here. Any suggestions? Here is the link to my data consisting of the covariates and targets
https://www.dropbox.com/s/0wcj2y6x6jd2vzm/data.mat
I am unfamiliar with nntool but I would suspect that your problem is related to the selection of your initial weights. Poor initial weight selection can lead to very slow convergence or failure to converge at all.
For instance, notice that as the number of neurons in the hidden layer increases, the number of inputs to each neuron in the visible layer also increases (one for each hidden unit). Say you are using a logit in your hidden layer (always positive) and pick your initial weights from the random uniform distribution between a fixed interval. Then as the number of hidden units increases, the inputs to each neuron in the visible layer will also increase because there are more incoming connections. With a very large number of hidden units, your initial solution may become very large and result in poor convergence.
Of course, how this all behaves depends on your activation functions and the distributio of the data and how it is normalized. I would recommend looking at Efficient Backprop by Yann LeCun for some excellent advice on normalizing your data and selecting initial weights and activation functions.