Back propagation with a simple ANN - neural-network

I watched a lecture and derived equations for back propagation, but it was in a simple example with 3 neurons: an input neuron, one hidden neuron, and an output neuron. This was easy to derive, but how would I do the same with more neurons? I'm not talking about adding more layers, I'm just talking about adding more neurons to the already existing three layers: the input, hidden, and output layer.
My first guess would be to use the equations I've derived for the network with just 3 neurons and 3 layers and iterate across all possible paths to each of the output neurons in the larger network, updating each weight. However, this would cause certain weights to be updated more than once. Can I just do this or is there a better method?

If you want to larn more about backpropagation I recommend you to read this link from Standford University http://cs231n.github.io/optimization-2/, it will really help you to understand backprop and all the math underneath.

Related

How are the bias neurons created in NEAT?

I am trying to implement simple NEAT. I read from various sources that there are 4 types of "nodes": input neurons, hidden neurons, output neurons and so-called bias neurons. I don't see which process may create bias neurons, that are depicted in this paper at page 16.
I understand that new neurons may be created while mutating, but it requires an existing connection between two neurons that will be split by this new neuron (basing on paper already mentioned, page 10). However, bias neuron has no "input" connection, so it clearly can't be created in the mentioned way. Then how, in details, does NEAT create bias neurons?
A bias neuron (node) in the NEAT context is simply a special input neuron that is always active. It is always included by construction since it seems to help evolution in many cases.
So, in short, you do not create bias neurons just like you won't create new input or output nodes; these are defined by your problem.
You are correct in that the standard NEAT implementation introduces new hidden nodes by splitting existing connections. Hidden nodes are the only neurons you will create or destroy in NEAT (and, to my knowledge, in neuroevolution in general).

How is the number of hidden and output neurons calculated for a neural network?

I'm very new to neural networks but I am trying to create one for optical character recognition. I have 100 images of every number from 0-9 in the size of 24x14. The number of inputs for the neural network are 336 but I don't know how to get the number of hidden neurons and output neurons.
How do I calculate it?
While for the output neurons the number should be equal to the number of classes you want to discriminate, for the hidden layer, the size is not so straight forward to set and it is mainly dependent on the trade-off between complexity of the model and generalization capabilities (see https://en.wikipedia.org/wiki/Artificial_neural_network#Computational_power).
The answers to this question can help:
training feedforward neural network for OCR
The number of output neurons is simply your number of classes (unless you only have 2 classes and are not using the one-hot representation, in which case you can make do with just 2 output neuron).
The number of hidden layers, and subsequently number of hidden neurons is not as straightforward as you might think as a beginner. Every problem will have a different configuration that will work for it. You have to try multiple things out. Just keep this in mind though:
The more layers you add, the more complex your calculations become and hence, the slower your network will train.
One of the best and easiest practices is to keep the number of hidden neurons fixed in each layer.
Keep in mind what hidden neurons in each layer mean. The input layer is your starting features and each subsequent hidden layer is what you do with those features.
Think about your problem and the features you are using. If you are dealing with images, you might want a large number of neurons in your first hidden layer to break apart your features into smaller units.
Usually you results would not vary much when you increase the number of neurons to a certain extent. And you'll get used to this as you practice more. Just keep in mind the trade-offs you are making
Good luck :)

classify the units in Deep learning for image classification

Suppose we have a database with 10 classes, and we do classification test on it by Deep Belief Network or Convolutional Neural Network. The question is that how we can understand which neurons in the last layer are related to which object?
In one of the post, a person wrote " to understand which neurons are for an object like shoes and which ones are not you will put that all units in the last layer to another supervised classifier(this can be anything like multi-class-SVM or a soft-max-layer). I do not know how it should be done? I do need more expansion.
If you have 10 classes, make your last layer have 10 neurons and use the softmax activation function. This will make sure that they all lie between 0 and 1 and add up to 1. Then, simply use the index of the neuron with the largest value as your output class.
You can look into class activation maps which does something similar to what you are asking for. Here is an insightful blog post explaining CAMs

ANN bypassing hidden layer for an input

I have just been set an assignment to calculate some ANN outputs and write an ANN. Simple stuff, done it before, so I don't need any help with general ANN stuff. However, there is something that is puzzling me. In the assignment, the topology is as follows(wont upload the diagram as it is his intellectual property):-
2 layers, 3 hiddens and one output.
Input x1 goes to 2 hidden nodes and the output node.
Input x2 goes to 2 hidden nodes.
The problem is the ever so usual XOR. He has NOT mentioned anything about this kind of topology before, and I have definitely attended each lecture and listened intently. I am a good student like that :)
I don't think this counts as homework as I need no help with the actual tasks in hand.
Any insight as to why one would use a network with a topology like this would be brilliant.
Regards
Does the neural net look like the above picture? It looks like a common XOR topology with one hidden layer and a bias neuron. The bias neuron basically helps you shift the values of the activation function to the left or the right.
For more information on the role of the bias neuron, take a look at the following answers:
Role of Bias in Neural Networks
XOR problem solvable with 2x2x1 neural network without bias?
Why is a bias neuron necessary for a backpropagating neural network that recognizes the XOR operator?
Update
I was able to find some literature about this. Apparently it is possible for an input to skip the hidden layer and go to the output layer. This is called a skip layer and is used to model traditional linear regression in a neural network. This page from the book Neural Network Modeling Using Sas Enterprise Miner describes the concept. This page from the same book goes into a little more detail about the concept as well.

Is there a rule/good advice on how big a artificial neural network should be?

My last lecture on ANN's was a while ago but I'm currently facing a project where I would want to use one.
So, the basics - like what type (a mutli-layer feedforward network), trained by an evolutionary algorithm (thats a given by the project), how many input-neurons (8) and how many ouput-neurons (7) - are set.
But I'm currently trying to figure out how many hidden layers I should use and how many neurons in each of these layers (the ea doesn't modify the network itself, but only the weights).
Is there a general rule or maybe a guideline on how to figure this out?
The best approach for this problem is to implement the cascade correlation algorithm, in which hidden nodes are sequentially added as necessary to reduce the error rate of the network. This has been demonstrated to be very useful in practice.
An alternative, of course, is a brute-force test of various values. I don't think simple answers such as "10 or 20 is good" are meaningful because you are directly addressing the separability of the data in high-dimensional space by the basis function.
A typical neural net relies on hidden layers in order to converge on a particular problem solution. A hidden layer of about 10 neurons is standard for networks with few input and output neurons. However, a trial an error approach often works best. Since the neural net will be trained by a genetic algorithm the number of hidden neurons may not play a significant role especially in training since its the weights and biases on the neurons which would be modified by an algorithm like back propogation.
As rcarter suggests, trial and error might do fine, but there's another thing you could try.
You could use genetic algorithms in order to determine the number of hidden layers or and the number of neurons in them.
I did similar things with a bunch of random forests, to try and find the best number of trees, branches, and parameters given to each tree, etc.