Neural network value calculation? - neural-network

I have 3 neurons x1, x2, x3. Now I know my value is being overflowed by the actual result value at output (as it is wrong answer) and my weights need new value, but how much value to be set for each neuron ? How to calculate that ?
One way is to divide the (desired value - output value) / 3 and assign the answer to each neuron ... but it won't work for new input, as no proper learning is made.

From your question it seems you do not understand yet how really neural networks work.
First of all, neural networks are a class of algorithms that fall under machine learning techniques. Therefore, they learn, either unsupervised, supervised or in a reinforcement type of training. This of course require a learning paradigm. In neural networks the most well studied supervised training is the backpropagation method. However, to understand how this work you first need to understand how a network is developed.
A description of what is a neural network and its foundations can be seen here: http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html
One practical explanation how you can implement a functional network through backpropagation can be seen here: http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html
If you read these you will probably know enough to answer your question.

Related

Confusion in Backpropagation

I've started working on Forward and back propagation of neural networks. I've coded it as-well and works properly too. But i'm confused in the algorithm itself. I'm new to Neural Networks.
So Forward propagation of neural networks is finding the right label with the given weights?
and Back-propagation is using forward propagation to find the most error free parameters by minimizing cost function and using these parameters to help classify other training examples? And this is called a trained Neural Network?
I feel like there is a big blunder in my concept if there is please let me know where i'm wrong and why i am wrong.
I will try my best to explain forward and back propagation in a detailed yet simple to understand manner, although it's not an easy topic to do.
Forward Propagation
Forward propagation is the process in a neural network where-by during the runtime of the network, values are fed into the front of the neural network, (the inputs). You can imagine that these values then travel across the weights which multiply the original value from the inputs by themselves. They then arrive at the hidden layer (neurons). Neurons vary quite a lot based on different types of networks, but here is one way of explaining it. When the values reach the neuron they go through a function where every single value being fed into the neuron is summed up and then fed into an activation function. This activation function can be very different depending on the use-case but let's take for example a linear activation function. It essentially gets the value being fed into it and then it rounds it to a 0 or 1. It is then fed through more weights and then it is spat out into the outputs. Which is the last step into the network.
You can imagine this network with this diagram.
Back Propagation
Back propagation is just like forward propagation except we work backwards from where we were in forward propagation.
The aim of back propagation is to reduce the error in the training phase (trying to get the neural network as accurate as possible). The way this is done is by going backwards through the weights and layers. At each weight the error is calculated and each weight is individually adjusted using an optimization algorithm; optimization algorithm is exactly what it sounds like. It optimizes the weights and adjusts their values to make the neural network more accurate.
Some optimization algorithms include gradient descent and stochastic gradient descent. I will not go through the details in this answer as I have already explained them in some of my other answers (linked below).
The process of calculating the error in the weights and adjusting them accordingly is the back-propagation process and it is usually repeated many times to get the network as accurate as possible. The number of times you do this is called the epoch count. It is good to learn the importance of how you should manage epochs and batch sizes (another topic), as these can severely impact the efficiency and accuracy of your network.
I understand that this answer may be hard to follow, but unfortunately this is the best way I can explain this. It is expected that you might not understand this the first time you read it, but remember this is a complicated topic. I have a linked a few more resources down below including a video (not mine) that explains these processes even better than a simple text explanation can. But I also hope my answer may have resolved your question and have a good day!
Further resources:
Link 1 - Detailed explanation of back-propagation.
Link 2 - Detailed explanation of stochastic/gradient-descent.
Youtube Video 1 - Detailed explanation of types of propagation.
Credits go to Sebastian Lague

Supervised neural network

Wanted to ask the opinion of SO experts about the type of neural network I should use to teach it make yes/no answers on the combination of over fifty parameters. Essentially I have a valuation that may produce up to fifty different warnings or errors that are present in what’s being evaluated. So far I’ve been using mean average with coefficients to produce yes/no threshold, but wanted to learn more about applying it through supervised neural network, which I can feed different results and teach it to give final verdict. Which neural network I can use for such undertaking? There are quite a few there and as I’m entering the field of artificial learning, I wanted to which direction I should start looking to.
EDIT
What I'm starting to lean towards is employing some kind of back-propagation to adjust coefficients for each of the rule, where the decision of whether barcode data is correct or not will influence those coefficients. I'm pretty sure this can be achieved using a NN, but not exactly sure which one to use.

how to derive a model equation from the artificial neural networks?

I have used the neural network software for predicting the continous data. Obviously the prediction was better than the results obtained through regression analysis. Now i would like to derive a model expression from the trained weights obtained from the training of the continous data through the software, as suggested by many researchers on how to interpret the trained weights and biases for deriving the model equation i tried to derive one from the similar lines.
After deriving the equation i found that the equation was not able to replicate the same results as given by the neural network software. so i am exploring the new methods to derive the equation. I want to know where i am going wrong and if any one can provide me steps for deriving one it will be helpful.
I have read sometime ago about what you're talking about, but with some diferences. It would probably be useful to you. It's called 'knowledge distilling', if I remember well, and it is a way of extracting the knowledge inside the blackbox that a neural network is. It consists, roughly speaking, in training a simpler model that is easier to interpret, but preserving al the predictive power of the original neural network. I'm speaking from memory, so I'm sorry about the lack of detail. A search on Google will provide the exact references for it.
Hope to have helped.

ANN bypassing hidden layer for an input

I have just been set an assignment to calculate some ANN outputs and write an ANN. Simple stuff, done it before, so I don't need any help with general ANN stuff. However, there is something that is puzzling me. In the assignment, the topology is as follows(wont upload the diagram as it is his intellectual property):-
2 layers, 3 hiddens and one output.
Input x1 goes to 2 hidden nodes and the output node.
Input x2 goes to 2 hidden nodes.
The problem is the ever so usual XOR. He has NOT mentioned anything about this kind of topology before, and I have definitely attended each lecture and listened intently. I am a good student like that :)
I don't think this counts as homework as I need no help with the actual tasks in hand.
Any insight as to why one would use a network with a topology like this would be brilliant.
Regards
Does the neural net look like the above picture? It looks like a common XOR topology with one hidden layer and a bias neuron. The bias neuron basically helps you shift the values of the activation function to the left or the right.
For more information on the role of the bias neuron, take a look at the following answers:
Role of Bias in Neural Networks
XOR problem solvable with 2x2x1 neural network without bias?
Why is a bias neuron necessary for a backpropagating neural network that recognizes the XOR operator?
Update
I was able to find some literature about this. Apparently it is possible for an input to skip the hidden layer and go to the output layer. This is called a skip layer and is used to model traditional linear regression in a neural network. This page from the book Neural Network Modeling Using Sas Enterprise Miner describes the concept. This page from the same book goes into a little more detail about the concept as well.

Is there a rule/good advice on how big a artificial neural network should be?

My last lecture on ANN's was a while ago but I'm currently facing a project where I would want to use one.
So, the basics - like what type (a mutli-layer feedforward network), trained by an evolutionary algorithm (thats a given by the project), how many input-neurons (8) and how many ouput-neurons (7) - are set.
But I'm currently trying to figure out how many hidden layers I should use and how many neurons in each of these layers (the ea doesn't modify the network itself, but only the weights).
Is there a general rule or maybe a guideline on how to figure this out?
The best approach for this problem is to implement the cascade correlation algorithm, in which hidden nodes are sequentially added as necessary to reduce the error rate of the network. This has been demonstrated to be very useful in practice.
An alternative, of course, is a brute-force test of various values. I don't think simple answers such as "10 or 20 is good" are meaningful because you are directly addressing the separability of the data in high-dimensional space by the basis function.
A typical neural net relies on hidden layers in order to converge on a particular problem solution. A hidden layer of about 10 neurons is standard for networks with few input and output neurons. However, a trial an error approach often works best. Since the neural net will be trained by a genetic algorithm the number of hidden neurons may not play a significant role especially in training since its the weights and biases on the neurons which would be modified by an algorithm like back propogation.
As rcarter suggests, trial and error might do fine, but there's another thing you could try.
You could use genetic algorithms in order to determine the number of hidden layers or and the number of neurons in them.
I did similar things with a bunch of random forests, to try and find the best number of trees, branches, and parameters given to each tree, etc.