I want to build a neural network with Encog that have 1 input (0/1 or true/false) and 1 ouput (double value) that calculates a mean value if criteria was specified (1 as an input) and 0 if criteria wasn't specified (0 as an input).
For example, if I have the following training dataset
input | ideal
1 | 0.6
0 | 0
1 | 0.2
1 | 0.4
Then I expect something around ~0.0 if input was 0 and ~0.4 if input was 1.
I have simplified my problem a lot. But the 2 principal questions are:
Can I go with such dataset or should I calculate average for all
duplicating input values and start a network training with unique
inputs?
What is the best network structure (network type,
activation function, propagation) for the problem described above?
Yes, you can make averager from neural network, but you need to modify your training set. For now you have three ones, and only one zero which means that your NN will add some kind of bias. Or if bias is something, that you like you just need to train your NN. With regard to network structure I recommend you multilayer neural network with sigmoidal function and with backpropogation as training method, or you can try quickprop also.
Related
I currently understand and made a simple neural network which solves the XOR problem. I want to make a neural network for digit recognition. I know using MNIST data I would need 784 input neurons, 15 hidden neurons and 10 output neurons (0-9).
However, I don’t understand how the network would be trained and how feed forward would work with multiple output neurons.
For example, if the input was the pixels for the digit 3, how would the network determine which output neuron is picked and when training, how would the network know which neuron should be associated with the target value.
Any help would be appreciated.
So you have a classification problem with multiple outputs. I'm supposing that you are using a softmax activation function for the output layer.
How the network determines which output neuron is picked: simple, the output neuron with the greatest probability of being the target class.
The network would be trained with standard backpropagation, same algorithm that you would have with only one output.
There is only one difference: the activation function.
For binary classification you need only one output (for example with digits 0 and 1, if probability < 0.5 then class is 0, else 1).
For multi-class classification you need an output node for each class; then the network will pick the node with the greatest probability of being the target class.
I have tried to implement a neural network in Java by myslef to be an XOR-gate, it has kinda of worked. About ~20% of the time when I try to train it, the weights converges to produce a good enough output (RMS < 0.05) but the other 80% of the time it doesn't.
The neural network (can be seen here) is composed of 2 inputs (+ 1 bias), 2 hidden (+ 1 bias) and 1 output unit. The activation function I used was the sigmoid function
e / ( 1 + e^-x)
which maps the input values to between 0 and 1. The learning algorithm used is Stochastic Gradient Descent using RMS as the cost function. The bias neurons has a constant output of 1. I have tried changing the learning rate between 0.1 and 0.01 and doesn't seem to fix the problem.
I have the network track the weights and rms of the network and plotted in on a graph. There is basically three different behaviours the weights can get. I can only post one the three.
two (or more) value diverges in different directions
One of the other two is just the weights converging to a good value and the second is a random wiggle of one weight.
I don't know if this is just a thing that happens or if there is some kind of way to fix it so please tell me if you know anything.
I am new to neural networks and I want to create a feed forward neural network for mutli-class classification. I am allowed to use any code that is publicly available but not any MATLAB ToolBox as i don't have access to it (so no neural network toolbox). The goal is to classify the data into one of 10 classes. Here is the data set, the class is defined by the three letter code in the last column.
When creating a neural network, do you simply define the amount of nodes and have each node in layer i connect to every single node in layer i+1? And then simply have them learn the weights themselves?
Also is there a source I could follow that has MATLAB code for creating a neural network with any amount of input, any number of nodes, and does multi-class classification that is feed forward.
A general introduction to neural networks (it seems you still need to learn a bit what they are):
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html
Read this document which explain how feedforward networks with backpropagation work (maths are important):
http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html
Here you have one implementation in matlab with comments:
http://anoopacademia.wordpress.com/2013/09/29/back-propagation-algorithm-using-matlab/
Regarding your questions:
1) "When creating a neural network, do you simply define the amount of nodes and have each node in layer i connect to every single node in layer i+1?" Depend of the network you use. In simple fully-connected feedforward neural nets, yes.
2)"And then simply have them learn the weights themselves?" That's the general idea. You have some data of which you know their classes (supervised learning), which you will give to the neural network to learn the pattern, and after learning is finished you use this updated weights to classify new, unseen data.
One thing that should help is to use cross-entropy error instead of classification error or mean-squared error (MSE) for such multi-class problem (especially for evaluation). This is a nice article that explains the idea. I will quote its example here:
Suppose we are predicting a person’s political party affiliation (democrat,
republican, other) from independent data such as age, sex, annual
income, and so on.
...
Now suppose you have just three training data items. Your neural
network uses softmax activation for the output neurons so that there
are three output values that can be interpreted as probabilities. For
example suppose the neural network’s computed outputs, and the target
(aka desired) values are as follows:
computed | targets | correct?
-----------------------------------------------
0.3 0.3 0.4 | 0 0 1 (democrat) | yes
0.3 0.4 0.3 | 0 1 0 (republican) | yes
0.1 0.2 0.7 | 1 0 0 (other) | no
This neural network has classification error of 1/3 = 0.33. Notice that the NN just barely gets the first two training items correct and is way off on the third training item. Now see another output as below:
computed | targets | correct?
-----------------------------------------------
0.1 0.2 0.7 | 0 0 1 (democrat) | yes
0.1 0.7 0.2 | 0 1 0 (republican) | yes
0.3 0.4 0.3 | 1 0 0 (other) | no
This NN also has a classification error of 1/3 = 0.33. But this second NN is much better than the first because it nails the first two training items and just barely misses the third training item. To summarize, classification error is a very crude measure of error. See below for a comparison of the classification error and Average cross-entropy error in the two cases:
Neural Network | classification error | Average cross-entropy error
--------------------------------------------------------------------
NN1 | 0.33 | 1.38
NN2 | 0.33 | 0.64
To use cross-entropy error in the training, you need to use a different cost function.
See details here.
where m is the number of training examples and k is the number of classes; y is the label; x is the feature vector; \theta is the weight parameter.
Will there ever come a point in my epoch where my weights would become greater than 1 if i used the logistic function as my sigmoid? Just wanna check if i'm coding the proper way for my feedforward implementation. Thanks.
That's possible. E.g. If you have a 1-input, 1-output feedforward network that has no bias, your only training input is 0.1 and the corresponding output is 1, then the higher the weight, the better. The logistic function simply ensures that the output is between 0 and 1.
I have trained Feed Forward NN using Matlab Neural Network Toolbox on a dataset containing speech features and accelerometer measurements. Targetset contains two target classes for dataset: 0 and 1. The training, validation and performance are all fine and I have generated code for this network.
Now I need to use this neural network in real-time to recognize pattern when occur and generate 0 or 1 when I test a new dataset against previously trained NN. But when I issue a command:
c = sim(net, j)
Where "j" is a new dataset[24x11]; instead 0 or 1 i get this as an output (I assume I get percent of correct classification but there is no classification result itself):
c =
Columns 1 through 9
0.6274 0.6248 0.9993 0.9991 0.9994 0.9999 0.9998 0.9934 0.9996
Columns 10 through 11
0.9966 0.9963
So is there any command or a way that I can actually see classification results? Any help highly appreciated! Thanks
I'm no matlab user, but from a logical point of view, you are missing an important point:
The input to a Neural Network is a single vector, you are passing a matrix. Thus matlab thinks that you want to classify a bunch of vectors (11 in your case). So the vector that you get is the output activation for every of these 11 vectors.
The output activation is a value between 0 and 1 (I guess you are using the sigmoid), so this is perfectly normal. Your job is to get a threshold that fits your data best. You can get this threshold with cross validation on your training/test data or by just choosing one (0.5?) and see if the results are "good" and modify if needed.
NNs normally convert their output to a value within (0,1) using for example the logistic function. It's not a percentage or probability, just a relative measure of certainty. In any case this means is that you have to manually use a threshold (such as 0.5) to discriminate the two classes. Which threshold is best is tough to find because you must select the optimum trade off between precision and recall.