I am building a single layer NN to play chess.
The input is a matrix[n,m] of n training set, 64 features each where each feature represents a square.
The training output is a matrix[n,1] where each output is a centipawn score obtained from stockfish. For example, the score can be 100, -200, 1405, etc etc.
I want to build this NN that has 64 input nodes, 500 hidden nodes, and 1 output node for outputting a score.
I know that NN is used for classification, but I was wondering if I can build a NN that can output arbitrary integers as well?
Thanks
Related
I currently understand and made a simple neural network which solves the XOR problem. I want to make a neural network for digit recognition. I know using MNIST data I would need 784 input neurons, 15 hidden neurons and 10 output neurons (0-9).
However, I don’t understand how the network would be trained and how feed forward would work with multiple output neurons.
For example, if the input was the pixels for the digit 3, how would the network determine which output neuron is picked and when training, how would the network know which neuron should be associated with the target value.
Any help would be appreciated.
So you have a classification problem with multiple outputs. I'm supposing that you are using a softmax activation function for the output layer.
How the network determines which output neuron is picked: simple, the output neuron with the greatest probability of being the target class.
The network would be trained with standard backpropagation, same algorithm that you would have with only one output.
There is only one difference: the activation function.
For binary classification you need only one output (for example with digits 0 and 1, if probability < 0.5 then class is 0, else 1).
For multi-class classification you need an output node for each class; then the network will pick the node with the greatest probability of being the target class.
I am trying to perform an experiment in Caffe with a very simple single hidden layer NN. I am using the MNIST dataset trained with a single hidden layer (of 128 nodes). I have all the weights from the fully trained network already.
However, during the feed forward stage I would like to use only a smaller subset of these nodes i.e 32 or 64. So for example, I would like to calculate the activations of 64 nodes during the feed forward pass and save them. then during the next run, calculate the activations of the other 64 nodes and combine them with the activations of the first 64 so i get the activations of all 128 nodes. Thus calculating the activations of all 128 nodes but in two 'passes'.
Is there a way to achieve this in Caffe? Please excuse me as I am very new to Caffe ( just started using it this week! ).
I am trying to find the optimum number of neurons to use to run the Neural Net Fitting tool in Neural Networks Matlab app.
I am currently using 62000 samples of 64 elements as input and 62000 samples of 1 element as target. I tried to obtain similar results as in data obtained through other means, but the results are not even similar when trying to run the tool with 1-12 neurons. I tried running it with 64 neurons and the results were closer to what it was expected.
Is there any kind of way to know how many neurons to use based on the number of elements/samples?
Any suggestions on how to select the number of neurons when running the tests?
Thanks.
Even for simple datasets like MNIST I will at minimum use 128 neurons. Possible values to check are 128, 256, 512, and maybe 1024. These numbers are just easy to remember and are not magical nor the consequence of a known formula. Alternatively, choose a few random samples from [100, 500] and see which number of neurons worked best. Harder tasks tend to require more neurons, and when you have many neurons you need to consider regularizing your network with L_2 regularization or dropout.
I am designing an algorithm for OCR using a neural network. I have 100 images([40x20] matrix) of each character so my input should be 2600x800. I have some question regarding the inputs and target.
1) is my input correct? and can all the 2600 images used in random order?
2) what should be the target? do I have to define the target for all 2600 inputs?
3) as the target for the same character is single, what is the final target vector?
(26x800) or (2600x800)?
Your input should be correct. You have (I am guessing) 26 characters and 100 images of size 800 for each, therefore the matrix looks good. As a side note, that looks pretty big input size, you may want to consider doing PCA and using the eigenvalues for training or just reduce the size of the images. I have been able to train NN with 10x10 images, but bigger== more difficult. Try, and if it doesn't work try doing PCA.
(and 3) Of course, if you want to train a NN you need to give it inputs with outputs, how else re you going to train it? You ourput should be of size 26x1 for each of the images, therefore the output for training should be 2600x26. In each of the outputs you should have 1 for the character index it belongs and zero in the rest.
I am trying to classify portions of time series data using a feed forward neural network using 20 neurons in a single hidden layer, and 3 outputs corresponding to the 3 events I would like to be able to recognize. There are many other things that I could classify in the data (obviously), but I don't really care about them for the time being. Neural network creation and training has been performed using Matlab's neural network toolbox for pattern recognition, as this is a classification problem.
In order to do this I am sequentially populating a moving window, then inputting the window into the neural network. The issue I have is that I am obviously not able to classify and train every possible shape the time series takes on. Due to this, I typically get windows filled with data that look very different from the windows I used to train the neural network, but still get outputs near 1.
Essentially, the 3 things I trained the ANN with are windows of 20 different data sets that correspond to shapes that would correspond to steady state, a curve that starts with a negative slope and levels off to 0 slope (essentially the left half side of a parabola that opens upwards), and a curve corresponding to 0 slope that quickly declines (right half side of a parabola that opens downwards).
Am I incorrect in thinking that if I input data that doesn't correspond to any of the items I trained the ANN with it should output values near 0 for all outputs?
Or is it likely due to the fact that these basically cover all the bases of steady state, increasing and decreasing, despite large differences in slope, and therefore something is always classified?
I guess I just need a nudge in the right direction.
Neural network output values
A neural network may not guarantee specific output values if these input values / expected output values were presented during the training period.
A neural network will not consistently output 0 for untrained input values.
A solution is to simply present the network with an array of input values that should result in the network outputting 0.