Neural Network for a Robot - neural-network

I need to implement a Robot Brain, I used feedforward neural network as a Controller. The robot has 24 sonar sonsor, and only one ouput which is R=Right, L=Left, F=Forward, B=Back. I also have a large dataset which contain sonar data and the desired output. The FNN is trained using backpropagation algorithm.
I used neuroph Studio to construct the FNN and to do the trainnig. Here the network params:
Input layer: 24
Hidden Layer: 10
Output Layer: 1
LearnningRate: 0.5
Momentum: 0.7
GlobalError: 0.1
My problem is that during iteration the error drop slightly and seems to be static. I tried to change the parameter but I'm not getting any useful result!!
Thanks for your help

Use 1 of n encoding for the output. Use 4 output neurons, and set up your target (output) data like this:
1 0 0 0 = right
0 1 0 0 = left
0 0 1 0 = forward
0 0 0 1 = back
Reduce the number of input sensors (and corresponding input neurons) to begin with, down to 3 or 5. This will simplify things so you can understand what's going on. Later you can build back up to 24 inputs.
Neural networks often get stuck in local minima during training, that could be why your error is static. Increasing the momentum can help avoid this.
Your learning rate looks quite high. Try 0.1, but play around with these values. Every problem is different and there are no values guaranteed to work.

Related

Choosing which variables to normalize while applying logistic regression

Suppose a dataset comprises independent variables that are continuous and binary variables. Usually the label/outcome column is converted to a one hot vector, whereas continuous variables can be normalized. But what needs to be applied for binary variables.
AGE RACE GENDER NEURO EMOT
15.95346 0 0 3 1
14.57084 1 1 0 0
15.8193 1 0 0 0
15.59754 0 1 0 0
How does this apply for logistic regression and neural networks?
If the range of continuous value is small, encode it into a binary form and use each bit of that binary form as a predictor.
For example, number 2 = 10 in binary.
Therefore
predictor_bit_0 = 0
predictor_bit_1 = 1
Try and see if it works. Just to warn you, this method is very subjective and may or may not yield good results for your data. I'll keep you posted if I find a better solution

MSE in neuralnet results and roc curve of the results

Hi my question is a bit long please bare and read it till the end.
I am working on a project with 30 participants. We have two type of data set (first data set has 30 rows and 160 columns , and second data set has the same 30 rows and 200 columns as outputs=y and these outputs are independent), what i want to do is to use the first data set and predict the second data set outputs.As first data set was rectangular type and had high dimension i have used factor analysis and now have 19 factors that cover up to 98% of the variance. Now i want to use these 19 factors for predicting the outputs of the second data set.
I am using neuralnet and backpropogation and everything goes well and my results are really close to outputs.
My questions :
1- as my inputs are the factors ( they are between -1 and 1 ) and my outputs scale are between 4 to 10000 and integer , should i still scaled them before running neural network ?
2-I scaled the data ( both input and outputs ) and then predicted with neuralnet , then i check the MSE error it was so high like 6000 while my prediction and real output are so close to each other. But if i rescale the prediction and outputs then check The MSE its near zero. Is it unbiased to rescale and then check the MSE ?
3- I read that it is better to not scale the output from the beginning but if i just scale the inputs all my prediction are 1. Is it correct to not to scale the outputs ?
4- If i want to plot the ROC curve how can i do it. Because my results are never equal to real outputs ?
Thank you for reading my question
[edit#1]: There is a publication on how to produce ROC curves using neural network results
http://www.lcc.uma.es/~jja/recidiva/048.pdf
1) You can scale your values (using minmax, for example). But only scale your training data set. Save the parameters used in the scaling process (in minmax they would be the min and max values by which the data is scaled). Only then, you can scale your test data set WITH the min and max values you got from the training data set. Remember, with the test data set you are trying to mimic the process of classifying unseen data. Unseen data is scaled with your scaling parameters from the testing data set.
2) When talking about errors, do mention which data set the error was computed on. You can compute an error function (in fact, there are different error functions, one of them, the mean squared error, or MSE) on the training data set, and one for your test data set.
4) Think about this: Let's say you train a network with the testing data set,and it only has 1 neuron in the output layer . Then, you present it with the test data set. Depending on which transfer function (activation function) you use in the output layer, you will get a value for each exemplar. Let's assume you use a sigmoid transfer function, where the max and min values are 1 and 0. That means the predictions will be limited to values between 1 and 0.
Let's also say that your target labels ("truth") only contains discrete values of 0 and 1 (indicating which class the exemplar belongs to).
targetLabels=[0 1 0 0 0 1 0 ];
NNprediction=[0.2 0.8 0.1 0.3 0.4 0.7 0.2];
How do you interpret this?
You can apply a hard-limiting function such that the NNprediction vector only contains the discreet values 0 and 1. Let's say you use a threshold of 0.5:
NNprediction_thresh_0.5 = [0 1 0 0 0 1 0];
vs.
targetLabels =[0 1 0 0 0 1 0];
With this information you can compute your False Positives, FN, TP, and TN (and a bunch of additional derived metrics such as True Positive Rate = TP/(TP+FN) ).
If you had a ROC curve showing the False Negative Rate vs. True Positive Rate, this would be a single point in the plot. However, if you vary the threshold in the hard-limit function, you can get all the values you need for a complete curve.
Makes sense? See the dependencies of one process on the others?

How to properly model ANN to find relationship between real value input-output data?

I'm trying to make an ANN which could tell me if there is causality between my input and output data. Data is following:
My input are measured values of pesticides (19 total) in an area eg:
-1.031413662 -0.156086316 -1.079232918 -0.659174849 -0.734577317 -0.944137546 -0.596917991 -0.282641072 -0.023508282 3.405638835 -1.008434997 -0.102330305 -0.65961995 -0.687140701 -0.167400684 -0.4387984 -0.855708613 -0.775964435 1.283238514
And the output is the measured value of plant-somthing in the same area (55 total) eg:
0.00 0.00 0.00 13.56 0 13.56 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 13.56 0 0 0 1.69 0 0 0 0 0 0 0 0 0 0 1.69 0 0 0 0 13.56 0 0 0 0 13.56 0 0 0 0 0 0
Values for input are in range from -2.5 to 10, and for output from 0 to 100.
So the question I'm trying to answer is: in what measure does pesticide A affect plant-somthings.
What are good ways to model (represent) input/output neurons to be able to process the mentioned input/output data? And how to scale/convert input/output data to be useful for NN?
Is there a book/paper that I should look at?
First, a neural network cannot find the causality between output and input, but only the correlation (just like every other probabilistic methods). Causality can only be derived logically from reasoning (and even then, it's not always clear, it all depends on your axioms).
Secondly, about how to design a neural network to model your data, here is a pretty simple rule that can be generally applied to make a first working draft:
set the number of input neurons = the number of input variables for one sample
set the number of output neurons = the number of output variables for one sample
then play with the number of hidden layers and the number of hidden neurons per hidden layer. In practice, you want to use the fewest number of hidden layers/neurons to model your data correctly, but enough so that the function approximated by your neural network fits correctly the data (else the error in output will be huge compared to the real output dataset).
Why do you need to use just enough neurons but not too much? This is because if you use a lot of hidden neurons, you are sure to overfit your data, and thus you will make a perfect prediction on your training dataset, but not in the general case when you will use real datasets. Theoretically, this is because a neural network is a function approximator, thus it can approximate any function, but using a too high order function will lead to overfitting. See PAC learning for more info on this.
So, in your precise case, the first thing to do is to clarify how many variables you have in input and in output for each sample. If it's 19 in input, then create 19 input nodes, and if you have 55 output variables, then create 55 output neurons.
About scaling and pre-processing, yes you should normalize your data between the range 0 and 1 (or -1 and 1 it's up to you and it depends on the activation function). A very good place to start is to watch the videos at the machine learning course by Andrew Ng at Coursera, this should get you kickstarted quickly and correctly (you'll be taught the tools to check that your neural network is working correctly, and this is immensely important and useful).
Note: you should check your output variables, from the sample you gave it seems they use discrete values: if the values are discrete, then you can use discrete output variables which will be a lot more precise and predictive than using real, floating values (eg, instead of having [0, 1.69, 13.56] as the possible output values, you'll have [0,1,2], this is called "binning" or multi-class categorization). In practice, this means you have to change the way your network works, by using a classification neural network (using activation functions such as sigmoid) instead of a regressive neural network (using activation functions such as logistic regression or rectified linear unit).

Neural Network feature priority

there is a kind of NN that can give importance for some inputs ?
I have a problem like (actualy solved by 2 different NNs):
SITUATION 1)
inputs: 1 0 1 0 1 0 1 : target: 23
SITUATION 2)
inputs: 1 0 1 0 1 0 1 : target: 29
can I use the same NN for the both inputs, using the SITUATION as another INPUT for a single NN ?
One problem of this approach is that I have 50 different SITUATIONS.
Anyone with a good idea ?
Andre
I think your best bet would be adding another 50 input neurons and lighting one of them, signalizing your situation.
To make it smaller, you could use just 6 input neurons and light them up in binary code (situation 13 = 101100 as input for input neurons)
Other solution would be training neural network for each situation and save its weights+biases. Then for solving you would first apply weights+biases corresponding to situation you want to do and then calculate outputs.
Last option i can think of would be creating 50 different neural networks and use one you need.
I think that solution of having additional 6 neurons in binary is the riht way to go.
You can have up to 64 different situations. Adding 7th neuron can extend your situation count to 124 and every next neuron will double that

My neural network forgets the last training when I try to teach next set of training inputs

Im learning(started today) neural networks and could finish a 2x2x1 network(forward data feeding and backward error propagated) that can learn AND operation for one set of inputs. It also dodges any local minimums using randomized parameters. My first source for this is: http://www.codeproject.com/Articles/14342/Designing-And-Implementing-A-Neural-Network-Librar
The problem is: it learns 0 AND 0 using inputs (0,0) but when I give (0,1) it forgets 0 AND 0 then learns 0 AND 1. Is this a general newbie bug?
What I tried:
loop for 10000 times
learn 0 and 0
end loop
loop for 10000 times
learn 0 and 1 (forgets 0 and 0)
end loop
loop for 10000 times
learn 1 and 0 (forgets 0 and 1)
end loop
loop for 10000 times
learn 1 and 1 (forgets 1 and 0)
end loop
only one set is learned
fail
Trial 2:
loop for 10000 times
learn 0 and 0
learn 0 and 1
learn 1 and 0
learn 1 and 1
end loop
gives same result for all input combinations.
fail.
Activation function for each neuron: hyperbolic tangent
2x2 structure: all-pairs
2x1 structure: all-pairs
Randomized learning rate: yes, small enough to keep far from explosive iteration (per iteration)
Randomized bias per neuron: yes, between -0.5 and +0.5 (just at start)
Randomized weighting: yes, between -0.5 and +0.5 (just at start)
Edit: Bias and weight updates are done for all-pairs of hidden and output layers.
Edit: All neurons(hidden+output) use same activation function.
Without specific code it is hard to say for sure, but I think the issue is that you are only giving it one case to learn at a time. You should give it a matrix of your different learning examples, with an expected result vector. Then, when you update your weights and biases, you are finding the values that minimize the error between your network output for all cases, and the expected output for all cases.
For an AND gate, your input would be (in MATLAB code, not sure what language you are using but that syntax is easy to understand):
input = [0, 0;
0, 1;
1, 0;
1, 1];
And your expected output would be:
output = [0;
0;
0;
1];
I think what you are doing now is basically finding the weights and biases that minimize the error between the network output and the expected output for just one input case, and then re-training those weights and biases to minimize the error for the second case, then the third, then the fourth. If you put them in arrays like this it should minimize the overall error for all cases. This is just my best guess though without any code to go on.