Problems in reinforcement learning: bug, parameters tuning, and training period

Problems in reinforcement learning: bug, parameters tuning, and training period - neural-network

I am currently training a reinforcement learning agent using a simple Neural Network with 100 hidden elements to solve 2048 game. I am using DQN's reinforcement learning algorithm (i.e. Q-learning with replay memory), but with 2 layers Neural Network instead of Deep Neural Network.
However, I left it trained on my laptop overnight (~7 hours, ~1000 games played, > 100000 steps) and the score does not seem to increase. I suspect there might be 3 sources of errors in my code: bug, parameters tuned badly, or maybe I just don't wait long enough.
Is there any method to figure out what is wrong with the code?
And what is the best practice to improve the training results?

I'll talk about all three of your hypothesis.
If you are using a standard DL framework like caffe or tensorflow, the chance of it being a bug is small.
Try decreasing the learning rate. Maybe you set it too high for the network to converge.
The training time of 100000 steps is not that long. For a simple pong game, you need to train around 500000 steps to get a good accuracy. So you can try training it for longer.
Also, 2048 is a fairly complicated game, so maybe you network is not deep enough to learn how to play it. Two layers is not much for such a complicated game. Try increasing the number of hidden layers. Perhaps you can use the network provided here

Related

Learning Rate in Neural Networks

I am new in field of neural networks and recently I came to know that very slow learning rate affects accuracy negatively. I understand that very low learning rate will make it inefficient but how accuracy is affected by low learning rate?

when learning rate is too large, it can cause the model to converge too quickly to a solution that is not optimal, think it as overshooting. On the contrary, when learning rate is too small, it can make the model get stuck (due to tiny steps taken to converge) to a local minimum which is not the optimal solution again.
Therefore accuracy is affected and the goal is to find the best learning rate for the that model.

Artificial life simulator not producing any results

I have been experimenting with evolving artificial creatures, but so far all creatures just die. To initialize the creatures that do not result from asexual reproduction; I create around 8 random neurons which both have a connection in and a connection out. I'm using mutation to get a set of weights which are used in a small neural network, that can form recurrent connections. I have 15 inputs and 5 output. There is a max number of 25 neurons in the hidden layer. The mutation chance is 25%. The different mutations are add a connection, disable a connection, make a small change to a weight, add a neuron, and disable a neuron. Is there something off with my mutation chances?

Real evolution is a massively parallel computation. Even so it took eons to get the basics of life. And then most of them died. Only a small sliver of all possible genes are ok.
To get your simulation to work in a reasonable time frame, you're gonna have to take some shortcuts.
Also, you should make sure your "small neural net" is capable of creating the kind of lifeforms that are successful. Your architecture may not be powerful enough to produce viable life.

Is Deep Q Learning appropriate for solving the Cartpole task?

I'm new to Reinforcement Learning. Recently, I've been trying to train a Deep Q Network to solve OpenAI gym's CartPole-v0 , where solving means achieving an average score of at least 195.0 over 100 consecutive episodes.
I am using a 2 layer neural network, experience replay with the memory containing 1 million experiences, epsilon greedy policy, RMSProp optimizer and Huber loss function.
With this setting, solving the task is taking several thousand episodes (> 30k). Learning is also quite unstable at times. So, is it normal for Deep Q Networks to oscillate and take this long for learning a task like this? What other alternatives (or improvements on my DQN) can give better results?

What other alternatives (or improvements on my DQN) can give better results?
in my experience, policy gradients work well with the cartpole. also, they are fairly easy to implement (if you squint, policy gradients almost look like supervised learning).
a good place to start: http://kvfrans.com/simple-algoritms-for-solving-cartpole/

What do I mutate and crossover in a genetic neural network?

I wrote a neural network and made a small application with things eating other things.
But I don't really know, how to make the thing genetic.
Currently I'm recording all the inputs and outputs from every individual every frame.
At the end of an generation, I then teach every knew individual the data from the top 10 best fitting individuals from prevous generations.
But the problem is, that the recorded data from a a pool of top 10 individuals at 100 generations, is about 50MB large. When I now start a new generation with 20 individuals I have to teach them 20x50MB.
This process takes longer than 3 minutes, and I am not sure if this is what I am supposed to do in genetic neural networks.
My approach works kind of good actually. Only the inefficiency bugs me. (Of course I know, I could just reduce the population.)
And I could't find me a solution to what I have to crossover and what to mutate.
Crossovering and mutating biases and weights is nonsense, isn't it? It only would break the network, would't it? I saw examples doing just this. Mutating the weight vector. But I just can't see, how this would make the network progress reaching it's desired outputs.
Can somebody show me how the network would become better at what it is doing by randomly switching and mutating weights and connections?
Would't it be the same, just randomly generating networks and hoping they start doing what they are supposed to do?
Are there other algorithms for genetic neural networks?
Thank you.

Typically, genetic algorithms for neural networks are used as an alternative to training with back-propagation. So there is no training phase (trying to combine various kinds of supervised training with evolution is an interesting idea, but isn't done commonly enough for there to be any standard methods that I know of).
In this context, crossover and mutation of weights and biases makes sense. It provides variation in the population. A lot of the resulting neural networks (especially early on) won't do much of anything interesting, but some will be better. As you keep selecting these better networks, you will continue to get better offspring. Eventually (assuming your task is reasonable and such) you'll have neural networks that are really good at what you want them to do. This is substantially better than random search, because evolution will explore the search space of potential neural networks in a much more intelligent manner.
So yes, just about any genetic neural network algorithm will involve mutating the weights, and perhaps crossing them over as well. Some, such as NEAT, also evolve the topology of the neural network and so allow mutations and crossovers that add or remove nodes and connections between nodes.

Neural Net Optimize w/ Genetic Algorithm

Is a genetic algorithm the most efficient way to optimize the number of hidden nodes and the amount of training done on an artificial neural network?
I am coding neural networks using the NNToolbox in Matlab. I am open to any other suggestions of optimization techniques, but I'm most familiar with GA's.

Actually, there are multiple things that you can optimize using GA regarding NN.
You can optimize the structure (number of nodes, layers, activation function etc.).
You can also train using GA, that means setting the weights.
Genetic algorithms will never be the most efficient, but they usually used when you have little clue as to what numbers to use.
For training, you can use other algorithms including backpropagation, nelder-mead etc..
You said you wanted to optimize number hidden nodes, for this, genetic algorithm may be sufficient, although far from "optimal". The space you are searching is probably too small to use genetic algorithms, but they can still work and afaik, they are already implemented in matlab, so no biggie.
What do you mean by optimizing amount of training done? If you mean number of epochs, then that's fine, just remember that training is somehow dependent on starting weights and they are usually random, so the fitness function used for GA won't really be a function.

A good example of neural networks and genetic programming is the NEAT architecture (Neuro-Evolution of Augmenting Topologies). This is a genetic algorithm that finds an optimal topology. It's also known to be good at keeping the number of hidden nodes down.
They also made a game using this called Nero. Quite unique and very amazing tangible results.
Dr. Stanley's homepage:
http://www.cs.ucf.edu/~kstanley/
Here you'll find just about everything NEAT related as he is the one who invented it.

Genetic algorithms can be usefully applied to optimising neural networks, but you have to think a little about what you want to do.
Most "classic" NN training algorithms, such as Back-Propagation, only optimise the weights of the neurons. Genetic algorithms can optimise the weights, but this will typically be inefficient. However, as you were asking, they can optimise the topology of the network and also the parameters for your training algorithm. You'll have to be especially wary of creating networks that are "over-trained" though.
One further technique with a modified genetic algorithms can be useful for overcoming a problem with Back-Propagation. Back-Propagation usually finds local minima, but it finds them accurately and rapidly. Combining a Genetic Algorithm with Back-Propagation, e.g., in a Lamarckian GA, gives the advantages of both. This technique is briefly described during the GAUL tutorial

It is sometimes useful to use a genetic algorithm to train a neural network when your objective function isn't continuous.

I'm not sure whether you should use a genetic algorithm for this.
I suppose the initial solution population for your genetic algorithm would consist of training sets for your neural network (given a specific training method). Usually the initial solution population consists of random solutions to your problem. However, random training sets would not really train your neural network.
The evaluation algorithm for your genetic algorithm would be a weighed average of the amount of training needed, the quality of the neural network in solving a specific problem and the numer of hidden nodes.
So, if you run this, you would get the training set that delivered the best result in terms of neural network quality (= training time, number hidden nodes, problem solving capabilities of the network).
Or are you considering an entirely different approach?

I'm not entirely sure what kind of problem you're working with, but GA sounds like a little bit of overkill here. Depending on the range of parameters you're working with, an exhaustive (or otherwise unintelligent) search may work. Try plotting your NN's performance with respect to number of hidden nodes for a first few values, starting small and jumping by larger and larger increments. In my experience, many NNs plateau in performance surprisingly early; you may be able to get a good picture of what range of hidden node numbers makes the most sense.
The same is often true for NNs' training iterations. More training helps networks up to a point, but soon ceases to have much effect.
In the majority of cases, these NN parameters don't affect performance in a very complex way. Generally, increasing them increases performance for a while but then diminishing returns kick in. GA is not really necessary to find a good value on this kind of simple curve; if the number of hidden nodes (or training iterations) really does cause the performance to fluctuate in a complicated way, then metaheuristics like GA may be apt. But give the brute-force approach a try before taking that route.

I would tend to say that genetic algorithms is a good idea since you can start with a minimal solution and grow the number of neurons. It is very likely that the "quality function" for which you want to find the optimal point is smooth and has only few bumps.
If you have to find this optimal NN frequently I would recommend using optimization algorithms and in your case quasi newton as described in numerical recipes which is optimal for problems where the function is expensive to evaluate.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Problems in reinforcement learning: bug, parameters tuning, and training period - neural-network

Related

Learning Rate in Neural Networks

Artificial life simulator not producing any results

Is Deep Q Learning appropriate for solving the Cartpole task?

What do I mutate and crossover in a genetic neural network?

Neural Net Optimize w/ Genetic Algorithm

Categories

Resources