Can neural network actually learn? - neural-network

I'm creating an evolution-artificial-life-simulation game in 2D (purely for fun purposes). It combines neural networks (for behaviour controlling) and genetic algorithm (for breeding and mutations).
On input I give them X,Y position of nearest food (normalized) and X,Y position of the "look at" vector.
Currently they fly around and when they collide with food (let's call it "eating apples") their fitness index is increased by one and the apple's position is randomed - after 2000 turns the GA interrupts and does its magic.
After about 100 generations they learn that eating apples is good and try to fly to the nearest ones.
But my question, as a neural network newbie, is - if I created a room where apples spawn way more frequent than on the rest of the map, would they learn and understand that? Would they fly to that room more often? And is it possible to tell how many generations would it take for them to learn?

What they can learn and how fast depends a lot on the information you give them access to. For instance, if they have no way of knowing that they are in the room where food generates more frequently, then there is no way for them to evolve to go there more frequently.
It's not entirely clear from your question what the "look at" vector is. If it, for instance, shows them what's directly in front of them, then it might be enough information for them to figure out that they're in the room of plenty, particularly if that room "looks" distinctive somehow. A more useful input to give them might be their current X and Y coordinates. If you did that, then I would definitely expect them to evolve to be in the good room more frequently (in proportion to how good it is, of course), because it would be possible for them to take action to go to and stay in that room.
As for how many generations it will take, that is incredibly hard to predict (especially without knowing more about your setup). If it takes them 100 generations to learn to eat food, then I would expect it to be on the order of hundreds. But the best way to find out is just to try it.

If it's all about location, they may keep a state of the map in their mind and simple statistics will let them learn where the food may be located. Neural nets is an overkill there.
If there are other features of locations (for example color, smell, height etc...) to map those features to the label (food exists or not) is good for neural nets. Especially if some of features not available or not reliable randomly at the moment.
If they need many decisions to reach the goal, you will need reinforcement learning. Forexample, they may go to a direction which is good for a time, but make them away from resources they will need later.

I believe that a recurrent neural network could learn to expect apples to spawn in a certain region.

Related

Reinforcement learning. Driving around objects with PPO

I am working on driving industrial robots with neural nets and so far it is working well. I am using the PPO algorithm from the OpenAI baseline and so far I can drive easily from point to point by using the following rewarding strategy:
I calculate the normalized distance between the target and the position. Then I calculate the distance reward with.
rd = 1-(d/dmax)^a
For each time step, I give the agent a penalty calculated by.
yt = 1-(t/tmax)*b
a and b are hyperparameters to tune.
As I said this works really well if I want to drive from point to point. But what if I want to drive around something? For my work, I need to avoid collisions and therefore the agent needs to drive around objects. If the object is not straight in the way of the nearest path it is working ok. Then the robot can adapt and drives around it. But it gets more and more difficult to impossible to drive around objects which are straight in the way.
See this image :
I already read a paper which combines PPO with NES to create some Gaussian noise for the parameters of the neural network but I can't implement it by myself.
Does anyone have some experience with adding more exploration to the PPO algorithm? Or does anyone have some general ideas on how I can improve my rewarding strategy?
What you describe is actually one of the most important research areas of Deep RL: the exploration problem.
The PPO algorithm (like many other "standard" RL algos) tries to maximise a return, which is a (usually discounted) sum of rewards provided by your environment:
In your case, you have a deceptive gradient problem, the gradient of your return points directly at your objective point (because your reward is the distance to your objective), which discourage your agent to explore other areas.
Here is an illustration of the deceptive gradient problem from this paper, the reward is computed like yours and as you can see, the gradient of your return function points directly to your objective (the little square in this example). If your agent starts in the bottom right part of the maze, you are very likely to be stuck in a local optimum.
There are many ways to deal with the exploration problem in RL, in PPO for example you can add some noise to your actions, some other approachs like SAC try to maximize both the reward and the entropy of your policy over the action space, but in the end you have no guarantee that adding exploration noise in your action space will result in efficient of your state space (which is actually what you want to explore, the (x,y) positions of your env).
I recommend you to read the Quality Diversity (QD) literature, which is a very promising field aiming to solve the exploration problem in RL.
Here is are two great resources:
A website gathering all informations about QD
A talk from ICLM 2019
Finally I want to add that the problem is not your reward function, you should not try to engineer a complex reward function such that your agent is able to behave like you want. The goal is to have an agent that is able to solve your environment despite pitfalls like the deceptive gradient problem.

Why do we need stochasticity in deterministic simulations?

Assuming the world were deterministic, why would we still need to introduce stochasticity into our simulations?
In a nutshell, to simplify models.
Let’s go with your assumption, even though I don’t believe it. If the universe is completely deterministic, then in any given scenario you choose to model there is one and only one correct answer. Unless you include the complete state space of absolutely everything that determines that answer, your model is wrong. Wrong, wrong, wrong!!!
For instance, if you want to predict how long it will take to fly from New York to London, you need to know the vector sums of all forces acting on the aircraft, which means you need the complete state (down to the atomic level) of the aircraft itself, the passengers, the atmosphere, fluctuations in the magnetic field of the earth, cosmic rays that can trigger upper atmospheric events, etc, etc, ad nauseam. Exclusion of any aspect of the potential forces involved makes your answer wrong.
Clearly, there’s no way to measure it all, and even if there was, there’s no way to maintain so much state information in any computing device we can build. And so we simplify and acknowledge that there is some degree of uncertainty in our model’s predictions/solutions.
When you embrace the existence of uncertainty, it brings us directly to stochastic solutions. One view of probability is that it is a mathematical formalism for modeling uncertainty. Rather than try to model every physical aspect of an aircraft’s flight, we can characterize the likely outcomes based on what proportion of flights require less (or more) than any particular amount of time, i.e., describing the distribution of possible flight times.
Once you adopt distributional modeling, you can see how distributional behaviors propagate though other parts of a system—either analytically, if your system is sufficiently simple, or by generating realizations of the distributions and using replication and sampling via simulation.

How do I determine processor speed required for optical flow?

I'd like to use an optical flow system to get velocities from surrounding environment. I've read papers about how optical flow works, but they don't treat details about optic sensors.
My question is: How do I determine how much computational power is required to perform optical flow analysis?
I'd like to use a low-power system (like microcontrollers), but I don't know what kind of camera I could use with such a system. I mean, could it be color or does it need to be B/W? Rolling shutter or global shutter? Which frame rate or number of pixels?
I'd like to specify the system myself but, without knowing how those camera attributes impact the processing load, I'm not sure where to start.
As Chuck already said in the comment. You first need to start with something. Opticalflow calculation really depends on what you are using it for and what you are trying to achieve. For realtime applications you might want to consider using faster processors (this is always true though).
Continuing to my answer.
Opticalflow calculation performance depends on few main things:
The optical-flow method you choose (dense or sparse), you can read more about it here and here. Of course that you should take into account not only that sparse is faster than dense, also that sparse might be less accurate in some cases. Again, this depends on what you're trying to achieve.
In addition, you will see that there are different optical-flow algorithms. Some might be faster than others. There are many algorithms such as Lucas-Kanade, Horn-Schunck, TVL1, Farneback, etc.
Most optical-flow methods from libraries such as OpenCV gives you the ability to change some parameters in order to play with the trade-off between accuracy and performance. See this and also check the OpenCV methods such as this and this for example - see the different arguments.
The resolution of your image. Smaller image usually means faster calculation.
Few things you might also want to consider:
If you are using a processor that has multiple cores, make sure that you are using all the cores in the optical-flow calculation. Some libraries may already do this for you, but in some cases you will need to do it by yourself. Take a look at my question and answer in this post, it might give you some idea and help you getting starting with such case.
If you want more accurate optical-flow results you must use global shutter camera. Rolling shutter cameras, such as most of the web-cams, will give you an extra error you don't want.
You don't need color image, if you have a grayscale camera it will be even better. If not, you will need to convert it to grayscale (not B/W) for faster performance as well.
Some libraries such as OpenCV has an option (in some cases) to run these algorithms on a GPU. If using a GPU is an option you might want to consider this as well.
From my own experience, the main thing that gave me a boost in performance was changing my resolution from 640x480 to 320x240 and even 160x120. In my case it didn't really hurt the accuracy.
I used an Odroid U3 mini-pc with OpenCV PyrLK algorithm and input frames of 320x240 resolution. After applying what's described here (splitting the image to 4 for parallel calculation) it worked pretty well (realtime).
The answer given by Sarid has some strong points, and many of them are shared by researchers around the world. My opinions are shared by anyone who has actually worked with these topics in the real-world setting.... with real world, i mean implementing optical flow in drones, on mobile phones and IP cameras that are not sitting in a protected office, and where other systems (such as humans) need to interact and be co-dependent.
First of all, depending on your problem, you may want to invest time in looking for ready-made solutions. Optical flow sensors are readily available, cheap and robust (but usually not strong in accuracy). These are the kind of sensors you find in optical mice. They are low power, and easily interfaced with micro-controllers. Some have staggering sample rates of thousands of fps. They commonly have low spatial resolution however, and (to emphasize) high robustness but low accuracy.
If instead you are looking for the kind of optical flow that can be used for shape from motion, pedestrian detection and video-encoding, for example, then you are probably better off to look for something more advanced, and thats where Sarids answer becomes relevant.
Since your question has been migrated from robotics stack exchange, I am going to assume you are interested applications close to machine control and human machine interaction. In that case, the most important aspects are the ones usually most ignored by people working in the field of optical flow estimation, namely:
Latency. If you have a human interfacing at the front-end... then the common term is "glass-to-glass latency". This is completely different from the fps of your system, which is connected to throughput. If you find that you are in a discussion with someone, and they do not understand the difference between latency and fps, then they are not the expert you are interested in. For example, almost all researchers in computer vision who do GPU implementations of optical flow add massive latency by allowing for frame delays and ineffecient memory handling (inefficient from perspective of latency, but efficient in terms of throughput and hard-ware utilization). Consider the problem of controlling a drone, say make it self-stabilizing, it is better to receive a bad optical flow estimation 10 ms earlier, then a good one with 10 ms extra delay.... especially if the optical system does not give you any upper bounds of the delay for any given time.
Algorithm stability. This is completely different from accuracy. Accuracy is what 99% of all research in optical flow has been obsessing about for the last 30 years. Stability is not at all something evaluated in the Middlebury benchmark for example. Stability deals with how small changes in your data will guarantee small changes in the estimated optical flow. While some good work has been done in the community (on robust statistics most interestingly) in the end the final evaluation of any algortihm disregards stability. Consider the optical mouse as a good example. The first generations of optical mice had higher accuracy (the average error from the true motion was smaller) but they had lower stability (especially when you ran the mice over "bad textures", with rotational motions). Later generations of optical mouse have worse accuracy, but are focusing on the stability, as that is the most important thing. You dont experience the mouse cursor jumping around as much as you did the earlier days of the devices.... but if you move the mouse on your mat, left and right repeatedly, you will see the cursor slowly drifting (i.e. low accuracy).
Heat. Any device that will estimate high accuracy optical flow, will require lots of computations. When it comes to computations per watt, GPUs are not that good. In drones, you may be able to get away with this, because it is a setting where you have active cooling as a by-product of the propulsion system. In the real-world, you most often can not assume active cooling nor unlimited power supply.
To conclude, its a fascinating area, and I hope you have a great experience coding solutions.

Output of Artificial Neural Network in Othello

I'm implementing Othello using Artificial neural network. When I read document (here, page 19), I don't understand some points.
They calculate the output:
image
I dont know if they calculate that, how this my AI know what the legal moves in game to choose the best legal move. That ouput is only a float number (I think so) and how I can use it?
The good news
It's super simple: the Neural-Network (NN) is a Value-Network (instead of a Policy-Network). This Value-Network takes a board-state as input and calculates some score describing how good the position is. It's the basic building-block of all Minimax-based Game-AIs, often called the evaluation function. (A Policy-Network output would give a probability-distribution over all possible moves)
So the NN gives you this score. You can then combine this score with some algorithm of your choice. Minimax (nearly all Chess-AIs) and MCTS (AlphaGo) are the most common.
Basic idea of Minimax: play a move, opponent plays a move, (repeat), evaluate with your NN -> do this for all possible combinations and propagate with Minimax. Only a few ply's (half-moves) will be possible with this NN, but it will be very powerful for Othello and it's easy to implement.
Basic idea of MCTS: play random move, play random move, (repeat), until game ends -> build-winner statistic. Now compare the average scores of all possible "first" moves. Pick the best. (Harder to incorporate NN as a heuristic.)
The calculation you mentioned is just the classic rule in Neural Networks to define the activation together with a dense-layer.
The bad news
I didn't read the paper, but the hard thing is to train and prepare your NN. You need to provide some data. Maybe it will be supervised (if you have historical games; easier), maybe unsupervised (Q-learning and co.). This will be very hard to do without experience.
I think I know all the theory needed, but I still failed to do this with some other (stochastic) games, because there are many many issues with autocorrelation and co, there is also a lot of hyperparameter-tuning needed.
Conclusion
This project is kind of complicated and there are many many pitfalls. Please be sure you understand the algorithms you want to try. It looks like you are kind of missing the basics. Game-theory (Minimax), AI/Learning-Theory (MCTS, Markov-Decision-Processes, Q-Learning...), NN (basic internals of a NN).

Training for pattern recognition (neural network)

How do you train Neural Network for pattern recognition? For example a face recognition in a picture how would you define the output neurons? (eg. how to detect where is the face exactly, rather than just saying that there is a face in camera). Also, how about detecting multiple faces and different size of faces?
If anyone could give me a pointer it would be really great
Cheers!
Generally speaking I would split the problem into multiple stages e.g.
1 - Is there a face in the picture?
2 - Where is the face in the picture?
3 - Is the face in the picture one that the NN (Neural network) recognises?
In each instance I would suggest you build a separate NN and train it to answer the questions posed.
As for the structure of the NN, that's a bit trickier to answer as it depends on your input data and desired output. For example if you had a 100x100 px image then I suppose its feasible to have 10,000 inputs. You might want to consider doing some preprocessing before hand to say detect ovals that way you could look and see if there are a number of ovals in a predictable outline (1 for the face, 2 for the eyes, and one for the mouth possibly). If you are preprocessing the data then you might have inputs for each oval.
Now for the output... for question one you could just have one output to say how sure the NN is that there is a face in the input data i.e a valuer of 0.0 (defiantly no face) --> 1.0 (defiantly a face). This way you can move onto stages 2 and 3.
I might say at this point that this is a non-trivial problem and you might be better to have a look at some of the frameworks available e.g. OpenCV
Now for the training part, you need to have a stockpile of images available to train the NN. There are a number of ways in which you could train the NN. One potential solution is to use a technique called back propagation 1, 2. In general terms, you use the NN on an image and compare it to a predetermined output. If its wrong tweak the NN to produce the desired output and repeat.
If you want a good book on AI, then I would highly recommend Artificial Intelligence: A Modern Approach by Russell and Norvig. Im sure that there are more appropriate Computer Vision textbooks, but the Russell & Norvig book is an excellent starter.
Dear GantengX, you should prepare your self to the fact that the answer is so large, complex and hard to understand. There is so many approaches to pattern and face recognition. And implementing real-life face recognition system is a huge array of work that one person can never handle. Prepare your self for at least 10 years of life behind books on mathematic and artificial intelligence, I'm not talking about hiring 5 highly payed developers in the end who will understand what you want them to do. And maybe you will end up having your own face recognition system. There are also dozen of other issues that will jump out during the process. So be ready for a life full of stresses and problems.
I'm sorry for telling obvious things, but your question was not specific, complete answer would touch many different scientific spheres and will result as a book with over 1k pages.
Regarding your question (the short answer).
There are several principal parts that each face recognition app consists of:
Artificial intelligence algorithm
Optimization algorithm (for AI optimization)
Different filtration algorithms
Effective data set development
Items 1. and 2. are the central part of each system, they do the actual work. Any other preprocessing just makes the input data less complex, making it easier to do a decision for your AI. Don't start 3. and 4. until you will have your first results.
P.S.
Using existing solutions is more cost-effective, but if you are studying things then don't loose time like I did, and start your dissertation right away.