Pathfinding algorithm with only partial knowledge of graph - partial

I need to program an algorithm to navigate a robot through a "maze" (a rectangular grid with a starting point, a goal, empty spaces and uncrossable spaces or "walls"). It can move in any cardinal direction (N, NW, W, SW, S, SE, E, NE) with constant cost per move.
The problem is that the robot doesn't "know" the layout of the map. It can only view it's 8 surrounding spaces and store them (it memorizes the surrounding tiles of every space it visits). The only other input is the cardinal direction in which the goal is on every move.
Is there any researched algorithm that I could implement to solve this problem? The typical ones like Dijkstra's or A* aren't trivialy adapted to the task, as I can't go back to revisit previous nodes in the graph without cost (retracing the steps of the robot to go to a better path would cost the moves again), and can't think of a way to make a reasonable heuristic for A*.
I probably could come up with something reasonable, but I just wanted to know if this was an already solved problem, and I need not reinvent the wheel :P
Thanks for any tips!

The problem isn't solved, but like with many planning problems, there is a large amount of research already available.
Most of the work in this area is based on the original work of R. E. Korf in the paper "Real-time heuristic search". That paper seems to be paywalled, but the preliminary results from the paper, along with a discussion of the Real-Time A* algorithm are still available.
The best recent publications on discrete planning with hidden state (path-finding with partial knowledge of the graph) are by Sven Koenig. This includes the significant work on the Learning Real-Time A* algorithm.
Koenig's work also includes some demonstrations of a range of algorithms on theoretical experiments that are far more challenging that anything that would be likely to occur in a simulation. See in particular "Easy and Hard Testbeds for Real-Time Search Algorithms" by Koenig and Simmons.

Related

Reinforcement learning. Driving around objects with PPO

I am working on driving industrial robots with neural nets and so far it is working well. I am using the PPO algorithm from the OpenAI baseline and so far I can drive easily from point to point by using the following rewarding strategy:
I calculate the normalized distance between the target and the position. Then I calculate the distance reward with.
rd = 1-(d/dmax)^a
For each time step, I give the agent a penalty calculated by.
yt = 1-(t/tmax)*b
a and b are hyperparameters to tune.
As I said this works really well if I want to drive from point to point. But what if I want to drive around something? For my work, I need to avoid collisions and therefore the agent needs to drive around objects. If the object is not straight in the way of the nearest path it is working ok. Then the robot can adapt and drives around it. But it gets more and more difficult to impossible to drive around objects which are straight in the way.
See this image :
I already read a paper which combines PPO with NES to create some Gaussian noise for the parameters of the neural network but I can't implement it by myself.
Does anyone have some experience with adding more exploration to the PPO algorithm? Or does anyone have some general ideas on how I can improve my rewarding strategy?
What you describe is actually one of the most important research areas of Deep RL: the exploration problem.
The PPO algorithm (like many other "standard" RL algos) tries to maximise a return, which is a (usually discounted) sum of rewards provided by your environment:
In your case, you have a deceptive gradient problem, the gradient of your return points directly at your objective point (because your reward is the distance to your objective), which discourage your agent to explore other areas.
Here is an illustration of the deceptive gradient problem from this paper, the reward is computed like yours and as you can see, the gradient of your return function points directly to your objective (the little square in this example). If your agent starts in the bottom right part of the maze, you are very likely to be stuck in a local optimum.
There are many ways to deal with the exploration problem in RL, in PPO for example you can add some noise to your actions, some other approachs like SAC try to maximize both the reward and the entropy of your policy over the action space, but in the end you have no guarantee that adding exploration noise in your action space will result in efficient of your state space (which is actually what you want to explore, the (x,y) positions of your env).
I recommend you to read the Quality Diversity (QD) literature, which is a very promising field aiming to solve the exploration problem in RL.
Here is are two great resources:
A website gathering all informations about QD
A talk from ICLM 2019
Finally I want to add that the problem is not your reward function, you should not try to engineer a complex reward function such that your agent is able to behave like you want. The goal is to have an agent that is able to solve your environment despite pitfalls like the deceptive gradient problem.

particle swarm optimization inertia factor

i am reading in soft computing algorithms ,currently in "Particle Swarm Optimization ",i understand the technique in general but ,i stopped at mathematical or physics part which i can't imagine or understand how it works or how it affect the flying,that part is the first part in the equation which update the velocity which is called the "Inertia Factor"
the complete update velocity equation is :
i read in one article in section 2.3 "Ineteria Factor" that:
"This variation of the algorithm aims to balance two possible PSO tendencies (de-
pendent on parameterization) of either exploiting areas around known solutions
or explore new areas of the search space. To do so this variation focuses on the
momentum component of the particles' velocity equation 2. Notice that if you
remove this component the movement of the particle has no memory of the pre-
vious direction of movement and it will always explore close to a found solution.
On the other hand if the velocity component is used, or even multiplied by a w
(inertial weight, balances the importance of the momentum component) factor
the particle will tend to explore new areas of the search space since it cannot
easily change its velocity towards the best solutions. It must rst \counteract"
the momentum previously gained, in doing so it enables the exploration of new
areas with the time \spend counteracting" the previous momentum. This vari-
ation is achieved by multiplying the previous velocity component with a weight
value, w."
the full pdf at: https://www.google.com.eg/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CDIQFjAA&url=http%3A%2F%2Fweb.ist.utl.pt%2F~gdgp%2FVA%2Fdata%2Fpso.pdf&ei=0HwrUaHBOYrItQbwwIDoDw&usg=AFQjCNH8vChXHXWz_ydHxJKAY0cUa94n-g
but i can't also imagine how physicaly or numerically this is happend and how this factor affect going from exploration level to exploitative level ,so need a numerical example to see how it's work and imagine how it's work.
also ,in Genetic Algorithm there's a schema theorem which is a proof of GA success of finding optimum solution,is there's such athoerm for PSO.
It's not easy to explain PSO using mathematics (see Wikipedia article for example).
But you can think like this: the equation has 3 parts:
particle speed = inertia + local memory + global memory
So you control the 'importance' of these components by varying the coefficientes in each part.
There's no analytical way to see this, unless you make the stocastic part constant and ignore things like particle-particle interation.
Exploit: take advantage of the best know solutions (local and global).
Explore: search in new directions, but don't ignore the best know solutions.
In a nutshell, you can control how much importance to give for the particle current speed (inertia), the particle memory of the best know solution, and the particle memory of the swarm best know solution.
I hope it can help you!
Br's
Inertia was not the part of the original PSO algorithm introduced by Kennedy and Eberhart in 1995. It's been three years until Shi and Eberhart published this extension and showed (to some extent) that it works better.
One can set that value to a constant (supposedly [0.8 to 1.2] is best).
However, the point of the parameter is to balance exploitation and exploration of space, and
authors got best results when they defined the parameter with a linear function, that decreases over time from [1.4 to 0].
Their rationale was that first one should exploit solutions to find a good seed and later exploit area around the seed.
My feeling about it is that the closer you are to 0, the more chaotic turns particles make.
For a detailed answer refer to Shi, Eberhart 1998 - "A modified Particle Swarm Optimizer".
Inertia controls the influence of the previous velocity.
When high, cognitive and social components are less relevant. (particle keeps going its way, exploring new portions of the space)
When low, particle explores better the space where the best-so-far optimum has been found
Inertia can change over time: Start high, later decrease

Lucas Kanade Optical Flow, Direction Vector

I am working on optical flow, and based on the lecture notes here and some samples on the Internet, I wrote this Python code.
All code and sample images are there as well. For small displacements of around 4-5 pixels, the direction of vector calculated seems to be fine, but the magnitude of the vector is too small (that's why I had to multiply u,v by 3 before plotting them).
Is this because of the limitation of the algorithm, or error in the code? The lecture note shared above also says that motion needs to be small "u, v are less than 1 pixel", maybe that's why. What is the reason for this limitation?
#belisarius says "LK uses a first order approximation, and so (u,v) should be ideally << 1, if not, higher order terms dominate the behavior and you are toast. ".
A standard conclusion from the optical flow constraint equation (OFCE, slide 5 of your reference), is that "your motion should be less than a pixel, less higher order terms kill you". While technically true, you can overcome this in practice using larger averaging windows. This requires that you do sane statistics, i.e. not pure least square means, as suggested in the slides. Equally fast computations, and far superior results can be achieved by Tikhonov regularization. This necessitates setting a tuning value(the Tikhonov constant). This can be done as a global constant, or letting it be adjusted to local information in the image (such as the Shi-Tomasi confidence, aka structure tensor determinant).
Note that this does not replace the need for multi-scale approaches in order to deal with larger motions. It may extend the range a bit for what any single scale can deal with.
Implementations, visualizations and code is available in tutorial format here, albeit in Matlab not Python.

What's a genetic algorithm that would produce interesting/surprising results and not have a boring/obvious end point?

I find genetic algorithm simulations like this to be incredibly entrancing and I think it'd be fun to make my own. But the problem with most simulations like this is that they're usually just hill climbing to a predictable ideal result that could have been crafted with human guidance pretty easily. An interesting simulation would have countless different solutions that would be significantly different from each other and surprising to the human observing them.
So how would I go about trying to create something like that? Is it even reasonable to expect to achieve what I'm describing? Are there any "standard" simulations (in the sense that the game of life is sort of standardized) that I could draw inspiration from?
Depends on what you mean by interesting. That's a pretty subjective term. I once programmed a graph analyzer for fun. The program would first let you plot any f(x) of your choice and set the bounds. The second step was creating a tree holding the most common binary operators (+-*/) in a random generated function of x. The program would create a pool of such random functions, test how well they fit to the original curve in question, then crossbreed and mutate some of the functions in the pool.
The results were quite cool. A totally weird function would often be a pretty good approximation to the query function. Perhaps not the most useful program, but fun nonetheless.
Well, for starters that genetic algorithm is not doing hill-climbing, otherwise it would get stuck at the first local maxima/minima.
Also, how can you say it doesn't produce surprising results? Look at this vehicle here for example produced around generation 7 for one of the runs I tried. It's a very old model of a bicycle. How can you say that's not a surprising result when it took humans millennia to come up with the same model?
To get interesting emergent behavior (that is unpredictable yet useful) it is probably necessary to give the genetic algorithm an interesting task to learn and not just a simple optimisation problem.
For instance, the Car Builder that you referred to (although quite nice in itself) is just using a fixed road as the fitness function. This makes it easy for the genetic algorithm to find an optimal solution, however if the road would change slightly, that optimal solution may not work anymore because the fitness of a solution may have grown dependent on trivially small details in the landscape and not be robust to changes to it. In real, cars did not evolve on one fixed test road either but on many different roads and terrains. Using an ever changing road as the (dynamic) fitness function, generated by random factors but within certain realistic boundaries for slopes etc. would be a more realistic and useful fitness function.
I think EvoLisa is a GA that produces interesting results. In one sense, the output is predictable, as you are trying to match a known image. On the other hand, the details of the output are pretty cool.

What is a fractal? [duplicate]

This question already has answers here:
Closed 14 years ago.
Duplicate of How to program a fractal
What are fractals?
Is this is one of the concepts that is brought over from Mathematics to programming to simplify or solve a particular set of problems?
I am closing this question and have posted a related question
If you want to know about fractals in a general non-programming way, I would suggest looking at a general non-programming site. Wikipedia has a good article on them. If you want to know about programming fractals, I would suggest looking at this already asked question:
How to program a fractal
It even has a fractal tag.
A fractal is generally "a rough or fragmented geometric shape that can be split into parts, each of which is (at least approximately) a reduced-size copy of the whole," a property called self-similarity. The term was coined by BenoƮt Mandelbrot in 1975 and was derived from the Latin fractus meaning "broken" or "fractured." A mathematical fractal is based on an equation that undergoes iteration, a form of feedback based on recursion.
A fractal often has the following features:
It has a fine structure at arbitrarily small scales.
It is too irregular to be easily described in traditional Euclidean geometric language.
It is self-similar (at least approximately or stochastically).
It has a Hausdorff dimension which is greater than its topological dimension (although this requirement is not met by space-filling curves such as the Hilbert curve).
It has a simple and recursive definition.
http://en.wikipedia.org/wiki/Fractal
its a type of self-similar shape, often grounded in a repeated mathematical function (but not necessarily). It has nothing to do with programming technique, but the easiest way to view one is to write a program to draw it. (drawing a fractal with pen-and-paper is pretty time-consuming)
By 'self-similar' i mean, if you keep zooming in on different parts of the fractal, it doesn't get any "smoother" or more linear, as would happen with a non-fractal shape. It's degree of complexity is invariant of the zoom level.
the Wikipedia page is pretty useful
Look up Procedural Generation for one way of how fractals are used in programming. They are an excellent way of generating chaotic/seemingly complex data from a very simple source. The generated data often benefits from self-similarity and other bits of organzation that make the content make more sense to people.