Suppose in a model the NetLogo world is 160 X 101. So, In such a world how many stationary agents and moving agents can possibly be created. Can this world will able to support 100000 moving agents (no die) or possible support by NetLogo to moving agents in a single model.
No formal restrictions, just performance and resources. See How to model a very large world in NetLogo? I have personally modelled with 50,000 agents and it's still reasonable speed. However, get your model working with a MUCH smaller size before expanding as it will slow down.
Related
I am working on driving industrial robots with neural nets and so far it is working well. I am using the PPO algorithm from the OpenAI baseline and so far I can drive easily from point to point by using the following rewarding strategy:
I calculate the normalized distance between the target and the position. Then I calculate the distance reward with.
rd = 1-(d/dmax)^a
For each time step, I give the agent a penalty calculated by.
yt = 1-(t/tmax)*b
a and b are hyperparameters to tune.
As I said this works really well if I want to drive from point to point. But what if I want to drive around something? For my work, I need to avoid collisions and therefore the agent needs to drive around objects. If the object is not straight in the way of the nearest path it is working ok. Then the robot can adapt and drives around it. But it gets more and more difficult to impossible to drive around objects which are straight in the way.
See this image :
I already read a paper which combines PPO with NES to create some Gaussian noise for the parameters of the neural network but I can't implement it by myself.
Does anyone have some experience with adding more exploration to the PPO algorithm? Or does anyone have some general ideas on how I can improve my rewarding strategy?
What you describe is actually one of the most important research areas of Deep RL: the exploration problem.
The PPO algorithm (like many other "standard" RL algos) tries to maximise a return, which is a (usually discounted) sum of rewards provided by your environment:
In your case, you have a deceptive gradient problem, the gradient of your return points directly at your objective point (because your reward is the distance to your objective), which discourage your agent to explore other areas.
Here is an illustration of the deceptive gradient problem from this paper, the reward is computed like yours and as you can see, the gradient of your return function points directly to your objective (the little square in this example). If your agent starts in the bottom right part of the maze, you are very likely to be stuck in a local optimum.
There are many ways to deal with the exploration problem in RL, in PPO for example you can add some noise to your actions, some other approachs like SAC try to maximize both the reward and the entropy of your policy over the action space, but in the end you have no guarantee that adding exploration noise in your action space will result in efficient of your state space (which is actually what you want to explore, the (x,y) positions of your env).
I recommend you to read the Quality Diversity (QD) literature, which is a very promising field aiming to solve the exploration problem in RL.
Here is are two great resources:
A website gathering all informations about QD
A talk from ICLM 2019
Finally I want to add that the problem is not your reward function, you should not try to engineer a complex reward function such that your agent is able to behave like you want. The goal is to have an agent that is able to solve your environment despite pitfalls like the deceptive gradient problem.
I've been playing with solar system simulation lately using Barnes-Hut algorithm to speed things up.
Now simulation works fine when feed with our solar system data, but I'd like to test it on something bigger.
Now, I tried to generate 500+ random bodies, and even add initial orbital motion around center of gravity - but every time after short time while most of the bodies end up ejected far away into space.
Are there any methods to generate random sets of planets/stars for simulations like this that will remain relatively stable ?
You should probably ask this question on the Physics or Mathematics stackexchange.
I think this is a very difficult question, to the point that great mathematicians have studied the stability of the solar system. Things are "easy" for the two body problem, but the three body problem is notorious for its chaotic behavior (Poincare studied it carefully and in the process laid out the fundament of the qualitative theory of dynamical systems). If I am not mistaken (feel free to check this online), instability of orbital dynamics of large number of bodies (large meaning three or more) is a condition, whose probability of occurrence is very high. Meanwhile, coming across stable configurations has a vary low probability.
Now, for so called integrable systems ("exactly solvable"),
like n copies of decoupled sun-one-planed models of a solar/star system, small perturbations are more likely to yield stable dynamics, due to the Kolmogorov-Arnold-Moser's theorem. So I can say that it is more likely for you to come across stability, if you first set up the bodies in your simulation to be comparatively small gravity sources orbiting one significantly larger gravitational source. Each body has one dominating force from the large source and many much smaller perturbations from the rest of the bodies (or the averaged sources of your Barnes-Hut algorithm). If you consider only the dominating force, and turn off the perturbations, you would have a solar system with n decoupled two-body systems (each body following elliptical motion around a common gravitational center). If you turn on the perturbations, this dynamics changes, but it tends to deviate from the unperturbed one very slowly, and is more likely to be stable. So start with highly ordered dynamics and start changing slightly the body's masses and their positions and velocities. You could follow how the dynamics changes when you alter the parameters and the initial conditions.
One more thing, it is always a good idea to place the inertial coordinate system, with respect to which the positions and the velocities of the bodies are represented, in the center of mass of the group of bodies. This is more or less guaranteed when the initial momenta sum up to the zero vector. This set up yields the center of mass of the system is always fixed at some point in space, so a simple translation will move it to the origin of the coordinate system.
I'm creating an evolution-artificial-life-simulation game in 2D (purely for fun purposes). It combines neural networks (for behaviour controlling) and genetic algorithm (for breeding and mutations).
On input I give them X,Y position of nearest food (normalized) and X,Y position of the "look at" vector.
Currently they fly around and when they collide with food (let's call it "eating apples") their fitness index is increased by one and the apple's position is randomed - after 2000 turns the GA interrupts and does its magic.
After about 100 generations they learn that eating apples is good and try to fly to the nearest ones.
But my question, as a neural network newbie, is - if I created a room where apples spawn way more frequent than on the rest of the map, would they learn and understand that? Would they fly to that room more often? And is it possible to tell how many generations would it take for them to learn?
What they can learn and how fast depends a lot on the information you give them access to. For instance, if they have no way of knowing that they are in the room where food generates more frequently, then there is no way for them to evolve to go there more frequently.
It's not entirely clear from your question what the "look at" vector is. If it, for instance, shows them what's directly in front of them, then it might be enough information for them to figure out that they're in the room of plenty, particularly if that room "looks" distinctive somehow. A more useful input to give them might be their current X and Y coordinates. If you did that, then I would definitely expect them to evolve to be in the good room more frequently (in proportion to how good it is, of course), because it would be possible for them to take action to go to and stay in that room.
As for how many generations it will take, that is incredibly hard to predict (especially without knowing more about your setup). If it takes them 100 generations to learn to eat food, then I would expect it to be on the order of hundreds. But the best way to find out is just to try it.
If it's all about location, they may keep a state of the map in their mind and simple statistics will let them learn where the food may be located. Neural nets is an overkill there.
If there are other features of locations (for example color, smell, height etc...) to map those features to the label (food exists or not) is good for neural nets. Especially if some of features not available or not reliable randomly at the moment.
If they need many decisions to reach the goal, you will need reinforcement learning. Forexample, they may go to a direction which is good for a time, but make them away from resources they will need later.
I believe that a recurrent neural network could learn to expect apples to spawn in a certain region.
I was wondering how a mechanical stop can be modeled most efficiently.
I do a hydraulic simulation with a controlled hydraulic cylinder in OpenModelica. For the hydraulic cylinder I use the sweptVolume model from the Modelica Standard Library.
What bugs me about this model is that there is no mechanical stop if the piston reaches the bottom of the cylinder.
I tried several ideas with no good result. I tried to reset the displacement of the piston to zero, if it hits the bottom, via an if-expression. But this is not really a good option due to the fact that the volume is calculated using the piston's displacement.
I then tried to introduce a force that equals the force applied to the piston, if the piston hits the stop. This option didn't work either, because in this case the pressure inside the cylinder can not be calculated.
The third try was to use the MSL model of MassWithStopAndFriction linked to the translational flange of the sweptVolume model, but this model seems to be broken for me.
Now I count on you as a competent community to bring in some more ideas for me to test.
Depending on your application, you may deploy the Hydraulics library? The library aims to model (compressible) fluid power systems and contains cylinders with end-stops. Its scope is different than the Fluid package you are using.
Using when and/or if statements for this task, I'd strongly discourage from experience. You may get one cylinder to work, but using that in a larger system will definitely get you into numerical problems. Have a look at the Mechanics package and analyse if the ElastoGap can be of any use to you.
I need to create a very large grid of patches to have GIS information of a very large network (such as a city-wide network). My question is how to get NetLogo to model such a world? When I set the max-pxcor and max-pycor to large numbers, it stop working. I need a world of for example size 50000 * 50000.
Thanks for your help.
See http://ccl.northwestern.edu/netlogo/docs/faq.html#howbig , which says in part: “The NetLogo engine has no fixed limits on size...”
It's highly unlikely that you'll be able to fit a 50,000 x 50,000 world, in your computer though — that's 2.5 billion patches. Memory usage in NetLogo is proportional to the number of agents, and patches are agents too.
You might take Stephin Guerin's advice at http://netlogo-users.18673.x6.nabble.com/Re-Rumors-of-Relogo-tp4869241p4869247.html on how to avoid needing an enormous patch grid when modeling transportation networks.