How to build a NetLogo agent who must try to find the shortest route between all of the given locations whilst also avoiding the given patches as those represent solid impassable objects.
If you google "shortest route algorithm" you'll find many, for example: https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
But you have massively under-specified the problem you are trying to solve. You want a single agent to solve this? Why not a swarm or class of agents? If just one agent why are you using NetLogo as a tool? Are the agents smart? Do they get smarter over time? Can 3 agents work together on exploring the terrain? Does this have to get solved just once or many times?
How much information is available to the agent? The whole world? Just the nearest radius R? Does the agent know in advance what the "given locations" are and what the obstacles are? Does the order in which the given-locations are visited matter?
This could be the classic "travelling salesman" problem, having to visit each of N cities with costs of travel between each pair being given. See https://en.wikipedia.org/wiki/Travelling_salesman_problem
Can the agent find ANY route, and then keep trying to improve it? Or must it work on the first pass? What's the stopping condition? How do you know when the best route you've found is the best possible route?
You should also look at the A* algorithm. https://en.wikipedia.org/wiki/A*_search_algorithm
There's a huge amount of prior work done on solving this class of problem but they're mostly computational -- ie, your agent could simply sit and plug numbers into software and compute an answer and never move. I don't think that's what you had in mind. But what DID you have in mind?
Related
im new to AnyLogic and Java so bear with me.
My Simulation uses 5 Transporters to transport Agents through a Prodcution (they take different Routes each time). I'm trying to track the Distance each individual Transporter travels until it returns to pick up a new Agent.
I think this is possible with the getDistanceTravelled() Function but I have some Questions on how to use the Function.
How do I identify each Transporter from my transporterFleet?
How do I track the Distance travelled for each Transporter?
How can I reset the Distance travelled when the Transporter has returned to pick up a new
Agent?
How can I store and visualize the Distance travelled for each Transporter ?
This is an example of the blocks im using to simulate one Route for the Transporters.
https://i.stack.imgur.com/XVjwp.png
I'm sorry if this is a lot to ask but I can't seem to get the hang of this Problem. Feel free to ask follow-up Questions if my explanation isn't enough. Any Help is appreciated.
Lars
I am taking part in a programming competition where the objective is writing a bot that can play a specific game.
The objective of the game is to earn a certain amount of points. You control multiple airships, that you move around, capture islands and navigate drones that carry treasure. You play against one opponent, turns happen simultaneously, and there is a time limit. You can move multiple ships and drones in one turn. You can program your bot in Python, Java or C#.
The exact details don‘t matter, just that each ship has around 15 options each turn (moving and shooting) and overall you have around 10000 different options for each turn (different configurations of airship movements and shooting)
Up until now I approached this competition naively, and haven‘t done anything exceptionally clever (for example, if near enemy, shoot). I have read about minimax algorithms, and I would really like to apply it here (or something similar), you can assume that I can tell the value of a state. My problem is the mass of options for each turn - which create an enourmous branching factor that doesnt let me get very deep.
Question 1: Is there a better, applicable approach to this problem? Perhaps deep-learning or something similar?
Question 2: Is there a way to minimize the branching factor? I`ve read about alpha-beta and similar algorithms, but nothing seems to do the job.
Any help would be much appreciated
The minimax algorithm seems to be natural for these kinds of problems. At first, the game will be modelled in a abstract way and then a solver is used to find the path from current situation to a gamestate which maximize the amount of points. A similar approach to minimax is GOAP, which was implemented in the 1970'er for Shakey the robot under the name STRIPS. But, GOAP and minimax has two problems: first, a abstract model of the game is needed (perhaps in PDDL or in Game Description Language) and second the state-space is to big.
An better alternative to planning is to use a Behavior Tree. Thats a static program which describes the behavior of an agent. No solver is needed and no complete modelling of the game is needed. Instead, a bottom up approach is used with multiple edit-compile-run iterations for finding the optimal behavior tree (Test-driven-development). To implement such programming approach a so called "reactive planner" has to be implemented first which is another word for a realtime scheduler. Thats a module whichs maps a behavior tree onto a gantt-chart for executing an action at a specific moment in time. As introduction, the unity3d Engine is a good starting point, which has a full behaviortree implementation out-of-the-box.
I'm new to Netlogo and having difficulty grasping this problem.....
Say you have 5,000 agents and each agent has five domains, those domains are given a number based on the agents expertise. Agents max for all domains is 15 and if the agent has over 7.5 for any domain they are considered an expert.
There are two environments one is clustered by each domain, agents with the highest number with move into that domain area. Second environment mixed with an expert from each of the five domains collaborating with different domain experts.
This model is to represent cross functional collaboration for innovation in businesses.
I'm having difficulty wrapping my head around the equation or activity to represent collaboration between the agents.
Also, confused about how to represent their connections. I've looked at Team Assembly on Netlogo and think its similar to what I want to represent. I am not sure how to quickly set the random matrix up to define the agents expertise.
any thoughts or suggestions would be greatly appreciated:)
Thanks in advance for reading and considering your input.
I am thinking to implement a learning strategy for different types of agents in my model. To be honest, I still do not know what kind of questions should I ask first or where to start.
I have two types of agents which I want them to learn by experience, they have a pool of actions which each has different reward based on specific situations that might happen.
I am new to reinforcement Learning methods, therefore any suggestions on what kind of questions should I ask myself is welcomed :)
Here is how I am going forward to formulate my problem:
Agents have a lifetime and they keep track of a few things that matter for them and these indicators are different for different agents, for example, one agent wants to increase A another wants B more than A.
States are points in an agent's lifetime which they
Have more than one option (I do not have a clear definition for
States as they might happen a few times or not happen at all because
Agents move around and they might never face a situation)
The reward is the an increase or decrease in an indicator that agents can get from an action in a specific State, and agent do not know what would be the gain if he chose another action.
The gain is not constant, the states are not well defined and there is no formal transition of one state into another,
For example agent can decide to share with one of the co-located agent (Action 1) or with all of the agents at the same location(Action 2) If certain conditions hold true Action A will be more rewarding for that agent, while in other conditions Action 2 will have higher reward; my problem is I did not see any example with unknown rewards since sharing in this scenario also depends on the other agent's characteristics (which affects the conditions of reward system) and in different states it will be different.
In my model there is no relationship between the action and the following state,and that makes me wonder if its ok to think about RL in this situation at all.
What I am looking to optimize here is the ability for my agents to reason about current situation in a better way and not only respond to their need which is triggered by their internal states. They have a few personalities which can define their long term goal and can affect their decision making in different situations, but I want them to remember what action in a situation helped them to increase their preferred long term goal.
In my model there is no relationship between the action and the following state,and that makes me wonder if its ok to think about RL in this situation at all.
This seems strange. What do actions do if not change state? Note that agents don't have to necessarily know how their actions will change their state. Similarly, actions could change the state imperfectly (a robots treads could skid out so it doesn't actually move when it tries to). In fact, some algorithms are specifically designed for this uncertainty.
In any case, even if the agents are moving around the state space without having any control, it can still learn the rewards for the different states. Indeed, many RL algorithms involve moving around the state space semi-randomly to figure out what the rewards are.
I do not have a clear definition for States as they might happen a few times or not happen at all because Agents move around and they might never face a situation
You might consider expanding what goes into what you consider to be a "state". For instance, the position seems like it should definitely go into the variables identifying a state. Not all states need to have rewards (although good RL algorithms typically infer a measure of goodness of neutral states).
I would recommend clearly defining the variables that determine an agent's state. For instance, the state space could be current-patch X internal-variable-value X other-agents-present. In the simplest case, the agent can observe all of the variables that make up their state. However, there are algorithms that don't require this. An agent should always be in a state, even if the state has no reward value.
Now, concerning unknown reward. That's actually totally okay. Reward can be a random variable. In that case, a simple way to apply standard RL algorithms would be to use the expected value of the variable when making decisions. If the distribution is unknown, then the algorithm could just use the mean of the rewards observed so far.
Alternatively, you could include the variables that determine the reward in the definition of the state. That way, if the reward changes, then it is literally in a different state. For example, suppose a robot is on top of a building. It needs to get to the top of the building in front of it. If it just moves forward, it falls to ground. Thus, that state has a very low reward. However, if it first places a plank that goes from one building to the other, and then moves forward, the reward changes. To represent this, we could include plank-in-place as a variable so that putting the board in place actually changes the robot's current state and the state that would result from moving forward. Thus, the reward itself has not changed; it's just in a different state.
Hopefully this helps!
UPDATE 2/7/2018: A recent upvote reminded me of the existence of this question. In the years since it was asked, I've actually dived into RL in NetLogo to a much greater extent. In particular, I've made a python extension for NetLogo, primarily to make it easier to integrate machine learning algorithms in with model. One of the demos of the extension trains a collection of agents using deep Q-learning as the model runs.
It is possible to easily use the GPS functionality in the iPhone since sdk 3.0, but it is explicitly forbidden to use Google's Maps.
This has two implications, I think:
You will have to provide maps yourself
You will have to calculate the shortest routes yourself.
I know that calculating the shortest route has puzzled mathematicians for ages, but both Tom Tom and Google are doing a great job, so that issue seems to have been solved.
Searching on the 'net, not being a mathematician myself, I came across the Dijkstra Algorithm. Is there anyone of you who has successfully used this algorithm in a Maps-like app in the iPhone?
Would you be willing to share it with me/the community?
Would this be the right approach, or are the other options?
Thank you so much for your consideration.
I do not believe Dijkstra's algorithm would be useful for real-world mapping because, as Tom Leys said (I would comment on his post, but lack the rep to do so), it requires a single starting point. If the starting point changes, everything must be recalculated, and I would imagine this would be quite slow on a device like the iPhone for a significantly large data set.
Dijkstra's algorithm is for finding the shortest path to all nodes (from a single starting node). Game programmers use a directed search such as A*. Where Dijkstra processes the node that is closest to the starting position first, A* processes the one that is estimated to be nearest to the end position
The way this works is that you provide a cheap "estimate" function from any given position to the end point. A good example is how far a bird would fly to get there. A* adds this to the current distance from the start for each node and then chooses the node that seems to be on the shortest path.
The better your estimate, the shorter the time it will take to find a good path. If this time is still too long, you can do a path find on a simple map and then another on a more complex map to find the route between the places you found on the simple map.
Update
After much searching, I have found an article on A* for you to to read
Dijkstra's algorithm is O(m log n) for n nodes and m edges (for a single path) and is efficient enough to be used for network routing. This means that it's efficient enough to be used for a one-off computation.
Briefly, Dijkstra's algorithm works like:
Take the start node
Assign it a depth of zero
Insert it into a priority queue at its depth key
Repeat:
Pop the node with the lowest depth from the priority queue
Record the node that you came from so you can track the path back
Mark the node as having been visited
If this node is the destination:
Break
For each neighbour:
If the node has not previously been visited:
Calculate depth as depth of current node + distance to neighbour
Insert neighbour into the priority queue at the calculated depth.
Return the destination node and list of the nodes through which it was reached.
Contrary to popular belief, Dijkstra's algorithm is not necessarily an all-pairs shortest path calculator, although it can be adapted to do this.
You would have to get a graph of the streets and intersections with the distances between the intersections. If you had this data you could use Dijkstra's algorithm to compute a shortest route.
If you look at technology tomtom calls 'IQ routes', they measure actual speed and travel time per roadstretch per time of day. This makes the arrival time more accurate. So the expected arrival time is more fact-based http://www.tomtom.com/page/iq-routes
Calculating a route using the A* algorithm is plenty fast enough on an iPhone with offline map data. I have experience of doing this commercially. I use the A* algorithm as documented on Wikipedia, and I keep the road network in memory and re-use it; once it's loaded, routing even over a large area like Spain or the western half of Canada is practically instant.
I take data from OpenStreetMap or elswhere and convert it into a directed graph, assuming (which is the right way to do it according to those who know) that any two roads sharing a point with the same ID are joined. I assign weights to different types of roads based on expected speeds, and if a portion of a road is one-way I create only a single arc; two-way roads get two arcs, one in each direction. That's pretty much the whole thing apart from some ad-hoc code to prevent dangerous turns, and implementing routing restrictions.
This was discussed earlier here: What algorithms compute directions from point a to point b on a map?
Have a look at CloudMade. They offer a free service for iPhone and iPad that allows navigation based on your current location. It is built on open street maps and has some nifty features like making your own mapstyle. It is a little slow from time to time but its totally free.