Dijkstras multiple copies of cells in the heap - dijkstra

I am trying to do a Dijkstra's shortest path on a grid. Right now I have it working, but I do have some confusion. Say I am looking at a cell in the grid, I evaluate it and then push all of its neighbors into the heap if they are not evaluated. The issue I run into is I end up with multiple instances of the same cell in the heap which really bogs down the process. To fix this, I set it to not push a cell onto the heap if its already in the heap. Is this a correct approach, or could this lead to issues? This is over an unweighted grid.

If it's a dijkstra, you don't have a heap, you have a priority queue.
And you have to store the shortest distance you arrived to the node at. So if you arrive again, but with a higher distance, you basically ignore the arrival. If with a shorter distance, you have to update the node's priority in the queue accordingly instead of inserting it twice ( if your queue's implementation doesn't support priority change, then you just remove the node and reinsert it with new priority )
https://en.wikipedia.org/wiki/Dijkstra%27s_algorithm

Related

Is there a way to record when (and how often) Transporters get within a certain distance of each other?

I have an AnyLogic simulation model using trucks and forklifts as agents (Transporter type), and among other things would like to identify each time one of them becomes within a certain distance of another one (example within 5 meters). I will record this count as a variable within the main space of the model. I am using path-guided navigation.
I have seen a method "agentsInRange" which will probably be able to do the trick, but am not sure where to call this from. I assume I should be able to use the AL functionality of "Min distance to obstacle" (TransporterFleet) and "Collision detection timeout" (TransporterControl)?
Thanks in advance!
Since there don't seem to be pre-built functions for this, afaik, the easiest way is to:
add an int variable to your transporter agent type counter
add an event to your transporter type checkCollision that triggers every second or so
in the event, loop across the entire population of transporters and count the number that are closer than X meters (use distanceTo(otherTransporter) and write your own custom code)
add that number to counter
Note that this might be very inefficient computationally as it is quite brute-force. But might be good enough :)

Changing max node capacity in M-tree affects the results

Posting the code for the entire tree for this problem would be pointless (too long and chaotic), and I've tried to fix this problem for a while now, so I don't really want some concrete solution, but more like ideas as to why this might be happening. So:
I have a dataset of 1.000.000 coordinates and I insert them into the tree. I do a range search after and for MaxCapacity=10 I get the correct results (and for any number >= 10). If I switch to MaxCapacity=4 results are wrong. But if I shrink the dataset to about 20.000 coordinates the results are again correct for MaxCapacity=4.
So to me, this looks like an incorrect split algorithm and it just shows for small MaxCapacities and large datasets where we have an enormous amount of splits. But the algorithm checks out for almost everything so I can't really find a mistake there. Any other ideas? Tree is written in SCALA, promotion policy promotes the two points that are the furthest away from each other and for split policy we iterate through the entries of the overflown node and we put each entry into the group of the promoted point that is closer to.
Don't know if anyone will be interested in this but I found the reasons causing this. I thought the problem was in split but I was wrong. The problem was when I was choosing in the Insert Recursion algorithm what node to jump to next in order to place the entry. So I was choosing this node by calculating the distance between each node's center and the entry's point. The node with minimum said distance was chosen.
This works fine if the entry happens to reside inside the radius of multiple nodes. In this case the minDistance works as intended but if the entry doesn't reside in any node's radius? In this case we would have to expand the radius as well to contain the entry. So we would need to find the node whose radius would expand less if it were to include the entry into its children. For a node, its distance from the entry point might be minimum but the expansion needed might be catastrophically big. I had not considered this case and as a result entries were placed in wrong nodes, causing huge expansions, causing huge overlaps. When I implemented this case the problem was fixed!

AnyLogic Batching agents i.t.o Weight

How do I set the batch size i.t.o the weight I want to batch? I am currently simulating a potato plant. And the potatoes(agent) all have their own weight due to the randomness of potatoes, but now I must batch them into 10Kg bags. The weight should just be over 10kg but not smaller, so it is going to be 9.9kg plus one more potato.
The F1 help function suggests to use a customized Queue. But I do not know how to go forward with that option.
Any help would be appreciated
You could use a "Wait" object with infinite capacity. Whenever a potatoe is added, check the total weight and if they are above your threshold, you can use wait.freeAll(). This will send them into a downstream batch object.
Make sure to change the batch size to the number of potatoes in the "wait" object before you the the freeAll() method,so that all freed potatoes are batched together. You can do that dynamically using batch.set_batchSize(x)
cheers

load factor in separate chaining?

Why is it recommended to have a load factor of 1.0 in separate chaining?
I've seen plenty of people saying that it is recommended, but not given a clear explanation of why.
With open addressing, I know the load factor should be between 0.5 and 0.7 because it should be a fast operation to find an unoccupied index when dealing with collisions. But I can't see why a load factor of 1 should be better in separate chaining. I mean, if I have a table of size 100, isn't there still a chance that all 100 elements hashes to the same index and get placed in the same list? So I really can't comprehend why this specific load factor for separate chaining should be 1.
tl;dr: To save memory space by not having slots unoccupied and speed up access by minimizing the number of list traversal operation.
If you understand the load factor as n_used_slots / n_total_slots:
Having a load factor of 1 just describes the ideal situation for a well-implemented hash table using Separate Chaining collision handling: no slots are left empty.
The other classical approach, Open Addressing, requires the table to always have a free slot available when adding a new item. Resizing the table is way too costly to do it for each item, but we are also restricted on memory and wouldn’t want to have too many unused slots lying around. One has to find a balance between speed (few table resizes, quick inserts and lookups) and memory (few empty slots) [as ever so often in programming]. The ideal load factor is based on this idea of balancing and can be estimated based on the actual hash function, the value domain and other factors.
With Separate Chaining, on the other hand, we usually expect from the start to have (way) more items than available hash table slots. If a collision occurs, we need to add the item to the linked list stored in a specific slot. Since searching in a linked list is costly, we would like to minimize list traversal operations. For that, the best case is to have all slots filled with lists of ideally the same length! Having all slots filled corresponds to a load factor of 1.
To put it another way: A load factor < 1 means that there are empty slots and items had to be added to a linked list in another slot, increasing the number of list traversal operations and wasting some memory.
Concerning your example of a table with size 100: yes, there is a chance that all items collide and occupy just one single slot. In that case, the effective load factor would be 0.01 and performance would be heavily impacted.
If you understand the load factor as n_items / n_total_slots:
In that case, the load factor can be larger than 1. A factor < 1 means you have empty slots, while factor > 1 means that there are slots holding more than one item and consequently, list traversals are required. In the first case, you are wasting space and in the second case list traversals lead to a (small) performance hit, depending on the size of the lists.
Example: A load factor of 10 means that on average each slot holds 10 items. Searching for an item therefore implies traversing 5 list nodes on average.
A load factor of 1 means you waste no space and have the fastest lookup, if you use a decent hash function that ensures a regular and evenly balanced usage of slots.

stacks growing downward and heaps growing upward, what if they encounter?

This picture could be found on Operating system concepts, beginning of Chapter 9. The size of virtual address space is 0 to max. My question is:
what will decide the max value? Is it fixed?
what will happen if the hole between stack and heap is filled and one of them of both of them want to grow continually?
I know that my question may be duplicate, but I've read a lot threads and I still cannot get my answer. Thanks in advance!
Keep in mind that what you are seeing is a very simplified diagram of what happens. First of all all, the underlying hardware sets a maximum logical address range.
Some part of that range will be reserved (either through hardware or software, depending upon the processor) for the operating system. The remaining addresses are for the user address space.
So what you are looking at is a conceptual view of a user address space. This can be further limited by system parameters and process quotas.
what will decide the max value? Is it fixed?
Thus MAX is a combination of hardware limits, operating system address allocation, system parameters, and process quotas. It can, therefore, be unfixed.
what will happen if the hole between stack and heap is filled and one of them of both of them want to grow continually?
First of all remember this diagram is only conceptual. One simplification is that the valid addresses within the address space need not be contiguous. There could be holes. Second, memory layout is usually controlled by the linker. The "text" and the "data" can be reversed or even interleaved.
The blue "hole" will generally be unallocated (invalid) memory pages. Some OS's do not grow the stack. It is preallocated by the linked. In a multi-threaded system, there could be multiple stacks (another simplification of the diagram) and there are often multiple heaps.
As various function map pages into the logical address space, the blue area shrinks. If go goes to zero, the next attempt to map pages will fail.