Algorithm stops illogically? - coordinates

I have been stuck on this problem for about 2 days now. Today after a lot of debugging I figured out that my nearest neighbour algorithm doesn't finish completely and stops in between when the last element from the ArrayList is removed.
I have 2 arrayLists
One contains the X coordinates of all the cities and the other one contains the Y coordinates of all the cities.
I am using a for loop to calculate the distances between the cities and removing the coordinates of the previous cities. But when the last city on the arrayList happens to be the previous city, it gets removed but everything stops.
Here is the link to the code :
https://codeshare.io/5e6xB9
How do I fix this problem? Should I use a linked list instead of an arrayList?
Your help will be much appreciated
Thank You.

Seems the problem is with .remove(i) lines. It removes element from list and moves remaining elements to the left.
But at first glance you have too much overhead in your code. Lists, arrays, Stacks, twice file read, twice parse array... You need review your code, and at least I'd suggest make class City with x and y coordinates.

Related

scipy.interpolate.griddata slow due to unnecessary data

I have a map with a 600*600 aequidistant x,y grid with associated scalar values.
I have around 1000 x,y coordinates at which I would like to get the bi-linear interpolated map values. Those are randomly placed in an inner center area of the map with arround 400*400 size.
I decided to go with the griddata function with method linear. My understanding is that with linear interpolation I would only need the three nearest grid positions around each coordinate do get the well defined interpolated values. So I would require around 3000 data points of the map to perform the interpolation. The 360k data points are highly unnecessary for this task.
Throwing stupidly the complete map in results in long excecution times of a half minute. Since it's easy to narrow the map already down to the area of interest I could reduce excecution time to nearly 20%.
I am now wondering if I oversaw something in my assumption that I need only the three nearest neighbours for my task. And if not, whether there is a fast solution to filter those 3000 out of the 360k. I assume looping 3000 times over the 360k lines will take longer than to just throw in the inner map.
Edit: I had also a look at the comparisson of the result with 600*600 and the reduced data points. I am actually surprised and concerned about the observation, that the interpolation results differ partly significantly.
So I found out that RegularGridinterpolator is the way to go for me. It's fast and I have a regular grid already.
I tried to sort out my findings with the differences in interpolation value and found griddata to show unexpected behavior for me.
Check out the issue I created for details.
https://github.com/scipy/scipy/issues/17378

Changing max node capacity in M-tree affects the results

Posting the code for the entire tree for this problem would be pointless (too long and chaotic), and I've tried to fix this problem for a while now, so I don't really want some concrete solution, but more like ideas as to why this might be happening. So:
I have a dataset of 1.000.000 coordinates and I insert them into the tree. I do a range search after and for MaxCapacity=10 I get the correct results (and for any number >= 10). If I switch to MaxCapacity=4 results are wrong. But if I shrink the dataset to about 20.000 coordinates the results are again correct for MaxCapacity=4.
So to me, this looks like an incorrect split algorithm and it just shows for small MaxCapacities and large datasets where we have an enormous amount of splits. But the algorithm checks out for almost everything so I can't really find a mistake there. Any other ideas? Tree is written in SCALA, promotion policy promotes the two points that are the furthest away from each other and for split policy we iterate through the entries of the overflown node and we put each entry into the group of the promoted point that is closer to.
Don't know if anyone will be interested in this but I found the reasons causing this. I thought the problem was in split but I was wrong. The problem was when I was choosing in the Insert Recursion algorithm what node to jump to next in order to place the entry. So I was choosing this node by calculating the distance between each node's center and the entry's point. The node with minimum said distance was chosen.
This works fine if the entry happens to reside inside the radius of multiple nodes. In this case the minDistance works as intended but if the entry doesn't reside in any node's radius? In this case we would have to expand the radius as well to contain the entry. So we would need to find the node whose radius would expand less if it were to include the entry into its children. For a node, its distance from the entry point might be minimum but the expansion needed might be catastrophically big. I had not considered this case and as a result entries were placed in wrong nodes, causing huge expansions, causing huge overlaps. When I implemented this case the problem was fixed!

Find points near LineString in mongodb sorted by distance

I have an array of points representing a street (black line) and points, representing a places on map (red points). I want to find all the points near the specified street, sorted by distance. I also need to have the ability to specify max distance (blue and green areas). Here is a simple example:
I thought of using the $near operator but it only accepts Point as an input, not LineString.
How mongodb can handle this type of queries?
As you mentioned, Mongo currently doesn't support anything other than Point. Have you come across the concept of a route boxer? 1 It was very popular a few years back on Google Maps. Given the line that you've drawn, find stops that are within dist(x). It was done by creating a series of bounding boxes around each point in the line, and searching for points that fall within the bucket.
I stumbled upon your question after I just realised that Mongo only works with points, which is reasonable I assume.
I already have a few options of how to do it (they expand on what #mnemosyn says in the comment). With the dataset that I'm working on, it's all on the client-side, so I could use the routeboxer, but I would like to implement it server-side for performance reasons. Here are my suggestions:
break the LineString down into its individual coordinate sets, and query for $near using each of those, combine results and extract an unique set. There are algorithms out there for simplifying a complex line, by reducing the number of points, but a simple one is easy to write.
do the same as above, but as a stored procedure/function. I haven't played around with Mongo's stored functions, and I don't know how well they work with drivers, but this could be faster than the first option above as you won't have to do roundtrips, and depending on the machine that your instance(s) of Mongo is(are) hosted, calculations could be faster by microseconds.
Implement the routeboxer approach server-side (has been done in PHP), and then use either of the above 2 to find stops that are $within the resulting bounding boxes. Heck since the routeboxer method returns rectangles, it would be possible to merge all these rectangles into one polygon covering your route, and just do a $within on that. (What #mnemosyn suggested).
EDIT: I thought of this but forgot about it, but it might be possible to achieve some of the above using the aggregation framework.
It's something that I'm going to be working on soon (hopefully), I'll open-source my result(s) based on which I end up going with.
EDIT: I must mention though that 1 and 2 have the flaw that if you have 2 points in a line that are say 2km apart, and you want points that are within 1.8km of your line, you'll obviously miss all the points between that part of your line. The solution is to inject points onto your line when simplifying it (I know, beats the objective of reducing points when adding new ones back in).
The flaw with 3 then is that it won't always be accurate as some points within your polygon are likely to have a distance greater than your limit, though the difference wouldn't be a significant percentage of your limit.
[1] google maps utils routeboxer
As you said Mongo's $near only works on points not lines as the centre point however if you flip your premise from find points near the line to find the line near the point then you can use your points as the centre and line as the target
this is the difference between
foreach line find points near it
and
foreach point find line near it
if you have a large number of points to check you can combine this with nevi_me's answer to reduce the list of points that need checking to a much smaller subset

MATLAB: Dividing Items using a For-loop

I needed some help with a problem I'd been assigned in class. It's our introduction to for loops. Here is the problem:
Consider the following riddle.
This is all I have so far:
function pile = IslandBananas(numpeople, numbears)
for pilesize=1:10000000
end
I would really appreciate your input. Thank you!
I will help you, but you need to try harder than that. And also, you only need one for loop. First, think about how you would construct this algorithm. Well you know you have to use a for loop so that is a start. So let's think about what is going on in the problem.
1) You have a pile.
2) First night someone takes the pile and divides it into 3 and finds that one is left over, this means mod(pile,3) = 1.
3) But he discards the extra banana. This means (pile-1).
4) He takes a third of it, leaving two-thirds left. This means (2/3)*(pile-1).
5) In the morning they take the pile and divide it into 3 and find again that one is left over, so this means mod((2/3)*(pile-1),3) = 1.
6) But they discard the extra banana. This means (2/3)*(pile-1)-1.
7) Finally, they have to each have at least one banana if it is to be the smallest pile possible. Thus, the smallest pile must be such that (1/3)*((2/3)*(pile-1)-1) = 1.
I have essentially given you the answer, the rest you can write with the formula (1/3)*((2/3)*(pile-1)-1) and a simple if statement to test for the smallest possible integer which is 1. This can be done in four lines inside of your for loop.
Now, expanding this to any number of people and any number of bears requires two simple substitutions in that formula! If your teacher demands it, this can easily be split into two nested for loops.

How can I randomly draw from an NSMutableArray and not use the same number twice?

For a game I am making, I am passing an array of objects of a set number that I am going to have the user identify. I would like to make the presentation of these objects random. But I also need to make it so that the same object isn't presented twice.
So this is the solution I have thought of tentatively: I pass in an array of objects to my game controller, of size 10. In the game controller I create a random number between 0-9 and then pull from the array with that number as the index...I don't know if that will work though because I need to make sure that the same random number isn't used in a game session.
Any thoughts on how I might accomplish this type of random game logic?
Thanks
You should look for a shuffling algorithm and shuffle your array contents.
This way you can just pop numbers off the array without worrying if they've been used before - when you get to the last number, reshuffle the array and repeat.
Here's an example: http://www.cocoanetics.com/2009/04/shuffling-an-nsarray/
To avoid having the same index come up more than once I'd remove the object from the array each time. If you base the random number of the size of the array you should be fine. Make a copy of the array if you want to be able to re-create it to start over again.