Comparing geospatial paths with MongoDB

Comparing geospatial paths with MongoDB - mongodb

I'm working on a mobile app that tracks a user's location at regular intervals to allow him to plot the path of a journey on a map. We'd like to add an optional feature that will tell him what other users of the app have made similar journeys in the timeframe he's looking at, be it today's commute or the last month of travel. We're referring to this as "path-matching".
The data is currently logged into files within the app's private storage directories on iOS and Android in a binary format that is easily and quickly scanned through to read locations. Each file contains the locations for one day, and generally runs to about 80KB.
To be able to implement the path matching feature we'll obviously need to start uploading these location logs to our server (with the users permission of course), on which we're running PHP. Someone suggested MongoDB for its geospatial prowess - but I've a few questions that maybe folks could help me with:
It seems like we could change our location-logging to use BSON instead. The first field would be a device or user IDs, followed by a list of locations for a particular day. The file could then be uploaded to our server and pushed into the MongoDB store. The online documentation however only seems to refer to importing BSON files created by mongodump. Is the format stable enough that any app could write BSON files readable directly by MongoDB?
Is MongoDB able to run geospatial queries on documents containing multiple locations, or on locations forming a path across multiple documents? Or does this strike you as something that would require excessive logic outside the database, on the PHP side?

The format is totally stable, but there isn't much tooling to do what you describe. Generally, you'd upload it to the backend and it would end up in, say $_POST['locations'] or something that would be an array of associative arrays. Sanitize it and just save it to the database, something like:
$locs = sanitize($_POST['locations']);
$doc = array('path' => array('type' => 'LineString', 'coordinates' => $locs), 'user' => $userId);
$collection->insert($doc);
In the above example, I'm using some of the latest geo stuff (http://docs.mongodb.org/manual/release-notes/2.4/#new-geospatial-indexes-with-geojson-and-improved-spherical-geometry), you'll need a nightly build to get this but it should be in the stable build in about a month. If you need it before then, you can use the older geo API: http://docs.mongodb.org/manual/core/geospatial-indexes/.
MongoDB doesn't read BSON files, but you could use mongorestore to manually load them. I would highly recommend letting the driver do the low-level stuff for you, though!
You can have a document containing a line (in the new geo stuff) and an array of points (in the old geo stuff). I'm not sure what you mean by "a path across multiple documents."
Edited to add: based on your comment, you might want to try {path : {$near : {$geometry : userPath}}} to find "nearby" paths. You could also try making a polygon around the user's path and querying for docs $within the polygon.

Related

Issue in MongoDB document search

I am new to MongoDB. And I have the following issue on currently developing web application.
We have an application where we use mongoDB to store data.
And we have an API where we search for the document via text search.
As an example: if the user type “New York” then the request should send the all the available data in the collection to the keyword “New York". (Here we call the API for each letter typed.) We have nearly 200000 data in the DB. Once the user searches for a document then it returns nearly 4000 data for some keywords. We tried with limiting the data to 5 – so it returns the top 5 data, and not the other available data. And we tried without limiting data now it returns hundreds and thousands of data as I mentioned. And it causes the request to slow down.
At Frontend we Bind search results to a dropdown. (NextJs)
My question:
Is there an optimizing way to search a document?
Are there any suggestions of a suitable way that I can implement this requirement using mongoDB and net5.0?
Or any other Implementation methods regarding this requirement?
Following code segment shows the query to retrieve the data to the incomming keyword.
var hotels = await _hotelsCollection
.Find(Builders<HotelDocument>.Filter.Text(keyword))
.Project<HotelDocument>(hotelFields)
.ToListAsync();
var terminals = await _terminalsCollection
.Find(Builders<TerminalDocument>.Filter.Text(keyword))
.Project<TerminalDocument>(terminalFeilds)
.ToListAsync();
var destinations = await _destinationsCollection
.Find(Builders<DestinationDocument>.Filter.Text(keyword))
.Project<DestinationDocument>(destinationFields)
.ToListAsync();

So this is a classic "autocomplete" feature, there are some known best practices you should follow:
On the client side you should use a debounce feature, this is a most. there is no reason to execute a request for each letter. This is most critical for an autocomplete feature.
On the backend things can get a bit more complicated, naturally you want to be using a db that is suited for this task, specifically MongoDB have a service called Atlas search that is a lucene based text search engine.
This will get you autocomplete support out of the box, however if you don't want to make big changes to your infra here are some suggestions:
Make sure the field your searching on is indexed.
I see your executing 3 separate requests, consider using something like Task.WhenAll to execute all of them at once instead of 1 by 1, I am not sure how the client side is built but if all 3 entities are shown in the same list then ideally you merge the labels into 1 collection so you could paginate the search properly.
As mentioned in #2 you must add server side pagination, no search engine can exist without one. I can't give specifics on how you should implement it as you have 3 separate entities and this could potentially make pagination implementation harder, i'd consider wether or not you need all 3 of these in the same API route.

How can I look for data in nested property in a firestore collection?

I have been using Firestore as a database for my projet.
I have been building an app that stores paths which are generated by reading gpx files. The collection i use to store them is built like this :
Each document is a different trail path.
What I would like to do is conditionnaly search through the collection "paths". I have been reading this documentation : https://cloud.google.com/firestore/docs/query-data/queries?hl=en
The condition will be to return result which are with X kms using latitude and longitude values, which are stored in the nested array "coordonnees".
I would like to know if there is a way to use this kind of expression below to look for data where coordonnees.lat < X and coordonnees.lng < X for example ???
citiesRef.where("state", ">=", "CA").where("state", "<=", "IN");
The goal is that people will type their location, and I want to return them the stored files that are near the location they typed in. If you know a better or another way to reach the same goal, it would nice too !

If I understand you correctly, you want to find documents that are around a certain location. While Firestore has a GeoPoint data type built-in, it unfortunately not currently support performing geoqueries on those.
The best option is to use an add-on library based on geohashes, as explained in the documentation on performing geoqueries on Firestore.

How to update collection documents efficiently when changing a specific value in Firestore?

I have 2 collections. One of them is named "USERS", and the other one "MATCHES". USERS, can join in the MATCHES, appearing the avatar of the user who has joined in the match. The problem is that when the user changes their avatar image after joining in the match, the match avatar doesn't changed, because the match has the old avatar.
The avatar is saved as Base64 in Firestore, but I will change it to "Storage" in the near future.
I have been trying to set the reference, but that only gives me the path.
If I have to do a Database Api Call for each match which is joined the user, maybe I have to do 20 Api calls updating the matches. It can be a solution, but not the best.
Maybe the solution is in the Google Functions?
I'm out of ideas.

Maybe the solution is in the Google Functions?
Cloud Functions also access Firestore through an SDK, so they can't magically do things that the SDK doesn't allow.
If you're duplicating data and you update one of the duplicates, you'll have to consider updating the others. If they all need to be updated, that indeed requires a separate call for each duplicate.
If you don't want to have to do this, don't store duplicate data.
For more on the strategies for updating duplicated data, see How to write denormalized data in Firebase

How to efficiently check database object based on location/proximity to user's location?

I am constructing an app (in XCode) which, in a general sense, displays information to users. The information is stored as individual objects in a database (happens to be a Parse-server hosted by heroku). The user can elect to "see" information that has been created within a set distance from their current location. (The information, when saved to the DB, is saved along with its lat and long based on the location of the user when they initiated the save). I know I can filter the pieces of information by comparing their lat and long to the viewing user's current lat and long and only display those which are close in enough. Roughly/generally:
var currentUserLat = latitude //latitude of user's current location
var infoSet = [Objects] //set of all pulled info from DB
for info in infoSet{
if info.lat-currentUserLat < 3{//arbitrary value
//display the info
}else{
//don't display
}
}
This is set up decently enough, and it works fine. The reason it works fine, though, is because of the small number of entries in the DB at this current time (the app is in development). Under practical usage (ie many users) the DB may be full of information objects (lets say, a thousand). In my opinion, to individually pull and compare the latitude of the information and compare it to the current user's latitude for each and every DB entry would take too long. I know there must be a way to do it in a timely manner (think tinder... they only display profiles of people who are in the near vicinity and it doesn't take that long for them to do so despite millions of profiles) but I do not know what is most efficient. I thought of creating separate sections for different geographical regions in the DB and then only searching those particular section of the DB depending on where the user's current location is, but this seems unsophisticated and would still lead to large amounts of info being pulled. What is the best way to do this?

Per Deploying a Parse Server to Heroku you can Install a MongoDB add-on or another of the Data Stores in the Add-on Category in which you can use Geospatial Indexes and Queries which are specifically intended for this sort of application.

Is there a reason you need to do that sort of checking on the client side? I would suggest sending your coordinates to your server and then having the server query your database with those coordinates and figure out which items to pull based on the given coordinates respectively. Then you can have the server return back to the client side whichever items were "close" to that user
EDIT: reworded

Import "normal" MongoDB collections into DerbyJS 0.6

Same situation like this question, but with current DerbyJS (version 0.6):
Using imported docs from MongoDB in DerbyJS
I have a MongoDB collection with data that was not saved through my
Derby app. I want to query against that and pull it into my Derby app.
Is this still possible?
The accepted answer there links to a dead link. The newest working link would be this: https://github.com/derbyjs/racer/blob/0.3/lib/descriptor/query/README.md
Which refers to the 0.3 branch for Racer (current master version is 0.6).
What I tried
Searching the internets
The naïve way:
var query = model.query('projects-legacy', { public: true });
model.fetch(query, function() {
query.ref('_page.projects');
})
(doesn't work)

A utility was written for this purpose: https://github.com/share/igor
You may need to modify it to only run against a single collection instead of the whole database, but it essentially goes through every document in the database and modifies it with the necessary livedb metadata and creates a default operation for it as well.
In livedb every collection has a corresponding operations collection, for example profiles will have a profiles_ops collection which holds all the operations for the profiles.
You will have to convert the collection to use it with Racer/livedb because of the metadata on the document itself.
An alternative if you dont want to convert is to use traditional AJAX/REST to get the data from your mongo database and then just put it in your local model. This will not be real-time or synced to the server but it will allow you to drive your templates from data that you dont want to convert for some reason.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse