Index vs aggregation

Index vs aggregation - mongodb

I have a collection of document - events - that have venues/addresses:
venue: {
name: String,
street: String,
city: String
...
}
When the user creates a new event I would like to offer an autocomplete field for the venue - ideally ordered by city if I can determine the user's location beforehand.
I see that Mongo offers a few methods of managing data from simply searching the collection, aggregation etc.. What would be the recommended approach for my situation?
If I index the event collection - when do I need to be concerned about the speed of search?
And for aggregation... I've not used it before but it seems a good fit especially with geo search but I am unclear as to when aggregation occurs. Is this something that is automatically done/populated once I set it up or do I need to run a cron job on this. Examples I am finding are unclear on this.
Would love to hear experiences people have with similar issues.

Aggregations are just a more sophisticated version of a query - just like in SQL you can do SELECT * from T1 or SELECT foo, count(*) from T1 GROUP BY foo so your question about cron jobs isn't really applicable unless you want to query something that's intensive to compute and you want to pre-compute it periodically.
When you index the fields that you query on, the result is the query will run much faster than an unindexed query - just like in RDBMS if you are familiar with those.
It sounds like you want to make a query based on what the user has already typed in. If I'm in San Francisco, CA and I started typing "S" in the venue field, presumably you want to run the query db.venues.find({city:"San Francisco",name:/^S/}) which would mean "venue in the city of San Francisco, name starting with 'S'."
I suspect you should save coordinates so that you can use geo-spacial search features in MongoDB - then you would get matching results sorted by closest to furthest from the coordinates that the user set as their location.
You can do this with a query so I don't see that you need an aggregation but it's possible that you would use aggregation to do various analysis of venues - the same queries are supported there, along with other powerful data manipulation operations.

Related

Querying MongoDB: retreive shops by name and by location with one single query

QUERYING MONGODB: RETREIVE SHOPS BY NAME AND BY LOCATION WITH ONE SINGLE QUERY
Hi folks!
I'm building a "search shops" application using MEAN Stack.
I store shops documents in MongoDB "location" collection like this:
{
_id: .....
name: ...//shop name
location : //...GEOJson
}
UI provides to the users one single input for shops searching. Basically, I would perform one single query to retrieve in the same results array:
All shops near the user (eventually limit to x)
All shops named "like" the input value
On logical side, I think this is a "$or like" query
Based on this answer
Using full text search with geospatial index on Mongodb
probably assign two special indexes (2dsphere and full text) to the collection is not the right manner to achieve this, anyway I think this is a different case just because I really don't want to apply sequential filter to results, "simply" want to retreive data with 2 distinct criteria.
If I should set indexes on my collection, of course the approach is to perform two distinct queries with two distinct mehtods ($near for locations and $text for name), and then merge the results with some server side logic to remove duplicate documents and sort them in some useful way for user experience, but I'm still wondering if exists a method to achieve this result with one single query.
So, the question is: is it possible or this kind of approach is out of MongoDB purpose?
Hope this is clear and hope that someone can teach something today!
Thanks

Can i append some country and city code in mongodb unique object id?

My question I am building database in Mongodb but problem is I have two field one is country and city, I want to query result mongodb like below
db.database.find({country:country_code,city:city_code})
As per my perception. It backend-side mongodb machine will perform operation like it will find first country_code records result will filter again with city_code,
In-stand of that I want to reduce time so I found one solution I do not know actually how Mongodb machine work, should I append country and city code in unique id so It become flexible solution ?
Something like this
db.database.find({A: {$regex: '/^*({country_code}{citycode})*$/''}})
I am new in mongodb, so please help me for best performance.
Thanks in advance,
Ronak Amlani

You want an index on your fields.
db.<your collection>.createIndex({ country: 1, city: 1 })
Your regex solution is going to do very little in the way of performance and even worse probably make your database a maintenance nightmare later on.

Best way to structure MongoDB with the following use cases?

sorry to have to ask this but I am new to MongoDB (only have experience with relational databases) and was just curious as to how you would structure your MongoDB.
The documents will be in the format of JSONs with some of the following fields:
{
"url": "http://....",
"text": "entire ad content including HTML (very long)",
"body": "text (50-200 characters)",
"date": "01/01/1990",
"phone": "8001112222",
"posting_title": "buy now"
}
Some of the values will be very long strings.
Each document is essentially an ad from a certain city. We are storing all ads for a lot of big cities in the US (about 422). We are storing more ads every day, and the amount of ads per city varies from as little as 0 to as big as 2000. The average is probably around 700-900.
We need to do the following types of queries, in almost instant time (if possible):
Get all ads for any specific city, for any specific date range.
Get all ads that were posted by a specific phone number, for any city, for any date range.
What would you recommend? I'm thinking I should have 422 collections - one for each city. I'm just worried about the query time when we query for phone numbers because it needs to go through each collection. I have an iterable list of all collection names.
Or would it be faster to just have one collection so that I don't have to switch through 422 collections?
Thank you so much, everyone. I'm here to answer any questions!
EDIT:
Here is my "iterating through all collections" snippet:
for name in glob.glob("Data\Nov. 12 - 5pm\*"):
val = name.split("5pm")[1].split(".json")[0][1:]
coll = db[val]
# Add into collection here...

MongoDB does not offer any operations which get results from more than one collection, so putting your data in multiple collections is not advisable in this case.
You can considerably speed up all the use-cases you mentioned by creating indexes for them. When you have a very large dataset and always query for exact equality, then hashed indexes are the fastest.
When you query a range of dates (between day x and day y), you should use the Date type and not strings, because this not just allows you to use lots of handy date operators in aggregation but also allows you to speed up ranged queries and sorts with ascending or descending indexes.

Maybe I'm missing something, but wouldn't making "city" a field in your JSON solve your problem? That way you only need to do something like this db.posts.find({ city: {$in: ['Boston', 'Michigan']}})

MongoDB instance vs relationship denormalistion

I playing with the best way to model mongodb documents
I am modelling a school.
A Student has many subjects.
Student{
subjects:[ {name:'',
level:'',
short name:''
},
{...},
{...}]
}
Decided to denormalise and embed subjects into students for performance.
There are rare cases where a subject needs to be queried and updated.
subjects.all
subject1.short_name = 'something new'
I know I will have to iterate through every student to update every subject reocrd.
However whast the best way to return all unique subjects?
Can you do a unique search of student.subjects names for example?
Or is it better to have another collection which is
Subjects{
name:'',
level:'',
short name:''
}
I still keep the denormalised Student.subject. But this is simply there for quering all the subjects on offer.
An updated would update this + every embeded Student.subject?
Any suggestions/recommendations?

However whast the best way to return all unique subjects?
This is a short fall of your schema here. You traded the ability to do this kind of thing easily in return for other speed benefits that you would do more often.
Currently the only real way is to either use the distinct() command ( http://docs.mongodb.org/manual/reference/method/db.collection.distinct/ ):
db.students.distinct('subjects.name');
or the aggregation framework:
db.students.aggregate([
{$unwind:'$subjects'},
{$group:{_id:'$subjects.name'}}
])
Like so.
As for schema recommendation, if you intend to make this kind of query often then I would factor out subjects into a separate collection.

MongoDB - forcing stored value to uppercase and searching

in SQL world I could do something to the effect of:
SELECT name FROM table WHERE UPPER(name) = UPPER('Smith');
and this would match a search for "Smith", "SMITH", "SmiTH", etc... because it forces the query and the value to be the same case.
However, MongoDB doesn't seem to have this capability without using a RegEx, which won't use indexes and would be slow for a large amount of data.
Is there a way to convert a stored value to a particular case before doing a search against it in MongoDB?
I've come across the $toUpper aggregate, but I can't figure out how that would be used in this particular case.
If there's not way to convert stored values before searching, is it possible to have MongoDB convert a value when it's created in Mongo? So when I add a document to the collection it would force the "name" attribute to a particular case? Something like a callback in the Rails world.
It looks like there's the ability to create stored JS for MongoDB as well, similar to a Stored Procedure. Would that be a feasible solution as well?
Mostly looking for a push in the right direction; I can figure out the particular code once I know what I'm looking for, but so far I'm not even sure if my desired functionality is doable.

You have to normalize your data before storing them. There is no support for performing normalization as part of a query at runtime.

The simplest thing to do is probably to save both a case-normalized (i.e. all-uppercase) and display version of the field you want to search by. Suppose you are storing users and want to do a case-insensitive search on last name. You might store:
{
_id: ObjectId(...),
first_name: "Dan",
last_name: "Crosta",
last_name_upper: "CROSTA"
}
You can then create an index on last_name_upper, and query like:
> db.users.find({last_name_upper: "CROSTA"})

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Index vs aggregation - mongodb

Related

Querying MongoDB: retreive shops by name and by location with one single query

Can i append some country and city code in mongodb unique object id?

Best way to structure MongoDB with the following use cases?

MongoDB instance vs relationship denormalistion

MongoDB - forcing stored value to uppercase and searching

Categories

Resources