I see having mulitple where acts like AND, but how about OR?
You can't really do an OR query in Cloud Firestore.
As a workaround, you could run two separate queries and merge them together on the client, or add some custom field that would essentially perform the "OR" query for you on the database. (For an example of that latter one, if you know you're going to often run an "age > 65 OR age < 18" query on the database, you could create a specific age_high_or_low field that you would set to true if the age field were greater than 65 or less then 18.)
Related
How can I query my Firestore database based on age range and location.
For example, let's say I want to get all users that are between the age of 18 and 30, and are within 5 miles me.
Here's a simplified version of my current database structure.
users
uid_0
age: 21
uid_1
age: 24
To filter by age parameters, I know I can do this:
// Swift
db.collection("users")
.whereField("age", isGreaterThanOrEqualTo: 18)
.whereField("age", isLessThanOrEqualTo: 30)
For location, I've read Firebase's geofire can be used, which would add an additional node such as:
_geofire
uid_0:
g: "asdaseeefef"
l:
0: 52.2101515118818
1: -0.3215188181881
uid_1:
g: "oposooksok"
l:
0: 50.1234567898788
1: -0.8789999595988
But I'm unsure how to add a location query on top of my original gender+age query (tbh I'm uncertain how to make a location query even by itself). But in regards to combining the two, my main concern is that the Firestore docs specify that you can only apply a range filter on one field, and I already am applying one for the age field.
Is it possible to filter by location while also filtering by an age range?
Firestore (and many other NoSQL databases) can only perform relational conditions, such as isGreaterThanOrEqualTo/isLessThanOrEqualTo/startAt/endAt, on a single field, due to how their indexes work behind the scenes.
To allow you to perform geoqueries on such a database, libraries such as GeoFire use so-called GeoHash values, which magically merge the two latitude and longitude values into a single value (the g in your data structure) that you can perform a range condition on. It's quite magical really. I did a talk on the topic a few years ago, which I highly recommend checking out: Geo-querying Firebase and Firestore.
Now if you'd also like to filter on another property such as age, you'd have to figure out a way to express the value of age into a single type with the longitude and latitude in a way that all three values can be filtered in one go. So you'd have to come up with a GeoHashAndAge type, which (while definitely interesting) seems a bit beyond what most of us are willing to go through.
That unfortunately leaves you with only two options: you can either pre-filter the data or post-filter it.
Pre-filtering means that you add one or more fields to each document that allow you to perform the necessary age filter without needing a relational condition. For example, if a use-case in your app is that you want people over 18, add a field isOver18 to each document with a boolean value, and you can filter on that with an equality check, which can be combined with the range filter on the geohash. This may not be possible for all use-cases, but when it is possible it allows you to leave the filtering to the database.
Post-filtering is simplest: you just perform the age filtering in your application code after retrieving the documents from Firestore. This always works, but of course means that you're reading more documents that are needed.
Based on #Frank van Puffelen's answer, I think the best solution (at least in the case of an app that needs to scale) is the pre-filtering option because we can keep database storage and read costs at a minimum (explained under the Best Solution header further down.
Shortcomings of other options
Post-filtering shortcoming:
Since we could (in theory) have millions of users within a 100 mile radius, the post-filtering age option would require fetching all that data at once, which means we couldn't take advantage of Firebase's cost/time-saving .limit(to:) method and things would get expensive and time-consuming, fast.
GeoAgeHash shortcoming:
Frank's GeoAgeHash idea is interesting and theoretically possible, but I don't believe this would be financially feasible either because – just how a GeoHash requires up to 9 queries – by adding the 3rd dimension of age (the first two being lat and long), it would raise the number of queries for a single search to potentially 9 * 82 (assuming you were querying users with ages between 18 and 100 resulting in the 82 options). And this would eat up your Firestore limit of 50K reads per day pretty quickly.
Best Solution
Pre-filtering:
EDIT: Turns out there is a major downfall to this solution as well in that Firestore requires a composite index to be made for each tag (below) since we are using a range clause for the Geohash, so that nears the limit of 200 tags allowed. If you are querying by other items as well, a composite index is required for each combination which could put you over, not to mention it's very time consuming to make that many indexes.
With that being said, I think the pre-filtering option is the best. For precise age filtering, you would just need to make sure you had a tag for each possible age.
For example: isOver18, isOver19, isOver20... tags to query against for the minimum age constraint, and then isUnder100, isUnder99, all the way down for the upper-age limit constraint.
And actually, more realistically, you would should be storing birthdates instead of ages (since ages would need to be checked each day and possibly updated to remain accurate).
Therefore, for querying purposes, you would store birth year tags instead of age tags, such as birth year isOver1900, isOver1901 and up, then isUnder2004, isUnder2003 and down.
In short, each user doc would need 82 isOver tags, and 82 isUnder tags, assuming you are planning for queries between the ages of 18 and 100 (which results in the 82 different ages).
Example Database Structure
Users Collection
userDoc1
birthDate: TimeInterval // I like to use TimeIntervalSince1970
isOver1900: Bool
isOver1901: Bool
isOver1902: Bool
...etc...
isUnder2004: Bool
isUnder2003: Bool
isUnder2002: Bool
...etc...
userDoc2
...
...
Example Usage
// Assuming it's 2022, and you want to query users between the ages of 24 and 30
db.collection("Users")
.whereField("isUnder1998", isEqualTo: true) // filters >= 23
.whereField("isOver1991", isEqualTo: true) // filters <= 31
// Note: we must filter one year extra for the `isOver` clause because technically there could be some people born at the end of 1991 that are still 30, depending on what time of the year the query is made. These edge cases can easily be filtered out client-side.
QUERYING MONGODB: RETREIVE SHOPS BY NAME AND BY LOCATION WITH ONE SINGLE QUERY
Hi folks!
I'm building a "search shops" application using MEAN Stack.
I store shops documents in MongoDB "location" collection like this:
{
_id: .....
name: ...//shop name
location : //...GEOJson
}
UI provides to the users one single input for shops searching. Basically, I would perform one single query to retrieve in the same results array:
All shops near the user (eventually limit to x)
All shops named "like" the input value
On logical side, I think this is a "$or like" query
Based on this answer
Using full text search with geospatial index on Mongodb
probably assign two special indexes (2dsphere and full text) to the collection is not the right manner to achieve this, anyway I think this is a different case just because I really don't want to apply sequential filter to results, "simply" want to retreive data with 2 distinct criteria.
If I should set indexes on my collection, of course the approach is to perform two distinct queries with two distinct mehtods ($near for locations and $text for name), and then merge the results with some server side logic to remove duplicate documents and sort them in some useful way for user experience, but I'm still wondering if exists a method to achieve this result with one single query.
So, the question is: is it possible or this kind of approach is out of MongoDB purpose?
Hope this is clear and hope that someone can teach something today!
Thanks
The Firestore documentation on making queries includes examples where you can filter a collection of documents based on whether some field in the document either matches, is less than, or is greater than some value you pass in. For example:
db.collection("cities").whereField("population", isLessThan: 100000)
This will return every "city" whose "population" is less than 100000. This type of query can be made on fields of type String as well.
db.collection("cities").whereField("name", isGreaterThanOrEqualTo: "San Francisco")
I don't see a method to perform a substring search. For example, this is not available:
db.collection("cities").whereField("name", beginsWith: "San")
I suppose I could add something like this myself using greaterThan and lessThan but I wanted to check first:
Why doesn't this functionality exist?
My fear is that it doesn't exist because the performance would be terrible.
[Googler here] You are correct, there are no string operations like beginsWith or contains in Cloud Firestore, you will have to approximate your query using greater than and less than comparisons.
You say "it doesn't exist because the performance would be terrible" and while I won't use those exact words you are right, the reason is performance.
All Cloud Firestore queries must hit an index. This is how we can guarantee that the performance of any query scales with the size of the result set even as the data set grows. We don't currently index string data in a way that would make it easy to service the queries you want.
Full text search operations are one of the top Cloud Firestore feature requests so we're certainly looking into it.
Right now if you want to do full text search or similar operations we recommend integrating with an external service, and we provide some guidance on how to do so:
https://firebase.google.com/docs/firestore/solutions/search
It is possible now:
db.collection('cities')
.where('name', '>=', 'San')
.where('name', '<', 'Sam');
for more details see Firestore query documents startsWith a string
I have tried and tried on Meteor and on Robomongo (Mongodb) to select objects with dot notation.
I would like to be able to filter team.0.wageringStats.wageringStraightSpread objects (sometimes subjects can be fields or arrays - thats another issue)
In the first image I can select team.wageringStats.wageringStraightSpread and get back all the subOjects of team (team has siblings not shown in images)
The second image I tried team.0.wageringStats.wageringStraightSpread and I get no fields.
Lastly i tried team.[0].wageringStats.wageringStraightSpread and
team[0].wageringStats.wageringStraightSpread and get the same result : 0 fields
I am at a loss and would like some help. Thank you
I am not sure what you are trying to do now? Because in your first command, you already have a list of team that match your criteria and then, put it into the loop of meteor to process. Why do you need to find only the first one ? By the way, in order to select the nth of the result set in mongodb, you will need something like skip and limit
db.collections.find({'team.wageringStats.wageringStraightSpread':1}).limit(1).skip(0)
(in skip, you need to pass the offset you need to reach to)
Also, if you only care about the first one, findOne is the one you need to do the query
db.collections.findOne({'team.wageringStats.wageringStraightSpread':1})
Be aware that the syntax of mongodb and meteor for querying is a bit different
I'm hoping to do a very common task which is to never delete items that I store, but instead just mark them with a deleted flag. However, for almost every request I will now have to specify deleted:false. Is there a way to have a "default" filter on which you can add? Such that I can construct a live_items filter and do queries on top of that?
This was just one guess at a potential answer. In general, I'd just like to have deleted=False be the default search.
Thanks!
In SQL you would do this with a view, but unfortunately MongoDB doesn't support views.
But when queries which exclude items which are marked as deleted are far more frequent than those which include them, you could remove the deleted items from the main items collection and put them in a separate items_deleted collection. This also has the nice side-effect that the performance of the collection of active items doesn't get impaired by a large number of deleted items. The downside is that indices can't be guaranteed to be unique over both collections.
I went ahead and just made a python function that combines the specified query:
def find_items(filt, single=False, live=True):
if live:
live = {'deleted': False}
filt = dict(filt.items() + live.items())
if single:
return db.Item.find_one(filt)
else:
return db.Item.find(filt)