The Firestore documentation on making queries includes examples where you can filter a collection of documents based on whether some field in the document either matches, is less than, or is greater than some value you pass in. For example:
db.collection("cities").whereField("population", isLessThan: 100000)
This will return every "city" whose "population" is less than 100000. This type of query can be made on fields of type String as well.
db.collection("cities").whereField("name", isGreaterThanOrEqualTo: "San Francisco")
I don't see a method to perform a substring search. For example, this is not available:
db.collection("cities").whereField("name", beginsWith: "San")
I suppose I could add something like this myself using greaterThan and lessThan but I wanted to check first:
Why doesn't this functionality exist?
My fear is that it doesn't exist because the performance would be terrible.
[Googler here] You are correct, there are no string operations like beginsWith or contains in Cloud Firestore, you will have to approximate your query using greater than and less than comparisons.
You say "it doesn't exist because the performance would be terrible" and while I won't use those exact words you are right, the reason is performance.
All Cloud Firestore queries must hit an index. This is how we can guarantee that the performance of any query scales with the size of the result set even as the data set grows. We don't currently index string data in a way that would make it easy to service the queries you want.
Full text search operations are one of the top Cloud Firestore feature requests so we're certainly looking into it.
Right now if you want to do full text search or similar operations we recommend integrating with an external service, and we provide some guidance on how to do so:
https://firebase.google.com/docs/firestore/solutions/search
It is possible now:
db.collection('cities')
.where('name', '>=', 'San')
.where('name', '<', 'Sam');
for more details see Firestore query documents startsWith a string
Related
I'm trying to make social media app using flutter. one key feature with apps like this is looking for friends, and I would like to implement this in my app.
Here's a look in my Firestore Database:
We have a collection of "users" where we store their data and a subcollection "friends" where we store the userID of the added users:
Let's say user "4uuBry" is friends with user "5083CM", then "5083CM" should not be seen in a list of friend suggestion for "4uuBry".
So how do I query all users who are not friends with "4uuBry" and display them in a ListView?
EDIT Sep 18 2020
The Firebase release notes suggest there are now not-in and != queries. (Proper documentation is now available.)
not-in finds documents where a specified field’s value is not in a specified array.
!= finds documents where a specified field's value does not equal the specified value.
Neither query operator will match documents where the specified field is not present. Be sure the see the documentation for the syntax for your language.
ORIGINAL ANSWER
Firestore doesn't provide inequality checks. According to the documentation:
The where() method takes three parameters: a field to filter on, a comparison operation, and a value. The comparison can be <, <=, ==, >, or >=.
Inequality operations don't scale like other operations that use an index. Firestore indexes are good for range queries. With this type of index, for an inequality query, the backend would still have to scan every document in the collection in order to come up with results, and that's extremely bad for performance when the number of documents grows large.
If you need to filter your results to remove particular items, you can still do that locally.
You also have the option of using multiple queries to exclude a distinct value. Something like this, if you want everything except 12. Query for value < 12, then query for value > 12, then merge the results in the client.
Example:
QuerySnapshot querySnap = await FirebaseFirestore.instance
.collection('chat')
.where('participant', arrayContains: myEmployeId)
.where('type', isEqualTo: '1') //you will change the query here
.get();
Main question
I am making a social media app which recommends posts to users. The posts have fields user, gym and grade, all of which are of type String. user refers to the user that made the post.
I want to recommend posts which have gym and grade that are found in lists of preferred gyms and grades, so I want to use whereIn on those fields. However, I do NOT want to recommend posts that were posted by the user of the app (i.e. I don't want users to see their own posts in their recommended posts), so I want to use isNotEqualTo on the user field.
Additionally, the posts have a timestamp field. I want to recommend the latest posts, so I want to use orderBy on the timestamp field.
Lastly, I only want to load 10 posts at a time, so I want to use limit(10).
However, I have to load the next 10 posts when the user scrolls past 10 posts, so I want to use startAfterDocument for that.
So this is what I attempted:
A possible searchTerms looks like this:
final searchTerms = {
'notUser': 'nathantew',
'gyms': ['gym1','gym2'],
'grades': ['grade1','grade2'],
}
Then the query using searchTerms (the actual query logic is longer in my code as the posts have more fields, and I attempted to make a generalised query function. I extracted the relevant parts):
var query =
FirebaseFirestore.instance.collection('posts').orderBy('timestamp');
if (lastDoc != null) {
query = query.startAfterDocument(lastDoc);
}
if (searchTerms['notUser'] != null) {
query = query.where('user', isNotEqualTo: searchTerms['notUser']);
}
if (searchTerms['gyms'] != null) {
query = query.where('gym', whereIn: searchTerms['gyms']);
}
if (searchTerms['grades'] != null) {
query = query.where('grade', whereIn: searchTerms['grades']);
}
return query.limit(10).get();
However, that throws an error on the isNotEqualTo filter:
_AssertionError ('package:cloud_firestore/src/query.dart': Failed assertion: line 677 pos 11: 'field == orders[0][0]': The initial orderBy() field '[[FieldPath([timestamp]), false]][0][0]' has to be the same as the where() field parameter 'FieldPath([user])' when an inequality operator is invoked.)
Extra questions and things I tried
I doubt this is how queries actually work for a non-SQL database like cloud firestore, but the way I imagine the query is that the posts are first ordered by timestamp, then we iterate through the posts from newest to oldest, then whichever post has a gym and grade in the list of preferred gyms and grades, AND doesn't have a user which is 'nathantew', will be added to the list of documents that will be returned to the app when the list reaches 10 documents.
And this way, when I request for the next 10 posts, I can pass in the last document of the previous query and start from there.
I tried searching up the error, but it seems to me that my understanding of how queries happen is completely wrong, because I quickly found a confusingly large amount of rules restricting how queries can be made.
The rules are confusing too, for example, https://firebase.google.com/docs/firestore/query-data/order-limit-data#limitations states "If you include a filter with a range comparison (<, <=, >, >=), your first ordering must be on the same field". Also, "You cannot order your query by any field included in an equality (=) or in clause." I'm using isNotEqualTo in this case, so does that fall under the first rule, or the second? It doesn't seem like isNotEqualTo is a "range comparison" filter, but the error called on the isNotEqualTo sounds like it is referring to the first rule...
I've seen suggestions to make a less complicated query omitting certain filters, and only add the filters in the local app code. I don't know how that'll work.
I can't omit the orderBy timestamp query filter or I would have to query the entire database to filter it locally.
But if I omit any of the where filters, then I need to query an unknown number of documents each time to ensure 10 documents are queried that satisfy the conditions.
And the more I think about it, the more possible errors I imagine. What if a user deletes their post as I am trying to use that document for the startAfterDocument filter...?
So, how should I make this query? Even better, where can I find best practices for such use cases? Are they all hidden from amateurs as they are used by large companies with professionals?
QUERYING MONGODB: RETREIVE SHOPS BY NAME AND BY LOCATION WITH ONE SINGLE QUERY
Hi folks!
I'm building a "search shops" application using MEAN Stack.
I store shops documents in MongoDB "location" collection like this:
{
_id: .....
name: ...//shop name
location : //...GEOJson
}
UI provides to the users one single input for shops searching. Basically, I would perform one single query to retrieve in the same results array:
All shops near the user (eventually limit to x)
All shops named "like" the input value
On logical side, I think this is a "$or like" query
Based on this answer
Using full text search with geospatial index on Mongodb
probably assign two special indexes (2dsphere and full text) to the collection is not the right manner to achieve this, anyway I think this is a different case just because I really don't want to apply sequential filter to results, "simply" want to retreive data with 2 distinct criteria.
If I should set indexes on my collection, of course the approach is to perform two distinct queries with two distinct mehtods ($near for locations and $text for name), and then merge the results with some server side logic to remove duplicate documents and sort them in some useful way for user experience, but I'm still wondering if exists a method to achieve this result with one single query.
So, the question is: is it possible or this kind of approach is out of MongoDB purpose?
Hope this is clear and hope that someone can teach something today!
Thanks
I would like to store and query documents that contain a from-to date range, where the range represents an interval when the document has been valid.
Typical use cases in lucene/solr documentation address the opposite problem: Querying for documents that contain a single timestamp and this timestamp is contained in a date range provided as query parameter. (createdate:[1976-03-06T23:59:59.999Z TO *])
I want to use the edismax parser.
I have found the ms() function, which seems to me to be designed for boosting score only, not to eliminate non-matching results entirely.
I have found the article Spatial Search Tricks for People Who Don't Have Spatial Data, where the problem described by me is said to be Easy... (Find People Alive On May 25, 1977).
Is there any simpler way to express something like
date_from_query:[valid_from_field TO valid_to_field] than using the spacial approach?
The most direct approach is to create the bounds yourself:
valid_from_field:[* TO date_from_query] AND valid_to_field:[date_from_query TO *]
.. which would give you documents where the valid_from_field is earlier than the date you're querying, and the valid_to_field is later than the date you're querying, in effect, extracting the interval contained between valid_from_field and valid_to_field. This assumes that neither field is multi valued.
I'd probably add it as a filter query, since you don't need any scoring from it, and you probably want to allow other search queries at the same time.
Since it is not possible to find "blueberry" by the word "blue" by using a mongodb full text search, I want to help my users to complete the word "blue" to "blueberry". To do so, is it possible to query all the words in a mongodb full text index -> that I can use the words as suggestions i.e. for typeahead.js?
Language stemming in text search uses an algorithm to try to relate words derived from a common base (eg. "running" should match "run"). This is different from the prefix match (eg. "blue" matching "blueberry") that you want to implement for an autocomplete feature.
To most effectively use typeahead.js with MongoDB text search I would suggest focusing on the prefetch support in typeahead:
Create a keywords collection which has the common words (perhaps with usage frequency count) used in your collection. You could create this collection by running a Map/Reduce across the collection you have the text search index on, and keep the word list up to date using a periodic Incremental Map/Reduce as new documents are added.
Have your application generate a JSON document from the keywords collection with the unique keywords (perhaps limited to "popular" keywords based on word frequency to keep the list manageable/relevant).
You can then use the generated keywords JSON for client-side autocomplete with typeahead's prefetch feature:
$('.mysearch .typeahead').typeahead({
name: 'mysearch',
prefetch: '/data/keywords.json'
});
typeahead.js will cache the prefetch JSON data in localStorage for client-side searches. When the search form is submitted, your application can use the server-side MongoDB text search to return the full results in relevance order.
A simple workaround I am doing right now is to break the text into individual chars stored as a text indexed array.
Then when you do the $search query you simply break up the query into chars again.
Please note that this only works for short strings say length smaller than 32 otherwise the indexing building process will take really long thus performance will be down significantly when inserting new records.
You can not query for all the words in the index, but you can of course query the original document's fields. The words in the search index are also not always the full words, but are stemmed anyway. So you probably wouldn't find "blueberry" in the index, but just "blueberri".
Don't know if this might be useful to some new people facing this problem.
Depending on the size of your collection and how much RAM you have available, you can make a search by $regex, by creating the proper index. E.g:
db.collection.find( {query : {$regex: /querywords/}}).sort({'criteria': -1}).limit(limit)
You would need an index as follows:
db.collection.ensureIndex( { "query": 1, "criteria" : -1 } )
This could be really fast if you have enough memory.
Hope this helps.
For those who have not yet started implementing any database architecture and are here for a solution, go for Elasticsearch. Its a json document driven database similar to mongodb structurally. It has "edge-ngram" analyzer which is really really efficient and quick in giving you did you mean for mis-spelled searches. You can also search partially.