How to create an index exemption on Firestore subdocuments? - google-cloud-firestore

We have a database structured as follows:
Collection foo
Documents
Collection bar
Documents with many fields (approaching the 1 MB limit)
Trying to write a document to the bar collection containing 34571 fields, I get (from the Go API):
rpc error: code = InvalidArgument desc = too many builtin index entries for entity
OK, fine, it seems I need to add an exemption:
Large array or map fields
Large array or map fields can approach the limit of 20,000 index entries per document. If you are not querying based on a large array or map field, you should exempt it from indexing.
But how? The console only lets me set a single collection name and a single field path, and slashes aren't accepted:
I tried other combinations, but / isn't accepted in either the Collection ID or the Field path, and using ., while not clearly forbidden, results in a generic error when trying to save the exemption. I'm also not sure if * is allowed.

Index exemptions are based on collection ID and not collection path. In this case, you can enter bar as the collection ID. This also means the exemption applies to all collections with ID bar, regardless of hierarchy.
As for the fields, you can specify only a single field path per exemption. The "*" all-selector is not supported. There is a limit of 200 index exemptions so you wouldn't be able to exempt all 34571 fields. If possible, I suggest moving your fields into a map. Then you could disable indexing on the map field.

Related

Does length of indexed field matter while searching?

The chat app schema that I have is something like below.
1. conversations {participants[user_1, user_2], convsersation_id}
2. messages {sender: user_1, sonversation_id, timestamps}
I want to map this relationship using existing _id:ObjectId which is already indexed.
But if I want to get all conversation of user_1 I have to first search in which conversation that user is involed and get that conversation's _id and again search for the messages in messages using that conversation _id.
So my questions are -
Does length of indexed field (here _id) matters while searching?
Should I create another shorter indexed fields?.
Also if there is any better alternative schema please suggest.
I would suggest you to maintain the data as sub documents instead of array. The advantage you have is you can build another index (only) on conversation_id field, which you want to query to know the user's involvement
When you maintain it as array, you cannot index the converstaion_id field separately, instead you will have to build a multi key index, which indexes all the elements of the array (sender and timestamps fields) which you are never going to use for querying and it also increases the index size
Answering you questions:
Does length of indexed field (here _id) matters while searching? - Not really
Should I create another shorter indexed fields? - Create sub-document and index converstaion_id
Also if there is any better alternative schema please suggest. - Maintain the array fields as sub-documents

Firestore index on maps and array - clarification

I'm trying to understand how Firestore creates indexes on fields. Given the following sample document, how are indexes created, especially for the maps/arrays?
I read the documentation at Index types in Cloud Firestore multiple times and I'm still unsure. There it says:
Automatic indexing
By default, Cloud Firestore automatically maintains single-field indexes for each field in a document and each subfield in a map. Cloud Firestore uses the following default settings for single-field indexes:
For each non-array and non-map field, Cloud Firestore defines two collection-scope single-field indexes, one in ascending mode and one in descending mode.
For each map field, Cloud Firestore creates one collection-scope ascending index and one descending index for each non-array and non-map subfield in the map.
For each array field in a document, Cloud Firestore creates and maintains a collection-scope array-contains index.
Single-field indexes with collection group scope are not maintained by default.
If I understand this correctly then there is an index created for each of these fields, even for the values in the alternate_names array.
So if I want to search for any document where fields.alternate_names contains a value of (for example) "Caofang", then Firestore would use an index for its search
Is my assumption/understanding correct?
No, your understanding is not correct. fields.alternate_names is an array subfield in a map field, which means it would not satisfy the requirements in the second point. You can test your assumption simply by issuing the query. If the query fails, you will see in the error message that it failed due to lack of index.
Firestore will simply not allow queries that are not indexed. The error message from that failure will contain a link to the console that will let you create the index necessary for that query, if such a thing is possible.
If you want to be able to query the contents of fields.alternate_names, consider promoting it to its own top-level field, which will be indexed by default.

Cloud Firestore: check if value exists without knowing field name

I would like to check if a certain value is present in my Cloud Firestore collection through all the present fields and have back the document ID that has at least one field whose value is the one searched.
In this example, the code should give back only 2 records when I look for "Peter": 8cyMJG7uNgVoenA63brG and fnk0kgW7gSBc3EdOYWxD.
I know how to do a search when the field name is known. But in this case, I cannot know the field name at prior.
If you don't know the name of a field, you can't perform any queries against its value. Firestore requires queries to use some index, and indexes always work with the names of fields in your documents.

Difference between wildcard search and individual text search

Is there a difference between a wildcard search index like $** and text indexes that I create for each of the fields in the collection ?
I do see a small difference in response time when I individually create text indexes. Using individual indexes, returns a better response. I am not able to post an example now, but will try to.
A wildcard text search will index every field that contains string data for each document in the collection (https://docs.mongodb.com/manual/core/index-text/#wildcard-text-indexes).
Because you are essentially increasing the number of fields indexed with a wild card text index, it would take longer to run compared to targeting specific fields for a text index.
Since you can only have one text index per collection (https://docs.mongodb.com/manual/core/index-text/#create-text-index), its worth considering which fields you plan on querying against beforehand.

Mongo query for number of items in a sub collection

This seems like it should be very simple but I can't get it to work. I want to select all documents A where there are one or more B elements in a sub collection.
Like if a Store document had a collection of Employees. I just want to find Stores with 1 or more Employees in it.
I tried something like:
{Store.Employees:{$size:{$ne:0}}}
or
{Store.Employees:{$size:{$gt:0}}}
Just can't get it to work.
This isn't supported. You basically only can get documents in which array size is equal to the value. Range searches you can't do.
What people normally do is that they cache array length in a separate field in the same document. Then they index that field and make very efficient queries.
Of course, this requires a little bit more work from you (not forgetting to keep that length field current).