Mongo query for number of items in a sub collection - mongodb

This seems like it should be very simple but I can't get it to work. I want to select all documents A where there are one or more B elements in a sub collection.
Like if a Store document had a collection of Employees. I just want to find Stores with 1 or more Employees in it.
I tried something like:
{Store.Employees:{$size:{$ne:0}}}
or
{Store.Employees:{$size:{$gt:0}}}
Just can't get it to work.

This isn't supported. You basically only can get documents in which array size is equal to the value. Range searches you can't do.
What people normally do is that they cache array length in a separate field in the same document. Then they index that field and make very efficient queries.
Of course, this requires a little bit more work from you (not forgetting to keep that length field current).

Related

Does length of indexed field matter while searching?

The chat app schema that I have is something like below.
1. conversations {participants[user_1, user_2], convsersation_id}
2. messages {sender: user_1, sonversation_id, timestamps}
I want to map this relationship using existing _id:ObjectId which is already indexed.
But if I want to get all conversation of user_1 I have to first search in which conversation that user is involed and get that conversation's _id and again search for the messages in messages using that conversation _id.
So my questions are -
Does length of indexed field (here _id) matters while searching?
Should I create another shorter indexed fields?.
Also if there is any better alternative schema please suggest.
I would suggest you to maintain the data as sub documents instead of array. The advantage you have is you can build another index (only) on conversation_id field, which you want to query to know the user's involvement
When you maintain it as array, you cannot index the converstaion_id field separately, instead you will have to build a multi key index, which indexes all the elements of the array (sender and timestamps fields) which you are never going to use for querying and it also increases the index size
Answering you questions:
Does length of indexed field (here _id) matters while searching? - Not really
Should I create another shorter indexed fields? - Create sub-document and index converstaion_id
Also if there is any better alternative schema please suggest. - Maintain the array fields as sub-documents

How to create an index exemption on Firestore subdocuments?

We have a database structured as follows:
Collection foo
Documents
Collection bar
Documents with many fields (approaching the 1 MB limit)
Trying to write a document to the bar collection containing 34571 fields, I get (from the Go API):
rpc error: code = InvalidArgument desc = too many builtin index entries for entity
OK, fine, it seems I need to add an exemption:
Large array or map fields
Large array or map fields can approach the limit of 20,000 index entries per document. If you are not querying based on a large array or map field, you should exempt it from indexing.
But how? The console only lets me set a single collection name and a single field path, and slashes aren't accepted:
I tried other combinations, but / isn't accepted in either the Collection ID or the Field path, and using ., while not clearly forbidden, results in a generic error when trying to save the exemption. I'm also not sure if * is allowed.
Index exemptions are based on collection ID and not collection path. In this case, you can enter bar as the collection ID. This also means the exemption applies to all collections with ID bar, regardless of hierarchy.
As for the fields, you can specify only a single field path per exemption. The "*" all-selector is not supported. There is a limit of 200 index exemptions so you wouldn't be able to exempt all 34571 fields. If possible, I suggest moving your fields into a map. Then you could disable indexing on the map field.

What's the easiest way to return the results of a query for a given key/value pair in mongo as an array of the values returned?

I have a field called id (not _id) in documents from two collections. I need to compare the contents of the first collection with the second. Basically, I need to know what documents with a given value 'id' exist in collection 'A', but not 'B'. What's the easiest way to build an array of id's from Collection A that I can use to do something like the following. :
db.B.find({id:{$nin: array_of_ids_from_coll_A}})
Please don't get hung up over why I'm using 'id' in this case, and not '_id'. Thanks.
Strictly speaking, this doesn't answer the question of 'how to build an array that...', but I'd iterate over collection A and, for each element, try to find a match in B. If none is found, add to a list.
This has a lot of roundtrips to the database, so it's not very fast, but it's very simple. Also, if A contains a lot of elements, the array of ids might be too large to throw all of them in the $nin, which otherwise would have to be solved by splitting up the array of ids. To make matters worse, $nin isn't efficient with indexes anyway.
I incorrectly assumed that the function 'distinct' returned a set of distinct documents based on a given 'field'. In fact, it returns an array of distinct values, provided a specific field. So, I was able to construct the array I was looking for with db.A.distinct('id'). Thanks to anyone who took the time to read this question, anyway.

MongoDB $in not only one result in case of repeated elements

I need to get the users whose ids are contained in an array. For this i'm using the $in operator, however being this inside an aggregate operation, i'd like to get back a specific user all the time it's id is present in the array, not just one. For example:
The ids array is A=[a,b,c,b] and U(x) is user with id x
with users.find({_id:{$in:A}}) i get these users as result: U(a),U(b),U(c)
instead i'd like to get back the result: U(a),U(b),U(c),U(b)
so get the user back every time it's id appears.
I understand that $in is working as expected but does anyone have an idea on how can i achieve this?
Thanks
This isn't possible using a MongoDB query.
MongoDB's query engine iterates over the documents in a collection (or over an index if there's a useful one) and returns to you any documents that match your query, in the order it finds them. Whether b appears once, twice, or a hundred times in your query makes no difference: the document with _id of b matches the query and is returned once, when MongoDB finds it.
You can do a post-processing step in your programming language to repeat documents as many times as you want.

The fastest way to show Documents with certain property first in MongoDB

I have collections with huge amount of Documents on which I need to do custom search with various different queries.
Each Document have boolean property. Let's call it "isInTop".
I need to show Documents which have this property first in all queries.
Yes. I can easy do sort in this field like:
.sort( { isInTop: -1 } );
And create proper index with field "isInTop" as last field in it. But this will be work slowly, as indexes in mongo works best with unique fields.
So is there is solution to show Documents with field "isInTop" on top of each query?
I see two solutions here.
First: set Documents wich need to be in top the _id from "future". As you know, ObjectId contains timestamp. So I can create ObjectId with timestamp from future and use natural order
Second: create separate collection for Ducuments wich need to be in top. And do queries in it first.
Is there is any other solutions for this problem? Which will work fater?
UPDATE
I have done this issue with sorting on custom field which represent rank.
Using the _id field trick you mention has the problem that at some point in time you will reach the special time, and you can't change the _id field (without inserting a new document and removing the old one).
Creating a special collection which just holds the ones you care about is probably the best option. It gives you the ability to logically (and to some extent, physically) separate the documents.
Newly introduced in mongodb there is also support for a "sparse" index which may fulfill your needs as well. You could only set the "isInTop" field when you want it to be special, and then create a sparse index on it which would not have the problems you would normally have with a single indexed boolean field (in btrees).