How to query multiple collections in mongodb (without using $lookup)? - mongodb

I would like to create a single query that gets data from three different collections from providing a single query parameter. I have seen methods that use $lookup but I do not want to use that as I cannot use it on sharded collections.
Here is an example to explain further.
I have three collections: user, chatroom and chatMessage.
user collection:
{
_id: ObjectId('456'),
username: 'John',
contacts: [
{
_id: ObjectId('AB12'),
name: 'Mary',
idOfContact: ObjectId('123'),
},
{
_id: ObjectId('AB34'),
name: 'Jane',
_idOfContact: ObjectId('234'),
},
{
_id: ObjectId('AB56'),
name: 'Peter',
_idOfContact: ObjectId('345'),
}
],
}
chatroom collection:
{
_id: ObjectId('AB34'),
usersInThisChatRoom: [
ObjectId("456"),
ObjectId("123"),
ObjectId("234"),
]
}
chatMessage collection:
[
{
_id: ObjectId("M01"),
chatRoomObjectId: _id: ObjectId('AB34'),
senderObjectId: ObjectId('456'),
message: 'Hello humans!',
date: ISODate("2019-09-03T07:24:28.742Z"),
},
...(other messages)
]
What I would like to be returned
[
{
chatRoomObjectId: ObjectId('AB34'),
usersInThisChatRoom: [
{
contactName: 'John',
contactUserId: ObjectId('456'),
},
contactName: 'Mary',
contactUserId: ObjectId('123'),
},
contactName: 'Jane',
contactUserId: ObjectId('234'),
}
]
chatMessages: [
{
_id: ObjectId("M01"),
senderObjectId: ObjectId('456'),
message: 'Hello humans!',
date: ISODate("2019-09-03T07:24:28.742Z"),
},
...(other messages)
]
},
...(other documents)
]
Is there a way to get my desired results by making a single query using the user._id and will that be performance friendly?
Or, do I have to make several queries, one after another, to achieve what I want?

According to this answer, you cannot perform a single query across multiple collections (aside from the $lookup aggregation pipeline function.
This means that You either use the $lookup aggregation pipeline or you make several queries to the DB.

Related

How to only return X amount of embedded documents with MongoDB?

I have a large collection called posts, like so:
[{
_id: 349348jf49rk,
user: frje93u45t,
comments: [{
_id: fks9272ewt
user: 49wnf93hr9,
comment: "Hello world"
}, {
_id: j3924je93h
user: 49wnf93hr9,
comment: "Heya"
}, {
_id: 30283jt9dj
user: dje394ifjef,
comment: "Text"
}, {
_id: dkw9278467
user: fgsgrt245,
comment: "Hola"
}, {
_id: 4irt8ej4gt
user: 49wnf93hr9,
comment: "Test"
}]
}]
My comments subdocument can sometimes be 100s of documents long. My question is, how can I return just the 3 newest documents (based on the ID) instead of all the documents, and return the length of all documents as totalNumberOfComments as a count instead? I need to do this for 100s of posts sometimes. This is what the final result would look like:
[{
_id: 349348jf49rk,
user: frje93u45t,
totalNumberOfComments: 5,
comments: [{
_id: fks9272ewt
user: 49wnf93hr9,
comment: "Hello world"
}, {
_id: j3924je93h
user: 49wnf93hr9,
comment: "Heya"
}, {
_id: 30283jt9dj
user: dje394ifjef,
comment: "Text"
}]
}]
I understand that this could be completed after MongoDB returns the data by splicing, although I think it would be best to do this within the query so that Mongo doesn't have to return all comments for every single post all the time.
Does this solve your problem? try plugging in the _id values and see what you are missing and post them here.
begin with this query
db.collection.aggregate([{$match: {_id: 349348jf49rk}},
{$project:{
_id:1,
user:1,
totalNumberOfComments: { $size: "$comments" },
comments: {$slice:3}
}
}
])

Mongodb how to use $elemMatch to limit results

My problem is: I have a structure similar to this:
{
id: 1,
participants: [
{ name: "joe", status: 0 },
{ name: "james", status: 2}
],
content: "mongomongo"
}
{
id: 2,
participants: [
{ name: "joe", status: 1 },
{ name: "jordan", status: 3}
],
content: "dongodongo"
}
What I want to do is run a query with almost the same effect as this:
db.find({ '_id': { $in: someArray}}, { participants: {$elemMatch: {'name': someName }}}
I would specify an array of object IDs for the $in, and then I would provide an username. What happens is that it would give me back both objects, but the participants array only has the entry that the $elemMatch found:
{
id: 1,
participants: [
{ name: "joe", status: 0 }
]
}
{
id: 2,
participants: [
{ name: "joe", status: 1 }
]
}
This is what I want, but the part that I DON'T want is that it leaves out other fields (namely content). How can I adjust the query so it that still returns one field in the participants array, but also returns the other fields such as content?
Thank you in advance!
Actually found the solution to my question. Just had to tweak the original query I used. I had confused the projection field and the options field since I was using Mongoose to manage mongodb interactions.
Here's the query that works:
db.find({ '_id': { $in: someArray}}, { participants: {$elemMatch: {'name': someName }}, content: 1, [anything] : 1});
EDIT:
I misunderstood the original post and example. If the only other field you are worried about returning is 'content', then you could add it to the projection argument like so:
db.collection.find(
{
'_id': {
$in: someArray
}
},
{
'participants': {
$elemMatch: {
'name': someName
}
},
'content' : 1
}
)
Hope this helps!

MongoDB aggregation framework approach to a multi-doc query

I am looking into the best way to organize filtering. I have the following document format:
{
_id: "info",
ids: ["id1", "id2", "id3"]
}
{
_id: "id1",
value: 5
}
{
_id: "id2",
value: 1
}
{
_id: "id3",
value: 5
}
I need to make the following query: get all documents by id from doc "info" and then filter them out by value 5. So, that result would be something like:
{
_id: "id1",
value: 5
}
{
_id: "id3",
value: 5
}
I suppose I need to do unwind on ids, but how do I then select all documents that match those values? Or maybe I should just use $in operator somehow to grab all documents and after that do filtering?
Any help is aprpeciated. Thanks.
If it is only MongoDB shell/script, I would do it like this:
db.ids.find({ _id: { $in: db.ids.findOne({ _id: "info" }).ids }, value: 5 })
You also have worse versions using:
or the eval command:
db.runCommand({
eval: function(value) {
var ids = db.ids.findOne({ _id: "info" }).ids;
return db.ids.find({ _id: { $in: ids }, value: value }).toArray();
},
args: [5]
})
or the $where operator (low performance because you execute one find for each candidate result with value 5):
db.ids.find({
value: 5,
$where: "db.ids.findOne({ _id: 'info', ids: this._id })"
})
But if you are trying to run the queries through a MongoDb driver, the story might be different.

How to find a subdocument by id in mongoose and exclude some fields

I have the a document stored in mongodb:
shop: {
_id: '...'
title: 'my shop'
users: [
{
_id: '...',
name: 'user1',
username: '...'
},
{
_id: '...',
name: 'user2',
username: '...'
}
]
}
I use this query to get a subdocument user by his id:
Shop.findOne({'users._id': userId}, {'users.$': 1}, function (err, user) {
console.log(user);
});
Output:
{ _id: ...,
users:
[{
name: 'user1',
username: '...',
_id: ...
}]
}
How can I filter the result to only return the user name.
The way I do it now:
Shop.findOne({'users._id': userId}, {'users.$': 1}, function (err, shop) {
shop = shop.toObject()
user = shop.users[0]
filtered = {
name: user.name
}
callback(filtered);
});
But is there a better way to do it all in the query?
This question is almost two years old, but I noticed that people are still looking for a solution to this problem. fernandopasik's answer helped me very much, but is missing a code sample on how to use the suggested aggregation operations. That's why I post a more detailed answer.
The document I used is:
{
_id: '...'
title: 'my shop'
users: [
{
_id: 'user1Id',
name: 'user1',
username: '...'
},
{
_id: 'user2Id',
name: 'user2',
username: '...'
}
]
}
The solution I came up with (after reading the mongodb docs about aggregation) was:
Shop.aggregate([
{$unwind: '$users'},
{$match: {'users._id': 2}},
{$project: {_id: 0, 'name': '$users.name'}}
]);
To understand how the aggregation is working, it's best to try one operation at a time and read the mongodb docs of this operation.
Shop.aggregate([{$unwind: '$users'}])
$unwind deconstructs the users array (don't forget to include $ on the array name), so you end up with:
{
_id: '...',
title: 'my shop',
users: {
_id: 'user1Id',
name: 'user1',
username: '...'
}
}
{
_id: '...',
title: 'my shop',
users: {
_id: 'user2Id',
name: 'user2',
username: '...'
}
}
2. Using {$match: {'users._id': 'user2Id'}} on the aggregation pipeline (the two docs in this example) will return the whole document where users._id is 'user2Id':
{
_id: '...',
title: 'my shop',
users: {
_id: 'user2Id',
name: 'user2',
username: '...'
}
}
3. to return only name: 'user2' you can use {$project: {_id: 0, 'name': '$users.name'}}:
{name: 'user2'}
The aggregation pipeline is not easy to grasp at first. I recommend reading through the mongodb aggregation docs and try one aggregation operation at a time. It is sometimes hard to spot the error in the whole aggregation pipeline. Most of the time you simply get no result document from the pipeline when there is an error somewhere in the pipeline.
You should try mongodb aggregation framework with mongoose:
http://mongoosejs.com/docs/api.html#aggregate-js
I suggest you first apply unwind users and then match the user id and then project the username.

Mongodb aggregate on subdocument in array

I am implementing a small application using mongodb as a backend. In this application I have a data structure where the documents will contain a field that contains an array of subdocuments.
I use the following use case as a basis:
http://docs.mongodb.org/manual/use-cases/inventory-management/
As you can see from the example, each document have a field called carted, which is an array of subdocuments.
{
_id: 42,
last_modified: ISODate("2012-03-09T20:55:36Z"),
status: 'active',
items: [
{ sku: '00e8da9b', qty: 1, item_details: {...} },
{ sku: '0ab42f88', qty: 4, item_details: {...} }
]
}
This fits me perfect, except for one problem:
I want to count each unique item (with "sku" as the unique identifier key) in the entire collection where each document adds the count by 1 (multiple instances of the same "sku" in the same document will still just count 1). E.g. I would like this result:
{ sku: '00e8da9b', doc_count: 1 },
{ sku: '0ab42f88', doc_count: 9 }
After reading up on MongoDB, I am quite confused about how to do this (fast) when you have a complex schema as described above. If I have understood the otherwise excellent documentation correct, such operation may perhaps be achieved using either the aggregation framework or the map/reduce framework, but this is where I need some input:
Which framework would be better suited to achieve the result I am looking for, given the complexity of the structure?
What kind of indexes would be preferred in order to gain the best possible performance out of the chosen framework?
MapReduce is slow, but it can handle very large data sets. The Aggregation framework on the other hand is a little quicker, but will struggle with large data volumes.
The trouble with your structure shown is that you need to "$unwind" the arrays to crack open the data. This means creating a new document for every array item and with the aggregation framework it needs to do this in memory. So if you have 1000 documents with 100 array elements it will need to build a stream of 100,000 documents in order to groupBy and count them.
You might want to consider seeing if there's a schema layout that will server your queries better, but if you want to do it with the Aggregation framework here's how you could do it (with some sample data so the whole script will drop into the shell);
db.so.remove();
db.so.ensureIndex({ "items.sku": 1}, {unique:false});
db.so.insert([
{
_id: 42,
last_modified: ISODate("2012-03-09T20:55:36Z"),
status: 'active',
items: [
{ sku: '00e8da9b', qty: 1, item_details: {} },
{ sku: '0ab42f88', qty: 4, item_details: {} },
{ sku: '0ab42f88', qty: 4, item_details: {} },
{ sku: '0ab42f88', qty: 4, item_details: {} },
]
},
{
_id: 43,
last_modified: ISODate("2012-03-09T20:55:36Z"),
status: 'active',
items: [
{ sku: '00e8da9b', qty: 1, item_details: {} },
{ sku: '0ab42f88', qty: 4, item_details: {} },
]
},
]);
db.so.runCommand("aggregate", {
pipeline: [
{ // optional filter to exclude inactive elements - can be removed
// you'll want an index on this if you use it too
$match: { status: "active" }
},
// unwind creates a doc for every array element
{ $unwind: "$items" },
{
$group: {
// group by unique SKU, but you only wanted to count a SKU once per doc id
_id: { _id: "$_id", sku: "$items.sku" },
}
},
{
$group: {
// group by unique SKU, and count them
_id: { sku:"$_id.sku" },
doc_count: { $sum: 1 },
}
}
]
//,explain:true
})
Note that I've $group'd twice, because you said that an SKU can only count once per document, so we need to first sort out the unique doc/sku pairs and then count them up.
If you want the output a little different (in other words, EXACTLY like in your sample) we can $project them.
With the latest mongo build (it may be true for other builds too), I've found that slightly different version of cirrus's answer performs faster and consumes less memory. I don't know the details why, seems like with this version mongo somehow have more possibility to optimize the pipeline.
db.so.runCommand("aggregate", {
pipeline: [
{ $unwind: "$items" },
{
$group: {
// create array of unique sku's (or set) per id
_id: { id: "$_id"},
sku: {$addToSet: "$items.sku"}
}
},
// unroll all sets
{ $unwind: "$sku" },
{
$group: {
// then count unique values per each Id
_id: { id: "$_id.id", sku:"$sku" },
count: { $sum: 1 },
}
}
]
})
to match exactly the same format as asked in question, grouping by "_id" should be skipped