Mongodb execute multiple queries in one round trip - mongodb

Is there anything like elasticsearch Multi Search API ?
the link is : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-multi-search.html.
consider I have multiple queries and I want to give these queries to mongo and get result in order .

Yes, there is something similar in MongoDB. Using Aggregation Framework you can define multiple aggregation pipelines inside $facet stage.
Try:
db.col.save({a:1})
db.col.save({a:2})
db.col.aggregate([
{
$facet: {
query1: [ { $match: { a:1 } }, { $project: { _id: 0 } } ],
query2: [ { $match: { a:2 } }, { $project: { _id: 0 } } ],
}
}
])
which prints:
{ "query1" : [ { "a" : 1 } ], "query2" : [ { "a" : 2 } ] }
Using $facet you have to keep in mind that single BSON document can't exceed 16 MB size. More about aggregation limitations here

Related

Mongodb find document from multiple collections

in nodejs and mongodb, usin mongoose ,
how can I query multiple collections?
for example: I have 3 collections:
mycollection1, mycollection2, mycollection3
I want to create query like findOne or findMany on mycollection*
and the query will return al the documents that exist in those collections
*(the same as I can do in Elasticsearch)
Thanks,
Larry
you can use $unionWith
db.coll1.aggregate([
{
"$unionWith": {
"coll": "coll2"
}
},
{
"$unionWith": {
"coll": "coll3"
}
},
{
"$match": {
"$expr": {
"$eq": [
"$a",
1
]
}
}
}
])
Test it here
Its better to do the filters before the union, and use this(if you have filters) (the above filters after), you can take the $match and add it to the each union.
{ $unionWith: { coll: "<collection>", pipeline: [ <stage1>, ... ] } }

MongoDB - count by field, and sort by count

I am new to MongoDB, and new to making more than super basic queries and i didn't succeed to create a query that does as follows:
I have such collection, each document represents one "use" of a benefit (e.g first row states the benefit "123" was used once):
[
{
"id" : "1111",
"benefit_id":"123"
},
{
"id":"2222",
"benefit_id":"456"
},
{
"id":"3333",
"benefit_id":"456"
},
{
"id":"4444",
"benefit_id":"789"
}
]
I need to create q query that output an array. at the top is the most top used benefit and how many times is was used.
for the above example the query should output:
[
{
"benefit_id":"456",
"cnt":2
},
{
"benefit_id":"123",
"cnt": 1
},
{
"benefit_id":"789",
"cnt":1
}
]
I have tried to work with the documentation and with $sortByCount but with no success.
$group
$group by benefit_id and get count using $sum
$sort by count descending order
db.collection.aggregate([
{
$group: {
_id: "$benefit_id",
count: { $sum: 1 }
}
},
{ $sort: { count: -1 } }
])
Playground
$sortByCount
Same operation using $sortByCount operator
db.collection.aggregate([
{ $sortByCount: "$benefit_id" }
])
Playground

Is there way to skip different find filter output in one query MongoDB

Is that possible to skip the first record in a document by name For eg product_detail is the collection and it has 10 documents with name apple and 10 documents in name mango can I skip the first 2 documents in each? The below query for skipping the first 2 documents in apple.
Query :
db.getCollection('product_detail').find({"productInfo.name" : "apple"}).skip(2);
db.getCollection('product_detail').find({"productInfo.name" : "mango"}).skip(2);
Instead of two queries to skip 2 documents for "productInfo.name": "apple" and "productInfo.name": "mango" I need one Can anyone help me out?
Check out $facet aggregation pipeline stage
db.getCollection('product_detail').aggregate([{
$facet: {
apple: [
{
$match: { "productInfo.name": "apple" }
},
{
$sort: {/* your sort condition here to ensure order */} }]
},
{
$skip: 2
}
],
mango: [
{
$match: { "productInfo.name": "mango" }
},
{
$sort: {/* your sort condition here to ensure order */} }]
},
{
$skip: 2
}
]
}
}])

Using "$count" Within an "addField" Operation in MongoDB Aggregation

I am trying to find the correct combination of aggregation operators to add a field titled "totalCount" to my mongoDB view.
This will get me the count at this particular stage of the aggregation pipeline and output this as the result of a count on each of the documents:
{
$count: "count"
}
But I then end up with one document with this result, rather than what I'm trying to accomplish, which is to make this value print out as an addedField that is a field/value on all of the documents, or even better, a value that prints in addition to the returned documents.
I've tried this but it gives me an error ""Unrecognized expression '$count'",":
{
$addFields: {
"totalCount" : { $count: "totalCount" }
}
}
What would the correct syntactical construction be for this? Is it possible to do it this way, or do I need to use $sum, or some other operator to make this work? I also tried this:
{
$addFields: {
"totalCount" : { $sum: { _id: 1 } }
}
},
... but while it doesn't give me any errors, it just prints 0 as the value for that field on every document rather than the total count of all documents.
Total count will always be a one-document result so you need $facet to run mutliple aggregation pipelines and then merge results. Let's say your regular pipeline contains simple $project and you want to merge it's results with $count. You can run below aggregation:
db.col.aggregate([
{
$facet: {
totalCount: [
{ $count: "value" }
],
pipelineResults: [
{
$project: { _id: 1 } // your regular aggregation pipeline here
}
]
}
},
{
$unwind: "$pipelineResults"
},
{
$unwind: "$totalCount"
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [ "$pipelineResults", { totalCount: "$totalCount.value" } ]
}
}
}
])
After $facet stage you'll get single document like this
{
"totalCount" : [
{
"value" : 3
}
],
"pipelineResults" : [
{
"_id" : ObjectId("5b313241120e4bc08ce87e46")
},
//....
]
}
Then you have to use $unwind to transform arrays into multiple documents and $replaceRoot with $mergeObjects to promote regular pipeline results into root level.
Since mongoDB version 5.0 there is another option, that allows to avoid the disadvantage of $facet, the grouping of all returned document into a one big document. The main concern is that a document as a size limit of 16M. Using $setWindowFields allows to avoid this concern
This can simply replace #micki's 4 steps:
db.col.aggregate([
{$setWindowFields: {output: {totalCount: {$count: {}}}}}
])

mongodb - aggregate failed with memory error

I'm trying to find duplicates in my sharded collection using the id field, which is of this pattern -
"id" : {
"idInner" : {
"k1" : "v1",
"k2" : "v2",
"k3" : "v3",
"k4" : "v4"
}
}
I used the below query, but received the "exception: Exceeded memory limit for $group, but didn't allow external sort. Pass allowDiskUse:true to opt in." error, even though I used "allowDiskUse : true" in my query.
db.collection.aggregate([
{ $group: {
_id: { id: "$id" },
uniqueIds: { $addToSet: "$_id" },
count: { $sum: 1 }
} },
{ $match: {
count: { $gte: 2 }
} },
{ $sort : { count : -1} },
{ $limit : 10 }
],
{
allowDiskUse : true
});
Is there another way to get what I want, or something else I should pass in the above query? Thanks.
Please use allowDiskTrue in run command.
db.runCommand(
{ aggregate: "collection",
pipeline: [
{ $group: {
_id: { id: "$id" },
uniqueIds: { $addToSet: "$_id" },
count: { $sum: 1 }
} },
{ $match: {
count: { $gte: 2 }
} },
{ $sort : { count : -1} },
{ $limit : 10 }
],
allowDiskUse: true
}
)
Let me know if this works for you.
Run a $match first in the pipeline to keep only documents of let's say id.idiInner.k1 that are between a range, so that you will take results for that range only. Since you are interested in duplicates on the id key, all the duplicated documents will satisfy this criteria. See how much you should narrow down that range and run it next for the next range etc. until you cover all documents.
If it is something you must do frequently, automate, by declaring the ranges, feed them in a loop, keep the duplicates of every run and merge the results in the end.
Another fast hack/trick would be to bypass the mongos and run the aggregation directly in each shard. Doing so will limit your docs roughly (assuming well balanced shards) to docs/number_of_shards and you may overcome the memory limit. In this second approach I assume that your shard key is the id key, however if it is not then this approach will not work since the same duplicated documents will be scattered among the shards.