Trying to do an aggregate operation to find distinct property pairs in a collection of objects, and paginated, but the $skip and $limit doesn't seems to work.
I have a collection with the following object type
{
"_id" : {
"expiration" : ISODate("2021-06-30T00:00:00.000Z"),
"product" : "proda",
"site" : "warehouse1",
"type" : "AVAILABLE"
},
"quantity" : 2,
"date" : ISODate("2021-06-28T00:00:00.000Z"),
}
I'm trying to find distinct product/site pairs, but only 2 at a time with the following aggregation:
db.getCollection('OBJECT').aggregate( [
{ $group: { "_id": { product: "$_id.product", site: "$_id.site" } } },
{ $skip: 0 },
{ $limit: 2 }
])
With skip being 0 it returns 2 distinct product-site paris as expected, but when I increase the skip value to 2 or more for the next steps, the query will not return anything and I have many objects with distinct product-site pairs that should be returned.
Related
A little brainteaser for mongo users.
I have a collection of documents like
{
"_id" : ObjectId("19628f4f0545a733185b672f"),
"name" : "hello",
"items" : [
{
"itemNumber" : 12512,
"value" : "let"
},
{
"itemNumber" : 2546,
"value" : "put"
}
]
}
I need to make sure that every item's itemNumber is unique globally in the collection.
In SQL database I would have a separate table for items and the query for checking if numbers are unique would be something like
select count(1)
from (
select itemNumber, count(itemNumber) as cnt
from items
group by itemNumber) sel
where cnt>1;
Resulting 0 would mean that all itemNumbers are unique. (Probably there are better ways to make that check in SQL)
With MongoDB the only solution that I can come to is
a) use forEach to extract all items to separate collection
b) make a simple aggregation
db.items.aggregate(
{ $group : { _id : '$itemNumber', count : {$sum : 1} } },
{ $out : "cnt" }
)
c) db.cnt.find({count: {$gt: 1}}).count()
Is there any one-query way to do it?
Performace notice: the collection is about 3M documents, 2,2KB each. I have noticed that aggreations that contain $group run like forever on this collection.
How about something like that:
db.items.aggregate(
{ $unwind: "$items" } ,
{ $group : { _id : '$items.itemNumber', count : { $sum : 1 } } },
{ $match: { "count": { $gt: 1 } } }
)
I'm new to the MongoDB world. I'm trying to figure out how to count the number of children organizations assigned to a parent organization. I have documents that have this general structure:
{
"_id" : "001",
"parentOrganization" : {
"organizationId" : "pOrg1"
},
"childOrganization" : {
"organizationId" : "cOrg1"
}
},
{
"_id" : "002",
"parentOrganization" : {
"organizationId" : "pOrg1"
},
"childOrganization" : {
"organizationId" : "cOrg2"
}
},
{
"_id" : "003",
"parentOrganization" : {
"organizationId" : "pOrg2"
},
"childOrganization" : {
"organizationId" : "cOrg3"
}
}
Each document has a parentOrganization with an associated childOrganization. There may be multiple documents with the same parentOrganization, but different childOrganizations. There may also be multiple documents with the same parent/child relationship. Additionally, there may even be a case where a child org may associate with multiple parent orgs.
I'm trying to group by parentOrganization and then count the number of unique childOrganization's associated with each parentOrganization, as well as display the unique id's.
I have tried using an aggregation framework with $match and $group, but I'm still not getting into the child organization parts to count them. Here is what I'm currently attempting:
var s1 = {$match: {"parentOrganization.organizationId": {$exists: true}}};
var s2 = {$group: {_id: "$parentOrganization.organizationId", count: {$sum: "$childOrganization.organizationId"}}};
db.collection.aggregate(s1, s2);
My results are returning the parentOrganization, but my $sum is not returning the number of associated childOrganizations:
/* 1 */
{
"_id" : "pOrg1",
"count" : 0
}
/* 2 */
{
"_id" : "pOrg2",
"count" : 0
}
I get the feeling it is a bit more complicated than my limited knowledge has access to at this time. What details am I missing in this query?
Your $sum is referencing the childOrganization.organizationId value, which is a string. When $sum references a string, it will return the value 0.
I was a unsure of exactly what you were asking for, but I believe that these aggregations can help you on your way.
This will return a count of documents groups by the parentOrganization.organizationId
db.collection.aggregate({$group: {"_id":"$parentOrganization.organizationId", "count": {"$sum": 1}}})
Output:
{ "_id" : "pOrg2", "count" : 1 }
{ "_id" : "pOrg1", "count" : 2 }
This will return a count of unique parent/child organizations:
db.collection.aggregate(
{$group: {"_id": {"parentOrganization": "$parentOrganization.organizationId", "childOrganization": "$childOrganization.organizationId"}, "count":{$sum:1}}})
Output:
{ "_id" : { "parentOrganization" : "pOrg2", "childOrganization" : "cOrg3" }, "count" : 1 }
{ "_id" : { "parentOrganization" : "pOrg1", "childOrganization" : "cOrg2" }, "count" : 1 }
{ "_id" : { "parentOrganization" : "pOrg1", "childOrganization" : "cOrg1" }, "count" : 1 }
This will return a count of unique child organizations and get the set of unique child organizations as well using $addToSet. One caveat of using $addToSet is that the MongoDB 16MB limit on document size still holds. This means that if your collection is large enough such that the size of the set will make one document greater than 16MB, the command will fail. The first $group will create a set of child organizations grouped by parent organization. The $project is used simply to add the total size of the set to the result.
db.collection.aggregate([
{$group: {"_id" : "$parentOrganization.organizationId", "childOrgs" : { "$addToSet" : "$childOrganization.organizationId"}}},
{$project: {"_id" : "$_id", "uniqueChildOrgsCount": {"$size" : "$childOrgs"}, "uniqueChildOrgs": "$childOrgs"}}])
Output:
{ "_id" : "pOrg2", "uniqueChildOrgsCount" : 1, "uniqueChildOrgs" : [ "cOrg3" ]}
{ "_id" : "pOrg1", "uniqueChildOrgsCount" : 2, "uniqueChildOrgs" : [ "cOrg2", "cOrg1" ]}
During these aggregations, I left out the $match statement you included for simplicity, but you could add that back as well.
i have 4 players with there scores in different matches.
e.g
{user: score} -- json keys
{'a': 10}, {'a':12}, {'b':16}
I am trying to find out a way in which i can found sum of single player using aggregation function.
users.aggregation([{$match:{'user':'a'}},{$group:{_id: null, scores:{$sum:'$score'}])
i am repeating same thing for b also and continue
In shot i am doing same thing for different users for too many times.
What is the best way or different way or optimize way, so i can write aggregate query once for all users
You can just match out the required users with the $in clause, and then group as #Sourbh Gupta suggested.
db.users.aggregate([
{$match:{'user':{$in: ['a', 'b', 'c']}}},
{$group:{_id: '$user', scores:{$sum:'$score'}}}
])
group the data on the basis of user. i.e.
users.aggregation([{$group:{_id: "$user", scores:{$sum:'$score'}}}])
Not too sure about your document structures, but if you've got 2 diffrent fields for 2 diffrent scores you can group together and sum then and then project and sum then 2 grouped sums (if that makes sense)
So for example, I have these docuemnts:
> db.scores.find()
{ "_id" : ObjectId("5858ed67b11b12dce194eec8"), "user" : "bob", "score" : { "a" : 10 } }
{ "_id" : ObjectId("5858ed6ab11b12dce194eec9"), "user" : "bob", "score" : { "a" : 12 } }
{ "_id" : ObjectId("5858ed6eb11b12dce194eeca"), "user" : "bob", "score" : { "b" : 16 } }
Notice we have a user bob and he has 2x a scores and 1x b score.
We can now write an aggregation query to do a match for bob then sum the scores.
db.scores.aggregate([
{ $match: { user : "bob" } },
{ $group: { _id : "$user", sumA : { $sum : "$score.a" }, sumB : { $sum : "$score.b" } } },
{ $project: { user: 1, score : { $sum: [ "$sumA", "$sumB" ] } } }
]);
This will give us the following result
{ "_id" : "bob", "score" : 38 }
Lets say I have a collection called phone_audit with document entries of the following form - _id which is the phone number, and value containing items that always contains 2 entries (id, and a date).
Please see below:
{
"_id" : {
"phone_number" : "+012345678"
},
"value" : {
"items" : [
{
"_id" : "c14b4ac1db691680a3fb65320fba7261",
"updated_at" : ISODate("2016-03-14T12:35:06.533Z")
},
{
"_id" : "986b58e55f8606270f8a43cd7f32392b",
"updated_at" : ISODate("2016-07-23T11:17:53.552Z")
}
]
}
},
......
I need to get a list of _id values for every entry in that collection representing the older of the two items in each document.
So in the above - result would be [c14b4ac1db691680a3fb65320fba7261,...]
Any pointers at the type of query to execute would be v.helpful even if the exact syntax is not correct.
With aggregate(), you can $unwind value.items, $sort by update_at, then use $first to get the oldest:
[
{
"$unwind": "$value.items"
},
{
"$sort": { "value.items.updated_at": 1 }
},
{
"$group":{
_id: "$_id.phone_number",
oldest:{$first:"$value.items"}
}
},
{
"$project":{
value_id: "$oldest._id"
}
}
]
I would like to retrieve a list of values that comes from the oldest document currently signed.But i failed to select a document absed on the date.Thanks
here is json :
"ad" : "noc3",
"createdDate" : ISODate(),
"list" : [
{
"id" : "p45",
"value" : 21,
},
{
"id" : "p6",
"value" : 20,
},
{
"id" : "4578",
"value" : 319
}
]
and here my aggregate request :
db.friends.aggregate({$match:{advertiser:"noc3", {$sort:{timestamps:-1},{$limit:1} }},{$unwind:"$list"},{$project:{_id: "$list.id", value:{$add:[0]}}});
Your aggregate query is incorrect. You add the sort and limit to the match, but that's now how you do that. You use different pipeline operators:
db.friends.aggregate( [
{ $match: { advertiser: "noc3" } },
{ $sort: { createdDate: -1 } },
{ $limit: 1 },
Your other pipeline operators are bit strange too, and your code vs query mismatches on timestamps vs createdDate. If you add the expected output, I can update the answer to include the last bits of the query too.