On Mongo 2.4.6
Collection of Users
{
"_id" : User1,
"orgRoles" : [
{"_id" : 1, "app" : "ANGRYBIRDS", "orgId" : "CODOE"},
{"_id" : 2, "app" : "ANGRYBIRDS", "orgId" : "MSDN"}
],
},
{
"_id" : User2,
"orgRoles" : [
{"_id" : 1, "app" : "ANGRYBIRDS", "orgId" : "CODOE"},
{"_id" : 2, "app" : "HUNGRYPIGS", "orgId" : "MSDN"}
],
},
{
"_id" : User2,
"orgRoles" : [
{"_id" : 1, "app" : "ANGRYBIRDS", "orgId" : "YAHOO"},
{"_id" : 2, "app" : "HUNGRYPIGS", "orgId" : "MSDN"}
],
}
With data that looks like above, I'm trying to write a query to get:
All the id's of the users that have only one ANGRYBIRDS app and that ANGRYBIRDS app is in the CODOE organization.
So it would return User2 because they have 1 ANGRYBIRDS and is in the ORG "CODOE" but not User1 because they have two ANGRYBIRDS or User3 because they don't have an ANGRYBIRDS app in the "CODOE" organization. I'm fairly new to mongo queries, so any help is appreciated.
To do something with a few more detailed conditions not immediately offered by standard operators, then your best approach is to use the aggregation framework. This allows you do some processing to work our your conditions, such as the number of matches:
db.collection.aggregate([
// Filter the documents that are possible matches
{ "$match": {
"orgRoles": {
"$elemMatch": {
"app": "ANGRYBIRDS", "orgId": "CODOE"
}
}
}},
// De-normalize the array content
{ "$unwind": "$orgRoles" },
// Group and count the matches
{ "$group": {
"_id": "$_id",
"orgRoles": { "$push": "$orgRoles" },
"matched": {
"$sum": {
"$cond": [
{ "$eq": ["$orgRoles.app", "ANGRYBIRDS"] },
1,
0
]
}
}
}},
// Filter where matched is more that 1
{ "$match": {
"orgRoles": {
"$elemMatch": {
"app": "ANGRYBIRDS", "orgId": "CODOE"
}
},
"matched": 1
}},
// Optionally project to just keep the original fields
{ "$project": { "orgRoles": 1 } }
])
The main thing here happens after the initial $match is processed to only return those documents that have at least one array element matching the main condition, and then after the array elements are processed with $unwind so they can be inspected individually.
The trick is the conditional $sum operation with the $cond operator which is a "ternary". This evaluates "howMany" matches were found in the array to the "ANGRYBIRDS" string. Following this you $match again in order to "filter" any documents that had a match count of more than one. Still leaving the other condition in there, but that is really not necessary.
Just for the record, this is also possible with using the JavaScript evaluation of the $where clause, but due to that it is likely not to be as efficient at processing:
db.collection.find({
"orgRoles": {
"$elemMatch": {
"app": "ANGRYBIRDS", "orgId": "CODOE"
}
},
"$where": function() {
var orgs = this.orgRoles.filter(function(el) {
return el.app == "ANGRYBIRDS";
});
return ( orgs.length == 1 );
}
})
One way of doing it using the aggregation pipeline is:
db.users.aggregate([
// Match the documents with app being "ANGRYBIRDS" and orgID being "CODE"
// Note that this step filters out most of the documents and is good to have
// at the start of the pipeline, moreover it can make use of indexes, if
// used at the beginning of the aggregation pipeline.
{
$match : {
"orgRoles.app" : "ANGRYBIRDS",
"orgRoles.orgId" : "CODOE"
}
},
// unwind the elements in the orgRoles array
{
$unwind : "$orgRoles"
},
// group by userid and app
{
$group : {
"_id" : {
"id" : "$_id",
"app" : "$orgRoles.app"
},
// take the id and app of the first document in each group, since all
// the
// other documents in the group will have the same values.
"id" : {
$first : "$_id"
},
"app" : {
$first : "$orgRoles.app"
},
// orgId can be different, so form an array for each group.
"orgId" : {
$push : {
"id" : "$orgRoles.orgId"
}
},
// count the number of documents in each group.
"count" : {
$sum : 1
}
}
},
// find the matching group
{
$match : {
"count" : 1,
"app" : "ANGRYBIRDS",
"orgId" : {
$elemMatch : {
"id" : "CODOE"
}
}
}
},
// project only the userid
{
$project : {
"id" : 1,
"_id" : 0
}
} ]);
Edit: Removed mapping the aggregation result, since the problem requires solution in v2.4.6, and according to the documentation.
Changed in version 2.6: The db.collection.aggregate() method returns a cursor and can return result sets of any size. Previous versions
returned all results in a single document, and the result set was
subject to a size limit of 16 megabytes.
Related
I have an aggregation pipeline that nearly does what I want. I've used match / unwind / project / sort to get 99% of the way. It is returning multiple documents:
[
{
"_id" : 254.8
},
{
"_id" : 93.7
},
{
"_id" : 89.9
},
{
"_id" : 94.15
},
{
"_id" : 102.1
},
{
"_id" : 93.9
},
{
"_id" : 102.7
}
]
Note: I've added the array brackets and commas to make it more readable, but you can also read it as:
{
"_id" : 254.8
}
{
"_id" : 93.7
}
{
"_id" : 89.9
}
{
"_id" : 94.15
}
{
"_id" : 102.1
}
I need the contents of the ID fields from all 7 documents in an array of values in one document:
{values: [254.8, 93.7, 89.9, 94.15, 102.1, 93.9, 102.7]}
It would be easy to sort this with JS once I have the results but I'd rather do it in the pipeline if possible so my JS stays 100% generic and only returns pure pipeline data.
Here is what you need to complete the job:
db.collection.aggregate([
{
"$group": {
"_id": null,
"values": {
$push: "$_id"
}
}
},
{
"$project": {
_id: false
}
}
])
The result will be:
[
{
"values": [
254.8,
93.7,
89.9,
94.15,
102.1,
93.9,
102.7
]
}
]
https://mongoplayground.net/p/pTmR_rni0J1
I want to get the order of some user from a list after $sort aggregation pipeline.
Let's say we have a leaderboard, and I need to get my rank in the leaderboard with only one query getting only my data.
I have tried $addFields and some queries with $map
Let's say we have these documents
/* 1 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f067"),
"name" : "x4",
"points" : 69
},
/* 2 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f07b"),
"name" : "x24",
"points" : 968
},
/* 3 createdAt:8/18/2019, 4:42:41 PM*/
{
"_id" : ObjectId("5d5963e1c6c93b2da849f06a"),
"name" : "x7",
"points" : 997
},
And I want to write a query like this
db.table.aggregate(
[
{ $sort : { points : 1 } },
{ $addFields: { order : "$index" } },
{ $match : { name : "x24" } }
]
)
I need to inject the order field with something like $index
I expect to have something like this in return
{
"_id" : ObjectId("5d5963e1c6c93b2da849f07b"),
"name" : "x24",
"points" : 968,
"order" : 2
}
I need something like the metadata of the result here which return 2
/* 2 createdAt:8/18/2019, 4:42:41 PM*/
One of the workaround for this situation is to convert your all documents into one single array and hence resolve the index of the document using this array with help of $unwind and finally project the data with fields as required.
db.collection.aggregate([
{ $sort: { points: 1 } },
{
$group: {
_id: 1,
register: { $push: { _id: "$_id", name: "$name", points: "$points" } }
}
},
{ $unwind: { path: "$register", includeArrayIndex: "order" } },
{ $match: { "register.name": "x4" } },
{
$project: {
_id: "$register._id",
name: "$register.name",
points: "$register.points",
order: 1
}
}
]);
To make it more efficient you can apply limit, match, and filter as per your requirement.
I have a collection with the following data:
{
"_id" : ObjectId("5516d416d0c2323619ddbca8"),
"date" : "28/02/2015",
"driver" : "user1",
"passengers" : [
{
"user" : "user2",
"times" : 2
},
{
"user" : "user3",
"times" : 3
}
]
}
{
"_id" : ObjectId("5516d517d0c2323619ddbca9"),
"date" : "27/02/2015",
"driver" : "user2",
"passengers" : [
{
"user" : "user1",
"times" : 2
},
{
"user" : "user3",
"times" : 2
}
]
}
And I would like to perform aggregation so that I will know for a certain passenger, times it was with a certain driver, in my example it would be:
for user1: [{ driver: user2, times: 2}]
for user2: [{ driver: user1, times: 2}]
for user3: [{ driver: user1, times: 3}, {driver: user2, times:2}]
Im quite new with mongo and know how to perform easy aggregation with sum, but not when its inside arrays, and when my subject is itself in the array.
what is the appropriate way to perform this kind of aggregation, and in more specific, how I perform it in express.js based server?
To achieve your needs with aggregation framework, the first pipeline stage will be a $match operation on the passenger in question that matches the documents with the user in the passenger array, followed by the $unwind operation which deconstructs the passengers array from the input documents in the previous operation to output a document for each element. Another $match operation on the deconstructed array follows that further filters the previous document stream to allow only matching documents to pass unmodified into the next pipeline stage, which is projecting the required fields with the $project operator. So essentially your aggregation pipeline for user3 will be like:
db.collection.aggregate([
{
"$match": {
"passengers.user": "user3"
}
},
{
"$unwind": "$passengers"
},
{
"$match": {
"passengers.user": "user3"
}
},
{
"$project": {
"_id": 0,
"driver": "$driver",
"times": "$passengers.times"
}
}
])
Result:
/* 0 */
{
"result" : [
{
"driver" : "user1",
"times" : 3
},
{
"driver" : "user2",
"times" : 2
}
],
"ok" : 1
}
UPDATE:
For grouping duplicates on drivers with different dates, as you mentioned you can do a $group operation just before the last $project pipeline stage where you compute the total passengers times using the $sum operator:
db.collection.aggregate([
{
"$match": {
"passengers.user": "user3"
}
},
{
"$unwind": "$passengers"
},
{
"$match": {
"passengers.user": "user3"
}
},
{
"$group": {
"_id": "$driver",
"total": {
"$sum": "$passengers.times"
}
}
},
{
"$project": {
"_id": 0,
"driver": "$_id",
"total": 1
}
}
])
Result:
/* 0 */
{
"result" : [
{
"total" : 2,
"driver" : "user2"
},
{
"total" : 3,
"driver" : "user1"
}
],
"ok" : 1
}
How to get the maximum of sections.Id in below document where collection._id = some parameter
{
"_id" : ObjectId("571c5c87faf473f40fd0317c"),
"name" : "test 1",
"sections" : [
{
"Id" : 1,
"name" : "first section"
},
{
"Id" : 2,
"name" : "section 2"
},
{
"Id" : 3,
"name" : "section 3"
}
}
I have tried below
db.collection.aggregate(
[
{
"$match": {
"_id": ObjectId("571c5c87faf473f40fd0317c")
}
},
{
"$group" : {
"_id" : "$_id",
"maxSectionId" : {"$max" : "$sections.Id"}
}
}
]);
But instead of returning max int single value it is returning an array of all Ids in sections array.
Further same query when executed in node.js it returns an empty array.
You can do using simple $project stage
Something like this
db.collection.aggregate([
{ "$project": {
"maxSectionId": {
"$arrayElemAt": [
"$sections",
{
"$indexOfArray": [
"$sections.Id",
{ "$max": "$sections.Id" }
]
}
]
}
}}
])
your aggregation query need $unwind for opennig to "sections" array
add your aggregation query this
{$unwind : "$sections"}
and your refactoring aggregation query like this
db.collection.aggregate(
[
{$unwind : "$sections"},
{
"$match": {
"_id": ObjectId("571c5c87faf473f40fd0317c")
}
},
{
"$group" : {
"_id" : "$_id",
"maxSectionId" : {"$max" : "$sections.Id"}
}
}
]);
and more knowledge for $unwind : https://docs.mongodb.org/manual/reference/operator/aggregation/unwind/
Replace $group with $project
In the $group stage, if the expression resolves to an array, $max does not traverse the array and compares the array as a whole.
With a single expression as its operand, if the expression resolves to an array, $max traverses into the array to operate on the numerical elements of the array to return a single value
[sic]
Assuming I have a collection called "posts" (in reality it is a more complex collection, posts is too simple) with the following structure:
> db.posts.find()
{ "_id" : ObjectId("50ad8d451d41c8fc58000003"), "title" : "Lorem ipsum", "author" :
"John Doe", "content" : "This is the content", "tags" : [ "SOME", "RANDOM", "TAGS" ] }
I expect this collection to span hundreds of thousands, perhaps millions, that I need to query for posts by tags and group the results by tag and display the results paginated. This is where the aggregation framework comes in. I plan to use the aggregate() method to query the collection:
db.posts.aggregate([
{ "$unwind" : "$tags" },
{ "$group" : {
_id: { tag: "$tags" },
count: { $sum: 1 }
} }
]);
The catch is that to create the paginator I would need to know the length of the output array. I know that to do that you can do:
db.posts.aggregate([
{ "$unwind" : "$tags" },
{ "$group" : {
_id: { tag: "$tags" },
count: { $sum: 1 }
} }
{ "$group" : {
_id: null,
total: { $sum: 1 }
} }
]);
But that would discard the output from previous pipeline (the first group). Is there a way that the two operations be combined while preserving each pipeline's output? I know that the output of the whole aggregate operation can be cast to an array in some language and have the contents counted but there may be a possibility that the pipeline output may exceed the 16Mb limit. Also, performing the same query just to obtain the count seems like a waste.
So is obtaining the document result and count at the same time possible? Any help is appreciated.
Use $project to save tag and count into tmp
Use $push or addToSet to store tmp into your data list.
Code:
db.test.aggregate(
{$unwind: '$tags'},
{$group:{_id: '$tags', count:{$sum:1}}},
{$project:{tmp:{tag:'$_id', count:'$count'}}},
{$group:{_id:null, total:{$sum:1}, data:{$addToSet:'$tmp'}}}
)
Output:
{
"result" : [
{
"_id" : null,
"total" : 5,
"data" : [
{
"tag" : "SOME",
"count" : 1
},
{
"tag" : "RANDOM",
"count" : 2
},
{
"tag" : "TAGS1",
"count" : 1
},
{
"tag" : "TAGS",
"count" : 1
},
{
"tag" : "SOME1",
"count" : 1
}
]
}
],
"ok" : 1
}
I'm not sure you need the aggregation framework for this other than counting all the tags eg:
db.posts.aggregate(
{ "unwind" : "$tags" },
{ "group" : {
_id: { tag: "$tags" },
count: { $sum: 1 }
} }
);
For paginating through per tag you can just use the normal query syntax - like so:
db.posts.find({tags: "RANDOM"}).skip(10).limit(10)