A document in my DB looks like this :
{
"_id": ObjectId("5e92e63fad262707ff301d6c"),
"uknum": 30,
"area": "bath",
"ukelectors": 62355,
"ukresults": [
{
"party": "con",
"leader": "thatcher",
"ukvotes": 22544
},
{
"party": "lab",
"leader": "foot",
"ukvotes": 7259
},
{
"party": "sdp",
"leader": "jenkins",
"ukvotes": 17240
},
{
"party": "eco",
"leader": "whittaker",
"ukvotes": 441
}
]
}
Requirement :
I need to build a query in python to get the name of the party which won area: bath. Basically, check who got maximum votes and choose that party.
The idea was to use $max aggregation pipeline but it does not seem to work.
You can do that using either one of the aggregation-pipeline query :
Query 1 : Without use of $unwind, by using $reduce on array :
db.collection.aggregate([
{ $match: { area: "bath" } },
{
$addFields: {
ukresults: {
$let: {
vars: {
res: {
$reduce: {
input: "$ukresults",
initialValue: { votes: 0, party: {} },
in: {
votes: {
$cond: [
{ $gt: ["$$this.ukvotes", "$$value.votes"] },
"$$this.ukvotes",
"$$value.votes",
],
},
party: {
$cond: [
{ $gt: ["$$this.ukvotes", "$$value.votes"] },
"$$this",
"$$value.party",
],
},
},
},
},
},
in: "$$res.party",
},
},
},
},
]);
Test : MongoDB-Playground
Query 2 : With use of $unwind :
db.collection.aggregate([
{
$match: {
area: "bath"
}
},
{
$unwind: {
path: "$ukresults",
preserveNullAndEmptyArrays: true
}
},
{
$sort: {
"ukresults.ukvotes": -1
}
},
{
$limit: 1
}
])
Test : MongoDB-Playground
I would say these two might perform well as we're doing on mostly one document (Cause we've $match as first stage), maybe first query might take a wile to iterate if you've more elements in array, give it a try & choose one which helps most.
Ref : Check this pymongo documentation for aggregation examples : pymongo-aggregation.
The following aggregation query prints the ukresults sub-document with maximum votes.
The aggregation operator $max cannot be applied directly on the ukresults array, as the array has sub-documents rather than scalar values, like numbers. So, we use $reduce aggregation operator to extract the sub-document with maximum votes. Note that using the $reduce operator is a reduction operation on an array, so is using $max - the difference is the array element data type.
The PyMongo code:
import pymongo
import pprint
client = pymongo.MongoClient()
collection = client.test.testCollection
pipeline = [
{
"$match": { "area": "bath" }
},
{
"$addFields": {
"maxvotes": {
"$reduce": {
"input": "$ukresults",
"initialValue": { "ukvotes": 0 },
"in": {
"$cond": [
{ "$gt": [ "$$this.ukvotes", "$$value.ukvotes"] },
"$$this",
"$$value"
]
}
}
}
}
},
{
"$project": {
"_id": 0,
"area": 1,
"maxvotes": 1
}
}
]
pprint.pprint(list(collection.aggregate(pipeline)))
The output:
[{'area': 'bath',
'maxvotes': {'leader': 'thatcher', 'party': 'con', 'ukvotes': 22544.0}}]
Related
I have the following mongodb structure...
[
{
track: 'Newcastle',
time: '17:30',
date: '22/04/2022',
bookmakers: [
{
bookmaker: 'Coral',
runners: [
{
runner: 'John',
running: true,
odds: 3.2
},
...
]
},
...
]
},
...
]
I'm trying to find filter the bookmakers array for each document to only include the objects that match the specified bookmaker values, for example:
{ 'bookmakers.bookmaker': { $in: ['Coral', 'Bet365'] } }
At the moment, I'm using the following mongodb query to only select the bookmakers that are specified, however I need to put the documents back together after they've been seperated by the '$unwind', is there a way I can do this using $group?
await HorseRacingOdds.aggregate([
{ $unwind: "$bookmakers" },
{
$group: {
_id: "$_id",
bookmakers: "$bookmakers"
}
},
{
$project: {
"_id": 0,
"__v": 0,
"lastUpdate": 0
}
}
])
How about a plain $addFields with $filter?
db.collection.aggregate([
{
"$addFields": {
"bookmakers": {
"$filter": {
"input": "$bookmakers",
"as": "b",
"cond": {
"$in": [
"$$b.bookmaker",
[
"Coral",
"Bet365"
]
]
}
}
}
}
},
{
$project: {
"_id": 0,
"__v": 0,
"lastUpdate": 0
}
}
])
Here is the Mongo playground for your reference.
I have a collection of users where each document has following structure:
{
"_id": "<id>",
"login": "xxx",
"solved": [
{
"problem": "<problemID>",
"points": 10
},
...
]
}
The field solved may be empty or contain arbitrary many subdocuments. My goal is to get a list of users together with the total score (sum of points) where users that haven't solved any problem yet will be assigned total score of 0. Is this possible to do this with a single query (ideally using aggregation framework)?
I was trying to use following query in aggregation framework:
{ "$group": {
"_id": "$_id",
"login": { "$first": "$login" },
"solved": { "$addToSet": { "points": 0 } }
} }
{ "$unwind": "$solved" }
{ "$group": {
"_id": "$_id",
"login": { "$first": "$login" },
"solved": { "$sum": "$solved.points" }
} }
However I am getting following error:
exception: The top-level _id field is the only field currently supported for exclusion
Thank you in advance
With MongoDB 3.2 version and newer, the $unwind operator now has some options where in particular the preserveNullAndEmptyArrays option will solve this.
If this option is set to true and if the path is null, missing, or an empty array, $unwind outputs the document. If false, $unwind does not output a document if the path is null, missing, or an empty array. In your case, set it to true:
db.collection.aggregate([
{ "$unwind": {
"path": "$solved",
"preserveNullAndEmptyArrays": true
} },
{ "$group": {
"_id": "$_id",
"login": { "$first": "$login" },
"solved": { "$sum": "$solved.points" }
} }
])
Here is the solution - it assumes that the field "solved" is either absent, is equal to null or has an array of problems and scores solved. The case it does not handle is "solved" being an empty array - although that would be a simple additional adjustment you could add.
project = {$project : {
"s" : {
"$ifNull" : [
"$solved",
[
{
"points" : 0
}
]
]
},
"login" : 1
}
};
unwind={$unwind:"$s"};
group= { "$group" : {
"_id" : "$_id",
"login" : {
"$first" : "$login"
},
"score" : {
"$sum" : "$s.points"
}
}
}
db.students.aggregate( [ project, unwind, group ] );
$lookup then $unwind inside look up array and that could be empty
let posts = await Post.aggregate<ActivityDoc>([
{
$match: {
_id: new mongoose.Types.ObjectId(req.params.id),
},
},
{
$lookup: {
from: 'users',
localField: 'user',
foreignField: '_id',
as: 'user',
},
},
{
$unwind: '$user',
},
{
$unwind: {
path: '$user.follower',
preserveNullAndEmptyArrays: true,
},
},
{
$match: {
$or: [
{
$and: [
{
'privacy.mode': {
$eq: PrivacyMode.EveryOne,
},
},
],
},
{
$and: [
{
'privacy.mode': {
$eq: PrivacyMode.MyCircle,
},
},
{
'user.follower.id': {
$eq: req.currentUser?.id,
},
},
],
},
],
},
},
]);
I have an article collection:
{
_id: 9999,
authorId: 12345,
coAuthors: [23456,34567],
title: 'My Article'
},
{
_id: 10000,
authorId: 78910,
title: 'My Second Article'
}
I'm trying to figure out how to get a list of distinct author and co-author ids out of the database. I have tried push, concat, and addToSet, but can't seem to find the right combination. I'm on 2.4.6 so I don't have access to setUnion.
Whilst $setUnion would be the "ideal" way to do this, there is another way that basically involved "switching" between a "type" to alternate which field is picked:
db.collection.aggregate([
{ "$project": {
"authorId": 1,
"coAuthors": { "$ifNull": [ "$coAuthors", [null] ] },
"type": { "$const": [ true,false ] }
}},
{ "$unwind": "$coAuthors" },
{ "$unwind": "$type" },
{ "$group": {
"_id": {
"$cond": [
"$type",
"$authorId",
"$coAuthors"
]
}
}},
{ "$match": { "_id": { "$ne": null } } }
])
And that is it. You may know the $const operation as the $literal operator from MongoDB 2.6. It has always been there, but was only documented and given an "alias" at the 2.6 release.
Of course the $unwind operations in both cases produce more "copies" of the data, but this is grouping for "distinct" values so it does not matter. Just depending on the true/false alternating value for the projected "type" field ( once unwound ) you just pick the field alternately.
Also this little mapReduce does much the same thing:
db.collection.mapReduce(
function() {
emit(this.authorId,null);
if ( this.hasOwnProperty("coAuthors"))
this.coAuthors.forEach(function(id) {
emit(id,null);
});
},
function(key,values) {
return null;
},
{ "out": { "inline": 1 } }
)
For the record, $setUnion is of course a lot cleaner and more performant:
db.collection.aggregate([
{ "$project": {
"combined": {
"$setUnion": [
{ "$map": {
"input": ["A"],
"as": "el",
"in": "$authorId"
}},
{ "$ifNull": [ "$coAuthors", [] ] }
]
}
}},
{ "$unwind": "$combined" },
{ "$group": {
"_id": "$combined"
}}
])
So there the only real concerns are converting the singular "authorId" to an array via $map and feeding an empty array where the "coAuthors" field is not present in the document.
Both output the same distinct values from the sample documents:
{ "_id" : 78910 }
{ "_id" : 23456 }
{ "_id" : 34567 }
{ "_id" : 12345 }
The below collection named "coll" was maintained in the mongodb.
{
{"_id":1, "set":[1,2,3,4,5]},
{"_id":2, "set":[0,2,6,4,5]},
{"_id":3, "set":[1,2,5,10,22]}
}
How to find the intersection of the set elements in the above collection documents with _id's 1 and 3.
Use the aggregation framework to get the desired result. The aggregation set operator that would do the magic is $setIntersection.
The following aggregation pipeline achieves what you are after:
db.test.aggregate([
{
"$match": {
"_id": { "$in": [1, 3] }
}
},
{
"$group": {
"_id": 0,
"set1": { "$first": "$set" },
"set2": { "$last": "$set" }
}
},
{
"$project": {
"set1": 1,
"set2": 1,
"commonToBoth": { "$setIntersection": [ "$set1", "$set2" ] },
"_id": 0
}
}
])
Output:
/* 0 */
{
"result" : [
{
"set1" : [1,2,3,4,5],
"set2" : [1,2,5,10,22],
"commonToBoth" : [1,2,5]
}
],
"ok" : 1
}
UPDATE
For three or more documents to be intersected, you'd need the $reduce operator to flatten the arrays. This will allow you to intersect any number of arrays, so instead of just doing an intersection of the two arrays from docs 1 and 3, this will apply to multiple arrays as well.
Consider running the following aggregate operation:
db.test.aggregate([
{ "$match": { "_id": { "$in": [1, 3] } } },
{
"$group": {
"_id": 0,
"sets": { "$push": "$set" },
"initialSet": { "$first": "$set" }
}
},
{
"$project": {
"commonSets": {
"$reduce": {
"input": "$sets",
"initialValue": "$initialSet",
"in": { "$setIntersection": ["$$value", "$$this"] }
}
}
}
}
])
I have collection of products. Each product contains array of items.
> db.products.find().pretty()
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "01.02.2100",
"purchasePrice" : 1,
"sellingPrice" : 10,
"count" : 15
},
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
So, can you please give me an advice, how I can query MongoDB to retrieve all products with only single item which date is equals to the date I pass to query as parameter.
The result for "31.08.2014" must be:
{
"_id" : ObjectId("54023e8bcef998273f36041d"),
"shop" : "shop1",
"name" : "product1",
"items" : [
{
"date" : "31.08.2014",
"purchasePrice" : 10,
"sellingPrice" : 1,
"count" : 5
}
]
}
What you are looking for is the positional $ operator and "projection". For a single field you need to match the required array element using "dot notation", for more than one field use $elemMatch:
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
)
Or the $elemMatch for more than one matching field:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "shop": 1, "name":1, "items.$": 1 }
)
These work for a single array element only though and only one will be returned. If you want more than one array element to be returned from your conditions then you need more advanced handling with the aggregation framework.
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$unwind": "$items" },
{ "$match": { "items.date": "31.08.2014" } },
{ "$group": {
"_id": "$_id",
"shop": { "$first": "$shop" },
"name": { "$first": "$name" },
"items": { "$push": "$items" }
}}
])
Or possibly in shorter/faster form since MongoDB 2.6 where your array of items contains unique entries:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$project": {
"shop": 1,
"name": 1,
"items": {
"$setDifference": [
{ "$map": {
"input": "$items",
"as": "el",
"in": {
"$cond": [
{ "$eq": [ "$$el.date", "31.08.2014" ] },
"$$el",
false
]
}
}},
[false]
]
}
}}
])
Or possibly with $redact, but a little contrived:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$redact": {
"$cond": [
{ "$eq": [ { "$ifNull": [ "$date", "31.08.2014" ] }, "31.08.2014" ] },
"$$DESCEND",
"$$PRUNE"
]
}}
])
More modern, you would use $filter:
db.products.aggregate([
{ "$match": { "items.date": "31.08.2014" } },
{ "$addFields": {
"items": {
"input": "$items",
"cond": { "$eq": [ "$$this.date", "31.08.2014" ] }
}
}}
])
And with multiple conditions, the $elemMatch and $and within the $filter:
db.products.aggregate([
{ "$match": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ "$addFields": {
"items": {
"input": "$items",
"cond": {
"$and": [
{ "$eq": [ "$$this.date", "31.08.2014" ] },
{ "$eq": [ "$$this.purchasePrice", 1 ] }
]
}
}
}}
])
So it just depends on whether you always expect a single element to match or multiple elements, and then which approach is better. But where possible the .find() method will generally be faster since it lacks the overhead of the other operations, which in those last to forms does not lag that far behind at all.
As a side note, your "dates" are represented as strings which is not a very good idea going forward. Consider changing these to proper Date object types, which will greatly help you in the future.
Based on Neil Lunn's code I work with this solution, it includes automatically all first level keys (but you could also exclude keys if you want):
db.products.find(
{ "items.date": "31.08.2014" },
{ "shop": 1, "name":1, "items.$": 1 }
{ items: { $elemMatch: { date: "31.08.2014" } } },
)
With multiple requirements:
db.products.find(
{ "items": {
"$elemMatch": { "date": "31.08.2014", "purchasePrice": 1 }
}},
{ items: { $elemMatch: { "date": "31.08.2014", "purchasePrice": 1 } } },
)
Mongo supports dot notation for sub-queries.
See: http://docs.mongodb.org/manual/reference/glossary/#term-dot-notation
Depending on your driver, you want something like:
db.products.find({"items.date":"31.08.2014"});
Note that the attribute is in quotes for dot notation, even if usually your driver doesn't require this.