I have a collection with 2 docs like below.
{
_id:1,
Score: 30,
Class:A,
School:X
}
{
Score:40,
Class:A,
School:Y
}
I need help in writing query to find out percentage of score like below
{
School:X,
Percent:30/70
}
{
School:Y
Percent:40/70
}
This input:
var r =
[
{"school":"X", "class":"A", "score": 30}
,{"school":"Y", "class":"A", "score": 40}
,{"school":"Z", "class":"A", "score": 20}
,{"school":"Y", "class":"B", "score": 50}
,{"school":"Z", "class":"B", "score": 17}
];
run through this pipeline:
db.foo.aggregate([
// Use $group to gather up the class and save the inputs via $push
{$group: {_id: "$class", tot: {$sum: "$score"}, items: {$push: {score:"$score",school:"$school"}}} }
// Now we have total by class, so just "reproject" that array and do some nice
// formatting as requested:
,{$project: {
items: {$map: { // overwrite input array $items; this is OK
input: "$items",
as: "z",
in: {
school: "$$z.school",
pct: {$concat: [ {$toString: "$$z.score"}, "/", {$toString:"$tot"} ]}
}
}}
}}
]);
produces this output, where _id is the Class:
{
"_id" : "A",
"items" : [
{"school" : "X", "pct" : "30/90"},
{"school" : "Y", "pct" : "40/90"},
{"school" : "Z", "pct" : "20/90"}
]
}
{
"_id" : "B",
{"school" : "Y", "pct" : "50/67"},
{"school" : "Z", "pct" : "17/67"}
]
}
From here you can $unwind if you wish.
Related
I have a collection named "records" that contains documents in the following form:
{
"name": "a"
"items": [
{
"a": "5",
"b": "1",
"c": "2"
},
{
"a": "6",
"b": "3",
"c": "7"
}
]
}
I want to keep the data just as it is in the database (to make the data easy to read and interpret). But I'd like to run a query that returns the data in the following form:
{
"name": "a"
"items": [
["5", "1", "2"],
["6", "3", "7"],
]
}
Is this possible with pymongo? I know I can run a query and translate the documents using Python, but I'd like to avoid iterating over the query result if possible.
I have a table named "records"
Collection
Is this possible with pymongo?
Yes
Any pointers on how to approach this would be super helpful!
I'd suggest you to use a view to transform your data during a query in MongoDB.
In this way, you can get transformed data and apply find to already transformed data if you need.
db.createCollection(
"view_name",
{"viewOn": "original_collection_name",
"pipeline": [{$unwind: "$items"},
{$project: {name: 1, items: {$objectToArray: "$items"}}},
{$project: {name: 1, items: {$concatArrays: ["$items.v"]}}},
{$group: {_id: "$_id", name: {$first: "$name"},
items: {$push: "$items"}}}]
}
)
> db.view_name.find({name: "a"})
{ "_id" : ObjectId("5fc3dbb69cb76f866582620f"), "name" : "a", "items" : [ [ "5", "1", "2" ], [ "6", "3", "7" ] ] }
> db.view_name.find({"items": {$in: [["5", "1", "2"]]}})
{ "_id" : ObjectId("5fc3dbb69cb76f866582620f"), "name" : "a", "items" : [ [ "5", "1", "2" ], [ "6", "3", "7" ] ] }
> db.view_name.find()
{ "_id" : ObjectId("5fc3dbb69cb76f866582620f"), "name" : "a", "items" : [ [ "5", "1", "2" ], [ "6", "3", "7" ] ] }
Query:
db.original_collection_name.aggregate([
{$unwind: "$items"},
{$project: {name: 1, items: {$objectToArray: "$items"}}},
{$project: {name: 1, items: {$concatArrays: ["$items.v"]}}},
{$group: {_id: "$_id", name: {$first: "$name"}, items: {$push: "$items"}}}])
Using $objectToArray and $map transformations:
// { name: "a", items: [ { a: "5", b: "1", c: "2" }, { a: "6", b: "3", c: "7" } ] }
db.collection.aggregate([
{ $set: { items: { $map: { input: "$items", as: "x", in: { $objectToArray: "$$x" } } } } },
// {
// name: "a",
// items: [
// [ { k: "a", v: "5" }, { k: "b", v: "1" }, { k: "c", v: "2" } ],
// [ { k: "a", v: "6" }, { k: "b", v: "3" }, { k: "c", v: "7" } ]
// ]
// }
{ $set: { items: { $map: { input: "$items", as: "x", in: "$$x.v" } } } }
])
// { name: "a", items: [["5", "1", "2"], ["6", "3", "7"]] }
This maps items' elements as key/value arrays such that { field: "value" } becomes [ { k: "field", v: "value" } ]. This way whatever the field name, we can easily access the value using v, which is the role of the second $set stage: "$$x.v".
This has the benefit of avoiding heavy stages such as unwind/group.
Note that you can also imbricate the second $map within the first; but that's probably less readable.
I have a test collection:
{
"_id" : ObjectId("5exxxxxx03"),
"username" : "abc",
"col1" : [
{
"colId" : 1
"col2" : [
{
"name" : "a",
"value" : 10
},
{
"name" : "b",
"value" : 20
},
{
"name" : "c",
"value" : 30
}
],
"col3" : [
{
"name" : "d",
"value" : 15
},
{
"name" : "e",
"value" : 25
},
{
"name" : "f",
"value" : 35
}
]
}
]
}
col1 has the list of sub-documents col2 and col3, which are similar, but convey different meanings. These two sub-documents are having name and value as fields.
Now, I need to find the max value from col2 or col3 and its corresponding name.
I tried the below query:
db.test.aggregate([
{$unwind: '$col1'},
{$unwind: '$col1.col2'},
{$unwind: '$col1.col3'},
{$group:
{_id: '$col1.colId',
maxCol2: {$max: '$col1.col2.value'},
maxCol3: {$max: '$col1.col3.value'}}},
{$project:
{maxValue: {$max: ['$maxCol2', '$maxCol3']},
name: {$cond: [
{$eq: ['$maxValue', '$maxCol2']},
'$col1.col2.name',
'$col1.col3.name']}}}]).pretty()
But, it resulted in the following, without name field in it:
{ "_id" : 1, "maxValue" : 35 }
So, just to check, weather my condition is correct or not, tried the following query ($col1.col2.name and $col1.col3.name replaced with 111 and 222 strings):
db.test.aggregate([
{$unwind: '$col1'},
{$unwind: '$col1.col2'},
{$unwind: '$col1.col3'},
{$group:
{_id: '$col1.colId',
maxCol2: {$max: '$col1.col2.value'},
maxCol3: {$max: '$col1.col3.value'}}},
{$project:
{maxValue: {$max: ['$maxCol2', '$maxCol3']},
name: {$cond: [
{$eq: ['$maxValue', '$maxCol2']},
'111',
'222']}}}]).pretty()
Which gives me the expected output:
{ "_id" : 1, "maxValue" : 35, "name" : "222" }
Could any one guide me why I am not getting the correct answer and how should I query this to get the correct output?
The correct out should be:
{ "_id" : 1, "maxValue" : 35, "name" : "f" }
P.S. - I'm a beginner.
You can use below aggregation
db.collection.aggregate([
{ "$project": {
"col1": {
"$max": {
"$reduce": {
"input": "$col1",
"initialValue": [],
"in": {
"$concatArrays": [
"$$this.col2",
"$$value",
"$$this.col3"
]
}
}
}
}
}}
])
MongoPlayground
Try this one:
Explanation
We need to add extra fields with col2 and col3 values. Once we calculate max value, we retrieve name based on max value.
db.collection.aggregate([
{
$unwind: "$col1"
},
{
$unwind: "$col1.col2"
},
{
$unwind: "$col1.col3"
},
{
$group: {
_id: "$col1.colId",
maxCol2: {
$max: "$col1.col2.value"
},
maxCol3: {
$max: "$col1.col3.value"
},
col2: {
$addToSet: "$col1.col2"
},
col3: {
$addToSet: "$col1.col3"
}
}
},
{
$project: {
maxValue: {
$filter: {
input: {
$cond: [
{
$gt: [
"$maxCol2",
"$maxCol3"
]
},
"$col2",
"$col3"
]
},
cond: {
$eq: [
"$$this.value",
{
$cond: [
{
$gt: [
"$maxCol2",
"$maxCol3"
]
},
"$maxCol2",
"$maxCol3"
]
}
]
}
}
}
}
},
{
$unwind: "$maxValue"
},
{
$project: {
_id: 1,
maxValue: "$maxValue.value",
name: "$maxValue.name"
}
}
])
MongoPlayground | Merging col2 / col3 | Per document
I've got a aggregation :
{
$group: {
_id: "$_id",
cuid: {$first: "$cuid"},
uniqueConnexion: {
$addToSet: "$uniqueConnexion"
},
uniqueFundraisings: {
$addToSet: "$uniqueFundraisings"
}
}
},
that result with :
{
"cuid" : "cjcqe7qdo00nl0ltitkxdw8r6",
"uniqueConnexion" : [
"09.2019",
"06.2019",
"07.2019",
"08.2019",
"05.2019"
],
"uniqueFundraisings" : [
"06.2019",
"02.2019",
"01.2019",
"03.2019",
"09.2018",
"10.2018"
],
}
And now I'm want to group the uniquerConnexion and uniqueFundraisings fields to a new field (name uniqueAction) and convert them to a quarter format.
So an output like this :
{
"cuid" : "cjcqe7qdo00nl0ltitkxdw8r6",
"uniqueAction" : [
"Q4-2018",
"Q1-2019",
"Q2-2019",
"Q3-2014",
],
}
The previous answer shows the power of $setUnion operating on two lists. I have taken that and expanded a little more to get the OP target state. Given an input that more clearly shows some quarterly grouping (hint!):
var r =
{
"cuid" : "cjcqe7qdo00nl0ltitkxdw8r6",
"uniqueConnexion" : [
"01.2018",
"02.2018",
"08.2018",
"09.2018",
"10.2018",
"11.2018"
],
"uniqueFundraisings" : [
"01.2018",
"02.2018",
"05.2018",
"06.2018",
"12.2018"
],
};
this agg:
db.foo.aggregate([
// Unique-ify the two lists:
{ $project: {
cuid:1,
X: { $setUnion: [ "$uniqueConnexion", "$uniqueFundraisings" ] }
}}
// Now need to get to quarters....
// The input date is "MM.YYYY". Need to turn it into "Qn-YYYY":
,{ $project: {
X: {$map: {
input: "$X",
as: "z",
in: {$let: {
vars: { q: {$toInt: {$substr: ["$$z",0,2] }}},
in: {$concat: [{$cond: [
{$lte: ["$$q", 3]}, "Q1", {$cond: [
{$lte: ["$$q", 6]}, "Q2", {$cond: [
{$lte: ["$$q", 9]}, "Q3", "Q4"] }
]}
]} ,
"-", {$substr:["$$z",3,4]},
]}
}}}}}}
,{ $unwind: "$X"}
,{ $group: {_id: "$X", n: {$sum:1} }}
]);
produces this output. Yes, the OP was not looking for the count of things appearing in each quarter but very often that quickly follows on the heels of the original ask.
{ "_id" : "Q4-2018", "n" : 3 }
{ "_id" : "Q3-2018", "n" : 2 }
{ "_id" : "Q2-2018", "n" : 2 }
{ "_id" : "Q1-2018", "n" : 2 }
i think this will help you
{ $project: {
cuid:1,
uniqueAction: { $setUnion: [ "$uniqueConnexio", "$uniqueAction" ] }, _id: 0
}
}
Is there an easy solution in MongoDB to find some objects that match a query and then to modify the result without modifying the persistent data depending on if a certain value is contained in an array?
Let explain me using an example:
students = [
{
name: "Alice",
age: 25,
courses: [ { name: "Databases", credits: 6 },{ name: "Java", credits: 4 }]
},
{
name: "Bob",
age: 22,
courses: [ { name: "Java", credits: 4 } ]
},
{
name: "Carol",
age: 19,
courses: [ { name: "Databases", credits: 6 } ]
},
{
name: "Dave", age: 18
}
]
Now, I want to query all students. The result should return all their data except 'courses'. Instead, I want to output a flag 'participant' indicating whether that person participates in the Databases course:
result = [
{ name: "Alice", age: 25, participant: 1 },
{ name: "Bob", age: 22, participant: 0 },
{ name: "Carol", age: 19, participant: 1 },
{ name: "Dave", age: 18, participant: 0}
]
without changing anything in the database.
I've already found a solution using aggregate. But it's very complicated and unhandy and so, I would like to know if there is a more handy solution for this problem.
My current solution looks like the following:
db.students.aggregate([
{$project: {"courses": {$ifNull: ["$courses", [{name: 0}]]}, name: 1, _id: 1, age: 1}},
{$unwind: "$courses"},
{$project: {name: 1, age: 1, participant: {$cond: [{$eq: ["$courses.name", "DB"]}, 1, 0]}}},
{$group: {_id: {_id: "$_id", age: 1, name: "$name"}, participant: {$sum: "$participant"}}},
{$project: {_id: 0, _id: "$_id._id", age: "$_id.age", name: "$_id.name", participant: 1}}
]);
One point I don't like in this solution is that I have to specify the output fields exactly three times. Also, this pipe is quite long.
Run the following aggregation pipeline to get the desired result:
db.students.aggregate([
{
"$project": {
"name": 1,
"age": 1,
"participant": {
"$size": {
"$ifNull" : [
{
"$setIntersection" : [
{
"$map": {
"input": "$courses",
"as": "el",
"in": {
"$eq": [ "$$el.name", "Databases" ]
}
}
},
[true]
]
},
[]
]
}
}
}
}
])
Output:
{
"result" : [
{
"_id" : ObjectId("564f1bb67d3c273d063cd216"),
"name" : "Alice",
"age" : 25,
"participant" : 1
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd217"),
"name" : "Bob",
"age" : 22,
"participant" : 0
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd218"),
"name" : "Carol",
"age" : 19,
"participant" : 1
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd219"),
"name" : "Dave",
"age" : 18,
"participant" : 0
}
],
"ok" : 1
}
The above pipeline uses only one step, $project in which the new field participant is created via a series of nested operators.
Crucial to the operations is the deeply nested $map operator which in essence creates a new array field that holds values as a result of the evaluated logic in a subexpression to each element of an array. Let's demonstrate this operation only by executing the pipeline with just the $map part:
db.students.aggregate([
{
"$project": {
"name": 1,
"age": 1,
"participant": {
"$map": {
"input": "$courses",
"as": "el",
"in": {
"$eq": [ "$$el.name", "Databases" ]
}
}
}
}
}
])
Output
{
"result" : [
{
"_id" : ObjectId("564f1bb67d3c273d063cd216"),
"name" : "Alice",
"age" : 25,
"participant" : [
true,
false
]
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd217"),
"name" : "Bob",
"age" : 22,
"participant" : [
false
]
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd218"),
"name" : "Carol",
"age" : 19,
"participant" : [
true
]
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd219"),
"name" : "Dave",
"age" : 18,
"participant" : null
}
],
"ok" : 1
}
Probe the array further by introducing the $setIntersection operator which returns a set with elements that appear in all of the input sets. Thus in the above you would need to get a resulting array that has true to denote that document user participated in a Database course, else it will return an empty or null array. Let's see how adding that operator affects the previous result:
db.students.aggregate([
{
"$project": {
"name": 1,
"age": 1,
"participant": {
"$setIntersection" : [
{
"$map": {
"input": "$courses",
"as": "el",
"in": {
"$eq": [ "$$el.name", "Databases" ]
}
}
},
[true]
]
}
}
}
])
Output:
{
"result" : [
{
"_id" : ObjectId("564f1bb67d3c273d063cd216"),
"name" : "Alice",
"age" : 25,
"participant" : [
true
]
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd217"),
"name" : "Bob",
"age" : 22,
"participant" : []
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd218"),
"name" : "Carol",
"age" : 19,
"participant" : [
true
]
},
{
"_id" : ObjectId("564f1bb67d3c273d063cd219"),
"name" : "Dave",
"age" : 18,
"participant" : null
}
],
"ok" : 1
}
To handle nulls, apply the $ifNull operator, equivalent to the coalesce command in SQL to substitute null values with an empty array:
db.students.aggregate([
{
"$project": {
"name": 1,
"age": 1,
"participant": {
"$ifNull" : [
{
"$setIntersection" : [
{
"$map": {
"input": "$courses",
"as": "el",
"in": {
"$eq": [ "$$el.name", "Databases" ]
}
}
},
[true]
]
},
[]
]
}
}
}
])
After this you can then wrap the $ifNull operator with the $size operator to return the number of elements in the participants array, and that yields the final output as above.
Based on what you said about the small number of objects, how about simply pulling out the database name and using JavaScript map to transform it? You're not saving much in terms of transfer and the code will be way more readable than the pipeline.
Say for every document of a collection, it has an string array. how could I count the repetitive time of every element of the array in all this collection? Right now I can find all the distinct element, but then Map Reduce function is a little tricky that I haven't fully understood.
Doc A
{
_id:
name:
actors: ["a", "b", "c"]
}
Doc B
{
_id:
name:
actors: ["a", "d"]
}
Doc C
{
_id:
name:
actors: ["a", "c", "f"]
}
I wanne get a statistic result with a:3 b:1 c:2 d:1 f:1.
An alternative route that you could take is the aggregation framework. Considering the above collection as an example
Populate test collection:
db.collection.insert([
{ "_id" : 1, "name" : "ABC1", "actors": ["a", "b", "c"] },
{ "_id" : 2, "name" : "ABC2", "actors" : ["a", "d"] },
{ "_id" : 3, "name" : "XYZ1", "actors" : ["a", "c", "f"] }
])
Using MongoDB 3.4.4 or newer:
db.collection.aggregate([
{ "$unwind" : "$actors" },
{ "$group": { "_id": "$actors", "count": { "$sum": 1} } },
{ "$group": {
"_id": null,
"counts": {
"$push": {
"k": "$_id",
"v": "$count"
}
}
} },
{ "$replaceRoot": {
"newRoot": { "$arrayToObject": "$counts" }
} }
])
Output
{
a: 3,
b: 1,
c: 2,
d: 1,
f: 1
}
Using MongoDB 3.2 and below:
The following aggregation pipeline operation uses the $unwind stage to output a document for each element in the actors array and the $group stage to group the documents by the value in the actors array then
counts the number of documents per each group (which gives the occurrence of the array elements as a group) by way of the $sum operator:
db.collection.aggregate([
{ "$unwind" : "$actors" },
{ "$group": { "_id": "$actors", "count": { "$sum": 1} } }
])
The operation returns the following results which would be a close match to your expectations but won't give you the documents as key/value pair:
/* 0 */
{
"result" : [
{
"_id" : "f",
"count" : 1
},
{
"_id" : "d",
"count" : 1
},
{
"_id" : "c",
"count" : 2
},
{
"_id" : "b",
"count" : 1
},
{
"_id" : "a",
"count" : 3
}
],
"ok" : 1
}