I have a mongo aggregation which currently ends in this project:
{
$project: {
_id: 1,
name: 1,
subject_id: '$subject.subject',
groups: '$groups'
}
}
It creates this array of tests:
0:
_id: 1
groups: (27)
name: "Year 1 Maths Paper 1"
subject_id: "111"
1:
_id:2
groups: (27)
name: "Year 1 Maths Paper 2"
subject_id: "111"
However, I want to group the data by the subject_id. So each subject will be an array with tests related to it inside. Does anyone know how to do this?
Demo - https://mongoplayground.net/p/-cST7kqFYxS
Use $group to groupby subject_id and $push names to tests array.
db.collection.aggregate([
{
$group: {
_id: "$subject_id",
tests: { $push: "$name" }
}
}
])
Related
I have a collection of nested documents, divided into buckets belonging to a single business id.
To illustrate, the following represents a document related to an invoice from business n. 1022 in which 10 roses, 20 oranges and 15 apples were sold:
sample_doc = {
'business_id': '32044',
'dt_op': Timestamp('2018-10-02 12:16:12'),
'transactions': [
{'Product name': 'Rose', "Quantity": 10},
{'Product name': 'Orange', "Quantity": 20},
{'Product name': 'Apple', "Quantity": 15}
]
}
I would like to get the total number of sales (sum of 'Quantity') for each product ('Product name') within a defined 'business_id'.
I tried, using Compass, to:
# Stage 1: $match
{
business_id: "1022"
}
#Stage 2: $group
{
_id: "$transactions.Product name",
TotalSum: {
$sum: "transactions.Quantity"
}
}
But a nested list of documents is returned, without performing sums.
How can I correctly perform the aggregation pipeline to get the total number of sales (sum of 'Quantity') for each product ('Product name') within a defined 'business_id'?
You are very close, all you're missing is a single $unwind before the $group stage:
db.collection.aggregate([
{
$match: {
business_id: "1022"
}
},
{
$unwind: "$transactions"
},
{
$group: {
_id: "$transactions.Product name",
TotalSum: {
$sum: "$transactions.Quantity"
}
}
}
])
Mongo Playground
I have some records like :
{
id: 1,
phone: "+15555555555",
name: "Acme CO.",
vendorcode: "ACMEC"
},
{
id: 2,
phone: "+15555555555",
name: "Acme corporation company",
vendorcode: "ACMECOMPANY"
},
{
id: 3,
phone: "+15555555555",
name: "Acme Incorporated",
vendorcode: null
}
I want to merge records:
IF phone field matches, merge the records. (can overwrite values with the values of the next record being merged).
But if there are vendorcode non-null values in multiple records, create an arrray of values. So "vendorcode" in the new record would be an array.
I would like the output of the above collection to be something like:
{
phone: "+15555555555",
name: "Acme Co.",
vendorcode: ["ACMEC","ACMECOMPANY"]
}
in a new collection.
How to write an aggregation for this in mongodb?
$group by phone, select first name, phone
$ifNull will return vendorcode if its not null
$addToSet to make array of unique vendorcode
$project to remove _id field
$out to write query result in new collection, this will create a new collection and write this query result
db.collection.aggregate([
{
$group: {
_id: "$phone",
phone: { $first: "$phone" },
name: { $first: "$name" },
vendorcode: {
$addToSet: { $ifNull: ["$vendorcode", "$$REMOVE"] }
}
}
},
{ $project: { _id: 0 } },
{ $out: "newCollectionName" }
])
Playground
I have a database of about 50k "company" records.
I want to find duplicates by matching:
name and street fields.
OR
phone field
(I consider both #1 and #2 unique identifiers, so either can be used to find duplicates.)
I am able to write the $group statement to match based on #1:
_id: {
name: '$name',
street: 'street'
},
uniqueIds: {
$addToSet: '$_id'
},
count: {
$sum: 1
}
I tried something like this to match one or the other:
_id: {
$or: [
{name: '$name', street: '$street'},
{phone: '$phone}
]
}...
But that just returns a boolean.
How to group by filtering for #1 or #2 above in the same aggregation?
One option is to use $facet:
db.company.aggregate([
{ $facet:{
by_name_street:[ {$group:{ _id:{n:"$name",str:"$street" }, cnt:{$sum:1} }} ] ,
by_phone:[ {$group:{ _id:"$phone" , cnt:{$sum:1} }} ]
} }
])
I am trying to make a pagination and would like to use the mongoDB's countDocuments() method to return the total number of teams who's leader belongs to DC organization.
teams collection:
{
_id: 1,
name: 'avengers',
leader_id: 'L1'
},
{
_id: 2,
name: 'justice league',
leader_id: 'L2'
},
{
_id: 3,
name: 'suicide squad',
leader_id: 'L3'
}
leaders collection:
{
_id: 'L1',
name: 'ironman',
organization: 'MCU'
},
{
_id: 'L2',
name: 'superman',
organization: 'DC'
},
{
_id: 'L3',
name: 'harley quinn',
organization: 'DC'
}
My question is, can we use the $lookup aggregation as the query to match my output?
No, countDocuments does not take aggregation operators in its argument. You can use the $count stage to get the count of documents in the pipeline.
If this is my collection structure:
{ _id: ObjectId("4fdbaf608b446b0477000142"), name: "product 1" }
{ _id: ObjectId("4fdbaf608b446b0477000143"), name: "product 2" }
{ _id: ObjectId("4fdbaf608b446b0477000144"), name: "product 3" }
and I query product 1, is there a way to query the next document, which in this case would be "product 2"?
It is best to add explicit sort() criteria if you want a predictable order of results.
Assuming the order you are after is "insertion order" and you are using MongoDB's default generated ObjectIds, then you can query based on the ObjectId:
// Find next product created
db.products.find({_id: {$gt: ObjectId("4fdbaf608b446b0477000142") }}).limit(1)
Note that this example only works because:
the first four bytes of the ObjectId are calculated from a unix-style timestamp (see: ObjectId Specification)
a query on _id alone will use the default _id index (sorted by id) to find a match
So really, this implicit sort is the same as:
db.products.find({_id: {$gt: ObjectId("4fdbaf608b446b0477000142" )}}).sort({_id:1}).limit(1);
If you added more criteria to the query to qualify how to find the "next" product (for example, a category), the query could use a different index and the order may not be as you expect.
You can check index usage with explain().
Starting in Mongo 5, it's a perfect use case for the new $setWindowFields aggregation operator:
// { _id: 1, name: "product 1" }
// { _id: 2, name: "product 2" }
// { _id: 3, name: "product 3" }
db.collection.aggregate([
{ $setWindowFields: {
sortBy: { _id: 1 },
output: { next: { $push: "$$ROOT", window: { documents: [1, 1] } } }
}}
])
// { _id: 1, name: "product 1", next: [{ _id: 2, name: "product 2" }] }
// { _id: 2, name: "product 2", next: [{ _id: 3, name: "product 3" }] }
// { _id: 3, name: "product 3", next: [ ] }
This:
$sorts documents by their order of insertion using their _ids (ObjectIds contain the timestamp of insertion): sortBy: { _id: 1 }
adds the next field in each document (output: { running: { ... }})
by $pushing the whole document $$ROOT ($push: "$$ROOT")
on a specified span of documents (the window) which is in this case is a range of only the next document document: window: { documents: [1, 1] }.
You can get the items back in insertion order using ObjectId. See: http://www.mongodb.org/display/DOCS/Optimizing+Object+IDs#OptimizingObjectIDs
If you want to get them back in some other order then you will have to define what that order is and store more information in your documents.