Aggregate (group by) query in MongoDB by 2 fields - mongodb

I am using MongoDB. My collection object structure is like the following:
{
"_id" : ObjectId("5a58800acebcda57188bf0aa"),
"title" : "Article title",
"categories" : "politics",
"url" : "https://example.com",
"article_date" : ISODate("2018-01-11T10:00:00.000Z"),
"content" : "content here..."
},
{
"_id" : ObjectId("5a58800acebcda57188bf0aa"),
"title" : "Article title 2",
"categories" : "economics",
"url" : "https://example.com",
"article_date" : ISODate("2018-01-12T10:00:00.000Z"),
"content" : "content here..."
}
Articles are publishing each day and I have many categories.
How can I group the data by date and count documents by specific category, for example:
{
"date": ISODate("2018-01-11T10:00:00.000Z"),
"result": [{
"category": "politics",
"count": 2
}, {
"category": "economics",
"count": 1
}]
},
{
"date": ISODate("2018-01-12T10:00:00.000Z"),
"result": [{
"category": "politics",
"count": 2
}, {
"category": "economics",
"count": 1
}]
}
Thank you in advance

you need to $group twice to get the result, first by article_date and categories then $group on article_date
db.art.aggregate([
{$group : {
_id : {article_date : "$article_date", categories : "$categories"},
count : {$sum : 1}
}},
{$group : {
_id : {article_date : "$_id.article_date"},
result : {$push : {category : "$_id.categories", count : "$count"}}
}},
{$addFields :{
_id : "$_id.article_date"
}}
]).pretty()
result for sample data in question
{
"_id" : ISODate("2018-01-11T10:00:00Z"),
"result" : [
{
"category" : "politics",
"count" : 1
}
]
}
{
"_id" : ISODate("2018-01-12T10:00:00Z"),
"result" : [
{
"category" : "economics",
"count" : 1
}
]
}

Related

Whats the alternative to $replaceRoot on mongoDB? $replaceRoot is incompatible with documentDB

The problem: I'm trying to make a query on MongoDB, but I'm using the DocumentDb from amazon, where some operations are no supported. I wanted to find an alternative to get the same result, if possible. Basically I want to change the root of the result, instead of being the first entity, I need it to be some merging of some values in different levels of the document.
So, I have the following structure in my collection:
{
"_id" : ObjectId("5e598bf4d98f7c70f9aa3b58"),
"status" : "active",
"invoices" : [
{
"_id" : ObjectId("5e598bf13b24713f50600375"),
"value" : 1157.52,
"receivables" : [
{
"situation" : {
"status" : "active",
"reason" : []
},
"rec_code" : "001",
"_id" : ObjectId("5e598bf13b24713f50600374"),
"expiration_date" : ISODate("2020-03-25T00:00:00.000Z"),
"value" : 1157.52
}
],
"invoice_code" : 9773,
"buyer" : {
"legal_name" : "test name",
"buyer_code" : "223132165498797"
}
},
],
"seller" : {
"code" : "321654897986",
"name" : "test name 2"
}
}
What I want to achieve is to list all "receivables" like this, where the _id is the _id of the receivable:
[{
"_id" : ObjectId("5e598bf13b24713f50600374"),
"situation" : {
"status" : "active",
"reason" : []
},
"rec_code" : "001",
"expiration_date" : ISODate("2020-03-25T00:00:00.000Z"),
"value" : 1157.52,
"status" : "active",
"seller" : {
"cnpj" : "321654897986",
"name" : "test name 2"
},
"invoice_code" : 9773.0,
"buyer" : {
"legal_name" : "test name",
"cnpj" : "223132165498797"
}
}]
This I can do with $replaceRoot in with the query below on MongoDB, but using documentDB I can't use $replaceRoot or $mergeObjects. Do you know how can I get the same result with other operators?:
db.testCollection.aggregate([
{ $unwind: "$invoices" },
{ $replaceRoot: {
newRoot: {
$mergeObjects: ["$$ROOT","$invoices"]}
}
},
{$project: {"_id": 0, "value": 0, "created_at": 0, "situation": 0}},
{ $unwind: "$receivables" },
{ $replaceRoot: {
newRoot: {
$mergeObjects: ["$receivables", "$$ROOT"]
}
}
},
{$project:{"created_at": 0, "receivables": 0, "invoices": 0}}
])
After going through mongodb operations, I could get a similar result fro what I wanted with the following query without $replaceRoot. It turns out it was a better query, I think:
db.testCollection.aggregate([
{$unwind: "$invoices"},
{$project : {
created_at: 1,
seller: "$seller",
buyer: "$invoices.buyer",
nnf: "$invoices.nnf",
receivable: '$invoices.receivables'
}
},
{$unwind: "$receivable"},
{$project : {
_id: '$receivable._id',
seller: 1,
buyer: 1,
invoice_code: 1,
receivable: 1,
created_at: 1,
}
},
{$sort: {"created_at": -1}},
])
This query resulted in the following structure list:
[{
"created_at" : ISODate("2020-03-06T09:47:26.161Z"),
"seller" : {
"name" : "Test name",
"cnpj" : "21231232131232"
},
"buyer" : {
"cnpj" : "21322132164654",
"legal_name" : "Test name 2"
},
"invoice_code" : 66119,
"receivable" : {
"rec_code" : "001",
"_id" : ObjectId("5e601bb5efff82b92935bad4"),
"expiration_date" : ISODate("2020-03-17T00:00:00.000Z"),
"value" : 6540.7,
"situation" : {
"status" : "active",
"reason" : []
}
},
"_id" : ObjectId("5e601bb5efff82b92935bad4")
}]
Support for $replaceRoot was added to Amazon DocumentDB in January 2021.

Problems aggregating MongoDB

I am having problems aggregating my Product Document in MongoDB.
My Product Document is:
{
"_id" : ObjectId("5d81171c2c69f45ef459e0af"),
"type" : "T-Shirt",
"name" : "Panda",
"description" : "Panda's are cool.",
"image" : ObjectId("5d81171c2c69f45ef459e0ad"),
"created_at" : ISODate("2019-09-17T18:25:48.026+01:00"),
"is_featured" : false,
"sizes" : [
"XS",
"S",
"M",
"L",
"XL"
],
"tags" : [ ],
"pricing" : {
"price" : 26,
"sale_price" : 8
},
"categories" : [
ObjectId("5d81171b2c69f45ef459e086"),
ObjectId("5d81171b2c69f45ef459e087")
],
"sku" : "5d81171c2c69f45ef459e0af"
},
And my Category Document is:
{
"_id" : ObjectId("5d81171b2c69f45ef459e087"),
"name" : "Art",
"description" : "These items are our artsy options.",
"created_at" : ISODate("2019-09-17T18:25:47.196+01:00")
},
My aim is to perform aggregation on the Product Document in order to count the number of items within each Category. So I have the Category "Art", I need to count the products are in the "Art" Category:
My current aggregate:
db.product.aggregate(
{ $unwind : "$categories" },
{
$group : {
"_id" : { "name" : "$name" },
"doc" : { $push : { "category" : "$categories" } },
}
},
{ $unwind : "$doc" },
{
$project : {
"_id" : 0,
"name" : "$name",
"category" : "$doc.category"
}
},
{
$group : {
"_id" : "$category",
"name": { "$first": "$name" },
"items_in_cat" : { $sum : 1 }
}
},
{ "$sort" : { "items_in_cat" : -1 } },
)
Which does actually work but not as I need:
{
"_id" : ObjectId("5d81171b2c69f45ef459e082"),
"name" : null, // Why is the name of the category no here?
"items_in_cat" : 4
},
As we can see the name is null. How can I aggregate the output to be:
{
"_id" : ObjectId("5d81171b2c69f45ef459e082"),
"name" : "Art",
"items_in_cat" : 4
},
We need to use $lookup to fetch the name from Category collection.
The following query can get us the expected output:
db.product.aggregate([
{
$unwind:"$categories"
},
{
$group:{
"_id":"$categories",
"items_in_cat":{
$sum:1
}
}
},
{
$lookup:{
"from":"category",
"let":{
"id":"$_id"
},
"pipeline":[
{
$match:{
$expr:{
$eq:["$_id","$$id"]
}
}
},
{
$project:{
"_id":0,
"name":1
}
}
],
"as":"categoryLookup"
}
},
{
$unwind:{
"path":"$categoryLookup",
"preserveNullAndEmptyArrays":true
}
},
{
$project:{
"_id":1,
"name":{
$ifNull:["$categoryLookup.name","NA"]
},
"items_in_cat":1
}
}
]).pretty()
Data set:
Collection: product
{
"_id" : ObjectId("5d81171c2c69f45ef459e0af"),
"type" : "T-Shirt",
"name" : "Panda",
"description" : "Panda's are cool.",
"image" : ObjectId("5d81171c2c69f45ef459e0ad"),
"created_at" : ISODate("2019-09-17T17:25:48.026Z"),
"is_featured" : false,
"sizes" : [
"XS",
"S",
"M",
"L",
"XL"
],
"tags" : [ ],
"pricing" : {
"price" : 26,
"sale_price" : 8
},
"categories" : [
ObjectId("5d81171b2c69f45ef459e086"),
ObjectId("5d81171b2c69f45ef459e087")
],
"sku" : "5d81171c2c69f45ef459e0af"
}
Collection: category
{
"_id" : ObjectId("5d81171b2c69f45ef459e086"),
"name" : "Art",
"description" : "These items are our artsy options.",
"created_at" : ISODate("2019-09-17T17:25:47.196Z")
}
{
"_id" : ObjectId("5d81171b2c69f45ef459e087"),
"name" : "Craft",
"description" : "These items are our artsy options.",
"created_at" : ISODate("2019-09-17T17:25:47.196Z")
}
Output:
{
"_id" : ObjectId("5d81171b2c69f45ef459e087"),
"items_in_cat" : 1,
"name" : "Craft"
}
{
"_id" : ObjectId("5d81171b2c69f45ef459e086"),
"items_in_cat" : 1,
"name" : "Art"
}

How to get either an existing document or newly added document

I have a table structure as given below.
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6b8"),
"product" : "product1",
"title" : "Alt Summit 1",
"category" : "category1",
"order" : 1
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6b9"),
"product" : "product2",
"title" : "Alt Summit 2",
"category" : "category1",
"order" : 2
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6ba"),
"product" : "product1",
"title" : "Alt Summit 1",
"category" : "category1",
"order" : 2,
"added_by" : user1
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6bb"),
"product" : "product2",
"title" : "Alt Summit 2",
"category" : "category1",
"order" : 1,
"added_by" : user1
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6bc"),
"product" : "product3",
"title" : "Alt Summit 3",
"category" : "category1",
"order" : 3,
"added_by" : user1
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6bd"),
"product" : "product4",
"title" : "Alt Summit 4",
"category" : "category1",
"order" : 4
}
I would like to explain the database structure first.
I have default products for every user on my website and users can add their own products to the list or they can re-order the existing list. when they add a new product it will be added by their username. when they re-order the products, I get the existing default records and insert by appending the username and change the order (you will understand that if you observe the above data).
Now how can I get the products for a category specific to a user.
For example if i need to get all the products for a user1 in category1 the output should be as below (sorted based on the order field). If you observe carefully the above data, the same product is been repeated with extra field named added_by and order changed. I want to get the order of user if user customizes it, If not the default order.
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6bb"),
"product" : "product2",
"title" : "Alt Summit 2",
"category" : "category1",
"order" : 1,
"added_by" : user1
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6ba"),
"product" : "product1",
"title" : "Alt Summit 1",
"category" : "category1",
"order" : 2,
"added_by" : user1
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6bc"),
"product" : "product3",
"title" : "Alt Summit 3",
"category" : "category1",
"order" : 3,
"added_by" : user1
}
{
"_id" : ObjectId("5870f1d9dd0ef6e62102e6bd"),
"product" : "product4",
"title" : "Alt Summit 4",
"category" : "category1",
"order" : 4
}
I am open for any suggestion on if I should change the structure of the table. But at any given time I should be able to get to the output which I mentioned above.
You can try the below aggregation with current Mongo 3.4 version.
$match documents for query criteria.
$group documents on product and pick the $first based on the earlier sort order while keeping the document with $$ROOT.
$replaceRoot to promote the root document to top level.
'$sort' by order field.
db.collection.aggregate([{
$match: {
"category": "category1",
$or: [{
"added_by": "user1"
}, {
"added_by": {
"$exists": false
}
}]
}
}, {
$group: {
"_id": "$product",
"root": {
"$first": "$$ROOT"
}
}
}, {
$replaceRoot: {
newRoot: "$root"
}
},{
$sort: {
"order": 1
}
}])
Update after OP's Comment
$group documents on product and $push all the products with $$ROOT.
$project stage to compare the $size if not $eq to 1 then $filter to pick the product with the added_by field.
db.collection.aggregate([{
$match: {
"category": "category1",
$or: [{
"added_by": "user1"
}, {
"added_by": {
"$exists": false
}
}]
}
}, {
$group: {
"_id": "$product",
"products": {
"$push": "$$ROOT"
}
}
}, {
$project: {
"_id": 0,
"product": {
"$arrayElemAt": [{
"$cond": [
{
$eq: [{$size: "$products"}, 1]
},
"$products",
{
"$filter": {
input: "$products",
as: "product",
cond: {
$ifNull: ["$$product.added_by", false]
}
}
}
]
},
0
]
}
}
}, {
$replaceRoot: {
newRoot: "$product"
}
}, {
$sort: {
"order": 1
}
}]);

MongoDb aggregation framework value of a field where max another field

I have a collection that has records looking like this:
"_id" : ObjectId("550424ef2f44472856286d56"), "accountId" : "123",
"contactOperations" :
[
{ "contactId" : "1", "operation" : 1, "date" : 500 },
{ "contactId" : "1", "operation" : 2, "date" : 501 },
{ "contactId" : "2", "operation" : 1, "date" : 502 }
]
}
I want to know the latest operation number that has been applied on a certain contact.
I'm using the aggregation framework to first unwind the contactOperations and then grouping by accountId and contactOperations.contactId and max contactOperations.date.
aggregate([{$unwind : "$contactOperations"}, {$group : {"_id":{"accountId":"$accountId", "contactId":"$contactOperations.contactId"}, "date":{$max:"$contactOperations.date"} }}])
The result I get is:
"_id" : { "accountId" : "123", "contactId" : "2" }, "time" : 502 }
"_id" : { "accountId" : "123", "contactId" : "1" }, "time" : 501 }
Which seems correct so far, but I also need the contactOperations.operation field that was recorded with $max date. How can I select that?
You have to sort the unwind values then apply $last operator to get operation for max date. Hope this query will solve your problem.
aggregate([
{
$unwind: "$contactOperations"
},
{
$sort: {
"date": 1
}
},
{
$group: {
"_id": {
"accountId": "$accountId",
"contactId": "$contactOperations.contactId"
},
"date": {
$max: "$contactOperations.date"
},
"operationId": {
$last: "$contactOperations.operation"
}
}
}
])

mongodb how to retrieve all the value in fields type array?

Is there a way to retrieve all the values
of a fields type array
ie
{ "slug" : "my-post", "status" : "publish", "published" : ISODate("2014-01-26T18:28:11Z"), "title" : "my post", "body" : "my body post", "_id" : ObjectId("52e553c937fb8bf218b8c624"), "tags" : [ "js", "php", "scala" ], "created" : ISODate("2014-01-26T18:28:25.298Z"), "author" : "whisher", "__v" : 0 }
{ "slug" : "my-post-2", "status" : "publish", "published" : ISODate("2014-01-26T18:28:27Z"), "title" : "my post 2", "body" : "spost body", "_id" : ObjectId("52e5540837fb8bf218b8c625"), "tags" : [ "android", "actionscript", "java" ], "created" : ISODate("2014-01-26T18:29:28.915Z"), "author" : "whisher", "__v" : 0 }
the result should be like
"android", "actionscript", "java","js", "php", "scala"
You can $unwind, and then $group them back
db.collection.aggregate({ $unwind : "$tags" }, {$group:{_id: "$tags"}});
The result would be
{ _id: "android"},
{ _id: "actionscript"},
{ _id: "java"},
{ _id: "js"},
{ _id: "php"},
{ _id: "scala"}
Use the distinct command (reference):
> db.test.distinct("tags")
[ "js", "php", "scala", "actionscript", "android", "java" ]
You could use aggregation if you eventually needed something more complex:
> db.test.aggregate(
{ $project: { tags : 1 } },
{ $unwind : "$tags" },
{ $group : { _id: "$tags" } } );
Results:
[
{
"_id" : "java"
},
{
"_id" : "actionscript"
},
{
"_id" : "android"
},
{
"_id" : "scala"
},
{
"_id" : "php"
},
{
"_id" : "js"
}
]
I'd use $project (reference) to reduce the number of fields being passed through the pipeline though. In the example above, I've used $project to include only the tags for example.