I have some data within a mongodb collection which looks like this:
[
{
"name" : "Apple",
"quantity" : "4",
},
{
"name" : "Apple",
"quantity" : "6",
},
{
"name" : "Orange",
"quantity" : "2",
},
{
"name" : "Orange",
"quantity" : "3",
},
]
I am trying to figure out a mongodb query and then its mongoose counterpart where I could utilize $sum to get all unique names with their respective sum. So the correct output after the query should look like this:
[
{
name: "Apple",
totalQuantity: "10"
},
{
name: "Orange",
totalQuantity: "5"
}
The $group will group documents by specified fields,
$group by name
$toInt convert string quantity to integer and $sum into totalQuantity
db.collection.aggregate([
{
$group: {
_id: "$name",
totalQuantity: {
$sum: { $toInt: "$quantity" }
}
}
},
{
$project: {
_id: 0,
name: "$_id",
totalQuantity: 1
}
}
])
Playground
Related
I'm quite new on MongoDB
Having a document like:
"_id":0001
"Name": "John"
"Contacts": [
{
"Person" : [
{
"User" : {
"_id" : ObjectId("5836b916885383034437d230"),
"Name": "Name1",
"Age" : 25,
}
},
{
"User" : {
"_id" : ObjectId("2836b916885383034437d230"),
"Name": "Name2",
"Age" : 30,
}
},
{
"User" : {
"_id" : ObjectId("1835b916885383034437d230"),
"Name": "Name3",
"Age" : 31,
}
},
}
which is the best way to get an output with the information of the Contacts with age greater or equal than 30 years?
Output should like:
{_id: "John", "ContactName":"Name2", "Age":30 }
{_id: "John", "ContactName":"Name3", "Age":31 }
Is aggregation the best way to do it, or it can be done by using a simple "find" statement?
$match
$unwind
$unwind
$match
$project
db.collection.aggregate([
{
"$match": {
"Contacts.Person.User.Age": {
"$gte": 30
}
}
},
{
"$unwind": "$Contacts"
},
{
"$unwind": "$Contacts.Person"
},
{
"$match": {
"Contacts.Person.User.Age": {
"$gte": 30
}
}
},
{
"$project": {
"_id": "$Name",
"ContactName": "$Contacts.Person.User.Name",
"Age": "$Contacts.Person.User.Age"
}
}
])
mongoplayground
I have an array element in my document db with multiple parameters.This is how a single document looks like. I can search based on name which is unique. Is there a way to list all technologies associated with the name.
"name" : "Sam",
"date" : ISODate("2020-02-05T06:34:28.453Z"),
"technology" : [
{
"technologyId" : "1",
"technologyName" : "tech1"
},
{
"technologyId" : "2",
"technologyName" : "tech2"
},
{
"technologyId" : "3",
"technologyName" : "tech3"
},
{
"technologyId" : "4",
"technologyName" : "tech4"
}
],
"sector" : [
{
"sectorId" : "1",
"sectorName" : "sector1"
},
{
"sectorId" : "2",
"sectorName" : "sector2"
},
{
"sectorId" : "3",
"sectorName" : "sector3"
},
{
"sectorId" : "4",
"sectorName" : "sector4"
}
]
This is my simple query
db.getCollection('myCollection').find({'name':'Sam'})
Is there a way to retrieve all technologies for a name in a single query.
My output should have only tech1,tech2,tech3,tech4.
A two stage aggregation using $match, $project and $map.
Query:
db.collection.aggregate([
{
$match: {
name: "Sam"
}
},
{
$project: {
"name": "$name",
"technologies": {
$map: {
input: "$technology",
as: "t",
in: "$$t.technologyName"
}
}
}
}
]);
Result:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"name": "Sam",
"technologies": [
"tech1",
"tech2",
"tech3",
"tech4"
]
}
]
In case you don't want the name in the final O/P remove it from project stage.
I'm considering that you don't have duplicate tech under a single name. You can project only the tech names then map:
db.getCollection('myCollection')
.find({ name: 'Sam' }, { 'technology.technologyName': 1 })
.map(function(doc) { return doc['technology.technologyName'] })
OK I am very new to Mongo, and I am already stuck.
Db has the following structure (much simplified for sure):
{
{
"_id" : ObjectId("57fdfbc12dc30a46507044ec"),
"keyterms" : [
{
"score" : "2",
"value" : "AA",
},
{
"score" : "2",
"value" : "AA",
},
{
"score" : "4",
"value" : "BB",
},
{
"score" : "3",
"value" : "CC",
}
]
},
{
"_id" : ObjectId("57fdfbc12dc30a46507044ef"),
"keyterms" : [
...
There are some Objects. Each Object have an array "keywords". Each of this Arrays Entries, which have score and value. There are some duplicates though (not really, since in the real db the keywords entries have much more fields, but concerning value and score they are duplicates).
Now I need a query, which
selects one object by id
groups its keyterms in by value
and counts the dublicates
sorts them by score
So I want to have something like that as result
// for Object 57fdfbc12dc30a46507044ec
"keyterms"; [
{
"score" : "4",
"value" : "BB",
"count" : 1
},
{
"score" : "3",
"value" : "CC",
"count" : 1
}
{
"score" : "2",
"value" : "AA",
"count" : 2
}
]
In SQL I would have written something like this
select
score, value, count(*) as count
from
all_keywords_table_or_some_join
group by
value
order by
score
But, sadly enough, it's not SQL.
In Mongo I managed to write this:
db.getCollection('tests').aggregate([
{$match: {'_id': ObjectId('57fdfbc12dc30a46507044ec')}},
{$unwind: "$keyterms"},
{$sort: {"keyterms.score": -1}},
{$group: {
'_id': "$_id",
'keyterms': {$push: "$keyterms"}
}},
{$project: {
'keyterms.score': 1,
'keyterms.value': 1
}}
])
But there is something missing: the grouping of the the keywords by their value. I can not get rid of the feeling, that this is the wrong approach at all. How can I select the keywords array and continue with that, and use an aggregate function inly on this - that would be easy.
BTW I read this
(Mongo aggregate nested array)
but I can't figure it out for my example unfortunately...
You'd want an aggregation pipeline where after you $unwind the array, you group the flattened documents by the array's value and score keys, aggregate the counts using the $sum accumulator operator and retain the main document's _id with the $first operator.
The preceding pipeline should then group the documents from the previous pipeline by the _id key so as to preserve the original schema and recreate the keyterms array using the $push operator.
The following demonstration attempts to explain the above aggregation operation:
db.tests.aggregate([
{ "$match": { "_id": ObjectId("57fdfbc12dc30a46507044ec") } },
{ "$unwind": "$keyterms" },
{
"$group": {
"_id": {
"value": "$keyterms.value",
"score": "$keyterms.score"
},
"doc_id": { "$first": "$_id" },
"count": { "$sum": 1 }
}
},
{ "$sort": {"_id.score": -1 } },
{
"$group": {
"_id": "$doc_id",
"keyterms": {
"$push": {
"value": "$_id.value",
"score": "$_id.score",
"count": "$count"
}
}
}
}
])
Sample Output
{
"_id" : ObjectId("57fdfbc12dc30a46507044ec"),
"keyterms" : [
{
"value" : "BB",
"score" : "4",
"count" : 1
},
{
"value" : "CC",
"score" : "3",
"count" : 1
},
{
"value" : "AA",
"score" : "2",
"count" : 2
}
]
}
Demo
Meanwhile, I solved it myself:
aggregate([
{$match: {'_id': ObjectId('57fdfbc12dc30a46507044ec')}},
{$unwind: "$keyterms"},
{$sort: {"keyterms.score": -1}},
{$group: {
'_id': "$keyterms.value",
'keyterms': {$push: "$keyterms"},
'escore': {$first: "$keyterms.score"},
'evalue': {$first: "$keyterms.value"}
}},
{$limit: 15},
{$project: {
"score": "$escore",
"value": "$evalue",
"count": {$size: "$keyterms"}
}}
])
In my MongoDB database I have a collection called test that looks like this:
{
"_id" : ObjectId("5774f2807f93c094a6691506"),
"name" : "jack",
"city" : "LA",
"age" : 30.0,
"cars" : 0
}
{
"_id" : ObjectId("5774f2be7f93c094a6691507"),
"name" : "jack",
"city" : "LA",
"age" : 40.0,
"cars" : 0
}
{
"_id" : ObjectId("5774f2ed7f93c094a6691508"),
"name" : "peter",
"city" : "London",
"age" : 35.0,
"cars" : 1
}
I have made a query which groups the people by name and city and only displays the oldest element of each group. In addition it only displays the guys that have at least a car. The query looks like this:
db.getCollection('test').aggregate( [
{
"$match":{"cars":{$ne:0}}
},
{
"$group": { "_id": { name: "$name", city: "$city" }, "age":{$max:"$age"}}
}
,
{
"$project":{"age":1, "name":"$_id.name", "city":"$_id.city", "cars":true}
}
] )
After executing the above query I get the following result:
{
"_id" : {
"name" : "peter",
"city" : "London"
},
"age" : 35.0,
"name" : "peter",
"city" : "London"
}
It's correct because peter is the only guy that owns a car. The problem is that it doesn't display the "cars" field. As you can see in the query there is a $project operator and the "cars" field is set to true. So it should be displayed.
Does adding cars at the grouping stage help? I am assuming you need to count them.
"$group": {
"_id": { name: "$name", city: "$city" },
"age": { $max:"$age" }
"cars": { $sum:"$cars" }
}
The input of the project stage is the output of the grouping stage. In your original query, there was no cars field available in this input.
One solution could be to "$push" cars into an array while grouping the data.
db.getCollection('test').aggregate( [
{
"$match":{"cars":{$ne:0}}
},
{
"$group": { "_id": { name: "$name", city: "$city" },
cars : {$push : "$cars"}, "age":{$max:"$age"}}
}
,
{
"$project":{"age":1, "name":"$_id.name", "city":"$_id.city", "cars":true}
}
] )
It's not displayed because it's not created in the previous pipeline stage. To understand how the aggregation pipeline works, treat the the aggregation operation as you would with any database system.
The $group pipeline operator is similar to the SQL's GROUP BY clause. In SQL, you can't use GROUP BY unless we use any of the aggregation functions.
The same way, you have to use an aggregation function in MongoDB as well. In this instance, to generate a car field you would have to use the $first operator to return the top document fields in the group.
This works well when the documents getting into that $group pipeline step are ordered, hence the need for a $sort pipeline before the $group for ordering. You can then apply the $first operator to the ordered group to get the maximum (which is essentially the top document in the ordered group, with its corresponding car value).
A correct pipeline that returns the desired field would look like this:
db.test.aggregate([
{ "$match": { "cars": { "$ne": 0 } } },
{ "$sort": { "name": 1, "city": 1, "age": -1 } }
{
"$group": {
"_id": { "name": "$name", "city": "$city" },
"age": { "$first": "$age" } ,
"cars": { "$first": "$cars" }
}
},
{
"$project": {
"age": 1,
"cars": 1,
"_id": 0,
"name": "$_id.name",
"city": "$_id.city"
}
}
])
Lets say I have 2 reports documents with an embeded line_items document:
Reports with embeded line_items
{
_id: "1",
week_number: "1",
line_items: [
{
cash: "5",
miscellaneous: "10"
},
{
cash: "20",
miscellaneous: "0"
}
]
},
{
_id: "2",
week_number: "2",
line_items: [
{
cash: "100",
miscellaneous: "0"
},
{
cash: "10",
miscellaneous: "0"
}
]
}
What I need to do is perform a set of additions on each line_item (in this case cash + miscellaneous) and have the grand total set on the reports query as a 'gross' field. I would like to end up with the following result:
Desired result
{ _id: "1", week_number: "1", gross: "35" },{ _id: "2", week_number: "2", gross: "110" }
I have tried the following query to no avail:
db.reports.aggregate([{$unwind: "$line_items"},{$group: {_id : "$_id", gross: {$sum : {$add: ["$cash", "$miscellaneous"]}}}}]);
You can't sum strings, so you'll first need to change the data type of the cash and miscellaneous fields in your docs to a numeric type.
But once you do that, you can sum them by including the line_items. prefix on those fields in your aggregate command:
db.reports.aggregate([
{$unwind: "$line_items"},
{$group: {
_id : "$_id",
gross: {$sum : {$add: ["$line_items.cash", "$line_items.miscellaneous"]}}
}}
]);
Output:
{
"result" : [
{
"_id" : "2",
"gross" : 110
},
{
"_id" : "1",
"gross" : 35
}
],
"ok" : 1
}