Mongodb count() of internal array - mongodb

I have the following MongoDB collection db.students:
/* 0 */
{
"id" : "0000",
"name" : "John"
"subjects" : [
{
"professor" : "Smith",
"day" : "Monday"
},
{
"professor" : "Smith",
"day" : "Tuesday"
}
]
}
/* 1 */
{
"id" : "0001",
"name" : "Mike"
"subjects" : [
{
"professor" : "Smith",
"day" : "Monday"
}
]
}
I want to find the number of subjects for a given student. I have a query:
db.students.find({'id':'0000'})
that will return the student document. How do I find the count for 'subjects'? Is it doable in a simple query?

If query will return just one element :
db.students.find({'id':'0000'})[0].subjects.length;
For multiple elements in cursor :
db.students.find({'id':'0000'}).forEach(function(doc) {
print(doc.subjects.length);
})
Do not forget to check existence of subjects either in query or before check .length

You could use the aggregation framework
db.students.aggregate(
[
{ $match : {'_id': '0000'}},
{ $unwind : "$subjects" },
{ $group : { _id : null, number : { $sum : 1 } } }
]
);
The $match stage will filter based on the student's _id
The $unwind stage will deconstruct your subjects array to multiple documents
The $group stage is when the count is done. _id is null because you are doing the count for only one user and only need to count.
You will have a result like :
{ "result" : [ { "_id" : null, "number" : 187 } ], "ok" : 1 }

Just another nice and simple aggregation solution:
db.students.aggregate([
{ $match : { 'id':'0000' } },
{ $project: {
subjectsCount: { $cond: {
if: { $isArray: "$subjects" },
then: { $size: "$subjects" },
else: 0
}
}
}
}
]).then(result => {
// handle result
}).catch(err => {
throw err;
});
Thanks!

Related

How can I find the sum and average of a document array?

Currently, I have the following document structure. The range field holds sub JSON objects as an array.
{
"_id" : ObjectId("62f60ba0ed0f1a1a0v"),
"userId" : "1431",
"range" : [
{
"index" : 0,
"clubType" : "driver",
"swingSize" : "full",
"distance" : 200,
"createdAt" : "2022-08-12T08:13:20.435+00:00"
},
{
"index" : 0,
"clubType" : "driver",
"swingSize" : "full",
"distance" : 150,
"createdAt" : "2022-08-12T08:13:20.435+00:00"
},
{
"index" : 0,
"clubType" : "wood",
"swingSize" : "full",
"distance" : 180,
"createdAt" : "2022-08-12T08:13:20.435+00:00"
}
]
}
In the above document, I want to sum and average the indexes with the same clubType and swingSize. So I used mongoose Aggregate like below.
result = await ClubRangeResultSchema.aggregate([
{
$match : {
userId : "1431",
range : {
$elemMatch : {
$and : [
{
createdAt : { $gte : lastDate }
},
{
createdAt : { $lte : lastDate }
}
]
}
}
}
},
{
$group : {
'_id' : {
'clubName' : '$range.clubName',
'swingSize' : '$range.swingSize'
},
'totalDistance' : { $sum : { $sum : '$range.distance' }}
}
}
]);
The result of the above query is all duplicate field names, and the total is also extracted for all data.
How should I modify the query?
You're close but need to do a couple of changes:
you want to $unwind the range array, $group doesn't flattern the array so when you use $range.clubType you are basically grouping the array itself as the value.
You want an additional match after the $unwind, the $elemMatch you use does not filter the range object, it does matches the initial document.
After the changes the pipeline should look like this:
db.collection.aggregate([
{
$match: {
userId: "1431",
range: {
$elemMatch: {
createdAt: "2022-08-12T08:13:20.435+00:00"
}
}
}
},
{
$unwind: "$range"
},
{
$match: {
"range.createdAt": "2022-08-12T08:13:20.435+00:00"
}
},
{
$group: {
"_id": {
"clubName": "$range.clubType",
"swingSize": "$range.swingSize"
},
"totalDistance": {
$sum: "$range.distance"
},
avgDistance: {
$avg: "$range.distance"
}
}
}
])
Mongo Playground

Ho use $sum (aggregation) for array of object and check greater than for each sum

My document structure is as follow :
{
"_id" : ObjectId("621ccb5ea46a9e41768e0ba8"),
"cust_name" : "Anuj Kumar",
"product" : [
{
"prod_name" : "Robot",
"price" : 15000
},
{
"prod_name" : "Keyboard",
"price" : 65000
}
],
"order_date" : ISODate("2022-02-22T00:00:00Z"),
"status" : "processed",
"invoice" : {
"invoice_no" : 111,
"invoice_date" : ISODate("2022-02-22T00:00:00Z")
}
}
How to do the following query...
List the details of orders with a value >10000.
I want to display only those objects whose sum of prices is greater than 10000
I try this
db.order.aggregate([{$project : {sumOfPrice : {$sum : "$product.price"} }}])
Output
{ "_id" : ObjectId("621ccb5ea46a9e41768e0ba8"), "sumOfPrice" : 80000 }
{ "_id" : ObjectId("621ccba9a46a9e41768e0ba9"), "sumOfPrice" : 16500 }
{ "_id" : ObjectId("621ccbfaa46a9e41768e0baa"), "sumOfPrice" : 5000 }
I want to check this sumOfPrice is greater than 10000 or not and display those order full object.
You can just add a $match stage right after that checks for this conditions, like so:
db.collection.aggregate([
{
$addFields: {
sumOfPrice: {
$sum: "$product.price"
}
}
},
{
$match: {
sumOfPrice: {
$gt: 10000
}
}
}
])
Mongo Playground
You can also use $expr operator with the find query as:
db.order.find({
$expr: {
$gt: [ {$sum: '$product.price'}, 10000 ]
}
})
Mongo Playground

mongodb count number of documents for every category

My collection looks like this:
{
"_id":ObjectId("5744b6cd9c408cea15964d18"),
"uuid":"bbde4bba-062b-4024-9bb0-8b12656afa7e",
"version":1,
"categories":["sport"]
},
{
"_id":ObjectId("5745d2bab047379469e10e27"),
"uuid":"bbde4bba-062b-4024-9bb0-8b12656afa7e",
"version":2,
"categories":["sport", "shopping"]
},
{
"_id":ObjectId("5744b6359c408cea15964d15"),
"uuid":"561c3705-ba6d-432b-98fb-254483fcbefa",
"version":1,
"categories":["politics"]
}
I want to count the number of documents for every category. To do this, I unwind the categories array:
db.collection.aggregate(
{$unwind: '$categories'},
{$group: {_id: '$categories', count: {$sum: 1}} }
)
Result:
{ "_id" : "sport", "count" : 2 }
{ "_id" : "shopping", "count" : 1 }
{ "_id" : "politics", "count" : 1 }
Now I want to count the number of documents for every category, but where document version is the latest version.
This is where I am stuck.
It's ugly but I think this gives you what you're after:
db.collection.aggregate(
{ $unwind : "$categories" },
{ $group :
{ "_id" : { "uuid" : "$uuid" },
"doc" : { $push : { "version" : "$version", "category" : "$categories" } },
"maxVersion" : { $max : "$version" }
}
},
{ $unwind : "$doc" },
{ $project : { "_id" : 0, "uuid" : "$id.uuid", "category" : "$doc.category", "isCurrentVersion" : { $eq : [ "$doc.version", "$maxVersion" ] } } },
{ $match : { "isCurrentVersion" : true }},
{ $group : { "_id" : "$category", "count" : { $sum : 1 } } }
)
You can do this by first grouping the denormalized documents (from the $unwind operator step) by two keys, i.e. the categories and version fields. This is necessary for the preceding pipeline step which orders the grouped documents and their accumulated counts by the version (desc) and categories (asc) keys respectively using the $sort operator.
Another grouping will be required to get the top documents in each categories group after ordering using the $first operator. The following shows this
db.collection.aggregate(
{ "$unwind": "$categories" },
{
"$group": {
"_id": {
'categories': '$categories',
'version': '$version'
},
"count": { "$sum": 1 }
}
},
{ "$sort": { "_id.version": -1, "_id.categories": 1 } },
{
"$group": {
"_id": "$_id.categories",
"count": { "$first": "$count" },
"version": { "$first": "$_id.version" }
}
}
)
Sample Output
{ "_id" : "shopping", "count" : 1, "version" : 2 }
{ "_id" : "sport", "count" : 1, "version" : 2 }
{ "_id" : "politics", "count" : 1, "version" : 1 }

compare two collection in mongodb

I have two different collection book and music in JSON .First I give a book collection example:
{
"_id" : ObjectId("b1"),
"author" : [
"Mary",
],
"title" : "Book1",
}
{
"_id" : ObjectId("b2"),
"author" : [
"Joe",
"Tony",
"Mary"
],
"title" : "Book2",
}
{
"_id" : ObjectId("b3"),
"author" : [
"Joe",
"Mary"
],
"title" : "Book3",
}
.......
Mary writes 3 books, Joe write 2 books, Tony writes 1 book. Second I give a music collection example:
{
"_id" : ObjectId("m1"),
"author" : [
"Tony"
],
"title" : "Music1",
}
{
"_id" : ObjectId("m2"),
"author" : [
"Joe",
"Tony"
],
"title" : "Music2",
}
.......
Tony has 2 musics, Joe has 1 music, Mary has 0 music.
I hope to get the number of authors who write more books than music.
Thus, Mary(3 > 0) and Joe(2 > 1) should take into consideration, but not Tony(1 < 2). Thus the final result should be 2(Mary and Joe).
I write down following code, but don't know how to compare:
db.book.aggregate([
{ $project:{ _id:0, author:1}},
{ $unwind:"$author" },
{$group:{_id:"$author", count:{$sum:1}}}
]
)
db.music.aggregate([
{ $project:{ _id:0, author:1}},
{ $unwind:"$author" },
{$group:{_id:"$author", count:{$sum:1}}}
]
)
Is it so far right? How to do the following comparison? Thanks.
to solve that problem, we need to use $out phase and store result of both queries in intermediate collection and then use aggregated query to join them ($lookup).
db.books.aggregate([{
$project : {
_id : 0,
author : 1
}
}, {
$unwind : "$author"
}, {
$group : {
_id : "$author",
count : {
$sum : 1
}
}
}, {
$project : {
_id : 0,
author : "$_id",
count : 1
}
}, {
$out : "bookAuthors"
}
])
db.music.aggregate([{
$project : {
_id : 0,
author : 1
}
}, {
$unwind : "$author"
}, {
$group : {
_id : "$author",
count : {
$sum : 1
}
}
}, {
$project : {
_id : 0,
author : "$_id",
count : 1
}
}, {
$out : "musicAuthors"
}
])
db.bookAuthors.aggregate([{
$lookup : {
from : "musicAuthors",
localField : "author",
foreignField : "author",
as : "music"
}
}, {
$unwind : "$music"
}, {
$project : {
_id : "$author",
result : {
$gt : ["$count", "$music.count"]
},
count : 1,
}
}, {
$match : {
result : true
}
}
])
EDIT CHANGES:
used author field instead of _id
added logical statement embeded in document in $project phase
result : { $gt : ["$count", "$music.count"]
Any questions welcome!
Have a fun!

Mongodb Time Series operations and generation

I've got a Mongodb Collection with this kind of docs :
{
"_id" : ObjectId("53cb898bed4bd6c24ae07a9f"),
"account" : "C1"
"created_on" : ISODate("2014-10-01T01:23:00.000Z")
"value" : 253
}
and
{
"_id" : ObjectId("52cb898bed4bd6c24ae06a9e"),
"account" : "C2"
"created_on" : ISODate("2014-10-01T01:23:00.000Z")
"value" : 9381
}
There is a document every minutes for C1 and C2.
I would like to generate data for an other account "C0" which will be equal to : (C2 - C1)*0.25
So the aim is to generate data for every minutes in the collection.
According to you, is it possible to do that in mongo shell ?
Thank you very much :)
The logic to solve this problem, is as below:
a) group all the records by created_on date.
b) get the value of both the documents in each group.
c) calculate the difference the C2 and C1 documents for each group.
d) In case one of the documents is missing difference
would be the value of the existing document.
d) project a document with value as (difference*.25) in each group.
e) insert the projected document to the collection.
I would like to propose two solutions to this, the first one would be on your assumption,
There is a document every minutes for C1 and C2.
So for every created_on time, there would be only two documents, C1 and C2.
db.time.aggregate([ {
$match : {
"account" : {
$in : [ "C1", "C2" ]
}
}
}, {
$group : {
"_id" : "$created_on",
"first" : {
$first : "$value"
},
"second" : {
$last : "$value"
},
"count" : {
$sum : 1
}
}
}, {
$project : {
"_id" : 0,
"value" : {
$multiply : [ {
$cond : [ {
$lte : [ "$count", 1 ]
}, "$first", {
$subtract : [ "$first", "$second" ]
} ]
}, 0.25 ]
},
"created_on" : "$_id",
"account" : {
$literal : "C0"
}
}
} ]).forEach(function(doc) {
doc.value = Math.abs(doc.value);
db.time.insert(doc);
});
The second solution is based on real-time scenarios. For a particular created_on time, there can be 'n' number of C1 documents and 'm' number of C2 documents with different values, but we would need only one 'C0' document representing the differences, for that particular created_on time. You would need an extra $group pipeline operator as below:
db.time.aggregate([ {
$match : {
"account" : {
$in : [ "C1", "C2" ]
}
}
}, {
$group : {
"_id" : {
"created_on" : "$created_on",
"account" : "$account"
},
"created_on" : {
$first : "$created_on"
},
"values" : {
$sum : "$value"
}
}
}, {
$group : {
"_id" : "$created_on",
"first" : {
$first : "$values"
},
"second" : {
$last : "$values"
},
"count" : {
$sum : 1
}
}
}, {
$project : {
"_id" : 0,
"value" : {
$multiply : [ {
$cond : [ {
$lte : [ "$count", 1 ]
}, "$first", {
$subtract : [ "$first", "$second" ]
} ]
}, 0.25 ]
},
"created_on" : "$_id",
"account" : {
$literal : "C0"
}
}
} ]).forEach(function(doc) {
doc.value = Math.abs(doc.value);
db.time.insert(doc);
});