In Mongodb How to Give two different $match - mongodb

In Db I have some sample data:
Object 1
"_id" : ObjectId("5b5934bb49b")
"payment" : {
"paid_total" : 500,
"name" : "havi",
"payment_mode" : "cash",
"pd_no" : "PD20725001",
"invoices" : [
{
"invoice_number" : "IN11803831583"
}
],
"type" : "Payment"
}
Object 2
"_id" : ObjectId("5b5934ee31e"),
"patient" : {
"invoice_date" : "2018-07-26",
"invoiceTotal" : 2000,
"pd_no" : "PD20725001",
"type" : "Invoice",
"invoice_number" : "IN11803831583"
}
Note: All the Data is In same Collection
As the above shown data I have many objects in my database. How can I get the Sum from the data above of invoiceTotal and sum of paid_total and then subtract the paid_total from invoiceTotal and show the balance amount for matching pd_no and invoice_number.
The output I expect looks like
invoiceTotal : 2000
paid_total : 500
Balance : 1500

Sample Input :
{
"_id" : ObjectId("5b596969a88e07f00d6dac17"),
"payment" : {
"paid_total" : 500,
"name" : "havi",
"payment_mode" : "cash",
"pd_no" : "PD20725001",
"invoices" : [
{
"invoice_number" : "IN11803831583"
}
],
"type" : "Payment"
}
}
{
"_id" : ObjectId("5b596986a88e07f00d6dac18"),
"patient" : {
"invoice_date" : "2018-07-26",
"invoiceTotal" : 2000,
"pd_no" : "PD20725001",
"type" : "Invoice",
"invoice_number" : "IN11803831583"
}
}
Use this aggregate query :
db.test.aggregate([
{
$project : {
_id : 0,
pd_no : { $ifNull: ["$payment.pd_no", "$patient.pd_no" ] },
invoice_no : { $ifNull: [ { $arrayElemAt : ["$payment.invoices.invoice_number", 0] },"$patient.invoice_number" ] },
type : { $ifNull: [ "$payment.type", "$patient.type" ] },
paid_total : { $ifNull: [ "$payment.paid_total", 0 ] },
invoice_total : { $ifNull: [ "$patient.invoiceTotal", 0 ] },
}
},
{
$group : {
_id : {
pd_no : "$pd_no",
invoice_no : "$invoice_no"
},
paid_total : {$sum : "$paid_total"},
invoice_total : {$sum : "$invoice_total"}
}
},
{
$project : {
_id : 0,
pd_no : "$_id.pd_no",
invoice_no : "$_id.invoice_no",
invoice_total : "$invoice_total",
paid_total : "$paid_total",
balance : {$subtract : ["$invoice_total" , "$paid_total"]}
}
}
])
In this query we are first finding the pd_no and invoice_no, which we are then using to group the documents. Next, we are getting the invoice_total and paid_total and then subtracting them to get the balance.
Output :
{
"pd_no" : "PD20725001",
"invoice_no" : "IN11803831583",
"invoice_total" : 2000,
"paid_total" : 500,
"balance" : 1500
}

I assume that you will only have documents with invoiceTotal or paid_total and never both at the same time.
you need first to get an amount to get the balance so if paid total it needs to be negative and positive on the case of the invoice total, and you can do this by using first the $project on the pipeline.
collection.aggregate([
{
$project : {
'patient.invoiceTotal': 1,
'payment.paid_total': 1,
ammount: {
$ifNull: ['$patient.invoiceTotal', { $multiply: [-1, '$payment.paid_total']}]
}
}
},
{
$group: {
_id: 'myGroup',
invoiceTotal: { $sum: '$patient.invoiceTotal' },
paid_total: { $sum: '$payment.paid_total' },
balance: { $sum: '$ammount' }
}
}
])

Related

Finding/Counting Duplicate Values in Array in MongoDB

I am new to the mongo database. Using Robo3t software
I have to find out duplicate values inside an array based on channel_id.
I did a research and found that aggregation needs to be used to do grouping and find respective count.
I have developed the following query but results are not as expected.
Sample Documents:
{
"_id" : ObjectId("59b674d141b47e5401897d31"),
"subscribed_channels" : [
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1002",
"channel_name" : "StarGold",
"channelPrice":"75"
},
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1003",
"channel_name" : "SetMax",
"channelPrice":"80"
}
],
"viewer_account_id" : "59b6745b41b47e5401143b3d",
"public_id_type" : "PHONE_NUMBER",
"viewer_id" : "+919322264403",
"role" : "CONSUMER",
"active" : true,
"date_time_created" : NumberLong(1505129681330),
"date_time_modified" : NumberLong(1569320824387)
}
{
"_id" : ObjectId("59b674d141b47e5401897d31"),
"subscribed_channels" : [
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1002",
"channel_name" : "StarGold",
"channelPrice":"75"
},
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
},
{
"channel_id" : "1001",
"channel_name" : "StarPlus",
"channelPrice":"100"
}
],
"viewer_account_id" : "59b6745b41b47e5401143c56",
"public_id_type" : "PHONE_NUMBER",
"viewer_id" : "+919322264404",
"role" : "CONSUMER",
"active" : true,
"date_time_created" : NumberLong(1505129681330),
"date_time_modified" : NumberLong(1569320824387)
}
Above are just 2 records of document viewers
Query :
db.getCollection('viewers').aggregate([
{
"$group" :
{_id:{
//viewer_id:"$consumer_id",
enterprise_id:"$subscribed_channels.channel_id",
},
"viewer_id": {
$first: "$viewer_id"
},
count:{$sum:1}
}},
{
"$match": {"count": { "$gt": 1 }}
}
])
Actual Output :
{
"_id" : {
"enterprise_id" : [
"1001",
"1001",
"1002",
"1003"
]
},
"consumer_id" : "+919322264403",
"count" : 2.0
}
{
"_id" : {
"enterprise_id" : [
"1001",
"1002",
"1001",
"1001
]
},
"consumer_id" : "+919322264404",
"count" : 2.0
}
Expected Output :
I want to group based on subscribed_channels.channel_id and get a count respectively
{
"_id" : {
"enterprise_id" : [
"1001",
"1001",
"1002",
"1003"
]
},
"consumer_id" : "+919322264403",
"count" : 2.0
}
{
"_id" : {
"enterprise_id" : [
"1001",
"1001",
"1001",
"1002
]
},
"consumer_id" : "+919322264404",
"count" : 3.0
}
Grouping is not happening based on channel_id, also the count is incorrect.
The count is not even giving me no of channel-id subscribed, also not giving duplicate channel_ids.
Please guide me in building a query that gives the correct result.
Try below query :
Query :
db.collection.aggregate([
/** project only needed fields & transform fields as you like */
{
$project: {
customer_id: "$viewer_id",
enterprise_id: "$subscribed_channels.channel_id",
count: {
/** Subtract size of original array & newly formed array which has unique values to get count of duplicates */
$subtract: [
{
$size: "$subscribed_channels.channel_id" // get size of original array
},
{
$size: {
$setUnion: ["$subscribed_channels.channel_id", []] // This will give you an array with unique elements & get size of it
}
}
]
}
}
}
]);
Test : MongoDB-Playground

Partition data around a match query during aggregation

What I have been trying to get my head around is to perform some kind of partitioning(split by predicate) in a mongo query. My current query looks like:
db.posts.aggregate([
{"$match": { $and:[ {$or:[{"toggled":false},{"toggled":true, "status":"INACTIVE"}]} , {"updatedAt":{$gte:1549786260000}} ] }},
{"$unwind" :"$interests"},
{"$group" : {"_id": {"iid": "$interests", "pid":"$publisher"}, "count": {"$sum" : 1}}},
{"$project":{ _id: 0, "iid": "$_id.iid", "pid": "$_id.pid", "count": 1 }}
])
This results in the following output:
{
"count" : 3.0,
"iid" : "INT456",
"pid" : "P789"
}
{
"count" : 2.0,
"iid" : "INT789",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P123"
}
All good so far, but then I had realized that for the documents that match the specific filter {"toggled":true, "status":"INACTIVE"}, I would rather decrement the count (-1). (considering the eventual value can be negative as well.)
Is there a way to somehow partition the data after match to make sure different grouping operations are performed for both the collection of documents?
Something that sounds similar to what I am looking for is
$mergeObjects, or maybe $reduce, but not much that I can relate from the documentation examples.
Note: I can sense, one straightforward way to deal with this would be to perform two queries, but I am looking for a single query to perform the operation.
Sample documents for the above output would be:
/* 1 */
{
"_id" : ObjectId("5d1f7******"),
"id" : "CON123",
"title" : "Game",
"content" : {},
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P789",
"interests" : [
"INT456"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 2 */
{
"_id" : ObjectId("5d1f8******"),
"id" : "CON456",
"title" : "Home",
"content" : {},
"status" : "INACTIVE",
"toggle":true,
"publisher" : "P789",
"interests" : [
"INT456",
"INT789"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 3 */
{
"_id" : ObjectId("5d0e9******"),
"id" : "CON654",
"title" : "School",
"content" : {},
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P789",
"interests" : [
"INT123",
"INT456",
"INT789"
],
"updatedAt" : NumberLong(1582078628264)
}
/* 4 */
{
"_id" : ObjectId("5d207*******"),
"id" : "CON789",
"title":"Stack",
"content" : { },
"status" : "ACTIVE",
"toggle":false,
"publisher" : "P123",
"interests" : [
"INT123"
],
"updatedAt" : NumberLong(1582078628264)
}
What I am looking forward to as a result though is
{
"count" : 1.0, (2-1)
"iid" : "INT456",
"pid" : "P789"
}
{
"count" : 0.0, (1-1)
"iid" : "INT789",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P789"
}
{
"count" : 1.0,
"iid" : "INT123",
"pid" : "P123"
}
This aggregation gives the desired result.
db.posts.aggregate( [
{ $match: { updatedAt: { $gte: 1549786260000 } } },
{ $facet: {
FALSE: [
{ $match: { toggle: false } },
{ $unwind : "$interests" },
{ $group : { _id : { iid: "$interests", pid: "$publisher" }, count: { $sum : 1 } } },
],
TRUE: [
{ $match: { toggle: true, status: "INACTIVE" } },
{ $unwind : "$interests" },
{ $group : { _id : { iid: "$interests", pid: "$publisher" }, count: { $sum : -1 } } },
]
} },
{ $project: { result: { $concatArrays: [ "$FALSE", "$TRUE" ] } } },
{ $unwind: "$result" },
{ $replaceRoot: { newRoot: "$result" } },
{ $group : { _id : "$_id", count: { $sum : "$count" } } },
{ $project:{ _id: 0, iid: "$_id.iid", pid: "$_id.pid", count: 1 } }
] )
[ EDIT ADD ]
The output from the query using the input data from the question post:
{ "count" : 1, "iid" : "INT123", "pid" : "P789" }
{ "count" : 1, "iid" : "INT123", "pid" : "P123" }
{ "count" : 0, "iid" : "INT789", "pid" : "P789" }
{ "count" : 1, "iid" : "INT456", "pid" : "P789" }
[ EDIT ADD 2 ]
This query gets the same result with different approach (code):
db.posts.aggregate( [
{
$match: { updatedAt: { $gte: 1549786260000 } }
},
{
$unwind : "$interests"
},
{
$group : {
_id : {
iid: "$interests",
pid: "$publisher"
},
count: {
$sum: {
$switch: {
branches: [
{ case: { $eq: [ "$toggle", false ] },
then: 1 },
{ case: { $and: [ { $eq: [ "$toggle", true] }, { $eq: [ "$status", "INACTIVE" ] } ] },
then: -1 }
]
}
}
}
}
},
{
$project:{
_id: 0,
iid: "$_id.iid",
pid: "$_id.pid",
count: 1
}
}
] )
[ EDIT ADD 3 ]
NOTE:
The facet query runs the two facets (TRUE and FALSE) on the same set of documents; it is like two queries running in parallel. But, there is some duplication of code as well as additional stages for shaping the documents down the pipeline to get the desired output.
The second query avoids the code duplication, and there are much lesser stages in the aggregation pipeline. This will make difference when the input dataset has a large number of documents to process - in terms of performance. In general, lesser stages means lesser iterations of the documents (as a stage has to scan the documents which are output from the previous stage).

MongoDB count distinct items

I have following query on a list with this fields : key,time,p,email
use app_db;
db.getCollection("app_log").aggregate(
[
{
"$match" : {
"key" : "login"
}
},
{
"$group" : {
"_id" : {
"$substr" : [
"$time",
0.0,
10.0
]
},
"total" : {
"$sum" : "$p"
},
"count" : {
"$sum" : 1.0
}
}
}
]
);
and the output is something like this :
{
"_id" : "2019-08-25",
"total" : NumberInt(623),
"count" : 400.0
}
{
"_id" : "2019-08-24",
"total" : NumberInt(2195),
"count" : 1963.0
}
{
"_id" : "2019-08-23",
"total" : NumberInt(1294),
"count" : 1706.0
}
{
"_id" : "2019-08-22",
"total" : NumberInt(53),
"count" : 1302.0
}
But I need the count to be distinctive on email field, which is count number of distinct email addresses who logged in per day and their p value is greater 0
You need $addToSet to get an array of unique email values per day and then you can use $size to count the number of items in that array:
db.getCollection("app_log").aggregate(
[
{
"$match" : {
"key" : "login"
}
},
{
"$group" : {
"_id" : {
"$substr" : [
"$time",
0.0,
10.0
]
},
"total" : {
"$sum" : "$p"
},
"emails" : {
"$addToSet": "$email"
}
}
},
{
$project: {
_id: 1,
total: 1,
countDistinct: { $size: "$emails" }
}
}
]
);

Mongodb find data and groupby for another column

{
"_id" : ObjectId("5763e4d6c0140edcb8731485"),
"_class" : "net.microservice.product.domain.Product",,
"createdAt" : ISODate("2016-06-17T11:53:58.228Z"),
"createdBy" : "user-0",
"modifiedAt" : ISODate("2016-06-21T06:21:47.524Z"),
"modifiedBy" : "user-0",
"merchant" : "a746f24safa5-e96f-4281-9759-a4a02b306d77",
"type" : DBRef("productTypes", ObjectId("575fd99236623f70c959247f")),
"fields" : {
"Image4" : {
"value" : "http://i.hizliresim.com/ZdELXa.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Image3" : {
"value" : "http://i.hizliresim.com/l1WkqX.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Image2" : {
"value" : "http://i.hizliresim.com/VYMl9n.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Kur" : {
"value" : "TL",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"Image1" : {
"value" : "http://i.hizliresim.com/nrWAQ0.jpg",
"detail" : {
"revisedBy" : "CTA",
"revisionDate" : ISODate("2016-06-21T06:21:47.204Z")
}
},
"uploadDate" : ISODate("2016-06-17T11:53:00Z"),
"tasks" : [ ]
}
this is sample of database. I want to get data in which:
- modifiedAt is before "modifiedAt" : ISODate("2016-07-21T06:21:47.524Z"),
so i do this and this works:
db.products.find({
'modifiedAt':
{$lte: ISODate("2016-10-18T13:05:18.961Z"
)} }).
count()
14999
But i need to find for each merchant. Beause 14999 result is not true because a merchant have lots of product so 14999 includes multiple products.
I need to group by merchant and distinct. I couldnot do it.
i do this but
db.products.
aggregate([ {
$group: {
_id: '$merchant', } }, {
$match: {
modifiedAt:
{$lte: ISODate("2016-06-18T13:05:18.961Z")} }} ])
brings nothing and no error.
you can try something like this. This gives you the number of products by merchant.
db.products.aggregate([
{$match: {modifiedAt:{$lte: ISODate("2016-06-21T06:21:47.524Z")}}},
{$group: { _id: "$merchant",count: { $sum: 1 }}}
])
Output:
{ "_id" : "a89846f24safa5-e96f-4281-9759-a4a02b306d77", "count" : 1 }
Always place the $match as early in the aggregation pipeline as possible. Because $match limits the total number of documents in the aggregation pipeline, earlier $match operations minimize the amount of processing down the pipe.
So your query would be like
db.products.aggregate([
{
$match: {
modifiedAt: {
$lte: ISODate("2016-06-18T13:05:18.961Z")
}
}
},
{
$group: {
_id: '$merchant'
}
}
])

Get lowest per date from multiple arrays in mongodb

I've the following structure of docs:
{
"_id" : ObjectId("5786458371d24d924d8b4575"),
"uniqueNumber" : "3899822714",
"lastUpdatedAt" : ISODate("2016-07-13T20:11:11.000Z"),
"new" : [
{
"price" : 8.4,
"created" : ISODate("2016-07-13T13:11:28.000Z")
},
{
"price" : 10.0,
"created" : ISODate("2016-07-13T14:50:56.000Z")
}
],
"used" : [
{
"price" : 10.99,
"created" : ISODate("2016-07-08T13:46:31.000Z")
},
{
"price" : 8.59,
"created" : ISODate("2016-07-13T13:11:28.000Z")
}
]
}
Now I need to get a list that gives me the lowest price of each array per date.
So, as example:
{
"uniqueNumber" : 1234,
"prices" : {
"created" : 2016-07-08,
"minNew" : 123,
"minUsed" : 22
}
}
By now I've built the following query
db.getCollection('col').aggregate([
{
$match : {
"uniqueNumber" : "3899822714"
}
},
{
$unwind : "$used"
},
{
$project : {
"uniqueNumber" : "$uniqueNumber",
"price" : "$used.price",
"ts" : "$used.created"
}
},
{
$sort : { "ts" : 1 }
},
{
$group : {_id: "$uniqueNumber", priceOfMaxTS : { $min: "$price" }, ts : { $last: "$ts" }}
}
]);
But this one will only give me the lowest price for the highest date. I couldn't really find anything that pushes me to the right direction to get the desired result.
UPDATE
I've found a way to get the lowest price of the used array grouped by day with this query:
db.getCollection('col').aggregate([
{
$match : {
"uniqueNumber" : "3899822714"
}
},
{
$unwind : "$used"
},
{
$project : {
"asin" : "$uniqueNumber",
"price" : "$used.price",
"ts" : "$used.created",
"y" : { "$year" : "$used.created" },
"m" : { "$month" : "$used.created" },
"d" : { "$dayOfMonth" : "$used.created" }
}
},
{
$group : { _id : { "year" : "$y", "month" : "$m", "day" : "$d" }, minPriceOfDay : { $min: "$price" }}
}
]);
No I only need to find a way to do this also to the new array in the same query.