I have data like this in a database:
{
"_id" : ObjectId("5ec4e40a7c89c96c7c3818f0"),
"lob" : "DIGITAL_STORE",
"paymentMode" : "NET_BANKING",
"pgStatus" : "PG_SUCCESS",
"createdAt" : ISODate("2020-05-20T08:02:18.566Z"),
"updatedAt" : ISODate("2020-07-22T18:57:29.915Z"),
"updatedBy" : "ONLINE_CHANNEL",
"_class" : "com.airtel.payments.pg.commons.persistence.PgTransactionDetails"
},
{
"_id" : ObjectId("5ec4e40a7c89c96c7c3818f0"),
"lob" : "DIGITAL_STORE",
"paymentMode" : "NET_BANKING",
"pgStatus" : "PG_FAILED",
"createdAt" : ISODate("2020-05-20T08:02:18.566Z"),
"updatedAt" : ISODate("2020-07-22T18:57:29.915Z"),
"updatedBy" : "ONLINE_CHANNEL",
"_class" : "com.airtel.payments.pg.commons.persistence.PgTransactionDetails"
}
I need to get data from mongodb where I can fetch count of success and failure in one document grouped by LOB & Payment mode.
I tried something like this, but lob and payment mode segregation not coming.
db.getCollection('transactionDetails').aggregate([
{$project: {
Success: {$cond: [{$eq: ["$pgStatus", "PG_SUCCESS" ]}, 1, 0]},
Failed: {$cond: [{$eq: ["$pgStatus", "PG_FAILED"]}, 1, 0]}
}},
{$group: {
_id: {Lob:"$lob",Mode:"$paymentMode"},
Success: {$sum: "$Success"},
Failed: {$sum: "$Failed"}
}}
]);
I can do this separately but not able to get in single document both count of success and failure transaction.
Because you have used $project and you have projected one field only and other fields are no longer available for next pipeline,
You can add other fields in $project or you can use $addFields instead of $project,
Playground
or you can try inside all operations in $group,
db.getCollection('transactionDetails').aggregate([
{
$group: {
_id: { Lob: "$lob", Mode: "$paymentMode" },
Success: {
$sum: { $cond: [{ $eq: ["$pgStatus", "PG_SUCCESS"] }, 1, 0] }
},
Failed: {
$sum: { $cond: [{ $eq: ["$pgStatus", "PG_FAILED"] }, 1, 0] }
}
}
}
])
Playground
Related
I have a collection called "User". I'm passing userid to get the record. In addition to that i also need additional 10 last updatedAt(DateTime) record excluding the userid record but added together. So, total returned result will be 11 in this case. Is that possible using same query? I tried using Or and lookup but can't make it work as expected.
Any help is appreciated.
User collection:
[{
"id" : "123456",
"name" : "foo",
"addressIds" : [ObjectId(234567)]
} ,
"id" : "345678",
"name" : "bar",
"addressIds" : [ObjectId(678565), ObjectId(567456)]
}]
Address collection:
[{
"_id":"234567",
"district" : "district1",
"pincode" : "568923",
},
{
"_id":"678565",
"district" : "district2",
"pincode" : "568924",
},
{
"_id":"567456",
"district" : "district3",
"pincode" : "568925",
}]
Using facets, i have the User and the addressIds. Can i have the actual documents for AddressIds in User?
Edit:
You can use $facet, like this:
db.collection.aggregate([
{$sort: {date: -1}},
{$facet: {
byTate: [{$limit: 10}],
byUser: [{$match: {userId: 455845}}]
}
},
{$project: {
byDate: {
$filter: {
input: "$byDate",
as: "item",
cond: {$ne: ["$$item",{"$arrayElemAt": ["$byUser", 0]}]}
}
},
byUser: 1,
}
},
{
$project: {
byDate: {$slice: ["$byDate", 10]},
byUser: 1
}
}
])
You can see it works on the playground .
Switch between userId: 455845 / 455845 to see both cases.
I have below collection, need to find duplicate records in mongo, how can we find that as below is one sample of collection we have around more then 10000 records of collections.
/* 1 */
{
"_id" : 1814099,
"eventId" : "LAS012",
"eventName" : "CustomerTab",
"timeStamp" : ISODate("2018-12-31T20:09:09.820Z"),
"eventMethod" : "click",
"resourceName" : "CustomerTab",
"targetType" : "",
"resourseUrl" : "",
"operationName" : "",
"functionStatus" : "",
"results" : "",
"pageId" : "CustomerPage",
"ban" : "290824901",
"jobId" : "87377713",
"wrid" : "87377713",
"jobType" : "IBJ7FXXS",
"Uid" : "sc343x",
"techRegion" : "W",
"mgmtReportingFunction" : "N",
"recordPublishIndicator" : "Y",
"__v" : 0
}
We can first find the unique ids using
const data = await db.collection.aggregate([
{
$group: {
_id: "$eventId",
id: {
"$first": "$_id"
}
}
},
{
$group: {
_id: null,
uniqueIds: {
$push: "$id"
}
}
}
]);
And then we can make another query, which will find all the duplicate documents
db.collection.find({_id: {$nin: data.uniqueIds}})
This will find all the documents that are redundant.
Another way
To find the event ids which are duplicated
db.collection.aggregate(
{"$group" : { "_id": "$eventId", "count": { "$sum": 1 } } },
{"$match": {"_id" :{ "$ne" : null } , "count" : {"$gt": 1} } }
)
To get duplicates from db, you need to get only the groups that have a count of more than one, we can use the $match operator to filter our results. Within the $match pipeline operator, we'll tell it to look at the count field and tell it to look for counts greater than one using the $gt operator representing "greater than" and the number 1. This looks like the following:
db.collection.aggregate([
{$group: {
_id: {eventId: "$eventId"},
uniqueIds: {$addToSet: "$_id"},
count: {$sum: 1}
}
},
{$match: {
count: {"$gt": 1}
}
}
]);
I assume that eventId is a unique id.
I have below set for data in my MongoDB collections. I need to find the latest data based on field "eventType".
{
"_id" : ObjectId("5d5690843248b8c20481f5e9"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBC",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:21.393Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5e9"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBB",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5ea"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "Lipids",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:23.394Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5ea"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "L",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:24.394Z")
}
I used 'aggregation' and 'find' queries and sorted it based on timestamp field "charttime" to fetch the latest data but it is not working. I need to fetch data based on field "eventType" so that for each 'eventType' I should get the latest data. So in the given example, I should get the latest data for "LAB" and "EDLIST". Ideally, it should return data:
{
"_id" : ObjectId("5d5690843248b8c20481f5e9"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBB",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
{
"_id" : ObjectId("5d5690843248b8c20481f5ea"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "L",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:24.394Z")
}
Follow below steps:
Sort all document first.
Group it by eventtype.
Project again to get id correctly into _id (Not necessary if you are ok with id key)
Sort again those data (Not necessary if you are ok with different eventype not sorted by date)
db.collection.aggregate([
{ $sort: {"charttime": 1 }},
{ $group: {
_id: "$eventType",
id: {$first: "$_id"},
"mrn": {$first: "$mrn"},
"eventType": {$first: "$eventType"},
"eventSubType": {$first: "$eventSubType"},
"value": {$first: "$value"},
"units": {$first: "$units"},
"charttime": {$first: "$charttime"}
}},
{$project: {
_id: "$id",
"mrn": 1,
"eventType": 1,
"eventSubType": 1,
"value": 1,
"units": 1,
"charttime": 1
}},
{ $sort: {"charttime": 1 }}
])
Hope this help!
Output:
/* 1 */
{
"_id" : ObjectId("5d5cedb1fc18699f18a24fa2"),
"mrn" : "xp35",
"eventType" : "LAB",
"eventSubType" : "CBB",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:22.393Z")
}
/* 2 */
{
"_id" : ObjectId("5d5cedc1fc18699f18a24fa9"),
"mrn" : "zfwy",
"eventType" : "EDLIST",
"eventSubType" : "L",
"value" : 1,
"units" : 1,
"charttime" : ISODate("2019-08-16T16:46:24.394Z")
}
===== UPDATE =====
As per your ask to optimize query:
db.collection.aggregate([
{ $sort: {"charttime": -1 }}, // Sort in descending. (So we would not have another sort after group)
{ $group: {
_id: "$eventType", // Group by event type
data: {$first: "$$ROOT"} // Take whole first record
}},
{ $replaceRoot: { newRoot: "$data" }} // Replaceroot to have document as per your requirement
])
===== UPDATE 2 ====
For too many records:
- Find eventType and maximum chartTime
- Iterate on each document and get records (You may have multiple calls on DB but it will take less time)
db.getCollection('Vehicle').aggregate([
{ $group: {
_id: "$eventType", // Group by event type
maxChartTime: {$max: "$charttime"}
}}
]).forEach(function(data) {
db.getCollection('Vehicle').find({
"eventType": data._id,
"charttime": data.maxChartTime
});
// Any mechanism to have array of all retrieved documents.
// You can handle it from your back end too.
})
Note:- I have tested it with 506983 records and got results in 0.526 sec.
First sort(descending) the data by charttime so that $first
accumulator works properly.
Then group by eventType and find latest of the dates by
$maxaccumulator.
$project pipe is to retain the original _id with the same key
name field. If it is not required as _id you can remove the pipe
altogether.
Aggregation Query:
db.collection.aggregate([
{ $sort: { charttime: -1 } },
{
$group: {
_id: "$eventType",
id: { $first: "$_id" },
mrn: { $first: "$mrn" },
eventType: { $first: "$eventType" },
eventSubType: { $first: "$eventSubType" },
value: { $first: "$value" },
units: { $first: "$units" },
charttime: { $max: "$charttime" }
}
},
{
$project: {
_id: "$id",
mrn: 1,
eventType: 1,
eventSubType: 1,
value: 1,
units: 1,
charttime: 1
}
}
]);
In a MongoDB collection, there is data nested in an absence array.
{
"_id" : ObjectId("5c6c62f3d0e85e6ae3a8c842"),
"absence" : [
{
"date" : ISODate("2017-05-10T17:00:00.000-07:00"),
"code" : "E",
"type" : "E",
"isPartial" : false
},
{
"date" : ISODate("2018-02-24T16:00:00.000-08:00"),
"code" : "W",
"type" : "E",
"isPartial" : false
},
{
"date" : ISODate("2018-02-23T16:00:00.000-08:00"),
"code" : "E",
"type" : "E",
"isPartial" : false
},
{
"date" : ISODate("2018-02-21T16:00:00.000-08:00"),
"code" : "U",
"type" : "U",
"isPartial" : false
},
{
"date" : ISODate("2018-02-20T16:00:00.000-08:00"),
"code" : "R",
"type" : "E",
"isPartial" : false
}
]
}
I'd like to aggregate by absence.type to return a count of every type and the total number of absence children. The results might look like:
{
"_id" : ObjectId("5c6c62f3d0e85e6ae3a8c842"),
"U" : 1,
"E" : 4,
"total" : 5
}
There are several similar questions posted here but I'm yet to successfully adapt the answers my schema. Any help is greatly appreciated.
Also, are there GUI modeling tools to help with MongoDB query building? The transition from RDBMS queries to the Mongo aggregation pipeline has been quite difficult.
You can use below aggregation:
db.col.aggregate([
{
$unwind: "$absence"
},
{
$group: {
_id: { _id: "$_id", type: "$absence.type" },
count: { $sum: 1 }
}
},
{
$group: {
_id: "$_id._id",
types: { $push: { k: "$_id.type", v: "$count" } },
total: { $sum: "$count" }
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [ "$$ROOT", { $arrayToObject: "$types" } ]
}
}
},
{
$project: {
types: 0
}
}
])
$unwind allows you to get single document per absence. Then you need double $group, first one to count by type and _id and second one to aggregate the data per _id. Having one document per _id you just need $replaceRoot with $mergeObjects to promote your dynamically created keys and values (by $arrayToObject) to the root level.
output:
{ "_id" : ObjectId("5c6c62f3d0e85e6ae3a8c842"), "total" : 5, "U" : 1, "E" : 4 }
If you know all the possible values of "absence.type" then $filter the array on the value and compute the $size of the filtered array. This won't work if you don't know all the possible values in the "absence.type".
db.col.aggregate([
{ $project: { U: { $size: { $filter: { input: "$absence", as: "a", cond: { $eq: [ "$$a.type", "U"]} }}},
E: { $size: { $filter: { input: "$absence", as: "a", cond: { $eq: [ "$$a.type", "E"]} }}} }},
{ $project: { total: { $add: [ "$U", "$E" ]}, U: 1, E: 1}},
])
I'm using mongodb 2.2. I would like to use the new Aggregation Framework to do queries over my documents, but the elements are arrays.
Here an example of my $project result:
{
"type" : [
"ads-get-yyy",
"ads-get-zzz"
],
"count" : [
NumberLong(0),
NumberLong(10)
],
"latency" : [
0.9790918827056885,
0.9790918827056885
]
}
I want to group by type, so for "ads-get-yyy" to know how much is the average of count and how much is the average of the latency.
I would like to have something similar to the next query, but that works inside of the elements of every array:
db.test.aggregate(
{
$project : {
"type" : 1,
"count" : 1,
"latency" : 1
}
},{
$group : {
_id: {type : "$type"},
count: {$avg: "$count"},
latency: {$avg: "$latency"}
}
});
I'm just learning the new AF too, but I think you need to first $unwind the types so that you can group by them. So something like:
db.test.aggregate({
$project : {
"type" : 1,
"count" : 1,
"latency" : 1
}
},{
$unwind : "$type"
},{
$group : {
_id: {type : "$type"},
count: {$avg: "$count"},
latency: {$avg: "$latency"}
}
});