How to find distinct (and greatest) values from collection using mongoDB? - mongodb

I have following collection -
[{
"customerId" : "54a32e9f1e14fa5476d654db",
"hostId" : "192.168.20.20",
"runtimeMilliSeconds" : 1422007201815
}
{
"customerId" : "54a32e9f1e14fa5476d654db",
"hostId" : "192.168.20.20",
"runtimeMilliSeconds" : 1422008101736
}
{
"customerId" : "54a32e9f1e14fa5476d654db",
"hostId" : "192.168.20.21",
"runtimeMilliSeconds" : 1422009002239
}
{
"customerId" : "54a32e9f1e14fa5476d654db",
"hostId" : "192.168.20.21",
"runtimeMilliSeconds" : 1422009901379
}
{
"customerId" : "54a32e9f1e14fa5476d654db",
"hostId" : "192.168.20.22",
"runtimeMilliSeconds" : 1422010801685
}
{
"customerId" : "54a32e9f1e14fa5476d654db",
"hostId" : "192.168.20.22",
"runtimeMilliSeconds" : 1422010801585
}]
I also have list of hostIds as : [ "192.168.20.20" , "192.168.20.21" , "192.168.20.22"]
I want to match hostId list with collection and find latest (greatest) runtimeMilliSeconds only to get following output -
[{"hostId":"192.168.20.20", "runtime": 1422007201815},
{"hostId":"192.168.20.21", "runtime": 1422009002239},
{"hostId":"192.168.20.22", "runtime": 1422010801685}]
I have tried out following with mongo aggregation -
{ "$match" : { "hostId" : { "$in" : [ "192.168.20.20" , "192.168.20.21" , "192.168.20.22"]} ,
"customerId" : "54a32e9f1e14fa5476d654db"}},
{ "$sort" : { "runtimeMilliSeconds" : -1}},
{ "$group" : { "_id" : { "hostId" : "$hostId" ,
"runtime" : "$runtimeMilliSeconds"}}},
{ "$project" : { "hostId" : "$_id.hostId" ,
"runtimeMilliSeconds" : "$_id.runtime" , "_id" : 0}}
But it gives me all values in collection.
How do I get above mentioned output using mongo??

Use $first operator
db.test.aggregate(
[
{ "$match" : { "hostId" : { "$in" : [ "192.168.20.20" , "192.168.20.21" , "192.168.20.22"]} , "customerId" : "54a32e9f1e14fa5476d654db"}},
{ "$sort" : { "runtimeMilliSeconds" : -1}},
{ "$group" : { "_id" : { "hostId" : "$hostId" } , "runtime" : { $first : "$runtimeMilliSeconds" }}},
{ "$project" : { "hostId" : "$_id.hostId" , "runtimeMilliSeconds" : "$runtime" , "_id" : 0}}
]
)
output will be:
{
"result" : [
{
"hostId" : "192.168.20.20",
"runtimeMilliSeconds" : 1422008101736
},
{
"hostId" : "192.168.20.21",
"runtimeMilliSeconds" : 1422009901379
},
{
"hostId" : "192.168.20.22",
"runtimeMilliSeconds" : 1422010801685
}
],
"ok" : 1
}

The most efficient way to do that would be to use the $max operator (no $sort stage needed):
[
{"$match" : {
"hostId" : { "$in" : [ "192.168.20.20" , "192.168.20.21" , "192.168.20.22"]},
"customerId" : "54a32e9f1e14fa5476d654db"
}},
{ "$group" : {
"_id" : "$hostId",
"runtime" : {"$max" : "$runtimeMilliSeconds"}
}},
{"$project" : {
"hostId" : "$_id" ,
"runtime" : 1,
"_id" : 0
}}
]

Hi I think you are close to your answer but following some changes will meet your output
{
"$match": {
"hostId": {
"$in": [
"192.168.20.20",
"192.168.20.21",
"192.168.20.22"
]
},
"customerId": "54a32e9f1e14fa5476d654db"
}
},
{
"$group": {
"_id": {
"hostId": "$hostId",
"runtime": "$runtimeMilliSeconds"
}
}
},
{
"$sort": {
"_id.runtime": -1
}
}{
"$group": {
"_id": "$_id.hostId",
"runtime": {
"$first": "$_id.runtime"
}
}
}

Related

How to write mongo query

How I can get the total number of seats available for a particular movie (seats present in all the theatres for that movie) from the mongodb schema below.
I need to write a mongo query to get the results
{
"_id" : ObjectId("5d637b5ce27c7d60e5c42ae7"),
"name" : "Bangalore",
"movies" : [
{
"name" : "KGF",
"theatres" : [
{
"name" : "PVR",
"seats" : 45
},
{
"name" : "IMAX",
"seats" : 46
}
]
},
{
"name" : "Avengers",
"theatres" : [
{
"name" : "IMAX",
"seats" : 50
}
]
}
],
"_class" : "com.BMS_mongo.ZZ_BMS_mongo_demo.Entity.CityInfo"
}
I have written this code :
db.cities.aggregate( [
{ "$unwind" : "$movies" }, { "$unwind" : "$theatres" } ,
{ "$group" : { _id : "$movies.theatre`enter code here`s.seats" ,
total : { "$sum" : "$seats" } }
}
] )
My schema:
The following query can get us the expected output:
db.collection.aggregate([
{
$unwind:"$movies"
},
{
$unwind:"$movies.theatres"
},
{
$group:{
"_id":"$movies.name",
"movie":{
$first:"$movies.name"
},
"totalSeats":{
$sum:"$movies.theatres.seats"
}
}
},
{
$project:{
"_id":0
}
}
]).pretty()
Data set:
{
"_id" : ObjectId("5d637b5ce27c7d60e5c42ae7"),
"name" : "Bangalore",
"movies" : [
{
"name" : "KGF",
"theatres" : [
{
"name" : "PVR",
"seats" : 45
},
{
"name" : "IMAX",
"seats" : 46
}
]
},
{
"name" : "Avengers",
"theatres" : [
{
"name" : "IMAX",
"seats" : 50
}
]
}
],
"_class" : "com.BMS_mongo.ZZ_BMS_mongo_demo.Entity.CityInfo"
}
Output:
{ "movie" : "Avengers", "totalSeats" : 50 }
{ "movie" : "KGF", "totalSeats" : 91 }
Query:
db.movie.aggregate([{ $unwind: { path: "$movies",} },
{ $unwind: { path: "$movies.theatres",} },
{ $group: { _id: "$movies.name", "moviename": { $first: "$movies.name" },
"totalSeats": { $sum: "$movies.theatres.seats" }} }])
I got the answer using this query ...
db.cities.aggregate( [
{ "$match" : { "name" : "Bangalore" } },
{ "$unwind" : "$movies" } ,
{ "$match" : {"movies.name" : "KGF"} },
{ "$unwind" : "$theatres" },
{ "$group" : { _id : "$movies.name", total : { "$sum" : "$movies.theatres.seats"
} } }
] )

Mongodb If then else, unwinding

I have documnet structure like below.
{
"_id" : ObjectId("5852c49ba35fe"),
"date" : "20100101",
"p9" : "PWR_FSA",
"p10" : "00278",
"p11" : "002",
"gs" : [
{
"tis" : [
{
"rr" : "00",
"ul" : [
{
"amnt" : 1.0,
"su" : "N"
}
]
}
],
"type" : "PQR"
}
],
"trig_id" : "255"
}
{
"_id" : ObjectId("59fdc49ba35fe"),
"date" : "20100101",
"p9" : "PWR_FSA",
"p10" : "00278",
"p11" : "002",
"gs" : [
{
"mis" : [
{
"rr" : "00",
"tl" : [
{
"amnt" : -1.5,
"su" : "N"
}
]
}
],
"type" : "ABC"
}
],
"trig_id" : "255"
}
Now i want to aggregare the amount on the basis of p9 so i have written below query.
I am gettiing an error at then : {{$unwind : "$gs.tis"},{$unwind : "$gs.tis.ul"}},
Where i am going wrong? Also Can i do unwinding in Then clause ?
db.TB.aggregate({$match :{$and :[{"trig_id" : "255"}]}},
{ $unwind : "$gs" },
{$project: {_id :1,
"amnt" :{
$concat : [ {
$cond : { if : {"gs.type" : {$eq : "PQR"}} ,
then : {{$unwind : "$gs.tis"},{$unwind : "$gs.tis.ul"}},
$cond : { if : {$match : {"gs.tis.ul.su": "N"}} , then : {$gs.tis.ul.amnt} , else: {""}}
else : "" }
}]}},
"total" : { $sum : 1 }
}},
{$group :{"_id": "$p9",
"amnt": {"$sum": "$amnt"}
}
})

mongodb $unwind empty array

With this data:
{
"_id" : ObjectId("576948b4999274493425c08a"),
"virustotal" : {
"scan_id" : "4a6c3dfc6677a87aee84f4b629303c40bb9e1dda283a67236e49979f96864078-1465973544",
"sha1" : "fd177b8c50b457dbec7cba56aeb10e9e38ebf72f",
"resource" : "4a6c3dfc6677a87aee84f4b629303c40bb9e1dda283a67236e49979f96864078",
"response_code" : 1,
"scan_date" : "2016-06-15 06:52:24",
"results" : [
{
"sig" : "Gen:Variant.Mikey.29601",
"vendor" : "MicroWorld-eScan"
},
{
"sig" : null,
"vendor" : "nProtect"
},
{
"sig" : null,
"vendor" : "CAT-QuickHeal"
},
{
"sig" : "HEUR/QVM07.1.0000.Malware.Gen",
"vendor" : "Qihoo-360"
}
]
}
},
{
"_id" : ObjectId("5768f214999274362f714e8b"),
"virustotal" : {
"scan_id" : "3d283314da4f99f1a0b59af7dc1024df42c3139fd6d4d4fb4015524002b38391-1466529838",
"sha1" : "fb865b8f0227e9097321182324c959106fcd8c27",
"resource" : "3d283314da4f99f1a0b59af7dc1024df42c3139fd6d4d4fb4015524002b38391",
"response_code" : 1,
"scan_date" : "2016-06-21 17:23:58",
"results" : [
{
"sig" : null,
"vendor" : "Bkav"
},
{
"sig" : null,
"vendor" : "ahnlab"
},
{
"sig" : null,
"vendor" : "MicroWorld-eScan"
},
{
"sig" : "Mal/DrodZp-A",
"vendor" : "Qihoo-360"
}
]
}
}
I'm trying to group by and count the vendor when sig is not null in order to obtain something like:
{
"_id" : "Qihoo-360",
"count" : 2
},
{
"_id" : "MicroWorld-eScan",
"count" : 1
},
{
"_id" : "Bkav",
"count" : 0
},
{
"_id" : "CAT-QuickHeal",
"count" : 0
}
At the moment with this code:
db.analysis.aggregate([
{ $unwind: "$virustotal.results" },
{
$group : {
_id : "$virustotal.results.vendor",
count : { $sum : 1 }
}
},
{ $sort : { count : -1 } }
])
I'm getting everything:
{
"_id" : "Qihoo-360",
"count" : 2
},
{
"_id" : "MicroWorld-eScan",
"count" : 2
},
{
"_id" : "Bkav",
"count" : 1
},
{
"_id" : "CAT-QuickHeal",
"count" : 1
}
How can I count 0 if the sig is null?
You need a conditional expression in your $sum operator that will check if the "$virustotal.results.sig" key is null by using the comparison operator $gt (as specified in the documentation's BSON comparsion order)
You can restructure your pipeline by adding this expression as follows:
db.analysis.aggregate([
{ "$unwind": "$virustotal.results" },
{
"$group" : {
"_id": "$virustotal.results.vendor",
"count" : {
"$sum": {
"$cond": [
{ "$gt": [ "$virustotal.results.sig", null ] },
1, 0
]
}
}
}
},
{ "$sort" : { "count" : -1 } }
])
Sample Output
/* 1 */
{
"_id" : "Qihoo-360",
"count" : 2
}
/* 2 */
{
"_id" : "MicroWorld-eScan",
"count" : 1
}
/* 3 */
{
"_id" : "Bkav",
"count" : 0
}
/* 4 */
{
"_id" : "CAT-QuickHeal",
"count" : 0
}
/* 5 */
{
"_id" : "nProtect",
"count" : 0
}
/* 6 */
{
"_id" : "ahnlab",
"count" : 0
}
I changed the null with None and the numbers increased but seems not correct yet.
Basically doing the query in mongoshell I get like
{
"_id" : "Kaspersky",
"count" : 176.0
}
from python:
Kaspersky 64
one of these 2 is wrong :)
So I'm trying to investigate what part of the query in python is not correctly written compared to the mongo shell one.
I did a simple query:
In mongoshell:
rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$ne": "null"} } }})
results: 176
db.analysis.count( { "virustotal.results" : { $elemMatch : { "vendor": "Kaspersky", "sig": {$gt: null} } }})
results: 0
Then I tried in python:
rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$ne": "null"} } }})
results: 568
rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$ne": "None"} } }})
results: 568
rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$gt": "None"} } }})
results: 64
rtmp = results_db.analysis.count( { "virustotal.results" : { "$elemMatch" : { "vendor": "Kaspersky", "sig": {"$gt": "null"} } }})
results: 6
hard to says what is the correct value! I suppose 176 but not able to reproduce in python...

mongodb aggregation $group and then $push a object

this is my data :
> db.bookmarks.find({"userId" : "56b9b74bf976ab70ff6b9999"}).pretty()
{
"_id" : ObjectId("56c2210fee4a33579f4202dd"),
"userId" : "56b9b74bf976ab70ff6b9999",
"items" : [
{
"itemId" : "28",
"timestamp" : "2016-02-12T18:07:28Z"
},
{
"itemId" : "29",
"timestamp" : "2016-02-12T18:07:29Z"
},
{
"itemId" : "30",
"timestamp" : "2016-02-12T18:07:30Z"
},
{
"itemId" : "31",
"timestamp" : "2016-02-12T18:07:31Z"
},
{
"itemId" : "32",
"timestamp" : "2016-02-12T18:07:32Z"
},
{
"itemId" : "33",
"timestamp" : "2016-02-12T18:07:33Z"
},
{
"itemId" : "34",
"timestamp" : "2016-02-12T18:07:34Z"
}
]
}
I want to have something like (actually i hope the _id can become userId too) :
{
"_id" : "56b9b74bf976ab70ff6b9999",
"items" : [
{ "itemId": "32", "timestamp": "2016-02-12T18:07:32Z" },
{ "itemId": "31", "timestamp": "2016-02-12T18:07:31Z" },
{ "itemId": "30", "timestamp": "2016-02-12T18:07:30Z" }
]
}
What I have now :
> db.bookmarks.aggregate(
... { $match: { "userId" : "56b9b74bf976ab70ff6b9999" } },
... { $unwind: '$items' },
... { $sort: { 'items.timestamp': -1} },
... { $skip: 2 },
... { $limit: 3},
... { $group: { '_id': '$userId' , items: { $push: '$items.itemId' } } }
... ).pretty()
{ "_id" : "56b9b74bf976ab70ff6b9999", "items" : [ "32", "31", "30" ] }
i tried to read the document in mongo and find out i can $push, but somehow i cannot find a way to push such object, which is not defined anywhere in the whole object. I want to have the timestamp also.. but i don't know how should i modified the $group (or others??) to do so. thanks for helping!
This code, which I tested in the MongoDB 3.2.1 shell, should give you the output format that you want:
> db.bookmarks.aggregate(
{ "$match" : { "userId" : "Ursula" } },
{ "$unwind" : "$items" },
{ "$sort" : { "items.timestamp" : -1 } },
{ "$skip" : 2 },
{ "$limit" : 3 },
{ "$group" : { "_id" : "$userId", items: { "$push" : { "myPlace" : "$items.itemId", "myStamp" : "$items.timestamp" } } } } ).pretty()
Running the above will produce this output:
{
"_id" : "Ursula",
"items" : [
{
"myPlace" : "52",
"myStamp" : ISODate("2016-02-13T18:07:32Z")
},
{
"myPlace" : "51",
"myStamp" : ISODate("2016-02-13T18:07:31Z")
},
{
"myPlace" : "50",
"myStamp" : ISODate("2016-02-13T18:07:30Z")
}
]
}
In MongoDB version 3.2.x, you can also use the $out operator in the very last stage of the aggregation pipeline, and have the output of the aggregation query written to a collection. Here is the code I used:
> db.bookmarks.aggregate(
{ "$match" : { "userId" : "Ursula" } },
{ "$unwind" : "$items" },
{ "$sort" : { "items.timestamp" : -1 } },
{ "$skip" : 2 },
{ "$limit" : 3 },
{ "$group" : { "_id" : "$userId", items: { "$push" : { "myPlace" : "$items.itemId", "myStamp" : "$items.timestamp" } } } },
{ "$out" : "ursula" } )
This gives me a collection named "ursula":
> show collections
ursula
and I can query that collection:
> db.ursula.find().pretty()
{
"_id" : "Ursula",
"items" : [
{
"myPlace" : "52",
"myStamp" : ISODate("2016-02-13T18:07:32Z")
},
{
"myPlace" : "51",
"myStamp" : ISODate("2016-02-13T18:07:31Z")
},
{
"myPlace" : "50",
"myStamp" : ISODate("2016-02-13T18:07:30Z")
}
]
}
>
Last of all, this is the input document I used in the aggregation query. You can compare this document to how I coded the aggregation query to see how I built the new items array.
> db.bookmarks.find( { "userId" : "Ursula" } ).pretty()
{
"_id" : ObjectId("56c240ed55f2f6004dc3b25c"),
"userId" : "Ursula",
"items" : [
{
"itemId" : "48",
"timestamp" : ISODate("2016-02-13T18:07:28Z")
},
{
"itemId" : "49",
"timestamp" : ISODate("2016-02-13T18:07:29Z")
},
{
"itemId" : "50",
"timestamp" : ISODate("2016-02-13T18:07:30Z")
},
{
"itemId" : "51",
"timestamp" : ISODate("2016-02-13T18:07:31Z")
},
{
"itemId" : "52",
"timestamp" : ISODate("2016-02-13T18:07:32Z")
},
{
"itemId" : "53",
"timestamp" : ISODate("2016-02-13T18:07:33Z")
},
{
"itemId" : "54",
"timestamp" : ISODate("2016-02-13T18:07:34Z")
}
]
}

Mongodb : How should I get original Json structure after filter the records based on requirement?

I am new to mongodb.
I have a Json document in collection like :
{
"_id" : ObjectId("55abf32f358e3aca807f0e6a"),
"usercbid" : 1995492.0000000000000000,
"defaultnotifytype" : {
"status" : true,
"alert" : true,
"action" : true
},
"calendar" : {
"alert" : 2468.0000000000000000,
"action" : 13579.0000000000000000,
"status" : 123456.0000000000000000
},
"assignment" : [
{
"orgid" : {
"service" : "AVPN",
"adminemail" : "pl9129#att.com",
"notifytype" : {
"status" : true,
"alert" : true
},
"keytype" : "MCN",
"KeyValue" : "SK1383"
}
},
{
"orgid" : {
"KeyValue" : "DD3342",
"service" : "<all>",
"keytype" : "MCN"
}
},
{
"orgid" : {
"notifytype" : {
"optout" : true
},
"keytype" : "MCN",
"keyvalue" : "<all>",
"service" : "MVPN"
}
},
{
"order" : {
"date" : "2015-03-15",
"adminemail" : "abc.com",
"notifytype" : {
"alert" : true
},
"id" : 123456.0000000000000000
}
},
{
"order" : {
"id" : 135246.0000000000000000,
"date" : "2015-03-17",
"adminemail" : "abc.com"
}
}
]
}
I would like to filter above json document with following condition:
var result = db.subscription.aggregate(
[ { $unwind: "$assignment" }
, {$match : {$or:
[
{
"assignment.order.id" : 123456
},
{
"assignment.orgid.keytype" : { $in: ["MCN"]}
,"assignment.orgid.KeyValue" : { $in: ["<all>","SK1383"]}
,"assignment.orgid.service" : { $in: ["<all>","AVPN"]}
}
]
}
}
,{$group: {_id: "$_id", assignment: {$push: "$assignment"}}}
// ,{$project : { usercbid : $usercbid, defaultnotifytype : 1, calendar : 1, assignment: 1} }
]
)
printjson(result);
Result of above query is :
{
"result" : [
{
"_id" : ObjectId("55abf32f358e3aca807f0e6a"),
"assignment" : [
{
"orgid" : {
"service" : "AVPN",
"adminemail" : "pl9129#att.com",
"notifytype" : {
"status" : true,
"alert" : true
},
"keytype" : "MCN",
"KeyValue" : "SK1383"
}
},
{
"order" : {
"date" : "2015-03-15",
"adminemail" : "pl9129#att.com",
"notifytype" : {
"alert" : true
},
"id" : 123456
}
}
]
}
],
"ok" : 1
}
But my final result lost the following original content:
"usercbid" : 1995492.0000000000000000,
"defaultnotifytype" : {
"status" : true,
"alert" : true,
"action" : true
},
"calendar" : {
"alert" : 2468.0000000000000000,
"action" : 13579.0000000000000000,
"status" : 123456.0000000000000000
},
How should I append above original content with filtered records?
Thanks,
$Fisrt is the operator which helps you getting the required output.
When you do a $Group, the result of the $Group pipeline operator contains only those fields which are specified inside the $Group pipeline operator.
So, from your query we can notice that you are grouping based on "_Id" and you are selecting only "assignment" key field, so the OUTPUT of this group pipeline operator will contain only those 2 fileds ( "_ID" and "assignment" ).
To make sure that the other left out feilds ( usercbid, defaultnotifytype , calendar ) to be part of the $Group pipeline output, we need to mention that explicitly in the Group pipeline using $First as below :
{ $group: { _id: "$_id", assignment: {$push: "$assignment"},
usercbid : { $first : "usercbid"} ,
defaultnotifytype : { $first : "defaultnotifytype" } ,
calendar : { $first : "calendar"}
}
}
$First Returns the value that results from applying an expression to the first document in a group of documents that share the same group by key.
Please check the below query, it will help you in fetching the required output :
var result = db.subscription.aggregate(
[ { $unwind: "$assignment" }
, { $match : {$or:
[
{
"assignment.order.id" : 123456
},
{
"assignment.orgid.keytype" : { $in: ["MCN"]}
,"assignment.orgid.KeyValue" : { $in: ["<all>","SK1383"]}
,"assignment.orgid.service" : { $in: ["<all>","AVPN"]}
}
]
}
}
,{ $group: { _id: "$_id", assignment: {$push: "$assignment"},
usercbid : { $first : "usercbid"} ,
defaultnotifytype : { $first : "defaultnotifytype" } ,
calendar : { $first : "calendar"}
}
}
]
).pretty();