One object per distinct field / value - mongodb

I have this schema that represents a "File object" :
"_id" : ObjectId("59f7184aedd1712cdd6a148a"),
"file" : {
"part" : ".part000010",
"creation_time" : "2017-10-26T14:01:42.309597",
"archive_time" : "2017-10-30T12:17:14.770871",
"inode" : 18644328,
"size" : 733326,
"tags" : null,
"group" : "ssc_hpci",
"uuid" : "408bf97c-bd6c-11e7-a854-de7a7d6f0f6c",
"filename" : "build/hpca/",
"owner" : "anl001",
"gid" : 66026,
"uid" : 66799,
"checksum" : null,
"symlink" : false
"archivename" : "build",
Many files will have the same archivename.
I need one object per distinct archivename. Obj1 would have an archive named "build", Obj2 an archive named "build1", etc.
Projection is not suitable for this, aggregate neither.
I'm using pymongo.


And Operator in Criteria not working as expected for nested documents inside aggregation Spring Data Mongo

I am trying to fetch total replies where read values for a replies is true. But I am getting count value as 3 but expected value is 2 (since only two read value is true) through Aggregation function available in Spring Data Mongo. Below is the code which I wrote:
Aggregation sumOfRepliesAgg = newAggregation(match(new Criteria().andOperator(Criteria.where("replies.repliedUserId").is(userProfileId),Criteria.where("").is(true))),
unwind("replies"), group("replies").count().as("repliesCount"),project("repliesCount"));
AggregationResults<Comments> totalRepliesCount = mongoOps.aggregate(sumOfRepliesAgg, "COMMENTS",Comments.class);
return totalRepliesCount.getMappedResults().size();
Using AND Operator inside Criteria Query and passed two criteria condition but not working as expected. Below is the sample data set:
"_id" : ObjectId("5c4ca7c94807e220ac5f7ec2"),
"_class" : "",
"comment_data" : "logged by karthe99",
"totalReplies" : 2,
"replies" : [
"_id" : "b33a429f-b201-449b-962b-d589b7979cf0",
"content" : "dasdsa",
"createdDate" : ISODate("2019-01-26T18:33:10.674Z"),
"repliedToUser" : "#karthe99",
"repliedUserId" : "5bbc305950a1051dac1b1c96",
"read" : false
"_id" : "b886f8da-2643-4eca-9d8a-53f90777f492",
"content" : "dasda",
"createdDate" : ISODate("2019-01-26T18:33:15.461Z"),
"repliedToUser" : "#karthe50",
"repliedUserId" : "5c4bd8914807e208b8a4212b",
"read" : true
"_id" : "b56hy4rt-2343-8tgr-988a-c4f90598h492",
"content" : "dasda",
"createdDate" : ISODate("2019-01-26T18:33:15.461Z"),
"repliedToUser" : "#karthe50",
"repliedUserId" : "5c4bd8914807e208b8a4212b",
"read" : true
"last_modified_by" : "karthe99",
"last_modified_date" : ISODate("2019-01-26T18:32:41.394Z")
What is the mistake in the query that I wrote?

How to find something from an array in mongo

"_id" : ObjectId("586aac4c8231ee0b98458045"),
"store_code" : NumberInt(10800),
"counter_name" : "R.N.Electric",
"address" : "314 khatipura road",
"locality" : "Khatipura Road (Jhotwara)",
"pincode" : NumberInt(302012),
"town" : "JAIPUR",
"gtm_city" : "JAIPUR",
"sales_office" : "URAJ",
"owner_name" : "Rajeev",
"owner_mobile" : "9828024073",
"division_mapping" : [//this contains only 1 element in every doc
"dvcode" : "cfc",
"dc" : "trade",
"beatcode" : "govindpura",
"fos" : {
"_id" : ObjectId("586ab8318231ee0b98458843"),
"loginid" : "9928483483",
"name" : "Arpit Gupta",
"division" : [
"sales_office" : "URAJ", //office
"gtm_city" : "JAIPUR" //city
"beat" : {
"_id" : ObjectId("586d372b39f64316b9c3cbd7"),
"division" : {
"_id" : ObjectId("5869f8b639f6430fe4edee2a"),
"clientdvcode" : NumberInt(40),
"code" : "cfc",
"name" : "Cooking & Fabric Care",
"project_code" : "usha-fos",
"client_code" : "usha",
"agent_code" : "v5global"
"beatcode" : "govindpura",
"sales_office" : "URAJ",
"gtm_city" : "JAIPUR",
"active" : true,
"agency_code" : "v5global",
"client_code" : "USHA_FOS",
"proj_code" : "usha-fos",
"fos" : {
"_id" : ObjectId("586ab8318231ee0b98458843"),
"loginid" : "9928483483",
"name" : "Arpit Gupta",
"division" : [
"sales_office" : "URAJ",
"gtm_city" : "JAIPUR"
"distributor_mail" : "",
"project_code" : "usha-fos",
"client_code" : "usha",
"agent_code" : "v5global",
"distributor_name" : "Sundeep Electrical"
I am having only 1 element in division_mapping's array and I want to find those documents whose dc in division_mapping is trade.
I have tried following:
Dont know what I am doing wrong.
//Maybe I have to unwind the array but is there any other way?
According to MongoDB documentation
The $elemMatch operator matches documents that contain an array
field with at least one element that matches all the specified query
According to above mentioned description to retrieve only documents whose dc in division_mapping is trade please try executing below mentioned query

Mongoid query embedded document and return parent

I have this document, each is a tool:
"_id" : ObjectId("54da43aea96ddcc40915a457"),
"checked_in" : false,
"barcode" : "PXJ-234234",
"calibrations" : [
"_id" : ObjectId("54da46ec546173129d810100"),
"cal_date" : null,
"cal_date_due" : ISODate("2014-08-06T00:00:00.000+0000"),
"time_in" : ISODate("2015-02-10T17:46:20.250+0000"),
"time_out" : ISODate("2015-02-10T17:46:20.250+0000"),
"updated_at" : ISODate("2015-02-10T17:59:08.796+0000"),
"created_at" : ISODate("2015-02-10T17:59:08.796+0000")
"_id" : ObjectId("5509e815686d610b70010000"),
"cal_date_due" : ISODate("2015-03-18T21:03:17.959+0000"),
"time_in" : ISODate("2015-03-18T21:03:17.959+0000"),
"time_out" : ISODate("2015-03-18T21:03:17.959+0000"),
"cal_date" : ISODate("2015-03-18T21:03:17.959+0000"),
"updated_at" : ISODate("2015-03-18T21:03:17.961+0000"),
"created_at" : ISODate("2015-03-18T21:03:17.961+0000")
"_id" : ObjectId("5509e837686d610b70020000"),
"cal_date_due" : ISODate("2015-03-18T21:03:51.189+0000"),
"time_in" : ISODate("2015-03-18T21:03:51.189+0000"),
"time_out" : ISODate("2015-03-18T21:03:51.189+0000"),
"cal_date" : ISODate("2015-03-18T21:03:51.189+0000"),
"updated_at" : ISODate("2015-03-18T21:03:51.191+0000"),
"created_at" : ISODate("2015-03-18T21:03:51.191+0000")
"group" : "Engine",
"location" : "Here or there",
"model" : "ZX101C",
"serial" : NumberInt(15449),
"tool" : "octane analyzer",
"updated_at" : ISODate("2015-09-30T20:43:55.652+0000"),
"description" : "Description...",
Tools are calibrated periodically. What I want to do is grab tools that are due this month.
Currently, my query is this:
scope :upcoming, -> { where(:at_ats => false).where('calibrations.0.cal_date_due' => {'$gte' =>, '$lte' =>}).order_by(:'calibrations.cal_date_due'.asc) }
However, this query gets the tool by the first calibration object and it needs to be the last. I've tried a myriad of things, but I'm stuck here.
How can I make sure I'm querying the most recent calibration document, not the first (which would be the oldest and therefore not relevant)?
You should look into aggregation framework and $unwind operator.
This link may be of help.
This link may be helpful. It contains an example of use of 'aggregation framework' for get the last element of the array, that is, the most recent in your case.

Fix duplicate name situation due to entities created before Orion 0.17.0

Since Orion 0.17.0 attribute type is no longer used as attribute "identification key". However, I have entities created with a pre-0.17.0 version that have attributes with the same name and different types. For example, the following entity, which have "ActivePower" duplicated:
> db.entities.findOne({"_id.type": "Regulator", "": "OUTSMART.RG_LAS_LLAMAS_01", "_id.servicePath": "/"})
"_id" : {
"type" : "Regulator",
"servicePath" : "/"
"attrs" : [
"name" : "TimeInstant",
"value" : "2015-04-27T01:51:36.000000Z",
"type" : "urn:x-ogc:def:trs:IDAS:1.0:ISO8601",
"modDate" : 1430092302
"name" : "ActivePower",
"value" : "11778",
"type" : "urn:x-ogc:def:phenomenon:Outsmart:1.0:ActivePower",
"modDate" : 1430092302
"name" : "ReactivePower",
"value" : "8414",
"type" : "urn:x-ogc:def:phenomenon:Outsmart:1.0:ReactivePower",
"modDate" : 1430092302
"name" : "electricPotential",
"value" : "231",
"type" : "urn:x-ogc:def:phenomenon:IDAS:1.0:electricPotential",
"modDate" : 1430092302
"name" : "electricCurrent",
"value" : "20890",
"type" : "urn:x-ogc:def:phenomenon:IDAS:1.0:electricCurrent",
"modDate" : 1430092302
"name" : "Latitud",
"value" : "43.4716987609863",
"type" : "urn:x-ogc:def:phenomenon:IDAS:1.0:latitude",
"modDate" : 1414522843
"name" : "Longitud",
"value" : "-3.80692005157471",
"type" : "urn:x-ogc:def:phenomenon:IDAS:1.0:longitude",
"modDate" : 1401818472
"name" : "ActivePower",
"creDate" : 1393420396,
"value" : "11778.2",
"type" : "float",
"modDate" : 1430092302
"modDate" : 1430092302
How can I adapt that entity to work with Orion 0.17.0 and beyond?
The simplest solution is to edit the entity using mongo console, to remove all the "duplicated" attributes with the same name and leave only one. In the example above we have one ActivePower with type urn:x-ogc:def:phenomenon:Outsmart:1.0:ActivePower and other with type float. Let's assume we want to kept the first one.
First of all, stop Orion and take a backup of the database. If something gets wrong while you edit the entity, you could need that backup to go back the initial status and try again.
Next, run mongo console (let's assume that your DB is named "orion") and get the entity to modify using findOne() operation. Let's store it in the doc variable.
# mongo orion
> doc = db.entities.findOne({"_id.type": "Regulator", "": "OUTSMART.RG_LAS_LLAMAS_01", "_id.servicePath": "/"})
Now, identify the position within the attrs array of the attribute to remove, taking into account that the position of the first element in the vector is 0 (and not 1). Looking to the above example, the attribute to remove is the 7-th. Check that printing the attribute:
> doc.attrs[7]
"name" : "ActivePower",
"creDate" : 1393420396,
"value" : "11778.2",
"type" : "float",
"modDate" : 1430092302
Use splice() function to remove the attribute from doc, using as first parameter the position of the attribute and as second parameter 1. Print the doc value to check that it has been removed:
> doc.attrs.splice(7, 1)
> doc
Repeat the above operation as many time as you need to remove all duplicated (in the example case, there is only one duplicated). When you are done, save the new version of the entity in the DB:

using 2 different result sets in mongodb

I'm using groovy with mongodb. I have a result set but need a value from a different grouping of documents. How do I pull that value into the result set I need?
MAIN:Network data
"resource_metadata" : {
"name" : "tapd2e75adf-71",
"parameters" : { },
"fref" : null,
"instance_id" : "9f170531-79d0-48ee-b0f7-9bd2788b1cc5"}
I need the display_name for the network data result set which is contained in the compute data.
CPU data
"resource_id" : "9f170531-79d0-48ee-b0f7-9bd2788b1cc5",
"resource_metadata" : {
"ramdisk_id" : "",
"display_name" : "testinstance0001"}
You can see the resource_id and the Instance_id are the same values. I know there is no relationship I can do but trying to reach to see if anyone has come across this. I'm using the table model to retrieve data for reporting. Hashtable has been suggested to me but I'm not seeing that working. Somehow in the hasNext I need to include the display_name value. in the networking data so GUID number doesn't only valid name shows from compute data.
def docs = meter.find(query).sort(sort).limit(50)\
while (docs.hasNext()) { def doc =\
model.addRow([ doc.get("counter_name"),doc.get("counter_volume"),doc.get("timestamp"),\
as Object[]);}
Full document:
1st set where I need the network data measure with no name only id {resource_metadata.instance_id}
"_id" : ObjectId("528812f8be09a32281e137d0"),
"counter_name" : "network.outgoing.packets",
"user_id" : "4d4e43ec79c5497491b23b13644c2a3b",
"timestamp" : ISODate("2013-11-17T00:51:00Z"),
"resource_metadata" : {
"name" : "tap6baab24e-8f",
"parameters" : { },
"fref" : null,
"instance_id" : "a8727a1d-4661-4565-9c0a-511279024a97",
"instance_type" : "50",
"mac" : "fa:16:3e:a3:bf:fc"
"source" : "openstack",
"counter_unit" : "packet",
"counter_volume" : 4611911,
"project_id" : "97dc4ca962b040608e7e707dd03f2574",
"message_id" : "54039238-4f22-11e3-8e68-e4115b99a59d",
"counter_type" : "cumulative"
2nd set where I want to grab the name as I get the values {resource_id}:
"_id" : ObjectId("5287bc3ebe09a32281dd2594"),
"counter_name" : "cpu",
"user_id" : "4d4e43ec79c5497491b23b13644c2a3b",
"message_signature" :
"timestamp" : ISODate("2013-11-16T18:40:58Z"),
"resource_id" : "a8727a1d-4661-4565-9c0a-511279024a97",
"resource_metadata" : {
"ramdisk_id" : "",
"display_name" : "vmsapng01",
"name" : "instance-000014d4",
"disk_gb" : "",
"availability_zone" : "",
"kernel_id" : "",
"ephemeral_gb" : "",
"host" : "3746d148a76f4e1a8203d7e2378ef48ccad8a714a47e7481ab37bcb6",
"memory_mb" : "",
"instance_type" : "50",
"vcpus" : "",
"root_gb" : "",
"image_ref" : "869be2c0-9480-4239-97ad-df383c6d09bf",
"architecture" : "",
"os_type" : "",
"reservation_id" : ""
"source" : "openstack",
"counter_unit" : "ns",
"counter_volume" : NumberLong("724574640000000"),
"project_id" : "97dc4ca962b040608e7e707dd03f2574",
"message_id" : "a240fa5a-4eee-11e3-8e68-e4115b99a59d",
"counter_type" : "cumulative"
This is another collection that contains the same value but just thought it would be easier to grab from same collection:
"_id" : "a8727a1d-4661-4565-9c0a-511279024a97",
"metadata" : {
"ramdisk_id" : "",
"display_name" : "vmsapng01",
"name" : "instance-000014d4",
"disk_gb" : "",
"availability_zone" : "",
"kernel_id" : "",
"ephemeral_gb" : "",
"host" : "3746d148a76f4e1a8203d7e2378ef48ccad8a714a47e7481ab37bcb6",
"memory_mb" : "",
"instance_type" : "50",
"vcpus" : "",
"root_gb" : "",
"image_ref" : "869be2c0-9480-4239-97ad-df383c6d09bf",
"architecture" : "",
"os_type" : "",
"reservation_id" : "",
It looks like these data are in 2 different collections, is this correct?
Would you be able to query CPU data for each "instance_id" ("resource_id")?
Or if this would cause too many queries to the database (looks like you limit to 50...) you could use $in with the list of all "Instance_id"s
Either way, you will need to query each collection separately.