using 2 different result sets in mongodb - mongodb

I'm using groovy with mongodb. I have a result set but need a value from a different grouping of documents. How do I pull that value into the result set I need?
MAIN:Network data
"resource_metadata" : {
"name" : "tapd2e75adf-71",
"parameters" : { },
"fref" : null,
"instance_id" : "9f170531-79d0-48ee-b0f7-9bd2788b1cc5"}
I need the display_name for the network data result set which is contained in the compute data.
CPU data
"resource_id" : "9f170531-79d0-48ee-b0f7-9bd2788b1cc5",
"resource_metadata" : {
"ramdisk_id" : "",
"display_name" : "testinstance0001"}
You can see the resource_id and the Instance_id are the same values. I know there is no relationship I can do but trying to reach to see if anyone has come across this. I'm using the table model to retrieve data for reporting. Hashtable has been suggested to me but I'm not seeing that working. Somehow in the hasNext I need to include the display_name value. in the networking data so GUID number doesn't only valid name shows from compute data.
def docs = meter.find(query).sort(sort).limit(50)\
while (docs.hasNext()) { def doc =\
model.addRow([ doc.get("counter_name"),doc.get("counter_volume"),doc.get("timestamp"),\
as Object[]);}
Full document:
1st set where I need the network data measure with no name only id {resource_metadata.instance_id}
"_id" : ObjectId("528812f8be09a32281e137d0"),
"counter_name" : "network.outgoing.packets",
"user_id" : "4d4e43ec79c5497491b23b13644c2a3b",
"timestamp" : ISODate("2013-11-17T00:51:00Z"),
"resource_metadata" : {
"name" : "tap6baab24e-8f",
"parameters" : { },
"fref" : null,
"instance_id" : "a8727a1d-4661-4565-9c0a-511279024a97",
"instance_type" : "50",
"mac" : "fa:16:3e:a3:bf:fc"
"source" : "openstack",
"counter_unit" : "packet",
"counter_volume" : 4611911,
"project_id" : "97dc4ca962b040608e7e707dd03f2574",
"message_id" : "54039238-4f22-11e3-8e68-e4115b99a59d",
"counter_type" : "cumulative"
2nd set where I want to grab the name as I get the values {resource_id}:
"_id" : ObjectId("5287bc3ebe09a32281dd2594"),
"counter_name" : "cpu",
"user_id" : "4d4e43ec79c5497491b23b13644c2a3b",
"message_signature" :
"timestamp" : ISODate("2013-11-16T18:40:58Z"),
"resource_id" : "a8727a1d-4661-4565-9c0a-511279024a97",
"resource_metadata" : {
"ramdisk_id" : "",
"display_name" : "vmsapng01",
"name" : "instance-000014d4",
"disk_gb" : "",
"availability_zone" : "",
"kernel_id" : "",
"ephemeral_gb" : "",
"host" : "3746d148a76f4e1a8203d7e2378ef48ccad8a714a47e7481ab37bcb6",
"memory_mb" : "",
"instance_type" : "50",
"vcpus" : "",
"root_gb" : "",
"image_ref" : "869be2c0-9480-4239-97ad-df383c6d09bf",
"architecture" : "",
"os_type" : "",
"reservation_id" : ""
"source" : "openstack",
"counter_unit" : "ns",
"counter_volume" : NumberLong("724574640000000"),
"project_id" : "97dc4ca962b040608e7e707dd03f2574",
"message_id" : "a240fa5a-4eee-11e3-8e68-e4115b99a59d",
"counter_type" : "cumulative"
This is another collection that contains the same value but just thought it would be easier to grab from same collection:
"_id" : "a8727a1d-4661-4565-9c0a-511279024a97",
"metadata" : {
"ramdisk_id" : "",
"display_name" : "vmsapng01",
"name" : "instance-000014d4",
"disk_gb" : "",
"availability_zone" : "",
"kernel_id" : "",
"ephemeral_gb" : "",
"host" : "3746d148a76f4e1a8203d7e2378ef48ccad8a714a47e7481ab37bcb6",
"memory_mb" : "",
"instance_type" : "50",
"vcpus" : "",
"root_gb" : "",
"image_ref" : "869be2c0-9480-4239-97ad-df383c6d09bf",
"architecture" : "",
"os_type" : "",
"reservation_id" : "",

It looks like these data are in 2 different collections, is this correct?
Would you be able to query CPU data for each "instance_id" ("resource_id")?
Or if this would cause too many queries to the database (looks like you limit to 50...) you could use $in with the list of all "Instance_id"s
Either way, you will need to query each collection separately.


One object per distinct field / value

I have this schema that represents a "File object" :
"_id" : ObjectId("59f7184aedd1712cdd6a148a"),
"file" : {
"part" : ".part000010",
"creation_time" : "2017-10-26T14:01:42.309597",
"archive_time" : "2017-10-30T12:17:14.770871",
"inode" : 18644328,
"size" : 733326,
"tags" : null,
"group" : "ssc_hpci",
"uuid" : "408bf97c-bd6c-11e7-a854-de7a7d6f0f6c",
"filename" : "build/hpca/",
"owner" : "anl001",
"gid" : 66026,
"uid" : 66799,
"checksum" : null,
"symlink" : false
"archivename" : "build",
Many files will have the same archivename.
I need one object per distinct archivename. Obj1 would have an archive named "build", Obj2 an archive named "build1", etc.
Projection is not suitable for this, aggregate neither.
I'm using pymongo.

Mongodb put Documents array as the same level

I have this array of documents, I would like to put "table" on the same level like mastil_antenas and other variables. how Can I do that with aggregate?
I'm trying with the aggregate $project but I can't get the result.
Example of Data
[ {
"mastil_antena" : "1",
"nro_platf" : "1",
"antmarcmast" : "ANDREW",
"antmodelmast" : "HWXXX6516DSA3M",
"retmarcmast" : "Ericsson",
"retmodelmast" : "ATM200-A20",
"distmast" : "1.50",
"altncramast" : "41.30",
"ORIENTMAG" : "73.00",
"incelecmast" : "RET",
"incmecmast" : "1.00",
"Feedertypemast" : "Fibra Optica",
"longjumpmast" : "5.00",
"longfo" : "100",
"calibrecablefuerza" : "10 mm",
"longcablefuerza" : "65.00",
"modelorruantena" : "32B66A",
"tiltmecfoto" : "",
"tiltmecfoto_fh" : "2017-10-18T05:51:22Z",
"az0foto" : "",
"az0foto_fh" : "2017-10-18T05:55:21Z",
"azneg60foto" : "",
"azneg60foto_fh" : "2017-10-18T05:55:36Z",
"azpos60foto" : "",
"azpos60foto_fh" : "2017-10-18T05:55:49Z",
"etiqantenafoto" : "",
"etiqantenafoto_fh" : "2017-10-18T05:56:01Z",
"tiltelectfoto" : "",
"tiltelectfoto_fh" : "2017-10-18T05:56:13Z",
"idcablefoto" : "",
"idcablefoto_fh" : "2017-10-18T05:56:38Z",
"rrutmafoto" : "",
"rrutmafoto_fh" : "2017-10-18T05:56:49Z",
"etiquetarrufoto" : "",
"etiquetarrufoto_fh" : "2017-10-18T05:57:02Z",
"rrutmafoto1" : "",
"rrutmafoto1_fh" : "2017-10-18T05:57:12Z",
"etiquetarrufoto1" : "",
"etiquetarrufoto1_fh" : "2017-10-18T05:57:27Z",
"botontorre4" : "sstelcel3",
"table" : { /* put all varibles one level up*/
"tecmast" : "LTE",
"frecmast" : "2100",
"secmast" : "1",
"untitled440" : "Salir"
"comentmast" : "",
"longfeedmast" : "",
"numtmasmast" : "",
"otra_marca_antena" : "",
"otro_modelo_antena" : ""
Starting from MongoDB version 3.4 you could use $addFields to do this.
//replace products with what makes sense in your database
{ //1 add the properties from subdocument table to documents
$addFields: {
"documents.tecmast" : "documents.0.table.tecmast",
"documents.frecmast" : "documents.0.table.frecmast",
"documents.secmast" : "documents.0.table.secmast",
"documents.untitled440" : "documents.0.table.untitled440"
//(optional) 2 remove the table property from the documents
$project: {"documents.table" : 0}
Step 1: use $addFields to grab properties from table inside documents.table and put them on documents
Step 2: (optional) remove property "table" from documents.
I hope this helps!!!

Getting data range from firebase with swift

I would like to ask that question. The question is how to get specific data range from firebase ?
I have table on firebase like this:
"users" : {
"Jz3IpatRWiWoDbiYM62q6qbHB503" : {
"email" : "",
"lastName" : "Ozdemir",
"name" : "Kaan"
"PmeYYFiac0c55fU2sFpnTP308mC3" : {
"email" : "",
"lastName" : "Hart",
"name" : "Kevin"
"r0bMqSGCWihFi2EF4u6ckSzLP8v1" : {
"email" : "",
"lastName" : "Alvarez",
"name" : "Marcus"
"A3tmSSGCWihFi2EF4u6ckSzLP8c1" : {
"email" : "",
"lastName" : "Swift",
"name" : "Taylor"
"3SUTsiGCWihFi2EF4u6ckSzLP8v2" : {
"email" : "",
"lastName" : "Fellon",
"name" : "Jimmy"
"lgSit3GCWihFi2EF4u6ckSzLP8u3" : {
"email" : "",
"lastName" : "Teller",
"name" : "Jax"
For example, I would like to get users values between 2 and 4 [2 - 4](Marcus Alvarez - Taylor Swift - Jimmy Fellon).
Is there any way to do that server side ? I don't wanna get all data and pick values that I want. Anyone knows?
Change your JSON DB structure to include an index in every node :
"users" : {
"autoID1" : {
"email" :.....,
"lastName" : ......,
"name" :.......,
"index" : //e.g.. 1,2,3,4......
"noOfUsers" : 223,
If you are appending this users node via app, you have too keep track of the no of users in Database node users and keep updating the noOfUsers whenever a new user is added. And to set the next ones index number , just retrieve that node value i.e 223 and sees it and then increment the noOfUsers......
To retrieve between 2-4 .. Now you can use :
Database.database().reference().child("users").queryOrdered(byChild: "index").queryStarting(atValue: "2").queryEnding(atValue: "4").observe....

How to find something from an array in mongo

"_id" : ObjectId("586aac4c8231ee0b98458045"),
"store_code" : NumberInt(10800),
"counter_name" : "R.N.Electric",
"address" : "314 khatipura road",
"locality" : "Khatipura Road (Jhotwara)",
"pincode" : NumberInt(302012),
"town" : "JAIPUR",
"gtm_city" : "JAIPUR",
"sales_office" : "URAJ",
"owner_name" : "Rajeev",
"owner_mobile" : "9828024073",
"division_mapping" : [//this contains only 1 element in every doc
"dvcode" : "cfc",
"dc" : "trade",
"beatcode" : "govindpura",
"fos" : {
"_id" : ObjectId("586ab8318231ee0b98458843"),
"loginid" : "9928483483",
"name" : "Arpit Gupta",
"division" : [
"sales_office" : "URAJ", //office
"gtm_city" : "JAIPUR" //city
"beat" : {
"_id" : ObjectId("586d372b39f64316b9c3cbd7"),
"division" : {
"_id" : ObjectId("5869f8b639f6430fe4edee2a"),
"clientdvcode" : NumberInt(40),
"code" : "cfc",
"name" : "Cooking & Fabric Care",
"project_code" : "usha-fos",
"client_code" : "usha",
"agent_code" : "v5global"
"beatcode" : "govindpura",
"sales_office" : "URAJ",
"gtm_city" : "JAIPUR",
"active" : true,
"agency_code" : "v5global",
"client_code" : "USHA_FOS",
"proj_code" : "usha-fos",
"fos" : {
"_id" : ObjectId("586ab8318231ee0b98458843"),
"loginid" : "9928483483",
"name" : "Arpit Gupta",
"division" : [
"sales_office" : "URAJ",
"gtm_city" : "JAIPUR"
"distributor_mail" : "",
"project_code" : "usha-fos",
"client_code" : "usha",
"agent_code" : "v5global",
"distributor_name" : "Sundeep Electrical"
I am having only 1 element in division_mapping's array and I want to find those documents whose dc in division_mapping is trade.
I have tried following:
Dont know what I am doing wrong.
//Maybe I have to unwind the array but is there any other way?
According to MongoDB documentation
The $elemMatch operator matches documents that contain an array
field with at least one element that matches all the specified query
According to above mentioned description to retrieve only documents whose dc in division_mapping is trade please try executing below mentioned query

Migrating from MongoDB to HBase

Hi I am very new to HBase database. I downloaded some twitter data and stored into MongoDB. Now I need to transform that data into HBase to speed-up Hadoop processing. But I am not able to create it's scheme. Here I have twitter data into JSON format-
"_id" : ObjectId("512b71e6e4b02a4322d1c0b0"),
"id" : NumberLong("306044618179506176"),
"source" : "Facebook",
"user" : {
"name" : "Dada Bhagwan",
"location" : "India",
"url" : "",
"id" : 191724440,
"protected" : false,
"timeZone" : null,
"description" : "Founder of Akram Vignan - Practical Spiritual Science of Self Realization",
"screenName" : "dadabhagwan",
"geoEnabled" : false,
"profileImageURL" : "",
"biggerProfileImageURL" : "",
"profileImageUrlHttps" : "",
"profileImageURLHttps" : "",
"biggerProfileImageURLHttps" : "",
"miniProfileImageURLHttps" : "",
"originalProfileImageURLHttps" : "",
"followersCount" : 499,
"profileBackgroundColor" : "EEE4C1",
"profileTextColor" : "333333",
"profileLinkColor" : "990000",
"lang" : "en",
"profileSidebarFillColor" : "FCF9EC",
"profileSidebarBorderColor" : "CBC09A",
"profileUseBackgroundImage" : true,
"showAllInlineMedia" : false,
"friendsCount" : 1,
"favouritesCount" : 0,
"profileBackgroundImageUrl" : "",
"profileBackgroundImageURL" : "",
"profileBackgroundImageUrlHttps" : "",
"profileBannerURL" : null,
"profileBannerRetinaURL" : null,
"profileBannerIPadURL" : null,
"profileBannerIPadRetinaURL" : null,
"miniProfileImageURL" : "",
"originalProfileImageURL" : "",
"utcOffset" : -1,
"contributorsEnabled" : false,
"status" : null,
"createdAt" : NumberLong("1284700143000"),
"profileBannerMobileURL" : null,
"profileBannerMobileRetinaURL" : null,
"profileBackgroundTiled" : false,
"statusesCount" : 1713,
"verified" : false,
"translator" : false,
"listedCount" : 6,
"followRequestSent" : false,
"descriptionURLEntities" : [ ],
"urlentity" : {
"url" : "",
"start" : 0,
"end" : 26,
"expandedURL" : "",
"displayURL" : ""
"rateLimitStatus" : null,
"accessLevel" : 0
"contributors" : [ ],
"geoLocation" : null,
"place" : null,
"favorited" : false,
"retweet" : false,
"retweetedStatus" : null,
"retweetCount" : 0,
"userMentionEntities" : [ ],
"retweetedByMe" : false,
"currentUserRetweetId" : -1,
"possiblySensitive" : false,
"urlentities" : [
"url" : "",
"start" : 113,
"end" : 135,
"expandedURL" : "",
"displayURL" : ""
"hashtagEntities" : [ ],
"mediaEntities" : [ ],
"truncated" : false,
"inReplyToStatusId" : -1,
"text" : "Spiritual Quote of the Day :\n\n‘I am Chandubhai’ is an illusion itself and from that are \nkarmas charged. When...",
"inReplyToUserId" : -1,
"inReplyToScreenName" : null,
"createdAt" : NumberLong("1361801697000"),
"rateLimitStatus" : null,
"accessLevel" : 0
Here how to divide data into columns and column-family? I thought to make one "twitter" column-family that contain source, getlocation, place, retweet etc... and another "user" column-family and that contain name, location etc... (user's data). i.e new column family for each inner level sub-document.
Is this approach is correct? Now How I will differentiate urlentity for "user" column-family and "twitter" column-family?
And how to handle those keys that contain list of sub-documents (for e.g. urlentity)
There are many ways to model this in HBase ranging from storing everything in a single column to having a different table for each sub entity with several other tables for "indexing".
Generally speaking you model the data in hbase based on you read and write access patterns. fo r example column family are stored in different files on disk. A reason to divide data into two column families is if there are a lot of cases where you need data from one and not the other. etc.
There's a good presentation about HBAse schema design by Ian Varley from HBaseCon 2012 you can find the slides here and the video here