I have been asked to perform a very basic query task in MongoDB however I am unable to understand how to properly query the collection w/ aggregate functions in proper syntax.
I need to query the email collection for all email attachment sizes & sum them for today. Customer is aksing me to group all the email attachements for just their account for a single day (today). How would I find this?
Below is the output of db.email.findOne():
{
"_id" : ObjectId("55893983e4b0ea8af5a61550"),
"customer_id" : "12345",
"Subject" : "test message",
"Date" : ISODate("2016-08-04T10:48:13Z"),
"headers" : [
"Date: Tue, 23 Jun 2015 12:48:13 +0200 (CEST)",
"From: user#domain.tld",
"Message-ID: <240354118.javamail.email.server.tld>",
"Subject: Cats",
"Content-Type: text/plain; charset=us-ascii",
"Content-Transfer-Encoding: 7bit",
"To: undisclosed-recipients:;",
"X-ClamAV: clean"
],
"text" : "feed the cats please",
"attachments" : [ ],
"langprob" : 0.8301894511454121,
"original_message_file_id" : "239863489r7637208",
"account_id" : "xxx",
"received_time" : ISODate("2015-06-23T10:48:35.097Z"),
"direction" : "inbound",
"state" : "CLOSED",
"encryption_key_id" : null,
"size" : 1651,
"routing_type" : "PUSH",
"priority" : 1,
"closed_time" : ISODate("2015-07-10T21:02:53.409Z")
}
Can anyone please assist me in properly creating a query in JSON syntax to extract the data I need from MongoDB based on my predicates?
Thank you for any help!
Related
I'm using MongoDB 2.6.1
I have a collection that stores the emails, project-wise. The documents are as follows(haven't included the 'Raw Email Text' key for readability) :
{
"_id" : ObjectId("540d4ae7eea013be22f1f0d6"),
"Project_Id" : "E11593",
"Project_Name" : "National Hearing Care- Novo",
"Email_Id" : "E11593.monitor#lntinfotech.com",
"Date" : "Mon Sep 08 05:05:35 IST 2014",
"To" : "manisha.bhopate#infostretch.com; ",
"From" : "Shubhangi Thorat",
"CC" : "NO VALUES",
"Subject" : "RE: pics",
"Unique_Id" : "Mon-Sep-08-11:51:20-IST-2014"
}
{
"_id" : ObjectId("540d4ae7eea013be22f1f0d7"),
"Project_Id" : "E11593",
"Project_Name" : "National Hearing Care- Novo",
"Email_Id" : "E11593.monitor#lntinfotech.com",
"Date" : "Mon Sep 08 05:02:38 IST 2014",
"To" : "manisha.bhopate#infostretch.com; ",
"From" : "Shubhangi Thorat",
"CC" : "NO VALUES",
"Subject" : "FW: pics",
"Unique_Id" : "Mon-Sep-08-11:51:20-IST-2014"
}
{
"_id" : ObjectId("540d4ae7eea013be22f1f0d8"),
"Project_Id" : "E11593",
"Project_Name" : "National Hearing Care- Novo",
"Email_Id" : "E11593.monitor#lntinfotech.com",
"Date" : "Mon Sep 08 04:37:47 IST 2014",
"To" : "Prachi Sutrawe; ",
"From" : "Mahindra Shambharkar",
"CC" : "NO VALUES",
"Subject" : "Accepted: Show and tell -Sale",
"Unique_Id" : "Mon-Sep-08-11:51:20-IST-2014"
}
I had the following thoughts on my mind when selecting the shard key:
Build a compound index {Project_Id, _id} since Project_Id has a low cardinality but _id has a high one
A hashed index on 'Date' / 'Unique_Id' which are both timestamps
A hashed index on 'From' field but it's cardinality is dependent on the no. of people involved in the project
'To' and 'CC' are multivalue keys and 'Subject' has high randomness so not sure if these keys can be used at all
While not listed in the output, 'Raw_Text' will be extensively read by different applications but I'm not sure if an index should be built and even used in sharding for this key !
What will be the optimal shard key in this case ?
I have two collection as bellow products has reference of user. i search product by name & in return i want combine output of product and user using map reduce method
user collection
{
"_id" : ObjectId("52ac5dd1fb670c2007000000"),
"company" : {
"about" : "This is textile machinery dealer",
"contactAddress" : [{
"address" : "abcd",
"city" : "52ac4bc6fb670c1007000000",
"zipcode" : "39as46as80"
},{
"address" : "abcd",
"city" : "52ac4bc6fb670c1007000000",
"zipcode" : "39as46as80"
}],
"fax" : "58784868",
"mainProducts" : "ads,asd,asd",
"mobileNumber" : "9537236588",
"name" : "krishna steels",
}
"user" : ObjectId("52ac4eb7fb670c0c07000000")
}
product colletion
{
"_id" : ObjectId("52ac5722fb670cf806000002"),
"category" : "52a2a9cc48a508b80e00001d",
"deliveryTime" : "10 days after received the ",
"price" : {
"minPrice" : "2000",
"maxPrice" : "3000",
"perUnit" : "5288ac6f7c104203e0976851",
"currency" : "INR"
},
"productName" : "New Mobile Solar Charger with Carabiner",
"rejectReason" : "",
"status" : 1,
"user" : ObjectId("52ac4eb7fb670c0c07000000")
}
This cannot be done. Mongo support Map Reduce only on one collection. You could try to fetch and merge in a java collection. Couple of days back I solved a similar problem using java collection.
Click to see similar response about joins and multi collection not supported in mongo.
This can be done using two map reduces.
You run your first MR and then you reduce out the second MR onto the results of the first.
You shouldn't do this though. JOINs are not designed to be done through MR, in fact it sounds like you are trying to do this MR with inline output which in itself is a very bad idea.
MRs are not designed to run inline to the application.
You would be better off doing the JOIN else where.
I'm stuck in a problem of aggregates in mongoDB. The data structure that I'm dealing with is like this :-
{
"_id" : ObjectId("4f16fe11d1e2d32371072aa0"),
"body" : " \nHi Kate, per our discussion on yesterday about the $15.00 f
lat fee on Tom's \nand Mark's deals, here is Bloomberg's response. Please pass
this info to all \nof our traders. Please let me know what the response is from
them.\n\nThanks\n\n\n---------------------- Forwarded by Evelyn Metoyer/Corp/En
ron on 04/17/2001 \n02:34 PM ---------------------------\n\n\n\"PAUL CALLAHAN, B
LOOMBERG/ NEW YORK\" <PCALLAHAN2#bloomberg.net> on 04/17/2001 \n02:28:57 PM\nTo:
Evelyn.Metoyer#enron.com\ncc: \n\nSubject: Commission\n\n\nEvelyn, as of April
16, 2001 our charge for Spot trades is a flat fee of\n$15/trade.\n\n\n",
"filename" : "3272.",
"headers" : {
"Content-Transfer-Encoding" : "7bit",
"Content-Type" : "text/plain; charset=us-ascii",
"Date" : ISODate("2001-04-17T14:33:00Z"),
"From" : "evelyn.metoyer#enron.com",
"Message-ID" : "<33504483.1075841847839.JavaMail.evans#thyme>",
"Mime-Version" : "1.0",
"Subject" : "Commission for Bloomberg",
"To" : [
"kate.symes#enron.com"
],
"X-FileName" : "kate symes 6-27-02.nsf",
"X-Folder" : "\\kate symes 6-27-02\\Notes Folders\\Discussion th
reads",
"X-From" : "Evelyn Metoyer",
"X-Origin" : "SYMES-K",
"X-To" : "Kate Symes",
"X-bcc" : "",
"X-cc" : ""
},
"mailbox" : "symes-k",
"subFolder" : "discussion_threads"
}
There are 120477 records in the database. I'm supposed to find out the pair of people who tend to communicate most (and 2nd most) with each other. The query that I've written is as follows:
db.messages.aggregate([{$project:{From:"$headers.From",To:"$headers.To",_id:0}
},{$unwind:"$headers.To"},{$group:{_id:{From:"$From",To:"$To"},number:{$sum:1}}}
,{$limit:3},{$sort:{number:-1}}]);
but it somehow does not work.
My first adventure into Mongo. Please save me some time by answering the following. This is the schema.
"_id" : 1,
"FullName" : "Full Name",
"Email" : "email#email.com",
"FacebookId" : NumberLong(0),
"LastModified" : ISODate("2012-04-11T09:26:10.955Z"),
"Connections" : [{
"_id" : 7,
"FullName" : "Fuller name",
"Email" : "connections#email.com",
"FacebookId" : NumberLong(0),
"LastModified" : ISODate("0001-01-01T00:00:00Z")
},
....
Given an id of a single top user, i'd like to return all of the Emails in the Connections array, and preferably, just the emails. What's the querystring? Much obliged!
You can't get only values from the sub-objects in MongoDB.
If you do a query like this:
db.test.find({"_id": 1}, {"Connections.Email":1});
you will get this kind of response:
{
"_id": 1,
"Connections" : [ {"Email":"connections#email.com"},
{"Email":"foo#example.com"} ]
}
This is the closest you can get with a simple query and field selection from MongoDB.
You can then filter out the e-mails values in your code with a simple foreach.
i have this collection:
{
"_id" : ObjectId("4f3176d21a8b87fcf14658a6"),
"quiosco_id" : "11111111 ",
"transacciones" : [{
"transaccion_uuid" : "60be5247-6a38-4da2-b7b3-ea1dfaf0293b",
"machine_uuid" : "11111111 ",
"audit" : "146018",
"mti" : "1810",
"direction" : "1",
"monto" : 1.1499999761581421,
"fecha" : "07/02/2012 02:39:14 PM",
"data1" : "181052200000028000001111111111111000000000115"
}, {
"transaccion_uuid" : "adcbda16-dda7-4887-9295-2e47df7520e2",
"machine_uuid" : "11111111 ",
"audit" : "146018",
"mti" : "1810",
"direction" : "2",
"monto" : 1.1499999761581421,
"fecha" : "07/02/2012 02:39:14 PM",
"data1" : "181052200000008000001111111111111000000000115"
}
}
I need only one document with a specific transaccion_uuid.
Any mongodb query always return root document, so you can't load only embedded document.
If you need root document that contains transaction with specific id you can do it easy via dot notation:
db.items.find({"transacciones.transaccion_uuid":
"adcbda16-dda7-4887-9295-2e47df7520e2"})
If you need just one transaction from embedded document you need find it from within your driver code manually.