MongoDB: Find an element in a document that contains an object with a subobject with an array - mongodb

I'm getting puzzled more and more discovering how mongodb is overcomplicated and bad designed in the query writing, anyway I have this kind of document in a db with thousand of records:
db.messages.aggregate([{$limit: 1}]).pretty()
{
"_id" : ObjectId("4f16fc97d1e2d32371003f42"),
"body" : "Hey Gillette,\n\nThe heat rate is going to depend on the type of fuel and the construction \ndate of the unit. Unfortunately, most of that info is proprietary. \n\nChris Gaskill is the head of our fundamentals group and he might be able to \nsupply you with some of the guidelines.\n\n-Bass\n\n\n \n\tEnron North America Corp.\n\t\n\tFrom: Lisa Gillette 04/05/2001 02:31 PM\n\t\n\nTo: Eric Bass/HOU/ECT#ECT\ncc: \nSubject: Power Generation Question\n\nHey Bass,\n\nI have a question and I am hoping you can help me. I am wanting to compile a \nlist of all the different types of power plants and their respective heat \nrates to determine some sort of generation ratio.\n\ni.e. Coal 4 mmbtu = 1 MW\n Simple Cycle 11 mmbtu = 1 MW\n\nPlease let me know if you can help me or point me to someone who can. Just \nFYI...Bryan suggested that I call you so blame him as you curse me under your \nbreath right now.\n\nThanks,\nLisa\n\n",
"filename" : "1045.",
"headers" : {
"Content-Transfer-Encoding" : "7bit",
"Content-Type" : "text/plain; charset=us-ascii",
"Date" : ISODate("2001-04-05T14:45:00Z"),
"From" : "eric.bass#enron.com",
"Message-ID" : "<2106897.1075854772243.JavaMail.evans#thyme>",
"Mime-Version" : "1.0",
"Subject" : "Re: Power Generation Question",
"To" : [
"lisa.gillette#enron.com"
],
"X-FileName" : "ebass.nsf",
"X-Folder" : "\\Eric_Bass_Jun2001\\Notes Folders\\Sent",
"X-From" : "Eric Bass",
"X-Origin" : "Bass-E",
"X-To" : "Lisa Gillette",
"X-bcc" : "",
"X-cc" : ""
},
"mailbox" : "bass-e",
"subFolder" : "sent"
}
And I need to find records from address X to address Y.
I managed to catch the "From" records with
db.messages.find({"headers.From": "eric.bass#enron.com"}).pretty().count()
But I can't get the To records (and I Need to get both togheter).
To query the "To" field I've tried:
db.messages.find({headers: {$elemMatch :{ "To": "lisa.gillette#enron.com"}}})
But it returns nothing
What am I missing?
Thanks

$elemMatch - To use this operator we need to give the array element and the matching operator, here in your case it should be like
db.messages.find({"headers.To": {$elemMatch :{$eq:"lisa.gillette#enron.com"}}})
$elemMatch is optimal to use when we have multiple queries to given for the array elements. If we are specifying only a single condition in the $elemMatch expression, we don't need to use $elemMatch, instead we can use find
db.messages.find({"headers.To": "lisa.gillette#enron.com"});

Related

how can i form a mongo best match query for below case to achieve performance

for a collection as below
Document 1
{
"entity" : "university",
"parEnityHRCHY" : "Planet>continent>country>state>city",
"parEnityVal" : "earth>North America>Massachusetts>Boston",
"entityVal" : [
"MIT",
"Harvard",
"New England"
]
}
Document 2
{
"entity" : "university",
"parEnityHRCHY" : "Planet>continent>country>state",
"parEnityVal" : "earth>North America>Massachusetts",
"entityVal" : [
"A",
"B",
"C"
]
}
i want to fetch the best match "entityVal" for the input "entity","parEnityHRCHY","parEnityVal"
if the value is not available at the exact match it should look recursively till the root.
for eg. in above case if "university" value are not available at the city level it should look at the state level like
if matches exact below condition return result.
Input:
"parEnityHRCHY" : "Planet>continent>country>state>city",
"parEnityVal" : "earth>North America>Massachusetts>Boston",
else look at one level up
"parEnityHRCHY" : "Planet>continent>country>state",
"parEnityVal" : "earth>North America>Massachusetts",
and so on until the root element.
please suggest some approach, i am planning to use $text search , max number of documents in collection approx 1 Million, max HRCHY level 10.
You can try something like this:
db.doc.find({"parEnityHRCHY" : "Planet>continent>country>state",'parEnityVal':{$regex:"earth>North America>Massachusetts"}})
But I am not sure If you're looking for something like this.

Mongodb - combine data from two collections

I know this has been covered quite a lot on here, however, i'm very new to MongoDB and am struggling with applying answers i've found to my situation.
In short, I have two collections 'total_by_country_and_isrc' which is the output from a MapReduce function and 'asset_report' which contains an asset_id not present in the 'total_by_country_and_isrc' collection or the original raw data collection this was MapReduced from.
An example of the data in 'total_by_country_and_isrc' is:
{ "_id" : { "custom_id" : 4748532, "isrc" : "GBCEJ0100080",
"country" : "AE" }, "value" : 0 }
And an example of the data in the 'asset_report' is:
{ "_id" : ObjectId("51824ef016f3edbb14ef5eae"), "Asset ID" :
"A836656134476364", "Asset Type" : "Web", "Metadata Origination" :
"Unknown", "Custom ID" : "4748532", "ISRC" : "", }
I'd like to end up with the following ('total_by_country_and_isrc_with_asset_id'):
{ "_id" : { "Asset ID" : "A836656134476364", "custom_id" : 4748532,
"isrc" : "GBCEJ0100080", "country" : "AE" }, "value" : 0 }
I know how I would approach with in a relational database but I really want to try and get this working in Mongo as i'm dealing with some pretty large collections and feel Mongo is the right tool for the job.
Can anyone offer some guidance here?
I think you want to use the "reduce" output action: Output to a Collection with an Action. You'll need to regenerate total_by_country_and_isrc, because it doesn't look like asset_report has the fields it needs to generate the keys you already have in total_by_country_and_isrc – so "joining" the data is impossible.
First, write a map method that is capable of generating the same keys from the original collection (used to generate total_by_country_and_isrc) and also from the asset_report collection. Think of these keys as the "join" fields.
Next, map and reduce your original collection to create total_by_country_and_isrc with the correct keys.
Finally, map asset_report with the same method you used to generate total_by_country_and_isrc, but use a reduce function that can be used to reduce the intersection (by key) of this mapped data from asset_report and the data in total_by_country_and_isrc.

MongoDB - How to find equals in collection and in embedded document

Gurus - I'm stuck in a situation that I can't figure out how I can query from the following collection "spouse", which has embedded document "surname" and check for equality with "surname" of this document:
{
"_id" : ObjectId("50bd2bb4fcfc6066b7ef090d"),
"name" : "Gwendolyn",
"surname" : "Davis",
"birthyear" : 1978,
"spouse" : {
"name" : "Dennis",
"surname" : "Evans",
"birthyear" : 1969
},
I need to query:
Output data for all spouses with the same surnames (if the surname of
one of the spouses is not specified, assume that it coincides with the
name of another)
I tried something like this:
db.task.find( {"surname" : { "spouse.surname" : 1 }} )
but it failed)
PLEASE PLEASE Guide me how I can achieve this any example/sample? based on this will be really helpful :-)
Thanks a lot!
You have three options.
Use $where modifier:
db.task.find({$where: 'this.spouse.surname === this.surname'})
Update all your documents and add special flag. After that you will be able to query documents by this flag. It's faster then $where, but requires altering your data.
Use MapReduce. It's quite complicated, but it allows you to do nearly anything.

Get the latest record from mongodb collection

I want to know the most recent record in a collection. How to do that?
Note: I know the following command line queries works:
1. db.test.find().sort({"idate":-1}).limit(1).forEach(printjson);
2. db.test.find().skip(db.test.count()-1).forEach(printjson)
where idate has the timestamp added.
The problem is longer the collection is the time to get back the data and my 'test' collection is really really huge. I need a query with constant time response.
If there is any better mongodb command line query, do let me know.
This is a rehash of the previous answer but it's more likely to work on different mongodb versions.
db.collection.find().limit(1).sort({$natural:-1})
This will give you one last document for a collection
db.collectionName.findOne({}, {sort:{$natural:-1}})
$natural:-1 means order opposite of the one that records are inserted in.
Edit: For all the downvoters, above is a Mongoose syntax,
mongo CLI syntax is: db.collectionName.find({}).sort({$natural:-1}).limit(1)
Yet another way of getting the last item from a MongoDB Collection (don't mind about the examples):
> db.collection.find().sort({'_id':-1}).limit(1)
Normal Projection
> db.Sports.find()
{ "_id" : ObjectId("5bfb5f82dea65504b456ab12"), "Type" : "NFL", "Head" : "Patriots Won SuperBowl 2017", "Body" : "Again, the Pats won the Super Bowl." }
{ "_id" : ObjectId("5bfb6011dea65504b456ab13"), "Type" : "World Cup 2018", "Head" : "Brazil Qualified for Round of 16", "Body" : "The Brazilians are happy today, due to the qualification of the Brazilian Team for the Round of 16 for the World Cup 2018." }
{ "_id" : ObjectId("5bfb60b1dea65504b456ab14"), "Type" : "F1", "Head" : "Ferrari Lost Championship", "Body" : "By two positions, Ferrari loses the F1 Championship, leaving the Italians in tears." }
Sorted Projection ( _id: reverse order )
> db.Sports.find().sort({'_id':-1})
{ "_id" : ObjectId("5bfb60b1dea65504b456ab14"), "Type" : "F1", "Head" : "Ferrari Lost Championship", "Body" : "By two positions, Ferrari loses the F1 Championship, leaving the Italians in tears." }
{ "_id" : ObjectId("5bfb6011dea65504b456ab13"), "Type" : "World Cup 2018", "Head" : "Brazil Qualified for Round of 16", "Body" : "The Brazilians are happy today, due to the qualification of the Brazilian Team for the Round of 16 for the World Cup 2018." }
{ "_id" : ObjectId("5bfb5f82dea65504b456ab12"), "Type" : "NFL", "Head" : "Patriots Won SuperBowl 2018", "Body" : "Again, the Pats won the Super Bowl" }
sort({'_id':-1}), defines a projection in descending order of all documents, based on their _ids.
Sorted Projection ( _id: reverse order ): getting the latest (last) document from a collection.
> db.Sports.find().sort({'_id':-1}).limit(1)
{ "_id" : ObjectId("5bfb60b1dea65504b456ab14"), "Type" : "F1", "Head" : "Ferrari Lost Championship", "Body" : "By two positions, Ferrari loses the F1 Championship, leaving the Italians in tears." }
I need a query with constant time response
By default, the indexes in MongoDB are B-Trees. Searching a B-Tree is a O(logN) operation, so even find({_id:...}) will not provide constant time, O(1) responses.
That stated, you can also sort by the _id if you are using ObjectId for you IDs. See here for details. Of course, even that is only good to the last second.
You may to resort to "writing twice". Write once to the main collection and write again to a "last updated" collection. Without transactions this will not be perfect, but with only one item in the "last updated" collection it will always be fast.
php7.1 mongoDB:
$data = $collection->findOne([],['sort' => ['_id' => -1],'projection' => ['_id' => 1]]);
My Solution :
db.collection("name of collection").find({}, {limit: 1}).sort({$natural: -1})
If you are using auto-generated Mongo Object Ids in your document, it contains timestamp in it as first 4 bytes using which latest doc inserted into the collection could be found out. I understand this is an old question, but if someone is still ending up here looking for one more alternative.
db.collectionName.aggregate(
[{$group: {_id: null, latestDocId: { $max: "$_id"}}}, {$project: {_id: 0, latestDocId: 1}}])
Above query would give the _id for the latest doc inserted into the collection
This is how to get the last record from all MongoDB documents from the "foo" collection.(change foo,x,y.. etc.)
db.foo.aggregate([{$sort:{ x : 1, date : 1 } },{$group: { _id: "$x" ,y: {$last:"$y"},yz: {$last:"$yz"},date: { $last : "$date" }}} ],{ allowDiskUse:true })
you can add or remove from the group
help articles: https://docs.mongodb.com/manual/reference/operator/aggregation/group/#pipe._S_group
https://docs.mongodb.com/manual/reference/operator/aggregation/last/
Mongo CLI syntax:
db.collectionName.find({}).sort({$natural:-1}).limit(1)
Let Mongo create the ID, it is an auto-incremented hash
mymongo:
self._collection.find().sort("_id",-1).limit(1)

In MongoDB, how does on get the value in a field for an embedded document, but query based on a different value

I have a basic structure like this:
> db.users.findOne()
{
"_id" : ObjectId("4f384903cd087c6f720066d7"),
"current_sign_in_at" : ISODate("2012-02-12T23:19:31Z"),
"current_sign_in_ip" : "127.0.0.1",
"email" : "something#gmail.com",
"encrypted_password" : "$2a$10$fu9B3M/.Gmi8qe7pXtVCPu94mBVC.gn5DzmQXH.g5snHT4AJSZYCu",
"last_sign_in_at" : ISODate("2012-02-12T23:19:31Z"),
"last_sign_in_ip" : "127.0.0.1",
"name" : "Trip Jameson",
"sign_in_count" : 100,
"usertimes" : [
...thousands and thousands of records like this one....
{
"enddate" : 348268392.115282,
"idle" : 0,
"startdate" : 348268382.116728,
"title" : "My Awesome Title"
},
]
}
So I want to find only usertimes for a single user where the title was "My Awesome Title", and then I want to see what the value for "idle" was in that record(s)
So far all I can figure out is that I can find the entire user record with a search like:
> db.users.find({'usertimes.title':"My Awesome Title"})
This just returns the entire User record though, which is useless for my purposes. Am I misunderstanding something?
Return only partial embedded documents is currently not supported by MongoDB
The matching User record will always be returned (at least with the current MongoDB version).
see this question for similar reference
Filtering embedded documents in MongoDB
This is the correspondent Jira on MongoDB space
http://jira.mongodb.org/browse/SERVER-142
Use:
db.users.find({'usertimes.title': "My Awesome Title"}, {'idle': 1});
May I suggest you take a more detailed look at http://www.mongodb.org/display/DOCS/Querying, it'll explain things for you.