mongodb empty is the wildcard but only on top level? - mongodb

sorry to those who do not like Python my samples use pymongo.
In mongodb empty is used as "wildcard" like the here (using the "wildcard" I get same or more results):
>>> projects.find_one({'name': 'boodo3', '_id': ObjectId('4efa14c6b0ab584326000000')})
{u'_id': ObjectId('4efa14c6b0ab584326000000'), u'name': u'boodo3', u'updated_at': u'2011-12-28 13:47:09'}
>>> projects.find_one({'name': 'boodo3'})
{u'_id': ObjectId('4efa14c6b0ab584326000000'), u'name': u'boodo3', u'updated_at': u'2011-12-28 13:47:09'}
but if I am querying "inside" of the document the wildcards do not work any more like in the following sample:
>>> testruns.find_one({'_parent': {'_coll': u'projects', '_id': '4efa14c6b0ab584326000000'}})
{u'_parent': {u'_coll': u'projects', u'_id': u'4efa14c6b0ab584326000000'}, u'_id': ObjectId('4efa167eb0ab584351000002'), u'name': u'11121617', u'updated_at': u'2011-12-27 19:03:26'}
>>> testruns.find_one({'_parent': {'_coll': u'projects'}})
>>> <no results here>
I tried variations using '$in' and '$nin' but so far without luck. I hope there is some structured way of querying documents (I mean something besides regex). I would restructure my collections if necessary but I believe that flat documents are not the solution here.
Does that mean I have to translate all the queries into dot-notation?
Or are regex the way to query documents in mongodb?
disclaimer: I do not intent to criticize mongodb or anything. This is my first app which uses mongodb and I want to learn how to query documents in mongodb.

This query:
testruns.find_one({'_parent': {'_coll': u'projects'}})
Is looking for documents where the _parent is equal to {'_coll': u'projects'} which does not have any matches because the document has other keys in it as well, so equality is not satisfied - there is no notion of "wildcard" here. Try your query like this:
testruns.find_one({'_parent._coll': u'projects'})

Please update the title as the question itself is incorrect.
"How to search nested document criteria in mongodb" which is a duplicate:
How to correctly query a MongoDB nested document with python?
This is also answered in the mongodb docs here:
https://docs.mongodb.com/manual/tutorial/query-embedded-documents/
Under "Specify Equality Match on a Nested Field"

Related

MongoDB $regex with $in clause

I need a mongodb query something like
db.getCollection("xyz").find({"_id" : {$regex : {$in : [xxxx/*]}}})
My Use case is -- I have a list of Strings such as
[xyz/12/poi, abc/98/mnb, ytn/65/tdx, ...]
The ids that are there in the collection(test) are something like
xyz/12/poi/2019061304.
I will get the values like xyz/12/poi from the input list, the other part of the id being yyyymmddhh format.
So, I need to go to the collection and find all the documents matching the input list with the ID of the documents in the test collection.
I can retrieve the documents individually but that does not seem to be a feasible option as the size of the input list is more than 10000.
Can you guys suggest a more feasible solution. Thanks in advance.
I tried using $in with $regex. But it seems mongodb does not support that. I have also tried pattern matching but even that is not feasible for me. Can you please suggest an alternative to using $in with $regex in mongodb.
Expected result could be an aggragate query/a normal query so that we hit the database only once and get the desired output rather than hitting the db for 10000 odd times.

How to use query commands in MongoDB?

MongoDb query
I am new to MongoDB, I just started learning recently.When I am using a query command for instance, db.tests.find({"by":"Srihari"}) .It is not giving any output. Is there any wrong with my query? Please help!
From the screenshot you've shared following document exists in your tests collection:
{"username": "srihari"}
{"username": "srih"}
{"username": "srh"}
{"username": "sh"}
The query you're sending to mongodb is :
db.tests.find({"by":"Srihari"})
There isn't any document in tests collection that matches your query.
However, you can query like this:
db.tests.find({"username": "sh"})
will definately return the result.
In MongoDB you specify equality conditions, using <field>:<value> expressions in the query filter. So db.tests.find({"by":"Srihari"}) is looking for all documents where the field "by" has the value "Srihari".
Since your document has the format
{
username: "srihari"
}
your query should be:
db.tests.find({username: "srihari"})
You can see more examples here: https://docs.mongodb.com/manual/tutorial/query-documents/

Pymongo: iterate over all documents in the collection

I am using PyMongo and trying to iterate over (10 millions) documents in my MongoDB collection and just extract a couple of keys: "name" and "address", then output them to .csv file.
I cannot figure out the right syntax to do it with find().forEach()
I was trying workarounds like
cursor = db.myCollection.find({"name": {$regex: REGEX}})
where REGEX would match everything - and it resulted in "Killed".
I also tried
cursor = db.myCollection.find({"name": {"$exist": True}})
but that did not work either.
Any suggestions?
I cannot figure out the right syntax to do it with find().forEach()
cursor.forEach() is not available for Python, it's a JavaScript function. You would have to get a cursor and iterate over it. See PyMongo Tutorial: querying for more than one document, where you can do :
for document in myCollection.find():
print(document) # iterate the cursor
where REGEX would match everything - and it resulted in "Killed".
Unfortunately there's lack of information here to debug on why and what 'Killed' is. Although if you would like to match everything, you can just state:
cursor = db.myCollection.find({"name": {$regex: /.*/}})
Given that field name contains string values. Although using $exists to check whether field name exists would be preferable than using regex.
While the use of $exists operator in your example above is incorrect. You're missing an s in $exists. Again, unfortunately we don't know much information on what 'didn't work' meant to help debug further.
If you're writing this script for Python exercise, I would recommend to review:
PyMongo Tutorial
MongoDB Tutorial: query documents
You could also enrol in a free online course at MongoDB University for M220P: MongoDB for Python Developers.
However, if you are just trying to accomplish your task of exporting CSV from a collection. As an alternative you could just use MongoDB's mongoexport. Which has the support for :
Exporting specific fields via --fields "name,address"
Exporting in CSV via --type "csv"
Exporting specific values with query via --query "..."
See mongoexport usage for more information.
I had no luck with .find().forEach() either, but this should find what you are searching for and then print it.
First find all documents that match what you are searching for
cursors = db.myCollection.find({"name": {$regex: REGEX}})
then iterate it over the matches
for cursor in cursors
print(cursor.get("name"))
The find() methods returns a PyMongo cursor, which is a reference to the result set of a query.
You have to de-reference, somehow, the reference(address).
After that, you will get a better understanding how to manipulate/manage the cursor.
Try the following for a start:
result = db.*collection_name*.find()
print(list(result))
I think I get the question but there's no accurate answer yet I believe. I had the same challenge and that's how I came about this, although, I don't know how to output to a .csv file. For my situation I needed the result in JSON. Here's my solution to your question using mongodb Projections;
your_collection = db.myCollection
cursor = list(your_collection.find( { }, {"name": 1, "address": 1}))
This second line returns the result as a list using the python list() function.
And then you can use jsonify(cursor) or just print(cursor) as a list.
I believe with the list it should be easier to figure how to output to a .csv.

Mongo remove last documents

I would like to know how to delete, for example, the last 100 documents inserted in my collection.
How is it possible from the shell?
You should be able to use the _id to sort on last inserted, as outlined in the answer here:
db.coll.find().sort({_id:-1}).limit(100);
It looks like using limit on the standard mongo remove operation isn't supported though, so you might use something like this to delete the 100 documents:
for(i=0;i<100;i++) {
db.coll.findAndModify({query :{}, sort: {"_id" : -1}, remove:true})
}
See the docs for more on findAndModify.

how can I manipulate the value field of MapReduce?

When using MapReduce, each resulting document 'result' is structured like this:
{ '_id' : 123, 'value' :{'sum_donations' 999, 'nbr_visitors':50 }
I could access _id and value field by using:
db.result.find() OR db.result.find({},{_id:1, value:1})
Is there a way to select _id and sum_donations without selecting the nbr_visitors? Something like this:
{'id': 123, 'sum_donation': 999}
Or should I just create another MapReduce function that return that for me?
I was thinking about having one MapReduce Collection and manipulate it to answer different questions.
I tried
db.result.find({},{_id:1, value.sum_donations:1}) but it didn't work.
There are two problems to doing this:
The value field of the MR is not currently manipulatable from the MR itself atm, there is a JIRA for it but it's not exactly on the "list": https://jira.mongodb.org/browse/SERVER-2517
The query language of Mongo cannot automatically project your fields to the top level document. Subdocument fields stay in the subdocument.
You could (if your using MongoDB 2.2) use the aggregation framework here with the $project operator but I believe this to be super over kill and would slow down your system and your program.
So the best way to do this atm is to just extend your programming to grab the field out of that subdocument. This is probably the most performant, direct and easiest method of doing this atm, to simply code around it.