Trouble querying specific text field in mongoDB using pymongo - mongodb

I have stored around 120 text files in a mongoDB database through connecting my local instance to mongodb cloud. I used pymongo to automate the insertion of the contents of each text file into mongodb cloud. The collection of 120 documents looks like this:
'''
{ _id:ObjectID(....),
nameTextdoc.txt:"text_document",
content: ['Each sentence stored in an array.','...']
'''
I am trying to retrieve the nameTextdoc.txt field and content field by using:
'''
collections.find_one({'nameTextdoc.txt': 'text_doc'})
'''
in a python script using pymongo. For some reason I receive None when I run this query. However, when I run:
'''
collections.find_one({})
'''
I get the entire document.
I would like to get assistance on writing a query that would retrieve the entirety of the text file by querying the name of the text file. I have periods in my key names, which may be the specific reason why I cannot retrieve them. Any help would be much appreciated.

Related

MongoDB - querying GridFS by metadata does not return any results

I am trying to query MongoDB database for a file stored in GridFS using metadata in the following way:
db['fs'].files.find({'metadata': {'a_field': 'a_value'}})
And it does not return any results whereas I can see the file with such a field value exists when I run e.g.:
db['fs'].files.find()
What is wrong about my query?
It turns out the problem is solved by changing the nesting of JSON query document from:
{'metadata': {'a_field': 'a_value'}}
to:
{'metadata.a_field': 'a_value'}
It is still a mystery to me why the two queries are not equivalent, though.

Export populated data from MongoDB to CSV file

I am using MongoDB at mLab. I have multiple collections - 1 main and other supporting. Therefore, the main collection consists of IDs pointing to supporting collections. I would like to export the actual data from the main collection to a CSV file. So I need to populate the data first and then export the result.
I see I can export collections individually but then the data are not populated. I suppose I should use bash script to do this but I do not know how.
Could you point me the right direction or suggest a way to do this?
Thank you!
Using the mongo shell will be the better idea in your case, as per the official documents below is the steps to write the bash script to read the data from mongo collection in bash shell scripts:
Simple example to get the data count from a collection with updated date time with greater than 10 days.
DATE2=$(date -d '10 days ago' "+%Y-%m-%dT%H:%M:%S.%3NZ");
counter = $(mongo --quiet dbName --eval 'db.dbCollection.find({"updatedAt":{"$gt":new ISODate("'$DATE'")}}).count()')
echo counter;
Or you can get the list of data and iterate over it to populate it as per your requirements.
For more on mongo shell query click here

MongoDB import to different collections set by a field

I have a file called data.json and extracted with mongoexport, with the following structure:
{"id":"63","name":"rcontent","table":"modules"}
{"id":"81","name":"choicegroup","table":"modules"}
{"id":"681","course":"1242","name":"Requeriments del curs","timemodified":"1388667164","table":"page"}
{"id":"682","course":"1242","name":"Guia d'estudi","timemodified":"1374183513","table":"page"}
What I need is to import this file into my local mongodb with a command like mongoimport or with pymongo, but storing every line in the collection named after the table value.
For example, the collection modules would contain the documents
{"id":"63","name":"rcontent"} and {"id":"81","name":"choicegroup"}
I've tried with mongoimport but I haven't seen any option which allows that. Does anyone know if there is a command or a method to do that?
Thank you
The basic steps for this using python are:
parse the data.json file to create python objects
extract the table key value pair from each document object
insert the remaining doc into a pymongo collection
Thankfully, pymongo makes this pretty straightforward, as below:
import json
from pymongo import MongoClient
client = MongoClient() # this will use default port and host
db = client['test-db'] # select the db to use
with open("data.json", "r") as json_f:
for str_doc in json_f.readlines():
doc = json.loads(str_doc)
table = doc.pop("table") # remove the 'table' key
db[table].insert(doc)

fill up mongo data automatically by using script

I am a newbie to mongo, I have a collection in my mongodb, To test a feature in my project I need to update database with some random data.I need a script to do that. by identifying the datatype of the field script should fill up the data automatically.
suppose I have the fields in the collection:
id, name, first_name, last_name, current_date, user_income etc.
Since the my questions are as follows:
1. Can we get all field names of a collection with their data types?
2. Can we generate a random value of that data type in mongo shell?
3. how to set the values dynamically to store random data.
I am frequently putting manually to do this.
1. Can we get all field names of a collection with their data types?
mongodb collections are schema-less, which means each document (row in relation database) can have different fields. When you find a document from a collection, you could get its fields names and data types.
2. Can we generate a random value of that data type in mongo shell?
3. how to set the values dynamically to store random data.
mongo shell use JavaScript, you may write a js script and run it with mongo the_js_file.js. So you could generate a random value in the js script.
It's useful to have a look at the mongo JavaScript API documentation and the mongo shell JavaScript Method Reference.
Other script language such as Python can also do that. mongodb has their APIs too.

Haskell mongodb text search

What is the status of text search with haskell mongodb driver?
There is now 'LIKE' operator in mongo similar to SQL variants, so what is the best way to search a collection or the whole db for a particular text string?
I've read some people referencing external tools but I can also see that some text search was implemented in 2.4 mongo version which is done through command interface.
There should not be any problems doing it from console but how would I do it from haskell driver? I found 'runCommand' function in the driver APIs and it looks like it should be possible to send 'text' command to the server but the signature shows that it returns only one document - not a list of documents. So how is it done correctly?
How would I efficiently search for a word or a sentence in a collection or db so that it returns a list of documents containing the word? Is it possible to do without external tools using mongo 'text search' feature? SHould it be done in the application level?
Thanks.
The result type already contains the list of documents (that contain the searched text). Unfortunately, I could not test the query on my running database, but I have used runCommand to run an aggregation (before it was implemented for the haskell driver). The result document you get for such an query looks something like this:
{ results: [
{ score : ...,
obj : { ... }
},
...
],
... ,
ok : 1
}
The result document has a field results and its value is a document with fields score and obj. So in the end, you can find each of the matched document behind the obj-field in the list of results.
For more details, you should take a look here.