How can I discover a mongo database's structure - mongodb

I have a Mongo database that I did not create or architect, is there a good way to introspect the db or print out what the structure is to start to get a handle on what types of data are being stored, how the data types are nested, etc?

Just query the database by running the following commands in the mongo shell:
use mydb //this switches to the database you want to query
show collections //this command will list all collections in the database
db.collectionName.find().pretty() //this will show all documents in the database in a readable format; do the same for each collection in the database
You should then be able to examine the document structure.

There is actually a tool to help you out here called Variety:
http://blog.mongodb.org/post/21923016898/meet-variety-a-schema-analyzer-for-mongodb
You can view the Github repo for it here: https://github.com/variety/variety
I should probably warn you that:
It uses MR to accomplish its tasks
It uses certain other queries that could bring a production set-up to a near halt in terms of performance.
As such I recommend you run this on a development server or a hidden node of a replica or something.
Depending on the size and depth of your documents it may take a very long time to understand the rough structure of your database through this but it will eventually give one.

This will print name and its type
var schematodo = db.collection_name.findOne()
for (var key in schematodo) { print (key, typeof key) ; }

I would recommend limiting the result set rather than issuing an unrestricted find command.
use mydb
db.collectionName.find().limit(10)
var z = db.collectionName.find().limit(10)
Object.keys(z[0])
Object.keys(z[1])
This will help you being to understand your database structure or lack thereof.

This is an open-source tool that I, along with my friend, have created - https://pypi.python.org/pypi/mongoschema/
It is a Python library with a pretty simple usage. You can try it out (even contribute).

One option is to use the Mongoeye. It is open-source tool similar to the Variety.
The difference is that Mongoeye is a stand-alone program (Mongo Shell is not required) and has more features (histograms, most frequent values, etc.).
https://github.com/mongoeye/mongoeye

Few days ago I found GUI client MongoDB Compass with some nice visualizations. See the product overview. It comes directly from the mongodb people and according to their doc:
MongoDB Compass is designed to allow users to easily analyze and understand the contents of their data collections within MongoDB...

You may've asked about validation schema. Here's the answer how to get it:
How to retrieve MongoDb collection validator rules?

Use Mongo Compass
which does a sample as explained here
Which does a random sample of 1000 documents to get you the schema - it could miss something but it's the only rational option if you database is several GBs.
Visualisation
The schema then can be exported as JSON
Documentation

You can use MongoDB's tool mongodump. On running it, a dump folder is created in the directory from which you executed mongodump. In that folder, there are multiple folders that correspond to the databases in MongDB, and there are subfolders that correspond to the collections, and files that correspond to the documents.
This method is the best I know of, as you can also make out the schema of empty collections.

Related

How to handle databases or collection being created accidentally in mongoDB? [duplicate]

Is there a way to switch off the ability of mongo to sporadically create dbs and collections as soon as it sees one in a query. I run queries on the mongo console all the time and mistype a db or collection name, causing mongo to just create one. There should be a switch to have mongo only explicitly create dbs and collections. I can't find one on the docs.
To be clear, MongoDB does not auto create collections or databases on queries. For collections, they are auto created when you actually save data to them. You can test this yourself, run a query on a previously unknown collection in a database like this:
use unknowndb
db.unknowncollection.find()
show collections
No collection named "unknowncollection" shows up until you insert or save into it.
Databases are a bit more complex. A simple "use unknowndb" will not auto create the database. However, if after you do that you run something like "show collections" it will create the empty database.
I agree, an option to control this behavior would be great. Happy to vote for it if you open a Jira ticket at mongoDB.
No, implicit creation of collections and DBs is a feature of the console and may not be disabled. You might take a look at the security/authorization/role features of 2.6 and see if anything might help (although there's not something that exactly matches your request as far as I know).
I'd suggest looking through the MongoDB issues/bug/requests database system here to and optionally add the feature request if it doesn't already exist.
For people who are using Mongoose, a new database will get created automatically if your Mongoose Schema contains any form of index. This is because Mongo needs to create a database before it can insert said index.

Using Alteryx's Mongo Connection Tool To Connect With A Mongo Labs Database (Collection)

I've been test driving Alteryx for the last week or so and was wondering if anyone has successfully connected to a Mongo Labs data base using the Alteryx Mongo Input and Output tools. I've tried numerous times and can's seem to get it to work.
Yes, I am using that MongoDB extractor on a daily basis to generate TDE files to feed Tableau, since Tableau does not offer dedicated extractors, but I admit it was difficult to get started.
Extraction is now swift from a MongoDB running on AWS, make sure you indicate the proper collection, and be aware of one flaw with the current version of extractor, as of Alteryx 9.5: it is missing the flag to send to the DB that would let you read from a Read Only configured MongoDB. It is on the priority list for V10. In the meantime, you should connect to the DB that can be written to, and to ease your DBA, show that Alteryx workflows can't write to it unless you drag into the workflow the MongoDB Output Tool.
Also you can use the Properties / Criteria window to set filters on the indexed field of your MongoDB. After much research, I found out the precise syntax for filtering the extraction by date:
{_id: {$gt: ObjectId("54a4fe800000000000000000")}}
Since each MongoDB ObjectId contains an embedded timestamp of its creation time.
To get the proper time, you can use this excellent website:
http://steveridout.github.io/mongo-object-time/
And if you need fancier filters, here is some help with the syntax:
http://www.querymongo.com/
I hope that helps...

Finding Collection Data in Meteor

I'm trying to better understand the Meteor/MongoDB data model. When you create a new meteor project I'd like to know where the data in a collection is stored when you create a new collection or add data to a collection. I understand that it is supposed to be under the .meteor/local/db directory but thus far I have not found it. I've both created new collections and added data to preexisting collection to both the basic project and to the Meteor demo projects (like Leaderboard) and I can't find where this data is stored. Could someone please guide me on this matter?
I imagine that I would at least see a JSON type list somewhere or a GUI similar to something like a MYSQL work bench (is there anything out there like this for Meteor - I've looked high and low but I haven't found it; Houston is insufficient).
In addition to scouring Stack Overflow for the answer to this question I've looked through a number of APIs (like Meteor's and Mongo's) and tutorials like http://meteortips.com/book/databases-part-1/
Again all I want to know is how can I see the data in Mongo as it is added to a collection. Thank you.
The data files are in the mongodb format; and are not human readable.
If you want to query mongo directly --
while meteor is running (from your app's directory)
meteor mongo
If meteor isn't running, and you want to launch just the database, you can try:
mongod --smallfiles --dbpath /path/to/my/app/.meteor/local/db --port 3001
Then connect with the regular mongo shell.
To access database in nice GUI form I use Robomongo.
What is nice you can connect to local (on port 3001) or production mongodb from it (see how to do that).
Update:
Remember to run meteor command before connecting to local mongodb.
Thanks #iAmME
I have been using MONOVUE (http://www.mongovue.com/downloads/) for viewing the collections and it has been very handy in checking the data.
The different kinds of views : Table View, Tree View and Text View make it easier to understand how the data is inserted especially for anyone(like me) jumping from RDBMS to NOSQL.

stop mongodb creating dbs and collections dynamically

Is there a way to switch off the ability of mongo to sporadically create dbs and collections as soon as it sees one in a query. I run queries on the mongo console all the time and mistype a db or collection name, causing mongo to just create one. There should be a switch to have mongo only explicitly create dbs and collections. I can't find one on the docs.
To be clear, MongoDB does not auto create collections or databases on queries. For collections, they are auto created when you actually save data to them. You can test this yourself, run a query on a previously unknown collection in a database like this:
use unknowndb
db.unknowncollection.find()
show collections
No collection named "unknowncollection" shows up until you insert or save into it.
Databases are a bit more complex. A simple "use unknowndb" will not auto create the database. However, if after you do that you run something like "show collections" it will create the empty database.
I agree, an option to control this behavior would be great. Happy to vote for it if you open a Jira ticket at mongoDB.
No, implicit creation of collections and DBs is a feature of the console and may not be disabled. You might take a look at the security/authorization/role features of 2.6 and see if anything might help (although there's not something that exactly matches your request as far as I know).
I'd suggest looking through the MongoDB issues/bug/requests database system here to and optionally add the feature request if it doesn't already exist.
For people who are using Mongoose, a new database will get created automatically if your Mongoose Schema contains any form of index. This is because Mongo needs to create a database before it can insert said index.

Unknown Database Content

I was given the data files to MongoDB without being told much of what was in the content. Is there a way to probe the contents of the database to identify commonalities within the database.
There are a few different mongo shell helpers that will give you an idea of the general structure of collections (eg. field/data types and their frequency of usage in a collection):
schema.js
variety
To get a better idea of the actual content, you could try one of the Admin UIs.