Analyse MongoDB's diagnostic.data files - mongodb

My MongoDB crashed and I am trying to understand why. On Ubuntu MongoDB produces files in /var/lib/mongodb/diagnostic.data. Those files, e.g. metrics.2016-03-08T17-15-01Z0, are binary files.
What tool should I use to analyse MongoDB diagnostic files? What data do the diagnostic files have?

You can see the contained data of the metrics... files using the tool bsondump which is included in every MongoDB installation.
Just execute bsondump metrics.2016-03-08T17-15-01Z0 and it will print out the decoded content of the file.

I believe at the moment there is no tool from MongoDB to view this.
Please see this comment from MongoDB engineer.
serverStatus, replSetGetStatus, collStats of local.oplog.rs.stats, buildInfo, getCmdLineOpts, hostInfo are the data collected as per latest
To understand the data being collected, please head over to MongoDB source code.

MongoDB 3.2 collects server statistics every second (default interval) into the diagnostic files inside the diagnostic.data directory. This data is collected for analysis of MongoDB server's behaviour by MongoDB engineers. I think no tool/document has been released yet for public to analyse the captured data.

Related

mongorestore for a collection results in "Killed" output and collection isn't fully restored

I type the following below:
root#:/home/deploy# mongorestore --db=dbname --collection=collectionname pathtobackupfolder/collectionname.bson
Here's the output:
2016-07-16T00:08:03.513-0400 checking for collection data in pathtobackupfolder/collectionname.bson
2016-07-16T00:08:03.525-0400 reading metadata file from pathtobackupfolder/collectionname.bson
2016-07-16T00:08:03.526-0400 restoring collectionname from file pathtobackupfolder/collectionname.bson
Killed
What's going on? I can't find anything on Google or Stackoverflow about a mongorestore resulting in "Killed". The backup folder that I'm restoring from is a collection of 12875 documents, and yet everytime I run the mongorestore, it always says "Killed", and always restores a different number that is less than the total number: 4793, 2000, 4000, etc.
The machine that I'm performing this call on is "Ubuntu 14.04.3 LTS (GNU/Linux 3.13.0-71-generic x86_64)" from Digital Ocean
Any help is appreciated. Thanks.
After trying the mongorestore command for the 5th and 6th time after posting this question, this time more explicit output came out that indicated it was a memory issue specific to Digital Ocean. I followed https://www.digitalocean.com/community/tutorials/how-to-add-swap-on-ubuntu-14-04 and the restore finished completely without errors.
If you are trying to solve it in docker, just increase swap memory in settings.json file

Pentaho PDI - Reading data from MongoDB

I have installed Pentaho Data Integration version (ce-5.0.1.A-stable) in my machine and I am trying to retrive information from MongoDB using PDI. I have created a transformation with Mongo Input step. Now when I try to configure my MongoDB connection details, I couldnt find any explicit connection Type for MongoDB. Could someone please advise on how to configure MongoDB datasource in Pentaho.
I have referred most of the Pentaho-MongoDb docs, but none of the solution works out.
Also, I have tried performing below steps as mentioned in Pentaho Official site, but still I couldnt find any connection Type for MongoDB
1- Move the following folder out of the data-integration folder structure:
data-integration/plugins/pentaho-big-data-plugin
2- Move the following files out of the data-integration folder structure if they exist:
data-integration/libext/JDBC/pentaho-hadoop-hive-jdbc-shim-1.3.0.jar
data-integration/libext/JDBC/pentaho-hadoop-hive-jdbc-shim-1.3.1.jar
data-integration/libext/JDBC/pentaho-hadoop-hive-jdbc-shim-1.3.2.jar
3- Unzip the file pentaho-big-data-plugin-shimtastic-1.3.3.1.zip from the data-integration/plugins folder.
4- Optionally, remove irrelevant folders under data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations.
5- Copy the file pentaho-hadoop-hive-jdbc-shim-1.3.3.jar into the folder
data-integration/libext/JDBC
6- Unzip the file pentaho-instaview-templates-shimtastic-1.3.3.zip to the following directory to
data-integration/plugins/spoon/agile-bi/platform/pentaho-solutions/system/instaview/templates/Big Data
Any help is really appreciated..!
Pentaho doesnot have a specific database connection for MongodB. So you will not find it in the Database Connection viewer. The way to connect to Mongodb is to use Mongodb Input step in PDI. There you will find the connection details section (configure credentials). You can then connect JSON Input step to read the results of your mongodb output. Check the below screenshot:
You can also read it from the Pentaho Wiki in here. Though the documentation seems to be slightly old, but it is the exact process to do it.
On a note you don't need Bigdata shims to connect to mongodb. It seems you have configured the hadoop-hive shims. It not required in here.
Hope it helps :)

Incredibly low GridFS performance using MongoDB 3.0.0 and Mongofiles

I have a MongoDB database with a GridFS collection containing hundreds of thousands of files (345,073, to be precise -- and about 100GBs in volume).
On MongoDB 2.6.8 it takes a fraction of a second to list the files using the native mongofiles and connecting to mongod. This is the command I use:
mongofiles --db files list
I just brewed and linked MongoDB 3.0.0 and suddenly the same operation takes more than five minutes to complete, if ever it does. I have to kill the query most of the time, as it drives two of my CPU cores to 100%. The log file does not show anything irregular. I rebuilt the indexes to no avail. I also tried the same with my other GridFS collections in other databases, each with millions of files and I encounter the same issue.
Then I uninstalled 3.0.0 and relinked 2.6.8 and everything is back to normal (using the exact same data files).
I am running MongoDB on Yosemite, and I reckon the problem might be platform specific. But is there anything that I have ommited and I should take into consideration? Or have I really discovered a bug that I must report?
Having the same problem here, for me running a mongofiles 2.6 from a docker image fixed the problem, seems they broke something with the rewrite.

Where does MongoDB store its documents?

I have inserted and fetched data using MongoDB, in PHP. Is there an actual copy of this data in a document somewhere?
By default Mongo stores its data in the directory /data/db.
You can specify a different directory using the --dbpath option.
If you’re running Mongo on Windows then the directory will be C:\data\db, where C is the drive letter of the working directory in which Mongo was started. This is quite confusing, so on Windows I’d recommend that you always specify a data directory using --dbpath.
MongoDB stores it's data in the data directory specified by --dbpath. It uses a database format so it's not actual documents, but there are multiple documents in each file and you cannot easily extract the data from this format yourself.
To read and/or update a document you need to use a MongoDB client, in the same way that you send SQL queries to MySQL through a MySQL client. You probably want to do it programmatically by using one of the client libraries for your programming language, but there is also a command-line client if you need to do manual updates.

Problem while exporting mongoDB Collection

I am using MongoDB version 1.6.5
One of my collection has 973525 records.
when I try to export this collection mongodb exports only 101 records.
I can't figure out the problem .
any one knows the solution for it.
This sounds like corruption. If your server has not shutdown cleanly that could be the cause. Have you had system crashes where you didn't do a repair?
You can try to do a dump with mongodump --dbpath if you shut down the server first.
Note: MongoExport/Import will not be able to restore all the data since json can't represent all possible data types.