How to get the size of the sharded server which is currently used in a mongoDB sharded cluster - mongodb

I have a sharded cluster which is set up , Since my data is seamlessly growing , I need to keep monitoring the size of data and add new shards to the cluster .
Is there a command that I could use to know how much size is utilized in each sharded server , at any point of time .
For eg . lets say I have a database , and my show dbs command from mongos console shows like this
mongos> show dbs
company 0.375GB
config 0.046875GB
test 0.0625GB
I want to know how much data is used in each shard servers . for company database .
my implemented architecture is as follows
I have a single database sharded , in which each collection is sharded .
3 shard servers running mongod instances
1 server running mongos
1 server running config instance
My whole application layer is talking to mongos directly .
I need to know this because , I am planning to build a cron which checks the available size of the shard server and if it exceeds some amount it will send a notification to administrator to show some attention .
Thanks in advance for responding to this post

After posting in the mongoDB user group , I got the solution on how we need to do this and what commands that could be used
Commands
To know about space utilization of a particular DB in each sharded server we have to use
db.stats()
to know about space utilization of a particular Collection in each sharded server we have to use
db.<collectionname>.stats()
Now to use it in the php daemon/cron I could call these commands using php mongo driver
$con= new Mongo()
$stats=$con->dbName->command(array('dbStats' => 1)); // for db.stats()
$stats=$con->dbName->command(array('collStats' => 'collection_name'));
Still I couldn't find any method to execute such commands from Zend shanty mongo but I could use default PHP pecl mongo db driver to achieve this
Thank you all for responding to this post

There are general monitoring solutions for that (nagios, zabbix, etc). They monitor many parameters of your machines and can be set up to send alerts in certain situations. You don't need to reinvent the wheel.
Such general solution can also warn you if you're running out of space on an app server (because its logs take all the space). Your specialized mongodb cron job won't be able to do that.

Related

About MongoDB add shard and router server need to restart?

I build a MongoDB sharding environment and want to test the performance of migration data.
I insert one billion rows in a collection in Replica Set A.
I added another shard setting Replica Set B.
MongoDB starts to balance chunks between those shards.
After balancing is finished, I found out I can't look up some data.
Because those data have been moved to Replica Set B, only when I restart all mongo router service am I able to query them.
Is it a normal and inevitable procedure, or is there any way to reload the whole system (through mongo shell command or anything else)?
Thank you !!!
I found a command that it seems help to reload the router config
db.adminCommand({"flushRouterConfig":1});
2017-05-18 After testing, it works!

Local MongoDB instance with index in remote server

One of our clients have a server running a MongoDB instance and we have to build an analytical application using the data stored in their MongoDB database which changes frequently.
Clients requirements are:
That we do not connect to their MongoDB instance directly or run another instance of MongoDB on their server but just somehow run our own MongoDB instance on our machine in our office using their MongoDB database directory with read only access remotely.
We've suggested deploying a REST application, getting a copy of their database dump but they did not want that. They just want us to run our own MongoDB intance which is hooked up with the MongoDB instance directory. Is this even possible ?
I've been searching for a solution for the past two days and we have to submit a solution by Monday. I really need some help.
I think this is normal request because analytical queries could cause too much load on the production server. It is pretty normal to separate production and analytical databases.
The easiest option is to use MongoDB replication. Set up MongoDB replica set with production database instance as primary and analytical database instance as secondary, also configure the analytical instance to never become primary.
If it is not possible to use replication - for example client doesn't want this, the servers could not connect directly to each other... - there is another option. You can read oplog from remote database and apply operations to your database instance. This is exactly the low level mechanism how replica set works, but you can do it manually too. For example MMS (Mongo Monitoring Sevice) Backup uses reading oplog for online backups of MongoDB.
Update: mongooplog could be the right tool for real-time application of replication oplog pulled from remote server on local server.
I don't think that running two databases that points to the same database files is possible or even recommended.
You could use mongorestore to restore from their data files directly, but this will only work if their mongod instance is not running (because mongorestore will need to lock the directory).
Another solution will be to do file system snapshots and then restore to your local database.
The downside to this backup/restore solutions is that your data will not be synced all the time.
Probably the best solution will be to use replica sets with hidden members.
You can create a replica set with just two members:
Primary - this will be the client server.
Secondary - hidden, with votes and priority set to 0. This will be your local instance.
Their server will always be primary (because hidden members cannot become primaries). Clients cannot see hidden members so for all intents and purposes your server will be read only.
Another upside to this is that the MongoDB replication will do all the "heavy" work of syncing the data between servers and your instance will always have the latest data.

Mongodb sharding is NOT recovered after power off accident

I'm running 4 vms (centos) on a single machine (Windows 2008 R2). The 4 vms are setup as below:
1 mongos
1 mongo configure server
2 mongod as sharding servers
OK, everything was fine before a power off accident. When the power came back, I did manually reboot all the VMs, and found the sharding setting is gone. I mean, the mongos can talk to the configure server, but somehow the sharding data is lost and it show the database is not sharded.
I setup the sharding based on documents from mongodb websites (e.g. running some command in mongo shell to enable sharding for the db and each collection). Do I need to do all the mongo shell commands again after rebooting? Or is it supposed to recover automatically once the sharding is enabled?
Thanks.
Once you've established a sharded cluster, it is certainly supposed to stay configured, even if servers fail, even if they all fail at the same time. Restarting the servers should bring everything up just the way it was before the outage. Based on your description, it is difficult to reason what might have gone wrong. A dump of the config database, and the log files of all the affected servers, would be necessary to analyze what happened. This should perhaps be filed as a ticket with MongoDB support.
(It is, by the way, recommended to run three config servers rather than a single one, for availability reasons. But even so, even a single server should recover just fine after having failed. The three-server recommendation is only to make sure that there's always a live config server even if one of them should fail.)

Need explanation on how the select queries are handled by mongoDB

I have no issues running a select query on mongoS, in a sharded environment, but my question is:
If have a 2 shard server setup and run a find query via application layer, which part of the sharded environment is reponsible for executing the query?
I am not able to see any change in any of instance's consoles and also no new process is created. I tested this by executing 3000 find queries on a locally implemented sharding setup.
Can anybody explain where I am wrong in understanding, or find statements don't put load on servers.
How does mongoDB handle select Or read operations?
I badly understand this.
Thanks in advance for responding
When you connect to a mongoD or mongoS server via the shell (mongo), you won't be able to look at the queries happening on that server. The shell is mainly there to execute queries, configurate the database and check it's status.
MongoS is simply a router of the queries coming from the application.
To the see the individual queries you'll need to check the log files which are located on each server based on your configuration.
By default only slow queries (under 100ms) will be logged. So will need to enable the Profiler to log all queries.
You can read this documentation pages for more info on Sharding.

64-bit mongodb multiple-shards Issue

I am using 64-bit MongoDB, and i am undergoing test on multiple-shards. If i keep multiple shards in a single machine. Its working fine but if i keep shards in different machine, its failed in sharding to second shard. I have restricted the first-shard size to 10MB, once its reaches the limited size in first shard it should start sharding to second-shard but not happening so.Instead failed to store in second-shard updating to first shard. The following are my shard details. In my environment initially i have two shards. The first shard is on my first-machine running along with my application. The Second-shard is on my second machine.
Configuration as follows:-
*)On both of my shards, shard-server,configserver,mongos and i have connected mongo through mongos as follows ./mongo hostname:27017/admin and i have added both the shards in first & second shard and enabled sharding for database and collection level by using shard-key.
Please, let me know if i gone wrong anywhere in the configuration.
Advance Thanks,
Your post could use some editing, this is very difficult to read.
It looks like you have 2 machines. On each machine you have:
mongod process serving as one shard
mongod process serving as a config
mongos process
a copy of your application connecting to localhost:27017/admin
Please, let me know if i gone wrong anywhere in the configuration.
There are several possible problems here. Please check the following:
You can only have 1 or 3 config processes. It looks like you have 2, this will not work.
When you connect to localhost:27017/admin are you connecting to mongos or mongod? Either one could be running on those ports. Can you specify the ports for each process to help clarify? You must connect to mongos or the sharding will not happen.
Please look at the logs, they generally have output indicating what the server is doing. If there is no indication of "splits" or "chunks" happening, then your database may be configured incorrectly.
Your best bet is to start from top and test each piece one at a time.