Need explanation on how the select queries are handled by mongoDB - mongodb

I have no issues running a select query on mongoS, in a sharded environment, but my question is:
If have a 2 shard server setup and run a find query via application layer, which part of the sharded environment is reponsible for executing the query?
I am not able to see any change in any of instance's consoles and also no new process is created. I tested this by executing 3000 find queries on a locally implemented sharding setup.
Can anybody explain where I am wrong in understanding, or find statements don't put load on servers.
How does mongoDB handle select Or read operations?
I badly understand this.
Thanks in advance for responding

When you connect to a mongoD or mongoS server via the shell (mongo), you won't be able to look at the queries happening on that server. The shell is mainly there to execute queries, configurate the database and check it's status.
MongoS is simply a router of the queries coming from the application.
To the see the individual queries you'll need to check the log files which are located on each server based on your configuration.
By default only slow queries (under 100ms) will be logged. So will need to enable the Profiler to log all queries.
You can read this documentation pages for more info on Sharding.

Related

Using --rest with MongoDB's mongos

I've created a small MongoDB v2.6.4 cluster with one mongos router, three config servers, and handful of mongod data servers. It's working fine, I can do CRUD operations just fine from the mongos client, which directs to the appropriate shards.
I'm now wanting to experiment with the rest service.
When I tried this with a single MongoDB data server using mongod, it worked exactly as prescribed.
Even now, I can connect to each shard which is running with --rest and get a web response. But, the results are a little unexpected. I see the overview, clients, dbtop, write lock, log, etc. And if I click on listDatabase, I actually see my database listed. However, any REST attempts I try return with a JSON object showing offest 0, no rows, total of 0 roes, no query, and 0 milliseconds.
At the moment I'm of the opinion that this doesn't matter, as I've directly connected to a shard and would anticipate only seeing just the contents of that shard -- although further experimentation has been unable to produce even that result.
Again, I'll stress that in the current sharded configuration, mongos, NodeJS, and PHP all can see, query, and manipulate the data within my MongoDB collection just fine.
However, what I'm trying to do (if it's even possible) is connect to the mongos shard service via REST, since it does offer the --httpinterface option, figuring it will expose access to the shards and do the right thing.
It delivers the standard status page, but clicking on any links or hand rolling any REST results in a message about --rest not being enabled and that I should enable it.
REST is not enabled, use --rest to turn on.
check that port 28017 is secured for the network too.
Problem is, mongos doesn't allow that as an option! (mongod, however does)
It is entirely possible this is correct behavior, but I'm not sure how REST services would then work in a sharded configuration.
I'm not using a configuration file, specifying everything by the command line. That said, I tried passing a config file with RESTInterfaceEnabled: true in the yaml, and it didn't help.
mongos is returning a web interface, with the List all commands, db version, and log. It's missing the other diagnostic information (that's usually mongod specific), which I'd perhaps expect. But the links it offers doesn't work -- all requesting me to turn on an option that isn't there.
I've manually checked each shard data server (running mongod with --shardsvr), and directly connecting to them does provide REST access.
Everything else about the sharded cluster is working perfectly. I just can't to REST, like I can in a single-node unsharded solution and access my database or collections.
What might I have missed? Is this even possible? Any ideas?
AFAIK the REST Interface is inly to do monitoring/management not to access the documents. MongoDB does not provide official REST interface to do CRUD operations. For this you need either to create your own or find a library in the ecosystem to do this, look at http://docs.mongodb.org/ecosystem/tools/http-interfaces/
Do you have specific requirements/needs in mind?
PS : I do not use the --rest parameter in any of my deployment/development.

Mongodb sharding is NOT recovered after power off accident

I'm running 4 vms (centos) on a single machine (Windows 2008 R2). The 4 vms are setup as below:
1 mongos
1 mongo configure server
2 mongod as sharding servers
OK, everything was fine before a power off accident. When the power came back, I did manually reboot all the VMs, and found the sharding setting is gone. I mean, the mongos can talk to the configure server, but somehow the sharding data is lost and it show the database is not sharded.
I setup the sharding based on documents from mongodb websites (e.g. running some command in mongo shell to enable sharding for the db and each collection). Do I need to do all the mongo shell commands again after rebooting? Or is it supposed to recover automatically once the sharding is enabled?
Thanks.
Once you've established a sharded cluster, it is certainly supposed to stay configured, even if servers fail, even if they all fail at the same time. Restarting the servers should bring everything up just the way it was before the outage. Based on your description, it is difficult to reason what might have gone wrong. A dump of the config database, and the log files of all the affected servers, would be necessary to analyze what happened. This should perhaps be filed as a ticket with MongoDB support.
(It is, by the way, recommended to run three config servers rather than a single one, for availability reasons. But even so, even a single server should recover just fine after having failed. The three-server recommendation is only to make sure that there's always a live config server even if one of them should fail.)

How to get the size of the sharded server which is currently used in a mongoDB sharded cluster

I have a sharded cluster which is set up , Since my data is seamlessly growing , I need to keep monitoring the size of data and add new shards to the cluster .
Is there a command that I could use to know how much size is utilized in each sharded server , at any point of time .
For eg . lets say I have a database , and my show dbs command from mongos console shows like this
mongos> show dbs
company 0.375GB
config 0.046875GB
test 0.0625GB
I want to know how much data is used in each shard servers . for company database .
my implemented architecture is as follows
I have a single database sharded , in which each collection is sharded .
3 shard servers running mongod instances
1 server running mongos
1 server running config instance
My whole application layer is talking to mongos directly .
I need to know this because , I am planning to build a cron which checks the available size of the shard server and if it exceeds some amount it will send a notification to administrator to show some attention .
Thanks in advance for responding to this post
After posting in the mongoDB user group , I got the solution on how we need to do this and what commands that could be used
Commands
To know about space utilization of a particular DB in each sharded server we have to use
db.stats()
to know about space utilization of a particular Collection in each sharded server we have to use
db.<collectionname>.stats()
Now to use it in the php daemon/cron I could call these commands using php mongo driver
$con= new Mongo()
$stats=$con->dbName->command(array('dbStats' => 1)); // for db.stats()
$stats=$con->dbName->command(array('collStats' => 'collection_name'));
Still I couldn't find any method to execute such commands from Zend shanty mongo but I could use default PHP pecl mongo db driver to achieve this
Thank you all for responding to this post
There are general monitoring solutions for that (nagios, zabbix, etc). They monitor many parameters of your machines and can be set up to send alerts in certain situations. You don't need to reinvent the wheel.
Such general solution can also warn you if you're running out of space on an app server (because its logs take all the space). Your specialized mongodb cron job won't be able to do that.

Executing Mongo mapreduce jobs on a machine without mongo installed

I have a set of mapreduce jobs which I need to execute from my Java program. Right now I am executing them via a Java Process calling
$MONGO_HOME/bin/mongo host:port/database jsFiles
Is there a way I could execute these mapreduce taks on a machine which does not have Mongo. Does the mongo Java driver support such functionality ?
Thanks!
MongoDB MapReduce jobs are always run on the Mongo server, never in the client, and any client can send a job to the server.
#Chris Shain pointed you to the docs (http://api.mongodb.org/java/current/com/mongodb/MapReduceCommand.html), and I recommend you read them, but also understand that most MapReduce operations will be all about reducing the huge volumes of data stored into your database down to smaller result sets. The closer this is done to where the data is actually stored, the better, and most people do not execute commands directly on the server. In order for the MapReduce operation to be useful, Mongo would have to (and did!) provide a way to use it from the client. For general strategies, see here: http://www.mongodb.org/display/DOCS/MapReduce
Note that because the operation runs on the server, you may notice increased lock percentage. Consider running the MapReduce job on a slave or secondary Mongo instance if this is a problem for you.
The Java client driver for Mongo has the MapReduceCommand, documented here: http://api.mongodb.org/java/current/com/mongodb/MapReduceCommand.html

64-bit mongodb multiple-shards Issue

I am using 64-bit MongoDB, and i am undergoing test on multiple-shards. If i keep multiple shards in a single machine. Its working fine but if i keep shards in different machine, its failed in sharding to second shard. I have restricted the first-shard size to 10MB, once its reaches the limited size in first shard it should start sharding to second-shard but not happening so.Instead failed to store in second-shard updating to first shard. The following are my shard details. In my environment initially i have two shards. The first shard is on my first-machine running along with my application. The Second-shard is on my second machine.
Configuration as follows:-
*)On both of my shards, shard-server,configserver,mongos and i have connected mongo through mongos as follows ./mongo hostname:27017/admin and i have added both the shards in first & second shard and enabled sharding for database and collection level by using shard-key.
Please, let me know if i gone wrong anywhere in the configuration.
Advance Thanks,
Your post could use some editing, this is very difficult to read.
It looks like you have 2 machines. On each machine you have:
mongod process serving as one shard
mongod process serving as a config
mongos process
a copy of your application connecting to localhost:27017/admin
Please, let me know if i gone wrong anywhere in the configuration.
There are several possible problems here. Please check the following:
You can only have 1 or 3 config processes. It looks like you have 2, this will not work.
When you connect to localhost:27017/admin are you connecting to mongos or mongod? Either one could be running on those ports. Can you specify the ports for each process to help clarify? You must connect to mongos or the sharding will not happen.
Please look at the logs, they generally have output indicating what the server is doing. If there is no indication of "splits" or "chunks" happening, then your database may be configured incorrectly.
Your best bet is to start from top and test each piece one at a time.