Load Balancing Between Mongos

Load Balancing Between Mongos - mongodb

I have created a sharded environment, I am using two mongos. Is their a way I can load balance between the two "mongos",Because presently I found the Mongo client uses one of the two.Or do I have to write my own load balancer?

The recommendation would be having a mongos per application server and not implementing your own load balancer.
A query may not return the whole result in one batch, in which case the mongos will store some information associated with the cursor. If subsequent requests to iterate using the cursor are not redirected to the same mongos, then you will get errors. A load balancer would need to understand the MongoDB binary wire protocol to guarantee that scenario is handled properly.
See:
http://craiggwilson.com/2013/10/21/load-balanced-mongos/

We had the same question. Looks like Java Mongo Client can do the failover connection by itself.
Please see the answers for this question.
MongoDB load balancing and failover of query routers

Related

Routing process in mongodb which doesn't lets client to know they are interacting with server

I got a mongodb question in my test which is as below:
Which Routing process in mongodb which doesn't lets client to know they are interacting with server:
A. mongo.mongo
B. mongoDB.mongoDB
C. mongos.mongos
D. mongos
I couldn't find the answer anywhere. Which of the above option is correct?

It's mongos.
The mongos tracks what data is on which shard by caching the metadata from the config servers. The mongos uses the metadata to route operations from applications and clients to the mongod instances
So user doesn't need to know which server they are getting the data from. They just know that they received the data from mongo.

Using --rest with MongoDB's mongos

I've created a small MongoDB v2.6.4 cluster with one mongos router, three config servers, and handful of mongod data servers. It's working fine, I can do CRUD operations just fine from the mongos client, which directs to the appropriate shards.
I'm now wanting to experiment with the rest service.
When I tried this with a single MongoDB data server using mongod, it worked exactly as prescribed.
Even now, I can connect to each shard which is running with --rest and get a web response. But, the results are a little unexpected. I see the overview, clients, dbtop, write lock, log, etc. And if I click on listDatabase, I actually see my database listed. However, any REST attempts I try return with a JSON object showing offest 0, no rows, total of 0 roes, no query, and 0 milliseconds.
At the moment I'm of the opinion that this doesn't matter, as I've directly connected to a shard and would anticipate only seeing just the contents of that shard -- although further experimentation has been unable to produce even that result.
Again, I'll stress that in the current sharded configuration, mongos, NodeJS, and PHP all can see, query, and manipulate the data within my MongoDB collection just fine.
However, what I'm trying to do (if it's even possible) is connect to the mongos shard service via REST, since it does offer the --httpinterface option, figuring it will expose access to the shards and do the right thing.
It delivers the standard status page, but clicking on any links or hand rolling any REST results in a message about --rest not being enabled and that I should enable it.
REST is not enabled, use --rest to turn on.
check that port 28017 is secured for the network too.
Problem is, mongos doesn't allow that as an option! (mongod, however does)
It is entirely possible this is correct behavior, but I'm not sure how REST services would then work in a sharded configuration.
I'm not using a configuration file, specifying everything by the command line. That said, I tried passing a config file with RESTInterfaceEnabled: true in the yaml, and it didn't help.
mongos is returning a web interface, with the List all commands, db version, and log. It's missing the other diagnostic information (that's usually mongod specific), which I'd perhaps expect. But the links it offers doesn't work -- all requesting me to turn on an option that isn't there.
I've manually checked each shard data server (running mongod with --shardsvr), and directly connecting to them does provide REST access.
Everything else about the sharded cluster is working perfectly. I just can't to REST, like I can in a single-node unsharded solution and access my database or collections.
What might I have missed? Is this even possible? Any ideas?

AFAIK the REST Interface is inly to do monitoring/management not to access the documents. MongoDB does not provide official REST interface to do CRUD operations. For this you need either to create your own or find a library in the ecosystem to do this, look at http://docs.mongodb.org/ecosystem/tools/http-interfaces/
Do you have specific requirements/needs in mind?
PS : I do not use the --rest parameter in any of my deployment/development.

How do mongos instances work together in a cluster?

I'm trying to figure out how different instances of mongos server work together.
If I have 1 configserver and some shards, for example four, each of them composed by only one node (a master of course), and have four mongos server... do the mongos server communicate between them? Is it possible that one mongos redirect its load to another mongos?

When you have multiple mongos instances, they do not automatically load-balance between each other. They don't even know about each others existence.
The MongoDB drivers for most programming languages allow to specify multiple mongos instances when creating a connection. In that case the driver will usually ping all of them and connect to the one with the lowest latency. This will usually be the one which is closest geographically. When all have the same network distance, the one which is least busy right now will usually respond first. The driver will then stay connected to that one mongos, unless the program explicitely reconnects or the mongos can no longer be reached (in that case the driver will usually automatically pick another one from the initial list).
That means using multiple mongos instances is normally only a valid method for scaling when you have a large number of low-load clients, not one high-load client. When you want your one high-load client to make use of many mongos instances, you need to implement this yourself by creating a separate connection to each mongos instance and implement your own mechanism to distribute queries among them.

Short answer
As of MongoDB 2.4, the mongos servers only provide a routing service to direct read/write queries to the appropriate shard(s). The mongos servers discover the configuration for your sharded cluster via the config servers. You can find out more details in the MongoDB documentation: Sharded Cluster Query Routing.
Longer scoop
I'm trying to figure out how different instances of mongos server work togheter.
The mongos servers do not currently talk directly to each other. They do coordinate activity some activity via your config servers:
reading the sharded cluster metadata
initiating a balancing round (any mongos can start a balancing round, but only one round can be active at a time)
If I have 1 configserver
You should always have 3 config servers in production. If you somehow lose or corrupt your config server, you will have to combine your data and re-shard your database(s). The sharded cluster metadata saved on the config servers is the definitive source for what sharded data ranges should live on each shard.
some shards, for example four, each of them composed by only one node (a master of course)
Ideally each shard should be backed by a replica set if you want optimal uptime. Replica sets provide for auto-failover and can be very useful for administrative purposes (for example, taking backups or adding indexes offline).
Is it possible that one mongos redirect its load to another mongos?
No, the mongos do not perform any load balancing. The typical recommendation is to deploy one mongos per app server.
From an application/driver point of view you can specify multiple mongos in your connect string for failover purposes. The application drivers will generally connect to the nearest available mongos (by network ping time), and attempt to reconnect to in the event the current mongos connection fails.

MongoDB : does it need 2 mongos per shard?

All in the title : do we need 2 mongos per shard in MongoDB ? I am not sure to understand exactly what mongos are for and if my website will communicate with them or if it is something internal to MongoDB.

If you have cluster set up (with shards, not to be confused with replica set), then you have to have mongos instances deployed. It's a router process. It knows which data resides where. Application talks to mongos, it routes the request to corresponding shard. Talking to shards directly is strongly discouraged.
You must have at least one mongos process. You can have more, they have small resource footprint. I usually deploy one mongos per application server.

A mongos is basically nothing more than a router which gathers a configuration of your cluster from config servers, caches that config, and uses it to route targeted and scatter and gather operations within a cluster of shards. It can also be used for aggregation as such if aggregation queries are common in your app the mongos can take some CPU and memory, however, for the most part they have no weight and can run on the smallest server.
You do not require 2 mongos, the number depends upon the operations being sent through that router. You can in theory do with one, however, that isn't very redundant and cerates a single point of failure, 2 makes that less possible.

MongoDB load balancing

I was looking at best load balancing option for concurrent users with Mongo DB. I have looked at Master Slave replication but don't think this will load balance. Are there any open source DB load balancers for Mongo DB?
I have looked at Sequoia but looks like that project is no longer actively supported.
Please note: The data is not very huge & also not use case for sharding.

both Master Slave and Replica Sets will load balance in MongoDB, if you set slaveOK in your driver.
When slaveOK is enabled MongoDB drivers direct all reads to secondaries/slaves.
This provides relatively effective read balancing; for write balancing your only option.would be sharding.