One Shard with Multiple Mongos - mongodb

Can we have this type of configuration?
Two server running the following things each-
1.Mongo Config Server.
2.Mongo Router.
3.Application.
Total 4 EC2 servers-
First Server-Running the web application & mongos.
Second Server-Running the web application & mongos.
Third Server-Running the First Shard with complete DB(Say for
example Demo).
Forth Server-Running The Second Shard with complete DB(Say for
example Demo).
Both the Mongos should point to one shard named Shard1?

Yes, You can have multiple mongos instances running against a single shard. Think of the mongos instances as clients for the sharded cluster which have to run as a daemon process in order to keep metadata and heartbeats up to date.
Edit: as for having a complete DB, this is only possible for a single DB. You can have one DB on shard1 and the other DB on shard2, for example. but you can never have a single complete DB on two shards. To achieve the goal of having db1 on shard1 and db2 on shard2, you simply make the respective shard the primary shard of the respective database and don't shard any collection. Please read the docs for the movePrimary command for details.
A bit OOT:
However, running a single config server is strongly advised against, and for a good reason. If the single config server goes down or gets corrupted, your cluster will be impossible to use - and recreating the sharded cluster will not an easy task to be done. And it's going to be a lengty process. So please, use three config servers.*

Related

How do mongos instances work together in a cluster?

I'm trying to figure out how different instances of mongos server work together.
If I have 1 configserver and some shards, for example four, each of them composed by only one node (a master of course), and have four mongos server... do the mongos server communicate between them? Is it possible that one mongos redirect its load to another mongos?
When you have multiple mongos instances, they do not automatically load-balance between each other. They don't even know about each others existence.
The MongoDB drivers for most programming languages allow to specify multiple mongos instances when creating a connection. In that case the driver will usually ping all of them and connect to the one with the lowest latency. This will usually be the one which is closest geographically. When all have the same network distance, the one which is least busy right now will usually respond first. The driver will then stay connected to that one mongos, unless the program explicitely reconnects or the mongos can no longer be reached (in that case the driver will usually automatically pick another one from the initial list).
That means using multiple mongos instances is normally only a valid method for scaling when you have a large number of low-load clients, not one high-load client. When you want your one high-load client to make use of many mongos instances, you need to implement this yourself by creating a separate connection to each mongos instance and implement your own mechanism to distribute queries among them.
Short answer
As of MongoDB 2.4, the mongos servers only provide a routing service to direct read/write queries to the appropriate shard(s). The mongos servers discover the configuration for your sharded cluster via the config servers. You can find out more details in the MongoDB documentation: Sharded Cluster Query Routing.
Longer scoop
I'm trying to figure out how different instances of mongos server work togheter.
The mongos servers do not currently talk directly to each other. They do coordinate activity some activity via your config servers:
reading the sharded cluster metadata
initiating a balancing round (any mongos can start a balancing round, but only one round can be active at a time)
If I have 1 configserver
You should always have 3 config servers in production. If you somehow lose or corrupt your config server, you will have to combine your data and re-shard your database(s). The sharded cluster metadata saved on the config servers is the definitive source for what sharded data ranges should live on each shard.
some shards, for example four, each of them composed by only one node (a master of course)
Ideally each shard should be backed by a replica set if you want optimal uptime. Replica sets provide for auto-failover and can be very useful for administrative purposes (for example, taking backups or adding indexes offline).
Is it possible that one mongos redirect its load to another mongos?
No, the mongos do not perform any load balancing. The typical recommendation is to deploy one mongos per app server.
From an application/driver point of view you can specify multiple mongos in your connect string for failover purposes. The application drivers will generally connect to the nearest available mongos (by network ping time), and attempt to reconnect to in the event the current mongos connection fails.

MongoDB : does it need 2 mongos per shard?

All in the title : do we need 2 mongos per shard in MongoDB ? I am not sure to understand exactly what mongos are for and if my website will communicate with them or if it is something internal to MongoDB.
If you have cluster set up (with shards, not to be confused with replica set), then you have to have mongos instances deployed. It's a router process. It knows which data resides where. Application talks to mongos, it routes the request to corresponding shard. Talking to shards directly is strongly discouraged.
You must have at least one mongos process. You can have more, they have small resource footprint. I usually deploy one mongos per application server.
A mongos is basically nothing more than a router which gathers a configuration of your cluster from config servers, caches that config, and uses it to route targeted and scatter and gather operations within a cluster of shards. It can also be used for aggregation as such if aggregation queries are common in your app the mongos can take some CPU and memory, however, for the most part they have no weight and can run on the smallest server.
You do not require 2 mongos, the number depends upon the operations being sent through that router. You can in theory do with one, however, that isn't very redundant and cerates a single point of failure, 2 makes that less possible.

How to determine the MongoDB server type

A MongoDB instance can have different roles:
Config server
Router (mongos)
Data server
Arbiter server (for replica sets)
I know that db.serverStatus() can be used to see if an instance is a router, the process value is mongos.
But for config servers, arbiters and data nodes the process value is mongod.
Is there a simple way of distinguishing between these instance types?
I want to bring attention to one particular important issue with this question: sharding is and horizontal dimension ( several replicasets where data is distributed to ) and replicaset is a high availability solution which is represented by the composition of different mongod nodes!
So you actually what you are trying to figure out is:
ReplicaSet nodes roles
Shard Nodes members
In the case of a replicaSet what you might be interested in knowing is each node role. You can easily get the information without needing to connect to all the nodes of the replicaset, just run the command:
db.isMaster()
with this you will get the node members and roles of each member.
For shard node members first of all you should never try to connect directly to the config servers. These are their to manage the distribution of chunks, chunk splits and other configuration data, relevant only for the shard cluster functionality. Avoid using those ip's to connect to from your application.
So if you want to have a clear view of which members compose your shard cluster, how many shards you have etc, you need to run command:
db.printShardStatus()
or
sh.status()
Please review the documentation here
Cheers,
N.

Convert a Shard Cluster to a Replicated Shard Cluster

I've been working with mongo for a few weeks and and building my environment in a dev. I started with a single node, then moved to a shard cluster, and now want to move to a replicated shard cluster. From what I read a Replicated Shard Cluster is the best of the best, scalability, durability, performance increase, etc.
I've read most of the (very good) tutorials in their help. It seems their lessons advise going from single node, to replica set, to sharded replica set, which, unfortunately is the opposite way I did it. I can't seem to find anything to upgrade a sharded cluster to a replicated shard cluster.
Here are 5 hosts that I have:
APPSERVER
CONFIGSERVER
SHARD1
SHARD2
SHARD3
I started each of the shard servers with:
mongod --shardsvr
Then I started the config server with:
mongod --configsvr
Then I started the mongos process on the APPSERVER with:
mongos --configdb CONFIGSERVER
Then in mongos, I added the shards, enabled sharding on my database, and defined a shardkey for a collection:
sh.addShard("SHARD1:27018");//again for 2 and 3
sh.enableSharding("beacon");
sh.shardCollection("beacon.alpha2", {"ip":1});
I want each of the shards replicated on each of the other two. (right?) Do I need to bring down the mongod processes on the shards and restart them with different CL parameters? What commands do I need to issue in my mongos shell to get it to replicate? Should I export all my data, take everything down, restart and reimport? Again, I see a lot of tutorials on how to create a replica set, but I don't really see anything on how to do a replica set given a sharded system to start with.
Thanks!
For each shard, you will need to restart the current member and start both it and two new members (or 1 new member and an arbiter) with the --replset command line option. You could add more members than that, but 3 is the lowest workable set. Then from inside what will become the new primary (your current SHARD1 for example) you could do the following:
rs.add("newmember1:port")
rs.add("newmember2:port")
rs.initiate();
You would then need to check and make sure that the sh.status() has been updated to reflect the new members of the replica set. In 2.2 this has become slightly easier as it should be automatic, for prior versions it was necessary to manually save the shard information in the config database, which is reflected in the documentation under sharded cluster. If it has been automatically added you will see the replica set list in the sh.status() output, similar to the following:
{ "_id" : "shard1", "host" : "shard1/SHARD1:port,newmember1:port,newmember2:port" }
If this does not happen automatically you will need to follow the procedure outlined in the documentation link above, namely from inside mongos:
db.getSiblingDB("config").shards.save({_id:"<name>", host:"<rsName>/member1,member2,..."})
Following the above example it would look like:
db.getSiblingDB("config").shards.save({_id:"SHARD1", host:"shard1/SHARD1:port,newmember1:port,newmember2:port"})
You would need to do this procedure for each shard, and you should do them one at a time so that you have all of SHARD1 as a replica set before moving on to SHARD2. You will also need to be aware that each replica set will become read-only while the initial election takes place, so at the very least you should schedule this in a downtime or maintenance window. Ideally test first in a staging environment.

64-bit mongodb multiple-shards Issue

I am using 64-bit MongoDB, and i am undergoing test on multiple-shards. If i keep multiple shards in a single machine. Its working fine but if i keep shards in different machine, its failed in sharding to second shard. I have restricted the first-shard size to 10MB, once its reaches the limited size in first shard it should start sharding to second-shard but not happening so.Instead failed to store in second-shard updating to first shard. The following are my shard details. In my environment initially i have two shards. The first shard is on my first-machine running along with my application. The Second-shard is on my second machine.
Configuration as follows:-
*)On both of my shards, shard-server,configserver,mongos and i have connected mongo through mongos as follows ./mongo hostname:27017/admin and i have added both the shards in first & second shard and enabled sharding for database and collection level by using shard-key.
Please, let me know if i gone wrong anywhere in the configuration.
Advance Thanks,
Your post could use some editing, this is very difficult to read.
It looks like you have 2 machines. On each machine you have:
mongod process serving as one shard
mongod process serving as a config
mongos process
a copy of your application connecting to localhost:27017/admin
Please, let me know if i gone wrong anywhere in the configuration.
There are several possible problems here. Please check the following:
You can only have 1 or 3 config processes. It looks like you have 2, this will not work.
When you connect to localhost:27017/admin are you connecting to mongos or mongod? Either one could be running on those ports. Can you specify the ports for each process to help clarify? You must connect to mongos or the sharding will not happen.
Please look at the logs, they generally have output indicating what the server is doing. If there is no indication of "splits" or "chunks" happening, then your database may be configured incorrectly.
Your best bet is to start from top and test each piece one at a time.