Setting up Sharding - mongodb

I am new to mongodb, after install hortonworks HDP cluster and embedded mongodb with 3 nodes at HDP cluster.
now, try to setup shardings with mongodb. I tried few things and executed few steps. when I mongo, I saw these 3 servers have shard0:PRIMARY, shard1:SECONDARY> and shard1:SECONDARY>
Q1. did this mean I have sharding working?
Q2. if this is not right, how to remove all settings and back to a initial settings?

Answer is simple. Yes, you have managed to start Replica Set, not cluster.
Cluster is group of mongod nodes or replica sets where sharded collections exists. Sharding is like oracle partitioning.
If you are building cluster, it's better to do it with replica sets, so every shard in cluster is one replica set of (at least) three nodes.

Related

What is the difference between a cluster and a replica set in mongodb atlas?

I'm taking the Mongodb University M103 course and over there they gave a brief overview of what a cluster and a replica set is.
From my understanding a cluster is a set of servers or nodes. While a replica set is a set of servers or nodes all of which has replication mechanism built into each of them for downtime access and faster read operation.
From that it seems that replica set is a specific type of cluster, but my confusion arises from MongoDB Atlas. In mongoDB atlas one has to create a cluster, is that a replica set as well?
Are those terms interchangeable in all scenarios?
Replica Set
In MongoDB, a replicaset is a concept that depicts a set of MongoDB server working in providing redundancy (all the servers in the replica set have the same data) and high availability (if a server goes down, the remaining servers can still fulfil requests). When you create a replicaset, you need a minimum of 3 servers. There will always be a primary (read and write) and the remaining are called secondaries (for reading only).
MongoDB Atlas Cluster
Atlas is a DaaS, meaning a database a service company. They remove the burdain of maintaining, scaling and monitoring MongoDB servers on premise, so that you can focus on your applications.
An Atlas MongoDB cluster is a set of configuration you give to Atlas so it can setup the MongoDB servers for you. Hence, a MongoDB ReplicaSet is a feature subset in Atlas.
For example, while creating an Atlas Cluster, they will ask you whether you want a replicaset, sharded cluster, etc. Also, in which cloud provider you want to deploy. Your backup policy, the specs of your MongoDB hardware and more...
The keyword here is configuration. At the end of the day, you will have your MongoDB servers (replicaset or not) up and ready.
Summary
MongoDB Cluster
A specific configuration set of MongoDB servers to provide specific
features. i.e. replicaset and sharding.
MongoDB Replicaset
A MongoDB cluster setup to provide redundancy and high
availability with 3 or more odd number of servers (3, 5, 7, etc.)
MongoDB Atlas Cluster
High level MongoDB cluster configuration that allows you to set a
replicaset or other type of MongoDB cluster with its location and performance range.
I would suggest you to play with their web console. You will definitely see the difference.

Convert an existing Replica Set In Mongodb to a Sharded Cluster

I have a replica set with 3 nodes. Is there a way to convert this existing Replica Set to a Sharded Cluster. Please find below the config file of the replica set:
All of my 3 nodes in replica set have sharding enabled:-
sharding:
clusterRole: shardsvr
sharding is enabled at the database level ....and then implemented at the collection level. any operating replica set can work in a sharding environment but there wouldn't be a compelling reason to do sharding until you have multiple replica clusters working in parallel... it's a big decision - no going back...and you need to be very clear on your sharding key.... so definitely invest time in learning the technology at the MongoDB University.

Is it possible to create Replica Set with Sharded Cluster (mongos) and Single Server (mongod)?

I want to make continuous backup of my Sharded Cluster on a single MongoDB server somewhere else.
So, is it possible to create Replica Set with Sharded Cluster (mongos instance) and single MongoDB server?
Did anyone experience creating Replica Sets with two Sharded Clusters or with one Sharded Cluster and one Single Server?
How does it work?
By the way, the best (and for now, the only) way to continuously backup Sharded Cluster is by using MongoDB Management Service (MMS).
I were also facing the same issue sometime back. I wanted to take replica of all sharded cluster into one MongoDB. But, i didn't found any solution for this scenario, and I think this is true, because -
If you configure the multiple shard server (say 2 shard server) with
one replica set, then this will not work because in a replica set (say
rs0) only 1-primary member is possible. And In this scenario, we will
have multiple primary depend on number of shard server.
To take the backup of your whole sharded cluster, you must stop all writes to the cluster. You can refer to MongoDB documentation on this - http://docs.mongodb.org/manual/tutorial/backup-sharded-cluster-with-database-dumps/

mongodb failover Demonstration! Help needed

Here is a newbie trying to play around Mongodb. I am trying to demonstrate scaling in my class, meaning, I need to show that I have 2 instances of mongoDB up and running and I need to replicate them, set one as master and the other as secondary.
Can any of you suggest me a simple way to demonstrate that if primary/master fails the slave/secondary comes up as the master?
Please keep it as simple as possible as I am teaching to a fairly beginners of MongoDB
MongoDB replica sets are not master/slave. In order to achieve automatic failover you need to have a majority of nodes in the replica set able to elect a new primary. The minimum number of nodes in your replica set should be 3, which can either be 3 data-bearing nodes or 2 data-bearing nodes and an arbiter, which is a node that votes in elections.
A demo using replication alone is more about failover and redundancy than scaling (better demo'd with sharding).
If you want a very simple (and non-production) way to stand up a replica set or sharded cluster in a development environment, I would suggest using the mlaunch script which is part of mtools.
For example, to create a 3-node replica set with an arbiter:
mlaunch --replicaset --nodes 2 --arbiter
To create a sharded cluster with 3 shards backed by a replica set (plus mongos and config server):
mlaunch --replicaset --sharded 3
As mentioned in the other comments here, the free MMS Monitoring service is a good way to visualise activity in your MongoDB deployment, and you can use db.shutdownServer() to shutdown specific nodes to see the outcome.
The easiest way would be to set up the MongoDB monitoring service. Stop the MongoD process on one and watch the other take over. But, use replica sets rather than master/secondary as they are the recommended approach.
Actually, it is pretty easy
Set up a replica set with 2 "normal" mongods and an arbiter
Connect to both of the normal mongods using mongo
Show the output of rs.status(). (Note the selffield)
Shut down the current primary
Show the output of rs.status() again and again, until the former secondary is elected primary
Another option would be to write a simple java app which utilizes the driver, put it in an infinite loop which writes one entry every second and puts out the number of objects in the database. Catch exceptions and write out that a problem occurred. Start the sharded cluster, then start your application. Shut down the primary while the program is running. during the elections, there may be exceptions be thrown. As soon as the former secondary is elected primary, the document count should start to rise again.

Convert a Shard Cluster to a Replicated Shard Cluster

I've been working with mongo for a few weeks and and building my environment in a dev. I started with a single node, then moved to a shard cluster, and now want to move to a replicated shard cluster. From what I read a Replicated Shard Cluster is the best of the best, scalability, durability, performance increase, etc.
I've read most of the (very good) tutorials in their help. It seems their lessons advise going from single node, to replica set, to sharded replica set, which, unfortunately is the opposite way I did it. I can't seem to find anything to upgrade a sharded cluster to a replicated shard cluster.
Here are 5 hosts that I have:
APPSERVER
CONFIGSERVER
SHARD1
SHARD2
SHARD3
I started each of the shard servers with:
mongod --shardsvr
Then I started the config server with:
mongod --configsvr
Then I started the mongos process on the APPSERVER with:
mongos --configdb CONFIGSERVER
Then in mongos, I added the shards, enabled sharding on my database, and defined a shardkey for a collection:
sh.addShard("SHARD1:27018");//again for 2 and 3
sh.enableSharding("beacon");
sh.shardCollection("beacon.alpha2", {"ip":1});
I want each of the shards replicated on each of the other two. (right?) Do I need to bring down the mongod processes on the shards and restart them with different CL parameters? What commands do I need to issue in my mongos shell to get it to replicate? Should I export all my data, take everything down, restart and reimport? Again, I see a lot of tutorials on how to create a replica set, but I don't really see anything on how to do a replica set given a sharded system to start with.
Thanks!
For each shard, you will need to restart the current member and start both it and two new members (or 1 new member and an arbiter) with the --replset command line option. You could add more members than that, but 3 is the lowest workable set. Then from inside what will become the new primary (your current SHARD1 for example) you could do the following:
rs.add("newmember1:port")
rs.add("newmember2:port")
rs.initiate();
You would then need to check and make sure that the sh.status() has been updated to reflect the new members of the replica set. In 2.2 this has become slightly easier as it should be automatic, for prior versions it was necessary to manually save the shard information in the config database, which is reflected in the documentation under sharded cluster. If it has been automatically added you will see the replica set list in the sh.status() output, similar to the following:
{ "_id" : "shard1", "host" : "shard1/SHARD1:port,newmember1:port,newmember2:port" }
If this does not happen automatically you will need to follow the procedure outlined in the documentation link above, namely from inside mongos:
db.getSiblingDB("config").shards.save({_id:"<name>", host:"<rsName>/member1,member2,..."})
Following the above example it would look like:
db.getSiblingDB("config").shards.save({_id:"SHARD1", host:"shard1/SHARD1:port,newmember1:port,newmember2:port"})
You would need to do this procedure for each shard, and you should do them one at a time so that you have all of SHARD1 as a replica set before moving on to SHARD2. You will also need to be aware that each replica set will become read-only while the initial election takes place, so at the very least you should schedule this in a downtime or maintenance window. Ideally test first in a staging environment.