MongoDB : How to perform sharding without replication? - mongodb

I am trying to accomplish sharding within 2 machines with config server, router, 1 shard in machine A and another shard in machine B. I am finding it hard to do this as I am a beginner and also can't find much documentation/ tutorials online. I have started a two mongod instances one as config server and another as shard, but clueless on how to proceed.
Below is the sharding configuration in two of my mongod (config and shard ) conf files:
Config server:
sharding:
clusterRole: configsvr
Shard:
sharding:
clusterRole : shardsvr
As per the documentation , the next step is to execute the command rs.initiate(), but I don't require replication. I still tried to execute just in case and received below error:
{
"ok" : 0,
"errmsg" : "This node was not started with the replSet option",
"code" : 76,
"codeName" : "NoReplicationEnabled"
}
Is it mandatory to have replication while sharding? How to do sharding without replication within 2 machines?

That's not possible, see sharding Options:
Note
Setting sharding.clusterRole requires the mongod instance to be
running with replication. To deploy the instance as a replica set
member, use the replSetName setting and specify the name of the
replica set.
But you can have a replica set with just one member, that's no problem.
The replica set will have only the primary, should work.

Related

Error in configuration of Mongodb sharded cluster

I have Error in configuration of Mongodb sharded cluster.
I tried all the possibilities of rs.add("127.0.0.1:27002"), rs.add("loclahost:27002") and rs.add("hostname:27002") for sharding
But I am getting error:
{
"ok" : 0,
"errmsg" : "Either all host names in a replica set configuration must be localhost references, or none must be; found 1 out of 2",
"code" : 103
}
I assume that you try to connect to your primary and trying to add the secondary nodes. To start a Mongo instance by typing
mongo localhost:30001
I suppose this is primary, in the mongod shell for this primary. Type in this command
rs.status()
You'll get to know the name of your primary. Same will be the name of your secondary with just the difference of the port number.
Once you get the name, just do rs.add("name:port_number") and you'll be able to add.
rs.add() is used to make ReplicaSet not Sharded cluster.
If you what to add a shard to a sharded cluster, you may use sh.addShard("host:port").

Does a mongod Configserver also contain data (except metadata)?

I am getting started with MongoDB and cannot find the answer to the question.
For test purposes I want to create a 3 Datanode Cluster, but so far I am not sure how many machines i will need to start a cluster with 3 Datanodes. I want to have 2 routingservers in the cluster.
My current understanding is that I will need 4 machines.
Machine (Configserver and Routingserver): runs mongod --configsrv and mongos
Machine (Shard and Routingserver): runs mongod and mongos
Machine (Shard): runs only the mongod
Machine (Shard): runs only the mongod
So in my opinion a mongod --configsrv cannot be a shard at the same time?
In MongoDB the config server will store any data other than metadata for a sharded cluster. If you manually connect to the config server and try to write data, you get this error:
WriteResult({
"nInserted" : 0,
"writeError" : {
"code" : 14037,
"errmsg" : "can't create user databases on a --configsvr instance"
}
})
Regarding the number of servers, each shard should run on its own machine. As you only have two shards, you can get away with 2 machines, however, 4 would be desirable so you can have a primary and a secondary replica set for both shards. The config server and routing servers can be run on any of the four machines, so you only need 4 machines.

MongoDB sharding: host does not belong to replica set

I am a Mongo newbie.
I am trying to sping up a MongoDB cluster with both sharding and replication. Cluster schema which I want to implement is: https://github.com/ansible/ansible-examples/raw/master/mongodb/images/site.png
I am using server IP as replication set name. I.e. I am building replication sets with commands below:
rs.initiate()
rs.add("10.148.28.51:27118")
rs.add("10.148.28.52:27118")
rs.add("10.148.28.53:27118")
Replication is being configured correctly so when I am executing rs.status() on PRIMARY host 10.148.28.51 I am getting "10.148.28.51" as repl.set name: https://gist.github.com/daniilyar/630bc6fe7723ed06f243
But when I am trying to add shards at mongos instance it gives me 2 opposite errors (depending on what addShard() syntax variation I use):
mongos> sh.addShard("10.148.28.51:27118")
{
"ok" : 0,
"errmsg" : "host is part of set 10.148.28.51, use replica set url format <setname>/<server1>,<server2>,...."
}
mongos> sh.addShard("10.148.28.51/10.148.28.51:27118")
{
"ok" : 0,
"errmsg" : "in seed list 10.148.28.51/10.148.28.51:27118, host 10.148.28.51:27118 does not belong to replica set 10.148.28.51"
}
How do I add shard if Mongo tells that "host X is in replica set Y" and that "host X does not belong to replica set Y" in the same time?
Any help would be greatly appreciated
From your description sounds like you need to tweak the way your are using the rs.add(..) command. You state you are using the IP address as the name of the replica set but this is not how rs.add(...) interprets the argument.
The argument you pass is the hostname (or IP) and port of the mongod instance you are looking to add to the replica set notthe replica set name. You set-up this configuration when connected via mongo to the primary. The replSet name is set when the primary is started:
mongod --replSet "rs1"
sets the as name of rs1.
I'd have a read over: http://docs.mongodb.org/manual/tutorial/convert-replica-set-to-replicated-shard-cluster/ as it covers pretty much what you appear to be trying to do.
I'd also consider what you are trying to achieve as it sounds (from your description) like you may end up with a single replicated shard (!!!) when you most probably are looking to create multiple shards each of which have their data replicated.
References:
rs.add command - http://docs.mongodb.org/manual/reference/method/rs.add/
rs.addShard command - http://docs.mongodb.org/manual/reference/method/sh.addShard/
Sharded Cluster -
http://docs.mongodb.org/manual/core/sharded-cluster-components/
Thank you for good explanation, now I understand. If you use in rs.add(IP:port), Mongo adds replica set member with name ip-X-Y-Z-R:. It seems to be Mongo's default behavior. So in my case solution was to use command:
sh.addShard("10.148.28.51/**ip-10-148-28-51**:27118")
instead of:
sh.addShard("10.148.28.51/**10.148.28.51**:27118")

Setting up distributed MongoDB with 4 servers

I am supposed to setup mongodb on 4 servers (a school project to benchmark mongodb, vs others). So I am thinking of using only sharding without replication. So I attempted to setup the 1st 3 server to run mongod --configsvr, and the 4th just a normal mongod instance. Then for all servers, I run mongos. Then I am at the part where I run sh.addShard("...") and I get
{
"ok" : 0,
"errmsg" : "the specified mongod is a --configsvr and should thus not be a shard server"
}
Seems like I cant have a config server running as a shard too? How then should I set things up?
No, the config server is not a shard. The config server is a special mongod instance that stores the sharded cluster metadata, essentially how the data is distributed across the sharded cluster.
You really only need 1 mongos instance for a simple setup like this. When you start mongos, you will provide it with the hostname(s) of your config server(s).
The Sharded Cluster Deployment Tutorial explains all of the steps that you need to follow.

Setup Shards: Should I install MongoDB on the following servers

Following the Oreily Scaling MongoDB book (i.e. Page 27), I saw the following command:
Once you’re connected, you can add a shard. There are two ways to add
a shard, depending on whether the shard is a single server or a
replica set. Let’s say we have a single server, sf-02, that we’ve been
using for data. We can make it the first shard by running the addShard
command:
> db.runCommand({"addShard" : "sf-02:27017"})
{ "shardAdded" : "shard0000", "ok" : 1 }
Question 1>: What should be done on the servers of sf-02?
Should I also install MongoDB on it? If any, which package?
For example, if we had a replica set creatively named replica set “rs”
with members rs1-a, rs1-b, and rs1-c, we could say:
> db.runCommand({"addShard" : "rs/rs1-a,rs1-c"})
{ "shardAdded" : "rs", "ok" : 1 }
Question 2>: where is "rs" located?
Question 3>: Does rs1-a, rs1-c share the same machine?
reply 1: you should run mongod with the --shardsvr option to start it as a shard server. each shard server has to know that it is will receive a connection from a mongos (the shard router).
reply 2: 'rs' is the name of a replica set, a set is just a group of machine (usually 3). so it is not located on a single machine, it is an abstract entity which represent the group of machine in the set.
reply 3: no. for testing purpose you can run replica set on the same machine, but the purpose of a replica set is failover. in production you should use different machine for every member of the set.