How to shift internal communication of nodes in a MongoDB cluster to another network to decrease the load of main network - mongodb

I have created a 8 node MongoDB cluster with 2 shards + 2 Replica(1 for each shard) + 3 Config Servers + 1 Mongos.
All these are on network 192.168.1.(eth0) with application server. So this network is handling all the traffic.
So I have created one another network 192.168.10.(eth1) which is having only these 8 MongoDB nodes.
Now all the eight nodes are the part of both the networks with dual IP's.
Now I want to shift the internal traffic between these mongodb nodes to network 192.168.10.(eth1) to reduce the load from main network 192.168.1.(eth0)
So how to bind the ports/nodes for the purpose?

You can use bind_ip as a startup or configuration option. Keep in mind that various nodes need to be accessible in the event of failover.
Notably here is your single mongos where it would be advised to either co-locate the service per app server, or depending on requirements, have a pool available to your driver connection. Preferably both and having a large instance for each 'mongos' where aggregate operations are used.

I got the solution of the problem I was looking for. I configured the cluster according to the IP's of network 192.168.11._
Now the internal data traffic is going through this network.

Related

Does a Mesos slave needs to be contacted by a Mesos master?

Can Apache Mesos 'slave' nodes be located on a separate network than the Mesos 'master' nodes? Similarly (for high-availability (HA) deploys), can the Apache Zookeeper nodes used in Mesos 'master' election be deployed on a separate network than the Mesos 'slave' nodes?
Currently, I have 3 masters+slaves nodes in the cloud, and I want to add a slave installed in my local subnet.
If such a setup is feasible, what are the pros/cons of such a setup?
I think https://www.stratio.com/blog/mesos-multi-data-center-architecture-for-disaster-recovery/ is a nice read on several of the things you need to make this work. There are some scenarios on how to handle stuff if a DC is down.
Pros:
You can failover in case of a DC being down/unreachable
Cons:
Both DC's must be able to run the environment by itself (active, or you should be able to scale up fast), so that creates overhead costs
Complexity increases (network, mesos/application configuration)
About the networks: they must be able to connect somehow, so public (but encrypted and firewalled, I also think that every node needs a public IP) or via an ipsec tunnel or another option like the link mentions.
I don't think doing it via the internet without tunneling (so the first option I mention) is a very good option.

MongoDB nodes (AWS EC2 Instances) are still responsive even after network partitioning done using Security Groups

I have created a MongoDB replica set using 5 EC2 instances on AWS. I added the nodes using rs.add("[IP_Address]") command.
I want to perform network partition in the replica set. In order to that, I have specified 2 kinds of security groups. 'SG1' has 27017 port (MongoDB port) opened. 'SG2' doesn't expose 27017.
I want to isolate 2 nodes from the replica set. When I apply SG2 on these 2 nodes (EC2 instances), ideally they should stop getting write and read from the primary as I am blocking the 27017 port using security group SG2. But in my case, they are still writable. Data written on Primary reflects on the partitioned node. Can someone help? TYA.
Most firewalls, including AWS Security groups, will block incoming connections when the connection is being opened. Changing settings will affect all new connection, but existing open connections are not re-evaluated when they are applied.
MongoDB maintains connections between hosts and that would only get blocked after loss of connection between the hosts.
On Linux you can restart the networking which will reset the connections. You can do this after applying the new rules by running:
/etc/init.d/networking stop && /etc/init.d/networking start

where to put mongos on AWS to ensure High Availability

I have a mongos to route my queries to two different mongo clusters running on two different ec2 instances so that if one ec2 instance goes down, i have a backup.
The challenge is, where should I put my mongos query router? I do not want to put my mongos query router on a 3rd EC2 instance upstream, because EC2 instances can fail and break. I've had this happen to me. Ec2 instances do not recover on their own and spin themselves up again right?. If the ec2 instance that my mongos query router is on goes down, then all the redundancy upstream that is built for high-availability becomes irrelevant.
So is there another amazon service (like an ec2) that is small, and would only be dedicated to one server (a mongos query distributor), that can spin itself up again if it goes down due to hardware failures, or auto-grow its own RAM and disk-space to give the mongos query router more resources due to software consuming system resources?
Looks like ec2 instances can auto-recover now via
https://aws.amazon.com/blogs/aws/new-auto-recovery-for-amazon-ec2/
So having a mongos query router on a mini ec2 instance with auto-scaling and auto-recovery should be safe.
Also, though not an out-of-box solution for ec2 instances, it looks like you can now also scale up the RAM and disk-size of your ec2 instance by using custom cloud-watch alarms to trigger these actions via
http://aws.amazon.com/code/8720044071969977

MongoDB sharding: mongos and configuration servers together?

We want to create a MongoDB shard (v. 2.4). The official documentation recommends to have 3 config servers.
However, the policies of our company won't allow us to get 3 extra servers for this purpose. Since we have already 3 application servers (1 web node, 2 process nodes) we are considering to put the configuration servers in the same application servers, with the mongos. Availability is not critical for us.
What do you think about this configuration? Can we face some problem or is it discouraged for some reason?
Given that Availability is not critical for your use case, I would say it should be fine to place the config servers in the same application servers and mongos.
If one of the process nodes is down, you will lose: 1 x mongos, 1 application server and 1 config server. During this down time, the other two config servers will be read-only , which means there won't be balancing of shards, modification to cluster config etc. Although your other two mongos should still be operational (CRUD wise). If your web-node is down, then you have a bigger problem to deal with.
If two of the nodes are down (2 process nodes, or 1 web server and process node), again, you would have bigger problem to deal with. i.e. Your applications are probably not going to work anyway.
Having said that, please consider the capacity of these nodes to be able to handle a mongos, an application server and a config server. i.e. CPU, RAM, network connections, etc.
I would recommend to test the deployment architecture in a development/staging cluster first under your typical workload and use case.
Also see Sharded Cluster High Availability for more info.
Lastly, I would recommend to check out MongoDB v3.2 which is the current stable release. The config servers in v3.2 are modelled as a replica set, see Sharded Cluster config servers for more info.

How do mongos instances work together in a cluster?

I'm trying to figure out how different instances of mongos server work together.
If I have 1 configserver and some shards, for example four, each of them composed by only one node (a master of course), and have four mongos server... do the mongos server communicate between them? Is it possible that one mongos redirect its load to another mongos?
When you have multiple mongos instances, they do not automatically load-balance between each other. They don't even know about each others existence.
The MongoDB drivers for most programming languages allow to specify multiple mongos instances when creating a connection. In that case the driver will usually ping all of them and connect to the one with the lowest latency. This will usually be the one which is closest geographically. When all have the same network distance, the one which is least busy right now will usually respond first. The driver will then stay connected to that one mongos, unless the program explicitely reconnects or the mongos can no longer be reached (in that case the driver will usually automatically pick another one from the initial list).
That means using multiple mongos instances is normally only a valid method for scaling when you have a large number of low-load clients, not one high-load client. When you want your one high-load client to make use of many mongos instances, you need to implement this yourself by creating a separate connection to each mongos instance and implement your own mechanism to distribute queries among them.
Short answer
As of MongoDB 2.4, the mongos servers only provide a routing service to direct read/write queries to the appropriate shard(s). The mongos servers discover the configuration for your sharded cluster via the config servers. You can find out more details in the MongoDB documentation: Sharded Cluster Query Routing.
Longer scoop
I'm trying to figure out how different instances of mongos server work togheter.
The mongos servers do not currently talk directly to each other. They do coordinate activity some activity via your config servers:
reading the sharded cluster metadata
initiating a balancing round (any mongos can start a balancing round, but only one round can be active at a time)
If I have 1 configserver
You should always have 3 config servers in production. If you somehow lose or corrupt your config server, you will have to combine your data and re-shard your database(s). The sharded cluster metadata saved on the config servers is the definitive source for what sharded data ranges should live on each shard.
some shards, for example four, each of them composed by only one node (a master of course)
Ideally each shard should be backed by a replica set if you want optimal uptime. Replica sets provide for auto-failover and can be very useful for administrative purposes (for example, taking backups or adding indexes offline).
Is it possible that one mongos redirect its load to another mongos?
No, the mongos do not perform any load balancing. The typical recommendation is to deploy one mongos per app server.
From an application/driver point of view you can specify multiple mongos in your connect string for failover purposes. The application drivers will generally connect to the nearest available mongos (by network ping time), and attempt to reconnect to in the event the current mongos connection fails.