MongoDB Multiple Connections to Replica Set - mongodb

Why does the MongoDB C# Client 2.0 create a connection to each member of the replica set when Read Preference is Primary (default)?
I have an application with MaxPoolSize set to 100, however it creates 300 connections, one to each node in the replica set. Surely it should just connect to the Primary, once it has identified which node is the Primary from the data received from the seed list?
I have two data nodes and one arbiter. The two data nodes are geographically close to the consuming application, with the arbiter a longer ping time away. Whilst I recognize the need to connect to each on a MongoClient level at least once. Why is the pool needing to connect to all nodes on each pooled connection?
I only allow Read Preference of Primary, so it is writing and reading from a single server. The issue is I am getting lots of Connection errors (hence me looking into this and discovering this).
I think the Client should connect to a single server on a pooled connection, either the Primary if writing, a pooled Secondary or Primary when reading depending on Read Preference. It should not connect more than once to the Arbiter.
Am I missing something here? It is causing an issue when I am bursting up my pooled connections and the connections are getting throttled by the Azure load balancer.
My connection string:
mongodb://user:pass#mongo1.domain.com:27000,mongo2.domain.com:27000
Note I do not specify the arbiter, it finds that after querying the replica set, and proceeds to open 100 connections, useless as it has no data.

Related

Mongo Replica Endpoint

I have some questions about the mongo replica
mongo replica
If I make 1 primary and 2 secondary MongoDB for replication. So I have 3 endpoints to 3 different DB and my apps can only write on primary DB. what if suddenly my primary shutdown and secondary DB take over the primary. Then how to automatically change the endpoint in my apps? should I use mongos (mongo routes)? but it needs sharding if I remember correctly.
Thank you.
All nodes in a replica set work together to have identical data. Secondary nodes may lag behind the primary, but you don't get "3 different DB". There is only one database of which copies exist on each node.
All MongoDB drivers know to monitor replica set members and discover which is the primary one automatically. You need to configure some drivers to do so by providing the replica set name, others do it automatically by default when they connect to a replica set node. Look up "connecting to replica set" in your driver documentation.
In a proper connection string you will provide all three RS members, e.g.
mongodb://mongodb0.example.com:27017,mongodb1.example.com:27017,mongodb2.example.com:27017/?replicaSet=myRepl
The client will detect the PRIMARY and will use it. I guess, most drivers will re-connect automatically if the PRIMARY node changes.
Most drivers will detect the PRIMARY automatically if you provide the ReplicaSet name, i.e.
mongodb://mongodb0.example.com:27017/?replicaSet=myRepl
would connect to the PRIMARY even if it is not mongodb0.example.com. However, if mongodb0.example.com does not run, then you don't connect at all. So, it is beneficial to provide all ReplicaSet members in the connection string.
See Connection String URI Format
mongos is needed only to connect to a Sharded Cluster.

MongoDB replica set - virtual IP , downtime questions

I have 3 questions regarding a replica set on mongodb on windows:
I currently have a standalone running with data on it, if I create a replica set (adding 2 secondries) will I have a downtime or I can create the replica set and adding 2 secondries while the standalone (now the primary) still running?
Will the 2 secondries copy all the data from the primary? Also data that was written to standalone before it became part of replica set?
Once there is an election a secondry become a primary but then it means the primary is on differnt IP + Port, this means I also need to change my write to the new primary by myself or mongodb doing it by himself? or need to use virtual ip?
Have a look at Convert a Standalone to a Replica Set.
You need to change the configuration file and restart the MongoDB, so you have a downtime of your MongoDB.
Yes, whenever you add a new member to a replica set then MonogDB performs a Initial Sync to the new secondary
You would need to change your connection, see Connect to a MongoDB Replica Set. The connection string contains all replica set members and the client will connect (by default) to the primary.
Actually you don't have to put all replica set members in your connection string, the client will discover them automatically. However, if you put only one member and by chance this member is down then you have no connection.

How do mongos instances work together in a cluster?

I'm trying to figure out how different instances of mongos server work together.
If I have 1 configserver and some shards, for example four, each of them composed by only one node (a master of course), and have four mongos server... do the mongos server communicate between them? Is it possible that one mongos redirect its load to another mongos?
When you have multiple mongos instances, they do not automatically load-balance between each other. They don't even know about each others existence.
The MongoDB drivers for most programming languages allow to specify multiple mongos instances when creating a connection. In that case the driver will usually ping all of them and connect to the one with the lowest latency. This will usually be the one which is closest geographically. When all have the same network distance, the one which is least busy right now will usually respond first. The driver will then stay connected to that one mongos, unless the program explicitely reconnects or the mongos can no longer be reached (in that case the driver will usually automatically pick another one from the initial list).
That means using multiple mongos instances is normally only a valid method for scaling when you have a large number of low-load clients, not one high-load client. When you want your one high-load client to make use of many mongos instances, you need to implement this yourself by creating a separate connection to each mongos instance and implement your own mechanism to distribute queries among them.
Short answer
As of MongoDB 2.4, the mongos servers only provide a routing service to direct read/write queries to the appropriate shard(s). The mongos servers discover the configuration for your sharded cluster via the config servers. You can find out more details in the MongoDB documentation: Sharded Cluster Query Routing.
Longer scoop
I'm trying to figure out how different instances of mongos server work togheter.
The mongos servers do not currently talk directly to each other. They do coordinate activity some activity via your config servers:
reading the sharded cluster metadata
initiating a balancing round (any mongos can start a balancing round, but only one round can be active at a time)
If I have 1 configserver
You should always have 3 config servers in production. If you somehow lose or corrupt your config server, you will have to combine your data and re-shard your database(s). The sharded cluster metadata saved on the config servers is the definitive source for what sharded data ranges should live on each shard.
some shards, for example four, each of them composed by only one node (a master of course)
Ideally each shard should be backed by a replica set if you want optimal uptime. Replica sets provide for auto-failover and can be very useful for administrative purposes (for example, taking backups or adding indexes offline).
Is it possible that one mongos redirect its load to another mongos?
No, the mongos do not perform any load balancing. The typical recommendation is to deploy one mongos per app server.
From an application/driver point of view you can specify multiple mongos in your connect string for failover purposes. The application drivers will generally connect to the nearest available mongos (by network ping time), and attempt to reconnect to in the event the current mongos connection fails.

How to add new server in replica set in production

I am new to mongodb replica set.
According to Replic Set Ref this should be connection string in my application to connect to mongodb
mongodb://db1.example.net,db2.example.net,db3.example.net:2500/?replicaSet=test
Suppose this is production replica set (i.e. I cannot change application code or stop all the mongo servers) And, I want to add another mongo db instance db4.example.net in test replica set. How will I do that?
How my application will know about the new db4.example.net
If you are looking for real-world scenario:
In situation when any of existing server is down due to hardware failure etc, it is natural to add another db server to the replica set to preserve the redundancy. But, how to do that.
The list of replica set hosts in your connection string is a "seed list", and does not have to include all of the members of your replica set.
The MongoDB client driver used by your application will iterate through the seed list until it can successfully connect to a host, and use that host to request the current replica set configuration which will list all current members of the replica set. Per the documentation, it is recommended to include at least two hosts in the connect string so that your driver can still connect in the event the first host happens to be down.
Any changes in replica set configuration (i.e. adding/removing members or election of a new primary) are automatically discovered by your client driver so you should not have to make any changes in the application configuration to add a new member to your replica set.
A change in replica set configuration may trigger an election for a new primary, so your application code should expect to handle transient errors for a few seconds during reconfiguration.
Some helpful mongo shell commands:
rs.conf() - display the current replication configuration
db.isMaster().primary - display the current primary
You should notice a version number in the configuration document returned by rs.conf(). This version is incremented on every configuration change so drivers and replica set nodes can check if they have a stale version of the config.
How my application will know about the new db4.example.net
Just rs.add("db4.example.net") and your application should discover this host automatically.
In your scenario, if you are replacing an entirely dead host you would likely also want to rs.remove() the original host (after adding the replacement) to maintain the voting majority for your replica set.
Alternatively, rather than adding a host with a new name you could replace the dead host with a new server using the same hostname as previously configured. For example, if db3.example.net died, you could replace it with a new db3.example.net and follow the steps to Resync a replica set member.
A way to provide abstraction to your database is to set up a sharded cluster. In that case, the access point between your application and the database are the mongodb routers. What happens behind them is outside of the visibility of the application. You can add shards, remove shards, turn shards into replica-sets and change those replica-sets all you want. The application keeps talking with the routers, and the routers know which servers they need to forward them. You can change the cluster configuration at runtime by connecting to the routers with the mongo shell.
When you have questions about how to set up and administrate MongoDB clusters, please ask on http://dba.stackexchange.com.
But note that in the scenario you described, that wouldn't even be necessary. When one of your database servers has a hardware failure and your system administrators want to replace it without application downtime, they can just assign the same IP and hostname to the new server so the application doesn't even notice that it's a replacement.
When you want to know details about how to do this, you will find help on http://serverfault.com

node-mongodb-native does not recover on replica set primary network failure?

Our MongoDB setup uses three replica set shards. Each webserver runs a mongos instance locally, and the client node.js processes connect through that using Mongoose (3.6.20) and node-mongodb-native. So node-mongodb-native just connects to mongos on localhost.
When a replica set primary goes down hard (we can simulate this by doing 'ifdown eth0' on the primary) mongos properly detects this, and also detects that a new primary has been elected. So far, so good. But node-mongodb-native's connections to the mongos instance are still open but not functional, and a restart of the node procs is required.
Our assumption was that mongos would just kill any established connections to the dead primary and node-mongodb-native would reconnect, but that seems to not be the case; both the server and the OS think these connections are open. By contrast, on primary stepDown, the clients fail over fine, connections are closed and reopened.
We are looking at socketTimeoutMS, but that seems incorrect since it causes disconnects for connections that are merely idle.
Are we missing configuration to our client or mongos, or do we have to implement our own pinging?
Based on experimentation and the following MongoDB bug, this appears to just be a shortcoming of mongos (or, if you prefer, of the client libraries) at this point. Right now it looks like 'write your own pinging logic in your app and trigger a reconnect when that fails', so that's what we are doing.
https://jira.mongodb.org/browse/SERVER-9041