Problem with MongoDB Replication - AWS and Windows Hosts - mongodb

I've been messing with this for a bit now and I have managed to crawl through the configuration given the documentation is rather non existent.
Right now the problem is that my ReplicaSet Secondaries cannot get a heartbeat to my Primary. I am able to ping all hosts from each other and I am able to connect to the shell from all hosts.
The ReplicaSet initiated and I was able to add the members, so I know they can all communicate.
Is there something I need to open up on the firewall to get the heartbeats through?

The problem was with the inbound Firewall Rule I created for traffic over 27017.
My inbound rule had a typo in the port number, preventing either secondary from contacting the primary.
The outbound rule was fine, which made it look like the ReplicaSet was working because they received information from the Primary.
This will create a problem if you're in this scenario and you shutdown the secondaries because the Primary will be aware that those went offline which will send your Primary into Secondary mode forever and will be unable to recover until you figure out the issue.

Related

MongoDB nodes (AWS EC2 Instances) are still responsive even after network partitioning done using Security Groups

I have created a MongoDB replica set using 5 EC2 instances on AWS. I added the nodes using rs.add("[IP_Address]") command.
I want to perform network partition in the replica set. In order to that, I have specified 2 kinds of security groups. 'SG1' has 27017 port (MongoDB port) opened. 'SG2' doesn't expose 27017.
I want to isolate 2 nodes from the replica set. When I apply SG2 on these 2 nodes (EC2 instances), ideally they should stop getting write and read from the primary as I am blocking the 27017 port using security group SG2. But in my case, they are still writable. Data written on Primary reflects on the partitioned node. Can someone help? TYA.
Most firewalls, including AWS Security groups, will block incoming connections when the connection is being opened. Changing settings will affect all new connection, but existing open connections are not re-evaluated when they are applied.
MongoDB maintains connections between hosts and that would only get blocked after loss of connection between the hosts.
On Linux you can restart the networking which will reset the connections. You can do this after applying the new rules by running:
/etc/init.d/networking stop && /etc/init.d/networking start

MongoDB error not master and slaveOk=false

I am using MongoDB with Loopback in my application with a loopback connector to the MongoDB. My application was working fine but now it throws an error
not master and slaveOk=false.
try running rs.slaveOk() in a mongoDB shell
You are attempting to connect to secondary replica whilst previously your app (connection) was set to connect likely to the primary, hence the error. If you use rs.secondaryOk() (slaveOk is deprecated now) you will possibly solve the connection problem but it might not be what you want.
To make sure you are doing the right thing, think if you want to connect to the secondary replica instead of primary. Usually, it's not what you want.
If you have permissions to amend the replica configuration
I suggest to connect using MongoDB Compass and execute rs.status() first to see the existing state and configuration for the cluster. Then, verify which replica is primary.
If necessary, adjust priorities in the replicaset configuration to assign primary status to the right replica. The highest priority number sets the replica as primary. This article shows how to do it right.
If you aren't able to change the replica configuration
Try a few things:
make sure your hostname points to the primary replica
if it is a local environment issue - make sure you added your local replica hostnames to the /etc/hosts pointing to 127.0.0.1
experiment with directConnection=true
experiment with multiple replica hosts and ?replicaSet=<name> - read this article (switch tabs to replica)
The best bet is that your database configuration has changed and your connection string no longer reflects it correctly. Usually, slight adjustments in the connection string are needed or just checking to what instance you want to connect.

Multiple a record for failover

I have 5 mongos server at amazon:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.4
192.168.1.5
and 2 HAProxy servers for load balancing:
192.168.1.6
192.168.1.7
My domain is registered at: namecheap.com, let call it domain.com.
1)
can i point database.domain.com to both HAProxy servers?
if yes how?
2)
if HAProxy server: 192.168.1.6 fails will 192.168.1.7 take over?
3)
Can i control the timeout of the records?
Please explain to me how things work and how to make it work like i want.
I'm trying to understand how such system is setup for failover. I'm seeking
knowledge and not humiliation so either you try to help or dont do anything
please.
Anna stay positive, we all are learning from each other. Well you need to create a replicaset of all of your mongodb servers. Replica set is a mongodb answer of handling failover.
Please see https://docs.mongodb.org/manual/replication/
To connect mongodb, you don't need any proxy servers. just point directly to mongodb primary. Depending on your application, mongodb connection string can look slightly different. Normally, it should be something like:
mongodb://192.168.1.1,192.168.1.2,192.168.1.3,192.168.1.4,192.168.1.5/?replicaSet=<replicaset_name>
See https://docs.mongodb.org/manual/reference/connection-string/
Just in further to #Saleem's answer, by way of explanation into DNS, multiple A records in DNS don't act as a failover, rather they act more like a load balancer in that your upstream DNS server will request the A record and will select one of the A records listed to return to you, and this record may change each time the time to live expires and your DNS server has to re-request the A record.
Some DNS servers are smart enough to request a new A record if the provided one doesn't work and so that gives you a certain level of psuedo-redundancy, but most do not have this feature enabled.
(source: Using DNS for failover using multiple A records)

node-mongodb-native does not recover on replica set primary network failure?

Our MongoDB setup uses three replica set shards. Each webserver runs a mongos instance locally, and the client node.js processes connect through that using Mongoose (3.6.20) and node-mongodb-native. So node-mongodb-native just connects to mongos on localhost.
When a replica set primary goes down hard (we can simulate this by doing 'ifdown eth0' on the primary) mongos properly detects this, and also detects that a new primary has been elected. So far, so good. But node-mongodb-native's connections to the mongos instance are still open but not functional, and a restart of the node procs is required.
Our assumption was that mongos would just kill any established connections to the dead primary and node-mongodb-native would reconnect, but that seems to not be the case; both the server and the OS think these connections are open. By contrast, on primary stepDown, the clients fail over fine, connections are closed and reopened.
We are looking at socketTimeoutMS, but that seems incorrect since it causes disconnects for connections that are merely idle.
Are we missing configuration to our client or mongos, or do we have to implement our own pinging?
Based on experimentation and the following MongoDB bug, this appears to just be a shortcoming of mongos (or, if you prefer, of the client libraries) at this point. Right now it looks like 'write your own pinging logic in your app and trigger a reconnect when that fails', so that's what we are doing.
https://jira.mongodb.org/browse/SERVER-9041

How the primary server down will be handled automatically in mongodb replication

I never have my hands on coding. I got a doubt regarding mongodb replica sets
below is the situation
I have an alert monitoring application.
It is using mongodb with replica set with 3 nodes.
Applications Java code base keep connecting to the primary and doing some transactions.
Now my question is that,
if the primary server is down, how will it effect the application server.
I mean, would the app server writes error saying connection failed like errors.
OR
the replica set will pick one of the slaves automatically as master and provides the application server to do its activity. How will it happen...?
Thanks & Regards,
UDAY
The replica set will try to pick another server as the new primary. If you have three nodes, and one goes down, the other two will negotiate which one becomes the new master. If two go down, or somehow communication between the remaining breaks down, there will be no new master until the situation is recovered.
The official drivers support this automatic fail-over, as does the mongos routing server if you use it. So the application code does not need to do anything here.
I am not sure if there will be connection errors during the brief period of time this fail-over negotiation takes (you will probably get errors for a few seconds).