Multiple a record for failover - mongodb

I have 5 mongos server at amazon:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.4
192.168.1.5
and 2 HAProxy servers for load balancing:
192.168.1.6
192.168.1.7
My domain is registered at: namecheap.com, let call it domain.com.
1)
can i point database.domain.com to both HAProxy servers?
if yes how?
2)
if HAProxy server: 192.168.1.6 fails will 192.168.1.7 take over?
3)
Can i control the timeout of the records?
Please explain to me how things work and how to make it work like i want.
I'm trying to understand how such system is setup for failover. I'm seeking
knowledge and not humiliation so either you try to help or dont do anything
please.

Anna stay positive, we all are learning from each other. Well you need to create a replicaset of all of your mongodb servers. Replica set is a mongodb answer of handling failover.
Please see https://docs.mongodb.org/manual/replication/
To connect mongodb, you don't need any proxy servers. just point directly to mongodb primary. Depending on your application, mongodb connection string can look slightly different. Normally, it should be something like:
mongodb://192.168.1.1,192.168.1.2,192.168.1.3,192.168.1.4,192.168.1.5/?replicaSet=<replicaset_name>
See https://docs.mongodb.org/manual/reference/connection-string/

Just in further to #Saleem's answer, by way of explanation into DNS, multiple A records in DNS don't act as a failover, rather they act more like a load balancer in that your upstream DNS server will request the A record and will select one of the A records listed to return to you, and this record may change each time the time to live expires and your DNS server has to re-request the A record.
Some DNS servers are smart enough to request a new A record if the provided one doesn't work and so that gives you a certain level of psuedo-redundancy, but most do not have this feature enabled.
(source: Using DNS for failover using multiple A records)

Related

Problem with MongoDB Replication - AWS and Windows Hosts

I've been messing with this for a bit now and I have managed to crawl through the configuration given the documentation is rather non existent.
Right now the problem is that my ReplicaSet Secondaries cannot get a heartbeat to my Primary. I am able to ping all hosts from each other and I am able to connect to the shell from all hosts.
The ReplicaSet initiated and I was able to add the members, so I know they can all communicate.
Is there something I need to open up on the firewall to get the heartbeats through?
The problem was with the inbound Firewall Rule I created for traffic over 27017.
My inbound rule had a typo in the port number, preventing either secondary from contacting the primary.
The outbound rule was fine, which made it look like the ReplicaSet was working because they received information from the Primary.
This will create a problem if you're in this scenario and you shutdown the secondaries because the Primary will be aware that those went offline which will send your Primary into Secondary mode forever and will be unable to recover until you figure out the issue.

haproxy streaming - no dependency on proxy

I have been testing with haproxy which does cookie based load balancing to our streaming servers, but lets say for example haproxy falls over (I know unlikely)
the streamer gets disconnected, is there a way of passing on the connection without it relying on haproxy, basically laving the streamer connected to the destination and cutting all ties with haproxy.
That is not possible by design.
HAProxy is a proxy (as the name suggests). As such, for each communication, you have two independent TCP-connections, one between the client and HAProxy and another between HAProxy and your backend server.
If HAProxy fails or you need to failover, the standing connections will have to be re-created. You can't pass over existing connections to another server since there is a lot of state attached to each connection that can't be transferred.
If you want to remove the loadbalancer from the equation after the initial connection initialization, you should look at Layer-3 loadbalancing solutions like LVS on Linux with Direct Routing. Note that these solutions are much less flexible than HAProxy. There is no such thing as a free lunch after all :)

How to setup a MongoDB replica set in EC2 US-WEST with only two availability zones

We are setting up a MongoDB replica set on Amazon EC2 in the us-west-1 region.
This region only has two availability zones though. My understanding is that MongoDB must have a majority to work correctly. If we create 2 servers in zone us-west-1b and one server in us-west-1c this will not provide high availability if the entire us-west-1b goes down right? How is this possible? What is the recommended configuration?
Having faced a similar challenge we looked at a number of possible solutions:
Put an Arbiter in another region:
Secure the connection either by using a point to point VPN between the regions a routing the traffic across this connection.
or
Give each server an E-IP and DNS name and use some combination of AWS security groups, IPTables and SSL to ensure connections are secure.
AWS actually have a whitepaper on this not sure how old it is though http://media.amazonwebservices.com/AWS_NoSQL_MongoDB.pdf
Alternatively you could allow the application to fall back to a read-only state until your servers come back on-line (not the nicest of options though)
Hope this helps

Amazon EC2 Elastic Load Balancer TCP disconnect after couple of hours

I am testing the reliability of TCP connections using Amazon Elastic Load Balancer compared to not using the Load Balancer to see if it has any impact.
I have setup a small Elastic Load Balancer on Amazon EC2 us-east zones with 8 t2.micro instances using an auto scaling group without policy and set to 8 min/max instance.
Each instance run a simple TCP server that accept connections on port 8017 and relay some data to the clients coming from another remote server located in my network. The same data is send to all clients.
For the purpose of the test, the servers running on the micro instances are only sending 1 byte of data every 60 seconds (to be sure the connection don't time out).
I connected multiple clients from various outside networks using the ELB DNS name provided, and after maybe 6-24 hours, I always stop receiving data and eventually the connections all die.
All clients stops around the same time, even though they are on different network/ISP. Each "client" application is doing about 10 TCP connections and they all stop receiving data.
All server instances look fine after this happen, they still send data.
To do further testing and eliminate the TCP server code problem, I also have external clients connected directly to the public IP of a single instance, without the ELB, and the data doesn't stop and the connection is not lost in this case (so far).
The Load balancer Idle Timeout is set to 900 seconds.
The Cross-Zone load balancing is enabled and I am using the following zones: us-east-1e, us-east-1b, us-east-1c, us-east-1d
I read the documentation, and searched everywhere to see if this is a known behaviour, but I couldn't find any clear answer or confirmation of others having the same issue, but it seems clear it is happening in my case.
My question: Is this a known/expected behaviour for TCP load balancer? Otherwise, any idea what could be the problem in my setup?

How do mongos instances work together in a cluster?

I'm trying to figure out how different instances of mongos server work together.
If I have 1 configserver and some shards, for example four, each of them composed by only one node (a master of course), and have four mongos server... do the mongos server communicate between them? Is it possible that one mongos redirect its load to another mongos?
When you have multiple mongos instances, they do not automatically load-balance between each other. They don't even know about each others existence.
The MongoDB drivers for most programming languages allow to specify multiple mongos instances when creating a connection. In that case the driver will usually ping all of them and connect to the one with the lowest latency. This will usually be the one which is closest geographically. When all have the same network distance, the one which is least busy right now will usually respond first. The driver will then stay connected to that one mongos, unless the program explicitely reconnects or the mongos can no longer be reached (in that case the driver will usually automatically pick another one from the initial list).
That means using multiple mongos instances is normally only a valid method for scaling when you have a large number of low-load clients, not one high-load client. When you want your one high-load client to make use of many mongos instances, you need to implement this yourself by creating a separate connection to each mongos instance and implement your own mechanism to distribute queries among them.
Short answer
As of MongoDB 2.4, the mongos servers only provide a routing service to direct read/write queries to the appropriate shard(s). The mongos servers discover the configuration for your sharded cluster via the config servers. You can find out more details in the MongoDB documentation: Sharded Cluster Query Routing.
Longer scoop
I'm trying to figure out how different instances of mongos server work togheter.
The mongos servers do not currently talk directly to each other. They do coordinate activity some activity via your config servers:
reading the sharded cluster metadata
initiating a balancing round (any mongos can start a balancing round, but only one round can be active at a time)
If I have 1 configserver
You should always have 3 config servers in production. If you somehow lose or corrupt your config server, you will have to combine your data and re-shard your database(s). The sharded cluster metadata saved on the config servers is the definitive source for what sharded data ranges should live on each shard.
some shards, for example four, each of them composed by only one node (a master of course)
Ideally each shard should be backed by a replica set if you want optimal uptime. Replica sets provide for auto-failover and can be very useful for administrative purposes (for example, taking backups or adding indexes offline).
Is it possible that one mongos redirect its load to another mongos?
No, the mongos do not perform any load balancing. The typical recommendation is to deploy one mongos per app server.
From an application/driver point of view you can specify multiple mongos in your connect string for failover purposes. The application drivers will generally connect to the nearest available mongos (by network ping time), and attempt to reconnect to in the event the current mongos connection fails.