HAProxy splitting read/write postgresql - postgresql

I use postgresql as database. I have a master/slave with streaming replication. I want to use HAProxy for load balancing. I want to send the writes to the master, and the reads to the slave. Can I do this with haproxy?

No, you can't. HAProxy doesn't understand the PostgreSQL protocol so it has no idea what "reads" or "writes" are.
Take a look at PgPool-II, which can do this to a limited extent. In practice it's usually better to configure the application so it knows to route its read-only queries to a different server if possible.

We are doing it by defining a frontend for reading and other for writing, each one listening on different ports, and routing them to backends where you have your db cluster organized.
Example of HAProxy config:
frontend writes
bind *:5439
default_backend writes_db
frontend reads
bind *:5438
default_backend reads_db
backend writes_db
option pgsql-check user haproxy
server master_db ip.for.my.master:5432 check
backend reads_db
balance roundrobin
option pgsql-check user haproxy
server replica_db ip.for.my.replica:5432 check
In our case, we use Django so we need to define the routers and settings.databases so all write operations are done on one port of the HAProxy server (5438) and all read operations are done on the other one (5439).

Related

Zalando operator- load balance read-write pgbouncer

I have installed Postgres cluster using zalando operator.
Also enabled pgbouncer for replicas and master.
But I would like to combine or load balance replicase and master connections,
So that read requests can be routed to read replicas and write requests can be routed to master.
Can anyone help me out in achieving this.
Thanks in advance.
Tried enabling pgbouncer.
pgbouncer is getting enabled either to master or to slave.
But I need a single point where it can route read requests to slaves and write requests to master.
There is no safe way to distinguish reading and writing statements in PostgreSQL. pgPool tries to do that, but I think any such solution is flaky. You will have to teach your application to direct reads and writes to different data sources.
I don't think Pgbouncer provides any out of the box way to load balance read and write queries. An alternative to that is the use of pgpool as a connection pooler. Pgpool provides a mode known as load_balance_mode which you can turn it on and it will try to load balance queries and send write queries to master and read queries to replica. You can read more about the load_balance_mode here

postgres redirect queries to standby?

I am trying to create a connection pooling system with load balancing. From what I unsderstand PGbouncer doesn't have a load balancing option and all I can do is to create a file with all the users+pass and configure the dbs/clusters. but in this option i cannot direct the connections to specific cluster. i'll explain: inserts will go to primary and selects will go to slave. what is possible is to let user "user1" connect to cluster on port 5432 to DB "database123".
How can I redirect queries to standby with other tools?
I tried to do this with pgpool but for some reason the standby is always on "waiting" status --> Cannot configure pgpool with master and slave nodes
It is impossible to tell from an SQL statement if it will modify data or not. What about SELECT delete_my_data();?
So all tools that try to figure that out by looking at the SQL statement are potentially problematic.
The best you can do is to write your application so that it uses two data sources: one for reading and one for writing, and you determine what goes where.

haproxy streaming - no dependency on proxy

I have been testing with haproxy which does cookie based load balancing to our streaming servers, but lets say for example haproxy falls over (I know unlikely)
the streamer gets disconnected, is there a way of passing on the connection without it relying on haproxy, basically laving the streamer connected to the destination and cutting all ties with haproxy.
That is not possible by design.
HAProxy is a proxy (as the name suggests). As such, for each communication, you have two independent TCP-connections, one between the client and HAProxy and another between HAProxy and your backend server.
If HAProxy fails or you need to failover, the standing connections will have to be re-created. You can't pass over existing connections to another server since there is a lot of state attached to each connection that can't be transferred.
If you want to remove the loadbalancer from the equation after the initial connection initialization, you should look at Layer-3 loadbalancing solutions like LVS on Linux with Direct Routing. Note that these solutions are much less flexible than HAProxy. There is no such thing as a free lunch after all :)

Multiple a record for failover

I have 5 mongos server at amazon:
192.168.1.1
192.168.1.2
192.168.1.3
192.168.1.4
192.168.1.5
and 2 HAProxy servers for load balancing:
192.168.1.6
192.168.1.7
My domain is registered at: namecheap.com, let call it domain.com.
1)
can i point database.domain.com to both HAProxy servers?
if yes how?
2)
if HAProxy server: 192.168.1.6 fails will 192.168.1.7 take over?
3)
Can i control the timeout of the records?
Please explain to me how things work and how to make it work like i want.
I'm trying to understand how such system is setup for failover. I'm seeking
knowledge and not humiliation so either you try to help or dont do anything
please.
Anna stay positive, we all are learning from each other. Well you need to create a replicaset of all of your mongodb servers. Replica set is a mongodb answer of handling failover.
Please see https://docs.mongodb.org/manual/replication/
To connect mongodb, you don't need any proxy servers. just point directly to mongodb primary. Depending on your application, mongodb connection string can look slightly different. Normally, it should be something like:
mongodb://192.168.1.1,192.168.1.2,192.168.1.3,192.168.1.4,192.168.1.5/?replicaSet=<replicaset_name>
See https://docs.mongodb.org/manual/reference/connection-string/
Just in further to #Saleem's answer, by way of explanation into DNS, multiple A records in DNS don't act as a failover, rather they act more like a load balancer in that your upstream DNS server will request the A record and will select one of the A records listed to return to you, and this record may change each time the time to live expires and your DNS server has to re-request the A record.
Some DNS servers are smart enough to request a new A record if the provided one doesn't work and so that gives you a certain level of psuedo-redundancy, but most do not have this feature enabled.
(source: Using DNS for failover using multiple A records)

Is a good idea to set "leastconn" as load balancing method in HAproxy to handle BOSH connections?

I have an HAproxy instance used as load balancer of BOSH (http-bind, http://xmpp.org/extensions/xep-0206.html) servers. It was running with "roundrobin" load balancing method, but I experimented some issues, when some instances go down, all the connections are redistributed to the active instances. When the death nodes come up again, they don't have the same amount of connections that the other instances, and they aren't using the same resources. If other instances go down, the sessions will be redistributed again and some servers will be overloaded and some other that are running in their limits go down, so all the service is interrupted, and I need to restart all instances at the same time in order try that the the sessions could be evenly redistributed.
I was reading about how can I configure a BOSH load balancing using HAproxy and I found this book: "Professional XMPP Programming with JavaScript and jQuery". In this book the author recommends that we can use "leastconn" as balance method for Haproxy.
The HAproxy documentation says that we shouldn't use "leastconn" with HTTP connections, but it says that we should use it where very long sessions are expected.
I think that this balancing method can help with the issue when the servers go down, because it will redistribute the sessions equally in the active nodes, and when the instance is up again, all the new sessions will go to this instance until it has the same amount of sessions that the other servers.
Has anyone some experience in this kind of configuration? What HApoxy settings or tuning do you recommend me in order to balance BOSH connections?
If your sessions are long, and they may be when I read SMPP, then leastconn will provide a better load-balancing than roundrobin.
Roundrobin works well for very short connections.
cheers