Load Balancing and Failover for Read-Only PostgreSQL Database - postgresql

Scenario
Multiple application servers host web services written in Java, running in SpringSource dm Server. To implement a new requirement, they will need to query a read-only PostgreSQL database.
Issue
To support redundancy, at least two PostgreSQL instances will be running. Access to PostgreSQL must be load balanced and must auto-fail over to currently running instances if an instance should go down. Auto-discovery of newly running instances is desirable but not required.
Research
I have reviewed the official PostgreSQL documentation on this issue. However, that focuses on the more general case of read/write access to the database. Top google results tend to lead to older newsgroup messages or dead projects such as Sequoia or DB Balancer, as well as one active project PG Pool II
Question
What are your real-world experiences with PG Pool II? What other simple and reliable alternatives are available?

PostgreSQL's wiki also lists clustering solutions, and the page on Replication, Clustering, and Connection Pooling has a table showing which solutions are suitable for load balancing.
I'm looking forward to PostgreSQL 9.0's combination of Hot Standby and Streaming Replication.

Have you looked at SQL Relay?

The standard solution for something like this is to look at Slony, Londiste or Bucardo. They all provide async replication to many slaves, where the slaves are read-only.
You then implement the load-balancing independent of this - on the TCP layer with something like HAProxy. Such a solution will be able to do failover of the read connections (though you'll still loose transaction visibility on a failover, and have to start new transaction on the new slave - but that's fine for most people)
Then all you have left is failover of the master role. There are supported ways of doing it on all these systems. None of them are automatic by default (because automatic failover of a database master role is really dangerous - consider the situation you are in once you've got split brain), but they can be automated easily if the requirement needs this for the master as well.

Related

Flyway: Multiple nodes migrating in parallel with PgBouncer transaction pooling

We're encountering issues with using Flyway for database migrations with multiple nodes in parallel, backed by a PostgreSQL database behind PgBouncer with transaction pooling.
The problem is that when multiple nodes start up at the same time, Flyway gets an exclusive lock but this seems to be a session lock, which isn't supported by PgBouncer transaction pooling (as multiple nodes may get the same session). This then causes each node to not start up because they've locked each other.
Is there anything we can change or configure in Flyway to support this? We'd prefer not to switch away from transaction pooling if possible, as that's our main motivation for using PgBouncer.
At the moment, Flyway doesn't support PgBouncer, so you're seeing errors because of that lack of support. No work arounds from the developers currently. I'd suggest opening an issue on the Community Github. That's the best way to get changes in.
As a workaround, we're currently configuring two data sources for our application - one to PgBouncer as normal, and another with a single connection that's used solely for Flyway that bypasses PgBouncer and connects directly to the PostgreSQL back-end.

PostgreSQL Multi master Synchronisation

I have a scenario as follows,
One cloud server is running an application with PGSQL as DB
Multiple local servers are running with same application with PGSQL as DB
User may access the cloud server for read/write data
User may access any of the local server to read/write data
What I need is synchronisation between all these databases. The synchronisation can be done live if connectivity is available, or immediately when connectivity is available.
Please guide me with some inputs, where can i start from.
Rethink your requirements.
Multimaster replication is full of pitfalls, and it is easy to get your databases out of sync unless you plan carefully. You'd probably be better off with a single master node.
That said, you could look at BDR by 2ndQuadrant which provides such functionality.

Clustering PostgreSQL clusters

This will be confusing for some due to poor terminology choices by the PostgreSQL folks, but please bear with me...
We have a need to be able to support multiple PostgreSQL (PG) clusters, and cluster them on multiple servers using, e.g. repmgr. For example, to support both server availability and also PITR for each PG cluster. A single PG cluster per server is too expensive in many cases, so we multi-tenant (small) customers on separate PG clusters, for data separation, recovery, etc., but also want to be able to support HA via replication/fail-over.
The closest analogy for a PG cluster is a SQL Server instance - each can host multiple DB's, has its own port, etc. Like SQL Server, you can run multiple instances (PG clusters) on the same server, and set up replication for each.
Basic repmgr setup is no problem - that seems fairly clear in the single PG cluster model. But, is there any recommended/supported approach to multiple PG clusters using repmgr? I can kind of imagine faking repmgr into thinking each PG cluster is in effect a separate repmgr cluster (with separate repmgr.conf, connection info/port). But, I'm not yet sure that will work.
I'd typically expect to fail-over all PG clusters on the same server - not one at a time.
I recognize this may not be the best idea in all cases, but am mostly exploring what's possible. I have some alternatives, but this is closest to our current single-node model.
To clarify, I need to support many thousands of customers across many server clusters. Ideally, each cluster uses the same repmgr DB (in the main PG cluster, e.g.), and essentially stands alone from the other server clusters.
Thanks...
Answering my own question, but I hope someone eventually posts a better answer, as I otherwise quite like repmgr. In the end, it appears repmgr just isn't suitable for multiple PG clusters (instances), as there is an implied relationship between the repmgr cluster connection strings and the PG cluster (port). Thus, you'd essentially have to create a separate repmgr environment (DB) for every clustered PG cluster/instance, losing a lot of the operational simplicity that repmgr brings to the table.
I will investigate a more generic solution using Corosync/Pacemaker/etc., as at least in that case, the virtual cluster IP handling is built-in to the solution, and doesn't require additional software/resources to pull off.
I'm sure I'm probably over-simplifying things, but it seems like repmgr was tantilizingly close to solving much of the problem, had it allowed the repmgr DB to be fully independent of the PG cluster and allow each repmgr cluster to specify its own connection info, not (only) the connection info for the repmgr DB itself.

Slony and PGPool for fail-over

We're considering Slony and PGPool as alternatives to handle failover in our application -and it seems like since we're gonna need at least two DB servers, we could take advantage of load balancing too-
Under which circumstances Slony is better than PGPool and viceversa?
This is anecdotal, so take it with a grain of salt.
PGPool and streaming WAL replication (with hot standby or not) works the way database replication ought to. Your application doesn't need to know anything about replication or whether it is part of a cluster or whatnot, it just talks to the database as it would any other. Streaming replication is robust, and has the ability to fail back to WAL shipping if streaming breaks. PGPool makes managing this process easy and gives good heartbeats and monitoring info.
Slony, on the other hand, is an administrative tar-pit, which needs to add trigger functions and numerous other objects to your database to work. Furthermore, Slony doesn't (easily) support the ability to replicate schema changes (DDL) in the same way it replicates ordinary insert/update/delete type operations (DML). Taken as a whole, these characteristics mean that, in many cases, your application needs to have special cases to handle Slony's eccentricities. In my opinion, that's bad: the application shouldn't have to be aware of/make changes to deal with the database replication solution that it runs on. I spent the better part of a year hacking Slony to work as a stable replication solution, and eventually came to the conclusion that it's a bad idea/bad replication mechanic implemented in an obtuse, illegible way, that is anything but stable or enterprise-ready. EDIT: as of PostgreSQL 9.3 you can install triggers (which Slony uses to detect changes) on DDL objects, which might allow Slony to replicate more aspects of a database.
That said, Slony does do two things better than streaming replication (administered via PGPool or no):
Slony allows per-table or per-database replication. Streaming replication only permits the replication of an entire Postgres instance. If that kind of granularity is important to you, you might want Slony.
Slony allows cascading (master -> slave -> slave) replication. Streaming replication only allows one level. EDIT: This is now supported in PostgreSQL native streaming replication, as of Postgres version 9.2.
At literally everything else, streaming replication is better and more stable.
But you're not just asking about streaming replication: PGPool does a great deal more than that. It allows the balancing of read and write loads between read-only slave databases and masters, and the implementation of backup plans, as well as a whole host of other things. Especially when compared to Slony's arcane command syntax and godawful administration scripts, PGPool wins in nearly every instance.
With regards to failover specifically, PGPool has heartbeat monitors and the ability to automatically route database traffic in a cluster. Slony only has a "fail over to slave now" command, leaving all of the monitoring and app-side routing up to you.
TL;DR: PGPool good. Slony bad.

Which PostgreSQL replication solution to use for my specific scenario

I need to replicate a PostgreSQL database server as follows:
Two servers are adjacent to each-other - one is the master and the other standby. If the master fails, the standby takes over. Replication from master to slave needs to be failsafe, hence, synchronous. The standby will not be used for any querying unless it has become a master. So, no high-availability/load-balancing is required.
There is another backup server at a remote location. Data from the master server mentioned above will be replicated to this remote server asynchronously and in batches. Time is not a factor at all in this replication - a couple of hours is just fine. This server would be used just for backup.
I've studied the currently available replication solutions from the PostgreSQL docs as well as from Google, but can't decide which combination of synchronous-asynchronous solutions would I need.
The closest I came up with is using pgpool-II for scenario 1 and Mammoth for scenario 2. However, as pgpool is statement-based, what would happen to queries containing rand() and now()?
Please note that I'd rather use free and open-source replication tools.
Also, just a side question - according to scenario 1 above, when the master fails, the standby will take over. Would the master-slave role be reversed after that, or would after the recovery of the master server the slave would go back to its standby state?
Any suggestion would be highly appreciated. Thanks.
I suggest using DRBD for scenario 1 and either 9.0 built-in replication or Slony for scenario 2.
Before PostgreSQL 9.1 (not yet released), there is no other synchronous replication solution available, and DRBD is widely established for this purpose. Together with Pacemaker or Heartbeat, which come with all the scripts needed for PostgreSQL monitoring and switchover, you have a very robust and fairly easy to manage solution. (In fact, I'd consider continuing to use DRBD even after 9.1 comes out; it's just a lot easier and has a longer track record.)
For the cross-site asynchronous, you could try the built-in replication of PostgreSQL 9.0, perhaps in conjunction with repmgr for monitoring and management. Alternatively, you could try the (now a bit) old-school Slony, but I'd guess it will more complicated for your needs.
You didn't mention if the server in question was on a specific version or if this was a new project with the freedom to choose the version. The answers vary based on that information.
If you are starting with a clean slate, I would recommend designing based on the PostgreSQL 9.1 beta. The final version will be released long before you would be ready to go into a production environment and it has binary synchronous replication built-in.
I've been using the built-in asynchronous replication in PostgreSQL for years in almost the exact same scenario you describe and it has always been rock-solid for me. It's become even better with 9.0 with Hot standby and it's become much easier to configure and maintain. 9.1 provides the only missing piece you require.
However, if you are trying to replicate an existing server, built-in asynchronous replication with aggressive settings for "checkpoint_timeout" a very frequent backup of unarchived WAL files could be sufficient until you can upgrade to 9.1.
The bottom line here is that you can get exactly what you want is with stock PostgreSQL 9.1--no third-party products required.
As for failover, it is not an automatic process, you'll need to handle that yourself. I would recommend that after a failover, switching the roles of the two machines until either the next failover event or until a controlled manual failover during a scheduled outage during a slow period of use. Again, this is not automatic and much be managed by the administrator (via shell scripts, presumably).