Pattern for a singleton application process using the database - postgresql

I have a backend process that maintains state in a PostgreSQL database, which needs to be visible to the frontend. I want to:
Properly handle the backend being stopped and started. This alone is as simple as clearing out the backend state tables on startup.
Guard against multiple instances of the backend trampling each other. There should only be one backend process, but if I accidentally start a second instance, I want to make sure either the first instance is killed, or the second instance is blocked until the first instance dies.
Solutions I can think of include:
Exploit the fact that my backend process listens on a port. If a second instance of the process tries to start, it will fail with "Address already in use". I just have to make sure it does the listen step before connecting to the database and wiping out state tables.
Open a secondary connection and run the following:
BEGIN;
LOCK TABLE initech.backend_lock IN EXCLUSIVE MODE;
Note: the reason for IN EXCLUSIVE MODE is that LOCK defaults to the AccessExclusive locking mode. This conflicts with the AccessShare lock acquired by pg_dump.
Don't commit. Leave the table locked until the program dies.
What's a good pattern for maintaining a singleton backend process that maintains state in a PostgreSQL database? Ideally, I would acquire a lock for the duration of the connection, but LOCK TABLE cannot be used outside of a transaction.
Background
Consider an application with a "broker" process which talks to the database, and accepts connections from clients. Any time a client connects, the broker process adds an entry for it to the database. This provides two benefits:
The frontend can query the database to see what clients are connected.
When a row changes in another table called initech.objects, and clients need to know about it, I can create a trigger that generates a list of clients to notify of the change, writes it to a table, then uses NOTIFY to wake up the broker process.
Without the table of connected clients, the application has to figure out what clients to notify. In my case, this turned out to be quite messy: store a copy of the initech.objects table in memory, and any time a row changes, dispatch the old row and new row to handlers that check if the row changed and act if it did. To do it efficiently involves creating "indexes" against both the table-stored-in-memory, and handlers interested in row changes. I'm making a poor replica of SQL's indexing and querying capabilities in the broker program. I'd rather move this work to the database.
In summary, I want the broker process to maintain some of its state in the database. It vastly simplifies dispatching configuration changes to clients, but it requires that only one instance of the broker be connected to the database at a time.

it can be done by advisory locks
http://www.postgresql.org/docs/9.1/interactive/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS

I solved this today in a way I thought was concise:
CREATE TYPE mutex as ENUM ('active');
CREATE TABLE singleton (status mutex DEFAULT 'active' NOT NULL UNIQUE);
Then your backend process tries to do this:
insert into singleton values ('active');
And quits or waits if it fails to do so.

Related

Postgres: processes terminated after connetion break / invalidation

I don't understand some of Postgres mechanism and it makes me quite upset.
I usually use DBeaver as SQL client to query external pg base. If run create.. or insert.. queries and then connection for some reason is broken or invalidated, the pid is still running and finishes transaction.
But for some more complicated PL/pgSQL functions (with temp tables, loops, inserts, etc.) we wrote, breaking connection always causes process termination (it disappears from session list just before making next sql operation, eg. inserting a row in logtable). No matter if it's DBeaver editor or psql command.
I know that maybe disconnecting is critical problem, which should be eliminated and maybe I shouldn't expect process to successfully continue, but I do:) Or just to know why it happened and is it possible to prevent it?
If the network connection fails, the database server can detect that in two ways:
if it tries to send data to the client, it will figure out pretty quickly that the connection is down
if it tries to receive data from the client, it will only notice when the kernel's TCP keepalive mechanism has determined that the connection is down
When you say that sometimes execution of a function is terminated right away, I would say that is because the function returned data to the client.
In the case where a query keeps running, it is not attempting to return any data yet.
There is no cure for the former, but in PostgreSQL v14 you can prevent the latter by setting client_connection_check_interval. In addition, you have to set the PostgreSQL keepalive parameters so that the dead connection becomes known quickly.
See my article for more.

How to avoid long delay before finally getting "40001 could not serialize access due to concurrent update"

We have a Postgres 12 system running one master master and two async hot-standby replica servers and we use SERIALIZABLE transactions. All the database servers have very fast SSD storage for Postgres and 64 GB of RAM. Clients connect directly to master server if they cannot accept delayed data for a transaction. Read-only clients that accept data up to 5 seconds old use the replica servers for querying data. Read-only clients use REPEATABLE READ transactions.
I'm aware that because we use SERIALIZABLE transactions Postgres might give us false positive matches and force us to repeat transactions. This is fine and expected.
However, the problem I'm seeing is that randomly a single line INSERT or UPDATE query stalls for a very long time. As an example, one error case was as follows (speaking directly to master to allow modifying table data):
A simple single row insert
insert into restservices (id, parent_id, ...) values ('...', '...', ...);
stalled for 74.62 seconds before finally emitting error
ERROR 40001 could not serialize access due to concurrent update
with error context
SQL statement "SELECT 1 FROM ONLY "public"."restservices" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
We log all queries exceeding 40 ms so I know this kind of stall is rare. Like maybe a couple of queries a day. We average around 200-400 transactions per second during normal load with 5-40 queries per transaction.
After finally getting the above error, the client code automatically released two savepoints, rolled back the transaction and disconnected from database (this cleanup took 2 ms total). It then reconnected to database 2 ms later and replayed the whole transaction from the start and finished in 66 ms including the time to connect to the database. So I think this is not about performance of the client or the master server as a whole. The expected transaction time is between 5-90 ms depending on transaction.
Is there some PostgreSQL connection or master configuration setting that I can use to make PostgreSQL to return the error 40001 faster even if it caused more transactions to be rolled back? Does anybody know if setting
set local statement_timeout='250'
within the transaction has dangerous side-effects? According to the documentation https://www.postgresql.org/docs/12/runtime-config-client.html "Setting statement_timeout in postgresql.conf is not recommended because it would affect all sessions" but I could set the timeout only for transactions by this client that's able to automatically retry the transaction very fast.
Is there anything else to try?
It looks like someone had the parent row to the one you were trying to insert locked. PostgreSQL doesn't know what to do about that until the lock is released, so it blocks. If you failed rather than blocking, and upon failure retried the exact same thing, the same parent row would (most likely) still be locked and so would just fail again, and you would busy-wait. Busy-waiting is not good, so blocking rather than failing is generally a good thing here. It blocks and then unblocks only to fail, but once it does fail a retry should succeed.
An obvious exception to blocking-better-than-failing being if when you retry, you can pick a different parent row to retry with, if that make sense in your context. In this case, maybe the best thing to do is explicitly lock the parent row with NOWAIT before attempting the insert. That way you can perhaps deal with failures in a more nuanced way.
If you must retry with the same parent_id, then I think the only real solution is to figure out who is holding the parent row lock for so long, and fix that. I don't think that setting statement_timeout would be hazardous, but it also wouldn't solve your problem, as you would probably just keep retrying until the lock on the offending row is released. (Setting it on the other session, the one holding the lock, might be helpful, depending on what that session is doing while the lock is held.)

Are the changes of a write transaction in ClientA immediately visible to a ClientB read, started after COMMIT?

We are observing some behaviours/errors in some of our workflows, related to the consistency and visiblity of a Postgres write transaction, followed by a read. One of our developers offered an explanation, but I could not find any search results documenting the proposed reasoning.
Given a single Postgres 10.3 host, the following operations take place:
ClientA performs a successful write transaction
After the COMMIT, an external notification is emitted
ClientB reacts to external notification and performs a read, only to find that the UPDATE transaction changes are not visible
The explanation that was proposed is that two postgres client connections on different threads don't have a guaranteed view snapshot and may not immediately observe the write transaction update after the commit. But from what I have read, I would expect that after the COMMIT has succeeded, a read operation then starting in response should see the effects of that write.
My specific question is: Given two database client connections on different threads, is it possible for a race condition with one client viewing the effects of a write transaction AFTER the other client has committed? (no overlapping transactions).
Every bit of documentation I have found thus far only refers to concerns about overlapping/concurrent transaction and the MVCC/transaction isolation topics. Nothing about a synchronised serial operation between two different client connections.
Edit: Some extra details about the configuration.
ClientA and ClientB would be different threads accessing postgres through a connection pool. Clients may both be in the same connection pool on the same application server, or it may be ClientA/ApplicationA and ClientB/ApplicationB.
When ClientB reacts, it will access the existing Application server connection pool to make a new read.
No, that cannot happen, unless the reading transaction started earlier and is running at the REPEATABLE READ or SERIALIZABLE isolation level.
There is also the possibility that the reading transaction does not connect to the same server as the writing transaction, but to a streaming replication standby server with hot_standby enabled. Then this can easily happen, even with synchronous replication (unless you set synchronous_commit = remote_apply).

In this case from Nygard's "Release it!" why do deadlocks happen?

I'm reading over and over this paragraph from Michael Nygard's book "Release it!" and I still don't understand why exactly deadlocks can happen:
Imagine 100,000 transactions all trying to update the same row of the
same table in the same database. Somebody is bound to get deadlocked.
Once a single transaction with a lock on the user’s profile got hung
(because of the need for a connection from a different resource pool),
all the other database transactions on that row got blocked. Pretty
soon, every single request-handling thread got used up with these
bogus logins. As soon as that happens, the site is down.
When he says "because of the need for a connection from a different resource pool", is this inside the DB engine? What is this other resource pool and why would a connection from this other resource pool be needed?
Then, "every single request-handling thread" refers already not to DB threads, but to application threads, right? And they hung because they're waiting for the DB transactions (that are already hung) to finish?
The problem is in that applications interface with a LOT of different systems, any of which can run in parallel, have internal or external locks, and depend on yet more systems.
A simple example of a deadlock is basically when two processes need to acquire exactly the same two locks at the same time to proceed, but can't agree to who will go first and in which order (which is usually what the locks are for in the first place, so it's a chicken-and-the-egg problem, not exactly trivial). So processes A and B need to acquire two locks, #1 and #2, to do their thing and proceed. But while A is locking #1, B is locking #2, and then A tries to lock #2 and B tries to lock #1 - that's a deadlock. Someone's got to give in for any work to be done.
In real life, let's say you're running multiple instances of your web application, to be able to serve multiple incoming client requests (e.g. web browsers) at the same time. It doesn't matter if those are threads, processes or coroutines. Instances of your application can hang if they require locks on two database rows. Or they can hang because in addition to a database lock, they also need a lock on a file in the file system. Or they can hang because they need a lock on a file in the file system and they are using a third party remote REST API which also has locks of its own. Or because of infinite other reasons including all of the above simultaneously.

Concerns about zookeeper's lock-recipe

While reading the ZooKeeper's recipe for lock, I got confused. It seems that this recipe for distributed locks can not guarantee "any snapshot in time no two clients think they hold the same lock". But since ZooKeeper is so widely adopted, if there were such mistakes in the reference documentation, someone should have pointed it out long ago, so what did I misunderstand?
Quoting the recipe for distributed locks:
Locks
Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock. These can be implemented using ZooKeeeper. As with priority queues, first define a lock node.
Call create( ) with a pathname of "locknode/guid-lock-" and the sequence and ephemeral flags set.
Call getChildren( ) on the lock node without setting the watch flag (this is important to avoid the herd effect).
If the pathname created in step 1 has the lowest sequence number suffix, the client has the lock and the client exits the protocol.
The client calls exists( ) with the watch flag set on the path in the lock directory with the next lowest sequence number.
if exists( ) returns false, go to step 2. Otherwise, wait for a notification for the pathname from the previous step before going to step 2.
Consider the following case:
Client1 successfully acquired the lock (in step 3), with ZooKeeper node "locknode/guid-lock-0";
Client2 created node "locknode/guid-lock-1", failed to acquire the lock, and is now watching "locknode/guid-lock-0";
Later, for some reason (say, network congestion), Client1 fails to send a heartbeat message to the ZooKeeper cluster on time, but Client1 is still working away, mistakenly assuming that it still holds the lock.
But, ZooKeeper may think Client1's session is timed out, and then
delete "locknode/guid-lock-0",
send a notification to Client2 (or maybe send the notification first?),
but can not send a "session timeout" notification to Client1 in time (say, due to network congestion).
Client2 gets the notification, goes to step 2, gets the only node ""locknode/guid-lock-1", which it created itself; thus, Client2 assumes it hold the lock.
But at the same time, Client1 assumes it holds the lock.
Is this a valid scenario?
The scenario you describe could arise. Client 1 thinks it has the lock, but in fact its session has timed out, and Client 2 acquires the lock.
The ZooKeeper client library will inform Client 1 that its connection has been disconnected (but the client doesn't know the session has expired until the client connects to the server), so the client can write some code and assume that his lock has been lost if he has been disconnected too long. But the thread which uses the lock needs to check periodically that the lock is still valid, which is inherently racy.
...But, Zookeeper may think client1's session is timeouted, and then...
From the Zookeeper documentation:
The removal of a node will only cause one client to wake up since
each node is watched by exactly one client. In this way, you avoid
the herd effect.
There is no polling or timeouts.
So I don't think the problem you describe arises. It looks to me as thought there could be a risk of hanging locks if something happens to the clients that create them, but the scenario you describe should not arise.
from packt book - Zookeeper Essentials
If there was a partial failure in the creation of znode due to connection loss, it's
possible that the client won't be able to correctly determine whether it successfully
created the child znode. To resolve such a situation, the client can store its session ID
in the znode data field or even as a part of the znode name itself. As a client retains
the same session ID after a reconnect, it can easily determine whether the child znode
was created by it by looking at the session ID.