What to do after a query when auto_commit is disabled - postgresql

In some scenarios we should setAutoCommit(false) before query, see here https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor and When does the PostgreSQL JDBC driver fetch rows after executing a query? .
But none of these topics mentioned how to do after query, when ResultSet and Statement is closed but Connection is not (may be recycled by ConnectionPool or DataSource).
I have these choices:
Do nothing (keep autoCommit = false for next query)
set autoCommit = true
commit
rollback
Which one is the best practice?

Even queries are executed in a transaction. If you started a transaction (which implicitly happened when you executed the query), then you should also end it. Generally, doing nothing would - with a well-behaved connection pool - result in a rollback when your connection is returned to the pool. However, it is best not the rely on such implicit behaviour, because not all connection pools or drivers will adhere to it. For example the Oracle JDBC driver will commit on connection close (or at least, it did so in the past, I'm not sure if it still does), and it might not be the correct behaviour for your program. Explicitly calling commit() or rollback() will clearly document the boundaries and expectations of your program.
Though committing or rolling back a transaction that only executed a query (and thus did not modify the database), will have the same end result, I would recommend using commit() rather than rollback(), to clearly indicate that the result was successful. For some databases, committing might be cheaper than rollback (or vice versa), but such systems usually have heuristics that will convert a commit to rollback (or vice versa, whatever is 'cheaper'), if the result would be equivalent.
You generally don't need to switch auto-commit mode when you're done. A well-behaved connection pool should do that for you (though not all do, or sometimes you need to explicitly configure this). Double check the behaviour and options of your connection pool to be sure.
If you want to continue using a connection yourself (without returning to the pool), then switching back to auto-commit mode is sufficient: calling setAutoCommit(true) with an active transaction will automatically commit that transaction.

It depends what you want to do afterwards. If you want to return to autocommit mode after the operation:
conn.setAutoCommit(true);
This will automatically commit the open transaction.

Related

Postgres: processes terminated after connetion break / invalidation

I don't understand some of Postgres mechanism and it makes me quite upset.
I usually use DBeaver as SQL client to query external pg base. If run create.. or insert.. queries and then connection for some reason is broken or invalidated, the pid is still running and finishes transaction.
But for some more complicated PL/pgSQL functions (with temp tables, loops, inserts, etc.) we wrote, breaking connection always causes process termination (it disappears from session list just before making next sql operation, eg. inserting a row in logtable). No matter if it's DBeaver editor or psql command.
I know that maybe disconnecting is critical problem, which should be eliminated and maybe I shouldn't expect process to successfully continue, but I do:) Or just to know why it happened and is it possible to prevent it?
If the network connection fails, the database server can detect that in two ways:
if it tries to send data to the client, it will figure out pretty quickly that the connection is down
if it tries to receive data from the client, it will only notice when the kernel's TCP keepalive mechanism has determined that the connection is down
When you say that sometimes execution of a function is terminated right away, I would say that is because the function returned data to the client.
In the case where a query keeps running, it is not attempting to return any data yet.
There is no cure for the former, but in PostgreSQL v14 you can prevent the latter by setting client_connection_check_interval. In addition, you have to set the PostgreSQL keepalive parameters so that the dead connection becomes known quickly.
See my article for more.

How to avoid long delay before finally getting "40001 could not serialize access due to concurrent update"

We have a Postgres 12 system running one master master and two async hot-standby replica servers and we use SERIALIZABLE transactions. All the database servers have very fast SSD storage for Postgres and 64 GB of RAM. Clients connect directly to master server if they cannot accept delayed data for a transaction. Read-only clients that accept data up to 5 seconds old use the replica servers for querying data. Read-only clients use REPEATABLE READ transactions.
I'm aware that because we use SERIALIZABLE transactions Postgres might give us false positive matches and force us to repeat transactions. This is fine and expected.
However, the problem I'm seeing is that randomly a single line INSERT or UPDATE query stalls for a very long time. As an example, one error case was as follows (speaking directly to master to allow modifying table data):
A simple single row insert
insert into restservices (id, parent_id, ...) values ('...', '...', ...);
stalled for 74.62 seconds before finally emitting error
ERROR 40001 could not serialize access due to concurrent update
with error context
SQL statement "SELECT 1 FROM ONLY "public"."restservices" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x"
We log all queries exceeding 40 ms so I know this kind of stall is rare. Like maybe a couple of queries a day. We average around 200-400 transactions per second during normal load with 5-40 queries per transaction.
After finally getting the above error, the client code automatically released two savepoints, rolled back the transaction and disconnected from database (this cleanup took 2 ms total). It then reconnected to database 2 ms later and replayed the whole transaction from the start and finished in 66 ms including the time to connect to the database. So I think this is not about performance of the client or the master server as a whole. The expected transaction time is between 5-90 ms depending on transaction.
Is there some PostgreSQL connection or master configuration setting that I can use to make PostgreSQL to return the error 40001 faster even if it caused more transactions to be rolled back? Does anybody know if setting
set local statement_timeout='250'
within the transaction has dangerous side-effects? According to the documentation https://www.postgresql.org/docs/12/runtime-config-client.html "Setting statement_timeout in postgresql.conf is not recommended because it would affect all sessions" but I could set the timeout only for transactions by this client that's able to automatically retry the transaction very fast.
Is there anything else to try?
It looks like someone had the parent row to the one you were trying to insert locked. PostgreSQL doesn't know what to do about that until the lock is released, so it blocks. If you failed rather than blocking, and upon failure retried the exact same thing, the same parent row would (most likely) still be locked and so would just fail again, and you would busy-wait. Busy-waiting is not good, so blocking rather than failing is generally a good thing here. It blocks and then unblocks only to fail, but once it does fail a retry should succeed.
An obvious exception to blocking-better-than-failing being if when you retry, you can pick a different parent row to retry with, if that make sense in your context. In this case, maybe the best thing to do is explicitly lock the parent row with NOWAIT before attempting the insert. That way you can perhaps deal with failures in a more nuanced way.
If you must retry with the same parent_id, then I think the only real solution is to figure out who is holding the parent row lock for so long, and fix that. I don't think that setting statement_timeout would be hazardous, but it also wouldn't solve your problem, as you would probably just keep retrying until the lock on the offending row is released. (Setting it on the other session, the one holding the lock, might be helpful, depending on what that session is doing while the lock is held.)

Bitronix transaction appears to be committing prematurely

We have a spring-batch process that uses the bitronix transaction manager. On the first pass of a particular step, we see the expected commit behavior - data is only committed to the target database when the transaction boundary is reached.
However, on the second and subsequent passes, rows are committed as soon as they are written. That is, they do not wait for the commit point.
We have confirmed that the bitronix commit is only called at the expected points.
Has anyone experienced this behavior before? What kind of bug am I looking for?
Java XA is designed in such a way that connections cannot be reused across transactions. Once the transaction is committed, the connection property is changed to autocommit=true, and the connection cannot be used in another transaction until it is returned to the connection pool and retrieved by the XA code again.

Can committing an transaction in PostgreSQL fail?

If I execute some SQL inside a transaction successfully, can it happens that the commit will fail? And what are possible causes? Can it fail related to the executed queries, or just due to some DB side issues?
The question comes up because I need to judge if it makes sense to commit transactions inside tests or if it is "safe enough" to just rollback after each test case.
If I execute some SQL inside a transaction successfully, can it happens that the commit will fail?
Yes.
And what are possible causes?
DEFERRABLE constraints with SET CONSTRAINTS DEFERRED or in a one-statement autocommit transaction. (Can't happen unless you use DEFERRABLE constraints)
SERIALIZABLE transaction with serialization failure detected at commit time. (Can't happen unless you use SERIALIZABLE transactions)
Asynchronous commit where the DB crashes or is shut down. (Can't happen if synchronous_commit = on, the default)
Disk I/O error, filesystem error, etc
Out-of-memory error
Network error leading to session disconnect after you send the commit but before you get confirmation of success. In this case you don't know for sure if it committed or not.
... probably more
Can it fail related to the executed queries, or just due to some DB side issues?
Either. A serialization failure, for example, is definitely related to the queries run.
If you're using READ COMMITTED isolation with no deferred constraints then commits are only likely to fail due to underlying system errors.
The question comes up because I need to judge if it makes sense to commit transactions inside tests or if it is "safe enough" to just rollback after each test case.
Any sensible test suite has to cover multiple concurrent transactions interacting, committing in different orders, etc.
If all you test is single standalone transactions you're not testing the real system.
So the question is IMO moot, because a decent suite of tests has to commit anyway.

Delete an uncommitted inserted row in DB2 (V8.2.7 - Fix 14)

Upon client's request, I was asked to turn a web application on read-uncommitted isolation level (it's a probably a bad idea...).
While testing if the isolation was in place, I inserted a row without committing (DBVisualiser : #set autocommit off + stop VPN connection to the database) and I started testing my application towards that uncommitted insert.
select * from MYTABLE WHERE MY ID = "NON_COMMIT_INSERT_ID" WITH UR is working fine. Now I would like to "delete" this row and I did not find any way...
UPDATE : The row did disappear after some time (about 30min). I guess there is some kind of timeout before a rollback is automatically issued. Is there any way to remove an uncommitted row before this happens ?
I think that this will not be possible using normal SQL statements - the only way to delete the row will be to rollback the transaction which inserted it (or wait for tx to commit, then delete). As you have disconnected from DB on network level, then 30 minutes you talk about is probably TCP timeout enforced on operating system level. After TCP connection has been terminated, DB2 rollbacked client's transaction automatically.
Still I think you could administratively force application to disconnect from database (using FORCE APPLICATION with handle obtained from LIST APPLICATIONS) which should rollback the transaction, see http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/core/r0001951.htm for details on these commands.
It's one thing to read uncommitted rows from a data base. There are sometimes good reasons (lack of read locks) for doing this,
It's another to leave inserted, updated, or deleted rows on a data base without a commit or roll back. You should never do this. Either commit or roll back after a database change.