Sudden Increase in row exclusive locks and connection exhaustion in PostgreSQL - postgresql

I have a scenario that repeats itself every few hours. In every few hours, there is a sudden increase in row exclusive locks in PostgreSQL DB. In Meantime there seems that some queries are not responded in time and causes connection exhaustion to happen that PostgreSQL does not accept new clients anymore. After 2-3 minutes locks and connection numbers drops and the system comes back to normal state again.
I wonder if auto vacuum can be the root cause of this? I see analyze and vacuum (NOT FULL VACCUM) take about 20 seconds to complete on one of the tables. I have INSERT,SELECT,UPDATE and DELETE operations going on from my application and I don't have DDL commands (ALTER TABLE, DROP TABLE, CREATE INDEX, ...) going on. Can auto vacuum procedure conflict with queries from my application and cause them to wait until vacuum has completed? Or it's all the applications and my bad design fault? I should say one of my tables has a field of type jsonb that keeps relatively large data for each row (10 MB roughly).
I have attached an image from monitoring application that shows the sudden increase in row exclusive locks.

ROW EXCLUSIVE locks are perfectly harmless; they are taken on tables against which DML statements run. Your graph reveals nothing. You should set log_lock_waits = on and log_min_duration_statement to a reasonable value. Perhaps you can spot something in the logs. Also, watch out for long running transactions.

Related

postgres autovacuum retry mechanism

In postgres, when autovacuum runs and for some reason say its able to perform autovacuum - for example when hot_standby_feedback is set and there are long running queries on standby. Say for example tab1 has been updated and it triggers autovacuum, meanwhile a long running query on standby is running and sends this info to primary which will skip the vacuum on tab1.
Since the autovacuum got skipped for tab1, When does autovacuum run vacuum on the table again? Or it will not run autovacuum again on it and we would need to manually run vacuum on that table. Basically does autovacuum retry autovacuum on tables that could not be vacuumed for the first time?
Autovacuum does not get skipped due to hot_standby_feedback. It still runs, it just might not accomplish anything if no rows can be removed. If this is the case, then pg_stat_all_tables.n_dead_tup does not get decremented, which means that the table will probably get autovacuumed again the next time the database is assessed for autovacuuming as the stats that make it eligible have not changed. On an otherwise idle system, this will happen about once every however long it takes to scan not-all-visible part of the table, rounded up to the next increment of autovacuum_naptime.
It might be a good idea (although the use case is narrow enough that I doubt it) to suppress repeat autovacuuming of a table until the horizon has advanced far enough to make it worthwhile, but currently there is no code to do this.
Note that this differs from INSERT driven autovacuums. There, n_ins_since_vacuum does get reset, even if none of the tuples were marked all visible. So that vacuum will not get run again until the table cross some other threshold to make it eligible.

Stop autovacuum on old partition

We're having some issues when autovacuum triggers on one of our large tables (~100Gb).
Our ETL jobs only hit the last three partition of this table but, from my understanding, when autovacuum is triggered on a partition the whole table is vacuumed and this is causing the ETL job to wait until it's finished.
So far I've tried to set autovacuum_vacuum_scale_factor to both a higher and lower value which yields approximately the same execution time for our job.
Since a rather important number of INSERT/UPDATE is performed on this table, I would like to put a low value for autovacuum_vacuum_scale_factor on the three lasts partitions, but prevent the vacuuming of older partitions.
We are already using a vacuum script that runs every evening so I'm thinking about setting autovacuum_enabled to off on older partitionned and let the script handle the vacuum on these tables, but I'm not sure if it's the right way to treat this problem.
Another parameter I've stumbled upon is the vacuum_freeze_min_age and autovacuum_freeze_max_age, but I'm not sure I understand how to use them.

Slow bulk read from Postgres Read replica while updating the rows we read

We have on RDS a main Postgres server and a read replica.
We constantly write and update new data for the last couple of days.
Reading from the read-replica works fine when looking at older data but when trying to read from the last couple of days, where we keep updating the data on the main server, is painfully slow.
Queries that take 2-3 minutes on old data can timeout after 20 minutes when querying data from the last day or two.
Looking at the monitors like CPU I don't see any extra load on the read replica.
Is there a solution for this?
You are accessing over 65 buffers for ever 1 visible row found in the index scan (and over 500 buffers for each row which is returned by the index scan, since 90% are filtered out by the mmsi criterion).
One issue is that your index is not as well selective as it could be. If you had the index on (day, mmsi) rather than just (day) it should be about 10 times faster.
But it also looks like you have a massive amount of bloat.
You are probably not vacuuming the table often enough. With your described UPDATE pattern, all the vacuum needs are accumulating in the newest data, but the activity counters are evaluated based on the full table size, so autovacuum is not done often enough to suit the needs of the new data. You could lower the scale factor for this table:
alter table simplified_blips set (autovacuum_vacuum_scale_factor = 0.01)
Or if you partition the data based on "day", then the partitions for newer days will naturally get vacuumed more often because the occurrence of updates will be judged against the size of each partition, it won't get diluted out by the size of all the older inactive partitions. Also, each vacuum run will take less work, as it won't have to scan all of the indexes of the entire table, just the indexes of the active partitions.
As suggested, the problem was bloat.
When you update a record in an ACID database the database creates a new version of the record with the new updated record.
After the update you end with a "dead record" (AKA dead tuple)
Once in a while the database will do autovacuum and clean the table from the dead tuples.
Usually the autovacuum should be fine but if your table is really large and updated often you should consider changing the autovacuum analysis and size to be more aggressive.

Postgresql: Querying on Hot Stand-by, option to continue ignoring effects of vaccum on any relevant rows?

My long running SELECT queries against Hot stand-by are failing apparently due to replay on the standby leading to vacuum of some of the rows matching my query.
Is there support for an option where I can ask the Hot stand-by server to not bother about such changes to the rows (even the rows that were updated/deleted) and continue with the scan, for my query?
Or, is dropping all queries where a matching row was cleaned up during replay, vaccumm something the server always does and there's no other way supported.
You can use hot_standby_feedback to tell the primary server to not vacuum rows that the standby server is still using. If you are concerned about affecting the primary in this way, you could instead use one of max_standby_streaming_delay or max_standby_archive_delay (depending on if you are streaming or copying log files).
These are all detailed here: https://www.postgresql.org/docs/current/runtime-config-replication.html

WALWriteLock on Unlogged table

We've been experiencing some connection spikes on pgbouncer connected to a postgres database. When we query pg_stat_activity during these spikes we see tons of active queries with wait_event WALWriteLock.
We changed some of our highly inserted tables to unlogged, yet inserts into these tables are still showing up during the spikes with a wait_event of WALWriteLock. I thought that if a table is unlogged, then inserts into it wouldn't get caught up waiting for WALWriteLock. What gives?
Further, any suggestions on how to stop these spikes?
WALWriteLock: accrued by PostgreSQL processes while WAL records are flushed to disk or during a WAL segments switch. synchronous_commit=off removes the wait for disk flush, full_page_writes=off reduces the amount of data to flush.
You can try above also mentioned in dec :-
https://www.percona.com/blog/2018/10/30/postgresql-locking-part-3-lightweight-locks/
One possible answer is that despite of not writing the table itself to WAL, COMMITs are still written to WAL, and if you make lots of tiny transactions this might show up. See if you can group inserts to smaller number of transactions.