I have a very large multi-million row transaction that I ended up needing to kill.
This transaction scanned a very large number of rows and created new rows in a new table if certain conditions were met.
This was in a commit block and did not complete before I killed the process— are there any repercussions to killing the process and restarting the server? I do not even see the tables in the db (presumably because the commit never happened). Can I just immediately try to do my migration again?
The answer depends on how you “killed” the transaction.
If you hit Ctrl+C or canceled the query with pg_cancel_backend or pg_terminate_backend, the transaction will have rolled back normally.
Any table you created in the session will be gone.
If you modified rows in pre-existing tables, the new rows will be dead and aotovacuum will remove them.
At worst, you will have some bloat in some tables that will be reused by the next attempt at your transaction.
Similarly, if you used a regular kill to kill the backend process of the session, everything will be fine.
If you used kill -9 to kill the session's backend process, PostgreSQL will have gone into crash recovery.
Your database will be consistent after crash recovery, but it is possible that some files (belonging to newly created tables) will be left behind. Such orphans take up space and are never removed, and the only safe way to get rid of that wasted space is to dump the database and restore it to a new database cluster.
Theoretically, yes. You should be able to just go ahead and try again. It might mean that some of the cleanup hasn't been performed yet, so there are some partial tables floating around, taking up memory, but nothing that should impact your data quality.
Related
I have a scenario that repeats itself every few hours. In every few hours, there is a sudden increase in row exclusive locks in PostgreSQL DB. In Meantime there seems that some queries are not responded in time and causes connection exhaustion to happen that PostgreSQL does not accept new clients anymore. After 2-3 minutes locks and connection numbers drops and the system comes back to normal state again.
I wonder if auto vacuum can be the root cause of this? I see analyze and vacuum (NOT FULL VACCUM) take about 20 seconds to complete on one of the tables. I have INSERT,SELECT,UPDATE and DELETE operations going on from my application and I don't have DDL commands (ALTER TABLE, DROP TABLE, CREATE INDEX, ...) going on. Can auto vacuum procedure conflict with queries from my application and cause them to wait until vacuum has completed? Or it's all the applications and my bad design fault? I should say one of my tables has a field of type jsonb that keeps relatively large data for each row (10 MB roughly).
I have attached an image from monitoring application that shows the sudden increase in row exclusive locks.
ROW EXCLUSIVE locks are perfectly harmless; they are taken on tables against which DML statements run. Your graph reveals nothing. You should set log_lock_waits = on and log_min_duration_statement to a reasonable value. Perhaps you can spot something in the logs. Also, watch out for long running transactions.
I'm getting the following error when running a query on a PostgreSQL db in standby mode. The query that causes the error works fine for 1 month but when you query for more than 1 month an error results.
ERROR: canceling statement due to conflict with recovery
Detail: User query might have needed to see row versions that must be removed
Any suggestions on how to resolve? Thanks
No need to touch hot_standby_feedback. As others have mentioned, setting it to on can bloat master. Imagine opening transaction on a slave and not closing it.
Instead, set max_standby_archive_delay and max_standby_streaming_delay to some sane value:
# /etc/postgresql/10/main/postgresql.conf on a slave
max_standby_archive_delay = 900s
max_standby_streaming_delay = 900s
This way queries on slaves with a duration less than 900 seconds won't be cancelled. If your workload requires longer queries, just set these options to a higher value.
Running queries on hot-standby server is somewhat tricky — it can fail, because during querying some needed rows might be updated or deleted on primary. As a primary does not know that a query is started on secondary it thinks it can clean up (vacuum) old versions of its rows. Then secondary has to replay this cleanup, and has to forcibly cancel all queries which can use these rows.
Longer queries will be canceled more often.
You can work around this by starting a repeatable read transaction on primary which does a dummy query and then sits idle while a real query is run on secondary. Its presence will prevent vacuuming of old row versions on primary.
More on this subject and other workarounds are explained in Hot Standby — Handling Query Conflicts section in documentation.
There's no need to start idle transactions on the master. In postgresql-9.1 the
most direct way to solve this problem is by setting
hot_standby_feedback = on
This will make the master aware of long-running queries. From the docs:
The first option is to set the parameter hot_standby_feedback, which prevents
VACUUM from removing recently-dead rows and so cleanup conflicts do not occur.
Why isn't this the default? This parameter was added after the initial
implementation and it's the only way that a standby can affect a master.
As stated here about hot_standby_feedback = on :
Well, the disadvantage of it is that the standby can bloat the master,
which might be surprising to some people, too
And here:
With what setting of max_standby_streaming_delay? I would rather
default that to -1 than default hot_standby_feedback on. That way what
you do on the standby only affects the standby
So I added
max_standby_streaming_delay = -1
And no more pg_dump error for us, nor master bloat :)
For AWS RDS instance, check http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html
The table data on the hot standby slave server is modified while a long running query is running. A solution (PostgreSQL 9.1+) to make sure the table data is not modified is to suspend the replication and resume after the query:
select pg_xlog_replay_pause(); -- suspend
select * from foo; -- your query
select pg_xlog_replay_resume(); --resume
I'm going to add some updated info and references to #max-malysh's excellent answer above.
In short, if you do something on the master, it needs to be replicated on the slave. Postgres uses WAL records for this, which are sent after every logged action on the master to the slave. The slave then executes the action and the two are again in sync. In one of several scenarios, you can be in conflict on the slave with what's coming in from the master in a WAL action. In most of them, there's a transaction happening on the slave which conflicts with what the WAL action wants to change. In that case, you have two options:
Delay the application of the WAL action for a bit, allowing the slave to finish its conflicting transaction, then apply the action.
Cancel the conflicting query on the slave.
We're concerned with #1, and two values:
max_standby_archive_delay - this is the delay used after a long disconnection between the master and slave, when the data is being read from a WAL archive, which is not current data.
max_standby_streaming_delay - delay used for cancelling queries when WAL entries are received via streaming replication.
Generally, if your server is meant for high availability replication, you want to keep these numbers short. The default setting of 30000 (milliseconds if no units given) is sufficient for this. If, however, you want to set up something like an archive, reporting- or read-replica that might have very long-running queries, then you'll want to set this to something higher to avoid cancelled queries. The recommended 900s setting above seems like a good starting point. I disagree with the official docs on setting an infinite value -1 as being a good idea--that could mask some buggy code and cause lots of issues.
The one caveat about long-running queries and setting these values higher is that other queries running on the slave in parallel with the long-running one which is causing the WAL action to be delayed will see old data until the long query has completed. Developers will need to understand this and serialize queries which shouldn't run simultaneously.
For the full explanation of how max_standby_archive_delay and max_standby_streaming_delay work and why, go here.
It might be too late for the answer but we face the same kind of issue on the production.
Earlier we have only one RDS and as the number of users increases on the app side, we decided to add Read Replica for it. Read replica works properly on the staging but once we moved to the production we start getting the same error.
So we solve this by enabling hot_standby_feedback property in the Postgres properties.
We referred the following link
https://aws.amazon.com/blogs/database/best-practices-for-amazon-rds-postgresql-replication/
I hope it will help.
Likewise, here's a 2nd caveat to #Artif3x elaboration of #max-malysh's excellent answer, both above.
With any delayed application of transactions from the master the follower(s) will have an older, stale view of the data. Therefore while providing time for the query on the follower to finish by setting max_standby_archive_delay and max_standby_streaming_delay makes sense, keep both of these caveats in mind:
the value of the follower as a standby / backup diminishes
any other queries running on the follower may return stale data.
If the value of the follower for backup ends up being too much in conflict with hosting queries, one solution would be multiple followers, each optimized for one or the other.
Also, note that several queries in a row can cause the application of wal entries to keep being delayed. So when choosing the new values, it’s not just the time for a single query, but a moving window that starts whenever a conflicting query starts, and ends when the wal entry is finally applied.
With slow query logging turned on, we see a lot of COMMITs taking upwards of multiple seconds to complete on our production database. On investigation, these are generally simple transactions: fetch a row, UPDATE the row, COMMIT. The SELECTs and UPDATEs in these particular transactions aren't being logged as slow. Is there anything we can do, or tools that we can use, to figure out the reason for these slow commits? We're running on an SSD, and are streaming to a slave, if that makes a difference.
Postgres commits are synchronous. This means they will wait for the WAL writes to complete before moving to the next one. You can adjust the WAL settings in the config file to adjust for this.
You can set the commit level to asynchronous at a session/user level or database wide with the synchronous_commit in the config file.
On the database side.
Vacuum your tables an update the statistics. This will get rid of dead tuples since your performing updates, there will be many.
VACUUM ANALYZE
I have found a bug in my application code where I have started a transaction, but never commit or do a rollback. The connection is used periodically, just reading some data every 10s or so. In the pg_stat_activity table, its state is reported as "idle in transaction", and its backend_start time is over a week ago.
What is the impact on the database of this? Does it cause additional CPU and RAM usage? Will it impact other connections? How long can it persist in this state?
I'm using postgresql 9.1 and 9.4.
Since you only SELECT, the impact is limited. It is more severe for any write operations, where the changes are not visible to any other transaction until committed - and lost if never committed.
It does cost some RAM and permanently occupies one of your allowed connections (which may or may not matter).
One of the more severe consequences of very long running transactions: It blocks VACUUM from doing it's job, since there is still an old transaction that can see old rows. The system will start bloating.
In particular, SELECT acquires an ACCESS SHARE lock (the least blocking of all) on all referenced tables. This does not interfere with other DML commands like INSERT, UPDATE or DELETE, but it will block DDL commands as well as TRUNCATE or VACUUM (including autovacuum jobs). See "Table-level Locks" in the manual.
It can also interfere with various replication solutions and lead to transaction ID wraparound in the long run if it stays open long enough / you burn enough XIDs fast enough. More about that in the manual on "Routine Vacuuming".
Blocking effects can mushroom if other transactions are blocked from committing and those have acquired locks of their own. Etc.
You can keep transactions open (almost) indefinitely - until the connection is closed (which also happens when the server is restarted, obviously.)
But never leave transactions open longer than needed.
There are two major impacts to the system.
The tables that have been used in those transactions:
are not vacuumed which means they are not "cleaned up" and their statistics aren't updated which might lead to bad (=slow) execution plans
cannot be changed using ALTER TABLE
We have a background process (linux daemon in an infinite loop) that automatically takes all lines from csv files that are placed in a certain directory and imports them into a table. The daemon processes any files that appear in the directory one by one, is written in python, and uses psycopg2 to connect to our postgresql database.
That process imports those records using INSERT statements, but first DELETES any table records that have the same unique key as any of the records in the csv file. Generally the process is DELETING a record for every record it INSERTS. So as this daemon is running in the background it is DELETING and then INSERTING rows. Every time it processes one file it specifically commits the transaction, closes the cursor, and then closes the connection.
Periodically (twice a day) we want to run CLUSTER to remove the dead tuples and keep the table to a manageable on disk size.
However, something in this process is stopping the CLUSTER command from removing the dead tuples for all the records that are being deleted as the process is running. We know this happening because if we run CLUSTER while the process is running, the on disk size of the table containing this imported data will not decrease and pg_stat_user_tables will show many dead tuples.
If we stop the process and then run CLUSTER, the on disk size of the table will decrease dramatically and and pg_stat_user_tables will report that all of the dead tuples are gone.
What's strange is we are committing the transaction and closing the connections every time we process each file, so I have no idea what is not allowing the dead tuples to be removed while the process is running.
Also strange, is that if we stop the process, then start the process again, then do a CLUSTER, it will remove all of the dead tuples created by the previous run of the daemon process; but any subsequent calls of CLUSTER will not clear any dead tuples created by the current run of the daemon process (while it is still running of course).
So something is maintaining some kind of link to the dead tuples until the process is stopped, even though we have committed the transaction and closed all connections to postgres that created those dead tuples. pg_locks does not report any open locks and no running transactions are reported, so it doesn't seem like its a lock or open transaction issue.
At the end of the day, this is stopping us from periodically running CLUSTER on the table so that it doesn't keep growing and growing.
I'm sure there is a simple answer to this, but I can't find it anywhere. Some skeleton code for the process is below. It really is a simple process so I have no idea what is going on here. Any guidance would be greatly appreciated.
while True:
l = [(get_modified_time(fname), fname) for fname in os.listdir('/tmp/data')]
l.sort()
for (t, fname) in l:
conn = psycopg2.connect("dbname='dbname' user='user' password='password'")
cursor = conn.cursor()
# Calls a postgresql function that reads a file and imports it into
# a table via INSERT statements and DELETEs any records that have the
# same unique key as any of the records in the file.
cursor.execute("SELECT import('%s', '%s');" % (fname, t))
conn.commit()
cursor.close()
conn.close()
os.remove(get_full_pathname(fname))
time.sleep(0.100)
What's wrong with autovacuum? When autovacuum does it's job, you don't have to use CLUSTER to cleanup dead tuples. CLUSTER isn't made for this, it's VACUUM.
If you change the proces to UPDATE duplicates, things might get even better when you use a lower FILLFACTOR: HOT updates. These are faster, reclaim space, keep the same order in storage and no need for VACUUM nor CLUSTER.