Heroku PG: Recover Write access revoked - postgresql

I have had a Write access revoked in my heroku dev plan as I had too many rows in my db.
Here is the result I have:
$ heroku pg:info
=== HEROKU_POSTGRESQL_WHITE_URL (DATABASE_URL)
Plan: Dev
Status: available
Connections: 1
PG Version: 9.2.7
Created: 2013-06-21 13:24 UTC
Data Size: 12.0 MB
Tables: 48
Rows: 10564/10000 (Write access revoked) - refreshing
Fork/Follow: Unsupported
Rollback: Unsupported
Region: Europe
Since then, I deleted half of the rows, as they were created by a script that degenerated, and I am in my development environment. I checked in pgadmin and in the rails console, and it appears that the rows are actually deleted.
How can I recover write access to my database? I don't want to upgrade as I don't need it normally. Been waiting for 2 hours already nothing changed.
I read heroku pg:info doesn't update really fast, but what can I do then?
Thanks for support

For the Starter tier (now called Hobby tier), the statistics on row count, table count, and database size are calculated and updated periodically by background workers. They can lag by a few minutes, but typically not more than 5 or 10.
After you've cleared out data to get under your row limit, it may take 5-10 minutes for write access to be restored naturally. You can usually speed up this process by running pg:info, which should cause a refresh to happen (thus you see refreshing above).
2 hours is not expected. If you're still locked out and you're sure you're under the row count, please open a ticket at help.heroku.com

According to Heroku documentation: https://devcenter.heroku.com/articles/heroku-postgres-plans#row-limit-enforcement
When you are over the hobby tier row limits and try to insert, you will see a Postgres error:
permission denied for relation <table name>
The row limits of the hobby tier database plans are enforced with the following mechanism:
When a hobby-dev database hits 7,000 rows, or a hobby-basic database hits 7 million rows , the owner receives a warning email stating they are nearing their row limits.
When the database exceeds its row capacity, the owner will receive an additional notification. At this point, the database will receive a 7 day grace period to either reduce the number of records, or migrate to another plan.
If the number of rows still exceeds the plan capacity after 7 days, INSERT privileges will be revoked on the database. Data can still be read, updated or deleted from database. This ensures that users still have the ability to bring their database into compliance, and retain access to their data.
Once the number of rows is again in compliance with the plan limit, INSERT privileges are automatically restored to the database. Note that the database sizes are checked asynchronously, so it may take a few minutes for the privileges to be restored.
So, you most likely have to upgrade your db plan

I faced the same restrictions but because of being in excess of heroku's GB data size limit (rather than row limit), but the outcome was the same ("Write access revoked; Database deletion imminent").
Deleting rows alone was not enough to free up disk space - After deleting the rows I had to also run the postgres VACUUM FULL command in order to see a reduction in disk space figure returned with heroku pg:info.
Here's a full explanation of the steps to solve.

Related

Postgres pg_dump getting very slow while copying large objects

I was performing a pg_dump operation on a postgres (v9) database of size around 80Gb.
The operation never seemed to finish even when trying the following:
running a FULL VACUUM before dumping
dumping the db into a directory-format archive (using -Fd)
without compression (-Z 0)
dumping the db into a directory in parallel (tried up to 10 threads -j 10)
When using the --verbose flag I saw that the most of the logs are related to creating/executing large objects.
When I tried dumping each table on its own (pg_dump -t table_name) the result was fast again (in minutes) but when restoring the dump to another db, the application that uses the db started throwing exceptions regarding some resources not being found (they should've been in the db)
As in Postgres pg_dump docs when using the -t flag the command will not copy blobs.
I added the flag -b (pg_dump -b -t table_name) and the operation went back to being slow.
So the problem I guess is with exporting the blobs in the db.
The number of blobs should be around 5 Million which can explain the slowness in general but the duration of execution is lasting as long as 5 hours before killing the process manually.
The blobs are relatively small (Max 100 Kb per blob)
Is this expected? or is there something fishy going around?
The slowness was due to high number of orphaned blobs
Apparently when launching a FULL VACUUM on a postgres database, it doesn't not remove orphaned large objects.
When I queried the amount of large objects in my database
select count(distinct loid) from pg_largeobject;
output:
151200997
The displayed amount from the query did not match the expected value. The expected amount of blobs should be around 5 Million in my case.
The table (the one that I created in the app) that references those blobs, in my case, is subject for frequent updates and postgres does not delete the old tuples (rows) but rather marks them as 'dead' and inserts the new ones. With each update to the table the old blob becomes no longer referenced by alive tuples, only by dead ones which makes it an orphaned blob.
Postgres has a dedicated command 'vacuumlo' to vacuum orphaned blobs.
After using it (the vacuum took around 4h) the dump operation became much faster. The new duration is around 2h (previsouly taking hours and hours without finishing)

How to resolve Amazon RDS Postgresql instance's DiskFull error?

We are having a very small Database for storing some relational data in an Amazon RDS instance. The version of the PostgreSQL Engine is 12.7.
There are a number of lambda functions in AWS in the same region, that access this instance for inserting records. In the process, some join queries are also used. We use psycopg2 Python library to interact with the DB. Since, the size of the data is very small, we have used a t2.small instance with 20GB storage and 1 CPU. In the production, however, a t2.medium instance has been used. Auto Scaling has not been enabled.
Recently, we have started experiencing an issue with this database. After the lambda functions run for a while, at some point, they time out. This is because the database takes too long to return a response, or, some times throws a Disk Full error as follows:
DiskFull
could not write to file "base/pgsql_tmp/pgsql_tmp1258.168": No space left on device
I have referred this documentation to identify the cause. Troubleshoot RDS DiskFull error
Following are the queries for checking the DB file size:
SELECT pg_size_pretty(pg_database_size('db_name'));
The response of this query is 35 MB.
SELECT pg_size_pretty(SUM(pg_relation_size(oid))) FROM pg_class;
The output of the above query is 33 MB.
As we can see, the DB file size is very small. However, on checking the size of the temporary files, we see the following:
SELECT datname, temp_files AS "Temporary files",temp_bytes AS "Size of temporary files" FROM pg_stat_database;
If we look at the size of the temporary files, its roughly 18.69 GB, which is why the DB is throwing a DiskFull error.
Why is the PostgreSQL instance not deleting the temporary files after the queries have finished? Even after rebooting the instance, the temporary file size is the same (although this is not a feasible solution as we want the DB to delete the temporary files on its own). Also, how do I avoid the DiskFull error as I may want to run more lambda functions that interact with the DB.
Just for additional information, I am including some RDS Monitoring graphs taken while the DB slowed down for CPU Utilisation and Free Storage Space:
From this, I am guessing that we probably need to enable autoscaling as the CPU Utilisation hits 83.5%. I would highly appreciate if someone shared some insights and helped in resolving the DiskFull error and identify why the temporary files are not deleted.
One of the join queries the lambda function runs on the database is:
SELECT DISTINCT
scl1.*, scl2.date_to AS compiled_date_to
FROM
logger_main_config_column_name_loading
JOIN
column_name_loading ON column_name_loading.id = logger_main_config_column_name_loading.column_name_loading_id
JOIN
sensor_config_column_name_loading ON sensor_config_column_name_loading.column_name_loading_id = column_name_loading.id
JOIN
sensor_config_loading AS scl1 ON scl1.id = sensor_config_column_name_loading.sensor_config_loading_id
INNER JOIN (
SELECT id, hash, min(date_from) AS date_from, max(date_to) AS date_to
FROM sensor_config_loading
GROUP BY id, hash
) AS scl2
ON scl1.id = scl2.id AND scl1.hash=scl2.hash AND scl1.date_from=scl2.date_from
WHERE
logger_main_config_loading_id = %(logger_main_config_loading_id)s;
How can this query be optimized? Will running smaller queries in a loop be faster?
pg_stat_database does not show the current size and number of temporary files, it shows cumulative historical data. So your database had 145 temporary files since the statistics were last reset.
Temporary files get deleted as soon as the query is done, no matter if it succeeds or fails.
You get the error because you have some rogue queries that write enough temporary files to fill the disk (perhaps some forgotten join conditions). To avoid the out-of-space condition, set the parameter temp_file_limit in postgresql.conf to a reasonable value and reload PostgreSQL.

PostgreSQL: even read access changes data files disk leading to large incremental backups using pgbackrest

We are using pgbackrest to backup our database to Amazon S3. We do full backups once a week and an incremental backup every other day.
Size of our database is around 1TB, a full backup is around 600GB and an incremental backup is also around 400GB!
We found out that even read access (pure select statements) on the database has the effect that the underlying data files (in /usr/local/pgsql/data/base/xxxxxx) change. This results in large incremental backups and also in very large storage (costs) on Amazon S3.
Usually the files with low index names (e.g. 391089.1) change on read access.
On an update, we see changes in one or more files - the index could correlate to the age of the row in the table.
Some more facts:
Postgres version 13.1
Database is running in docker container (docker version 20.10.0)
OS is CentOS 7
We see the phenomenon on multiple servers.
Can someone explain, why postgresql changes data files on pure read access?
We tested on a pure database without any other resources accessing the database.
This is normal. Some cases I can think of right away are:
a SELECT or other SQL statement setting a hint bit
This is a shortcut for subsequent statements that access the data, so they don't have t consult the commit log any more.
a SELECT ... FOR UPDATE writing a row lock
autovacuum removing dead row versions
These are leftovers from DELETE or UPDATE.
autovacuum freezing old visible row versions
This is necessary to prevent data corruption if the transaction ID counter wraps around.
The only way to fairly reliably prevent PostgreSQL from modifying a table in the future is:
never perform an INSERT, UPDATE or DELETE on it
run VACUUM (FREEZE) on the table and make sure that there are no concurrent transactions

How to fix "uncommitted xmin from before xid cutoff needs to be frozen, automatic vacuum of table "db.pg_catalog.pg_largeobject"

PostgreSQL Database error log generate this error all day and still continue error to next day
[23523] ERROR: uncommitted xmin 53354897 from before xid cutoff 210760077 needs to be frozen
[23523] CONTEXT: automatic vacuum of table "xxxx.pg_catalog.pg_largeobject"
[23523] ERROR: uncommitted xmin 53354897 from before xid cutoff 210760077 needs to be frozen
[23523] CONTEXT: automatic vacuum of table "xxxx.pg_catalog.pg_largeobject_metadata"
The error are involve system catalogs (pg_catalog.pg_largeobject, pg_catalog.pg_largeobject_metadata).
I need help about how to fix it or what will be affected if I disable autovacuum on these 2 tables.
Note:
DB : PostgreSQL version 11.6
OS : Red Hat Enterprise Linux Server release 7.8
You are experiencing data corruption, and if you don't take action, you are headed for disaster: if autovacuum keeps failing (as it will), you will eventually get close enough to transaction ID wraparound that your database will stop accepting transactions.
Create a new database cluster, dump the corrupted cluster with pg_dumpall, restore it into the new cluster and remove the old one.
You are running an old minor release (current is 11.10), so you are missing about a year of bug fixes. The cause could be a software bug or (more often) a hardware problem.
As Laurenz told you, is data corruption, but you don't have to dumpall and restore.
If your row registry is not so important, you can delete it by the xmin number 53354897.
To get more safety, you can dump before and delete after that, achieving no downtime.
In my case, that error happened in a log table and I could delete it without get any data injury.
Observation: If you got corruption on your data, you have to check your hardware and data integrity as well, even if you delete the problematic row.

How to retrieve all database sql executed from particular server?

I am running an application in a particular server which updates a postgres database table.Is there any way that I can retrieve all the queries executed to that database (may be my table) from a -period of time if I have admin privilege?
You can install the extension pg_stat_statements which will give you a summary of the queries executed.
Note that the number of queries that are stored in the table pg_stat_statements is limited (the limit can be configured). So you probably want to store a snapshot of that table on a regular basis. How often depends on your workload. Increasing pg_stat_statements.max means you can reduce the frequency of taking snapshots from that table.