is there a way to set a rate limit on the delete query(like 5 delete in 5 minutes) on a table in Postgresql to prevent DB users from deleting all data?
Related
I have a situation where updates on my temp table is slow. Below is the scenario
Created temp table in session for every session,first time temp table created and then going forward doing insert,update and delete operations this operations until session ends only.
First i'm inserting the rows and based on rows i'm updateing other columns. but this updates is slow compared to norma table. i checked the performance by replacing temp table whereas normal table taking around 50 to 60s but temp table is taking nearly 5 mins.
I tried analyze on temp table, then i got the improved performance. when im using analyze the updates are completed in with 50 seconds.
I tried Types also, but no luck.
Record count in temp table is 480
Can anyone help to imprrove the performance on temp table with out analyze OR any alternative for bulk collect and bulk insert in user defined types
All the above ooperations i'm doing in postgresql.
The lack of information in your question forces me to guess, but if all other things are equal, the difference is probably that you don't have accurate statistics on the temporary table. For normal tables (which are visible to the public), autovacuum takes care of that automatically, but for temporary tables, you have to call ANALYZE explicitly to gather table statistics.
I have a table in Aurora Postgres 9.6 that is just:
create table myTable
(
id uuid default extensions.uuid_generate_v4() not null
blobs jsonb not null
);
The blobs can get rather large at time, but are usually a few MBs. And end up getting stored in the toast tables.
Under increased load, I started to see tables locks in Aurora replica (lock_relation) or AccessExclusiveLock in Postgres equivalent. Looking at pg_locks tables, it seems that the cause for the table lock is a system process.
We are not able to select any rows from that table. What I found is that once we kill the vacuum, the table locks is released and we can fetch rows.
Locks
id,locktype,database,relation,page,tuple,virtualxid,transactionid,classid,objid,objsubid,virtualtransaction,pid,mode,granted,fastpath
1,relation,16394,142767200,,,,,,,,1/0,33244,AccessExclusiveLock,true,false
Relation Translation
142767200 -> pg_toast.pg_toast_142767196
pg_toast.pg_toast_142767196 - mySchema.myTable
Vacuum Run
0 years 0 mons 0 days 1 hours 37 mins 47.688907 secs rdsadmin autovacuum: VACUUM pg_toast.pg_toast_142767196 active
Screenshots of locks
Questions:
My understanding is that an auto vacuum shouldn't cause table locks that interfere with other operations, why do I see those?
Is there any other system processes that is connected to auto vacuum that once killed, unlocks the table (pids don't match up directly)
We have a table with nearly 2 billion events recorded. As per our data model, each event is uniquely identified with 4 columns combined primary key. Excluding the primary key, there are 5 B-tree indexes each on single different columns. So totally 6 B-tree indexes.
The events recorded span for years and now we need to remove the data older than 1 year.
We have a time column with long values recorded for each event. And we use the following query,
delete from events where ctid = any ( array (select ctid from events where time < 1517423400000 limit 10000) )
Does the indices gets updated?
During testing, it didn't.
After insertion,
total_table_size - 27893760
table_size - 7659520
index_size - 20209664
After deletion,
total_table_size - 20226048
table_size - 0
index_size - 20209664
Reindex can be done
Command: REINDEX
Description: rebuild indexes
Syntax:
REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [ FORCE ]
Considering #a_horse_with_no_name method is the good solution.
What we had:
Postgres version 9.4.
1 table with 2 billion rows with 21 columns (all bigint) and 5 columns combined primary key and 5 individual column indices with date spanning 2 years.
It looks similar to time-series data with a time column containing UNIX timestamp except that its analytics project, so time is not at an ordered increase. The table was insert and select only (most select queries use aggregate functions).
What we need: Our data span is 6 months and need to remove the old data.
What we did (with less knowledge on Postgres internals):
Delete rows at 10000 batch rate.
At inital, the delete was so fast taking ms, as the bloat increased each batch delete increased to nearly 10s. Then autovacuum got triggered and it ran for almost 3 months. The insert rate was high and each batch delete has increased the WAL size too. Poor stats in the table made the current queries so slow that they ran for minutes and hours.
So we decided to go for Partitioning. Using Table Inheritance in 9.4, we implemented it.
Note: Postgres has Declarative Partitioning from version 10, which handles most manual work needed in partitioning using Table Inheritance.
Please go through the official docs as they have clear explanation.
Simplified and how we implemented it:
Create parent table
Create child table inheriting it with check constraints. (We had monthly partitions and created using schedular)
Indexes are need to be created separately for each child table
To drop old data, just drop the table, so vacuum is not needed and will be instant.
Make sure to have the postgres property constraint_exclusion to partition.
VACUUM ANALYZE the old partition after started inserting in the new partition. (In our case, it helped the query planner to use Index-Only scan instead of Seq. scan)
Using Triggers as mentioned in the docs may make the inserts slower, so we deviated from it, as we partitioned based on time column, we calculated the table name at application level based on time value before every insert and it didn't affect the insert rate for us.
Also read other caveats mentioned there.
Wanted to update all rows in a table of size 1000GB. Tried 2 methods
1) first tried to update 1 million rows but after this got to know that size increased by 30GB. Since I didn't want to do auto vacuum so I rejected this method.
2) tried to create a clone table and insert all record from current table to clone table. After inserting all data exchanged the table name. Since I couldn't rename the table name without service downtime so rejected this method also.
Need a way to update all the records without any downtime.
I keep getting emails that state the following:
"[Your database X] contains 16,919 rows, exceeding the plan limit of 10,000. INSERT privileges to the database will be automatically revoked in 7 days. This will cause service failures in most applications dependent on this database."
Even though I have limited the number of rows in my single table application to max 10 000, usually hovering at 9999.
I have checked the number of rows and the number of tables by psql and PGAdmin3.
Any idea how Heroku counts the number of rows in a database? Is this a platform bug or am I missing something?
Right now it makes estimated counts until you reach a certain threshold at which point it performs accurate counts (this mechanism subject to change). It will never revoke access or email a user without doing an accurate count first (SELECT count(*) FROM table1 + SELECT count(*) FROM table2 etc).
It does not count system tables; it considers all user level tables. Oftentimes people don't realize they have tables that are eating up rows, such as sessions, events or logs.