Do I need to manually VACUUM temporary tables in PostgreSQL? - postgresql

Consider I have an application server which:
uses connection pooling (with a relatively high number of allowed idle connections),
can run for months, and
makes heavy use of temporary tables (which are not DROP'ped on COMMIT).
The above means that I may have N "eternal" database sessions "holding" N temporary tables, which will only be dropped when the server is restarted.
I'm well aware that the autovacuum daemon can't access those temporary tables.
My question is, if I make frequent INSERT's to and DELETE's from temporary tables, and the tables are supposed to "live" for a long time, then do I need to manually VACUUM those tables after a deletion, or a single manual ANALYZE would be enough?
Currently, if I execute
select
n_tup_del,
n_live_tup,
n_dead_tup,
n_mod_since_analyze,
vacuum_count,
analyze_count
from
pg_stat_user_tables
where
relname = '...'
order by
n_dead_tup desc;
I see the that vacuum_count is always zero:
n_tup_del n_live_tup n_dead_tup n_mod_since_analyze vacuum_count analyze_count
64 3 64 0 0 16
50 1 50 26 0 3
28 1 28 2 0 5
7 1 7 4 0 4
3 1 3 2 0 4
1 6 1 8 0 2
0 0 0 0 0 0
which may mean that manual VACUUM is indeed required.

https://www.postgresql.org/docs/current/static/sql-commands.html
ANALYZE — collect statistics about a database
VACUUM — garbage-collect
and optionally analyze a database
vacuum can optionaly also analyze. So if all you want - fresh stats - just analyze. If you want to "recover" unused rows, then vacuum. I f you want both, use vacuum analyze

We had and application which was running for 24+ hours using a lot of long living quite heavy updated temp tables and we used ANALYZE on them. But there is a problem with VACUUM - if you try to use in function you get an error:
ERROR: VACUUM cannot be executed from a function or multi-command string
CONTEXT: SQL statement "vacuum xxxxxx"
PL/pgSQL function inline_code_block line 4 at SQL statement
SQL state: 25001
But later we discovered, that temp tables actually were not so advantageous at least for our app. Technically they are normal tables existing as datafiles on disk in so called temporary tablespace (either pg_default or you can set it in postgresql.conf file). But they use only so called temp_buffers - they are not loaded into shared_buffers. So you have to set temp_buffers properly and rely more on Linux cache. And as you already mentioned - autovacuum daemon "does not see" them. Therefore we later switched to using normal tables.

Related

Why does Postgres VACUUM FULL ANALYZE gives performance boost but VACUUM ANALYZE does not

I have a large database with the largest tables having more than 30 million records. The database server is a dedicated server with 64 cores, 128 GB RAM running ubuntu and postgres 12. So the server is more powerful than we normally need. The server receives around 300-400 new records every second.
The problem is that almost after 1 week or 10 days of use the database becomes extremely slow, therefore we have to perform VACUUM FULL ANALYZE, and after this everything goes back to normal. But we have to put our server in maintenance mode and then perform this operation every week which is a pain.
I came up with the idea that we don't need a VACUUM FULL and we can just run ANALYZE on the database as it can run in parallel, but this didn't work. There was no performance gains after running this. Even when i run simple VACUUM on the whole database and then run ANALYZE after it, it still doesn't give the kind of performance boost that we get from VACUUM FULL ANALYZE.
I know that VACUUM FULL copies the data from the old table to a new tables and deletes the old table. But what else does it do?
Update:
So i have also reindexed the 15 largest tables, in order to confirm if this would speed up the database. But this also didnt work.
So i had to execute VACUUM FULL ANALYZE, as i didnt see any other way. Now i am trying to identify the slow queries.
Thanks to jjanes, i was able to install Track_io_timing and also identified a few queries where indexes can be added. I am using like this
SELECT * FROM pg_stat_statements ORDER BY total_time DESC;
And i get this result.
userid | 10
dbid | 16401
queryid | -3264485807545194012
query | update events set field1 = $1, field2 = $2 , field3= $3, field4 = $4 , field5 =$5 where id = $6
calls | 104559
total_time | 106180828.60536088
min_time | 3.326082
max_time | 259055.09376800002
mean_time | 1015.5111334783633
stddev_time | 1665.0715182035976
rows | 104559
shared_blks_hit | 4456728574
shared_blks_read | 4838722113
shared_blks_dirtied | 879809
shared_blks_written | 326809
local_blks_hit | 0
local_blks_read | 0
local_blks_dirtied | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
blk_read_time | 15074237.05887792
blk_write_time | 15691.634870000113
This query simply updates 1 record, and the table size is around 30 Million records.
Question: This query already uses an index, can you please guide on what should be the next step and why is this slow? Also IO information does this show?
As you say, VACUUM FULL is an expensive command. PGs secret weapon is AUTOVACUUM, which monitors database stats and attempts to target tables with dead tuples. Read about how to tune it for the database as a whole, and possibly for big tables.

Postgres multi-column index is taking forever to complete

I have a table with around 270,000,000 rows and this is how I created it.
CREATE TABLE init_package_details AS
SELECT pcont.package_content_id as package_content_id,
pcont.activity_id as activity_id,
pc.org_id as org_id,
pc.bed_type as bed_type,
pc.is_override as is_override,
pmmap.package_id as package_id,
pcont.activity_qty as activity_qty,
pcont.charge_head as charge_head,
pcont.activity_charge as charge,
COALESCE(pc.charge,0) - COALESCE(pc.discount,0) as package_charge
FROM a pc
JOIN b od ON
(od.org_id = pc.org_id AND od.status='A')
JOIN c pm ON
(pc.package_id=pm.package_id)
JOIN d pmmap ON
(pmmap.pack_master_id=pm.package_id)
JOIN e pcont ON
(pcont.package_id=pmmap.package_id);
I need to build index on the init_package_details table.
This table is getting created at around 5-6 mins.
I have created btree index like,
CREATE INDEX init_package_details_package_content_id_idx
ON init_package_details(package_content_id);`
which is taking 10 mins (More than the time to create and populate the table itself)
And, when I create another index like,
CREATE INDEX init_package_details_package_act_org_bt_id_idx
ON init_package_details(activity_id,org_id,bed_type);
It just freezes and taking forever to complete. I waited for around 30 mins before I manually cancelled it.
Below are stats from iotop -o if it helps,
When I created table Averaging around 110-120 MB/s (This is how 270 million rows got inserted in 5-6 mins)
When I created First Index, It was averaging at around 70 MB/s
On second index, it is snailing at 5-7 MB/s
Could someone explain Why is this happening? Is there anyway I can speedup the index creations here?
EDIT 1: There are no other connections accessing the table. And, pg_stat_activity shows active as status throughout the running time. This happens inside a transaction (this is happening between BEGIN and COMMIT, it contains many other scripts in same .sql file).
EDIT 2:
postgres=# show work_mem ;
work_mem
----------
5MB
(1 row)
postgres=# show maintenance_work_mem;
maintenance_work_mem
----------------------
16MB
Building indexes takes a long time, that's normal.
If you are not bottlenecked on I/O, you are probably on CPU.
There are a few things to improve the performance:
Set maintenance_work_mem very high.
Use PostgreSQL v11 or better, where several parallel workers can be used.

Autovacuum not removing dead rows (and xmin horizon doesn't match xmin of any session) [duplicate]

There is a table , which has 200 rows . But number of live tuples showing there is more than that (around 60K) .
select count(*) from subscriber_offset_manager;
count
-------
200
(1 row)
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 61453 | 5
(1 row)
But as seen from pg_stat_activity and pg_locks , we are not able to track any open connection .
SELECT query, state,locktype,mode
FROM pg_locks
JOIN pg_stat_activity
USING (pid)
WHERE relation::regclass = 'subscriber_offset_manager'::regclass
;
query | state | locktype | mode
-------+-------+----------+------
(0 rows)
I also tried full vacuum on this table , Below are results :
All the times no rows are removed
some times all the live tuples become dead tuples .
Here is output .
vacuum FULL VERBOSE ANALYZE subscriber_offset_manager;
INFO: vacuuming "public.subscriber_offset_manager"
INFO: "subscriber_offset_manager": found 0 removable, 67920 nonremovable row versions in 714 pages
DETAIL: 67720 dead row versions cannot be removed yet.
CPU 0.01s/0.06u sec elapsed 0.13 sec.
INFO: analyzing "public.subscriber_offset_manager"
INFO: "subscriber_offset_manager": scanned 710 of 710 pages, containing 200 live rows and 67720 dead rows; 200 rows in sample, 200 estimated total rows
VACUUM
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 200 | 67749
and after 10 sec
SELECT schemaname,relname,n_live_tup,n_dead_tup FROM pg_stat_user_tables where relname='subscriber_offset_manager' ORDER BY n_dead_tup
;
schemaname | relname | n_live_tup | n_dead_tup
------------+---------------------------+------------+------------
public | subscriber_offset_manager | 68325 | 132
How Our App query to this table .
Our application generally select some rows and based on some business calculation, update the row .
select query -- select based on some id
select * from subscriber_offset_manager where shard_id=1 ;
update query -- update some other column for this selected shard id
around 20 threads do this in parallel and One thread works on only one row .
app is writen in java and we are using hibernate to do db operations .
Postgresql version is 9.3.24
One more interesting observation :
- when i stop my java app and then do full vacuum , it works fine (number of rows and live tuples become equal). So there is something wrong if we select and update continuously from java app . –
Problem/Issue
These live tuples some times go to dead tuples and after some times again comes to live .
Due to above behaviour select from the table taking time and increasing load on server as lots of live/deadtuples are there ..
I know three things that keep VACUUM from doing its job:
Long running transactions.
Prepared transactions that did not get committed.
Stale replication slots.
See my blog post for details.
I got the issue ☺ .
For Understanding the issue consider the following flow :
Thread 1 -
Opens a hibernate session
Make some queries on Table-A
Select from subscriber_offset_manager
Update subscriber_offset_manager .
Closes the Session .
Many Threads of Type Thread-1 running in parallel .
Thread 2 -
These type of threads are running in parallel .
Opens a hibernate session
Make some select queries on Table-A
Does not close session .(session leak .)
Temporary Solution - If i close all those connection made by Thread-2 by using pg_cancel_backend then vacuuming starts working .
Also we have recreated the issue many times and tried this solution and it worked .
Now, there are following doubts which are still not answered .
Why postgres is not showing any data related to table "subscriber_offset_manager" .
This issue is not re-creating when instead of running Thread-2 , if we run select on Table-A , using psql .
why postgres is working like this with jdbc .
Some more mind blowing observation :
event if we run queries on "subscriber_offset_manager" in different session then also issue coming ;
we found many instance here where Thread 2 is working on some third table "Table-C" and issue is coming
all these type od transactions state in pg_stat_activity is "idle_in_transaction ."
#Erwin Brandstetter and #Laurenz Albe , if you know there is bug related to postgres/jdbc .
There might be locks after all, your query might be misleading:
SELECT query, state,locktype,mode
FROM pg_locks
JOIN pg_stat_activity USING (pid)
WHERE relation = 'subscriber_offset_manager'::regclass
pg_locks.pid can be NULL, then the join would eliminate rows. The manual for Postgres 9.3:
Process ID of the server process holding or awaiting this lock, or null if the lock is held by a prepared transaction
Bold emphasis mine. (Still the same in pg 10.)
Do you get anything for the simple query?
SELECT * FROM pg_locks
WHERE relation = 'subscriber_offset_manager'::regclass;
This could explain why VACUUM complains:
DETAIL: 67720 dead row versions cannot be removed yet.
This, in turn, would point to problems in your application logic / queries, locking more rows than necessary.
My first idea would be long running transactions, where even a simple SELECT (acquiring a lowly ACCESS SHARE lock) can block VACUUM from doing its job. 20 threads in parallel might chain up and lock out VACUUM indefinitely. Keep your transactions (and their locks) as brief as possible. And make sure your queries are optimized and don't lock more rows than necessary.
One more thing to note: transaction isolation levels SERIALIZABLE or REPEATABLE READ make it much harder for VACUUM to clean up. Default READ COMMITTED mode is less restrictive, but VACUUM can still be blocked as discussed.
Related:
What are the consequences of not ending a database transaction?
Postgres UPDATE … LIMIT 1
VACUUM VERBOSE outputs, nonremovable “dead row versions cannot be removed yet”?

Reading pg_buffercache output

I am using postgres-9.3 (in CenOS 6.9) and trying to understand the pg_buffercache table output.
I ran this:
SELECT c.relname,count(*) AS buffers FROM pg_class c INNER JOIN
pg_buffercache b ON b.relfilenode=c.relfilenode INNER JOIN
pg_database d ON (b.reldatabase=d.oid AND
d.datname=current_database()) GROUP BY c.relname
ORDER BY 2 DESC LIMIT 5;
and the output below showed one of the tables using 6594 buffers. This was during when I had tons of INSERT followed by SELECT and UPDATE in the data_main table).
relname | buffers
------------------+---------
data_main | 6594
objt_main | 1897
objt_access | 788
idx_data_mai | 736
I also ran "select * from pg_buffercache where is dirty" which showed around 50 entries.
How should I interpret these numbers? Does the buffer count correspond to all the transactions since I created the extension or the recent ones. How can I find out if my specific operation using the proper amount of buffers?
Here's my setting:
# show shared_buffers;
shared_buffers
----------------
1GB
# show work_mem;
work_mem
----------
128kB
# show maintenance_work_mem;
maintenance_work_mem
----------------------
64GB
And the current free mem (I have 64GM memory in this machine). And I have a mixed workload machine with period bursts of INSERTS and lots of SELECTS. Currently the database and tables are small but will grow to at least 2 million rows.
$ free -m
total used free shared buffers cached
Mem: 64375 33483 30891 954 15 15731
/+ buffers/cache: 18097 46278
Swap: 32767 38 32729
Basically, I am trying to understand how to properly use this pg_buffercache table. Should I ran this query periodically? And do I need to change my shared_buffers accordingly.
I did some reading and testing and this is what I have found. Found a userful query here: How large is a "buffer" in PostgreSQL
Here are a few notes for others that have similar questions.
You will need to create the extension for each database. So "\c db_name" then "create extension pg_buffercache".
Same for running the queries.
Restarting the database clears the queries.

How to shrink pg_toast table?

I am running on postgres 9.3 on mac osx and I have a database which grew out of control. I used to have table which had one column which stored large data. Then I noticed that there the db size grew up to around 19gb just because of a pg_toast table. Then I remove the mentioned column and ran vacuum in order to get the db to a smaller size again, but it remained the same. So how can I shrink the database size?
SELECT nspname || '.' || relname AS "relation"
,pg_size_pretty(pg_relation_size(C.oid)) AS "size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_relation_size(C.oid) DESC
LIMIT 20;
results in
pg_toast.pg_toast_700305 | 18 GB
pg_toast.pg_toast_700305_index | 206 MB
public.catalog_hotelde_images | 122 MB
public.routes | 120 MB
VACUUM VERBOSE ANALYZE pg_toast.pg_toast_700305; INFO: vacuuming "pg_toast.pg_toast_700305"
INFO: index "pg_toast_700305_index" now contains 9601330 row versions in 26329 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.06s/0.02u sec elapsed 0.33 sec.
INFO: "pg_toast_700305": found 0 removable, 0 nonremovable row versions in 0 out of 2393157 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.06s/0.07u sec elapsed 0.37 sec.
VACUUM
structure of the routes table
id serial NOT NULL,
origin_id integer,
destination_id integer,
total_time integer,
total_distance integer,
speed_id integer,
uid bigint,
created_at timestamp without time zone,
updated_at timestamp without time zone,
CONSTRAINT routes_pkey PRIMARY KEY (id)
You can use one of the two types of vacuuming: standard or full.
standard:
VACUUM table_name;
full:
VACUUM FULL table_name;
Keep in mind that VACUUM FULL locks the table it is working on until it's finished.
You may want to perform standard vacuum more frequently on your tables which have frequent upload/delete activity, it may not give you as much space as vacuum full does but you will be able to run operations like SELECT, INSERT, UPDATE and DELETE and it will take less time to complete.
In my case, when pg_toast (along with other tables) got out of control, standard VACUUM made a slight difference but was not enough. I used VACUUM FULL to reclaim more disk space which was very slow on large relations. I decided to tune autovacuum and use standard VACUUM more often on my tables which are updated frequently.
If you need to use VACUUM FULL, you should do it when your users are less active.
Also, do not turn off autovacuum.
You can get some additional information by adding verbose to your commands:
VACUUM FULL VERBOSE table_name;
Try the following:
vacuum full