What is stored in pg_default tablespace? - postgresql

I have ~2.5 TB database, which is divided into tablespaces. The problem is that ~250 GB are stored in pg_defalut tablespace.
I have 3 tables and 6 tablespaces: 1 per each table and 1 for its index. Each tablespace directory is not empty, so there are no missing tablespaces for some tables/indexes. But the size of data/main/base/OID_of_database directory is about 250 GB.
Can anyone tell me what is stored there, is it OK, and if not, how can I move it to tablespace?
I am using PostgreSQL 10.

Inspect the base subdirectory of the data directory. It will contain number directories that correspond to your databases and perhaps a pgsql_tmp directory.
Find out which directory contains the 250GB. Map directory names to databases using
SELECT oid, datname
FROM pg_database;
Once you have identified the directory, change into it and see what it contains.
Map the numbers to database objects using
SELECT relname, relkind, relfilenode
FROM pg_class;
(Make sure you are connected to the correct database.)
Now you know which objects take up the space.
If you had frequent crashes during operations like ALTER TABLE or VACUUM (FULL), the files may be leftovers from that. They can theoretically be deleted, but I wouldn't do that without consulting with a PostgreSQL expert.

Related

no active/idle session still temporary files present in postgresql

I have used below query to find out temporary files present in postgresql-9.6 instance
SELECT datname, temp_files AS "Temporary files",temp_bytes AS "Size of temporary files" FROM pg_stat_database
order by temp_bytes desc;
result is as per below
why Postgresql maintaining temporary tables when there is no active session?
You are misunderstanding what those numbers are. They are totals over the lifetime of the database. They are not numbers for currently present temporary files.

postgres lo_unlink not deleting objects

Im little bit new in postgres, but I have problem related with disk space. now I have 2GB free space, I need to delete some of largeobjects. Situation is like this:
I have table espbidocuments that stores OID with ID to pg_largeobject table where is stored my files. pg_largeobject now is 87,201,636,352 bytes size and I need to shrink. I tried deleting references from espbidocuments, so LO will be orphans, then I ran function lo_unlink(oid) and get result 1, so I think that my files are deleted, but still when I selecting from pg_largeobject table, I get that they exist in that table and pg_largeobject table size is not lesser, size is the same. I cant do vacuum because tables size is 80+GB and free space is ~2GB... how remove those darn largeobject?

How can I determine whether a file in a PostgreSQL data directory is used in the database?

I have a situation in which summing the size of the tables in a tablespace (using pg_class among others) reveals that there is 550G of datafiles in a particular tablespace in a particular database.
However, there is 670G of files in that directory on the server.
FWIW, I don't know how that can be. No files have been written to that directory via any mechanism other than Postgres. My best guess is perhaps the database crashed while an autovacuum was going on, leaving orphan files laying around...does that sound plausible?)
SO, I've worked out a way, by reading the contents of a ls command into the database, strip off the numeric extensions for tables > 1G in size, and compare them with the contents of pg_class, and have, in fact, found about 120G of files not reflected in pg_class.
My question is, is it safe for me to delete these files, or could they be in active use by the database but not reflected in pg_class?
Do not manually delete files in the PostgreSQL data directory.
This is not safe and will corrupt your database.
The safe way to purge any files that don't belong to the database is to perform a pg_dumpall, stop the server, remove the data directory and the contents of all tablespace directories, breate a new cluster with inindb and restore the dump.
If you want to investigate the issue, you could try to create a new tablespace and move everything from the old to the new tablespace. I will describe that in the rest of my answer.
Move all the tables and indexes in all databases to the new tablespace:
ALTER TABLE ALL IN TABLESPACE oldtblsp SET TABLESPACE newtblsp;
ALTER INDEX ALL IN TABLESPACE oldtblsp SET TABLESPACE newtblsp;
If oldtblsp is the default tablespace of a database:
ALTER DATABASE mydb SET TABLESPACE newtblsp;
Then run a checkpoint:
CHECKPOINT;
Make sure you forgot no database:
SELECT datname
FROM pg_database d
JOIN pg_tablespace s
ON d.dattablespace = s.oid
WHERE s.spcname = 'oldtblsp';
Make sure that there are no objects in the old tablespace by running this query in all databases:
SELECT t.relname, t.relnamespace::regnamespace, t.relkind
FROM pg_class t
JOIN pg_tablespace s
ON t.reltablespace = s.oid
WHERE s.spcname = 'oldtblsp';
This should return no results.
Now the old tablespace should be empty and you can
DROP TABLESPACE oldtblsp;
If you really get an error
ERROR: tablespace "tblsp" is not empty
there might be some files left behind.
Delete them at your own risk...

check size of relation being build in Postgres

I have a loaded OLTP db. I ALTER TABLE.. ADD PK on 100GB relation - want to check the progress. But until it is built I haven't it in pg_catalog for other transactions, so can't just select it.
I tried find ./base/14673648/ -ctime 1 also -mtime - hundreds of files, an dthen I thought - why do I think it has created a filenode?.. Just because it ate some space.
So forgive my ignorance and advise - how do I check the size on PK being created so far?
Update: I can sum ./base/pgsql_tmp/pgsql_tmpPID.N. where PID is pid of session that creats PK as per docs:
Temporary files (for operations such as sorting more data than can fit
in memory) are created within PGDATA/base/pgsql_tmp, or within a
pgsql_tmp subdirectory of a tablespace directory if a tablespace other
than pg_default is specified for them. The name of a temporary file
has the form pgsql_tmpPPP.NNN, where PPP is the PID of the owning
backend and NNN distinguishes different temporary files of that
backend.
New question: How can I get it from pg_catalog?
pondstats=# select pg_size_pretty(temp_bytes) from pg_stat_database where datid = 14673648;
pg_size_pretty
----------------
89 GB
(1 row)
shows the sum of all temp files, not per relation
A primary key is implemented with a unique index, and that has files in the data directory.
Unfortunately there is no way to check the progress of index creation (unless you know your way around the source and attach to the backend with a debugger).
You only need to concentrate on relation files are do not in the output of
SELECT relfilenode FROM pg_class
WHERE relfilenode <> 0
UNION
SELECT pg_relation_filenode(oid) FROM pg_class
WHERE pg_relation_filenode(oid) IS NOT NULL;
Once you know which file belongs to your index-in-creation (it should be growing fast, unless there is a lock blocking the statement) you can start guessing how long it has to go by comparing it to files belonging to a comparable index on a comparable table.
All pretty hand-wavy, I'm afraid.

PostgreSQL database size (tablespace size) much bigger then calculated total sum of relations

Hallo all,
I see a very big difference between the actual database size (on the HDD and displayed by pg_database_size() call) and the size, calculated by summing up total relation sizes retrieved by pg_total_relation_size().
The first is 62G and the last is 16G (right the difference of the deleted data from the biggest table)
Here is a simplified query, that can show that difference on my system:
select current_database(),
pg_size_pretty( sum(total_relation_raw_size)::bigint ) as calculated_database_size,
pg_size_pretty( pg_database_size(current_database()) ) as database_size
from (select pg_total_relation_size(relid) as total_relation_raw_size
from pg_stat_all_tables -- this includes also system tables shared between databases
where schemaname != 'pg_toast'
) as stats;
It seems like there is some dangling data there. As this situation appeared, after we dumped and full vacuumed lots of unused data from that DB.
P.S.: I suppose, that it was a database corruption of some sort... The only way to recover from this situation was to switch to the Hot-Standby database...
LOBs are a very valid concern as BobG writes, since they are not deleted when the rows of your application table (containing the OIDs) get deleted.
These will NOT be deleted by the VACUUM process automatically, only you have run VACUUMLO on them.
Vacuumlo will delete all of the unreferenced LOBs from the database.
Example call:
vacuumlo -U postgres -W -v <database_name>
(I only included the -v to make vacuumlo a bit more verbose so that you see how many LOBs it removes)
After vacuumlo has deleted the LOBs, you can run VACUUM FULL (or let the auto-vacuum process run).
Do you have unused LOBs?
If you have something like this:
CREATE TABLE bigobjects (
id BIGINT NOT NULL PRIMARY KEY,
filename VARCHAR(255) NOT NULL,
filecontents OID NOT NULL
);
followed by:
\lo_import '/tmp/bigfile'
11357
INSERT INTO bigobjects VALUES (1, 'bigfile', 11357);
TRUNCATE TABLE bigobjects;
You'll still have the LOB (id 11357) in the database.
You can check the pg_catalog.pg_largeobject system catalog table for all the large objects in your database (recommend SELECT DISTINCT loid FROM pg_catalog.pg_largeobject unless you want to see all your LOB data as octal.)
If you clean out all your unused LOBs and do a VACUUM FULL, you should see a hefty reduction in storage. I just tried this on a personal dev database I've been using and saw a reduction in size from 200MB down to 10MB (as reported by pg_database_size(current_database()).)
As this situation appeared, after we dumped and full vacuumed lots of unused data from that DB.
I had similar experience: 3GB db with lots of dynamic data that went to 20GB for a month or so.
Manually deleting / vacuuming the problematic tables doesn't seamed to have effect ..
And then we just did a final
VACUUM FULL ANALYZE
On the WHOLE DB ... and it dropped half the size.
It took 4hours so be careful with that.
Your query is specifically screening out pg_toast tables, which can be big. See if getting rid of that where schemaname != 'pg_toast' gets you a more accurate answer.