check size of relation being build in Postgres - postgresql

I have a loaded OLTP db. I ALTER TABLE.. ADD PK on 100GB relation - want to check the progress. But until it is built I haven't it in pg_catalog for other transactions, so can't just select it.
I tried find ./base/14673648/ -ctime 1 also -mtime - hundreds of files, an dthen I thought - why do I think it has created a filenode?.. Just because it ate some space.
So forgive my ignorance and advise - how do I check the size on PK being created so far?
Update: I can sum ./base/pgsql_tmp/pgsql_tmpPID.N. where PID is pid of session that creats PK as per docs:
Temporary files (for operations such as sorting more data than can fit
in memory) are created within PGDATA/base/pgsql_tmp, or within a
pgsql_tmp subdirectory of a tablespace directory if a tablespace other
than pg_default is specified for them. The name of a temporary file
has the form pgsql_tmpPPP.NNN, where PPP is the PID of the owning
backend and NNN distinguishes different temporary files of that
backend.
New question: How can I get it from pg_catalog?
pondstats=# select pg_size_pretty(temp_bytes) from pg_stat_database where datid = 14673648;
pg_size_pretty
----------------
89 GB
(1 row)
shows the sum of all temp files, not per relation

A primary key is implemented with a unique index, and that has files in the data directory.
Unfortunately there is no way to check the progress of index creation (unless you know your way around the source and attach to the backend with a debugger).
You only need to concentrate on relation files are do not in the output of
SELECT relfilenode FROM pg_class
WHERE relfilenode <> 0
UNION
SELECT pg_relation_filenode(oid) FROM pg_class
WHERE pg_relation_filenode(oid) IS NOT NULL;
Once you know which file belongs to your index-in-creation (it should be growing fast, unless there is a lock blocking the statement) you can start guessing how long it has to go by comparing it to files belonging to a comparable index on a comparable table.
All pretty hand-wavy, I'm afraid.

Related

pg_largeobject huge, but no tables have OID column type

postrgresql noob, PG 9.4.x, no access to application code, developers, anyone knowledgeable about it
User database CT has 427GB pg_largeobject (PGLOB) table, next largest table is 500ish MB.
Per this post (Does Postgresql use PGLOB internally?) a very reputable member said postgresql does not use PGLOB internally.
I have reviewed the schema of all user tables in the database, and none of them are of type OID (or lo) - which is the value used for PGLOB rows to tie the collection of blob chunks back to a referencing table row. I think this means I cannot use vacuumlo (vacuumlo) to delete orphaned PGLOB rows because that utility searches user objects for those two data types in user tables.
I HAVE identified a table with an integer field type that has int values that match LOID values in PGLOB. This seems to indicate that the developers somehow got their blobs into PGLOB using the integer value stored in a user table row.
QUESTION: Is that last statement possible?
A) If it is not, what could be adding all this data to PGLOB table?
B) If it is possible, is there a way I can programatically search ALL tables for integer values that might represent rows in PGLOB?
NEED: I DESPERATELY need to reduce the size of the PGLOB table, as we are running out of disk space. And no, we cannot add space to existing disk per admin. So I somehow need to determine if there are LOID values in PGLOB that do NOT exist in ANY user tables as integer-type fields and then run lo_unlink to remove the rows. This could get me more usable 8K pages in the table.
BTW, I have also run pg_freespace on PGLOB, and it identified that most of the pages in PGLOB did not contain enough space in which to insert another blob chunk.
THANKS FOR THE ASSISTANCE!
Not really an answer but thinking out loud:
As you found all large objects are stored in a single table. The oid field you refer to is something you add to a table so you can have a pointer to a particular LO oid in pg_largeobject. That being said there is nothing compelling you to store that info in a table, you can just create LO's in pg_largeobject. From the looks of it, and just a guess, the developers stored the oid's as integer's with the intent of doing integer::oid to get a particular LO back as needed. I would look at other information is stored in that table to see if helps determine what the LO's are for?
Also you might join the integer::oid values to the oid(loid) in pg_catalog to see if that table accounts for all of them?
I was able to do a detailed analysis of all user tables in the database, find all columns that contained numeric data with no decimals, and then do a a query from pg_largeobject with a NOT EXISTS clause for every table matching pglob.loid against the appropriate field(s) in the user tables.
I found 25794 LOIDs that could be DELETEd from the PGLOB table, totaling 3.4M rows.
select distinct loid
into OrphanedBLOBs
from pg_largeobject l
where NOT exists (select * from tbl1 cn where cn.noteid = l.loid)
and not exists (select * from tbl1 d where d.document = l.loid)
and not exists (select * from tbl1 d where d.reportid = l.loid)
I used that table to execute lo_unlink(loid) for each of the LOIDs.

no active/idle session still temporary files present in postgresql

I have used below query to find out temporary files present in postgresql-9.6 instance
SELECT datname, temp_files AS "Temporary files",temp_bytes AS "Size of temporary files" FROM pg_stat_database
order by temp_bytes desc;
result is as per below
why Postgresql maintaining temporary tables when there is no active session?
You are misunderstanding what those numbers are. They are totals over the lifetime of the database. They are not numbers for currently present temporary files.

What is stored in pg_default tablespace?

I have ~2.5 TB database, which is divided into tablespaces. The problem is that ~250 GB are stored in pg_defalut tablespace.
I have 3 tables and 6 tablespaces: 1 per each table and 1 for its index. Each tablespace directory is not empty, so there are no missing tablespaces for some tables/indexes. But the size of data/main/base/OID_of_database directory is about 250 GB.
Can anyone tell me what is stored there, is it OK, and if not, how can I move it to tablespace?
I am using PostgreSQL 10.
Inspect the base subdirectory of the data directory. It will contain number directories that correspond to your databases and perhaps a pgsql_tmp directory.
Find out which directory contains the 250GB. Map directory names to databases using
SELECT oid, datname
FROM pg_database;
Once you have identified the directory, change into it and see what it contains.
Map the numbers to database objects using
SELECT relname, relkind, relfilenode
FROM pg_class;
(Make sure you are connected to the correct database.)
Now you know which objects take up the space.
If you had frequent crashes during operations like ALTER TABLE or VACUUM (FULL), the files may be leftovers from that. They can theoretically be deleted, but I wouldn't do that without consulting with a PostgreSQL expert.

How to find the OID of the rows in a table in postgres?

I have a problem encountered lately in our Postgres database, when I query: select * from myTable,
it results to, 'could not open relation with OID 892600370'. And it's front end application can't run properly anymore. Base on my research, I determined the column that has an error but I want exactly to locate the rows OID of the column so that I can modify it. Please help.
Thank you in advance.
You've got a corrupted database. Might be a bug, but more likely bad hardware. If you have a recent backup, just use that. I'm guessing you don't though.
Make sure you locate any backups of either the database or its file tree and keep them safe.
Stop the PostgreSQL server and take a file backup of the entire database tree (base, global, pg_xlog - everything at that level). It is now safe to start fiddling...
Now, start the database server again and dump tables one at a time. If a table won't dump, try dropping any indexes and foreign-key constraints and give it another go.
For a table that won't dump, it might be just certain rows. Drop any indexes and dump a range of rows using COPY ... SELECT. That should let you narrow down any corrupted rows and get the rest.
Now you have a mostly-recovered database, restore it on another machine and take whatever steps are needed to establish what is damaged/lost and what needs to be done.
Run a full set of tests on the old machine and see if anything needs replacement. Consider whether your monitoring needs improvement.
Then - make sure you keep proper backups next time, that way you won't have to do all this, you'll just use them instead.
could not open relation with OID 892600370
A relation is a table or index. A relation's OID is the OID of the row in pg_class where this relation is defined.
Try select relname from pg_class where oid=892600370;
Often it's immediately obvious from relname what this relation is, otherwise you want to look at the other fields in pg_class: relnamespace, relkind,...

PostgreSQL database size (tablespace size) much bigger then calculated total sum of relations

Hallo all,
I see a very big difference between the actual database size (on the HDD and displayed by pg_database_size() call) and the size, calculated by summing up total relation sizes retrieved by pg_total_relation_size().
The first is 62G and the last is 16G (right the difference of the deleted data from the biggest table)
Here is a simplified query, that can show that difference on my system:
select current_database(),
pg_size_pretty( sum(total_relation_raw_size)::bigint ) as calculated_database_size,
pg_size_pretty( pg_database_size(current_database()) ) as database_size
from (select pg_total_relation_size(relid) as total_relation_raw_size
from pg_stat_all_tables -- this includes also system tables shared between databases
where schemaname != 'pg_toast'
) as stats;
It seems like there is some dangling data there. As this situation appeared, after we dumped and full vacuumed lots of unused data from that DB.
P.S.: I suppose, that it was a database corruption of some sort... The only way to recover from this situation was to switch to the Hot-Standby database...
LOBs are a very valid concern as BobG writes, since they are not deleted when the rows of your application table (containing the OIDs) get deleted.
These will NOT be deleted by the VACUUM process automatically, only you have run VACUUMLO on them.
Vacuumlo will delete all of the unreferenced LOBs from the database.
Example call:
vacuumlo -U postgres -W -v <database_name>
(I only included the -v to make vacuumlo a bit more verbose so that you see how many LOBs it removes)
After vacuumlo has deleted the LOBs, you can run VACUUM FULL (or let the auto-vacuum process run).
Do you have unused LOBs?
If you have something like this:
CREATE TABLE bigobjects (
id BIGINT NOT NULL PRIMARY KEY,
filename VARCHAR(255) NOT NULL,
filecontents OID NOT NULL
);
followed by:
\lo_import '/tmp/bigfile'
11357
INSERT INTO bigobjects VALUES (1, 'bigfile', 11357);
TRUNCATE TABLE bigobjects;
You'll still have the LOB (id 11357) in the database.
You can check the pg_catalog.pg_largeobject system catalog table for all the large objects in your database (recommend SELECT DISTINCT loid FROM pg_catalog.pg_largeobject unless you want to see all your LOB data as octal.)
If you clean out all your unused LOBs and do a VACUUM FULL, you should see a hefty reduction in storage. I just tried this on a personal dev database I've been using and saw a reduction in size from 200MB down to 10MB (as reported by pg_database_size(current_database()).)
As this situation appeared, after we dumped and full vacuumed lots of unused data from that DB.
I had similar experience: 3GB db with lots of dynamic data that went to 20GB for a month or so.
Manually deleting / vacuuming the problematic tables doesn't seamed to have effect ..
And then we just did a final
VACUUM FULL ANALYZE
On the WHOLE DB ... and it dropped half the size.
It took 4hours so be careful with that.
Your query is specifically screening out pg_toast tables, which can be big. See if getting rid of that where schemaname != 'pg_toast' gets you a more accurate answer.