postgres lo_unlink not deleting objects - postgresql

Im little bit new in postgres, but I have problem related with disk space. now I have 2GB free space, I need to delete some of largeobjects. Situation is like this:
I have table espbidocuments that stores OID with ID to pg_largeobject table where is stored my files. pg_largeobject now is 87,201,636,352 bytes size and I need to shrink. I tried deleting references from espbidocuments, so LO will be orphans, then I ran function lo_unlink(oid) and get result 1, so I think that my files are deleted, but still when I selecting from pg_largeobject table, I get that they exist in that table and pg_largeobject table size is not lesser, size is the same. I cant do vacuum because tables size is 80+GB and free space is ~2GB... how remove those darn largeobject?

Related

Huge delete on PostgreSQL table : Deleting 99,9% of the rows of the table

I have a table in my PostgreSQL database that became huge, filled with a lot of useless rows.
As these useless rows represent 99.9% of my table data (about 3.3M rows), I was wondering if deleting them could have a bad impact on my DB :
I know that this operation could take some time and I will be able to block writes on the table during the maintenance operation
But I was wondering if this huge change in the data could also impact performance after the opertation itself.
I found solutions like creating a new table / using TRUNCATE to drop all lines but as this operation will be specific and one shot, I would like to be able to choose the most adapted solution.
I know that Postgre SQL has a VACUUM mechanism but I'm not a DBA expert : Could anyone please confirm that this delete will not impact my table integrity / data structure and that freed space will be reclaimed if needed for new data ?
PostgreSQL 11.12, with default settings on AWS RDS. I don't have any index on my table and the criteria for rows deletion will not be based on the PK
Deleting rows typically does not shrink a PostgreSQL table, sou you would then have to run VACUUM (FULL) to compact it, during which the table is inaccessible.
If you are deleting many rows, both the DELETE and the VACUUM (FULL) will take a long time, and you would be much better off like this:
create a new table that is defined like the old one
INSERT INTO new_tab SELECT * FROM old_tab WHERE ... to copy over the rows you want to keep
drop foreign key constraints that point to the old table
create all indexes and constraints on the new table
drop the old table and rename the new one
By planning that carefully, you can get away with a short down time.

Cleaning up files from table without deleting rows in postgresql 9.6.3

I have a table with files and various relations to this table, files are stored as bytea. I want to free up space occupied by old files (according to timestamp), however the rows should still be present in the table.
Is it enough to set null to bytea field? Will the data be actually deleted from the table this way?
In PostgreSQL, updating a row creates a new tuple (row version), and the old one is left to be deleted by autovacuum.
Also, larger bytea attributes will be stored out-of-line in the TOAST table that belongs to the table.
When you set the bytea attribute to NULL (which is the right thing to do), two things will happen:
The main table will become bigger because of all the new tuples created by the UPDATE. Autovacuum will free the space, but not shrink the table (the empty space can be re-used by future data modifications).
Entries in the TOAST table will be deleted. Again, autovacuum will free the space, but the table won't shrink.
So what you will actually observe is that after the UPDATE, your table uses more space than before.
You can get rid of all that empty space by running VACUUM (FULL) on the table, but that will block concurrent access to the table for the duration of the operation, so be ready to schedule some down time (you'll probably do that for the UPDATE anyway).

SQL Server Express 2008 10GB Size Limit

I am approaching the 10 GB limit that Express has on the primary database file.
The main problem appears to be some fixed length char(500) columns that are never near that length.
I have two tables with about 2 million rows between them. These two tables add up to about 8 GB of data with the remainder being spread over another 20 tables or so. These two tables each have 2 char(500) columns.
I am testing a way to convert these columns to varchar(500) and recover the trailing spaces.
I tried this:
Alter Table Test_MAILBACKUP_RECIPIENTS
Alter Column SMTP_address varchar(500)
GO
Alter Table Test_MAILBACKUP_RECIPIENTS
Alter Column EXDN_address varchar(500)
This quickly changed the column type but obviously didn’t recover the space.
The only way I can see to do this successfully is to:
Create a new table in tempdb with the varchar(500) columns,
Copy the information into the temp table trimming off the trailing spaces,
Drop the real table,
Recreate the real table with the new varchar(500) columns,
Copy the information back.
I’m open to other ideas here as I’ll have to take my application offline while this process completes?
Another thing I’m curious about is the primary key identity column.
This table has a Primary Key field set as an identity.
I know I have to use Set Identity_Insert on to allow the records to be inserted into the table and turn it off when I’m finished.
How will recreating a table affect new records being inserted into the table after I’m finished. Or is this just “Microsoft Magic” and I don’t need to worry about it?
The problem with you initial approach was that you converted the columns to varchar but didn't trim the existing whitespace (which is maintained after the conversion), after changing the data type of the columns to you should do:
update Test_MAILBACKUP_RECIPIENTS set
SMTP_address=rtrim(SMTP_address), EXDN_address=rtrim(EXDN_address)
This will eliminate all trailing spaces from you table, but note that the actual disk size will be the same, as SQL Server don't shrink automatically database files, it just mark that space as unused and available for other data.
You can use this script from another question to see the actual space used by data in the DB files:
Get size of all tables in database
Usually shrinking a database is not recommended but when there is a lot of difference between used space and disk size you can do it with dbcc shrinkdatabase:
dbcc shrinkdatabase (YourDatabase, 10) -- leaving 10% of free space for new data
OK I did a SQL backup, disabled the application and tried my script anyway.
I was shocked that it ran in under 2 minutes on my slow old server.
I re-enabled my application and it still works. (Yay)
Looking at the reported size of the table now it went from 1.4GB to 126Mb! So at least that has bought me some time.
(I have circled the Data size in KB)
Before
After
My next problem is the MailBackup table which also has two char(500) columns.
It is shown as 6.7GB.
I can't use the same approach as this table contains a FileStream column which has around 190gb of data and tempdb does not support FleStream as far as I know.
Looks like this might be worth a new question.

Is it possible to truncate the pg_largeobject table in postgres?

I would like to discard the contents of the pg_largeobject table and reclaim its disk space. When I try issuing this command:
truncate pg_largeobject
I get this response:
ERROR: permission denied: "pg_largeobject" is a system catalog
This is even though I am issuing the command as user postgres (a superuser). There is insufficient disk space to do a VACUUM FULL while the table contains a lot of rows. I've also tried just deleting all the rows in preparation for a VACUUM FULL, but this was still going after a whole day, and ended up being interrupted. I'd prefer to truncate if at all possible.
Is truncation of this table possible? It currently contains around 1 TB of images I no longer want. I've removed references to the table from all my other tables (and deleted all rows from pg_largeobject_metadata).
Turning on allow_system_table_mods was the answer. The truncate then took only a few minutes. Thanks to Nick Barnes for this suggestion and to an old article that confirmed this approach.

PostgreSQL database size (tablespace size) much bigger then calculated total sum of relations

Hallo all,
I see a very big difference between the actual database size (on the HDD and displayed by pg_database_size() call) and the size, calculated by summing up total relation sizes retrieved by pg_total_relation_size().
The first is 62G and the last is 16G (right the difference of the deleted data from the biggest table)
Here is a simplified query, that can show that difference on my system:
select current_database(),
pg_size_pretty( sum(total_relation_raw_size)::bigint ) as calculated_database_size,
pg_size_pretty( pg_database_size(current_database()) ) as database_size
from (select pg_total_relation_size(relid) as total_relation_raw_size
from pg_stat_all_tables -- this includes also system tables shared between databases
where schemaname != 'pg_toast'
) as stats;
It seems like there is some dangling data there. As this situation appeared, after we dumped and full vacuumed lots of unused data from that DB.
P.S.: I suppose, that it was a database corruption of some sort... The only way to recover from this situation was to switch to the Hot-Standby database...
LOBs are a very valid concern as BobG writes, since they are not deleted when the rows of your application table (containing the OIDs) get deleted.
These will NOT be deleted by the VACUUM process automatically, only you have run VACUUMLO on them.
Vacuumlo will delete all of the unreferenced LOBs from the database.
Example call:
vacuumlo -U postgres -W -v <database_name>
(I only included the -v to make vacuumlo a bit more verbose so that you see how many LOBs it removes)
After vacuumlo has deleted the LOBs, you can run VACUUM FULL (or let the auto-vacuum process run).
Do you have unused LOBs?
If you have something like this:
CREATE TABLE bigobjects (
id BIGINT NOT NULL PRIMARY KEY,
filename VARCHAR(255) NOT NULL,
filecontents OID NOT NULL
);
followed by:
\lo_import '/tmp/bigfile'
11357
INSERT INTO bigobjects VALUES (1, 'bigfile', 11357);
TRUNCATE TABLE bigobjects;
You'll still have the LOB (id 11357) in the database.
You can check the pg_catalog.pg_largeobject system catalog table for all the large objects in your database (recommend SELECT DISTINCT loid FROM pg_catalog.pg_largeobject unless you want to see all your LOB data as octal.)
If you clean out all your unused LOBs and do a VACUUM FULL, you should see a hefty reduction in storage. I just tried this on a personal dev database I've been using and saw a reduction in size from 200MB down to 10MB (as reported by pg_database_size(current_database()).)
As this situation appeared, after we dumped and full vacuumed lots of unused data from that DB.
I had similar experience: 3GB db with lots of dynamic data that went to 20GB for a month or so.
Manually deleting / vacuuming the problematic tables doesn't seamed to have effect ..
And then we just did a final
VACUUM FULL ANALYZE
On the WHOLE DB ... and it dropped half the size.
It took 4hours so be careful with that.
Your query is specifically screening out pg_toast tables, which can be big. See if getting rid of that where schemaname != 'pg_toast' gets you a more accurate answer.