How to delete unused sequences? - postgresql

We are using PostgreSQL. My requirement is to delete unused sequences from my database.
For example, if I create any table through my application, one sequence will be created, but for deleting the table we are not deleting the sequence, too. If want to create the same table another sequence is being created.
Example: table: file; automatically created sequence for id coumn: file_id_seq
When I delete the table file and create it with same name again, a new sequence is being created (i.e. file_id_seq1). I have accumulated a huge number of unused sequences in my application database this way.
How to delete these unused sequences?

A sequence that is created automatically for a serial column is deleted automatically, when the column (or its table) is dropped. The problem you describe should not exist to begin with. Only very old versions of PostgreSQL did not do that. 7.4 or older?
Solution for the problem
This query will generate the DDL commands to delete all "unbound" sequences in the database it is executed in:
SELECT string_agg('DROP SEQUENCE ' || c.oid::regclass, '; ') || ';' AS ddl
FROM pg_class c
LEFT JOIN pg_depend d ON d.refobjid = c.oid
AND d.deptype <> 'i'
WHERE c.relkind = 'S'
AND d.refobjid IS NULL;
The cast to regclass in c.oid::regclass automatically schema-qualifies sequence names where necessary according to the current search_path. See:
How to check if a table exists in a given schema
How does the search_path influence identifier resolution and the "current schema"
Result:
DROP SEQUENCE foo_id_seq;
DROP SEQUENCE bar_id_seq;
...
Execute the result to drop all sequences that are not bound to a serial column (or any other column). Study the meaning of columns and tables here.
Careful! These sequences might be in use otherwise. There are use cases where sequences are created as standalone objects. For instance, if you want multiple columns to share one sequence. You should know exactly what you are doing.
However, you cannot delete sequences bound to a serial column this way. So the operation is safe in this respect.
DROP SEQUENCE test_id_seq;
Result:
ERROR: cannot drop sequence test_id_seq because other objects depend on it
DETAIL: default for table test column id depends on sequence test_id_seq
HINT: Use DROP ... CASCADE to drop the dependent objects too.

If you are using pgAdmin, you can select the sequence and check the "depends on" tab. It will list any object that relies on the sequence.
Another way is to TRY to delete the sequence. If a table references it, pgAdmin will throw an error saying that something is depending on this sequence. If you are able to delete the sequence without any errors, there is no dependency.
Be sure to test this somewhere.

What i do is first I got all the sequences and then saved these result into a file then i run the file in psql: below content was saved with file name del_seq_all.sql and then list sequences in test1.sql . i dont know this is the correct solution or not. But result is coming as expected.
\o d:/test1.sql
SELECT 'drop sequence ' || c.relname || ';' FROM pg_class c WHERE
(c.relkind = 'S');
\o
\i d:/test1.sql

Proceed with caution, "drop sequence sequence_name_here" will successfully drop a sequence even if it's attached as a default nextval() value of a table column. There seems to be some disconnect here especially if the sequence was created separately. I'm also looking for the perfect one liner to clean up 100% unused sequences.

Building on the answer by #erwin:
DO $$ DECLARE
r text;
BEGIN
FOR r IN (
SELECT cl.relname
FROM pg_class cl
LEFT JOIN pg_namespace ns ON ns."oid" = cl.relnamespace
LEFT JOIN pg_depend d ON d.refobjid = cl."oid" AND d.deptype <> 'i'
WHERE ns.nspname = 'public'
AND cl.relkind = 'S'
AND d.refobjid IS NULL
) LOOP
-- dangerous, test before you execute!
RAISE NOTICE '%', -- once confident, comment this line ...
-- EXECUTE -- ... and uncomment this one
'DROP SEQUENCE ' || quote_ident(r);
END LOOP;
END $$;
This will actually execute the query and produce the intended result

Related

PostgreSQL rename a column only if it exists

I couldn't find in the PostgreSQL documentation if there is a way to run an: ALTER TABLE tablename RENAME COLUMN IF EXISTS colname TO newcolname; statement.
I would be glad we could, because I'm facing an error that depends on who made and gave me an SQL script, for which in some cases everything is perfectly fine (when the column has the wrong name, name that will actually be changed using a RENAME statement), and in other cases not (when the column already has the right name).
Hence the idea of using an IF EXISTS statement on the column name while trying to rename it. If the column has already the good name (here cust_date_mean), the rename command that must be applied only on the wrong name should be properly skipped and not issuing the following error:
db_1 | [223] ERROR: column "cust_mean" does not exist
db_1 | [223] STATEMENT: ALTER TABLE tablename RENAME COLUMN cust_mean TO cust_date_mean;
db_1 | ERROR: column "cust_mean" does not exist
(In the meantime I will clarify things with the team so it's not a big deal if such command doesn't exist but I think it could help).
While there is no built-in feature, you can use a DO statement:
DO
$$
DECLARE
_tbl regclass := 'public.tbl'; -- not case sensitive unless double-quoted
_colname name := 'cust_mean'; -- exact, case sensitive, no double-quoting
_new_colname text := 'cust_date_mean'; -- exact, case sensitive, no double-quoting
BEGIN
IF EXISTS (SELECT FROM pg_attribute
WHERE attrelid = _tbl
AND attname = _colname
AND attnum > 0
AND NOT attisdropped) THEN
EXECUTE format('ALTER TABLE %s RENAME COLUMN %I TO %I', _tbl, _colname, _new_colname);
ELSE
RAISE NOTICE 'Column % of table % not found!', quote_ident(_colname), _tbl;
END IF;
END
$$;
Does exactly what you ask for. Enter table and column names in the DECLARE section.
The NOTICE is optional.
For repeated use, I would create a function and pass parameters instead of the variables.
Variables are handled safely (no SQL injection). The table name can optionally be schema-qualified. If it's not, it's resolved according to the current search_path, just like ALTER TABLE would.
Related:
PostgreSQL create table if not exists
Table name as a PostgreSQL function parameter

Postgres: Understanding Primary Key Sequences

My database has fallen out of sync, which lead me to this question:
How to reset postgres' primary key sequence when it falls out of sync? (Copied Below)
However, I don't quite understand some things here: What is 'your_table_id_seq'? I have no clue where to find this. Doing some digging, I found a table called pg_sequence
in pg_catalog. Is it related to this? I can't see any way to relate that data back to my table though.
-- Login to psql and run the following
-- What is the result?
SELECT MAX(id) FROM your_table;
-- Then run...
-- This should be higher than the last result.
SELECT nextval('your_table_id_seq');
-- If it's not higher... run this set the sequence last to your highest id.
-- (wise to run a quick pg_dump first...)
BEGIN;
-- protect against concurrent inserts while you update the counter
LOCK TABLE your_table IN EXCLUSIVE MODE;
-- Update the sequence
SELECT setval('your_table_id_seq', COALESCE((SELECT MAX(id)+1 FROM your_table), 1), false);
COMMIT;
The following query gives names of all sequences.
SELECT c.relname
FROM pg_class c
WHERE c.relkind = 'S';
Typically a sequence is named as ${table}_id_seq.
I found the answer in this question: List all sequences in a Postgres db 8.1 with SQL

How to update results of EXECUTE format block in function (PostgreSQL)

Below is a great function to check the real count of all tables in PostgreSQL database. I found it here.
From my local test, it seems that the function returns the all result only after it finished all counting for 100 tables.
I am trying to make it more practical. If we could save the result of each table counting as soon as it finished with the table, then we can check the progress of all counting jobs instead of waiting for the end.
I think if I could UPDATE the result in this function immediately after finishing the first table, it will be great for my requirement.
Can you let me know how I can update the result into the table after this function finishes the counting of the first table?
CREATE FUNCTION rowcount_all(schema_name text default 'public')
RETURNS table(table_name text, cnt bigint) as
$$
declare
table_name text;
begin
for table_name in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
ORDER BY c.relname
LOOP
RETURN QUERY EXECUTE format('select count(*) from %I.%I',
table_name, schema_name, table_name);
END LOOP;
end
$$ language plpgsql;
-- Query
WITH rc(schema_name,tbl) AS (
select s.n,rowcount_all(s.n) from (values ('schema1'),('schema2')) as s(n)
)
SELECT schema_name,(tbl).* FROM rc;
Updated
I have decided to use a shell script to run the function below as a background process. The function would generate a processing log file so that I can check the current process.
I think your idea is good, but I also don't think it will work "out of the box" on PostgreSQL. I'm by no means the expert on this, but the way MVCC works on PostgreSQL, it's basically doing all of the DML in what can best be understood as temporary space, and then if and when everything works as expected it moves it all in at the end.
This has a lot of advantages, most notably that when someone is updating tables it doesn't prevent others from querying from those same tables.
If this were Oracle, I think you could accomplish this within the stored proc by using commit, but this isn't Oracle. And to be fair, Oracle doesn't allow truncates to be rolled back within a stored proc the way PostgreSQL does, so there are gives and takes.
Again, I'm not the expert, so if I've messed up a detail or two, feel free to correct me.
So, back to the solution. One way you COULD accomplish this is to set up your server as a remote server. Something like this would work:
CREATE SERVER pgprod
FOREIGN DATA WRAPPER dblink_fdw
OPTIONS (dbname 'postgres', host 'localhost', port '5432');
Assuming you have a table that stores the tables and counts:
create table table_counts (
table_name text not null,
record_count bigint,
constraint table_counts_pk primary key (table_name)
);
Were it not for your desire to see these results as they occur, something like this would work, for a single schema. It's easy enough to make this all schemas, so this is for illustration:
CREATE or replace FUNCTION rowcount_all(schema_name text)
returns void as
$$
declare
rowcount integer;
tablename text;
begin
for tablename in SELECT c.relname FROM pg_class c
JOIN pg_namespace s ON (c.relnamespace=s.oid)
WHERE c.relkind = 'r' AND s.nspname=schema_name
ORDER BY c.relname
LOOP
EXECUTE 'select count(*) from ' || schema_name || '.' || tablename into rowcount;
insert into table_counts values (schema_name || '.' || tablename, rowcount)
on conflict (table_name) do
update set record_count = rowcount;
END LOOP;
end
$$ language plpgsql;
(this presupposes 9.5 or greater -- if not, hand-roll your own upsert).
However, since you want real-time updates to the table, you could then put that same upsert into a dblink expression:
perform dblink_exec('pgprod', '
<< your upsert statement here >>
');
Of course the formatting of the SQL within the DBlink is now a little extra tricky, but the upside is once you nail it, you can run the function in the background and query the table while it's running to see the dynamic results.
I'd weigh that against the need to really have the information real-time.

Create a temporary table from a selection or insert if table already exist

How to create a temporary table, if it does not already exist, and add the selected rows to it?
CREATE TABLE AS
is the simplest and fastest way:
CREATE TEMP TABLE tbl AS
SELECT * FROM tbl WHERE ... ;
Do not use SELECT INTO. See:
Combine two tables into a new one so that select rows from the other one are ignored
Not sure whether table already exists
CREATE TABLE IF NOT EXISTS ... was introduced in version Postgres 9.1.
For older versions, use the function provided in this related answer:
PostgreSQL create table if not exists
Then:
INSERT INTO tbl (col1, col2, ...)
SELECT col1, col2, ...
Chances are, something is going wrong in your code if the temp table already exists. Make sure you don't duplicate data in the table or something. Or consider the following paragraph ...
Unique names
Temporary tables are only visible within your current session (not to be confused with transaction!). So the table name cannot conflict with other sessions. If you need unique names within your session, you could use dynamic SQL and utilize a SEQUENCE:
Create once:
CREATE SEQUENCE tablename_helper_seq;
You could use a DO statement (or a plpgsql function):
DO
$do$
BEGIN
EXECUTE
'CREATE TEMP TABLE tbl' || nextval('tablename_helper_seq'::regclass) || ' AS
SELECT * FROM tbl WHERE ... ';
RAISE NOTICE 'Temporary table created: "tbl%"' || ', lastval();
END
$do$;
lastval() and currval(regclass) are instrumental to return the dynamically created table name.

How do I drop all tables in psql (PostgreSQL interactive terminal) that starts with a common word?

How do I drop all tables whose name start with, say, doors_? Can I do some sort of regex using the drop table command?
I prefer not writing a custom script but all solutions are welcomed. Thanks!
This script will generate the DDL commands to drop them all:
SELECT 'DROP TABLE ' || t.oid::regclass || ';'
FROM pg_class t
-- JOIN pg_namespace n ON n.oid = t.relnamespace -- to select by schema
WHERE t.relkind = 'r'
AND t.relname ~~ E'doors\_%' -- enter search term for table here
-- AND n.nspname ~~ '%myschema%' -- optionally select by schema(s), too
ORDER BY 1;
The cast t.oid::regclass makes the syntax work for mixed case identifiers, reserved words or special characters in table names, too. It also prevents SQL injection and prepends the schema name where necessary. More about object identifier types in the manual.
About the schema search path.
You could automate the dropping, too, but it's unwise not to check what you actually delete before you do.
You could append CASCADE to every statement to DROP depending objects (views and referencing foreign keys). But, again, that's unwise unless you know very well what you are doing. Foreign key constraints are no big loss, but this will also drop all dependent views entirely. Without CASCADE you get error messages informing you which objects prevent you from dropping the table. And you can then deal with it.
I normally use one query to generate the DDL commands for me based on some of the metadata tables and then run those commands manually. For example:
SELECT 'DROP TABLE ' || tablename || ';' FROM pg_tables
WHERE tablename LIKE 'prefix%' AND schemaname = 'public';
This will return a bunch of DROP TABLE xxx; queries, which I simply copy&paste to the console. While you could add some code to execute them automatically, I prefer to run them on my own.