PostgreSQL rename a column only if it exists

PostgreSQL rename a column only if it exists - postgresql

I couldn't find in the PostgreSQL documentation if there is a way to run an: ALTER TABLE tablename RENAME COLUMN IF EXISTS colname TO newcolname; statement.
I would be glad we could, because I'm facing an error that depends on who made and gave me an SQL script, for which in some cases everything is perfectly fine (when the column has the wrong name, name that will actually be changed using a RENAME statement), and in other cases not (when the column already has the right name).
Hence the idea of using an IF EXISTS statement on the column name while trying to rename it. If the column has already the good name (here cust_date_mean), the rename command that must be applied only on the wrong name should be properly skipped and not issuing the following error:
db_1 | [223] ERROR: column "cust_mean" does not exist
db_1 | [223] STATEMENT: ALTER TABLE tablename RENAME COLUMN cust_mean TO cust_date_mean;
db_1 | ERROR: column "cust_mean" does not exist
(In the meantime I will clarify things with the team so it's not a big deal if such command doesn't exist but I think it could help).

While there is no built-in feature, you can use a DO statement:
DO
$$
DECLARE
_tbl regclass := 'public.tbl'; -- not case sensitive unless double-quoted
_colname name := 'cust_mean'; -- exact, case sensitive, no double-quoting
_new_colname text := 'cust_date_mean'; -- exact, case sensitive, no double-quoting
BEGIN
IF EXISTS (SELECT FROM pg_attribute
WHERE attrelid = _tbl
AND attname = _colname
AND attnum > 0
AND NOT attisdropped) THEN
EXECUTE format('ALTER TABLE %s RENAME COLUMN %I TO %I', _tbl, _colname, _new_colname);
ELSE
RAISE NOTICE 'Column % of table % not found!', quote_ident(_colname), _tbl;
END IF;
END
$$;
Does exactly what you ask for. Enter table and column names in the DECLARE section.
The NOTICE is optional.
For repeated use, I would create a function and pass parameters instead of the variables.
Variables are handled safely (no SQL injection). The table name can optionally be schema-qualified. If it's not, it's resolved according to the current search_path, just like ALTER TABLE would.
Related:
PostgreSQL create table if not exists
Table name as a PostgreSQL function parameter

Related

Declare and return value for DELETE and INSERT

I am trying to remove duplicated data from some of our databases based upon unique id's. All deleted data should be stored in a separate table for auditing purposes. Since it concerns quite some databases and different schemas and tables I wanted to start using variables to reduce chance of errors and the amount of work it will take me.
This is the best example query I could think off, but it doesn't work:
do $$
declare #source_schema varchar := 'my_source_schema';
declare #source_table varchar := 'my_source_table';
declare #target_table varchar := 'my_target_schema' || source_table || '_duplicates'; --target schema and appendix are always the same, source_table is a variable input.
declare #unique_keys varchar := ('1', '2', '3')
begin
select into #target_table
from #source_schema.#source_table
where id in (#unique_keys);
delete from #source_schema.#source_table where export_id in (#unique_keys);
end ;
$$;
The query syntax works with hard-coded values.
Most of the times my variables are perceived as columns or not recognized at all. :(

You need to create and then call a plpgsql procedure with input parameters :
CREATE OR REPLACE PROCEDURE duplicates_suppress
(my_target_schema text, my_source_schema text, my_source_table text, unique_keys text[])
LANGUAGE plpgsql AS
$$
BEGIN
EXECUTE FORMAT(
'WITH list AS (INSERT INTO %1$I.%3$I_duplicates SELECT * FROM %2$I.%3$I WHERE array[id] <# %4$L :: integer[] RETURNING id)
DELETE FROM %2$I.%3$I AS t USING list AS l WHERE t.id = l.id', my_target_schema, my_source_schema, my_source_table, unique_keys :: text) ;
END ;
$$ ;
The procedure duplicates_suppress inserts into my_target_schema.my_source_table || '_duplicates' the rows from my_source_schema.my_source_table whose id is in the array unique_keys and then deletes these rows from the table my_source_schema.my_source_table .
See the test result in dbfiddle.

As has been commented, you need some kind of dynamic SQL. In a FUNCTION, PROCEDURE or a DO statement to do it on the server.
You should be comfortable with PL/pgSQL. Dynamic SQL is no beginners' toy.
Example with a PROCEDURE, like Edouard already suggested. You'll need a FUNCTION instead to wrap it in an outer transaction (like you very well might). See:
When to use stored procedure / user-defined function?
CREATE OR REPLACE PROCEDURE pg_temp.f_archive_dupes(_source_schema text, _source_table text, _unique_keys int[], OUT _row_count int)
LANGUAGE plpgsql AS
$proc$
-- target schema and appendix are always the same, source_table is a variable input
DECLARE
_target_schema CONSTANT text := 's2'; -- hardcoded
_target_table text := _source_table || '_duplicates';
_sql text := format(
'WITH del AS (
DELETE FROM %I.%I
WHERE id = ANY($1)
RETURNING *
)
INSERT INTO %I.%I TABLE del', _source_schema, _source_table
, _target_schema, _target_table);
BEGIN
RAISE NOTICE '%', _sql; -- debug
EXECUTE _sql USING _unique_keys; -- execute
GET DIAGNOSTICS _row_count = ROW_COUNT;
END
$proc$;
Call:
CALL pg_temp.f_archive_dupes('s1', 't1', '{1, 3}', 0);
db<>fiddle here
I made the procedure temporary, since I assume you don't need to keep it permanently. Create it once per database. See:
How to create a temporary function in PostgreSQL?
Passed schema and table names are case-sensitive strings! (Unlike unquoted identifiers in plain SQL.) Either way, be wary of SQL-injection when concatenating SQL dynamically. See:
Are PostgreSQL column names case-sensitive?
Table name as a PostgreSQL function parameter
Made _unique_keys type int[] (array of integer) since your sample values look like integers. Use a the actual data type of your id columns!
The variable _sql holds the query string, so it can easily be debugged before actually executing. Using RAISE NOTICE '%', _sql; for that purpose.
I suggest to comment the EXECUTE line until you are sure.
I made the PROCEDURE return the number of processed rows. You didn't ask for that, but it's typically convenient. At hardly any cost. See:
Dynamic SQL (EXECUTE) as condition for IF statement
Best way to get result count before LIMIT was applied
Last, but not least, use DELETE ... RETURNING * in a data-modifying CTE. Since that has to find rows only once it comes at about half the cost of separate SELECT and DELETE. And it's perfectly safe. If anything goes wrong, the whole transaction is rolled back anyway.
Two separate commands can also run into concurrency issues or race conditions which are ruled out this way, as DELETE implicitly locks the rows to delete. Example:
Replicating data between Postgres DBs
Or you can build the statements in a client program. Like psql, and use \gexec. Example:
Filter column names from existing table for SQL DDL statement

Based on Erwin's answer, minor optimization...
create or replace procedure pg_temp.p_archive_dump
(_source_schema text, _source_table text,
_unique_key int[],_target_schema text)
language plpgsql as
$$
declare
_row_count bigint;
_target_table text := '';
BEGIN
select quote_ident(_source_table) ||'_'|| array_to_string(_unique_key,'_') into _target_table from quote_ident(_source_table);
raise notice 'the deleted table records will store in %.%',_target_schema, _target_table;
execute format('create table %I.%I as select * from %I.%I limit 0',_target_schema, _target_table,_source_schema,_source_table );
execute format('with mm as ( delete from %I.%I where id = any (%L) returning * ) insert into %I.%I table mm'
,_source_schema,_source_table,_unique_key, _target_schema, _target_table);
GET DIAGNOSTICS _row_count = ROW_COUNT;
RAISE notice 'rows influenced, %',_row_count;
end
$$;
--
if your _unique_key is not that much, this solution also create a table for you. Obviously you need to create the target schema yourself.
If your unique_key is too much, you can customize to properly rename the dumped table.
Let's call it.
call pg_temp.p_archive_dump('s1','t1', '{1,2}','s2');
s1 is the source schema, t1 is source table, {1,2} is the unique key you want to extract to the new table. s2 is the target schema

Dynamic SELECT INTO in PL/pgSQL function

How can I write a dynamic SELECT INTO query inside a PL/pgSQL function in Postgres?
Say I have a variable called tb_name which is filled in a FOR loop from information_schema.tables. Now I have a variable called tc which will be taking the row count for each table. I want something like the following:
FOR tb_name in select table_name from information_schema.tables where table_schema='some_schema' and table_name like '%1%'
LOOP
EXECUTE FORMAT('select count(*) into' || tc 'from' || tb_name);
END LOOP
What should be the data type of tb_name and tc in this case?

CREATE OR REPLACE FUNCTION myfunc(_tbl_pattern text, _schema text = 'public')
RETURNS void AS -- or whatever you want to return
$func$
DECLARE
_tb_name information_schema.tables.table_name%TYPE; -- currently varchar
_tc bigint; -- count() returns bigint
BEGIN
FOR _tb_name IN
SELECT table_name
FROM information_schema.tables
WHERE table_schema = _schema
AND table_name ~ _tbl_pattern -- see below!
LOOP
EXECUTE format('SELECT count(*) FROM %I.%I', _schema, _tb_name)
INTO _tc;
-- do something with _tc
END LOOP;
END
$func$ LANGUAGE plpgsql;
Notes
I prepended all parameters and variables with an underscore (_) to avoid naming collisions with table columns. Just a useful convention.
_tc should be bigint, since that's what the aggregate function count() returns.
The data type of _tb_name is derived from its parent column dynamically: information_schema.tables.table_name%TYPE. See the chapter Copying Types in the manual.
Are you sure you only want tables listed in information_schema.tables? Makes sense, but be aware of implications. See:
How to check if a table exists in a given schema
a_horse already pointed to the manual and Andy provided a code example. This is how you assign a single row or value returned from a dynamic query with EXECUTE to a (row) variable. A single column (like count in the example) is decomposed from the row type automatically, so we can assign to the scalar variable tc directly - in the same way we would assign a whole row to a record or row variable. Related:
How to get the value of a dynamically generated field name in PL/pgSQL
Schema-qualify the table name in the dynamic query. There may be other tables of the same name in the current search_path, which would result in completely wrong (and very confusing!) results without schema-qualification. Sneaky bug! Or this schema is not in the search_path at all, which would make the function raise an exception immediately.
How does the search_path influence identifier resolution and the "current schema"
Always quote identifiers properly to defend against SQL injection and random errors. Schema and table have to be quoted separately! See:
Table name as a PostgreSQL function parameter
Truncating all tables in a Postgres database
I use the regular expression operator ~ in table_name ~ _tbl_pattern instead of table_name LIKE ('%' || _tbl_pattern || '%'), that's simpler. Be wary of special characters in the pattern parameter either way! See:
PostgreSQL Reverse LIKE
Escape function for regular expression or LIKE patterns
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
I set a default for the schema name in the function call: _schema text = 'public'. Just for convenience, you may or may not want that. See:
Assigning default value for type
Addressing your comment: to pass values, use the USING clause like:
EXECUTE format('SELECT count(*) FROM %I.%I
WHERE some_column = $1', _schema, _tb_name,column_name)
USING user_def_variable;
Related:
INSERT with dynamic table name in trigger function

It looks like you want the %I placeholder for FORMAT so that it treats your variable as an identifier. Also, the INTO clause should go outside the prepared statement.
FOR tb_name in select table_name from information_schema.tables where table_schema='some_schema' and table_name like '%1%'
LOOP
EXECUTE FORMAT('select count(*) from %I', tb_name) INTO tc;
END LOOP

PostgreSQL execute custom queries from string

I am trying to create a set of functions for a PostgreSQL database I have, and I am facing this problem. What I am trying to do is to create a function, which takes as a parameter the name of his column a user wants to change, and the new value they wish to insert.
I thought that the simplest way would be to create the queries as strings internally and execute them (sql injections are not of concern right now).
Searching a bit, I tried to use the EXECUTE command, with no success. Any arrangement of the queries I tried did not work, with or without SET, and even a super simple query shows a syntax error, inside the function code or in the pgadmin sql editor:
EXECUTE "SELECT * FROM user_data.users;";
ERROR: prepared statement "SELECT * FROM user_data.users;" does not exist
********** Error **********
ERROR: prepared statement "SELECT * FROM user_data.users;" does not exist
SQL state: 26000
Any suggestions to solve this?

To alter a column name you could use a function like this.
It takes the table name, column name, and new column name.
CREATE OR REPLACE FUNCTION foo(_t regclass, _ocn text, _ncn text)
RETURNS void AS
$func$
BEGIN
EXECUTE 'ALTER TABLE '|| _t ||'
RENAME COLUMN '|| quote_ident(_ocn) ||' TO '|| quote_ident(_ncn) ||'';
END
$func$ LANGUAGE plpgsql;
To alter the type is a bit more work and may add to the server overheads.
Lets use this table as an example
CREATE TABLE foobar (id serial primary key, name text);
Ref Postgresql Type Conversion
In many cases a user does not need to understand the details of the
type conversion mechanism. However, implicit conversions done by
PostgreSQL can affect the results of a query. When necessary, these
results can be tailored by using explicit type conversion.
Firstly we would have to get the column data type(s) from the information The Information Schema Ref postgresql 9.4
select data_type from information_schema.columns where table_name = 'foobar' and column_name = 'name';
We could cast types as mentioned in type conversion::Table 8-1. Data Types Ref Postgresql 9.4
ALTER TABLE foobar ALTER COLUMN name TYPE varchar(20) USING name::text;
I hope this helps

PostgreSQL function-OID: 'my OID' where 'my' is the currently executing user-defined-function

This query returns the OID of the function whose name and signature is supplied:
select 'myfunc(signature)'::regprocedure::oid;
But is there something in PostgreSQL plpgsql like a myNameAndSignature() function so we could use dynamic sql to build a statement that gets the OID of the function and then creates a temporary table with the OID appended to the name of the temp table?
The statement to execute dynamically is:
create temp table TT17015
I'm new to PostgreSQL, and maybe there's a better way to handle naming of temporary tables so the functions that use temp tables, and call each other, don't get the error that a particular temp table it is trying to delete is in use elsewhere?

Using the OID of a function does not necessarily prevent a naming conflict. The same function could be run multiple times in the same session.
If you are in need of a unique name, use a SEQUENCE. Run once in your database:
CREATE SEQUENCE tt_seq;
Then, in your plpgsql function or DO statement:
DO
$$
DECLARE
_tbl text := 'tt' || nextval('tt_seq');
BEGIN
EXECUTE 'CREATE TEMP TABLE ' || _tbl || '(id int)';
END
$$
Drawback is that you have to use dynamic SQL for dynamic identifiers. Plain SQL commands do not accept parameters for identifiers.

How to delete unused sequences?

We are using PostgreSQL. My requirement is to delete unused sequences from my database.
For example, if I create any table through my application, one sequence will be created, but for deleting the table we are not deleting the sequence, too. If want to create the same table another sequence is being created.
Example: table: file; automatically created sequence for id coumn: file_id_seq
When I delete the table file and create it with same name again, a new sequence is being created (i.e. file_id_seq1). I have accumulated a huge number of unused sequences in my application database this way.
How to delete these unused sequences?

A sequence that is created automatically for a serial column is deleted automatically, when the column (or its table) is dropped. The problem you describe should not exist to begin with. Only very old versions of PostgreSQL did not do that. 7.4 or older?
Solution for the problem
This query will generate the DDL commands to delete all "unbound" sequences in the database it is executed in:
SELECT string_agg('DROP SEQUENCE ' || c.oid::regclass, '; ') || ';' AS ddl
FROM pg_class c
LEFT JOIN pg_depend d ON d.refobjid = c.oid
AND d.deptype <> 'i'
WHERE c.relkind = 'S'
AND d.refobjid IS NULL;
The cast to regclass in c.oid::regclass automatically schema-qualifies sequence names where necessary according to the current search_path. See:
How to check if a table exists in a given schema
How does the search_path influence identifier resolution and the "current schema"
Result:
DROP SEQUENCE foo_id_seq;
DROP SEQUENCE bar_id_seq;
...
Execute the result to drop all sequences that are not bound to a serial column (or any other column). Study the meaning of columns and tables here.
Careful! These sequences might be in use otherwise. There are use cases where sequences are created as standalone objects. For instance, if you want multiple columns to share one sequence. You should know exactly what you are doing.
However, you cannot delete sequences bound to a serial column this way. So the operation is safe in this respect.
DROP SEQUENCE test_id_seq;
Result:
ERROR: cannot drop sequence test_id_seq because other objects depend on it
DETAIL: default for table test column id depends on sequence test_id_seq
HINT: Use DROP ... CASCADE to drop the dependent objects too.

If you are using pgAdmin, you can select the sequence and check the "depends on" tab. It will list any object that relies on the sequence.
Another way is to TRY to delete the sequence. If a table references it, pgAdmin will throw an error saying that something is depending on this sequence. If you are able to delete the sequence without any errors, there is no dependency.
Be sure to test this somewhere.

What i do is first I got all the sequences and then saved these result into a file then i run the file in psql: below content was saved with file name del_seq_all.sql and then list sequences in test1.sql . i dont know this is the correct solution or not. But result is coming as expected.
\o d:/test1.sql
SELECT 'drop sequence ' || c.relname || ';' FROM pg_class c WHERE
(c.relkind = 'S');
\o
\i d:/test1.sql

Proceed with caution, "drop sequence sequence_name_here" will successfully drop a sequence even if it's attached as a default nextval() value of a table column. There seems to be some disconnect here especially if the sequence was created separately. I'm also looking for the perfect one liner to clean up 100% unused sequences.

Building on the answer by #erwin:
DO $$ DECLARE
r text;
BEGIN
FOR r IN (
SELECT cl.relname
FROM pg_class cl
LEFT JOIN pg_namespace ns ON ns."oid" = cl.relnamespace
LEFT JOIN pg_depend d ON d.refobjid = cl."oid" AND d.deptype <> 'i'
WHERE ns.nspname = 'public'
AND cl.relkind = 'S'
AND d.refobjid IS NULL
) LOOP
-- dangerous, test before you execute!
RAISE NOTICE '%', -- once confident, comment this line ...
-- EXECUTE -- ... and uncomment this one
'DROP SEQUENCE ' || quote_ident(r);
END LOOP;
END $$;
This will actually execute the query and produce the intended result

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse