LOOP for table function - postgresql

I'm a newbie in Posgresql
I have a table function:
CREATE OR REPLACE FUNCTION stage.get_primary_key_info(
schemaName text,
tableName text
) RETURNS TABLE(constraint_name text, column_name text) AS
$BODY$
SELECT c.constraint_name, c.column_name
FROM information_schema.key_column_usage AS c
LEFT JOIN information_schema.table_constraints AS t
ON t.constraint_name = c.constraint_name
WHERE t.table_schema = schemaName
AND t.table_name = tableName
AND t.constraint_type = 'PRIMARY KEY'
;
$BODY$ LANGUAGE sql;
And I'm trying to use this function like:
FOR c IN (SELECT * FROM stage.get_primary_key_info(target_schema, stmt.tablename))
LOOP
joinFields = joinFields || FORMAT('t.%s = s.%s AND', c.column_name);
END LOOP;
But I have this error:
The loop variable of the tuples must be a variable of the type record
or tuple or a list of scalar variables

Try declaring c as a record. You are also missing a second parameter in the FORMAT string.
Also, take a look at string_agg and you might be able to skip that loop entirely. Here is an example (substituting y for the second parameter):
SELECT string_agg(FORMAT('t.%s = s.%s', column_name, 'y'), ' AND ')
FROM get_primary_key_info(target_schema, stmt.tablename)
;

Related

Find table names that contain a specific column entry from another table [duplicate]

Is it possible to search every column of every table for a particular value in PostgreSQL?
A similar question is available here for Oracle.
How about dumping the contents of the database, then using grep?
$ pg_dump --data-only --inserts -U postgres your-db-name > a.tmp
$ grep United a.tmp
INSERT INTO countries VALUES ('US', 'United States');
INSERT INTO countries VALUES ('GB', 'United Kingdom');
The same utility, pg_dump, can include column names in the output. Just change --inserts to --column-inserts. That way you can search for specific column names, too. But if I were looking for column names, I'd probably dump the schema instead of the data.
$ pg_dump --data-only --column-inserts -U postgres your-db-name > a.tmp
$ grep country_code a.tmp
INSERT INTO countries (iso_country_code, iso_country_name) VALUES ('US', 'United States');
INSERT INTO countries (iso_country_code, iso_country_name) VALUES ('GB', 'United Kingdom');
Here's a pl/pgsql function that locates records where any column contains a specific value.
It takes as arguments the value to search in text format, an array of table names to search into (defaults to all tables) and an array of schema names (defaults all schema names).
It returns a table structure with schema, name of table, name of column and pseudo-column ctid (non-durable physical location of the row in the table, see System Columns)
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_tables name[] default '{}',
haystack_schema name[] default '{}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
JOIN information_schema.table_privileges p ON
(t.table_name=p.table_name AND t.table_schema=p.table_schema
AND p.privilege_type='SELECT')
JOIN information_schema.schemata s ON
(s.schema_name=t.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND (c.table_schema=ANY(haystack_schema) OR haystack_schema='{}')
AND t.table_type='BASE TABLE'
LOOP
FOR rowctid IN
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L',
schemaname,
tablename,
columnname,
needle
)
LOOP
-- uncomment next line to get some progress report
-- RAISE NOTICE 'hit in %.%', schemaname, tablename;
RETURN NEXT;
END LOOP;
END LOOP;
END;
$$ language plpgsql;
See also the version on github based on the same principle but adding some speed and reporting improvements.
Examples of use in a test database:
Search in all tables within public schema:
select * from search_columns('foobar');
schemaname | tablename | columnname | rowctid
------------+-----------+------------+---------
public | s3 | usename | (0,11)
public | s2 | relname | (7,29)
public | w | body | (0,2)
(3 rows)
Search in a specific table:
select * from search_columns('foobar','{w}');
schemaname | tablename | columnname | rowctid
------------+-----------+------------+---------
public | w | body | (0,2)
(1 row)
Search in a subset of tables obtained from a select:
select * from search_columns('foobar', array(select table_name::name from information_schema.tables where table_name like 's%'), array['public']);
schemaname | tablename | columnname | rowctid
------------+-----------+------------+---------
public | s2 | relname | (7,29)
public | s3 | usename | (0,11)
(2 rows)
Get a result row with the corresponding base table and and ctid:
select * from public.w where ctid='(0,2)';
title | body | tsv
-------+--------+---------------------
toto | foobar | 'foobar':2 'toto':1
Variants
To test against a regular expression instead of strict equality, like grep, this part of the query:
SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L
may be changed to:
SELECT ctid FROM %I.%I WHERE cast(%I as text) ~ %L
For case insensitive comparisons, you could write:
SELECT ctid FROM %I.%I WHERE lower(cast(%I as text)) = lower(%L)
to search every column of every table for a particular value
This does not define how to match exactly.
Nor does it define what to return exactly.
Assuming:
Find any row with any column containing the given value in its text representation - as opposed to equaling the given value.
Return the table name (regclass) and the tuple ID (ctid), because that's simplest.
Here is a dead simple, fast and slightly dirty way:
CREATE OR REPLACE FUNCTION search_whole_db(_like_pattern text)
RETURNS TABLE(_tbl regclass, _ctid tid) AS
$func$
BEGIN
FOR _tbl IN
SELECT c.oid::regclass
FROM pg_class c
JOIN pg_namespace n ON n.oid = relnamespace
WHERE c.relkind = 'r' -- only tables
AND n.nspname !~ '^(pg_|information_schema)' -- exclude system schemas
ORDER BY n.nspname, c.relname
LOOP
RETURN QUERY EXECUTE format(
'SELECT $1, ctid FROM %s t WHERE t::text ~~ %L'
, _tbl, '%' || _like_pattern || '%')
USING _tbl;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM search_whole_db('mypattern');
Provide the search pattern without enclosing %.
Why slightly dirty?
If separators and decorators for the row in text representation can be part of the search pattern, there can be false positives:
column separator: , by default
whole row is enclosed in parentheses:()
some values are enclosed in double quotes "
\ may be added as escape char
And the text representation of some columns may depend on local settings - but that ambiguity is inherent to the question, not to my solution.
Each qualifying row is returned once only, even when it matches multiple times (as opposed to other answers here).
This searches the whole DB except for system catalogs. Will typically take a long time to finish. You might want to restrict to certain schemas / tables (or even columns) like demonstrated in other answers. Or add notices and a progress indicator, also demonstrated in another answer.
The regclass object identifier type is represented as table name, schema-qualified where necessary to disambiguate according to the current search_path:
Find the referenced table name using table, field and schema name
What is the ctid?
How do I decompose ctid into page and row numbers?
You might want to escape characters with special meaning in the search pattern. See:
Escape function for regular expression or LIKE patterns
There is a way to achieve this without creating a function or using an external tool. By using Postgres' query_to_xml() function that can dynamically run a query inside another query, it's possible to search a text across many tables. This is based on my answer to retrieve the rowcount for all tables:
To search for the string foo across all tables in a schema, the following can be used:
with found_rows as (
select format('%I.%I', table_schema, table_name) as table_name,
query_to_xml(format('select to_jsonb(t) as table_row
from %I.%I as t
where t::text like ''%%foo%%'' ', table_schema, table_name),
true, false, '') as table_rows
from information_schema.tables
where table_schema = 'public'
)
select table_name, x.table_row
from found_rows f
left join xmltable('//table/row'
passing table_rows
columns
table_row text path 'table_row') as x on true
Note that the use of xmltable requires Postgres 10 or newer. For older Postgres version, this can be also done using xpath().
with found_rows as (
select format('%I.%I', table_schema, table_name) as table_name,
query_to_xml(format('select to_jsonb(t) as table_row
from %I.%I as t
where t::text like ''%%foo%%'' ', table_schema, table_name),
true, false, '') as table_rows
from information_schema.tables
where table_schema = 'public'
)
select table_name, x.table_row
from found_rows f
cross join unnest(xpath('/table/row/table_row/text()', table_rows)) as r(data)
The common table expression (WITH ...) is only used for convenience. It loops through all tables in the public schema. For each table the following query is run through the query_to_xml() function:
select to_jsonb(t)
from some_table t
where t::text like '%foo%';
The where clause is used to make sure the expensive generation of XML content is only done for rows that contain the search string. This might return something like this:
<table xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<row>
<table_row>{"id": 42, "some_column": "foobar"}</table_row>
</row>
</table>
The conversion of the complete row to jsonb is done, so that in the result one could see which value belongs to which column.
The above might return something like this:
table_name | table_row
-------------+----------------------------------------
public.foo | {"id": 1, "some_column": "foobar"}
public.bar | {"id": 42, "another_column": "barfoo"}
Online example for Postgres 10+
Online example for older Postgres versions
Without storing a new procedure you can use a code block and execute to obtain a table of occurences. You can filter results by schema, table or column name.
DO $$
DECLARE
value int := 0;
sql text := 'The constructed select statement';
rec1 record;
rec2 record;
BEGIN
DROP TABLE IF EXISTS _x;
CREATE TEMPORARY TABLE _x (
schema_name text,
table_name text,
column_name text,
found text
);
FOR rec1 IN
SELECT table_schema, table_name, column_name
FROM information_schema.columns
WHERE table_name <> '_x'
AND UPPER(column_name) LIKE UPPER('%%')
AND table_schema <> 'pg_catalog'
AND table_schema <> 'information_schema'
AND data_type IN ('character varying', 'text', 'character', 'char', 'varchar')
LOOP
sql := concat('SELECT ', rec1."column_name", ' AS "found" FROM ',rec1."table_schema" , '.',rec1."table_name" , ' WHERE UPPER(',rec1."column_name" , ') LIKE UPPER(''','%my_substring_to_find_goes_here%' , ''')');
RAISE NOTICE '%', sql;
BEGIN
FOR rec2 IN EXECUTE sql LOOP
RAISE NOTICE '%', sql;
INSERT INTO _x VALUES (rec1."table_schema", rec1."table_name", rec1."column_name", rec2."found");
END LOOP;
EXCEPTION WHEN OTHERS THEN
END;
END LOOP;
END; $$;
SELECT * FROM _x;
If you're using IntelliJ add your DB to Database view then right click on databases and select full text search, it will list all tables and all fields for your specific text.
And if someone think it could help. Here is #Daniel Vérité's function, with another param that accept names of columns that can be used in search. This way it decrease the time of processing. At least in my test it reduced a lot.
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_columns name[] default '{}',
haystack_tables name[] default '{}',
haystack_schema name[] default '{public}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND c.table_schema=ANY(haystack_schema)
AND (c.column_name=ANY(haystack_columns) OR haystack_columns='{}')
AND t.table_type='BASE TABLE'
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
END IF;
END LOOP;
END;
$$ language plpgsql;
Bellow is an example of usage of the search_function created above.
SELECT * FROM search_columns('86192700'
, array(SELECT DISTINCT a.column_name::name FROM information_schema.columns AS a
INNER JOIN information_schema.tables as b ON (b.table_catalog = a.table_catalog AND b.table_schema = a.table_schema AND b.table_name = a.table_name)
WHERE
a.column_name iLIKE '%cep%'
AND b.table_type = 'BASE TABLE'
AND b.table_schema = 'public'
)
, array(SELECT b.table_name::name FROM information_schema.columns AS a
INNER JOIN information_schema.tables as b ON (b.table_catalog = a.table_catalog AND b.table_schema = a.table_schema AND b.table_name = a.table_name)
WHERE
a.column_name iLIKE '%cep%'
AND b.table_type = 'BASE TABLE'
AND b.table_schema = 'public')
);
Here's #Daniel Vérité's function with progress reporting functionality.
It reports progress in three ways:
by RAISE NOTICE;
by decreasing value of supplied {progress_seq} sequence from
{total number of colums to search in} down to 0;
by writing the progress along with found tables into text file,
located in c:\windows\temp\{progress_seq}.txt.
_
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_tables name[] default '{}',
haystack_schema name[] default '{public}',
progress_seq text default NULL
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
DECLARE
currenttable text;
columnscount integer;
foundintables text[];
foundincolumns text[];
begin
currenttable='';
columnscount = (SELECT count(1)
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND c.table_schema=ANY(haystack_schema)
AND t.table_type='BASE TABLE')::integer;
PERFORM setval(progress_seq::regclass, columnscount);
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND c.table_schema=ANY(haystack_schema)
AND t.table_type='BASE TABLE'
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text)=%L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
foundintables = foundintables || tablename;
foundincolumns = foundincolumns || columnname;
RAISE NOTICE 'FOUND! %, %, %, %', schemaname,tablename,columnname, rowctid;
END IF;
IF (progress_seq IS NOT NULL) THEN
PERFORM nextval(progress_seq::regclass);
END IF;
IF(currenttable<>tablename) THEN
currenttable=tablename;
IF (progress_seq IS NOT NULL) THEN
RAISE NOTICE 'Columns left to look in: %; looking in table: %', currval(progress_seq::regclass), tablename;
EXECUTE 'COPY (SELECT unnest(string_to_array(''Current table (column ' || columnscount-currval(progress_seq::regclass) || ' of ' || columnscount || '): ' || tablename || '\n\nFound in tables/columns:\n' || COALESCE(
(SELECT string_agg(c1 || '/' || c2, '\n') FROM (SELECT unnest(foundintables) AS c1,unnest(foundincolumns) AS c2) AS t1)
, '') || ''',''\n''))) TO ''c:\WINDOWS\temp\' || progress_seq || '.txt''';
END IF;
END IF;
END LOOP;
END;
$$ language plpgsql;
-- Below function will list all the tables which contain a specific string in the database
select TablesCount(‘StringToSearch’);
--Iterates through all the tables in the database
CREATE OR REPLACE FUNCTION **TablesCount**(_searchText TEXT)
RETURNS text AS
$$ -- here start procedural part
DECLARE _tname text;
DECLARE cnt int;
BEGIN
FOR _tname IN SELECT table_name FROM information_schema.tables where table_schema='public' and table_type='BASE TABLE' LOOP
cnt= getMatchingCount(_tname,Columnames(_tname,_searchText));
RAISE NOTICE 'Count% ', CONCAT(' ',cnt,' Table name: ', _tname);
END LOOP;
RETURN _tname;
END;
$$ -- here finish procedural part
LANGUAGE plpgsql; -- language specification
-- Returns the count of tables for which the condition is met.
-- For example, if the intended text exists in any of the fields of the table,
-- then the count will be greater than 0. We can find the notifications
-- in the Messages section of the result viewer in postgres database.
CREATE OR REPLACE FUNCTION **getMatchingCount**(_tname TEXT, _clause TEXT)
RETURNS int AS
$$
Declare outpt text;
BEGIN
EXECUTE 'Select Count(*) from '||_tname||' where '|| _clause
INTO outpt;
RETURN outpt;
END;
$$ LANGUAGE plpgsql;
--Get the fields of each table. Builds the where clause with all columns of a table.
CREATE OR REPLACE FUNCTION **Columnames**(_tname text,st text)
RETURNS text AS
$$ -- here start procedural part
DECLARE
_name text;
_helper text;
BEGIN
FOR _name IN SELECT column_name FROM information_schema.Columns WHERE table_name =_tname LOOP
_name=CONCAT('CAST(',_name,' as VarChar)',' like ','''%',st,'%''', ' OR ');
_helper= CONCAT(_helper,_name,' ');
END LOOP;
RETURN CONCAT(_helper, ' 1=2');
END;
$$ -- here finish procedural part
LANGUAGE plpgsql; -- language specification

Not able to execute query in function using input parameter?

I have this function, which for some reason is causing an issue with thw input parameter i pass to its execute statement.
DROP FUNCTION IF EXISTS ingoing_outgoing_reference_intigrity_breach(altered_table_name text, entity_ids bigint[]);
DROP TYPE IF EXISTS detection_intgrity_result;
CREATE or REPLACE AGGREGATE range_merge(anyrange)
(
sfunc = range_merge,
stype = anyrange
);
create type detection_intgrity_result
as (source_entity_id bigint,
source_valid tsrange,
causes_intigrity_breach boolean,
depending_entity_internal_name text,
depending_entity_id bigint,
depending_valid tsrange
);
CREATE OR REPLACE FUNCTION ingoing_outgoing_reference_intigrity_breach(altered_table_name text, entity_ids bigint[])
RETURNS setof detection_intgrity_result
AS $$
DECLARE
i record;
result_row detection_intgrity_result;
BEGIN
FOR i IN (
SELECT
tc.table_name, kcu.column_name,
ccu.table_name AS foreign_table_name
FROM
information_schema.table_constraints AS tc
JOIN
information_schema.key_column_usage AS kcu
ON
tc.constraint_name = kcu.constraint_name
JOIN
information_schema.constraint_column_usage AS ccu
ON
ccu.constraint_name = tc.constraint_name
WHERE
constraint_type = 'FOREIGN KEY' and
ccu.column_name != 'id' and
kcu.column_name != 'entity_id' and
ccu.table_name = altered_table_name OR tc.table_name = altered_table_name
and ccu.table_name != tc.table_name) LOOP
EXECUTE format('SELECT '||i.table_name||'.entity_id, '||i.table_name||'.'||i.column_name||',
range_merge('||i.foreign_table_name||'_registration.valid) #> range_merge('||i.table_name||'.valid) as lifespan,
range_merge('||i.foreign_table_name||'_registration.valid) <# range_merge('||i.table_name||'.valid) as reverse_lifespan
from '||i.foreign_table_name||'_registration, '||i.table_name||'
where '||i.table_name||'.'||i.column_name||' in $2
and '||i.table_name||'_registration.registration #> now()::timestamp
GROUP BY '||i.table_name||'.'||i.column_name||', '||i.table_name||'_registration.entity_id') using entity_ids;
END LOOP;
END
$$ LANGUAGE plpgsql;
select * from ingoing_outgoing_reference_intigrity_breach('country', '{1,12,1}')
Am i not able to pass input parameters to a execute statement?
The function look up tables which have ingoing and outgoing refereneces and ensure that data between those two is still valid.
There are three problems with that EXECUTE statement:
You use parameter $2, which corresponds to the second argument, but there is only a single argument.
You are trying to pass an array as IN list, which won't work. Use the equivalent, but correct
WHERE colname = ANY ($1)
You construct an SQL string with ||, which makes the code vulnerable to SQL injection. Use the format function with the %I pattern to make the code safe and more readable.

PostgreSQL left join with alias in a loop

Is it possible to iterate over a table's records and make a left join with them in a stored procedure?
Something like this:
FOR r IN SELECT tablename FROM tablewithtablenames ORDER BY tablename ASC
LOOP
INSERT INTO temp_Results
SELECT
temp_ids.Key as Key,
loggedvalue.pk_timestamp,
FROM
(temp_idS AS temp_ids
LEFT JOIN
quote_ident(r.tablename) AS loggedvalue
ON temp_ids.Key = loggedvalue.pk_fk_id);
END LOOP;
Unfortunately i get the following error message when i want to execute the stored procedure. (Function creation was successful.)
Error message:
ERROR: column loggedvalue.pk_fk_id does not exist LINE 29:
ON temp_ids.Key = "loggedvalue...
I have the feeling that i convert the record in a wrong way maybe because when i manually replaced the quote_ident(r.tablename) to the name of the table that i know the r contains it was fine, also i traced out the r.tablename in the loop and it was correct also.
As a_horse_with_no_name pointed out i should have use dynamic sql because in plpgsql you can not use a variable as a table name so i eliminated the loop and i used a union all:
CREATE OR REPLACE FUNCTION getaffectedtables(
OUT tableNames TEXT)
as $$
BEGIN
SELECT TRIM(TRAILING ' UNION ALL ' FROM string_agg('','SELECT * FROM "' || "tablename" || '" UNION ALL '))
INTO tableNames
FROM exampleTable;
END;$$
LANGUAGE plpgsql;
Then i used dynamic execute:
DECLARE
affectednames TEXT;
BEGIN
affectednames := getaffectedtables();
EXECUTE '
SELECT
temp_ids.Key as Key,
loggedvalue.pk_timestamp,
FROM
(temp_idS AS temp_ids
LEFT JOIN
('|| affectednames ||') AS loggedvalue
ON temp_ids.Key = loggedvalue.pk_fk_id);';

EXECUTE...INTO...USING statement in PL/pgSQL can't execute into a record?

I'm attempting to write an area of a function in PL/pgSQL that loops through an hstore and sets a record's column(the key of the hstore) to a specific value (the value of the hstore). I'm using Postgres 9.1.
The hstore will look like: ' "column1"=>"value1","column2"=>"value2" '
Generally, here is what I want from a function that takes in an hstore and has a record with values to modify:
FOR my_key, my_value IN
SELECT key,
value
FROM EACH( in_hstore )
LOOP
EXECUTE 'SELECT $1'
INTO my_row.my_key
USING my_value;
END LOOP;
The error which I am getting with this code:
"myrow" has no field "my_key". I've been searching for quite a while now for a solution, but everything else I've tried to achieve the same result hasn't worked.
Simpler alternative to your posted answer. Should perform much better.
This function retrieves a row from a given table (in_table_name) and primary key value (in_row_pk), and inserts it as new row into the same table, with some values replaced (in_override_values). The new primary key value as per default is returned (pk_new).
CREATE OR REPLACE FUNCTION f_clone_row(in_table_name regclass
, in_row_pk int
, in_override_values hstore
, OUT pk_new int)
LANGUAGE plpgsql AS
$func$
DECLARE
_pk text; -- name of PK column
_cols text; -- list of names of other columns
BEGIN
-- Get name of PK column
SELECT INTO _pk a.attname
FROM pg_catalog.pg_index i
JOIN pg_catalog.pg_attribute a ON a.attrelid = i.indrelid
AND a.attnum = i.indkey[0] -- single PK col!
WHERE i.indrelid = in_table_name
AND i.indisprimary;
-- Get list of columns excluding PK column
SELECT INTO _cols string_agg(quote_ident(attname), ',')
FROM pg_catalog.pg_attribute
WHERE attrelid = in_table_name -- regclass used as OID
AND attnum > 0 -- exclude system columns
AND attisdropped = FALSE -- exclude dropped columns
AND attname <> _pk; -- exclude PK column
-- INSERT cloned row with override values, returning new PK
EXECUTE format('
INSERT INTO %1$I (%2$s)
SELECT %2$s
FROM (SELECT (t #= $1).* FROM %1$I t WHERE %3$I = $2) x
RETURNING %3$I'
, in_table_name, _cols, _pk)
USING in_override_values, in_row_pk -- use override values directly
INTO pk_new; -- return new pk directly
END
$func$;
Call:
SELECT f_clone_row('tbl', 1, '"col1"=>"foo_new","col2"=>"bar_new"');
db<>fiddle here
Old sqlfiddle
Use regclass as input parameter type, so only valid table names can be used to begin with and SQL injection is ruled out. The function also fails earlier and more gracefully if you should provide an illegal table name.
Use an OUT parameter (pk_new) to simplify the syntax.
No need to figure out the next value for the primary key manually. It is inserted automatically and returned after the fact. That's not only simpler and faster, you also avoid wasted or out-of-order sequence numbers.
Use format() to simplify the assembly of the dynamic query string and make it less error-prone. Note how I use positional parameters for identifiers and unquoted strings respectively.
I build on your implicit assumption that allowed tables have a single primary key column of type integer with a column default. Typically serial columns.
Key element of the function is the final INSERT:
Merge override values with the existing row using the #= operator in a subselect and decompose the resulting row immediately.
Then you can select only relevant columns in the main SELECT.
Let Postgres assign the default value for the PK and get it back with the RETURNING clause.
Write the returned value into the OUT parameter directly.
All done in a single SQL command, that is generally fastest.
Since I didn't want to have to use any external functions for speed purposes, I created a solution using hstores to insert a record into a table:
CREATE OR REPLACE FUNCTION fn_clone_row(in_table_name character varying, in_row_pk integer, in_override_values hstore)
RETURNS integer
LANGUAGE plpgsql
AS $function$
DECLARE
my_table_pk_col_name varchar;
my_key text;
my_value text;
my_row record;
my_pk_default text;
my_pk_new integer;
my_pk_new_text text;
my_row_hstore hstore;
my_row_keys text[];
my_row_keys_list text;
my_row_values text[];
my_row_values_list text;
BEGIN
-- Get the next value of the pk column for the table.
SELECT ad.adsrc,
at.attname
INTO my_pk_default,
my_table_pk_col_name
FROM pg_attrdef ad
JOIN pg_attribute at
ON at.attnum = ad.adnum
AND at.attrelid = ad.adrelid
JOIN pg_class c
ON c.oid = at.attrelid
JOIN pg_constraint cn
ON cn.conrelid = c.oid
AND cn.contype = 'p'
AND cn.conkey[1] = at.attnum
JOIN pg_namespace n
ON n.oid = c.relnamespace
WHERE c.relname = in_table_name
AND n.nspname = 'public';
-- Get the next value of the pk in a local variable
EXECUTE ' SELECT ' || my_pk_default
INTO my_pk_new;
-- Set the integer value back to text for the hstore
my_pk_new_text := my_pk_new::text;
-- Add the next value statement to the hstore of changes to make.
in_override_values := in_override_values || hstore( my_table_pk_col_name, my_pk_new_text );
-- Copy over only the given row to the record.
EXECUTE ' SELECT * '
' FROM ' || quote_ident( in_table_name ) ||
' WHERE ' || quote_ident( my_table_pk_col_name ) ||
' = ' || quote_nullable( in_row_pk )
INTO my_row;
-- Replace the values that need to be changed in the column name array
my_row := my_row #= in_override_values;
-- Create an hstore of my record
my_row_hstore := hstore( my_row );
-- Create a string of comma-delimited, quote-enclosed column names
my_row_keys := akeys( my_row_hstore );
SELECT array_to_string( array_agg( quote_ident( x.colname ) ), ',' )
INTO my_row_keys_list
FROM ( SELECT unnest( my_row_keys ) AS colname ) x;
-- Create a string of comma-delimited, quote-enclosed column values
my_row_values := avals( my_row_hstore );
SELECT array_to_string( array_agg( quote_nullable( x.value ) ), ',' )
INTO my_row_values_list
FROM ( SELECT unnest( my_row_values ) AS value ) x;
-- Insert the values into the columns of a new row
EXECUTE 'INSERT INTO ' || in_table_name || '(' || my_row_keys_list || ')'
' VALUES (' || my_row_values_list || ')';
RETURN my_pk_new;
END
$function$;
It's quite a bit longer than what I had envisioned, but it works and is actually quite speedy.

Loop on tables with PL/pgSQL in Postgres 9.0+

I want to loop through all my tables to count rows in each of them. The following query gets me an error:
DO $$
DECLARE
tables CURSOR FOR
SELECT tablename FROM pg_tables
WHERE tablename NOT LIKE 'pg_%'
ORDER BY tablename;
tablename varchar(100);
nbRow int;
BEGIN
FOR tablename IN tables LOOP
EXECUTE 'SELECT count(*) FROM ' || tablename INTO nbRow;
-- Do something with nbRow
END LOOP;
END$$;
Errors:
ERROR: syntax error at or near ")"
LINE 1: SELECT count(*) FROM (sql_features)
^
QUERY: SELECT count(*) FROM (sql_features)
CONTEXT: PL/pgSQL function inline_code_block line 8 at EXECUTE statement
sql_features is a table's name in my DB. I already tried to use quote_ident() but to no avail.
I can't remember the last time I actually needed to use an explicit cursor for looping in PL/pgSQL.
Use the implicit cursor of a FOR loop, that's much cleaner:
DO
$$
DECLARE
rec record;
nbrow bigint;
BEGIN
FOR rec IN
SELECT *
FROM pg_tables
WHERE tablename NOT LIKE 'pg\_%'
ORDER BY tablename
LOOP
EXECUTE 'SELECT count(*) FROM '
|| quote_ident(rec.schemaname) || '.'
|| quote_ident(rec.tablename)
INTO nbrow;
-- Do something with nbrow
END LOOP;
END
$$;
You need to include the schema name to make this work for all schemas (including those not in your search_path).
Also, you actually need to use quote_ident() or format() with %I or a regclass variable to safeguard against SQL injection. A table name can be almost anything inside double quotes. See:
Table name as a PostgreSQL function parameter
Minor detail: escape the underscore (_) in the LIKE pattern to make it a literal underscore: tablename NOT LIKE 'pg\_%'
How I might do it:
DO
$$
DECLARE
tbl regclass;
nbrow bigint;
BEGIN
FOR tbl IN
SELECT c.oid
FROM pg_class c
JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE c.relkind = 'r'
AND n.nspname NOT LIKE 'pg\_%' -- system schema(s)
AND n.nspname <> 'information_schema' -- information schema
ORDER BY n.nspname, c.relname
LOOP
EXECUTE 'SELECT count(*) FROM ' || tbl INTO nbrow;
-- raise notice '%: % rows', tbl, nbrow;
END LOOP;
END
$$;
Query pg_catalog.pg_class instead of tablename, it provides the OID of the table.
The object identifier type regclass is handy to simplify. n particular, table names are double-quoted and schema-qualified where necessary automatically (also prevents SQL injection).
This query also excludes temporary tables (temp schema is named pg_temp% internally).
To only include tables from a given schema:
AND n.nspname = 'public' -- schema name here, case-sensitive
The cursor returns a record, not a scalar value, so "tablename" is not a string variable.
The concatenation turns the record into a string that looks like this (sql_features). If you had selected e.g. the schemaname with the tablename, the text representation of the record would have been (public,sql_features).
So you need to access the column inside the record to create your SQL statement:
DO $$
DECLARE
tables CURSOR FOR
SELECT tablename
FROM pg_tables
WHERE tablename NOT LIKE 'pg_%'
ORDER BY tablename;
nbRow int;
BEGIN
FOR table_record IN tables LOOP
EXECUTE 'SELECT count(*) FROM ' || table_record.tablename INTO nbRow;
-- Do something with nbRow
END LOOP;
END$$;
You might want to use WHERE schemaname = 'public' instead of not like 'pg_%' to exclude the Postgres system tables.