ERROR: column "int4" specified more than once - postgresql

Steps for Execution:
Table Creation
CREATE TABLE xyz.table_a(
id bigint NOT NULL,
scores jsonb,
CONSTRAINT table_a_pkey PRIMARY KEY (id)
);
Add some dummy data :
INSERT INTO xyz.table_a(
id, scores)
VALUES (1, '{"a":20,"b":20}');
Function Creation
CREATE OR REPLACE FUNCTION xyz.example(
table_name text,
regular_columns text,
json_column text,
view_name text
) RETURNS text
LANGUAGE 'plpgsql'
COST 100
VOLATILE
AS $BODY$
DECLARE
cols TEXT;
cols_sum TEXT;
BEGIN
EXECUTE
format(
$ex$SELECT string_agg(
format(
'CAST(%2$s->>%%1$L AS INTEGER)',
key),
', '
)
FROM (SELECT DISTINCT key
FROM %1$s, jsonb_each(%2$s)
ORDER BY 1
) s;$ex$,
table_name, json_column
)
INTO cols;
EXECUTE
format(
$ex$SELECT string_agg(
format(
'CAST(%2$s->>%%1$L AS INTEGER)',
key
),
'+'
)
FROM (SELECT DISTINCT key
FROM %1$s, jsonb_each(%2$s)
ORDER BY 1) s;$ex$,
table_name, json_column
)
INTO cols_sum;
EXECUTE
format(
$ex$DROP VIEW IF EXISTS %2$s;
CREATE VIEW %2$s AS
SELECT %3$s, %4$s, SUM(%5$s) AS total
FROM %1$s
GROUP BY %3$s$ex$,
table_name, view_name, regular_columns, cols, cols_sum
);
RETURN cols;
END
$BODY$:
Call Function
SELECT xyz.example(
'xyz.table_a',
' id',
'scores',
'xyz.view_table_a'
);
Once you run these steps, I am getting an error
ERROR: column "int4" specified more than once
CONTEXT: SQL statement "
DROP VIEW IF EXISTS xyz.view_table_a;
CREATE VIEW xyz.view_table_a AS
SELECT id, CAST(scores->>'a' AS INTEGER), CAST(scores->>'b' AS INTEGER), SUM(CAST(scores->>'a' AS INTEGER)+CAST(scores->>'b' AS INTEGER)) AS total FROM xyz.table_a GROUP BY id

Look at the error message closely:
...
SELECT id, CAST(scores->>'a' AS INTEGER), CAST(scores->>'b' AS INTEGER),
...
There are multiple expressions without column alias. A named column like "id" defaults to the given name. But other expressions default to the internal type name, which is "int4" for integer. One might assume that the JSON key name is used, but that's not so. CAST(scores->>'a' AS INTEGER) is just another expression returning an unnamed integer value.
This still works for a plain SELECT. Postgres tolerates duplicate column names in the (outer) SELECT list. But a VIEW cannot be created that way. Would result in ambiguities.
Either add column aliases to expressions in the SELECT list:
SELECT id, CAST(scores->>'a' AS INTEGER) AS a, CAST(scores->>'b' AS INTEGER) AS b, ...
Or add a list of column names to CREATE VIEW:
CREATE VIEW xyz.view_table_a(id, a, b, ...) AS ...
Something like this should fix your function (preserving literal spelling of JSON key names:
...
format(
'CAST(%2$s->>%%1$L AS INTEGER) AS %%1$I',
key),
...
See the working demo here:
db<>fiddle here
Aside, your nested format() calls make the code pretty hard to read and maintain.

Related

PostgreSQL -- JOIN UNNEST output with CTE INSERT ID -- INSERT many to many

In a PostgreSQL function, is it possible to join the result of UNNEST, which is an integer array from function input, with an ID returned from a CTE INSERT?
I have PostgreSQL tables like:
CREATE TABLE public.message (
id SERIAL PRIMARY KEY,
content TEXT
);
CREATE TABLE public.message_tag (
id SERIAL PRIMARY KEY,
message_id INTEGER NOT NULL CONSTRAINT message_tag_message_id_fkey REFERENCES public.message(id) ON DELETE CASCADE,
tag_id INTEGER NOT NULL CONSTRAINT message_tag_tag_id_fkey REFERENCES public.tag(id) ON DELETE CASCADE
);
I want to create a PostgreSQL function which takes input of content and an array of tag_id. This is for graphile. I want to do it all in one function, so I get a mutation.
Here's what I got so far. I don't know how to join an UNNEST across an id returned from a CTE.
CREATE FUNCTION public.create_message(content text, tags Int[])
RETURNS public.message
AS $$
-- insert to get primary key of message, for many to many message_id
WITH moved_rows AS (
INSERT INTO public.message (content)
RETURNING *;
)
-- many to many relation
INSERT INTO public.message_tag
SELECT moved_rows.id as message_id, tagInput.tag_id FROM moved_rows, UNNEST(tags) as tagInput;
RETURNING *
$$ LANGUAGE sql VOLATILE STRICT;
You're not that far from your goal:
the semicolon placement in the CTE is wrong
the first INSERT statement lacks a SELECT or VALUES clause to specify what should be inserted
the INSERT into tag_message should specify the columns in which to insert (especially if you have that unnecessary serial id)
you specified a relation alias for the UNNEST call already, but none for the column tag_id
your function was RETURNING a set of message_tag rows but was specified to return a single message row
To fix these:
CREATE FUNCTION public.create_message(content text, tags Int[])
RETURNS public.message
AS $$
-- insert to get primary key of message, for many to many message_id
WITH moved_rows AS (
INSERT INTO public.message (content)
VALUES ($1)
RETURNING *
),
-- many to many relation
_ AS (
INSERT INTO public.message_tag (message_id, tag_id)
SELECT moved_rows.id, tagInput.tag_id
FROM moved_rows, UNNEST($2) as tagInput(tag_id)
)
TABLE moved_rows;
$$ LANGUAGE sql VOLATILE STRICT;
(Online demo)

Make duplicate row in Postgresql

I am writing migration script to migrate database. I have to duplicate the row by incrementing primary key considering that different database can have n number of different columns in the table. I can't write each and every column in query. If i simply just copy the row then, I am getting duplicate key error.
Query: INSERT INTO table_name SELECT * FROM table_name WHERE id=255;
ERROR: duplicate key value violates unique constraint "table_name_pkey"
DETAIL: Key (id)=(255) already exist
Here, It's good that I don't have to mention all column names. I can select all columns by giving *. But, same time I am also getting duplicate key error.
What's the solution of this problem? Any help would be appreciated. Thanks in advance.
If you are willing to type all column names, you may write
INSERT INTO table_name (
pri_key
,col2
,col3
)
SELECT (
SELECT MAX(pri_key) + 1
FROM table_name
)
,col2
,col3
FROM table_name
WHERE id = 255;
Other option (without typing all columns , but you know the primary key ) is to CREATE a temp table, update it and re-insert within a transaction.
BEGIN;
CREATE TEMP TABLE temp_tab ON COMMIT DROP AS SELECT * FROM table_name WHERE id=255;
UPDATE temp_tab SET pri_key_col = ( select MAX(pri_key_col) + 1 FROM table_name );
INSERT INTO table_name select * FROM temp_tab;
COMMIT;
This is just a DO block but you could create a function that takes things like the table name etc as parameters.
Setup:
CREATE TABLE public.t1 (a TEXT, b TEXT, c TEXT, id SERIAL PRIMARY KEY, e TEXT, f TEXT);
INSERT INTO public.t1 (e) VALUES ('x'), ('y'), ('z');
Code to duplicate values without the primary key column:
DO $$
DECLARE
_table_schema TEXT := 'public';
_table_name TEXT := 't1';
_pk_column_name TEXT := 'id';
_columns TEXT;
BEGIN
SELECT STRING_AGG(column_name, ',')
INTO _columns
FROM information_schema.columns
WHERE table_name = _table_name
AND table_schema = _table_schema
AND column_name <> _pk_column_name;
EXECUTE FORMAT('INSERT INTO %1$s.%2$s (%3$s) SELECT %3$s FROM %1$s.%2$s', _table_schema, _table_name, _columns);
END $$
The query it creates and runs is: INSERT INTO public.t1 (a,b,c,e,f) SELECT a,b,c,e,f FROM public.t1. It's selected all the columns apart from the PK one. You could put this code in a function and use it for any table you wanted, or just use it like this and edit it for whatever table.

How to stop the "insert or update on table ...violates foreign key constraint"?

How to construct an INSERT statement so that it would not generate the error "insert or update on table ... violates foreign key constraint" in case if the foreign key value does not exist in the reference table?
I just need no record created in this case and success response.
Thank you
Use a query as the source for the INSERT statement:
insert into the_table (id, some_data, some_fk_column
select *
from (
values (42, 'foobar', 100)
) as x(id, some_data, some_fk_column)
where exists (select *
from referenced_table rt
where rt.primary_key_column = x.some_fk_column);
This can also be extended to a multi-row insert:
insert into the_table (id, some_data, some_fk_column
select *
from (
values
(42, 'foobar', 100),
(24, 'barfoo', 101)
) as x(id, some_data, some_fk_column)
where exists (select *
from referenced_table rt
where rt.primary_key_column = x.some_fk_column);
You didn't show us your table definitions so I had to make up the table and column names. You will have to translate that to your names.
You could create a function with plpgsql, which inserts a row and catches the exception:
CREATE FUNCTION customInsert(int,varchar) RETURNS VOID
AS $$
BEGIN
INSERT INTO foo VALUES ($1,$2);
EXCEPTION
WHEN foreign_key_violation THEN --do nothing
END;
$$ LANGUAGE plpgsql
You can then call this function by this:
SELECT customInsert(1,'hello');
This function tries to insert the given parameters into the table foo and catches the foreign_key_violation error if occurs.
Of course you can generalise the function more, to be able to insert in more than one table, but your question sounded like this was only needed for one specific table.

Postgres find all rows in database tables matching criteria on a given column

I am trying to write sub-queries so that I search all tables for a column named id and since there are multiple tables with id column, I want to add the condition, so that id = 3119093.
My attempt was:
Select *
from information_schema.tables
where id = '3119093' and id IN (
Select table_name
from information_schema.columns
where column_name = 'id' );
This didn't work so I tried:
Select *
from information_schema.tables
where table_name IN (
Select table_name
from information_schema.columns
where column_name = 'id' and 'id' IN (
Select * from table_name where 'id' = 3119093));
This isn't the right way either. Any help would be appreciated. Thanks!
A harder attempt is:
CREATE OR REPLACE FUNCTION search_columns(
needle text,
haystack_tables name[] default '{}',
haystack_schema name[] default '{public}'
)
RETURNS table(schemaname text, tablename text, columnname text, rowctid text)
AS $$
begin
FOR schemaname,tablename,columnname IN
SELECT c.table_schema,c.table_name,c.column_name
FROM information_schema.columns c
JOIN information_schema.tables t ON
(t.table_name=c.table_name AND t.table_schema=c.table_schema)
WHERE (c.table_name=ANY(haystack_tables) OR haystack_tables='{}')
AND c.table_schema=ANY(haystack_schema)
AND t.table_type='BASE TABLE'
--AND c.column_name = "id"
LOOP
EXECUTE format('SELECT ctid FROM %I.%I WHERE cast(%I as text) like %L',
schemaname,
tablename,
columnname,
needle
) INTO rowctid;
IF rowctid is not null THEN
RETURN NEXT;
END IF;
END LOOP;
END;
$$ language plpgsql;
select * from search_columns('%3119093%'::varchar,'{}'::name[]) ;
The only problem is this code displays the table name and column name. I have to then manually enter
Select * from table_name where id = 3119093
where I got the table name from the code above.
I want to automatically implement returning rows from a table but I don't know how to get the table name automatically.
I took the time to make it work for you.
For starters, some information on what is going on inside the code.
Explanation
function takes two input arguments: column name and column value
it requires a created type that it will be returning a set of
first loop identifies tables that have a column name specified as the input argument
then it forms a query which aggregates all rows that match the input condition inside every table taken from step 3 with comparison based on ILIKE - as per your example
function goes into the second loop only if there is at least one row in currently visited table that matches specified condition (then the array is not null)
second loop unnests the array of rows that match the condition and for every element it puts it in the function output with RETURN NEXT rec clause
Notes
Searching with LIKE is inefficient - I suggest adding another input argument "column type" and restrict it in the lookup by adding a join to pg_catalog.pg_type table.
The second loop is there so that if more than 1 row is found for a particular table, then every row gets returned.
If you are looking for something else, like you need key-value pairs, not just the values, then you need to extend the function. You could for example build json format from rows.
Now, to the code.
Test case
CREATE TABLE tbl1 (col1 int, id int); -- does contain values
CREATE TABLE tbl2 (col1 int, col2 int); -- doesn't contain column "id"
CREATE TABLE tbl3 (id int, col5 int); -- doesn't contain values
INSERT INTO tbl1 (col1, id)
VALUES (1, 5), (1, 33), (1, 25);
Table stores data:
postgres=# select * From tbl1;
col1 | id
------+----
1 | 5
1 | 33
1 | 25
(3 rows)
Creating type
CREATE TYPE sometype AS ( schemaname text, tablename text, colname text, entirerow text );
Function code
CREATE OR REPLACE FUNCTION search_tables_for_column (
v_column_name text
, v_column_value text
)
RETURNS SETOF sometype
LANGUAGE plpgsql
STABLE
AS
$$
DECLARE
rec sometype%rowtype;
v_row_array text[];
rec2 record;
arr_el text;
BEGIN
FOR rec IN
SELECT
nam.nspname AS schemaname
, cls.relname AS tablename
, att.attname AS colname
, null::text AS entirerow
FROM
pg_attribute att
JOIN pg_class cls ON att.attrelid = cls.oid
JOIN pg_namespace nam ON cls.relnamespace = nam.oid
WHERE
cls.relkind = 'r'
AND att.attname = v_column_name
LOOP
EXECUTE format('SELECT ARRAY_AGG(row(tablename.*)::text) FROM %I.%I AS tablename WHERE %I::text ILIKE %s',
rec.schemaname, rec.tablename, rec.colname, quote_literal(concat('%',v_column_value,'%'))) INTO v_row_array;
IF v_row_array is not null THEN
FOR rec2 IN
SELECT unnest(v_row_array) AS one_row
LOOP
rec.entirerow := rec2.one_row;
RETURN NEXT rec;
END LOOP;
END IF;
END LOOP;
END
$$;
Exemplary call & output
postgres=# select * from search_tables_for_column('id','5');
schemaname | tablename | colname | entirerow
------------+-----------+---------+-----------
public | tbl1 | id | (1,5)
public | tbl1 | id | (1,25)
(2 rows)

Can't drop temp table in Postgres function: "being used by active queries in this session"

It is expected to now take in a table called waypoints and follow through the function body.
drop function if exists everything(waypoints);
create function everything(waypoints) RETURNS TABLE(node int, xy text[]) as $$
BEGIN
drop table if exists bbox;
create temporary table bbox(...);
insert into bbox
select ... from waypoints;
drop table if exists b_spaces;
create temporary table b_spaces(
...
);
insert into b_spaces
select ...
drop table if exists b_graph; -- Line the error flags.
create temporary table b_graph(
...
);
insert into b_graph
select ...
drop table if exists local_green;
create temporary table local_green(
...
);
insert into local_green
...
with aug_temp as (
select ...
)
insert into b_graph(source, target, cost) (
(select ... from aug_temp)
UNION
(select ... from aug_temp)
);
return query
with
results as (
select id1, ... from b_graph -- The relation being complained about.
),
pkg as (
select loc, ...
)
select id1, array_agg(loc)
from pkg
group by id1;
return;
END;
$$ LANGUAGE plpgsql;
This returns cannot DROP TABLE b_graph because it is being used by active queries in this session
How do I go about rectifying this issue?
The error message is rather obvious, you cannot drop a temp table while it is being used.
You might be able to avoid the problem by adding ON COMMIT DROP:
Temporary table and loops in a function
However, this can probably be simpler. If you don't need all those temp tables to begin with (which I suspect), you can replace them all with CTEs (or most of them probably even with cheaper subqueries) and simplify to one big query. Can be plpgsql or just SQL:
CREATE FUNCTION everything(waypoints)
RETURNS TABLE(node int, xy text[]) AS
$func$
WITH bbox AS (SELECT ... FROM waypoints) -- not the fct. parameter!
, b_spaces AS (SELECT ... )
, b_graph AS (SELECT ... )
, local_green AS (SELECT ... )
, aug_temp AS (SELECT ... )
, b_graph2(source, target, cost) AS (
SELECT ... FROM b_graph
UNION ALL -- guessing you really want UNION ALL
SELECT ... FROM aug_temp
UNION ALL
SELECT ... FROM aug_temp
)
, results AS (SELECT id1, ... FROM b_graph2)
, pkg AS (SELECT loc, ... )
SELECT id1, array_agg(loc)
FROM pkg
GROUP BY id1
$func$ LANGUAGE sql;
Views are just storing a query ("the recipe"), not the actual resulting values ("the soup").
It's typically cheaper to use CTEs instead of creating temp tables.
Derived tables in queries, sorted by their typical overall performance (exceptions for special cases involving indexes). From slow to fast:
CREATE TABLE
CREATE UNLOGGED TABLE
CREATE TEMP TABLE
CTE
subquery
UNION would try to fold duplicate rows. Typically, people really want UNION ALL, which just appends rows. Faster and does not try to remove dupes.