Getting A "Could Not Open Relation" Error On Simple Query - postgresql

I have a function that creates a set of INSERT INTO ... VALUES scripts. If I uncomment the dvp.content line, the function fails with an "ERROR: could not open relation with OID ###", which refers to the temp table. The content column is a jsonb type. Not sure where to begin?
CREATE OR REPLACE FUNCTION export_docs_as_sql(doc_list uuid[], to_org_id uuid)
RETURNS table(id integer, sql text)
AS $$
BEGIN
...
-- use a temp table to gather all INSERT statements
CREATE TEMP TABLE IF NOT EXISTS doc_data_export(
id serial PRIMARY KEY,
sql text
);
...
-- get doc_version_pages
INSERT INTO doc_data_export(sql)
SELECT 'INSERT INTO doc_version_pages(id, doc_version_id, persona_id, care_category_id, patient_group_id, title, content, created_at, updated_at, is_guide, is_root) VALUES (' ||
quote_literal(dvp.id::TEXT) || ', ' ||
quote_literal(dvp.doc_version_id::TEXT) || ', ' ||
CASE WHEN p.name IS NOT NULL THEN '(SELECT px.id FROM personas px WHERE px.org_id = ' || quote_literal(dv.id::TEXT) || ' AND px.name = ' || quote_literal(p.name) || '), ' ELSE 'NULL, ' END ||
CASE WHEN c.name IS NOT NULL THEN '(SELECT cx.id FROM care_categories cx WHERE cx.org_id = ' || quote_literal(to_org_id) || ' AND cx.name = ' || quote_literal(c.name) || '), ' ELSE 'NULL, ' END ||
CASE WHEN g.name IS NOT NULL THEN '(SELECT gx.id FROM patient_groups gx WHERE gx.org_id = ' || quote_literal(to_org_id) || ' AND gx.name = ' || quote_literal(g.name) || '), ' ELSE 'NULL, ' END ||
quote_literal(dvp.title::TEXT) || ', ' ||
--dvp.content || ', ' ||
quote_literal(dvp.created_at::TEXT) || ', ' ||
quote_literal(now()::timestamp) || ', ' ||
quote_literal(dvp.is_guide::TEXT) || ', ' ||
quote_literal(dvp.is_root::TEXT) || ');'
FROM unnest(doc_list) l
INNER JOIN doc_versions dv ON l = dv.doc_id
INNER JOIN doc_version_pages dvp ON dv.id = dvp.doc_version_id
LEFT JOIN personas p ON dvp.persona_id = p.id
LEFT JOIN care_categories c ON dvp.care_category_id = c.id
LEFT JOIN patient_groups g ON dvp.patient_group_id = g.id;
...
-- output all inserts
RETURN QUERY SELECT * FROM doc_data_export;
-- drop temp table
DROP TABLE doc_data_export;
END;
$$ LANGUAGE plpgsql;

The "Could Not Open Relation" problem is occurring due to the bug described here, which remains an issue as of Postgres 14.0:
What seems to be happening is that if the strings are large enough to be
toasted, then the data returned out of the function with RETURN QUERY
contains toast pointers referencing the temp table's toast table.
If you drop the temp table then those pointers will fail upon use.
To explain further, when a column value is greater than the TOAST_TUPLE_THRESHOLD configuration parameter (usually 2KB) and cannot be compressed or when the column is configured with a storage parameter of EXTERNAL, the value will be broken down into chunks and stored in a special secondary table called a TOAST table. This table will be stored in the pg_toast schema and will be named like pg_toast.pg_toast_<table OID>.
So when you add dvp.content to the sql statement you insert that into doc_data_export, some of these values are larger than the aforementioned constraints and are thus TOASTed. Your RETURN QUERY is only sending the pointers to the values in the toast table. After the return is done, the temporary table and its corresponding TOAST table is dropped. Thus when the outer query attempts to materialize the results, it can't find the TOAST table that these pointers reference - hence the cryptic error message you see.
You can avoid sending TOAST pointers for the temporary table -and thus safely DROP it after the RETURN QUERY -by performing an operation on the sql column that returns the same value:
RETURN QUERY SELECT id, sql || '' FROM doc_data_export;
The simple function below will reproduce a minimal example of the TOAST bug when you set fail to true and demonstrate the successful workaround when you set fail to false.
DROP FUNCTION IF EXISTS buttered_toast(boolean);
CREATE OR REPLACE FUNCTION buttered_toast(fail boolean)
RETURNS table(id integer, enormous_data text)
AS $$
BEGIN
CREATE TEMPORARY TABLE tbl_with_toasts (
id integer PRIMARY KEY,
enormous_data text
) ON COMMIT DROP;
--generate a giant string that is sure to generate a TOAST table.
INSERT INTO tbl_with_toasts(id,enormous_data) SELECT 1, string_agg(gen_random_uuid()::text,'-') FROM generate_series(1,10000) as ints(int);
IF buttered_toast.fail THEN
-- will return pointers to tbl_with_toast's TOAST table for the "enormous_data" column.
RETURN QUERY SELECT tbl_with_toasts.id, tbl_with_toasts.enormous_data FROM tbl_with_toasts ;
ELSE
-- will generate and return new values for the "enormous_data" column
RETURN QUERY SELECT tbl_with_toasts.id, tbl_with_toasts.enormous_data || '' FROM tbl_with_toasts ;
END IF;
DROP TABLE tbl_with_toasts;
END;
$$ LANGUAGE plpgsql;
-- fails with "Could Not Open Relation"
select * from buttered_toast(true)
--succeeds
select * from buttered_toast(false);

Related

Postgresql: EXECUTE sql_cmd merge with CREATE TEMP TABLE temp_tbl AS SELECT

Soo, here is the thing, I am using in my DB methods 2 approaches:
1.) Is compose and SQL query from various string, depending on what I need to filter out:
sql_cmd := 'SELECT count(*) FROM art_short_term_finished WHERE (entry_time <= ''' || timestamp_up || ''' AND exit_time >= ''' || timestamp_down || ''') AND ' || time_filter || ' AND entry_zone = ' || zone_parameter || ' AND park_uuid = ' || park_id_p || '';
EXECUTE sql_cmd INTO shortterm_counter;
2.) Copy part of the big table, into smaller temp table and work with it:
-- Get the data from FPL into smaller table for processing
DROP TABLE IF EXISTS temp_fpl_filtered;
CREATE TEMP TABLE temp_fpl_filtered AS SELECT car_id FROM flexcore_passing_log fpl WHERE fpl.zone_leaved = '0' AND fpl.status IN (SELECT status_id FROM fpl_ok_statuses) AND fpl.park_uuid = park_id_p AND (fpl.datetime BETWEEN row_i.start_d AND row_i.end_d);
But what If I want to mix those two?
I want to have the SELECT after CREATE TEMP TABLE temp_fpl_filtered AS to have different WHERE clauses depending on input parameters of stored procedure, without having to write the same statement xy times in one stored procedure.
But my approach:
-- art class is shortterm, check shortterm history
IF art_class_p = 1 OR article_p = 0 THEN
-- create temporary table derivated from shortterm history
IF article_p = 0 THEN
article_p_filter := '';
ELSE
article_p_filter := ' AND article_id = ' || article_p;
END IF;
short_cmd := 'SELECT car_id, article_id, entry_time, exit_time FROM art_short_term_finished WHERE zone_leaved = ''0'' AND status IN (SELECT status_id FROM fpl_ok_statuses) ''' || article_p_filter || ''' AND park_uuid = ''' || park_id_p || ''' AND (entry_time <= ''' || timestamp_up || ''' AND exit_time >= ''' || timestamp_down || ''')';
DROP TABLE IF EXISTS temp_short_full;
CREATE TEMP TABLE temp_short_full AS short_cmd;
--EXECUTE sql_cmd INTO shortterm_counter;
END IF;
throws me an exception when I try to inser stored procedure:
psql:report_parking_average.sql:107: ERROR: syntax error at or near "short_cmd"
LINE 50: CREATE TEMP TABLE temp_fpl_filtered AS short_cmd;
^
Also, the another try:
EXECUTE short_cmd INTO TEMP TABLE temp_short_full;
is not working..
You need to include the CREATE TABLE part into the SQL you generate:
short_cmd := 'CREATE TEMP TABLE temp_short_full AS SELECT car_id, article_id, entry_time, exit_time FROM art_short_term_finished WHERE zone_leaved = ''0'' AND status IN (SELECT status_id FROM fpl_ok_statuses) ''' || article_p_filter || ''' AND park_uuid = ''' || park_id_p || ''' AND (entry_time <= ''' || timestamp_up || ''' AND exit_time >= ''' || timestamp_down || ''')';
DROP TABLE IF EXISTS temp_short_full;
execute short_cmd;

Update newly created row value with a trigger

My problem is as follow: I insert or update a row in a postgresql database and need to modify one field in this row. BUT I need to know the new serial PK when I insert a new row to make a SELECT with JOIN on other tables.
I'm now stucked because I've done a AFTER INSERT AND UPDATE trigger to get the new PK (kkw_block_id). I get the value I need with the SELECT but after that I can't modify the value in the row: modifying the NEW.value is not possible with AFTER INSERT AND UPDATE and if I do an UPDATE on the row, I enter in an infinite loop, the trigger beeing called in the trigger...
CREATE TRIGGER tsvectorupdate
AFTER INSERT OR UPDATE
ON kkw_block
FOR EACH ROW
EXECUTE PROCEDURE kkw_search_trigger();
CREATE OR REPLACE FUNCTION kkw_search_trigger()
RETURNS trigger AS
$BODY$
DECLARE vector_en TEXT;
DECLARE vector_fr TEXT;
DECLARE vector_de TEXT;
BEGIN
-- I need the new serial PK(kkw_id) in the following section.
SELECT coalesce(modell_en, '') || ', ' || coalesce(bezeichnung_en,'') || ', ' || coalesce(kkw.kkw_name_en,'') || ', ' || coalesce(kkw_typ.typ_abr,'') || ', ' || coalesce(kkw_typ.typ_desc_en,'') || ', ' || coalesce(kkw_typ.typ_desc_short_en,'') INTO vector_en
FROM kkw_block
LEFT JOIN kkw ON NEW.kkw_id = kkw.kkw_id
LEFT JOIN kkw_typ ON NEW.kkw_typ_id = kkw_typ.kkw_typ_id
WHERE kkw_block_id = NEW.kkw_block_id;
-- I need to update a field of the newly created or updated row.
NEW.search_vector_en := to_tsvector('english', 'new test vector'); --- This doesn't work with 'AFTER UPDATE' trigger.
RETURN NULL;
END
$BODY$
Any idea?
Drop default for your PK and assign it in your BEFORE trigger. You will have to change that existing trigger from AFTER to BEFORE.
You can assign PK from sequence like that:
NEW.kkw_block_id = nextval('your_sequence_name_here');
Since you are using the same function for both INSERT and DELETE, you need to check if it is INSERT and only then use sequence. I have also included check if PK is null or not. I suppose that alone would be enough to not overwrite it during update.
IF (TG_OP = 'INSERT') AND NEW.kkw_block_id IS NULL THEN
NEW.kkw_block_id = nextval('your_sequence_name_here');
END IF;
This will be fine as long as this trigger will work for each new row, with seems to be the case. This will let you modify NEW and it will be reflected in data saved in table.
I end up with the following solution. I made a BEFORE trigger. The problem was the LEFT JOIN with reference to the table where the new row doesn't exist yet. It's not ideal but here it is:
CREATE TRIGGER tsvectorupdate
BEFORE INSERT OR UPDATE
ON kkw_block
FOR EACH ROW
EXECUTE PROCEDURE kkw_search_trigger();
CREATE TYPE kkw_type_record_type AS (typ_abr TEXT, typ_desc_en TEXT, typ_desc_short_en TEXT);
CREATE TYPE kkw_record_type AS (kkw_name_en TEXT);
CREATE OR REPLACE FUNCTION kkw_search_trigger()
RETURNS trigger AS
$BODY$
DECLARE kkw_rec kkw_record_type;
DECLARE kkw_typ_rec kkw_type_record_type;
DECLARE vector_en TEXT;
BEGIN
--- make a individual select instead of LEFT JOIN
SELECT kkw_name_en INTO kkw_rec.kkw_name_en
FROM kkw
WHERE kkw.kkw_id = NEW.kkw_id;
--- make a individual select instead of LEFT JOIN
SELECT typ_abr, typ_desc_en, typ_desc_short_en INTO kkw_typ_rec.typ_abr, kkw_typ_rec.typ_desc_en, kkw_typ_rec.typ_desc_short_en
FROM kkw_typ
WHERE kkw_typ.kkw_typ_id = NEW.kkw_typ_id;
vector_en := coalesce(NEW.modell_en, '') || ', ' || coalesce(NEW.bezeichnung_en,'') || ', ' || coalesce(kkw_rec.kkw_name_en,'') || ', ' || coalesce(kkw_typ_rec.typ_abr,'') || ', ' || coalesce(kkw_typ_rec.typ_desc_en,'') || ', ' || coalesce(kkw_typ_rec.typ_desc_short_en,'');
NEW.search_vector_en := to_tsvector('english', vector_en);
RETURN NEW;
END
$BODY$

Performance of joining on multiple columns with potential NULL values

Lets say we have the following table
CREATE TABLE my_table
(
record_id SERIAL,
column_1 INTEGER,
column_2 INTEGER,
column_3 INTEGER,
price NUMERIC
);
With the following data
INSERT INTO my_table (column_1, column_2, column_3, price) VALUES
(1, NULL, 1, 54.99),
(1, NULL, 1, 69.50),
(NULL, 2, 2, 54.99),
(NULL, 2, 2, 69.50),
(3, 3, NULL, 54.99),
(3, 3, NULL, 69.50);
Now we do something like
CREATE TABLE my_table_aggregations AS
SELECT
ROW_NUMBER() OVER () AS aggregation_id,
column_1,
column_2,
column_3
FROM my_table
GROUP BY
column_1,
column_2,
column_3;
What I want to do now is assign an aggregation_id to each record_id in my_table. Now because I have NULL values I cant simply join by t1.column_1 = t2.column_1 because NULL = NULL is NULL and so the join will exclude these records.
Now I know that I should use something like this
SELECT
t.record_id,
agg.aggregation_id
FROM my_table t
JOIN my_table_aggregations agg ON
(
((t.column_1 IS NULL AND agg.column_1 IS NULL) OR t.column_1 = agg.column_1) AND
((t.column_2 IS NULL AND agg.column_2 IS NULL) OR t.column_2 = agg.column_2) AND
((t.column_3 IS NULL AND agg.column_3 IS NULL) OR t.column_3 = agg.column_3)
);
The problem here is that I am dealing with hundreds of millions of records and having an OR in the join seems to take forever to run.
There is an alternative, which is something like this
SELECT
t.record_id,
agg.aggregation_id
FROM my_table t
JOIN my_table_aggregations agg ON
(
COALESCE(t.column_1, -1) = COALESCE(agg.column_1, -1) AND
COALESCE(t.column_2, -1) = COALESCE(agg.column_2, -1) AND
COALESCE(t.column_3, -1) = COALESCE(agg.column_3, -1)
);
But the problem with this is that I am assuming there is no value in any of those columns which is -1.
Do note, this is an example which I am well aware I can use DENSE_RANK to get the same result. So lets pretend that this isn't an option.
Is there some crazy awesome way to get around having to use COALESCE but keeping the performance it has over using the correct way of the OR? I run tests, and the COALESCE is over 10 times faster than the OR.
I am running this on a Greenplum database so I am not sure if this performance difference is the same on a standard Postgres database.
Since my solution with NULLIF had performance problems, and your use of COALESCE was much faster, I wonder if you could try tweaking that solution to deal with the issue of -1. To do that, you could try casting to avoid false matches. I'm not sure what the performance hit would be, but it would look like:
SELECT
t.record_id,
agg.aggregation_id
FROM my_table t
JOIN my_table_aggregations agg ON
(
COALESCE(cast(t.column_1 as varchar), 'NA') =
COALESCE(cast(agg.column_1 as varchar), 'NA') AND
COALESCE(cast(t.column_2 as varchar), 'NA') =
COALESCE(cast(agg.column_2 as varchar), 'NA') AND
COALESCE(cast(t.column_3 as varchar), 'NA') =
COALESCE(cast(agg.column_3 as varchar), 'NA')
);
After doing some thinking, I decided the best approach this this is to dynamically find a value for each column that can be used as the second param in a COALESCE join. The function is rather long, but it does what I need and more importantly, this way keeps the COALESCE performance, the only down side is getting the MIN values is an additional time cost, but we are talking a minute.
Here is the function:
CREATE OR REPLACE FUNCTION pg_temp.get_null_join_int_value
(
left_table_schema TEXT,
left_table_name TEXT,
left_table_columns TEXT[],
right_table_schema TEXT,
right_table_name TEXT,
right_table_columns TEXT[],
output_table_schema TEXT,
output_table_name TEXT
) RETURNS TEXT AS
$$
DECLARE
colum_name TEXT;
sql TEXT;
complete_sql TEXT;
full_left_table TEXT;
full_right_table TEXT;
full_output_table TEXT;
BEGIN
/*****************************
VALIDATE PARAMS
******************************/
-- this section validates all of the function parameters ensuring that the values that cannot be NULL are not so
-- also checks for empty arrays which is not allowed and then ensures both arrays are of the same length
IF (left_table_name IS NULL) THEN
RAISE EXCEPTION 'left_table_name cannot be NULL';
ELSIF (left_table_columns IS NULL) THEN
RAISE EXCEPTION 'left_table_columns cannot be NULL';
ELSIF (right_table_name IS NULL) THEN
RAISE EXCEPTION 'right_table_name cannot be NULL';
ELSIF (right_table_columns IS NULL) THEN
RAISE EXCEPTION 'right_table_columns cannot be NULL';
ELSIF (output_table_name IS NULL) THEN
RAISE EXCEPTION 'output_table_name cannot be NULL';
ELSIF (array_upper(left_table_columns, 1) IS NULL) THEN
RAISE EXCEPTION 'left_table_columns cannot be an empty array';
ELSIF (array_upper(right_table_columns, 1) IS NULL) THEN
RAISE EXCEPTION 'right_table_columns cannot be an empty array';
ELSIF (array_upper(left_table_columns, 1) <> array_upper(right_table_columns, 1)) THEN
RAISE EXCEPTION 'left_table_columns and right_table_columns must have a matching array length';
END IF;
/************************
TABLE NAMES
*************************/
-- create the full name of the left table
-- the schema name can be NULL which means that the table is temporary
-- because of this, we need to detect if we should specify the schema
IF (left_table_schema IS NOT NULL) THEN
full_left_table = left_table_schema || '.' || left_table_name;
ELSE
full_left_table = left_table_name;
END IF;
-- create the full name of the right table
-- the schema name can be NULL which means that the table is temporary
-- because of this, we need to detect if we should specify the schema
IF (right_table_schema IS NOT NULL) THEN
full_right_table = right_table_schema || '.' || right_table_name;
ELSE
full_right_table = right_table_name;
END IF;
-- create the full name of the output table
-- the schema name can be NULL which means that the table is temporary
-- because of this, we need to detect if we should specify the schema
IF (output_table_schema IS NOT NULL) THEN
full_output_table = output_table_schema || '.' || output_table_name;
ELSE
full_output_table = output_table_name;
END IF;
/**********************
LEFT TABLE
***********************/
-- start to create the table which will store the min values from the left table
sql =
'DROP TABLE IF EXISTS temp_null_join_left_table;' || E'\n' ||
'CREATE TEMP TABLE temp_null_join_left_table AS' || E'\n' ||
'SELECT';
-- loop through each column name in the left table column names parameter
FOR colum_name IN SELECT UNNEST(left_table_columns) LOOP
-- find the minimum value in this column and subtract one
-- we will use this as a value we know is not in the column of this table
sql = sql || E'\n\t' || 'MIN("' || colum_name || '")-1 AS "' || colum_name || '",';
END LOOP;
-- remove the trailing comma from the SQL
sql = TRIM(TRAILING ',' FROM sql);
-- finish the SQL to create the left table min values
sql = sql || E'\n' ||
'FROM ' || full_left_table || ';';
-- run the query that creates the table which stores the minimum values for each column in the left table
EXECUTE sql;
-- store the sql which will be the return value of the function
complete_sql = sql;
/************************
RIGHT TABLE
*************************/
-- start to create the table which will store the min values from the right table
sql =
'DROP TABLE IF EXISTS temp_null_join_right_table;' || E'\n' ||
'CREATE TEMP TABLE temp_null_join_right_table AS' || E'\n' ||
'SELECT';
-- loop through each column name in the right table column names parameter
FOR colum_name IN SELECT UNNEST(right_table_columns) LOOP
-- find the minimum value in this column and subtract one
-- we will use this as a value we know is not in the column of this table
sql = sql || E'\n\t' || 'MIN("' || colum_name || '")-1 AS "' || colum_name || '",';
END LOOP;
-- remove the trailing comma from the SQL
sql = TRIM(TRAILING ',' FROM sql);
-- finish the SQL to create the right table min values
sql = sql || E'\n' ||
'FROM ' || full_left_table || ';';
-- run the query that creates the table which stores the minimum values for each column in the right table
EXECUTE sql;
-- store the sql which will be the return value of the function
complete_sql = complete_sql || E'\n\n' || sql;
-- start to create the final output table which will contain the column names defined in the left_table_columns parameter
-- each column will contain a negative value that is not present in both the left and right tables for the given column
sql =
'DROP TABLE IF EXISTS ' || full_output_table || ';' || E'\n' ||
'CREATE ' || (CASE WHEN output_table_schema IS NULL THEN 'TEMP ' END) || 'TABLE ' || full_output_table || ' AS' || E'\n' ||
'SELECT';
-- loop through each index of the left_table_columns array
FOR i IN coalesce(array_lower(left_table_columns, 1), 1)..coalesce(array_upper(left_table_columns, 1), 1) LOOP
-- add to the sql a call to the LEAST function
-- this function takes an infinite number of columns and returns the smallest value within those columns
-- we have -1 hardcoded because the smallest minimum value may be a positive integer and so we need to ensure the number used is negative
-- this way we will not confuse this value with a real ID from a table
sql = sql || E'\n\t' || 'LEAST(l."' || left_table_columns[i] || '", r."' || right_table_columns[i] || '", -1) AS "' || left_table_columns[i] || '",';
END LOOP;
-- remove the trailing comma from the SQL
sql = TRIM(TRAILING ',' FROM sql);
-- finish off the SQL which creates the final table
sql = sql || E'\n' ||
'FROM temp_null_join_left_table l' || E'\n' ||
'CROSS JOIN temp_null_join_right_table r' || ';';
-- create the final table
EXECUTE sql;
-- store the sql which will be the return value of the function
complete_sql = complete_sql || E'\n\n' || sql;
-- we no longer need these tables
sql =
'DROP TABLE IF EXISTS temp_null_join_left_table;' || E'\n' ||
'DROP TABLE IF EXISTS temp_null_join_right_table;';
EXECUTE sql;
-- store the sql which will be the return value of the function
complete_sql = complete_sql || E'\n\n' || sql;
-- return the SQL that has been run, good for debugging purposes or just understanding what the function does
RETURN complete_sql;
END;
$$
LANGUAGE plpgsql;
Below is an example usage of the function
SELECT pg_temp.get_null_join_int_value
(
-- left table
'public',
'my_table',
'{"column_1", "column_2", "column_3"}',
-- right table
'public',
'my_table_aggregations',
'{"column_1", "column_2", "column_3"}',
-- output table
NULL,
'temp_null_join_values'
);
Once the temp_null_join_values table is created you can do a sub select in the join for the COALESCE 2nd param.
DROP TABLE IF EXISTS temp_result_table;
CREATE TEMP TABLE temp_result_table AS
SELECT
t.record_id,
agg.aggregation_id
FROM public.my_table t
JOIN my_table_aggregations agg ON
(
COALESCE(t.column_1, (SELECT column_1 FROM temp_null_join_values)) = COALESCE(agg.column_1, (SELECT column_1 FROM temp_null_join_values)) AND
COALESCE(t.column_2, (SELECT column_2 FROM temp_null_join_values)) = COALESCE(agg.column_2, (SELECT column_2 FROM temp_null_join_values)) AND
COALESCE(t.column_3, (SELECT column_3 FROM temp_null_join_values)) = COALESCE(agg.column_3, (SELECT column_3 FROM temp_null_join_values))
);
I hope this helps someone
How about:
SELECT
t.record_id,
a.aggregation_id
FROM my_table t
JOIN my_table_aggregations agg ON
(
NULLIF(t.column_1, agg.column_1) IS NULL
AND
NULLIF(agg.column_1, t.column_1) IS NULL
AND
NULLIF(t.column_2, agg.column_2) IS NULL
AND
NULLIF(agg.column_2, t.column_2) IS NULL
AND
NULLIF(t.column_3, agg.column_3) IS NULL
AND
NULLIF(agg.column_3, t.column_3) IS NULL
);

Postgresql create a log schema

So my problem is simple. I have a schema prod with many tables, and another one log with the exact same tables and structure (primary keys change that's it).
When I do UPDATE or DELETE in the schema prod, I want to record old data in the log schema.
I have the following function called after a update or delete:
CREATE FUNCTION prod.log_data() RETURNS trigger
LANGUAGE plpgsql AS $$
DECLARE
v RECORD;
column_names text;
value_names text;
BEGIN
-- get column names of current table and store the list in a text var
column_names = '';
value_names = '';
FOR v IN SELECT * FROM information_schema.columns WHERE table_name = quote_ident(TG_TABLE_NAME) AND table_schema = quote_ident(TG_TABLE_SCHEMA) LOOP
column_names = column_names || ',' || v.column_name;
value_names = value_names || ',$1.' || v.column_name;
END LOOP;
-- remove first char ','
column_names = substring( column_names FROM 2);
value_names = substring( value_names FROM 2);
-- execute the insert into log schema
EXECUTE 'INSERT INTO log.' || TG_TABLE_NAME || ' ( ' || column_names || ' ) VALUES ( ' || value_names || ' )' USING OLD;
RETURN NULL; -- no need to return, it is executed after update
END;$$;
The annoying part is that I have to get column names from information_schema for each row.
I would rather use this:
EXECUTE 'INSERT INTO log.' || TG_TABLE_NAME || ' SELECT ' || OLD;
But some values can be NULL so this will execute:
INSERT INTO log.user SELECT 2,,,"2015-10-28 13:52:44.785947"
instead of
INSERT INTO log.user SELECT 2,NULL,NULL,"2015-10-28 13:52:44.785947"
Any idea to convert ",," to ",NULL,"?
Thanks
-Quentin
First of all I must say that in my opinion using PostgreSQL system tables (like information_schema) is the proper way for such a usecase. Especially that you must write it once: you create the function prod.log_data() and your done. Moreover it may be dangerous to use OLD in that context (just like *) as always because of not specified elements order.
But,
to answer your exact question the only way I know is to do some operations on OLD. Just observe that you cast OLD to text by doing concatenation ... ' SELECT ' || OLD. The default casting create that ugly double-commas. So, next you can play with that text. In the end I propose:
DECLARE
tmp TEXT
...
BEGIN
...
/*to make OLD -> text like (2,,3,4,,)*/
SELECT '' || OLD INTO tmp; /*step 1*/
/*take care of commas at the begining and end: '(,' ',)'*/
tmp := replace(replace(tmp, '(,', '(NULL,'), ',)', ',NULL)'); /*step 2*/
/* replace rest of commas to commas with NULL between them */
SELECT array_to_string(string_to_array(tmp, ',', ''), ',', 'NULL') INTO tmp; /*step 3*/
/* Now we can do EXECUTE*/
EXECUTE 'INSERT INTO log.' || TG_TABLE_NAME || ' SELECT ' || tmp;
Of course you can do steps 1-3 in one big step
SELECT array_to_string(string_to_array(replace(replace('' || NEW, '(,', '(NULL,'), ',)', ',NULL)'), ',', ''), ',', 'NULL') INTO tmp;
In my opinion this approach isn't any better from using information_schema, but it's your call.

PostgreSQL ERROR: EXECUTE of SELECT ... INTO is not implemented

When I run the following command from a function I defined, I get the error "EXECUTE of SELECT ... INTO is not implemented". Does this mean the specific command is not allowed (i.e. "SELECT ...INTO")? Or does it just mean I'm doing something wrong? The actual code causing the error is below. I apologize if the answer is already out here, however I looked and could not find this specific error. Thanks in advance... For whatever it's worth I'm running 8.4.7
vCommand = 'select ' || stmt.column_name || ' as id ' ||
', count(*) as nCount
INTO tmpResults
from ' || stmt.table_name || '
WHERE ' || stmt.column_name || ' IN (select distinct primary_id from anyTable
WHERE primary_id = ' || stmt.column_name || ')
group by ' || stmt.column_name || ';';
EXECUTE vCommand;
INTO is ambiguous in this use case and then is prohibited there.
You can use a CREATE TABLE AS SELECT instead.
CREATE OR REPLACE FUNCTION public.f1(tablename character varying)
RETURNS integer
LANGUAGE plpgsql
AS $function$
begin
execute 'create temp table xx on commit drop as select * from '
|| quote_ident(tablename);
return (select count(*) from xx);
end;
$function$
postgres=# select f1('omega');
f1
────
2
(1 row)