How to call decode function on column before insert statement? - postgresql

I have a table and some of the columns are "bytea" type. What I want to do is; Before any insert statement check if a column type is "bytea" then decode hex value of column value.
Is there a way to create a trigger function like below?
INSERT INTO USERS (ID, NAME, STATUS) VALUES ('0x3BEDDASTSFSFSDS', 'test', 'new')
to
INSERT INTO USERS (ID, NAME, STATUS) VALUES (decode('0x3BEDDASTSFSFSDS', 'hex'), 'test', 'new')

There is no way to intercept the input string before Postgres parses it as a bytea.
It'll be read as an "escape" format literal, i.e. the bytea will end up holding the hex digits' ASCII codepoints. You can undo this, though it's a little unpleasant:
create function trg() returns trigger language plpgsql as $$
begin
assert substring(new.id from 1 for 2) = '0x';
new.id = decode(convert_from(substring(new.id from 3), current_setting('server_encoding')), 'hex');
return new;
end
$$;
create trigger trg
before insert on users
for each row
execute procedure trg();
If you have any control over the input, just change the 0x to \x, and Postgres will convert it for you.

Related

How to call an array in stored procedure?

May I know on how to call an array in stored procedure? I tried to enclosed it with a bracket to put the column_name that need to be insert in the new table.
CREATE OR REPLACE PROCEDURE data_versioning_nonull(new_table_name VARCHAR(100),column_name VARCHAR(100)[], current_table_name VARCHAR(100))
language plpgsql
as $$
BEGIN
EXECUTE ('CREATE TABLE ' || quote_ident(new_table_name) || ' AS SELECT ' || quote_ident(column_name) || ' FROM ' || quote_ident(current_table_name));
END $$;
CALL data_versioning_nonull('sales_2019_sample', ['orderid', 'product', 'address'], 'sales_2019');
Using execute format() lets you replace all the quote_ident() with %I placeholders in a single text instead of a series of concatenated snippets. %1$I lets you re-use the first argument.
It's best if you use ARRAY['a','b','c']::VARCHAR(100)[] to explicitly make it an array of your desired type. '{"a","b","c"}'::VARCHAR(100)[] works too.
You'll need to convert the array into a list of columns some other way, because when cast to text, it'll get curly braces which are not allowed in the column list syntax. Demo
It's not a good practice to introduce random limitations - PostgreSQL doesn't limit identifier lengths to 100 characters, so you don't have to either. The default limit is 63 bytes, so you can go way, way longer than 100 characters (demo). You can switch that data type to a regular text. Interestingly, exceeding specified varchar length would just convert it to unlimited varchar, making it just syntax noise.
DBFiddle online demo
CREATE TABLE sales_2019(orderid INT,product INT,address INT);
CREATE OR REPLACE PROCEDURE data_versioning_nonull(
new_table_name TEXT,
column_names TEXT[],
current_table_name TEXT)
LANGUAGE plpgsql AS $$
DECLARE
list_of_columns_as_quoted_identifiers TEXT;
BEGIN
SELECT string_agg(quote_ident(name),',')
INTO list_of_columns_as_quoted_identifiers
FROM unnest(column_names) name;
EXECUTE format('CREATE TABLE %1$I.%2$I AS SELECT %3$s FROM %1$I.%4$I',
current_schema(),
new_table_name,
list_of_columns_as_quoted_identifiers,
current_table_name);
END $$;
CALL data_versioning_nonull(
'sales_2019_sample',
ARRAY['orderid', 'product', 'address']::text[],
'sales_2019');
Schema awareness: currently the procedure creates the new table in the default schema, based on a table in that same default schema - above I made it explicit, but that's what it would do without the current_schema() calls anyway. You could add new_table_schema and current_table_schema parameters and if most of the time you don't expect them to be used, you can hide them behind procedure overloads for convenience, using current_schema() to keep the implicit behaviour. Demo
First, change your stored procedure to convert selected columns from array to csv like this.
CREATE OR REPLACE PROCEDURE data_versioning_nonull(new_table_name VARCHAR(100),column_name VARCHAR(100)[], current_table_name VARCHAR(100))
language plpgsql
as $$
BEGIN
EXECUTE ('CREATE TABLE ' || quote_ident(new_table_name) || ' AS SELECT ' || array_to_string(column_name, ',') || ' FROM ' || quote_ident(current_table_name));
END $$;
Then call it as:
CALL data_versioning_nonull('sales_2019_sample', '{"orderid", "product", "address"}', 'sales_2019');

Filter bigint values on insert Postgresql

I have 2 tables in Postgresql with the same schema, the only difference is that in one of the table id field is of type bigint. Schema of the table I need to fill with data looks like this:
create table test_int_table(
id int,
description text,
hash_code int
);
I need to copy the data from test_table with bigint id to public.test_int_table. And some of the values which are bigger than id range should be filtered out. How can I track those values without hardcoding the range?
I can do something like this, but I would like to build more generic solution:
insert into test_int_table
select * from test_table as test
where test.id not between 2147483647 and 9223372036854775808
By generic I mean without constraints on the columns names and their number. So that in case, I have multiple columns of bigint type in other tables how can I filter all of their columns values generically (without specifying a column name)?
There is no generic solution, as far as I can tell.
But I would write it as
INSERT INTO test_int_table
SELECT *
FROM test_table AS t
WHERE t.id BETWEEN -2147483647 AND 2147483647;
You can do something like this if you want to track :
Create a function like this :
CREATE OR REPLACE FUNCTION convert_to_integer(v_input bigint)
RETURNS INTEGER AS $$
DECLARE v_int_value INTEGER DEFAULT NULL;
BEGIN
BEGIN
v_int_value := v_input::INTEGER;
EXCEPTION WHEN OTHERS THEN
RAISE NOTICE 'Invalid integer value: "%". Returning NULL.', v_input;
RETURN NULL;
END;
RETURN v_int_value;
END;
$$ LANGUAGE plpgsql;
and write a query like this :
INSERT INTO test_int_table SELECT * FROM test_table AS t WHERE convert_to_integer(t.id) is not null;
Or you can modify a function to return 0.

String Manipulation throws error while Inserting into PostgreSql Jsonb Column

I have the following code, to insert data into my table res(ID bigserial,
Results jsonb not null). I want to insert data so that 'Display' column always has the 'i' appended to it, so that every row has a different value for 'Display' Column.
DO $$
declare cnt bigint;
BEGIN
FOR i IN 1..2 LOOP
INSERT INTO res (Results)
VALUES ('{"cid":"CID1","Display":"User One'|| i || '","FName":"Userfff","LName":"One"}');
INSERT INTO res (Results)
VALUES ('{"cid":"CID1","Display":"User One'|| i || '","FName":"Userfff","LName":"One"}');
INSERT INTO res (Results)
VALUES ('{"cid":"CID1","Display":"User One'|| i || '","FName":"Userfff","LName":"One"}');
END LOOP;
END;
$$
LANGUAGE plpgsql;
However, when I run this code, I get the following error:
ERROR: column "results" is of type jsonb but expression is of type text
LINE 2: VALUES ('{"cid":"CID1","Display":"User One'|| i ...
^
HINT: You will need to rewrite or cast the expression.
How should I modify my query so that the query runs successfully?
No need for a loop or even PL/pgSQL.
You can generate any number of rows using generate_series()
To put a value into a string I prefer to use format() as that makes dealing with strings a lot easier:
insert into res(results)
select format('{"cid":"CID1","Display":"User One %s","FName":"Userfff","LName":"One"}', i::text)::jsonb
from generate_series(1,5) g(i);
The second parameter to the format() function will be put where the %s in the first parameter.
The above inserts 5 rows each with a different value for display. This is what you stated in your question. Your sample code however inserts a total of 6 rows where 3 rows have the same value for display.

Function to return dynamic set of columns for given table

I have a fields table to store column information for other tables:
CREATE TABLE public.fields (
schema_name varchar(100),
table_name varchar(100),
column_text varchar(100),
column_name varchar(100),
column_type varchar(100) default 'varchar(100)',
column_visible boolean
);
And I'd like to create a function to fetch data for a specific table.
Just tried sth like this:
create or replace function public.get_table(schema_name text,
table_name text,
active boolean default true)
returns setof record as $$
declare
entity_name text default schema_name || '.' || table_name;
r record;
begin
for r in EXECUTE 'select * from ' || entity_name loop
return next r;
end loop;
return;
end
$$
language plpgsql;
With this function I have to specify columns when I call it!
select * from public.get_table('public', 'users') as dept(id int, uname text);
I want to pass schema_name and table_name as parameters to function and get record list, according to column_visible field in public.fields table.
Solution for the simple case
As explained in the referenced answers below, you can use registered (row) types, and thus implicitly declare the return type of a polymorphic function:
CREATE OR REPLACE FUNCTION public.get_table(_tbl_type anyelement)
RETURNS SETOF anyelement AS
$func$
BEGIN
RETURN QUERY EXECUTE format('TABLE %s', pg_typeof(_tbl_type));
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM public.get_table(NULL::public.users); -- note the syntax!
Returns the complete table (with all user columns).
Wait! How?
Detailed explanation in this related answer, chapter
"Various complete table types":
Refactor a PL/pgSQL function to return the output of various SELECT queries
TABLE foo is just short for SELECT * FROM foo:
Is there a shortcut for SELECT * FROM?
2 steps for completely dynamic return type
But what you are trying to do is strictly impossible in a single SQL command.
I want to pass schema_name and table_name as parameters to function and get record list, according to column_visible field in
public.fields table.
There is no direct way to return an arbitrary selection of columns (return type not known at call time) from a function - or any SQL command. SQL demands to know number, names and types of resulting columns at call time. More in the 2nd chapter of this related answer:
How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?
There are various workarounds. You could wrap the result in one of the standard document types (json, jsonb, hstore, xml).
Or you generate the query with one function call and execute the result with the next:
CREATE OR REPLACE FUNCTION public.generate_get_table(_schema_name text, _table_name text)
RETURNS text AS
$func$
SELECT format('SELECT %s FROM %I.%I'
, string_agg(quote_ident(column_name), ', ')
, schema_name
, table_name)
FROM fields
WHERE column_visible
AND schema_name = _schema_name
AND table_name = _table_name
GROUP BY schema_name, table_name
ORDER BY schema_name, table_name;
$func$ LANGUAGE sql;
Call:
SELECT public.generate_get_table('public', 'users');
This create a query of the form:
SELECT usr_id, usr FROM public.users;
Execute it in the 2nd step. (You might want to add column numbers and order columns.)
Or append \gexec in psql to execute the return value immediately. See:
How to force evaluation of subquery before joining / pushing down to foreign server
Be sure to defend against SQL injection:
INSERT with dynamic table name in trigger function
Define table and column names as arguments in a plpgsql function?
Asides
varchar(100) does not make much sense for identifiers, which are limited to 63 characters in standard Postgres:
Maximum characters in labels (table names, columns etc)
If you understand how the object identifier type regclass works, you might replace schema and table name with a singe regclass column.
I think you just need another query to get the list of columns you want.
Maybe something like (this is untested):
create or replace function public.get_table(_schema_name text, _table_name text, active boolean default true) returns setof record as $$
declare
entity_name text default schema_name || '.' || table_name;
r record;
columns varchar;
begin
-- Get the list of columns
SELECT string_agg(column_name, ', ')
INTO columns
FROM public.fields
WHERE fields.schema_name = _schema_name
AND fields.table_name = _table_name
AND fields.column_visible = TRUE;
-- Return rows from the specified table
RETURN QUERY EXECUTE 'select ' || columns || ' from ' || entity_name;
RETURN;
end
$$
language plpgsql;
Keep in mind that column/table references may need to be surrounded by double quotes if they have certain characters in them.

Execute a dynamic crosstab query

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.