how to convert string as sql syntax - postgresql

An example what I am attempting to do,
SELECT count(*) as "count"
FROM (
SELECT overlay('db_xx.company_summary' placing 'US' from 4)
) as s
This returns count to be 1, I would expect it to count all rows in db_us.company_summary 1*10^6
I would expect this to result in a query to similar to this;
SELECT count(*) as "count"
FROM (db_us.company_summary')
I attempted the overlay function, to be similar to the above query. I am not sure if its possible to do this in SQL.
Normally in python you would do something like this;
"hello {}".format("world")
So I would like the inputted string to act as a SQL syntax command.

Plain SQL does not allow to parameterize identifiers (or anything but values). You need dynamic SQL. Examples:
Truncating all tables in a Postgres database
How to use variable as table name in plpgsql
How do I use variables in a postgresql function for loop query
Something like this:
Minimal test setup (should be in the question):
CREATE TABLE country_code (iso2 text PRIMARY KEY);
INSERT INTO country_code VALUES ('US'), ('GB'), ('AT');
CREATE SCHEMA db_us;
CREATE SCHEMA db_gb;
CREATE SCHEMA db_at;
CREATE TABLE db_us.company_summary (foo int);
CREATE TABLE db_gb.company_summary (foo int);
CREATE TABLE db_at.company_summary (foo int);
INSERT INTO db_us.company_summary VALUES (1), (2);
INSERT INTO db_gb.company_summary VALUES (1), (2), (3);
INSERT INTO db_at.company_summary VALUES (1), (2), (3), (4);
PL/pgSQL function with dynamic SQL:
CREATE OR REPLACE FUNCTION f_counts()
RETURNS SETOF bigint
LANGUAGE plpgsql AS
$func$
DECLARE
_lower_iso2 text;
_ct bigint;
BEGIN
FOR _lower_iso2 IN
SELECT lower(iso2) FROM country_code
LOOP
RETURN QUERY EXECUTE
format ('SELECT count(*) AS count FROM %I.company_summary'
, overlay('db_xx' PLACING _lower_iso2 FROM 4)
);
END LOOP;
END
$func$;
Call:
SELECT * FROM f_counts();
Result:
f_counts
---------
2
3
4
db<>fiddle here
Be aware that Postgres identifiers are case sensitive. See:
Are PostgreSQL column names case-sensitive?

Related

How to use a function as a parameter to a stored procedure in postgres?

I created a function in Postgres that returns a table with a varying number of rows.
NewProducts() is another function.
CREATE FUNCTION RetrieveRebates(customerID INT)
RETURNS TABLE(
customer INT,
reb NUMERIC,
prod INT
)
LANGUAGE SQL
AS $$
SELECT rebate_default.customer_id AS customer, rebate_default.rebate_group AS reb, NewProducts.product_id AS prod FROM NewProducts(customerID)
JOIN rebate_default ON NewProducts.group = rebate_default.group
WHERE rebate_default.customer_id=customerID;
$$;
I want to insert all the rows from this table into another table.
CREATE PROCEDURE InsertData(SELECT * FROM RetrieveRebates(customerID))
LANGUAGE SQL
BEGIN ATOMIC
INSERT INTO table3 VALUES (customer);
INSERT INTO table3 VALUES (prod);
INSERT INTO table3 VALUES (reb);
END;
How can I use a table as a parameter of a stored procedure and add all rows from this table parameter to another table?

Postgresql: Partitioning a table for values in SELECT

I want to use PostgreSQL's declarative table partitioning on a column with UUID values. I want it partitioned by values that are returned by a SELECT.
CREATE TABLE foo (
id uuid NOT NULL,
type_id uuid NOT NULL,
-- other columns
PRIMARY KEY (id, type_id)
) PARTITION BY LIST (type_id);
CREATE TABLE foo_1 PARTITION OF foo
FOR VALUES IN (SELECT id FROM type_ids WHERE type_name = 'type1');
CREATE TABLE foo_2 PARTITION OF foo
FOR VALUES IN (SELECT id FROM type_ids WHERE type_name = 'type2');
I don't want to use specific UUIDs in FOR VALUES IN ('uuid'), since the UUIDs may differ by environment (dev, qa, prod). However, the SELECT syntax doesn't seem to be accepted by Postgres. Any suggestions?
I just wanted the SELECT to be evaluated at the table creation time
You should have made that clear in the question, not in a comment.
In that case - if this is a one time thing, you can use dynamic SQL, e.g. wrapped into a procedure
create procedure create_partition(p_part_name text, p_type_name text)
as
$$
declare
l_sql text;
l_id uuid;
begin
select id
into l_id
from type_ids
where type_name = p_type_name;
l_sql := format('CREATE TABLE %I PARTITION OF foo FOR VALUES IN (%L)', p_part_name, l_id);
execute l_sql;
end;
$$
language plpgsql;
Then you can do:
call create_partition('foo_1', 'type1');
call create_partition('foo_2', 'type2');

Filter bigint values on insert Postgresql

I have 2 tables in Postgresql with the same schema, the only difference is that in one of the table id field is of type bigint. Schema of the table I need to fill with data looks like this:
create table test_int_table(
id int,
description text,
hash_code int
);
I need to copy the data from test_table with bigint id to public.test_int_table. And some of the values which are bigger than id range should be filtered out. How can I track those values without hardcoding the range?
I can do something like this, but I would like to build more generic solution:
insert into test_int_table
select * from test_table as test
where test.id not between 2147483647 and 9223372036854775808
By generic I mean without constraints on the columns names and their number. So that in case, I have multiple columns of bigint type in other tables how can I filter all of their columns values generically (without specifying a column name)?
There is no generic solution, as far as I can tell.
But I would write it as
INSERT INTO test_int_table
SELECT *
FROM test_table AS t
WHERE t.id BETWEEN -2147483647 AND 2147483647;
You can do something like this if you want to track :
Create a function like this :
CREATE OR REPLACE FUNCTION convert_to_integer(v_input bigint)
RETURNS INTEGER AS $$
DECLARE v_int_value INTEGER DEFAULT NULL;
BEGIN
BEGIN
v_int_value := v_input::INTEGER;
EXCEPTION WHEN OTHERS THEN
RAISE NOTICE 'Invalid integer value: "%". Returning NULL.', v_input;
RETURN NULL;
END;
RETURN v_int_value;
END;
$$ LANGUAGE plpgsql;
and write a query like this :
INSERT INTO test_int_table SELECT * FROM test_table AS t WHERE convert_to_integer(t.id) is not null;
Or you can modify a function to return 0.

Is it possible to write a postgres function that will handle a many to many join?

I have a job table. I have an industries table. Jobs and industries have a many to many relationship via a join table called industriesjobs. Both tables have uuid is their primary key. My question is two fold. Firstly is it feasible to write two functions to insert data like this? If this is feasible then my second question is how do I express an array of the uuid column type. I'm unsure of the syntax.
CREATE OR REPLACE FUNCTION linkJobToIndustries(jobId uuid, industiresId uuid[]) RETURNS void AS $$
DECLARE
industryId uuid[];
BEGIN
FOREACH industryId SLICE 1 IN ARRAY industriesId LOOP
INSERT INTO industriesjobs (industry_id, job_id) VALUES (industryId, jobId);
END LOOP;
RETURN;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION insertJobWithIndistries(orginsation varchar, title varchar, addressId uuid, industryIds uuid[]) RETURNS uuid AS $$
DECLARE
jobId uuid;
BEGIN
INSERT INTO jobs ("organisation", "title", "address_id") VALUES (orginsation, title, addressId) RETURNING id INTO jobId;
SELECT JobbaLinkJobToIndustries(jobId, industryIds);
END;
$$ LANGUAGE plpgsql;
SELECT jobId FROM insertJobWithIndistries(
'Acme Inc'::varchar,
'Bomb Tester'::varchar,
'0030cfb3-1a03-4c5a-9afa-6b69376abe2e',
{ 19c2e0ee-acd5-48b2-9fac-077ad4d49b19, 21f8ffb7-e155-4c8f-acf0-9e991325784, 28c18acd-99ba-46ac-a2dc-59ce952eecf2 }
);
Thanks in advance.
Key to a solution are the function unnest() to (per documentation):
expand an array to a set of rows
And a data-modifying CTE.
A simple query does the job:
WITH ins_job AS (
INSERT INTO jobs (organisation, title, address_id)
SELECT 'Acme Inc', 'Bomb Tester', '0030cfb3-1a03-4c5a-9afa-6b69376abe2e' -- job-data here
RETURNING id
)
INSERT INTO industriesjobs (industry_id, job_id)
SELECT indid, id
FROM ins_job i -- that's a single row, so a CROSS JOIN is OK
, unnest('{19c2e0ee-acd5-48b2-9fac-077ad4d49b19
, 21f8ffb7-e155-4c8f-acf0-9e9913257845
, 28c18acd-99ba-46ac-a2dc-59ce952eecf2}'::uuid[]) indid; -- industry IDs here
Also demonstrating proper syntax for an array of uuid. (White space between elements and separators is irrelevant while not inside double-quotes.)
One of your UUIDs was one character short:
21f8ffb7-e155-4c8f-acf0-9e991325784
Must be something like:
21f8ffb7-e155-4c8f-acf0-9e9913257845 -- one more character
If you need functions, you do that, too:
CREATE OR REPLACE FUNCTION link_job_to_industries(_jobid uuid, _indids uuid[])
RETURNS void AS
$func$
INSERT INTO industriesjobs (industry_id, job_id)
SELECT _indid, _jobid
FROM unnest(_indids) _indid;
$func$ LANGUAGE sql;
Etc.
Related:
Insert data in 3 tables at a time using Postgres
How to insert multiple rows using a function in PostgreSQL

Execute a dynamic crosstab query

I implemented this function in my Postgres database: http://www.cureffi.org/2013/03/19/automatically-creating-pivot-table-column-names-in-postgresql/
Here's the function:
create or replace function xtab (tablename varchar, rowc varchar, colc varchar, cellc varchar, celldatatype varchar) returns varchar language plpgsql as $$
declare
dynsql1 varchar;
dynsql2 varchar;
columnlist varchar;
begin
-- 1. retrieve list of column names.
dynsql1 = 'select string_agg(distinct '||colc||'||'' '||celldatatype||''','','' order by '||colc||'||'' '||celldatatype||''') from '||tablename||';';
execute dynsql1 into columnlist;
-- 2. set up the crosstab query
dynsql2 = 'select * from crosstab (
''select '||rowc||','||colc||','||cellc||' from '||tablename||' group by 1,2 order by 1,2'',
''select distinct '||colc||' from '||tablename||' order by 1''
)
as ct (
'||rowc||' varchar,'||columnlist||'
);';
return dynsql2;
end
$$;
So now I can call the function:
select xtab('globalpayments','month','currency','(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)','text');
Which returns (because the return type of the function is varchar):
select * from crosstab (
'select month,currency,(sum(total_fees)/sum(txn_amount)*100)::decimal(48,2)
from globalpayments
group by 1,2
order by 1,2'
, 'select distinct currency
from globalpayments
order by 1'
) as ct ( month varchar,CAD text,EUR text,GBP text,USD text );
How can I get this function to not only generate the code for the dynamic crosstab, but also execute the result? I.e., the result when I manually copy/paste/execute is this. But I want it to execute without that extra step: the function shall assemble the dynamic query and execute it:
Edit 1
This function comes close, but I need it to return more than just the first column of the first record
Taken from: Are there any way to execute a query inside the string value (like eval) in PostgreSQL?
create or replace function eval( sql text ) returns text as $$
declare
as_txt text;
begin
if sql is null then return null ; end if ;
execute sql into as_txt ;
return as_txt ;
end;
$$ language plpgsql
usage: select * from eval($$select * from analytics limit 1$$)
However it just returns the first column of the first record :
eval
----
2015
when the actual result looks like this:
Year, Month, Date, TPV_USD
---- ----- ------ --------
2016, 3, 2016-03-31, 100000
What you ask for is impossible. SQL is a strictly typed language. PostgreSQL functions need to declare a return type (RETURNS ..) at the time of creation.
A limited way around this is with polymorphic functions. If you can provide the return type at the time of the function call. But that's not evident from your question.
Refactor a PL/pgSQL function to return the output of various SELECT queries
You can return a completely dynamic result with anonymous records. But then you are required to provide a column definition list with every call. And how do you know about the returned columns? Catch 22.
There are various workarounds, depending on what you need or can work with. Since all your data columns seem to share the same data type, I suggest to return an array: text[]. Or you could return a document type like hstore or json. Related:
Dynamic alternative to pivot with CASE and GROUP BY
Dynamically convert hstore keys into columns for an unknown set of keys
But it might be simpler to just use two calls: 1: Let Postgres build the query. 2: Execute and retrieve returned rows.
Selecting multiple max() values using a single SQL statement
I would not use the function from Eric Minikel as presented in your question at all. It is not safe against SQL injection by way of maliciously malformed identifiers. Use format() to build query strings unless you are running an outdated version older than Postgres 9.1.
A shorter and cleaner implementation could look like this:
CREATE OR REPLACE FUNCTION xtab(_tbl regclass, _row text, _cat text
, _expr text -- still vulnerable to SQL injection!
, _type regtype)
RETURNS text
LANGUAGE plpgsql AS
$func$
DECLARE
_cat_list text;
_col_list text;
BEGIN
-- generate categories for xtab param and col definition list
EXECUTE format(
$$SELECT string_agg(quote_literal(x.cat), '), (')
, string_agg(quote_ident (x.cat), %L)
FROM (SELECT DISTINCT %I AS cat FROM %s ORDER BY 1) x$$
, ' ' || _type || ', ', _cat, _tbl)
INTO _cat_list, _col_list;
-- generate query string
RETURN format(
'SELECT * FROM crosstab(
$q$SELECT %I, %I, %s
FROM %I
GROUP BY 1, 2 -- only works if the 3rd column is an aggregate expression
ORDER BY 1, 2$q$
, $c$VALUES (%5$s)$c$
) ct(%1$I text, %6$s %7$s)'
, _row, _cat, _expr -- expr must be an aggregate expression!
, _tbl, _cat_list, _col_list, _type);
END
$func$;
Same function call as your original version. The function crosstab() is provided by the additional module tablefunc which has to be installed. Basics:
PostgreSQL Crosstab Query
This handles column and table names safely. Note the use of object identifier types regclass and regtype. Also works for schema-qualified names.
Table name as a PostgreSQL function parameter
However, it is not completely safe while you pass a string to be executed as expression (_expr - cellc in your original query). This kind of input is inherently unsafe against SQL injection and should never be exposed to the general public.
SQL injection in Postgres functions vs prepared queries
Scans the table only once for both lists of categories and should be a bit faster.
Still can't return completely dynamic row types since that's strictly not possible.
Not quite impossible, you can still execute it (from a query execute the string and return SETOF RECORD.
Then you have to specify the return record format. The reason in this case is that the planner needs to know the return format before it can make certain decisions (materialization comes to mind).
So in this case you would EXECUTE the query, return the rows and return SETOF RECORD.
For example, we could do something like this with a wrapper function but the same logic could be folded into your function:
CREATE OR REPLACE FUNCTION crosstab_wrapper
(tablename varchar, rowc varchar, colc varchar,
cellc varchar, celldatatype varchar)
returns setof record language plpgsql as $$
DECLARE outrow record;
BEGIN
FOR outrow IN EXECUTE xtab($1, $2, $3, $4, $5)
LOOP
RETURN NEXT outrow
END LOOP;
END;
$$;
Then you supply the record structure on calling the function just like you do with crosstab.
Then when you all the query you would have to supply a record structure (as (col1 type, col2 type, etc) like you do with connectby.