PL/PGSQL Operator does not exist: information_schema.sql_identifier - postgresql

I'm writing a function to write a dynamic query.
This is the original query without function
SELECT b.column_name, a.default_flag, CAST(AVG(a.payment_ratio) AS NUMERIC), CAST(MAX(a.payment_ratio) AS NUMERIC) FROM user_joined a, information_schema.columns b where b.column_name = 'payment_ratio' group by a.default_flag, b.column_name
Then, I put it into a function like this
CREATE OR REPLACE FUNCTION test4(col text)
RETURNS TABLE(
col_name TEXT,
default_flag bigint,
average NUMERIC,
maximum NUMERIC) AS $$
BEGIN
RETURN QUERY EXECUTE FORMAT
('SELECT CAST(b.column_name AS TEXT), a.default_flag, CAST(AVG(a.'||col||') AS NUMERIC), CAST(MAX(a.'||col||') AS NUMERIC) FROM user_joined a, information_schema.columns b where b.column_name = %I group by a.default_flag, b.column_name', col);
END; $$
LANGUAGE PLPGSQL;
When I try to run
SELECT * FROM test4('payment_ratio')
I get this error
ERROR: operator does not exist: information_schema.sql_identifier = double precision
LINE 1: ... information_schema.columns b where b.column_name = payment_...
Is there anything wrong with my function?

The columns in information_schema have the (somewhat strange) data type sql_identifier and that can't be compared directly to a text value. You need to cast it in the SQL query.
You are also using the %I incorrectly. In the join condition the column name is a string constant so you need to use %L there. In the SELECT list, it's an identifier, so you need to use %I there.
CREATE OR REPLACE FUNCTION test4(col text)
RETURNS TABLE(
col_name TEXT,
default_flag bigint,
average NUMERIC,
maximum NUMERIC) AS $$
BEGIN
RETURN QUERY EXECUTE
FORMAT ('SELECT CAST(b.column_name AS TEXT),
a.default_flag, CAST(AVG(a.%I) AS NUMERIC),
CAST(MAX(a.'||col||') AS NUMERIC)
FROM user_joined a
JOIN information_schema.columns b ON b.column_name::text = %L
group by a.default_flag, b.column_name', col, col);
END; $$
LANGUAGE PLPGSQL;

Related

PostgreSQL: function to display columns in alphabetical order

On PostgreSQL, I need to see the table's columns in alphabetical order, so I'm using the query:
SELECT column_name, data_type FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'organizations' ORDER BY column_name ASC;
I use it a lot every day, so I want to create a function:
CREATE OR REPLACE FUNCTION seecols(table_name text)
RETURNS TABLE (column_name varchar, data_type varchar)
AS $func$
DECLARE
_query varchar;
BEGIN
-- Displays columns by alphabetic order
_query := 'SELECT column_name, data_type FROM information_schema.columns WHERE table_name = '''||table_name||''' ';
RETURN QUERY EXECUTE _query;
END;
$func$ LANGUAGE plpgsql;
But when I try:
SELECT seecols('organizations');
I'm getting:
**structure of query does not match function result type**
I guess the line "RETURNS TABLE (column_name varchar, data_type varchar)" is wrongly defined. But since this is my first time using plpgsql, I don't know how to make it more dynamic.
You don't need neither dynamic sql nor plpgsql here. Just embed your sql query into a sql function :
CREATE OR REPLACE FUNCTION seecols (IN t_name text, OUT column_name varchar, OUT data_type varchar)
RETURNS setof record LANGUAGE sql AS $$
SELECT column_name, data_type
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = t_name
ORDER BY column_name ASC ;
$$ ;
see dbfiddle

How to use text input as column name(s) in a Postgres function?

I'm working with Postgres and PostGIS. Trying to write a function that that selects specific columns according to the given argument.
I'm using a WITH statement to create the result table before converting it to bytea to return.
The part I need help with is the $4 part. I tried it is demonstrated below and $4::text and both give me back the text value of the input and not the column value in the table if cols=name so I get back from the query name and not the actual names in the table. I also try data($4) and got type error.
The code is like this:
CREATE OR REPLACE FUNCTION select_by_txt(z integer,x integer,y integer, cols text)
RETURNS bytea
LANGUAGE 'plpgsql'
AS $BODY$
declare
res bytea;
begin
WITH bounds AS (
SELECT ST_TileEnvelope(z, x, y) AS geom
),
mvtgeom AS (
SELECT ST_AsMVTGeom(ST_Transform(t.geom, 3857), bounds.geom) AS geom, $4
FROM table1 t, bounds
WHERE ST_Intersects(t.geom, ST_Transform(bounds.geom, 4326))
)
SELECT ST_AsMVT(mvtgeom, 'public.select_by_txt')
INTO res
FROM mvtgeom;
RETURN res;
end;
$BODY$;
Example for calling the function:
select_by_txt(10,32,33,"col1,col2")
The argument cols can be multiple column names from 1 and not limited from above. The names of the columns inside cols will be checked before calling the function that they are valid columns.
Passing multiple column names as concatenated string for dynamic execution urgently requires decontamination. I suggest a VARIADIC function parameter instead, with properly quoted identifiers (using quote_ident() in this case):
CREATE OR REPLACE FUNCTION select_by_txt(z int, x int, y int, VARIADIC cols text[] = NULL, OUT res text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format(
$$
SELECT ST_AsMVT(mvtgeom, 'public.select_by_txt')
FROM (
SELECT ST_AsMVTGeom(ST_Transform(t.geom, 3857), bounds.geom) AS geom%s
FROM table1 t
JOIN (SELECT ST_TileEnvelope($1, $2, $3)) AS bounds(geom)
ON ST_Intersects(t.geom, ST_Transform(bounds.geom, 4326))
) mvtgeom
$$, (SELECT ', ' || string_agg(quote_ident (col), ', ') FROM unnest(cols) col)
)
INTO res
USING z, x, y;
END
$func$;
db<>fiddle here
The format specifier %I for format() deals with a single identifier. You have to put in more work for multiple identifiers, especially for a variable number of 0-n identifiers. This implementation quotes every single column name, and only add a , if any column names have been passed. So it works for every possible input, even no input at all. Note VARIADIC cols text[] = NULL as last input parameter with NULL as default value:
Optional argument in PL/pgSQL function
Related:
quote_ident() does not add quotes to column name "first"
Column names are case sensitive in this context!
Call for your example (important!):
SELECT select_by_txt(10,32,33,'col1', 'col2');
Alternative syntax:
SELECT select_by_txt(10,32,33, VARIADIC '{col1,col2}');
More revealing call, with a third column name and malicious (though futile) intent:
SELECT select_by_txt(10,32,33,'col1', 'col2', $$col3'); DROP TABLE table1;--$$);
About that odd third column name and SQL injection:
https://www.explainxkcd.com/wiki/index.php/Little_Bobby_Tables
About VAIRADIC parameters:
Return rows matching elements of input array in plpgsql function
Pass multiple values in single parameter
Using an OUT parameter for simplicity. That's totally optional. See:
Returning from a function with OUT parameter
What I would not do
If you really, really trust the input to be a properly formatted list of 1 or more valid column names at all times - and you asserted that ...
the names of the columns inside cols will be checked before calling the function that they are valid columns
You could simplify:
CREATE OR REPLACE FUNCTION select_by_txt(z int, x int, y int, cols text, OUT res text)
LANGUAGE plpgsql AS
$func$
BEGIN
EXECUTE format(
$$
SELECT ST_AsMVT(mvtgeom, 'public.select_by_txt')
FROM (
SELECT ST_AsMVTGeom(ST_Transform(t.geom, 3857), bounds.geom) AS geom, %s
FROM table1 t
JOIN (SELECT ST_TileEnvelope($1, $2, $3)) AS bounds(geom)
ON ST_Intersects(t.geom, ST_Transform(bounds.geom, 4326))
) mvtgeom
$$, cols
)
INTO res
USING z, x, y;
END
$func$;
(How can you be so sure that the input will always be reliable?)
You would need to use a dynamic query:
CREATE OR REPLACE FUNCTION select_by_txt(z integer,x integer,y integer, cols text)
RETURNS bytea
LANGUAGE 'plpgsql'
AS $BODY$
declare
res bytea;
begin
EXECUTE format('
WITH bounds AS (
SELECT ST_TileEnvelope($1, $2, $3) AS geom
),
mvtgeom AS (
SELECT ST_AsMVTGeom(ST_Transform(t.geom, 3857), bounds.geom) AS geom, %I
FROM table1 t, bounds
WHERE ST_Intersects(t.geom, ST_Transform(bounds.geom, 4326))
)
SELECT ST_AsMVT(mvtgeom, ''public.select_by_txt'')
FROM mvtgeom', cols)
INTO res
USING z,x,y;
RETURN res;
end;
$BODY$;

PL/pgSQL: structure of query does not match function result type

I have a table
create table abac_object (
inbound abac_attribute,
outbound abac_attribute,
created_at timestamptz not null default now(),
primary key (inbound, outbound)
);
abac_attribute is my custom type that is defined like
create type abac_attribute as (
value text,
key text,
namespace_id uuid
);
I try to create a number of functions that work over the table
create function abac_object_list_1(_attr abac_attribute)
returns setof abac_object as $$
select *
from abac_object
where outbound = _attr
$$ language sql stable;
create function abac_object_list_2(_attr1 abac_attribute, _attr2 abac_attribute)
returns setof abac_object as $$
select t1.*
from abac_object as t1
inner join abac_object as t2 on t1.inbound = t2.inbound
where
t1.outbound = _attr1
and t2.outbound = _attr2
$$ language sql stable;
create function abac_object_list_3(_attr1 abac_attribute, _attr2 abac_attribute, _attr3 abac_attribute)
returns setof abac_object as $$
select t1.*
from abac_object as t1
inner join abac_object as t2 on t1.inbound = t2.inbound
inner join abac_object as t3 on t1.inbound = t3.inbound
where
t1.outbound = _attr1
and t2.outbound = _attr2
and t3.outbound = _attr3
$$ language sql stable;
create function abac_object_list(_attrs abac_attribute[], _offset integer, _limit integer)
returns setof abac_attribute as $$
begin
case array_length(_attrs, 1)
when 1 then return query
select t.inbound
from abac_object_list_1(_attrs[1]) as t
order by t.created_at
offset _offset
limit _limit;
when 2 then return query
select t.inbound
from abac_object_list_2(_attrs[1], _attrs[2]) as t
order by t.created_at
offset _offset
limit _limit;
when 3 then return query
select t.inbound
from abac_object_list_3(_attrs[1], _attrs[2], _attrs[3]) as t
order by t.created_at
offset _offset
limit _limit;
else raise exception 'bad argument' using detail = 'array length should be less or equal to 3';
end case;
end
$$ language plpgsql stable;
abac_object_list_1 works fine
select *
from abac_object_list_1(('apple', 'type', '00000000-0000-1000-a000-000000000010') ::abac_attribute);
However with abac_object_list I get the following error
select *
from abac_object_list(array [('apple', 'type', '00000000-0000-1000-a000-000000000010')] :: abac_attribute[], 0, 100);
[42804] ERROR: structure of query does not match function result type Detail: Returned type abac_attribute does not match expected type text in column 1. Where: PL/pgSQL function abac_object_list(abac_attribute[],integer,integer) line 4 at RETURN QUERY
I can't understand where text type comes from.
These functions worked before but then I added created_at column to be able to add explicit ORDER BY clause. And now I can't rewrite functions correctly.
Also maybe there is a better (or more idiomatic) way to define these function.
Could you please help me to figure it out?
Thank you.
P. S.
Well, it seems to work if I write
return query select (t.inbound).*
But I am not sure if it is a right approach.
Also I wonder what is better to use return table or return setof.

plpgsql return column name and column stats as a table

I am trying to use pgplpsql in a postgres database to:
- loop through all columns in a schema, and for every column that is double precision I want to return a table containing fields 'table name, column name, min value, max value, mean value, median value.
-So far I have managed to return the min, max, mean of these fields - but not in a table despite defining the table in the 'returns statement'.
Question:
How do I return a table properly
How do I include the column name and table name for the columns within this table? I have tried many things with a full range of error messages.
Becky
DROP FUNCTION household.numeric_stats(schemanm text);
CREATE OR REPLACE FUNCTION household.numeric_stats(schemanm text)
returns table(min double precision, max double precision, avg double precision)as $$
DECLARE
cname text;
tname text;
BEGIN
for cname,tname in SELECT column_name::text col,table_name::text tble FROM information_schema.columns
where table_schema = schemanm and data_type in ('double precision')
and table_name::text not in ('ap_household','derived_forest_income', 'derived_product_income','derivedproduct_income','view_income_overview_by_household')
LOOP
RAISE NOTICE 'cname is: % from %', cname, tname;
return query
execute format('select min(%I), max(%I), avg(%I) from %I.%I where %I != ''NaN''', cname, cname, cname, schemanm, tname, cname);
END;
$$
LANGUAGE plpgsql;
Would work like this:
CREATE OR REPLACE FUNCTION household.numeric_stats(schemanm text)
RETURNS TABLE(tname text, cname text
, min float8, max float8
, avg float8, median float8) AS
$func$
BEGIN
FOR tname, cname IN
SELECT table_name::text, column_name::text
FROM information_schema.columns
WHERE table_schema = schemanm
AND data_type = 'double precision'
AND table_name <> ALL ('{ap_household,derived_forest_income, derived_product_income,derivedproduct_income,view_income_overview_by_household}'::varchar[])
LOOP
-- RAISE NOTICE 'tname: %, cname: %', tname, cname;
RETURN QUERY EXECUTE format(
$f$SELECT $1, $2, min(%1$I), max(%1$I), avg(%1$I), median(%1$I)
FROM %2$I.%3$I
WHERE %1$I <> 'NaN'$f$, cname, schemanm, tname)
USING tname, cname;
END LOOP;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM household.numeric_stats('myschema');
You need to create the median aggregate functions before you can use it. Instructions in the Postgres Wiki:
https://wiki.postgresql.org/wiki/Aggregate_Median

Merge multiple result tables and perform final query on result

I have a function returning table, which accumulates output of multiple calls to another function returning table. I would like to perform final query on built table before returning result. Currently I implemented this as two functions, one accumulating and one performing final query, which is ugly:
CREATE OR REPLACE FUNCTION func_accu(LOCATION_ID INTEGER, SCHEMA_CUSTOMER TEXT)
RETURNS TABLE("networkid" integer, "count" bigint) AS $$
DECLARE
GATEWAY_ID integer;
BEGIN
FOR GATEWAY_ID IN
execute format(
'SELECT id FROM %1$I.gateway WHERE location_id=%2$L'
, SCHEMA_CUSTOMER, LOCATION_ID)
LOOP
RETURN QUERY execute format(
'SELECT * FROM get_available_networks_gw(%1$L, %2$L)'
, GATEWAY_ID, SCHEMA_CUSTOMER);
END LOOP;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION func_query(LOCATION_ID INTEGER, SCHEMA_CUSTOMER TEXT)
RETURNS TABLE("networkid" integer, "count" bigint) AS $$
DECLARE
BEGIN
RETURN QUERY execute format('
SELECT networkid, max(count) FROM func_accu(%2$L, %1$L) GROUP BY networkid;'
, SCHEMA_CUSTOMER, LOCATION_ID);
END;
$$ LANGUAGE plpgsql;
How can this be done in single function, elegantly?
Both functions simplified and merged, also supplying value parameters in the USING clause:
CREATE OR REPLACE FUNCTION pg_temp.func_accu(_location_id integer, schema_customer text)
RETURNS TABLE(networkid integer, count bigint) AS
$func$
BEGIN
RETURN QUERY EXECUTE format('
SELECT f.networkid, max(f.ct)
FROM %I.gateway g
, get_available_networks_gw(g.id, $1) f(networkid, ct)
WHERE g.location_id = $2
GROUP BY 1'
, _schema_customer)
USING _schema_customer, _location_id;
END
$func$ LANGUAGE plpgsql;
Call:
SELECT * FROM func_accu(123, 'my_schema');
Related:
Dynamically access column value in record
I am using alias names for the columns returned by the function (f(networkid, ct)) to be sure because you did not disclose the return type of get_available_networks_gw(). You can use the column names of the return type directly.
The comma (,) in the FROM clause is short syntax for CROSS JOIN LATERAL .... Requires Postgres 9.3 or later.
What is the difference between LATERAL and a subquery in PostgreSQL?
Or you could run this query instead of the function:
SELECT f.networkid, max(f.ct)
FROM myschema.gateway g, get_available_networks_gw(g.id, 'my_schema') f(networkid, ct)
WHERE g.location_id = $2
GROUP BY 1;