I am in an existing Postgres 14 database that has the following Operators and Functions in place that augment an '=' or 'LIKE' query to leverage 'lower()' (thus allowing the query to take advantage of Indexes). This is allowing us to do case-insensitive queries without having to directly include 'lower()' in the query.
-- ------------ CREATE FUNCTIONS -----------
CREATE OR REPLACE FUNCTION ci_caseinsmatch(varchar, varchar) RETURNS boolean
AS $$
SELECT LOWER($1)::text = LOWER($2)::text;
$$
LANGUAGE sql
IMMUTABLE STRICT;
CREATE FUNCTION ci_like(varchar, varchar) RETURNS boolean
AS $$
SELECT LOWER($1)::text LIKE LOWER($2)::text;
$$
LANGUAGE sql;
-- ------------ CREATE OPERATORS -----------
CREATE OPERATOR = (
PROCEDURE = ci_caseinsmatch,
LEFTARG = varchar,
RIGHTARG = varchar,
COMMUTATOR = =,
NEGATOR = <>
);
CREATE OPERATOR ~~(
PROCEDURE = ci_like,
LEFTARG = varchar,
RIGHTARG = varchar,
RESTRICT = likesel,
JOIN = likejoinsel);
(That code courtesy of postgresonline)
With the proper indexes in place, this gets us the ability to do fast case-insensitive searches like
select * from packages where a_varchar_column='Bd1261d5-6e47-481f-a7e2-6f54c88dd8eb';
or
select * from packages where a_varchar_column LIKE 'Bd1261d5-6e47-481f-a7e2-6f54c88dd8eb';
any time we're comparing VARCHAR's, without worrying about the casing of characters, or having to interject lower() in to the queries manually.
Now I want to extend that capability to IN queries, like this
select * from packages where a_varchar_column IN ('b8fec092-fdea-4c1f-bf30-7f4124da660e', 'ae7a78f3-d419-4361-95c9-9254f1c76da2')
But IN (which translates to =AND) is not an Operator. So I can't CREATE OPERATOR IN nor CREATE OPERATOR =AND.
So the underlying question is, can I interject a function any time an '=AND' construct is called, and how do I set up that trigger ?
In this case it happens to be desired when we have a varchar on the left that we want forced to lower(), but that's already handled by the Function.
I know for certain that if my app were using lower() like this
select * from packages where lower(a_varchar_column) IN ('b8fec092-fdea-4c1f-bf30-7f4124da660e', 'ae7a78f3-d419-4361-95c9-9254f1c76da2')
that the query is fast and takes advantage of our Indexes. But I'm trying to have the database do that on the applications behalf.
Related
Goal
I'm trying to make PostgreSQL do something like function dispatch, but I'm open to any other solution for the problem that lets me keep the function I'm calling a SQL function (versus PL/PGSQL) because I want it to be inlined.
Suppose I have two functions like:
create or replace function is_in_view_one(p people) returns boolean as $$
select p.state = ANY(ARRAY['TX', 'NY', 'CA'])
$$ language sql strict stable;
And:
create or replace function is_in_view_two(p people) returns boolean as $$
select p.state = ANY(ARRAY['NV', 'FL', 'MT'])
$$ language sql strict stable;
I want to be able to write some code, or adapt the above functions, so I can write:
select count(*)
from people
where is_in_view(people, 'one');
And I want is_in_view to be fully inline-able according to these criteria: https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions'
Attempted Solution via Domains
I've tried to set up a solution using domains as function identifiers, and although it doesn't work, I think someone more knowledgeable about PostgreSQL types, casts, and function identification might know how to hack it.
I tried to do:
create domain view_one_id as uuid check (value = 'ed744964-6561-11eb-878e-c7ad77d3260a');
create domain view_two_id as uuid check (value = 'fa9fe0f8-6561-11eb-878e-c79c81b46d0c');
create or replace function say_n(v view_one_id) returns integer as $$
select 1
$$ language sql strict;
create or replace function say_n(v view_two_id) returns integer as $$
select 2
$$ language sql strict;
Hoping that I could then do:
select say_n('ed744964-6561-11eb-878e-c7ad77d3260a') # 1
select say_n('fa9fe0f8-6561-11eb-878e-c79c81b46d0c') # 2
But instead I get:
=# select say_n('fa9fe0f8-6561-11eb-878e-c79c81b46d0c');
ERROR: function say_n(unknown) is not unique
LINE 1: select say_n('fa9fe0f8-6561-11eb-878e-c79c81b46d0c');
^
HINT: Could not choose a best candidate function. You might need to add explicit type casts.
I can do:
=# select say_n('fa9fe0f8-6561-11eb-878e-c79c81b46d0c'::view_two_id);
say_n
-------
2
(1 row)
But in order to integrate with external tooling that knows how to call functions and only call functions (not also supply a variable type cast) I'm holding out hope for a solution that doesn't require modifications to this external tooling.
Happy to entertain alternatives! I feel like this solution might be possible by fiddling with the CAST or something, however.
What about using CASE:
select p.state = ANY (CASE WHEN $2 = 'one'
THEN ARRAY['TX', 'NY', 'CA']
ELSE ARRAY['NV', 'FL', 'MT']
END)
But if you want efficiency, you would be better off with the two functions, because the above cannot use an index.
I'm trying to build a parametrized view using a postgres function:
CREATE FUNCTION schemaB.testFunc(p INT)
RETURNS TABLE
AS
RETURN (SELECT * FROM schemaZ.mainTable WHERE id=p)
The problem is always the same:
SQL Error [42601]: ERROR: syntax error at or near "AS"
Any idea on what could I be doing wrong?
You need to specify the columns of the "return table", this is either done using
returns table(col_1 integer, col_2 text, ...)
In your case you are returning only rows of one table, so it's easier to use
returns setof maintable
As documented in the manual the function body needs to be enclosed in single quotes, or using dollar quoting.
As stored functions can be written in many different languages in Postgres, you also need to specify a language - in this case language sql is suitable.
So putting all that together, you need:
CREATE FUNCTION schemaB.testFunc(p_id INT)
RETURNS setof schemaZ.mainTable
AS
$$
SELECT *
FROM schemaZ.mainTable
WHERE id = p_id
$$
language sql;
A return statement is not required for language sql functions.
I am currently learning a lot of PostgreSQL, especially PLPGSQL and am struggling in handling query results in functions.
I want to create a wrapper around a user table and use the result later on and then return it.
In my case the user and account are two different tables and I want to create it in one go.
My first and naïve approach was to build the following:
CREATE OR REPLACE FUNCTION schema.create_user_with_login (IN email varchar, IN password varchar, IN firstname varchar DEFAULT NULL, IN surname varchar DEFAULT NULL)
RETURNS schema.user
LANGUAGE plpgsql
VOLATILE
RETURNS NULL ON NULL INPUT
AS
$$
declare
created_user schema."user";
begin
INSERT INTO schema."user" ("firstname", "surname", "email")
VALUES (firstname, surname, email)
RETURNING * INTO created_user;
// [...] create accounts and other data using e.g. created_user.id
// the query should return the initially created user
RETURN created_user
end;
$$;
This approach does not work, as schema.user has NOT NULL fields (a domain type with that constraint) and will throw an exception for the declared statement:
domain schema."USER_ID" does not allow null values
So maybe it could work, but not with in that constrained environment.
I also tried to use RETURNS SETOF schema.user and directly RETURN QUERY INSERT ...., but this does not return all columns, but instead one column with all the data.
How can I achieve the effect of returning the initial user object as a proper user row while having the data available inside the function?
I am using Postgres 9.6. My version output:
PostgreSQL 9.6.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.2 20140120 (Red Hat 4.8.2-16), 64-bit
Issue 1
I also tried to use RETURNS SETOF schema.user and directly RETURN QUERY INSERT ...., but this does not return all columns, but instead
one column with all the data.
Sure it returns all columns. You have to call set-returning functions like this:
SELECT * FROM schema.create_user_with_login;
You have to declare it as RETURNS SETOF foo.users to cooperate with RETURN QUERY.
Issue 2
It's nonsense to declare your function as STRICT (synonym for RETURNS NULL ON NULL INPUT) and then declare NULL parameter default values:
... firstname varchar DEFAULT NULL, IN surname varchar DEFAULT NULL)
You cannot pass NULL values to a function defined STRICT, it would just return NULL and do nothing. While firstname and surname are meant to be optional, do not define the function strict (or pass empty strings instead or something)
More suggestions
Don't call your schema "schema".
Don't use the reserved word user as identifier at all.
Use legal, lower-case, unquoted identifiers everywhere if possible.
Function
All things considered, your function might look like this:
CREATE OR REPLACE FUNCTION foo.create_user_with_login (_email text
, _password text
, _firstname text = NULL
, _surname text = NULL)
RETURNS SETOF foo.users
LANGUAGE plpgsql AS -- do *not* define it STRICT
$func$
BEGIN
RETURN QUERY
WITH u AS (
INSERT INTO foo.users (firstname, surname, email)
VALUES (_firstname, _surname, _email)
RETURNING *
)
, a AS ( -- create account using created_user.id
INSERT INTO accounts (user_id)
SELECT u.user_id FROM u
)
-- more chained CTEs with DML statements?
TABLE u; -- return the initially created user
END
$func$;
Yes, that's a single SQL statement with several data-modifying CTE to do it all. Fastest and cleanest. The function wrapper is optional for convenience. Might as well be LANGUAGE sql. Related:
Insert data in 3 tables at a time using Postgres
I prepended function parameter names with underscore (_email) to rule out naming conventions. This is totally optional, but you have carefully keep track of the scope of conflicting parameters, variables, and column names if you don't.
TABLE u is short for SELECT * FROM u.
Is there a shortcut for SELECT * FROM?
Store results of query in a plpgsql variable?
Three distinct cases:
Single value:
Store query result in a variable using in PL/pgSQL
Single row
Declare row type variable in PL/pgSQL
Set of rows (= table)
There are no "table variables", but several other options:
PostgreSQL table variable
How to use a record type variable in plpgsql?
I have a very simple query that is not much more complicated than:
select *
from table_name
where id = 1234
...it takes less than 50 milliseconds to run.
Took that query and put it into a function:
CREATE OR REPLACE FUNCTION pie(id_param integer)
RETURNS SETOF record AS
$BODY$
BEGIN
RETURN QUERY SELECT *
FROM table_name
where id = id_param;
END
$BODY$
LANGUAGE plpgsql STABLE;
This function when executed select * from pie(123); takes 22 seconds.
If I hard code an integer in place of id_param, the function executes in under 50 milliseconds.
Why does the fact that I am using a parameter in the where statement cause my function to run slow?
Edit to add concrete example:
CREATE TYPE test_type AS (gid integer, geocode character varying(9))
CREATE OR REPLACE FUNCTION geocode_route_by_geocode(geocode_param character)
RETURNS SETOF test_type AS
$BODY$
BEGIN
RETURN QUERY EXECUTE
'SELECT gs.geo_shape_id AS gid,
gs.geocode
FROM geo_shapes gs
WHERE geocode = $1
AND geo_type = 1
GROUP BY geography, gid, geocode' USING geocode_param;
END;
$BODY$
LANGUAGE plpgsql STABLE;
ALTER FUNCTION geocode_carrier_route_by_geocode(character)
OWNER TO root;
--Runs in 20 seconds
select * from geocode_route_by_geocode('999xyz');
--Runs in 10 milliseconds
SELECT gs.geo_shape_id AS gid,
gs.geocode
FROM geo_shapes gs
WHERE geocode = '9999xyz'
AND geo_type = 1
GROUP BY geography, gid, geocode
Update in PostgreSQL 9.2
There was a major improvement, I quote the release notes here:
Allow the planner to generate custom plans for specific parameter
values even when using prepared statements (Tom Lane)
In the past, a prepared statement always had a single "generic" plan
that was used for all parameter values, which was frequently much
inferior to the plans used for non-prepared statements containing
explicit constant values. Now, the planner attempts to generate custom
plans for specific parameter values. A generic plan will only be used
after custom plans have repeatedly proven to provide no benefit. This
change should eliminate the performance penalties formerly seen from
use of prepared statements (including non-dynamic statements in
PL/pgSQL).
Original answer for PostgreSQL 9.1 or older
A plpgsql functions has a similar effect as the PREPARE statement: queries are parsed and the query plan is cached.
The advantage is that some overhead is saved for every call.
The disadvantage is that the query plan is not optimized for the particular parameter values it is called with.
For queries on tables with even data distribution, this will generally be no problem and PL/pgSQL functions will perform somewhat faster than raw SQL queries or SQL functions. But if your query can use certain indexes depending on the actual values in the WHERE clause or, more generally, chose a better query plan for the particular values, you may end up with a sub-optimal query plan. Try an SQL function or use dynamic SQL with EXECUTE to force a the query to be re-planned for every call. Could look like this:
CREATE OR REPLACE FUNCTION pie(id_param integer)
RETURNS SETOF record AS
$BODY$
BEGIN
RETURN QUERY EXECUTE
'SELECT *
FROM table_name
where id = $1'
USING id_param;
END
$BODY$
LANGUAGE plpgsql STABLE;
Edit after comment:
If this variant does not change the execution time, there must be other factors at play that you may have missed or did not mention. Different database? Different parameter values? You would have to post more details.
I add a quote from the manual to back up my above statements:
An EXECUTE with a simple constant command string and some USING
parameters, as in the first example above, is functionally equivalent
to just writing the command directly in PL/pgSQL and allowing
replacement of PL/pgSQL variables to happen automatically. The
important difference is that EXECUTE will re-plan the command on each
execution, generating a plan that is specific to the current parameter
values; whereas PL/pgSQL normally creates a generic plan and caches it
for re-use. In situations where the best plan depends strongly on the
parameter values, EXECUTE can be significantly faster; while when the
plan is not sensitive to parameter values, re-planning will be a
waste.
Basically, at least for proof of concept, I want a function where I can run:
SELECT res('table_name'); and this will give me the results of SELECT * FROM table_name;.
The issue I am having is schema...in the declaration of the function I have:
CREATE OR REPLACE FUNCTION res(table_name TEXT) RETURNS SETOF THISISTHEPROBLEM AS
The problem is that I do not know how to declare my return, as it wants me to specify a table or a schema, and I won't have that until the function is actually run.
Any ideas?
You can do this, but as mentioned before you have to add a column definiton list in the SELECT query.
CREATE OR REPLACE FUNCTION res(table_name TEXT) RETURNS SETOF record AS $$
BEGIN
RETURN QUERY EXECUTE 'SELECT * FROM ' || table_name;
END;
$$ LANGUAGE plpgsql;
SELECT * FROM res('sometable') sometable (col1 INTEGER, col2 INTEGER, col3 SMALLINT, col4 TEXT);
Why for any real practical purpose would you just want to pass in table and select * from it? For fun maybe?
You can't do it without defining some kind of known output like jack and rudi show. Or doing it like depesz does here using output parameters http://www.depesz.com/index.php/2008/05/03/waiting-for-84-return-query-execute-and-cursor_tuple_fraction/.
A few hack around the wall approachs are to issue raise notices in a loop and print out a result set one row at a time. Or you could create a function called get_rows_TABLENAME that has a definition for every table you want to return. Just use code to generate the procedures creations. But again not sure how much value doing a select * from a table, especially with no constraints is other than for fun or making the DBA's blood boil.
Now in SQL Server you can have a stored procedure return a dynamic result set. This is both a blessing and curse as you can't be certain what comes back without looking up the definition. For me I look at PostgreSQL's implementation to be the more sound way to go about it.
Even if you manage to do this (see rudi-moore's answer for a way if you have 8.4 or above), You will have to expand the type explicitly in the select - eg:
SELECT res('table_name') as foo(id int,...)