%rowtype in nested `declare...begin...end` - postgresql

inside an anonymous block, I have nested a declare...begin...end. Inner block cannot "see" a temp table created in outer block. What is best way to solve, still maintaining creating of temp table inside?
The below FAILs with: CONTEXT: compilation of PL/pgSQL function "inline_code_block" near line 152
do
$$
declare
xyz...
being
create temp table x ... <depending on xyz>
...
declare
r x%rowtype -> FAIL
begin
...
end;
end;
$$

It cannot to work. This error is not related to block scopes, but it is related to timing. The temp table is created in runtime, but the x%rowtype is evaluated in validation time (before runtime). There is very simple and robust solution in Postgres. Use record type:
declare r record;
begin
The variable of record type takes real composite type in assign time.
Although PL/pgSQL is similar to PL/SQL, inside it is absolutely different technology, and not all patterns from Oracle are possible or are effective. And sometimes there are different (and sometimes much more comfortable) ways.

Related

Is there any way to check whether a PostgreSQL record- or row-type variable contains a specific field from inside a function?

I have a trigger defined on several tables to fire after all INSERT, UPDATE, or DELETE, all using the same trigger function. The trigger function performs an expensive check, but I can speed it up significantly by filtering some of the intermediate steps of that check using either a WHERE machine_serial = NEW.machine_serial or WHERE machine_serial = OLD.machine_serial clause, depending on what type of statement fired the trigger. However, not all the tables actually have a machine_serial column, so I can't perform this filtering when the trigger is fired on one of those tables. I am currently trying to find a good solution to making the decision of whether to filter or not from within the trigger function, and I believe that simply checking whether NEW or OLD has the machine_serial field would be easiest, clearest, and fastest. I can't find any way to do that in the documentation though, but checking whether a RECORD contains a certain field seems like such a basic, commonplace operation for anyone that has to work with RECORDs that I assume that I've just got to be missing it somewhere - I can't imagine that it's just not possible.
For completeness, I'll go over the alternatives I've considered to the hypothetical does-RECORD-have-field check:
I could create two trigger functions, do_expensive_check_with_machine_serial() and do_expensive_check_without_machine_serial(), and use one or the other depending on whether the table has the machine_serial column. But if I or anyone after me needs to alter the logic in either one of these functions, they'll need to remember to alter the logic in the other one, too.
I could stick with the one trigger function I currently have, and figure out whether the firing table has machine_serial by just trying to access NEW.machine_serial or OLD.machine_serial. If that raises an exception, I can catch it and then I'll know the field isn't present. But the manual explicitly suggests avoiding using exception blocks unless absolutely necessary, due to performance impacts.
I could stick with the one trigger function I currently have, and just add a check like this: IF (TG_TABLE_SCHEMA = x AND TG_TABLE_NAME = y) OR (TG_TABLE_SCHEMA = w AND TG_TABLE_NAME = z) OR ...
, and just maintain that list of every table that has a machine_serial column. But then I and anyone that comes after me would need to alter that check in the trigger function any time the trigger is added to a new table, which is less than ideal.
Of course, the above three alternatives would all function, but they all feel like bad design choices to me. Maybe it's because I'm used to the dynamicness offered by Python, but if I used any of these alternatives, I would feel like I'm doing something wrong. And PostgreSQL is pretty good about offering lots of operators on all sorts of data types, so I just can't imagine that something as basic as checking whether a RECORD or ROW-type variable contains a certain field is impossible.
Before I show the solution, I have to say, so this requirement can be signal of some unhappy design. Maybe you try to implement some functionality that should not be implemented in triggers. Triggers are good, but too smart too generic too rich can be very slow and very hard to maintain and fix errors (but as every in life, there are exceptions from rules).
So first - you can look to system catalog:
CREATE FUNCTION public.foo_trg() RETURNS trigger
LANGUAGE plpgsql
AS $$
begin
raise notice 'a exists %', exists(select * from pg_attribute where attrelid = new.tableoid and attname = 'a');
raise notice 'd exists %', exists(select * from pg_attribute where attrelid = new.tableoid and attname = 'd');
return new;
end;
$$;
CREATE TABLE public.foo (
a integer,
b integer
);
CREATE TRIGGER foo_trg_insert
AFTER INSERT ON public.foo
FOR EACH ROW EXECUTE FUNCTION public.foo_trg();
(2022-09-02 06:18:41) postgres=# insert into foo values(1,2);
NOTICE: a exists t
NOTICE: d exists f
INSERT 0 1
Second solution is based on record to jsonb transformations:
CREATE OR REPLACE FUNCTION public.foo_trg()
RETURNS trigger
LANGUAGE plpgsql
AS $$
declare j jsonb;
begin
j := to_jsonb(new);
raise notice 'a exists %', j ? 'a';
raise notice 'd exists %', j ? 'd';
return new;
end;
$$
(2022-09-02 06:24:54) postgres=# insert into foo values(1,2);
NOTICE: a exists t
NOTICE: d exists f
INSERT 0 1
Second solution can be faster, because doesn't requires queries to system catalog. It hits just system catalog cache, but it doesn't work on some legacy PostgreSQL releases.

How to do variable substitution in plpgsql?

I've got a bit of complex sql code that I'm converting from MSSql to Postgres (using Entity Framework Core 2.1), to deal with potential race conditions when inserting to a table with a unique index. Here's the dumbed-down version:
const string QUERY = #"
DO
$$
BEGIN
insert into Foo (Field1,Field2,Field3)
values (#value1,#value2,#value3);
EXCEPTION WHEN others THEN
-- do nothing; it's a race condition
END;
$$ LANGUAGE plpgsql;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
";
return DbContext.Foos
.FromSql(QUERY,
new NpgsqlParameter("value1", value1),
new NpgsqlParameter("value2", value2),
new NpgsqlParameter("value3", value3))
.First();
In other words, try to insert the record, but don't throw an exception if the attempt to insert it results in a unique index violation (index is on Field1+Field2), and return the record, whether it was created by me or another thread.
This concept worked fine in MSSql, using a TRY..CATCH block. As far as I can tell, the way to handle Postgres exceptions is as I've done, in a plpgsql block.
BUT...
It appears that variable substitution in plpgsql blocks doesn't work. The code above fails on the .First() (no elements in sequence), and when I comment out the EXCEPTION line, I see the real problem, which is:
Npgsql.PostgresException : 42703: column "value1" does not exist
When I test using regular Sql, i.e. doing the insert without using a plpgsql block, this works fine.
So, what is the correct way to do variable substitution in a plpgsql block?
The reason this doesn't work is that the body of the DO statement is actually a string, a text. See reference
$$ is just another way to delimit text in postgresql. It can be just as well be replaced with ' or $somestuff$.
As it is a string, Npgsql and Postgresql have no reason to mess with #value1 in it.
Solutions? Only a very ugly one, so not using this construction, as you're not able to pass it any values. And messing with string concatenation is no different than doing concatenation in C# in the first place.
Alternatives? Yes!
You don't need to handle exceptions in plpgsql blocks. Simply insert, use the ON CONFLICT DO NOTHING, and be on your way.
INSERT INTO Foo (Field1,Field2,Field3)
VALUES (#value1,#value2,#value3)
ON CONFLICT DO NOTHING;
select *
from Foo
where Field1 = #value1
and Field2 = #value2;
Or if you really want to keep using plpgsql, you can simply create a temporary table, using the ON COMMIT DROP option, fill it up with these parameters as one row, then use it in the DO statement. For that to work all your code must execute as part of one transaction. You can use one explicitly just in case.
The only ways to pass parameters to plpgsql code is via these 2 methods:
Declaring a function, then calling it with arguments
When already inside a plpgsql block you can call:
EXECUTE $$ INSERT ... VALUES ($1, $2, $3); $$ USING 3, 'text value', 5.234;
End notes:
As a fellow T-SQL developer who loved its freedom, but transitioned to Postgresql, I have to say that the BIG difference is that on one side there's T-SQL which gives the power, and on the other side it's a very powerful Postgresql-flavored SQL. plpgsql is very rarely warranted. In fact, in a code base of megabytes of complex SQL stuff, I can rewrite pretty much every plpgsql code in SQL. That's how powerful it really is compared to MSSQL-flavored SQL. It just takes some getting used to, and befriending the very ample documentation. Good luck!

plpgsql : how to reference variable in sql statement

I am rather new to PL/pgSQL and don't know how to reference a variable in a SELECT statement.
In this following function the SELECT INTO always returns NULL:
$body$
DECLARE
r RECORD;
c CURSOR FOR select name from t_people;
nb_bills integer;
BEGIN
OPEN c;
LOOP
FETCH c INTO r;
EXIT WHEN NOT FOUND;
RAISE NOTICE 'name found: %', r.name;
SELECT count(bill_id) INTO nb_bills from t_bills where name = r.name;
END LOOP;
END;
$body$
RAISE NOTICE allows me to verify that my CURSOR is working well: names are properly retrieved, but for some reason still unknown to me, not properly passed to the SELECT INTO statement.
For debugging purpose, I tried to replace the variable in SELECT INTO with a constant value and it worked:
SELECT count( bill_id) INTO nb_bills from t_bills where name = 'joe';
I don't know how to reference r.name in the SELECT INTO statement.
I tried r.name, I tried to create another variable containing quotes, it is always returning NULL.
I am stuck. If anyone knows ...
the SELECT INTO always returns NULL
not properly passed to the SELECT INTO statement.
it is always returning NULL.
None of this makes sense.
SELECT statements do not return anything by itself in PL/pgSQL. You have to either assign results to variables or explicitly return results with one of the available RETURN variants.
SELECT INTO is only used for variable assignment and does not return anything, either. Not to be confused with SELECT INTO in plain SQL - which is generally discouraged:
Combine two tables into a new one so that select rows from the other one are ignored
It's not clear what's supposed to be returned in your example. You did not disclose the return type of the function and you did not actually return anything.
Start by reading the chapter Returning From a Function in the manual.
Here are some related answers with examples:
Can I make a plpgsql function return an integer without using a variable?
Return a query from a function?
How to return multiple rows from PL/pgSQL function?
Return setof record (virtual table) from function
And there may be naming conflicts with parameter names. We can't tell unless you post the complete function definition.
Better approach
That said, explicit cursors are only actually needed on rare occasions. Typically, the implicit cursor of a FOR loop is simpler to handle and cheaper:
Truncating all tables in a Postgres database
And most of the time you don't even need any cursors or loops at all. Set-based solutions are typically simpler and faster.

PostgreSQL insert or update trigger function volatility category

Assume, i have 2 tables in my DB (postgresql-9.x)
CREATE TABLE FOLDER (
KEY BIGSERIAL PRIMARY KEY,
PATH TEXT,
NAME TEXT
);
CREATE TABLE FOLDERFILE (
FILEID BIGINT,
PATH TEXT,
PATHKEY BIGINT
);
I automatically update FOLDERFILE.PATHKEY from FOLDER.KEY whenever i insert into or update FOLDERFILE:
CREATE OR REPLACE FUNCTION folderfile_fill_pathkey() RETURNS trigger AS $$
DECLARE
pathkey bigint;
changed boolean;
BEGIN
IF tg_op = 'INSERT' THEN
changed := TRUE;
ELSE IF old.FILEID != new.FILEID THEN
changed := TRUE;
END IF;
END IF;
IF changed THEN
SELECT INTO pathkey key FROM FOLDER WHERE PATH = new.path;
IF FOUND THEN
new.pathkey = pathkey;
ELSE
new.pathkey = NULL;
END IF;
END IF;
RETURN new;
END
$$ LANGUAGE plpgsql VOLATILE;
CREATE TRIGGER folderfile_fill_pathkey_trigger AFTER INSERT OR UPDATE
ON FOLDERFILE FOR EACH ROW EXECUTE PROCEDURE fcliplink_fill_pathkey();
So the question is about function folderfile_fill_pathkey() volatility. Documentations says
Any function with side-effects must be labeled VOLATILE
But as far as i understand – this function does not change any data in the tables it rely on, so i can mark this function as IMMUTABLE. It that correct?
Would there be any problem with IMMUTABLE trigger function if I bulk-insert many rows into FOLDERFILE within the same transaction, like:
BEGIN;
INSERT INTO FOLDERFILE ( ... );
...
INSERT INTO FOLDERFILE ( ... );
COMMIT;
Firstly, as #pozs already pointed out, the function definition you have provided is most definitely STABLE rather than IMMUTABLE since it performs database look-ups. This means that the result is not simply derived from the input parameters (as IMMUTABLE would suggest), but also from the data stored in your FOLDER table (which is bound to change). As per the documentation:
STABLE indicates that the function cannot modify the database, and
that within a single table scan it will consistently return the same
result for the same argument values, but that its result could change
across SQL statements. This is the appropriate selection for functions
whose results depend on database lookups, parameter variables (such as
the current time zone), etc.
Secondly, adding stability modifiers (IMMUTABLE/STABLE/VOLATILE) to your trigger functions serves an illustrative purpose at best, since AFAIK PostgreSQL doesn't actually perform any planning that would warrant their use. The following post from the pgsql-hackers mailing list seems to support my claim:
Volatility is a complete no-op for a trigger function anyway, as are
other planner parameters such as cost/rows, because there is no
planning involved in trigger calls.
To sum up: you're probably better off avoiding the stability keywords in your trigger(!) procedures for now, since including them seems to add little to no benefit but entails several unexpected caveats/pitfalls (see the end of #pozs's first comment).

postgresql privileges Ensuring inserts are only done through functions

Let's say I have a table persons which contains only a name(varchar) and a user client.
I'd like that the only way for client to insert to persons is through the function:
CREATE OR REPLACE FUNCTION add_a_person(a_name varying character)
RETURNS void AS
$BODY$
BEGIN
INSERT INTO persons VALUES(a_name);
END;
$BODY$
LANGUAGE plpgsql VOLATILE COST 100;
So, I don't want to grant client insert privileges on persons and only give execute privilege for add_a_person.
But without doing so, I'd get a permission denied because of the use of insert inside the function.
I have not found a way to this in the postgres documentation about granting privileges.
Is there a way to do this?
You can define the function with SECURITY DEFINER. This will allow the function to run for the restricted user as if they had the higher privileges of the function's creator (which needs to be able to insert into the table).
The last line of the definition would look like this:
LANGUAGE plpgsql VOLATILE COST 100 SECURITY DEFINER;
This is a bit simplistic, but assuming are running 9.2 or later, this is an example of how to check for a single permitted function doing an insert:
CREATE TABLE my_table (col1 text, col2 integer, col3 timestamp);
CREATE FUNCTION my_table_insert_function(col1 text, col2 integer) RETURNS integer AS $$
BEGIN
INSERT INTO my_table VALUES (col1, col2, current_timestamp);
RETURN 1;
END $$ LANGUAGE plpgsql;
CREATE FUNCTION my_table_insert_trigger_function() RETURNS trigger AS $$
DECLARE
stack text;
fn integer;
BEGIN
RAISE EXCEPTION 'secured';
EXCEPTION WHEN OTHERS THEN
BEGIN
GET STACKED DIAGNOSTICS stack = PG_EXCEPTION_CONTEXT;
fn := position('my_table_insert_function' in stack);
IF (fn <= 0) THEN
RAISE EXCEPTION 'Expecting insert from my_table_insert_function'
USING HINT = 'Use function to insert data';
END IF;
RETURN new;
END;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER my_table_insert_trigger BEFORE INSERT ON my_table
FOR EACH ROW EXECUTE PROCEDURE my_table_insert_trigger_function();
And a quick example of usage:
INSERT INTO my_table VALUES ('test one', 1, current_timestamp); -- FAILS
SELECT my_table_insert_function('test one', 1); -- SUCCEEDS
You'll want to peek into the stack in more detail if you want your code to be more robust, secure, etc. Checks for multiple functions are possible, of course, but involve more work. Splitting the stack into multiple lines and parsing it can be fairly involved, so you'll probably want some helper functions if things get more complex.
This is just a proof of concept, but it does what it claims. I would expect this code to be fairly slow given the use of exception handling and stack inspection, so don't use it in performance-critical parts of your application. It's not likely to be suitable for cases where DML statements are frequent, but if security is more important than performance, go for it.
Matthew's answer is correct in that a SECURITY DEFINER will allow the function to run with the privileges of a different user. Documentation for this is at http://www.postgresql.org/docs/9.1/static/sql-createfunction.html
Why are you trying to implement security this way? If you want to enforce some logic on the inserts, then I would strongly recommend doing it with constraints. http://www.postgresql.org/docs/9.1/static/ddl-constraints.html
If you want substantially higher levels of logic than can be reasonably implemented in constraints, I would suggest looking into building a business logic layer between your presentation layer and the data storage layer. You will find that scalability demands this pretty much instantly.
If your goal is to defend against SQL injection then you have found a way that might work, but that will create a heck of a lot of work for you. Worse, it leads to huge volumes of really mindless code that all has to be kept in sync across schema changes. This is pretty rough if you're trying to do anything agile. Consider instead using a programming framework that takes advantage of PREPARE / EXECUTE, which is pretty much all of them at this point.
http://www.postgresql.org/docs/9.0/static/sql-prepare.html