postgresql privileges Ensuring inserts are only done through functions - postgresql

Let's say I have a table persons which contains only a name(varchar) and a user client.
I'd like that the only way for client to insert to persons is through the function:
CREATE OR REPLACE FUNCTION add_a_person(a_name varying character)
RETURNS void AS
$BODY$
BEGIN
INSERT INTO persons VALUES(a_name);
END;
$BODY$
LANGUAGE plpgsql VOLATILE COST 100;
So, I don't want to grant client insert privileges on persons and only give execute privilege for add_a_person.
But without doing so, I'd get a permission denied because of the use of insert inside the function.
I have not found a way to this in the postgres documentation about granting privileges.
Is there a way to do this?

You can define the function with SECURITY DEFINER. This will allow the function to run for the restricted user as if they had the higher privileges of the function's creator (which needs to be able to insert into the table).
The last line of the definition would look like this:
LANGUAGE plpgsql VOLATILE COST 100 SECURITY DEFINER;

This is a bit simplistic, but assuming are running 9.2 or later, this is an example of how to check for a single permitted function doing an insert:
CREATE TABLE my_table (col1 text, col2 integer, col3 timestamp);
CREATE FUNCTION my_table_insert_function(col1 text, col2 integer) RETURNS integer AS $$
BEGIN
INSERT INTO my_table VALUES (col1, col2, current_timestamp);
RETURN 1;
END $$ LANGUAGE plpgsql;
CREATE FUNCTION my_table_insert_trigger_function() RETURNS trigger AS $$
DECLARE
stack text;
fn integer;
BEGIN
RAISE EXCEPTION 'secured';
EXCEPTION WHEN OTHERS THEN
BEGIN
GET STACKED DIAGNOSTICS stack = PG_EXCEPTION_CONTEXT;
fn := position('my_table_insert_function' in stack);
IF (fn <= 0) THEN
RAISE EXCEPTION 'Expecting insert from my_table_insert_function'
USING HINT = 'Use function to insert data';
END IF;
RETURN new;
END;
END $$ LANGUAGE plpgsql;
CREATE TRIGGER my_table_insert_trigger BEFORE INSERT ON my_table
FOR EACH ROW EXECUTE PROCEDURE my_table_insert_trigger_function();
And a quick example of usage:
INSERT INTO my_table VALUES ('test one', 1, current_timestamp); -- FAILS
SELECT my_table_insert_function('test one', 1); -- SUCCEEDS
You'll want to peek into the stack in more detail if you want your code to be more robust, secure, etc. Checks for multiple functions are possible, of course, but involve more work. Splitting the stack into multiple lines and parsing it can be fairly involved, so you'll probably want some helper functions if things get more complex.
This is just a proof of concept, but it does what it claims. I would expect this code to be fairly slow given the use of exception handling and stack inspection, so don't use it in performance-critical parts of your application. It's not likely to be suitable for cases where DML statements are frequent, but if security is more important than performance, go for it.

Matthew's answer is correct in that a SECURITY DEFINER will allow the function to run with the privileges of a different user. Documentation for this is at http://www.postgresql.org/docs/9.1/static/sql-createfunction.html
Why are you trying to implement security this way? If you want to enforce some logic on the inserts, then I would strongly recommend doing it with constraints. http://www.postgresql.org/docs/9.1/static/ddl-constraints.html
If you want substantially higher levels of logic than can be reasonably implemented in constraints, I would suggest looking into building a business logic layer between your presentation layer and the data storage layer. You will find that scalability demands this pretty much instantly.
If your goal is to defend against SQL injection then you have found a way that might work, but that will create a heck of a lot of work for you. Worse, it leads to huge volumes of really mindless code that all has to be kept in sync across schema changes. This is pretty rough if you're trying to do anything agile. Consider instead using a programming framework that takes advantage of PREPARE / EXECUTE, which is pretty much all of them at this point.
http://www.postgresql.org/docs/9.0/static/sql-prepare.html

Related

Is there any way to check whether a PostgreSQL record- or row-type variable contains a specific field from inside a function?

I have a trigger defined on several tables to fire after all INSERT, UPDATE, or DELETE, all using the same trigger function. The trigger function performs an expensive check, but I can speed it up significantly by filtering some of the intermediate steps of that check using either a WHERE machine_serial = NEW.machine_serial or WHERE machine_serial = OLD.machine_serial clause, depending on what type of statement fired the trigger. However, not all the tables actually have a machine_serial column, so I can't perform this filtering when the trigger is fired on one of those tables. I am currently trying to find a good solution to making the decision of whether to filter or not from within the trigger function, and I believe that simply checking whether NEW or OLD has the machine_serial field would be easiest, clearest, and fastest. I can't find any way to do that in the documentation though, but checking whether a RECORD contains a certain field seems like such a basic, commonplace operation for anyone that has to work with RECORDs that I assume that I've just got to be missing it somewhere - I can't imagine that it's just not possible.
For completeness, I'll go over the alternatives I've considered to the hypothetical does-RECORD-have-field check:
I could create two trigger functions, do_expensive_check_with_machine_serial() and do_expensive_check_without_machine_serial(), and use one or the other depending on whether the table has the machine_serial column. But if I or anyone after me needs to alter the logic in either one of these functions, they'll need to remember to alter the logic in the other one, too.
I could stick with the one trigger function I currently have, and figure out whether the firing table has machine_serial by just trying to access NEW.machine_serial or OLD.machine_serial. If that raises an exception, I can catch it and then I'll know the field isn't present. But the manual explicitly suggests avoiding using exception blocks unless absolutely necessary, due to performance impacts.
I could stick with the one trigger function I currently have, and just add a check like this: IF (TG_TABLE_SCHEMA = x AND TG_TABLE_NAME = y) OR (TG_TABLE_SCHEMA = w AND TG_TABLE_NAME = z) OR ...
, and just maintain that list of every table that has a machine_serial column. But then I and anyone that comes after me would need to alter that check in the trigger function any time the trigger is added to a new table, which is less than ideal.
Of course, the above three alternatives would all function, but they all feel like bad design choices to me. Maybe it's because I'm used to the dynamicness offered by Python, but if I used any of these alternatives, I would feel like I'm doing something wrong. And PostgreSQL is pretty good about offering lots of operators on all sorts of data types, so I just can't imagine that something as basic as checking whether a RECORD or ROW-type variable contains a certain field is impossible.
Before I show the solution, I have to say, so this requirement can be signal of some unhappy design. Maybe you try to implement some functionality that should not be implemented in triggers. Triggers are good, but too smart too generic too rich can be very slow and very hard to maintain and fix errors (but as every in life, there are exceptions from rules).
So first - you can look to system catalog:
CREATE FUNCTION public.foo_trg() RETURNS trigger
LANGUAGE plpgsql
AS $$
begin
raise notice 'a exists %', exists(select * from pg_attribute where attrelid = new.tableoid and attname = 'a');
raise notice 'd exists %', exists(select * from pg_attribute where attrelid = new.tableoid and attname = 'd');
return new;
end;
$$;
CREATE TABLE public.foo (
a integer,
b integer
);
CREATE TRIGGER foo_trg_insert
AFTER INSERT ON public.foo
FOR EACH ROW EXECUTE FUNCTION public.foo_trg();
(2022-09-02 06:18:41) postgres=# insert into foo values(1,2);
NOTICE: a exists t
NOTICE: d exists f
INSERT 0 1
Second solution is based on record to jsonb transformations:
CREATE OR REPLACE FUNCTION public.foo_trg()
RETURNS trigger
LANGUAGE plpgsql
AS $$
declare j jsonb;
begin
j := to_jsonb(new);
raise notice 'a exists %', j ? 'a';
raise notice 'd exists %', j ? 'd';
return new;
end;
$$
(2022-09-02 06:24:54) postgres=# insert into foo values(1,2);
NOTICE: a exists t
NOTICE: d exists f
INSERT 0 1
Second solution can be faster, because doesn't requires queries to system catalog. It hits just system catalog cache, but it doesn't work on some legacy PostgreSQL releases.

Postgres Locking a table inside the function is not working?

CREATE OR REPLACE FUNCTION()
RETURND VOID AS
BEGIN
FOR I IN 1..5
LOOP
LOCK TABLE tbl_Employee1 IN EXCLUSIVE MODE;
INSERT INTO tbl_Employee1
VALUES
(i,'test');
END LOOP;
COMMIT;
END;
$$ LANGUAGE PLPGSQL
When I select the table it is going into infinty loop means the transaction is not complete. Please help me out ?
Your code has been stripped down so much that it doesn't really make sense any more.
However, you should only lock the table once, not in each iteration of the loop. Plus you can't use commit in a function in Postgres, so you have to remove that as well. It's also bad coding style (in Postgres and Oracle) to not provide column names for the insert statement.
Immediate solution:
CREATE OR REPLACE FUNCTION ...
RETURNS VOID AS
$$
BEGIN
LOCK TABLE Employee1 IN EXCLUSIVE MODE;
FOR I IN 1..5 LOOP
INSERT INTO Employee1 (id, name)
VALUES (i,'test');
END LOOP;
-- no commit here!
END;
$$ LANGUAGE PLPGSQL
The above is needlessly complicated in Postgres and can be implemented much more efficiently without a loop:
CREATE OR REPLACE FUNCTION ....
RETURNS VOID AS
$$
BEGIN
LOCK TABLE Employee1 IN EXCLUSIVE MODE;
INSERT INTO Employee1 (id, name)
select i, test
from generate_series(1,5);
END;
$$ LANGUAGE PLPGSQL
Locking a table in exclusive mode seems like a bad idea to begin with. In Oracle as well, but in Postgres this might have more severe implications. If you want to prevent duplicates in the table, create a unique index (or constraint) and deal with errors. Or use insert ... on conflict in Postgres. That will be much more efficient (and scalable) than locking a complete table.
Additionally: LOCK TABLE IN EXCLUSIVE MODE; behaves differently in Oracle and Postgres. While Oracle will still allow read only queries on that table, you block every access to it in Postgres - including SELECT statements.

Recursive function postgres to get the latest ID

I want to use a function in PostgreSQL to get the latest ID related to a history:
CREATE TABLE "tbl_ids" (
"ID" oid,
"Name" text,
"newID" oid
);
After creating this simple table, I have no idea where to start my function, and before you ask: I know about COALESCE()-function, but I'm going to have more then one parent-ID in the future.
CREATE FUNCTION get_lastes_id(ID oid, newID oid) RETURNS oid AS $$
BEGIN
IF new IS NOT NULL THEN
--USE old--
END
IF new IS NULL THEN
get_latest_id(new, "newID")
END
END;
I gotta say it because you'd find out anyway: I'm really new in functions with PostgreSQL and I'm not even sure if this is possible. But assuming COALESCE()-Function also exists it has to be a server-side function I guess.
First, it is not clear what you are asking. oid's are probably not the best type to use primarily because they are an internal type designed for the system libraries and therefore you cannot guarantee they will act the way you expect.
Secondly this seems to me to be a poor choice tools if you want to use recursion to just get the latest. If you want things to perform well, try to think in set operations rather than imparitive algorithms.
If you want a trigger to get the latest (maximum) oid for a name and assign it to "newID" then:
CREATE OR REPLACE FUNCTION set_newID() RETURNS TRIGGER LANGUAGE PLPGSQL AS
$$
DECLARE maxid oid;
BEGIN
IF new."newID" IS NOT NULL THEN
RETURN new; -- do nothing
END IF;
SELECT max("ID") INTO maxid FROM tbl_ids WHERE "Name" = new."Name";
new."newID" = maxid;
RETURN new;
END;
$$;
That works with oids and ints. However it has to select a row from the db on each row modified by the trigger so you will have performance problems with bulk inserts for example.
Oh, and far better to use all lower case so you don't have to quote every identifier.

PostgreSQL insert or update trigger function volatility category

Assume, i have 2 tables in my DB (postgresql-9.x)
CREATE TABLE FOLDER (
KEY BIGSERIAL PRIMARY KEY,
PATH TEXT,
NAME TEXT
);
CREATE TABLE FOLDERFILE (
FILEID BIGINT,
PATH TEXT,
PATHKEY BIGINT
);
I automatically update FOLDERFILE.PATHKEY from FOLDER.KEY whenever i insert into or update FOLDERFILE:
CREATE OR REPLACE FUNCTION folderfile_fill_pathkey() RETURNS trigger AS $$
DECLARE
pathkey bigint;
changed boolean;
BEGIN
IF tg_op = 'INSERT' THEN
changed := TRUE;
ELSE IF old.FILEID != new.FILEID THEN
changed := TRUE;
END IF;
END IF;
IF changed THEN
SELECT INTO pathkey key FROM FOLDER WHERE PATH = new.path;
IF FOUND THEN
new.pathkey = pathkey;
ELSE
new.pathkey = NULL;
END IF;
END IF;
RETURN new;
END
$$ LANGUAGE plpgsql VOLATILE;
CREATE TRIGGER folderfile_fill_pathkey_trigger AFTER INSERT OR UPDATE
ON FOLDERFILE FOR EACH ROW EXECUTE PROCEDURE fcliplink_fill_pathkey();
So the question is about function folderfile_fill_pathkey() volatility. Documentations says
Any function with side-effects must be labeled VOLATILE
But as far as i understand – this function does not change any data in the tables it rely on, so i can mark this function as IMMUTABLE. It that correct?
Would there be any problem with IMMUTABLE trigger function if I bulk-insert many rows into FOLDERFILE within the same transaction, like:
BEGIN;
INSERT INTO FOLDERFILE ( ... );
...
INSERT INTO FOLDERFILE ( ... );
COMMIT;
Firstly, as #pozs already pointed out, the function definition you have provided is most definitely STABLE rather than IMMUTABLE since it performs database look-ups. This means that the result is not simply derived from the input parameters (as IMMUTABLE would suggest), but also from the data stored in your FOLDER table (which is bound to change). As per the documentation:
STABLE indicates that the function cannot modify the database, and
that within a single table scan it will consistently return the same
result for the same argument values, but that its result could change
across SQL statements. This is the appropriate selection for functions
whose results depend on database lookups, parameter variables (such as
the current time zone), etc.
Secondly, adding stability modifiers (IMMUTABLE/STABLE/VOLATILE) to your trigger functions serves an illustrative purpose at best, since AFAIK PostgreSQL doesn't actually perform any planning that would warrant their use. The following post from the pgsql-hackers mailing list seems to support my claim:
Volatility is a complete no-op for a trigger function anyway, as are
other planner parameters such as cost/rows, because there is no
planning involved in trigger calls.
To sum up: you're probably better off avoiding the stability keywords in your trigger(!) procedures for now, since including them seems to add little to no benefit but entails several unexpected caveats/pitfalls (see the end of #pozs's first comment).

Is this generic MERGE/UPSERT function for PostgreSQL safe?

I have created a "merge" function which is supposed to execute either an UPDATE or an INSERT query, depending on existing data. Instead of writing an upsert-wrapper for each table (as in most of the available examples), this function takes entire SQL strings. Both of the SQL strings are automatically generated by our application.
The plan is to call the function like this:
-- hypothetical "settings" table, with a primary key of (user_id, setting):
SELECT merge(
$$UPDATE settings SET value = 'x' WHERE user_id = 42 AND setting = 'foo'$$,
$$INSERT INTO settings (user_id, setting, value) VALUES (42, 'foo', 'x')$$
);
Here's the full code of the merge() function:
CREATE OR REPLACE FUNCTION merge (update_sql TEXT, insert_sql TEXT) RETURNS TEXT AS
$func$
DECLARE
max_iterations INTEGER := 10;
i INTEGER := 0;
num_updated INTEGER;
BEGIN
-- usually returns before re-entering the loop
LOOP
-- first try the update
EXECUTE update_sql;
GET DIAGNOSTICS num_updated = ROW_COUNT;
IF num_updated > 0 THEN
RETURN 'UPDATE';
END IF;
-- nothing was updated: try the insert, watching out for concurrent inserts
BEGIN
EXECUTE insert_sql;
RETURN 'INSERT';
EXCEPTION WHEN unique_violation THEN
-- nop; just loop and try again from the top
END;
-- emergency brake
i := i + 1;
IF i >= max_iterations THEN
RAISE EXCEPTION 'merge(): tried looping % times, giving up now.', i;
EXIT;
END IF;
END LOOP;
END;
$func$
LANGUAGE plpgsql;
It appears to work well enough in my tests, but I'm not certain if I haven't missed anything crucial, especially regarding concurrent UPDATE/INSERT/DELETE queries, which may be issued without using this function. Did I overlook anything important?
Among the resources I consulted for this function are:
UPDATE/INSERT example 40.2 in the PostgreSQL manual
Why is UPSERT so complicated?
SO: Insert, on duplicate update (postgresql)
(Edit: one of the goals was to avoid locking the target table.)
The answer to your question depends your the context of how your application(s) will access the database. There are many ways to solve this as nicely discussed in depesz's post you cited by yourself. In addition you might want to also consider using writeable CTEs see here. Also the [question]Insert, on duplicate update in PostgreSQL? has some interesting discussions for your decision making process.