How to ensure all Postgres queries have WHERE clause? - postgresql

I am building a multi tenant system in which many clients data will be in the same database.
I am paranoid about some developer forgetting to put the appropriate "WHERE clientid = " onto every query.
Is there a way to, at the database level, ensure that every query has the correct WHERE = clause, thereby ensuring that no query will ever be executed without also specifying which client the query is for?
I was wondering if maybe the query rewrite rules could do this but it's not clear to me if they can do so.
thanks

Deny permissions on the table t for all users. Then give them permission on a function f that returns the table and accepts the parameter client_id:
create or replace function f(_client_id integer)
returns setof t as
$$
select *
from t
where client_id = _client_id
$$ language sql
;
select * from f(1);
client_id | v
-----------+---
1 | 2

Another way is to create a VIEW for:
SELECT *
FROM t
WHERE t.client_id = current_setting('session_vars.client_id');
And use SET session_vars.client_id = 1234 at the start of the session.
Deny acces to the tables, and leave only permissins for views.
You may need to create rewrite rules for UPDATE, DELETE, INSERT for the views (it depends on your PostgreSQL version).
Performance penalty will be small (if any) because PostgreSQL will rewrite the queries before execution.

Related

PostgreSQL 9.5 - Row level security / ROLE best practices

I'm tying to grasp the best way to use the new row level security feature in a multi-tenant database that supports a web application.
Currently, the application has a few different ROLEs available, depending on the action it is attempting to take.
Once the application makes a connection using its own ROLE, the application passes authentication parameters (provided by the user) into different functions that filter out rows based on the user supplied authentication parameters. The system is designed to work with thousands of users and it seems to work; however, it's defiantly clunky (and slow).
It seems that if I wanted to use the new row level security feature I would need to create a new ROLE for each real world user (not just for the web application) to access the database.
Is this correct? and if so, is it a good idea to create thousands of ROLEs in the database?
Update from a_horse_with_no_name's link in the comments (thanks, that thread is spot on):
CREATE USER application;
CREATE TABLE t1 (id int primary key, f1 text, app_user text);
INSERT INTO t1 VALUES(1,'a','bob');
INSERT INTO t1 VALUES(2,'b','alice');
ALTER TABLE t1 ENABLE ROW LEVEL SECURITY;
CREATE POLICY P ON t1 USING (app_user = current_setting('app_name.app_user'));
GRANT SELECT ON t1 TO application;
SET SESSION AUTHORIZATION application;
SET app_name.app_user = 'bob';
SELECT * FROM t1;
id | f1 | app_user
----+----+----------
1 | a | bob
(1 row)
SET app_name.app_user = 'alice';
SELECT * FROM t1;
id | f1 | app_user
----+----+----------
2 | b | alice
(1 row)
SET app_name.app_user = 'none';
SELECT * FROM t1;
id | f1 | app_user
----+----+----------
(0 rows)
Now, I'm confused by current_setting('app_name.app_user') as I was under the impression this was only for configuration parameters... where is app_name defined?
Setting security policies based on a session setting is a BAD BAD BAD idea (I hate both CAPS and bold so trust me that I mean it). Any user can SET SESSION 'app_name.app_user' = 'bob', so as soon as someone figures out that "app_name.app_user" is the door in (trust me, they will) then your whole security is out the door.
The only way that I see is to use a table accessible to your webadmin only which stores session tokens (uuid type comes to mind, cast to text for ease of use). The login() function is SECURITY DEFINER (assuming owner webadmin), setting the token as well as a session SETting, and then the table being owned by (or having appropriate privileges for) webadmin refers to that table and the session setting in its policy.
Unfortunately, you cannot use temporary (session) tables here because you cannot build policies on a temporary table so you have to use a "real" table. That is something of a performance penalty, but weigh that against the damage of a hack...
In practice:
CREATE FUNCTION login (uname text, pwd text) RETURNS boolean AS $$
DECLARE
t uuid;
BEGIN
PERFORM * FROM users WHERE user = uname AND password = pwd;
IF FOUND THEN
INSERT INTO sessions SET token = uuid_generate_v4()::text, user ....
RETURNING token INTO t;
SET SESSION "app_name.token" = t;
RETURN true;
ELSE
SET SESSION "app_name.token" = '';
RETURN false;
END IF;
END; $$ LANGUAGE plpgsql STRICT;
And now your policy would link to sessions:
CREATE POLICY p ON t1 FOR SELECT
USING (SELECT true FROM sessions WHERE token = current_setting('app_name.token'));
(Since uuids may be assumed to be unique, no need for LIMIT 1. ordering or other magic, if the uuid is in the table the policy will pass, otherwise fail.) The uuid is impossible to guess (within your lifetime anyway) and impossible to retrieve by anyone but webadmin.

How to implement data authorization logic inside a Posgres db?

We are building a web app that sits on top of a postgres db. We would like to implement authorization logic inside the database so that it is opaque to the app. For example, suppose a server side controller requests all users from a view v_user. We would like for the db to handle the authorization of which users the currently logged in user can or cannot see. Obviously the server is going to need to send over the login_pkey (user_pkey of logged in user) on every request for this to work.
The issue we are having is with reads. We were able to do this for inserts, updates and deletes by putting the logic in the triggers behind those operations on all views. The issue we are having is how to do this for reads. How can we include variable logic (i.e. logic that depends on which login_pkey is passed) in a view (or some other place) and how can we pass this information for each query.
If it is important, the server we are using is Node and the ORM is Sequelize.
Thanks in advance.
Ideally you really want row security to do this well. It's available in the 9.5 version in beta now.
But you can do what you need without.
To pass a user identity you can use a custom variable, e.g.
SET myapp.appuser = 'fred';
then access it with current_setting e.g.
SELECT current_setting('myapp.appuser')
This will raise an ERROR if the setting does not exist, so you should set a default blank value in postgresql.conf, with ALTER DATABASE SET, etc. Or use PostgreSQL 9.5's current_setting('settingname', true) to return null on missing values.
To filter what users can see, use views that check the user identity setting your app sets at connect-time, per the above.
This is not safe if your users can run arbitrary SQL, because nothing stops them RESETing the setting or doing a SET myapp.appuser = 'the-admin'.
It's very easy to implement this using Pl/Python global dict GD. First, you need to write auth() function:
create or replace function auth(login text, pass text) as $$
-- Check auth login here
GD['user_id'] = get_user_id_by_login(login)
$$ language plpythonu;
Then you have to write get_current_user() function
create or replace function get_current_user() returns integer as $$
return GD['user_id']
$$ langugage plpythonu;
Now, you can get current user any time you want. For example:
-- inside stored procedure
vUserId := get_current_user()
-- in query
select * from some_table where owner_id = get_current_user()
Remember, that GD is stored per session, so, as you wrote, you need to login every time you connect to database. In my ORM I do like this:
class MyORM():
def login(self, user, password):
cursor = self.__conn.cursor()
result = cursor.execute('select core.login(%s, %s)', (user, password,))
data = cursor.fetchone()
cursor.close()
return data[0]
def auth(self, cursor):
cursor.execute('select core.auth(%s)', (g.user_id,))
def query(self, query):
cursor = self.__conn.cursor()
self.auth(cursor)
cursor.execute(query)
data = cursor.fetchall()
cursor.close()
return data

Hana - Select without Cursor

I am doing a select in a stored procedure with a cursor, I would to know If I could do the same select without using a cursor.
PROCEDURE "DOUGLAS"."tmp.douglas.yahoo::testando" ( )
LANGUAGE SQLSCRIPT
SQL SECURITY INVOKER
DEFAULT SCHEMA DOUGLAS
AS
BEGIN
/*****************************
procedure logic
*****************************/
declare R varchar(50);
declare cursor users FOR select * from USERS WHERE CREATE_TIME between ADD_SECONDS (CURRENT_TIMESTAMP , -7200 ) and CURRENT_TIMESTAMP;
FOR R AS users DO
CALL _SYS_REPO.GRANT_ACTIVATED_ROLE('dux.health.model.roles::finalUser',R.USER_NAME);
END FOR;
END;
Technically you could convert the result set into an ARRAY and then loop over the array - but for what?
The main problem is that you want to automatically grant permissions on any users that match your time based WHERE condition.
This is not a good idea in most scenarios.
The point of avoiding cursors is to allow the DBMS to optimize SQL commands. Telling the DBMS what data you want, not how to produce it.
In this example it really wouldn't make any difference performance wise.
A much more important factor is that you run SELECT * even though you only need the USER_NAME and that your R variable is declared as VARCHAR(50) (which is wrong if you wanted to store the USER_NAME in it) but never actually used.
The R variable in the FOR loop exists in a different validity context and actually contains the current row of the cursor.

Postgres Rules Preventing CTE Queries

Using Postgres 9.3:
I am attempting to automatically populate a table when an insert is performed on another table. This seems like a good use for rules, but after adding the rule to the first table, I am no longer able to perform inserts into the second table using the writable CTE. Here is an example:
CREATE TABLE foo (
id INT PRIMARY KEY
);
CREATE TABLE bar (
id INT PRIMARY KEY REFERENCES foo
);
CREATE RULE insertFoo AS ON INSERT TO foo DO INSERT INTO bar VALUES (NEW.id);
WITH a AS (SELECT * FROM (VALUES (1), (2)) b)
INSERT INTO foo SELECT * FROM a
When this is run, I get the error
"ERROR: WITH cannot be used in a query that is rewritten by rules
into multiple queries".
I have searched for that error string, but am only able to find links to the source code. I know that I can perform the above using row-level triggers instead, but it seems like I should be able to do this at the statement level. Why can I not use the writable CTE, when queries like this can (in this case) be easily re-written as:
INSERT INTO foo SELECT * FROM (VALUES (1), (2)) a
Does anyone know of another way that would accomplish what I am attempting to do other than 1) using rules, which prevents the use of "with" queries, or 2) using row-level triggers? Thanks,
        
TL;DR: use triggers, not rules.
Generally speaking, prefer triggers over rules, unless rules are absolutely necessary. (Which, in practice, they never are.)
Using rules introduces heaps of problems which will needlessly complicate your life down the road. You've run into one here. Another (major) one is, for instance, that the number of affected rows will correspond to that of the very last query -- if you're relying on FOUND somewhere and your query is incorrectly reporting that no rows were affected by a query, you'll be in for painful bugs.
Moreover, there's occasional talk of deprecating Postgres rules outright:
http://postgresql.nabble.com/Deprecating-RULES-td5727689.html
As the other answer I definitely recommend using INSTEAD OF triggers before RULEs.
However if for some reason you don't want to change existing VIEW RULEs and still want use WITH you can do so by wrapping the VIEW in a stored procedure:
create function insert_foo(int) returns void as $$
insert into foo values ($1)
$$ language sql;
WITH a AS (SELECT * FROM (VALUES (1), (2)) b)
SELECT insert_foo(a.column1) from a;
This could be useful when using some legacy db through some system that wraps statements with CTEs.

PostgreSQL Set-Returning Function Call Optimization

I have the following problem with PostgreSQL 9.3.
There is a view encapsulating a non-trivial query to some resources (e.g., documents). Let's illustrate it as simple as
CREATE VIEW vw_resources AS
SELECT * FROM documents; -- there are several joined tables in fact...
The client application uses the view usually with some WHERE conditions on several fields, and might also use paging of the results, so OFFSET and LIMIT may also be applied.
Now, on top of the actual resource list computed by vw_resources, I only want to display resources which the current user is allowed for. There is quite a complex set of rules regarding privileges (they depend on several attributes of the resources in question, explicit ACLs, implicit rules based on user roles or relations to other users...) so I wanted to encapsulate all of them in a single function. To prevent repetitive costly queries for each resource, the function takes a list of resource IDs, evaluates the privileges for all of them at once, and returns the set of the requested resource IDs together with the according privileges (read/write is distinguished). It looks roughly like this:
CREATE FUNCTION list_privileges(resource_ids BIGINT[])
RETURNS TABLE (resource_id BIGINT, privilege TEXT)
AS $function$
BEGIN
-- the function lists privileges for a user that would get passed in an argument - omitting that for simplicity
RAISE NOTICE 'list_privileges called'; -- for diagnostic purposes
-- for illustration, let's simply grant write privileges for any odd resource:
RETURN QUERY SELECT id, (CASE WHEN id % 2 = 1 THEN 'write' ELSE 'none' END)
FROM unnest(resource_ids) id;
END;
$function$ LANGUAGE plpgsql STABLE;
The question is how to integrate such a function in the vw_resources view for it to give only resources the user is privileged for (i.e., has 'read' or 'write' privilege).
A trivial solution would use a CTE:
CREATE VIEW vw_resources AS
WITH base_data AS (
SELECT * FROM documents
)
SELECT base_data.*, priv.privilege
FROM base_data
JOIN list_privileges((SELECT array_agg(resource_id) FROM base_data)) AS priv USING (resource_id)
WHERE privilege IN ('read', 'write');
The problem is that the view itself gives too much rows - some WHERE conditions and OFFSET/LIMIT clauses are only applied to the view itself, like SELECT * FROM vw_resources WHERE id IN (1,2,3) LIMIT 10 (any complex filtering might be requested by the client application). And since PostgreSQL is unable to push the conditions down the CTE, the list_privileges(BIGINT[]) function ends up with evaluating privileges for all resources in the database, which effectively kills the performance.
So I attempted to use a window function which would collect resource IDs from the whole result set and join the list_privileges(BIGINT[]) function in an outer query, like illustrated below, but the list_privileges(BIGINT[]) function ends up being called repetitively for each row (as testified by 'list_privileges called' notices), which kinda ruins the previous effort:
CREATE VIEW vw_resources AS
SELECT d.*, priv.privilege
FROM (
SELECT *, array_agg(resource_id) OVER () AS collected
FROM documents
) AS d
JOIN list_privileges(d.collected) AS priv USING (resource_id)
WHERE privilege IN ('read', 'write');
I would resort to forcing clients to give two separate queries, the first taking the vw_resources without privileges applied, the second calling the list_privileges(BIGINT[]) function passing it the list of resource IDs fetched by the first query, and filtering the disallowed resources on the client side. It is quite clumsy for the client, though, and obtaining e.g. the first 20 allowed resources would be practically impossible as limiting the first query simply does not get it - if some resources are filtered out due to privileges then we simply don't have 20 rows in the overall result...
Any help welcome!
P.S. For the sake of completeness, I append a sample documents table:
CREATE TABLE documents (resource_id BIGINT, content TEXT);
INSERT INTO documents VALUES (1,'a'),(2,'b'),(3,'c');
If you must use plpgsql then create the function taking no arguments
create function list_privileges()
returns table (resource_id bigint, privilege text)
as $function$
begin
raise notice 'list_privileges called'; -- for diagnostic purposes
return query select 1, case when 1 % 2 = 1 then 'write' else 'none' end
;
end;
$function$ language plpgsql stable;
And join it to the other complex query to form the vw_resources view
create view vw_resources as
select *
from
documents d
inner join
list_privileges() using(resource_id)
The filter conditions will be added at query time
select *
from vw_resources
where
id in (1,2,3)
and
privilege in ('read', 'write')
Let the planner do its optimization magic and check the explain output before any "premature optimization".
This is just a conjecture: The function might make it harder or impossible for the planner to optimize.
If plpgsql is not really necessary, and that is very frequent, I would just create a view in instead of the function
create view vw_list_privileges as
select
1 as resource_id,
case when 1 % 2 = 1 then 'write' else 'none' end as privilege
And join it the same way to the complex query
create view vw_resources as
select *
from
documents d
inner join
vw_list_privileges using(resource_id)