I have a schema of tables whose contents basically boil down to:
A set of users
A set of object groups
An access control list (acl) indicating what users have access to what groups
A set of objects, each of which belongs to exactly one group.
I want to create a simple application that supports access control. I'm thinking views would be a good approach here.
Suppose I have the following database initialization:
/* Database definition */
BEGIN;
CREATE SCHEMA foo;
CREATE TABLE foo.users (
id SERIAL PRIMARY KEY,
name TEXT
);
CREATE TABLE foo.groups (
id SERIAL PRIMARY KEY,
name TEXT
);
CREATE TABLE foo.acl (
user_ INT REFERENCES foo.users,
group_ INT REFERENCES foo.groups
);
CREATE TABLE foo.objects (
id SERIAL PRIMARY KEY,
group_ INT REFERENCES foo.groups,
name TEXT,
data TEXT
);
/* Sample data */
-- Create groups A and B
INSERT INTO foo.groups VALUES (1, 'A');
INSERT INTO foo.groups VALUES (2, 'B');
-- Create objects belonging to group A
INSERT INTO foo.objects VALUES (1, 1, 'object in A', 'apples');
INSERT INTO foo.objects VALUES (2, 1, 'another object in A', 'asparagus');
-- Create objects belonging to group B
INSERT INTO foo.objects VALUES (3, 2, 'object in B', 'bananas');
INSERT INTO foo.objects VALUES (4, 2, 'object in B', 'blueberries');
-- Create users
INSERT INTO foo.users VALUES (1, 'alice');
INSERT INTO foo.users VALUES (2, 'amy');
INSERT INTO foo.users VALUES (3, 'billy');
INSERT INTO foo.users VALUES (4, 'bob');
INSERT INTO foo.users VALUES (5, 'caitlin');
INSERT INTO foo.users VALUES (6, 'charlie');
-- alice and amy can access group A
INSERT INTO foo.acl VALUES (1, 1);
INSERT INTO foo.acl VALUES (2, 1);
-- billy and bob can access group B
INSERT INTO foo.acl VALUES (3, 2);
INSERT INTO foo.acl VALUES (4, 2);
-- caitlin and charlie can access groups A and B
INSERT INTO foo.acl VALUES (5, 1);
INSERT INTO foo.acl VALUES (5, 2);
INSERT INTO foo.acl VALUES (6, 1);
INSERT INTO foo.acl VALUES (6, 2);
COMMIT;
My idea is to use views that mirror the database, but restrict content to only that which the current user (ascertained by my PHP script) may access (here I'll just use the user 'bob'). Suppose I run this at the beginning of every PostgreSQL session (meaning every time someone accesses a page on my site):
BEGIN;
CREATE TEMPORARY VIEW users AS
SELECT * FROM foo.users
WHERE name='bob';
CREATE TEMPORARY VIEW acl AS
SELECT acl.* FROM foo.acl, users
WHERE acl.user_=users.id;
CREATE TEMPORARY VIEW groups AS
SELECT groups.* FROM foo.groups, acl
WHERE groups.id=acl.group_;
CREATE TEMPORARY VIEW objects AS
SELECT objects.* FROM foo.objects, groups
WHERE objects.group_=groups.id;
COMMIT;
My question is, is this a good approach? Do these CREATE TEMPORARY VIEW statements produce significant overhead, especially compared to a couple simple queries?
Also, is there a way to make these views permanent in my database definition, then bind a value to the user name per session? This way, it doesn't have to create all these views every time a user loads a page.
Several problems with this approach:
One user web session is not the same thing as one database session. Multiple users with with sort of setup would fail instantly.
Management overhead creating/destroying the views.
Instead, I would recommend something like the following view:
CREATE VIEW AllowedObjects
SELECT objects.*, users.name AS alloweduser
FROM objects
INNER JOIN groups ON groups.id = objects.group_
INNER JOIN acl ON acl.group_ = groups.id
INNER JOIN users ON users.id = acl.user_
Then, everywhere you select objects:
SELECT * FROM AllowedObjects
WHERE alloweduser='Bob'
This assumes Bob can only have one ACL joining him to a particular group, otherwise a DISTINCT would be necessary.
This could be abstracted to a slightly less complex view that could be used to make it easier to check permissions for UPDATE and DELETE:
CREATE VIEW AllowedUserGroup
SELECT groups.id AS allowedgroup, users.name AS alloweduser
FROM groups
INNER JOIN acl ON acl.group_ = groups.id
INNER JOIN users ON users.id = acl.user_
This provides a flattened view of which users are in which groups, which you can check against the objects table during an UPDATE/DELETE:
UPDATE objects SET foo='bar' WHERE id=42 AND EXISTS
(SELECT NULL FROM AllowedUserGroup
WHERE alloweduser='Bob' AND allowedgroup = objects.group_)
Related
I have table tariffs, with two columns: (tariff_id, reception)
I have table users, with two columns: (user_id, reception)
And I have table users_tariffs with two columns: (user_id, tariff_id).
I want to prevent situation when tariff from one reception is assigned to user from another reception. How can I do that?
E.G
Users:
user_id | reception
Putin | Russia
Trump | USA
Tariffs:
tariff_id | reception
cheap | USA
expensive | Russia
Wrong situation at users_tariffs, because Cheap tariff is for USA only:
user_id | tariff_id
Putin | Cheap
SOLUTION 1: FOREIGN KEY CONSTRAINTS
I am assuming the following table definitions.
In particular, the composite key in user_tariffs makes this a many-to-many relationship between users and tariffs.
CREATE TABLE tariffs (tariff_id int NOT NULL PRIMARY KEY,
reception text NOT NULL);
CREATE TABLE users (user_id int NOT NULL PRIMARY KEY,
reception text NOT NULL);
CREATE TABLE user_tariffs (tariff_id int NOT NULL REFERENCES tariffs (tariff_id),
user_id int NOT NULL REFERENCES users (user_id),
PRIMARY KEY (tariff_id, user_id));
You probably need a combination of all three columns somewhere, so let's create this:
ALTER TABLE user_tariffs ADD COLUMN reception text;
UPDATE user_tariffs a
SET reception = b.reception
FROM (SELECT * FROM tariffs) b
WHERE a.tariff_id = b.tariff_id;
ALTER TABLE user_tariffs ALTER COLUMN reception SET NOT NULL;
Now we can use FOREIGN KEY REFERENCES (user_id, reception) into users.
CREATE UNIQUE INDEX ON tariffs (tariff_id, reception);
ALTER TABLE user_tariffs ADD FOREIGN KEY (tariff_id, reception)
REFERENCES tariffs (tariff_id, reception);
In addition, we can use FK REFs (tariff_id, reception) into tariffs.
CREATE UNIQUE INDEX ON users (user_id, reception);
ALTER TABLE user_tariffs ADD FOREIGN KEY (user_id, reception)
REFERENCES users (user_id, reception);
Populate with data:
INSERT INTO users VALUES (1, 'cheap'), (2, 'expensive');
INSERT INTO tariffs VALUES (1, 'cheap'), (2, 'expensive');
Now assume we have the following data (user_id, tariff_id) to insert:
WITH data (user_id, tariff_id)
AS (VALUES (1, 2), (2, 1)), -- here is your application data
datas (user_id, tariff_id, reception)
AS (SELECT user_id,
tariff_id,
(SELECT u.reception -- reception calculated by user
FROM users u
WHERE u.user_id = d.user_id)
FROM data d)
INSERT INTO user_tariffs SELECT * FROM datas ;
Then you cannot insert the data, because you can only add (1, 1) or (2, 2) with the same reception, but not (1, 2) or (2, 1) with different reception's. The error message is:
ERROR: insert or update on table "user_tariffs" violates foreign key constraint "user_tariffs_user_id_fkey1"
DETAIL: Key (user_id, reception)=(2, cheap) is not present in table "users".
But you can insert with data AS VALUES (1, 1), (2, 2).
I think the FOREIGN KEY CONSTRAINT solution is to be preferred.
Please describe your functional dependencies, if you want better table designs.
SOLUTION 2: TRIGGER
-- DROP TABLE user_tariffs CASCADE;
-- DROP TABLE users CASCADE;
-- DROP TABLE tariffs CASCADE;
CREATE TABLE tariffs (tariff_id int NOT NULL PRIMARY KEY,
reception text NOT NULL);
CREATE TABLE users (user_id int NOT NULL PRIMARY KEY,
reception text NOT NULL);
CREATE TABLE user_tariffs (tariff_id int NOT NULL REFERENCES tariffs (tariff_id),
user_id int NOT NULL REFERENCES users (user_id),
PRIMARY KEY (tariff_id, user_id));
INSERT INTO users VALUES (1, 'cheap'), (2, 'expensive');
INSERT INTO tariffs VALUES (1, 'cheap'), (2, 'expensive');
-- table user_tariffs (user_id, tariff_id) only, without reception column.
Create a function with return type trigger:
CREATE OR REPLACE FUNCTION check_reception()
RETURNS trigger AS $$
DECLARE valid boolean := false;
BEGIN
SELECT (SELECT u.reception FROM users u WHERE u.user_id = NEW.user_id)
= (SELECT t.reception FROM tariffs t WHERE t.tariff_id = NEW.tariff_id)
INTO valid FROM user_tariffs ;
IF valid = false
THEN RAISE EXCEPTION '(user, tariff, reception) invalid.';
END IF;
RETURN NEW;
END; $$ LANGUAGE plpgsql ;
and register it:
CREATE TRIGGER reception_trigger
AFTER INSERT OR UPDATE ON user_tariffs
FOR EACH ROW EXECUTE PROCEDURE check_reception();
Now try to insert (1, 2), which would be (cheap, expensive) and is not allowed:
INSERT INTO user_tariffs VALUES (1, 2);
ERROR: (user, tariff, reception) invalid.
KONTEXT: PL/pgSQL function check_reception() line 7 at RAISE
But we can insert (1, 1), which is (cheap, cheap) without problem:
INSERT INTO user_tariffs VALUES (1, 1);
SELECT * FROM user_tariffs;
Remark
Triggers are not the best solution here, in my opinion. Try to avoid triggers, if possible. They can have side effects (transactions etc). Check StackOverflow for further details :)
I'm currently going through the growing pains of trying to learn about functions and triggers. I'm trying to do a problem from a book I'm reading , but i dont understand how to do certain parts.
using this table
create table movies (
id integer primary key,
title varchar(255) not null,
year integer
);
insert into movies values (1, 'The Croods', 2013);
insert into movies values (2, 'Now You See Me', 2013);
insert into movies values (3, 'Argo', 2012);
insert into movies values (4, 'Jurassic World', 2015);
create table discs (
id integer primary key,
movie_id integer not null references movies(id),
type_id integer references disc_types(id),
price decimal(10,2),
available boolean
);
insert into discs values (1, 1, 1, 1.59, 't');
insert into discs values (2, 1, 1, 1.59, 'f');
insert into discs values (3, 1, 2, 2.99, 'f');
insert into discs values (4, 2, 1, 1.29, 't');
insert into discs values (5, 2, 1, 1.29, 't');
insert into discs values (6, 2, 2, 2.99, 't');
insert into discs values (7, 3, 2, 2.59, 't');
insert into discs values (8, 3, 2, 2.59, 't');
create table customers (
id integer primary key,
name varchar(255),
email varchar(255)
);
insert into customers values (1, 'John', 'john#hotmail.com');
insert into customers values (2, 'Jane', 'jane#gmail.com');
create table rentals (
id integer primary key,
customer_id integer not null references customers(id),
disc_id integer not null references discs(id),
date_rented date,
date_returned date
);
insert into rentals values (1, 1, 7, '2013-10-01', '2013-10-03');
insert into rentals values (2, 2, 5, '2013-10-05', '2013-10-06');
insert into rentals values (3, 2, 2, '2013-11-02', null);
insert into rentals values (4, 2, 3, '2013-11-02', null);
create table ratings (
customer_id integer not null references customers(id),
movie_id integer not null references movies(id),
rating integer,
primary key (customer_id, movie_id)
);
insert into ratings values (1, 1, 1);
insert into ratings values (1, 2, 4);
insert into ratings values (1, 3, 5);
insert into ratings values (2, 1, 4);
my logic was that i would have the new values of the ratings table that were going to be inserted or updated and use those to compare to whats in the rentals table to see if that customer had rented that movie already, if they did then they could enter a rating. but i cant transfer that logic in this lol. unless there an easier way to do this.
The loop inside the function complicates matters a bit, let's see if we can get rid of it. Your ratings table has a reference to customer and movie so we need a join.
SELECT COUNT(*) INTO rented FROM rentals WHERE disc_id IN
(SELECT id from discs INNER JOIN
rentals ON disc_id = discs.id where movie_id = new.movie_id)
AND customer_id = new.customer_id
Right this should make the logic of your stored procedure a lot easier. I am now leaving you to finish it because this after all is a learning exercise.
You need this sort of a join because it's more efficient and simpler than the loop. The ratings table has a reference to the movie_id but the rentals table only has a disc_id thus to find out if the user has rented a particular movie, you need to join it through the disc table.
You will need to change the return values. ref: http://www.postgresql.org/docs/9.2/static/plpgsql-trigger.html
Row-level triggers fired BEFORE can return null to signal the trigger
manager to skip the rest of the operation for this row (i.e.,
subsequent triggers are not fired, and the INSERT/UPDATE/DELETE does
not occur for this row). If a nonnull value is returned then the
operation proceeds with that row value
And also note that you do not do an INSERT inside your trigger function. You just return a non null value for the insert to proceed.
This is the EXISTS() version. (BTW: the definition for movies is missing)
CREATE OR REPLACE FUNCTION rate_only_rented()
RETURNS TRIGGER AS $func$
BEGIN
IF ( NOT EXISTS (
SELECT *
FROM rentals r
JOIN discs d ON r.disc_id = d.id
WHERE d.movie_id = NEW.movie_id
AND r.customer_id = NEW.customer_id
) ) THEN
RAISE EXCEPTION 'you(%) have not rented this movie(%) before'
, NEW.customer_id ,NEW.movie_id;
RETURN NULL;
ELSE
RETURN NEW;
END IF;
END;
$func$ language plpgsql;
And the trigger:
CREATE TRIGGER rate_only_rented
AFTER INSERT OR UPDATE
ON ratings
FOR EACH ROW
EXECUTE PROCEDURE rate_only_rented()
;
I understand that in a graph DB, relations are created after insertion and instead or field to field relations, they are record to record relation. If my understanding is correct, then what do I need to do when I import a million records from a CSV file and need to relate them to records in the other table?
Is it not possible to relations (edges) at design time, before insertion so that whenever a record is inserted, it has a relation already there?
My database is detailed as below.
create class Country extends V
create class Immigrant extends V
create class comesFrom extends E
create property Country.c_id integer
create property Country.c_name String
create property Immigrant.i_id integer
create property Immigrant.i_name String
create property Immigrant.i_country Integer
If I create the edges manually, it will be like this.
insert into Country(c_id, c_name) values (1, 'USA')
insert into Country(c_id, c_name) values (2, 'UK')
insert into Country(c_id, c_name) values (3,'PAK')
insert into Immigrant(i_id, i_name,i_country) values (1, 'John',1)
insert into Immigrant(i_id, i_name,i_country) values (2, 'Graham',2)
insert into Immigrant(i_id, i_name,i_country) values (3, 'Ali',3)
create edge comesFrom from (select from Immigrant where i_country = 1) to (select from Country where c_id = 1)
create edge comesFrom from (select from Immigrant where i_country = 2) to (select from Country where c_id = 2)
create edge comesFrom from (select from Immigrant where i_country = 3) to (select from Country where c_id = 3)
You can use OrientDB ETL to do that. For more information look at this example: http://orientdb.com/docs/last/Import-from-CSV-to-a-Graph.html
I have the following scenario in a Postgres 9.3 database:
Tables B and C reference Table A.
Table C has an optional field that references table B.
I would like to ensure that for each row of table C that references table B, c.b.a = c.a. That is, if C has a reference to B, both rows should point at the same row in table A.
I could refactor table C so that if c.b is specified, c.a is null but that would make queries joining tables A and C awkward.
I might also be able to make table B's primary key include its reference to table A and then make table C's foreign key to table B include table C's reference to table A but I think this adjustment would be too awkward to justify the benefit.
I think this can be done with a trigger that runs before insert/update on table C and rejects operations that violate the specified constraint.
Is there a better way to enforce data integrity in this situation?
There is a very simple, bullet-proof solution. Works for Postgres 9.3 - when the original question was asked. Works for the current Postgres 13 - when the question in the bounty was added:
Would like information on if this is possible to achieve without database triggers
FOREIGN KEY constraints can span multiple columns. Just include the ID of table A in the FK constraint from table C to table B. This enforces that linked rows in B and C always point to the same row in A. Like:
CREATE TABLE a (
a_id int PRIMARY KEY
);
CREATE TABLE b (
b_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, UNIQUE (a_id, b_id) -- redundant, but required for FK
);
CREATE TABLE c (
c_id int PRIMARY KEY
, a_id int NOT NULL REFERENCES a
, b_id int
, CONSTRAINT fk_simple_and_safe_solution
FOREIGN KEY (a_id, b_id) REFERENCES b(a_id, b_id) -- THIS !
);
Minimal sample data:
INSERT INTO a(a_id) VALUES
(1)
, (2);
INSERT INTO b(b_id, a_id) VALUES
(1, 1)
, (2, 2);
INSERT INTO c(c_id, a_id, b_id) VALUES
(1, 1, NULL) -- allowed
, (2, 2, 2); -- allowed
Disallowed as requested:
INSERT INTO c(c_id, a_id, b_id) VALUES (3,2,1);
ERROR: insert or update on table "c" violates foreign key constraint "fk_simple_and_safe_solution"
DETAIL: Key (a_id, b_id)=(2, 1) is not present in table "b".
db<>fiddle here
The default MATCH SIMPLE behavior of FK constraints works like this (quoting the manual):
MATCH SIMPLE allows any of the foreign key columns to be null; if any of them are null, the row is not required to have a match in the referenced table.
So NULL values in c(b_id) are still allowed (as requested: "optional field"). The FK constraint is "disabled" for this special case.
We need the logically redundant UNIQUE constraint on b(a_id, b_id) to allow the FK reference to it. But by making it out to be on (a_id, b_id) instead of (b_id, a_id), it is also useful in its own right, providing a useful index on b(a_id) to support the other FK constraint, among other things. See:
Is a composite index also good for queries on the first field?
(An additional index on c(a_id) is typically useful accordingly.)
Further reading:
Differences between MATCH FULL, MATCH SIMPLE, and MATCH PARTIAL?
Enforcing constraints “two tables away”
I ended up creating a trigger as follows:
create function "check C.A = C.B.A"()
returns trigger
as $$
begin
if NEW.b is not null then
if NEW.a != (select a from B where id = NEW.b) then
raise exception 'a != b.a';
end if;
end if;
return NEW;
end;
$$
language plpgsql;
create trigger "ensure C.A = C.B.A"
before insert or update on C
for each row
execute procedure "check C.A = C.B.A"();
Would like information on if this is possible to achieve without database triggers
Yes, it is possible. The mechanism is called ASSERTION and it is defined in SQL-92 Standard(though it is not implemented by any major RDBMS).
In short it allows to create multiple-row constraints or multi-table check constraints.
As for PostgreSQL it could be emulated by using view with WITH CHECK OPTION and performing operation on view instead of base table.
WITH CHECK OPTION
This option controls the behavior of automatically updatable views. When this option is specified, INSERT and UPDATE commands on the view will be checked to ensure that new rows satisfy the view-defining condition (that is, the new rows are checked to ensure that they are visible through the view). If they are not, the update will be rejected.
Example:
CREATE TABLE a(id INT PRIMARY KEY, cola VARCHAR(10));
CREATE TABLE b(id INT PRIMARY KEY, colb VARCHAR(10), a_id INT REFERENCES a(id) NOT NULL);
CREATE TABLE c(id INT PRIMARY KEY, colc VARCHAR(10),
a_id INT REFERENCES a(id) NOT NULL,
b_id INT REFERENCES b(id));
Sample inserts:
INSERT INTO a(id, cola) VALUES (1, 'A');
INSERT INTO a(id, cola) VALUES (2, 'A2');
INSERT INTO b(id, colb, a_id) VALUES (12, 'B', 1);
INSERT INTO c(id, colc, a_id) VALUES (15, 'C', 2);
Violating the condition(connecting C with B different a_id on both tables)
UPDATE c SET b_id = 12 WHERE id = 15;;
-- no issues whatsover
Creating view:
CREATE VIEW view_c
AS
SELECT *
FROM c
WHERE NOT EXISTS(SELECT 1
FROM b
WHERE c.b_id = b.id
AND c.a_id != b.a_id) -- here is the clue, we want a_id to be the same
WITH CHECK OPTION ;
Trying update second time(error):
UPDATE view_c SET b_id = 12 WHERE id = 15;
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (15, C, 2, 12).
Trying brand new inserts with incorrect data(also errors)
INSERT INTO b(id, colb, a_id) VALUES (20, 'B2', 2);
INSERT INTO view_c(id, colc, a_id, b_id) VALUES (30, 'C2', 1, 20);
--ERROR: new row violates check option for view "view_c"
--DETAIL: Failing row contains (30, C2, 1, 20)
db<>fiddle demo
I have a database with companies and their products, I want for each
company to have a separate product id sequence.
I know that postgresql can't do this, the only way is to have a separate sequence for each company but this is cumbersome.
I thought about a solution to have a separate table to hold the sequences
CREATE TABLE "sequence"
(
"table" character varying(25),
company_id integer DEFAULT 0,
"value" integer
)
"table" will be holt the table name for the sequence, such as products, categories etc.
and value will hold the actual sequence data that will be used for product_id on inserts
I will use UPDATE ... RETURNING value; to get a product id
I was wondering is this solution efficient?
With row level locking, only users of same company adding rows in the same table will have to wait to get a lock and I think that reduces race condition problems.
Is there a better way to solve this problem?
I don't want to use a sequence for products table for all companies because the difference between product id's will be to big, I want to keep it simple for the users.
You could just embed a counter in your companies table:
CREATE TABLE companies (
id SERIAL PRIMARY KEY,
name TEXT,
product_id INT DEFAULT 0
);
CREATE TABLE products (
company INT REFERENCES companies(id),
product_id INT,
PRIMARY KEY (company, product_id),
name TEXT
);
INSERT INTO companies (id, name) VALUES (1, 'Acme Corporation');
INSERT INTO companies (id, name) VALUES (2, 'Umbrella Corporation');
Then, use UPDATE ... RETURNING to get the next product ID for a given company:
> INSERT INTO products VALUES (1, (UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id), 'Anvil');
ERROR: syntax error at or near "companies"
LINE 1: INSERT INTO products VALUES (1, (UPDATE companies SET produc...
^
Oh noes! It seems you can't (as of PostgreSQL 9.1devel) use UPDATE ... RETURNING as a subquery.
The good news is, it's not a problem! Just create a stored procedure that does the increment/return part:
CREATE FUNCTION next_product_id(company INT) RETURNS INT
AS $$
UPDATE companies SET product_id = product_id+1 WHERE id=$1 RETURNING product_id
$$ LANGUAGE 'sql';
Now insertion is a piece of cake:
INSERT INTO products VALUES (1, next_product_id(1), 'Anvil');
INSERT INTO products VALUES (1, next_product_id(1), 'Dynamite');
INSERT INTO products VALUES (2, next_product_id(2), 'Umbrella');
INSERT INTO products VALUES (1, next_product_id(1), 'Explosive tennis balls');
Be sure to use the same company ID in both the product value and the argument to next_product_id(company INT).
Depending on how many companies you have, you could create a sequence for each company. Query it by a function which is set as a default on your product_id column.
Alternatively this function could simply do a SELECT FOR UPDATE and update the values of your table. Should be pretty performant I think.