Cross table constraints in PostgreSQL - postgresql

Using PostgreSQL 9.2.4, I have a table users with a 1:many relation to the table user_roles. The users table stores both employees and other kinds of users.
Table "public.users"
Column | Type | Modifiers
-----------------+-------------------+-----------------------------------------------------
uid | integer | not null default nextval('users_uid_seq'::regclass)
employee_number | character varying |
name | character varying |
Indexes:
"users_pkey" PRIMARY KEY, btree (uid)
Referenced by:
TABLE "user_roles" CONSTRAINT "user_roles_uid_fkey" FOREIGN KEY (uid) REFERENCES users(uid)
Table "public.user_roles"
Column | Type | Modifiers
-----------+-------------------+------------------------------------------------------------------
id | integer | not null default nextval('user_roles_id_seq'::regclass)
uid | integer |
role | character varying | not null
Indexes:
"user_roles_pkey" PRIMARY KEY, btree (id)
Foreign-key constraints:
"user_roles_uid_fkey" FOREIGN KEY (uid) REFERENCES users(uid)
I want to ensure that the column users.employee_number cannot be NULL if there is a related row where user_roles.role_name contains an employee role name. That is, I want the database to enforce the constraint that for some roles, users.employee_number must have a value, but not for others.
How can I accomplish this, preferably without user-defined functions or triggers? I found (blog post, SO Answer) that SQL Server supports indexed views, which sounds like it would serve my purpose. However, I assume that materialized views will not work in my case, since they are not dynamically updated.

Clarifications
The formulation of this requirement leaves room for interpretation:
where UserRole.role_name contains an employee role name.
My interpretation:
with an entry in UserRole that has role_name = 'employee'.
Your naming convention is was problematic (updated now). User is a reserved word in standard SQL and Postgres. It's illegal as identifier unless double-quoted - which would be ill-advised. User legal names so you don't have to double-quote.
I am using trouble-free identifiers in my implementation.
The problem
FOREIGN KEY and CHECK constraint are the proven, air-tight tools to enforce relational integrity. Triggers are powerful, useful and versatile features but more sophisticated, less strict and with more room for design errors and corner cases.
Your case is difficult because a FK constraint seems impossible at first: it requires a PRIMARY KEY or UNIQUE constraint to reference - neither allows NULL values. There are no partial FK constraints, the only escape from strict referential integrity are NULL values in the referencing columns due to the default MATCH SIMPLE behavior of FK constraints. Per documentation:
MATCH SIMPLE allows any of the foreign key columns to be null; if any
of them are null, the row is not required to have a match in the referenced table.
Related answer on dba.SE with more:
Two-column foreign key constraint only when third column is NOT NULL
The workaround is to introduce a boolean flag is_employee to mark employees on both sides, defined NOT NULL in users, but allowed to be NULL in user_role:
Solution
This enforces your requirements exactly, while keeping noise and overhead to a minimum:
CREATE TABLE users (
users_id serial PRIMARY KEY
, employee_nr int
, is_employee bool NOT NULL DEFAULT false
, CONSTRAINT role_employee CHECK (employee_nr IS NOT NULL = is_employee)
, UNIQUE (is_employee, users_id) -- required for FK (otherwise redundant)
);
CREATE TABLE user_role (
user_role_id serial PRIMARY KEY
, users_id int NOT NULL REFERENCES users
, role_name text NOT NULL
, is_employee bool CHECK(is_employee)
, CONSTRAINT role_employee
CHECK (role_name <> 'employee' OR is_employee IS TRUE)
, CONSTRAINT role_employee_requires_employee_nr_fk
FOREIGN KEY (is_employee, users_id) REFERENCES users(is_employee, users_id)
);
That's all.
These triggers are optional but recommended for convenience to set the added tags is_employee automatically and you don't have to do anything extra:
-- users
CREATE OR REPLACE FUNCTION trg_users_insup_bef()
RETURNS trigger AS
$func$
BEGIN
NEW.is_employee = (NEW.employee_nr IS NOT NULL);
RETURN NEW;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER insup_bef
BEFORE INSERT OR UPDATE OF employee_nr ON users
FOR EACH ROW
EXECUTE PROCEDURE trg_users_insup_bef();
-- user_role
CREATE OR REPLACE FUNCTION trg_user_role_insup_bef()
RETURNS trigger AS
$func$
BEGIN
NEW.is_employee = true;
RETURN NEW;
END
$func$ LANGUAGE plpgsql;
CREATE TRIGGER insup_bef
BEFORE INSERT OR UPDATE OF role_name ON user_role
FOR EACH ROW
WHEN (NEW.role_name = 'employee')
EXECUTE PROCEDURE trg_user_role_insup_bef();
Again, no-nonsense, optimized and only called when needed.
SQL Fiddle demo for Postgres 9.3. Should work with Postgres 9.1+.
Major points
Now, if we want to set user_role.role_name = 'employee', then there has to be a matching user.employee_nr first.
You can still add an employee_nr to any user, and you can (then) still tag any user_role with is_employee, irregardless of the actual role_name. Easy to disallow if you need to, but this implementation does not introduce any more restrictions than required.
users.is_employee can only be true or false and is forced to reflect the existence of an employee_nr by the CHECK constraint. The trigger keeps the column in sync automatically. You could allow false additionally for other purposes with only minor updates to the design.
The rules for user_role.is_employee are slightly different: it must be true if role_name = 'employee'. Enforced by a CHECK constraint and set automatically by the trigger again. But it's allowed to change role_name to something else and still keep is_employee. Nobody said a user with an employee_nr is required to have an according entry in user_role, just the other way round! Again, easy to enforce additionally if needed.
If there are other triggers that could interfere, consider this:
How To Avoid Looping Trigger Calls In PostgreSQL 9.2.1
But we need not worry that rules might be violated because the above triggers are only for convenience. The rules per se are enforce with CHECK and FK constraints, which allow no exceptions.
Aside: I put the column is_employee first in the constraint UNIQUE (is_employee, users_id) for a reason. users_id is already covered in the PK, so it can take second place here:
DB associative entities and indexing

First, you can solve this using a trigger.
But, I think you can solve this using constraints, with just a little weirdness:
create table UserRoles (
UserRoleId int not null primary key,
. . .
NeedsEmployeeNumber boolean not null,
. . .
);
create table Users (
. . .
UserRoleId int,
NeedsEmployeeNumber boolean,
EmployeeNumber,
foreign key (UserRoleId, NeedsEmployeeNumber) references UserRoles(UserRoleId, NeedsEmployeeNumber),
check ((NeedsEmployeeNumber and EmployeeNumber is not null) or
(not NeedsEmployeeNumber and EmployeeNumber is null)
)
);
This should work, but it is an awkward solution:
When you add a role to an employee, you need to add the flag along with the role.
If a role is updated to change the flag, then this needs to be propagated to existing records -- and the propagation cannot be automatic because you also need to potentially set EmployeeNumber.

New Answer:
This( SQL Sub queries in check constraint ) seems to answer your question, and the language is still in the 9.4 documentation( http://www.postgresql.org/docs/9.4/interactive/sql-createtable.html ).
Old Answer:
SELECT
User.*
, UserRole1.*
FROM
User
LEFT JOIN UserRole UserRole1
ON User.id = UserRole1.UserId
AND (
(
User.employee_number IS NOT NULL AND UserRole1.role_name IN (enumerate employee role names here)
)
OR
(User.employee_number IS NULL)
)
The above query selects all fields from User and all fields from UserRole(aliased as UserRole1). I assumed that the key field between the two fields is known as User.id and UserRole1.UserId, please change these to whatever the real values are.
In the JOIN part of the query there is an OR that on the left side requires an employee number not be NULL in the user table and that UserRole1.role_name be in a list that you must supply to the IN () operator.
The right part of the JOIN is the opposite, it requires that User.employee_number be NULL(this should be your non-employee set).
If you require a more exact solution then please provide more details on your table structures and what roles must be selected for employees.

Related

Composite FK referencing atomic PK + non unique attribute

I am trying to create the following tables in Postgres 13.3:
CREATE TABLE IF NOT EXISTS accounts (
account_id Integer PRIMARY KEY NOT NULL
);
CREATE TABLE IF NOT EXISTS users (
user_id Integer PRIMARY KEY NOT NULL,
account_id Integer NOT NULL REFERENCES accounts(account_id) ON DELETE CASCADE
);
CREATE TABLE IF NOT EXISTS calendars (
calendar_id Integer PRIMARY KEY NOT NULL,
user_id Integer NOT NULL,
account_id Integer NOT NULL,
FOREIGN KEY (user_id, account_id) REFERENCES users(user_id, account_id) ON DELETE CASCADE
);
But I get the following error when creating the calendars table:
ERROR: there is no unique constraint matching given keys for referenced table "users"
Which does not make much sense to me since the foreign key contains the user_id which is the PK of the users table and therefore also has a uniqueness constraint. If I add an explicit uniqueness constraint on the combined user_id and account_id like so:
ALTER TABLE users ADD UNIQUE (user_id, account_id);
Then I am able to create the calendars table. This unique constraint seems unnecessary to me as user_id is already unique. Can someone please explain to me what I am missing here?
Postgres is so smart/dumb that it doesn't assume the designer to do stupid things.
The Postgres designers could have taken different strategies:
Detect the transitivity, and make the FK not only depend on users.id, but also on users.account_id -> accounts.id. This is doable but costly. It also involves adding multiple dependency-records in the catalogs for a single FK-constraint. When imposing the constraint(UPDATE or DELETE in any of the two referred tables), it could get very complex.
Detect the transitivity, and silently ignore the redundant column reference. This implies: lying to the programmer. It would also need to be represented in the catalogs.
cascading DDL operations would get more complex, too. (remember: DDL is already very hard w.r.t. concurrency/versioning)
From the execution/performance point of view: imposing the constraints currently involves "pseudo triggers" on the referred table's indexes. (except DEFERRED, which has to be handled specially)
So, IMHO the Postgres developers made the sane choice of refusing to do stupid complex things.

understanding an inheritance in Postgres; why key "fails" in insert/update command

(One image, tousands of words)
I'd made few tables that are inherited between themselves. (persons)
And then assign child table (address), and relate it only to "base" table (person).
When try to insert in child table, and record is related to inherited table, insert statement fail because there is no key in master table.
And as I insert records in descendant tables, records are salo available in base table (so, IMHO, should be visible/accessible in inherited tables).
Please take a look on attached image. Obviously do someting wrong or didn't get some point....
Thank You in advanced!
Sorry, that's how Postgres table inheritance works. 5.10.1 Caveats explains.
A serious limitation of the inheritance feature is that indexes (including unique constraints) and foreign key constraints only apply to single tables, not to their inheritance children. This is true on both the referencing and referenced sides of a foreign key constraint. Thus, in the terms of the above example:
Specifying that another table's column REFERENCES cities(name) would allow the other table to contain city names, but not capital names. There is no good workaround for this case.
In their example, capitals inherits from cities as organization_employees inherits from person. If person_address REFERENCES person(idt_person) it will not see entries in organization_employees.
Inheritance is not as useful as it seems, and it's not a way to avoid joins. This can be better done with a join table with some extra columns. It's unclear why an organization would inherit from a person.
person
id bigserial primary key
name text not null
verified boolean not null default false
vat_nr text
foto bytea
# An organization is not a person
organization
id bigserial not null
name text not null
# Joins a person with an organization
# Stores information about that relationship
organization_employee
person_id bigint not null references person(id)
organization_id bigint not null references organization(id)
usr text
pwd text
# Get each employee, their name, and their org's name.
select
person.name
organization.name
from
organization_employee
join person on person_id = person.id
join organization on organization_id = organization.id
Use bigserial (bigint) for primary keys, 2 billion comes faster than you think
Don't enshrine arbitrary business rules in the schema, like how long a name can be. You're not saving any space by limiting it, and every time the business rule changes you have to alter your schema. Use the text type. Enforce arbitrary limits in the application or as constraints.
idt_table_name primary keys makes for long, inconsistent column names hard to guess. Why is the primary key of person_address not idt_person_address? Why is the primary key of organization_employee idt_person? You can't tell, at a glance, which is the primary key and which is a foreign key. You still need to prepend the column name to disambiguate; for example, if you join person with person_address you need person.idt_person and person_address.idt_person. Confusing and redundant. id (or idt if you prefer) makes it obvious what the primary key is and clearly differentiates it from table_id (or idt_table) foreign keys. SQL already has the means to resolve ambiguities: person.id.

Foreign Key referencing inherited table

I have the following tables:
CREATE TABLE mail (
id serial,
parent_mail_id integer,
...
PRIMARY KEY (id),
FOREIGN KEY (parent_mail_id) REFERENCES mail(id),
...
);
CREATE TABLE incoming (
from_contact_id integer NOT NULL REFERENCES contact(id),
...
PRIMARY KEY (id),
---> FOREIGN KEY (parent_mail_id) REFERENCES mail(id), <---
...
) INHERITS(mail);
CREATE TABLE outgoing (
from_user_id integer NOT NULL REFERENCES "user"(id),
...
PRIMARY KEY (id),
--> FOREIGN KEY (parent_mail_id) REFERENCES mail(id), <--
...
) INHERITS(mail);
incoming and outgoing inherit from mail and define their foreign keys (and primary keys) again, as they are not automatically inherited.
The problem is:
If I'd insert an incoming mail, it is not possible to reference it from the outgoing table as the foreign key only works with the super table (mails).
Is there a workaround for that?
PostgreSQL 9.3 docs:
A serious limitation of the inheritance feature is that indexes
(including unique constraints) and foreign key constraints only apply
to single tables, not to their inheritance children. This is true on
both the referencing and referenced sides of a foreign key constraint.
Thus, in the terms of the above example:
If we declared cities.name to be UNIQUE or a PRIMARY KEY, this would not stop the capitals table from having rows with names
duplicating rows in cities. And those duplicate rows would by default
show up in queries from cities. In fact, by default capitals would
have no unique constraint at all, and so could contain multiple rows
with the same name. You could add a unique constraint to capitals, but
this would not prevent duplication compared to cities.
Similarly, if we were to specify that cities.name REFERENCES some other table, this constraint would not automatically propagate to
capitals. In this case you could work around it by manually adding the
same REFERENCES constraint to capitals.
Specifying that another table's column REFERENCES cities(name) would allow the other table to contain city names, but not capital
names. There is no good workaround for this case.
These deficiencies will probably be fixed in some future release, but
in the meantime considerable care is needed in deciding whether
inheritance is useful for your application.
And not really a workaround, so maybe make mails a non-inherited table, and then separate incoming_columns and outgoing_columns for their respective extra columns, with the mail id as both their primary and foreign key. You can then create a view outgoing as mail INNER JOIN outgoing_columns, for example.
You may use a constraint trigger
CREATE OR REPLACE FUNCTION mail_ref_trigger()
RETURNS trigger AS
$BODY$
DECLARE
BEGIN
IF NOT EXISTS (
SELECT 1 FROM mail WHERE id = NEW.parent_mail_id
) THEN
RAISE foreign_key_violation USING MESSAGE = FORMAT('Referenced mail id not found, mail_id:%s', NEW.parent_mail_id);
END IF;
RETURN NEW;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
CREATE CONSTRAINT TRIGGER mail_fkey_trigger
AFTER UPDATE OR INSERT ON incoming
DEFERRABLE
FOR EACH ROW EXECUTE PROCEDURE mail_ref_trigger();

How to tell PostgreSQL not to verify datatype

I have a table like:
CREATE TABLE test(
id integer not null default nextval('test_id_seq'::regclass),
client_name_id integer not null
);
Foreign-key constraints:
"test_client_name_id_fkey" FOREIGN KEY (client_name_id) REFERENCES company(id) DEFERRABLE INITIALLY DEFERRED
and company table:
CREATE TABLE company(
id integer not null default nextval('company_id_seq'::regclass),
company_name character varying(64) not null
)
Now I have trigger on test table which fetch id from company table using provided value client_name_id which is string by matching it with company_name. but when I insert record PostgreSQL return error that client_name_id is string and int required which is true.
How can I tell PostgreSQL not to verify inserted row as I have taken care of it in my triggers.
What you are trying to do is very unorthodox. Are you sure, this is what you want? Of course, you cant enter a string (with non-digits) into an integer column. No surprise there, right? If you want to enter the text instead, you'd have to add a text column instead - with a fk-constraint to company(company_name) if you want to match your current layout.
ALTER TABLE test ALTER DROP COLUMN client_name_id; -- drops fk constraint, too
ALTER TABLE test ADD COLUMN client_name REFERENCES company(company_name);
You would need a UNIQUE constraint on company.company_name to allow this.
However, I would advise to rethink your approach. Your table layout seems proper as it is. The trigger is the unconventional element. Normally, you would reference the primary key, just like you have it now. No trigger needed. To get the company name, you would join the table in a SELECT:
SELECT *
FROM test t
JOIN company c ON t.client_name_id = c.id;
Also, these non-standard modifiers should only be there if you need them: DEFERRABLE INITIALLY DEFERRED. Like, when you have to enter values in table test before you enter the referenced values in table company (in the same transaction).

PostgreSQL ON INSERT CASCADE

I've got two tables - one is Product and one is ProductSearchResult.
Whenever someone tries to Insert a SearchResult with a product that is not listed in the Product table the foreign key constrain is violattet, hence i get an error.
I would like to know how i could get my database to automatically create that missing Product in the Product Table (Just the ProductID, all other attributes can be left blank)
Is there such thing as CASCADE ON INSERT? If there is, i was not able not get it working.
Rules are getting executed after the Insert, so because we get an Error beforehand there are useless if you USE an "DO ALSO". If you use "DO INSTEAD" and add the INSERT Command at the End you end up with endless recursion.
I reckon a Trigger is the way to go - but all my attempts to write one failed.
Any recommendations?
The Table Structure:
CREATE TABLE Product (
ID char(10) PRIMARY KEY,
Title varchar(150),
Manufacturer varchar(80),
Category smallint,
FOREIGN KEY(Category) REFERENCES Category(ID) ON DELETE CASCADE);
CREATE TABLE ProductSearchResult (
SearchTermID smallint NOT NULL,
ProductID char(10) NOT NULL,
DateFirstListed date NOT NULL DEFAULT current_date,
DateLastListed date NOT NULL DEFAULT current_date,
PRIMARY KEY (SearchTermID,ProductID),
FOREIGN KEY (SearchTermID) REFERENCES SearchTerm(ID) ON DELETE CASCADE,
FOREIGN KEY (ProductID) REFERENCES Product ON DELETE CASCADE);
Yes, triggers are the way to go. But before you can start to use triggers in plpgsql, you
have to enable the language. As user postgres, run the command createlang with the proper parameters.
Once you've done that, you have to
Write function in plpgsql
create a trigger to invoke that function
See example 39-3 for a basic example.
Note that a function body in Postgres is a string, with a special quoting mechanism: 2 dollar signs with an optional word in between them, as the quotes. (The word allows you to quote other similar quotes.)
Also note that you can reuse a trigger procedure for multiple tables, as long as they have the columns your procedure uses.
So the function has to
check if the value of NEW.ProductID exists in the ProductSearchResult table, with a select statement (you ought to be able to use SELECT count(*) ... INTO someint, or SELECT EXISTS(...) INTO somebool)
if not, insert a new row in that table
If you still get stuck, come back here.
In any case (rules OR triggers) the insert needs to create a new key (and new values for the attributes) in the products table. In most cases, this implies that a (serial,sequence) surrogate primary key should be used in the products table, and that the "real world" product_id ("product number") should default to NULL, and be degraded to a candidate key.
BTW: a rule can be used, rules just are tricky to implement correctly for N:1 relations (they need the same kind of EXISTS-logic as in Bart's answer above).
Maybe cascading on INSERT is not such a good idea after all. What do you want to happen if someone inserts a ProductSearchResult record for a not-existing product? [IMO a FK is always a domain; you cannot just extend a domain just by referring to a not-existant value for it; that would make the FK constraint meaningless]