postgresql - designing a tree hierarchy with mixed node types (inheritance does not help!) - postgresql

I have a question about implementing inheritance in postgresql(9.1).
The purpose is to build a geo-hierarchy model, where countries, states and continents can be mixed up to create "regions". And then these
regions too can be mixed up with the countries, etc. to create a truly awesome region-hierarchy
So in my logical model, everything is a type of "place". A region-tree can be constructed by specifying edgewise using the two "places". The design is as below, and easy to manage in the Java layer.
create table place_t (
place_id serial primary key,
place_type varchar(10)
);
create table country_t (
short_name varchar(30) unique,
name varchar(255) null
) inherits(place_t);
create table region_t(
short_name varchar(30),
hierarchy_id integer, -- references hierarchy_t(hierarchy_id)
unique(short_name) -- (short_name,hierarchy_id)
) inherits(place_t);
create table region_hier_t(
parent integer references place_t(place_id), -- would prefer FK region_t(place_id)
child integer references place_t(place_id),
primary key(parent,child)
);
insert into region_t values(DEFAULT, 'region', 'NA', 'north american ops');
insert into region_t values(DEFAULT, 'region', 'EMEA', 'europe and middle east');
insert into country_t values(DEFAULT, 'country', 'US', 'USD', 'united states');
insert into country_t values(DEFAULT, 'country', 'CN', 'CND', 'canada');
So far so good. But the following fails:
insert into region_hier_t
select p.place_id, c.place_id
from region_t as p, country_t as c
where p.short_name = 'NA' and c.short_name = 'US';
The reason is that the first 4 inserts did not create any row in "place_t". RTFM! Postgres docs actually mention this.
The question is - is there a workaround? Via insert triggers on region_t and country_t to implement my own "inheritance" is the only thing I could think of.
A second question is - is there a better design for such a mixed-node tree structure?
For certain reasons I do not want to rely too much on postgres-contrib features. Perhaps that's very silly and please feel free to chime in, but gently (and only after answering the other question)!
Thanks

References on parent and child column in region_hier_t table are wrong, because you cannot insert a key from country_t if your reference calls another table (child integer references place_t(place_id)); You can either drop them or add new ones.
So let's take the second option and add an unique constraint matching given keys for referenced tables region_t and country_t:
ALTER TABLE region_t
ADD CONSTRAINT pk_region_t PRIMARY KEY(place_id );
ALTER TABLE country_t
ADD CONSTRAINT pk_country_t PRIMARY KEY(place_id );
The correct CREATE statement for region_hier_t is:
create table region_hier_t(
parent integer references region_t(place_id),
child integer references country_t(place_id),
primary key(parent,child)
);
And finally you can run your INSERT.
So, as you see there is many improvements for you to do. Maybe you should reconsider your design. Take a look at this answer: How to store postal addresses and political divisions in a normalized way? It's much simpler than your solution and easier to maintain.
But if you wanna stay by your solution don't forget to set primary keys on child tables(as shown above). Only check constraints and not-null constraints are inherited by its children and you haven't done it already.
I see that other of your insert don't work correctly:
insert into region_t values(DEFAULT, 'region', 'NA', 'north american ops');
ERROR: invalid input syntax for integer: "north american ops"
LINE 1: ...ert into region_t values(DEFAULT, 'region', 'NA', 'north ame...
So there is problem with column assignment as well.

So it turns out that inheritance in PostgreSQL is somewhat different from that used in typical OOP languages. In particular, the "superclass" table is not populated automatically. If I had to use my own triggers to do that, I didn't have a use case left for the inheritance structure.
So I abandoned Postgresql inheritance and created my own "place_t" table. And "country_t", "state_t", "county_t" and "region_t" children tables, linked to parent "place_t" through "place_id".
On these children tables, I created an before insert/update row level trigger to ensure that "place_id" refers to a valid row in "place_t" and the reference is not changed later. IOW, "place_id" in children tables should behave like write-once-read-many.
Now, I can insert the world geo. Also, define a new "region". I created a "region_composition_t" to record the edges of a regional hierarchy, the parent being a reference to "region_t" and child being a reference to "place_t".
So far so good. The challenge now is how to suppress any update/delete cascading effects.

The workaround is to get rid of your foreign keys to place_t and do instead:
CREATE FUNCTION place_t_exists(id int)
RETURNS bool LANGUAGE SQL AS
$$
SELECT count(*) = 1 FROM place_t;
$$;
CREATE FUNCTION fkey_place_t() RETURNS TRIGGER
LANGUAGE PLPGSQL AS $$
BEGIN;
IF place_t_exists(TG_ARGV[1]) THEN RETURN NEW
ELSE RAISE EXCEPTION 'place_t does not exist';
END IF;
END;
$$;
You also need something on the child tables to restrain when the hierarchy node exists:
CREATE FUNCTION hierarchy_exists(id int) RETURNS BOOL LANGUAGE SQL AS
$$
SELECT COUNT(*) > 0 FROM region_heir_t WHERE parent = $1 or child = $1;
$$;
CREATE OR REPLACE FUNCTION fkey_hierarchy_trigger() RETURNS trigger LANGUAGE PLPGSQL AS
$$
BEGIN
IF hierarchy_exists(old.place_id) THEN RAISE EXCEPTION 'Hierarchy node still exists';
ELSE RETURN OLD;
END;
$$;
Then you can create your triggers:
CREATE CONSTRAINT TRIGGER fkey_place_parent AFTER INSERT OR UPDATE TO region_hier_t
FOR EACH ROW EXECUTE PROCEDURE fkey_place_t(new.parent);
CREATE CONSTRAINT TRIGGER fkey_place_child AFTER INSERT OR UPDATE TO region_hier_t
FOR EACH ROW EXECUTE PROCEDURE fkey_place_t(new.child);
And then for each of the place_t child tables:
CREATE CONSTRAINT TRIGGER fkey_hier_t TO [child_table]
FOR EACH ROW EXECUTE PROCEDURE fkey_hierarchy_trigger();
This solution may not be worth it, but it is worth knowing how to do it if you need to.

Related

Creating a SQL trigger to update tables when entering data in a view

If I have this setup:
CREATE TABLE category(
category_id serial PRIMARY KEY,
category_name text UNIQUE NOT NULL -- must be UNIQUE
);
CREATE TABLE parts (
part_id serial PRIMARY KEY,
category_id int REFERENCES product,
part_name text
);
CREATE VIEW partview AS
SELECT com.part_id, cat.category_name, com.part_name
FROM parts com
LEFT JOIN category cat USING (category_id);
How do I create a trigger so that when I insert data into the view, the source tables are updated?
I tried this... but it doesn't work :(
CREATE FUNCTION insert_view_func()
RETURNS trigger as
$func$
BEGIN
INSERT INTO parts (category_name)
select (select category_id from category where category_name = category.category_name)
RETURNING category_id as id
into new.componentid;
return new;
END
$func$ language plpgsql;
create trigger insert_view_trig
INSTEAD of insert on partview
for each row execute procedure insert_view_func();
The big issue with view insert triggers on non-simple views is you do not know what is be inserted.
In this case that could be Category or Parts or both. Your trigger has to handle both. Here that is not much a issue here:
create or replace function insert_view_func()
returns trigger
language plpgsql
as $$
begin
insert into category (category_name) values(new.category_name)
on conflict do nothing;
insert into parts (category_id, part_name)
select category_id, new.part_name
from category
where category_name = new.category_name;
return new;
end ;
$$ ;
That, however, is not the major issue here. Your data model setup a 1:M relationship between Category:Parts.
Not a problem if that is what you really want, but it does open a potential problem. Since the Part_Name is not unique it opens
the possibility of multiple parts with the same name (see fiddle), but each associated to a separate category.
This could become quite confusing. To avoid this you may want to consider a M:M relationship and creating a resolution table. The other option would be modifying the trigger function to check for existing part_name. Better yet make Part_Name unique.

Postgresql - Constraints on Ranges - Two tables

With constraints on ranges, we prevent adding any overlapping values in an existing table. For example, in the following room_reservation table, we make sure that no room is reserved in a conflicting time.
CREATE EXTENSION btree_gist;
CREATE TABLE room_reservation (
room text,
during tsrange,
EXCLUDE USING GIST (room WITH =, during WITH &&)
);
What we need here is to also consider another table (rooms) also having room and during fields and consider the records within that table while making a reservation?
Our specific scenario is exam management. We have an invigilation table (room reservation) and also a time table of classes. Once we are adding an invigilation record, we need to make sure that it does not coincide with any other invigilation record and make sure that there is no lecture at that time in that room.
You cannot do that with a single exclusion constraint. Instead, you should use the exclusion constraint on one table, say invigilation, and then use a BEFORE INSERT trigger on that same table that checks if there is a conflict in the second table, say rooms. The trigger function on the first table would do a simple range check on the second table:
CREATE FUNCTION check_no_class() RETURNS trigger AS $$
BEGIN
PERFORM * FROM rooms
WHERE room = NEW.room
AND during && NEW.during;
IF FOUND THEN
RETURN NULL;
ELSE
RETURN NEW;
END IF;
END; $$ LANGUAGE plpgsql;
CREATE TRIGGER check_rooms
BEFORE INSERT ON invigilation
FOR EACH ROW EXECUTE PROCEDURE check_no_class();
If a class is scheduled in a room then the insert on invigilation will fail.

Is it possible to refer a column in a view as foreign key (PostgreSQL 9.4)?

I know in older versions it was impossible, is it the same with version 9.4?
I'm trying to do something like this:
CREATE VIEW products AS
SELECT d1.id AS id, d1.price AS pr FROM dup.freshProducts AS d1
UNION
SELECT d2.id AS id, d2.price AS pr FROM dup.cannedProducts AS d2;
CREATE TABLE orderLines
(
line_id integer PRIMARY KEY,
product_no integer REFERENCES productView.id
);
I'm trying to implement an inheritance relationship where freshProducts and cannedProducts both inherit from products. I implemented it using two different tables and I created a view products that has only the common properties between freshProducts and cannedProducts. In addition, each row in orderLines has a relationship with a product, either a freshProduct or a cannedProduct. See image for clarification.
If referencing to a view is yet not possible, which solution do you think is best? I've thought of eihter a materialized view or implementing the restriction using triggers. Could you recommend any good example of such triggers to use as a basis?
Thank-you very much!
Referencing a (materialized) view wouldn't work and a trigger might look like this:
CREATE OR REPLACE FUNCTION reject_not_existing_id()
RETURNS "trigger" AS
$BODY$
BEGIN
IF NEW.product_no NOT IN (SELECT id FROM dup.freshProducts UNION SELECT id FROM dup.cannedProducts) THEN
RAISE EXCEPTION 'The product id % does not exist', NEW.product_no;
END IF;
RETURN NEW;
END;
$BODY$
LANGUAGE 'plpgsql' VOLATILE;
CREATE TRIGGER tr_before_insert_or_update
BEFORE INSERT OR UPDATE OF product_no
ON orderLines
FOR EACH ROW
EXECUTE PROCEDURE reject_not_existing_id();
(See also http://www.tek-tips.com/viewthread.cfm?qid=1116256)
A materialized view might look like a good approach but fails for two reasons: Like a view you simply can't reference it, because it is no table (go ahead and try). Assuming you could, there would still be the problem of preventing two equal ids in freshProducts and cannedProducts. Yes you can define an UNIQUE INDEX on a materialized view, but how to make sure the same id isn't used both in fresh an canned in the first place?
That's something you still have to solve if using the trigger in orderLines.
That brings me to suggest to rethink your model. 'Fresh' and 'canned' might as well be values of an attribute of a single table products, hence making all the trouble superfluous. If fresh and canned product significantly differ in (the number of) their attributes (can't think of any other reason to create two different tables) then reference the product id in two other tables. Like
CREATE TABLE products
(
id ... PRIMARY KEY
, fresh_or_canned ...
, price ...
, another_common_attribute_1 ...
, ...
, another_common_attribute_n ...
);
CREATE TABLE canned_specific_data
(
canned_id ... REFERENCES products (id)
, type_of_can ...
, ...
, another_attribute_that_does_not_apply_to_fresh ...
);
CREATE TABLE fresh_specific_data
(
fresh_id ... REFERENCES products (id)
, date_of_harvest ...
, ...
, another_attribute_that_does_not_apply_to_canned ...
);
The simple answer to preventing ID duplication is to simply use the same sequence as the default value for IDs in both freshProducts and cannedProducts.
Now, there comes the question, why do you need a foreign key at all? Typically this is to prevent deletion of data that another table depends upon, however, you can write a trigger to prevent that. Alsowise, you have updating that value to something that doesn't exist in the keyed-to table, but you can write a trigger for that too.
So basically you can write triggers to implement all the desired functionality of a foreign key without actually needing a foreign key, with the added benefit that they WILL work with such a view.

PostgreSQL: How to revalidate CHECKs

My database has the following structure:
CREATE TYPE instrument_type AS ENUM (
'Stock',
...
'Currency',
...
);
CREATE FUNCTION get_instrument_type(instrument_id bigint) RETURNS instrument_type
LANGUAGE plpgsql STABLE RETURNS NULL ON NULL INPUT
AS $$
BEGIN
RETURN (SELECT instr_type FROM instruments WHERE id = instrument_id);
END
$$;
CREATE TABLE instruments (
id bigserial PRIMARY KEY,
instr_type instrument_type NOT NULL,
...
);
CREATE TABLE countries_currencies (
...
curr bigint NOT NULL
REFERENCES instruments (id)
ON UPDATE CASCADE ON DELETE CASCADE
CHECK (get_instrument_type(curr) = 'Currency'),
...
);
As you can see, I use one common table for instruments. There are a lot of foreign keys referencing to that table. But some tables like countries_currencies require that referenced item is 'Currency'. Since I can't use subqueries in CHECK constraints, I have to use function.
One day it could happen that one bad man will change instrument_type from 'Currency' to something else. If there is a row in table countries_currencies, referencing to modified instrument, CHECK will become invalid for this row. But CHECK will be applied to new rows, not for already existing.
Is there any standard way to revalidate CHECKs? I want to run such procedure as a part of general data integrity test.
P.S. I know, I could write trigger on table instruments and forbid change if something could become broken. But it requires assurance that I check all referencing tables and their constraints, so it is error prone anyway.
You could simply update all rows in place to trigger the CHECK:
UPDATE countries_currencies SET curr = curr;

Need help writing a PostgreSQL trigger function

I have two tables representing two different types of imagery. I am using PostGIS to represent the boundaries of those images. Here is a simplified example of those tables:
CREATE TABLE img_format_a (
id SERIAL PRIMARY KEY,
file_path VARCHAR(1000),
boundary GEOGRAPHY(POLYGON, 4326)
);
CREATE TABLE img_format_p (
id SERIAL PRIMARY KEY,
file_path VARCHAR(1000),
boundary GEOGRAPHY(POLYGON, 4326)
);
I also have a cross reference table, which I want to contain all the IDs of the images that overlap each other. Whenever an image of type "A" gets inserted into the database, I want to check to see whether it overlaps any of the existing imagery of type "P" (and vice versa) and insert corresponding entries into the img_a_img_p cross reference table. This table should represent a many-to-many relationship.
My first instinct is to write a trigger to manage thisimg_a_img_p table. I've never created a trigger before, so let me know if this is a silly thing to do, but it seems to make sense to me. So I create the following trigger:
CREATE TRIGGER update_a_p_cross_reference
AFTER INSERT OR DELETE OR UPDATE OF boundary
ON img_format_p FOR EACH ROW
EXECUTE PROCEDURE check_p_cross_reference();
The part where I am getting stuck is with writing the trigger function. My code is in Java and I see that there are tools like PL/pgSQL, but I'm not sure if that's what I should use or if I even need one of those special add-ons.
Essentially all I need the trigger to do is update the cross reference table each time a new image gets inserted into either img_format_a or img_format_p. When a new image is inserted, I would like to use a PostGIS function like ST_Intersects to determine whether the new image overlaps with any of the images in the other table. For each image pair where ST_INTERSECTS returns true, I would like to insert a new entry into img_a_img_p with the ID's of both images. Can someone help me figure out how to write this trigger function? Here is some pseudocode:
SELECT * FROM img_format_p P
WHERE ST_Intersects(A.boundary, P.boundary);
for each match in selection {
INSERT INTO img_a_img_p VALUES (A.id, P.id);
}
You could wrap the usual INSERT ... SELECT idiom in a PL/pgSQL function sort of like this:
create function check_p_cross_reference() returns trigger as
$$
begin
insert into img_a_img_p (img_a_id, img_p_id)
select a.id, p.id
from img_format_a, img_format_p
where p.id = NEW.id
and ST_Intersects(a.boundary, p.boundary);
return null;
end;
$$ language plpgsql;
Triggers have two extra variables, NEW and OLD:
NEW
Data type RECORD; variable holding the new database row for INSERT/UPDATE operations in row-level triggers. This variable is NULL in statement-level triggers and for DELETE operations.
OLD
Data type RECORD; variable holding the old database row for UPDATE/DELETE operations in row-level triggers. This variable is NULL in statement-level triggers and for INSERT operations.
So you can use NEW.id to access the new img_format_p value that's going in. You (currently) can't use the plain SQL language for triggers:
It is not currently possible to write a trigger function in the plain SQL function language.
but PL/pgSQL is pretty close. This would make sense as an AFTER INSERT trigger:
CREATE TRIGGER update_a_p_cross_reference
AFTER INSERT
ON img_format_p FOR EACH ROW
EXECUTE PROCEDURE check_p_cross_reference();
Deletes could be handled with a foreign key on img_a_img_p and a cascading delete. You could use your trigger for UPDATEs as well:
CREATE TRIGGER update_a_p_cross_reference
AFTER INSERT OR UPDATE OF boundary
ON img_format_p FOR EACH ROW
EXECUTE PROCEDURE check_p_cross_reference();
but you'd probably want to clear out the old entries before inserting the new ones with something like:
delete from img_a_img_p where img_p_id = NEW.id;
before the INSERT...SELECT statement.