Database design with composite types in postgresql - postgresql

How to refer another table built from composite type in my table.
I am trying to setup a database to understand postgresql and its Object oriented features.
The statement is as follows : There are multiple companies which can have board members.
Each company can own another company or a person can own that company too.
This is the type of database design I am looking for.
create type companyType(
name: VARCHAR,
boardMembers : personType[],
owns: companyType[]
)
create type personType(
name: VARCHAR,
owns: companyType[]
)
Create table company as companyType
Create table person as personType
I understand that I cannot self reference the companyType so I will probably move this another table.
My question is, when I am trying to insert into say company type, how do i insert list of person table objects as foreign key ?
Would making a column 'id' in each table and giving it type SERIAL work to use it as a foreign key?

That is not a relational database design, and you won't get happy with it.
Map each object to a table. The table columns are the attributes of the object. Add an artificial primary key (id bigint GENERATED ALWAYS AS IDENTITY). Don't use composite types or arrays.
Relationships are expressed like this:
If the relationship is one-to-many, add a foreign key to the "many' side.
If the relationship is many-to-many, add a "junction table" that has foreign keys to both tables. The primary key is the union of these foreign keys.
Normalize the resulting data model to remove redundancies.
Sprinkle with unique and check constraints as appropriate.
That way your queries will become simple, and you can use your database's features to make your life easier.

Related

PostgreSQL database: Get rid of redundant transitive relation (maybe 3NF is failed)

I'm creating a hybrid between "X-Com Enemy Unknown" and "The Sims". I maintain game state in a database--PostgreSQL--but my question is structural, not engine-specific.
As in X-Com, there are some bases in different locations, so I create a table named Base with ID autoincrement identity as primary key.
Every base has some facilities in its territory, so I create a table named Facility with a foreign key Facility.Base_ID, referring to Base.ID.
Every base has some landing crafts in its hangars, so I create a table named Craft with a foreign key Craft.Base_ID, referring to Base.ID.
Every base has some troopers in its barracks, so I create a table named Trooper with a foreign key Trooper.Base_ID, referring to Base.ID.
Just to this point, everything seems to be ok, doesn't it? However...
I want to have some sort of staff instruction. Like in the X-Com game, every trooper can be assigned to some craft for offense action, or can be unassigned. In addition, every trooper can be assigned to some facility (or can be unassigned) for defense action. So, I have to add nullable foreign keys Trooper.Craft_ID and Trooper.Facility_ID, referring to Craft.ID and Facility.ID respectively.
That database has a redundancy. If some trooper is assigned to a craft or to a facility (or both), it has two (or even three) relations to the base--one direct relation through its Base_ID and some indirect relations as Facility(Trooper.Facility_ID).Base_ID and Craft(Trooper.Craft_ID).Base_ID. Even if I get rid of Trooper.Base_ID (e.g. I can make both assignment mandatory and create a mock craft and a mock facility in every base), I can't get rid of both trooper-facility-base and trooper-craft-base relations.
In addition to this redundancy, there is a worse problem--in case of a mistake, some trooper can be assigned to a craft from one base and to a facility from another base, that's a really nasty situation. I can prohibit it in the application business logic tier, but it's still allowed by the database.
There can be some constraints to apply, but is there any structural modification to the schema that can get rid of the redundancy and potential inconsistency as a result of a good structure, not as a result of constraints?
CREATE TABLE base (
id int PRIMARY KEY
);
CREATE TABLE facility (
id int PRIMARY KEY,
base_id int REFERENCES base
);
CREATE TABLE craft (
id int PRIMARY KEY,
base_id int REFERENCES base
);
CREATE TABLE trooper (
id int PRIMARY KEY,
assigned_facility_id int REFERENCES facility,
assigned_craft_id int REFERENCES craft,
base_id int REFERENCES base
);
Now I want to get some sort of constraints on a trooper t so that
facilities.get(t.assigned_facility_id).base_id IS NULL OR EQUAL TO t.base_id
crafts.get(t.assigned_craft_id).base_id IS NULL OR EQUAL TO t.base_id
This hypothetical constraint has to be applied to table trooper, because it applies in boundaries of each trooper row separately. Constraints on one table have to check equality between fields of two other tables.
I would like to create a database schema where there is exactly one way, having a trooper.id, to find its referenced base.id. How do I normalise my schema?

understanding an inheritance in Postgres; why key "fails" in insert/update command

(One image, tousands of words)
I'd made few tables that are inherited between themselves. (persons)
And then assign child table (address), and relate it only to "base" table (person).
When try to insert in child table, and record is related to inherited table, insert statement fail because there is no key in master table.
And as I insert records in descendant tables, records are salo available in base table (so, IMHO, should be visible/accessible in inherited tables).
Please take a look on attached image. Obviously do someting wrong or didn't get some point....
Thank You in advanced!
Sorry, that's how Postgres table inheritance works. 5.10.1 Caveats explains.
A serious limitation of the inheritance feature is that indexes (including unique constraints) and foreign key constraints only apply to single tables, not to their inheritance children. This is true on both the referencing and referenced sides of a foreign key constraint. Thus, in the terms of the above example:
Specifying that another table's column REFERENCES cities(name) would allow the other table to contain city names, but not capital names. There is no good workaround for this case.
In their example, capitals inherits from cities as organization_employees inherits from person. If person_address REFERENCES person(idt_person) it will not see entries in organization_employees.
Inheritance is not as useful as it seems, and it's not a way to avoid joins. This can be better done with a join table with some extra columns. It's unclear why an organization would inherit from a person.
person
id bigserial primary key
name text not null
verified boolean not null default false
vat_nr text
foto bytea
# An organization is not a person
organization
id bigserial not null
name text not null
# Joins a person with an organization
# Stores information about that relationship
organization_employee
person_id bigint not null references person(id)
organization_id bigint not null references organization(id)
usr text
pwd text
# Get each employee, their name, and their org's name.
select
person.name
organization.name
from
organization_employee
join person on person_id = person.id
join organization on organization_id = organization.id
Use bigserial (bigint) for primary keys, 2 billion comes faster than you think
Don't enshrine arbitrary business rules in the schema, like how long a name can be. You're not saving any space by limiting it, and every time the business rule changes you have to alter your schema. Use the text type. Enforce arbitrary limits in the application or as constraints.
idt_table_name primary keys makes for long, inconsistent column names hard to guess. Why is the primary key of person_address not idt_person_address? Why is the primary key of organization_employee idt_person? You can't tell, at a glance, which is the primary key and which is a foreign key. You still need to prepend the column name to disambiguate; for example, if you join person with person_address you need person.idt_person and person_address.idt_person. Confusing and redundant. id (or idt if you prefer) makes it obvious what the primary key is and clearly differentiates it from table_id (or idt_table) foreign keys. SQL already has the means to resolve ambiguities: person.id.

Many-to-Many in Postgres?

I went with PostgreSQL because it is an ORDMBS rather than a standard relational DBMS. I have a class/object (below) that I would like to implement into the database.
class User{
int id;
String name;
ArrayList<User> friends;
}
Now, a user has many friends, so, logically, the table should be declared like so:
CREATE TABLE user_table(
id INT,
name TEXT,
friends TYPEOF(user_table)[]
)
However, to my knowledge, it is not possible to use a row of a table as a type (-10 points for postgreSQL), so, instead, my array of friends is stored as integers:
CREATE TABLE user_table(
id INT,
name TEXT,
friends INT[]
)
This is an issue because elements of an array cannot reference - only the array itself can. Added to this, there seems to be no way to import the whole user (that is to say, the user and all the user's friends) without doing multiple queries.
Am I using postgreSQL wrong? It seems to me that the only efficient way to use it is by using a relational approach.
I want a cleaner object-oriented approach similar to that of Java.
I'm afraid you are indeed using PostgreSQL wrong, and possibly misunderstanding the purpose of Object-relational databases as opposed to classic relational databases. Both classes of database are still inherently relational, but the former provides allowances for inheritance and user-defined types that the latter does not.
This answer to one of your previous questions provides you with some great pointers to achieve what you're trying to do using the Postgres pattern.
Well, first off PostgreSQL absolutely supports arrays of complex types like you describe (although I don't think it has a TYPEOF operator). How would the declaration you describe work, though? You are trying to use the table type in the declaration of the table. If what you want is a composite type in an array (and I'm not really sure that it is) you would declare this in two steps:
CREATE TYPE ima_type AS ( some_id integer, some_val text);
CREATE TABLE ima_table
( some_other_id serial NOT NULL
, friendz ima_type []
)
;
That runs fine. You can also create arrays of table types, because every table definition is a type definition in Postgres.
However, in a relational database, a more traditional model would use two tables:
CREATE TABLE persons
( person_id serial NOT NULL PRIMARY KEY
, person_name text NOT NULL
)
;
CREATE TABLE friend_lookup
( person_id integer FOREIGN KEY REFERENCES persons
, friend_id integer FOREIGN KEY REFERENCES persons(person_id)
, CONSTRAINT uq_person_friend UNIQUE (person_id, friend_id)
)
;
Ignoring the fact that the persons table has absolutely no way to prevent duplicate persons (what about misspellings, middle initials, spacing, honorifics, etc?; also two different people can have the same name), this will do what you want and allow for a simple query that lists all friends.

Entity Framework - Join table with composite key and a primary key

I am struggling with the way entity framework handles join tables, specifically because entity framework requires that a join table has a composite key composed of the primary keys on the two related entities I want the hold the relationship for. The problem here is that I need to hold a relationship to the relationship so to speak.
This may be a problem with my database design or equally due to my lack of understanding with EF. It is probably best illustrated through example (see below);
I have three tables each with a primary key:-
Table : DispatchChannel
{ *DispatchChannelID integer }
Table : Format
{ *FormatID integer }
Table : EventType
{ *EventTypeID integer }
The relationship between EventTypes and DispatchChannels is held in EventTypeDispatchChannels (see below) since this only contains a composite key it is not pulled through into our model and entity framework takes care of maintaining the relationship.
Table : EventTypeDispatchChannels
{ EventTypeID integer, DispatchChannelID integer
}
My problem now arises because for each combination of EventTypeID and DispatchChannelID I want to hold a list of available formats, this would be easy if my EventTypeDispatchChannels table had a primary key therefore my other join table would look like this;
Table : EventTypeDispatchChannelFormats
{ EventTypeDispatchChannelID integer, FormatID integer
}
The absence of a primary key on EventTypeDispatchChannels is where I am struggling to make this work, however if I had the key then entity framework no longer sees this as a linked entity.
I'm relatively new to C# so apologies if I have not explained this so well, but any advice would be appreciated.
The moment you want to give an association a more important role than just being a piece of string between two classes, the association becomes a first-class citizen of your domain and it's justified to make it part of the class model. It's also inevitable, but that's secondary.
So you should map EventTypeDispatchChannels to a class. The table could have its own simple primary key besides the two foreign keys. A simle PK is probably easier, so your table Format can do with a simple foreign key to EventTypeDispatchChannels for the one-to-many association.
You will lose the many to many feature to simply address dispatchChannel.Events. In stead you have to do
db.DispatchChannels.Where(d => d.DispatchChannelID == 1)
.SelectMany(d => d.EventTypeDispatchChannels)
.Select(ed => ed.Event)
On the other hand you have gained the possibility to create an association by just creating an EventTypeDispatchChannel and setting its primitive foreign key values. Many-to-many associations with a transparent junction table can only be set by adding objects to a collection (add an Event to dispatchChannel.Events). This means that the collection must be loaded and you need an Event object, which is more expensive in database round trips.

When to use inherited tables in PostgreSQL?

In which situations you should use inherited tables? I tried to use them very briefly and inheritance didn't seem like in OOP world.
I thought it worked like this:
Table users has all fields required for all user levels. Tables like moderators, admins, bloggers, etc but fields are not checked from parent. For example users has email field and inherited bloggers has it now too but it's not unique for both users and bloggers at the same time. ie. same as I add email field to both tables.
The only usage I could think of is fields that are usually used, like row_is_deleted, created_at, modified_at. Is this the only usage for inherited tables?
There are some major reasons for using table inheritance in postgres.
Let's say, we have some tables needed for statistics, which are created and filled each month:
statistics
- statistics_2010_04 (inherits statistics)
- statistics_2010_05 (inherits statistics)
In this sample, we have 2.000.000 rows in each table. Each table has a CHECK constraint to make sure only data for the matching month gets stored in it.
So what makes the inheritance a cool feature - why is it cool to split the data?
PERFORMANCE: When selecting data, we SELECT * FROM statistics WHERE date BETWEEN x and Y, and Postgres only uses the tables, where it makes sense. Eg. SELECT * FROM statistics WHERE date BETWEEN '2010-04-01' AND '2010-04-15' only scans the table statistics_2010_04, all other tables won't get touched - fast!
Index size: We have no big fat table with a big fat index on column date. We have small tables per month, with small indexes - faster reads.
Maintenance: We can run vacuum full, reindex, cluster on each month table without locking all other data
For the correct use of table inheritance as a performance booster, look at the postgresql manual.
You need to set CHECK constraints on each table to tell the database, on which key your data gets split (partitioned).
I make heavy use of table inheritance, especially when it comes to storing log data grouped by month. Hint: If you store data, which will never change (log data), create or indexes with CREATE INDEX ON () WITH(fillfactor=100); This means no space for updates will be reserved in the index - index is smaller on disk.
UPDATE:
fillfactor default is 100, from http://www.postgresql.org/docs/9.1/static/sql-createtable.html:
The fillfactor for a table is a percentage between 10 and 100. 100 (complete packing) is the default
"Table inheritance" means something different than "class inheritance" and they serve different purposes.
Postgres is all about data definitions. Sometimes really complex data definitions. OOP (in the common Java-colored sense of things) is about subordinating behaviors to data definitions in a single atomic structure. The purpose and meaning of the word "inheritance" is significantly different here.
In OOP land I might define (being very loose with syntax and semantics here):
import life
class Animal(life.Autonomous):
metabolism = biofunc(alive=True)
def die(self):
self.metabolism = False
class Mammal(Animal):
hair_color = color(foo=bar)
def gray(self, mate):
self.hair_color = age_effect('hair', self.age)
class Human(Mammal):
alcoholic = vice_boolean(baz=balls)
The tables for this might look like:
CREATE TABLE animal
(name varchar(20) PRIMARY KEY,
metabolism boolean NOT NULL);
CREATE TABLE mammal
(hair_color varchar(20) REFERENCES hair_color(code) NOT NULL,
PRIMARY KEY (name))
INHERITS (animal);
CREATE TABLE human
(alcoholic boolean NOT NULL,
FOREIGN KEY (hair_color) REFERENCES hair_color(code),
PRIMARY KEY (name))
INHERITS (mammal);
But where are the behaviors? They don't fit anywhere. This is not the purpose of "objects" as they are discussed in the database world, because databases are concerned with data, not procedural code. You could write functions in the database to do calculations for you (often a very good idea, but not really something that fits this case) but functions are not the same thing as methods -- methods as understood in the form of OOP you are talking about are deliberately less flexible.
There is one more thing to point out about inheritance as a schematic device: As of Postgres 9.2 there is no way to reference a foreign key constraint across all of the partitions/table family members at once. You can write checks to do this or get around it another way, but its not a built-in feature (it comes down to issues with complex indexing, really, and nobody has written the bits necessary to make that automatic). Instead of using table inheritance for this purpose, often a better match in the database for object inheritance is to make schematic extensions to tables. Something like this:
CREATE TABLE animal
(name varchar(20) PRIMARY KEY,
ilk varchar(20) REFERENCES animal_ilk NOT NULL,
metabolism boolean NOT NULL);
CREATE TABLE mammal
(animal varchar(20) REFERENCES animal PRIMARY KEY,
ilk varchar(20) REFERENCES mammal_ilk NOT NULL,
hair_color varchar(20) REFERENCES hair_color(code) NOT NULL);
CREATE TABLE human
(mammal varchar(20) REFERENCES mammal PRIMARY KEY,
alcoholic boolean NOT NULL);
Now we have a canonical reference for the instance of the animal that we can reliably use as a foreign key reference, and we have an "ilk" column that references a table of xxx_ilk definitions which points to the "next" table of extended data (or indicates there is none if the ilk is the generic type itself). Writing table functions, views, etc. against this sort of schema is so easy that most ORM frameworks do exactly this sort of thing in the background when you resort to OOP-style class inheritance to create families of object types.
Inheritance can be used in an OOP paradigm as long as you do not need to create foreign keys on the parent table. By example, if you have an abstract class vehicle stored in a vehicle table and a table car that inherits from it, all cars will be visible in the vehicle table but a foreign key from a driver table on the vehicle table won't match theses records.
Inheritance can be also used as a partitionning tool. This is especially usefull when you have tables meant to be growing forever (log tables etc).
Main use of inheritance is for partitioning, but sometimes it's useful in other situations. In my database there are many tables differing only in a foreign key. My "abstract class" table "image" contains an "ID" (primary key for it must be in every table) and PostGIS 2.0 raster. Inherited tables such as "site_map" or "artifact_drawing" have a foreign key column ("site_name" text column for "site_map", "artifact_id" integer column for the "artifact_drawing" table etc.) and primary and foreign key constraints; the rest is inherited from the the "image" table. I suspect I might have to add a "description" column to all the image tables in the future, so this might save me quite a lot of work without making real issues (well, the database might run little slower).
EDIT: another good use: with two-table handling of unregistered users, other RDBMSs have problems with handling the two tables, but in PostgreSQL it is easy - just add ONLY when you are not interrested in data in the inherited "unregistered user" table.
The only experience I have with inherited tables is in partitioning. It works fine, but it's not the most sophisticated and easy to use part of PostgreSQL.
Last week we were looking the same OOP issue, but we had too many problems with Hibernate - we didn't like our setup, so we didn't use inheritance in PostgreSQL.
I use inheritance when I have more than 1 on 1 relationships between tables.
Example: suppose you want to store object map locations with attributes x, y, rotation, scale.
Now suppose you have several different kinds of objects to display on the map and each object has its own map location parameters, and map parameters are never reused.
In these cases table inheritance would be quite useful to avoid having to maintain unnormalised tables or having to create location id’s and cross referencing it to other tables.
I tried some operations on it, I will not point out if is there any actual use case for database inheritance, but I will give you some detail for making your decision. Here is an example of PostgresQL: https://www.postgresql.org/docs/15/tutorial-inheritance.html
You can try below SQL script.
CREATE TABLE IF NOT EXISTS cities (
name text,
population real,
elevation int -- (in ft)
);
CREATE TABLE IF NOT EXISTS capitals (
state char(2) UNIQUE NOT NULL
) INHERITS (cities);
ALTER TABLE cities
ADD test_id varchar(255); -- Both table would contains test col
DROP TABLE cities; -- Cannot drop because capitals depends on it
ALTER TABLE cities
ADD CONSTRAINT fk_test FOREIGN KEY (test_id) REFERENCES sometable (id);
As you can see my comments, let me summarize:
When you add/delete/update fields -> the inheritance table would also be affected.
Cannot drop the parent table.
Foreign keys would not be inherited.
From my perspective, in growing applications, we cannot easily predict the changes in the future, for me I would avoid applying this to early database developing.
When features are stable as well and we want to create some database model which much likely the same as the existing one, we can consider that use case.
Use it as little as possible. And that usually means never, it boiling down to a way of creating structures that violate the relational model, for instance by breaking the information principle and by creating bags instead of relations.
Instead, use table partitioning combined with proper relational modelling, including further normal forms.