Postgres unique constraint across multiple values of a column - postgresql

I've got a tricky uniqueness requirement in one of my tables.
Say we're got a table of dogs. Dogs live in houses.
CREATE TABLE dogs (
dog_id integer,
house_id integer,
dogname varchar
}
A dog's name must be unique within a house. There is also a "main" house, and no dog may have a name that is the same as a dog in the main house.
Example, where house_id 0 is the "main" house:
dog_id house_id dogname
1 0 Fido
2 0 Rover
3 1 Shep
4 1 Shep // FAIL, not unique in house 1
5 2 Shep // ok, allowed
6 2 Fido // FAIL, conflict with main house
How do I create a uniqueness constraint that models this?
I'm thinking there is a way to do it with an exclusion constraint, but I haven't figured out how.
I would prefer to do this with a constraint instead of a trigger because I also want to do upserts on this table, and you can only use ON CONFLICT with constraints.

For future visitors: the only way I could solve this was to query for a conflicting record first, and then do the insert only if there was none. This works if you wrap both commands in a transaction. I had to do this in the application, not at the database level. Not an ideal solution.

Related

One-way multi column UNIQUE constraint

I have a system in which I am trying to describe event-based interactions between two targets. In our system, an event (interaction) has a "source" and a "target" (basically who [did what] to whom):
-- tried to remove some of the "noise" from this for the sake of the post:
CREATE TABLE interaction_relationship (
id integer CONSTRAINT interaction_pk PRIMARY KEY,
source_id integer CONSTRAINT source_fk NOT NULL REFERENCES entity(id),
target_id integer CONSTRAINT target_fk NOT NULL REFERENCES entity(id),
-- CONSTRAINT(s)
CREATE CONSTRAINT interaction_relationship_deduplication UNIQUE(source_id, target_id)
);
Constraint interaction_relationship_deduplication is the source of my question:
In our system, a single source can interact with a single target multiple times, but that relationship can only exist once, i.e. if I am a mechanic working on a car, I may see that car multiple times in my shop, but I have only one relationship with that single car:
id
source_id
target_id
a
123abc
456def
b
123abc
789ghi
Ideally, this table also represents a unidirectional relationship. source_id is always the "owner" of the interaction, i.e. if the car 456def ran over the mechanic 123abc there would be another entry in the interaction_relationship table:
id
source_id
target_id
1
123abc
456def
2
123abc
789ghi
3
456def
123abc
So, my question: does UNIQUE on multiple columns take value order into consideration? Or would the above cause a failure?
does UNIQUE on multiple columns take value order into consideration?
Yes. The tuple (123abc, 456def) is different from the tuple (456def, 123abc), they may both exist in the table at the same time.
That said, you might want to remove the surrogate id from the relationship table, there's hardly any use for it. A relation table (as opposed to an entity table, and even there) would do totally fine with a multi-column primary key, which is naturally the combination of source and target.

Moving data between PostgreSQL databases respecting conflicting keys

Situation
I have a 2 databases which were in one time direct copies of each other but now they contain new different data.
What do I want to do
I want to move data from database "SOURCE" to database "TARGET" but the problem is that the tables use auto-incremented keys, and since both databases are used at the same time, a lot of the IDs are already taken up in TARGET so I cannot just identity insert the data coming from SOURCE.
But in theory we could just not use identity insert at all and let the database take care of assigning new IDs.
What makes it harder is that we have like 50 tables where each of them is connected by foreign keys. Clearly the foreign keys will also have to be changed else they will no longer reference the correct thing.
Let's see a very simplified example:
table Human {
id integer NOT NULL PK AutoIncremented
name varchar NOT NULL
parentId integer NULL FK -> Human.id
}
table Pet {
id integer NOT NULL PK AutoIncremented
name varchar NOT NULL
ownerId integer NOT NULL FK -> Human.id
}
SOURCE Human
Id name parentId
==========================
1 Aron null
2 Bert 1
3 Anna 2
SOURCE Pet
Id name ownerId
==========================
1 Frankie 1
2 Doggo 2
TARGET Human
Id name parentId
==========================
1 Armin null
2 Cecil 1
TARGET Pet
Id name ownerId
==========================
1 Gatto 2
Let's say I want to move Aron, Bert, Anna, Frankie and Doggo to the TARGET database.
But if we directly try to insert them with not caring about original ids, the foreign keys will be garbled:
TARGET Human
Id name parentId
==========================
1 Armin null
2 Cecil 1
3 Aron null
4 Bert 1
5 Anna 2
TARGET Pet
Id name ownerId
==========================
1 Gatto 2
2 Frankie 1
3 Doggo 2
The father of Anna is Cecil and the Owner of Doggo is Cecil also instead of Bert. The parent of Bert is Armin instead of Aron.
How I want it to look is:
TARGET Human
Id name parentId
==========================
1 Armin null
2 Cecil 1
3 Aron null
4 Bert 3
5 Anna 4
TARGET Pet
Id name ownerId
==========================
1 Gatto 2
2 Frankie 3
3 Doggo 4
Imagine having like 50 similar tables with 1000 of lines, so we will have to automate the solution.
Questions
Is there a specific tool I can utilize?
Is there some simple SQL logic to precisely do that?
Do I need to roll my own software to do this (e.g. a service that connects to both databases, read everything in EF with including all relations, and save it to the other DB)? I fear that there are too many gotchas and it is time consuming.
Is there a specific tool? Not as far as I know.
Is there some simple SQL? Not exactly simple but not all that complex either.
Do you need to roll own? Maybe, depends on if you think you use the SQL (balow).
I would guess there is no direct path, the problem being as you note, getting the FK values reassigned. The following adds a column to all the tables, which can be used to span the across the tables. For this I would use a uuid. Then with that you can copy from one table set to the other except for the FK. After copying you can join on the uuid to complete the FKs.
-- establish a reference field unique across databases.
alter table target_human add sync_id uuid default gen_random_uuid ();
alter table target_pet add sync_id uuid default gen_random_uuid ();
alter table source_human add sync_id uuid default gen_random_uuid ();
alter table source_pet add sync_id uuid default gen_random_uuid ();
--- copy table 2 to table 1 except parent_id
insert into target_human(name,sync_id)
select name, sync_id
from source_human;
-- update parent id in table to prior parent in table 2 reasigning parent
with conv (sync_parent, sync_child, new_parent) as
( select h2p.sync_id sync_parent, h2c.sync_id sync_child, h1.id new_parent
from source_human h2c
join source_human h2p on h2c.parentid = h2p.id
join target_human h1 on h1.sync_id = h2p.sync_id
)
update target_human h1
set parentid = c.new_parent
from conv c
where h1.sync_id = c.sync_child;
-----------------------------------------------------------------------------------------------
alter table target_pet alter column ownerId drop not null;
insert into target_pet(name, sync_id)
select name, sync_id
from source_pet ;
with conv ( sync_pet,new_owner) as
( select p2.sync_id, h1.id
from source_pet p2
join source_human h2 on p2.ownerid = h2.id
join target_human h1 on h2.sync_id = h1.sync_id
)
update target_pet p1
set ownerid = c.new_owner
from conv c
where p1.sync_id = c.sync_pet;
alter table target_pet alter column ownerId set not null;
See demo. You now reverse the source and target table definitions to complete the other side of the sync. You can then drop the uuid columns if so desired. But you may want to keep them. If you have gotten them out of sync, you will do so again. You could even go a step further and make the UUID your PK/FK and then just copy the data, the keys will remain correct, but that might involve updating the apps to the revised DB structure. This does not address communication across databases, but I assume you already have that handled. You will need to repeat for each set, perhaps you can write a script to generate them. Further, I would guess there are fewer gotchas and less time consuming than rolling your own. This is basically 5 queries per table set you have, but to clean-up the current mess, 500 queries is not that much;

RDBMS Ref. Integrity: A child with too many parents?

I have a general design question. Consider these 3 tables:
Table Restaurants:
RID pk auto_increment
etc...
Table Vacations:
VID pk auto_increment
etc...
Table Movies:
MID pk auto_increment
etc...
And now imagine we want to create a list "Top things to do when COVID is over" of selected records from these 3 different tables. The list may contain any mix of records from these tables. What comes to mind then is:
Table Todo:
Type [ one of R, V, M ]
ID [ the ID of the parent item ]
But how would you enforce referential integrity on this thing? I.e., how do we ensure that when a restaurant is deleted from Restaurants, it will also drop from Todo?
(I am aware of how to accomplish these things with triggers; Curious if there's a combination of entities that will accomplish this with pure RDBMS ref. int.)
Thank you!
You can add nullable foreign key columns in your todo table for each target table you have. So your table will look like:
Table Todo:
RID fk (nullable)
VID fk (nullable)
MID fk (nullable)
The type column isn't needed anymore as you can check which column is filled with a foreign key. Obviously you have to add a CHECK constraint to ensure that exactly one foreign key must be set.

SymmetricDS pk alternative

After reading the SymmetricDS userguide I'm not sure if SymmetricDS supports conflict resolution which is not based on PK but exclusively on my own custom columns.
Given the following scenario:
2 nodes with bi-directional update
each node has one table products which must be synchronized
Now, the table schema looks like this (simplified):
id (pk) | name (char) | reference (char)
What I would like to know is, is it possible to define the column reference as identifier for conflict resolution and insert / update operations instead of the pk column id?
Example:
Node0
id (pk) | name (char) | reference (char)
1 Foo IN001
2 FooBaz IN003
----
Node1
id (pk) | name (char) | reference (char)
1 Bar EX001
2 Foo IN001
Changes on row 2 in Node1 will trigger updates on row 1 in Node 1 while creating a new record in Node0/1 will trigger an insert in the respective node but considering that the PK might be already taken.
Furthermore I would like to filter the to be synchronized table rows by the value of column reference. Which means that only rows should by synced where reference startwith('IN') == True.
Thanks!
Look at the column 'SYNC_KEY_NAMES' on the TRIGGER table.
Specify a comma-delimited list of columns that should be used as the
key for synchronization operations. By default, if not specified, then
the primary key of the table will be used.
If you insert the value 'name' into this column, SDS will handle it as the PK.
Leaving id as a PK creates a hurdle. If this column auto-increments, you can try to exclude it in the trigger table column, 'EXCLUDED_COLUMN_NAMES'. Since this is the PK, I don't know if SDS will ignore it or not.
If that does not work you will have to write a Custom Load Filter to increment the id field on insert.

Constraint to avoid combination of foreign keys

I've here a problem that I couldn't find a proper solution on my researches, maybe it's because I couldn't find out the exact terms to search for it, so if this is a duplicate I will delete it.
My problem is I want to know if it is possible to avoid a combination of data between two fields. I will show the structure and the kind of data I want to avoid. It will be easier to understand.
Table_A Table_B
------------------------ -------------------------------
id integer (PK) id integer (PK)
description varchar(50) title varchar(50)
id1_fromA (FK A->id)
id2_fromA (FK A->id)
I'm trying to validate the following data on table Table_B (combination is between id1_fromA and id2_fromA)
id title id1_fromA id2_fromA
1 Some Title 1 2 --It will be permmited
2 Some other 1 2 --It is a duplicate NOT ALLOWED
3 One more 1 1 --It is equals NOT ALLOWED
4 Another 2 1 --It is same as registry id 1 so NOT ALLOWED
5 Sample data 3 2 --It is ok
With above data I can easily solve the problem for registry ID=2 with
ALTER TABLE table_B ADD CONSTRAINT UK_TO_A_FKS UNIQUE (id1_fromA, id2_fromA);
And the problem for registry ID=3 with
ALTER TABLE table_B ADD CONSTRAINT CHK_TO_A_FKS CHECK (id1_fromA != id2_fromA);
My Problem is with the registry ID=4 I want to avoid such duplicate of combination as 1,2=2,1. Is it possible to do it with a CONSTRAINT or an INDEX or an UNIQUE or I will need to create a trigger or a procedure to do so?
Thanks in advance.
You can't do this with a unique constraint, but you can do this with a unique index.
create unique index UK_TO_A_FKS
on table_b (least(id1_froma, id2_froma), greatest(id1_froma, id2_froma));