Postgres check if related table has an entry if field is True - postgresql

I have a table with a boolean column automated. When that field is set to TRUE, then another table needs to have an entry referring to that row.
Table A
id | automated
---------------------
1 | False
2 | True
3 | False
Table B
id | FK-TableA | Value
-------------------------------
2 | 2 | X
So whenever a new entry gets inserted into Table A where automated is set to TRUE, then there also has to be a row inserted (or present) in Table B with a referece to it.

It seems an unnatural flow to me, with the constraint you are stating the natural flow should be creating a TRIGGER on Table B that inserts a record on Table A whenever a new Table B record is inserted.
But I understand this is a simplification of a more elaborated problem so if you really need to create this kind of procedure, there is still a question to be answered, what happens when check is negative, should there be an exception? should the record be inserted with FALSE instead of TRUE, should the record need to be ignored? there are two options from my point of view:
Create a TRIGGER before INSERT on table A that updates the table accordingly (Create a PROCEDURE that checks if this exists and a TRIGGER that executes this procedure)
Create a RULE on insert on your table A that checks that record exists on Table B and changes record or does instead nothing.
With a little more background I can help you with Trigger/Rule.
Anyway, take into account this can be performance-wise a real mistake if this table gets lots of INSERTs and you should go for some offline(as not being done on the live INSERT) procedure instead of doing on live INSERT

It is ugly and introduces a redundancy into the database, but I cannot think of a better way than this:
Introduce a new column b_id to a.
Add a UNIQUE constraint on ("FK-TableA", id) to b.
Add a foreign key on a so that (id, b_id) REFERENCES b("FK-TableA", id).
Add a CHECK (b_id IS NOT NULL OR NOT automated) constraint on a.
Then you have to point b_id to one of the rows in b that points back to this a row.
To make it perfect, you'd have to add triggers that guarantee that after each modification, the two foreign keys are still consistent.
I told you it was ugly!

Related

How can a relational database with foreign key constraints ingest data that may be in the wrong order?

The database is ingesting data from a stream, and all the rows needed to satisfy a foreign key constraint may be late or never arrive.
This can likely be accomplished by using another datastore, one without foreign key constraints, and then when all the needed data is available, read into the database which has fk constraints. However, this adds complexity and I'd like to avoid it.
We're working on a solution that creates "placeholder" rows to point the foreign key to. When the real data comes in, the placeholder is replaced with real values. Again, this adds complexity, but it's the best solution we've found so far.
How do people typically solve this problem?
Edit: Some sample data which might help explain the problem:
Let's say we have these tables:
CREATE TABLE order (
id INTEGER NOT NULL,
order_number,
PRIMARY KEY (id),
UNIQUE (order_number)
);
CREATE TABLE line_item (
id INTEGER NOT NULL,
order_number INTEGER REFERENCES order(order_number),
PRIMARY KEY (id)
);
If I insert an order first, not a problem! But let's say I try:
INSERT INTO line_item (order_number) values (123) before order 123 was inserted. This will fail the fk constraint of course. But this might be the order I get the data, since it's reading from a stream that is collecting this data from multiple sources.
Also, to address #philpxy's question, I didn't really find much on this. One thing that was mentioned was deferred constraints. This is a mechanism that waits to do the fk constraints at the end of a transaction. I don't think it's possible to do that in my case however, since these insert statements will be run at random times whenever the data is received.
You have a business workflow problem, because line items of individual orders are coming in before the orders themselves have come in. One workaround, perhaps not ideal, would be to create a before insert trigger which checks, for every incoming insert to the line_item table, whether that order already exists in the order table. If not, then it will first insert the order record before trying the insert on line_item.
CREATE OR REPLACE FUNCTION "public"."fn_insert_order" () RETURNS trigger AS $$
BEGIN
INSERT INTO "order" (order_number)
SELECT NEW.order_number
WHERE NOT EXISTS (SELECT 1 FROM "order" WHERE order_number = NEW.order_number);
RETURN NEW;
END
$$
LANGUAGE 'plpgsql'
# trigger
CREATE TRIGGER "trigger_insert_order"
BEFORE INSERT ON line_item FOR EACH ROW
EXECUTE PROCEDURE fn_insert_order()
Note: I am assuming that the id column of the order table in fact is auto increment, in which case Postgres would automatically assign a value to it when inserting as above. Most likely, this is what you want, as having two id columns which both need to be manually assigned does not make much sense.
You could accomplish that with a BEFORE INSERT trigger on line_item.
In that trigger you query order if a matching item exists, and if not, you insert a dummy row.
That will allow the INSERT to succeed, at the cost of some performance.
To insert rows into order, use
INSERT INTO order ...
ON CONFLICT ON (order_number) DO UPDATE SET
id = EXCLUDED.id;
Updating a primary key is problematic and may lead to conflicts. One way you could get around that is if you use negative ids for artificially generated orders (assuming that the real ids are positive). If you have any references to that primary key, you'd have to define the constraint with ON UPDATE CASCADE.

How to delete child records using foreign keys and relational table in Postgres

I have a table design where "notes" for various entities are handled using a relational table.
For example, the following tables exist
'notes' table having fields id and note
'knifes' table having as only field an id
'knife_notes' table having knife_id and note_id, being foreign keys to 'id' in knifes table
and notes 'id' in notes table respectively.
Update: the note_id field in the knife_notes tables is unique, so that each note can only be related to one particular knife.
When adding a note, i.e. a child, for a knife (parent), a note record is created and a record in table knife_notes is create too, thereby relating a notes id and a knifes id.
The two foreign keys are having 'On Delete Cascade'.
However, when a knife is deleted, only the records in knife_notes are 'cascade' deleted, not the records in the notes table.
Do I need a second query to delete the notes records when deleting a knife, or is there a better way?
What you did was create an n-to-m relationship between knives and notes. Are you sure that is what you need?
Because, the way your datamodel is set up now, it is actually desirable that notes aren't deleted. The current model allows (amongst others):
A. A knife that has 1 note
B. A specific knife that has more than 1 note (2 rows in knife_notes that point to the same knife and to different notes)
C. A specific note that is related to multiple knives
Because of scenario C the database can't just cascade from the knive_notes table to the notes table: there might be other tables that are dependent on a specific note.
To make it visual, think of this scenario in the knive_notes table:
id knife_id note_id
--------------------------------------
1 11 101
2 11 102
3 12 103
4 13 103
From a database point of view this is perfectly legal; note 103 is used for both knive 12 and 13. Now if you were to delete knive 12, the database can't simply remove note 103, because it could still be used by other knives (and in fact it is).
There are 2 sollutions I can think of:
1. simplify your datamodel
2. create a trigger or rule in PostgreSQL
Expanding on the options:
datamodel:
Why don't you create a datamodel that only has
Knive
Note (with a foreign key to knive)
This way a knive can have multiple notes, but a note always is related to a knive. This will not work, though, if a note is used multiple times or in multiple roles (ie. if you also have a table "guns" that also needs notes or if a specific note is used by multiple rows in the knive table ("the color of this knive is red")
triggers or rules
With rules you can rewrite your SQL query. With triggers you can execute a function just before or after an operation on a table.
Look up triggers & rewrite rules.
Myself, I'm most used to working with triggers. I would basically create an on-delete trigger that starts a trigger function. In that function I would check if the note isn't used anywhere else and if that is the case execute a delete on the note.
CREATE FUNCTION try_to_delete_note() RETURNS trigger AS ##
DECLARE
usage_found int;
BEGIN
-- Check that note is not used anywhere else
usage_found = (select count(*) from knive_notes where note_id = OLD.note_id)
IF (usage_found = 0) THEN
DELETE from note where id = OLD.note_id
END IF;
RETURN NULL; -- result is ignored since this is an after trigger
END;
## LANGUAGE plpgsql;
CREATE TRIGGER knife_notes_deleted_trigger
AFTER DELETE ON knife_notes
FOR EACH ROW
EXECUTE PROCEDURE try_to_delete_note();
Also see this page in postgresql docs with trigger examples.

incorrect data update on Sybase trigger execution

I have a table test_123 with the column as:
int_1 (int),
datetime_1 (datetime),
tinyint_1 (tinyint),
datetime_2 (datetime)
So when column datetime_1 is updated and the value at column tinyint_1 = 1 that time i have to update my column datetime_2 with column value of datetime_1
I have created the below trigger for this.. but with my trigger it is updating all datetime2 column values with datetime_1 column when tinyint_1 = 1 .. but i just want to update that particular row where datetime_1 value has updated( i mean changed)..
Below is the trigger..
CREATE TRIGGER test_trigger_upd
ON test_123
FOR UPDATE
AS
FOR EACH STATEMENT
IF UPDATE(datetime_1)
BEGIN
UPDATE test_123
SET test_123.datetime_2 = inserted.datetime_1
WHERE test_123.tinyint_1 = 1
END
ROW-level triggers are not supported in ASE. There are only after-statement triggers.
As commented earlier, the problem you're facing is that you need to be able to link the rows in the 'inserted' pseudo-table to the base table itself. You can only do that if there is a key -- meaning: a column that uniquely identifies a row, or a combination of columns that does so. Without that, you simply cannot identify the row that needs to be updated, since there may be multiple rows with identical column values if uniqueness is not guaranteed.
(and on a side note: not having a key in a table is bad design practice -- and this problem is one of the many reasons why).
A simple solution is to add an identity column to the table, e.g.
ALTER TABLE test_123 ADD idcol INT IDENTITY NOT NULL
You can then add a predicate 'test_123.idcol = inserted.idcol' to the trigger join.

What is the difference between these two T-SQL statements?

In a SSIS package at work there are some SQL tasks that create staging tables for holding import data. All the statements take this form:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'dbo.tbNewTable') AND type in (N'U'))
BEGIN
TRUNCATE TABLE dbo.tbNewTable
END
ELSE
BEGIN
CREATE TABLE dbo.tbNewTable (
ColumnA VARCHAR(10) NULL,
ColumnB VARCHAR(10) NULL,
ColumnC INT NULL
) ON PRIMARY
END
In Itzik Ben-Gan's T-SQL Fundamentals I see a different form of statement for creating a table:
IF OBJECT_ID('dbo.tbNewTable', 'U') IS NOT NULL
BEGIN
DROP TABLE dbo.tbNewTable
END
CREATE TABLE dbo.tbNewTable (
ColumnA VARCHAR(10) NULL,
ColumnB VARCHAR(10) NULL,
ColumnC INT NULL
) ON PRIMARY
Each of these appears to do the same thing. After execution, there will be a empty table called tbNewTable in the dbo schema.
Are there any practical or theoretical differences between the two? What implications might they have?
The first one assumes that if the table exists, it has the same columns as those it would create. The second one does not make that assumption. So if a table with that name happened to exist and had a different set of columns, the two would have very different results.
The first will not actually DROP the table -- it merely TRUNCATES all the data in said table. Hence why the CREATE is guarded.
Thus the form with the DROP will allow the subsequent CREATE to change the schema (when the new table is created) even if tbNewTable previously existed.
Because the DROP/CREATE alters the database schema it may not also be allowed in all cases. For instance, a view created with a SCHEMABINDING will prevent the table from being dropped. (This also hold true for more general FK relationships, should any exist.)
...when SCHEMABINDING is specified, the base table or tables cannot be modified in a way that would affect the view definition.
The TRUNCATE should be marginally faster in one of those constant "don't care" ways: there should be no performance consideration given to one over the other.
There are also permission differences. TRUNCATE only requires the ALTER permission.
The minimum permission required is ALTER on table_name. TRUNCATE TABLE permissions default to the table owner...
Happy coding.
These are very different..
The first does an equality check on the sys.objects system table and looks to see if there is a matching table name. If so, it truncates the table. Basically removing all rows but maintaining the table structure itself - i.e. the actual table is never dropped.
In the second, the check to make sure that the table exists is implicitly done using the OBJECT_ID() method. If so, the table is dropped completely - rows and structure.
If you have a primary and foreign key constraint on the table, you'll certainly have issues dropping it completely... and if you have other tables that are linked to the table you are trying to 'truncate' you'll have issues there too, unless you have cascade deletion turned on.
I tend to dislike either construction in an SSIS package. I create the tables in a deployment script and I want the package to fail if one of the tables I use is missing later on because then something drastically wrong has happened and I want to investigate what before I try putting data anywhere.

PostgreSQL: dynamic row values (?)

Oh helloes!
I have two tables, first one (let's call it NameTable) is preset with a set of values (id, name) and the second one (ListTable) is empty but with same columns.
The question is: How can I insert into ListTable a value that comes from NameTable? So that if I change one name in the NameTable then automagically the values in ListTable are updated aswell.
Is there INSERT for this or does the tables has to be created in some special manner?
Tried browsing the manual but without success :(
The suggestion for using INSERT...SELECT is the best method for moving between tables in the same database.
However, there's another way to deal with the auto-update requirement.
It sounds like these are your criteria:
Table A is defined with columns (x,y)
(x,y) is unique
Table B is also defined with columns (x,y)
Table A is a superset of Table B
Table B is to be loaded with data from Table A and needs to remain in sync with UPDATEs on Table A.
This is a job for a FOREIGN KEY with the option ON UPDATE CASCADE:
ALTER TABLE B ADD FOREIGN KEY (x,y) REFERENCES A (x,y) ON UPDATE CASCADE;
Now, not only will it auto-update Table B when Table A is updated, table B is protected against containing (x,y) pairs that do not exist in Table A. If you want records to auto-delete from Table B when deleted from Table A, add "ON UPDATE DELETE."
Hmmm... I'm a bit confused about exactly what you want to do or why, but here are a couple of pointers towards things you might want to take a look at: table inheritance, triggers and rules.
Table inheritance in postgresql allows a table to share the data of a another table. So, if you add a row to the base table, it won't show up in the inherited table, but if you add a row to the inherited table, it will now show up in both tables and updates in either place will reflect it in both tables.
Triggers allow you to setup code that will be run when insert, update or delete operations happen on a table. This would allow you to add the behavior you describe manually.
Rules allow you to setup a rule that will replace a matching query with an alternative query when a specific condition is met.
If you describe your problem further as in why you want this behavior, it might be easier to suggest the right way to go about things :-)