How to apply sequence order in trigger? - triggers

In Informix, I need to update 2 (two) tables when the trigger is executed. Let's say Table_A and Table_B. In Table_A, there is a int8 (long data type) column as primary key. When a new record is inserted, this primary key column will retrieve the value from a sequence. This is the code:
sequence_A.nextVal
In Table_B, there is a foreign key column that references the primary key in Table_A. In order to make the primary key and the foreign key tally, I use sequence_A.currVal to insert the value into this foreign key column.
I did try the code below but Informix give me syntax error.
create trigger The_Trigger
insert on The_Table referencing new as n
for each row
(
insert into Table_A(...) value(sequence_A.nextVal, ...)
insert into Table_B(...) value(sequence_A.currVal, ...)
)
If I separate the insert statement into 2 (two) difference triggers, it works. Thus I was thinking to create 2 (two) triggers on The_Table. Let's say Trigger_A and Trigger_B, may I know how can I ensure that Trigger_A will get execute first then only Thrigger_B. Can I specify the order execution on triggers? Can this be done? And how?

In your first attempt, you omitted the comma between the two INSERT statements, and the keyword is VALUES, not VALUE:
CREATE TRIGGER The_Trigger
INSERT ON The_Table REFERENCING NEW AS n
FOR EACH ROW
(
INSERT INTO Table_A(...) VALUES(sequence_A.nextVal, ...),
INSERT INTO Table_B(...) VALUES(sequence_A.currVal, ...)
)
With those two changes, I believe you would have sequential execution.
Given a sufficiently recent version of Informix, you can have multiple triggers for a single event on a single table. The sequence of execution is the sequence in which the triggers are defined.

Related

How can a relational database with foreign key constraints ingest data that may be in the wrong order?

The database is ingesting data from a stream, and all the rows needed to satisfy a foreign key constraint may be late or never arrive.
This can likely be accomplished by using another datastore, one without foreign key constraints, and then when all the needed data is available, read into the database which has fk constraints. However, this adds complexity and I'd like to avoid it.
We're working on a solution that creates "placeholder" rows to point the foreign key to. When the real data comes in, the placeholder is replaced with real values. Again, this adds complexity, but it's the best solution we've found so far.
How do people typically solve this problem?
Edit: Some sample data which might help explain the problem:
Let's say we have these tables:
CREATE TABLE order (
id INTEGER NOT NULL,
order_number,
PRIMARY KEY (id),
UNIQUE (order_number)
);
CREATE TABLE line_item (
id INTEGER NOT NULL,
order_number INTEGER REFERENCES order(order_number),
PRIMARY KEY (id)
);
If I insert an order first, not a problem! But let's say I try:
INSERT INTO line_item (order_number) values (123) before order 123 was inserted. This will fail the fk constraint of course. But this might be the order I get the data, since it's reading from a stream that is collecting this data from multiple sources.
Also, to address #philpxy's question, I didn't really find much on this. One thing that was mentioned was deferred constraints. This is a mechanism that waits to do the fk constraints at the end of a transaction. I don't think it's possible to do that in my case however, since these insert statements will be run at random times whenever the data is received.
You have a business workflow problem, because line items of individual orders are coming in before the orders themselves have come in. One workaround, perhaps not ideal, would be to create a before insert trigger which checks, for every incoming insert to the line_item table, whether that order already exists in the order table. If not, then it will first insert the order record before trying the insert on line_item.
CREATE OR REPLACE FUNCTION "public"."fn_insert_order" () RETURNS trigger AS $$
BEGIN
INSERT INTO "order" (order_number)
SELECT NEW.order_number
WHERE NOT EXISTS (SELECT 1 FROM "order" WHERE order_number = NEW.order_number);
RETURN NEW;
END
$$
LANGUAGE 'plpgsql'
# trigger
CREATE TRIGGER "trigger_insert_order"
BEFORE INSERT ON line_item FOR EACH ROW
EXECUTE PROCEDURE fn_insert_order()
Note: I am assuming that the id column of the order table in fact is auto increment, in which case Postgres would automatically assign a value to it when inserting as above. Most likely, this is what you want, as having two id columns which both need to be manually assigned does not make much sense.
You could accomplish that with a BEFORE INSERT trigger on line_item.
In that trigger you query order if a matching item exists, and if not, you insert a dummy row.
That will allow the INSERT to succeed, at the cost of some performance.
To insert rows into order, use
INSERT INTO order ...
ON CONFLICT ON (order_number) DO UPDATE SET
id = EXCLUDED.id;
Updating a primary key is problematic and may lead to conflicts. One way you could get around that is if you use negative ids for artificially generated orders (assuming that the real ids are positive). If you have any references to that primary key, you'd have to define the constraint with ON UPDATE CASCADE.

Execute Function Once per Unique Value

I have two tables that contain data related to everyday business:
CREATE TABLE main_table (
main_id serial,
cola text,
colb text,
colc text,
CONSTRAINT main_table_pkey PRIMARY KEY (main_id)
);
CREATE TABLE second_table (
second_id serial,
main_id integer,
cold text,
CONSTRAINT second_table_pkey PRIMARY KEY (second_id),
CONSTRAINT second_table_fkey FOREIGN KEY (main_id)
REFERENCES main_table (main_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
We have a need to know when some data was updated in these tables so that exports can be generated and pushed to third parties. I've created a third table to hold the update information:
CREATE TYPE field AS ENUM ('cola', 'colb', 'colc', 'cold');
CREATE TABLE table_updates (
main_id int,
field field
updated_on date NOT NULL DEFAULT NOW(),
CONSTRAINT table_updates_fkey FOREIGN KEY (main_id)
REFERENCES main_table (main_id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
);
main_table has a trigger to update table_updates before UPDATE queries, which satisfies the need to track three of the four column updates.
I can easily add the same type of trigger to second_table, however because main_id is not unique the function can be executed several times for a single main_id value, which is not desirable.
How can I create a function that, when updating several rows in second_table, executes only once per main_id?
How can I create a function that, when updating several rows in second_table, executes only once per main_id?
If your inserts are batched insert by main_id ie, INSERT INTO tbl (main_id...) VALUES (main_id ...),(main_id ...),(main_id ...) you can use the rule system to trigger once for the INSERT or UPDATE
For the things that can be implemented by both, which is best depends on the usage of the database. A trigger is fired once for each affected row. A rule modifies the query or generates an additional query. So if many rows are affected in one statement, a rule issuing one extra command is likely to be faster than a trigger that is called for every single row and must re-determine what to do many times. However, the trigger approach is conceptually far simpler than the rule approach, and is easier for novices to get right.
Shy of that, you may also want to look into the normal LISTEN, and NOTIFY. Which give you the ability to use Async actions. If that's your thing and you decide to keep the trigger method consider Trigger Change Notification module, via tcn.
My suggestion is to do this in the app (outside of the DB) if at all possible. Remember in PostgreSQL temp tables are local to the session. So you can have each loader-session do something like this,
BEGIN
CREATE TEMP TABLE UNLOGGED etl_inventory;
COPY foo FROM stdin;
-- Are they different, if so `NOTIFY`
-- UPSERT
COMMIT;
And then one have one daemon that does exportation add to exportation queue when it receives the NOTIFY event.
While Evan's answer is correct, I think this question could benefit from an example.
This is the rule definition I used with the example tables in the question:
CREATE OR REPLACE RULE update_update_table
AS ON UPDATE TO second_table
DO ALSO (
INSERT INTO table_updates (
main_id, field
)
SELECT DISTINCT OLD.main_id, 'cold'::field
WHERE NOT EXISTS (
SELECT TRUE
FROM table_updates
WHERE main_id = OLD.main_id
AND field = 'cold'
);
UPDATE table_updates
SET updated_on = NOW()
WHERE main_id = OLD.main_id
AND field = 'cold'
)

PostgreSQL self referential table - how to store parent ID in script?

I've the following table:
DROP SEQUENCE IF EXISTS CATEGORY_SEQ CASCADE;
CREATE SEQUENCE CATEGORY_SEQ START 1;
DROP TABLE IF EXISTS CATEGORY CASCADE;
CREATE TABLE CATEGORY (
ID BIGINT NOT NULL DEFAULT nextval('CATEGORY_SEQ'),
NAME CHARACTER VARYING(255) NOT NULL,
PARENT_ID BIGINT
);
ALTER TABLE CATEGORY
ADD CONSTRAINT CATEGORY_PK PRIMARY KEY (ID);
ALTER TABLE CATEGORY
ADD CONSTRAINT CATEGORY_SELF_FK FOREIGN KEY (PARENT_ID) REFERENCES CATEGORY (ID);
Now I need to insert the data. So I start with parent:
INSERT INTO CATEGORY (NAME) VALUES ('PARENT_1');
And now I need the ID of the just inserted parent to add children to it:
INSERT INTO CATEGORY (NAME, PARENT_ID) VALUES ('CHILDREN_1_1', <what_goes_here>);
INSERT INTO CATEGORY (NAME, PARENT_ID) VALUES ('CHILDREN_1_2', <what_goes_here>);
How can I get and store the ID of the parent to later use it in the subsequent inserts?
You can use a data modifying CTE with the returning clause:
with parent_cat (parent_id) as (
INSERT INTO CATEGORY (NAME) VALUES ('PARENT_1')
returning id
)
INSERT INTO CATEGORY (NAME, PARENT_ID)
VALUES
('CHILDREN_1_1', (select parent_id from parent_cat) ),
('CHILDREN_1_2', (select parent_id from parent_cat) );
The answer is to use RETURNING along with WITH
WITH inserted AS (
INSERT INTO CATEGORY (NAME) VALUES ('PARENT_1')
RETURNING id
) INSERT INTO CATEGORY (NAME, PARENT_ID) VALUES
('CHILD_1_1', (SELECT inserted.id FROM inserted)),
('CHILD_2_1', (SELECT inserted.id FROM inserted));
( tl;dr : goto option 3: INSERT with RETURNING )
Recall that in postgresql there is no "id" concept for tables, just sequences (which are typically but not necessarily used as default values for surrogate primary keys, with the SERIAL pseudo-type).
If you are interested in getting the id of a newly inserted row, there are several ways:
Option 1: CURRVAL(<sequence name>);.
For example:
INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John');
SELECT currval('persons_id_seq');
The name of the sequence must be known, it's really arbitrary; in this example we assume that the table persons has an id column created with the SERIAL pseudo-type. To avoid relying on this and to feel more clean, you can use instead pg_get_serial_sequence:
INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John');
SELECT currval(pg_get_serial_sequence('persons','id'));
Caveat: currval() only works after an INSERT (which has executed nextval() ), in the same session.
Option 2: LASTVAL();
This is similar to the previous, only that you don't need to specify the sequence number: it looks for the most recent modified sequence (always inside your session, same caveat as above).
Both CURRVAL and LASTVAL are totally concurrent safe. The behaviour of sequence in PG is designed so that different session will not interfere, so there is no risk of race conditions (if another session inserts another row between my INSERT and my SELECT, I still get my correct value).
However they do have a subtle potential problem. If the database has some TRIGGER (or RULE) that, on insertion into persons table, makes some extra insertions in other tables... then LASTVAL will probably give us the wrong value. The problem can even happen with CURRVAL, if the extra insertions are done intto the same persons table (this is much less usual, but the risk still exists).
Option 3: INSERT with RETURNING
INSERT INTO persons (lastname,firstname) VALUES ('Smith', 'John') RETURNING id;
This is the most clean, efficient and safe way to get the id. It doesn't have any of the risks of the previous.
Drawbacks? Almost none: you might need to modify the way you call your INSERT statement (in the worst case, perhaps your API or DB layer does not expect an INSERT to return a value); it's not standard SQL (who cares); it's available since Postgresql 8.2 (Dec 2006...)
Conclusion: If you can, go for option 3. Elsewhere, prefer 1.
Note: all these methods are useless if you intend to get the last globally inserted id (not necessarily in your session). For this, you must resort to select max(id) from table (of course, this will not read uncommitted inserts from other transactions).

T-SQL create table with primary keys

Hello I wan to create a new table based on another one and create primary keys as well.
Currently this is how I'm doing it. Table B has no primary keys defined. But I would like to create them in table A. Is there a way using this select top 0 statement to do that? Or do I need to do an ALTER TABLE after I created tableA?
Thanks
select TOP 0 *
INTO [tableA]
FROM [tableB]
SELECT INTO does not support copying any of the indexes, constraints, triggers or even computed columns and other table properties, aside from the IDENTITY property (as long as you don't apply an expression to the IDENTITY column.
So, you will have to add the constraints after the table has been created and populated.
The short answer is NO. SELECT INTO will always create a HEAP table and, according to Books Online:
Indexes, constraints, and triggers defined in the source table are not
transferred to the new table, nor can they be specified in the
SELECT...INTO statement. If these objects are required, you must
create them after executing the SELECT...INTO statement.
So, after executing SELECT INTO you need to execute an ALTER TABLE or CREATE UNIQUE INDEX in order to add a primary key.
Also, if dbo.TableB does not already have an IDENTITY column (or if it does and you want to leave it out for some reason), and you need to create an artificial primary key column (rather than use an existing column in dbo.TableB to serve as the new primary key), you could use the IDENTITY function to create a candidate key column. But you still have to add the constraint to TableA after the fact to make it a primary key, since just the IDENTITY function/property alone does not make it so.
-- This statement will create a HEAP table
SELECT Col1, Col2, IDENTITY(INT,1,1) Col3
INTO dbo.MyTable
FROM dbo.AnotherTable;
-- This statement will create a clustered PK
ALTER TABLE dbo.MyTable
ADD CONSTRAINT PK_MyTable_Col3 PRIMARY KEY (Col3);

postgres autoincrement not updated on explicit id inserts

I have the following table in postgres:
CREATE TABLE "test" (
"id" serial NOT NULL PRIMARY KEY,
"value" text
)
I am doing following insertions:
insert into test (id, value) values (1, 'alpha')
insert into test (id, value) values (2, 'beta')
insert into test (value) values ('gamma')
In the first 2 inserts I am explicitly mentioning the id. However the table's auto increment pointer is not updated in this case. Hence in the 3rd insert I get the error:
ERROR: duplicate key value violates unique constraint "test_pkey"
DETAIL: Key (id)=(1) already exists.
I never faced this problem in Mysql in both MyISAM and INNODB engines. Explicit or not, mysql always update autoincrement pointer based on the max row id.
What is the workaround for this problem in postgres? I need it because I want a tighter control for some ids in my table.
UPDATE:
I need it because for some values I need to have a fixed id. For other new entries I dont mind creating new ones.
I think it may be possible by manually incrementing the nextval pointer to max(id) + 1 whenever I am explicitly inserting the ids. But I am not sure how to do that.
That's how it's supposed to work - next_val('test_id_seq') is only called when the system needs a value for this column and you have not provided one. If you provide value no such call is performed and consequently the sequence is not "updated".
You could work around this by manually setting the value of the sequence after your last insert with explicitly provided values:
SELECT setval('test_id_seq', (SELECT MAX(id) from "test"));
The name of the sequence is autogenerated and is always tablename_columnname_seq.
In the recent version of Django, this topic is discussed in the documentation:
Django uses PostgreSQL’s SERIAL data type to store auto-incrementing
primary keys. A SERIAL column is populated with values from a sequence
that keeps track of the next available value. Manually assigning a
value to an auto-incrementing field doesn’t update the field’s
sequence, which might later cause a conflict.
Ref: https://docs.djangoproject.com/en/dev/ref/databases/#manually-specified-autoincrement-pk
There is also management command manage.py sqlsequencereset app_label ... that is able to generate SQL statements for resetting sequences for the given app name(s)
Ref: https://docs.djangoproject.com/en/dev/ref/django-admin/#django-admin-sqlsequencereset
For example these SQL statements were generated by manage.py sqlsequencereset my_app_in_my_project:
BEGIN;
SELECT setval(pg_get_serial_sequence('"my_project_aaa"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_aaa";
SELECT setval(pg_get_serial_sequence('"my_project_bbb"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_bbb";
SELECT setval(pg_get_serial_sequence('"my_project_ccc"','id'), coalesce(max("id"), 1), max("id") IS NOT null) FROM "my_project_ccc";
COMMIT;
It can be done automatically using a trigger. This way you are sure that the largest value is always used as the next default value.
CREATE OR REPLACE FUNCTION set_serial_id_seq()
RETURNS trigger AS
$BODY$
BEGIN
EXECUTE (FORMAT('SELECT setval(''%s_%s_seq'', (SELECT MAX(%s) from %s));',
TG_TABLE_NAME,
TG_ARGV[0],
TG_ARGV[0],
TG_TABLE_NAME));
RETURN OLD;
END;
$BODY$
LANGUAGE plpgsql;
CREATE TRIGGER set_mytable_id_seq
AFTER INSERT OR UPDATE OR DELETE
ON mytable
FOR EACH STATEMENT
EXECUTE PROCEDURE set_serial_id_seq('mytable_id');
The function can be reused for multiple tables. Change "mytable" to the table of interest.
For more info regarding triggers:
https://www.postgresql.org/docs/9.1/plpgsql-trigger.html
https://www.postgresql.org/docs/9.1/sql-createtrigger.html