InsertOrUpdate in Slick - when not using primary key - scala

We are using Slick in a Scala Project. There's a module where we need to do Upsert operation ( Insert/Update) . One of the ways we know is to simply use SQL statements and do it, but for now we would like to stick to using Slick to do it instead.
Since Slick's version we use supports InsertOrUpdate() operation, we want to use that. Now here's an issue : -
Our table has one primary key, and that is the index which is set to auto-increment. And the upsert operation which we want to do is on a transaction id which though is expected to be unique but is not marked unique or primary key in the database.
How to handle this situation? I tried manually implementing Upsert by first checking the table, and then insert/update but that is failing in production environment when many requests are coming in every second and we are having 2 inserts for about 1% of the results.

Related

Is it possible to access current column data on conflict

I want to get such behaviour on inserting data (conflict on id):
if there is no model with same id in db do INSERT
if there is entry with same id in db and that entry is newer (updated_at field) do NOT UPDATE
if there is entry with same id in db and that entry is older (updated_at field) do UPDATE
I'm using Ecto for that and want to work on constraints, however I cannot find an option to do so in documentation. Pseudo code of constraint could look like:
CHECK: NULL(current.updated_at) or incoming.updated_at > current.updated_at
Is such behaviour possible in Postgres?
PostgreSQL does not support CHECK constraints that reference table
data other than the new or updated row being checked. While a CHECK
constraint that violates this rule may appear to work in simple tests,
it cannot guarantee that the database will not reach a state in which
the constraint condition is false (due to subsequent changes of the
other row(s) involved). This would cause a database dump and reload to
fail. The reload could fail even when the complete database state is
consistent with the constraint, due to rows not being loaded in an
order that will satisfy the constraint. If possible, use UNIQUE,
EXCLUDE, or FOREIGN KEY constraints to express cross-row and
cross-table restrictions.
If what you desire is a one-time check against other rows at row
insertion, rather than a continuously-maintained consistency
guarantee, a custom trigger can be used to implement that. (This
approach avoids the dump/reload problem because pg_dump does not
reinstall triggers until after reloading data, so that the check will
not be enforced during a dump/reload.)
That should be simple using the WHERE clause of ON CONFLICT ... DO UPDATE:
INSERT INTO mytable (id, entry) VALUES (42, '2021-05-29 12:00:00')
ON CONFLICT (id)
DO UPDATE SET entry = EXCLUDED.entry
WHERE mytable.entry < EXCLUDED.entry;

How to set Ignore Duplicate Key in Postgresql while table creation itself

I am creating a table in Postgresql 9.5 where id is the primary key. While inserting rows in the table if anyone tries to insert duplicate id, i want it to get ignored instead of raising exception. Is there any way such that i can set this while table creation itself that duplicate entries get ignored.
There are many techniques to resolve duplicate insertion issue while writing insertion query i.e. using ON CONFLICT DO NOTHING, or using WHERE EXISTS clause etc. But i want to handle this at table creation end so that the person writing insertion query doesn't need to bother any.
Creating RULE is one of the possible solution. Are there other possible solutions? Maybe something like this:
`CREATE TABLE dbo.foo (bar int PRIMARY KEY WITH (FILLFACTOR=90, IGNORE_DUP_KEY = ON))`
Although exact this statement doesn't work on Postgresql 9.5 on my machine.
add a trigger before insert or rule on insert do instead - otherwise has to be handled by inserting query. both solutions will require more resources on each insert.
Alternative way to use function with arguments for insert, that will check for duplicates, so end users will use function instead of INSERT statement.
WHERE EXISTS sub-query is not atomic btw - so you can still have exception after check...
9.5 ON CONFLICT DO NOTHING is the best solution still

Can Postgres silently ignore column constraint conflicts?

I have a Postgres 9.6 table with certain columns that must be unique. If I try to insert a duplicate row, I want Postgres to simply ignore the insert and continue, instead of failing or aborting. If the insert is wrapped in a transaction, it shouldn't abort the transaction or affect other updates in the transaction.
I assume there's a way to create the table as described above, but I haven't figured it out yet.
Bonus points if you can show me how to do it in Rails.
This is possible with the ON CONFLICT clause for INSERT:
The optional ON CONFLICT clause specifies an alternative action to
raising a unique violation or exclusion constraint violation error.
For each individual row proposed for insertion, either the insertion
proceeds, or, if an arbiter constraint or index specified by
conflict_target is violated, the alternative conflict_action is taken.
ON CONFLICT DO NOTHING simply avoids inserting a row as its
alternative action.
This is a relatively new feature and only available since Postgres 9.5, but that isn't an issue for you.
This is not something you specific at table creation, you'll need to modify each insert. I don't know how this works with Rails, but I guess you'll have to manually write at least part of the queries to do this.
This feature is also often called UPSERT, which is probably a better term to search for if you want to look for an integrated way in Rails to do this.

Way to migrate a create table with sequence from postgres to DB2

I need to migrate a DDL from Postgres to DB2, but I need that it works the same as in Postgres. There is a table that generates values from a sequence, but the values can also be explicitly given.
Postgres
create sequence hist_id_seq;
create table benchmarksql.history (
hist_id integer not null default nextval('hist_id_seq') primary key,
h_c_id integer,
h_c_d_id integer,
h_c_w_id integer,
h_d_id integer,
h_w_id integer,
h_date timestamp,
h_amount decimal(6,2),
h_data varchar(24)
);
(Look at the sequence call in the hist_id column to define the value of the primary key)
The business logic inserts into the table by explicitly providing an ID, and in other cases, it leaves the database to choose the number.
If I change this in DB2 to a GENERATED ALWAYS it will throw errors because there are some provided values. On the other side, if I create the table with GENERATED BY DEFAULT, DB2 will throw an error when trying to insert with the same value (SQL0803N), because the "internal sequence" does not take into account the already inserted values, and it does not retry with a next value.
And, I do not want to restart the sequence each time a provided ID was inserted.
This is the problem in BenchmarkSQL when trying to port it to DB2: https://sourceforge.net/projects/benchmarksql/ (File sqlTableCreates)
How can I implement the same database logic in DB2 as it does in Postgres (and apparently in Oracle)?
You're operating under a misconception: that sources external to the db get to dictate its internal keys. Ideally/conceptually, autogenerated ids will never need to be seen outside of the db, as conceptually there should be unique natural keys for export or reporting. Still, there are times when applications will need to manage some ids, often when setting up related entities (eg, JPA seems to want to work this way).
However, if you add an id value that you generated from a different source, the db won't be able to manage it. How could it? It's not efficient - for one thing, attempting to do so would do one of the following
Be unsafe in the face of multiple clients (attempt to add duplicate keys)
Serialize access to the table (for a potentially slow query, too)
(This usually shows up when people attempt something like: SELECT MAX(id) + 1, which would require locking the entire table for thread safety, likely including statements that don't even touch that column. If you try to find any "first-unused" id - trying to fill gaps - this gets more complicated and problematic)
Neither is ideal, so it's best to not have the problem in the first place. This is usually done by having id columns be autogenerated, but (as pointed out earlier) there are situations where we may need to know what the id will be before we insert the row into the table. Fortunately, there's a standard SQL object for this, SEQUENCE. This provides a db-managed, thread-safe, fast way to get ids. It appears that in PostgreSQL you can use sequences in the DEFAULT clause for a column, but DB2 doesn't allow it. If you don't want to specify an id every time (it should be autogenerated some of the time), you'll need another way; this is the perfect time to use a BEFORE INSERT trigger;
CREATE TRIGGER Add_Generated_Id NO CASCADE BEFORE INSERT ON benchmarksql.history
NEW AS Incoming_Entity
FOR EACH ROW
WHEN Incoming_Entity.id IS NULL
SET id = NEXTVAL FOR hist_id_seq
(something like this - not tested. You didn't specify where in the project this would belong)
So, if you then add a row with something like:
INSERT INTO benchmarksql.history (hist_id, h_data) VALUES(null, 'a')
or
INSERT INTO benchmarksql.history (h_data) VALUES('a')
an id will be generated and attached automatically. Note that ALL ids added to the table must come from the given sequence (as #mustaccio pointed out, this appears to be true even in PostgreSQL), or any UNIQUE CONSTRAINT on the column will start throwing duplicate-key errors. So any time your application needs an id before inserting a row in the table, you'll need some form of
SELECT NEXT VALUE FOR hist_id_seq
FROM sysibm.sysdummy1
... and that's it, pretty much. This is completely thread and concurrency safe, will not maintain/require long-term locks, nor require serialized access to the table.

EF db first and table without key

I am trying to use Entity Framework DB first to do quick prototyping of a reporting website for a huge db. The problem is one of the tables doesn't have a key. I got an 'Error 159: EntityType has no key defined'. If I add a key on the model designer, I got 'Error 3024: Must specify mapping for all key properties'. My question is whether there is a way to workaround this WITHOUT adding a key to the table. The table is not in our control.
Huge table which does not have a key? It would not be possible for you or for table owner to search for anything in this table without using full table scan. Also, it is basically impossible to use UPDATE by single row without having primary key.
You really have to either create synthetic key, or ask owner to do that. As a workaround, you might be able to find some existing column (or 2-3 columns) which is unique enough that it can be used as unique key. If it is unique but does not have actual index created, that would be still not good for performance - you should create such index.