PostgreSQL - Duplicate Unique Key - postgresql

On my table I have a secondary unique key labeled md5. Before inserting, I check to see if the MD5 exists, and if not, insert it, as shown below:
-- Attempt to find this item
SELECT INTO oResults (SELECT domain_id FROM db.domains WHERE "md5"=oMD5);
IF (oResults IS NULL) THEN
-- Attempt to find this domain
INSERT INTO db.domains ("md5", "domain", "inserted")
VALUES (oMD5, oDomain, now());
RETURN currval('db.domains_seq');
END IF;
This works great for single threaded inserts, my problem is when I have two external applications calling my function concurrently that happen to have the same MD5. I end up with a situation where:
App 1: Sees the MD5 does not exist
App 2: Inserts this MD5 into table
App 1: Goes to now Insert MD5 into table since it thinks it doesnt exist, but gets an error because right after it seen it does not, App 2 inserted it.
Is there a more effective way of doing this?
Can I catch the error on insert and if so, then select the domain_id?
Thanks in advance!
This also seems to be covered at Insert, on duplicate update in PostgreSQL?

You could just go ahead and try to insert the MD5 and catch the error, if you get a "unique constraint violation" error then ignore it and keep going, if you get some other error then bail out. That way you push the duplicate checking right down to the database and your race condition goes away.
Something like this:
Attempt to insert the MD5 value.
If you get a unique violation error, then ignore it and continue on.
If you get some other error, bail out and complain.
If you don't get an error, then continue on.
Do your SELECT INTO oResults (SELECT domain_id FROM db.domains WHERE "md5"=oMD5) to extract the domain_id.
There might be a bit of a performance hit but "correct and a little slow" is better than "fast but broken".
Eventually you might end up with more exceptions that successful inserts. Then you could try to insert in the table the references (through a foreign key) your db.domains and trap the FK violation there. If you had an FK violation, then do the old "insert and ignore unique violations" on db.domains and then retry the insert that gave you the FK violation. This is the same basic idea, it just a matter of choosing which one will probably throw the least exceptions and go with that.

Related

Know which index caused an error during a bulk INSERT or bulk UPDATE in PostgreSQL

When I INSERT or UPDATE a list of rows in PostgreSQL and one of them is causing an error, how can I know which one exactly (its index in the input list) ?
For example, if I have a UNIQUE constraint on the name column, and if name two already exists, I want to know that the constraint violation is caused by the input row at index 1.
INSERT INTO table (id, name) VALUES ('0000', 'one'), ('0001', 'two');
I know PostgSQL will stop on the first error encountered, and therefore that we can't know all of the problematic rows. That's fine, I just need the first problematic index (if any).
Inserting each row separately is not a possibility since we want to optimize for performance as well.
Postgres gives you the exactly what you are asking for and actually. It provides the constraint name, the column(s), and the value(s). However, much of this is is subsequent details to the error message. You need to extract the complete message. See demo.

How to set Ignore Duplicate Key in Postgresql while table creation itself

I am creating a table in Postgresql 9.5 where id is the primary key. While inserting rows in the table if anyone tries to insert duplicate id, i want it to get ignored instead of raising exception. Is there any way such that i can set this while table creation itself that duplicate entries get ignored.
There are many techniques to resolve duplicate insertion issue while writing insertion query i.e. using ON CONFLICT DO NOTHING, or using WHERE EXISTS clause etc. But i want to handle this at table creation end so that the person writing insertion query doesn't need to bother any.
Creating RULE is one of the possible solution. Are there other possible solutions? Maybe something like this:
`CREATE TABLE dbo.foo (bar int PRIMARY KEY WITH (FILLFACTOR=90, IGNORE_DUP_KEY = ON))`
Although exact this statement doesn't work on Postgresql 9.5 on my machine.
add a trigger before insert or rule on insert do instead - otherwise has to be handled by inserting query. both solutions will require more resources on each insert.
Alternative way to use function with arguments for insert, that will check for duplicates, so end users will use function instead of INSERT statement.
WHERE EXISTS sub-query is not atomic btw - so you can still have exception after check...
9.5 ON CONFLICT DO NOTHING is the best solution still

Inserting row based on condition

I have a postgres table that is used to hold users files. Two users can have a file with the same name, but a user isn't allowed to have two files with the same name. Currently, if a user tries to upload a file with a name they already used, the database will spit out the error below as it should.
IntegrityError: duplicate key value violates unique constraint "file_user_id_title_key"
What I would like to do is first query the database with the file name and user ID to see if the file name is being used by the user. If the name is already being used, return an error, otherwise write the row.
cur.execute('INSERT INTO files(user_id, title, share)'
'VALUES (%s, %s, %s) RETURNING id;',
(user.id, file.title, file.share))
The problem is that you cannot really do that without opening a race condition:
There is nothing to keep somebody else from inserting a conflicting row between the time you query the table and when you try to insert the row, so the error could still happen (unless you go to extreme measures like locking the table before you do that, which would affect concurrency badly).
Moreover, your proposed technique incurs extra load on the database by adding a superfluous second query.
You are right that you should not confront the user with a database error message, but the correct way to handle this is as follows:
You INSERT the new row like you showed.
You check if you get an error.
If the SQLSTATE of the error is the SQL standard value 23505 (unique_violation), you know that there is already such a file for the user and show the appropriate error to the user.
So you can consider the INSERT statement as an atomic operation check if there is already a matching entry, and if not, add the row.

Postgresql - is there a way of knowing an INSERT error code to determine if I should UPDATE because of a duplicate unique restriction?

I have a composite UNIQUE set of columns in my table. Therefore if I insert into the table where the unique key is violated, Postgresql returns and error and my PHP script can read this error.
When inserting, instead of doing this:
SELECT id FROM table WHERE col1='x' and col2='y'
(if no rows)
INSERT INTO table...
(else if rows are found)
UPDATE table SET...
I prefer to use:
INSERT INTO table...
(if error occurred then attempt to UPDATE)
UPDATE table SET...
The kind of error returned from the above would be "ERROR: duplicate key value violates unique constraint "xxxxxxxx_key"
However, there is no point doing an UPDATE if the INSERT failed for some other reason, such as invalid data. Is there a way of "knowing" (from PHP/Postgres) if the error actually failed from this duplicate key issue, rather than invalid data? I'm just curious. Performing an UPDATE also would return an error anyway if the data were invalid, but what would you say is best practice?
Many thanks!
Just check the error message to see what kind of error you have. pg_result_error_field() shows it all. Check the PGSQL_DIAG_SQLSTATE and the PostgreSQL manual for the details.
You might want to look into this example in the official documentation.
You're free to add more WHEN EXCEPTION ... THEN handlers, list of available errors can also be found in the documentation.
Although in the example above the function will cause an exception in case any other error appears, except the unique_violation one, which is treated specially.

Check referential integrity in stored procedure

I have a customer table and an order table in an sql server 2000 database.
I don't want an order to be in the order table with a customerID that doesn't exist in the customer table so I have put a foreign key constraint on customerID.
This all works fine but when writing a stored procedure that could possibly violate the constraint, is there a way to check whether the constraint will be violated and, if it will be, skip/rollback the query?
At the minute all that happens is the stored procedure returns an error that is displayed on my asp page and looks rather ugly + most users wont understand it.
I would like a more elegant way of handling the error if possible.
Thanks
You have two options:
Add error handling to catch the ugly, error inspect it to see if it's a FK constraint violation and display this to the user. This is IMHO the better solution.
Add code in the stored procedure like the following:
if exists (select null from customer where customerid=#customerId )
begin
--The customer existed so insert order
end
else
begin
--Do something to tell you code to display error message
end
With the second option you will want to watch your transactional consistency. For example what happens if a customer is deleted after your check is made.
You can inspect the data before attempting the operation, or you can attempt the operation and then check the errors after each statement, then ROLLBACK etc.
But you can handle it entirely within stored procedures and return appropriately to the caller according to your design.
Have a look at this article: http://www.sommarskog.se/error-handling-II.html
In SQL Server 2005, there is a possibility of using TRY/CATCH