Actually a lot of things might be covered here: Job queue as SQL table with multiple consumers (PostgreSQL)
However I just wanted to ask for my specific query.
Currently I have a job queue that actually should emit a new job for every consumer, however we found out that we sometimes gotten the same job twice on different consumer (probably a race condition.
This was our query (run inside a transaction):
UPDATE invoice_job SET status = 'working', date_time_start = now(),
node = $ip
WHERE id = (SELECT id FROM invoice_job WHERE status = 'created' ORDER BY id LIMIT 1)
RETURNING *
Currently the Table is pretty simple and has a status (can be "created", "working", "done", date_time_start field, created field (not used for query), id field, node (where the job was run).
However this emitted the same job twice at one point.
Currently I changed the query now to:
UPDATE invoice_job SET status = 'working', date_time_start = now(),
node = $ip
WHERE id = (SELECT id FROM invoice_job WHERE status = 'created' ORDER BY id LIMIT 1 FOR UPDATE SKIP LOCKED)
RETURNING *
would that actually help and only emit the same job once?
Your solution with FOR UPDATE SKIP LOCKED is fine. It'll ensure a row is locked by exactly one session before being updated for processing. No transaction can choose a row already locked by another transaction, and when the lock is released on commit, subsequent SELECT clauses will no longer match the row.
The original failed because the subquery's SELECT can choose the same row concurrently in multiple sessions, each of which then tries to UPDATE the row. There's no WHERE clause in the UPDATE that'd make that fail; it's perectly fine for two concurrent sessions to UPDATE invoice_job SET status = 'working' WHERE node = 42 or whatever. The second update will happily run and commit once the first update succeeds.
You could also make it safe by repeating the WHERE clause in the UPDATE
UPDATE invoice_job SET status = 'working', date_time_start = now(),
node = $ip
WHERE id = (SELECT id FROM invoice_job WHERE status = 'created' ORDER BY id LIMIT 1)
AND status = 'created'
RETURNING *
... but this will often return zero rows under high concurrency.
In fact it will return zero rows for all but one of a set of concurrent executions, so it's no better than a serial queue worker. This is true of most of the other "clever" tricks people use to try to do concurrent queues, and one of the main reasons SKIP LOCKED was introduced.
The fact that you only noticed this problem now tells me that you would actually be fine with a simple, serial queue dispatch where you LOCK TABLE before picking the first row. But SKIP LOCKED will scale better if your workload grows.
Related
I have about 10 queries that concurrently update a row, so I want to know what is the difference between
UPDATE account SET balance = balance + 1000
WHERE id = (SELECT id FROM account
where id = 1 FOR UPDATE);
and
BEGIN;
SELECT balance FROM account WHERE id = 1 FOR UPDATE;
-- compute $newval = $balance + 1000
UPDATE account SET balance = $newval WHERE id = 1;
COMMIT;
I am using PosgreSQL 11, so what is the right solution and what will happen with multi transactions in these two solutions?
Both versions will have exactly the same effect, and both prevent anomalies in the face of concurrency, because the row is locked before it is modified.
The first method is preferable, because there is only one client-server round trip, so the transaction is shorter and the lock is held for a shorter time, which improves concurrency.
The best way to do this and be safe from concurrent data modifications is:
UPDATE account
SET balance = balance + 1000
WHERE id = 1;
This does the same, because an UPDATE automatically puts an exclusive lock on the affected row, and a blocked query will see the updated version of the row when the lock is gone.
I have a situation where I have multiple (potentially hundreds) threads repeating the same task (using a java scheduled executor, if you are curious). This task entails selecting rows of changes (from a table called change) that have not yet been processed (processed changes are kept track in a m:n join table called process_change_rel that keeps track of the process id, record id and status) processing them, then updating back the status.
My question is, how is the best way to prevent two threads from the same process from selecting the same row? Will the below solution (using for update to lock rows ) work? If not, please suggest a working solution
Create table change(
—id , autogenerated pk
—other fields
)
Create table change_process_rel(
—change id (pk of change table)
—process id (pk of process table)
—status)
Query I would use is listed below
Select * from
change c
where c.id not in(select changeid from change_process_rel with cs) for update
Please let me know if this would work
You have to "lock" a row which you are going to process somehow. Such a "locking" should be concurrent of course with minimum conflicts / errors.
One way is as follows:
Create table change
(
id int not null generated always as identity
, v varchar(10)
) in userspace1;
insert into change (v) values '1', '2', '3';
Create table change_process_rel
(
id int not null
, pid int not null
, status int not null
) in userspace1;
create unique index change_process_rel1 on change_process_rel(id);
Now you should be able to run the same statement from multiple concurrent sessions:
SELECT ID
FROM NEW TABLE
(
insert into change_process_rel (id, pid, status)
select c.id, mon_get_application_handle(), 1
from change c
where not exists (select 1 from change_process_rel r where r.id = c.id)
fetch first 1 row only
with ur
);
Every such a statement inserts 1 or 0 rows into the change_process_rel table, which is used here as a "lock" table. The corresponding ID from change is returned, and you may proceed with processing of the corresponding event in the same transaction.
If the transaction completes successfully, then the row inserted into the change_process_rel table is saved, so, the corresponding id from change may be considered as processed. If the transaction fails, the corresponding "lock" row from change_process_rel disappears, and this row may be processed later by this or another application.
The problem of this method is, that when both tables become large enough, such a sub-select may not work as quick as previously.
Another method is to use Evaluate uncommitted data through lock deferral.
It requires to place the status column into the change table.
Unfortunately, Db2 for LUW doesn't have SKIP LOCKED functionality, which might help with such a sort of algorithms.
If, let's say, status=0 is "not processed", and status<>0 is some processing / processed status, then after setting these DB2_EVALUNCOMMITTED and DB2_SKIP* registry variables and restart the instance, you may "catch" the next ID for processing with the following statement.
SELECT ID
FROM NEW TABLE
(
update
(
select id, status
from change
where status=0
fetch first 1 row only
)
set status=1
);
Once you get it, you may do further processing of this ID in the same transaction as previously.
It's good to create an index for performance:
create index change1 on change(status);
and may be set this table as volatile or collect distribution statistics on this column in addition to regular statistics on table and its indexes periodically.
Note that such a registry variables setting has global effect, and you should keep it in mind...
I need to write a query to poll a database table only if I ( the application process ) am leader. I plan on implementing leader election via database reservation ( lock a table and update leader record if available every so often ). How can I combine the leader election query with the polling query such that I am guaranteed that the polling query doesn't run if being run by a process that is not leader. This needs to be a db only solution ( for a variety of reasons ).
I'm thinking something like
SELECT *
FROM outbound_messages
WHERE status = 'READY'
AND 'JVM1' IN (SELECT jvm_name
FROM leader
WHERE leader_status = active)
Will this work ?
This seems overly complicated.
In PostgreSQL you can do that easily with advisory locks. Choose a bigint lock number (I chose 42) and query like this:
WITH ok(ok) AS (
pg_try_advisory_lock(42)
)
SELECT o.*
FROM outbound_messages o
CROSS JOIN ok
WHERE ok AND o.status = 'READY';
Only the first caller will obtain the lock and get a result.
You can release the lock by ending your session or calling
SELECT pg_advisory_unlock(42);
There are 2 tables:
CREATE TABLE "job"
(
"id" SERIAL,
"processed" BOOLEAN NOT NULL,
PRIMARY KEY("id")
);
CREATE TABLE "job_result"
(
"id" SERIAL,
"job_id" INT NOT NULL,
PRIMARY KEY("id")
);
There are several consumers, that do the following (sequentially):
1) start transaction
2) search for job not processed yet
3) process it
4) save result ( set processed field to true and insert into job_result )
5) commit
Questions:
1) Is the following sql code correct, so no job could be processed more than one time?
2) If it is correct, can it be rewritten in more clean way ? ( I am confused about "UPDATE job SET id = id" )
UPDATE job
SET id = id
WHERE id =
(
SELECT MIN(id)
FROM job
WHERE processed = false AND pg_try_advisory_lock(id) = true
)
AND processed = false
RETURNING *
Thanks.
with job_update as (
update job
set processed = true
where id = (
select id
from (
select min(id)
from job
where processed = false
) s
for update
)
returning id
)
insert into job_result (job_id)
select id
from job_update
Question 1
To answer your first question, the processing can be done twice if the database crashes between step 3 and step 5. When the server/service recovers, it will be processed again.
If the processing step only computes results which are sent to the database in the same connection as the queuing queries, then no one will be able to see that it was processed twice, as the results of the first time were never visible.
However if the processing step talks to the outside world, such as sending an email or charging a credit card, that action will be taken twice and both will be visible. The only way to avoid that is to use two-phase commits for all dealings with the outside world. Also, if the worker keeps two connections to the database and is not disciplined about their use, then that can also lead to visible double-processing.
Question 2
For your second question, there are several ways it can be made cleaner.
Most importantly, you will want to change the advisory lock from session-duration to transaction-duration. If you leave it at session-duration, long-lived workers will be become slower and slower and will use more and more memory as time goes on. This is safe to do, because in the query as written you are checking the processed flag in both the sub-select and in the update itself.
You could make the table structure itself cleaner. You could have one table with both the processed flag and the results field, instead of two tables. Or if you want two tables, you could remove the processed flag from the job table and signify completion simply be deleting the completed record from the table, rather than updating the processed flag.
Assuming you don't want to make such changes, you could still clean up the SQL without changing the table structure or semantics. You do need to lock the tuple to avoid a race condition with the release of the advisory lock. But rather than using the degenerate id=id construct (which some future maintainer is likely to remove, because it is not intuitively obvious why it is even there), you might as well just set the tuple to its final state by setting processed=true, and then removing that second update step from your step 4. This is safe to do because you do not issue an intermediate commit, so no one can see the tuple in this intermediate state of having processed=true but not yet really being processed.
UPDATE job
SET processed = true
WHERE id =
(
SELECT MIN(id)
FROM job
WHERE processed = false AND pg_try_advisory_xact_lock(id) = true
)
AND processed = false
RETURNING id
However, this query still has the unwanted feature that often someone looking for the next job to process will find no rows. That is because it suffered a race condition which was then filtered out by the outer processed=false condition. This is OK as long as your workers are prepared to retry, but it leads to needless contention in the database. This can be improved by making the inner select lock the tuple when it first encounters it by switching from a min(id) to a LIMIT 1 query:
UPDATE job
SET processed=true
WHERE id =
(
SELECT id
FROM job
WHERE processed = false AND pg_try_advisory_xact_lock(id) = true
order by id limit 1 for update
)
RETURNING id
If PostgreSQL allowed ORDER BY and LIMIT on UPDATES, then you could avoid the subselect altogether, but that is currently implemented (maybe it will be in 9.5).
For good performance (or even to avoid memory errors), you will need an index like:
create index on job (id) where processed = false;
We have noticed a rare occurrence of a deadlock on a Postgresql 9.2 server on the following situation:
T1 starts the batch operation:
UPDATE BB bb SET status = 'PROCESSING', chunk_id = 0 WHERE bb.status ='PENDING'
AND bb.bulk_id = 1 AND bb.user_id IN (SELECT user_id FROM BB WHERE bulk_id = 1
AND chunk_id IS NULL AND status ='PENDING' LIMIT 2000)
When T1 commits after a few hundred milliseconds or so (BB has many millions of rows), multiple threads begin new Transactions (one transaction per thread) that read items from BB, do some processing and update them in batches of 50 or so with the queries:
For select:
SELECT *, RANK() as rno OVER(ORDER BY user_id) FROM BB WHERE status = 'PROCESSING' AND bulk_id = 1 and rno = $1
And Update:
UPDATE BB set datetime=$1, status='DONE', message_id=$2 WHERE bulk_id=1 AND user_id=$3
(user_id, bulk_id have a UNIQUE constraint).
Due to an external to the situation problem, another transaction T2 executes the same query with T1 almost immediately after T1 has committed (the initial batch operation where items are marked as 'PROCESSING').
UPDATE BB bb SET status = 'PROCESSING', chunk_id = 0 WHERE bb.status ='PENDING'
AND bb.bulk_id = 1 AND bb.user_id IN (SELECT user_id FROM BB WHERE bulk_id = 1
AND chunk_id IS NULL AND status ='PENDING' LIMIT 2000)
However although these items are marked as 'PROCESSING' this query deadlocks with some of the updates (which are done in batches as i said) off the worker threads. To my understanding this should not happen with READ_COMMITTED isolation level (default) that we use. I am sure that T1 has committed because the worker threads execute after it has done so.
edit: One thing i should clear up is that T2 starts after T1 but before it commits. However due to a write_exclusive tuple lock that we acquire with a SELECT for UPDATE on the same row (that is not affected by any of the above queries), it waits for T1 to commit before it runs the batch update query.
When T1 commits after a few hundred milliseconds or so (BB has many millions of rows), multiple threads begin new Transactions (one transaction per thread) that read items from BB, do some processing and update them in batches of 50 or so with the queries:
This strikes me as a concurrency problem. I think you are far better off to have one transaction read the rows and hand them off to worker processes, and then update them in batches, when they come back. Your fundamental problem is going to be that these rows are effectively working on uncertain state, holding the rows during transactions, and the like. You have to handle rollbacks and so forth separately, and consequently the locking is a real problem.
Now, if that solution is not possible, I would have a separate locking table. In this case, each thread spins up separately, locks the locking table, claims a bunch of rows, inserts records into the locking table, and commits. In this way each one thread has claimed records. Then they can work on their record sets, update them, etc. You may want to have a process which periodically clears out stale locks.
In essence your problem is that rows go from state A -> processing -> state B and may be rolled back. Since the other threads have no way of knowing what rows are processing and by which threads, you can't safely allocate records. One option is to change the model to:
state A -> claimed state -> processing -> state B. However you have to have some way of ensuring that rows are effectively allocated and that the threads know which rows have been allocated to eachother.