Pretty simple question but I can't find the answer anywhere.
The pg-promise docs say that we should avoid using transactions because they're blocking operations and that we should be using tasks instead. https://vitaly-t.github.io/pg-promise/Database.html#tx
But what happens when a query inside of a task fails? Will the whole task be rolled back?
In other words, are tasks atomic?
Tasks (method task) are there simply to let you execute multiple queries against the same allocated connection from the pool. They are not atomic.
A transaction (method tx) extends the task with automatic injection of BEGIN + COMMIT/ROLLBACK. Every transaction is atomic.
Also, each multi-query is atomic, as documented, which is enforced by PostgreSQL.
The pg-promise docs say that we should avoid using transactions because they're blocking operations and that we should be using tasks instead.
That advise is given in the context of executing multiple queries where none of the query changes the database, like SELECT-s. If you are changing data, that advise does not apply.
From the tx documentation:
Note that transactions should be chosen over tasks only where necessary, because unlike regular tasks, transactions are blocking operations.
Related
I have a simple bug in my program that uses multi user support. I'm using knex to build sql queries, and I have a pseudocode that depicts the scenerio:
const value = queryBuilder().readDataFromTheDatabase();//executes this
//do some other work and get value
queryBuilder.writeValueToTheDatabase(updateValue(value));
This piece of code is being use in sort of a middleware function. And as you can see, this is a possible race condition i.e. when multiple users access the thing, one of them gets a stale value when they try to execute this at roughly the same amount of time.
My solution
So, I was think a possible solution would be create a single queryBuilder statement:
queryBuilder().readAndUpdateValueInTheDatabase();
So, I'll probably have to use a little bit of plpgsql. I was wondering if this solution will be sufficient. Will the statement be executed atomically? i.e. When one request reads and doesn't finish his write, does another request wait around to both read and write or just waits to write but, reads the stale value?
I think what you are looking for here is isolation, not atomicity. You could set all transactions to the highest isolation level, serializable (which is higher than the usual default level). With that level, if data that a transaction read (and presumably relied upon) is changed, then when it tries to commit it might get a serialization failure error. I say "might", because the system could conclude the situation would be consistent with the data change having happened after the commit, in which case the commit is allowed to stand.
To avoid a race condition with such a setup, you must run both the read and the write in the same database transaction.
There are two ways to do that:
Use the default READ COMMITTED isolation level and lock the rows when you read them:
SELECT ... FROM ... FOR NO KEY UPDATE;
That locks the rows against concurrent modifications, and the lock is held until the end of the transaction.
Use the REPEATABLE READ isolation level and don't lock anything. Then your UPDATE will receive a serialization error (SQLSTATE 40001) if somebody modified the row concurrently. In that case, you roll the transaction back and try again in a new REPEATABLE READ transaction.
The first solution is usually better if you expect conflicts frequently, while the second is better if conflicts are rare.
Note that you should keep the database transaction as short as possible in both cases to keep the risk of conflicts low.
Transaction in PostgreSQL use an optimistic locking model when accessing to tables, while some other DBMS do pessimistic locking (IBM Db2) or the two locking model (MS SQL Server).
Optimistic locking snapshot the data on which you are working, and the modifications are done on the snapshot until the transaction ended. When the transaction finishes, the snapshot modifications are postponed on the real database (table rows), but if some other user had made a change between the moment of the snapshot capture and the commit, then the commit cannot apply and the COMMIT is rejected as a ROLLBACK.
You can try to raise the ISOLATION LEVEL (REPEATABLE READ or SERIALIZABLE) to avoid the trouble.
I have been working in DB2 LUW database, i want to submit procedures as a parallel job. Meaning I have a procesure which will do some DDL, DML statements to one table. This table is having huge data, the same procedure need to run for few more tables run in parallel.
I submit the job using DBMS_JOB.SUBMIT statement and executed the job using DBMS_JOB.RUN statement. I have job handler procedure which helps to do this in parallel.
But each job is executing in sequentially (meaning the first job got completed then the second jobs started, after 2nd job completed 3rd job getting started.
**My First Question **
how to run DBMS_JOB in parallel ?
And second issue I'm facing is the cutrent session is still waiting to get complete all the jobs. I can't use that particular session, once all the job got completed than i can have access to use that same session.
**My Second Question **
*how to make the session accessible, instead of waiting for all jobs completed *
Please help me sir/madam.
DBMS_JOB is an interface to the Administrative Taks Scheduler (ATS) of Db2-LUW for the sake of some compatibility with Oracle RDBMS. However, you can also use the ATS directly independently of DBMS_JOB, via ADMIN_TASK_ADD and related procedures.
My experience is that db2acd (the process that implements autonomic actions including the ATS) is unreliable especially when ulimits are misconfigured, and it silently won't run jobs in some circumstances. It also has a 5 minute wakeup to check for new jobs which can frustrate, and it requires an already activated database which is inconvenient for some use cases.
I would not recommend usage of the Db2 ATS for application layer functionality. Full function enterprise schedulers exist for good reasons.
For parallel invocations, I would use an enterprise scheduling tool if available, or failing-that use the scheduler supplied by the operating system either on the Db2-server or at worst on the client-side, taking care in both cases that each stored-procedure-invocation is its own scheduled-job with its own Db2-connection.
By using a Db2-connection per stored-procedure invocation, and concurrently scheduling them, they run in parallel as long as their actions don't cause mutual contention.
Apart from the above, I believe the ATS will start jobs in parallel provided that the job-defintions are correct.
Examine the contents of both ADMIN_TASK_LIST and ADMIN_TASK_STATUS administrative views, and corroborate with db2diag entries (diaglevel 4 may give more detail, even if you must use it only temporarily).
Calls to SQL PL (or PL/SQL) stored procedures are synchronous relative to the caller, which means that the Db2-connection is blocked until the stored procedure returns. You cannot "make the session accessible" if it is waiting for a stored procedure to complete, but you can open a new connection.
Different options exist for stored procedures that are written in C, or C++, or Java or C++/CLR. They have more freedom. Other options exist for messaging/broker based solutions. uch depends on available skillsets, toolsets, and experience. But in general it's wiser to keep it simple.
I would like all queries from my Spring-Hibernate application executed in a read-only transaction to be dispatched to a PostgreSQL slave and all read-write transaction queries to a master.
While using annotation driven transactions in Spring, if the transaction is defined as read-only, the PostreSQL driver allows only select queries to be executed, which is obvious, however there is no mention of how the driver would behave in a master slave configuration. For e.g., the MySQL driver has a replication connection class which automatically dispatches read-only transaction queries to the slave.
One solution would be to use multiple Hibernate session factories and use the one pointing to the slave for selects and the other for updates, but that would be too much manual handling. How should I be designing this?
This is a surprisingly complex question and the answer is not simply easy. You need to keep in mind that you have to have this dispatched in such a way that the layer which does the dispatching knows whether a transaction is likely to be read-only or not.
The cleanest solution is probably to implement the dispatching in your middleware. This has the advantage of being a functional dispatch-- we know what we are trying to do so let's dispatch there... Of course functions can create a bit of a knowledge gap in what is read-only and what writes....
The second option is that one could probably dispatch with something like PGPool or the like. I would expect you would probably want to avoid server-side prepared queries in these cases because the more knowledge you provide the intermediate layer, the fewer problems you will have.
In Marklogic, if I interrupt a long running query with a database restart, will that query then no-longer be fully applied when the database comes online again?
Yes, in general canceling an update query will roll back any changes it tried to make. You can think of this like a stack: every update in your query goes into a stack, taking any necessary locks as it goes. After all the expressions have been evaluated, the update enters commit phase and applies that stack atomically to the database. If the query is interrupted before that atomic commit, none of the changes are durable. This behavior covers the A=atomic and D=durable aspects of the ACID properties common to transactional DBMS implementations.
There are some exceptions. It is possible to structure an update so that work is applied in granular sub-transactions. One way to do that is with a multi-statement transaction.
See http://docs.marklogic.com/guide/app-dev/transactions for more.
For SNAPSHOt isolation level in SQL Server 2008 R2, the following is mentioned in MSDN ADO.Net documentation:
Transactions that modify data do not block transactions that read data, and transactions that read data do not block transactions that write data, as they normally would under the default READ COMMITTED isolation level in SQL Server.
There is no mention of whether writes will block writes, when both transactions are in SNAPSHOT isolation mode. So my question is as follows:
Will writes in a SNAPSHOT transaction1 block writes to same tables in another SNAPHOT transaction2?
LATEST UPDATE
After doing a lot of thinking on my question, I am coming to a conclusion as mentioned in paragraph below. Hope others can throw more light on this.
There is no relational database in which writes do NOT block writes. In other words, writes will always block writes. Writes would include statements like INSERT or UPDATE or DELETE. This is true no matter which isolation level you use, since all relational databases need to implement data consistency, when multiple writes are happening in database. Of course, the simultaneous writes need to be conflicting ( as in inserting into the same table or updating the same row/s) for this blocking to occur.
Ligos is actually incorrect - if two separate transactions are trying to update the same record with Snapshot on, transaction 2 WILL be blocked until transaction 1 releases the lock. Then, and ONLY then, will you get error 3960. I realize this thread is over 2 years old, but I wanted to avoid miss-information being out there.
Even the link Ligos references says the exact same thing I am mentioning above (check out the last non-red paragraph)
Write vs. Write will only not be blocked if the two records (ie. rows) trying to be updated are different
No. They will not block. Instead, the UPDATE command in trans2 will fail with error number 3960.
Because of how SNAPSHOT isolation level works, any UPDATE command may fail. The only way you can tell is to catch and handle error 3960 (it is called optimistic concurrency because you don't expect this situation to happen very often).
I ended up testing this empirically, because it's not entirely obvious from the documentation. This blog post illustrates it nicely though.
Assumption: both trans1 and trans2 are UPDATEing the same row in the same table. Updating two different rows should work just fine.