I'm using Npsql with PostgreSQL. I want to see uncommitted changes of one transaction in a different one.
This is how I create my connection and transaction:
// create connection
m_Connection = new NpgsqlConnection(connectionString);
m_Connection.Open();
//create transaction
m_Transaction = m_Connection.BeginTransaction(IsolationLevel.ReadUncommitted);
In one thread I insert a row like so:
NpgsqlCommand command = CreateCommand("INSERT INTO TABLEA ....", parameters, commandTimeout)
command.ExecuteNonQuery();
and process something else without committing or rolling back the transaction.
In a different thread I read a row like so:
NpgsqlCommand command = CreateCommand("SELECT COUNT(*) FROM TABLEA", parameters, commandTimeout);
command.ExecuteScalar();
but somehow I don't see the results of the first INSERT. I don't see the results of the insert in pgAdmin either (even after running SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED).
What am I doing wrong? Any help will be appreciated.
PostgreSQL does not support uncommitted reads.
You will never be able to see changes from other transactions that are not committed.
Related
Note: I'm using DBeaver 21.1.3 as my PostgreSQL development tool.
For my testing, I have created a table:
CREATE TABLE test_sk1(n numeric(2));
then I have disabled Auto-Commit on my DBEaver to verify whether I can see the blocking query for my other transaction.
I have then executed an insert on my table:
INSERT INTO test_sk1(n) values(10);
Now this insert transaction is un-committed, which will block the table.
Then I have taken another new SQL window and tried alter command on the table.
ALTER TABLE test_sk1 ADD COLUMN v VARCHAR(2);
Now I see the alter transaction got blocked.
But when I verified in the locks, I see that this Alter transaction got blocked by "Show search_path;' transaction, where I'm expecting "INSERT..." transaction as blocking query.
I used below query to fetch the lock:
SELECT p1.pid,
p1.query as blocked_query,
p2.pid as blocked_by_pid,
p2.query AS blocking_query
FROM pg_stat_activity p1, pg_stat_activity p2
WHERE p2.pid IN (SELECT UNNEST(pg_blocking_pids(p1.pid)))
AND cardinality(pg_blocking_pids(p1.pid)) > 0;
Why is this happening on our databases?
Try using psql for such experiments. DBeaver and many other tools will execute many SQL statements that you didn't intend to run.
The query that you see in pg_stat_activity is just the latest query done by that process, not necessarily the one locking any resource.
I am using sqlworkbench-j to query Redshift data. I am facing issue of locking tables whenever I do query on this table. It also happens for simple select statements. I know this is happening because workbench explicitly adds begin for every statement to take care for any changes happening for the data. So for every query we need to write end transaction.
Is there any option to disable the begin statement or to add end transaction statement in sqlworkbench-j?
When you set up redshift - click the "autocommit" option.
see here for more detailed instructions
https://docs.aws.amazon.com/redshift/latest/mgmt/connecting-using-workbench.html
especially point 10
Trying to support PostgreSQL DB in my application, found this strange behaviour.
Preparation:
CREATE TABLE test(id INTEGER, flag BOOLEAN);
INSERT INTO test(id, flag) VALUES (1, true);
Assume two concurrent transactions (Autocommit=false, READ_COMMITTED) TX1 and TX2:
TX1:
UPDATE test SET flag = FALSE WHERE id = 1;
INSERT INTO test(id, flag) VALUES (2, TRUE);
-- (wait, no COMMIT yet)
TX2:
SELECT id FROM test WHERE flag=true FOR UPDATE;
-- waits for TX1 to release lock
Now, if I COMMIT in TX1, the SELECT in TX2 returns empty cursor.
It is strange to me, because same experiment in Oracle and MariaDB results in selecting newly created row (id=2).
I could not find anything about this behaviour in PG documentation.
Am I missing something?
Is there any way to force PG server to "refresh" statement visibility after acquiring lock?
PS: PostgreSQL version 11.1
TX2 scans the table and tries to lock the results.
The scan sees the snapshot of the database from the start of the query, so it cannot see any rows that were inserted (or made eligible in some other way) by concurrent modifications that started after that snapshot was taken.
That is why you cannot see the row with the id 2.
For id 1, that is also true, so the scan finds that row. But the query has to wait until the lock is released. When that finally happens, it fetches that latest committed version of the row and performs the check again, so that row is excluded as well.
This “EvalPlanQual” recheck (to use PostgreSQL jargon) is only performed for rows that were found during the scan, but were locked. The second row isn't even found during the scan, so no such processing happens there.
This is a bit odd, admitted. But it is not a bug, it is just the way PostgreSQL wirks.
If you want to avoid such anomalies, use the REPEATABLE READ isolation level. Then you will get a serialization error in such a case and can retry the transaction, thus avoiding inconsistencies like that.
I'm using Entity Framework 5.0. I need to restrict access to a row while I'm reading and Updating it.
My application run on more that 10 machines and when I use TransactionScope ,some time some other application on other machines (randomly) dump and can not update or read data from that table.
I think TransactionScope restricted access to my table while its reading or updating and other update or reading request will dump.
How can I handle other requests from other applications to update or read data from that table when one application did not done TransactionScope action?
How can I handle it?
using (var myDB = new MyDBEntities())
{
using (TransactionScope scope = new TransactionScope())
{
// read and update myDB object with some code in here
// ...
myDB.SaveChanges();
scope.Complete();
}
}
When using transaction scopes you can hold another transaction to select/update the same row.
Also, you can hide uncommitted data from another transaction using special table hint called as READPAST.
Example:
session1
BEGIN TRANSACTION
update users with (SERIALIZABLE) set name='test' where Id = 1
-- COMMIT --not committed yet
session2
select * from users where Id = 1
--waits session1 to finish its job
--rows returns after commit
session3
select * from users with (READPAST) where Id = 1 --returns 0 row
While you're not commit the transaction, other sessions could not read or update the data. When you commit the transaction on session1, session2 will be able to read the row.
http://omerc.blogspot.com.tr/2010/04/transaction-and-locks-in-ms-sql-2008.html
https://technet.microsoft.com/en-us/library/jj856598(v=sql.110).aspx
https://www.codeproject.com/Articles/690136/All-About-TransactionScope
Be aware that data is still open to lost updates. To prevent it you may consider to use optimistic/pessimistic locking.
https://logicalread.com/sql-server-concurrency-lost-updates-w01/#.WskuNNNuZ-U
PostgreSQL has read committed isolation level. Now I have a transaction which consists of a single DELETE statement and this delete statement has a subquery consisting of a SELECT statement for selection the rows to delete.
Is it true that I have to use FOR UPDATE in the select statement to get no conflicts with other transaction?
My thinking is the following: First the corresponding rows are read out from the table and in a second step these rows are deleted, so another transaction could interfere.
And what about a simple DELETE FROM myTable WHERE id = 4 statement? Do I also have to use FOR UPDATE?
Is it true that I have to use FOR UPDATE in the select statement to
get no conflicts with other transaction?
What does "no conflicts with other transaction" mean to you? You can test this by opening two terminals, and executing statements in each of them. Interleaved correctly, the DELETE statement will make the "other transaction" (the one that has its isolation level set to READ COMMITTED) wait until it commits or rolls back.
sandbox=# set transaction isolation level read committed;
SET
sandbox=# select * from customer;
date_of_birth
---------------
1996-09-29
1996-09-28
(2 rows)
sandbox=# begin transaction;
BEGIN
sandbox=# delete from customer
sandbox-# where date_of_birth = '1996-09-28';
DELETE 1
sandbox=# update customer
sandbox-# set date_of_birth = '1900-01-01'
sandbox-# where date_of_birth = '1996-09-28';
(Execution pauses here, waiting for transaction in other terminal.)
sandbox=# commit;
COMMIT
sandbox=#
UPDATE 0
sandbox=#
See below for the documentation.
And what about a simple DELETE FROM myTable WHERE id = 4 statement? Do
I also have to use FOR UPDATE?
There's no such statement as DELETE . . . FOR UPDATE.
You need to be sensitive to context when you're reading about database updates. Update can mean any change to a database; it can include inserting, deleting, and updating rows. In the docs cited below, "locked as though for update" is explicitly talking about UPDATE and DELETE statements, among others.
Current docs
FOR UPDATE causes the rows retrieved by the SELECT statement to be
locked as though for update. This prevents them from being modified or
deleted by other transactions until the current transaction ends. That
is, other transactions that attempt UPDATE, DELETE, SELECT FOR UPDATE,
SELECT FOR NO KEY UPDATE, SELECT FOR SHARE or SELECT FOR KEY SHARE of
these rows will be blocked until the current transaction ends. The FOR
UPDATE lock mode is also acquired by any DELETE on a row, and also by
an UPDATE that modifies the values on certain columns. Currently, the
set of columns considered for the UPDATE case are those that have an
unique index on them that can be used in a foreign key (so partial
indexes and expressional indexes are not considered), but this may
change in the future. Also, if an UPDATE, DELETE, or SELECT FOR UPDATE
from another transaction has already locked a selected row or rows,
SELECT FOR UPDATE will wait for the other transaction to complete, and
will then lock and return the updated row (or no row, if the row was
deleted).
Short version: the FOR UPDATE in a sub-select is not necessary because the DELETE implementation already does the necessary locking. It would be redundant.
Ideally you should read and digest Concurrency Control to learn how the concurrency issues are dealt with by the SQL engine.
Specifically for the case you're mentioning, I think these couple of excerpts are the most relevant, in Read Committed Isolation Level:
UPDATE, DELETE, SELECT FOR UPDATE, and SELECT FOR SHARE commands
behave the same as SELECT in terms of searching for target rows: they
will only find target rows that were committed as of the command start
time.
However, such a target row might have already been updated (or
deleted or locked) by another concurrent transaction by the time it is
found. In this case, the would-be updater will wait for the first
updating transaction to commit or roll back (if it is still in
progress).
So one of your two concurrent DELETE will be put to wait, as soon as it tries to delete a row that the other one already processed just before. This wait will only end when the other one commits or roll backs. In a way, that means that the engine "detected the conflict" and serialized the two DELETE in order to deal with that conflict.
If the first updater rolls back, then its effects are
negated and the second updater can proceed with updating the
originally found row. If the first updater commits, the second updater
will ignore the row if the first updater deleted it, otherwise it will
attempt to apply its operation to the updated version of the row.
In your scenario, after the first DELETE has committed and the second one is waked up, the second one will be unable to delete the row that it was put to wait for, because it's no longer current, it's gone. That's not an error in this isolation level. The execution will just go on with the other rows, some of which may also have disappeared. Eventually it will report the actual number of rows that were deleted by this statement, that may be different from the number that the sub-select initially found, before the statement was put to wait.