Read modified and yet uncommitted data - db2

select .. WITH UR ignores locks and gives only currently committed data
how to read data that are not committed ?
as in Oracle :
update table set ..
select .. gives modified and yet uncommitted data

Data that is not commited cannot be read by other transaction, the only way to retrieve that data is in the transaction that is modifying the data.
For more information about isolation and concurrency in DB2, please take a look at this tutorial: http://www.ibm.com/developerworks/data/tutorials/db2-cert6106/

Related

BigQuery - how to determine Data Availability

In regards to: https://cloud.google.com/bigquery/streaming-data-into-bigquery#dataavailability
What's the best way to programatically determine if a table's data is available after streaming?
I am getting unexpected results trying to fetch Rows and TotalRows with the following apis: Jobs.Query, Jobs.GetQueryResults, Tables.Get, Tabledata.List
Thanks.
You can tell whether data is flushed on the table by executing the Tables.Get() API and looking at the streamingBuffer.oldestEntryTime value. This can be considered a high-water mark of data that has been flushed out of the buffer.
Any data before this timestamp should be available for copy, export, and list operations.
Also, I should clarify that data in the table is available for query immediately after streaming. It only is unavailable to table copy, export, and tabledata.list() operations. Yes, this is confusing, but yes, we're also working on addressing the problem.
For tables that haven't been streamed to before or recently, there is a warmup period where new streaming data won't show up.
See https://cloud.google.com/bigquery/streaming-data-into-bigquery#dataavailability for more information.

Check redo / committed data size in postgres?

I have following Quires:
How Do I check redo / un-committed data size in PostgreSQL ?
Looks like if I do multiple update in sequence, it slows down.
Like Update 1, update 2, .... update n; ...seem update n is slower than update 1. Does uncommitted data volume affects it ? How redo management works in PostgreSQL ?
How do I monitor current running SQL in stored function? pg_stat_activity just shows function call; at session level. How do I get current SQL under that function which is running now ?
~ Santosh
You're clearly coming from an Oracle background.
PostgreSQL does not have undo and redo logs, as such.
Uncommitted (in-progress or rolled back), live committed data and comimtted-then-deleted data are mixed together in the heap, i.e. the main table contents. The fraction used by rolled back transactions, old versions of updated rows and deleted rows is referred to as table bloat. See the wiki.
The closest thing to do the redo log is the write-ahead logs in pg_xlog. There's no SQL-level interface to getting the current xlog size.
The documentation discusses this in some more detail, but it's an area of PostgreSQL management that could really use more attention from interested contributors. Both better built-in monitoring tools and better documentation would be good. Patches are welcome.
As for your second question... you don't. There isn't currently a way to get a function call stack. One is being discussed, but hasn't been implemented as of 9.5.

Lock and transaction in postgres that should block a query

Let's assume in SQL window 1 I do:
-- query 1
BEGIN TRANSACTION;
UPDATE post SET title = 'edited' WHERE id = 1;
-- note that there is no explicit commit
Then from another window (window 2) I do:
-- query 2
SELECT * FROM post WHERE id = 1;
I get:
1 | original title
Which is fine as the default isolation level is READ COMMITTED and because query 1 is never committed, the change it performs is not readable until I explicitly commit from window 1.
In fact if I, in window 1, do:
COMMIT TRANSACTION;
I can then see the change if I re-run query 2.
1 | edited
My question is:
Why is query 2 returning fine the first time I run it? I was expecting it to block as the transaction in window 1 was not committed yet and the lock placed on row with id = 1 was (should be) an unreleased exclusive one that should block a read like the one performed in window 2. All the rest makes sense to me but I was expecting the SELECT to get stuck until an explicit commit in window 1 was executed.
The behaviour you describe is normal and expected in any transactional relational database.
If PostgreSQL showed you the value edited for the first SELECT it'd be wrong to do so - that's called a "dirty read", and is bad news in databases.
PostgreSQL would be allowed to wait at the SELECT until you committed or rolled back, but it isn't required to by the SQL standard, you haven't told it you want to wait, and it doesn't have to wait for any technical reason, so it returns the data you asked for immediately. After all, until it's committed, that update only kind-of exists - it still might or might not happen.
If PostgreSQL always waited here, then you'd quickly land up with a situation where only one connection could be doing anything with the database at a time. Not pretty for performance, and totally unnecessary the vast majority of the time.
If you want to wait for a concurrent UPDATE (or DELETE), you'd use SELECT ... FOR SHARE. (But be aware that this won't work for INSERT).
Details:
SELECT without a FOR UPDATE or FOR SHARE clause does not take any row level locks. So it sees whatever is the current committed row, and is not affected by any in-flight transactions that might be modifying that row. The concepts are explained in the MVCC section of the docs. The general idea is that PostgreSQL is copy-on-write, with versioning that allows it to return the correct copy based on what the transaction or statement could "see" at the time it started - what PostgreSQL calls a "snapshot".
In the default READ COMMITTED isolation snapshots are taken at the statement level, so if you SELECT a row, COMMIT a change to it from another transaction, and SELECT it again you'll see different values even within one transation. You can use SNAPSHOT isolation if you don't want to see changes committed after the transaction begins, or SERIALIZABLE isolation to add further protection against certain kinds of transaction inter-dependencies.
See the transaction isolation chapter in the documentation.
If you want a SELECT to wait for in-progress transactions to commit or rollback changes to rows being selected, you must use SELECT ... FOR SHARE. This will block on the lock taken by an UPDATE or DELETE until the transaction that took the lock rolls back or commits.
INSERT is different, though - the tuples just don't exist to other transactions until commit. The only way to wait for concurrent INSERTs is to take an EXCLUSIVE table-level lock, so you know nobody else is changing the table while you read it. Usually the need to do that means you have a design problem in the application though - your app should not care if there are uncommitted inserts still in flight.
See the explicit locking chapter of the documentation.
In PostgreSQL's MVCC implementation, the principle is reading does not block writing and vice-versa. The manual:
The main advantage of using the MVCC model of concurrency control
rather than locking is that in MVCC locks acquired for querying
(reading) data do not conflict with locks acquired for writing data,
and so reading never blocks writing and writing never blocks reading.
PostgreSQL maintains this guarantee even when providing the strictest
level of transaction isolation through the use of an innovative
Serializable Snapshot Isolation (SSI) level.
Each transaction only sees (mostly) what has been committed before the transaction began.
That does not mean there'd be no locking. Not at all. For many operations various kinds of locks are acquired. And various strategies are applied to resolve possible conflicts.

What IsolationLevel should I use in my TransactionScopes

What IsolationLevel should I use in my TransactionScopes for:
Reading a single record and I may update that record. This record is
independent of all other data in the database so I only need to lock that one record.
Trying to read a single record. If no record exists, then create a record with that
value in that table. This is independent of all other tables, but it
needs to lock this table so another thread doesn't also find no
record, and then add the same record.
In the 2nd case, I think I need to lock the table to stop an insert on the table and any access on the record read, but allow reads of other records in the table and any access on any other table.
thanks - dave
A am not sure about EF as I have not worked with it, but my answer is following:
It is enough to use 'REPEATABLE READ' since it "Specifies that statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes."
I would use 'SERIALIZABLE' since "No other transactions can modify data that has been read by the current transaction until the current transaction completes."
You can read more here about isolation levels.

How commit works in oracle

I have couple of statements, the pseudo code would look something like this
insert into local_table
crease savepoint sp1
insert into remote_db //using db_link
update local_table2
delete from local_table
commit
Now I am kind a confuse at insert into remote_db statement. Would there be any chance that the commit which is being applied has different affect on local db and on remote db?
The problem statement is kind a complex. the script which copies data from local db to remote db is producing duplicates. After going through investigation, thats the only place which looks suspicious but i am not sure. Would really appreciate if someone can shed light on COMMIT of oracle.
If you are asking whether the commit could potentially cause duplicate rows, no, that's not possible.
Given the way that distributed transactions take place, it is possible that that transaction would not be committed at all on the remote database (in which case it would be an in-doubt distributed transaction that the remote DBA would likely need to resolve). But if the transaction is committed successfully, it's going to be committed correctly. It's not possible that some rows would get committed and others wouldn't or that duplicate rows that didn't exist prior to the commit would be created by the act of committing.