Compensation Log Records (CLR) in UNDO-Phase of Database Recovery - recovery

The compensation logs' redo-information corresponds to the undo-information of the log entry that made their creation necessary during the undo-phase.
That sounds to me like the CLRs redo information is the same as the undo information of the logs that sparked them.
But should't it be that they have the REDO information in order to cancel out the executed UNDO operations in case of an interrupted recovery process?
Here's an example:
let T2 be a looser transaction:
<#55, T2, P3, J=J+9, J = J-9, #53>
J=J+9 is the redo-op and J = J-9 is the undo-op.
Now the CLR that's appended to the Log file during the redo-phase would be:
<#56, T2, J=J-9,__, #53>
With J=J-9 being the undo-op of the original log entry as the redo information in the CLR. In case the recovery is interrupted, the log-entry #56 will be executed during redo phase.
The point of CLR is to ensure that restarting the recovery process and running it again always leads to the same result.
How does running the J=J-9 operation during the redo phase of the rerun ensure this?
Can somebody pease explain this to me?

You have a change that you need to apply to the database. You can either make the change and record how to undo it if an transaction aborts, or you can record what you need to do (REDO) and consult the database and redo log to figure out the current state of the database. See also http://ariel.its.unimelb.edu.au/~yuan/Ingres/clr.html

Related

IBM CDC Table should already have been refreshed. Transformation Server will terminate

I have a table with both source and target as IBM DB2 iSeries. The replication method is Mirror. After refresh before mirroring, the message Table <lib>/<table> should already have been refreshed. Transformation Server will terminate. occurs and the state of table stays as Refresh. Other tables in the same subscription are running normally. Below is the detailed log:
source
Table lib/table, member table will be refreshed to subscription.
Table lib/table, member table refresh to subscription is complete 200000 rows sent.
Table lib/table member table could not be refreshed.
Table lib/table should already have been refreshed. Transformation Server will terminate.
target
Refresh started for target table lib/table, member *ONLY.
220310 rows deleted from member *FIRST of table lib/table.
Refresh completed for table lib/table, member *ONLY. 200000 rows received, 199500 rows successfully applied, 500 rows failed.
Does anyone have any ideas towards this kind of situation?
CDC iSeries will try to get a very short exclusive lock (allow read) to ensure that the there are no uncommitted commit cycles involving the table at the time that the refresh starts. if it cannot get a lock then it skips the refresh, moves on to the next table, posts the message that you have reported.
So you will need to run the refresh of the table at a time of low activity on the table (or no activity).
This lock is required to ensure consistency if the source application is updating the table under commitment control, as the journal scraper would otherwise ignore any transactions belonging to a commit cycle that started before the refresh itself started.
If the source application is not using commitment control at all and the iSeries is the only source then you can get the target to ignore commitment control. The source will then know not to try the lock.
To turn off commitment control for a Java-based target add the target system parameter mirror_commit_on_transaction_boundary and set it to false, if the target is iSeries change the target commitment control parameter to *NONE.
Please be sure that commitment control is not used at all if you make this change on the target as otherwise you may have some troublesome synchronisation issues if changes are rolled back concurrent with a table refresh
May be seeing the job log would give more clarity as what is the cause for this behavior, as this could happen in many reasons.
One of the thing that can be try is, in Management Console select mapped tables
parked the table
Flag it to Refresh and start subscription it will refresh the table and enters into "Active" state.
Thanks

Understanding Postres Deadlock

I have a Python Webapp with Flask and SQLAlchemy, and there's a system update process that occurs in multiple threads. When I run it, I'm getting a DeadlocK from Postgres.
The queries that appear in the logs are the following.
ERROR: deadlock detected
DETAIL: Process 2269053 waits for ShareLock on transaction 42979254; blocked by process 2269014.
Process 2269014 waits for ShareLock on transaction 42979253; blocked by process 2269053.
Process 2269053: UPDATE sequence SET item_list='{"item_list": [162, 164]}' WHERE sequence.id = 1978
Process 2269014: UPDATE sequence SET item_list='{"item_list": [162, 165]}' WHERE sequence.id = 1977
HINT: See server log for query details.
while updating tuple (102,44) in relation "sequence"
STATEMENT: UPDATE sequence SET item_list='{"item_list": [162, 164]}' WHERE sequence.id = 1978
I see that they are 2 different PK's, from my understanding when an update is performed only the row is locked and the statements are from 2 different rows. There's clearly something I'm misunderstanding so I wanted to ask if someone could help me clarify why this deadlocks happens and how can I solve it?
Thank you
The missing information is that a transaction can span multiple statements, and each of these can take locks. So each of the quoted UPDATE statements blocked on a lock taken by some statement in the other transaction.
For example, the previous statement of process 2269053 might have updated the row with id 1977, and the previous statement of process 2269014 might have updated the row with id 1978. There are of course numerous other possibilities.
You should figure out which part of your application issued these statements (application log file?) and look what these transactions did before. You might have to crank up application or database logging to get that information, if you cannot reconstruct it by looking at the code.

what does the file Snapshot.scala in databricks?

I am running some streaming query jobs on a databricks cluster, and when i look at the cluster/job logs, I see a lot of
first at Snapshot.scala:1
and
withNewExecutionId at TransactionalWriteEdge.scala:130
A quick search yielded this scala script https://github.com/delta-io/delta/blob/master/src/main/scala/org/apache/spark/sql/delta/Snapshot.scala
Any one can explain what this do in laymans term?
Internally this class manages the replay of actions stored in checkpoint or delta file
Generally, this "snapshotting" relies on delta encoding and indirectly allows snaphot isolation as well.
Practically delta-encoding remembers every side-effectful operation like INSERT DELETE UPDATE that you did since the last checkpoint. In case of delta lake it would be SingleAction (source): AddFile (insert) RemoveFile (delete). Conceptually this approach is close to event-sourcing - without it you'd have to literally store/broadcast whole state (database or directory) on every update. It also employed by many classic ACID databases with replication.
Overall it gives you:
ability to continuously replicate file-system/directory/database state (see SnapshotManagement.update). Basically that's why you see a lot of first at Snapshot.scala:1 - it's called in order to catch up with the log every time you start transaction, see DeltaLog.startTransaction. I couldn't find TransactionalWriteEdge sources, but I guess it's called around the same time.
ability to restore state by replaying every action since the last snapshot.
ability to isolate (and store) transactions by keeping their snapshots apart until commit (every SingleAction has txn in order to isolate). Delta-lake uses optimistic locking for that: transaction commits will fail if their logs are not mergeable, while readers don't see uncommitted actions.
P.S. You can see that the log is accessed in line val deltaData = load(files) and actions are stacked on top of previousSnapshot (val checkpointData = previousSnapshot.getOrElse(emptyActions); val allActions = checkpointData.union(deltaData))

Suspend transaction in Postgres

I have seen another database system that offers to suspend transaction. The current transaction is kept intact but put on hold while your code is allowed to work with the database to effect immediate permanent changes to rows. Then you can resume transaction, continuing where you left off with the same locks and other transaction protections in place as if you'd never interrupted it.
For example, say an customer is placing an order, in a transaction. During that transaction, the customer notices their phone number needs to be updated, so we change that data. Next, customer decides to cancel the not-yet-completed order. A rollback of the order has the unintended consequence of also undoing the phone number change. So it would be nice if we could:
Suspend the transaction for the order.
Update the phone number, committed to the database.
Resume the transaction for the order.
Is there some way to suspend a transaction in Postgres? In JDBC?
If a transaction cannot continue, it must roll back.
If your transaction has a point at which you don't know how to carry on, then your transaction logic is flawed, you need to reorganize it - either split into multiple transactions (or sub-transactions, aka save points), or take out the parts that do not belong to the transaction logic.
Is there some way to suspend a transaction in Postgres?
No, no such thing. And the data integrity principle is unconditional as to time.
No.
The closest things are
prepared transactions: this allows (with some conditions) for a transaction to be saved, and then later rolled back or committed.
savepoints: this allows for "nested transactions", where portions of transactions can be rolled back .
Neither of these fit exactly what you are looking for. It seems that our example has two operations that do not need to be part of the same transaction at all, since the phone number update appears to be unrelated to the success of the order. (Also, a long-running transaction is a bad idea....your order should probably be a state machine implemented without long-running transaction.)
Workaround – open second connection
In JDBC, you could just open a second connection to the database.
Do your separate work on that second connection and close. The first connection is still open and remains in its same state. Any active transaction in that first connection remains.

Transaction & Locks Problem

with in do transaction, i defined a label and in this label i am accessing a table with exclusive-lock.and at the end of label i have done all the changes in that table. bt now i am with in transaction block.
Now, i tried to access that same table in another session.then it show an error, Table used by another user. So is it possible that, can we release teh table with in transaction,so another user can access it.
For example:
Session 1)
DO TRANSACTION:
---
---
loopb:
REPEAT:
--
--
---------------------> control is here right now.
END. /*repeat*/
--
--
END. /*do transaction*/
Session 2)
I tried to access same table, but it show an error, that table locked by another user.
All those records you touched in the loop using EXCLUSIVE-LOCK will not be available to be locked by another user until the TRANSACTION is complete. There is no getting around this. If the second process needs to lock those records, then all you can do is decrease your TRANSACTION scope in the first process. This is a safety feature so that if an error happens later on in the TRANSACTION, all the changes made during the TRANSACTION will be rolled back. Another way to look at it is if you could release some record locks during a TRANSACTION, you would lose the atomicity (all-or-nothingness) that is part of the definition of a TRANSACTION.
It should be noted that if you don't really need to lock those records in the second process but just need to see their updated value, that is possible. Once the updated records are no longer in the record buffer (or the record lock status is downgraded to a NO-LOCK in the TRANSACTION), they will become limbo locks and you can view their updated values using a NO-LOCK. To make the last record in the loop become a limbo lock, you can either do this
FIND CURRENT tablerecord NO-LOCK.
Or this, if you do not need to access the record buffer any longer:
RELEASE tablerecord.
Other sessions can do a "dirty read" of the record using NO-LOCK. But they will not be able to lock it or update it until the transaction is committed (or rolled back). And that won't happen until the repeat block iterates or you leave it.