Are ESENT records flushed to a disk when performing a commit/rollback? - esent

I'm making some tests using ManagedEsent interface and I wonder if someone here can clarify on this:
Inside a transaction I do an update(insert a record) and then rollback the transaction
If I look at the database with EseDatabaseView I can see the "uncommitted" record listed in the table. Going esent.dll again doesn't give me the record.
So, using the esent api the record is not committed (and not visible); and using EDV (I guess it reads directly from the file) the record is there.
Is this "normal" behaviour, does ESENT "always" write record into the file and if it is not committed it just does not reads it? Is it a bug in EDV showing it then? Or I'm missing something here?

As in most databases, esent is using transactions. They are stored in the LOG files that you can find in the same directory with the EDB file. If you want to access the data within the transaction LOG files, you need to flush them into the database. Use the following command to do that:
eseutil /MH database.edb
More information about esent's transaction log:
http://support.microsoft.com/kb/240145/en-us

Judging by the description of EseDatabaseView ('esent.dll (The dll file of Extensible Storage Engine) is not required to read the database.'), it sounds like they try to read the database file directly, not using the ESE APIs.
Is the database shut down cleanly first? Confirm with esentutl -mh [database name], which dumps the database header. Look for something like 'Dirty Shutdown' or 'Clean Shutdown'.
Dirty Shutdown: You may see transactions that have not been committed (or rolled back). You need the information in the transaction log files to bring it to a clean state.
Clean Shutdown: Everything is in a clean state. You could potentially delete the log files if you want.
My guess is that the database was shut down in a dirty state, and the tool is reading the data that hasn't yet been rolled back.
-martin

Related

Is there a way to show everything that was changed in a PostgreSQL database during a transaction?

I often have to execute complex sql scripts in a single transaction on a large PostgreSQL database and I would like to verify everything that was changed during the transaction.
Verifying each single entry on each table "by hand" would take ages.
Dumping the database before and after the script to plain sql and using diff on the dumps isn't really an option since each dump would be about 50G of data.
Is there a way to show all the data that was added, deleted or modified during a single transaction?
Dude, What are you looking for is the most searchable thing on the internet when it comes to capturing Database changes. It is a kind of version control we can say.
But as long as I know, sadly there are no in-built approaches are available in PostgreSQL or MySql. But you can overcome it by setting/adding some triggers for your most usable operations.
You can create some backup schemas, and tables to capture your changes that are changed(updated), created, or deleted.
In this way you can achieve what you want. I know this process is fully manual, But really effective.
If you need to analyze the script's behaviour only sporadically, then the easiest approach would be to change server configuration parameter log_min_duration_statement to 0 and then back to any value it had before the analysis. Then all of the script activity will be written to the instance log.
This approach is not suitable if your storage is not prepared to accommodate this amount of data, or for systems in which you don't want sensitive client data to be written to a plain-text log file.

Would it possible that postgres Write-Ahead Log DOUBLE RE-APPLY?

I found quote here
PostgreSQL is one of the databases relying on write-ahead log (WAL) –
all changes are written to a log (a stream of changes) first, and only
then to the data files. That provides durability, because in case of a
crash the database may use WAL to perform recovery – read the changes
from WAL and re-apply them on data files.
from this article https://blog.2ndquadrant.com/basics-of-tuning-checkpoints/
Let say there is WAL file keep following query
UPDATE page SET view_count = view_count + 1;
I can imagine the case that, postgres already apply this WAL to DB
but it crash right after it apply.
DB doesn't update latest WAL position NOR delete WAL log file yet.
When DB is up, it will do recovery and reapply this WAL again, isn't it?
And would this final value in database become view_count + 2 ?
Please advise
Such situation called Partial Page Write. PostgreSQL has configuration option to prevent protect against this problem full_page_writes. It's enabled by default:
When this parameter is on, the PostgreSQL server writes the entire content of each disk page to WAL during the first modification of that page after a checkpoint. This is needed because a page write that is in process during an operating system crash might be only partially completed, leading to an on-disk page that contains a mix of old and new data.
Changes replayed by restoring page copy instead of redoing the update.

How to rollback an update in PostgreSQL

While editing some records in my PostgreSQL database using sql in the terminal (in ubuntu lucid), I made a wrong update.
Instead of -
update mytable set start_time='13:06:00' where id=123;
I typed -
update mytable set start_time='13:06:00';
So, all records are now having the same start_time value.
Is there a way to undo this change? There are some 500+ records in the table, and I do not know what the start_time value for each record was
Is it lost forever?
I'm assuming it was a transaction that's already committed? If so, that's what "commit" means, you can't go back.
Some data may be recoverable if you're lucky. Stop the database NOW.
Here's an answer I wrote on the same topic earlier. I hope it's helpful.
This might be too: Recoved deleted rows in postgresql .
Unless the data is absolutely critical, just restore from backups, it'll be lots easier and less painful. If you didn't have backups, consider yourself soundly thwacked.
If you catch the mistake and immediately bring down any applications using the database and take it offline, you can potentially use Point-in-Time Recovery (PITR) to replay your Write Ahead Log (WAL) files up to, but not including, the moment when the errant transaction was made. This would return the database to the state it was in prior, thus effectively 'undoing' that transaction.
As an approach for a production application database it has a number of obvious limitations, but there are circumstances in which PITR may be the best option available, especially when critical data loss has occurred. However, it is of no value if archiving was not already configured before the corruption event.
https://www.postgresql.org/docs/current/static/continuous-archiving.html
Similar capabilities exist with other relational database engines.

How can I audit with the Microsoft SQL Server LDF file?

We need an audit log in the product that we are creating. We use SQL Server 2008 R2. I learned that the LDF file keeps an complete log of all transactions that where made*.
I've found ApexSQL Log, this tools analyses the LDF file and provides a GUI. It's a great demonstration of what's possible. But it's expensive. More info: http://www.apexsql.com/sql_tools_log.aspx
Do you know of other programs that can analyse the LDF file's? Or perhaps other methods to provide audit-trail functionality? I know that it's possible to create triggers. But if it isn't necessary to add things to my database scheme then I would rather not do it.
*Only if you select the full recovery model.
How about the new Change Data Capture (CDC) functionality in R2. Doesnt that serve your purpose ?
When it comes to the information stored in an LDF file, make sure to form a full log chain. A log chain is a continuous sequence of transaction log backups. It starts with a full database backup followed by all subsequent log backups up through the auditing point. If it becomes broken, only the transactions in the logs up to the last backup before the missing one can be shown with full information (e.g. a schema and object name, or a row history)
Unlike INSERT and DELETE operations, which are fully logged in the LDF files, UPDATE operations are logged minimally – only the changes that are made are logged, but the old and new values are not. When logging UPDATE operations, SQL Server doesn’t log complete before and after row states but only the incremental change that occurred to the row. For example, if a word “log” was updated to word “blog” SQL Server will, in general case, only log an addition of letter “b” at index 0. This is enough for its purpose of ensuring ACID but not enough to easily show before and after states of the row. So, in order to understand what changed really occurred, you have to reconstruct the context in which the change occurred from the rest of transaction log and/or backup and online database data

Database last updated?

I'm working with SQL 2000 and I need to determine which of these databases are actually being used.
Is there a SQL script I can used to tell me the last time a database was updated? Read? Etc?
I Googled it, but came up empty.
Edit: the following targets issue of finding, post-facto, the last access date. With regards to figuring out who is using which databases, this can definitively monitored with the right filters in the SQL profiler. Beware however that profiler traces can get quite big (and hence slow/hard to analyze) when the filters are not adequate.
Changes to the database schema, i.e. addition of table, columns, triggers and other such objects typically leaves "dated" tracks in the system tables/views (can provide more detail about that if need be).
However, and unless the data itself includes timestamps of sorts, there are typically very few sure-fire ways of knowing when data was changed, unless the recovery model involves keeping all such changes to the Log. In that case you need some tools to "decompile" the log data...
With regards to detecting "read" activity... A tough one. There may be some computer-forensic like tricks, but again, no easy solution I'm afraid (beyond the ability to see in server activity the very last query for all still active connections; obviously a very transient thing ;-) )
I typically run the profiler if I suspect the database is actually used. If there is no activity, then simply set it to read-only or offline.
You can use a transaction log reader to check when data in a database was last modified.
With SQL 2000, I do not know of a way to know when the data was read.
What you can do is to put a trigger on the login to the database and track when the login is successful and track associated variables to find out who / what application is using the DB.
If your database is fully logged, create a new transaction log backup, and check it's size. The log backup will have a fixed small lengh, when there were no changes made to the database since the previous transaction log backup has been made, and it will be larger in case there were changes.
This is not a very exact method, but it can be easily checked, and might work for you.