what happens to my dataset in case of unexpected failure - ado.net

i know this has been asked here. But my question is slightly different. When the dataset was designed keeping the disconnected principle in mind, what was provided as a feature which would handle unexpected termination of the application, say a power failure or a windows hang or system exception leading to restart. Say the user has entered some 100 rows and it is modified at the dataset alone. Usually the dataset is updated at the application close or at a timely period.
In old times which programming using vb 6.0 all interaction used to take place directly with the database, thus each successful transaction was committing itself automatically. How can that be done using datasets?

DataSets are never for direct access to database, they are a disconnected model only. There is no intent that they be able to recover from machine failures.
If you want to work live against the database you need to use DataReaders and issue DbCommands against the database live for changes. This of course will increase your load on the database server though.
You have to balance the two for most applications. If you know a user just entered vital data as a new row, execute an insert command to the database, and put a copy in your local cached DataSet. Then your local queries can run against the disconnected data, and inserts are stored immediately.

A DataSet can be serialized very easily, so you could implement your own regular backup to disk by using serialization of the DataSet to the filesystem. This will give you some protection, but you will have to write your own code to check for any data that your application may have saved to disk previously and so on...
You could also ignore DataSets and use SqlDataReaders and SqlCommands for the same sort of 'direct access to the database' you are describing.

Related

Is there a way to show everything that was changed in a PostgreSQL database during a transaction?

I often have to execute complex sql scripts in a single transaction on a large PostgreSQL database and I would like to verify everything that was changed during the transaction.
Verifying each single entry on each table "by hand" would take ages.
Dumping the database before and after the script to plain sql and using diff on the dumps isn't really an option since each dump would be about 50G of data.
Is there a way to show all the data that was added, deleted or modified during a single transaction?
Dude, What are you looking for is the most searchable thing on the internet when it comes to capturing Database changes. It is a kind of version control we can say.
But as long as I know, sadly there are no in-built approaches are available in PostgreSQL or MySql. But you can overcome it by setting/adding some triggers for your most usable operations.
You can create some backup schemas, and tables to capture your changes that are changed(updated), created, or deleted.
In this way you can achieve what you want. I know this process is fully manual, But really effective.
If you need to analyze the script's behaviour only sporadically, then the easiest approach would be to change server configuration parameter log_min_duration_statement to 0 and then back to any value it had before the analysis. Then all of the script activity will be written to the instance log.
This approach is not suitable if your storage is not prepared to accommodate this amount of data, or for systems in which you don't want sensitive client data to be written to a plain-text log file.

Postgresql: How to access data of a transaction with another connection

Goodmorning,
I use Postgresql for my database engine and some operations use transactions to be sure that everything goes fine.
Sometimes I need to test some specific datas at "that poin" of my application but these operations often make al lot changes in the database and it's not easy to reproduce "all the changes made inside the transaction" with another connection (like using a PgAdmin query tool) outside the transaction to test the single aspect that i need.
One way to test the specific data, is to load the data into a variable, and then debug-it, but i was searching for a more "wide solution".
So that's the question: Is there a way to access the data of a specific connection (which is in transaction) with another connection/query_tool?
Thanks,
Attilio
In postgresql (actually) there's no way to do it, full stop.

PostgreSQL blocking on too many inserts

I am working on a research platform that reads relevant Twitter feeds via the Twitter API and stores them in a PostgreSQL database for future analysis. Middleware is Perl, and the server is an HP ML310 with 8GB RAM running Debian linux.
The problem is that the twitter feed can be quite large (many entries per second), and I can't afford to wait for the insert before returning to wait for the next tweet. So what I've done is to use a fork() so each tweet gets a new process to insert into the database and the listener and return quickly to grab the next tweet. However, because each of these processes effectively opens a new connection to the PostgreSQL backend, the system never catches up with its twitter feed.
I am open to using a connection pooling suggestion and/or to upgrading hardware if necessary to make this work, but would appreciate any advice. Is this likely RAM bound, or is there configuration or software approaches I can try to make the system sufficiently speedy?
If you open and close a new connection for each insert, that is going to hurt big time. You should use a connection pooler instead. Creating a new database connection is not a lightweight thing to do.
Doing a a fork() for each insert is probably not such a good idea either. Can't you create one process that simply takes care of the inserts and listens on a socket, or scans a directory or something like that and another process signalling the insert process (a classical producer/consumer pattern). Or use some kind of message queue (I don't know Perl, so I can't say what kind of tools are available there).
When doing bulk inserts do them in a single transaction, sending the commit at the end. Do not commit each insert. Another option is to write the rows into a text file and then use COPY to insert them into the database (it doesn't get faster than that).
You can also tune the PostgreSQL server a bit. If you can afford to lose some transactions in case of a system crash, you might want to turn synchronous_commit off.
If you can rebuild the table from scratch anytime (e.g. by re-inserting the tweets), you might also want to make that table an "unlogged" table. It is faster than a regular table in writing, but if Postgres is not shown down cleanly, you lose all the data in the table.
Use COPY command.
One script reads Tweeter and appends strings to the CSV file on disk.
Other scripts looking for CSV file on disk, renamed this file file and started COPY command from this file.

Database last updated?

I'm working with SQL 2000 and I need to determine which of these databases are actually being used.
Is there a SQL script I can used to tell me the last time a database was updated? Read? Etc?
I Googled it, but came up empty.
Edit: the following targets issue of finding, post-facto, the last access date. With regards to figuring out who is using which databases, this can definitively monitored with the right filters in the SQL profiler. Beware however that profiler traces can get quite big (and hence slow/hard to analyze) when the filters are not adequate.
Changes to the database schema, i.e. addition of table, columns, triggers and other such objects typically leaves "dated" tracks in the system tables/views (can provide more detail about that if need be).
However, and unless the data itself includes timestamps of sorts, there are typically very few sure-fire ways of knowing when data was changed, unless the recovery model involves keeping all such changes to the Log. In that case you need some tools to "decompile" the log data...
With regards to detecting "read" activity... A tough one. There may be some computer-forensic like tricks, but again, no easy solution I'm afraid (beyond the ability to see in server activity the very last query for all still active connections; obviously a very transient thing ;-) )
I typically run the profiler if I suspect the database is actually used. If there is no activity, then simply set it to read-only or offline.
You can use a transaction log reader to check when data in a database was last modified.
With SQL 2000, I do not know of a way to know when the data was read.
What you can do is to put a trigger on the login to the database and track when the login is successful and track associated variables to find out who / what application is using the DB.
If your database is fully logged, create a new transaction log backup, and check it's size. The log backup will have a fixed small lengh, when there were no changes made to the database since the previous transaction log backup has been made, and it will be larger in case there were changes.
This is not a very exact method, but it can be easily checked, and might work for you.

SyBase SQL anywhere check if Synchronization is needed?

I have a Sybase SQL Anywhere 11.0.1 database that I am using to sync with an Oracle Consolidated Database.
I know that the SQL Anywhere database keeps track of all of the changes that are made to it so that it knows what to synchronize with the consolidated database. My question is whether or not there is a SQL command that will tell you if the database has changes to sync.
I have a mobile application and I want to show a little flag to the user anytime they have made changes to the handheld that need to be synced. I could just create another table to track that stuff myself but I would much rather just ping the database and ask it if it has changes that need to be synced.
There's nothing automatic to tell you that there is data to synchronize. In addition to Ben's suggestion, another idea would be to query the SYS.SYSSYNC table at the remote database to get an idea of whether there might be changes. The following statement returns a result set that shows a simple status of your last synchronization :
select ss.site_name, sp.publication_name, ss.log_sent,ss.progress
from sys.syssync ss, sys.syspublication sp
where ss.publication_id = sp.publication_id
and ss.publication_id is not null
and ss.site_name is not null
If progress < log_sent, then the status of the last synchronization is unknown. The last upload may or may not have been applied at the consolidated, because the upload was sent, but no response was received from the MobiLink server. In this case, suggesting a synch isn't a bad idea.
If progress = log_sent, then the last synch was successful. Knowing this, you could check the value of db_property('CurrentRedoPos'), which will return the current log offset of the remote database. If this value is significantly higher than the progress value, there have been many operations applied to the database since the last synchronization, so there's a good chance that there is data to synchronize. There are lots of reasons why even a large difference in progress and db_property('CurrentRedoPos') could result in no actual data needing synchronization.
The download from the ML Server is applied by dbmlsync after the progress value at the remote is updated by dbmlsync when the upload is confirmed by the ML Server. Operations applied in the download by dbmlsync are not synchronized back to the ML Server, so the entire offset range could just be the last download that was applied. This could be worked around by tracking the current log offset in the sp_hook_dbmlsync_end hook when the exit code value in the #hook_dict table value is zero. This would tell you the log offset of the database after the download was applied, and you could now compare the saved value with the current log offset.
All the operations in the transaction log could be operations on tables that are not synchronized.
All the operations in the transaction log could have been rolled back.
My solution is not ideal. Tracking the changes to synchronized tables yourself is the best solution, but I thought I could offer an alternative that might be OK for your needs, with the advantage that you are not triggering an extra action on every operation performed on a synchronized table.
The mobile database doesn't keep track of when the last sync was, the MobiLink server keeps all of that information in the MobiLink tables of the consolidated database.
Since synchronization only transfers necessary information, you could simply initiate a sync. If there's nothing to sync, then very little data will be used by your application.
As a side note, SQL Anywhere has its own SO clone which is monitored by Sybase engineers. If anyone knows for sure, it'll be them.
As of SQL Anywhere 17, SAP PM maps to a local Sybase database that contains a TTRANSACTION_UPLOAD table, so to determine if a synchronization is necessary we simply query this table to see if it has any records that need to be sync'd to the HANA consolidation database.