tsql query kill scenario - tsql

In our webapp, we have lots of queries running. Most of them reading data but some update queries with high priority might come. Since, we'd like to cancel read queries but when using KILL, I'd like the read query to return certain dataset result or execution result upon receiving cancel.
My intention is to mimic the behavior of signal in C programs for which a signal handler is invoked upon receiving a kill signal.
Is there any method to define an asynchrnous KILL signal handler for SPs?

This is not a fully tested answer. But it is a bit more than just a comment.
One is to have dirty read (with nolock).
This part is tested I do this all time.
Build a large scalable app you need to resort to this and manage it.
A dirty read will not block an update.
You can get that - a dirty read.
A lot of people think a dirty read may get corrupt data.
If you are updating smith to johnson the select is not going to get smison.
The select is going to get smith and it will be immediately stale.
But how is that worse then taking a read lock?
The read get smith and blocks the update.
Once the read locks are cleared it is updated.
I would contend that blocking an update is also stale data.
If you are using reader I think you could pass the same cancellation token to each select and then just cancel the one token.
But if may not process the CancellationToken until it read the row so it may not cancel a query a long running query that has not yet returned any rows.
DbDataReader.ReadAsync Method (CancellationToken)
Or if you are not using reader look at
SqlCommand.Cancel
As far as getting cancel to return alternate data. I doubt SQL is going to do that.

Related

How to investigate time required to obtain lock - and why - within a procedure

I am stumped on an issue I am having. The true context is rather complicated, but I can boil it down to these functional points (everything else is not related to the problematic table):
I have a trigger function that contains several SELECTs and then an UPDATE
The update takes an unreasonable amount of time to execute ("unreasonable" = > 1.4s)
The same exact queries when run outside the trigger (for the same rows, parameters, etc.) do not have any issues (i.e. they execute in under 1-2ms)
I am pretty sure that indexes, etc., are working as necessary; i.e. there shouldn't be any issues.
There are no circular triggers
There is on trigger on the destination table, but even with that removed the behavior is the same.
I have done many tests to no avail, but these are pretty meaningful:
when the update is replaced with a SELECT, the response time is fast, as expected
when the update is replaced with a SELECT... FOR UPDATE, the response time is slow, the same as the update
^ this (as well as other things) has led me to possibly believe that the delay is spent waiting to achieve a lock
No other transactions are really happening on that table. I am truly bewildered.
Server context: This is being run in AWS/RDS on db.m5.xlarge.
What I am looking for is whether there is a way to get some information about locks that are happening mid-transaction or possibly even a history of acquired locks? Or anything else that can give me insight into what is causing the delay that seems so closely related to acquiring a lock on that table.
Unfortunately, just to make everything even more frustrating, I cannot replicate the issue when I attempt to use EXPLAIN in the function body. The only way to do this (that I know of) is to use the EXECUTE... syntax with a query string. That doesn't have a delay - its also useless for the trigger.

How to create warning message in trigger?

Is it possible to create a warning message in a trigger in Firebird 2.5?
I know I can create an exception message which will stop the user from saving the record changes, but in this instance I don't mind if the user continues.
Could I call a procedure that generates the message?
There is no mechanism in Firebird to produce warnings in PSQL code, you can only raise exceptions, which in triggers will result in the effect of the executed statement that fired the trigger to be undone.
In short, this is not possible.
There are workarounds possible, but those would require 'external' protocols, like, for example, inserting the warning message into a global temporary table, requiring the calling code to explicitly select from that temporary table after execution.
SQL model does provide putting query on pause and then waiting for extra input from client to either unfreeze it or fail it. SQL is not user-interactive service and there is no confirmation dialogs. You have to rethink your application design.
One possible avenue, nominally staying withing 2-tier client-server framework, would be creating temporary tabless for all the data you want to save (for example transaction-scope GTTs), and then have TWO stored procedures. One SP would be sanity-checking and returning list of warnings, if any. Another SP then would dump the data from GTTs to main, persistent tables without doing those checks.
Your client app would select warnings from the check-SP first, if it returns any then show them to the user, then either call save-SP and commit, or rollback without calling save-SP.
This is abusing C/S idea, so there would be dragons. First of all, you would have to have several GTTs and two SPs for E-V-E-R-Y pausable data saving in your app. And that can be a lot.
Also, notice, that database data may change after you called check-SP and before you called save-SP. Becuse some OTHER application running elsewhere could be changing and committing data during that pause. Especially if you transaction was of READ COMMMITTED kind. But with SNAPSHOT tx too.
Better approach would be to drop C/S scheme and go to 3-tier model, AKA multi-tier, AKA "Application Server". That way your client app sends the "briefcase" of data to the app-server, it would be app-server (not SQL triggers) doing all the data validation, and then it would be saving it to data storage backend, SQL or any other.
There, of course, still would be that problem, that data could had been changed by other users, why you paused one user and waited him to read and decide. But you would have more flexibility in app-server on data reconcilation, than you would have with plain SQL.

Executing a trigger AFTER the completion of a transaction

In PostgreSQL, are DEFERRED triggers executed before (within) the completion of the transaction or just after it?
The documentation says:
DEFERRABLE
NOT DEFERRABLE
This controls whether the constraint can be deferred. A constraint
that is not deferrable will be checked immediately after every
command. Checking of constraints that are deferrable can be postponed
until the end of the transaction (using the SET CONSTRAINTS command).
It doesn't specify if it is still inside the transaction or out. My personal experience says that it is inside the transaction and I need it to be outside!
Are DEFERRED (or INITIALLY DEFERRED) triggers executed inside of the transaction? And if they are, how can I postpone their execution to the time when the transaction is completed?
To give you a hint what I'm after, I'm using pg_notify and RabbitMQ (PostgreSQL LISTEN Exchange) to send out messages. I process such messages in an external application. Right now I have a trigger which notifies the external app of the newly inserted records by including the record's id in the message. But in a non-deterministic way, once in a while, when I try to select a record by its id at hand, the record can not be found. That's because the transaction is not complete yet and the record is not actually added to the table. If I can only postpone the execution of the trigger for after the completion of the transaction, everything will work out.
In order to get better answers let me explain the situation even closer to the real world. The actual scenario is a little more complicated than what I explained before. The source code can be found here if anyone's interested. Becuase of reasons that I'm not gonna dig into, I have to send the notification from another database so the notification is actually sent like:
PERFORM * FROM dblink('hq','SELECT pg_notify(''' || channel || ''', ''' || payload || ''')');
Which I'm sure makes the whole situation much more complicated.
Triggers (including all sorts of deferred triggers) fire inside the transaction.
But that is not the problem here, because notifications are delivered between transactions anyway.
The manual on NOTIFY:
NOTIFY interacts with SQL transactions in some important ways.
Firstly, if a NOTIFY is executed inside a transaction, the notify
events are not delivered until and unless the transaction is
committed. This is appropriate, since if the transaction is aborted,
all the commands within it have had no effect, including NOTIFY. But
it can be disconcerting if one is expecting the notification events to
be delivered immediately. Secondly, if a listening session receives a
notification signal while it is within a transaction, the notification
event will not be delivered to its connected client until just after
the transaction is completed (either committed or aborted). Again, the
reasoning is that if a notification were delivered within a
transaction that was later aborted, one would want the notification to
be undone somehow — but the server cannot "take back" a notification
once it has sent it to the client. So notification events are only
delivered between transactions. The upshot of this is that
applications using NOTIFY for real-time signaling should try to keep
their transactions short.
Bold emphasis mine.
pg_notify() is just a convenient wrapper function for the SQL NOTIFY command.
If some rows cannot be found after a notification has been received, there must be a different cause! Go find it. Likely candidates:
Concurrent transactions interfering
Triggers doing something more or different than you think they do.
All sorts of programming errors.
Either way, like the manual suggests, keep transactions that send notifications short.
dblink
Update: Transaction control in a PROCEDURE or DO statement in Postgres 11 or later makes this a lot simpler. Just COMMIT; to (also) send waiting notifications.
Original answer (mostly for Postgres 10 or older):
PERFORM * FROM dblink('hq','SELECT pg_notify(''' || channel || ''', ''' || payload || ''')');
... which should be rewritten with format() to simplify and make the syntax secure:
PRERFORM dblink('hq', format('NOTIFY %I, %L', channel, payload));
dblink is a game-changer here, because it opens a separate transaction in the other database. This is sometimes used to fake autonomous transaction.
Does Postgres support nested or autonomous transactions?
How do I do large non-blocking updates in PostgreSQL?
dblink() waits for the remote command to finish. So the remote transaction will most probably commit first. The manual:
The function returns the row(s) produced by the query.
If you can send notification from the same transaction instead, that would be a clean solution.
Workaround for dblink
If notifications have to be sent from a different transaction, there is a workaround with dblink_send_query():
dblink_send_query sends a query to be executed asynchronously, that is, without immediately waiting for the result.
DO -- or plpgsql function
$$
BEGIN
-- do stuff
PERFORM dblink_connect ('hq', 'your_connstr_or_foreign_server_here');
PERFORM dblink_send_query('con1', format('SELECT pg_sleep(3); NOTIFY %I, %L ', 'Channel', 'payload'));
PERFORM dblink_disconnect('con1');
END
$$;
If you do this right before the end of the transaction, your local transaction gets 3 seconds (pg_sleep(3)) head start to commit. Chose an appropriate number of seconds.
There is an inherent uncertainty to this approach, since you get no error message if anything goes wrong. For a secure solution you need a different design. After successfully sending the command, chances for it to still fail are extremely slim, though. The chance that successful notifications are missed seem much higher, but that's built into your current solution already.
Safe alternative
A safer alternative would be to write to a queue table and poll it like discussed in #Bohemian's answer. This related answer demonstrates how to poll safely:
Postgres UPDATE … LIMIT 1
I'm posting this as an answer, assuming the actual problem you are trying to solve is deferring execution of an external process until after the transaction is completed (rather than the X-Y "problem" you're trying to solve using trigger Kung Fu).
Having the database tell an app to do something is a broken pattern. It's broken because:
There's no fallback if the app doesn't get the message, eg because it's down, network explodes, whatever. Even the app replying with an acknowledgment (which it can't), wouldn't fix this problem (see next point)
There's no sensible way to retry the work if the app gets the message but fails to complete it (for any of lots of reasons)
In contrast, using the database as a persistant queue, and having the app poll it for work, and take the work off the queue when work is complete, has none of the above problems.
There are lots of ways to achieve this. The one I prefer is to have some process (usually trigger on insert, update and delete) put data into a "queue" table. Have another process poll that table for work to do, and delete from the table when work is complete.
It also adds some other benefits:
The production and consumption of work is decoupled, which means you can safely kill and restart your app (which must happen from time to time, eg deploying) - the queue table will happily grow while the app is down, and will drain when the app is back up. You can even replace the app with an entirely new one
If for whatever reason you want to initiate processing of certain items, you can just manually insert rows into the queue table. I used this technique myself to initiate the processing of all items in a database that needed initialising by being put on the queue once. Importantly, I didn't need to do a perfunctory update to every row just to fire the trigger
Getting to your question, a slight delay can be introduced by adding a timestamp column to the queue table and having the poll query only select rows that are older than (say) 1 second, which gives the database time to complete its transaction
You can't overload the app. The app will read only as much work as it can handle. If your queue is growing, you need a faster app, or more apps If multiple consumers are operating, concurrency can be solved by (for example) adding a "token" column to the queue table
Queues that are backed by database tables is the basis of how persistent queues are implemented in commercial grade queue-based platforms, so the pattern is well tested, used and understood.
Leave the database to do what it does best, and the only thing it does well: Manage data. Don't try to make your database server into an app server.

Could I save Postgres transaction and continue work with db within it later

I know about prepared transaction in Postgres, but seems you can just commit or rollback it later. You cannot even view the transaction's db state before you've committed it. Is any way to save transaction for later use?
What I want to achieve actually is a preview (and correcting) of some changes in db (changes are imports from csv file, so user need to see preview before apply it). I want to make changes, add some changes later, see full state of db and apply it (certainly, commit transaction)
I cannot find a very good reference in docs, but I have a very strong feeling that the answer is: No, you cannot do that.
It would mean that when you "save" the transaction, the database would basically have to maintain all of its locks in place for an indefinite amount of time. Even if it was possible, it would mean horrible failure modes and trouble on all fronts.
For the pattern that you are describing, I would use two separate transactions. Import to a staging table and show that to user (or import to the main table but mark rows as "unapproved"). If user approves, in another transactions move or update these rows.
You can always end up in a situation where user can simply leave or crash without clicking "OK" or "Cancel". If what you're describing was possible, you would end up with a hung transaction holding all these resources. In my proposed solution you end up with wasteful rows in "staging" table that you may still show to user later or remove.
You may want to read up on persistence saga. This is actually a very simple example of a well known and researched problem.
To make the long story short, this pattern breaks down a long-running process like yours into smaller operations that are applied and persisted in some way in separate transactions. If any of them happens to fail (or does not occur as expected), you have compensating actions that usually undo what the steps executed so far have done (e.g. by throwing away stale/irrelevant data).
Here's a decent introduction:
https://blog.couchbase.com/saga-pattern-implement-business-transactions-using-microservices-part/#:~:text=The%20SAGA%20Pattern,completion%20of%20the%20previous%20one.
http://vasters.com/clemensv/2012/09/01/Sagas.aspx
This concept was formally introduced in the 80s, but is well alive and relevant today.

Getting past Salesforce trigger governors

I'm trying to write an "after update" trigger that does a batch update on all child records of the record that has just been updated. This needs to be able to handle 15k+ child records at a time. Unfortunately, the limit appears to be 100, which is so far below my needs it's not even close to acceptable. I haven't tried splitting the records into batches of 100 each, since this will still put me at a cap of 10k updates per trigger execution. (Maybe I could just daisy-chain triggers together? ugh.)
Does anyone know what series of hoops I can jump through to overcome this limitation?
Edit: I tried calling following #future function in my trigger, but it never updates the child records:
global class ParentChildBulkUpdater
{
#future
public static void UpdateChildDistributors(String parentId) {
Account[] children = [SELECT Id FROM Account WHERE ParentId = :parentId];
for(Account child : children)
child.Site = 'Bulk Updater Fired';
update children;
}
}
The best (and easiest) route to take with this problem is to use Batch Apex, you can create a batch class and fire it from the trigger. Like #future it runs in a separate thread, but it can process up to 50,000,000 records!
You'll need to pass some information to your batch class before using database.executeBatch so that it has the list of parent IDs to work with, or you could just get all of the accounts of course ;)
I've only just noticed how old this question is but hopefully this answer will help others.
It's worst than that, you're not even going to be able to get those 15k records in the first place, because there is a 1,000 row query limit within a trigger (This scales to the number of rows the trigger is being called for, but that probably doesnt help)
I guess your only way to do it is with the #future tag - read up on that in the docs. It gives you much higher limits. Although, you can only call so many of those in a day - so you may need to somehow keep track of which parent objects have their children updating, and then process that offline.
A final option may be to use the API via some external tool. But you'll still have to make sure everything in your code is batched up.
I thought these limits were draconian at first, but actually you can do a hell of a lot within them if you batch things correctly, we regularly update 1,000's of rows from triggers. And from an architectural point of view, much more than that and you're really talking batch processing anyway which isnt normally activated by a trigger. One things for sure - they make you jump through hoops to do it.
I think Codek is right, going the API / external tool route is a good way to go. The governor limits still apply, but are much less strict with API calls. Salesforce recently revamped their DataLoader tool, so that might be something to look into.
Another thing you could try is using a Workflow rule with an Outbound Message to call a web service on your end. Just send over the parent object and let a process on your end handle the child record updates via the API. One thing to be aware of with outbound messages, it is best to queue up the process on your end somehow, and immediately respond to Salesforce. Otherwise Salesforce will resend the message.
#future doesn't work (does not update records at all)? Weird. Did you try using your function in automated test? It should work and and the annotation should be ignored (during the test it will be executed instantly, test methods have higher limits). I suggest you investigate this a bit more, it seems like best solution to what you want to accomplish.
Also - maybe try to call it from your class, not the trigger?
Daisy-chaining triggers together will not work, I've tried it in the past.
Your last option might be batch Apex (from Winter'10 release so all organisations should have it by now). It's meant for mass data update/validation jobs, things you typically run overnight in normal databases (it can be scheduled). See http://www.salesforce.com/community/winter10/custom-cloud/program-cloud-logic/batch-code.jsp and release notes PDF.
I believe in version 18 of the API the 1000 limit has been removed. (so the documentations says but in some cases I still hit a limit)
So you may be able to use batch apex. With a single APEX update statement
Something like:
List children = new List{};
for(childObect__c c : [SELECT ....]) {
c.foo__c = 'bar';
children.add(c);
}
update(children);;
Besure you bulkify your tigger also see http://sfdc.arrowpointe.com/2008/09/13/bulkifying-a-trigger-an-example/
Maybe a change to your data model is the better option here. Think of creating a formula on the children object where you access the data from the parent. This would be far more efficient probably.