Asynchronous SQL procedure execution set and wait for completion - tsql

Say I have a large set of calls to a procedure to run which have varying parameters but are independent so I want to make parallel/async calls. I use the service broker to fire these all off but the problem I have is I want to know neat ways of knowing how to wait for them all to complete (or error).
Is there a way to do this? I believe I could just loop with waits on the result table checking for completion on that but that isn't very "event triggered". Hoping for a nicer way to do this.
I have used the service broker with queue code and processing based off this other answer: Remus' service broker queuing example

Good day Shiv,
There are several ways (like always) that you can use in order to implement this requirement. One of these is using this logic:
(1) Create two queues: one will be the trigger to execute the main SP that you want execute in Asynchronous, and the other will be the trigger to execute whatever you want to execute after all the executions ended.
(2) When you create the message in the first queue you should also create a message in the second queue, which will only tell us which execution did not ended yet (first queue gives the information which execution started since once we START the execution we use the message and remove it from the queue).
(3) Inside the SP that you execute using the main first queue (this part executed in synchronous):
(3.1) execute the queries you need
(3.2) clear the equivalent message from the second queue (meaning that this message will removed only after the queries ended)
(3.3) check if there are messages in the second queue. If there are no messages then all the tasks ended and you can execute your final step
** Theoretically instead of using the second queue, you can store data in a table, but using second queue should probably give better performance then updating table each time an execution ended. Anyhow, you test the option of using a table as well.

Related

Processing Groups of Results with Vertx - How to coordinate?

I have a job processing system where each job contains thousands of individual tasks that require different strategies to complete. The individual tasks make up the whole job. If all tasks have been completed, the job is marked as successfully completed and other steps are taken, if any of the tasks fail, the job must be marked as failed and other steps are taken, if the job times out the job must be marked as failed and other steps are taken.
Once all of the results for a job have been received, the next job can be fetched. The next job shouldn't be fetched while a job is currently being processed.
Here is the what the flow looks like:
The Job Polling Verticle publishes a job to the event bus, and the Job Processing Verticle publishes each task to the event bus. When the job strategy completes, it publishes the task result to the event bus.
The issue is that I don't know the right way to determine when all tasks have been completed in this model. All verticles are stateless, The Job Processing Verticle doesn't await any futures, and even if the Job Results Verticle was stateful, it doesn't know how many results it should expect.
The only way I can think to do this would be to have a global stateful object. But I don't think this is good design.
Additionally, I need to know when a Job has timed out. That is, it's run longer than it should and I need to consider it's failed, log it, and move on.
I could do this with the global state, but again I don't think that's the right solution.
Does this verticle pattern make sense for what I'm trying to do?
First, let me try to address your questions. Then I'll try to explain what problems this design has.
The issue is that I don't know the right way to determine when all tasks have been completed in this model. All verticles are stateless, The Job Processing Verticle doesn't await any futures, and even if the Job Results Verticle was stateful, it doesn't know how many results it should expect.
The solution could be reference counting verticle. Each worker should emit a start message on event bus with jobId when it starts, and end message with jobId when it completes. Even if you have fan-out (those are the cases that you don't know how many workers there are), counting verticle will know that. In your diagram, "Job Post Processing Verticle" is a good candidate for this. It can maintain a counter, and only when it reaches zero, it should start the next job. That also helps avoiding actually sharing some memory reference.
Additionally, I need to know when a Job has timed out. That is, it's run longer than it should and I need to consider it's failed, log it, and move on.
In the same verticle you can start a timer every time you get a new start message. If you get end message, cancel the timer. Otherwise, cancel current job and start again.
Now, this solution will work, but the design has two main flaws. One is the fact that you maintain all your flow in memory, it seems. If your application crashes, all progress is lost, and it's not clear how you record it. Maybe polling Jobs table in DB would actually be better, since your job execution is sequential anyway.
Second point is the fact that all those timeouts and reference counting is homemade implementation of structured concurrency. Maybe you should take a look at something like Kotlin coroutines for that, at it will handle many of your problems for you.

UVM shared variables

I have a doubt regarding UVM. Let's think I have a DUT with two interfaces, each one with its agent, generating transactions with the same clock. These transactions are handled with analysis imports (and write functions) on the scoreboard. My problem is that both these transactions read/modify shared variables of the scoreboard.
My questions are:
1) Have I to guarantee mutual exclusion explicitly though a semaphore? (i suppose yes)
2) Is this, in general, a correct way to proceed?
3) and the main problem, can in some way the order of execution be fixed?
Depending on that order the values of shared variables can change, generating inconsistency. Moreover, that order is fixed by specifications.
Thanks in advance.
While SystemVerilog tasks and functions do run concurrently, they do not run in parallel. It is important to understand the difference between parallelism and concurrency and it has been explained well here.
So while a SystemVerilog task or function could be executing concurrently with another task or function, in reality it does not actually run at the same time (run time context). The SystemVerilog scheduler keeps a list of all the tasks and functions that need to run on the same simulation time and at that time it executes them one-by-one (sequentially) on the same processor (concurrency) and not together on multiple processors (parallelism). As a result mutual exclusion is implicit and you do not need to use semaphores on that account.
The sequence in which two such concurrent functions would be executed is not deterministic but it is repeatable. So when you execute a testbench multiple times on the same simulator, the sequence of execution would be same. But two different simulators (or different versions of the same simulator) could execute these functions in a different order.
If the specifications require a certain order of execution, you need to ensure that order by making one of these tasks/functions wait on the other. In your scoreboard example, since you are using analysis port, you will have two "write" functions (perhaps using uvm_analysis_imp_decl macro) executing concurrently. To ensure an order, (since functions can not wait) you can fork out join_none threads and make one of the threads wait on the other by introducing an event that gets triggered at the conclusion of the first thread and the other thread waits for this event at the start.
This is a pretty difficult problem to address. If you get 2 transactions in the same time step, you have to be able to process them regardless of the order in which they get sent to your scoreboard. You can't know for sure which monitor will get triggered first. The only thing you can do is collect the transactions and at the end of the time step do your modeling/checking/etc.
Semaphores only help you if you have concurrent threads that take (simulation) time that are trying to access a shared resource. If you get things from an analysis port, then you get them in 0 time, so semaphores won't help you here.
So to my understanding, the answer is: compiler/vendor/uvm cannot ensure the order of execution. If you need to ensure the order which actually happen in same time step, you need to use semaphore correctly to make it work the way you want.
Another thing is, only you yourself know which one must execute after the other if they are in same simulation time.
this is a classical race condition where the result depends upon the actual thread order...
first of all you have to decide if the write race is problematic for you and/or if there is a priority order in this case. if you dont care the last access would win.
if the access isnt atomic you might need a semaphore to ensure only one access is handled at a time and the next waits till the first has finished.
you can also try to control order by changing the structure or introducing thread ordering (wait_order) or if possible you remove timing at all (here instead of directly operating with the data you get you simply store the data for some time and then later you operate on it.

Oracle AQ Asynchronous notification

I'm planning to make use of Oracle AQ Asynchronous notification feature in an OLTP application.The number of messages it enqueues might go up to 1000 within a minute during peak hours. The dequeue callback procedure will process the message and inserts an entry into a table which is determined by the type of message.
My concern is that, does the large number of enqueue notifications it generates(the PL/SQL callback procedure that is being called in turn for every notification) cause database contention ?
Is it advisable to use Asynchronous notification for this purpose or Should I go with dequeue polling process where I can dequeue one message a time in a continous loop.
My database version is 10gR2
Your expert help is highly appreciated!!
For each message enqueued into the queue, ORACLE background process invokes the related callback procedure by creating a scheduler job.
As you are expecting 1000 messages with in a minute if you depend on the callback procedure it increases the load on ORACLE background processes and also creates those many onetime scheduler jobs, again if the no.of parallel scheduler jobs created is more than the "job_queue_processes" (ORACLE parameter) value configured in your database that delays the message processing, instead if you choose for polling to dequeue by using AQ listener, a single job can process all the messages enqueued in to the queue.

How do you create a sequential workflow with error handling in Celery?

I am currently using Redis for a workflow with a couple of steps in it. For each step, a worker snatches the payload from a queue, and when its done, pushes it onto the next queue, where the next worker can take it further. If an exception occurs, the task is put into a special queue by the worker.
The application logic with regards to the flow through the application hence lies in the workers themselves. I now want to switch to Celery.
I understand that in Celery you can use subtasks, but I fail to see how you express your specific error handling there for different conditions such as exceptions and time-outs. Are you supposed to use different queues or use subtasks, and what would that look like in code?
I have now read the docs even more thoroughly and additionally made some tests, and this works:
The problem is to string together tasks so that they happen one after the other, but at the same time be able to handle error conditions and "break out" of the flow and do something else, not just abort.
You can string together tasks with link, and if an extra parameter *link_error* is in there it will be used for failure. From reading:
http://docs.celeryproject.org/en/latest/userguide/calling.html#linking-callbacks-errbacks
I made this:
res = add.apply_async((2, 2), link=mul.s(16), link_error=onerror.s())
The three tasks are add, mul and onerror. Add adds two numbers and mul multiplies two numbers. So this will add 2 and 2 together, and then the sum will carried over to the next step (mul) and be multiplied by 16.
However, if the add code is buggy, or has bad data or if something else bad but detectable occurs, add throws an exception and the onerror task will be run instead of mul. The onerror task gets the uuid of the job and can look the job up in the database backend, if such a one is configured. The onerror task can then archive the job or send an e-mail or whatever.

Networking using run loop

I have an application which uses some external library for analytics. Problem is that I suspect it does some things synchronously, which blocks my thread and makes watchdog kill my app after 10 secs (0x8badf00d code). It is really hard to reproduce (I cannot), but there are quite few cases "in the wild".
I've read some documentation, which suggested that instead creating another thread I should use run-loops. Unfortunately the more I read about them, the more confused I get. And the last thing i want to do is release a fix which will break even more things :/
What I am trying to achieve is:
From main thread add a task to the run-loop, which calls just one function: initMyAnalytics(). My thread continues running, even if initMyAnalytics() gets locked waiting for network data. After initMyAnalytics() finishes, it quietly quits and never gets called again (so it doesnt loop or anything).
Any ideas how to achieve it? Code examples are welcome ;)
Regards!
You don't need to use a run loop in that case. Run loops' purpose is to proceed events from various sources sequentially in a particular thread and stay idle when they have nothing to do. Of course, you can detach a thread, create a run loop, add a source for your function and run the run loop until the function ends. The same as you can use a semi-trailer truck to carry your groceries home.
Here, what you need are dispatch queues. Dispatch queues are First-In-First-Out data structures that run tasks asynchronously. In contrary to run loops, a dispatch queue isn't tied to a particular thread: the working threads are automatically created and terminated as and when required.
As you only have one task to execute, you don't need to create a dispatch queue. Instead you will use an existing global concurrent queue. A concurrent queue execute one or more tasks concurrently, which is perfectly fine in our case. But if we had many tasks to execute and wanted each task to wait for its predecessor to end, we would need to create a serial queue.
So all you have to do is:
create a task for your function by enclosing it into a Block
get a global queue using dispatch_get_global_queue
add the task to the queue using dispatch_async.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
initMyAnalytics();
});
DISPATCH_QUEUE_PRIORITY_DEFAULT is a macro that evaluates to 0. You can get different global queues with different priorities. The second parameter is reserved for future use and should always be 0.