What's the best strategy for cancelling a pending job or executing job using rebus and MSMQ? (long running) - msmq

I have a web app sending messages through rebus to start long running jobs on a remote machine. I'm using the MSMQ transport. The remote worker can process 5 operations in parallel. Therefore, if the remote worker is busy, some messages could pile up in the queue until the worker can process them. Meanwhile, the user might decide to cancel a pending operation (or an executing one). What's the best strategy to handle this particular scenario when using rebus (or any bus for that matter)?

Well, since the queue is opaque and the endpoint will only ever get to see the messages that it actually attempts to receive, there's no way, really, to filter messages before they're received.
Depending on how important it is to you that you don't unnecessarily do work, I can think of a couple of ways to abort the processing before it commences.
One way I can think of is to take advantage of the fact that Rebus executes handlers in a handler pipeline, which means you can intercept messages before they're executed.
If your work is performed by DoWork, you can insert a "filter" like this:
Configure.With(...)
.(...)
.Options(o => o.SpecifyOrderOfHandlers()
.First<AbortIfCancelled>()
.Then<DoWork>())
.Start();
and then your AbortIfCancelled could look like this:
public class AbortIfCancelled : IHandleMessages<Work>
{
readonly IMessageContext _messageContext;
readonly ICancelWork _cancelWork;
public AbortIfCancelled(IMessageContext messageContext, ICancelWork cancelWork)
{
_messageContext = messageContext;
_cancelWork = cancelWork;
}
public async Task Handle(Work work)
{
if (await _cancelWork.WasCancelled(work))
{
_messageContet.AbortDispatch();
}
}
}
thus aborting the rest of the pipeline if the ICancelWork thing returns true. You will then have to implement ICancelWork e.g. by stuffing a bool into a database somewhere.
PS: The AbortDispatch() function on IMessageContext is available from 0.98.11

Related

Background task in reactive pipeline (Fire-and-forget)

I have a reactive pipeline to process incoming requests. For each request I need to call a business-relevant function (doSomeRelevantProcessing).
After that is done, I need to notify some external service about what happened. That part of the pipeline should not increase the overall response time.
Also, notifying this external system is not business critical: giving a quick response after the main part of the pipeline is finished is more important than making sure the notification is successful.
As far as I learned, the only way to run something in the background without slowing down the overall process is to subscribe to in directly in the pipeline, thus achieving a fire-and-forget mentality.
Is there a good alternative to subscribing inside the flatmap?
I am a little worried about what might happen if notifying the external service takes longer than the original processing and a lot of requests are coming in at once. Could this lead to a memory exhaustion or the overall process to block?
fun runPipeline(incoming: Mono<Request>) = incoming
.flatMap { doSomeRelevantProcessing(it) } // this should not be delayed
.flatMap { doBackgroundJob(it) } // this can take a moment, but is not super critical
fun doSomeRelevantProcessing(request: Request) = Mono.just(request) // do some processing
fun doBackgroundJob(request: Request) = Mono.deferContextual { ctx: ContextView ->
val notification = "notification" // build an object from context
// this uses non-blocking HTTP (i.e. webclient), so it can take a second or so
notifyExternalService(notification).subscribeOn(Schedulers.boundedElastic()).subscribe()
Mono.just(Unit)
}
fun notifyExternalService(notification: String) = Mono.just(Unit) // might take a while
I'm answering this assuming that you notify the external service using purely reactive mechanisms - i.e. you're not wrapping a blocking service. If you are then the answer would be different as you're bound by the size of your bounded elastic thread pool, which could quickly become overwhelmed if you have hundreds of requests a second incoming.
(Assuming you're using reactive mechanisms, then there's no need for .subscribeOn(Schedulers.boundedElastic()) as you give in your example, as that's not buying you anything - it's designed for wrapping legacy blocking services.)
Could this lead to a memory exhaustion
It's only a possibility in really extreme cases, the memory used by each individual request will be tiny. It's almost certainly not worth worrying about, if you start seeing memory issues here then you'll almost certainly be hit by other issues elsewhere.
That being said, I'd probably recommend adding .timeout(Duration.ofSeconds(5)) or similar before your inner subscribe method to make sure the requests are killed off after a while if they haven't worked for any reason - this will prevent them building up.
...or [can this cause] the overall process to block?
This one is easier - a short no, it can't.

How to choose in which ExecutorService the play ws is executed?

1)
I would like to execute specific Play WS in an isolated thread pool. Because I'm going to have a lot of HTTP call to do in background and I don't want that it overload my main executor service.
Note : I also found an information that I don't understand here : https://groups.google.com/forum/#!topic/play-framework/ibDf2vL3sT0
It explains that PlayWs already have it owns thread pool. Is it still right in Play 2.6 ? I don't understand things like that when reading the play documentation (see : https://www.playframework.com/documentation/2.5.x/ThreadPools#Using-the-default-thread-pool)
I created my own context :
call-to-db-context {
fork-join-executor {
parallelism-factor = 1
parallelism-max = 24
}
}
But I don't know how to specify that ws request use this context.
ws.url("http://127.0.0.1:8080/b")
.get() // How to specify executorContext here ?
2)
Also, this call-to-db-context must have a low priority because it's background task. I would like that Akka handle user request have higher priority and my default executorContext too. What is the best way to do it ?
In early play it was easier, you could just configure client yourself (as soon as WS was wrapper on top of ning).
It looks bit more complicated in 2.6. In mature products when something is not easy to change, than, most probably, you do not need to change it.
So, I do not think that you need to specify thread pool for WS methods. For post-processing may be, if it is long. But play client is asynchronous, means, that it will not block thread while wait for response. If you do requests using unreliable network, just use timeouts.
See here more about play client
Not sure that understand your requirement about priority. Did you know about akka deployment configuration? If no, you need to read here
So, you can specify thread pools for your actors. Having different dispatchers for actors and ws post-processing (processing of data from DB?) you will separate these functionalities. If ws-calls postprocessing is heavy, limit amount of threads for your call-to-db-context dispatcher
Update after comment
Lots of ws-calls in your case (if you think that their amount could affect perormance) need to be limited in two places: limit amount of calls started, limit amount of concurrent post-processing. You need to understand. Setting specific dispatcher for ws itself will not limit anything: as soon as it is async, it can start 1000 requests having only one thread.
So, I would say you can wrap your ws-calls into actor. Actor will handle message to start the request, and post-processing e.g.
receive: Receive = {
…
case Get(url) =>
ws.url(url).get().onComplete {
case Success(response) => self ! GetSuccess(response)
case Failure(exception) => self ! GetFailure(exception)
}
case GetSuccess(response) => …..
case GetFailure(exception) => ……
}
you can deploy this actor on specific dispatcher with round robin pool (set amount of workers). This solution does not limit start of requests (so you can get long queue of responses). You can add become with disabling accepting Get, while not received GetSuccess or GetFailure (so worker need to process request completely before start next).

How do you register a listener in service fabric (azure)

I am writing a reliable actor in service fabric who's job it will be to listen to changes in a Firebase DB and run logic based on those changes. I have it functioning, but not correctly. What I've done so far is write the actor code with a method called MonitorRules() which is what is listening to Firebase using a C# Firebase client wrapper called FireSharp. MonitorRules() looks like this:
public async Task MonitorRules()
{
FireSharp.FirebaseClient client = new FireSharp.FirebaseClient(new FireSharp.Config.FirebaseConfig
{
AuthSecret = "My5up3rS3cr3tAu7h53cr37",
BasePath = "https://myapp.firebaseio.com/"
});
await client.OnAsync("businessRules",
added: (sender, args) =>
{
ActorEventSource.Current.ActorMessage(this, $"{args.Data} added at {args.Path}");
},
changed: (sender, args) =>
{
ActorEventSource.Current.ActorMessage(this, $"{args.OldData} changed to {args.Data} at {args.Path}");
}
);
}
I then call MonitorRules() after the service is registered like so in the service's Main() method:
fabricRuntime.RegisterActor<RuleMonitor>();
var serviceUri = new Uri("fabric:/MyApp.RuleEngine/RuleMonitorActorService");
var actorId = ActorId.NewId();
var ruleMonitor = ActorProxy.Create<IRuleMonitor>(actorId, serviceUri);
ruleMonitor.MonitorRules();
This "works" in that the service opens a connection to Firebase and responds to data changes. The problem is that since the service is run on three nodes of a five node cluster, it's actually listening three times and processes each message three times. Also, if there is no activity for a while, the service is deactivated and no longer responds to changes in Firebase. All in all, not the right way to set something like this up I'm sure, but I can not find any documentation on how to set up a polling client like this in service fabric. Is there a way to set this up that will adhere to the spirit of azure service fabric?
Yeah, there are a few things to familiarize yourself with here. The first is the Actor lifecycle and garbage collection. Tl;dr: Actors are deactivated if they do not receive a client request (via ActorProxy) or a reminder for some period of time, which is configurable.
Second, Actors have Timers and Reminders that you can use to do periodic work, like polling a database for changes. The difference between a timer and reminder is that a timer doesn't count as "being used" meaning that the actor can still be deactivated which shuts down the timer, but a reminder counts as "being used" and can also re-activate a deactivated actor. The way to think about timers and reminders is that you're doing the polling, rather than waiting for a callback from something else like you have here with FireSharp.
Finally, calling MonitorRules from Main() is not the best idea. The reason is that Main() is actually the entry point for your actor service host process, which is just an EXE that is used to host instances of your actors. The only thing that should happen in Main() is registering your actor type and nothing else. Let's look at what's happening here in more detail:
So you deploy your actor service to a cluster. The first thing that happens is we start the host process on as many nodes as necessary to run the actor service (in your case that's 3). We enter Main() where the actor service type gets registered and at this point, that's all we should do, because once the actor service is registered with the host process, we'll then create an instance (or multiple instances or replicas if it's stateful) of the service, and then the service can start doing its work. For actors, that means the actor service is ready to start activating actors when a client application makes a call using ActorProxy. But with the ActorProxy call you have in Main(), you're basically saying "activate an actor on every node where this host is when the host starts" which is why you're listening three times.
With all that in mind, the first question to ask yourself is whether actors are the right model for you. If you just want a simple place to monitor Firebase with a FireSharp client, it might be easier just use a reliable service instead because you can put your monitoring in RunAsync, which is started automatically when the service starts, unlike actors which need a client to activate them.

MPI message availability notification

Is there a way in MPI to be notified of availability of message for a particular process? Currenlty I use polling with an asynchronous MPI_Iprobe followed by an MPI_Recv . This means the process have to stop what it is doing and call this method now and then. Is there a way to be notified for availability of message through signals/interrupts? Another option is to do this polling stuff with a separate thread but I am not sure if that is acceptable because it consumes cpu time.
int poll(int& source,int& message_id) {
int flag;
MPI_Status mpi_status;
MPI_Iprobe(MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD,&flag,&mpi_status);
if(flag) {
message_id = mpi_status.MPI_TAG;
source = mpi_status.MPI_SOURCE;
return true;
}
return false;
}
Edit: It looks like MPI implementations use polling http://blogs.cisco.com/performance/polling-vs-blocking-message-passingprogress/
The best solution for this seems to be using the blocking MPI_Probe() and then
mpirun -n 2 --mca yield_when_idle
This will make the polling threads to block until message is sent. But some mpi implementations do not have the mca option.
Such a mechanism is not directly available in a portable way.
You could use a thread with MPI_Wait (or a flavor thereof) - or even more simply MPI_Recv. However there is no guarantee that your MPI implementation doesn't consume CPU while waiting on the message. (While it is guaranteed that polling over MPI_Iprobe will indeed consume 100% CPU.) Also be careful with MPI and threads, there are pitfalls and different operating modes.
Why would you want to do that in the first place? There may be a better mechanism for what you do, for instance using one-sided communication.

The reason why Task deletion of uCOS should not occur during ISR

I'm modifying some functionalities (mainly scheduling) of uCos-ii.
And I found out that OSTaskDel function does nothing when it is called by ISR.
Though I learned some basic features of OS, I really don't understand why that should be prohibited.
All it does is withrawl from readylist and release of acquired resources like TCB or semaphores...
Is there any reason for them to be banned while handling interrupt?
It is not clear from the documentation why it is prohibited in this case, but OSTaskDel() explicitly calls OS_Sched(), and in an ISR this should only happen when the outer-most nested interrupt handler exists (handled by OSIntExit()).
I don't think the following is advisable, because there may be other reasons why this is prohibited, but you could remove the:
if (OSIntNesting > 0) {
return (OS_TASK_DEL_ISR);
}
then make the OS_Sched() call conditional as follows:
if (OSIntNesting == 0) {
OS_Sched();
}
If this dies horribly, remember I said it was ill-advised!
This operation will extend your interrupt processing time in any case so is probably a bad idea if only for that reason.
It is a bad idea in general (not just from an ISR) to asynchronously delete another task regardless of that tasks state or resource usage. uC/OS-II provides the OSTaskDelReq() function to manage task deletion in a way that allows a task to delete itself on request and therefore be able to correctly release all its resources. Even without that, sending a request via the task's normal IPC mechanisms is usually better (and more portable).
If a task is not designed for self-deletion on demand, then you might simply use OSSuspend().
Generally, you cannot do a few things in ISRs:
block on a semaphore and the like
block while acquiring a spin lock, if it's a single-CPU system
cause a page fault, that has to be resolved by the virtual memory subsystem (with virtual on-disk memory, that is)
If you do any of the above in an ISR, you'll have a deadlock.
OSTaskDel() is probably doing some of those things.