How do you register a listener in service fabric (azure) - azure-service-fabric

I am writing a reliable actor in service fabric who's job it will be to listen to changes in a Firebase DB and run logic based on those changes. I have it functioning, but not correctly. What I've done so far is write the actor code with a method called MonitorRules() which is what is listening to Firebase using a C# Firebase client wrapper called FireSharp. MonitorRules() looks like this:
public async Task MonitorRules()
{
FireSharp.FirebaseClient client = new FireSharp.FirebaseClient(new FireSharp.Config.FirebaseConfig
{
AuthSecret = "My5up3rS3cr3tAu7h53cr37",
BasePath = "https://myapp.firebaseio.com/"
});
await client.OnAsync("businessRules",
added: (sender, args) =>
{
ActorEventSource.Current.ActorMessage(this, $"{args.Data} added at {args.Path}");
},
changed: (sender, args) =>
{
ActorEventSource.Current.ActorMessage(this, $"{args.OldData} changed to {args.Data} at {args.Path}");
}
);
}
I then call MonitorRules() after the service is registered like so in the service's Main() method:
fabricRuntime.RegisterActor<RuleMonitor>();
var serviceUri = new Uri("fabric:/MyApp.RuleEngine/RuleMonitorActorService");
var actorId = ActorId.NewId();
var ruleMonitor = ActorProxy.Create<IRuleMonitor>(actorId, serviceUri);
ruleMonitor.MonitorRules();
This "works" in that the service opens a connection to Firebase and responds to data changes. The problem is that since the service is run on three nodes of a five node cluster, it's actually listening three times and processes each message three times. Also, if there is no activity for a while, the service is deactivated and no longer responds to changes in Firebase. All in all, not the right way to set something like this up I'm sure, but I can not find any documentation on how to set up a polling client like this in service fabric. Is there a way to set this up that will adhere to the spirit of azure service fabric?

Yeah, there are a few things to familiarize yourself with here. The first is the Actor lifecycle and garbage collection. Tl;dr: Actors are deactivated if they do not receive a client request (via ActorProxy) or a reminder for some period of time, which is configurable.
Second, Actors have Timers and Reminders that you can use to do periodic work, like polling a database for changes. The difference between a timer and reminder is that a timer doesn't count as "being used" meaning that the actor can still be deactivated which shuts down the timer, but a reminder counts as "being used" and can also re-activate a deactivated actor. The way to think about timers and reminders is that you're doing the polling, rather than waiting for a callback from something else like you have here with FireSharp.
Finally, calling MonitorRules from Main() is not the best idea. The reason is that Main() is actually the entry point for your actor service host process, which is just an EXE that is used to host instances of your actors. The only thing that should happen in Main() is registering your actor type and nothing else. Let's look at what's happening here in more detail:
So you deploy your actor service to a cluster. The first thing that happens is we start the host process on as many nodes as necessary to run the actor service (in your case that's 3). We enter Main() where the actor service type gets registered and at this point, that's all we should do, because once the actor service is registered with the host process, we'll then create an instance (or multiple instances or replicas if it's stateful) of the service, and then the service can start doing its work. For actors, that means the actor service is ready to start activating actors when a client application makes a call using ActorProxy. But with the ActorProxy call you have in Main(), you're basically saying "activate an actor on every node where this host is when the host starts" which is why you're listening three times.
With all that in mind, the first question to ask yourself is whether actors are the right model for you. If you just want a simple place to monitor Firebase with a FireSharp client, it might be easier just use a reliable service instead because you can put your monitoring in RunAsync, which is started automatically when the service starts, unlike actors which need a client to activate them.

Related

How to choose in which ExecutorService the play ws is executed?

1)
I would like to execute specific Play WS in an isolated thread pool. Because I'm going to have a lot of HTTP call to do in background and I don't want that it overload my main executor service.
Note : I also found an information that I don't understand here : https://groups.google.com/forum/#!topic/play-framework/ibDf2vL3sT0
It explains that PlayWs already have it owns thread pool. Is it still right in Play 2.6 ? I don't understand things like that when reading the play documentation (see : https://www.playframework.com/documentation/2.5.x/ThreadPools#Using-the-default-thread-pool)
I created my own context :
call-to-db-context {
fork-join-executor {
parallelism-factor = 1
parallelism-max = 24
}
}
But I don't know how to specify that ws request use this context.
ws.url("http://127.0.0.1:8080/b")
.get() // How to specify executorContext here ?
2)
Also, this call-to-db-context must have a low priority because it's background task. I would like that Akka handle user request have higher priority and my default executorContext too. What is the best way to do it ?
In early play it was easier, you could just configure client yourself (as soon as WS was wrapper on top of ning).
It looks bit more complicated in 2.6. In mature products when something is not easy to change, than, most probably, you do not need to change it.
So, I do not think that you need to specify thread pool for WS methods. For post-processing may be, if it is long. But play client is asynchronous, means, that it will not block thread while wait for response. If you do requests using unreliable network, just use timeouts.
See here more about play client
Not sure that understand your requirement about priority. Did you know about akka deployment configuration? If no, you need to read here
So, you can specify thread pools for your actors. Having different dispatchers for actors and ws post-processing (processing of data from DB?) you will separate these functionalities. If ws-calls postprocessing is heavy, limit amount of threads for your call-to-db-context dispatcher
Update after comment
Lots of ws-calls in your case (if you think that their amount could affect perormance) need to be limited in two places: limit amount of calls started, limit amount of concurrent post-processing. You need to understand. Setting specific dispatcher for ws itself will not limit anything: as soon as it is async, it can start 1000 requests having only one thread.
So, I would say you can wrap your ws-calls into actor. Actor will handle message to start the request, and post-processing e.g.
receive: Receive = {
…
case Get(url) =>
ws.url(url).get().onComplete {
case Success(response) => self ! GetSuccess(response)
case Failure(exception) => self ! GetFailure(exception)
}
case GetSuccess(response) => …..
case GetFailure(exception) => ……
}
you can deploy this actor on specific dispatcher with round robin pool (set amount of workers). This solution does not limit start of requests (so you can get long queue of responses). You can add become with disabling accepting Get, while not received GetSuccess or GetFailure (so worker need to process request completely before start next).

Spring Cloud Circuit Breakers or Hystrix

Hystrix is predominantly meant for applications built using spring cloud.
Having said that there could be multiple services layer for an application.
e.g. Amazon (Amazon site must be having multiple services like login, products, carts, orders, payment and so on)
Client (say web user) -> web application X -> Service A (it uses Data Source A ) -> Service B (Data Source B) -> Service C (Data Source C) -> Service D (Data Source D) -> Service E (Data Source E)
with this kind of scenario when something breaks in Service E, How that gets navigated back to client?
How Hystrix can be useful here to know unavailability of one specific functionality in Service E?
If that example is wrong, then is Hystrix scope limited to multiple processes inside one service and not multiple services used in one application?
Having said that above example can be tweaked something like below
Client (say web user) -> web application X -> Service A -> inside Service A Lets say there are processes like process 1 ->process 2 ->process 3 ->process 4->process 5
and anything fails in process 5 gets navigated back to process 1 and then back to client.
My question is more about maintaining thread state here.
With try-catch thread scope is limited per service (please correct me if wrong).
How Hystrix maintains state of thread during this whole transaction?
Hystrix is predominantly meant for applications built using spring
cloud
Not exactly. Hystrix is generally used to enable Circuit Breaker functionality. It could be used everywhere. Even for a simple method call
For example
class A {
B b;
public void methodA() {
b.methodB();
}
}
class A {
DatabaseConnectionPool object;
#HystrixCommand(fallbackMethod = "abcd")
public void methodB() {
// Get db object from pool.
// call db and fetch something.
}
}
Even for a simple method call, it can be used. Doesn't matter what is being done inside the code wrapped around Hystrix.
But generally we use Hystrix around pieces of code which would throw exceptions for unknown reasons (especially when calling differnt applications)
with this kind of scenario when something breaks in Service E, How
that gets navigated back to client? How Hystrix can be useful here to
know unavailability of one specific functionality in Service E?
If you wrap each method call i.e from Service A --> Service B and Service B --> Service C and further with Hystrix, then each call is treated as a circuit and you can visualize using Hystrix-dashboard, what is the state(closed, open, half-open) of each circuit.
Let's assume the call from Service B --> Service C fails(throws exception) then Hystrix would wrap the original exception in Hystrix Exception and throws it back. If you have a fallback method, then it goes to the fallback method in Service B and returns the value specified in fallback. If you don't have fallback then it throws the exception higher up the chain. And same thing repeats higher up the chain.
How Hystrix maintains state of thread during this whole transaction ?
For each call wrapped with Hystrix, Hystrix maintains a Thread Pool and you can configure this completely.
If I already have an existing Java feature of using try-catch why will
someone go for Hystrix explicitly?
Hystrix provides much more functionality. You cannot even compare that with try catch. I suggest you read Circuit breaker pattern.

What's the best strategy for cancelling a pending job or executing job using rebus and MSMQ? (long running)

I have a web app sending messages through rebus to start long running jobs on a remote machine. I'm using the MSMQ transport. The remote worker can process 5 operations in parallel. Therefore, if the remote worker is busy, some messages could pile up in the queue until the worker can process them. Meanwhile, the user might decide to cancel a pending operation (or an executing one). What's the best strategy to handle this particular scenario when using rebus (or any bus for that matter)?
Well, since the queue is opaque and the endpoint will only ever get to see the messages that it actually attempts to receive, there's no way, really, to filter messages before they're received.
Depending on how important it is to you that you don't unnecessarily do work, I can think of a couple of ways to abort the processing before it commences.
One way I can think of is to take advantage of the fact that Rebus executes handlers in a handler pipeline, which means you can intercept messages before they're executed.
If your work is performed by DoWork, you can insert a "filter" like this:
Configure.With(...)
.(...)
.Options(o => o.SpecifyOrderOfHandlers()
.First<AbortIfCancelled>()
.Then<DoWork>())
.Start();
and then your AbortIfCancelled could look like this:
public class AbortIfCancelled : IHandleMessages<Work>
{
readonly IMessageContext _messageContext;
readonly ICancelWork _cancelWork;
public AbortIfCancelled(IMessageContext messageContext, ICancelWork cancelWork)
{
_messageContext = messageContext;
_cancelWork = cancelWork;
}
public async Task Handle(Work work)
{
if (await _cancelWork.WasCancelled(work))
{
_messageContet.AbortDispatch();
}
}
}
thus aborting the rest of the pipeline if the ICancelWork thing returns true. You will then have to implement ICancelWork e.g. by stuffing a bool into a database somewhere.
PS: The AbortDispatch() function on IMessageContext is available from 0.98.11

How to discover that a Scala remote actor is died?

In Scala, an actor can be notified when another (remote) actor terminates by setting the trapExit flag and invoking the link() method with the second actor as parameter. In this case when the remote actor ends its job by calling exit() the first one is notified by receiving an Exit message.
But what happens when the remote actor terminates in a less graceful way (e.g. the VM where it is running crashes)? In other words, how the local actor can discover that the remote one is no longer available? Of course I would prefer (if possible) that the local actor could be notified by a message similar to the Exit one, but it seems not feasible. Am I missing something? Should I continuously polling the state of the remote actor (and in this case I don't know which is the best way to do that) or is there a smarter solution?
But what happens when the remote actor terminates in a less graceful way (e.g. the VM where it is running crashes)
Actor proxy stays alive accepting messages (and loosing them), and waiting for you to restart the JVM with remote actor. Watching for JVM crashes (and other failures happening on the infrastructure level) is far beyond Scala responsibilities. Good choice for that could be monitoring through JMX.
In other words, how the local actor can discover that the remote one is no longer available?
You may define a timeout interval (say 5000 millis). If remote actor doesn't reply during this interval, it's a sign for you that something unexpected is happening to remote actor, and you may either ask it about its state or just treat it as dead.
Should I continuously polling the state of the remote actor (and in this case I don't know which is the best way to do that) or is there a smarter solution?
You may put a kind of a polling load balancer/dispatcher in front of a group of actors, that will use only those actors that are alive and ready to process messages (which makes sense in case of remote actors that may suddenly appear/disappear behind the proxy) -> Can Scala actors process multiple messages simultaneously?
The book Actors in Scala mentions (not tested personally):
Trapping termination notifications.
In some cases, it is useful to receive termination notifications as messages in the mailbox of a monitoring actor.
For example, a monitoring actor may want to rethrow an exception that is not handled by some linked actor.
Or, a monitoring actor may want to react to normal termination, which is not possible by default.
Actors can be configured to receive all termination notifications as normal messages in their mailbox using the Boolean trapExit flag. In the following example actor b links itself to actor a:
val a = actor { ... }
val b = actor {
self.trapExit = true
link(a)
...
}
Note that before actor b invokes link it sets its trapExit member to true;
this means that whenever a linked actor terminates (normally or abnormally) it receives a message of type Exit.
Therefore, actor b is going to be notified whenever actor a terminates (assuming that actor a did not terminate before b’s invocation of link).
So "what happens when the remote actor terminates in a less graceful way"?
It should receive an Exit message even in the case of an abnormal termination.
val b = actor {
self.trapExit = true
link(a)
a ! 'start
react {
case Exit(from, reason) if from == a =>
println("Actor 'a' terminated because of " + reason)
}
}

Scala Remote Actors - Pitfalls

While writing Scala RemoteActor code, I noticed some pitfalls:
RemoteActor.classLoader = getClass().getClassLoader() has to be set in order to avoid "java.lang.ClassNotFoundException"
link doesn't always work due to "a race condition for which a NetKernel (the facility responsible for forwarding messages remotely) that backs a remote actor can close before the remote actor's proxy (more specifically, proxy delegate) has had a chance to send a message remotely indicating the local exit." (Stephan Tu)
RemoteActor.select doesn't always return the same delegate (RemoteActor.select - result deterministic?)
Sending a delegate over the network prevents the application to quit normally (RemoteActor unregister actor)
Remote Actors won't terminate if RemoteActor.alive() and RemoteActor.register() are used outside act. (See the answer of Magnus)
Are there any other pitfalls a programmer should be aware of?
Here's another; you need to put your RemoteActor.alive() and RemoteActor.register() calls inside your act method when you define your actor or the actor won't terminate when you call exit(); see How do I kill a RemoteActor?