Scala folding using Akka - scala

I implemented in Java what I called a "foldable queue", i.e., a LinkedBlockingQueue used by an ExecutorService. The idea is that each task as a unique id that if is in the queue while another task is submitted via that same id, it is not added to the queue. The Java code looks like this:
public final class FoldablePricingQueue extends LinkedBlockingQueue<Runnable> {
#Override
public boolean offer(final Runnable runnable) {
if (contains(runnable)) {
return true; // rejected, but true not to throw an exception
} else {
return super.offer(runnable);
}
}
}
Threads have to be pre-started but this is a minor detail. I have an Abstract class that implements Runnable that takes a unique id... this is the one passed in
I would like to implement the same logic using Scala and Akka (Actors).
I would need to have access to the mailbox, and I think I would need to override the ! method and check the mailbox for the event.. has anyone done this before?

This is exactly how the Akka mailbox works. The Akka mailbox can only exist once in the task-queue.
Look at:
https://github.com/jboner/akka/blob/master/akka-actor/src/main/scala/akka/dispatch/Dispatcher.scala#L143
https://github.com/jboner/akka/blob/master/akka-actor/src/main/scala/akka/dispatch/Dispatcher.scala#L198
Very cheaply implemented using an atomic boolean, so no need to traverse the queue.
Also, by the way, your Queue in Java is broken since it doesn't override put, add or offer(E, long, TimeUnit).

Maybe you could do that with two actors. A facade one and a worker one. Clients send jobs to facade. Facade forwards then to worker, and remember them in its internal state, a Set queuedJobs. When it receives a job that is queued, it just discard it. Each time the worker starts processing a job (or completes it, whichever suits you), it sends a StartingOn(job) message to facade, which removes it from queuedJobs.

The proposed design doesn't make sense. The closest thing to a Runnable would be an Actor. Sure, you can keep them in a list, and not add them if they are already there. Such lists are kept by routing actors, which can be created from ready parts provided by Akka, or from a basic actor using the forward method.
You can't look into another actor's mailbox, and overriding ! makes no sense. What you do is you send all your messages to a routing actor, and that routing actor forwards them to a proper destination.
Naturally, since it receives these messages, it can do any logic at that point.

Related

Scala Akka broadcasting send message to all child actors from Parent Context

In my task parent actor create some child actors and send message to them
def apply(): Behavior[CreateBranch] = {
Behaviors.receive { (context, message) =>
val child1 = context.spawn(Child(), "child1")
val child2 = context.spawn(Child(), "child2")
val child3 = context.spawn(Child(), "child3")
context.children.foreach {
child ! Child.SomeMessage
{
}
}
Child actor defined as:
object Child {
sealed trait SomeMessage
def apply():Behavior[SomeMessage] = {
// some behaviour
{
}
But expression "context.children" does not work: children defined as Iterable[ActorRef[Nothing]] instead of Iterable[ActorRef[Child.SomeMessage]]
In documentation about "context.children" ActorContext.children
"this is not a useful way to lookup children when the purpose is to send messages to them."
And this document suggest alternative way of realisation - but it about of 30 lines of code!
Is there any other alternative way to realise broadcast message send ?
There is no way in general to guarantee the static type of the child actors (and the same is true of the parent actor and the guardian actor (context.system)).
However, if you're absolutely sure that all children will accept the message you can
context.children.foreach { child =>
child.unsafeUpcast[Child.SomeMessage] ! Child.SomeMessage
}
As the name implies, this is unsafe and allows you to send messages without any guarantee that the child will handle them. Since the implementation of an Akka Typed Behavior involves a classic actor which performs message casts, a mismatch between Behavior and ActorRef can cause the actor to immediately fail. unsafeUpcast is (at least from the sender's perspective...) safer than an asInstanceOf cast, as this at least guarantees that you're actually calling ! on an ActorRef.
You asked "Is there any other alternative way to realise broadcast message send ?".
Levi's answer is the most technically correct. But I think the "real" answer is not to use context.children. This complication is caused by the type system trying to protect you from something that is a very legitimate concern. That collection contains ALL children. And there are lots of times I might spawn a child just to monitor a timeout, receive a single message, or some other task that might not understand that message.
If you want to send a specific message to child actors, as was pointed out in Levi's answer, you probably don't really mean "all child actors", you mean "all child actors who understand this message". And although there are some other ways you could do this, I always find it easiest just to maintain your own collection of ActorRefs for an actor of a particular type. If I have "worker actors" I maintain my own sequence and then the message types just work automagically.
My experience is that you should really only use context.children for system-type operations. But anything else, I maintain my own collection. That way if I create a new type of child actor at a later date, I don't break any assumptions I'm making about the type of child actors and what messages they can receive.

Service Fabric, determine if specific actor exists

We are using Azure Service Fabric and are using actors to model specific devices, using the id of the device as the ActorId. Service Fabric will instantiate a new actor instance when we request an actor for a given id if it is not already instantiated, but I cannot seem to find an api that allows me to query if a specific device id already has an instantiated actor.
I understand that there might be some distributed/timing issues in obtaining the point-in-time truth but for our specific purpose, we do not need a hard realtime answer to this but can settle for a best guess. We would just like to, in theory, contact the current primary for the specific partition resolved by the ActorId and get back whether or not the device has an instantiated actor.
Ideally it is a fast/performant call, essentially faster than e.g. instantiating the actor and calling a method to understand if it has been initialized correctly and is not just an "empty" actor.
You can use the ActorServiceProxy to iterate through the information for a specific partition but that does not seem to be a very performant way of obtaining the information.
Anyone with insights into this?
The only official way you can check if the actor has been activated in any Service Partition previously is using the ActorServiceProxy query, like described here:
IActorService actorServiceProxy = ActorServiceProxy.Create(
new Uri("fabric:/MyApp/MyService"), partitionKey);
ContinuationToken continuationToken = null;
do
{
PagedResult<ActorInformation> page = await actorServiceProxy.GetActorsAsync(continuationToken, cancellationToken);
var actor = page.Items.FirstOrDefault(x => x.ActorId == idToFind);
continuationToken = page.ContinuationToken;
}
while (continuationToken != null);
By the nature of SF Actors, they are virtual, that means they always exist, even though you didn't activated then previously, so it make a bit harder to do this check.
As you said, it is not performant to query all actors, so, the other workarounds you could try is:
Store the IDs in a Reliable Dictionary elsewhere, every time an Actor is activated you raise an event and insert the ActorIDs in the Dictionary if not there yet.
You can use the OnActivateAsync() actor event to notify it's creation, or
You can use the custom actor factory in the ActorService to register actor activation
You can store the dictionary in another actor, or another StatefulService
Create a property in the actor that is set by the actor itself when it is activated.
The OnActivateAsync() check if this property has been set before
If not set yet, you set a new value and store in a variable (a non persisted value) to say the actor is new
Whenever you interact with actor you set this to indicate it is not new anymore
The next activation, the property will be already set, and nothing should happen.
Create a custom IActorStateProvider to do the same as mentioned in the option 2, instead of handle it in the actor it will handle a level underneath it. Honestly I think it is a bit of work, would only be handy if you have to do the same for many actor types, the option 1 and 2 would be much easier.
Do as Peter Bons Suggested, store the ActorID outside the ActorService, like in a DB, I would only suggest this option if you have to check this from outside the cluster.
.
The following snipped can help you if you want to manage these events outside the actor.
private static void Main()
{
try
{
ActorRuntime.RegisterActorAsync<NetCoreActorService>(
(context, actorType) => new ActorService(context, actorType,
new Func<ActorService, ActorId, ActorBase>((actorService, actorId) =>
{
RegisterActor(actorId);//The custom method to register the actor if new
return (ActorBase)Activator.CreateInstance(actorType.ImplementationType, actorService, actorId);
})
)).GetAwaiter().GetResult();
Thread.Sleep(Timeout.Infinite);
}
catch (Exception e)
{
ActorEventSource.Current.ActorHostInitializationFailed(e.ToString());
throw;
}
}
private static void RegisterActor(ActorId actorId)
{
//Here you will put the logic to register elsewhere the actor creation
}
Alternatively, you could create a stateful DeviceActorStatusActor which would be notified (called) by DeviceActor as soon as it's created. (Share the ActorId for correlation.)
Depending on your needs you can also register multiple Actors with the same status-tracking actor.
You'll have great performance and near real-time information.

Spray.io - delegate processing to another actor(s)

I implement a REST service using Spray.io framework. Such service must receive some "search" queries, process them and send result back to the client(s). The code that perfrom searching located in separate actor - SearchActor, so after receiving (JSON) query from user, i re-send (using ask pattern) this query to my SearchActor. But what i don't really understand it's how i must implement interaction between spray.io route actor and my SearchActor.
I see here several variants but which one is more correct and why?
Create one instance of SearchActor at startup and send every request to this actor
For every request create new instance of SearchActor
Create pool of SearchActor actors at startup and send requests to this pool
You're not forced to use the ask pattern. In fact, it will create a thread for each of your request and this is probably not what you want. I would recommend that you use a tell instead. You do this by spawning a new Actor for each request (less expensive than a thread), that has the RequestContext in one of its constructor fields. You will use this context to give the response back, typically with its complete method.
Example code.
class RESTActor extends HttpService {
val route = path("mypath") ~ post {
entity(as[SearchActor.Search]) { search => ctx =>
SearchActor(ctx) ! search
}
}
}
case class SearchActor(ctx: RequestContext) {
def receive = {
case msg: Search => //... search process
case msg: Result => ctx.complete(msg) // sends back reply
}
}
Variant #1 is out of question after the initial implementation - you would want to scale out, so single blocking actor is bad.
Variants #2 and #3 are not very different - creation of new actor is cheap and has minimal overhead. As your actors may die often (i.e. backend is not available), i would say that #2 is the way to go.
Concrete implementation idea is shown at http://techblog.net-a-porter.com/2013/12/ask-tell-and-per-request-actors/

Akka and Actor behavior interface

I just start trying myself out with Scala. I grow confident enough to start refactoring an ongoing multi-threaded application that i have been working on for about a year and half.
However something that somehow bother and i can't somehow figure it out, is how to expose the interface/contract/protocole of an Actor? In the OO mode, i have my public interface with synchronized method if necessary, and i know what i can do with that object. Now that i'm going to use actor, it seems that all of that won't be available anymore.
More specifically, I a KbService in my app with a set of method to work with that Kb. I want to make it an actor on its own. For that i need to make all the method private or protected given that now they will be only called by my received method. Hence the to expose the set of available behavior.
Is there some best practice for that ?
To tackle this issue I generally expose the set of messages that an Actor can receive via a "protocol" object.
class TestActor extends Actor {
def receive = {
case Action1 => ???
case Action2 => ???
}
}
object TestActorProtocol {
case object Action1
case object Action2
}
So when you want to communicate with TestActor you must send it a message from its protocol object.
import example.TestActorProtocol._
testActorRef ! TestActorProtocol.Action1
It can become heavy sometimes, but at least there is some kind of contract exposed.
Hope it helps

Actor-based webservice - How to do it properly?

In the past few months, me and my colleagues have successfully built a server-side system for dispatching push notifications to iPhone devices. Basically, a user registers for these notifications via a RESTful webservice (Spray-Server, recently updated to use Spray-can as the HTTP layer), and the logic schedules one or multiple messages for dispatch in the future, using Akka's scheduler.
This system, as we built it, simply works: it can handle hundreds, maybe even thousands of HTTP requests a second, and can send out notifications at a rate of 23,000 per second - possibly even more if we reduce log output, add multiple notification sender actors (and thus more connections with Apple), and there might be some optimization to be done in the Java library we use (java-apns).
This question is about how to do it Right(tm). My colleague, much more knowledgeable about Scala and actor-based systems in general, noted how the application isn't a 'pure' actor-based system - and he's right. What I'm wondering now is how to do it Right.
At the moment, we have a single Spray HttpService actor, not subclassed, that is initialized with a set of directives that outlines our HTTP service logic. Currently, very much simplified, we have directives like this:
post {
content(as[SomeBusinessObject]) { businessObject => request =>
// store the business object in a MongoDB back-end and wait for the ID to be
// returned; we want to send this back to the user.
val businessObjectId = persister !! new PersistSchedule(businessObject)
request.complete("/businessObject/%s".format(businessObjectId))
}
}
Now, if I get this right, 'waiting for a response' from an actor is a no-no in actor-based programming (plus the !! is deprecated). What I believe is the 'correct' way to do it is to pass the request object over to the persister actor in a message, and have it call request.complete as soon as it's received a generated ID from the back-end.
I have rewritten one of the routes in my application to do just this; in the message that is sent to the actor, the request object / reference is also sent. This seems to work like it's supposed to:
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
My main concern here is that we seem to pass the request object to the 'business logic', in this case the persister. The persister now gets additional responsibility, i.e. call request.complete, and knowledge about what system it runs in, i.e. that it's part of a webservice.
What would be the correct way to handle a situation like this, so that the persister actor becomes unaware of it being part of a http service, and doesn't need to know how to output the generated ID?
I'm thinking that the request should still be passed to the persister actor, but instead of the persister actor calling request.complete, it sends a message back to the HttpService actor (a SchedulePersisted(request, businessObjectId) message), which simply calls request.complete("/businessObject/%s".format(businessObjectId)). Basically:
def receive = {
case SchedulePersisted(request, businessObjectId) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
val directives = post {
content(as[SomeBusinessObject]) { businessObject => request =>
persister ! new PersistSchedule(request, businessObject)
}
}
Am I on the right track with this approach?
A smaller secondary spray-server specific question, is it okay to subclass HttpService and override the receive method, or will I break things that way? (I have no clue about subclassing actors, or how to pass unrecognized messages to the 'parent' actor)
Final question, is passing the request object / reference around in actor messages that may pass throughout the entire application an okay approach, or is there a better way to 'remember' what request should be sent a response after flowing the request through the application?
In regards to your first question, yes, you are on the right track. (Although I would also like to see some alternative ways to handle this sort of issue).
One suggestion I have is to insulate the persister actor from knowing about requests at all. You can pass the request as an Any type. Your matcher in your service code can automagically cast the cookie back into a Request.
case class SchedulePersisted(businessObjectId: String, cookie: Any)
// in your actor
override def receive = super.receive orElse {
case SchedulePersisted(businessObjectId, request: Request) =>
request.complete("/businessObject/%s".format(businessObjectId))
}
In regards to your second question, actor classes are really no different than regular classes. But you do need to make sure you call the superclass's receive method, so that it can handle its own messages. I had some other ways of doing this in my original answer, but I think I prefer chaining partial functions like this:
class SpecialHttpService extends HttpService {
override def receive = super.receive orElse {
case SpecialMessage(x) =>
// handle special message
}
}
You could also use the produce directive. It allows you to decouple the actual marshalling from the request completion:
get {
produce(instanceOf[Person]) { personCompleter =>
databaseActor ! ShowPersonJob(personCompleter)
}
}
The produce directive in this example extracts a function Person => Unit that you can use to complete the request transparently deep within the business logic layer, which should not be aware of spray.
https://github.com/spray/spray/wiki/Marshalling-Unmarshalling