My application has a class ApplicationUsers that has no mutable members. Upon creation of instances, it reads the entire user database (relatively small) into an immutable collection. It has a number of methods to query the data.
I am now faced with the problem of having to create new users (or modify some of their attributes). My current idea is to use an Akka actor that, at a high level, would look like this:
class UserActor extends Actor{
var users = new ApplicationUsers
def receive = {
case GetUsers => sender ! users
case SomeMutableOperation => {
PerformTheChangeOnTheDatabase() // does not alter users (which is immutable)
users = new ApplicationUsers // reads the database from scratch into a new immutable instance
}
}
}
Is this safe? My reasoning is that it should be: whenever users is changed by SomeMutableOperation any other threads making use of previous instances of users already have a handle to an older version, and should not be affected. Also, any GetUsers request will not be acted upon until a new instance is not safely constructed.
Is there anything I am missing? Is my construct safe?
UPDATE: I probably should be using Agents to do this, but the question is still holds: is the above safe?
You are doing it exactly right: have immutable data types and reference them via var within the actor. This way you can freely share the data and mutability is confined to the actor. The only thing to watch out for is if you reference the var from a closure which is executed outside of the actor (e.g. in a Future transformation or a Props instance). In such a case you need to make a stack-local copy:
val currentUsers = users
other ? Process(users) recoverWith { case _ => backup ? Process(currentUsers) }
In the first case you just grab the value—which is fine—but asking the backup happens from a different thread, hence the need for val currentUsers.
Looks fine to me. You don't seem to need Agents here.
Related
We are using Azure Service Fabric and are using actors to model specific devices, using the id of the device as the ActorId. Service Fabric will instantiate a new actor instance when we request an actor for a given id if it is not already instantiated, but I cannot seem to find an api that allows me to query if a specific device id already has an instantiated actor.
I understand that there might be some distributed/timing issues in obtaining the point-in-time truth but for our specific purpose, we do not need a hard realtime answer to this but can settle for a best guess. We would just like to, in theory, contact the current primary for the specific partition resolved by the ActorId and get back whether or not the device has an instantiated actor.
Ideally it is a fast/performant call, essentially faster than e.g. instantiating the actor and calling a method to understand if it has been initialized correctly and is not just an "empty" actor.
You can use the ActorServiceProxy to iterate through the information for a specific partition but that does not seem to be a very performant way of obtaining the information.
Anyone with insights into this?
The only official way you can check if the actor has been activated in any Service Partition previously is using the ActorServiceProxy query, like described here:
IActorService actorServiceProxy = ActorServiceProxy.Create(
new Uri("fabric:/MyApp/MyService"), partitionKey);
ContinuationToken continuationToken = null;
do
{
PagedResult<ActorInformation> page = await actorServiceProxy.GetActorsAsync(continuationToken, cancellationToken);
var actor = page.Items.FirstOrDefault(x => x.ActorId == idToFind);
continuationToken = page.ContinuationToken;
}
while (continuationToken != null);
By the nature of SF Actors, they are virtual, that means they always exist, even though you didn't activated then previously, so it make a bit harder to do this check.
As you said, it is not performant to query all actors, so, the other workarounds you could try is:
Store the IDs in a Reliable Dictionary elsewhere, every time an Actor is activated you raise an event and insert the ActorIDs in the Dictionary if not there yet.
You can use the OnActivateAsync() actor event to notify it's creation, or
You can use the custom actor factory in the ActorService to register actor activation
You can store the dictionary in another actor, or another StatefulService
Create a property in the actor that is set by the actor itself when it is activated.
The OnActivateAsync() check if this property has been set before
If not set yet, you set a new value and store in a variable (a non persisted value) to say the actor is new
Whenever you interact with actor you set this to indicate it is not new anymore
The next activation, the property will be already set, and nothing should happen.
Create a custom IActorStateProvider to do the same as mentioned in the option 2, instead of handle it in the actor it will handle a level underneath it. Honestly I think it is a bit of work, would only be handy if you have to do the same for many actor types, the option 1 and 2 would be much easier.
Do as Peter Bons Suggested, store the ActorID outside the ActorService, like in a DB, I would only suggest this option if you have to check this from outside the cluster.
.
The following snipped can help you if you want to manage these events outside the actor.
private static void Main()
{
try
{
ActorRuntime.RegisterActorAsync<NetCoreActorService>(
(context, actorType) => new ActorService(context, actorType,
new Func<ActorService, ActorId, ActorBase>((actorService, actorId) =>
{
RegisterActor(actorId);//The custom method to register the actor if new
return (ActorBase)Activator.CreateInstance(actorType.ImplementationType, actorService, actorId);
})
)).GetAwaiter().GetResult();
Thread.Sleep(Timeout.Infinite);
}
catch (Exception e)
{
ActorEventSource.Current.ActorHostInitializationFailed(e.ToString());
throw;
}
}
private static void RegisterActor(ActorId actorId)
{
//Here you will put the logic to register elsewhere the actor creation
}
Alternatively, you could create a stateful DeviceActorStatusActor which would be notified (called) by DeviceActor as soon as it's created. (Share the ActorId for correlation.)
Depending on your needs you can also register multiple Actors with the same status-tracking actor.
You'll have great performance and near real-time information.
I'm using the Akka libraries.
What happens when multiple actors call a function on an object? Would this block other actors from accessing the object?
The reason I ask this is because I want to use JBCrypt with akka actors. And since we can encrypt multiple strings concurrently I have each actor calling JBcrypt.hash(...). Not sure how it works since I think, in scala, objects exist in one place, and I feel like multiple actors using the same object (library) might block the concurrency from actually happening.
Multiple actors calling a function in an object that calls a library will not block unless the library being called uses concurrency control mechanisms such as sychronized, ThreadLocal or an object lock.
For example, calling print on the below Printer object will block:
class BlockingPrinter(){
def print(s: String) = synchronized{s}
}
object Printer{
val printer = new BlockingPrinter()
def print(str: String) = printer.print(str)
}
But calling it on the below Printer object will not
class NonBlockingPrinter(){
def print(s: String) = s
}
object Printer{
val printer = new NonBlockingPrinter()
def print(str: String) = printer.print(str)
}
In summary, the library that you're calling is the one that decides how concurrency is handled. Not the fact that you're calling an object.
It depends on how the function is implemented. If the function accessed some internal mutable state and tries to synchronize in order to achieve thread safety then there is a problem. If it's a pure function and does not access any external state, then it is safe. If the function has the mutable state at least it must contain the mutable state to itself.
I have following code that traverses through a list of people and calls a callback for each of them in class1.
def syncPeople(callback: Person => _) = Future {
person.findAll(criteria).foldLeft(0L) { (count, obj) =>
callback(obj)
count + 1
}
}
Callback and the call to syncPeople is in class2 and looks similar to this
def getActor(person: Person):ActorRef = {
if(person.isMale) maleActor
else femaleActor
}
def process(person: Person): Unit = {
val workActor = getActor(person)
workActor ! person
} //The actor does the actual work and may be quite intense
def syncPeople(process)
Now, I want to track the total time taken to sync all people. ie when the last workActor completes the work. I am using a third Actor: MonitorActor to keep track of start and end times. The MaleActor, FemaleActor can send messages to this when they process an individual
Whats the best way to keep track of this spawned processes?
I explored
Future.sequence // but the class sending the workActor the message is not an actor. so the future does not receive the message
keeping track of personIds when they finish, but without using a var, to accumulate the received messages in MonitorActor its not possible implement this.. and using var is not preferred way of doing things
What could be other ways of implementing this
Funny, I'm working on a very similar problem to this at the moment. The solution I would suggest is using akka-fsm which keeps track of state.
Essentially in something outside of your state object, do something like generate a Long that represents an id:
def getId(): Long = System.currentTimeMillis() / 1000L
The state object when implemented correctly is immutable, so you just keep reusing this id throughout the transaction.
I know this answer is missing a lot of the implementation details but I'm still working on the implementation myself in my own code. Hopefully after reading about akka-fsm a bit and playing with it, this answer will make sense?
Don't demonize mutable state, it's SHARED mutable state, which causes the most of the problems. You don't have shared mutable state inside an actor, because you always talk to actorRefs and the actors process only one message at a time (no race conditions and other evil stuff). What I'm saying is, it's ok to use a var (unless you spawn some futures inside the actor, which mutate the var, because then your are back to SHARING mutable state). FSM is another solution as #devnulled suggested, but it sounds more like an overkill for your use case.
My app gets a new instance of Something via an API call on a stateless controller. After I do my mission critical stuff (like saving it to my Postgres database and committing the transaction) I would now like to do a bunch of fire-and-forget operations.
In my controller I send the model instance to the post-processor:
import _root_.com.eaio.uuid.UUID
import akka.actor.Props
// ... skip a bunch of code
play.api.libs.concurrent.Akka.system.actorOf(
Props[MySomethingPostprocessorActor],
name = "somethingActor"+new UUID().toString()
) ! something
The MySomethingPostprocessorActor actor looks like this:
class MySomethingPostprocessorActor extends Actor with ActorLogging {
def receive = {
case Something(thing, alpha, beta) => try {
play.api.libs.concurrent.Akka.system.actorOf(
Props[MongoActor],
name = "mongoActor"+new UUID().toString()
) ! Something(thing, alpha, beta)
play.api.libs.concurrent.Akka.system.actorOf(
Props[PubsubActor],
name = "pubsubActor"+new UUID().toString()
) ! Something(thing, alpha, beta)
// ... and so forth
} catch {
case e => {
log.error("MySomethingPostprocessorActor error=[{}]", e)
}
}
}
}
So, here's what I'm not sure about:
I know Actor factories are discouraged as per the warning on this page. My remedy for this is to name each actor instance with a unique string provided by UUID, to get around the your-actor-is-not-unique errors:
play.core.ActionInvoker$$anonfun$receive$1$$anon$1:
Execution exception [[InvalidActorNameException:
actor name somethingActor is not unique!]]
Is there a better way to do the above, i.e. instead of giving everything a unique name? All examples in the Akka docs I encountered give actors a static name, which is a bit misleading.
(any other comments are welcome too, e.g. the if the bundling pattern I use is frowned upon, etc)
As far as I'm aware the name paramater is optional.
This may or may not be the case with Akka + Play (haven't checked). When working with standalone actor systems though, you usually only name an actor when you need that reference for later.
From the sounds of it you're tossing out these instances after using them, so you could probably skip the naming step.
Better yet, you could probably save the overhead of creating each actor instance by just wrapping your operations in Futures and using callbacks if need be: http://doc.akka.io/docs/akka/2.0.3/scala/futures.html
I need to have one global variable (singleton) that will change very infrequently. Actually it only changes when the actor restarts, and reinitialize the variable. Since I cannot do this with singleton val in companion object, I have to declare it as a var (mutable).
object UserDatabase {
var dbConnection = "" // initializing db connection
}
Many guidelines that I read always go against sharing a mutable state. So I move the variable to class and use message passing to retrieve the variable.
class UserDatabase extends Actor{
val dbConnection = "" // initializing db connection locally
def receive = {case GetConnection => self.reply(dbConnection)}
}
Problem is, dbConnection is accessed very frequently by many .. many actors, and continuously sending message will reduce performance (since akka process mailbox one by one).
I don't see how I can do this without sacrificing performance. Any idea?
Perhaps use an Agent instead? http://akka.io/docs/akka/1.2-RC6/scala/agents.html
First of all, have you actually measure/notice performance reduction ? Since messaging is lightweight, perhaps it's fast enough for your application.
Then, a possible solution: If the "global" state is written rarely, but accessed very often, you can choose a push strategy. Every time it changes, the UserDatabase actor will send the updated value to interested actors. You can then use a publish/subscribe approach, rely on the actor register, use a pool of actors, etc.
class UserDatabase extends Actor{
var dbConnection = "" // initializing db connection locally
def receive = {
case SetConnection( newConnection ) if dbConnection != newConnection => {
dbConnection = newConnection
sendUpdatedConnection(); // sends the change to every relevant actor
}
}
}
If you don't need to use the variable very often in any case, it might be simpler and more efficient to make it a java.lang.concurrent.atomic.AtomicReference or wrap every access of it in a synchronized block (on the variable). Actors don't always make things easier and safer, just usually.
Create many actors as routees of a RoundRobinRouter.
Make each actor handle a connection and actually handling the DB logic.