I'm evaluating Service Fabric for an IoT-style application using the model that each device has its own actor, along with other actors in the system. I understand that inactive actors will be garbage-collected automatically but their state will persist for when they are reactivated. I also see there is a way to explicitly delete an actor and its state.
In my scenario I'm wondering if there are any patterns or recommendations on how to handle devices that go dormant, fail or "disappear" and never send another message. Without an explicit delete their state will persist forever and I would like to clean it up automatically, e.g.: after six months.
Here's a method that works.
private async Task Kill()
{
// Do other required cleanup
var actorToDelete = ActorServiceProxy.Create(ServiceUri, Id);
await actorToDelete.DeleteActorAsync(Id, CancellationToken.None).ConfigureAwait(false);
}
Then just call this method using the following line:
var killTask = Task.Run(Kill);
This will spin up a new thread that references the actor, which will be blocked until the current turn has ended. When the task finally receives access to the actor, it will delete it. The beauty is that this can be called within the actor itself, meaning they can "self-delete".
You'll have to do this kind of clean-up yourself by writing a "clean-up" service that periodically checks for dormant actors and deletes them. The actor framework doesn't keep track of last deactivated time, so your individual actors will have to do that (which is easy enough, you have an OnDeactivate event that you can override in your actor class and save a timestamp there).
This clean-up service can be your actor service itself even, where you can implement RunAsync and do periodic clean-up work there.
Actors have a method OnPostActorMethodAsync which is called after every actor method is invoked (unless the method throws an exception, but I believe that's a bug). You could schedule a "kill me" reminder in that method to fire after X period of time. Every time an actor method is called that time will get pushed back. When the "kill me" reminder finally does fire, simply delete all the actor's state, and unregister any reminders. Eventually SF will kick it out of memory, and at that point, I believe the actor has essentially been deleted(not in memory, no persisted state.)
Related
The Service Fabric documentation states that:
Actors may receive duplicate messages from the same client.
Does this hold for reminders as well? If I set a single reminder for my actor instance, could it be called twice at the same time?
My team submitted a similar question to Service Fabric support, and this was their response...
*"If there is a failover (i.e. current primary becomes secondary or primary process crashes) while ‘ReceiveReminderAsync()’ call back is executing or failover kicks in after ‘ReceiveReminderAsync()’ completes but before ActorRuntime does automatic save state and notes down completion, on the new primary this reminder will fire again immediately.
Note that in this scenario, as the new primary comes up and invokes the reminder, the reminder callback in previous primary may be still be executing (and will eventually fail to make any local state changes as replica has become secondary)."*
This behavior seems entirely consistent with why a public actor method would be invoked twice.
In a ServiceFabric app, I have the necessity to create thousands of stateful Actors, so I need to avoid accumulating Actors when they become useless.
I know I can't delete an Actor from the Actor itself, but I don't want to keep track of Actors and loop to delete them.
The Actors runtime use Garbace collection to remove the deactivated Actor objects (but not their state); so, I was thinking about removing Actor state inside the OnDeactivateAsync() method and let the GC deallocate the Actor object after the usual 60min.
In theory, something like this should be equivalent to delete the Actor, isn't it?
protected override async Task OnActivateAsync()
{
await this.StateManager.TryRemoveStateAsync("MyState");
}
Is there anything remaining that only explicit deletion can remove?
According to the docs, you shouldn't change the state from OnDeactivateAsync.
If you need your Actor to not keep persisted state, you can use attributes to change the state persistence behavior:
No persisted state: State is not replicated or written to disk. This
level is for actors that simply don't need to maintain state reliably.
[StatePersistence(StatePersistence.None)]
class MyActor : Actor, IMyActor
{
}
Finally, you can use the ActorService to query Actors, see if they are inactive, and delete them.
TL;DR There are some additional resources you can free yourself (reminders) and some that only explicit deletion can remove because they are not publicly accessible.
Service Fabric Actor repo is available on GitHub. I am using using persistent storage model which seems to use KvsActorStateProvider behind the scenes so I'll base the answer on that. There is a series of calls that starts at IActorService.DeleteActorAsync and continues over to IActorManager.DeleteActorAsync. Lot of stuff is happening in there including a call to the state provider to remove the state part of the actor. The core code that handles this is here and it seems to be removing not only the state, but also reminders and some internal actor data. In addition, if you are using actor events, all event subscribers are unsubscribed for your actor.
If you really want delete-like behavior without calling the actor runtime, I guess you could register a reminder that would delete the state and unregister itself plus other reminders.
I'm wondering whether there is anything stored/managed in Service Fabric for a non-activated actor without persistent state?
Let's say that an actor instance has the following life cycle:
Actor is activated for the first time.
Actor save state (persistent and replicated).
Actor remove all saved state.
Actor is deactivated (GC).
Is there anything left now? Is it like we would have deleted it instead?
If you call IActorService.GetActorsAsync you will still get that actor in the list, so yes, something (a marker value) is left in the storage provider. If the StatePersistence is not set to Persisted, like all other state it may get lost if you turn off the machines, for example.
This question isn't as philosophical as the title might suggest. Consider the following approach to persistence:
Commands to perform Operations come in from various Clients. I represent both Operations and Clients as persistent actors. The Client's state is the lastOperationId to pass through. The Operation's state is pretty much an FSM of the Operation's progress (it's effectively a Saga, as it then needs to reach out to other systems external to the ActorSystem in order to move through it's states).
A Reception actor receives the operation command, which contains the client id and operation id. The Reception actor creates or retrieves the Client actor and forwards it the command. The Client actor reads and validates the operation command, persists it, creates an OperationReceived event, updates its own state with the this operation id. Now it needs to create a new Operation actor to manage the new long-running operation. But here is where I get lost and all the nice examples in the documentation and on the various blogs don't help. Most commentators say that a PersistentActor converts commands to events, and then updates their state. They may also have side effects as long as they are not invoked during replay. So I have two areas of confusion:
Is the creation of an Operation actor in this context equivalent to
creating state, or performing a side effect? It doesn't seem like a side effect, but at the same time it's not changing its own state, but causing a state change in a new child.
Am I supposed to construct a Command to send to the new Operation actor or will I
simply forward it the OperationReceived event?
If I go with my assumption that creating a child actor is not a side effect, it means I must also create the child when replaying. This in turn would cause the state of the child to be recovered.
I hope the underlying question is clear. I feel it's a general question, but the best way I can formulate it is by giving a specific example.
Edit:
On reflection, I think that the creation of one persistent actor from another is an act of creating state, albeit outsourced. That means that the event that triggers the creation will trigger that creation on a subsequent replay (which will lead to the retrieval of the child's own persisted state). This makes me think that passing the event (rather than a wrapping command) might be the cleanest thing to do as the same event can be applied to update the state in both parent and child. There should be no need to persist the event as it comes into the child - it has already been persisted in the parent and will replay.
On reflection, I think that the creation of one persistent actor from another is an act of creating state, albeit outsourced. That means that the event that triggers the creation will trigger that same creation on a subsequent replay. This makes me think that passing the event (rather than a wrapping command) might be the cleanest thing to do as the same event can be applied to update the state in both parent and child. There should be no need to persist the event as it comes into the child - it has already been persisted in the parent and will replay.
I am learning Scala and Akka.
In the problem I am trying to solve I want an actor to be reading a real-time data stream and perform a certain calculation that would update its state.
Every 3 seconds I am sending a request through a Scheduler for the actor to return to its state.
While I have pretty much everything implemented, with my actor having a broadcaster and receiver and the function to update the state right. I am not entirely sure how to do it, I could potentially put the calculations always running in a separate thread inside the actor but I would like to now if there is a more elegant way to make this in scala.
I would suggest to divide the work between two actors. The parent actor would manage child worker actor and would track the state. It sends a message to the child worker actor to trigger data processing.
The child worker actor processes the data stream - don't forget to wrap the processing into a Future so that it doesn't block the actor from processing messages. It also periodically sends messages to the master with current state. So the child worker is stateless, it sends notifications when its state changes.
If you want to know the current state of the work overall, you ask the master. In principle, you can merge this into one actor which sends the status message to itself. I wouldn't update the state directly to avoid concurrency issues. The reason is that the data processing work running in the Future can possible run on a different thread than message processing.