Does Service Fabric keep actor information indefinitely? - azure-service-fabric

For GDPR reasons, I need to ensure that I don't store customer data which I no longer need. When looking at Service Fabric Actors, I am uncertain what "garbage collection" really means.
[StatePersistence(StatePersistence.Persisted)]
internal class Actor1 : Actor, IActor1
{
public Actor1(ActorService actorService, ActorId actorId)
: base(actorService, actorId)
{
}
public Task PingAsync(CancellationToken cancellationToken)
{
return Task.CompletedTask;
}
}
IActorService actorServiceProxy = ActorServiceProxy.Create(
new Uri("fabric:/MyApp/MyActorService"), partitionKey);
// ...
do
{
PagedResult<ActorInformation> page = await actorServiceProxy.GetActorsAsync(continuationToken, cancellationToken);
// ...
When enumerating actor instances in a test environment, it seemed to me like actor information was kept for at least 2 months, even though the actors did not have any stored state.
I found multiple articles mentioning that I will need to delete the actors manually if they have leftover state, but in my case the only "state" would be the fact that the actorId "exists". If I were to use something sensitive like a user email address as an actorId, would Service Fabric ever delete the information about the actorId by itself?

Garbage collection (in this context) means that an Actor object is removed from memory to free up resources. If the Actor has StatePersistence.Persisted its state will be written to disk on each replica of the underlying ActorService. Even if you're not explicitly storing anything in the StateManager, a record of the Actor (using ActorId as key) will exist.
It's up to you as a developer to manage the lifecycle of Actor state. Deleting an Actor explicitly, also deletes its state.
Garbage collection of deactivated actors only cleans up the actor
object, but it does not remove data that is stored in an actor's State
Manager. When an actor is reactivated, its data is again made
available to it through the State Manager. In cases where actors store
data in State Manager and are deactivated but never reactivated, it
may be necessary to clean up their data.
More info here.

Related

Creating children actors in Akka Typed persistent actors

Let’s assume an application implemented using Akka Typed has a persistent actor. This persistent actor as part of its operations creates transient (or non-persistent) children actors, each child has a unique ID and these IDs are part of the persisted state. The persistent actor also needs some way of communicating with its children, but we don’t want to persist children’s ActorRefs as they aren’t really part of the state. On recovery the persistent actor should recreate its children based on the recovered state. It doesn’t sound like a very unusual use case, I’m trying to figure out what’s the cleanest way of implementing it. I could create the children actors inside the andThen Effect in my command handler which is meant for side effects, but then there’s no way to save the child’s ActorRef from there. That seems to be a more general characteristic of the typed Persistence API - it’s very hard to have non-persistent state in persistent actors (which could be used for storing the transient children ActorRefs in this case). One solution I came up with is having a sort of “proxy” actor for creating children, keeping a map of IDs and ActorRefs, and forwarding messages based on IDs. The persistent actor holds a reference to that proxy actor and contacts it every time it needs to create a new child or send something to one of the existing children. I have mixed feelings about it though and would appreciate if somebody can point me to a better solution.
If you are not using snapshots then the persistence mechanism does not store the State object, it stores the sequence of Events that led to that State object. On recovery it simply re-plays those Events in the order in which they happened and your eventHandler will return a modified State object that reflects the effect of each event.
This means that the State object can contain values that are not themselves persisted but are just set by the processing of certain Events. They are, in effect, cached values derived from the persistent values in the State.
In your case the operation that causes the creation of a transient actor will be captured as an Event on the actor. So you can create the transient actor in the eventHandler and put the ActorRef in the new State object. When the actor is recovered it will replay that event and your actor will re-create the transient actor.
If you are using snapshots then I don't think there is a requirement that the snapshot object is the same type as your State object, so you can snapshot the state without the ActorRefs and re-create them when you get the SnapshotOffer message.
It's a design goal of typed persistence that the State be fully recoverable from the events (or from a snapshot and the events since that snapshot).
In general, the only way to have state that's non-persistent is to wrap the EventSourcedBehavior in a Behaviors.setup block which sets up the state. One option for this is some sort of mutable state (e.g. a var or (likely exclusive or) mutable collection) in setup, which the command/event/recovery handlers manipulate.
A much more immutable alternative is to define an immutable fixture in setup, which includes a child actor which was spawned in setup to manage non-persistent state. You can also put things like the entity ID or other things that are immutable for at least this incarnation of the entity into the fixture.

Service Fabric actors auto delete

In a ServiceFabric app, I have the necessity to create thousands of stateful Actors, so I need to avoid accumulating Actors when they become useless.
I know I can't delete an Actor from the Actor itself, but I don't want to keep track of Actors and loop to delete them.
The Actors runtime use Garbace collection to remove the deactivated Actor objects (but not their state); so, I was thinking about removing Actor state inside the OnDeactivateAsync() method and let the GC deallocate the Actor object after the usual 60min.
In theory, something like this should be equivalent to delete the Actor, isn't it?
protected override async Task OnActivateAsync()
{
await this.StateManager.TryRemoveStateAsync("MyState");
}
Is there anything remaining that only explicit deletion can remove?
According to the docs, you shouldn't change the state from OnDeactivateAsync.
If you need your Actor to not keep persisted state, you can use attributes to change the state persistence behavior:
No persisted state: State is not replicated or written to disk. This
level is for actors that simply don't need to maintain state reliably.
[StatePersistence(StatePersistence.None)]
class MyActor : Actor, IMyActor
{
}
Finally, you can use the ActorService to query Actors, see if they are inactive, and delete them.
TL;DR There are some additional resources you can free yourself (reminders) and some that only explicit deletion can remove because they are not publicly accessible.
Service Fabric Actor repo is available on GitHub. I am using using persistent storage model which seems to use KvsActorStateProvider behind the scenes so I'll base the answer on that. There is a series of calls that starts at IActorService.DeleteActorAsync and continues over to IActorManager.DeleteActorAsync. Lot of stuff is happening in there including a call to the state provider to remove the state part of the actor. The core code that handles this is here and it seems to be removing not only the state, but also reminders and some internal actor data. In addition, if you are using actor events, all event subscribers are unsubscribed for your actor.
If you really want delete-like behavior without calling the actor runtime, I guess you could register a reminder that would delete the state and unregister itself plus other reminders.

Different use case for akka cluster aware router & akka cluster sharding?

Cluster aware router:
val router = system.actorOf(ClusterRouterPool(
RoundRobinPool(0),
ClusterRouterPoolSettings(
totalInstances = 20,
maxInstancesPerNode = 1,
allowLocalRoutees = false,
useRole = None
)
).props(Props[Worker]), name = "router")
Here, we can send message to router, the message will send to a series of remote routee actors.
Cluster sharding (Not consider persistence)
class NewShoppers extends Actor {
ClusterSharding(context.system).start(
"shardshoppers",
Props(new Shopper),
ClusterShardingSettings(context.system),
Shopper.extractEntityId,
Shopper.extractShardId
)
def proxy = {
ClusterSharding(context.system).shardRegion("shardshoppers")
}
override def receive: Receive = {
case msg => proxy forward msg
}
}
Here, we can send message to proxy, the message will send to a series of sharded actors (a.k.a. entities).
So, my question is: it seems both 2 methods can make the tasks distribute to a lot of actors. What's the design choice of above two? Which situation need which choice?
The pool router would be when you just want to send some work to whatever node and have some processing happen, two messages sent in sequence will likely not end up in the same actor for processing.
Cluster sharding is for when you have a unique id on each actor of some kind, and you have too many of them to fit in one node, but you want every message with that id to always end up in the actor for that id. For example modelling a User as an entity, you want all commands about that user to end up with the user but you want the actor to be moved if the cluster topology changes (remove or add nodes) and you want them reasonably balanced across the existing nodes.
Credit to johanandren and the linked article as basis for the following answer:
Both a router and sharding distribute work. Sharding is required if, additionally to load balancing, the recipient actors have to reliably manage state that is directly associated with the entity identifier.
To recap, the entity identifier is a key, derived from the message being sent, determining the message's receipient actor in the cluster.
First of all, can you manage state associated with an identifier across different nodes using a consistently hashing router? A Consistent Hash router will always send messages with an equal identifier to the same target actor. The answer is: No, as explained below.
The hash-based method stops working when nodes in the cluster go Down or come Up, because this changes the associated actor for some identifiers. If a node goes down, messages that were associated with it are now sent to a different actor in the network, but that actor is not informed about the former state of the actor which it is now replacing. Likewise, if a new node comes up, it will take care of messages (identifiers) that were previously associated with a different actor, and neither the new node or the old node are informed about this.
With sharding, on the other hand, the actors that are created are aware of the entity identifier that they manage. Sharding will make sure that there is exactly one actor managing the entity in the cluster. And it will re-create sharded actors on a different node if their parent node goes down. So using persistence they will retain their (persisted) state across nodes when the number of nodes changes. You also don't have to worry about concurrency issues if an actor is re-created on a different node thanks to Sharding. Furthermore, if a message with a new entity identifier is encountered, for which an actor does not exist yet, a new actor is created.
A consistently hashing router may still be of use for caching, because messages with the same key generally do go to the same actor. To manage a stateful entity that exists only once in the cluster, Sharding is required.
Use routers for load balancing, use Sharding for managing stateful entities in a distributed manner.

Is removing all actor state eventually the same as deleting the actor?

I'm wondering whether there is anything stored/managed in Service Fabric for a non-activated actor without persistent state?
Let's say that an actor instance has the following life cycle:
Actor is activated for the first time.
Actor save state (persistent and replicated).
Actor remove all saved state.
Actor is deactivated (GC).
Is there anything left now? Is it like we would have deleted it instead?
If you call IActorService.GetActorsAsync you will still get that actor in the list, so yes, something (a marker value) is left in the storage provider. If the StatePersistence is not set to Persisted, like all other state it may get lost if you turn off the machines, for example.

Cleaning up dormant actors in Azure Service Fabric

I'm evaluating Service Fabric for an IoT-style application using the model that each device has its own actor, along with other actors in the system. I understand that inactive actors will be garbage-collected automatically but their state will persist for when they are reactivated. I also see there is a way to explicitly delete an actor and its state.
In my scenario I'm wondering if there are any patterns or recommendations on how to handle devices that go dormant, fail or "disappear" and never send another message. Without an explicit delete their state will persist forever and I would like to clean it up automatically, e.g.: after six months.
Here's a method that works.
private async Task Kill()
{
// Do other required cleanup
var actorToDelete = ActorServiceProxy.Create(ServiceUri, Id);
await actorToDelete.DeleteActorAsync(Id, CancellationToken.None).ConfigureAwait(false);
}
Then just call this method using the following line:
var killTask = Task.Run(Kill);
This will spin up a new thread that references the actor, which will be blocked until the current turn has ended. When the task finally receives access to the actor, it will delete it. The beauty is that this can be called within the actor itself, meaning they can "self-delete".
You'll have to do this kind of clean-up yourself by writing a "clean-up" service that periodically checks for dormant actors and deletes them. The actor framework doesn't keep track of last deactivated time, so your individual actors will have to do that (which is easy enough, you have an OnDeactivate event that you can override in your actor class and save a timestamp there).
This clean-up service can be your actor service itself even, where you can implement RunAsync and do periodic clean-up work there.
Actors have a method OnPostActorMethodAsync which is called after every actor method is invoked (unless the method throws an exception, but I believe that's a bug). You could schedule a "kill me" reminder in that method to fire after X period of time. Every time an actor method is called that time will get pushed back. When the "kill me" reminder finally does fire, simply delete all the actor's state, and unregister any reminders. Eventually SF will kick it out of memory, and at that point, I believe the actor has essentially been deleted(not in memory, no persisted state.)