How to manage a user's game state using akka - scala

I am trying to figure out how to manage a users game state using akka.
The game state will be persisted to mysql and this cannot change because we have other services that require this.
Anything that happens in a game is considered an "event".
Then you I have "Levels" which someone can achieve. A level is achieved when you complete all the "events" associated with it.
So you have:
Level
- event1 e.g. reach a point in the game
- event2 e.g. pickup a sword
- event3 e.g. defeat a monster
So in a game there are many levels, and 100's of events that are linked to levels.
So all "events" are sent via HTTP to my backend, and I save the event in the database.
I then have to load the users game profile in memory, and then re-calculate the Level's achieved since there was a new event that happened.
Note: This calculation cannot be done at the database level because it is a little more complicated that I am writing here.
The problem I see is that if I use akka, I can't have multiple actors processing the events for the same user, because the data can become stale.
Just to be clear, so when a new event arrives, I have to load the game profile in memory, loop through the levels and see if any of them have been achieved, if they have, update the database
e.g. update levels set achieved=true where level_id = 123 and user_id=234
e.g. actor1 loads the profile (all the levels and events for this user) and then processes the new event that just arrived in the inbox.
at the same time, actor2 loads the profile (same as actor1), and then processes the new event. When it persists the changes to mysql, the data will be out of sych.
If I was using threads, I would have to lock during the game profile calculation and persisting to the db.
How can I do this using Akka and be able to handle things in parallel, or is this scenerio not allow for it?

Let's think how you would manage it without actors. So, in nutshell, you have the following problem scenario:
two (or more) update requests arrive at the same time, both are
going to modify the same data
both requests read some stable data
state, then update it each in its own manner and persist to the DB
the modifications from the request which checked in first are lost, more precisely - overridden by the later request.
This is a classical problem. There are at least two classical solutions of it:
Optimistic locking
Pessimistic locking: it's usually achieved by applying Serializable isolation level for transactions.
It worth reading this answer with a nice comparison of both worlds.
As you're using Akka, you most probably want to prefer better concurrency and occasional failures, which are easy to recover. It goes on par with Akka motto let it crash.
So, you need to make the next steps:
Add version column to your table(s). It can be numeric or string (with hash). Numeric is the simplest one.
When you insert new record - initialize versions.
When you update the record - check version value has not changed. So, here's your update strategy:
Read record and its version.
Update record in memory.
Execute update query with criteria where rec_id=$id and version=$version.
If updated records count is 1 - you're good. If 0 - throw OptimisticLockException or smth like this.
Finally, it's time for Akka to do its job: come up with appropriate supervision strategy (I'd pick something like try again in 1 second). In actor's preRestart method return the update message back to the actor's mailbox (see Restart Hooks chapter in Akka docs).
With this strategy, even if two requests try to update the same record at a time, one of them will fail and will be immediately processed again.

Related

Designing event-based architecture for the customer service

Being a developer with solid experience, i am only entering the world of microservices and event-driven architecture. Things like loose coupling, independent scalability and proper implementation of asynchronous business processes is something that i feel should get simplified as compared with traditional monolith approach. So giving it a try, making a simple PoC for myself.
I am considering making a simple application where user can register, login and change the customer details. However, i want to react on certain events asynchronously:
customer logs in - we send them an email, if the IP address used is new to the system.
customer changes their name, we send them an email notifying of the change.
The idea is to make a separate application that reacts on "CustomerLoggedIn", "CustomerChangeName" events.
Here i can think of three approaches, how to implement this simple functionality, with each of them having some drawbacks. So, when a customer submits their name change:
Store change name Changed name is stored in the DB + an event is sent to Kafkas when the DB transaction is completed. One of the big problems that arise here is that if a customer had 2 tabs open and almost simultaneously submits a change from initial name "Bob" to "Alice" in one tab and from "Bob" to "Jim" in another one, on a database level one of the updates overwrites the other, which is ok, however we cannot guarantee the order of the events to be the same. We can use some checks to ensure that DB update is only done when "the last version" has been seen, thus preventing the second update at all, so only one event will be emitted. But in general case, this pattern will not allow us to preserve the same order of events in the DB as in Kafka, unless we do DB change + Kafka event sending in one distributed transaction, which is anti-pattern afaik.
Change the name in the DB, and use Debezium or similar DB CDC to capture the event and stream it. Here we get a single event source, so ordering problem is solved, however what bothers me is that i lose the ability to enrich the events with business information. Another related drawback is that CDC will stream all the updates in the "customer" table regardless of the business meaning of the event. So, in this case, i will probably need to build a Kafka Streams application to convert the DB CDC events to business events and decouple the DB structure from event structure. The potential benefit of this approach is that i will be able to capture "direct" DB changes in the same manner as those originated in the application.
Emit event from the application, without storing it in the DB. One of the subscribers might to the DB persistence, another will do email sending, etc. The biggest problem i see here is - what do i return to the client? I cannot say "Ok, your name is changed", it's more like "Ok, you request has been recorded and will be processed". In case if the customer quickly hits refresh - he expects to see his new name, as we don't want to explain to the customers what's eventual consistency, do we? Also the order of processing the same event by "email sender" and "db updater" is not guaranteed, so i can send an email before the change is persisted.
I am looking for advices regarding any of these three approaches (and maybe some others i am missing), maybe the usecases when one can be preferrable over others?
It sounds to me like you want event sourcing. In event sourcing, all you need to store is the event: the current state of a customer is derived from replaying the events (either from the beginning of time, or since a snapshot: the snapshot is just an optional optimization). Some other process (there are a few ways to go about this) can then project the events to Kafka for consumption by interested parties. Since every event has a sequence number, you can use the sequence number to prevent concurrent modification (alternatively, the more actor modely event-sourcing implementations can use techniques like cluster sharding in Akka to achieve the same ends).
Doing this, you can have a "write-side" which processes the updates in a strongly consistent manner and can respond to queries which only involve a single customer having seen every update to that point (the consistency boundary basically makes customer in this case an aggregate in domain-driven-design terms). "Read-sides" consuming events are eventually consistent: the latencies are typically fairly short: in this case your services sending emails are read-sides (as would be a hypothetical panel showing names of all customers), but the customer's view of their own data could be served by the write-side.
(The separation into read-sides and write-side (the pluralization is significant) is Command Query Responsibility Segregation, which sometimes gets interpreted as "reads can only be served by a read-side". This is not totally accurate: for one thing a write-side's model needs to be read in order for the write-side to perform its task of validating commands and synchronizing updates, so nearly any CQRS-using project violates that interpretation. CQRS should instead be interpreted as "serve reads from the model that makes the most sense and avoid overcomplicating a model (including that model in the write-side) to support a new read".)
I think I qualify to answer this, having extensively used debezium for simplifying the architecture.
I would prefer Option 2:
Every transaction always results in an event emitted in correct order
Option 1/3 has a corner case, what if transaction succeeds, but application fails to emit the event?
To your point:
Another related drawback is that CDC will stream all the updates in
the "customer" table regardless of the business meaning of the event.
So, in this case, i will probably need to build a Kafka Streams
application to convert the DB CDC events to business events and
decouple the DB structure from event structure.
I really dont think that is a roadblock. The benefit you get is potentially other usecases may crop up where another consumer to this topic may want to read other columns of the table.
Option 1 and 3 are only going to tie this to your core application logic, and that is not doing any favor from simplifying PoV. With option 2, with zero code changes to core application APIs, a developer can independently work on the events, with no need to understand that core logic.

How to prevent lost updates on the views in a distributed CQRS/ES system?

I have a CQRS/ES application where some of the views are populated by events from multiple aggregate roots.
I have a CashRegisterActivated event on the CashRegister aggregate root and a SaleCompleted event on the Sale aggregate root. Both events are used to populate the CashRegisterView. The CashRegisterActivated event creates the CashRegisterView or sets it active in case it already exists. The SaleCompleted event sets the last sale sequence number and updates the cash in the drawer.
When two of these events arrive within milliseconds, the first update is overwritten by the last one. So that's a lost update.
I already have a few possible solutions in mind, but they all have their drawbacks:
Marshal all event processing for a view or for one record of a view on the same thread. This works fine on a single node, but once you scale out, things start to get complex. You need to ensure all events for a view are delivered to the same node. And you need to migrate to another node when it goes down. This requires some smart load balancer which is aware of the events and the views.
Lock the record before updating to make sure no other threads or nodes modify it in the meantime. This will probably work fine, but it means giving up on a lock-free system. Threads will set there, waiting for a lock to be freed. Locking also means increased latency when I scale out the data store (if I'm not mistaken).
For the record: I'm using Java with Apache Camel, RabbitMQ to deliver the events and MariaDB for the view data store.
I have a CQRS/ES application where some of the views in the read model are populated by events from multiple aggregate roots.
That may be a mistake.
Driving a process off of an isolated event. But composing a view normally requires a history, rather than a single event.
A more likely implementation would be to use the arrival of the events to mark the current view stale, and to use a single writer to update the view from the history of events produced by the aggregate(s) concerned.
And that requires a smart messaging solution. I thought "Smart endpoints and dumb pipes" would be a good practice for CQRS/ES systems.
It is. The endpoints just need to be smart enough to understand when they need histories, or when events are sufficient.
A view, after all, is just a snapshot. You take inputs (X.history, Y.history), produce a snapshot, write the snapshot into your view store (possibly with meta data describing the positions in the histories that were used), and you are done.
The events are just used to indicate to the writer that a previous snapshot is stale. You don't use the event to extend the history, you use the event to tell the writer that a history has changed.
You don't lose updates with multiple events, because the event itself, with all of its state, is captured in the history. It's the history that is used to build the event-sourced view.
Konrad Garus wrote
... handling events coming from a single source is easier, but more importantly because a DB-backed event store trivially guarantees ordering and has no issues with lost or duplicate messages.
A solution could be to detect the when this situation happens, and do a retry.
To do this:
Add to each table the aggregate version number which is kept up to date
On each update statement add the following the the where clause "aggr_version=n-1" (where n is the version of the event being processed)
When the result of the update statement is that no records where modified, it probably means that the event was processed out of order and a retry strategy can be performed
The problem is that this adds complexity and is hard to test. The performance bottleneck is very likely in the database, so a single process with a failover solution will probably be the easiest solution.
Although I see you ask how to handle these things at scale - I've seen people recommend using a single threaded approach - until such times as it actually becomes a problem - and then address it.
I would have a process manager per view model, draw the events you need from the store and write them single threaded.
I combined the answers of VoiceOfUnreason and StefRave into something I think might work. Populating a view from multiple aggregate roots feels wrong indeed. We have out of order detection with a retry queue. So an event on an aggregate root will only be processed when the last completely processed event is version n-1.
So when I create new aggregate roots for the views that would be populated by multiple aggregate roots (say aggregate views), all updates for the view will be synchronised without row locking or thread synchronisation. We have conflict detection with a retry mechanism on the aggregate roots as well, that will take care of concurrency on the command side. So if I just construct these aggregate roots from the events I'm currently using to populate the aggregate views, I will have solved the lost update problem.
Thoughts on this solution?

CQRS, Event-Sourcing and Web-Applications

As I am reading some CQRS resources, there is a recurrent point I do not catch. For instance, let's say a client emits a command. This command is integrated by the domain, so it can refresh its domain model (DM). On the other hand, the command is persisted in an Event-Store. That is the most common scenario.
1) When we say the DM is refreshed, I suppose data is persisted in the underlying database (if any). Am I right ? Otherwise, we would deal with a memory-transient model, which I suppose, would not be a good thing ? (state is not supposed to remain in memory on server side outside a client request).
2) If data is persisted, I suppose the read-model that relies on it is automatically updated, as each client that requests it generates a new "state/context" in the application (in case of a Web-Application or a RESTful architecture) ?
3) If the command is persisted, does that mean we deal with Event-Sourcing (by construct when we use CQRS) ? Does Event-Sourcing invalidate the database update process ? (as if state is reconstructed from the Event-Store, maintaining the database seems useless) ?
Does CQRS only apply to multi-databases systems (when data is propagated on separate databases), and, if it deals with memory-transient models, does that fit well with Web-Applications or RESTful services ?
1) As already said, the only things that are really stored are the events.
The only things that commands do are consistency checks prior to the raise of events. In pseudo-code:
public void BorrowBook(BorrowableBook dto){
if (dto is valid)
RaiseEvent(new BookBorrowedEvent(dto))
else
throw exception
}
public void Apply(BookBorrowedEvent evt) {
this.aProperty = evt.aProperty;
...
}
Current state is retrieved by sequential Apply. Since this, you have to point a great attention in the design phase cause there are common pitfalls to avoid (maybe you already read it, but let me suggest this article by Martin Fowler).
So far so good, but this is just Event Sourcing. CQRS come into play if you decide to use a different database to persist the state of an aggregate.
In my project we have a projection that every x minutes apply the new events (from event store) on the aggregate and save the results on a separate instance of MongoDB (presentation layer will access to this DB for reading). This model is clearly eventually consistent, but in this way you really separate Command (write) from Query (read).
2) If you have decided to divide the write model from the read model there are various options that you can use to make them synchronized:
Every x seconds apply events from the last checkpoint (some solutions offer snapshot to avoid reapplying of heavy commands)
A projection that subscribe events and update the read model as soon event is raised
3) The only thing stored are the events. Infact we have an event-store, not a command store :)
Is database is useless? Depends! How many events do you need to reapply for take the aggregate to the current state?
Three? Maybe you don't need to have a database for read-model
The thing to grok is that the ONLY thing stored is the events*. The domain model is rebuilt from the events.
So yes, the domain model is memory transient as you say in that no representation of the domain model is stored* only the events which happend to the domain to put the model in the current state.
When an element from the domain model is loaded what happens is a new instance of the element is created and then the events that affect that instance are replayed one after the other in the right order to put the element into the correct state.
you could keep instances of your domain objects around and subscribing to new events so that they can be kept up to date without loading them from all the events every time, but usually its quick enough just to load all the events from the database and apply them every time in the same way that you might load the instance from the database on every call to your web service.
*Unless you have snapshots of you domain object to reduce the number of events you need to load/process
Persistence of data is not strictly needed. It might be sufficient to have enough copies in enough different locations (GigaSpaces). So no, a database is not required. This is (at least was a few years ago) used in production by the Dutch eBay equivalent.

Using aggregates and Domain events with nosql storage

I'm wandering on DDD and NoSql field actually. I have a doubt now: i need to produce events from the aggregate and i would like to use a NoSql storage. But how can i be sure that events are saved on the storage AND the changes on the aggregate root not having transactions?
Does it makes sense? Is there a way to do this without being forced to use event sourcing or a transactional db?
Actually i was lookin at implementing a 2 phase commit algorithm but it seems pretty heavy from a performance point of view...
Am i approaching the problem the wrong way?
Stuffed with questions...
Thanks for every suggestion
Enrico
PS
I'm a newbie on stackoverflow so any suggestion/critic/... is more than welcome
Enrico
Edit 1
Well i would need events to notify aggregates that something happened and i they should react to the change. The problem arise when such events are important for the business logic. As far as i understood, after a night of thinking, i can't use a nosql storage to do such things. Let me explain (thinking with loud voice :P):
With ES (1st scenery): I save the "diff" of the data. Then i produce an event associated with it. 2 operations.
With ES (2nd scenery): I save the "diff" of the data. A process, watch the ES and produce the event. But i'm tied to having only one watcher process to ensure the correct ordering of events.
With ES (3d scenery): Idempotent events. The events can be inferred by the state and every reapplication of the event can cause a change on the consumer only once, can have multiple "dequeue" processes, duplicates can't possibly happen. 1 operation, but it introduce heavy limitations on the consumers.
In general: I save the aggregate's data. Then i produce an event associated with it. 2 operations.
Now the question becomes wider imho, is it possible to work with domain events and nosql when such domain events are fundamental part of the business process?
I think that could be a better option to go relational... even if i would need to add quite a lot of machines to get the same performances.
Edit 2
For the sake of completness, searching for "domain events nosql idempotent" on google: http://svendvanderveken.wordpress.com/2011/08/26/transactional-event-based-nosql-storage/
If you need Event Sourcing, you should store events only.
This should be the sequence:
the aggregate root recieves a command
it fires proper events
events are stored
Each aggregate's re-hydratation should be done only by executing events over them. You can create aggregates' snapshots if you measure performance problems on their initialization, but this doesn't require two-phase commits, since you can build snapshots asynchronously via batch.
Note however that you need CQRS and/or Event Sourcing only if your application is heavily concurrent and you need to cope with partition tolerance and compensating actions.
edit
Event Sourcing is alternative to the persistence of object state. You either store the events or the state of the object model. You can save snapshot, but they're just performance tools: your application must be able to work without them. You can consider such snapshots as a caching technique. As an alternative you can persist object state (the classical model), but in that case you don't need to store events.
In my own DDD application, I use observable entities to decouple (via direct events' subscription from the repository) aggregates and their persistence. For example your repository can subscribe each domain events, and execute the actions required by the application (persist to the store, dispatch to a queue and so on...). But as a persistence technique, Event Sourcing is alternative to classical persistence of the observable object state. In most scenarios you don't need both.
edit 2
A final note: if you choose ES, one of the events subscriber can build a relational read-model too.

EventStore: learning how to use

I'm trying to learn EventStore, I like the concept but when I try to apply in practice I'm getting stuck in same point.
Let's see the code:
foreach (var k in stream.CommittedEvents)
{
//handling events
}
Two question about that:
When an app start ups after some maintenance, how do we bookmark in a
safe way what events start to read? Is there a pattern to use?
as soon the events are all consumed, the cycle ends... what about the message arriving run time? I would expect the call blocking until some new message arrive ( of course need to be handled in a thread ) or having something like BeginRead EndRead.
Do I have to bind an ESB to handle run time event or does the EventSore provides some facility to do this?
I try to better explain with an example
Suppose the aggregate is a financial portfolio, and the application is an application showing that portfolio to a trader. Suppose the trader connect to the web app and he looks at his own portfolio. The current state will be the whole history, so I have to read potentially a lot of records to reproduce the status. I guess this could be done by a so called snapshot, but who's responsible for creating it? When one should choose to create an aggregate? How can one guess a snapshot for an aggregate exists ?
For the runtime part: as soon the user look at the reconstructed portfolio state, the real time part begin to run. The user can place an order and a new position can be created by succesfully execute that order in the market. How is the portfolio updated by the infrastructure? I would expect, but maybe I'm completely wrong, having the same event stream being the source of that new event new long position, otherwise I have two path handling the state of the same aggregate. I would like to know if this is how the strategy is supposed to work, even if I feel a little tricky having the two state agents, that can possibly overlap.
Just to clarify how I fear the overlapping:
I know events has to be idempotent, so I know it must not be a
problem anyway,
But let's consider the following:
I subscribe an event bus before streaming the event to update the state of the portfolio. some "open position event" appears on the bus: I must handle them, but maybe the portfolio is not in the correct state to handle it since is not yet actualized. Even if I'm able to handle such events I will find them again when I read the stream.
More insidious: I open the stream and I read all events and I create a state. Then I subscribe to the bus: some message on the bus happen in the middle between the end of the steram reading and the beggining of the subscription: those events are missing and the aggregate is not in the correct state.
Please be patient all, my English is poor and the argument is tricky, hope I managed to share my doubt :)
The current state will be the whole history, so I have to read
potentially a lot of records to reproduce the status. I guess this
could be done by a so called snapshot, but who's responsible for
creating it?
In CQRS and event sourcing, queries are served by projections which are generated from events emitted by aggregates. You don't use the aggregate instance as reconstituted from the event store to display information.
The term snapshot refers specifically to an optimization of the event store which allows rebuilding the aggregate without replaying all of the events.
Projections are essentially event handlers which maintain a denormalized view of aggregates. Events emitted from aggregates are published, possibly out of band, and the projection subscribes to and handles those events. A projection can combine multiple aggregates if a requirement exists to display summary information, for instance. In case of a trading application, each view will typically contain data from various aggregates. Projections are designed in a consumer-driven way - application requirements determine the different views of the underlying data that are needed.
With this type of workflow you have to embrace eventual consistency throughout your application. For instance, if an end user is viewing their portfolio and initiating new trades, the UI has to subscribe to updates to reflect updated projections in an asynchronous manner.
Take a look at here for an overview of CQRS and event sourcing.