CQRS/EventStore: How are failures to deliver events handled?

CQRS/EventStore: How are failures to deliver events handled? - cqrs

Getting into CQRS and I understand that you have commands (app layer) and events (from the domain).
In the simple case where events are to update the read model, do read model updates fail? If there is no "bug" then I cannot see them failing and as I am using EventStore, I know there is a commit flag which will retry failures.
So my question is do I have to do anything in addition to EventStore to handle failures?
Coming from a world where you do everything in one transaction and now things are done separately is worrying me.

Of course there may be cases where a published event will fail in the read models.
You have to make sure you can detect that and solve it.
The nice thing is that you can replay all the events again and again so you have the chance not only to fix the error. You can also test the fix by replaying every single event if you want.
I use NServiceBus as my publishing mechanism which allows me to use an error queue. Using my other logging tools together with the error queue I can easily determine what happened since I have the error log and the actual message that caused the error in the first place.

Related

How to persist and replay NestJS CQRS event and saga across restart?

I am making an application which will need to use NestJS' CQRS module, as the requirements naturally lend themselves to that pattern.
Updates to the application logic are expected to be frequent and to happen during busy hours (that's just how my management works...), so the application needs to be able to restart gracefully. However, this means that events started just before the shutdown may not finish, or even if they do, some sagas may not trigger due to some events having happened before the restart... I'd like to ensure that doesn't happen.
I'm aware of NestJS' OnApplicationShutdown and OnApplicationBootstrap hooks, which is exactly for this purpose, but what I'm not sure is what I should do there. How can I capture all events that have unfinished handlers and sagas? Then after a restart, how can I make the event bus aware of the events monitored by sagas, without executing the already executed handlers?
I guess the second part could be worked around with a random ID per event/handler combo, that will be looked up in a log, and if present, the handler will be skipped, and if not, it will be executed and added to the log... But even with such a workaround, I don't see how I could do the first part. There will be a lot of events, and sagas (by definition) execute commands, meaning they have side effects... Even if all commands can become idempotent, the sheer quantity of events and frequent restarts means restarting from the very first command is a no go.
I've seen this package but I'm not sure if it solves this particular use case, or if it's really just logging the events, and pretty much nothing more.

Append events to eventstore

We are using axon framework version 3.4.2 and found a bug inside our code. The bug relates to a missing event which was not published. The solution is to fix the code but this would not fix the event store and views.
My question is how would one fix this? We thought of appending the events to the event store (we use a JDBC event store), but without the correct data, the new events would not be processed. The best would be to do it in the application by publishing the event in axon and let axon handle all the details, but this is a once-off, correcting action.
Is there any way of "injecting" a once-off event into axon?

The comment which Matt shared is conceptually what you should do.
Thus, to resolve the issue you unintendedly introduced, you should produce a compensation action, aka a command. This command will be handled in your command model, will validate the model's state and publish the desired event.
Added, I am assuming this event of your should originate from an Aggregate, correct?
In Axon terms, that means you want to publish a domain event rather than a regular event.
Although you can publish events on the EventBus or store in the EventStore directly, it is rather complicated to make those domain events through that process.
Thus, as I started off with and what Matt Freeman commented on your question, a compensating action would be the way to go, with or without Axon.
Last note, know that Axon 4.2 is already out for some time now. Although Axon 3 will still under go bug fixes, none of these have occurred in the last year. Simply put, there is no active development on Axon 3. Migrating to a more recent version would thus be beneficial for your project.

CQRS - Single command handler?

I´m just trying to wrap my head around CQRS(/ES). I have not done anything serious with CQRS. Probably I´m just missing something very fundamental right now. Currently, I´m reading "Exploring CQRS and Event Sourcing". There is one sentence that somehow puzzles me in regards to commands:
"A single recipient processes a command."
I´ve seen this also in the CQRS sample application from Greg Young (FakeBus.cs) where an exception is thrown when more then one command handler is registered for any command type.
For me, this is an indication that this is a fundamental principle for CQRS (or Commands?). What is the reason? For me, it is somewhat counter-intuitive.
Imagine I have two components that need to perform some action in response to a command (it doesn´t matter if I have two instances of the same component or two independent components). Then I would need to create a handler that delegates the command to these components.
In my opinion, this is introducing an unnecessary dependency. In terms of CQRS, a command is nothing more than a message that is sent. I don´t get the reason why there should be only one handler for this message.
Can someone tell me what I am missing here? There is probably a very good reason for this that I just don´t see right now.
Regards

I am by no means an expert myself with CQRS, but perhaps I can help shed some light.
"A single recipient processes a command.", What is the reason?
One of the fundamental reasons for this is transactional consistency. A command needs to be handled in one discrete (and isolated) part of the application so that it can be committed in a single transaction. As soon as you start to have multiple handlers, distributing the application beyond a single process (and maintaining transactional consistency) is nearly impossible. So, while you could design that way, it is not recommended.
Hope this helps.

Imagine I have two components that need to perform some action in response to a command (it doesn´t matter if I have two instances of the same component or two independent components). Then I would need to create a handler that delegates the command to these components.
That's the responsibility of events.
A command must be handled by one command handler and must change the state for a single aggregate root. The aggregate root then raises one or more events indicating that something happened. These events can have multiple listeners that perform desired actions.
For example, you have a PurchaseGift command. Your command handler loads the Purchase aggregate root and performs the desired operation raising a GiftPurchased event. You can have one or more listeners to the GiftPurchase event, one for sending an email to the buyer confirming the operation and another to send the gift by mail.

What triggers UI refresh in CQRS client app?

I am attempting to learn and apply the CQRS design approach (pattern and architecture) to a new project but seem to be missing a key piece.
My client application executes a query and retrieves a list of light-weight, read-only DTOs from the read model. The user selects an item and clicks a button to initiate some action. The action is performed by creating and sending the corresponding command object to the write model (where the command handler carries out the action, updates the data store, etc.) At some point, however, I need to update the UI to reflect changes to the state of the application resulting from the action.
How does the UI know when it is time to refresh the original list?
Additional Info
I have noticed that most articles/blogs discussing CQRS use MVC client apps in their examples. I am working on a Silverlight client right now and am beginning to wonder if the pattern simply doesn't work in that case.
Follow-Up Question
After thinking more about Bartlomiej's response and subsequent discussion, I am wondering about error handling in CQRS. Given that commands are basically fire-and-forget asynchronous operations, how do we report an error condition to the UI?
I see 'refreshing the UI' to take one of two forms:
The operation succeeds, data has changed and the UI should be updated to reflect these changes
The operation fails, data has not changed but the user should be notified of the failure and potential corrective actions.
Even with a Post-Redirect-Get pattern in an MVC, you can't really Redirect until you know the outcome of the operation. None of the examples I've seen thus far address these real-world concerns.

I've been struggling with similar issues for a WPF client. The re-query trigger for any data is dependent on the data your updating, commands tend to fall into categories:
The command is a true fire and forget method, it informs the back-end of a state change but this change does not need to be reflected in the UI, or the change simply isn't important to the UI.
The command will alter the result of a single query
The command will alter the result of multiple queries, usually (in my domain at least) in a cascading fashion, that is, changing the state of a single "high level" piece of data will likely affect many "low level" caches.
My first trigger is the page load, very few items are exempt from this as most pages must assume data has been updated since it was last visited. Though some systems may be able to escape with only updating financial and other critical data in this way.
For short commands I also update data when 'success' is returned from a command. Though this is mostly laziness as IMHO all CQRS commands should be fired asynchronously. It's still an option I couldn't live without but one you may have to if your implementation expects high latency between command and query.
One pattern I'm starting to make use of is the mediator (most MVVM frameworks come with one). When I fire a command, I also fire a message to the mediator specifying which command was launched. Each Cache (A view model property Retriever<T>) listens for commands which affect it and then updates appropriately. I try to minimise the number of messages while still minimising the number of caches that update unnecessary from a single message so I'll (hopefully) eventually end up with a shortlist of update reasons, with each 'reason' updating a list of caches.
Another approach is simple honesty, I find that by exposing graphically how the system updates itself makes users more willing to be patient with it. On firing a command show some UI indicating you're waiting for the successful response, on error you could offer to retry / show the error, on success you start the update of the relevant fields. Baring in mind that this command could have been fired from another terminal (of which you have no knowledge) so data will need to timeout eventually to avoid missing state changes invoked by other machines also.
Noting the irony that the only efficient method of updating cache's and values on a client is to un-separate the commands and queries again, be it through hardcoding or something like a hashmap.

My two cents.
I think MVVM actually fits into CQRS quite well. The ViewModel simply becomes an observable ReadModel.
1 - You initialize your ViewModel state via a query on the ReadModel.
2 - Changes on your ViewModel are automatically reflected on any Views that are bound to it.
3 - Certain changes on your ViewModel trigger a command to propegate to a message queue, an object responsible for sending those commands to the server takes those messages off the queue and sends them to the WriteModel.
4 - Clients should be well formed, meaning the ViewModel should have performed appropriate validation before it ever triggered the command. Once the command has been triggered, any event notifications can be published onto an event bus for the client to communicate changes to other ViewModels or components in the system interested in those changes. These events should carry the relevant information necessary. Typically, this means that other view models usually don't have to re-query the read model as a result of the change unless they are dependent on other data that needs to be retrieved.
5 - There is an object that connects to the message bus on the server for real-time push notifications when other clients make changes that this client is interested in knowing about, falling back to long-polling if necessary. It propagates those to the internal message bus that ties the components on the client together.
6 - The last part to handle is the fact that clients can be occasionally connected, which should be the only reason a command fails (they don't have internet access at the moment), which is when the client should be notified of problems.

In my ASP.NET MVC 3 I use 2 techniques depending on use case:
already well-known Post-Redirect-Get pattern which fits nicely with CQRS. Your MVC action that triggers the command returns a redirection to action that performs a query.
in some cases, like real-time updates of other clients, I rely on domain events/messages. I create an event handler that uses singlarR to push changes to all connected and interested clients.

There are two major ways you can take as far as I know :
1) design your UI , so that the user does not see its changes right away. Like for instance a message to tell him his action is a success, and offering him different choices to continue his work. this should buy you enough time to have updated your readmodel.
2) more complex, but you might keep the information you have send to the server and shows them in the interface.
The most important I guess, educate your user if you can so that they know why the data is not here... yet!
I am thinking about it only now, but these are for sync command handling, not async, in async things go really harder on the brain...the client interface becomes an event eater too..

Microsoft Message Queue Missing Messages

I am using C# and .Net Framework 1.1 (yes its old but I inherited this stuff and can't convert up). I places messages on a transactional queue but it does not get on the queue about 50% of the time. Running workgroup and Windows/XP Professional with all service packs installed. I don't see any messages in the dead letter queue either.
Any ideas where to look?

If it isn't hitting the queue at all and isn't going to the dead-letter queue, it suggests the item isn't being sent to the queue. You should be able to confirm that this is the case by switching on the journal for the queue.
Assuming it isn't hitting the queue, it is probably a transaction issue. I would check that you are definitely committing the message to the queue every time. Make sure there aren't any exceptions being thrown and swallowed that causes the transaction to roll back or never be committed (essentially the same thing). Also make sure there aren't any conditional statements that mean the commit gets skipped.
I would add some logging around every location where a transaction is started, committed and rolled back and also around any location where you are creating a message. You can then review you log to see the order of events and see what's going astray.
Another option would be to remove all of the transaction code and test the code against a non-transactional queue. If the messages all appear then it is a transactional problem. If not, the issue is elsewhere.
I use MSMQ a lot and the one thing I have learned through experience is that it works really well and the weak point is me :-)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse