CQRS - where should I place requests to external systems? [closed] - cqrs

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
I'm wondering where requests to external systems (to be specific: a Webservice) should be placed in a CQRS-based system.
For example, given a system that sends a booking-request to an external flight service:
Should this be in the domain object, in the command handler for "bookFlight"? Or should this be in a saga, as a reaction to a domain object event "flightBookingPlaced"?

I'll make some assumptions:
The external request is part of the "transaction".
The external request is core to the behaviour of the command.
The external system response is synchronous so much as it either responds or fails, there are no callbacks or polling involved.
I would say it can belong in the command or as a series of commands.
Hide the external service behind an ACL or facade, make that a dependency on the command. The command will then represent the transition from "not booked" to "booked". Ignoring the complexities of the command "blocking" until complete (effectively), that'll cover what you need.
If you wanted to support a more granular approach, the small series of commands approach feels like it fits best:
not booked -> booking pending -> booked
Launch the event and trigger a RequestBookingCommand, which changes the booking state from "not booked" to "booking pending", and commits the transaction. This can then trigger the next command ExternalBookingCommand, which can work in the background without needing the domain object initially. The booking can be performed on the external system and if successful, take you from "booking pending" to "booked". If it fails, you can retry or revert the booking to "booking failed".
This then at least allows you to start putting validation around not attemping to double book etc.
I can't speak to sagas specifically, but I would like to think you could represent the protocol of "booking commands" as a little saga; mapping you from one domain state (not booked) to the eventual state (booked) with as many stops as you need inbetween.
In either approach, what is important is defending domain state and ensuring any transactions are integral. Going more granular with the states and events might help also because you can use better language (one of DDD's tenets) to describe what is occurring, such as RequestBookingCommand leaving you in a BookingRequested state, following onto a PerformExternalBooking command starting with a BookingRequested state and leaving you in a Booked or BookingFailed state. You can also then introduce domain events such as SuccessfullyBooked or BookingRequestedOnFoo.
My approach to these situations, usually, is to try not to overthink it and first build a model that matches how I describe it verbally. Frameworks and infrastructure can help you combat technical considerations (such as transactions or concurrency).

If this is not an internal microservice - really fast and stable, I would do this in Saga/Process Manager/Gateway - an async actor with its own state machine.
With external services you would want to have error processing, retries, timeouts - everything async, so your aggregate is not blocked.

Related

Axon Server auto-scaling split/merge delay

I am implementing auto-scaling in an application using Axon Server, and running in k8s.
I have created ReST endpoints in the application itself, which look at the local configuration (for processors and thread counts) and then speak to the Axon Server ReST API in order to split/merge the processors appropriately. The intent being to use container lifecycle hooks to trigger them.
As a result, if a new instance (pod) of an application is launched, configured for 2 threads on ProcessorA, then my code will make 2 requests to the /v1/components/blah/processors/ProcessorA/segments/split?context=default endpoint on the server. This is in order to make full use of the 2 new threads.
Likewise, when the pod is shut down, it makes 2 similar requests to the merge endpoint on the server.
When scaling up I see the processor split twice, as expected. However, on shutdown I don't see the merge twice unless I put a long (5s) wait between requests. This isn't likely to be particularly stable, so I'm wondering if there's something else I need to be doing.
Perhaps I ought to request the merge, then loop waiting for it to occur, then request another. This seems like it's going to be excessively slow.
There was another question on SO somewhat related, Automatically scale Axon's tracking event processors, where Steven commented that there was no inbuilt auto-scaling in Axon Server at that point in time. I've not seen anything in more recent times either.
As it stands work is underway to improve the split/merge functionality. For one, the result of a split/merge will be returned, which has been resolved under issue #1001.
This should make it so you do not have to wait for the status' to have been updated, which is the likely cause why it (seems to) take long. This functionality will be part of Axon Framework / Server 4.4 by the way, which should be released relatively soon.
Subsequently, discussion are still underway to allow for auto scaling. One requirement deemed important is the capability of a TrackingEventProcessor to process several segments per thread (issue #1434). This will ensure that the TEP can take over several segments to transition the boundary when scaling, for example.
Eventually though, Axon Server should be able to do this for you. It's just not there yet.
So for now I think the most pragmatic solution is indeed to wait for the result to show up on the status'. As said, I trust 4.4 will improve upon this by returning the result of the split/merge operation once called. Lastly, the Axon team is aware this can be improved upon further, hence why discussion on the matter are underway.

Where to draw the line with reactive programming [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I have been using RxJava in my project for about a year now.
With time, I grew to love it very much - now I'm thinking maybe too much...
Most methods I write now have some form of Rx in it, which is great! (until it's not).
I now notice that some methods require a lot of work to combine the different observable producing methods.
I get the feeling that although I understand what I write now, the next programmer will have a really hard time understanding my code.
Before I get to the bottom line let me give an example straight from my code in Kotlin (Don't dive too deep into it):
private fun <T : Entity> getCachedEntities(
getManyFunc: () -> Observable<Timestamped<List<T>>>,
getFromNetwork: () -> Observable<ListResult<T>>,
getFunc: (String) -> Observable<Timestamped<T>>,
insertFunc: (T) -> Unit,
updateFunc: (T) -> Unit,
deleteFunc: (String) -> Unit)
= concat(
getManyFunc().filter { isNew(it.timestampMillis) }
.map { ListResult(it.value, "") },
getFromNetwork().doOnNext {
syncWithStorage(it.entities, getFunc, insertFunc, updateFunc, deleteFunc)
}).first()
.onErrorResumeNext { e -> // If a network error occurred, return the cached data and the error
concat(getManyFunc().map { ListResult(it.value, "") }, error(e))
}
Briefly what this does is:
Retrieve some timestamped data from storage
If data is not new, fetch data from network
Sync network data again with the storage (to update it)
If a network error occured, again retrieve the older data and the error
And here comes my actual question:
Reactive programming offers some really powerful concepts. But as we know with great power comes great responsibility.
Where do we draw the line? Is it OK to fill our entire programs with awesome reactive oneliners or should we save it only for really mundane operations?
Obviously this is very subjective, but I hope someone with more experience can share his knowledge and pitfalls.
Let me phrase it better
How do I design my code to be reactive yet easy to read?
When you pick up Rx, it becomes this awesome shiny hammer and everything starts looking like a rusty nail just waiting for you to bang in.
Personally, I think the biggest clue is in the name, reactive framework. Given a requirement, you need to reflect upon whether a reactive solution truly makes sense.
In any Rx proposition, you are looking to introduce one or more event streams and carry out some action in response to an event.
I think there are two key questions to ask:
Are you in control of the event stream?
To what degree must you complete responses at the rate of the event stream?
If you do not have control of the event stream and you must respond at the rate of the event stream then Rx is a good candidate.
In any other circumstance, it is probably a poor choice.
I have seen many examples where people have jumped through hoops to create the illusion of a lack of control in order to justify Rx - which seems crazy to me. Why give up the control that you have?
Some examples:
You have to extract data from a fixed list of files and store it in a database. You decide to push each file name into a subject and create a reactive pipeline that opens each file and projects the data, then processes the data in some way and finally writes it to the database.
This fails the control test and the rate test. It would be far easier to iterate over the files and pull them in and process them as fast as you can. The phrase "decide to push" is the giveaway here.
You need to display stock prices from a stock exchange.
Clearly this is a good choice for Rx. If you can't keep up with the rate of prices in general, you are screwed. It might be the case that you conflate prices (perhaps to provide an update only once every second) - but this still qualifies as keeping up. The one thing you can't do is ask the stock exchange to slow down.
These (real world) examples pretty much fall at opposite ends of the spectrum and don't have much grey area. But there is a lot of grey area out there where control isn't clear.
Sometimes you are wearing the client hat in a client/server system and it can be easy to fall into the trap of sacrificing control, or putting control in the wrong place - which can easily be fixed with correct design. Consider this:
A client application displays news updates from a server.
News updates are submitted to the server at any time and are created in high volume.
The client should be refreshed at an interval set by the client.
Refresh interval can be changed at any time and the user can always request an immediate refresh.
The client only shows updates tagged with particular keywords, as specified by the user.
The news updates are sometimes lengthy and the client should not store the full content of news updates, but rather display the headline and summary.
At user request, the full content of an article can be shown.
Here, the frequency of news updates is not in control of the client. But the desired refresh rate and the tags of interest are.
For the client to receive all the news updates as they arrive and filter them client side isn't going to work. But there are plenty of options:
Should the server send a data stream of updates taking into account the client refresh rate? What if the client goes offline?
What if there are thousands of clients? What if the client wants an immediate refresh?
There are lots of valid ways to tackle this problem that include more or less reactive elements. But any good solution should take account of the client's control of tags and desired refresh rate, and the lack of control of news update frequency (by client or server). You might want the server to react to changes in client interest by updating the events that it pushes to the client - which it pushes only as long as the client is listening (detected via a heartbeat). When the user wants a full article, then the client would pull the article down.
There is much debate in the Rx community about back-pressure. This is the idea that the client should inform the server when it is overloaded and the server respond by somehow reducing the event stream. I think this is a misguided approach that can lead to confusing designs.
To my mind, as soon as a client needs to give this feedback, it has failed the response rate test. At this point, you are not in a reactive situation, you are in an async enumerable situation. i.e. The client should be saying "I am ready" when it is ready for more and then waiting in a non-blocking fashion for server to respond.
This would be appropriate if the first scenario were modified to be files arriving in a drop-folder, of varying lengths and complexity to process. The client should make a non-blocking call for the next file, process it, and repeat. (Add parallelism as required) - and not be responding to a stream of file-arrived events.
Wrap up
I've deliberately avoided other valid concerns such as maintainability of code, performance of Rx itself etc. Most because they are addressed elsewhere and more importantly because I think the ideas here are more divisive than those concerns.
So if you reflect on the elements of control and response rate in your scenario you and will probably stay on the right track.
The response rate issue can be subtle - and the degree aspect is important. Arrival rate can fluctuate, and there is going to be some acceptable degree of fluctuation in response rate - clearly, if you don't ultimately have a way to "catch up" then at some point the client will blow up.
I find that there are two things I keep in mind when writing Rx (or any mildly sophisticated/new technology)
Can I test it?
Can I easily hire someone that can maintain it. Not struggle to maintain it, but will be fine left alone to maintain it?
To this end, I also find that just because you can, doesn't always mean you should. As a guide I try to avoid creating queries that are over say 7 lines of code. Queries bigger than this, I try to separate into sub queries that I compose.
If code you have provided is at the core of the code base, and is at the extreme end of the complexity, then It may be fine. However, if you find all of your Rx code carries that much complexity, you may be creating a difficult to work with code base.

What triggers UI refresh in CQRS client app?

I am attempting to learn and apply the CQRS design approach (pattern and architecture) to a new project but seem to be missing a key piece.
My client application executes a query and retrieves a list of light-weight, read-only DTOs from the read model. The user selects an item and clicks a button to initiate some action. The action is performed by creating and sending the corresponding command object to the write model (where the command handler carries out the action, updates the data store, etc.) At some point, however, I need to update the UI to reflect changes to the state of the application resulting from the action.
How does the UI know when it is time to refresh the original list?
Additional Info
I have noticed that most articles/blogs discussing CQRS use MVC client apps in their examples. I am working on a Silverlight client right now and am beginning to wonder if the pattern simply doesn't work in that case.
Follow-Up Question
After thinking more about Bartlomiej's response and subsequent discussion, I am wondering about error handling in CQRS. Given that commands are basically fire-and-forget asynchronous operations, how do we report an error condition to the UI?
I see 'refreshing the UI' to take one of two forms:
The operation succeeds, data has changed and the UI should be updated to reflect these changes
The operation fails, data has not changed but the user should be notified of the failure and potential corrective actions.
Even with a Post-Redirect-Get pattern in an MVC, you can't really Redirect until you know the outcome of the operation. None of the examples I've seen thus far address these real-world concerns.
I've been struggling with similar issues for a WPF client. The re-query trigger for any data is dependent on the data your updating, commands tend to fall into categories:
The command is a true fire and forget method, it informs the back-end of a state change but this change does not need to be reflected in the UI, or the change simply isn't important to the UI.
The command will alter the result of a single query
The command will alter the result of multiple queries, usually (in my domain at least) in a cascading fashion, that is, changing the state of a single "high level" piece of data will likely affect many "low level" caches.
My first trigger is the page load, very few items are exempt from this as most pages must assume data has been updated since it was last visited. Though some systems may be able to escape with only updating financial and other critical data in this way.
For short commands I also update data when 'success' is returned from a command. Though this is mostly laziness as IMHO all CQRS commands should be fired asynchronously. It's still an option I couldn't live without but one you may have to if your implementation expects high latency between command and query.
One pattern I'm starting to make use of is the mediator (most MVVM frameworks come with one). When I fire a command, I also fire a message to the mediator specifying which command was launched. Each Cache (A view model property Retriever<T>) listens for commands which affect it and then updates appropriately. I try to minimise the number of messages while still minimising the number of caches that update unnecessary from a single message so I'll (hopefully) eventually end up with a shortlist of update reasons, with each 'reason' updating a list of caches.
Another approach is simple honesty, I find that by exposing graphically how the system updates itself makes users more willing to be patient with it. On firing a command show some UI indicating you're waiting for the successful response, on error you could offer to retry / show the error, on success you start the update of the relevant fields. Baring in mind that this command could have been fired from another terminal (of which you have no knowledge) so data will need to timeout eventually to avoid missing state changes invoked by other machines also.
Noting the irony that the only efficient method of updating cache's and values on a client is to un-separate the commands and queries again, be it through hardcoding or something like a hashmap.
My two cents.
I think MVVM actually fits into CQRS quite well. The ViewModel simply becomes an observable ReadModel.
1 - You initialize your ViewModel state via a query on the ReadModel.
2 - Changes on your ViewModel are automatically reflected on any Views that are bound to it.
3 - Certain changes on your ViewModel trigger a command to propegate to a message queue, an object responsible for sending those commands to the server takes those messages off the queue and sends them to the WriteModel.
4 - Clients should be well formed, meaning the ViewModel should have performed appropriate validation before it ever triggered the command. Once the command has been triggered, any event notifications can be published onto an event bus for the client to communicate changes to other ViewModels or components in the system interested in those changes. These events should carry the relevant information necessary. Typically, this means that other view models usually don't have to re-query the read model as a result of the change unless they are dependent on other data that needs to be retrieved.
5 - There is an object that connects to the message bus on the server for real-time push notifications when other clients make changes that this client is interested in knowing about, falling back to long-polling if necessary. It propagates those to the internal message bus that ties the components on the client together.
6 - The last part to handle is the fact that clients can be occasionally connected, which should be the only reason a command fails (they don't have internet access at the moment), which is when the client should be notified of problems.
In my ASP.NET MVC 3 I use 2 techniques depending on use case:
already well-known Post-Redirect-Get pattern which fits nicely with CQRS. Your MVC action that triggers the command returns a redirection to action that performs a query.
in some cases, like real-time updates of other clients, I rely on domain events/messages. I create an event handler that uses singlarR to push changes to all connected and interested clients.
There are two major ways you can take as far as I know :
1) design your UI , so that the user does not see its changes right away. Like for instance a message to tell him his action is a success, and offering him different choices to continue his work. this should buy you enough time to have updated your readmodel.
2) more complex, but you might keep the information you have send to the server and shows them in the interface.
The most important I guess, educate your user if you can so that they know why the data is not here... yet!
I am thinking about it only now, but these are for sync command handling, not async, in async things go really harder on the brain...the client interface becomes an event eater too..

Saga , Commands , events & ReadModel

I am currently writing my first saga and I am a bit puzzled with the read model. Let's explain it with an example :
I have three bounded context : programming, contractor and control. Each of them has its specific read model.
worflow :
programming send an event "JobScheduled"
Saga receives this event and tell contractor to "schedule the Work".
When done the Contractor send an event "JobDone".
The Saga receives this event and tell Control to "Start Control Period".
Everything turns out to be fine here. We are on the write side so we are passing vital information for the process to go on.
My question comes with unnecessary information. Let's say that the event "JobScheduled"
has a note field : "test note", and before this job is done this field is changed to "test note important". This change is of no importance to the workflow as described, but it is important that the contractor might see the change in the field when looking at the read model of the contractor bounded context.
Am I to give to the saga the event NoteChanged and process it or should I create a projection which directly listens to this event in my contractor bounded context?
To give it to the saga looks to me like unnecessary work because I am only updating readmodel here there is no domain involved in the change.
On the other hand, making a direct coupling between the two bounded context removes one of the assets of sagas which is the possibilty of modifying the interactions the bounded context have between each other in the workflow.
Thanks for your reading,
If changing the note is important, it should be modelled explicitly. This can be accomplished by introducing an Event just like you already did.
If said Event has any relevancy to the process, it can be handled by a Saga. If it only needs to be represented in different Read Models, then just handling it in their respective projections should be fine.
One context may very well listen to and handle the events of another one, even across application boundaries. At least this is how cross-context integration should work in an event centric architecture.
Personally I would send a command "Change note", because I see it like an information that have to be saved in the event stream of your aggregate. Than if your saga don't "feel" to tell anyone about this command, or simply give the information to an handler that quietly update your read model I guess is fine.

C# What is my best option? Service/Application/Multiple Applications [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I am developing a solution that requires a number of tasks to be completed at various times. Example:
Task 1 - Monitor mailbox, process mail items
Task 2 - Monitor mailbox (different folder), process mail items
Task 3 - Generate PDF reports
Task 4 - Monitor folder, distribute files via email as attachments when new ones arrive.
I have already implemented the solution, however, it was basically just a quick fix to get the thing running. Now that it is up, I want to revisit the current setup and improve it so it is as efficient as possible.
For the current solution I have created a sepearate application for each different task and used the Task Scheduler to execute them at specific times.
Task 1 is a console application that runs on a scheduled task every 5 minutes
Task 2 is a console application that runs on a scheduled task every 5 minutes (2 minutes after the first application this is because Task 1 will move emails into the folder Task 2 is monitoring)
Task 3 is run at 5am every day as a runonce application on a scheduled task
Task 4 is running indefinetly.
My question is, does this seem like a reasonable approach for a solution to this type of application? Do some of the tasks seem better as a service rather than an application?
I think I'd probably use a single service which can be easily configured to run the various tasks (so that if you want to separate them later, you can do so).
Scheduling specific applications is okay and certainly a simpler way of working, but this feels more like a service to me. Of course, if you separate out the "doing stuff" logic from the "invocation" side of things, you can easily switch from one to the other.
The efficiency side of things is unlikely to change much by this decision. Do you have good grounds to be worried about the overall efficiency at the moment? Have you profiled your applications to work out where any bottlenecks are? I'd say they're unlikely to be in the scheduling side of things.
A service sounds like the right way to approach this.
Long running subtasks such as PDF generation are well suited to perform using the asynchronous programming method, i.e. using worker threads that call back to the parent thread upon completion. This way the monitor tasks can run independently of the action tasks.