How to "join" two Aggregate Roots when preparing View Model? - cqrs

Assume that Book and Author are Aggregate Roots in my model.
In read model i have table AuthorsAndBooks which is a list of Authors and Books joined by Book.AuthorId
When BookAdded event is fired i want to receive Author data to create a new AuthorsAndBooks line.
Because Book is an Aggregate Root, information about Author doesn't included in BookAdded event. And i cannot include it because Author root doesn't have getters (according to guidelines of all examples and posts about CQRS and Event Sourcing).
Usually i receive two types of answers on this question:
Enrich your domain event with all data you need in event handlers. But as i said i cannot do it for Aggregates Roots.
Use available data from View Model. I.e. load Author from View Model and use it to build AuthorsAndBooks row.
The last one has some problems with concurrency. Author data can be not available in View Model at the time BookAdded event is handling.
What approach do you use to solve this? Thank you.

As a general advice, let the event handlers be idempotent and make sure you can deal with out of order message handling (either by re-queuing or building in mechanisms to fill in missing data).
On the other hand, do question why author and book are such desperate aggregate roots. Maybe you should copy from the author upon adding a book (what the f* is "adding a book", how's that a command). The problem is all these made-up examples. Descend to the real world, I doubt your problem exists.

Your question is missing some context, for example what is the user scenario that leads to this event and what is the state you are starting from? If you were writing the BDD tests for this case, what would they look like? Knowing this would help a lot in answering your question.
How you solve the problem of relating an book to an author is domain dependent. First we are assuming it makes sense for your domain to have an aggregate for Author and an aggregate for Book, for example, if I was writing a library system, I doubt I would have an aggregate for authors, since I don't care about an author without his/her book, what I care about is books.
As for the lack of getters, it's worth mentioning that aggregate roots don't have getters because of a preference for a tell-don't-ask style of OOP. However you can tell one AR to do something which then then tells something to another AR if you need. Part of what is important is the AR tells the others about itself rather than writing code where you ask it and then pass it along.
Finally, I have to ask why you don't have the author's ID at the time you are adding the book? How would you even know who the author is then? I would assume you could just do the following (my code assumes you are using a fluent interface for creation of AR, but you can substitute factories, constructors, whatever you use):
CreateNew.Book()
.ForAuthor(command.AuthorId)
.WithContent(command.Content);
Now perhaps the scenario is you are adding a book along with a brand new author. I would either handle this as two separate commands (which may make more sense for your domain), or handle the command the following way:
var author = CreateNew.Author()
.WithName(command.AuthorName);
var book = CreateNew.Book()
.ForAuthor(author.Id)
.WithContent(command.Content);
Perhaps the problem is you have no getter on the aggregate root Id, which I don't believe is necessary or common. However, assuming Id encapsulation is important to you, or your BookAdded event needs more information about the author than the Id along can provide, then you could do something like this:
var author = CreateNew.Author()
.WithName(command.AuthorName);
var book = author.AddBook(command.Content);
// Adds a new book belonging to this Author
public Book AddBook(BookContent content) {
var book = CreateNew.Book()
.ForAuthor(this.Id)
.WithContent(command.Content);
}
Here we are telling the author to add a book, at which point it creates the aggregate root for the book and passes it's Id to the book. Then we can have the event BookAddedForAuthor which will have the id of the author.
The last one has downsides though, it creates a command that must act through multiple aggregate roots. As much as possible I would try to figure out why the first example isn't working for you.
Also, I can't stress enough how the implementation you are looking for is dictated by your specific domain context.

IMHO, populate read model from author/book events, using reordering to handle cases, where events get out of order (view handler is within it's own consistency boundary and should handle ordering/deduplication cases anyway).

The first thing I would ask is why there are concurrency issues in the read model. If the client is sending a reference to the author aggregate inside the AddBook command, where did it get the information from? If the book and author are created at the same time, then your event can probably be enriched. Let me know if I'm missing something here.

The last one has some problems with
concurrency. Author data can be not
available in View Model at the time
BookAdded event is handling.
What about "handling the event later"? So you simply put it to the back of the queue until this data is available (maybe with a limit of x tries and x time between each try).

Related

How to handle static data in ES/CQRS?

After reading dozens of articles and watching hours of videos, I don't seem to get an answer to a simple question:
Should static data be included in the events of the write/read models?
Let's take the oh-so-common "orders" example.
In all examples you'll likely see something like:
class OrderCreated(Event):
....
class LineAdded(Event):
itemID
itemCount
itemPrice
But in practice, you will also have lots of "static" data (products, locations, categories, vendors, etc).
For example, we have a STATIC products table, with their SKUs, description, etc. But in all examples, the STATIC data is never part of the event.
What I don't understand is this:
Command side: should the STATIC data be included in the event? If so, which data? Should the entire "product" record be included? But a product also has a category and a vendor. Should their data be in the event as well?
Query side: should the STATIC data be included in the model/view? Or can/should it be JOINED with the static table when an actual query is executed.
If static data is NOT part of the event, then the projector cannot add it to the read model, which implies that the query MUST use joins.
If static data IS part of the event, then let's say we change something in the products table (e.g. typo in the item description), this change will not be reflected in the read model.
So, what's the right approach to using static data with ES/CQRS?
Should static data be included in the events of the write/read models?
"It depends".
First thing to note is that ES/CQRS are a distraction from this question.
CQRS is simply the creation of two objects where there was previously only one. -- Greg Young
In other words, CQRS is a response to the idea that we want to make different trade offs when reading information out of a system than when writing information into the system.
Similarly, ES just means that the data model should be an append only sequence of immutable documents that describe changes of information.
Storing snapshots of your domain entities (be that a single document in a document store, or rows in a relational database, or whatever) has to solve the same problems with "static" data.
For data that is truly immutable (the ratio of a circle's circumference and diameter is the same today as it was a billion years ago), pretty much anything works.
When you are dealing with information that changes over time, you need to be aware of the fact that that the answer changes depending on when you ask it.
Consider:
Monday: we accept an order from a customer
Tuesday: we update the prices in the product catalog
Wednesday: we invoice the customer
Thursday: we update the prices in the product catalog
Friday: we print a report for this order
What price should appear in the report? Does the answer change if the revised prices went down rather than up?
Recommended reading: Helland 2015
Roughly, if you are going to need now's information later, then you need to either (a) write the information down now or (b) write down the information you'll need later to look up now's information (ex: id + timestamp).
Furthermore, in a distributed system, you'll need to think about the implications when part of the system is unavailable (ex: what happens if we are trying to invoice, but the product catalog is unavailable? can we cache the data ahead of time?)
Sometimes, this sort of thing can turn into a complete tangle until you discover that you are missing some domain concept (the invoice depends on a price from a quote, not the catalog price) or that you have your service boundaries drawn incorrectly (Udi Dahan talks about this often).
So the "easy" part of the answer is that you should expect time to be a concept you model in your solution. After that, it gets context sensitive very quickly, and discovering the "right" answer may involve investigating subtle questions.

CQRS projections, joining data from different aggregates via probe commands

In CQRS when we need to create a custom-tailored projections for our read-models, we usually prefer a "denormalized" projections (assume we are talking about projecting onto a DB). It is not uncommon to have the information need by the application/UI come from different aggregates (possibly from different BCs).
Imagine we need a projected table to contain customer's information together with her full address and that Customer and Address are different aggregates in our system (possibly in different BCs). Meaning that, addresses are generated and maintained independently of customers. Or, in other words, when a new customer is created, there is no guarantee that there will be an AddressCreatedEvent subsequently produced by the system, this event may have already been processed prior to the creation of the customer. All we have at the time of CreateCustomerCommand is an UUID of an existing address.
We have several solutions here.
Enrich CreateCustomerCommand and the subsequent CustomerCreatedEvent to contain full address of the customer (looking up this information on the fly from the UI or the controller). This way the projection handler will just update the table directly upon receiving CustomerCreatedEvent.
Use the addrUuid provided in CustomerCreatedEvent to perform an ad-hoc query in the projection handler to get the missing part of the address information before updating the table.
These are commonly discussed solution to this problem. However, as noted by many others, there are problems with each approach. Enriching events can be difficult to justify as well described by Enrico Massone in this question, for example. Querying other views/projections (kind of JOINs) will work but introduces coupling (see the same link).
I would like describe another method here, which, as I believe, nicely addresses these concerns. I apologize beforehand for not giving a proper credit if this is a known technique. Sincerely, I have not seen it described elsewhere (at least not as explicitly).
"A picture speaks a thousand words", as they say:
The idea is that :
We keep CreateCustomerCommand and CustomerCreatedEvent simple with only addrUuid attribute (no enriching).
In API controller we send two commands to the command handler (aggregates): the first one, as usual, - CreateCustomerCommand to create customer and project customer information together with addrUuid to the table leaving other columns (full address, etc.) empty for time being. (Warning: See the update, we may have concurrency issue here and need to issue the probe command from a Saga.)
Right after this, and after we have obtained custUuid of the newly created customer, we issue a special ProbeAddrressCommand to Address aggregate triggering an AddressProbedEvent which will encapsulate the full state of the address together with the special attribute probeInitiatorUuid which is, of course our custUuid from the previous command.
The projection handler will then act upon AddressProbedEvent by simply filling in the missing pieces of the information in the table looking up the required row by matching the provided probeInitiatorUuid (i.e. custUuid) and addrUuid.
So we have two phases: create Customer and probe for the related Address. They are depicted in the diagram with (1) and (2) correspondingly.
Obviously, we can send as many such "probe" commands (in parallel) as needed by our projection: ProbeBillingCommand, ProbePreferencesCommand, etc. effectively populating or "filling in" the denormalized projection with missing data from each handled "probe" event.
The advantages of this method is that we keep the commands/events in the first phase simple (only UUIDs to other aggregates) all the while avoiding synchronous coupling (joining) of the projections. The whole approach has a nice EDA feeling about it.
My question is then: is this a known technique? Seems like I have not seen this... And what can go wrong with this approach?
I would be more then happy to update this question with any references to other sources which describe this method.
UPDATE 1:
There is one significant flaw with this approach that I can see already: command ProbeAddrressCommand cannot be issued before the projection handler had a chance to process CustomerCreatedEvent. But this is impossible to know from the API gateway (or controller).
The solution would probably involve a Saga, say CustomerAddressJoinProjectionSaga with will start upon receiving CustomerCreatedEvent and which will only then issue ProbeAddrressCommand. The Saga will end upon registering AddressProbedEvent. Or, if many other aggregates are involved in probing, when all such events have been received.
So here is the updated diagram.
UPDATE 2:
As noted by Levi Ramsey (see answer below) my example is rather convoluted with respect to the choice of aggregates. Indeed, Customer and Address are often conceptualized as belonging together (same Aggregate Root). So it is a better illustration of the problem to think of something like Student and Course instead, assuming for the sake of simplicity that there is a straightforward relation between the two: a student is taking a course. This way it is more obvious that Student and Course are independent aggregates (students and courses can be created and maintained at different times and different places in the system).
But the question still remains: how can we obtain a projection containing the full information about a student (full name, etc.) and the courses she is registered for (title, credits, the instructor's full name, prerequisites, etc.) all in the same table, if the UI requires it ?
A couple of thoughts:
I question why address needs to be a separate aggregate much less in a different bounded context, in view of the requirement that customers have an address. If in some other bounded context customer addresses are meaningful (e.g. you want to know "which addresses have more customers" etc.), then that context can subscribe to the events from the customer service.
As an alternative, if there's a particularly strong reason to model addresses separately from customers, why not have the read side prospectively listen for events from the address aggregate and store the latest address for a given address UUID in case there's a customer who ends up with that address. The reliability per unit effort of that approach is likely to be somewhat greater, I would expect.

.Net Core Rest API Request/Response best practice

I need some advice on how to best structure the requests and the responses for my Rest API.
I'm mostly trying to limit myself to CRUD operations on one resources and I work with one object: for example if the ressource is "book" I end up with the following actions in the controller
[HttpPost("books")] Book Create(Book book)
[HttpGet("books")] Book Get(int id)
This is relatively strait forward.
Now for a more complex example for the creation of a resource, I need to receive a complexe object different from my ressource and return an object containing the resource and extra data
For example for the Order resource I have a the following action in the controller:
[HttpPost("/order")] CreateOrderResponse CreateOrder(CreateOrderRequest createOrderRequest)
Here my action will use the "CreateOrderRequest" object to create to build an Order.
Then I would like to return a "createOrderResponse" object which contains the Order but also extra information that the client needs.
I'm not sure this is the best way to go, any advice ?
Thanks in advance for your help
I prefer the following:
[HttpPost("/order")] CreateOrderResponse CreateOrder(CreateOrderRequest createOrderRequest)
And here is why:
By this method, you are able to protect your public API from implementation details. If you expose your model to your API then you cannot make the same guarantee.
You can also make your validations specific to the request format. In some cases, you might require one subset of your model when creating a record and another subset when editing data. This approach will allow you to handle that scenario as well.
Security. Were you going to add that Book right to a DbContext and save it? Or attach it and update directly? Those would be potential issues from security and data quality perspectives.
But there are downsides:
This approach is time consuming. It may not be worth the time invested if you are writing something as a learning exercise or a quick implementation. And it adds complexity. But then, you might find complexity when you realize your Book object is insufficent in all cases.
You will feel like there is duplicate code in different places. The code may appear to be the same, but the use cases are actually different and may diverge over time. Having a Book parameter will be a liability at that point.

MongoDB model design for meteorjs app

I'm more used to a relational database and am having a hard time thinking about how to design my database in mongoDB, and am even more unclear when taking into account some of the special considerations of database design for meteorjs, where I understand you often prefer separate collections over embedded documents/data in order to make better use of some of the benefits you get from collections.
Let's say I want to track students progress in high school. They need to complete certain required classes each school year in order to progress to the next year (freshman, sophomore, junior, senior), and they can also complete some electives. I need to track when the students complete each requirement or elective. And the requirements may change slightly from year to year, but I need to remember for example that Johnny completed all of the freshman requirements as they existed two years ago.
So I have:
Students
Requirements
Electives
Grades (frosh, etc.)
Years
Mostly, I'm trying to think about how to set up the requirements. In a relational DB, I'd have a table of requirements, with className, grade, and year, and a table of student_requirements, that tracks the students as they complete each requirement. But I'm thinking in MongoDB/meteorjs, I'd have a model for each grade/level that gets stored with a studentID and initially instantiates with false values for each requirement, like:
{
student: [studentID],
class: 'freshman'
year: 2014,
requirements: {
class1: false,
class2: false
}
}
and as the student completes a requirement, it updates like:
{
student: [studentID],
class: 'freshman'
year: 2014,
requirements: {
class1: false,
class2: [completionDateTime]
}
}
So in this way, each student will collect four Requirements documents, which are somewhat dictated by their initial instantiation values. And instead of the actual requirements for each grade/year living in the database, they would essentially live in the code itself.
Some of the actions I would like to be able to support are marking off requirements across a set of students at one time, and showing a grid of users/requirements to see who needs what.
Does this sound reasonable? Or is there a better way to approach this? I'm pretty early in this application and am hoping to avoid painting myself into a corner. Any help suggestion is appreciated. Thanks! :-)
Currently I'm thinking about my application data design too. I've read the examples in the MongoDB manual
look up MongoDB manual data model design - docs.mongodb.org/manual/core/data-model-design/
and here -> MongoDB manual one to one relationship - docs.mongodb.org/manual/tutorial/model-embedded-one-to-one-relationships-between-documents/
(sorry I can't post more than one link at the moment in an answer)
They say:
In general, use embedded data models when:
you have “contains” relationships between entities.
you have one-to-many relationships between entities. In these relationships the “many” or child documents always appear with or are viewed in the context of the “one” or parent documents.
The normalized approach uses a reference in a document, to another document. Just like in the Meteor.js book. They create a web app which shows posts, and each post has a set of comments. They use two collections, the posts and the comments. When adding a comment it's submitted together with the post_id.
So in your example you have a students collection. And each student has to fulfill requirements? And each student has his own requirements like a post has his own comments?
Then I would handle it like they did in the book. With two collections. I think that should be the normalized approach, not the embedded.
I'm a little confused myself, so maybe you can tell me, if my answer makes sense.
Maybe you can help me too? I'm trying to make a app that manages a flea market.
Users of the app create events.
The creator of the event invites users to be cashiers for that event.
Users create lists of stuff they want to sell. Max. number of lists/sellers per event. Max. number of position on a list (25/50).
Cashiers type in the positions of those lists at the event, to track what is sold.
Event creators make billings for the sold stuff of each list, to hand out the money afterwards.
I'm confused how to set up the data design. I need Events and Lists. Do I use the normalized approach, or the embedded one?
Edit:
After reading percona.com/blog/2013/08/01/schema-design-in-mongodb-vs-schema-design-in-mysql/ I found following advice:
If you read people information 99% of the time, having 2 separate collections can be a good solution: it avoids keeping in memory data is almost never used (passport information) and when you need to have all information for a given person, it may be acceptable to do the join in the application.
Same thing if you want to display the name of people on one screen and the passport information on another screen.
But if you want to display all information for a given person, storing everything in the same collection (with embedding or with a flat structure) is likely to be the best solution

Event Sourcing Commands vs Events

I understand the difference between commands and events but in a lot of cases you end up with redundancy and mapping between 2 classes that are essentially the same (ThingNameUpdateCommand, ThingNameUpdatedEvent). For these simple cases can you / do you use the event also as a command? Do people serialise to a store all commands as well as all events? Just seems to be a little redundant to me.
All lot of this redundancy is for a reason in general and you want to avoid using the same message for two different purposes for a number of reasons:
Sourced events must be versioned when they change since they are stored and re-used (deserialized) when you hydrate an aggregate root. It will make things a bit awkward if the class is also being used as a message.
Coupling is increased, the same class is now being used by command handlers, the domain model and event handlers now. De-coupling the command side from the event can simplify life for you down the road.
Finally clarity. Commands are issued in a language that asks something to be done (imperative generally). Events are representations of what has happened (past-tense generally). This language gets muddled if you use the same class for both.
In the end these are just data classes, it isn't like this is "hard" code. There are ways to actually avoid some of the typing for simple scenarios like code-gen. For example, I know Greg has used XML and XSD transforms to create all the classes needed for a given domain in the past.
I'd say for a lot of simple cases you may want to question if this is really domain (i.e. modeling behavior) or just data. If it is just data consider not using event sourcing here. Below is a link to a talk by Udi Dahan about breaking up your domain model so that not all of it requires event-sourcing. I'm kind of in line with this way of thinking now myself.
http://skillsmatter.com/podcast/design-architecture/talk-from-udi-dahan
After working through some examples and especially the Greg Young presentation (http://www.youtube.com/watch?v=JHGkaShoyNs) I've come to the conclusion that commands are redundant. They are simply events from your user, they did press that button. You should store these in exactly the same way as other events because it is data you don't know if you will want to use it in a future view. Your user did add and then later remove that item from the basket or at least attempt to. You may later want to use this information to remind the user of this at later date.