Experts,
I am in the evaluating moving a nice designed DDD app to an event sourced architecture as a pet project.
As can be inferred, my Aggregate operations are relatively coarse grained. As part of my design process, I now find myself emitting a large number of events from a small subset of operations to satisfy what I know to be requirements of the read model. Is this acceptable practice ?
In addition to this, I have distilled a lot of the domain complexity via use of ValueObjects & entities. Can VO's/ E accept commands and emit events themselves, or should I expose state and add from the command handler lower down the stack ?
In terms of VO's note that I use mutable operations sparingly and it is a trade off between over complicating other areas of my domain.
I now find myself emitting a large number of events from a small subset of operations to satisfy what I know to be requirements of the read model. Is this acceptable practice ?
In most cases, you should have 1 event by command. Remember events describe the user intent so you have to stay close to the user action.
Can VO's/ E accept commands and emit events themselves
No only aggregates emit events and yes you can come up with very messy aggregates if you have a lot of events, but there is solutions for that like delegating the work to the commands and events themselves. I blogged about this technique here: https://dev.to/maximegel/cqrs-scalable-aggregates-731
In terms of VO's note that I use mutable operations sparingly
As long as you're aware of the consequences. That's fine. Trade-off are part of the job, but be sure you team is aware of that since it's written everywhere that value objects are immutable you expose yourselves to confusion and pointer issues.
Related
I am learning what is CQRS pattern and came to know there is also CQS pattern.
When I tried to search I found lots of diagrams and info on CQRS but didn't found much about CQS.
Key point in CQRS pattern
In CQRS there is one model to write (command model) and one model to read (query model), which are completely separate.
How is CQS different from CQRS?
CQS (Command Query Separation) and CQRS (Command Query Responsibility Segregation) are very much related. You can think of CQS as being at the class or component level, while CQRS is more at the bounded context level.
I tend to think of CQS as being at the micro level, and CQRS at the macro level.
CQS prescribes separate methods for querying from or writing to a model: the query doesn't mutate state, while the command mutates state but does not have a return value. It was devised by Bertrand Meyer as part of his pioneering work on the Eiffel programming language.
CQRS prescribes a similar approach, except it's more of a path through your system. A query request takes a separate path from a command. The query returns data without altering the underlying system; the command alters the system but does not return data.
Greg Young put together a pretty thorough write-up of what CQRS is some years back, and goes into how CQRS is an evolution of CQS. This document introduced me to CQRS some years ago, and I find it a still very useful reference document.
This is an old question but I am going to take a shot at answering it. I do not often answer questions on StackOverflow, so please forgive me if I do something outside the bounds of community in terms of linking to things, writing a long answer, etc.
There are many differences between CQRS and CQS however CQRS uses CQS inside of its definition! Let's start with defining the two and then we can discuss differences.
CQS defines two types of messages depending on their return value: no return value (void) specifies this is a Command; a return value (non-void) specifies this method is a Query.
Commands change information
Queries return information
Commands change state. Queries do not.
Now for CQRS, which uses the same definition as CQS for Commands and Queries. What CQRS says is that we do not want one object with Command and Query methods. Instead we want two objects: one with all the Commands and one with all the Queries.
The idea overall is very simple; it's everything after doing this where things become interesting. There are numerous talks online, of me discussing some of the attributes associated (sorry way too much to type here!).
CQS is about Command and Queries. It doesn't care about the model. You have somehow separated services for reading data, and others for writing data.
CQRS is about separate models for writes and reads. Of course, usage of write model often requires reading something to fulfill business logic, but you can only do reads on read model. Separate Databases are state of the art. But imagine single DB with separate models for read and writes modeled in ORM. It's very often good enough.
I have found that people often say they practice CQRS when they have CQS.
Read the inventor Greg Young's answer
I think, like "Dependency Injection" the concepts are so simple and taken for granted that the fact that they have fancy names seems to drive people to think they're something more than they are, especially as CQRS is often quoted alongside Event Sourcing.
CQS is the separation of methods that read to those that change state; don't do both in a single method. This is micro level.
CQRS extends this concept into a higher level for machine-machine APIs, separation of the message models and processing paths.
So CQRS is a principle you apply to the code in an API or facade.
I have found CQRS to essentially be a very strong S in SOLID, pushing the separation deeply into the psyche of developers to produce more maintainable code.
I think web applications are a bad fit for CQRS since the mutation of state via representation transfer means the command and query are two sides of the same request-response. The representation is a command and the response is the query.
For example, you send an order and receive a view of all your orders.
Imagine if the code of a website was factored into a command side and query side. The route action handling code would need to fall into one of those sides, but it does both.
Imagining a stronger segregation, if the code was moved into two different compilable codebases, then the website would accept a POST of a form, but the user would have to browse to another website URL to see the impact of the action. This is obviously crazy. One workaround would be to always redirect, though this wouldn't really be RESTful since the ideal REST application is where the next representation contains hypertext to drive the next state transition and so on.
Given that a website is a REST API between human and machine (or machine and machine), this also includes REST APIs, though other types of HTTP message passing API may be a perfect fit for CQRS.
A service or facade within the bounds of the website may obviously work well with CQRS, though the action handlers would sit outside this boundary.
See CQS on Wikipedia
The biggest difference is CQRS uses separate data stores for commands and queries. A query store can use a different technology like a document database or just be a denormalized schema in the same database that makes querying the data easier.
The data between databases is usually copied asynchronously using something like a service bus. Therefore, data in the query store is eventually consistent (is going to be there at some point of time). Applications need to account for that. While it is possible to use the same transaction (same database or a 2-phase commit) to write in both stores, it is usually not recommended for scalability reasons.
CQS architecture reads and writes from the same data store/tables.
I'm wandering on DDD and NoSql field actually. I have a doubt now: i need to produce events from the aggregate and i would like to use a NoSql storage. But how can i be sure that events are saved on the storage AND the changes on the aggregate root not having transactions?
Does it makes sense? Is there a way to do this without being forced to use event sourcing or a transactional db?
Actually i was lookin at implementing a 2 phase commit algorithm but it seems pretty heavy from a performance point of view...
Am i approaching the problem the wrong way?
Stuffed with questions...
Thanks for every suggestion
Enrico
PS
I'm a newbie on stackoverflow so any suggestion/critic/... is more than welcome
Enrico
Edit 1
Well i would need events to notify aggregates that something happened and i they should react to the change. The problem arise when such events are important for the business logic. As far as i understood, after a night of thinking, i can't use a nosql storage to do such things. Let me explain (thinking with loud voice :P):
With ES (1st scenery): I save the "diff" of the data. Then i produce an event associated with it. 2 operations.
With ES (2nd scenery): I save the "diff" of the data. A process, watch the ES and produce the event. But i'm tied to having only one watcher process to ensure the correct ordering of events.
With ES (3d scenery): Idempotent events. The events can be inferred by the state and every reapplication of the event can cause a change on the consumer only once, can have multiple "dequeue" processes, duplicates can't possibly happen. 1 operation, but it introduce heavy limitations on the consumers.
In general: I save the aggregate's data. Then i produce an event associated with it. 2 operations.
Now the question becomes wider imho, is it possible to work with domain events and nosql when such domain events are fundamental part of the business process?
I think that could be a better option to go relational... even if i would need to add quite a lot of machines to get the same performances.
Edit 2
For the sake of completness, searching for "domain events nosql idempotent" on google: http://svendvanderveken.wordpress.com/2011/08/26/transactional-event-based-nosql-storage/
If you need Event Sourcing, you should store events only.
This should be the sequence:
the aggregate root recieves a command
it fires proper events
events are stored
Each aggregate's re-hydratation should be done only by executing events over them. You can create aggregates' snapshots if you measure performance problems on their initialization, but this doesn't require two-phase commits, since you can build snapshots asynchronously via batch.
Note however that you need CQRS and/or Event Sourcing only if your application is heavily concurrent and you need to cope with partition tolerance and compensating actions.
edit
Event Sourcing is alternative to the persistence of object state. You either store the events or the state of the object model. You can save snapshot, but they're just performance tools: your application must be able to work without them. You can consider such snapshots as a caching technique. As an alternative you can persist object state (the classical model), but in that case you don't need to store events.
In my own DDD application, I use observable entities to decouple (via direct events' subscription from the repository) aggregates and their persistence. For example your repository can subscribe each domain events, and execute the actions required by the application (persist to the store, dispatch to a queue and so on...). But as a persistence technique, Event Sourcing is alternative to classical persistence of the observable object state. In most scenarios you don't need both.
edit 2
A final note: if you choose ES, one of the events subscriber can build a relational read-model too.
I'm trying to understand sagas, and meanwhile I have a specific way of thinking of them - but I am not sure whether I got the idea right. Hence I'd like to elaborate and have others tell me whether it's right or wrong.
In my understanding, sagas are a solution to the question of how to model long-running processes. Long-running means: Involving multiple commands, multiple events and possibly multiple aggregates. The process is not modeled inside one of the participating aggregates to avoid dependencies between them.
Basically, a saga is nothing more but a command / event handler that reacts on internal and external commands / events. It does not contain its own logic, it's just a (finite) state machine, and therefor provides tasks such as When event X happens, send command Y.
Sagas are persisted to the event store as well as aggregates, are correlated to a specific aggregate instance, and hence are reloaded when this specific aggregate (or set of aggregates) is used.
Is this right?
There are different means of implementing Sagas. Reaching from stateless event handlers that publish commands all the way to carrying all the state and basically being the domain's aggregates themselves. Udi Dahan once wrote an article about Sagas being the only Aggregates in a (in his specific case) correctly modeled system. I'll look it up and update this answer.
There's also the concept of document-based sagas.
Your definition of Sagas sounds right for me and I also would define them so.
The only change in your description I would made is that a saga is only a eventhandler (not a command) for event(s) and based on the receiving event and its internal state constructs a command and sents it to the CommandBus for execution.
Normally has a Saga only a single event to be started from (StartByEvent) and multiple events to transition (TransitionByEvent) to the next state and mutiple event to be ended by(EndByEvent).
On MSDN they defined Sagas as ProcessManager.
The term saga is commonly used in discussions of CQRS to refer to a
piece of code that coordinates and routes messages between bounded
contexts and aggregates. However, for the purposes of this guidance we
prefer to use the term process manager to refer to this type of code
artifact. There are two reasons for this: There is a well-known,
pre-existing definition of the term saga that has a different meaning
from the one generally understood in relation to CQRS. The term
process manager is a better description of the role performed by this
type of code artifact. Although the term saga is often used in the
context of the CQRS pattern, it has a pre-existing definition. We have
chosen to use the term process manager in this guidance to avoid
confusion with this pre-existing definition. The term saga, in
relation to distributed systems, was originally defined in the paper
"Sagas" by Hector Garcia-Molina and Kenneth Salem. This paper proposes
a mechanism that it calls a saga as an alternative to using a
distributed transaction for managing a long-running business process.
The paper recognizes that business processes are often comprised of
multiple steps, each of which involves a transaction, and that overall
consistency can be achieved by grouping these individual transactions
into a distributed transaction. However, in long-running business
processes, using distributed transactions can impact on the
performance and concurrency of the system because of the locks that
must be held for the duration of the distributed transaction.
reference: http://msdn.microsoft.com/en-us/library/jj591569.aspx
Didn't know how to shorten that title.
I'm basically trying to wrap my head around the concept of CQRS (http://en.wikipedia.org/wiki/Command-query_separation) and related concepts.
Although CQRS doesn't necessarily incorporate Messaging and Event Sourcing it seems to be a good combination (as can be seen with a lot of examples / blogposts combining these concepts )
Given a use-case for a state change for something (say to update a Question on SO), would you consider the following flow to be correct (as in best practice) ?
The system issues an aggregate UpdateQuestionCommand which might be separated into a couple of smaller commands: UpdateQuestion which is targeted at the Question Aggregate Root, and UpdateUserAction(to count points, etc) targeted at the User Aggregate Root. These are send asynchronously using point-to-point messaging.
The aggregate roots do their thing and if all goes well fire events QuestionUpdated and UserActionUpdated respectively, which contain state that is outsourced to an Event Store.. to be persisted yadayada, just to be complete, not really the point here.
These events are also put on a pub/sub queue for broadcasting. Any subscriber (among which likely one or multiple Projectors which create the Read Views) are free to subscribe to these events.
The general question: Is it indeed best practice, that Commands are communicated Point-to-Point (i.e: The receiver is known) whereas events are broadcasted (I.e: the receiver(s) are unknown) ?
Assuming the above, what would be the advantage/ disadvantage of allowing Commands to be broadcasted through pub/sub instead of point-to-point?
For example: When broadcasting Commands while using Saga's (http://blog.jonathanoliver.com/2010/09/cqrs-sagas-with-event-sourcing-part-i-of-ii/) could be a problem, since the mediation role a Saga needs to play in case of failure of one of the aggregate roots is hindered, because the saga doesn't know which aggregate roots participate to begin with.
On the other hand, I see advantages (flexibility) when broadcasting commands would be allowed.
Any help in clearing my head is highly appreciated.
Yes, for Command or Query there is only one and exactly one receiver (thus you can still load balance), but for Events there could be zero or more receivers (subscribers)
I have an entity in my domain that represent a city electrical network. Actually my model is an entity with a List that contains breakers, transformers, lines.
The network change every time a breaker is opened/closed, user can change connections etc...
In all examples of CQRS the EventStore is queried with Version and aggregateId.
Do you think I have to implement events only for the "network" aggregate or also for every "Connectable" item?
In this case when I have to replay all events to get the "actual" status (based on a date) I can have near 10000-20000 events to process.
An Event modify one property or I need an Event that modify an object (containing all properties of the object)?
Theres always an exception to the rule but I think you need to have an event for every command handled in your domain. You can get around the problem of processing so many events by making use of Snapshots.
http://thinkbeforecoding.com/post/2010/02/25/Event-Sourcing-and-CQRS-Snapshots
I assume you mean currently your "connectable items" are part of the "network" aggregate and you are asking if they should be their own aggregate? That really depends on the nature of your system and problem and is more of a DDD issue than simple a CQRS one. However if the nature of your changes is typically to operate on the items independently of one another then then should probably be aggregate roots themselves. Regardless in order to answer that question we would need to know much more about the system you are modeling.
As for the challenge of replaying thousands of events, you certainly do not have to replay all your events for each command. Sure snapshotting is an option, but even better is caching the aggregate root objects in memory after they are first loaded to ensure that you do not have to source from events with each command (unless the system crashes, in which case you can rely on snapshots for quicker recovery though you may not need them with caching since you only pay the penalty of loading once).
Now if you are distributing this system across multiple hosts or threads there are some other issues to consider but I think that discussion is best left for another question or the forums.
Finally you asked (I think) can an event modify more than one property of the state of an object? Yes if that is what makes sense based on what that event represents. The idea of an event is simply that it represents a state change in the aggregate, however these events should also represent concepts that make sense to the business.
I hope that helps.