version of aggregate event sourcing - aggregate

According to event sourcing. When a command is called, all events of a domain have to be stored. Per event, system must increase the version of an aggregate. My eventstore is something like this:
(AggregateId, AggregateVersion, Sequence, Data, EventName, CreatedDate)
(AggregateId, AggregateVersion) is key
In some cases it does not make sense to increase the version of an aggregate. For example,
a command register an user and raises RegisteredUser, WelcomeEmailEvent, GiftCardEvent.
how can I handle this problem?

how can I handle this problem?
Avoid confusing your representation-of-information-changes events from your publishing-for-use-elsewhere events.
"Event sourcing", as commonly understood in the domain-drive-design and cqrs space, is a kind of data model. We're talking specifically about the messages an aggregate sends to its future self that describe its own changes over time.
It's "just" another way of storing the state of the aggregate, same as we would do if we were storing information in a relational database, or a document store, etc.
Messages that we are going to send to other components and then forget about don't need to have events in the event stream.
In some cases, there can be confusion when we haven't recognized that there are multiple different processes at work.
A requirement like "when a new user is registered, we should send them a welcome email" is not necessarily part of the registration process; it might instead be an independent process that is triggered by the appearance of a RegisteredUser event. The information that you need to save for the SendEmail process would be "somewhere else" - outside of the Users event history.

Event changes the state of an aggregate, and therefore changes its version. If state is not changed, then there should be no event for this aggregate.
In your example, I would ask myself - if WelcomeEmailEvent does not change the state of the User aggregate, then whose state it chages? Perhaps some other aggregate - some EmailNotification service that cares about successful or filed email attempt. In this case I would make it event of those aggregate which state it changes. And it will affect version of that aggregate.

Related

Synchronising events between microservices using Kafka and MongoDb connector

I'm experimenting with microservices architecture. I have UserService and ShoppingService.
In UserService I'm using MongoDb. When I'm creating new user in UserService I want to sync basic user info to ShoppingService. In UserService I'm using something like event sourcing. When I'm creating new User, I first create the UserCreatedEvent and then I apply the event onto domain User object. So in the end I get the domain User object that has current state and list of events containing one UserCreatedEvent.
I wonder if I should persist the Events collection as a nested property of User document or in separate UserEvents collection. I was planning to use Kafka Connect to synchronize the events from UserService to ShoppingService.
If I decide to persist the events inside the User document then I don't need transaction that I would use to save event to separate UserEvents collection but I can't setup the Kafka connector to track changes in the nested property only.
If I decide to persist events in separate UserEvents collection I need to wrap in transaction changes to User and UserEvents. But saving events to separate collection makes setting up Kafka connector very easy because I track only inserts and I don't need to track updates of nested UserEvents array in User document.
I think I will go with the second option for sake of simplicity but maybe I've missed something. Is it good idea to implement it like this?
I would generally advise the second approach. Note that you can also eliminate the need for a transaction by observing that User is just a snapshot based on the UserEvents up to some point in the stream and thus doesn't have to be immediately updated.
With this, your read operation for User can be: select a user from User (the latest snapshot), which includes a version/sequence number saying that it's as-of some event; then select the events with later sequence numbers and apply those events to the user. If there's some querier which wants a faster response and can tolerate getting something stale, a different endpoint (or an option in the query) can bypass the event replay.
You can then have some asynchronous process which subscribes to the stream of user events and updates User based on those events.

CQRS/Event Sourcing: Should events (types) be shared?

Should events be shared? I am experimenting with CQRS and Event Sourcing and am wondered if events (the types) should be shared/defined between services.
Case:
A request comes in and a new createUser command is pushed into the 'commands' event log. Service A (business logic) fetches this command and generates the data of the new user. Once the new user is created it pushes the new data into the 'events' event log with the event name newUser. Service B (projector) notices the new event and starts processing it.
Here lays my question. Should we define for every event type (in this case newUser) the logics that needs to be ran in order to update the materialised view? In the example below do we have 2 types of events and is for every event the actions defined that need to happen. In this case are the event types defined in the logics service and the projector service.
# <- onEvent
switch event.type
case "newUser"
putUsers(firstName=data.firstName, lastName=data.lastName) # put this data in the database
case "updateUserFirstName"
updateUsers(where id = 1, firstName=data.firstName)
Or is it a good idea to define in the event the type of operation that needs to be preformed? In that case are event types not shared and is the projector service able to handle unknown/new events, without any modification.
# <- onEvent
switch event.operation
case "create"
putUser(...)
case "update"
updateUser(...) # update only the data defined in the event
Is options 2 a viable option? Or will I be running into issues when choosing this strategy?
Events reflect something that has happened. They are usually named in the past tense - userCreated.
Generic events (or one event type per entity) have a number of drawbacks:
Finding proper past tense names for event types becomes more difficult
You lose some of the expressivity since the whole domain meaning is no longer immediately apparent just looking at the event type
Impossible to subscribe to events in a fine grained, streamlined way because you need to "open the envelope" to find out which specific event you're dealing with
Discrepancy between events you talk about with domain experts (for instance during Event Storming sessions) and the way they are encoded in your types, messages, etc.
I wouldn't recommend it except maybe in a very free-form/dynamic system where the entities are not known in advance.
I recommend using event type to determine what type of business logic/rules will consume the event.
As #guillaume31 mentioned, use past tense to name your events. But if you want to plan for the future, you should also version your event types. For example, you can name your event types like this "userCreated_v1" or "userFirstNameChanged_v1". This gives you the ability to change the structure of event messages in the future and easily associate new business logic/rules with the new events.

Event Sourcing and dealing with data dependencies

Given a REST API with the following operations resulting in events posted to Kafka:
AddCategory
UpdateCategory
RemoveCategory
AddItem (refers to a category by some identifier)
UpdateItem
RemoveItem
And an environment where multiple users may use the REST API at the same time, and the consumers must all get the same events. The consumers may be offline for an extended period of time (more than a day). New consumers may be added, and others removed.
The problems:
Event ordering (only workaround single topic/partition?)
AddItem before AddCategory, invalid category reference.
UpdateItem before AddCategory, used to be a valid reference, now invalid.
RemoveCategory before AddItem, category reference invalid.
....infinite list of other concurrency issues.
Event Store snapshots for fast resync of restarted consumers
Should there be a compacted log topic for both categories and items, each entity keyed by its identifier?
Can the whole compacted log topic be somehow identified as an offset?
Should there only be one one entry in the compacted log topic, and the data of it contain a serialized blob of all categories and items given an offset (would require single topic/partition).
How to deal with the handover from replaying the rendered entities event store to the "live stream" of commands/events? Encode offset in each item in the compacted log view, and pass that to replay from the live event log?
Are there other systems that fit this problem better?
I will give you a partial answer based on my experience in Event sourcing.
Event ordering (only workaround single topic/partition?)
AddItem before AddCategory, invalid category reference.
UpdateItem before AddCategory, used to be a valid reference, now invalid.
RemoveCategory before AddItem, category reference invalid.
....infinite list of other concurrency issues.
All scalable Event stores that I know of guaranty events ordering inside a partition only. In DDD terms, the Event store ensure that the Aggregate is rehydrated correctly by replaying the events in the order they were generated. The Apache-kafka topic seems to be a good choice for that. While this is sufficient for the Write side of an application, it is harder for the Read side to use it. Harder but not impossible.
Given that the events are already validated by the Write side (because they represent facts that already happened) we can be sure that any inconsistency that appears in the system is due to the wrong ordering of events. Also, given that the Read side is eventually consistent with the Write side, the missing events will eventually reach our Read models.
So, first thing, in your case AddItem before AddCategory, invalid category reference, should be in fact ItemAdded before CategoryAdded (terms are in the past).
Second, when ItemAdded arrives, you try to load the Category by ID and if it fails (because of the delayed CategoryAdded event) then you can create a NotYetAvailableCategory having the ID equal to the referenced ID in the ItemAdded event and a title of "Not Yet Available Please Wait a few miliseconds". Then, when the CategoryAdded event arrives, you just update all the Items that reference that category ID. So, the main idea is that you create temporary entities that will be finalized when their events eventually arrive.
In the case of CategoryRemoved before ItemAdded, category reference invalid, when the ItemAdded event arrives, you could check that the category was deleted (by havind a ListOfCategoriesThatWereDeleted read model) and then take the appropriate actions in your Item entity - what depends on you business.

Remove read data for authenticated user?

In DDS what my requirement is, I have many subscribers but the publisher is single. My subscriber reads the data from the DDS and checks the message is for that particular subscriber. If the checking success then only it takes the data and remove from DDS. The message must maintain in DDS until the authenticated subscriber takes it's data. How can I achieve this using DDS (in java environment)?
First of all, you should be aware that with DDS, a Subscriber is never able to remove data from the global data space. Every Subscriber has its own cached copy of the distributed data and can only act on that copy. If one Subscriber takes data, then other Subscribers for the same Topic will not be influenced by that in any way. Only Publishers can remove data globally for every Subscriber. From your question, it is not clear whether you know this.
Independent of that, it seems like the use of a ContentFilteredTopic (CFT) is suitable here. According to the description, the Subscriber knows the file name that it is looking for. With a CFT, the Subscriber can indicate that it is only interested in samples that have a particular value for the file_name attribute. The infrastructure will take care of the filtering process and will ensure that the Subscriber will not receive any data with a different value for the attribute file_name. As a consequence, any take() action done on the DataReader will contain relevant information and there is no need to check the data first and then take it.
The API documentation should contain more detailed information about how to use a ContentFilteredTopic.

SO style reputation system with CQRS & Event Sourcing

I am diving into my first forays with CQRS and Event Sourcing and I have a few points Id like some guidance on. I would like to implement a SO style reputation system. This seems a perfect fit for this architecture.
Keeping SO as the example. Say a question is upvoted this generates an UpvoteCommand which increases the questions total score and fires off a QuestionUpvotedEvent.
It seems like the author's User aggregate should subscribe to the QuestionUpvotedEvent which could increase the reputation score. But how/when you do this subscription is not clear to me? In Greg Youngs example the event/command handling is wired up in the global.asax but this doesn't seem to involve any routing based on aggregate Id.
It seems as though every User aggregate would subscribe to every QuestionUpvotedEvent which doesn't seem correct, to make such a scheme work the event handler would have to exhibit behavior to identify if that user owned the question that was just upvoted. Greg Young implied this should not be in event handler code, which should merely involve state change.
What am i getting wrong here?
Any guidance much appreciated.
EDIT
I guess what we are talking about here is inter-aggregate communication between the Question & User aggregates. One solution I can see is that the QuestionUpvotedEvent is subscribed to by a ReputationEventHandler which could then fetch the corresponding User AR and call a corresponding method on this object e.g. YourQuestionWasUpvoted. This would in turn generated a user specific UserQuestionUpvoted event thereby preserving replay ability in the future. Is this heading in the right direction?
EDIT 2
See also the discussion on google groups here.
My understanding is that aggregates themselves should not be be subscribing to events. The domain model only raises events. It's the query side or other infrastructure components (such as an emailing component) that subscribe to events.
Domain Services are designed to work with use-cases/commands that involve more than one aggregate.
What I would do in this situation:
VoteUpQuestionCommand gets invoked.
The handler for VoteUpQuestionCommand calls:
IQuestionVotingService.VoteUpQuestion(Guid questionId, Guid UserId);
This then fecthes both the question & user aggregates, calling the appropriate methods on both, such as user.IncrementReputation(int amount) and question.VoteUp(). This would raise two events; UsersReputationIncreasedEvent and QuestionUpVotedEvent respectively, which would be handled by the query side.
My rule of thumb: if you do inter-AR communication use a saga. It keeps things within the transactional boundary and makes your links explicit => easier to handle/maintain.
The user aggregate should have a QuestionAuthored event... in that event is subscribes to the QuestionUpvotedEvent... similarly it should have a QuestionDeletedEvent and/or QuestionClosedEvent in which it does the proper handling like unsibscribing from the QuestionUpvotedEvent etc.
EDIT - as per comment:
I would implement the Question is an external event source and handle it via a gateway. The gateway in turn is the one responsible for handling any replay correctly so the end result stays exactly the same - except for special events like rejection events...
This is the old question and tagged as answered but I think can add something to it.
After few months of reading, practice and create small framework and application base on CQRS+ES, I think CQRS try to decouple components dependencies and responsibilities. In some resources write for each command you Should change maximum one aggregate on command handler (you can load more than one aggregate on handler but only one of them can change).
So in your case I think the best practice is #Tom answer and you should use saga. If your framework doesn't support saga (Like my small framework) you can create some event handler like UpdateUserReputationByQuestionVotedEvent. In that, handler create UpdateUserReputation(Guid user id, int amount) OR UpdateUserReputation(Guid user id, Guid QuestionId, int amount) OR
UpdateUserReputation(Guid user id, string description, int amount). After command sends to handler, the handler load user by user id and update states and properties. In this type of handling you can create a more complex scenario or workflow.