Change Stream Duplication if Microservices instance got replicated - mongodb

I have implemented MongoDB Change Stream in a Java Microservice, When i do a replica of my Microservice I See change stream watch is listening twice. Code is duplicated. Any way to stop this?

I gave a similar answer here, however as this question is directly related to Java, I feel it is actually more relevant on this question. I assume what you are after is each change being processed only once among many replicated processes.
Doing this with strong guarantees is difficult but not impossible. I wrote about the details of one solution here: https://www.alechenninger.com/2020/05/building-kafka-like-message-queue-with.html
This solution is implemented in a proof-of-concept library written in Java that accomplishes this which you are free to use/fork (the blog post explains how it works).
It comes down to a few techniques:
Each process attempts to obtain a lock
Each lock (or each change) has an associated fencing token
Processing each change must be idempotent
While processing the change, the token is used to ensure ordered, effectively-once updates.
More details in the blog post.

Related

Axon Event Published Multiple Times Over EventBus

Just want to confirm the intended behavior of Axon, versus what I’m seeing in my application. We have a customized Kafka publisher integrated with the Axon framework, as well as, a custom Cassandra-backed event store.
The issue I’m seeing is as follows: (1) I publish a command (e.g. CreateServiceCommand) which hits the constructor of the ServiceAggregate, and then (2) A ServiceCreatedEvent is applied to the aggregate. (3) We see the domain event persisted in the backend and published over the EventBus (where we have a Kafka consumer).
All well and good, but suppose I publish that same command again with the same aggregate identifier. I do see the ServiceCreatedEvent being applied to the aggregate in the debugger, but since a domain event record already exists with that key, nothing is persisted to the backend. Again, all well and good, however I see the ServiceCreatedEvent being published out to Kafka and consumed by our listener, which is unexpected behavior.
I’m wondering whether this is the expected behavior of Axon, or if our Kafka integrations ought to be ensuring we’re not publishing duplicate events over the EventBus.
Edit:
I swapped in Axon's JPA event store and saw the following log when attempting to issue a command to create the aggregate that already exists. This issue then is indeed due to a defect with our custom event store.
"oracle.jdbc.OracleDatabaseException: ORA-00001: unique constraint (R671659.UK8S1F994P4LA2IPB13ME2XQM1W) violated\n\n\tat oracle.jdbc.driver.T4CTTIoer11.processError
The given explanation has a couple of holes which make it odd to be honest, and hard to pinpoint where the problem lies.
In short no, Axon would not publish an event twice as a result of dispatching the exact same command a second time. This depends on your implementation. If the command creates an aggregate, then you should see a constraint violation on the uniqueness requirement of aggregate identifier and (aggregate) sequence number. If it's a command which works on an existing aggregate, it depends on your implementation whether it is idempotent yes/no.
From your transcript I guess you are talking about a command handler which creates an Aggregate. Thus the behavior you are seeing strikes me as odd. Either the event store is custom which inserts this undesired behavior, or it's due to not using Axon's dedicated Kafka Extension.
Also note that using a single solution for event storage and message distribution like Axon Server will omit the problem entirely. You'd no longer need to configure any custom event handling and publication on Kafka at all, saving you personal development work and infrastructure coordination. Added, it provides you the guarantees which I've discussed earlier. From more insights on why/how of Axon Server, you can check this other SO response of mine.

Kafka Streams Application Updates

I've built a Kafka Streams application. It's my first one, so I'm moving out of a proof-of-concept mindset into a "how can I productionalize this?" mindset.
The tl;dr version: I'm looking for kafka streams deployment recommendations and tips, specifically related to updating your application code.
I've been able to find lots of documentation about how Kafka and the Streams API work, but I couldn't find anything on actually deploying a Streams app.
The initial deployment seems to be fairly easy - there is good documentation for configuring your Kafka cluster, then you must create topics for your application, and then you're pretty much fine to start it up and publish data for it to process.
But what if you want to upgrade your application later? Specifically, if the update contains a change to the topology. My application does a decent amount of data enrichment and aggregation into windows, so it's likely that the processing will need to be tweaked in the future.
My understanding is that changing the order of processing or inserting additional steps into the topology will cause the internal ids for each processing step to shift, meaning at best new state stores will be created with the previous state being lost, and at worst, processing steps reading from an incorrect state store topic when starting up. This implies that you either have to reset the application, or give the new version a new application id. But there are some problems with that:
If you reset the application or give a new id, processing will start from the beginning of source and intermediate topics. I really don't want to publish the output to the output topics twice.
Currently "in-flight" data would be lost when you stop your application for an upgrade (since that application would never start again to resume processing).
The only way I can think to mitigate this is to:
Stop data from being published to source topics. Let the application process all messages, then shut it off.
Truncate all source and intermediate topics.
Start new version of application with a new app id.
Start publishers.
This is "okay" for now since my application is the only one reading from the source topics, and intermediate topics are not currently used beyond feeding to the next processor in the same application. But, I can see this getting pretty messy.
Is there a better way to handle application updates? Or are my steps generally along the lines of what most developers do?
I think you have a full picture of the problem here and your solution seems to be what most people do in this case.
During the latest Kafka-Summit this question has been asked after the talk of Gwen Shapira and Matthias J. Sax about Kubernetes deployment. The responses were the same: If your upgrade contains topology modifications, that implies rolling upgrades can't be done.
It looks like there is no KIP about this for now.

Converting resources in a RESTful manner

I'm currently stuck with designing my endpoints in a way so that they are conform with the REST principles but also ensure the integrity of the underlying data.
I have two resources, ShadowUser and RealUser whereas the first one only has a first name, last name and an e-mail.
The second user has much more properties such like an Id under which the real user can be addressed at other place in the system.
My use-case it to convert specific ShadowUsers into real users.
In my head the flow seems pretty simple:
get the shadow users /GET api/ShadowUsers?somePropery=someValue
create new real users with the data fetched /POST api/RealUsers
delete the shadow-users /DELETE api/ShadowUSers?somePropery=someValue
But what happens when there is a problem between the creation of new users and the deletion of the shadow ones? The data would now be inconsistent.
The example is even easier when there is only one single user, but the issue stays the same as there could be something between step 2 and 3 leaving the user existing as shadow and real.
So the question is, how this can be done in a "transactional" manner where anything is good and persisted or something went wrong and nothing has been changed in the underlying data-store?
Are there any "best practices" or "design-patterns" which can be used?
Perhaps the role of the RESTful API GETting and POSTing those real users in batch (I asked a question some weeks ago about a related issue: Updating RESTful resources against aggregate roots only).
In the API side, POSTed users wouldn't be handled directly but they would be enqueued in a reliable messaging queue (for example RabbitMQ). A background process would be subscribed to the whole queue and it would process both the creation and removal of real and shadow users respectively.
The point of using a reliable messaging system is that you can implement retry policies. If the operation is interrupted in the middle of finishing its work, you can retry it and detect which changes are still pending to complete the task.
In summary, using this approach you can implement that operation in a transactional way.

Event sourcing and validation on writes

Still pretty young at ES and CQRS, I understand that they are tightly related to eventual consistency of data.
Eventual consistency can be problematic when we should perform validation before writing to the store, like checking that an email address isn't already used by an existing user. The only way to do that in a strongly consistent way would be to stop accepting new events, finish processing the remaining events against our view and then querying the view. We obviously don't want to go that far and Greg Young actually recommends to embrace eventual consistency and deal with (rare) cases where we break constraints.
Pushing this approach to the limits, my understanding is that this would mean, when developing a web API for example, to respond 'OK' to every request because it is impossible, at the time of the request, to validate it... Am I on the right track, or missing something here?
As hinted in my comment above, a RESTful API can return 202 Accepted.
This provides a way for a client to poll for status updates if that's necessary.
The client can monitor for state if that's desirable, but alternatively, it can also simply fire and forget, assuming that if it gets any sort of 200-range response, the command will eventually be applied. This can be a good alternative if you have an alternative channel on which you can propagate errors. For example, if you know which user submitted the command, and you have that user's email address, you can send an email in the event of a failure to apply the command.
One of the points of a CQRS architecture is that the edge of the system should do whatever it can to validate the correctness of a Command before it accepts it. Based on the known state of the system (as exposed by the Query side), the system can make a strong effort to validate that a given Command is acceptable. If it does that, the only permanent error that should happen if you accept a Command is a concurrency conflict. Depending on how fast your system approaches consistent states, such concurrency conflicts may be so few that e.g. sending the user an email is an appropriate error-handling strategy.

CQRS reading from the domain model?

Is it totally forbidden (or just inappropriate) for the domain model in CQRS to return state to the client? e.g. if live updated information exists in the domain model. I'm thinking of a command causing events and changes to the domain model and returning info to the client.
The alternatives seem to be:
the client determines the updated state itself (duplication of code), which should hopefully match the eventual update of the read model,
the client waits for the read model to be updated (eventual consistency) and somehow knows when to read from there to update its state, or
the whole command is done synchronously and the read model somehow pushes the information to the client, e.g. with some sort of web socket.
I guess the latter relates to the desire to build reactive applications incorporating CQRS. The notion of the client polling for updates seem unsatisfactory (and similarly for the read model polling for updates from the domain model).
Is it totally forbidden (or just inappropriate) for the domain model in CQRS to return state to the client?
Assuming you mean DDD/CQRS, I think the dogmatic answer is "so inappropriate it may as well be forbidden". Sure, it's not going to hard-break anything on its own, but your architecture has taken a step down a very slippery slope that (at its bottom) is a big-ball-of-mud.
You want your Domain model to be isolated from all the issues of what people want to see afterwards.
returning info to the client
Consider that for any given Command sent to the Domain model, there can be multiple versions of "info" the client might want, depending on how the command is being invoked!
Imagine a button called "Send Invoice" that appears both on the "regular flow" page and the "bulk modify" page. Those might invoke the same Command yet have drastically different requirements for what the user sees as "the result" of what they did.
The alternatives seem to be:
I'd look at the Application Layer for this, or whatever layer would know "where" the user is when they are doing things. (Perhaps the same place where POST data is becoming an instance of a Command object.) That layer is the one that has enough context to determine what read-model to look at for the eventual results.
Whether you want the server to block (polling the DB), or the client do its own polling, or asynchronous websocket-stuff... That's another story.