Event Replay using TrackingEventProcessor - Axon 3 - cqrs

I'm following the axon-springboot example shared by Allard (https://github.com/abuijze/bootiful-axon).
My understanding so far is: (please correct me if I have misunderstood some of the concepts)
Events are raised and stored in the event store/event bus (Mysql) (using EmbeddedEventStore). Now, event processors (TrackingProcessors - in my case) will pull events from the source (MySql - right?) and event handlers will execute the business logic and update the query storage and message published to RabbitMQ.
First question is where, when and who publishes this message to the RabbitMQ (used by statistics application which has the message listener configured.)
I have configured the TrackingProcessor to try the replay functionality. To execute the replay I stop my processor, delete the token entry for the processor, start the processor and events are replayed and my Query Storage is up-to-date as expected.
Second question is, when the replay is triggered and Query Storage is updated, I don't see any messages being published to the RabbitMQ...so my statistics application is out of sync. Am I doing something wrong?
Can you please advise?
Thanks
Singh

First of all, a correction: it is not the Tracking Processor or the updater of the view model that sends the messages to RabbitMQ. The Events are forwarded to Rabbit as they are published to the Event Bus.
The answer to your first question: messages are published by the SpringAmqpPublisher, which connects directly to the Event Bus, and forwards any published message to RabbitMQ as they are published.
To answer your second question, let's clarify how replays work, first. While it's called a "replay", essentially it's more a "reset". The Tracking Processor uses a TrackingToken to remember its progress of processing the Event Store. When the token is deleted (or just not yet available), the Tracking Processor starts processing from the beginning of the Event Store.
You never reply an entire application, just a single (Tracking) Processor. Just imagine: you re-publish all messages to RabbitMQ again, other components are triggered again, unaware of the fact that these are "old" messages, and user-confirmation emails are sent again, orders placed again, etc. etc.
If your Statistics are out of date, it's because they aren't part of the same processor and aren't rebuilt together with the other element. RabbitMQ doesn't support "replaying", since it doesn't remember the messages after delivering them.
Any model that you want to be able to rebuild, should be managed by a Tracking Processor.
Check out the Axon Reference guide for more information: https://docs.axonframework.org/part3/event-processing.html#event-processors

Related

Google PubSub with pull subscriber design flaw?

We are using googles steaming pull subscriber the design is as follows
We are doing
sending file from FE (frontend) to BE (backend)
BE converting that file to ByteArray and publishing to pubsub topic as message (so ByteArray going as message)
Topic sending that message to subscriber, subscriber converting the ByteArray to file again
that converted file subscriber sending to that tool
tool doing some cool stuff with file and notify the status to subscriber
that status going to BE and BE update the DB and sending that status to FE
Now in our subscriber when we receive message we are immediately acknowledge it and removing the listener of subscriber so that we don't get message any more
and when that tool done that stuff, it sending status to subscriber (we have express server running on subscriber) and
after receiving status we are re-creating listener of subscriber to receive message
Note
that tool may take 1hr or more to do stuff
we are using ordering key to properly distribute message to VM's
this code is working fine but my question is
is there any flaw in this (bcz we r removing listener then again re creating it or anything like that)
or any better option or GCP services to best fit this design
or any improvement in code
EDIT :
Removed code sample
I would say that there are several parts of this design that are sub-optimal. First of all, acking a message before you have finished processing it means you risk message loss. What happens if your tool or subscriber crashes after acking the message, but before processing has completed? This means when the processes start back up, they will not receive the message again. Are you okay with requests from the frontend possibly never being processed? If not, you'll want to ack after processing is completed, or--given that your processing takes so long--persist the request to a database or to some storage and then acknowledge the message. If you are going to have to persist the file somewhere else anyway, you might want to consider taking Pub/Sub out of the picture and just writing the file to storage like GCS and then having your subscribers instead read out of GCS directly.
Secondly, stopping the subscriber upon each message being received is an anti-pattern. Your subscriber should be receiving and processing each message as it arrives. If you need to limit the number of messages being processed in parallel, use message flow control.
Also ordering keys isn't really a way to "properly distribute message to VM's." Ordering keys is only a means by which to ensure ordered delivery. There are no guarantees that the messages for the same ordering key will continually go to the same subscriber client. In fact, if you shut down the subscriber client after receiving each message, then another subscriber could receiving the next message for the ordering key since you've acked the earlier message. If all you mean by "properly distribute message" is that you want the messages delivered in order, then this is the correct way to use ordering keys.
You say you have a subscription per client, then whether or not that is the right thing to do depends on what you mean by "client." If client means "user of the front end," then I imagine you plan to have a different topic per user as well. If so, then you need to keep in mind the 10,000 topic-per-project limit. If you mean that each VM has its own subscription, then note that each VM is going to receive every message published to the topic. If you only want one VM to receive each message, then you need to use the same subscription across all VMs.
In general, also keep in mind that Cloud Pub/Sub has at-least-once delivery semantics. That means that even an acknowledged message could be redelivered, so you do need to be prepared to handle duplicate message delivery.

Pub/Sub and consumer aware publishing. Stop producing when nobody is subscribed

I'm trying to find a messaging system that supports the following use case.
Producer registers topic namespace
client subscribes to topic
first client triggers notification on producer to start producing
new client with subscription to the same topic receives data (potentially conflated, similar to hot/cold observables in RX world)
When the last client goes away, unsubscribe or crash, notify the producer to stop producing to said topic
I am aware that according to the pub/sub pattern A producer is defined to be blissfully unaware of the existence of consumers, meaning that my use-case simply does not fit the pub/sub paradigm.
So far I have looked into Kafka, Redis, NATS.io and Amazon SQS, but without much success. I've been thinking about a few possible ways to solve this, Haven't however found anything that would satisfy my needs yet.
One option that springs to mind, for bullet 2) is to model a request/reply pattern as amongs others detailed on the NATS page to have the producer listen to clients. A client would then publish a 'subscribe' message into the system that the producer would pick up on a wildcard subscription. This however leaves one big problem, which is unsubscribing. Assuming the consumer stops as it should, publishing an unsubscribe message just like the subscribe would work. But in the case of a crash or similar this won't work.
I'd be grateful for any ideas, references or architectural patterns/best practices that satisfy the above.
I've been doing quite a bit of research over the past week but haven't come across any satisfying Q&A or articles. Either I'm approaching it entirely wrong, or there just doesn't seem to be much out there which would surprise me as to me, this appears to be a fairly common scenario that applies to many domains.
thanks in advance
Chris
//edit
An actual simple use-case that I have at hand is stock quote distribution.
Quotes come from external source
subscribe to stock A quotes from external system when the first end-user looks at stock A
Stop receiving quotes for stock A from external system when no more end-users look at said stock
RabbitMQ has internal events you can use with the Event Exchange Plugin. Event such as consumer.created or consumer.deleted could be use to trigger some actions at your server level: for example, checking the remaining number of consumers using RabbitMQ Management API and takes action such as closing a topic, based on your use cases.
I don't have any messaging consumer present based publishing in mind. Got ever worst because you'll need kind of heartbeat mechanism to handle consumer crashes.
So here are my two cents, not sue if you're looking for an out of the box solution, but if not, you could wrap your application around a zookeeper cluster to handle all your use cases.
Simply use watchers on ephemeral nodes to check when you have no more consumers ( including crashes) and put some watcher around a 'consumers' path to be advertised when you get consumers.
Consumers side, you would have to register your zk node ID whenever you start it.
It's not so complicated to do, and zk is not the only solution for this, you might use other consensus techs as well.
A start for zookeeper :
https://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html
( strongly advise to use curator api, which handle lot of recipes in a smooth way)
Yannick
Unfortunately you haven't specified your use business use case that you try to solve with such requirements. From the sound of it you want not the pub/sub system, but an orchestration one.
I would recommend checking out the Cadence Workflow that is capable of supporting your listed requirements and many more orchestration use cases.
Here is a strawman design that satisfies your requirements:
Any new subscriber sends an event to a workflow with a TopicName as a workflowID to subscribe. If workflow with given ID doesn't exist it is automatically started.
Any subscribe sends another signal to unsubscribe.
When no subscribers are left workflow exits.
Publisher sends an event to the workflow to deliver to subscribers.
Workflow delivers the event to the subscribers using an activity.
If workflow with given TopicName doesn't run the publish event to it is going to fail.
Cadence offers a lot of other advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
Distributed CRON support
See the presentation that goes over Cadence programming model.

How should Event Sourcing event handlers be hosted to construct a read model?

There are various example applications and frameworks that implement a CQRS + Event Sourcing architecture and most describe use of an event handler to create a denormalized view from domain events stored in an event store.
One example of hosting this architecture is as a web api that accepts commands to the write side and supports querying the denormalized views. This web api is likely scaled out to many machines in a load balanced farm.
My question is where are the read model event handlers hosted?
Possible scenarios:
Hosted in a single windows service on a separate host.
If so, wouldn't that create a single point of failure? This probably complicates deployment too but it does guarantee a single thread of execution. Downside is that the read model could exhibit increased latency.
Hosted as part of the web api itself.
If I'm using EventStore, for example, for the event storage and event subscription handling, will multiple handlers (one in each web farm process) be fired for each single event and thereby cause contention in the handlers as they try to read/write to their read store? Or are we guaranteed for a given aggregate instance all its events will be processed one at a time in event version order?
I'm leaning towards scenario 2 as it simplifies deployment and also supports process managers that need to also listen to events. Same situation though as only one event handler should be handling a single event.
Can EventStore handle this scenario? How are others handling processing of events in eventually consistent architectures?
EDIT:
To clarify, I'm talking about the process of extracting event data into the denormalized tables rather than the reading of those tables for the "Q" in CQRS.
I guess what I'm looking for are options for how we "should" implement and deploy the event processing for read models/sagas/etc that can support redundancy and scale, assuming of course the processing of events is handled in an idempotent way.
I've read of two possible solutions for processing data saved as events in an event store but I don't understand which one should be used over another.
Event bus
An event bus/queue is used to publish messages after an event is saved, usually by the repository implementation. Interested parties (subscribers), such as read models, or sagas/process managers, use the bus/queue "in some way" to process it in an idempotent way.
If the queue is pub/sub this implies that each downstream dependency (read model, sagas, etc) can only support one process each to subscribe to the queue. More than one process would mean each processing the same event and then competing to make the changes downstream. Idempotent handling should take care of consistency/concurrency issues.
If the queue is competing consumer we at least have the possibility of hosting subscribers in each web farm node for redundancy. Though this requires a queue for each downstream dependency; one for sagas/process managers, one for each read model, etc, and so the repository would have to publish to each for eventual consistency.
Subscription/feed
A subscription/feed where interested parties (subscriber) read an event stream on demand and get events from a known checkpoint for processing into a read model.
This looks great for recreating read models if necessary. However, as per the usual pub/sub pattern, it would seem only one subscriber process per downstream dependency should be used. If we register multiple subscribers for the same event stream, one in each web farm node for example, they will all attempt to process and update the same respective read model.
In our project we use subscription-based projections. The reasons for this are:
Committing to the write-side must be transactional and if you use two pieces of infrastructure (event store and message bus), you have to start using DTC or otherwise you risk your events to be saved to the store but not published to the bus, or the other way around, depending on your implementation. DTC and two-phase commits are nasty things and you do not want to go this way
Events are usually published in the message bus anyway (we do it via subscriptions too) for event-driven communication between different bounded contexts. If you use message subscribers to update your read model, when you decide to rebuilt the read model, your other subscribers will get these messages too and this will bring the system to invalid state. I think you have thought about this already when saying you must only have one subscriber for each published message type.
Message bus consumers have no guarantee on message order and this can bring your read model to mess.
Message consumers usually handle retries by sending the message back to the queue, and usually by the end of the queue, for retrying. This means that your events can become heavily out of order. In addition, usually after some number of retries, message consumer gives up on the poison message and puts it to some DLQ. If this would be your projection, this will mean that one update will be ignored whilst others will be processed. This means that your read model will be in inconsistent (invalid) state.
Considering these reasons, we have single-threaded subscription-based projections that can do whatever. You can do different type of projections with own checkpoints, subscribing to the event store using catch-up subscriptions. We host them in the same process as many other things for the sake of simplicity but this process only runs on one machine. Should we want to scale-out this process, we will have to take the subscriptions/projections out. It can easily be done since this part has virtually no dependencies to other modules, except the read model DTOs itself, which can be shared as an assembly anyway.
By using subscriptions you always project events that have been already committed. If something goes wrong with the projections, the write side is definitely the source of truth and remains so, you just need to fix the projection and run it again.
We have two separate ones - one for projecting to the read model and another one for publishing events to the message bus. This construct has proven to work very well.
Specifically for EventStore, they now have competing consumers, which are server based subscriptions where many clients can subscribe to the subscription group but only one client gets the message.
It sounds like that is what you are after, each node in the farm can subscribe to the subscription group and the node that receives the message does the projection

MSMQ - Multiple Subscribers and Event Notification

I'm a bit new to MSMQ and need a bit of help. We have a JMS based messaging system and we are considering replacing it with MSMQ. There are 2 existing scenarios in JMS which i need to verify MSMQ supports.
Multiple Subscriber Applications for the same message.
Notification send to a Subscriber Application that a message has arrived for them. (Basically MSMQ pushing message to the subscriber application as opposed to the Subscriber application checking the Queue in MSMQ)
If anyone could provide any info or link to any sites with the relevant info, I'd appreciate it.
Thanks,
Tarique
Multiple Subscriber Applications for the same message.
You can do this with Multiple-Destination Messaging
Notification send to a Subscriber Application that a message has arrived for them.
Use async pattern for this, you begin listen for a message and get notification when it arrives (C# method, such as MyReceiveCompleted in the code sample). From personal experience this works slower than reading one by one in a sync way. But if you handle less than 1k messages a second on an arbitary average machine you will be fine.
See MessageQueue.BeginReceive for code sample.

First message not arriving over an MSMQ/MassTransit Service Bus

I've got a MassTransit ServiceBus running over MSMQ. It appears that the first message sent over the Bus doesn't arrive, but subsequent messages do?
Is there some initialization that needs performing on the queue or bus before the message is sent?
This depends on a few settings in how much time the system needs to setup before everything will correctly route. If only first message is failing to end up in the right location, then likely the subscription data isn't propagated everywhere yet. http://readthedocs.org/docs/masstransit/en/develop/overview/subscriptions.html
Using Multicast subscriptions, the easiest choice, will require a few seconds after a endpoint has come up and register a subscriber with all other endpoints. If you can control the order of services starting up, then this can often be avoided by started back to front in the flow.
If you are using the subscription service, then that can also take a couple seconds to get data everywhere. It has to go through the subscription service but the subscription is send to everyone on the bus. This is tied to a SQL db, and latency to the db can effect this timing.
Lastly, if you are using static routing, then that should work immediately, because the subscription is setup upon startup.