Akka, advices on implementation

Akka, advices on implementation - scala

I would like to create an app who integrate our (enterprise) specific tools with Slack. For that I can foreseen some bots who send events to Slack and a few commands who trigger actions.
I would like to use Akka because it is a buzzword and seems nice but don't have any arguments in favor of it. (It is not a problem since I will develop this app alone on my freetime).
But I don't have any experience on how to create an "Actor based application". I already have 3 actors, two to collect Events and one to publish those Events to Slack. Each collector are triggered by a timer, they hold a reference to the publisher and send message to him... that works..
For the commands part I din't have anything but can imagine a listener (Play http controller) who convert Slack requests to message and send them to one Actor. Ideally, I would like to decouple this listener from the Actor who will handle the command.
I would like to have some advices on how to develop this kind of applications where I have actors to collect information on a time basis and other to react to messages.
Thanks.

For time based activity, i.e. "ticks" or "heartbeats", the documentation demonstrates a very good pattern for sending messages to Actors periodically.
For your other need, Actors responding to messages as they come in, this is exactly the purpose of the Actor's receive method. From the documentation:
Actors are objects which encapsulate state and behavior, they
communicate exclusively by exchanging messages which are placed into
the recipient’s mailbox. In a sense, actors are the most stringent
form of object-oriented programming, but it serves better to view them
as persons: while modeling a solution with actors, envision a group of
people and assign sub-tasks to them, arrange their functions into an
organizational structure and think about how to escalate failure (all
with the benefit of not actually dealing with people, which means that
we need not concern ourselves with their emotional state or moral
issues). The result can then serve as a mental scaffolding for
building the software implementation.

Related

Pub/Sub and consumer aware publishing. Stop producing when nobody is subscribed

I'm trying to find a messaging system that supports the following use case.
Producer registers topic namespace
client subscribes to topic
first client triggers notification on producer to start producing
new client with subscription to the same topic receives data (potentially conflated, similar to hot/cold observables in RX world)
When the last client goes away, unsubscribe or crash, notify the producer to stop producing to said topic
I am aware that according to the pub/sub pattern A producer is defined to be blissfully unaware of the existence of consumers, meaning that my use-case simply does not fit the pub/sub paradigm.
So far I have looked into Kafka, Redis, NATS.io and Amazon SQS, but without much success. I've been thinking about a few possible ways to solve this, Haven't however found anything that would satisfy my needs yet.
One option that springs to mind, for bullet 2) is to model a request/reply pattern as amongs others detailed on the NATS page to have the producer listen to clients. A client would then publish a 'subscribe' message into the system that the producer would pick up on a wildcard subscription. This however leaves one big problem, which is unsubscribing. Assuming the consumer stops as it should, publishing an unsubscribe message just like the subscribe would work. But in the case of a crash or similar this won't work.
I'd be grateful for any ideas, references or architectural patterns/best practices that satisfy the above.
I've been doing quite a bit of research over the past week but haven't come across any satisfying Q&A or articles. Either I'm approaching it entirely wrong, or there just doesn't seem to be much out there which would surprise me as to me, this appears to be a fairly common scenario that applies to many domains.
thanks in advance
Chris
//edit
An actual simple use-case that I have at hand is stock quote distribution.
Quotes come from external source
subscribe to stock A quotes from external system when the first end-user looks at stock A
Stop receiving quotes for stock A from external system when no more end-users look at said stock

RabbitMQ has internal events you can use with the Event Exchange Plugin. Event such as consumer.created or consumer.deleted could be use to trigger some actions at your server level: for example, checking the remaining number of consumers using RabbitMQ Management API and takes action such as closing a topic, based on your use cases.

I don't have any messaging consumer present based publishing in mind. Got ever worst because you'll need kind of heartbeat mechanism to handle consumer crashes.
So here are my two cents, not sue if you're looking for an out of the box solution, but if not, you could wrap your application around a zookeeper cluster to handle all your use cases.
Simply use watchers on ephemeral nodes to check when you have no more consumers ( including crashes) and put some watcher around a 'consumers' path to be advertised when you get consumers.
Consumers side, you would have to register your zk node ID whenever you start it.
It's not so complicated to do, and zk is not the only solution for this, you might use other consensus techs as well.
A start for zookeeper :
https://zookeeper.apache.org/doc/r3.1.2/zookeeperStarted.html
( strongly advise to use curator api, which handle lot of recipes in a smooth way)
Yannick

Unfortunately you haven't specified your use business use case that you try to solve with such requirements. From the sound of it you want not the pub/sub system, but an orchestration one.
I would recommend checking out the Cadence Workflow that is capable of supporting your listed requirements and many more orchestration use cases.
Here is a strawman design that satisfies your requirements:
Any new subscriber sends an event to a workflow with a TopicName as a workflowID to subscribe. If workflow with given ID doesn't exist it is automatically started.
Any subscribe sends another signal to unsubscribe.
When no subscribers are left workflow exits.
Publisher sends an event to the workflow to deliver to subscribers.
Workflow delivers the event to the subscribers using an activity.
If workflow with given TopicName doesn't run the publish event to it is going to fail.
Cadence offers a lot of other advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
Distributed CRON support
See the presentation that goes over Cadence programming model.

How to use pub/sub pattern in Event Sourcing & CQRS

I am developing micro-services, I am using Event Sourcing with CQRS pattern, in my case, If a user is deleted/ updated from one service I want it to publish an event and other service to subscribe it and delete the entries regarding that user from its db as well.
I wanted to ask how can I use pub/sub pattern in Event Sourcing, Which Event store can be used for it as currently I have seen some people using Azure Tables but how can it be used as pub/sub?

Which Event store can be used for it ...?
If you have the luxury of choosing the technology to use, then I would suggest you start out by looking into Greg Young's Event Store
Yes, that's the same guy that introduced CQRS to the world.
(You may also want to review his talk on polyglot data, which includes discussion of pull vs push based models).

how can I use pub/sub pattern in Event Sourcing
This use case naturally lays down on eventsourcing and if accurately to realize it, then the question about notifications will disappear by itself.
It is the best of all to realize interaction by means of the common bus. Each microservice realizing your aggregates or projections is connected in the uniform logical bus, and signed on all events, and also can send any events there.
Of course, when if the system is under a heavy load, it is necessary to do some to optimization, for example, to enter name spaces for events and to specify to the broker of the bus what events and to what microservice it is necessary to deliver. Also if some information is private for microservice, then it makes a sense to make private channel in the bus, however it isn't provided by the theory of eventsourcing, exactly the same as validation between aggregates.
Also thanks to the concept of the common bus, you also receive "as a gift" reactivity for clients of system, for example, of browsers. However you shan't subscribe for projections or statuses of aggregates, only for events. If server events aren't equal client, you can enter the intermediate entity on their broadcasting, however it is not responsibility of storage of events any more.

How should Event Sourcing event handlers be hosted to construct a read model?

There are various example applications and frameworks that implement a CQRS + Event Sourcing architecture and most describe use of an event handler to create a denormalized view from domain events stored in an event store.
One example of hosting this architecture is as a web api that accepts commands to the write side and supports querying the denormalized views. This web api is likely scaled out to many machines in a load balanced farm.
My question is where are the read model event handlers hosted?
Possible scenarios:
Hosted in a single windows service on a separate host.
If so, wouldn't that create a single point of failure? This probably complicates deployment too but it does guarantee a single thread of execution. Downside is that the read model could exhibit increased latency.
Hosted as part of the web api itself.
If I'm using EventStore, for example, for the event storage and event subscription handling, will multiple handlers (one in each web farm process) be fired for each single event and thereby cause contention in the handlers as they try to read/write to their read store? Or are we guaranteed for a given aggregate instance all its events will be processed one at a time in event version order?
I'm leaning towards scenario 2 as it simplifies deployment and also supports process managers that need to also listen to events. Same situation though as only one event handler should be handling a single event.
Can EventStore handle this scenario? How are others handling processing of events in eventually consistent architectures?
EDIT:
To clarify, I'm talking about the process of extracting event data into the denormalized tables rather than the reading of those tables for the "Q" in CQRS.
I guess what I'm looking for are options for how we "should" implement and deploy the event processing for read models/sagas/etc that can support redundancy and scale, assuming of course the processing of events is handled in an idempotent way.
I've read of two possible solutions for processing data saved as events in an event store but I don't understand which one should be used over another.
Event bus
An event bus/queue is used to publish messages after an event is saved, usually by the repository implementation. Interested parties (subscribers), such as read models, or sagas/process managers, use the bus/queue "in some way" to process it in an idempotent way.
If the queue is pub/sub this implies that each downstream dependency (read model, sagas, etc) can only support one process each to subscribe to the queue. More than one process would mean each processing the same event and then competing to make the changes downstream. Idempotent handling should take care of consistency/concurrency issues.
If the queue is competing consumer we at least have the possibility of hosting subscribers in each web farm node for redundancy. Though this requires a queue for each downstream dependency; one for sagas/process managers, one for each read model, etc, and so the repository would have to publish to each for eventual consistency.
Subscription/feed
A subscription/feed where interested parties (subscriber) read an event stream on demand and get events from a known checkpoint for processing into a read model.
This looks great for recreating read models if necessary. However, as per the usual pub/sub pattern, it would seem only one subscriber process per downstream dependency should be used. If we register multiple subscribers for the same event stream, one in each web farm node for example, they will all attempt to process and update the same respective read model.

In our project we use subscription-based projections. The reasons for this are:
Committing to the write-side must be transactional and if you use two pieces of infrastructure (event store and message bus), you have to start using DTC or otherwise you risk your events to be saved to the store but not published to the bus, or the other way around, depending on your implementation. DTC and two-phase commits are nasty things and you do not want to go this way
Events are usually published in the message bus anyway (we do it via subscriptions too) for event-driven communication between different bounded contexts. If you use message subscribers to update your read model, when you decide to rebuilt the read model, your other subscribers will get these messages too and this will bring the system to invalid state. I think you have thought about this already when saying you must only have one subscriber for each published message type.
Message bus consumers have no guarantee on message order and this can bring your read model to mess.
Message consumers usually handle retries by sending the message back to the queue, and usually by the end of the queue, for retrying. This means that your events can become heavily out of order. In addition, usually after some number of retries, message consumer gives up on the poison message and puts it to some DLQ. If this would be your projection, this will mean that one update will be ignored whilst others will be processed. This means that your read model will be in inconsistent (invalid) state.
Considering these reasons, we have single-threaded subscription-based projections that can do whatever. You can do different type of projections with own checkpoints, subscribing to the event store using catch-up subscriptions. We host them in the same process as many other things for the sake of simplicity but this process only runs on one machine. Should we want to scale-out this process, we will have to take the subscriptions/projections out. It can easily be done since this part has virtually no dependencies to other modules, except the read model DTOs itself, which can be shared as an assembly anyway.
By using subscriptions you always project events that have been already committed. If something goes wrong with the projections, the write side is definitely the source of truth and remains so, you just need to fix the projection and run it again.
We have two separate ones - one for projecting to the read model and another one for publishing events to the message bus. This construct has proven to work very well.

Specifically for EventStore, they now have competing consumers, which are server based subscriptions where many clients can subscribe to the subscription group but only one client gets the message.
It sounds like that is what you are after, each node in the farm can subscribe to the subscription group and the node that receives the message does the projection

Handling multiple event dependency in event-driven architecture

What would be best practice if you have an event-driven architecture and a service subscribing to events has to wait for multiple event (of the same kind) before proceeding with creating the next event in the chain?
An example would be a book order handling service that has to wait for each book in the order to have been handled by the warehouse before creating the event that the order has been picked so that the shipping service (or something similar) picks up the order and starts preparing for shipping.

Another useful pattern beside the Aggregator that Tom mentioned above is a saga pattern (a mini workflow).
I've used it before with messaging library called NServiceBus to handle coordinating multiple messages that are correlated to each other.
the pattern is very useful and fits nicely for long-running processes. even if your correlated messages are different messages, like OrderStarted, OrderLineProcessed, OrderCompleted.

You can use the Aggregator pattern, also called Parallel Convoy.
Essentially you need to have some way of identifying messages which need to be aggregated, and when the aggregated set as a whole has been recieved, so that processing can start.
Without going out and buying the book*, the Apache Camel integration platform website has some nice resource on implementing the aggregator pattern. While this is obviously specific to Camel, you can see what kind of things are involved.
* disclaimer, I am not affiliated in any way with Adison Wesley, or any of the authors of the book...

Implementing a message bus using ZeroMQ

I have to develop a message bus for processes to send, receive messages from each other. Currently, we are running on Linux with the view of porting to other platforms later.
For this, I am using ZeroMQ over TCP. The pattern is PUB-SUB with a forwarder. My bus runs as a separate process and all clients connect to SUB port to receive messages and PUB to send messages. Each process subscribes to messages by a unique tag. A send call from a process sends messages to all. A receive call will fetch that process the messages marked with the tag of that process. This is working fine.
Now I need to wrap the ZeroMQ stuff. My clients only need to supply a unique tag. I need to maintain a global list of tags vs. ZeroMQ context and sockets details. When a client say,
initialize_comms("name"); the bus needs to check if this name is unique, create ZeroMQ contexts and sockets. Similarly, if a client say receive("name"); the bus needs to fetch messages with that tag.
To summarize the problems I am facing;
Is there anyway to achieve this using facilities provided by ZeroMQ?
Is ZeroMQ the right tool for this, or should I look for something like nanomsg?
Is PUB-SUB with forwarder the right pattern for this?
Or, am I missing something here?

Answers
Yes, ZeroMQ is capable of serving this need
Yes. ZeroMQ is a right tool ( rather a powerful tool-box of low-latency components ) for this. While nanomsg has a straight primitive for bus, the core distributed logic can be integrated in ZeroMQ framework
Yes & No. PUB-SUB as given above may serve for emulation of the "shout-cast"-to-bus and build on a SUB side-effect of using a subscription key(s). The WHOLE REST of the logic has to be re-thought and designed so as the whole scope of the fabrication meets your plans (ref. below). Also kindly bear in mind, that initial versions of ZeroMQ operated PUB/SUB primitive as "subscription filtering" of the incoming stream of messages being done on receiver side, so massive designs shall check against traffic-volumes / risk-of-flooding / process-inefficiency on the massive scale...
Yes. ZeroMQ is rather a well-tuned foundation of primitive elements ( as far as the architecture is discussed, not the power & performance thereof ) to build more clever, more robust & almost-linearly-scaleable Formal Communication Pattern(s). Do not get stuck to PUB/SUB or PAIR primitives once sketching Architecture. Any design will remain poor if one forgets where the True Powers comes from.
A good place to start a next step forward towards a scaleable & fault-resilient Bus
Thus a best next step one may do is IMHO to get a bit more global view, which may sound complicated for the first few things one tries to code with ZeroMQ, but if you at least jump to the page 265 of the Code Connected, Volume 1, if it were not the case of reading step-by-step thereto.
The fastest-ever learning-curve would be to have first an un-exposed view on the Fig.60 Republishing Updates and Fig.62 HA Clone Server pair for a possible High-availability approach and then go back to the roots, elements and details.

Here is what I ended up designing, if anyone is interested. Thanks everyone for the tips and pointers.
I have a message bus implemented using ZeroMQ (and CZMQ) running as a separate process.
The pattern is PUBLISHER-SUBSCRIBER with a LISTENER. They are connected using a PROXY.
In addition, there is a ROUTER invoked using a newly forked thread.
These three endpoints run on TCP and are bound to predefined ports which the clients know of.
PUBLISHER accepts all messages from clients.
SUBSCRIBER sends messages with a unique tag to the client who have subscribed to that tag.
LISTENER listens to all messages passing through. currently, this is for logging testing and purposes.
ROUTER provides a separate comms channel to clients. Messages such as control commands are directed here so that they will not get passed downstream.
Clients connect to,
PUBLISHER to send messages.
SUBSCRIBER to receive messages. Subscription is using unique tags.
ROUTER to send commands (check tag uniqueness etc.)
I am still doing implementation so there may be unseen problems, but right now it works fine. Also, there may be a more elegant way but I didn't want to throw away the PUB-SUB thing I had built.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse