Akka persistent query reader as competing consumer

Akka persistent query reader as competing consumer - scala

I'm dabbling around with Akka and persistence query. Below design shows what I'm doing here.
When I run multiple instances of the "persistence reader actor", all the actors (#6 below) receive the same message. Ideally, I would want one of the actors to receive the message vs all. Is it possible to implement "competing consumers" so that I can avoid the same message being processed by multiple persistence reader actors?
I did go thru akka "smallest-mailbox-pool" to implement competing consumers but wanted to make sure that there is nothing inbuilt in persistence reader plugins before handling competing consumers with my own code.

Related

Why is there no Async / Non-Blocking Support in Kafka Streams API?

I am wondering why there is no non-blocking support via simple callbacks or Java's CompletableFuture or Scala Futures in the Kafka Stream API.
I do understand that ordering in a partition needs to be maintained, but across partitions I do not see the reason to achieve ordering by blocking an expensive resource: a thread.
i.e. when I let my Kafka Streams app with a call to an external service, e.g. in mapValues run on 1 server and I have more than thousands of partitions, I will probably lock up the machine because all threads are blocked. Having some API method like mapValuesAsync() would be nice here, wouldn't it?
Also just imagine on Kafka Stream App with doing several blocking operations in it's flow one would need way less partitions per each topic to run into the problem. Wasting threads doesn't look like a nice API design here.
Is there any support planned for this? Or do I oversee something here?

Async processing is generally hard in stream processing. It's not just about ordering, but also about fault-tolerance, tracking progress etc.
It's not impossible to support though and in fact there is already a design proposal for it: https://cwiki.apache.org/confluence/display/KAFKA/KIP-408%3A+Add+Asynchronous+Processing+To+Kafka+Streams
Feel free to help building this feature!

what the essential difference between akka and ThreadPool+BlockingQueue in ONE Process?

We know Akka is one implementation of actor pattern. Without Akka, I usually implement a simple actor pattern using ThreadPool+BlockingQueue. So the message is offered into the queue, and the works(actors) take the message from the Queue, then do what they should do. Of course, this kind of implementation can be only in just ONE process.
So as to in one process,
What's the essential difference between these two(Akka vs.
ThreadPool+BlockingQueue)
Moreover, what's the difference between actor pattern and producer-consumer model?

Actor model is indeed quite similar to producer-consumer model (P-C).
However, if you use a blocking queue with P-C your application won't be completely non-blocking and asynchronous. The promise of actor model and Akka is that all messages are sent asynchronously and don't block the sender.
Another aspect of it is managing these queues gets quite cumbersome once you have many consumers and producers. With actors you simply send a message and don't have to think about these low level details. Under the hood Akka will keep a message queue aka mailbox per actor with a dispatcher assigning actors to the thread pool to process those messages.
It's much easier to use Akka to achieve highly performant and resilient application than coding it yourself. You get fault tolerance, resource management, location transparency, routing, distributed, async processing, hierarchical supervision out of the box. Not to mention other frameworks and libraries leveraging these features to give you even more (reactive streams, akka http, etc). There are lot's of patterns developed for you already there, so why bother with your own.

How to recover messages in Akka Actors now that Durable Mailboxes are removed?

I was working with the latest version of Akka when I noticed that durable mailboxes are now removed from Akka.
I need to make sure that my messages are recovered upon a restart after crash. Is there an alternate way to work without durable mailboxes or a custom implementation by someone else.
I also tried Akka Persistence but it replays the messages and I don't want to send the same messages twice in the event of a crash given that all messages are expensive to perform.

While this is not exactly a solution to work with Akka Actors, it does solve the original problem in question here.
Instead of using Akka here, I believe it's a better idea to use something like Kafka along with reactive streams with something like akka/reactive-kafka.
A system like that is very good for persistence, and offers very good semantics for preserving the message queue on a crash. This is way better than storing the message somewhere that is to be processed, and in general performs better.
It does not have to be Kafka, but any backend that can plug with a reactive stream (Akka's implementation or otherwise).

Akka Persistence replays events that were created based on received commands. Events are generated from command messages after validation and shouldn't be able to create invalid actor states.
This means that not the initial received messages (commands) are necesarilly replayed but you can persist events that are cheaper to apply to reconstruct the state of an actor after the crash. In addition you can use snapshots to recover state directly.
Edit:
As mentioned i the comments it is true that only the state of the actor is persisted and survives the crash. This state only reflects the consumed messages and not those that still reside in the actors mailbox.
However instead of pushing messages to an actor which would then be stored in a durable mailbox an alternative might be for the 'recipient' to pull messages from a persistent actor which stores the list of messages as part of his state.
UntypedPersistentActorWithAtLeastOnceDelivery as part of akka persistence offers another possibility where the sender takes care of persisting messages.
I realize that those are no drop-in replacements for durable mailboxes as they require rethinking the system. Pulling work from the consumers has worked for me so far. Initially we also considered message Queue products (RabbitMQ with durable queues) but since our initial work items come from a db we can deal with an akka crash without durable messages.

Scala integration with Rabbit MQ

I have a back-end Scala application that needs to integrate with RabbitMQ. The back-end Scala app executes long running tasks asynchronously. Messages to execute the tasks are queued into RabbitMQ by a web client. The back-end application then consumes each of these messages, executing the corresponding long-running tasks.
Should Scala app directly consume the message from RabbitMQ and simply have the corresponding tasks be processed using Futures? Or is it better to use Akka Actors to receive these messages from RabbitMQ, and then execute the long-running tasks?
What are the pro's and con's of each approach?

Futures sound like a simpler approach for your use case combined with the RabbitMQ Java client.
My model for choosing actors v. futures is: prefer futures, switching to actors when I feel I have a good use case for them (see Good use case for Akka for some examples). For example, if you were trying to divide-and-conquer the batch workloads (as the linked answer states), actors may serve your purposes well.
Use the RabbitMQ Java examples below as a starting point, modifying to do work in futures so that the thread polling the work queue is not blocked. I included links to both work queue and RPC examples in case you need to return some response (RabbitMQ is good at this case as it has a concept of correlationId built in).
Java RabbitMQ examples:
Work Queues
Remote procedure call (RPC)

How to implement an actor topology in AKKA?

I need to implement various kind of topologies in AKKA actor model system
like Line,mesh etc. Are there in built libraries for various kind of topologies
in AKKA? If not how can I build such topologies? Also, how do we connect two or
more actors in AKKA because that's the basic for creating any topology?
Thanks

I don't think that topologies make any sense in Akka.
As you have noticed, the basic tool for creating topologies is a connection between two entities. But such connection cannot exist in Akka, because any actor can send a message to any actor. This is the foundation of actor model, after all.
Of course, you can try and emulate some topology manually, but I cannot see any use for it.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Akka persistent query reader as competing consumer - scala

Related

Why is there no Async / Non-Blocking Support in Kafka Streams API?

what the essential difference between akka and ThreadPool+BlockingQueue in ONE Process?

How to recover messages in Akka Actors now that Durable Mailboxes are removed?

Scala integration with Rabbit MQ

How to implement an actor topology in AKKA?

Categories

Resources