Source.queue recently added an overload which specializes to OverflowStrategy.dropNew and avoids the async mechanism. The result of materializing this is a BoundedSourceQueue[T] (compared to SourceQueueWithComplete[T] in the older version). The docs for the SourceQueueWithComplete variants of Source.queue make it clear that the materialized queues should be used by any number of concurrent producers:
The materialized SourceQueue may be used by up to maxConcurrentOffers concurrent producers.
The docs for the BoundedSourceQueue don't say anything about this. Is this constraint lifted for BoundedSourceQueue? Can it be used by any number of concurrent producers?
Technically, the SourceQueueWithComplete variant doesn't have the maxConcurrentOffers restriction if OverflowStrategy.dropNew is in effect.
However, because the result of offering an element to the SourceQueueWithComplete is communicated asynchronously, that does mean that if a producer produces faster than it handles the future, it may overwhelm memory. Asynchrony removes backpressure, unless some other mechanism reintroduces it, after all.
Because when the strategy is dropNew, it's possible to know immediately that the element was dropped, the result of offering can be communicated synchronously (i.e. blocking the producer until it handles/throws away the result). This allows there to be arbitrarily many producers without OOM risk. For this reason if using the dropNew strategy, the BoundedSourceQueue version is recommended (i.e. only use the SourceQueueWithComplete if some other strategy is being used), with the recommendation becoming stronger as load becomes higher.
Yes, the number of running threads is the limit for the number of concurrent producers to the BoundedSourceQueue variant.
Related
According to this blog post:
If the source of the stream polls an external entity for new messages and the downstream processing is non-uniform, inserting a buffer can be crucial to realizing good throughput. For example, a large buffer inserted after the Kafka Consumer from the Reactive Streams Kafka library can improve performance by an order of magnitude in some situations. Otherwise, the source may not poll Kafka fast enough to keep the downstream saturated with work, with the source oscillating between backpressuring and polling Kafka.
The documentation for the alpakka kafka connnector doesn't mention that, so I was wondering if it makes sense to use a buffer in this case. Also does the same thing apply to Kafka sinks (should I add a buffer before)?
...I was wondering if it makes sense to use a buffer in this case
Consider the following segment from the blog post you quoted:
...the downstream processing is non-uniform....
One of the points of that section of the post is to illustrate the similar effects that a user-defined buffer and an asynchronous boundary can have on a stream. The default behavior, in which there are no buffers or asynchronous boundaries, is to enable operator fusion, which runs a stream in a single actor. This essentially means that for every Kafka message that is consumed, the message must go through the entire pipeline of the stream, from source to sink, before the next message goes through the pipeline. In other words, a message m2 will not go through the pipeline until the preceding message m1 is finished processing.
If the processing that occurs downstream from a Kafka connector source is "non-uniform" (i.e, it can take varying amounts of time: sometimes the processing happens quickly, sometimes it takes a while), then introducing a buffer or an asynchronous boundary could improve the overall throughput. This is because a buffer or asynchronous boundary can allow the source to continue consuming Kafka messages even if the downstream processing happens to take a long time. That is, if m1 takes a long time to process, the source could consume messages m2, m3, etc. (until the buffer is full), without waiting for m1 to finish. As Colin Breck states in his post:
The buffer improves performance by decoupling stages, allowing the upstream or downstream to continue to process elements, on average, even if one of them is busy processing a relatively expensive workload.
This potential performance boost doesn't apply for all situations. Again quoting Breck:
Similar to the async method discussed in the previous section, it should be noted that inserting buffers indiscriminately will not improve performance and simply consume additional resources. If adjacent workloads are relatively uniform, the addition of a buffer will not change the performance, as the overall performance of the stream will simply be dominated by the slowest processing stage.
One obvious way to determine whether using a buffer (i.e., .buffer) makes sense in your case is to try it. You might also try adding an asynchronous boundary (i.e., .async) instead. Compare the following three approaches--(1) the default fused behavior without buffering, (2) .buffer, and (3) .async--and see which one results in the best performance.
I frequently see queues in software architecture, especially those called "scalable" with prominent representative of Actor from Akka.io multi-actor platform. However, how can queue be scalable, if we have to synchronize placing messages in queue (and therefore operate in single thread vs multi thread) and again synchronize taking out messages from queue (to assure, that message it taken exactly once)? It get's even more complicated, when those messages can change state of (actor) system - in this case even after taking out message from queue, it cannot be load balanced, but still processed in single thread.
Is it correct, that putting messages in queue must be synchronized?
Is it correct, that putting messages out of queue must be synchronized?
If 1 or 2 is correct, then how is queue scalable? Doesn't synchronization to single thread immediately create bottleneck?
How can (actor) system be scalable, if it is statefull?
Does statefull actor/bean mean, that I have to process messages in single thread and in order?
Does statefullness mean, that I have to have single copy of bean/actor per entire system?
If 6 is false, then how do I share this state between instances?
When I am trying to connect my new P2P node to netowrk, I believe I have to have some "server" that will tell me, who are other peers, is that correct? When I am trying to download torrent, I have to connect to tracker - if there is "server" then we do we call it P2P? If this tracker will go down, then I cannot connect to peers, is that correct?
Is synchronization and statefullness destroying scalability?
Is it correct, that putting messages in queue must be synchronized?
Is it correct, that putting messages out of queue must be synchronized?
No.
Assuming we're talking about the synchronized java keyword then that is a reenetrant mutual exclusion lock on the object. Even multiple threads accessing that lock can be fast as long as contention is low. And each object has its own lock so there are many locks, each which only needs to be taken for a short time, i.e. it is fine-grained locking.
But even if it did, queues need not be implemented via mutual exclusion locks. Lock-free and even wait-free queue data structures exist. Which means the mere presence of locks does not automatically imply single-threaded execution.
The rest of your questions should be asked separately because they are not about message queuing.
Of course you are correct in that a single queue is not scalable. The point of the Actor Model is that you can have millions of Actors and therefore distribute the load over millions of queues—if you have so many cores in your cluster. Always remember what Carl Hewitt said:
One Actor is no actor. Actors come in systems.
Each single actor is a fully sequential and single-threaded unit of computation. The whole model is constructed such that it is perfectly suited to describe distribution, though; this means that you create as many actors as you need.
I am aware this is a very imprecise question and might be deemed inappropriate for stackoverflow. Unfortunately smaller applications (in terms of the number of actors) and 'tutorial-like' ones don't help me develop intuition about the overhead of message dispatch and a swift spot for granularity between a 'scala object' and a 'CORBA object'.
While almost certainly keeping a state of conversation with a client for example deserves an actor, in most real use cases it would involve conditional/parallel/alternative interactions modeled by many classes. This leaves the choice between treating actors as facades to quite complex services, similar to the justly retired EJB, or akin to smalltalk objects, firing messages between each other willy-nilly whenever communication can possibly be implemented in an asynchronous manner.
Apart from the overhead of message passing itself, there will also be overhead involved with lifecycle management, and I am wary of potential problems caused by chained-restarts of whole subtrees of actors due to exceptions or other errors in their root.
For the sake of this question we may assume that vast majority of the communication happens within a single machine and network crossing is insignificant.
I am not sure what you mean by an "overhead of message passing itself".
When network/serialisation is not involved then the overhead is negligible: one side pushes a message in a queue, another reads it from it.
Akka claims that it can go as fast as 50 millions messages per second on a single machine. This means that you wouldn't use actors just as façade for complex subsystems. You would rather model them as mush smaller "working units". They can be more complex compare to smalltalk objects when convenient. You could have, say, KafkaConsumerActor which would utilise internally other "normal" classes such as Connection, Configuration, etc., these don't have to be akka actors. But it is still small enough to be a simple working unit doing one simple thing (consuming a message and sending it somewhere).
50 millions a second is really a lot.
A memory footprint is also extremely small. Akka itself claims that you can have ~2.5 millions actors for just 1GB of heap. Compare to what a typical system does it is, indeed, nothing.
As for lifecycle, creating an actor is not much heavier than creating an class instance and a mailbox so I don't really expect it to be that significant.
Saying that, typically you don't have many actors in your system that would handle one message and die. Normally you spawn actors which live much longer. Like, an actor that calculates your mortgage repayments based on parameters you provide doesn't have any reason to die at all.
Also Akka makes it very simple to use actor pools (different kinds of them).
So performance here is very tweakable.
Last point is that you should compare Akka overhead in a context. For example, if your system is doing database queries, or serving/performing HTTP requests, or even doing significant IO of some sort, then probably overhead of these activities makes overhead of Akka so insignificant so you wouldn't even bother thinking about it. Like a roundtrip to the DB for 50 millis would be an equivalent of an overhead from ~2.5 millions akka messages. Does it matter?
So can you find an edge case scenario where Akka would force you to pay performance penalties? Probably. Akka is not a golden hammer (and nothing is).
But with all the above in mind you should think if it is Akka that is a performance bottleneck in your specific context or you are wasting time in micro-optimisation.
When using Akka 1.3, do I need to worry about what happens when the actors producing messages are producing them faster than than the actors consuming them can process?
Without any mechanism, in a long running process, the queue sizes would grow to consume all available memory.
The doc says the default dispatcher is the ExecutorBasedEventDrivenDispatcher.
This dispatcher has five queue configuration:
Bounded LinkedBlockingQueue
Unbounded LinkedBlockingQueue
Bounded ArrayBlockingQueue
Unbounded ArrayBlockingQueue
SynchronousQueue
and four overload policies:
CallerRuns
Abort
Discard
DicardOldest
Is this the right mechanism to be looking at? If so, what are this dispatchers' default settings?
The dispatcher has a task queue. This is unrelated to your problem. In fact, you want as many mailboxes to be enqueued as possible.
What you might be looking for is: http://doc.akka.io/docs/akka/1.3.1/scala/dispatchers.html#Making_the_Actor_mailbox_bounded
I'm coming from Java, where I'd submit Runnables to an ExecutorService backed by a thread pool. It's very clear in Java how to set limits to the size of the thread pool.
I'm interested in using Scala actors, but I'm unclear on how to limit concurrency.
Let's just say, hypothetically, that I'm creating a web service which accepts "jobs". A job is submitted with POST requests, and I want my service to enqueue the job then immediately return 202 Accepted — i.e. the jobs are handled asynchronously.
If I'm using actors to process the jobs in the queue, how can I limit the number of simultaneous jobs that are processed?
I can think of a few different ways to approach this; I'm wondering if there's a community best practice, or at least, some clearly established approaches that are somewhat standard in the Scala world.
One approach I've thought of is having a single coordinator actor which would manage the job queue and the job-processing actors; I suppose it could use a simple int field to track how many jobs are currently being processed. I'm sure there'd be some gotchyas with that approach, however, such as making sure to track when an error occurs so as to decrement the number. That's why I'm wondering if Scala already provides a simpler or more encapsulated approach to this.
BTW I tried to ask this question a while ago but I asked it badly.
Thanks!
I'd really encourage you to have a look at Akka, an alternative Actor implementation for Scala.
http://www.akkasource.org
Akka already has a JAX-RS[1] integration and you could use that in concert with a LoadBalancer[2] to throttle how many actions can be done in parallell:
[1] http://doc.akkasource.org/rest
[2] http://github.com/jboner/akka/blob/master/akka-patterns/src/main/scala/Patterns.scala
You can override the system properties actors.maxPoolSize and actors.corePoolSize which limit the size of the actor thread pool and then throw as many jobs at the pool as your actors can handle. Why do you think you need to throttle your reactions?
You really have two problems here.
The first is keeping the thread pool used by actors under control. That can be done by setting the system property actors.maxPoolSize.
The second is runaway growth in the number of tasks that have been submitted to the pool. You may or may not be concerned with this one, however it is fully possible to trigger failure conditions such as out of memory errors and in some cases potentially more subtle problems by generating too many tasks too fast.
Each worker thread maintains a dequeue of tasks. The dequeue is implemented as an array that the worker thread will dynamically enlarge up to some maximum size. In 2.7.x the queue can grow itself quite large and I've seen that trigger out of memory errors when combined with lots of concurrent threads. The max dequeue size is smaller 2.8. The dequeue can also fill up.
Addressing this problem requires you control how many tasks you generate, which probably means some sort of coordinator as you've outlined. I've encountered this problem when the actors that initiate a kind of data processing pipeline are much faster than ones later in the pipeline. In order control the process I usually have the actors later in the chain ping back actors earlier in the chain every X messages, and have the ones earlier in the chain stop after X messages and wait for the ping back. You could also do it with a more centralized coordinator.