Inter executor communication in Storm worker - worker

Suppose I have a topology with one spout S feeding a bolt A. Suppose also I run this topology in a single worker.
My questions
Does the send thread of the executor running S move tuples from S' outgoing queue to A' incoming queue directly or does it move them to the shared LMAX disruptor maintained by the worker?
If the latter is the correct answer, which thread moves then tuples outputted by S to the incoming queue of A?
Thank you very much!

Related

Opening Kafka streams dynamically from a queue consumer

We have a use case where based on work item arriving on a worker queue, we would need to use the message metatdata to decide which Kafka topic to stream our data from. We would have maybe less than 100 worker nodes deployed and each worker node can have a configurable number of threads to receive messages from the queue. So if a worker has "n" threads , we would land up opening maybe kafka streams to "n" different topics. (n is usually less than 10).
Once the worker is done processing the message, we would need to close the stream also.
The worker can receive the next messsage once its acked the first message and at which point , I need to open a kafka stream for another topic.
Also every kafka stream needs to scan all the partitions(around 5-10) for the topic to filter by a certain attribute.
Can a flow like this work for Kafka streams or is this not an optimal approach?
I am not sure if I fully understand the use case, but it seem to be a "simple" copy data from topic A to topic B use case, ie, no data processing/modification. The logic to copy data from input to output topic seems complex though, and thus using Kafka Streams (ie, Kafka's stream processing library) might not be the best fit, as you need more flexibility.
However, using plain KafkaConsumers and KafkaProducers should allow you to implement what you want.

How does storm (with multiple worker nodes) guarantee message processing while reading from a kafka topic

I have a storm setup that picks up messages from a kafka topic and processes and persists them.
I want to understand how storm gurantees message processing in such a scenario
Consider the below scenario:
I have configured multiple supervisors+workers for a storm cluster.
The KafkaSpout is reading message from the topic and then passes on this a bolt. The bolt acks upon completion and the spout moves forward to the next message.
I have 2 supervisors running - each of which are running 3 workers each.
From what I understand - each of the worker on every supervisor is capable to processing a message.
So, at any given time 6 messages are being processed parallely in storm cluster.
what if the second message fails, either due to worker shutdown or due to supervisor shutdown.
the zookeeper is already pointing to the 7 message for the consumer group.
In such a scenario, how will the second message get processed?
I guess there is some miss understanding. The following claims seem to be wrong:
The bolt acks upon completion and the spout moves forward to the next message.
at any given time 6 messages are being processed parallely in storm cluster
=> A spout is not waiting for the acks; it fetches tuples over-and-over again with the maximum speed regardless of the processing speed of the bolts -- as long as new messages are available in Kafka. (Or did you limit the number of tuples in flight via max.spout.pending?). Thus, many messages are processed in parallel (even if only #executors are given to a UDF -- many other messages are buffered in internal Storm queues).
As far as I know (but I am not 100% sure), KafkaSpout "orders" the incoming acks and only move the offset if all consecutive acks are available -- ie, message 7 is not acked to Kafka if the Storm ack of message 6 is not there yet. Thus, KafkaSpout can re-emit message 6 if it fails. Re-call that Storm does not give any ordering guarantees.

Storm max spout pending

This is a question regarding how Storm's max spout pending works. I currently have a spout that reads a file and emits a tuple for each line in the file (I know Storm is not the best solution for dealing with files but I do not have a choice for this problem).
I set the topology.max.spout.pending to 50k to throttle how many tuples get into the topology to be processed. However, I see this number not having any effect in the topology. I see all records in a file being emitted every time. My guess is this might be due to a loop I have in the nextTuple() method that emits all records in a file.
My question is: Does Storm just stop calling nextTuple() for the Spout task when topology.max.spout.pending is reached? Does this mean I should only emit one tuple every time the method is called?
Exactly! Storm can only limit your spout with the next command, so if you transmit everything when you receive the first next, there is no way for Storm to throttle your spout.
The Storm developers recommend emitting a single tuple with a single next command. The Storm framework will then throttle your spout as needed to meet the "max spout pending" requirement. If you're emitting a high number of tuples, you can batch your emits to at most a tenth of your max spout pending, to give Storm the chance to throttle.
Storm topologies have a max spout pending parameter. The max
spout pending value for a topology can be configured via the
“topology.max.spout.pending” setting in the topology
configuration yaml file. This value puts a limit on how many
tuples can be in flight, i.e. have not yet been acked or failed, in a
Storm topology at any point of time. The need for this parameter
comes from the fact that Storm uses ZeroMQ to dispatch
tuples from one task to another task. If the consumer side of
ZeroMQ is unable to keep up with the tuple rate, then the
ZeroMQ queue starts to build up. Eventually tuples timeout at the
spout and get replayed to the topology thus adding more pressure
on the queues. To avoid this pathological failure case, Storm
allows the user to put a limit on the number of tuples that are in
flight in the topology. This limit takes effect on a per spout task
basis and not on a topology level.(source) For cases when the spouts are
unreliable, i.e. they don’t emit a message id in their tuples, this
value has no effect.
One of the problems that Storm users continually face is in
coming up with the right value for this max spout pending
parameter. A very small value can easily starve the topology and
a sufficiently large value can overload the topology with a huge
number of tuples to the extent of causing failures and replays.
Users have to go through several iterations of topology
deployments with different max spout pending values to find the
value that works best for them.
One solution is to build the input queue outside the nextTuple method and the only thing to do in nextTuple is to poll the queue and emit. If you are processing multiple files, your nextTuple method should also check if the result of polling the queue is null, and if yes, atomically reset the source file that is populating your queue.

Is it possible to dispatch no more than N number of messages from the queue at a given time with distributed consumers?

I have a distributed system that reads messages from RabbitMQ. In my situation now I need to process no more than N msgs/s.
For example: Imagine service A that sends text messages (SMS). This service can only handle 100 msgs/s. My system has multiple consumers that reads data from RabbitMQ. Each message needs to be processed first and than be send to service A. Processing time is always different.
So the question:
Is it possible to configure queue to dispatch no more than 100 msgs/s to multiple consumers?
You can use the prefetch_size parameter of the qos method to limit throughput to your consumers.
channel.basic_qos(100);
See also: http://www.rabbitmq.com/blog/2012/05/11/some-queuing-theory-throughput-latency-and-bandwidth/

How many messages can a queue hold?

In Celery, what is the upper bound limit of the number of messages in a queue?
How many messages can wait in a queue in order to be prefetched/consume by a worker?
Queue length depends from broker(and message length). For example, if you are using RabbitMQ as broker, you can expect millions of messages(I saw hundreds of thousands in practice). You can make simple load testing using RabbitMQ management plugin(monitor resources).
This thread can be helpful.