I have recently been using spark streaming to process data in kafka.
After the application is started and a few batches are finished, there is a continuous delay.
Most of the time, data processing is completed within 1-5 seconds.
However, after several batches, it took 41 ~ 45 seconds continuously, and most of the delay occurred in the area that fetches data from stage0.
I accidentally found the Kafka request.timemout.ms setting to be 40 seconds by default and changed this setting to 10 seconds.
I then restarted the application and observed that the batch was completed in 11 to 15 seconds.
Actual processing time is 1-5 sec. I can not understand this delay.
What is wrong?
My environment is as follows.
Spark streaming 2.1.0(createDirectStream)
Kafka : 0.10.1
Batch interval : 20s
Request.timeout.ms : 10s
/////
The following capture is the graph when request.timeout.ms is set to 8 seconds.
I found the problem and solution:
Basically when you are reading from your executors every partition of kafka, spark streaming for improve the performance or reading and processing, is caching the content of the partition read in memory.
If the size of the topic is so big, the cache can overflow and when kafka connect do fetch to kafka the cache is full and get the timeout.
Solution: If you are in spark 2.2.0 or higher( from spark documentation) this is the solution, is a bug known by spark and cloudera:
The cache for consumers has a default maximum size of 64. If you expect to be handling more than (64 * number of executors) Kafka partitions, you can change this setting via spark.streaming.kafka.consumer.cache.maxCapacity.
If you would like to disable the caching for Kafka consumers, you can set spark.streaming.kafka.consumer.cache.enabled to false. Disabling the cache may be needed to workaround the problem described in SPARK-19185. This property may be removed in later versions of Spark, once SPARK-19185 is resolved.
The cache is keyed by topicpartition and group.id, so use a separate group.id for each call to createDirectStream.
spark.streaming.kafka.consumer.cache.enabled to false In your spark-submit as parameter and your mini-bacth performance will be like a supersonic aeroplane.
We face the same issue too, and after lots of analysis, we find that it is due to a kafka bug as described in KAFKA-4303.
For spark applications, we can avoid this issue by setting reconnect.backoff.ms = 0 in the consumer config.
I may decribe more details when I have time.
Related
I have a Kafka consumer running on a Spring application.
I am trying to config the consumer with fetch.max.wait.ms and fetch.min.bytes.
I would like the consumer to wait until there are 15000000 bytes of messages or 1 minute has passed.
consumerProps.put(ConsumerConfig.FETCH_MAX_WAIT_MS_CONFIG, 60000);
consumerProps.put(ConsumerConfig.FETCH_MIN_BYTES_CONFIG, 15000000);
factory.setConsumerFactory(new DefaultKafkaConsumerFactory<>(consumerProps));
I know this configuration does have an effect, because once it was set i started to get org.apache.kafka.common.errors.DisconnectException
To resolve it i increased request.timeout.ms
consumerProps.put(ConsumerConfig.REQUEST_TIMEOUT_MS_CONFIG, 120000);
This resolved the errors, but the behavior is not as expected:
The consumer is picking up messages (at low amount, no way near the fetch.min.bytes) very often.
However, within a minute it will sometimes do multiple fetches.
It works O.k on my local dev when i test it with Spring EmbeddedKafka, but doesn't work on production. (MSk)
What can explain it? Is it possible it doesn't work well on MSK?
Are there other properties that play a role here or can be in the way?
Is it correct to say that, assuming i am always under fetch.min.bytes, that i won't see more than 1 fetch per minute?
Is there a case where while records are polled, new ones are written, what is the expected behavior then? does it affect current poll or next one?
(other properties defined for this consumer: session.timeout.ms, max.poll.records, max.partition.fetch.bytes)
====== EDIT =====
After some investigation i discovered something:
The configuration works as expected when the consumer is working against a topic with a single partition.
When working against a topic with multiple partitions the fetch time becomes unexpected.
I have not used the spring consumer myself but after doing some research it seems it is not possible to achieve what you are trying to do. As per this thread, it is not possible to configure poll duration in the listener implementation.
However, you can write your own poll logic and achieve the desired behaviour using poll duration and max poll records. You can use this code as reference and configure:
Poll duration as 60 seconds
max.poll.records
I am running spark streaming with kafka fro word count program, there is a lot of delay in batch creation and processing - around 2 mins for each batch.
How could i reduce this time ? Are there any properties to be configured this to be quickly as possible - like properties at spark streaming level or kafka level ?
you should define the interval between each batch in your unstructured StreamingContext, exemple :
val ssc = StreamingContext(new SparkConf(), Minutes(1))
in strutured streaming you have a option: kafkaConsumer.pollTimeoutMs
with 512 ms as default value, more informations: https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html
Another problem can come from the kafka lag. you application can take a long time to process a specific offset, maybe 2 minutes, so as soon as this offset is finish, he will poll others for processing. Try to look at the current offset of your consumer group and the last offset of your topic.
We are trying to benchmark the performance in our Storm Topology. We are ingesting messages around 1000/seconds to Kafka Topic. When we put max.spout.pendind=2000 in our KafkaSpout then we don't see any failed messages in storm UI but when we decrease the max.spout.pendind value to 500 or 100, then we see lot of failed messages in spout in Storm UI. My understanding is that if we keep the max.spout.pending low then we will not have any failed messages as nothing will timeout but it behaving in opposite manner. We are using Storm 1.1.0 version from HDP 2.6.5 version.
We have one Kafka Spout and two bolts .
KafkaSpout Parallelism - 1
Processing Bolt Parallelism - 1
Custom Kafka Writer Bolt Parallelism - 1
Could anyone have any idea about this?
The first you will have to do is on the storm UI check the statistics of the latency. You should also look at how the bolts/spouts are loaded (capacity statistics).
Is the rate of emit of tuples really high compared to the rate of sinking this data ? , This is an indication that i get when you mention that increasing pending spouts is fixing the issue. Can you provide these stats .. Another part worth exploring is increasing the task time out on the tuples (to see if this is causing replay and flooding the topology)
Please find the below topology stats :
This is interesting. You are right, follow my steps to narrow down the issue,
Upload a screenshot of your Topology Visualization screen on peek load.
Check for the bolts which are changing their color to brown/red. Red is indicating that your bolt takes too much time to process records.
Your spout/bolt executors are way less to process 1K tuples per second.
Number of machines you are using?
If tuples are failed in "KafkaSpout" then most of the time it means timeout error.
Find out after processing how many events tuples are failing.
We have 25Million records written to the Kafka topic.
The topic is having 24 Partitions and 24 Consumers.
Each message is 1KB. And these messages are wrapped with Avro for serialization and Deserialization.
Replication Factor is 2.
The fetch size is 50000 and the Poll interval is 50ms.
Right now during the load test to consume and process 1Million, it takes 40mins on an average. But, we want to process the 25Million records in less than 20 to 30mins.
Broker configs:
background.threads = 10
num.network.threads = 7
num.io.threads = 8
Set replica.lag.time.max.ms = 500
Set replica.lag.max.messages = 4
Set log.flush.interval.ms to default value as per logs
Used G1 collector instead of MarkSweepGC
Changed Xms to 4G and Xmx to 4G
our setup has 8 brokers each with 3 disks and with 10GBPS ethernet with simplex network.
Consumer Configs:
We are using Java Consumer API to consume the messages. We made the swappiness to 1 and using 200 threads to process the data within the consumer. Inside the consumer we are picking up the message and hitting Redis, MaprDB to perform some business logic. Once, the logic is completed we are committing the message using Kafka Commit Sync.
Each consumer is running with -xms 4G and -xmx 4G. What are the other aspects we need to consider in order to increase the read throughput?
I won't provide you an exact answer to your problem, but more a roadmap and methodological help.
10 min for 1Million message is indeed slow IF everything works fine AND the consumer's task is light.
First thing you need to know is what is your bottle neck.
It could be:
the Kafka cluster itself: messages are long to be pulled out of the cluster. T test that, you should check with a simple consumer (the one provided with Kafka CLI for example), running directly on a machine where you have a broker (or close), to avoid network latency. How fast is that?
the network between the brokers and the consumer
the consumer: what does it do? Maybe the processing is really long. Then optimisation should run there. Can you monitor the ressources (CPU, RAM) required for your consumer? Maybe one good test you could do is create a test consumer, in which you load 10k messages in memory, then do your business logic and time it. How long does it last? This will tell you the max throughput of your consumer, irrespective of Kafka's speed.
I am running a Spark Streaming job (means that data keeps getting pushed to a kafka topic and read by Spark consumer continuously). My Kafka topic for Input data has a retention time set to 60000 (1 Min). However, Input Topic doesn't clear messages after 1 minute. It takes approx 1:26 mins to clear if no new data got added to the topic.
If I add data continously for two mintues, I would expect half of old data to be cleared because of retention.ms set to 1 min. But I just see double data.
Has anyone seen similar pattern. How can I resolve this? Would you need more details?
You need to set the property log.retention.check.interval.ms to set the frequency in milliseconds that the log cleaner checks whether any log is eligible for deletion.