How to tune for slow mdbs? - wildfly

I have created a simple MDB that sleeps for 1 sec (to simulate the network call it makes in real life). When I produce 100 messages there is a queue build up and it takes about 4 seconds to consume all of the messages. What should I tune to make it run all 100 messages simultaneously? I have tried:
#ActivationConfigProperty(propertyName = "maxSession", propertyValue = "100")
on the MDB and
<subsystem xmlns="urn:jboss:domain:messaging:2.0">
<hornetq-server>
<thread-pool-max-size>500</thread-pool-max-size>
in the Wildfly config

Related

While producing the messages, some brokers going breaking down makes any exception in Kafka producer side?

I am testing the scenario as follows.
I am producing the messages to sink which is the Kafka containing the three brokers.
What if brokers are going to down, the producing side have an any issue because of the broker-down?
When I tested it on my local using Flink, I generated the messages and sinked them to Kafka. And I have three kafka brokers. When I deleted the number of brokers to 2, there are no problems. And obviously, when all the brokers are going to down, then the producer-side app gives an exception.
So, according to these fact, I think that the producer-side app can still alive without any errors until one broker remains. Is my assumption correct?
Below is the my producer side configuration.
acks = 1
batch.size = 16384
compression.type = lz4
connections.max.idle.ms = 540000
delivery.timeout.ms = 120000
enable.idempotence = false
key.serializer = class org.apache.kafka.common.serialization.ByteArraySerializer
linger.ms = 0
partitioner.class = class org.apache.kafka.clients.producer.internals.DefaultPartitioner
receive.buffer.bytes = 32768
reconnect.backoff.max.ms = 1000
reconnect.backoff.ms = 50
request.timeout.ms = 30000
replication is two and I have three partitions for each topic.
Any help will be appreciated.
Thanks.
It all depends on your requirements and your producer configuration. At the moment, yes you can have 2 out of 3 brokers alive and your producer will continue as normal.
This is because you have acks=1 which means only the leader has to acknowledge the message before it is considered successful. The followers don't have to acknowledge the message.
You should also check whether you have changed min.insync.replicas at the broker or topic level configuration. The default is 1, meaning only 1 in-sync replica is needed for a broker to allow acks=all requests.
Side note: you have replication=2, I'd change this so partitions were replicated across all 3 brokers.
I'm not sure if I understood your question, but in Kafka client API there are some retryable Exceptions (like Not Leader, or unreached/unknown host).
So your Producer wil retry until reaching the first limit of these two configs:
retries : https://kafka.apache.org/documentation/#producerconfigs_retries
delivery.timeout.ms : https://kafka.apache.org/documentation/#producerconfigs_delivery.timeout.ms
So using the default values :
retries > 2 billions time &
delivery.timeout.ms = 2 minutes
Your producer will do N retries for only 2 minutes then crashes.

Kafka streams throwing InvalidProducerException frequently

I have a kafka streams application with 4 instances, each runing on a separate ec2 instance with 16 threads. Total threads = 16 * 4. The input topic has only 32 partitions. I understand that some of the threads will remain idle.
I am continously seeing this exception
Caused by: org.apache.kafka.common.errors.InvalidProducerEpochException: Producer attempted to produce with an old epoch.
01:57:23.971 [kafka-producer-network-thread | bids_kafka_streams_beta_007-fd78c6fa-62bc-437d-add0-c31f5b7c1901-StreamThread-12-1_6-producer] ERROR org.apach
e.kafka.streams.processor.internals.RecordCollectorImpl - stream-thread [bids_kafka_streams_beta_007-fd78c6fa-62bc-437d-add0-c31f5b7c1901-StreamThread-12] t
ask [1_6] Error encountered sending record to topic kafka_streams_bids_output for task 1_6 due to:
org.apache.kafka.common.errors.InvalidProducerEpochException: Producer attempted to produce with an old epoch.
Written offsets would not be recorded and no more records would be sent since the producer is fenced, indicating the task may be migrated out
The only settings I have change in the streams config are the producer configs to reduce CPU usage on brokers
linger.ms=10000
commit.interval.ms=10000
Records are windowed by 2 mins
Is it due to rebalancing? Why so frequent?

Kafka producer quota and timeout exceptions

I am trying to come up with a configuration that would enforce producer quota setup based on an average byte rate of producer.
I did a test with a 3 node cluster. The topic however was created with 1 partition and 1 replication factor so that the producer_byte_rate can be measured only for 1 broker (the leader broker).
I set the producer_byte_rate to 20480 on client id test_producer_quota.
I used kafka-producer-perf-test to test out the throughput and throttle.
kafka-producer-perf-test --producer-props bootstrap.servers=SSL://kafka-broker1:6667 \
client.id=test_producer_quota \
--topic quota_test \
--producer.config /myfolder/client.properties \
--record.size 2048 --num-records 4000 --throughput -1
I expected the producer client to learn about the throttle and eventually smooth out the requests sent to the broker. Instead I noticed there is alternate throghput of 98 rec/sec and 21 recs/sec for a period of more than 30 seconds. During this time average latency slowly kept increseing and finally when it hits 120000 ms, I start to see Timeout exception as below
org.apache.kafka.common.errors.TimeoutException : Expiring 7 records for quota_test-0: 120000 ms has passed since batch creation.
What is possibly causing this issue?
The producer is hitting timeout when latency reaches 120 seconds (default value of delivery.timeout.ms )
Why isnt the producer not learning about the throttle and quota and slowing down or backing off
What other producer configuration could help alleviate this timeout issue ?
(2048 * 4000) / 20480 = 400 (sec)
This means that, if your producer is trying to send the 4000 records full speed ( which is the case because you set throughput to -1), then it might batch them and put them in the queue.. in maybe one or two seconds (depending on your CPU).
Then, thanks to your quota settings (20480), you can be sure that the broker won't 'complete' the processing of those 4000 records before at least 399 or 398 seconds.
The broker does not return an error when a client exceeds its quota, but instead attempts to slow the client down. The broker computes the amount of delay needed to bring a client under its quota and delays the response for that amount of time.
Your request.timeout.ms being set to 120 seconds, you then have this timeoutException.

Apache Kafka: Lowering `request.timeout.ms` causes metadata fetch failures?

I have a 9 broker, 5 node Zookeeper Kafka setup.
In order to reduce the time for reporting failures, we had set request.timeout.ms to 3000. However, with this setting, I'm observing some weird behavior.
Occasionally, I'm seeing client (producer) getting an error:
org.apache.kafka.common.errors.TimeoutException: Failed to update metadata after 60000 ms.
This doesn't happen always. There are some producers that work just fine.
When I bumped up the request.timeout.ms value, I didn't see any errors.
Any idea why lowering request.timeout.ms cause metadata fetch timeouts?

Slow Consumers on Kafka

I use Kafka to maintain a Python service which should be working on parallel to handle the slow API requests for each message efficiently.
I used multiprocessing module on Python and kafka-python for the consumers.
ZooKeeper and Kafka 2.11 runs on the same Ubuntu server with mostly defult configurations.
The topic is auto-created with another kafka-python producer and set to have 10 partitions in order to use up to 10 consumers at the same time.
When I check, I see that the queue is really long, so the producer sends so many requests:
$ bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic usrReq --time -1
usrReq:8:1157
usrReq:2:1185
usrReq:5:1167
usrReq:4:1115
usrReq:7:1164
usrReq:10:1150
usrReq:1:1149
usrReq:9:1138
usrReq:3:1186
usrReq:6:1220
usrReq:0:6264
However; although working 10 cores in parallel, consumers take very long time (117 seconds on a sample log below) to get the next message from the queue.
thread 7, consumer: 117.485 sec
api1:0.412 sec
api2:0.752 sec
db_insert:0.132 sec
This is how each Process creates its own consumer, fetches messages and runs the analysis on the code:
consumer = KafkaConsumer(group_id='my-group',
bootstrap_servers='localhost',
value_deserializer=lambda m: json.loads(m.decode('ascii'))
consumer.subscribe(topics='usrReq')
while True:
msg = next(consumer).value['id']
method(msg)
Where could be the problem in this setup?