kafka MirrorMaker : No broker partitions consumed by consumer thread kafka-mirror - apache-kafka

This is regarding kafka MirrorMaker tool.
I have configured kafka on two machines.
source:
destination: vm [ubuntu at the source only]
Kafka at both source and destination are of same version of kafka [kafka_2.11-0.9.0.0]
At source and destination, respective zookeeper and kafka servers are running.
with the MirrorMaker tool I wanted to replicate/make mirror of topics from source to destination.
Below is the command , that I have used:
./bin/kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config ./config/mirror_consumer.properties --producer.config ./config/mirror_producer.properties --whitelist='.*' &>mirror-log.log
configuration files contains
a. mirror_consumer.properties
#host:port of kafka source zookeeper to be mirrored
zookeeper.connect=source-ip:3181
zookeeper.connection.timeout.ms=1000000
consumer.timeout.ms=-1
security.protocol=PLAINTEXT
group.id=kafka-mirror
where,
source-ip is ip address of source machine.
my zookeeper at source is running at port 3181.
b. mirror_producer.properties
# mirror broker (local) at the destination
bootstrap.servers=localhost:9092
producer.type=async
where,
localhost, resolves to destination i.e. ubuntu vm
and kafka is runnning on default port i.e. 9092
Initially, I have created few topics with name say source1 and source2.
From source machine with respective producers from command line I have sent some messages to the topics created.
after executing the MirrorMaker command from destination,
I could see that the consumer at destination is trying to consume the topics.
Unfortunately, consumer at destination fails to read the partitions from broker for each topic.
please have a look at the sample log entry below:
[2016-05-06 13:25:00,931] WARN No broker partitions consumed by consumer thread kafka-mirror_mojes-VirtualBox-1462521159741-6c2475c3-0 for topic source1 (kafka.consumer.RangeAssignor)
[2016-05-06 13:25:00,931] WARN No broker partitions consumed by consumer thread kafka-mirror_mojes-VirtualBox-1462521295337-c3742307-0 for topic source1 (kafka.consumer.RangeAssignor)
[2016-05-06 13:25:00,931] WARN No broker partitions consumed by consumer thread kafka-mirror_mojes-VirtualBox-1462517840512-a134d048-0 for topic source2 (kafka.consumer.RangeAssignor)
[2016-05-06 13:25:00,932] WARN No broker partitions consumed by consumer thread kafka-mirror_mojes-VirtualBox-1462519206297-63bc9c58-0 for topic source2 (kafka.consumer.RangeAssignor)
[2016-05-06 13:25:00,932] WARN No broker partitions consumed by consumer thread kafka-mirror_mojes-VirtualBox-1462519513695-bee7950e-0 for topic source2 (kafka.consumer.RangeAssignor)
Please let me know , if you see anything that is missing / need to be fixed.
It would be great help.
Thanks in advance.

We get this issue when there is a mismatch between the number of partitions in a topic to the number of consumers in a consumer group feeding to the same topic.

Related

Apache Beam KafkaIO mention topic partition instead of topic name

Apache Beam KafkaIO has support for kafka consumers to read only from specified partitions. I have the following code.
KafkaIO.<String, String>read()
.withCreateTime(Duration.standardMinutes(1))
.withReadCommitted()
.withBootstrapServers(endPoint)
.withConsumerConfigUpdates(new ImmutableMap.Builder<String, Object>()
.put(ConsumerConfig.GROUP_ID_CONFIG, groupName)
.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 5)
.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest")
.build())
.commitOffsetsInFinalize()
.withTopicPartitions(List<TopicPartitions>)
I have the following 2 questions.
How do I get the partition names from kafka? How do I mention it in kafkaIO?
Does Apache beam spawn the number of kafka consumers equal to the partition list mentioned during the creation of the kafka consumer?
I found the answers myself.
How do I tell kafkaIO to read from particular partitions?
kafkaIO has the method withTopicPartitions(List<TopicPartitions>) which accepts a list of TopicPartition objects.
Topic Partitions are named as sequential numbers starting from zero. Hence, the following should work
KafkaIO.<String, String>read()
.withCreateTime(Duration.standardMinutes(1))
.withReadCommitted()
.withBootstrapServers(endPoint)
.withConsumerConfigUpdates(new ImmutableMap.Builder<String, Object>()
.put(ConsumerConfig.GROUP_ID_CONFIG, groupName)
.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, 5)
.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest")
.build())
.commitOffsetsInFinalize()
.withTopicPartitions(Arrays.asList(new TopicPartition(topicName, 0),new TopicPartition(topicName, 1),new TopicPartition(topicName, 2)))
To test it out, use kafkacat and the following command
kafkacat -P -b localhost:9092 -t sample -p 0 - This command produces to specified partition.
Does Apache beam spawn the number of kafka consumers equal to the partition list mentioned during the creation of the kafka consumer?
It will spawn a single consumer group with the number of consumers as the number of partitions mentioned during the building of the kafka Producer object explicitly.

How can you set the max.message.bytes of a state store changelog topic?

I have a Kafka Streams application with messages up to 10MiB. I want to persist these messages in a state store, but Kafka Streams fails to produce to the internal changelog topic:
2017-11-17 08:36:19,792 ERROR RecordCollectorImpl - task [4_5] Error sending record to topic appid-statestorename-state-store-changelog. No more offsets will be recorded for this task and the exception will eventually be thrown
org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
2017-11-17 08:36:20,583 ERROR StreamThread - stream-thread [StreamThread-1] Failed while executing StreamTask 4_5 due to flush state:
By adding some logging, it looks like the default max.message.bytes setting of an internal topic is 1MiB.
The default max.message.bytes for the cluster is set to 50MiB.
Is it possible to tweak the configuration of internal topics of Kafka Streams applications?
A work-around is to start the streams application, let it create the topics, and afterwards alter the topic config. But this feels like a dirty hack.
./kafka-topics.sh --zookeeper ... \
--alter --topic appid-statestorename-state-store-changelog \
--config max.message.bytes=10485760
Kafka 1.0 allows to specify custom topic properties for internal topics via StreamsConfig.
You prefix those configs with "topic." and can use any configs as defined in TopicConfig.
See the original KIP for more details:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-173%3A+Add+prefix+to+StreamsConfig+to+enable+setting+default+internal+topic+configs

What if kafka offset manager is down

A confluence doc shows how to fetching consumer offsets stored kafka, as follows: https://cwiki.apache.org/confluence/display/KAFKA/Committing+and+fetching+consumer+offsets+in+Kafka
It seems one broker is assigned as the offset manager, all the offset fetch and commit are done to this broker. but what if this broker is down?
Broker offsetManager = metadataResponse.coordinator();
// if the coordinator is different, from the above channel's host then reconnect
channel.disconnect();
channel = new BlockingChannel(offsetManager.host(), offsetManager.port(),
BlockingChannel.UseDefaultBufferSize(),
BlockingChannel.UseDefaultBufferSize(),
5000 /* read timeout in millis */);
channel.connect();
By configuring:
1. offsets.topic.num.partitions : The number of partitions for offset to commit the topic.
&
2. offsets.topic.replication.factor: replication factor for the offset topic"
in server.properties file, we are going to have Offset manager with one broker acting as leader and rest as followers and hence it follows the same leader failure mechanism in kafka.
Hence, when the offset manager that handles commitment of offset is down, Broker Controller eventually elects one of the ISR as the next offset manager Leader.

Kafka rolling restart: Data is lost

As part of our current Kafka cluster, high-availability testing (HA) is being done. The objective is, while a producer job is pushing data to a particular partition of a topic, all the brokers in Kafka cluster are restarted sequentially (Stop-first broker- restart it and after first broker comes up, do same steps for second broker and so-on). The producer job is pushing around 7 million records for about 30 minutes while this test is going on. At the end of job, it was noticed that around 1000 records are missing.
Below are specifics of our Kafka cluster: (kafka_2.10-0.8.2.0)
-3 Kafka brokers each with 2 100GB mounts
Topic was created with:
-Replication factor of 3
-min.insync.replica=2
server.properties:
broker.id=1
num.network.threads=3
num.io.threads=8
socket.send.buffer.bytes=1048576
socket.receive.buffer.bytes=1048576
socket.request.max.bytes=104857600
log.dirs=/drive1,/drive2
num.partitions=1
num.recovery.threads.per.data.dir=1
log.flush.interval.messages=10000
log.retention.hours=1
log.segment.bytes=1073741824
log.retention.check.interval.ms=1800000
log.cleaner.enable=false
zookeeper.connect=ZK1:2181,ZK2:2181,ZK3:2181
zookeeper.connection.timeout.ms=10000
advertised.host.name=XXXX
auto.leader.rebalance.enable=true
auto.create.topics.enable=false
queued.max.requests=500
delete.topic.enable=true
controlled.shutdown.enable=true
unclean.leader.election=false
num.replica.fetchers=4
controller.message.queue.size=10
Producer.properties (aync producer with new producer API)
bootstrap.servers=broker1:9092,broker2:9092,broker3:9092
acks=all
buffer.memory=33554432
compression.type=snappy
batch.size=32768
linger.ms=5
max.request.size=1048576
block.on.buffer.full=true
reconnect.backoff.ms=10
retry.backoff.ms=100
key.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
value.serializer=org.apache.kafka.common.serialization.ByteArraySerializer
Can someone share any info about Kafka-cluster and HA to ensure that data would not be lost while rolling restarting Kafka brokers?
Also, here is my producer code. This is a fire and forget kind of producer. we are not handling failures explicitly as of now. Working fine for almost millions of records. I am seeing problem, only when Kafka brokers are restarted as explained above.
public void sendMessage(List<byte[]> messages, String destination, Integer parition, String kafkaDBKey) {
for(byte[] message : messages) {
producer.send(new ProducerRecord<byte[], byte[]>(destination, parition, kafkaDBKey.getBytes(), message));
}
}
By increasing default retries value from 0 to 4000 on producer side, we are able to send data successfully without loosing.
retries=4000
Due to this setting, there is a possibility of sending same message twice and messages are out of sequence by the time consumer receives it (second msg might reach before first msg). But for our current problem that is not an issue and is handled on consumer side to ensure everything is in order.

Kafka unrecoverable if broker dies

We have a kafka cluster with three brokers (node ids 0,1,2) and a zookeeper setup with three nodes.
We created a topic "test" on this cluster with 20 partitions and replication factor 2. We are using Java producer API to send messages to this topic. One of the kafka broker intermittently goes down after which it is unrecoverable. To simulate the case, we killed one of the broker manually. As per the kafka arch, it is supposed to self recover, but which is not happening. When I describe the topic on the console, I see the number of ISR's reduced to one for few of the partitions as one of the broker killed. Now, whenever we are trying to push messages via the producer API (either Java client or console producer), we are encountering SocketTimeoutException.. One quick look into the logs says, "Unable to fetch the metadata"
WARN [2015-07-01 22:55:07,590] [ReplicaFetcherThread-0-3][] kafka.server.ReplicaFetcherThread - [ReplicaFetcherThread-0-3],
Error in fetch Name: FetchRequest; Version: 0; CorrelationId: 23711; ClientId: ReplicaFetcherThread-0-3;
ReplicaId: 0; MaxWait: 500 ms; MinBytes: 1 bytes; RequestInfo: [zuluDelta,2] -> PartitionFetchInfo(11409,1048576),[zuluDelta,14] -> PartitionFetchInfo(11483,1048576).
Possible cause: java.nio.channels.ClosedChannelException
[2015-07-01 23:37:40,426] WARN Fetching topic metadata with correlation id 0 for topics [Set(test)] from broker [id:1,host:abc-0042.yy.xxx.com,port:9092] failed (kafka.client.ClientUtils$)
java.net.SocketTimeoutException
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:201)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:86)
at java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:221)
at kafka.utils.Utils$.read(Utils.scala:380)
at kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
at kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
at kafka.network.BlockingChannel.receive(BlockingChannel.scala:111)
at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:75)
at kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:72)
at kafka.producer.SyncProducer.send(SyncProducer.scala:113)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:58)
at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:93)
at kafka.consumer.ConsumerFetcherManager$LeaderFinderThread.doWork(ConsumerFetcherManager.scala:66)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
Any leads will be appreciated...
From your error Unable to fetch metadata it could mostly be because you could have set the bootstrap.servers in the producer to the broker that has died.
Ideally, you must have more than one broker in the bootstrap.servers list because if one of the broker fails (or is unreachable) then the other could give you the metadata.
FYI: Metadata is the information about a particular topic that tells how many number of partitions it has, their leader brokers, follower brokers etc.
So, when a key is produced to a partition, its corresponding leader broker will be the one to whom the messages will be sent to.
From your question, your ISR set has only one broker. You could try setting the bootstrap.server to this broker.