we have 3 kafka machines and 3 zookeeper's servers
kafka1 - 1001 ( broker ID )
kafka2 - 1002 ( broker ID )
kafka3 - 1003 ( broker ID )
we have issue is about to re-balance the partitions to available brokers , we can see ( down ) that some partitions under-replicated with only two brokers in isr. Instead with 3
what is the best way to re balance the kafka topic partitions in ISR
second - we can see that leader 1002 is missing , what is the solution for this?
remark - we have 23 topics ( the list down is partial )
[kafka#kafka01 bin]$ ./kafka-topics.sh -describe --zookeeper master:2181
Topic:lop_gt PartitionCount:50 ReplicationFactor:3 Configs:segment.bytes=104857600,cleanup.policy=compact,compression.type=producer
Topic: lop_gt Partition: 0 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 1 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 2 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 3 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 4 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 5 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 6 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 7 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 8 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 9 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 10 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 11 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 12 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 13 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 14 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 15 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 16 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 17 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 18 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 19 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 20 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 21 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 22 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 23 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 24 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 25 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 26 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 27 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 28 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 29 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 30 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 31 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 32 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 33 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 34 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 35 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 36 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 37 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 38 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 39 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 40 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 41 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 42 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 43 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: lop_gt Partition: 44 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: lop_gt Partition: 45 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: lop_gt Partition: 46 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: lop_gt Partition: 47 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: lop_gt Partition: 48 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: lop_gt Partition: 49 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic:_schemas PartitionCount:1 ReplicationFactor:3 Configs:cleanup.policy=compact
Topic: _schemas Partition: 0 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic:ambari_kafka_service_check PartitionCount:1 ReplicationFactor:1 Configs:
Topic: ambari_kafka_service_check Partition: 0 Leader: 1002 Replicas: 1002 Isr: 1002
Topic:jr_dfse PartitionCount:10 ReplicationFactor:3 Configs:
Topic: jr_dfse Partition: 0 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: jr_dfse Partition: 1 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: jr_dfse Partition: 2 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: jr_dfse Partition: 3 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: jr_dfse Partition: 4 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: jr_dfse Partition: 5 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: jr_dfse Partition: 6 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: jr_dfse Partition: 7 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: jr_dfse Partition: 8 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: jr_dfse Partition: 9 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic:frte_nnc PartitionCount:6 ReplicationFactor:3 Configs:
Topic: frte_nnc Partition: 0 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: frte_nnc Partition: 1 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: frte_nnc Partition: 2 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: frte_nnc Partition: 3 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: frte_nnc Partition: 4 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: frte_nnc Partition: 5 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic:erw_plk PartitionCount:100 ReplicationFactor:3 Configs:
Topic: erw_plk Partition: 0 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 1 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 2 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 3 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 4 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 5 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 6 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 7 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 8 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 9 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 10 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 11 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 12 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 13 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 14 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 15 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 16 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 17 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 18 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 19 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 20 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 21 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 22 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 23 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 24 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 25 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 26 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 27 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 28 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 29 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 30 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 31 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 32 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 33 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 34 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 35 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 36 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 37 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 38 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 39 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 40 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 41 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 42 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 43 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 44 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 45 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 46 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 47 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 48 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 49 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 50 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 51 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 52 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 53 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 54 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 55 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 56 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 57 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 58 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 59 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 60 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 61 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 62 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 63 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 64 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 65 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 66 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 67 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 68 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 69 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 70 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 71 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 72 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 73 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 74 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 75 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 76 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 77 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 78 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 79 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 80 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 81 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 82 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 83 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 84 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 85 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 86 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 87 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 88 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 89 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 90 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 91 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 92 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 93 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: erw_plk Partition: 94 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: erw_plk Partition: 95 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: erw_plk Partition: 96 Leader: 1003 Replicas: 1002,1003,1001 Isr: 1003,1001
Topic: erw_plk Partition: 97 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: erw_plk Partition: 98 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
Topic: erw_plk Partition: 99 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic:loe_sd PartitionCount:6 ReplicationFactor:3 Configs:
Topic: loe_sd Partition: 0 Leader: 1001 Replicas: 1002,1001,1003 Isr: 1001,1003,1002
Topic: loe_sd Partition: 1 Leader: 1003 Replicas: 1003,1002,1001 Isr: 1003,1001
Topic: loe_sd Partition: 2 Leader: 1001 Replicas: 1001,1003,1002 Isr: 1003,1001,1002
Topic: loe_sd Partition: 3 Leader: 1001 Replicas: 1002,1003,1001 Isr: 1001,1003,1002
Topic: loe_sd Partition: 4 Leader: 1003 Replicas: 1003,1001,1002 Isr: 1003,1001
Topic: loe_sd Partition: 5 Leader: 1001 Replicas: 1001,1002,1003 Isr: 1003,1001,1002
You have two options how to influence the partition leadership. With the configuration option auto.leader.rebalance.enable set to true (which should be the default setting), Kafka will automatically try to move the leadership for each partitionto your preferred broker. Preferred broker is the first one in the list of replicas. It is executed as periodical check, so might not happen immediately.
Alternatively - if the automatic rebalance is turned off - you can reassign the replicas manually using the bin/kafka-preferred-replica-election.sh tool. For more info see the Kafka docs.
However, in your case it looks like your broker 1002 is either shortly after the restart or might need more time to re-sync the data, since the broker 1002 is not yet ISR for all partitions. If this is permanent state and you are 100% sure that your broker had enough time to sync all partitions etc., maybe there are some other problems with 1002. But that is hard to say without logs from the broker etc.
I am trying the quickstart of kafka documentation,link is, https://kafka.apache.org/quickstart.
I have deploy 3 brokers and create a topic.
➜ kafka_2.10-0.10.1.0 bin/kafka-topics.sh --describe --zookeeper
localhost:2181 --topic my-replicated-topic
Topic:my-replicated-topic PartitionCount:1 ReplicationFactor:3
Configs:
Topic: my-replicated-topic Partition: 0 Leader: 2 Replicas: 2,0,1
Isr: 2,1,0
Then I use the "bin/kafka-console-producer.sh --broker-list localhost:9092 --topic my-replicated-topic" to test producer.
And use "bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --from-beginning --topic my-replicated-topic to test consumer"
the producer and consumer work well.
if I kill server 1 or 2, the producer and consumer work properly.
but if I kill server 0, and I type the message in producer terminal, the consumer can't read new messages.
when I kill server 0,the consumer print the log:
[2017-06-23 17:29:52,750] WARN Auto offset commit failed for group console-consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:52,974] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,085] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,195] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,302] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:29:53,409] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
then I restart the server 0,the consumer print the message and some warn logs:
hhhh
hello
[2017-06-23 17:32:32,795] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
[2017-06-23 17:32:32,902] WARN Auto offset commit failed for group console-
consumer-97540: Offset commit failed with a retriable exception. You should
retry committing offsets.
(org.apache.kafka.clients.consumer.internals.ConsumerCoordinator)
This confused me.Why server 0 is so special, and the server 0 is not the leader.
And i noticed that server log printed by server 0 has much information as below:
[2017-06-23 17:32:33,640] INFO [Group Metadata Manager on Broker 0]: Finished
loading offsets from [__consumer_offsets,23] in 38 milliseconds.
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,641] INFO [Group Metadata Manager on Broker 0]: Loading
offsets and group metadata from [__consumer_offsets,26]
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,646] INFO [Group Metadata Manager on Broker 0]: Finished
loading offsets from [__consumer_offsets,26] in 4 milliseconds.
(kafka.coordinator.GroupMetadataManager)
[2017-06-23 17:32:33,646] INFO [Group Metadata Manager on Broker 0]: Loading
offsets and group metadata from [__consumer_offsets,29]
(kafka.coordinator.GroupMetadataManager)
but server1 and serve2 log don't have that content.
can somebody explains it for me ,thanks very much!
Solved:
The replication factor on the _consumer-offsets topic is the root cause. It's an issue: issues.apache.org/jira/browse/KAFKA-3959
kafka-console-producer defaults to acks = 1 so that's not fault tolerant at all. Add the flag or config parameter to set acks = all and if your topic and the _consumer-offsets topic were both created with replication factor of 3 your test will work.
The servers share their load for managing Consumer Groups.
Usually each independant consumer has a unique Consumer Group ID and you use the same Group ID when you want to split the consuming process between multiple consumers.
That being said: being the leader broker, for a Kafka server within the cluster, is just for coordination of other brokers. The leader has nothing to do (directly) with the server that is currently managing the Group ID and commits for a specific consumer!
So, whenever you subscribe, you are designated a server which will handle the offset commits for your group and this has nothing to do with leader election.
Shut down that server and you might have issue for your group consumption until the Kafka cluster stabilizes again (reallocates your consumer to move the Group management to other servers or waits for the nodes to respond again... I am not expert enough from there to tell you exactly how the failover happens).
Probably, the topic __consumer_offsets has the "Replicas" set to 0.
To confirm this, verify the topic __consumer_offsets:
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic __consumer_offsets
Topic: __consumer_offsets PartitionCount: 50 ReplicationFactor: 1 Configs: compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets Partition: 0 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 1 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 2 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 3 Leader: 0 Replicas: 0 Isr: 0
Topic: __consumer_offsets Partition: 4 Leader: 0 Replicas: 0 Isr: 0
...
Topic: __consumer_offsets Partition: 49 Leader: 0 Replicas: 0 Isr: 0
Notice the "Replicas: 0 Isr: 0". This is the reason when you stop the broker 0, the consumer doesn't get the messages anymore.
To correct this, you need to alter the "Replicas" of the topic __consumer_offsets, including the other brokers.
Create a json file like this (config/inc-replication-factor-consumer_offsets.json):
{"version":1,
"partitions":[
{"topic":"__consumer_offsets", "partition":0, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":1, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":2, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":3, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":4, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":5, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":6, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":7, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":8, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":9, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":10, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":11, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":12, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":13, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":14, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":15, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":16, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":17, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":18, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":19, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":20, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":21, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":22, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":23, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":24, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":25, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":26, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":27, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":28, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":29, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":30, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":31, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":32, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":33, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":34, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":35, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":36, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":37, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":38, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":39, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":40, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":41, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":42, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":43, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":44, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":45, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":46, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":47, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":48, "replicas":[0, 1, 2]},
{"topic":"__consumer_offsets", "partition":49, "replicas":[0, 1, 2]}
]
}
Execute the following command:
kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --zookeeper localhost:2181 --reassignment-json-file ../config/inc-replication-factor-consumer_offsets.json --execute
Confirm the "Replicas":
kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic __consumer_offsets
Topic: __consumer_offsets PartitionCount: 50 ReplicationFactor: 3 Configs: compression.type=producer,cleanup.policy=compact,segment.bytes=104857600
Topic: __consumer_offsets Partition: 0 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets Partition: 1 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets Partition: 2 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Topic: __consumer_offsets Partition: 3 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
...
Topic: __consumer_offsets Partition: 49 Leader: 0 Replicas: 0,1,2 Isr: 0,2,1
Now you can stop only the broker 0, produce some messages and see the result on the consumer.