Curator Leader election give connection refused error - apache-zookeeper

I implemented curator leader election example which is given in this site
Instead of having number of curator clients I added only one curator client as follows
public void selectLeader() {
CuratorFramework client = null;
try {
client = CuratorFrameworkFactory.newClient("localhost:2181", new ExponentialBackoffRetry(1000, 3));
LeaderSelectorService service = new LeaderSelectorService(client, "/leaderSelections", "LeaderElector");
client.start();
Thread.sleep(10000);
service.start();
} catch (Exception e) {
System.out.println("error"+e);
}finally
{
System.out.println("Shutting down...");
// CloseableUtils.closeQuietly(client);
}
}
public class LeaderSelectorService extends LeaderSelectorListenerAdapter implements Closeable {
private final String name;
private final LeaderSelector leaderSelector;
public LeaderSelectorService(CuratorFramework client, String path, String name) {
this.name = name;
// create a leader selector using the given path for management
// all participants in a given leader selection must use the same path
// ExampleClient here is also a LeaderSelectorListener but this isn't required
leaderSelector = new LeaderSelector(client, path, this);
// for most cases you will want your instance to requeue when it relinquishes leadership
leaderSelector.autoRequeue();
}
public void start() throws IOException
{
// the selection for this instance doesn't start until the leader selector is started
// leader selection is done in the background so this call to leaderSelector.start() returns immediately
leaderSelector.start();
}
#Override
public void takeLeadership(CuratorFramework arg0) throws Exception {
// we are now the leader. This method should not return until we want to relinquish leadership
final int waitSeconds = (int)(5 * Math.random()) + 1;
System.out.println(name + " is now the leader. Waiting " + waitSeconds + " seconds...");
//System.out.println(name + " has been leader " + leaderCount.getAndIncrement() + " time(s) before.");
try
{
Thread.sleep(TimeUnit.SECONDS.toMillis(waitSeconds));
}
catch ( InterruptedException e )
{
System.err.println(name + " was interrupted.");
Thread.currentThread().interrupt();
}
finally
{
System.out.println(name + " relinquishing leadership.\n");
}
}
#Override
public void close() throws IOException {
leaderSelector.close();
}
}
I have only one zookeeper instance and I am using Zookeeper 3.4.6, curator-framework 4.0.0 and curator-recipes 4.0.0.
when I start the client, it connects to zookeeper and in the log I can see "State change : connected" message.
Then I wait 10s and start leader election which gives me below error repeatedly.
2017-09-06 09:34:22.727 INFO 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Unable to read additional data from server sessionid 0x15e555a719d0000, likely server has closed socket, closing socket connection and attempting reconnect
2017-09-06 09:34:22.830 INFO 1228 --- [c-1-EventThread] o.a.c.f.state.ConnectionStateManager : State change: SUSPENDED
2017-09-06 09:34:23.302 INFO 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-09-06 09:34:23.303 INFO 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Socket connection established, initiating session, client: /127.0.0.1:49594, server: localhost/127.0.0.1:2181
2017-09-06 09:34:23.305 INFO 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15e555a719d0000, negotiated timeout = 120000
2017-09-06 09:34:23.305 INFO 1228 --- [c-1-EventThread] o.a.c.f.state.ConnectionStateManager : State change: RECONNECTED
2017-09-06 09:34:23.310 WARN 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session 0x15e555a719d0000 for server localhost/127.0.0.1:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.8.0_131]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.8.0_131]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.8.0_131]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[na:1.8.0_131]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380) ~[na:1.8.0_131]
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:75) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
after some time it started to give me below error message.
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss
at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:831) [curator-framework-4.0.0.jar:4.0.0]
at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:623) [curator-framework-4.0.0.jar:4.0.0]
at org.apache.curator.framework.imps.WatcherRemovalFacade.processBackgroundOperation(WatcherRemovalFacade.java:152) [curator-framework-4.0.0.jar:4.0.0]
at org.apache.curator.framework.imps.GetConfigBuilderImpl$2.processResult(GetConfigBuilderImpl.java:222) [curator-framework-4.0.0.jar:4.0.0]
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:590) [zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:499) [zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
2017-09-06 09:34:31.897 INFO 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
2017-09-06 09:34:31.898 INFO 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Socket connection established, initiating session, client: /127.0.0.1:49611, server: localhost/127.0.0.1:2181
2017-09-06 09:34:31.899 INFO 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session establishment complete on server localhost/127.0.0.1:2181, sessionid = 0x15e555a719d0000, negotiated timeout = 120000
2017-09-06 09:34:31.899 INFO 1228 --- [c-1-EventThread] o.a.c.f.state.ConnectionStateManager : State change: RECONNECTED
2017-09-06 09:34:31.907 WARN 1228 --- [localhost:2181)] org.apache.zookeeper.ClientCnxn : Session 0x15e555a719d0000 for server localhost/127.0.0.1:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Xid out of order. Got Xid 41 with err -6 expected Xid 40 for a packet with details: clientPath:/leaderSelections serverPath:/leaderSelections finished:false header:: 40,12 replyHeader:: 0,0,-4 request:: '/leaderSelections,F response:: v{}
at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:892) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214) ~[zookeeper-3.5.3-beta.jar:3.5.3-beta-8ce24f9e675cbefffb8f21a47e06b42864475a60]
I tried several solution in the internet but non got succeeded. Does anybody know the root cause of this issue.

I have fixed this issue. There was version number mismatch between zookeeper version and curator version. I used curator version 4.0.0 with zookeeper 3.4.6. According to apache curator site
Curator 4.0.0 - compatible with ZooKeeper 3.5.x. I changed my curator version to 2.8.0

Related

ActiveMQ in Eclipse to listen to remote broker

My goal is to listen to a broker on a port on a specific IP address with a given topic to receive JMS messages from there with ActiveMQ in Eclipse IDE.
Here is my code:
public static void main(String[] args) throws URISyntaxException, Exception {
Connection Connection = null;
try {
// Producer
ConnectionFactory ConnectionFactory = new ActiveMQConnectionFactory("tcp://1.2.3.4:5678");
Connection = ConnectionFactory.createConnection();
Connection.start();
Session session = Connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
Topic topic = session.createTopic("MyTopic");
// Consumer1 subscribes to customerTopic
MessageConsumer consumer1 = session.createConsumer(topic);
consumer1.setMessageListener(new MessageConsumeListener("MyTopic"));
System.out.print("connected");
Thread.sleep(3000);
/* session.close(); */
} finally {
if (Connection != null) {
Connection.close();
}
}
}
This is the error:
Exception in thread "main" javax.jms.JMSException: Could not connect to broker URL: tcp://1.2.3.4.5678 Reason: java.net.ConnectException: Connection refused (Connection refused)
at org.apache.activemq.util.JMSExceptionSupport.create(JMSExceptionSupport.java:36)
at org.apache.activemq.ActiveMQConnectionFactory.createActiveMQConnection(ActiveMQConnectionFactory.java:360)
at org.apache.activemq.ActiveMQConnectionFactory.createActiveMQConnection(ActiveMQConnectionFactory.java:305)
at org.apache.activemq.ActiveMQConnectionFactory.createConnection(ActiveMQConnectionFactory.java:245)
at My.Project.subscribelistener.main(subscribelistener.java:29)
And also this:
Caused by: java.net.ConnectException: Connection refused (Connection refused)
at java.base/java.net.PlainSocketImpl.socketConnect(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:399)
at java.base/java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:242)
at java.base/java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:224)
at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.base/java.net.Socket.connect(Socket.java:609)
at org.apache.activemq.transport.tcp.TcpTransport.connect(TcpTransport.java:501)
at org.apache.activemq.transport.tcp.TcpTransport.doStart(TcpTransport.java:464)
at org.apache.activemq.util.ServiceSupport.start(ServiceSupport.java:55)
at org.apache.activemq.transport.AbstractInactivityMonitor.start(AbstractInactivityMonitor.java:168)
at org.apache.activemq.transport.InactivityMonitor.start(InactivityMonitor.java:50)
at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:58)
at org.apache.activemq.transport.WireFormatNegotiator.start(WireFormatNegotiator.java:72)
at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:58)
at org.apache.activemq.transport.TransportFilter.start(TransportFilter.java:58)
at org.apache.activemq.ActiveMQConnectionFactory.createActiveMQConnection(ActiveMQConnectionFactory.java:340)
... 3 more
I can also ping the IP address and with telnet I can connect to the port.
How can I connect to a remote broker?

Exactly once semantic with spring kafka

Im trying to test my exactly once configuration to make sure all the configs i set are correct and the behavior is as i expect
I seem to encounter a problem with duplicate sends
public static void main(String[] args) {
MessageProducer producer = new ProducerBuilder()
.setBootstrapServers("kafka:9992")
.setKeySerializerClass(StringSerializer.class)
.setValueSerializerClass(StringSerializer.class)
.setProducerEnableIdempotence(true).build();
MessageConsumer consumer = new ConsumerBuilder()
.setBootstrapServers("kafka:9992")
.setIsolationLevel("read_committed")
.setTopics("someTopic2")
.setGroupId("bla")
.setKeyDeserializerClass(StringDeserializer.class)
.setValueDeserializerClass(MapDeserializer.class)
.setConsumerMessageLogic(new ConsumerMessageLogic() {
#Override
public void onMessage(ConsumerRecord cr, Acknowledgment acknowledgment) {
producer.sendMessage(new TopicPartition("someTopic2", cr.partition()),
new OffsetAndMetadata(cr.offset() + 1),"something1", "im in transaction", cr.key());
acknowledgment.acknowledge();
}
}).build();
consumer.start();
}
this is my "test", you can assume the builder puts the right configuration.
ConsumerMessageLogic is a class that handles the "process" part of the read-process-write that the exactly once semantic is supporting
inside the producer class i have a send message method like so:
public void sendMessage(TopicPartition topicPartition, OffsetAndMetadata offsetAndMetadata,String sendToTopic, V message, PK partitionKey) {
try {
KafkaRecord<PK, V> partitionAndMessagePair = producerMessageLogic.prepareMessage(topicPartition.topic(), partitionKey, message);
if(kafkaTemplate.getProducerFactory().transactionCapable()){
kafkaTemplate.executeInTransaction(operations -> {
sendMessage(message, partitionKey, sendToTopic, partitionAndMessagePair, operations);
operations.sendOffsetsToTransaction(
Map.of(topicPartition, offsetAndMetadata),"bla");
return true;
});
}else{
sendMessage(message, partitionKey, topicPartition.topic(), partitionAndMessagePair, kafkaTemplate);
}
}catch (Exception e){
failureHandler.onFailure(partitionKey, message, e);
}
}
I create my consumer like so:
/**
* Start the message consumer
* The record event will be delegate on the onMessage()
*/
public void start() {
initConsumerMessageListenerContainer();
container.start();
}
/**
* Initialize the kafka message listener
*/
private void initConsumerMessageListenerContainer() {
// start a acknowledge message listener to allow the manual commit
messageListener = consumerMessageLogic::onMessage;
// start and initialize the consumer container
container = initContainer(messageListener);
// sets the number of consumers, the topic partitions will be divided by the consumers
container.setConcurrency(springConcurrency);
springContainerPollTimeoutOpt.ifPresent(p -> container.getContainerProperties().setPollTimeout(p));
if (springAckMode != null) {
container.getContainerProperties().setAckMode(springAckMode);
}
}
private ConcurrentMessageListenerContainer<PK, V> initContainer(AcknowledgingMessageListener<PK, V> messageListener) {
return new ConcurrentMessageListenerContainer<>(
consumerFactory(props),
containerProperties(messageListener));
}
when i create my producer i create it with UUID as transaction prefix like so
public ProducerFactory<PK, V> producerFactory(boolean isTransactional) {
ProducerFactory<PK, V> res = new DefaultKafkaProducerFactory<>(props);
if(isTransactional){
((DefaultKafkaProducerFactory<PK, V>) res).setTransactionIdPrefix(UUID.randomUUID().toString());
((DefaultKafkaProducerFactory<PK, V>) res).setProducerPerConsumerPartition(true);
}
return res;
}
Now after everything is set up, i bring 2 instances up on a topic with 2 partitions
each instance get 1 partitions from the consumed topic.
i send a message and wait in debug for the transaction timeout ( to simulate loss of connection)
in instance A, once the timeout passes the other instance( instance B) automatically processes the record and send it to the target topic cause a re-balance occurred
So far so good.
Now when i release the break point on instance A, it says its re-balancing and couldn't commit, but i still see another output record in my destination topic.
My expectation was that instance A wont continue its work once i release the breakpoint as the record was already processed.
Am i doing something wrong?
Can this scenario be achieved?
edit 2:
after garys remarks about the execute in transaction, i get the duplicate record if i freeze one of the instances till the timeout and release it after the other instance processed the record, then the freezed instance process and produce the same record to the out put topic...
public static void main(String[] args) {
MessageProducer producer = new ProducerBuilder()
.setBootstrapServers("kafka:9992")
.setKeySerializerClass(StringSerializer.class)
.setValueSerializerClass(StringSerializer.class)
.setProducerEnableIdempotence(true).build();
MessageConsumer consumer = new ConsumerBuilder()
.setBootstrapServers("kafka:9992")
.setIsolationLevel("read_committed")
.setTopics("someTopic2")
.setGroupId("bla")
.setKeyDeserializerClass(StringDeserializer.class)
.setValueDeserializerClass(MapDeserializer.class)
.setConsumerMessageLogic(new ConsumerMessageLogic() {
#Override
public void onMessage(ConsumerRecord cr, Acknowledgment acknowledgment) {
producer.sendMessage("something1", "im in transaction");
}
}).build();
consumer.start(producer.getProducerFactory());
}
the new sendMessage method in the producer without executeInTransaction
public void sendMessage(V message, PK partitionKey, String topicName) {
try {
KafkaRecord<PK, V> partitionAndMessagePair = producerMessageLogic.prepareMessage(topicName, partitionKey, message);
sendMessage(message, partitionKey, topicName, partitionAndMessagePair, kafkaTemplate);
}catch (Exception e){
failureHandler.onFailure(partitionKey, message, e);
}
}
as well as i changed the consumer container creation to have a transaction manager with the same producerfactory as suggested
/**
* Initialize the kafka message listener
*/
private void initConsumerMessageListenerContainer(ProducerFactory<PK,V> producerFactory) {
// start a acknowledge message listener to allow the manual commit
acknowledgingMessageListener = consumerMessageLogic::onMessage;
// start and initialize the consumer container
container = initContainer(acknowledgingMessageListener, producerFactory);
// sets the number of consumers, the topic partitions will be divided by the consumers
container.setConcurrency(springConcurrency);
springContainerPollTimeoutOpt.ifPresent(p -> container.getContainerProperties().setPollTimeout(p));
if (springAckMode != null) {
container.getContainerProperties().setAckMode(springAckMode);
}
}
private ConcurrentMessageListenerContainer<PK, V> initContainer(AcknowledgingMessageListener<PK, V> messageListener, ProducerFactory<PK,V> producerFactory) {
return new ConcurrentMessageListenerContainer<>(
consumerFactory(props),
containerProperties(messageListener, producerFactory));
}
#NonNull
private ContainerProperties containerProperties(MessageListener<PK, V> messageListener, ProducerFactory<PK,V> producerFactory) {
ContainerProperties containerProperties = new ContainerProperties(topics);
containerProperties.setMessageListener(messageListener);
containerProperties.setTransactionManager(new KafkaTransactionManager<>(producerFactory));
return containerProperties;
}
my expectation is that the broker once receiving the processed record from the freezed instance, that it'll know that that record was already handled by another instance as it contains the exact same metadata ( or is it? i mean, the PID will be different, but should it be different?)
Maybe the scenario im looking for is not even supported in the current exactly once support kafka and spring provides...
if i have 2 instances of read-process-write - that means i have 2 producers with 2 different PID's.
Now when i freeze one of the instances, when the unfrozen instance gets the record process responsibility due to a rebalance, it will send the record with its own PID and a sequence in the metadata.
Now when i release the frozen instance, he sends the same record but with its own PID, so theres no way the broker will know its a duplicate...
Am i wrong? how can i avoid this scenario? i though the re-balance stops the instance and doesnt let it complete its process ( where he produce the duplicate record) cause he no longer has responsibility about that record
Adding the logs:
frozen instance: you can see the freeze time at 10:53:34 and i released it at 10:54:02 ( rebalance time is 10 secs)
2020-06-16 10:53:34,393 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
Created new Producer: CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]
2020-06-16 10:53:34,394 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]
beginTransaction()
2020-06-16 10:53:34,395 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.doBegin:149] Created
Kafka transaction on producer [CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]]
2020-06-16 10:54:02,157 INFO [${sys:spring.application.name}] [kafka-
coordinator-heartbeat-thread | bla]
[o.a.k.c.c.i.AbstractCoordinator.:] [Consumer clientId=consumer-bla-1,
groupId=bla] Group coordinator X.X.X.X:9992 (id: 2147482646 rack:
null) is unavailable or invalid, will attempt rediscovery
2020-06-16 10:54:02,181 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Sending offsets to transaction: {someTopic2-
0=OffsetAndMetadata{offset=23, leaderEpoch=null, metadata=''}}
2020-06-16 10:54:02,189 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0] [i.i.k.s.p.SimpleSuccessHandler.:] Sent
message=[im in transaction] with offset=[252] to topic something1
2020-06-16 10:54:02,193 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0] [o.a.k.c.p.i.TransactionManager.:]
[Producer clientId=producer-b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0, transactionalId=b76e8aba-8149-48f8-857b-
a19195f5a20abla.someTopic2.0] Discovered group coordinator
X.X.X.X:9992 (id: 1001 rack: null)
2020-06-16 10:54:02,263 INFO [${sys:spring.application.name}] [kafka-
coordinator-heartbeat-thread | bla]
[o.a.k.c.c.i.AbstractCoordinator.:] [Consumer clientId=consumer-bla-1,
groupId=bla] Discovered group coordinator 192.168.144.1:9992 (id:
2147482646 rack: null)
2020-06-16 10:54:02,295 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.processCommit:740]
Initiating transaction commit
2020-06-16 10:54:02,296 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#5c7f5906]
commitTransaction()
2020-06-16 10:54:02,299 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Commit list: {}
2020-06-16 10:54:02,301 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.a.k.c.c.i.AbstractCoordinator.:] [Consumer
clientId=consumer-bla-1, groupId=bla] Attempt to heartbeat failed for
since member id consumer-bla-1-b3ad1c09-ad06-4bc4-a891-47a2288a830f is
not valid.
2020-06-16 10:54:02,302 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.a.k.c.c.i.ConsumerCoordinator.:] [Consumer
clientId=consumer-bla-1, groupId=bla] Giving away all assigned
partitions as lost since generation has been reset,indicating that
consumer is no longer part of the group
2020-06-16 10:54:02,302 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.a.k.c.c.i.ConsumerCoordinator.:] [Consumer
clientId=consumer-bla-1, groupId=bla] Lost previously assigned
partitions someTopic2-0
2020-06-16 10:54:02,302 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.l.ConcurrentMessageListenerContainer.info:279]
bla: partitions lost: [someTopic2-0]
2020-06-16 10:54:02,303 INFO [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.l.ConcurrentMessageListenerContainer.info:279]
bla: partitions revoked: [someTopic2-0]
2020-06-16 10:54:02,303 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Commit list: {}
The regular instance that takes over the partation and produce the record after a rebalance
2020-06-16 10:53:46,536 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
Created new Producer: CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]
2020-06-16 10:53:46,537 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]
beginTransaction()
2020-06-16 10:53:46,539 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.doBegin:149] Created
Kafka transaction on producer [CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]]
2020-06-16 10:53:46,556 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Sending offsets to transaction: {someTopic2-
0=OffsetAndMetadata{offset=23, leaderEpoch=null, metadata=''}}
2020-06-16 10:53:46,563 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0] [i.i.k.s.p.SimpleSuccessHandler.:] Sent
message=[im in transaction] with offset=[250] to topic something1
2020-06-16 10:53:46,566 INFO [${sys:spring.application.name}] [kafka-
producer-network-thread | producer-1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0] [o.a.k.c.p.i.TransactionManager.:]
[Producer clientId=producer-1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0, transactionalId=1d8e74d3-8986-4458-89b7-
6d3e5756e213bla.someTopic2.0] Discovered group coordinator
X.X.X.X:9992 (id: 1001 rack: null)
2020-06-16 10:53:46,668 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.t.KafkaTransactionManager.processCommit:740]
Initiating transaction commit
2020-06-16 10:53:46,669 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1] [o.s.k.c.DefaultKafkaProducerFactory.debug:296]
CloseSafeProducer
[delegate=org.apache.kafka.clients.producer.KafkaProducer#26c76153]
commitTransaction()
2020-06-16 10:53:46,672 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Commit list: {}
2020-06-16 10:53:51,673 DEBUG [${sys:spring.application.name}]
[consumer-0-C-1]
[o.s.k.l.KafkaMessageListenerContainer$ListenerConsumer.debug:296]
Received: 0 records
I noticed they both note the exact same offset to commit
Sending offsets to transaction: {someTopic2-0=OffsetAndMetadata{offset=23, leaderEpoch=null, metadata=''}}
i thought when they try to commit the exact same thing the broker will abort one of the transactions...
I also noticed that if i reduce the transaction.timeout.ms to just 2 seconds, it doesnt abort the transaction no matter how long i freeze the instance on debug...
maybe the timer of transaction.timeout.ms starts only after i send the message?
You must not use executeInTransaction at all - see its Javadocs; it is used when there is no active transaction or if you explicitly don't want an operation to participate in an existing transaction.
You need to add a KafkaTransactionManager to the listener container; it must have a reference to same ProducerFactory as the template.
Then, the container will start the transaction and, if successful, send the offset to the transaction.

app is alive but stops consuming messages after a while

spring-kafka consumer stops consuming messages after a while. The stoppage happens every time, but never at the same duration. When app is no longer consuming, in the end of the log always I see the statement that consumer sent LEAVE_GROUP signal. If I am not seeing any errors or exceptions, why is the consumer leaving the group?
org.springframework.boot:spring-boot-starter-parent:2.0.4.RELEASE
spring-kafka:2.1.8.RELEASE
org.apache.kafka:kafka-clients:1.0.2
I've set logging as
logging.level.org.apache.kafka=DEBUG
logging.level.org.springframework.kafka=INFO
other settings
spring.kafka.listener.concurrency=5
spring.kafka.listener.type=single
spring.kafka.listener.ack-mode=record
spring.kafka.listener.poll-timeout=10000
spring.kafka.consumer.heartbeat-interval=5000
spring.kafka.consumer.max-poll-records=50
spring.kafka.consumer.fetch-max-wait=10000
spring.kafka.consumer.enable-auto-commit=false
spring.kafka.consumer.properties.security.protocol=SSL
spring.kafka.consumer.retry.maxAttempts=3
spring.kafka.consumer.retry.backoffperiod.millisecs=2000
ContainerFactory setup
#Bean
public ConcurrentKafkaListenerContainerFactory<String, String> recordsKafkaListenerContainerFactory(RetryTemplate retryTemplate) {
ConcurrentKafkaListenerContainerFactory<String, String> factory = new ConcurrentKafkaListenerContainerFactory<>();
factory.setConsumerFactory(consumerFactory);
factory.setConcurrency(listenerCount);
factory.getContainerProperties().setAckMode(AbstractMessageListenerContainer.AckMode.RECORD);
factory.getContainerProperties().setPollTimeout(pollTimeoutMillis);
factory.getContainerProperties().setErrorHandler(new SeekToCurrentErrorHandler());
factory.getContainerProperties().setAckOnError(false);
factory.setRetryTemplate(retryTemplate);
factory.setStatefulRetry(true);
factory.getContainerProperties().setIdleEventInterval(60000L);
return factory;
}
Listener configuration
#Component
public class RecordsEventListener implements ConsumerSeekAware {
private static final org.slf4j.Logger LOG = org.slf4j.LoggerFactory.getLogger(RecordsEventListener.class);
#Value("${mode.replay:false}")
public void setModeReplay(boolean enabled) {
this.isReplay = enabled;
}
#KafkaListener(topics = "${event.topic}", containerFactory = "RecordsKafkaListenerContainerFactory")
public void handleEvent(#Payload String payload) throws RecordsEventListenerException {
try {
//business logic
} catch (Exception e) {
LOG.error("Process error for event: {}",payload,e);
if(isRetryableException(e)) {
LOG.warn("Retryable exception detected. Going to retry.");
throw new RecordsEventListenerException(e);
}else{
LOG.warn("Dropping event because non retryable exception");
}
}
}
private Boolean isRetryableException(Exception e) {
return binaryExceptionClassifier.classify(e);
}
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
//do nothing
}
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
//do this only once per start of app
if (isReplay && !partitonSeekToBeginningDone) {
assignments.forEach((t, p) -> callback.seekToBeginning(t.topic(), t.partition()));
partitonSeekToBeginningDone = true;
}
}
#Override
public void onIdleContainer(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
//do nothing
LOG.info("Container is IDLE; no messages to pull.");
assignments.forEach((t,p)->LOG.info("Topic:{}, Partition:{}, Offset:{}",t.topic(),t.partition(),p));
}
boolean isPartitionSeekToBeginningDone() {
return partitonSeekToBeginningDone;
}
void setPartitonSeekToBeginningDone(boolean partitonSeekToBeginningDone) {
this.partitonSeekToBeginningDone = partitonSeekToBeginningDone;
}
}
When app is no longer consuming, in the end of the log always I see the statement that consumer sent LEAVE_GROUP signal.
2019-05-02 18:31:05.770 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-1, groupId=app] Sending Heartbeat request to coordinator x.x.x.com:9093 (id: 2147482638 rack: null)
2019-05-02 18:31:05.770 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-1, groupId=app] Using older server API v0 to send HEARTBEAT {group_id=app,generation_id=6,member_id=consumer-1-98d28e69-b0b9-4c2b-82cd-731e53b74b87} with correlation id 5347 to node 2147482638
2019-05-02 18:31:05.872 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-1, groupId=app] Received successful Heartbeat response
2019-05-02 18:31:10.856 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-1, groupId=app] Sending Heartbeat request to coordinator x.x.x.com:9093 (id: 2147482638 rack: null)
2019-05-02 18:31:10.857 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-1, groupId=app] Using older server API v0 to send HEARTBEAT {group_id=app,generation_id=6,member_id=consumer-1-98d28e69-b0b9-4c2b-82cd-731e53b74b87} with correlation id 5348 to node 2147482638
2019-05-02 18:31:10.958 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-1, groupId=app] Received successful Heartbeat response
2019-05-02 18:31:11.767 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-1, groupId=app] Sending LeaveGroup request to coordinator x.x.x.com:9093 (id: 2147482638 rack: null)
2019-05-02 18:31:11.767 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] org.apache.kafka.clients.NetworkClient : [Consumer clientId=consumer-1, groupId=app] Using older server API v0 to send LEAVE_GROUP {group_id=app,member_id=consumer-1-98d28e69-b0b9-4c2b-82cd-731e53b74b87} with correlation id 5349 to node 2147482638
2019-05-02 18:31:11.768 DEBUG 9548 --- [kafka-coordinator-heartbeat-thread | app] o.a.k.c.c.internals.AbstractCoordinator : [Consumer clientId=consumer-1, groupId=app] Disabling heartbeat thread
Full log
Thanks to all who replied. Turns out, it was indeed the broker dropping the consumer on session timeout. The broker a very old version (0.10.0.1) did not accommodate the newer features as outlined in KIP-62 that spring-kafka version we used could make use of.
Since we could not dictate the upgrade to the broker or changes to session timeout, we simply modified our processing logic so as to finish the task under the session timeout.

java.net.SocketException: Connection reset since TLSv1.2 protocol

I'm no more able to connect to api.softlayer.com
I'm calling the rest API from a WebSphere application portal 8.5 (java7) using Apache (HTTP-client-4.5.3.jar)
The coding is
HttpClient client = HttpClientBuilder.create().build();
HttpResponse response = client.execute(request);
The error is
17:44:00.997 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection request: [route: {s}->https://api.softlayer.com:443][total kept alive: 0; route allocated: 0 of 2; total allocated: 0 of 20]
17:44:01.044 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection leased: [id: 0][route: {s}->https://api.softlayer.com:443][total kept alive: 0; route allocated: 1 of 2; total allocated: 1 of 20]
17:44:01.044 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Opening connection {s}->https://api.softlayer.com:443
17:44:01.060 [main] DEBUG o.a.h.i.c.DefaultHttpClientConnectionOperator - Connecting to api.softlayer.com/66.228.119.120:443
17:44:01.060 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Connecting socket to api.softlayer.com/66.228.119.120:443 with timeout 0
17:44:01.606 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled protocols: [TLSv1]
17:44:01.606 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Enabled cipher suites:[TLS_EMPTY_RENEGOTIATION_INFO_SCSV, SSL_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256, SSL_ECDHE_RSA_WITH_AES_128_CBC_SHA256, SSL_RSA_WITH_AES_128_CBC_SHA256, SSL_ECDH_ECDSA_WITH_AES_128_CBC_SHA256, SSL_ECDH_RSA_WITH_AES_128_CBC_SHA256, SSL_DHE_RSA_WITH_AES_128_CBC_SHA256, SSL_DHE_DSS_WITH_AES_128_CBC_SHA256, SSL_ECDHE_ECDSA_WITH_AES_128_CBC_SHA, SSL_ECDHE_RSA_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_AES_128_CBC_SHA, SSL_ECDH_ECDSA_WITH_AES_128_CBC_SHA, SSL_ECDH_RSA_WITH_AES_128_CBC_SHA, SSL_DHE_RSA_WITH_AES_128_CBC_SHA, SSL_DHE_DSS_WITH_AES_128_CBC_SHA, SSL_ECDHE_ECDSA_WITH_RC4_128_SHA, SSL_ECDHE_RSA_WITH_RC4_128_SHA, SSL_RSA_WITH_RC4_128_SHA, SSL_ECDH_ECDSA_WITH_RC4_128_SHA, SSL_ECDH_RSA_WITH_RC4_128_SHA, SSL_ECDHE_ECDSA_WITH_3DES_EDE_CBC_SHA, SSL_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA, SSL_RSA_WITH_3DES_EDE_CBC_SHA, SSL_ECDH_ECDSA_WITH_3DES_EDE_CBC_SHA, SSL_ECDH_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA, SSL_RSA_WITH_RC4_128_MD5]
17:44:01.606 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory - Starting handshake
17:44:01.747 [main] DEBUG o.a.h.i.c.DefaultManagedHttpClientConnection - http-outgoing-0: Shutdown connection
17:44:01.747 [main] DEBUG o.a.h.impl.execchain.MainClientExec - Connection discarded
17:44:01.747 [main] DEBUG o.a.h.i.c.PoolingHttpClientConnectionManager - Connection released: [id: 0][route: {s}->https://api.softlayer.com:443][total kept alive: 0; route allocated: 0 of 2; total allocated: 0 of 20]
17:44:01.747 [main] INFO o.a.http.impl.execchain.RetryExec - I/O exception (java.net.SocketException) caught when processing request to {s}->https://api.softlayer.com:443: Connection reset
17:44:01.747 [main] DEBUG o.a.http.impl.execchain.RetryExec - Connection reset
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:207) ~[na:1.7.0]
at java.net.SocketInputStream.read(SocketInputStream.java:133) ~[na:1.7.0]
at com.ibm.jsse2.a.a(a.java:110) ~[na:7.0 build_20131216]
at com.ibm.jsse2.a.a(a.java:141) ~[na:7.0 build_20131216]
at com.ibm.jsse2.qc.a(qc.java:691) ~[na:7.0 build_20131216]
at com.ibm.jsse2.qc.h(qc.java:266) ~[na:7.0 build_20131216]
at com.ibm.jsse2.qc.a(qc.java:770) ~[na:7.0 build_20131216]
at com.ibm.jsse2.qc.startHandshake(qc.java:476) ~[na:7.0 build_20131216]
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:396) ~[httpclient-4.5.3.jar:4.5.3]
at org.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:355) ~[httpclient-4.5.3.jar:4.5.3]
The solution is to enable the enable TLSv1.2
import javax.net.ssl.SSLContext;
import org.apache.http.client.HttpClient;
import org.apache.http.conn.ssl.NoopHostnameVerifier;
import org.apache.http.conn.ssl.SSLConnectionSocketFactory;
import org.apache.http.impl.client.CloseableHttpClient;
import org.apache.http.impl.client.HttpClients;
import org.apache.http.ssl.SSLContexts;
public class HttpClientFactory {
private static CloseableHttpClient client;
public static HttpClient getHttpsClient() throws Exception {
if (client != null) {
return client;
}
SSLContext sslContext = SSLContexts.createDefault();
SSLConnectionSocketFactory sslsf = new SSLConnectionSocketFactory(sslContext, new String[] { "TLSv1.2"}, null, new NoopHostnameVerifier());
client = HttpClients.custom().setSSLSocketFactory(sslsf).build();
return client;
}
}
Softlayer's servers only accept TLSv1.2 connections you must make sure that your code is only performing connections with that protocol which it seems your code is not doing
17:44:01.606 [main] DEBUG o.a.h.c.s.SSLConnectionSocketFactory -
Enabled protocols: [TLSv1]

Kafka consumer java code not working

I have Just started with Kafka, I am able to produce and consume data through command prompt and also produce data through Java code even from remote server.
But I am trying this simple consumer Java code it is not working.
public class Simpleconsumer {
private final ConsumerConnector consumer;
private final String topic;
public Simpleconsumer(String topic) {
Properties props = new Properties();
props.put("zookeeper.connect", "127.0.0.1:2181");
props.put("group.id", "topic1");
props.put("auto.offset.reset", "smallest");
consumer = Consumer.createJavaConsumerConnector(new ConsumerConfig(props));
this.topic = topic;
}
public void testConsumer() {
try{
Map<String, Integer> topicCount = new HashMap<String, Integer>();
topicCount.put(topic, 1);
Map<String, List<KafkaStream<byte[], byte[]>>> consumerStreams = consumer.createMessageStreams(topicCount);
List<KafkaStream<byte[], byte[]>> streams = consumerStreams.get(topic);
System.out.println("start.......");
for (final KafkaStream stream : streams) {
ConsumerIterator<byte[], byte[]> it = stream.iterator();
System.out.println("iterate.......");
while (it.hasNext()) {
System.out.println("Message from Single Topic: " + new String(it.next().message()));
}
}
System.out.println("end.......");
if (consumer != null) {
consumer.shutdown();
}
}
catch(Exception e)
{
System.out.println(e);
}
}
public static void main(String[] args) {
// String topic = args[0];
Simpleconsumer simpleHLConsumer = new Simpleconsumer("topic1");
simpleHLConsumer.testConsumer();
}
}
Output is :-
log4j:WARN No appenders could be found for logger (kafka.utils.VerifiableProperties).
log4j:WARN Please initialize the log4j system properly.
start.......
iterate.......
There is no error, the programme doesn't terminate or give any output!!
Log of zookeeper
2016-02-18 17:31:31,790 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory#197] - Accepted socket connection from /127.0.0.1:33338
2016-02-18 17:31:31,793 [myid:] - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer#868] - Client attempting to establish new session at /127.0.0.1:33338
2016-02-18 17:31:31,821 [myid:] - INFO [SyncThread:0:ZooKeeperServer#617] - Established session 0x152f4265b0b0009 with negotiated timeout 6000 for client /127.0.0.1:33338
2016-02-18 17:31:31,891 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#645] - Got user-level KeeperException when processing sessionid:0x152f4265b0b0009 type:create cxid:0x1 zxid:0x718 txntype:-1 reqpath:n/a Error Path:/consumers Error:KeeperErrorCode = NodeExists for /consumers
2016-02-18 17:31:31,892 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#645] - Got user-level KeeperException when processing sessionid:0x152f4265b0b0009 type:create cxid:0x2 zxid:0x719 txntype:-1 reqpath:n/a Error Path:/consumers/artinew Error:KeeperErrorCode = NodeExists for /consumers/artinew
2016-02-18 17:31:31,892 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#645] - Got user-level KeeperException when processing sessionid:0x152f4265b0b0009 type:create cxid:0x3 zxid:0x71a txntype:-1 reqpath:n/a Error Path:/consumers/artinew/ids Error:KeeperErrorCode = NodeExists for /consumers/artinew/ids
2016-02-18 17:31:32,000 [myid:] - INFO [SessionTracker:ZooKeeperServer#347] - Expiring session 0x152f4265b0b0008, timeout of 6000ms exceeded
2016-02-18 17:31:32,000 [myid:] - INFO [ProcessThread(sid:0 cport:-1)::PrepRequestProcessor#494] - Processed session termination for sessionid: 0x152f4265b0b0008
2016-02-18 17:31:32,002 [myid:] - INFO [SyncThread:0:NIOServerCnxn#1007] - Closed socket connection for client /127.0.0.1:33337 which had sessionid 0x152f4265b0b0
I am getting this in Kafka console in infinite loop. please explain
[2016-02-17 20:50:08,594] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
[2016-02-17 20:50:08,174] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
[2016-02-17 20:50:08,385] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
[2016-02-17 20:50:08,760] INFO Closing socket connection to /xx.xx.xx.xx. (kafka.network.Processor)
i created the topic in the following manner
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 5 --topic topic1
i am able to consume it in command prompt using
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --from-beginning --topic topic1
Not able to understand what is the issue.
Try localhost instead of 127.0.0.1 in code to make sure local resolution is working fine.