My producer can create a topic, but data doesn't seem to be stored inside the broker - apache-kafka

My producer can create a topic, but it doesn't seem to store any data inside a broker. I can check that the topic is created with kafka-topics script.
When I tried to consume with kafka-console-consumer, it doesn't consume anything. (I know --from-beginning.)
When I produced with kafka-console-producer, my consumer(kafka-console-consumer) can consume it right away. So there is something wrong with my java code.
And when I run my code with localhost:9092, it worked fine. And when I consume the topic with my consumer code, it was working properly. My producer works with Kafka server on my local machine but doesn't work with another Kafka server on remote machine.
Code :
//this code is inside the main method
Properties properties = new Properties();
//properties.put("bootstrap.servers", "localhost:9092");
//When I used localhost, my consumer code consumes it fine.
properties.put("bootstrap.servers", "192.168.0.30:9092");
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<String, String>(properties);
ProducerRecord<String, String> record = new ProducerRecord<>("test5", "1111","jin1111");
//topc is created, but consumer can't consume any data.
//I tried putting different values for key and value parameters but no avail.
try {
kafkaProducer.send(record);
System.out.println("complete");
} catch (Exception e) {
e.printStackTrace();
} finally {
kafkaProducer.close();
System.out.println("closed");
}
/*//try{
for(int i = 0; i < 10000; i++){
System.out.println(i);
kafkaProducer.send(new ProducerRecord("test", Integer.toString(i), "message - " + i ));
}*/
My CLI (Putty) :
I want to see my consumer consuming when I run my java code. (Those data shown in the image are from the producer script.)
update
After reading answers and comments, this is what I've tried so far. Still not consuming any messages. I think message produced in this code is not stored in the broker. I tried with the different server, too. The same problem. Topic was created, but no consumer exists in the consumer group list and can't consume. And no data can be consumed with consumer script.
I also tried permission change. (chown) and tried with etc/hosts files. but no luck. I'll keep on trying until I solve this.
public static void main(String[] args){
Properties properties = new Properties();
//properties.put("bootstrap.servers", "localhost:9092");
properties.put("bootstrap.servers", "192.168.0.30:9092");
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("linger.ms", "1");
properties.put("batch.size", "16384");
properties.put("request.timeout.ms", "30000");
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<String, String>(properties);
ProducerRecord<String, String> record = new ProducerRecord<>("test5", "1111","jin1111");
System.out.println("1");
try {
kafkaProducer.send(record);
//kafkaProducer.send(record).get();
// implement Callback
System.out.println("complete");
kafkaProducer.flush();
System.out.println("flush completed");
} catch (Exception e) {
e.printStackTrace();
} finally {
kafkaProducer.flush();
System.out.println("another flush test");
kafkaProducer.close();
System.out.println("closed");
}
}
When I run this in Eclipse, the console shows :

To complete the ppatierno answer, you should call KafkaProducer.flush() before calling KafkaProducer.close(). This is a blocking call and will not return before all record got sent.
Yannick

My guess is that your main method exits and the application ends before the message is sent by the Kafka client.
The send method is not sync. The client buffers messages and send them after reaching a timeout named linger time (see linger.ms) or the buffer is filled to a specific size (see batch.size parameter for example). The default linger time is anyway 0.
So what your main method does is providing the message to the send method but then it exits and the underlying thread in the Kafka client isn't able to send the message.

I finally figured out. If you experienced similar problem, there are things you can do.
In your server.properties, uncomment these and put the ip and port.
(There seems to be a problem with the port, so I changed it.)
listeners=PLAINTEXT://192.168.0.30:9093
advertised.listeners=PLAINTEXT://192.168.0.30:9093
(Before restarting your broker with your changed server.properties, you might want to clean all existing log.dir. Try this, if nothing works)
Some other things you might want to consider :
change your log.dir. Usually the default path is tmp, but sometimes there is a noexec setting, so configure to a different location
check your etc/hosts
check your permission : And use chown and chmod
change zookeeper port and kafka port if necessary.
change broker.id
My working producer code :
public class Producer1 {
public static void main(String[] args){
Properties properties = new Properties();
properties.put("bootstrap.servers", "192.168.0.30:9093");
properties.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer");
properties.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer");
KafkaProducer<String, String> kafkaProducer = new KafkaProducer<String, String>(properties);
ProducerRecord<String, String> record = new ProducerRecord<>("test", "1","jin");
try {
kafkaProducer.send(record);
System.out.println("complete");
} catch (Exception e) {
e.printStackTrace();
} finally {
kafkaProducer.close();
System.out.println("closed");
}
}
}
working Consumer code:
public class Consumer1 {
public static void main(String[] args) {
Properties props = new Properties();
props.put("bootstrap.servers", "192.168.0.30:9093");
props.put("group.id", "jin");
props.put("auto.offset.reset", "earliest");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String, String>(props);
consumer.subscribe(Collections.singletonList("test"));
try {
while (true) {
ConsumerRecords<String, String> records = consumer.poll(Duration.ofMillis(1000));
for (ConsumerRecord<String, String> record : records){
System.out.printf("offset = %d, key = %s, value = %s", record.offset(), record.key(), record.value());
}
}
} catch (Exception e){
e.printStackTrace();
} finally {
consumer.close();
System.out.println("closed");
}
}
}
Console :

Related

How to run several instances of a kafka transactional producer on same broker with same transactionalId?

I am using Kafka Transactional producer to post atomically to 2 topics on a broker. My code looks similar to this:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("transactional.id", "my-transactional-id");
Producer<String, String> producer = new KafkaProducer<>(props, new StringSerializer(), new StringSerializer());
producer.initTransactions();
try {
producer.beginTransaction();
for (int i = 0; i < 100; i++)
producer.send(new ProducerRecord<>("my-topic", Integer.toString(i), Integer.toString(i)));
producer.commitTransaction();
} catch (ProducerFencedException | OutOfOrderSequenceException | AuthorizationException e) {
// We can't recover from these exceptions, so our only option is to close the producer and exit.
producer.close();
} catch (KafkaException e) {
// For all other exceptions, just abort the transaction and try again.
producer.abortTransaction();
}
producer.close();
Acc. to kafka docs and my understanding, initTransactions() has to be called only once per producer session, and it registers that producer instance to the broker with the specified transactional-id.
Now, in my case I need to deploy this code to several servers using same kafka broker.
Do I need different transactional Ids for each instance?
Is there any way to close initTransactions() once it is called so that it doesn't block other producers executing transactions as they have same transactional-id.
P.s. I don't want to close the producer and re-instantiate it after every sent transaction as this can impact performance, I believe. How can we implement an efficient solution to this problem?

Kafka transactional producer writes messages even if producer commit in not invoked

For me it seems like kafka transactional producer is behaving like a regular producer, the meesages are visible on the topic as send is called for each message. Maybe I am missing something basic. I was expecting the messages to appear in the topic only after the producer commit method is called. In my code below produce.commitTransactions() is commented out but I still get the messages in the topic. Thanks for any pointers.
public static void main(String[] args) {
try {
Properties producerConfig = new Properties();
producerConfig.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "...");
producerConfig.put(ProducerConfig.CLIENT_ID_CONFIG, "transactional-producer-1");
producerConfig.put(ProducerConfig.ENABLE_IDEMPOTENCE_CONFIG, true); // enable idempotence
producerConfig.put(ProducerConfig.TRANSACTIONAL_ID_CONFIG, "test-transactional-id-1"); // set transaction id
producerConfig.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
producerConfig.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringSerializer");
Producer<String, String> producer = new KafkaProducer<>(producerConfig);
producer.initTransactions(); //initiate transactions
try {
producer.beginTransaction(); //begin transactions
for (Integer i = 0; i < 1000; i++) {
producer.send(new ProducerRecord<String, String>("t_test", i.toString(), "value_" + i));
}
// producer.commitTransaction(); //commit
} catch (KafkaException e) {
// For all other exceptions, just abort the transaction and try again.
producer.abortTransaction();
}
producer.close();
} catch (Exception e) {
System.out.println(e.toString());
}
}
When it comes to transactions in Kafka you need to consider a Producer/Consumer pair. A Producer itself, as you have observed, is just producing data and either committing the transaction or not.
Only in interplay with a consumer you can "complete" a transaction by setting the KafkaConsumer configuration isolation.level set to read_committed (by default it is set to read_uncommitted). This configuration is described as:
isolation.level: Controls how to read messages written transactionally. If set to read_committed, consumer.poll() will only return transactional messages which have been committed. If set to read_uncommitted' (the default), consumer.poll() will return all messages, even transactional messages which have been aborted. Non-transactional messages will be returned unconditionally in either mode.

Getting the number of partitions in the producer in kafka 0.8.0 using partitionsFor method

Do we have a support for partitionsFor method in the producer in kafka version 0.8.0? I want to use this method to get the number of partitions given a kafka topic.
If this method is not available in kafka 0.8.0, what is the easiest way to get the number of partitions in the producer in this specific version of kafka?
You can use listTopics() method also
ArrayList<Topics> topicList = new ArrayList<Topics>();
Properties props = new Properties();
Map<String, List<PartitionInfo>> topics;
Topics topic;
InputStream input =
getClass().getClassLoader().getResourceAsStream("kafkaCluster.properties");
try {
props.load(input);
props.put(AdminClientConfig.BOOTSTRAP_SERVERS_CONFIG,
props.getProperty("BOOTSTRAP_SERVERS_CONFIG"));
props.put("key.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer",
"org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> consumer = new KafkaConsumer<String,
String>(props);
topics = consumer.listTopics();
// System.out.println(topics.get(topics));
for (Map.Entry<String, List<PartitionInfo>> entry :
topics.entrySet()) {
System.out.println("Key = " + entry.getKey() + ", Value = " +
entry.getValue());
topic = new Topics();
topic.setTopic_name(entry.getKey());
topic.setPartitions(Integer.toString(entry.getValue().size()));
topicList.add(topic);
}
} catch (IOException e) {
e.printStackTrace();
}
Why don't you try following approach
https://stackoverflow.com/a/35458605/5922904.
Also ZkUtils have the method getPartitionsForTopics which can also be used. Although I have not tried and tested it myself

Kafka consumer does not start from latest message

I want to have a Kafka Consumer which starts from the latest message in a topic.
Here is the java code:
private static Properties properties = new Properties();
private static KafkaConsumer<String, String> consumer;
static
{
properties.setProperty("bootstrap.servers","localhost");
properties.setProperty("enable.auto.commit", "true");
properties.setProperty("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.setProperty("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
properties.setProperty("group.id", "test");
properties.setProperty("auto.offset.reset", "latest");
consumer = new KafkaConsumer<>(properties);
consumer.subscribe(Collections.singletonList("mytopic"));
}
#Override
public StreamHandler call() throws Exception
{
while (true)
{
ConsumerRecords<String, String> consumerRecords = consumer.poll(200);
Iterable<ConsumerRecord<String, String>> records = consumerRecords.records("mytopic");
for(ConsumerRecord<String, String> rec : records)
{
System.out.println(rec.value());
}
}
}
Although the value for auto.offset.reset is latest, but the consumer starts form messages which belong to 2 days ago and then it catches up with the latest messages.
What am I missing?
Have you run this same code before with the same group.id? The auto.offset.reset parameter is only used if there is not an existing offset already stored for your consumer. So if you've run the example previously, say two days ago, and then you run it again, it will start from the last consumed position.
Use seekToEnd() if you would like to manually go to the end of the topic.
See https://stackoverflow.com/a/32392174/1392894 for a slightly more thorough discussion of this.
If you want to manually control the position of your offsets you need to set enable.auto.commit = false.
If you want to position all offsets to the end of each partition then call seekToEnd()
https://kafka.apache.org/0102/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html#seekToEnd(java.util.Collection)

Kafka Consumer not getting invoked when the kafka Producer is set to Sync

I have a requirement where there are 2 topics to be maintained 1 with synchronous approach and other with an asynchronous way.
The asynchronous works as expected invoking the consumer record, however in the synchronous approach the consumer code is not getting invoked.
Below is the code declared in the config file
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9093");
props.put(ProducerConfig.RETRIES_CONFIG, 3);
props.put(ProducerConfig.BATCH_SIZE_CONFIG, 16384);
props.put(ProducerConfig.ACKS_CONFIG, "all");
props.put(ProducerConfig.LINGER_MS_CONFIG, 1);
props.put(ProducerConfig.BUFFER_MEMORY_CONFIG, 33554432);
I have enabled autoFlush true here
#Bean( name="KafkaPayloadSyncTemplate")
public KafkaTemplate<String, KafkaPayload> KafkaPayloadSyncTemplate() {
return new KafkaTemplate<String,KafkaPayload>(producerFactory(),true);
}
The control stops thereafter not making any calls to the consumer after returning the recordMetadataResults object
private List<RecordMetadata> sendPayloadToKafkaTopicInSync() throws InterruptedException, ExecutionException {
final List<RecordMetadata> recordMetadataResults = new ArrayList<RecordMetadata>();
KafkaPayload kafkaPayload = constructKafkaPayload();
ListenableFuture<SendResult<String,KafkaPayload>>
future = KafkaPayloadSyncTemplate.send(TestTopic, kafkaPayload);
SendResult<String, KafkaPayload> results;
results = future.get();
recordMetadataResults.add(results.getRecordMetadata());
return recordMetadataResults;
}
Consumer Code
public class KafkaTestListener {
#Autowired
TestServiceImpl TestServiceImpl;
public final CountDownLatch countDownLatch = new CountDownLatch(1);
#KafkaListener(id="POC", topics = "TestTopic", group = "TestGroup")
public void listen(ConsumerRecord<String,KafkaPayload> record, Acknowledgment acknowledgment) {
countDownLatch.countDown();
TestServiceImpl.consumeKafkaMessage(record);
System.out.println("Acknowledgment : " + acknowledgment);
acknowledgment.acknowledge();
}
}
Based on the issue, I have 2 questions
Should we manually call the listen() inside the Listener Class when its a Sync Producer. If Yes, How to do that ?
If the listener(#KafkaListener) get called automatically, what other setup/configurations do I need to add to make this working.
Thanks for the inputs in advance
-Srikant
You should be sure that you use consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest"); for Consumer Properties.
Not sure what you mean about sync/async, but produce and consume are fully distinguished operations. And you can't affect consumer from your producer side. Because in between there is Kafka Broker.