search for a very simple EsperIO Kafka example - apache-kafka

I'm just desperately looking for example code for an Esper CEP Kafka Adapter code. I've already installed Kafka and wrote data to a Kafka topic using a producer and now I want to process it with Esper CEP. Unfortunately the documentation of Esper for the Kafka Adapter is not very meaningful. Does anyone have a very simple example?
Edit:
So far I added an adapter and it seems to work. However, I don't know how to read the adapter nor how to link a CEP pattern with this adapter. This is my code so far:
config.addImport(KafkaOutputDefault.class);
Properties props = new Properties();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class.getName());
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class.getName());
props.put(ConsumerConfig.GROUP_ID_CONFIG, "group.id");
props.put(EsperIOKafkaConfig.INPUT_SUBSCRIBER_CONFIG, EsperIOKafkaInputSubscriberByTopicList.class.getName());
props.put(EsperIOKafkaConfig.TOPICS_CONFIG, "test123");
props.put(EsperIOKafkaConfig.INPUT_PROCESSOR_CONFIG, EsperIOKafkaInputProcessorDefault.class.getName());
props.put(EsperIOKafkaConfig.INPUT_TIMESTAMPEXTRACTOR_CONFIG, EsperIOKafkaInputTimestampExtractorConsumerRecord.class.getName());
Configuration config2 = new Configuration();
config2.addPluginLoader("KafkaInput", EsperIOKafkaInputAdapterPlugin.class.getName(), props, null);
EsperIOKafkaInputAdapter adapter = new EsperIOKafkaInputAdapter(props, "default");
adapter.start();

I've had the same problem. I created a sample Project you could have a look at, especially the plain-esper branch.
An even more simplified Version would be:
public class KafkaExample implements Runnable {
private String runtimeURI;
public KafkaExample(String runtimeURI) {
this.runtimeURI = runtimeURI;
}
public static void main(String[] args){
new KafkaExample("KafkaExample").run();
}
#Override
public void run() {
Configuration configuration = new Configuration();
configuration.getCommon().addImport(KafkaOutputDefault.class);
configuration.getCommon().addEventType(String.class);
Properties consumerProps = new Properties();
// Kafka Consumer Properties
consumerProps.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
consumerProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class.getName());
consumerProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG,StringDeserializer.class.getName());
consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, UUID.randomUUID().toString());
consumerProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, OffsetResetStrategy.EARLIEST.toString().toLowerCase());
// EsperIO Kafka Input Adapter Properties
consumerProps.put(EsperIOKafkaConfig.INPUT_SUBSCRIBER_CONFIG, Consumer.class.getName());
consumerProps.put(EsperIOKafkaConfig.INPUT_PROCESSOR_CONFIG, InputProcessor.class.getName());
consumerProps.put(EsperIOKafkaConfig.INPUT_TIMESTAMPEXTRACTOR_CONFIG, EsperIOKafkaInputTimestampExtractorConsumerRecord.class.getName());
configuration.getRuntime().addPluginLoader("KafkaInput", EsperIOKafkaInputAdapterPlugin.class.getName(), consumerProps, null);
String stmt = "#name('sampleQuery') select * from String";
EPCompiled compiled;
try {
compiled = EPCompilerProvider.getCompiler().compile(stmt, new CompilerArguments(configuration));
} catch (EPCompileException ex) {
throw new RuntimeException(ex);
}
EPRuntime runtime = EPRuntimeProvider.getRuntime(runtimeURI, configuration);
EPDeployment deployment;
try {
deployment = runtime.getDeploymentService().deploy(compiled, new DeploymentOptions().setDeploymentId(UUID.randomUUID().toString()));
} catch (EPDeployException ex) {
throw new RuntimeException(ex);
}
EPStatement statement = runtime.getDeploymentService().getStatement(deployment.getDeploymentId(), "sampleQuery");
statement.addListener((newData, oldData, sta, run) -> {
for (EventBean nd : newData) {
System.out.println(nd.getUnderlying());
}
});
while (true) {}
}
}
public class Consumer implements EsperIOKafkaInputSubscriber {
#Override
public void subscribe(EsperIOKafkaInputSubscriberContext context) {
Collection<String> collection = new ArrayList<String>();
collection.add("input");
context.getConsumer().subscribe(collection);
}
}
public class InputProcessor implements EsperIOKafkaInputProcessor {
private EPRuntime runtime;
#Override
public void init(EsperIOKafkaInputProcessorContext context) {
this.runtime = context.getRuntime();
}
#Override
public void process(ConsumerRecords<Object, Object> records) {
for (ConsumerRecord record : records) {
if (record.value() != null) {
try {
runtime.getEventService().sendEventBean(record.value().toString(), "String");
} catch (Exception e) {
throw e;
}
}
}
}
public void close() {}
}

Sample code follows. This code assumes there are already some messages in the topic. This does not loop and wait for more messages.
Properties consumerProps = new Properties();
consumerProps.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, ip);
consumerProps.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class.getName());
consumerProps.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, org.apache.kafka.common.serialization.StringDeserializer.class.getName());
consumerProps.put(ConsumerConfig.GROUP_ID_CONFIG, "mygroup");
KafkaConsumer consumer = new KafkaConsumer<>(consumerProps);
ConsumerRecords<String, String> rows = consumer.poll(1000);
Iterator<ConsumerRecord<String, String>> it = rows.iterator();
while (it.hasNext()) {
ConsumerRecord<String, String> row = it.next();
MyEvent event = new MyEvent(row.value()); // transform string to event
// process event
runtime.sendEvent(event);
}

Related

Why is windowing now working for Kafka Streams?

I am running a simple Kafka Streams program on my eclipse which is running successfully, but it is not able to implement the windowing concept.
I want to process all the messages received in a window of 5 seconds to the output topic. I googled and understand that I need to implement the tumbling window concept. However, I see that the output is sent to the output topic instantly.
What am I doing wrong here? Below is the main method that I am running:
public static void main(String[] args) throws Exception {
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "streams-wordcount");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(StreamsConfig.CACHE_MAX_BYTES_BUFFERING_CONFIG, 0);
props.put(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
final StreamsBuilder builder = new StreamsBuilder();
KStream<String, String> source = builder.stream("wc-input");
#SuppressWarnings("deprecation")
KTable<Windowed<String>, Long> counts = source
.flatMapValues(new ValueMapper<String, Iterable<String>>() {
#Override
public Iterable<String> apply(String value) {
return Arrays.asList(value.toLowerCase(Locale.getDefault()).split(" "));
}
})
.groupBy(new KeyValueMapper<String, String, String>() {
#Override
public String apply(String key, String value) {
return value;
}
})
.count(TimeWindows.of(10000L)
.until(10000L),"Counts");
// need to override value serde to Long type
counts.to("wc-output");
final Topology topology = builder.build();
final KafkaStreams streams = new KafkaStreams(topology, props);
final CountDownLatch latch = new CountDownLatch(1);
// attach shutdown handler to catch control-c
Runtime.getRuntime().addShutdownHook(new Thread("streams-wordcount-shutdown-hook") {
#Override
public void run() {
streams.close();
latch.countDown();
}
});
try {
streams.start();
long windowSizeMs = TimeUnit.MINUTES.toMillis(50000); // 5 * 60 * 1000L
TimeWindows.of(windowSizeMs);
TimeWindows.of(windowSizeMs).advanceBy(windowSizeMs);
latch.await();
} catch (Throwable e) {
System.exit(1);
}
System.exit(0);
}
Windowing does not mean "one output" per window. If you want to get only one output per window, you want so use suppress() on the result KTable.
Compare this article: https://www.confluent.io/blog/watermarks-tables-event-time-dataflow-model/

Kafka Consumer does not read data from Producer

My Kafka consumer doesnt read from my producer. I noticed that after calling the poll method thae the code does not execute the print "Hello" and there is no error message showing.
The code execute well but it's like if it breaks after the poll method
Note: my producer works well. I created a consumer to test it.
Code:
public class ConsumerApp {
public static void main(String[] args) {
// Create Propety dictionary for the producer Config settings
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer<String, String> myconsumer = new KafkaConsumer<String, String>(props);
myconsumer.subscribe(Arrays.asList("test"));
try {
while (true) {
ConsumerRecords<String, String> records = myconsumer.poll(100);
System.out.println("hello");
// processing logic goes here
for (ConsumerRecord<String, String> record : records) {
// processing records
System.out.println(String.format(record.topic(), record.partition(), record.offset(), record.key(),
record.value()));
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
// Closing Consumer
myconsumer.close();
}
}
}
I found the solution i didnt set a connection with the zookeeper server , now that i did my consumer reads the Data ! Here is the code
public static void main(String[] args) {
//Create Propety dictionary for the producer Config settings
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("zookeeper.connect", "localhost:2181");
props.put("group.id", "console");
props.put("zookeeper.session.timeout.ms", "500");
props.put("zookeeper.sync.timeout.ms", "500");
props.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
KafkaConsumer< String, String > myconsumer = new KafkaConsumer<String, String> (props);
myconsumer.subscribe(Collections.singletonList("test"));
try {
while(true){
ConsumerRecords<String, String> records = myconsumer.poll(100);
// processing logic goes here
for (ConsumerRecord<String, String> record : records) {
// processing records
System.out.printf("offset = %d, key = %s, value = %s\n",
record.offset(), record.key(), record.value());
}
}
} catch (Exception e) {
e.printStackTrace();
} finally {
// Closing Consumer
myconsumer.close();
}
}
}
Long time ago I was playing with this example and it worked well, try it:
Consumer:
package com.spnotes.kafka.simple;
import org.apache.kafka.clients.consumer.ConsumerConfig;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.clients.consumer.ConsumerRecords;
import org.apache.kafka.clients.consumer.KafkaConsumer;
import org.apache.kafka.common.errors.WakeupException;
import java.util.Arrays;
import java.util.Properties;
import java.util.Scanner;
/**
* Created by sunilpatil on 12/28/15.
*/
public class Consumer {
private static Scanner in;
public static void main(String[] argv)throws Exception{
if (argv.length != 2) {
System.err.printf("Usage: %s <topicName> <groupId>\n",
Consumer.class.getSimpleName());
System.exit(-1);
}
in = new Scanner(System.in);
String topicName = argv[0];
String groupId = argv[1];
ConsumerThread consumerRunnable = new ConsumerThread(topicName,groupId);
consumerRunnable.start();
String line = "";
while (!line.equals("exit")) {
line = in.next();
}
consumerRunnable.getKafkaConsumer().wakeup();
System.out.println("Stopping consumer .....");
consumerRunnable.join();
}
private static class ConsumerThread extends Thread{
private String topicName;
private String groupId;
private KafkaConsumer<String,String> kafkaConsumer;
public ConsumerThread(String topicName, String groupId){
this.topicName = topicName;
this.groupId = groupId;
}
public void run() {
Properties configProperties = new Properties();
configProperties.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
configProperties.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.ByteArrayDeserializer");
configProperties.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer");
configProperties.put(ConsumerConfig.GROUP_ID_CONFIG, groupId);
configProperties.put(ConsumerConfig.CLIENT_ID_CONFIG, "simple");
//Figure out where to start processing messages from
kafkaConsumer = new KafkaConsumer<String, String>(configProperties);
kafkaConsumer.subscribe(Arrays.asList(topicName));
//Start processing messages
try {
while (true) {
ConsumerRecords<String, String> records = kafkaConsumer.poll(100);
for (ConsumerRecord<String, String> record : records)
System.out.println(record.value());
}
}catch(WakeupException ex){
System.out.println("Exception caught " + ex.getMessage());
}finally{
kafkaConsumer.close();
System.out.println("After closing KafkaConsumer");
}
}
public KafkaConsumer<String,String> getKafkaConsumer(){
return this.kafkaConsumer;
}
}
}
Producer:
package com.spnotes.kafka.simple;
import org.apache.kafka.clients.producer.KafkaProducer;
import org.apache.kafka.clients.producer.ProducerConfig;
import org.apache.kafka.clients.producer.ProducerRecord;
import java.util.Properties;
import java.util.Scanner;
/**
* Created by sunilpatil on 12/28/15.
*/
public class Producer {
private static Scanner in;
public static void main(String[] argv)throws Exception {
if (argv.length != 1) {
System.err.println("Please specify 1 parameters ");
System.exit(-1);
}
String topicName = argv[0];
in = new Scanner(System.in);
System.out.println("Enter message(type exit to quit)");
//Configure the Producer
Properties configProperties = new Properties();
configProperties.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG,"localhost:9092");
configProperties.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.ByteArraySerializer");
configProperties.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringSerializer");
org.apache.kafka.clients.producer.Producer producer = new KafkaProducer(configProperties);
String line = in.nextLine();
while(!line.equals("exit")) {
//TODO: Make sure to use the ProducerRecord constructor that does not take parition Id
ProducerRecord<String, String> rec = new ProducerRecord<String, String>(topicName,line);
producer.send(rec);
line = in.nextLine();
}
in.close();
producer.close();
}
}
You can find another one nice example here: https://www.codenotfound.com/spring-kafka-consumer-producer-example.html

Write in Topic in Kafka through Java Code

i am trying to write in Kafka Topic through JAVA, as i have created the Topic, but want to insert some data in that topic.
Thanks in advance.
Here's an example of a synchronous producer. It should work with Kafka 0.11 (and a few prior releases too):
import org.apache.kafka.clients.producer.*;
import org.apache.kafka.common.serialization.LongSerializer;
import org.apache.kafka.common.serialization.StringSerializer;
import java.util.Properties;
public class MyKafkaProducer {
private final static String TOPIC = "my-example-topic";
private final static String BOOTSTRAP_SERVERS = "localhost:9092,localhost:9093,localhost:9094";
private static Producer<Long, String> createProducer() {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, BOOTSTRAP_SERVERS);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "MyKafkaProducer");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, LongSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
return new KafkaProducer<>(props);
}
static void runProducer(final int sendMessageCount) throws Exception {
final Producer<Long, String> producer = createProducer();
try {
for (long index = 1; index <= sendMessageCount; index++) {
final ProducerRecord<Long, String> record = new ProducerRecord<>(TOPIC, index, "Message " + index);
RecordMetadata metadata = producer.send(record).get();
System.out.printf("sent record(key=%s value='%s')" + " metadata(partition=%d, offset=%d)\n",
record.key(), record.value(), metadata.partition(), metadata.offset());
}
} finally {
producer.flush();
producer.close();
}
}
public static void main(String[] args) throws Exception {
if (args.length == 0) {
runProducer(5);
} else {
runProducer(Integer.parseInt(args[0]));
}
}
}
You may need to modify some of the hard-coded settings.
Reference: http://cloudurable.com/blog/kafka-tutorial-kafka-producer/index.html

Using kafka streams to segregate messages

I have a setup where each kafka message will contain a "sender" field. All these message are sent to a single topic.
Is there a way to segregate these messages at the consumer side? I would like sender specific consumer that will read all messages pertaining to that sender alone.
Should I be using Kafka Streams to achieve this? I am new to Kafka Streams, any advice guidance will be helpful.
public class KafkaStreams3 {
public static void main(String[] args) throws JSONException {
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, "kafkastreams1");
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
final Serde < String > stringSerde = Serdes.String();
Properties kafkaProperties = new Properties();
kafkaProperties.put("key.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
kafkaProperties.put("value.serializer",
"org.apache.kafka.common.serialization.StringSerializer");
kafkaProperties.put("bootstrap.servers", "localhost:9092");
KafkaProducer<String, String> producer = new KafkaProducer<String, String>(kafkaProperties);
KStreamBuilder builder = new KStreamBuilder();
KStream<String, String> source = builder.stream(stringSerde, stringSerde, "topic1");
KStream<String, String> s1 = source.map(new KeyValueMapper<String, String, KeyValue<String, String>>() {
#Override
public KeyValue<String, String> apply(String dummy, String record) {
JSONObject jsonObject;
try {
jsonObject = new JSONObject(record);
return new KeyValue<String,String>(jsonObject.get("sender").toString(), record);
} catch (JSONException e) {
e.printStackTrace();
return new KeyValue<>(record, record);
}
}
});
s1.print();
s1.foreach(new ForeachAction<String, String>() {
#Override
public void apply(String key, String value) {
ProducerRecord<String, String> data1 = new ProducerRecord<String, String>(
key, key, value);
producer.send(data1);
}
});
KafkaStreams streams = new KafkaStreams(builder, props);
streams.start();
Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
#Override
public void run() {
streams.close();
producer.close();
}
}));
}
}
I believe the simplest way to achieve this is to use your "sender" field as a key and to have a single topic partitioned by "sender", this will give you locality and order per "sender" so you get a stronger ordering guarantee per "sender" and you can connect clients to consume from specific partitions.
Other possibility is that from the initial topic you stream your messages to other topics aggregating by key so you would end up having one topic per "sender".
Here's a fragment of code for a producer and then streaming with json serializers and deserializers.
Producer:
private Properties kafkaClientProperties() {
Properties properties = new Properties();
final Serializer<JsonNode> jsonSerializer = new JsonSerializer();
properties.put("bootstrap.servers", config.getHost());
properties.put("client.id", clientId);
properties.put("key.serializer", StringSerializer.class);
properties.put("value.serializer", jsonSerializer.getClass());
return properties;
}
public Future<RecordMetadata> send(String topic, String key, Object instance) {
ObjectMapper objectMapper = new ObjectMapper();
JsonNode jsonNode = objectMapper.convertValue(instance, JsonNode.class);
return kafkaProducer.send(new ProducerRecord<>(topic, key,
jsonNode));
}
The stream:
log.info("loading kafka stream configuration");
final Serializer<JsonNode> jsonSerializer = new JsonSerializer();
final Deserializer<JsonNode> jsonDeserializer = new JsonDeserializer();
final Serde<JsonNode> jsonSerde = Serdes.serdeFrom(jsonSerializer, jsonDeserializer);
KStreamBuilder kStreamBuilder = new KStreamBuilder();
Properties props = new Properties();
props.put(StreamsConfig.APPLICATION_ID_CONFIG, config.getStreamEnrichProduce().getId());
props.put(StreamsConfig.BOOTSTRAP_SERVERS_CONFIG, hosts);
//stream from topic...
KStream<String, JsonNode> stockQuoteRawStream = kStreamBuilder.stream(Serdes.String(), jsonSerde , config.getStockQuote().getTopic());
Map<String, Map> exchanges = stockExchangeMaps.getExchanges();
ObjectMapper objectMapper = new ObjectMapper();
kafkaProducer.configure(config.getStreamEnrichProduce().getTopic());
// - enrich stockquote with stockdetails before producing to new topic
stockQuoteRawStream.foreach((key, jsonNode) -> {
StockQuote stockQuote = null;
StockDetail stockDetail;
try {
stockQuote = objectMapper.treeToValue(jsonNode, StockQuote.class);
} catch (JsonProcessingException e) {
e.printStackTrace();
}
JsonNode exchangeNode = jsonNode.get("exchange");
// get stockDetail that matches current quote being processed
Map<String, StockDetail> stockDetailMap = exchanges.get(exchangeNode.toString().replace("\"", ""));
stockDetail = stockDetailMap.get(key);
stockQuote.setStockDetail(stockDetail);
kafkaProducer.send(config.getStreamEnrichProduce().getTopic(), null, stockQuote);
});
return new KafkaStreams(kStreamBuilder, props);

Understanding kafka zookeper auto reset

I still having doubts with kafka ZOOKEPER_AUTO_RESET.I have seen lot of questions asked on this regard. Kindly excuse if the same is a duplicate query .
I am having a high level java consumer which keeps on consuming.
I do have multiple topics and all topics are having a single partition.
My concern is on the below.
I started the consumerkafka.jar with consumer group name as “ncdev1” and ZOOKEPER_AUTO_RESET = smallest . Could observe that init offset is set as -1. Then I stop/started the jar after sometime. At this time, it picks the latest offset assigned to the consumer group (ncdev1) ie 36. I again restarted after sometime, then the initoffset is set to 39. Which is the latest value.
Then I changed the group name to ZOOKEPER_GROUP_ID = ncdev2. And restarted the jar file, this time again the offset is set to -1. In further restarts, it jumped to the latest value ie 39
Then I set the
ZOOKEPER_AUTO_RESET=largest and ZOOKEPER_GROUP_ID = ncdev3
Then tried restarting the jar file with group name ncdev3. There is no difference in the way it picks offset when it restarts. That is it is picking 39 when it restarts, which is same as the previous configuration.
Any idea on why is it not picking offset form the beginning.Any other configuration to be done to make it read from the beginning?(largest and smallest understanding from What determines Kafka consumer offset?)
Thanks in Advance
Code addedd
public class ConsumerForKafka {
private final ConsumerConnector consumer;
private final String topic;
private ExecutorService executor;
ServerSocket soketToWrite;
Socket s_Accept ;
OutputStream s1out ;
DataOutputStream dos;
static boolean logEnabled ;
static File fileName;
private static final Logger logger = Logger.getLogger(ConsumerForKafka.class);
public ConsumerForKafka(String a_zookeeper, String a_groupId, String a_topic,String session_timeout,String auto_reset,String a_commitEnable) {
consumer = kafka.consumer.Consumer.createJavaConsumerConnector(
createConsumerConfig(a_zookeeper, a_groupId,session_timeout,auto_reset,a_commitEnable));
this.topic =a_topic;
}
public void run(int a_numThreads,String a_zookeeper, String a_topic) throws InterruptedException, IOException {
Map<String, Integer> topicCountMap = new HashMap<String, Integer>();
topicCountMap.put(topic, new Integer(a_numThreads));
Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap);
String socketURL = PropertyUtils.getProperty("SOCKET_CONNECT_HOST");
int socketPort = Integer.parseInt(PropertyUtils.getProperty("SOCKET_CONNECT_PORT"));
Socket socks = new Socket(socketURL,socketPort);
//****
String keeper = a_zookeeper;
String topic = a_topic;
long millis = new java.util.Date().getTime();
//****
PrintWriter outWriter = new PrintWriter(socks.getOutputStream(), true);
List<KafkaStream<byte[], byte[]>> streams = null;
// now create an object to consume the messages
//
int threadNumber = 0;
// System.out.println("going to forTopic value is "+topic);
boolean keepRunningThread =false;
boolean chcek = false;
logger.info("logged");
BufferedWriter bw = null;
FileWriter fw = null;
if(logEnabled){
fw = new FileWriter(fileName, true);
bw = new BufferedWriter(fw);
}
for (;;) {
streams = consumerMap.get(topic);
keepRunningThread =true;
for (final KafkaStream stream : streams) {
ConsumerIterator<byte[], byte[]> it = stream.iterator();
while(keepRunningThread)
{
try{
if (it.hasNext()){
if(logEnabled){
String data = new String(it.next().message())+""+"\n";
bw.write(data);
bw.flush();
outWriter.print(data);
outWriter.flush();
consumer.commitOffsets();
logger.info("Explicit commit ......");
}else{
outWriter.print(new String(it.next().message())+""+"\n");
outWriter.flush();
}
}
// logger.info("running");
} catch(ConsumerTimeoutException ex) {
keepRunningThread =false;
break;
}catch(NullPointerException npe ){
keepRunningThread =true;
npe.printStackTrace();
}catch(IllegalStateException ile){
keepRunningThread =true;
ile.printStackTrace();
}
}
}
}
}
private static ConsumerConfig createConsumerConfig(String a_zookeeper, String a_groupId,String session_timeout,String auto_reset,String commitEnable) {
Properties props = new Properties();
props.put("zookeeper.connect", a_zookeeper);
props.put("group.id", a_groupId);
props.put("zookeeper.session.timeout.ms", session_timeout);
props.put("zookeeper.sync.time.ms", "2000");
props.put("auto.offset.reset", auto_reset);
props.put("auto.commit.interval.ms", "60000");
props.put("consumer.timeout.ms", "30");
props.put("auto.commit.enable",commitEnable);
//props.put("rebalance.max.retries", "4");
return new ConsumerConfig(props);
}
public static void main(String[] args) throws InterruptedException {
String zooKeeper = PropertyUtils.getProperty("ZOOKEEPER_URL_PORT");
String groupId = PropertyUtils.getProperty("ZOOKEPER_GROUP_ID");
String session_timeout = PropertyUtils.getProperty("ZOOKEPER_SESSION_TIMOUT_MS"); //6400
String auto_reset = PropertyUtils.getProperty("ZOOKEPER_AUTO_RESET"); //smallest
String enableLogging = PropertyUtils.getProperty("ENABLE_LOG");
String directoryPath = PropertyUtils.getProperty("LOG_DIRECTORY");
String log4jpath = PropertyUtils.getProperty("LOG_DIR");
String commitEnable = PropertyUtils.getProperty("ZOOKEPER_COMMIT"); //false
PropertyConfigurator.configure(log4jpath);
String socketURL = PropertyUtils.getProperty("SOCKET_CONNECT_HOST");
int socketPort = Integer.parseInt(PropertyUtils.getProperty("SOCKET_CONNECT_PORT"));
try {
Socket socks = new Socket(socketURL,socketPort);
boolean connected = socks.isConnected() && !socks.isClosed();
if(connected){
//System.out.println("Able to connect ");
}else{
logger.info("Not able to conenct to socket ..Exiting...");
System.exit(0);
}
} catch (UnknownHostException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch(java.net.ConnectException cne){
logger.info("Not able to conenct to socket ..Exitring...");
System.exit(0);
}
catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
// String zooKeeper = args[0];
// String groupId = args[1];
String topic = args[0];
int threads = 1;
logEnabled = Boolean.parseBoolean(enableLogging);
if(logEnabled)
createDirectory(topic,directoryPath);
ConsumerForKafka example = new ConsumerForKafka(zooKeeper, groupId, topic, session_timeout,auto_reset,commitEnable);
try {
example.run(threads,zooKeeper,topic);
} catch(java.net.ConnectException cne){
cne.printStackTrace();
System.exit(0);
}
catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
private static void createDirectory(String topic,String d_Path) {
try{
File file = new File(d_Path);
if (!file.exists()) {
if (file.mkdir()) {
logger.info("Directory Created" +file.getPath());
} else {
logger.info("Directory Creation failed");
}
}
fileName = new File(d_Path + topic + ".log");
if (!fileName.exists()) {
fileName.createNewFile();
}
}catch(IOException IOE){
//logger.info("IOException occured during Directory or During File creation ");
}
}
}
After rereading your post carefully, I think what you ran into should be as expected.
I started the consumerkafka.jar with consumer group name as “ncdev1” and ZOOKEPER_AUTO_RESET = smallest . Could observe that init offset is set as -1. Then I stop/started the jar after sometime. At this time, it picks the latest offset assigned to the consumer group (ncdev1) ie 36.
auto.offset.reset only applies when there is no initial offset or if an offset is out of range. Since you only have 36 messages in the log, it's possible for the consumer group to read all those records very quickly, that's why you see consumer group always picked the latest offsets every time it got restarted.