kafka-apache flink execution log4j error - apache-kafka

I'm trying to run a simple Apache Flink script with Kafka inegration but I keep on having problems with the execution.
The script should read messages coming from a kafka producer, elaborate them, and then send back again, to an other topic, the result of the processing.
I've get this example from here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Simple-Flink-Kafka-Test-td4828.html
The error I have is:
Exception in thread "main" java.lang.NoSuchFieldError:ALL
at org.apache.flink.streaming.api.graph.StreamingJobGraphGenera‌tor.createJobGraph(S‌​treamingJobGraphGene‌​rator.java:86)
at org.apache.flink.streaming.api.graph.StreamGraph.getJobGraph‌​(StreamGraph.java:42‌​9)
at org.apache.flink.streaming.api.environment.LocalStreamEnviro‌nment.execute(LocalS‌​treamEnvironment.jav‌​a:46)
at org.apache.flink.streaming.api.environment.LocalStreamEnviro‌nment.execute(LocalS‌​treamEnvironment.jav‌​a:33)
This is my code:
public class App {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
Properties properties = new Properties();
properties.setProperty("bootstrap.servers", "localhost:9092");
//properties.setProperty("zookeeper.connect", "localhost:2181");
properties.setProperty("group.id", "javaflink");
DataStream<String> messageStream = env.addSource(new FlinkKafkaConsumer010<String>("test", new SimpleStringSchema(), properties));
System.out.println("Step D");
messageStream.map(new MapFunction<String, String>(){
public String map(String value) throws Exception {
// TODO Auto-generated method stub
return "Blablabla " + value;
}
}).addSink(new FlinkKafkaProducer010("localhost:9092", "demo2", new SimpleStringSchema()));
env.execute();
}
}
These are the pom.xml dependencies:
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-core</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java_2.11</artifactId>
<version>0.10.2</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>1.3.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-core</artifactId>
<version>0.9.1</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.11</artifactId>
<version>1.3.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.10_2.11</artifactId>
<version>1.3.1</version>
</dependency>
What could cause this kind of error?
Thanks
Luca

The problem is most likely caused by the mixture of different Flink versions you have defined in your pom.xml. In order to run this program, it should be enough to include the following dependencies:
<!-- Streaming API -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-java_2.11</artifactId>
<version>1.3.1</version>
</dependency>
<!-- In order to execute the program from within your IDE -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients_2.11</artifactId>
<version>1.3.1</version>
</dependency>
<!-- Kafka connector dependency -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-kafka-0.10_2.11</artifactId>
<version>1.3.1</version>
</dependency>

Related

Embedded mongodb to spring-boot application java exception

I need to set embedded mongodb in my springboot project but it show infinite error logs. Someone can help me?
I use these dependencies
<!-- https://mvnrepository.com/artifact/de.flapdoodle.embed/de.flapdoodle.embed.mongo -->
<dependency>
<groupId>de.flapdoodle.embed</groupId>
<artifactId>de.flapdoodle.embed.mongo</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/cz.jirutka.spring/embedmongo-spring -->
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>bson</artifactId>
<version>3.8.0</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver</artifactId>
<version>3.8.0</version>
<exclusions>
<exclusion>
<groupId>org.mongodb</groupId> <!-- Exclude Project-E from Project-B -->
<artifactId>bson</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver-core</artifactId>
<version>3.8.0</version>
</dependency>
<dependency>
<groupId>cz.jirutka.spring</groupId>
<artifactId>embedmongo-spring</artifactId>
<version>1.3.1</version>
</dependency>
And then i configure the mongoTemplate with this method in a configuration class
private static final String MONGO_DB_URL = "localhost";
private static final String MONGO_DB_NAME = "embedded_db";
#Bean
public MongoTemplate mongoTemplate() throws IOException {
EmbeddedMongoFactoryBean mongo = new EmbeddedMongoFactoryBean();
mongo.setBindIp(MONGO_DB_URL);
MongoClient mongoClient = (MongoClient) mongo.getObject();
return new MongoTemplate(mongoClient, MONGO_DB_NAME);
}
But when i run my application it show this error+
Exception in thread "main" 11:56:00.710 [Thread-0] DEBUG de.flapdoodle.embed.process.store.CachingArtifactStore - force delete for PRODUCTION:Windows:B64 and de.flapdoodle.embed.process.extract.ImmutableExtractedFileSet#545997b1
11:56:00.710 [Thread-1] DEBUG de.flapdoodle.embed.mongo.AbstractMongoProcess - try to stop mongod
java.lang.NoSuchMethodError: 'void com.mongodb.client.internal.MongoClientDelegate.<init>(com.mongodb.connection.Cluster, java.util.List, java.lang.Object)'
at com.mongodb.Mongo.<init>(Mongo.java:319)
at com.mongodb.Mongo.<init>(Mongo.java:291)
at com.mongodb.Mongo.<init>(Mongo.java:286)
at com.mongodb.Mongo.<init>(Mongo.java:282)
at com.mongodb.MongoClient.<init>(MongoClient.java:180)
at com.mongodb.MongoClient.<init>(MongoClient.java:155)
at com.mongodb.MongoClient.<init>(MongoClient.java:145)
at cz.jirutka.spring.embedmongo.EmbeddedMongoBuilder.build(EmbeddedMongoBuilder.java:104)
at cz.jirutka.spring.embedmongo.EmbeddedMongoFactoryBean.getObject(EmbeddedMongoFactoryBean.java:52)
at com.nextage.arcacrmconnector.commons.EmbeddedMongoDb.mongoTemplate(EmbeddedMongoDb.java:20)
at com.nextage.arcacrmconnector.commons.MongoTemplateSingleton.setMongoTemplate(MongoTemplateSingleton.java:20)
at com.nextage.arcacrmconnector.commons.MongoTemplateSingleton.getMongoTemplate(MongoTemplateSingleton.java:13)
at com.nextage.arcacrmconnector.services.CommonMongoService.<init>(CommonMongoService.java:12)
at com.nextage.arcacrmconnector.services.LogService.<init>(LogService.java:18)
at com.nextage.arcacrmconnector.consumer.QueueConsumerTimerTask.<init>(QueueConsumerTimerTask.java:23)
at com.nextage.arcacrmconnector.application.Application.<clinit>(Application.java:30)
11:56:00.716 [Thread-0] WARN de.flapdoodle.embed.process.io.file.Files - could not delete C:\Users\DONATE~1\AppData\Local\Temp\extract-70ac2cd1-bb5b-4276-9243-cdf6b52db3famongod.exe. Will try to delete it again when program exits.
Exception in thread "Thread-0" java.lang.IllegalStateException: Shutdown in progress
at java.base/java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:66)
at java.base/java.lang.Runtime.addShutdownHook(Runtime.java:213)
at de.flapdoodle.embed.process.io.file.FileCleaner.forceDeleteOnExit(FileCleaner.java:51)
at de.flapdoodle.embed.process.io.file.Files.forceDelete(Files.java:128)
at de.flapdoodle.embed.process.extract.ExtractedFileSets.delete(ExtractedFileSets.java:77)
at de.flapdoodle.embed.process.store.ArtifactStore.removeFileSet(ArtifactStore.java:90)
at de.flapdoodle.embed.process.store.CachingArtifactStore$FilesWithCounter.forceDelete(CachingArtifactStore.java:176)
at de.flapdoodle.embed.process.store.CachingArtifactStore.removeAll(CachingArtifactStore.java:100)
at de.flapdoodle.embed.process.store.CachingArtifactStore$CacheCleaner.run(CachingArtifactStore.java:196)
at java.base/java.lang.Thread.run(Thread.java:830)
How can i fix it? It is a dependency version error?
String tempFile = System.getenv("temp") + File.separator + "extract-" + System.getenv("USERNAME") + "-extractmongod";
String executable;
if (System.getenv("OS") != null && System.getenv("OS").contains("Windows")) {
executable = tempFile + ".exe";
} else {
executable = tempFile + ".sh";
}
Files.deleteIfExists(new File(executable).toPath());
Files.deleteIfExists(new File(tempFile + ".pid").toPath());
please try this temporary solution. write this in mongo config file,This should be executed before the start of embedded mongo db
link is : https://github.com/flapdoodle-oss/de.flapdoodle.embed.mongo/issues/171

Kafka-spark Streaming processing jobs synchronically

Im trying a simple test where i use Kafka-connect and spark
I wrote a custom kafka-connect that creates this source record
SourceRecord sr = new SourceRecord(null,
null,
destTopic,
Schema.STRING_SCHEMA,
cleanPath);
in the spark i receive this message like this
val kafkaConsumerParams = Map[String, String](
"metadata.broker.list" -> prop.getProperty("kafka_host"),
"zookeeper.connect" -> prop.getProperty("zookeeper_host"),
"group.id" -> prop.getProperty("kafka_group_id"),
"schema.registry.url" -> prop.getProperty("schema_registry_url"),
"auto.offset.reset" -> prop.getProperty("auto_offset_reset")
)
val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaConsumerParams, topicsSet)
val ds = messages.foreachRDD(rdd => {
val toPrint = rdd.map(t => {
val file_path = t._2
val startTime = DateTime.now()
Thread.sleep(1000 * 60)
1
}).sum()
LogUtils.getLogger(classOf[DeviceManager]).info(" toPrint = " + toPrint +" (number of flows calculated)")
})
}
when i use the connector to send multiple message to the desired topic ( in my test it had 6 partitions)
The sleep thread gets all the messages, but preforms them synchronically instead of asynchronically.
When i create a simple test producer, the sleeps are done asynchronically.
I Also created 2 simple consumers, and tried both the connector and a producer, and both task were consumed asynchronically
which means my problems lays with the way the spark is receiving the messages sent from the connector.
I cant figure why the tasks are not acting the same way as they do when i send it from a producer.
i even printed the record the spark recieves and they are exactly the same
producer sent record
1: {partition=2, offset=11, value=something, key=null}
2: {partition=5, offset=9, value=something2, key=null}
connect sent record
1: {partition=3, offset=9, value=something, key=null}
the versions used in my projects are
<scala.version>2.11.7</scala.version>
<confluent.version>4.0.0</confluent.version>
<kafka.version>1.0.0</kafka.version>
<java.version>1.8</java.version>
<spark.version>2.0.0</spark.version>
dependencies
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-avro-serializer</artifactId>
<version>${confluent.version}</version>
</dependency>
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-schema-registry-client</artifactId>
<version>${confluent.version}</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.8.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.11</artifactId>
<version>1.6.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_2.11</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.0-RC1</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.8.0</version>
</dependency>
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-avro-serializer</artifactId>
<version>${confluent.version}</version>
<scope>${global.scope}</scope>
</dependency>
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-connect-avro-converter</artifactId>
<version>${confluent.version}</version>
<scope>${global.scope}</scope>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>connect-api</artifactId>
<version>${kafka.version}</version>
</dependency>
We cannot run Spark-Kafka streaming jobs asynchronously. But we can run them in parallel, as Kafka consumer(s) do. For that, we need to set following configuration in SparkConf():
sparkConf.set("spark.streaming.concurrentJobs","4")
By default, its value is "1". But we can override it to a higher value.
I hope this helps!

MongoDB Scala - query document for a specific field value

So I know that in Mongo Shell, you use dot notation to get the field you want in any document.
How is dot notation achieved in MongoDB Scala. I'm confused as to how it works. Here is the code that fetches a document from a collection:
val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)
EDIT:
I'm trying to work on a mechanism to basically re-consume Kafka records at a point where the consumer was shutdown. To do this, I store my kafka records in an external database, and then try to fetch the most recent offset from there and start consuming from that point. Here is my Scala method that should do that:
def getLatestCommitOffsetFromDB(collectionName: String): Long = {
import com.mongodb.Block
import org.bson.Document
val printBlock = new Block[Document]() {
override def apply(document: Document): Unit = {
println(document.toJson)
}
}
import com.mongodb.async.SingleResultCallback
val callbackWhenFinished = new SingleResultCallback[Void]() {
override def onResult(result: Void, t: Throwable): Unit = {
System.out.println("Latest offset fetched from database.")
}
}
var obj: String = " "
try {
val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)
//TODO FIND A WAY TO GET THE VALUE AND STORE IT IN A VARIABLE
} catch {
case e: RuntimeException =>
logger.error(s"MongoDB Server Error : Unable to fetch data from collection : $collection")
logger.error(e.printStackTrace().toString())
}
obj.toLong
}
The problem isn't that I can fetch documents from Mongo, more-so that I'm trying to access a particular field in Mongo. The Document has four fields in it: topic, partition, message, and offset. I want to get the "offset" field and store that in a variable, so I can use it as a restarting point to re-consume Kafka records.
where do I go from there?
POM.xml
<?xml version="1.0" encoding="UTF-8"?>
http://maven.apache.org/xsd/maven-4.0.0.xsd">
4.0.0
<groupId>OffsetManagementPoC</groupId>
<artifactId>OffsetManagementPoC</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.12</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.8</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.25</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>0.10.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.6.5</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.6.5</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-annotations -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.6.5</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>casbah_2.12</artifactId>
<version>3.1.1</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>org.mongodb.scala</groupId>
<artifactId>mongo-scala-driver_2.12</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-compiler</artifactId>
<version>2.11.8</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongo-java-driver</artifactId>
<version>3.4.2</version>
</dependency>
<dependency>
<groupId>org.mongodb.scala</groupId>
<artifactId>mongo-scala-driver_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>bson</artifactId>
<version>3.3.0</version>
</dependency>
<dependency>
<groupId>org.mongodb</groupId>
<artifactId>mongodb-driver-async</artifactId>
<version>3.4.3</version>
</dependency>
<dependency>
<groupId>org.mongodb.scala</groupId>
<artifactId>mongo-scala-bson_2.11</artifactId>
<version>2.1.0</version>
</dependency>
</dependencies>
You can modify your query this way:
import com.mongodb.MongoClient
import com.mongodb.client.MongoCollection
import com.mongodb.client.model.Projections
def getLatestCommitOffsetFromDB(
databaseName: String,
collectionName: String
): Long = {
val mongoClient = new MongoClient("localhost", 27017);
val collection =
mongoClient.getDatabase(databaseName).getCollection(collectionName)
val record = collection
.find()
.projection(
Projections
.fields(Projections.include("offset"), Projections.excludeId()))
.first
record.get("offset").asInstanceOf[Double].toLong
}
I think you were missing the com.mongodb.client.model.Projections imports in order to use fields, include and excludeId
I used first instead of limit(1) to make it easier to extract the result.
first returns a Document object on which you can call get to retrieve the value of the requested field.
But in fact, since you just want one record and one field, you can remove the projection!:
val record = collection.find().first
According to the documentation, collection.find() accepts a com.mongodb.DBObject
One of the implementations of that interface that you can use is BasicDBObject which is basically like a mutable.Map[String, Object]. You can use the constructor which accepts a map like:
val query = new com.mongodb.BasicDBObject(Map(
"foo.bar" -> "value1"
"bar.foo" -> "value2"
))
val record = collection.find(query)....

java.lang.ClassNotFoundException: io.jsonwebtoken.Jwts when using JJWT JSON Web Token

When I am trying to use JJWT from Stormpath, it is throwing a run time Exception java.lang.ClassNotFoundException: io.jsonwebtoken.Jwts. I am using Jersey2 embedded on GlassFish 4.1; here is the code that is throwing the exception:
private String issueToken(String login) {
Key key = keyGenerator.generateKey();
//Key key = MacProvider.generateKey();
String jwtToken = Jwts.builder()
.setIssuer(uriInfo.getAbsolutePath().toString())
//.setIssuer("http://trustyapp.com/")
.setSubject(login)
.setIssuedAt(new Date())
.setExpiration(toDate(LocalDateTime.now().plusMinutes(15L)))
.signWith(SignatureAlgorithm.HS512, key)
.compact();
logger.info("#### generating token for a key : " + jwtToken + " - " + key);
return jwtToken;
}
I have imported io.jsonwebtoken.Jwts and my pom.xml has :
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.8.2</version>
<scope>compile</scope>
</dependency>
i also tried it without the above dependency in case the below dependency which is on my pom.xml is enough:
<dependency>
<groupId>io.jsonwebtoken</groupId>
<artifactId>jjwt</artifactId>
<version>0.7.0</version>
<scope>compile</scope>
</dependency>
I tried the recommendations from this and this but it did not work, please help
The problem is solved after adding the following dependencies into my pom.xml:
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-common</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-jdk-http</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-client</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-server</artifactId>
<version>${version.jersey}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-servlet</artifactId>
<version>${version.jersey}</version>
</dependency>
I assumed that such dependencies are not required since i am using Jersey 2 which is embedded on the GlassFish4.1.1 Server.

Storm-Kafka-client from Storm 1.0.1

Based on the Storm documentation supported implementation of KafkaSpout is based on the old consumer API. I noticed the external package has another implementation named storm-kafka-client.
https://github.com/apache/storm/tree/master/external/storm-kafka-client
It is unclear if the new client release in 1.0.1 is production ready. Does anyone have experience running it?
I posted the same question to the Storm mail list.
the new API is production ready. We should use 1.x branch.
I plan to test with
<!-- https://mvnrepository.com/artifact/org.apache.storm/storm-kafka-client -->
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka-client</artifactId>
<version>1.0.1</version>
</dependency>
Will update on the progress.
Below Code works for me fine!!!
public TopologyBuilder myTopology() {
TopologyBuilder builder = new TopologyBuilder();
try {
KafkaSpoutConfig<String, String> kafkaSpoutConfig = getKafkaSpoutConfig("KAFKA_IP:9092", KAFKA_TOPIC);
KafkaSpout kafkaSpout = new KafkaSpout<>(kafkaSpoutConfig);
builder.setSpout("kafkaSpout", kafkaSpout, 2 * 2);
builder.setBolt("Bolt-1", new TestBolt(), parallelism).shuffleGrouping("kafkaSpout", KAFKA_TOPIC);
} catch (Exception ex) {
}
return builder;
}
Configure Spout.
protected KafkaSpoutConfig<String, String> getKafkaSpoutConfig(String bootstrapServers ,String topic) {
ByTopicRecordTranslator<String, String> trans = new ByTopicRecordTranslator<>(
(r) -> new Values(r.topic(), r.partition(), r.offset(), r.key(), r.value()),
new Fields("topic", "partition", "offset", "key", "value"), topic);
Builder<String, String> builder = KafkaSpoutConfig.builder(bootstrapServers, new String[]{topic});
return builder.setProp(ConsumerConfig.GROUP_ID_CONFIG, topic)
.setProcessingGuarantee(ProcessingGuarantee.AT_LEAST_ONCE)
.setRetry(getRetryService())
.setRecordTranslator(trans)
.setOffsetCommitPeriodMs(10_000)
.setFirstPollOffsetStrategy(UNCOMMITTED_EARLIEST)
.setMaxUncommittedOffsets(1000)
.build();
}
For configure failed messages retyr logic
protected KafkaSpoutRetryService getRetryService() {
return new KafkaSpoutRetryExponentialBackoff(TimeInterval.microSeconds(500),
TimeInterval.milliSeconds(2), Integer.MAX_VALUE, TimeInterval.seconds(10));
}
You can use following maven dependency for storm 1.1.0
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-core</artifactId>
<version>1.1.0</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<groupId>org.slf4j</groupId>
<artifactId>log4j-over-slf4j</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>0.10.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.storm</groupId>
<artifactId>storm-kafka</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.10</artifactId>
<version>0.9.0.0</version>
<exclusions>
<exclusion>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</exclusion>
<exclusion>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</exclusion>
</exclusions>
</dependency>
You may face some more dependency issue which you can resolve by adding the required jars.
Also the dependency in java code will change from org.backtype.storm.XXXXX to org.apache.storm.XXXXX