How can i reduce lag in kafka consumer/producer - scala

I am looking for improvement in scala kafka code. For reduce lag, what should i do in consumer & producer.
This is the code I got from someone.
I know this code is not a difficult code. But I have never seen scala code before, and I am just beginning to learn about kafka. So I have a hard time finding the problem.
import java.util.Properties
import org.apache.kafka.clients.producer.{KafkaProducer, ProducerRecord}
import scala.util.Try
class KafkaMessenger(val servers: String, val sender: String) {
val props = new Properties()
props.put("bootstrap.servers", servers)
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("producer.type", "async")
val producer = new KafkaProducer[String, String](props)
def send(topic: String, message: Any): Try[Unit] = Try {
producer.send(new ProducerRecord(topic, message.toString))
}
def close(): Unit = producer.close()
}
object KafkaMessenger {
def apply(host: String, topic: String, sender: String, message: String): Unit = {
val messenger = new KafkaMessenger(host, sender)
messenger.send(topic, message)
messenger.close()
}
}
and this is consumer code.
import java.util.Properties
import java.util.concurrent.Executors
import com.satreci.g2gs.common.impl.utils.KafkaMessageTypes._
import kafka.admin.AdminUtils
import kafka.consumer._
import kafka.utils.ZkUtils
import org.I0Itec.zkclient.{ZkClient, ZkConnection}
import org.slf4j.LoggerFactory
import scala.language.postfixOps
class KafkaListener(val zookeeper: String,
val groupId: String,
val topic: String,
val handleMessage: ByteArrayMessage => Unit,
val workJson: String = ""
) extends AutoCloseable {
private lazy val logger = LoggerFactory.getLogger(this.getClass)
val config: ConsumerConfig = createConsumerConfig(zookeeper, groupId)
val consumer: ConsumerConnector = Consumer.create(config)
val sessionTimeoutMs: Int = 10 * 1000
val connectionTimeoutMs: Int = 8 * 1000
val zkClient: ZkClient = ZkUtils.createZkClient(zookeeper, sessionTimeoutMs, connectionTimeoutMs)
val zkUtils = new ZkUtils(zkClient, new ZkConnection(zookeeper), false)
def createConsumerConfig(zookeeper: String, groupId: String): ConsumerConfig = {
val props = new Properties()
props.put("zookeeper.connect", zookeeper)
props.put("group.id", groupId)
props.put("auto.offset.reset", "smallest")
props.put("zookeeper.session.timeout.ms", "5000")
props.put("zookeeper.sync.time.ms", "200")
props.put("auto.commit.interval.ms", "1000")
props.put("partition.assignment.strategy", "roundrobin")
new ConsumerConfig(props)
}
def run(threadCount: Int = 1): Unit = {
val streams = consumer.createMessageStreamsByFilter(Whitelist(topic), threadCount)
if (!AdminUtils.topicExists(zkUtils, topic)) {
AdminUtils.createTopic(zkUtils, topic, 1, 1)
}
val executor = Executors.newFixedThreadPool(threadCount)
for (stream <- streams) {
executor.submit(new MessageConsumer(stream))
}
logger.debug(s"KafkaListener start with ${threadCount}thread (topic=$topic)")
}
override def close(): Unit = {
consumer.shutdown()
logger.debug(s"$topic Listener close")
}
class MessageConsumer(val stream: MessageStream) extends Runnable {
override def run(): Unit = {
val it = stream.iterator()
while (it.hasNext()) {
val message = it.next().message()
if (workJson == "") {
handleMessage(message)
}
else {
val strMessage = new String(message)
val newMessage = s"$strMessage/#/$workJson"
val outMessage = newMessage.toCharArray.map(c => c.toByte)
handleMessage(outMessage)
}
}
}
}
}
Specifically, I want to modify the structure that creates KafkaProduce objects whenever I send a message. There seems to be many other improvements to reduce lag.

Increase the number of consumer(KafkaListener) instances with same group id.
It will increase the consumption rate. Eventually your lag between producer write & consumer will get minimized.

Related

Kafka Producer/Consumer crushing every second API call

Everytime I make the second API call, I get an error in Postman saying "There was an internal server error."
I don't understand if the problem is related to my kafka producer or consumer, they both worked just fine yesterday. The messages don't arrive anymore to the consumer and I can't make a second API call as the code crushes every second time (without giving any logs in Scala)
This is my producer code:
class Producer(topic: String, brokers: String) {
val producer = new KafkaProducer[String, String](configuration)
private def configuration: Properties = {
val props = new Properties()
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers)
props.put(ProducerConfig.ACKS_CONFIG, "all")
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getCanonicalName)
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getCanonicalName)
props
}
def sendMessages(message: String): Unit = {
val record = new ProducerRecord[String, String](topic, "1", message)
producer.send(record)
producer.close()
}
}
This is where I'm using it:
object Message extends DefaultJsonProtocol with SprayJsonSupport {
val newConversation = new Producer(brokers = KAFKA_BROKER, topic = "topic_2")
def sendMessage(sender_id: String, receiver_id: String, content: String): String = {
val JsonMessage = Map("sender_id" -> sender_id, "receiver_id" -> receiver_id, "content" -> content)
val i = JsonMessage.toJson.prettyPrint
newConversation.sendMessages(i)
"Message Sent"
}
}
And this is the API:
f
inal case class Message(sender_id: String, receiver_id: String, content: String)
object producerRoute extends DefaultJsonProtocol with SprayJsonSupport {
implicit val MessageFormat = jsonFormat3(Message)
val sendMessageRoute:Route = (post & path("send")){
entity(as[Message]){
msg => {
complete(sendMessage(msg.sender_id,msg.receiver_id,msg.content))
}
}
}
}
On the other hand, this is my Consumer code:
class Consumer(brokers: String, topic: String, groupId: String) {
val consumer = new KafkaConsumer[String, String](configuration)
consumer.subscribe(util.Arrays.asList(topic))
private def configuration: Properties = {
val props = new Properties()
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers)
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, classOf[StringDeserializer].getCanonicalName)
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, classOf[StringDeserializer].getCanonicalName)
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId)
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest")
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true)
props
}
def receiveMessages():Array[String] = {
val a:ArrayBuffer[String] = new ArrayBuffer[String]
while (true) {
val records = consumer.poll(Duration.ofSeconds(0))
records.forEach(record => a.addOne(record.value()))
}
println(a.toArray)
a.toArray
}
}
object Consumer extends App {
val consumer = new Consumer(brokers = KAFKA_BROKER, topic = "topic_2", groupId = "test")
consumer.receiveMessages()
}
I don't even get the result from the print in the consumer anymore. I don't understand what's the problem as it worked just fine before and I didn't change anything since the last time it worked.

Flink 1.12 serialize Avro Generic Record to Kafka failed with com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException

I have a DataStream[GenericRecord]:
val consumer = new FlinkKafkaConsumer[String]("input_csv_topic", new SimpleStringSchema(), properties)
val stream = senv.
addSource(consumer).
map(line => {
val arr = line.split(",")
val schemaUrl = "" // avro schema link, standard .avsc file format
val schemaStr = scala.io.Source.fromURL(schemaUrl).mkString.toString().stripLineEnd
import org.codehaus.jettison.json.{JSONObject, JSONArray}
val schemaFields: JSONArray = new JSONObject(schemaStr).optJSONArray("fields")
val genericDevice: GenericRecord = new GenericData.Record(new Schema.Parser().parse(schemaStr))
for(i <- 0 until arr.length) {
val fieldObj: JSONObject = schemaFields.optJSONObject(i)
val columnName = fieldObj.optString("name")
var columnType = fieldObj.optString("type")
if (columnType.contains("string")) {
genericDevice.put(columnName, arr(i))
} else if (columnType.contains("int")) {
genericDevice.put(columnName, toInt(arr(i)).getOrElse(0).asInstanceOf[Number].intValue)
} else if (columnType.contains("long")) {
genericDevice.put(columnName, toLong(arr(i)).getOrElse(0).asInstanceOf[Number].longValue)
}
}
genericDevice
})
val kafkaSink = new FlinkKafkaProducer[GenericRecord](
"output_avro_topic",
new MyKafkaAvroSerializationSchema[GenericRecord](classOf[GenericRecord], "output_avro_topic", "this is the key", schemaStr),
properties,
FlinkKafkaProducer.Semantic.AT_LEAST_ONCE)
stream.addSink(kafkaSink)
Here is MyKafkaAvroSerializationSchema implementation:
class MyKafkaAvroSerializationSchema[T](avroType: Class[T], topic: String, key: String, schemaStr: String) extends KafkaSerializationSchema[T] {
lazy val schema: Schema = new Schema.Parser().parse(schemaStr)
override def serialize(element: T, timestamp: lang.Long): ProducerRecord[Array[Byte], Array[Byte]] = {
val cl = Thread.currentThread().getContextClassLoader()
val genericData = new GenericData(cl)
val writer = new GenericDatumWriter[T](schema, genericData)
// val writer = new ReflectDatumWriter[T](schema)
// val writer = new SpecificDatumWriter[T](schema)
val out = new ByteArrayOutputStream()
val encoder: BinaryEncoder = EncoderFactory.get().binaryEncoder(out, null)
writer.write(element, encoder)
encoder.flush()
out.close()
new ProducerRecord[Array[Byte], Array[Byte]](topic, key.getBytes, out.toByteArray)
}
}
Here's stack trace screenshot:
com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
How to use Flink to serialize Avro Generic Record to Kafka? I have tested different writers, but still got com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException, thanks for your input.
You can simply add the flink-avro module to Your project and use the already provided AvroSerializationSchema that can be used both for SpecificRecord and GenericRecord after providing the schema.

Kafka and akka (scala): How to create Source[CommittableMessage[Array[Byte], String], Consumer.Control]?

I would like for unit test to create a source with committable message and with Consumer control.
Or to transform a source created like this :
val message: Source[Array[Byte], NotUsed] = Source.single("one message".getBytes)
to something like this
Source[CommittableMessage[Array[Byte], String], Consumer.Control]
Goal is to unit test actor behavior on message without having to install kafka on the build machine
You can use this helper to create a CommittableMessage:
package akka.kafka.internal
import akka.Done
import akka.kafka.ConsumerMessage.{CommittableMessage, CommittableOffsetBatch, GroupTopicPartition, PartitionOffset}
import akka.kafka.internal.ConsumerStage.Committer
import org.apache.kafka.clients.consumer.ConsumerRecord
import scala.collection.immutable
import scala.concurrent.Future
object AkkaKafkaHelper {
private val committer = new Committer {
def commit(offsets: immutable.Seq[PartitionOffset]): Future[Done] = Future.successful(Done)
def commit(batch: CommittableOffsetBatch): Future[Done] = Future.successful(Done)
}
def commitableMessage[K, V](key: K, value: V, topic: String = "topic", partition: Int = 0, offset: Int = 0, groupId: String = "group"): CommittableMessage[K, V] = {
val partitionOffset = PartitionOffset(GroupTopicPartition(groupId, topic, partition), offset)
val record = new ConsumerRecord(topic, partition, offset, key, value)
CommittableMessage(record, ConsumerStage.CommittableOffsetImpl(partitionOffset)(committer))
}
}
Use Consumer.committableSource to create a Source[CommittableMessage[K, V], Control]. The idea is that in your test you would produce one or more messages onto some topic, then use committableSource to consume from that same topic.
The following is an example that illustrates this approach: it's a slightly adjusted excerpt from the IntegrationSpec in the Akka Streams Kafka project. IntegrationSpec uses scalatest-embedded-kafka, which provides an in-memory Kafka instance for ScalaTest specs.
Source(1 to 100)
.map(n => new ProducerRecord(topic1, partition0, null: Array[Byte], n.toString))
.runWith(Producer.plainSink(producerSettings))
val consumerSettings = createConsumerSettings(group1)
val (control, probe1) = Consumer.committableSource(consumerSettings, TopicSubscription(Set(topic1)))
.filterNot(_.record.value == InitialMsg)
.mapAsync(10) { elem =>
elem.committableOffset.commitScaladsl().map { _ => Done }
}
.toMat(TestSink.probe)(Keep.both)
.run()
probe1
.request(25)
.expectNextN(25).toSet should be(Set(Done))
probe1.cancel()
Await.result(control.isShutdown, remainingOrDefault)

How to Test Kafka Consumer

I have a Kafka Consumer (built in Scala) which extracts latest records from Kafka. The consumer looks like this:
val consumerProperties = new Properties()
consumerProperties.put("bootstrap.servers", "localhost:9092")
consumerProperties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("group.id", "something")
consumerProperties.put("auto.offset.reset", "latest")
val consumer = new KafkaConsumer[String, String](consumerProperties)
consumer.subscribe(java.util.Collections.singletonList("topic"))
Now, I want to write an integration test for it. Is there any way or any best practice for Testing Kafka Consumers?
You need to start zookeeper and kafka programmatically for integration tests.
1.1 start zookeeper (ZooKeeperServer)
def startZooKeeper(zooKeeperPort: Int, zkLogsDir: Directory): ServerCnxnFactory = {
val tickTime = 2000
val zkServer = new ZooKeeperServer(zkLogsDir.toFile.jfile, zkLogsDir.toFile.jfile, tickTime)
val factory = ServerCnxnFactory.createFactory
factory.configure(new InetSocketAddress("0.0.0.0", zooKeeperPort), 1024)
factory.startup(zkServer)
factory
}
1.2 start kafka (KafkaServer)
case class StreamConfig(streamTcpPort: Int = 9092,
streamStateTcpPort :Int = 2181,
stream: String,
numOfPartition: Int = 1,
nodes: Map[String, String] = Map.empty)
def startKafkaBroker(config: StreamConfig,
kafkaLogDir: Directory): KafkaServer = {
val syncServiceAddress = s"localhost:${config.streamStateTcpPort}"
val properties: Properties = new Properties
properties.setProperty("zookeeper.connect", syncServiceAddress)
properties.setProperty("broker.id", "0")
properties.setProperty("host.name", "localhost")
properties.setProperty("advertised.host.name", "localhost")
properties.setProperty("port", config.streamTcpPort.toString)
properties.setProperty("auto.create.topics.enable", "true")
properties.setProperty("log.dir", kafkaLogDir.toAbsolute.path)
properties.setProperty("log.flush.interval.messages", 1.toString)
properties.setProperty("log.cleaner.dedupe.buffer.size", "1048577")
config.nodes.foreach {
case (key, value) => properties.setProperty(key, value)
}
val broker = new KafkaServer(new KafkaConfig(properties))
broker.startup()
println(s"KafkaStream Broker started at ${properties.get("host.name")}:${properties.get("port")} at ${kafkaLogDir.toFile}")
broker
}
emit some events to stream using KafkaProducer
Then consume with your consumer to test and verify its working
You can use scalatest-eventstream that has startBroker method which will start Zookeeper and Kafka for you.
Also has destroyBroker which will cleanup your kafka after tests.
eg.
class MyStreamConsumerSpecs extends FunSpec with BeforeAndAfterAll with Matchers {
implicit val config =
StreamConfig(streamTcpPort = 9092, streamStateTcpPort = 2181, stream = "test-topic", numOfPartition = 1)
val kafkaStream = new KafkaEmbeddedStream
override protected def beforeAll(): Unit = {
kafkaStream.startBroker
}
override protected def afterAll(): Unit = {
kafkaStream.destroyBroker
}
describe("Kafka Embedded stream") {
it("does consume some events") {
//uses application.properties
//emitter.broker.endpoint=localhost:9092
//emitter.event.key.serializer=org.apache.kafka.common.serialization.StringSerializer
//emitter.event.value.serializer=org.apache.kafka.common.serialization.StringSerializer
kafkaStream.appendEvent("test-topic", """{"MyEvent" : { "myKey" : "myValue"}}""")
val consumerProperties = new Properties()
consumerProperties.put("bootstrap.servers", "localhost:9092")
consumerProperties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("group.id", "something")
consumerProperties.put("auto.offset.reset", "earliest")
val myConsumer = new KafkaConsumer[String, String](consumerProperties)
myConsumer.subscribe(java.util.Collections.singletonList("test-topic"))
val events = myConsumer.poll(2000)
events.count() shouldBe 1
events.iterator().next().value() shouldBe """{"MyEvent" : { "myKey" : "myValue"}}"""
println("=================" + events.count())
}
}
}

Create a Simple Kafka Consumer using Scala

I am currently learning Scala & was trying to create a SimpleConsumer for retrieving messages from a Kafka partition.
The consumer should be able to handle the following tasks:
Keep track of Offsets.
Figure out which Broker is the lead Broker for a topic and partition
Must be able to handle Broker leader changes.
I was able to find a very good documentation to create this consumer in Java (https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example).
Does anyone have a sample Scala code for creating this simpleconsumer or if you could refer me some documentation which will point me in the right direction it will be greatly appreciated.
Here is the sample code of a Simple Kafka consumer written in Scala. Got it working after few trial and errors.
package com.Kafka.Consumer
import kafka.api.FetchRequest
import kafka.api.FetchRequestBuilder
import kafka.api.PartitionOffsetRequestInfo
import kafka.common.ErrorMapping
import kafka.common.TopicAndPartition
import kafka.javaapi._
import kafka.javaapi.consumer.SimpleConsumer
import kafka.message.MessageAndOffset
import java.nio.ByteBuffer
import java.util.ArrayList
import java.util.Collections
import java.util.HashMap
import java.util.List
import java.util.Map
import SimpleExample._
//remove if not needed
import scala.collection.JavaConversions._
object SimpleExample {
def main(args: Array[String]) {
val example = new SimpleExample()
val maxReads = java.lang.Integer.parseInt(args(0))
val topic = args(1)
val partition = java.lang.Integer.parseInt(args(2))
val seeds = new ArrayList[String]()
seeds.add(args(3))
val port = java.lang.Integer.parseInt(args(4))
try {
example.run(maxReads, topic, partition, seeds, port)
} catch {
case e: Exception => {
println("Oops:" + e)
e.printStackTrace()
}
}
}
def getLastOffset(consumer: SimpleConsumer,
topic: String,
partition: Int,
whichTime: Long,
clientName: String): Long = {
val topicAndPartition = new TopicAndPartition(topic, partition)
val requestInfo = new HashMap[TopicAndPartition, PartitionOffsetRequestInfo]()
requestInfo.put(topicAndPartition, new PartitionOffsetRequestInfo(whichTime, 1))
val request = new kafka.javaapi.OffsetRequest(requestInfo, kafka.api.OffsetRequest.CurrentVersion, clientName)
val response = consumer.getOffsetsBefore(request)
if (response.hasError) {
println("Error fetching data Offset Data the Broker. Reason: " +
response.errorCode(topic, partition))
return 0
}
val offsets = response.offsets(topic, partition)
offsets(0)
}
}
class SimpleExample {
private var m_replicaBrokers: List[String] = new ArrayList[String]()
def run(a_maxReads: Int,
a_topic: String,
a_partition: Int,
a_seedBrokers: List[String],
a_port: Int) {
val metadata = findLeader(a_seedBrokers, a_port, a_topic, a_partition)
if (metadata == null) {
println("Can't find metadata for Topic and Partition. Exiting")
return
}
if (metadata.leader == null) {
println("Can't find Leader for Topic and Partition. Exiting")
return
}
var leadBroker = metadata.leader.host
val clientName = "Client_" + a_topic + "_" + a_partition
var consumer = new SimpleConsumer(leadBroker, a_port, 100000, 64 * 1024, clientName)
var readOffset = getLastOffset(consumer, a_topic, a_partition, kafka.api.OffsetRequest.EarliestTime, clientName)
var numErrors = 0
//while (a_maxReads > 0) {
if (consumer == null) {
consumer = new SimpleConsumer(leadBroker, a_port, 100000, 64 * 1024, clientName)
}
val req = new FetchRequestBuilder().clientId(clientName).addFetch(a_topic, a_partition, readOffset,
100000)
.build()
val fetchResponse = consumer.fetch(req)
if (fetchResponse.hasError) {
numErrors += 1
val code = fetchResponse.errorCode(a_topic, a_partition)
println("Error fetching data from the Broker:" + leadBroker +
" Reason: " +
code)
if (numErrors > 5) //break
if (code == ErrorMapping.OffsetOutOfRangeCode) {
readOffset = getLastOffset(consumer, a_topic, a_partition, kafka.api.OffsetRequest.LatestTime, clientName)
//continue
}
consumer.close()
consumer = null
leadBroker = findNewLeader(leadBroker, a_topic, a_partition, a_port)
//continue
}
numErrors = 0
var numRead = 0
for (messageAndOffset <- fetchResponse.messageSet(a_topic, a_partition)) {
val currentOffset = messageAndOffset.offset
if (currentOffset < readOffset) {
println("Found an old offset: " + currentOffset + " Expecting: " +
readOffset)
//continue
}
readOffset = messageAndOffset.nextOffset
val payload = messageAndOffset.message.payload
val bytes = Array.ofDim[Byte](payload.limit())
payload.get(bytes)
println(String.valueOf(messageAndOffset.offset) + ": " + new String(bytes, "UTF-8"))
numRead += 1
// a_maxReads -= 1
}
if (numRead == 0) {
try {
Thread.sleep(1000)
} catch {
case ie: InterruptedException =>
}
}
//}
if (consumer != null) consumer.close()
}
private def findNewLeader(a_oldLeader: String,
a_topic: String,
a_partition: Int,
a_port: Int): String = {
for (i <- 0 until 3) {
var goToSleep = false
val metadata = findLeader(m_replicaBrokers, a_port, a_topic, a_partition)
if (metadata == null) {
goToSleep = true
} else if (metadata.leader == null) {
goToSleep = true
} else if (a_oldLeader.equalsIgnoreCase(metadata.leader.host) && i == 0) {
goToSleep = true
} else {
return metadata.leader.host
}
if (goToSleep) {
try {
Thread.sleep(1000)
} catch {
case ie: InterruptedException =>
}
}
}
println("Unable to find new leader after Broker failure. Exiting")
throw new Exception("Unable to find new leader after Broker failure. Exiting")
}
private def findLeader(a_seedBrokers: List[String],
a_port: Int,
a_topic: String,
a_partition: Int): PartitionMetadata = {
var returnMetaData: PartitionMetadata = null
for (seed <- a_seedBrokers) {
var consumer: SimpleConsumer = null
try {
consumer = new SimpleConsumer(seed, a_port, 100000, 64 * 1024, "leaderLookup")
val topics = Collections.singletonList(a_topic)
val req = new TopicMetadataRequest(topics)
val resp = consumer.send(req)
val metaData = resp.topicsMetadata
for (item <- metaData; part <- item.partitionsMetadata){
if (part.partitionId == a_partition) {
returnMetaData = part
//break
}
}
} catch {
case e: Exception => println("Error communicating with Broker [" + seed + "] to find Leader for [" +
a_topic +
", " +
a_partition +
"] Reason: " +
e)
} finally {
if (consumer != null) consumer.close()
}
}
if (returnMetaData != null) {
m_replicaBrokers.clear()
for (replica <- returnMetaData.replicas) {
m_replicaBrokers.add(replica.host)
}
}
returnMetaData
}
}
I built a simple kafka consumer and producer using scala.
consumer:
package com.kafka
import java.util.concurrent._
import java.util.{Collections, Properties}
import com.sun.javafx.util.Logging
import org.apache.kafka.clients.consumer.{ConsumerConfig, KafkaConsumer}
import scala.collection.JavaConversions._
class Consumer(val brokers: String,
val groupId: String,
val topic: String) extends Logging {
val props = createConsumerConfig(brokers, groupId)
val consumer = new KafkaConsumer[String, String](props)
var executor: ExecutorService = null
def shutdown() = {
if (consumer != null)
consumer.close()
if (executor != null)
executor.shutdown()
}
def createConsumerConfig(brokers: String, groupId: String): Properties = {
val props = new Properties()
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers)
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId)
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true")
props.put(ConsumerConfig.AUTO_COMMIT_INTERVAL_MS_CONFIG, "1000")
props.put(ConsumerConfig.SESSION_TIMEOUT_MS_CONFIG, "30000")
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer")
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, "org.apache.kafka.common.serialization.StringDeserializer")
props
}
def run() = {
consumer.subscribe(Collections.singletonList(this.topic))
Executors.newSingleThreadExecutor.execute(new Runnable {
override def run(): Unit = {
while (true) {
val records = consumer.poll(1000)
for (record <- records) {
System.out.println("Received message: (" + record.key() + ", " + record.value() + ") at offset " + record.offset())
}
}
}
})
}
}
object Consumer extends App{
val newArgs = Array("localhost:9092", "2","test")
val example = new Consumer(newArgs(0), newArgs(1), newArgs(2))
example.run()
}
producer:
package com.kafka
import java.util.{Date, Properties}
import org.apache.kafka.clients.producer.KafkaProducer
import org.apache.kafka.clients.producer.ProducerRecord
object Producer extends App{
val newArgs = Array("20","test","localhost:9092")
val events = newArgs(0).toInt
val topic = newArgs(1)
val brokers = newArgs(2)
val props = new Properties()
props.put("bootstrap.servers", brokers)
props.put("client.id", "producer")
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
val producer = new KafkaProducer[String, String](props)
val t = System.currentTimeMillis()
for (nEvents <- Range(0, events)) {
val key = "messageKey " + nEvents.toString
val msg = "test message"
val data = new ProducerRecord[String, String](topic, key, msg)
//async
//producer.send(data, (m,e) => {})
//sync
producer.send(data)
}
System.out.println("sent per second: " + events * 1000 / (System.currentTimeMillis() - t))
producer.close()
}