Kafka Producer/Consumer crushing every second API call - scala

Everytime I make the second API call, I get an error in Postman saying "There was an internal server error."
I don't understand if the problem is related to my kafka producer or consumer, they both worked just fine yesterday. The messages don't arrive anymore to the consumer and I can't make a second API call as the code crushes every second time (without giving any logs in Scala)
This is my producer code:
class Producer(topic: String, brokers: String) {
val producer = new KafkaProducer[String, String](configuration)
private def configuration: Properties = {
val props = new Properties()
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers)
props.put(ProducerConfig.ACKS_CONFIG, "all")
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getCanonicalName)
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, classOf[StringSerializer].getCanonicalName)
props
}
def sendMessages(message: String): Unit = {
val record = new ProducerRecord[String, String](topic, "1", message)
producer.send(record)
producer.close()
}
}
This is where I'm using it:
object Message extends DefaultJsonProtocol with SprayJsonSupport {
val newConversation = new Producer(brokers = KAFKA_BROKER, topic = "topic_2")
def sendMessage(sender_id: String, receiver_id: String, content: String): String = {
val JsonMessage = Map("sender_id" -> sender_id, "receiver_id" -> receiver_id, "content" -> content)
val i = JsonMessage.toJson.prettyPrint
newConversation.sendMessages(i)
"Message Sent"
}
}
And this is the API:
f
inal case class Message(sender_id: String, receiver_id: String, content: String)
object producerRoute extends DefaultJsonProtocol with SprayJsonSupport {
implicit val MessageFormat = jsonFormat3(Message)
val sendMessageRoute:Route = (post & path("send")){
entity(as[Message]){
msg => {
complete(sendMessage(msg.sender_id,msg.receiver_id,msg.content))
}
}
}
}
On the other hand, this is my Consumer code:
class Consumer(brokers: String, topic: String, groupId: String) {
val consumer = new KafkaConsumer[String, String](configuration)
consumer.subscribe(util.Arrays.asList(topic))
private def configuration: Properties = {
val props = new Properties()
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, brokers)
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, classOf[StringDeserializer].getCanonicalName)
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, classOf[StringDeserializer].getCanonicalName)
props.put(ConsumerConfig.GROUP_ID_CONFIG, groupId)
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest")
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, true)
props
}
def receiveMessages():Array[String] = {
val a:ArrayBuffer[String] = new ArrayBuffer[String]
while (true) {
val records = consumer.poll(Duration.ofSeconds(0))
records.forEach(record => a.addOne(record.value()))
}
println(a.toArray)
a.toArray
}
}
object Consumer extends App {
val consumer = new Consumer(brokers = KAFKA_BROKER, topic = "topic_2", groupId = "test")
consumer.receiveMessages()
}
I don't even get the result from the print in the consumer anymore. I don't understand what's the problem as it worked just fine before and I didn't change anything since the last time it worked.

Related

Flink 1.12 serialize Avro Generic Record to Kafka failed with com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException

I have a DataStream[GenericRecord]:
val consumer = new FlinkKafkaConsumer[String]("input_csv_topic", new SimpleStringSchema(), properties)
val stream = senv.
addSource(consumer).
map(line => {
val arr = line.split(",")
val schemaUrl = "" // avro schema link, standard .avsc file format
val schemaStr = scala.io.Source.fromURL(schemaUrl).mkString.toString().stripLineEnd
import org.codehaus.jettison.json.{JSONObject, JSONArray}
val schemaFields: JSONArray = new JSONObject(schemaStr).optJSONArray("fields")
val genericDevice: GenericRecord = new GenericData.Record(new Schema.Parser().parse(schemaStr))
for(i <- 0 until arr.length) {
val fieldObj: JSONObject = schemaFields.optJSONObject(i)
val columnName = fieldObj.optString("name")
var columnType = fieldObj.optString("type")
if (columnType.contains("string")) {
genericDevice.put(columnName, arr(i))
} else if (columnType.contains("int")) {
genericDevice.put(columnName, toInt(arr(i)).getOrElse(0).asInstanceOf[Number].intValue)
} else if (columnType.contains("long")) {
genericDevice.put(columnName, toLong(arr(i)).getOrElse(0).asInstanceOf[Number].longValue)
}
}
genericDevice
})
val kafkaSink = new FlinkKafkaProducer[GenericRecord](
"output_avro_topic",
new MyKafkaAvroSerializationSchema[GenericRecord](classOf[GenericRecord], "output_avro_topic", "this is the key", schemaStr),
properties,
FlinkKafkaProducer.Semantic.AT_LEAST_ONCE)
stream.addSink(kafkaSink)
Here is MyKafkaAvroSerializationSchema implementation:
class MyKafkaAvroSerializationSchema[T](avroType: Class[T], topic: String, key: String, schemaStr: String) extends KafkaSerializationSchema[T] {
lazy val schema: Schema = new Schema.Parser().parse(schemaStr)
override def serialize(element: T, timestamp: lang.Long): ProducerRecord[Array[Byte], Array[Byte]] = {
val cl = Thread.currentThread().getContextClassLoader()
val genericData = new GenericData(cl)
val writer = new GenericDatumWriter[T](schema, genericData)
// val writer = new ReflectDatumWriter[T](schema)
// val writer = new SpecificDatumWriter[T](schema)
val out = new ByteArrayOutputStream()
val encoder: BinaryEncoder = EncoderFactory.get().binaryEncoder(out, null)
writer.write(element, encoder)
encoder.flush()
out.close()
new ProducerRecord[Array[Byte], Array[Byte]](topic, key.getBytes, out.toByteArray)
}
}
Here's stack trace screenshot:
com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException
Serialization trace:
reserved (org.apache.avro.Schema$Field)
fieldMap (org.apache.avro.Schema$RecordSchema)
schema (org.apache.avro.generic.GenericData$Record)
How to use Flink to serialize Avro Generic Record to Kafka? I have tested different writers, but still got com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException, thanks for your input.
You can simply add the flink-avro module to Your project and use the already provided AvroSerializationSchema that can be used both for SpecificRecord and GenericRecord after providing the schema.

How can i reduce lag in kafka consumer/producer

I am looking for improvement in scala kafka code. For reduce lag, what should i do in consumer & producer.
This is the code I got from someone.
I know this code is not a difficult code. But I have never seen scala code before, and I am just beginning to learn about kafka. So I have a hard time finding the problem.
import java.util.Properties
import org.apache.kafka.clients.producer.{KafkaProducer, ProducerRecord}
import scala.util.Try
class KafkaMessenger(val servers: String, val sender: String) {
val props = new Properties()
props.put("bootstrap.servers", servers)
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("producer.type", "async")
val producer = new KafkaProducer[String, String](props)
def send(topic: String, message: Any): Try[Unit] = Try {
producer.send(new ProducerRecord(topic, message.toString))
}
def close(): Unit = producer.close()
}
object KafkaMessenger {
def apply(host: String, topic: String, sender: String, message: String): Unit = {
val messenger = new KafkaMessenger(host, sender)
messenger.send(topic, message)
messenger.close()
}
}
and this is consumer code.
import java.util.Properties
import java.util.concurrent.Executors
import com.satreci.g2gs.common.impl.utils.KafkaMessageTypes._
import kafka.admin.AdminUtils
import kafka.consumer._
import kafka.utils.ZkUtils
import org.I0Itec.zkclient.{ZkClient, ZkConnection}
import org.slf4j.LoggerFactory
import scala.language.postfixOps
class KafkaListener(val zookeeper: String,
val groupId: String,
val topic: String,
val handleMessage: ByteArrayMessage => Unit,
val workJson: String = ""
) extends AutoCloseable {
private lazy val logger = LoggerFactory.getLogger(this.getClass)
val config: ConsumerConfig = createConsumerConfig(zookeeper, groupId)
val consumer: ConsumerConnector = Consumer.create(config)
val sessionTimeoutMs: Int = 10 * 1000
val connectionTimeoutMs: Int = 8 * 1000
val zkClient: ZkClient = ZkUtils.createZkClient(zookeeper, sessionTimeoutMs, connectionTimeoutMs)
val zkUtils = new ZkUtils(zkClient, new ZkConnection(zookeeper), false)
def createConsumerConfig(zookeeper: String, groupId: String): ConsumerConfig = {
val props = new Properties()
props.put("zookeeper.connect", zookeeper)
props.put("group.id", groupId)
props.put("auto.offset.reset", "smallest")
props.put("zookeeper.session.timeout.ms", "5000")
props.put("zookeeper.sync.time.ms", "200")
props.put("auto.commit.interval.ms", "1000")
props.put("partition.assignment.strategy", "roundrobin")
new ConsumerConfig(props)
}
def run(threadCount: Int = 1): Unit = {
val streams = consumer.createMessageStreamsByFilter(Whitelist(topic), threadCount)
if (!AdminUtils.topicExists(zkUtils, topic)) {
AdminUtils.createTopic(zkUtils, topic, 1, 1)
}
val executor = Executors.newFixedThreadPool(threadCount)
for (stream <- streams) {
executor.submit(new MessageConsumer(stream))
}
logger.debug(s"KafkaListener start with ${threadCount}thread (topic=$topic)")
}
override def close(): Unit = {
consumer.shutdown()
logger.debug(s"$topic Listener close")
}
class MessageConsumer(val stream: MessageStream) extends Runnable {
override def run(): Unit = {
val it = stream.iterator()
while (it.hasNext()) {
val message = it.next().message()
if (workJson == "") {
handleMessage(message)
}
else {
val strMessage = new String(message)
val newMessage = s"$strMessage/#/$workJson"
val outMessage = newMessage.toCharArray.map(c => c.toByte)
handleMessage(outMessage)
}
}
}
}
}
Specifically, I want to modify the structure that creates KafkaProduce objects whenever I send a message. There seems to be many other improvements to reduce lag.
Increase the number of consumer(KafkaListener) instances with same group id.
It will increase the consumption rate. Eventually your lag between producer write & consumer will get minimized.

My Kafka Producer code just runs well without any exception, but no data is sent in brokers

I have created the topic "test_topic_02", and manually wrote data into broker which succeed. But when I produce data with the following code, writing data to broker did not work.
object KafkaProducer {
private val log: slf4j.Logger = LoggerFactory.getLogger(this.getClass)
Logger.getLogger("org").setLevel(Level.WARN)
def main(args: Array[String]): Unit = {
val topic = "test_topic_02"
val brokers = "10.31.31.45:9092"
val props = new Properties()
var partition: Int = 0
//partition
val list: List[Int] = List(0, 1, 2, 3, 4)
props.put("bootstrap.servers", brokers)
props.put("client.id", "KafkaProducer")
props.put("key.serializer", "org.apache.kafka.common.serialization.StringSerializer")
props.put("value.serializer", "org.apache.kafka.common.serialization.StringSerializer")
//create producer
val producer: KafkaProducer[String, String] = new KafkaProducer[String, String](props)
while (true) {
for (i <- 1 to 100) {
for (j <- list) {
partition = j
try {
val producerData = new ProducerRecord[String, String](topic, Integer.valueOf(partition), "message from simulator_" + Integer.toString(i), Integer.toString(i))
val future: Future[RecordMetadata] = producer.send(producerData)
println(producerData)
//implicit number to long
future.get(long2Long(3), TimeUnit.SECONDS)
println("Message Sent Successfully")
Thread.sleep(1000)
} catch {
case e : Exception =>
log.error("Launching Failed")
}
}
}
}
println("Stop Producing Data")
producer.close()
}
}
You should use the other send function that takes a callback to write to Kafka. In the callback, you can look for exceptions or errors.
ProducerRecord<byte[],byte[]> record = new ProducerRecord<byte[],byte[]>("the-topic", key, value);
producer.send(myRecord,
new Callback() {
public void onCompletion(RecordMetadata metadata, Exception e) {
if(e != null) {
e.printStackTrace();
} else {
System.out.println("The offset of the record we just sent is: " + metadata.offset());
}
}
});

How to Test Kafka Consumer

I have a Kafka Consumer (built in Scala) which extracts latest records from Kafka. The consumer looks like this:
val consumerProperties = new Properties()
consumerProperties.put("bootstrap.servers", "localhost:9092")
consumerProperties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("group.id", "something")
consumerProperties.put("auto.offset.reset", "latest")
val consumer = new KafkaConsumer[String, String](consumerProperties)
consumer.subscribe(java.util.Collections.singletonList("topic"))
Now, I want to write an integration test for it. Is there any way or any best practice for Testing Kafka Consumers?
You need to start zookeeper and kafka programmatically for integration tests.
1.1 start zookeeper (ZooKeeperServer)
def startZooKeeper(zooKeeperPort: Int, zkLogsDir: Directory): ServerCnxnFactory = {
val tickTime = 2000
val zkServer = new ZooKeeperServer(zkLogsDir.toFile.jfile, zkLogsDir.toFile.jfile, tickTime)
val factory = ServerCnxnFactory.createFactory
factory.configure(new InetSocketAddress("0.0.0.0", zooKeeperPort), 1024)
factory.startup(zkServer)
factory
}
1.2 start kafka (KafkaServer)
case class StreamConfig(streamTcpPort: Int = 9092,
streamStateTcpPort :Int = 2181,
stream: String,
numOfPartition: Int = 1,
nodes: Map[String, String] = Map.empty)
def startKafkaBroker(config: StreamConfig,
kafkaLogDir: Directory): KafkaServer = {
val syncServiceAddress = s"localhost:${config.streamStateTcpPort}"
val properties: Properties = new Properties
properties.setProperty("zookeeper.connect", syncServiceAddress)
properties.setProperty("broker.id", "0")
properties.setProperty("host.name", "localhost")
properties.setProperty("advertised.host.name", "localhost")
properties.setProperty("port", config.streamTcpPort.toString)
properties.setProperty("auto.create.topics.enable", "true")
properties.setProperty("log.dir", kafkaLogDir.toAbsolute.path)
properties.setProperty("log.flush.interval.messages", 1.toString)
properties.setProperty("log.cleaner.dedupe.buffer.size", "1048577")
config.nodes.foreach {
case (key, value) => properties.setProperty(key, value)
}
val broker = new KafkaServer(new KafkaConfig(properties))
broker.startup()
println(s"KafkaStream Broker started at ${properties.get("host.name")}:${properties.get("port")} at ${kafkaLogDir.toFile}")
broker
}
emit some events to stream using KafkaProducer
Then consume with your consumer to test and verify its working
You can use scalatest-eventstream that has startBroker method which will start Zookeeper and Kafka for you.
Also has destroyBroker which will cleanup your kafka after tests.
eg.
class MyStreamConsumerSpecs extends FunSpec with BeforeAndAfterAll with Matchers {
implicit val config =
StreamConfig(streamTcpPort = 9092, streamStateTcpPort = 2181, stream = "test-topic", numOfPartition = 1)
val kafkaStream = new KafkaEmbeddedStream
override protected def beforeAll(): Unit = {
kafkaStream.startBroker
}
override protected def afterAll(): Unit = {
kafkaStream.destroyBroker
}
describe("Kafka Embedded stream") {
it("does consume some events") {
//uses application.properties
//emitter.broker.endpoint=localhost:9092
//emitter.event.key.serializer=org.apache.kafka.common.serialization.StringSerializer
//emitter.event.value.serializer=org.apache.kafka.common.serialization.StringSerializer
kafkaStream.appendEvent("test-topic", """{"MyEvent" : { "myKey" : "myValue"}}""")
val consumerProperties = new Properties()
consumerProperties.put("bootstrap.servers", "localhost:9092")
consumerProperties.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer")
consumerProperties.put("group.id", "something")
consumerProperties.put("auto.offset.reset", "earliest")
val myConsumer = new KafkaConsumer[String, String](consumerProperties)
myConsumer.subscribe(java.util.Collections.singletonList("test-topic"))
val events = myConsumer.poll(2000)
events.count() shouldBe 1
events.iterator().next().value() shouldBe """{"MyEvent" : { "myKey" : "myValue"}}"""
println("=================" + events.count())
}
}
}

Kafka: one producer for two topics vs. two producers

There are two kafka-topics:
logs (in text format), for this case I use standard StringSerializer for Kafka
events (in JSON format), for this case I use custom JSON Serializer for Kafka
There is some REST web-application (based on Servlet).
Which approach is best for this application?
Approach 1: Create single producer for both topics:
val producer = new KafkaProducer[String, AnyRef](...props...)
// send logs
producer.send(new ProducerRecord[String, AnyRef](
topic = "logs", "some log key", "some log str"))
// send events
producer.send(new ProducerRecord[String, AnyRef](
topic = "events", "some evt key", Event("some"))
Approach 2: Create two producers with strong types of values.
val logsProducer = new KafkaProducer[String, String](...props...)
val eventsProducer = new KafkaProducer[String, Event](...props...)
// send logs
logsProducer.send(new ProducerRecord[String, String](
topic = "logs", "some log key", "some log str"))
// send events
eventsProducer.send(new ProducerRecord[String, Event](
topic = "events", "some evt key", Event("some event"))
Update 1: For the Approach-1 I use own serializer based on Json4s:
class KafkaJson4sSerializer[T <: AnyRef] extends Serializer[T] {
import org.json4s._
import org.json4s.native.Serialization
import org.json4s.native.Serialization.write
implicit val formats = Serialization.formats(NoTypeHints)
override def configure(configs: util.Map[String, _], isKey: Boolean): Unit = {}
override def serialize(topic: String, data: T): Array[Byte] = {
write(data).getBytes
}
override def close(): Unit = {}
}
val p = new Properties()
p.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
"org.apache.kafka.common.serialization.StringSerializer")
p.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
"KafkaJson4sSerializer") // use the above own serializer
val producer = new KafkaProducer[String, AnyRef](p)
// send string type to topic 'logs'
producer.send(new ProducerRecord[String, AnyRef]("logs", "k1", "string value"))
// send Event type to another topic 'events'
producer.send(new ProducerRecord[String, AnyRef]("events", "k2", Event("some evt")))