How to create kafka topic dynamic using the Apache Kafka even using the auto.create.enable.topic property [duplicate] - apache-kafka

I am using the below Camel Route to Produce the message to the Kafka instance, but the topic is not present in the instance. How to create the topic when it is not present in the Kafka instance?
#Component
public class kafkaConfig extends RouteBuilder {
#Override
public void configure() throws Exception {
from("timer:time")
.to("kafka:aayush?brokers=test-hydra-cf--gejc-inbggu-ef-a.cloud.com:443&saslMechanism=PLAIN&securityProtocol=SASL_SSL&saslJaasConfig=org.apache.kafka.common.security.plain.PlainLoginModule required username=\"...\" password=\"...\";");
}
}
Output: It is saying the topic is not present.
How to create the topic when it is not present ?

Camel can't. That's just a plain Kafka SDK action.
Plus, you'd need to enable auto topic creation on the broker (not recommended)
With spring-kafka, you can define a #Bean to create a new topic upon application startup.
https://docs.spring.io/spring-kafka/reference/html/#configuring-topics
However, some organizations lock this down further such that only authorized services can create topics, therefore you'd need to contact your cluster administrator to create any topic.

Related

Can kafka publish messages to AWS lambda

I have to publish messages from a kafka topic to lambda to process them and store in a database using a springboot application, i did some research and found something to consume messages from kafka
public Function<KStream<String, String>, KStream<String, String>> process(){} however, im not sure if this is only used to publish the consumed messages to another kafka topic or can be used as an event source to lambda, I need some guidance on consuming and converting the consumed kafka message to event source.
Brokers do not push. Consumers always poll.
Code shown is for Kafka Streams API, which primarily writes to new Kafka topics. While you could fire HTTP events to start a lambda, that's not recommended.
Alternatively, Kafka is already supported as an event source. You don't need to write any consumer code.
https://aws.amazon.com/about-aws/whats-new/2020/12/aws-lambda-now-supports-self-managed-apache-kafka-as-an-event-source/
This is possible from MSK or a self managed Kafka
process them and store in a database
Your lambda could process the data and send to a new Kafka topic using a producer. You can then use MSK Connect or run your own Kafka Connect cluster elsewhere to dump records into a database. No Spring/Java code would be necessary.

How do I create a Kafka topic on the fly / on startup for the producer to send to?

I'm starting to use the Confluent .NET library for Kafka and am attempting to implement a pattern I used to use against Azure Service Bus, to create the topic upon startup of the producer application (create if not exists). How would this be done in the Kafka API's, and can it be done?
This would allow the topics to be part of source control and configured in the automated release processes rather than manually set up per topic/environment. Also, I'd prefer that my developers not have to go to each Kafka instance / environment and configure them first to match.
If I can't do it this way, I'll have to bake it into bash scripts in the release process, but would prefer it in the startup code.
You could enable the cluster-wide configuration auto.create.topics.enable.
This will automatically create a topic if a new producer tries to send data to a topic that does not exist yet.
However, be aware of the following:
the topic will be created with default settings on replication, number of partitions and retention. Make sure to change those default setting as required. Anyway, all automatically created topic will have the identical configuration.
typos in the topic name configuration in the producer code can lead to unwanted creation of topics.
Alternatively, you can make use of the AdminClient API. An example is shown here:
static async Task CreateTopicAsync(string bootstrapServers, string topicName) { using (var adminClient = new AdminClientBuilder(new AdminClientConfig { BootstrapServers = bootstrapServers }).Build()) { try { await adminClient.CreateTopicsAsync(new TopicSpecification[] { new TopicSpecification { Name = topicName, ReplicationFactor = 1, NumPartitions = 1 } }); } catch (CreateTopicsException e) { Console.WriteLine($"An error occured creating topic {e.Results[0].Topic}: {e.Results[0].Error.Reason}"); } } }

Kafka connect avro consumer in scala

I have a producer using kafka connect which uses Confluent Kafka Connect API and it publish the messages in a "SourceRecord" format, which contains "schema" and "struct" as below.
I am looking for a sample code to build out a kafka consumer in scala, which consumes the message and deserialize it into an object
import org.apache.kafka.connect.source.SourceRecord;
import org.apache.kafka.connect.source.SourceTask;
//publish kafka message in avro format
protected SourceRecord makeSourceRecord(AvroDataEvent avroDataEvent) {
return new SourceRecord(
partitionKey(config.sourceJdbcUrl),
config.topicName,
avroDataEvent.schema(),
avroDataEvent.struct());
}
You can consume directly from the config.topicName topic using the Confluent KafkaAvroDeserializer class along with the schema registry that Connector is configured with
Just because data came from Connect doesn't require using the Connect API to read it.
Regarding sample code, try this as a starting point (it's in Kotlin) http://aseigneurin.github.io/2018/08/03/kafka-tutorial-5-consuming-avro.html

delete kafka logs for consumed messages, using SCS

Am new in using kafka and spring cloud stream. Need some help.
SetUP
I have two spring-boot applications App-1, App-2.
I am using spring cloud stream and spring-cloud-stream-binder-kafka for async communication.
There is one topic TOPIC-1
Use Case
Suppose App-1 sent a message on topic TOPIC-1 which App-2 was listening.
App-2 consumed the message and processed it successfully.
Now offset of that topic gets incremented.
Question
How can i implement a mechanism to delete the only successfully consumed message's data from kafka logs after a specified period of time?
In Kafka, the responsibility of what has been consumed is the responsibility of the consumer. So I guess, there must be some kafka message log control mechanism in spring cloud stream kafka that i am not aware of.
NOTE 1 : I know about the kafka log retention time and disk properties. But kafka logs will be deleted even for non consumed messages.
NOTE 2: I have gone through this question but it can't help.
There is no such mechanism, that I am aware of, in Kafka; and certainly not in Spring Cloud Stream or the libraries it is based on. Kafka clients don't have access to such low-level constructs.
Also, consumer offsets are completely separate to the topic logs; in modern brokers, they are stored in a special topic.
EDIT
Per the comment below, the kafka-delete-records.sh command line tool can be used.
Note that this uses the scala AdminClient which is not on the SCSt classpath by default (since 2.0).
However, the java AdminClient supports similar functionality:
/**
* Delete records whose offset is smaller than the given offset of the corresponding partition.
*
* This is a convenience method for {#link #deleteRecords(Map, DeleteRecordsOptions)} with default options.
* See the overload for more details.
*
* This operation is supported by brokers with version 0.11.0.0 or higher.
*
* #param recordsToDelete The topic partitions and related offsets from which records deletion starts.
* #return The DeleteRecordsResult.
*/
public DeleteRecordsResult deleteRecords(Map<TopicPartition, RecordsToDelete> recordsToDelete) {
return deleteRecords(recordsToDelete, new DeleteRecordsOptions());
}
You can create an AdminClient using boot's AutoConfiguration KafkaAdmin.
AdminClient client = AdminClient.create(kafkaAdmin.getConfig());

Two spring kafka consumer with different configuration in one application

I am newbie to Kafka. I am working on a Spring-Kafka POC. Our KAFKA severs are Kerberized. With all required configuration, we are able to access the Kerberized Kafka server. Now we have another requirement where we have to consume topics from non-Kerberized (Simple Kafka Consumer) Kafka servers. Can we do this in single application by creating another KafkaConsumer with its own Listener?
Yes; just define a different consumer factory bean for the second consumer.
If you are using Spring Boot's auto configuration, you will have to manually declare both because the auto configuration is disabled if a user-defined bean is discovered.