Customize input kafka topic name for Spring Cloud Stream - apache-kafka

Since #EnableBinding and #StreamListener(Sink.INPUT) were deprecated in favor to functions, I need to create a consumer that would read messages from Kafka topic.
My consumer function:
#Bean
public Consumer<Person> log() {
return person -> {
System.out.println("Received: " + person);
};
}
, application.yml configs
spring:
cloud:
stream:
kafka:
binder:
brokers: localhost:9092
bindings:
consumer:
destination: messages
contentType: application/json
Instead of connecting to topic messages, it keeps connecting to log-in-0 topic.
How could I fix this ?

spring.cloud.stream.bindings.log-in-0.destination=messages

Related

Spring cloud stream: Attaching function to binder by type

I have a Spring cloud stream application implemented in Functional approach. the app consumes events from multiple Kafka topics, normalizes the input into output schema (always same schema) and publishes to Kafka. I am not using Kafka-streams since no join/enrichment/state is required.
I want to allow flexible deployment by controlling the input topics to consume from at runtime: you can either consume from all topics or from single topic. my way to do it was to declare dedicated function for each type, and a dedicated binding for each function.
The problem is that the binder (there is a single one) routes all incoming messages to all bindings, and I get ClassCastException when the wrong function is called to handle some event type.
I thought of the following solutions, yet I want to know if there is a better way:
having a single binder per binding. I rather not to, especially
since I'm using a well-configured binder and I don't want to simply
duplicate it.
having a single binder and a single function of type
Message<?>, that internally checks the object type, cast it
and handles it by type.
my application.yaml looks like this:
spring:
cloud:
function:
definition: data;more
stream:
default-binder: kafka-string-avro
bindings:
data-in-0:
binder: kafka-string-avro
destination: data.emails.events
group: communication_system_events_data_gp
data-out-0:
binder: kafka-string-avro
destination: communication.system.emails.events
producer:
useNativeEncoding: true
more-in-0:
binder: kafka-string-avro
destination: communication.emails.send.status
group: communication_system_events_more_gp
more-out-0:
binder: kafka-string-avro
destination: communication.system.emails.events
producer:
useNativeEncoding: true
my functions:
#Bean("data")
public Function<Message<Data>, Message<Output>> dataFunction() {
return new DataFunction();
}
#Bean("more")
public Function<Message<More>, Message<Output>> moreFunction() {
return new MoreFunction();
}
Not sure where the issue is, but I am seeing some configuration issues with what you provided. It might be a typo when you copied to the question, but the following config should isolate the two different topics to their corresponding functions.
spring:
cloud:
function:
definition: dataFunction;moreFunction
stream:
default-binder: kafka-string-avro
bindings:
dataFunction-in-0:
binder: kafka-string-avro
destination: data.emails.events
group: communication_system_events_data_gp
dataFunction-out-0:
binder: kafka-string-avro
destination: communication.system.emails.events
producer:
useNativeEncoding: true
moreFunction-in-0:
binder: kafka-string-avro
destination: communication.emails.send.status
group: communication_system_events_more_gp
moreFunction-out-0:
binder: kafka-string-avro
destination: communication.system.emails.events
producer:
useNativeEncoding: true
#Bean("data")
public Function<Message<Data>, Message<Output>> dataFunction() {
return new DataFunction();
}
#Bean("more")
public Function<Message<More>, Message<Output>> moreFunction() {
return new MoreFunction();
}

mongoDB inserting twice when called on different Threads

Basically I am consuming Messages from spring cloud stream kafka and inserting it into the MongoDB
My code works fine if my mongo cluster is up
I have 2 problems In case My Mongo Instance is down
auto commit of cloud stream is disabled (autoCommitOffset set to false) then also re-polling is not happening even if it hasn't Acknowledged the message yet
While Checking For Mongo Connection it takes some time and in that time period if it receive two meesages with same ID and after that if i start the instance of mongo it duplicates the messages which in normal case is working fine
Do we have any solution for these?
Here is my code,
interface ResourceInventorySink {
companion object {
const val INPUT = "resourceInventoryInput"
}
#Input(INPUT)
fun input(): SubscribableChannel
}
#EnableBinding(ResourceInventorySink::class)
class InventoryEventListeners {
val logger = LoggerFactory.getLogger(javaClass)
#Autowired
lateinit var resourceInventoryService : ResourceInventoryService
#StreamListener(ResourceInventorySink.INPUT, condition = OperationConstants.INSERT)
fun receiveInsert(event : Message<ResourceInventoryEvent>) {
logger.info("received Insert message {}", event.payload.toString())
val success = resourceInventoryService.insert(event.payload)
success.subscribe({
logger.info("Data Inserted", event.payload.toString())
event.headers.get(KafkaHeaders.ACKNOWLEDGMENT, Acknowledgment::class.java)?.acknowledge()
},{
if(it !is DataAccessResourceFailureException) {
logger.error("Exception Occured {} {}", it.message , it.cause.toString())
event.headers.get(KafkaHeaders.ACKNOWLEDGMENT, Acknowledgment::class.java)?.acknowledge()
}
else {
logger.error("Error Inserting in Mongo DB {}", it.cause)
}
})
}
Here is my service class
#Service
class ResourceInventoryService() {
val logger = LoggerFactory.getLogger(javaClass)
#Autowired
lateinit var resourceInventoryRepository: ResourceInventoryRepository
fun insert(newResource: ResourceInventoryEvent) = resourceInventoryRepository
.findByProductId(newResource.productId)
.switchIfEmpty(newResource.convertTODocument().toMono())
.flatMap { resourceInventoryRepository.save(it) }
.onErrorResume { Mono.error(it) }
this is my application.yml
spring:
cloud:
stream:
default:
consumer:
useNativeEncoding: true
kafka:
binder:
brokers:
- localhost:9092
consumer-properties:
key.deserializer : org.apache.kafka.common.serialization.StringDeserializer
value.deserializer: io.confluent.kafka.serializers.KafkaAvroDeserializer
schema.registry.url: http://localhost:8081
enable.auto.commit: false
specific.avro.reader: true
bindings:
resourceInventoryInput:
consumer:
autoOffsetCommit: false
default-binder: kafka
bindings:
resourceInventoryInput:
binder: kafka
destination: ${application.messaging.topic}
content-type: application/*+avro
group: ${application.messaging.group}
EDIT 1. Acknowledgment is null

kafka-consumer-groups CLI not showing node-kafka consumer groupf

I have a kafka consumer group running on node.js powered by node-kafka. When this consumer group is active or in-active, I expect to see it reported by the kafa-consumer-groups CLI.
The kafka-consumer-groups CLI does show the console consumers and not just the node consumer.
I can see the node consumer group in Kafka Tool. It doesn't show up in the Kafa-consumer-groups CLI output
kafka-consumer-groups --bootstrap-server localhost:9092 --list
kafka-consumer-groups --bootstrap-server localhost:9092 --group node-kafka-consumer --describe
kafka-consumer-groups CLI should show all consumers - console and programmatic (in my case node-kafka consumer)
Here is the solution that uses kafka-node ConsumerGroup object to write offsets to kafka instead of zookeeper
const { ConsumerGroup } = kafka;
const consumerOptions = {
kafkaHost: 'localhost:9092',
groupId: 'kafka-node-consumer-group',
protocol: ['roundrobin'],
fromOffset: 'earliest'
};
const topics = ['zoo_animals'];
const consumerGroup = new ConsumerGroup(
{ id: 'node-app-1', ...consumerOptions },
topics
);
consumerGroup.on('message', onMessage);
consumerGroup.on('error', onError);
function onMessage(message) {
console.log('message', message);
}
function onError(error) {
console.log('error', error);
}
process.once('SIGINT', function() {
consumerGroup.close(true, err => {
if (err) {
console.log('error closing consumer', err);
} else {
console.log('closed consumer');
}
});
});```

How can I test a Spring Cloud Stream Kafka Streams application that uses Avro and the Confluent Schema Registry?

I am having trouble figuring out how to test a Spring Cloud Stream Kafka Streams application that uses Avro as message format and a (Confluent) schema registry.
The configuration could be something like this:
spring:
application:
name: shipping-service
cloud:
stream:
schema-registry-client:
endpoint: http://localhost:8081
kafka:
streams:
binder:
configuration:
application:
id: shipping-service
default:
key:
serde: org.apache.kafka.common.serialization.Serdes$IntegerSerde
schema:
registry:
url: ${spring.cloud.stream.schema-registry-client.endpoint}
value:
subject:
name:
strategy: io.confluent.kafka.serializers.subject.RecordNameStrategy
bindings:
input:
consumer:
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
order:
consumer:
valueSerde: io.confluent.kafka.streams.serdes.avro.GenericAvroSerde
output:
producer:
valueSerde: io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde
bindings:
input:
destination: customer
order:
destination: order
output:
destination: order
server:
port: 8086
logging:
level:
org.springframework.kafka.config: debug
NOTES:
It is using native serialization/deserialization.
Test framework: Junit 5
I guess regarding the Kafka Broker I should use a EmbeddedKafkaBroker bean, but as you see, it also relies on a Schema Registry that should be mocked in some way. How?
Sorting this out has been a real pain, but finally I managed to make it work using fluent-kafka-streams-tests:
Extra dependencies:
testImplementation("org.springframework.kafka:spring-kafka-test")
testImplementation("com.bakdata.fluent-kafka-streams-tests:schema-registry-mock-junit5:2.0.0")
The key is to set up the necessary configs as System properties. For that I created a separated test configuration class:
#Configuration
class KafkaTestConfiguration(private val embeddedKafkaBroker: EmbeddedKafkaBroker) {
private val schemaRegistryMock = SchemaRegistryMock()
#PostConstruct
fun init() {
System.setProperty("spring.kafka.bootstrap-servers", embeddedKafkaBroker.brokersAsString)
System.setProperty("spring.cloud.stream.kafka.streams.binder.brokers", embeddedKafkaBroker.brokersAsString)
schemaRegistryMock.start()
System.setProperty("spring.cloud.stream.schema-registry-client.endpoint", schemaRegistryMock.url)
System.setProperty("spring.cloud.stream.kafka.streams.binder.configuration.schema.registry.url", schemaRegistryMock.url)
}
#Bean
fun schemaRegistryMock(): SchemaRegistryMock {
return schemaRegistryMock
}
#PreDestroy
fun preDestroy() {
schemaRegistryMock.stop()
}
}
Finally the test class, where you can now produce and consume Avro messages with your KStream processing them and taking advantage of the mock schema registry:
#EmbeddedKafka
#SpringBootTest(properties = [
"spring.profiles.active=local",
"schema-registry.user=",
"schema-registry.password=",
"spring.cloud.stream.bindings.event.destination=event",
"spring.cloud.stream.bindings.event.producer.useNativeEncoding=true",
"spring.cloud.stream.kafka.streams.binder.configuration.application.server=localhost:8080",
"spring.cloud.stream.kafka.streams.bindings.event.consumer.keySerde=io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde",
"spring.cloud.stream.kafka.streams.bindings.event.consumer.valueSerde=io.confluent.kafka.streams.serdes.avro.SpecificAvroSerde"])
class MyApplicationTests {
#Autowired
private lateinit var embeddedKafka: EmbeddedKafkaBroker
#Autowired
private lateinit var schemaRegistryMock: SchemaRegistryMock
#Test
fun `should process events`() {
val senderProps = KafkaTestUtils.producerProps(embeddedKafka)
senderProps[ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG] = "io.confluent.kafka.serializers.KafkaAvroSerializer"
senderProps[ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG] = "io.confluent.kafka.serializers.KafkaAvroSerializer"
senderProps["schema.registry.url"] = schemaRegistryMock.url
val pf = DefaultKafkaProducerFactory<Int, String>(senderProps)
try {
val template = KafkaTemplate(pf, true)
template.defaultTopic = "event"
...
}

How can we configure value.subject.name.strategy for schemas in Spring Cloud Stream Kafka producers, consumers and KStreams?

I would like to customize the naming strategy of the Avro schema subjects in Spring Cloud Stream Producers, Consumers and KStreams.
This would be done in Kafka with the properties key.subject.name.strategy and value.subject.name.strategy -> https://docs.confluent.io/current/schema-registry/serializer-formatter.html#subject-name-strategy
In a native Kafka Producer this works:
private val producer: KafkaProducer<Int, Customer>
init {
val props = Properties()
...
props[AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG] = "http://localhost:8081"
props[AbstractKafkaAvroSerDeConfig.VALUE_SUBJECT_NAME_STRATEGY] = TopicRecordNameStrategy::class.java.name
producer = KafkaProducer(props)
}
fun sendCustomerEvent(customer: Customer) {
val record: ProducerRecord<Int, Customer> = ProducerRecord("customer", customer.id, customer)
producer.send(record)
}
However I cannot find how to do this in Spring Cloud Stream. So far I have tried this in a producer:
spring:
application:
name: spring-boot-customer-service
cloud:
stream:
kafka:
bindings:
output:
producer:
configuration:
key:
serializer: org.apache.kafka.common.serialization.IntegerSerializer
value:
subject:
name:
strategy: io.confluent.kafka.serializers.subject.TopicRecordNameStrategy
Apparently Spring Cloud uses it's own subject naming strategy with the interface org.springframework.cloud.stream.schema.avro.SubjectNamingStrategy and only one subclass: DefaultSubjectNamingStrategy.
Is there declarative way of configuring value.subject.name.strategy or are we expected to provide our own org.springframework.cloud.stream.schema.avro.SubjectNamingStrategy implementation and the property spring.cloud.stream.schema.avro.subject-naming-strategy?
As pointed out in the other answer there's a dedicated property, spring.cloud.stream.schema.avro.subjectNamingStrategy, that allows to set up a different naming strategy for Kafka producers.
I contributed the org.springframework.cloud.stream.schema.avro.QualifiedSubjectNamingStrategy that provides that functionality out of the box.
In the case of Kafka Streams and native serialization/deserialization (default behaviour from Spring Cloud Streams 3.0.0+) you have to use Confluent's implementation (io.confluent.kafka.serializers.subject.RecordNameStrategy) and the native properties:
spring:
application:
name: shipping-service
cloud:
stream:
...
kafka:
streams:
binder:
configuration:
application:
id: shipping-service
...
value:
subject:
name:
strategy: io.confluent.kafka.serializers.subject.RecordNameStrategy
You can declare it in your properties as
spring.cloud.stream.schema.avro.subjectNamingStrategy=MyStrategy
where MyStrategy is an implementation of the interface. For instance
object MyStrategy: SubjectNamingStrategy {
override fun toSubject(schema: Schema): String = schema.fullName
}