Here is my situation:
We have a Spring cloud Stream 3 Kafka service connected to multiple topics in the same broker but I want to control connecting to a specific topic based on properties.
Every topic has its own binder and binding but the broker is the same for all.
I tried disabling the binding (that was the only solution I found so far) by using the property below and that works for the StreamListener to not receive messages but the connection to the topic and rebalancing is still happening.
spring:
cloud:
stream:
bindings:
...
anotherBinding:
consumer:
...
autostartup: false
I wonder if there is any setting on binder level that prevents it from starting. One of the topics consumer should only be available in one of the environments.
Thanks
Disabling the bindings by setting autoStartup to false should work, I am not sure what the issue is.
It doesn't look like you are using the new functional model, but the StreamListener. If you are using the functional model, here is another thing that you can try. You can disable the bindings by not including the corresponding functions at runtime. For example, assume you have the following two consumers.
#Bean
public Consumer<String> one() {}
#Bean
public Consumer<String> two() {}
When running this app, you can provide the property spring.cloud.function.definition to include/exclude functions. For instance, when you run it with spring.cloud.function.definition=one, then the consumer two will not be activated at all. When running with spring.cloud.function.definition=two, then the consumer one will not be activated.
The downside to the above approach is that if you decide to start the other function once the app started (given autoStartup is false on the other function), it will not work as it was not part of the original bindings through spring.cloud.function.definition. However, based on your requirements, this is probably not an issue as you know which environments are targeted for the corresponding topics. In other words, if you know that consumer one needs to always consume from the topic one, then you don't include consumer two as part of the definition.
Related
I'm new to kafka. How can I define two(or more) Kafka consumers using configuration .properties or .yml? I'm interested in using the spring.kafka.* application's properties and also I would like to specify two different properties for two consumers using this configuration files. For example, consumerA will have spring.kafka.bootstrap-servers=localhost:9090 and consumerB spring.kafka.bootstrap-servers=localhost:9091. I have seen examples online using multiple #KafkaLister with a single application.yml where all the common properties of the #KafkaLister (consumer) beans are defined in application.yml. This is ok concept but in case there are more consumers and they have completely different configurations then all of the configs needs to be put in the #KafkaListener annotation and this will make the class long and hard to read and the .yml obsolete. Instead, I would like something like this:
spring:
kafka:
consumer1:
auto-offset-reset: earliest
spring.kafka.bootstrap-servers=localhost:9091
kafka:
consumer2:
spring.kafka.bootstrap-servers=localhost:9092
auto-offset-reset: latest
kafka:
consumer3:
spring.kafka.bootstrap-servers=localhost:9093
auto-offset-reset: latest
And also how would I then connect this configuration to the appropriate beans? I could surely define the consumer beans and then use my custom configuration files to create as many different consumers as I would like but it seems to me that I would be reinventing the wheel.
Thanks
No; you can't declare consumers like that, you have to use #KafkaListener annotations on methods or create listener containers/listeners yourself.
You can override the bootstrap servers in the listener itself...
#KafkaListener(... properties="bootstrap.servers:${some.property}")
void listen(...) {
}
You can programmatically make multiple consumers from user properties, using the technique described here...
Can i add topics to my #kafkalistener at runtime
We have a Spring Boot Kafka Streams processor. For various reasons, we may have a situation where we need the process to start and run, but there are no topics we wish to subscribe to. In such cases, we just want the process to "sleep", because other liveness/environment checkers depend on it running. Also, it's part of a RedHat OCP cluster, and we don't want the pod to be constantly doing a crash backoff loop. I fully understand that it'll never really do anything until it's restarted with a valid topic(s), but that's OK.
If we start it with no topics, we get this message:Failed to start bean 'kStreamBuilder'; nested exception is org.springframework.kafka.KafkaException: Could not start stream: ; nested exception is org.apache.kafka.streams.errors.TopologyException: Invalid topology: Topology has no stream threads and no global threads, must subscribe to at least one source topic or global table.
In a test environment, we could just create a topic that's never written to, but in production, we don't have that flexibility, so a programmatic solution would be best. Ideally, I think, if there's a "null topic" abstraction of some sort (a Kafka "/dev/null"), that would look the cleanest in the code.
Best practices, please?
You can set the autoStartup property on the StreamsBuilderFactoryBean to false and only start() it if you have at least one stream.
If using Spring Boot, it's available as a property:
https://docs.spring.io/spring-boot/docs/current/reference/html/application-properties.html#application-properties.integration.spring.kafka.streams.auto-startup
I am trying to figure out how to utilize ActiveMQ Artemis to achieve the following topology. I do need to have several producers writing to queues hosted on two standalone Artemis brokers. For the moment every producer creates two connection factories which handle the connections to the 2 brokers and create the corresponding queues.
#Bean
public ConnectionFactory jmsConnectionFactoryBroker1() {
ActiveMQConnectionFactory connectionFactory = new ActiveMQConnectionFactory(brokerUrl_1,username,password);
return connectionFactory;
}
#Bean
public ConnectionFactory jmsConnectionFactoryBroker2() {
ActiveMQConnectionFactory connectionFactory = new ActiveMQConnectionFactory(brokerUrl_2,username,password);
return connectionFactory;
}
My main issue is that I need to know which queue is assigned to which broker and at the same time I need to know that if one broker is down for some reason that I can re-create that queue to the other broker on the fly and avoid losing any further messages. So my approach was to setup broker urls as below
artemis.brokerUrl_1=(tcp://myhost1:61616,tcp://myhost2:61616)?randomize=false
artemis.brokerUrl_2=(tcp://myhost2:61616,tcp://myhost1:61616)?randomize=false
So using a different JmsTemplate for each broker url my intention was that when referring to JmsTemplate
using brokerUrl_1 would create the queues on myhost1, and the same for the corresponding JmsTemplate
for brokerUrl_2.
I would have expected (due to randomize parameter) that each queue would have some kind of static membership to a broker and in the case of a broker's failure there would be some kind of migration by re-creating the queue from scratch to the other broker.
Instead what I notice that almost every time the distribution of queue creation does not happen as perceived but rather randomly since the same queue can appear in either broker which is not a desirable
for my use-case.
How can I approach this case and solve my problem in a way that I can create my queues on a predefined broker and have the fail-safe that if one broker is down the producer will create the same queue to the
other broker and continue?
Note that having shared state between the brokers is not an option
The randomize=false doesn't apply to the Artemis core JMS client. It only applies to the OpenWire JMS client distributed with ActiveMQ 5.x. Which connector is selected from the URL is determined by the connection load-balancing policy as discussed in the documentation. The default connection load-balancing policy is org.apache.activemq.artemis.api.core.client.loadbalance.RoundRobinConnectionLoadBalancingPolicy which will select a random connector from the URL list and then round-robin connections after that. There are other policies available, and if none of them give you the behavior you want then you can potentially implement your own.
That said, it sounds like what you really want/need is 2 pairs of brokers where each pair consists of a live and a backup. That way if the live broker fails then all the clients can fail-over to the backup and you won't have to deal with any of this other complexity of this "fake" fail-over functionality you're trying to implement.
Also, since you're using Spring's JmsTemplate you should be aware of some well-known anti-patterns that it uses which may significantly impact performance in a negative way.
I was reading about Kafka Stream - Elastic Scaling features.
Means Kafka Stream can handover the task to other instance and task states will get created using changelog. Its mentioned that Instance coardinate with each other to achieve rebalance.
But there is no such detail given how exactly rebalance work?
Is it same like how Consumer Group works or different mechanism because Kafka Stream instances not exactly how consumer in Consumer Group?
Visit this article for a more thorough explanation.
..."In a nutshell, running instances of your application will automatically become aware of new instances joining the group, and will split the work with them; and vice versa, if any running instances are leaving the group (e.g. because they were stopped or they failed), then the remaining instances will become aware of that, too, and will take over their work. More specifically, when you are launching instances of your Streams API based application, these instances will share the same Kafka consumer group id. The group.id is a setting of Kafka’s consumer configuration, and for a Streams API based application this consumer group id is derived from the application.id setting in the Kafka Streams configuration."...
I have problem using KafkaEmbedded from https://mvnrepository.com/artifact/org.springframework.kafka/spring-kafka-test/2.1.10.RELEASE
I'm using KafkaEmbedded to create Kafka broker for testing producer/consumer pipelines. These producers/consumers are standard clients from kafka-clients. I'm not using Spring Kafka clients.
Everything is working, the code works fine, but I have to use consumeFromEmbeddedTopics() method from KafkaEmbedded to make consumer works. If I won't use this method, the consumer does not get any messages.
There are two problems I have with this method: first, it needs KafkaConsumer as a parameter(and I don't want to expose it in class) and invoking this method gives ConcurrentModificationException when an object invokes poll using #Scheduled.
I'm using auto.offset.reset property so it's a different thing.
My question is: how to correctly consume records from KafkaEmbedded without invoking these consumeFromEmbeddedTopics() methods?
There is nothing special about that method, it simply subscribes the consumer to the topic(s) and polls it.
There is no reason you can't do the same with your Consumer.