I have an application with a high level of load and performance's critical .
Now, I'm migrating the application to use EJB. I'm very worried about using EJB to consume messages on queues because transactionality can decrease the performance.
Now, I'm consuming X messages in the same transaction, but I don't know how do the same using MDBs.
Is it possible to consume a block of messages in an MDB using only one transaction?
It is not guaranteed that the same MDB will process the stream of messages.
I think you can achieve what you want by using a stateless bean with an #Asynchronous invocation, and passing your set of messages.
Something like that:
#Stateless
public class AsynchProcessor {
#Asynchronous
public void processMessages(Set<MyMessage> messages) {....}
}
Decorate your method with Future if necessary, then in your client.
Set<MyMessage> messages = ...
asynchProcessor.processMessages(messages)
Related
I use spring cloud stream with kafka. I have a topic X, with partition Y and consumer group Z. Spring boot starter parent 2.7.2, spring kafka version 2.8.8:
#StreamListener("input-channel-name")
public void processMessage(final DomainObject domainObject) {
// some processing
}
It works fine.
I would like to have an endpoint in the app, that allows me to re-read/re-process (seek right?) all the messages in X.Y (again). But not after rebalancing (ConsumerSeekAware#onPartitionsAssigned) or after app restart (KafkaConsumerProperties#resetOffsets) but on demand like this:
#RestController
#Slf4j
#RequiredArgsConstructor
public class SeekController {
#GetMapping
public void seekToBeginningForDomainObject() {
/**
* seekToBeginning for X, Y, input-channel-name
*/
}
}
I just can't achieve that. Is it even possible ?. I understand that I have to do that on the consumer level, probably the one that is created after #StreamListener("input-channel-name") subscription, right ? but I've no clue how to obtain that consumer. How can I execute seek on demand to make kafka send the messages to the consumer again ? I just want to reset the offset for X.Y.Z to 0 to just make the app, load and process all the messages again.
https://docs.spring.io/spring-cloud-stream/docs/current/reference/html/spring-cloud-stream-binder-kafka.html#rebalance-listener
KafkaBindingRebalanceListener.onPartitionsAssigned() provides a boolean to indicate whether this is an initial assignment Vs. a rebalance assignment.
Spring cloud stream does not currently support arbitrary seeks at runtime, even though the underlying KafkaMessageDrivenChannelAdapter does support getting access to a ConsumerSeekCallback (which allows arbitrary seeks between polls). It would need an enhancement to the binder to allow access to this code.
It is possible, though, to consume idle container events in an event listener; the event contains the consumer, so you could do arbitrary seeks under those conditions.
Currently, I have a Kafka Listener configured with a ConcurrentKafkaListenerContainerFactory and a SeekToCurrentErrorHandler (with a DeadLetterPublishingRecoverer configured with 1 retry).
My Listener method is annotated with #Transactional (and also all the methods in my Services that interact with the DB).
My Listener method does the following:
Receive message from Kafka
Interact with several services that save different parts of the received data to the DB
Ack message in Kafka (i.e., commit offset)
If it fails somewhere in the middle, it should rollback and retry until max retries.
Then send message to DLT.
I'm trying to make this method fully transactional, i.e., if something fails all previous changes are rolled back.
However, the #Transactional annotation in the Listener method is not enough.
How can I achieve this?
What configurations should I employ to make the Listener method fully transactional?
If you are not also publishing to Kafka from the listener, there is no need (or benefit) to using Kafka transactions; just overhead. The STCEH + DLPR is enough.
If you are also publishing to Kafka (and want those to be rolled back too), then see the documentation - configure a KafkaTransactionManager in the listener container.
let my describe the rationale behind my question:
We have a Micronaut-based application consuming messages from Kafka broker.
The consumed messages are processed and fed to another remote "downstream" application.
If this downstream application is going to restart purposely, it will take a while to get ready accepting further messages from our Micronaut-based application.
So we have the idea to send out Micronaut application a request to SUSPEND/PAUSE consumption of messages from Kafka (e.g. via HTTP to an appropriate endpoint).
The KafkaConsumer interface seems to have appropriate methods to achieve this goal like
public void pause(java.util.Collection<TopicPartition> partitions)
public void resume(java.util.Collection<TopicPartition> partitions)
But how to get a reference to the appropriate KafkaConsumer instance fed in to our HTTP endpoint?
We've tried to get it injected to the constructor of the HTTP endpoint/controller class, but this yields
Error instantiating bean of type [HttpController]
Message: Missing bean arguments for type: org.apache.kafka.clients.consumer.KafkaConsumer. Requires arguments: AbstractKafkaConsumerConfiguration consumerConfiguration
It's possible to get a reference of KafkaConsumer instance as method parameter with #Topic annotated receive methods as describes in the Micronaut Kafka documentation,
but this would result in storing this reference as instance variable, get it accessed by the HTTP endpoint, etc. pp. ... which sounds not very convincing:
You get a reference to the KafkaConsumer ONLY when receiving the next message! This might by appropriate for SUSPENDING/PAUSING, but not for RESUMING!
By the way, calling KafkaConsumer.resume(...) on a reference saved as instance variable yields
java.util.ConcurrentModificationException: KafkaConsumer is not safe for multi-threaded access
at org.apache.kafka.clients.consumer.KafkaConsumer.acquire(KafkaConsumer.java:2201)
at org.apache.kafka.clients.consumer.KafkaConsumer.acquireAndEnsureOpen(KafkaConsumer.java:2185)
at org.apache.kafka.clients.consumer.KafkaConsumer.resume(KafkaConsumer.java:1842)
[...]
I think the same holds true when implementing KafkaConsumerAware interface to store a reference of the freshly created KafkaConsumer instance.
So are there any ideas how to handle this in an appropriate way?
Thanks
Christian
I am using JMS for getting the yahoo stock quotes asynchronously. I am creating JMSContext on the Producer side and I would like to use the same context of producer in consumer class as well. So when I make it public static then JMSContext is set to null. So can JMSContext be public and static? is there any other way to create the JMSContext in consumer? I am using netbeans to implement this task.
JMSContext is a Java object do can have what ever visibility you require for your application architecture. However read the JMS spec and you'll see that only 1 thread can use it at any one time. If you can enforce that in your application you can share the context, but if that doesn't make sense don't. It's not the JMS provider's job to enforce this threading restriction.
We were using the Kafka 0.8 async producer but it is dropping messages (and there is no aysnc response from another thread or we could keep using async).
We have set the batch.num.messages to 500 and our consumer is not changing. I read that batch.num.messages only applies to the async producer and not sync so I need to batch myself. We are using compression.codec=snappy and our own serializer class.
My question is two-fold:
Can I assume that I can just use our own serializer class and then send the message on my own?
Do I need to worry about any special snappy options/parameters that Kafka might be using?
Yes, it's because batch.num.messages controls behaviour of async producer only. This is explicitly said so in relevant guide on parameters:
The number of messages to send in one batch when using async mode. The producer will wait until either this number of messages are ready to send or queue.buffer.max.ms is reached.
In order to have batching for sync producer you have to send list of messages:
public void trySend(List<M> messages) {
List<KeyedMessage<String, M>> keyedMessages = Lists.newArrayListWithExpectedSize(messages.size());
for (M m : messages) {
keyedMessages.add(new KeyedMessage<String, M>(topic, m));
}
try {
producer.send(keyedMessages);
} catch (Exception ex) {
log.error(ex)
}
}
Note that I'm using kafka.javaapi.producer.Producer here.
Once send is executed, batch is sent.
Can I assume that I can just use our own serializer class and then send the message on my own?
Do I need to worry about any special snappy options/parameters that Kafka might be using?
Both, compression and serializer are orthogonal features that don't affect batching, but actually applied to individual messages.
Note that there will be api changes and async/sync api will be unified.