As Kafka document said that,
The producer is thread safe and sharing a single producer instance
across threads will generally be faster than having multiple
instances.
So I have following code and want to only have one instance of KafkaProducer for each send request. But when is the best place in the code to call close method on it? As I can't call close method in the send method. How should I write the code to handle?
public class Producer {
private final KafkaProducer<Integer, String> producer;
public Producer(String topic, Boolean isAsync) {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, KafkaProperties.KAFKA_SERVER_URL + ":" + KafkaProperties.KAFKA_SERVER_PORT);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "DemoProducer");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, IntegerSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
producer = new KafkaProducer<>(props);
}
public void send(String message) {
producer.send(new ProducerRecord<>(topic, messageNo, messageStr);
}
}
You can create the producer first and pass(inject) it to your web resource or resources.
Then you can use a shutdown hook to close the producer with its reference.
Even better if you can use life cycle stop hooks in your framework.
Example in dropwizard:
https://www.dropwizard.io/en/latest/manual/core.html?highlight=managed#managed-objects
Kafka producer implements the AutoClosable interface. So you can declare it within the try-with-resources block, and it should take care of releasing the resources when your code goes outside the scope of the block.
Do you ever need to change producer parameters at runtime? For example when changing the broker urls or during tests?
If you need it and you have a singletone producer, make sure to provide a hook to close and recreate the producer with new parameters.
Related
I have a producer publisherMethod which will use properties to Create KafkaProducer obj.
I am trying to write the junit for this method and I am not able to mock.
Found MockProducer Class on the web , but didn't get clear idea that How to use it.
Properties props = new Properties();
Producer<String, String> producer = new KafkaProducer<>(props); //this line I need to mock
Exception when I try to mock :
org.apache.kafka.common.KafkaException: Failed to construct kafka producer
I need to convert org.apache.activemq.artemis.core.message.impl.CoreMessage to javax.jms.Message. How can i do this? Maybe there is a required util method somewhere in the code, or it needs to be done manually?
I want to intercept the following events:
afterSend
afterDeliver
messageExpired
And then send the message to a direct endpoint Camel route which requires a javax.jms.Message instance.
My recommendation would be to simply copy the message and route the copy to the address of your choice, e.g.:
public class MyPlugin implements ActiveMQServerMessagePlugin {
ActiveMQServer server;
#Override
public void registered(ActiveMQServer server) {
this.server = server;
}
#Override
public void afterSend(ServerSession session,
Transaction tx,
Message message,
boolean direct,
boolean noAutoCreateQueue,
RoutingStatus result) throws ActiveMQException {
Message copy = message.copy();
copy.setAddress("foo");
try {
server.getPostOffice().route(copy, false);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Then a Camel consumer can pick up the message and do whatever it needs to with it. This approach has a few advantages:
It's simple. It would technically be possible to convert the org.apache.activemq.artemis.api.core.Message instance into a javax.jms.Message instance, but it's not going to be straight-forward. javax.jms.Message is a JMS client class. It's not used on the server anywhere so there is no existing facility to do any kind of conversion to/from it.
It's fast. If you use a javax.jms.Message you'd also have to use a JMS client to send it and that would mean creating and managing JMS resources like a javax.jms.Connection and a javax.jms.Session. This is not really something you want to be doing in a broker plugin as it will add a fair amount of latency. The method shown here uses the broker's own internal API to deal with the message. No client resources are necessary.
It's asynchronous. By sending the message and letting Camel pick it up later you don't have to wait on Camel at all which reduces the latency added by the plugin.
org.apache.activemq.artemis.jms.client.ActiveMQMessage
This looks like the implementation of javax.jms.Message with an underlying org.apache.activemq.artemis.api.core.client.ClientMessage which extends CoreMessage
I am looking to read Flatfile which is in 10 GB. For that, I chose to use ThreadPoolTaskExecutor to make my step multi-threded.
I am wondering how these 4 worker threads are working internally? How one thread doesn't read the data read by another thread. If someone can explain how it's working internally, that will be great help.
#Bean
#StepScope
public FlatFileItemReader<Transaction> fileTransactionReader(#Value("#{jobParameters['inputFlatFile']}") Resource resource) {
return new FlatFileItemReaderBuilder<Transaction>()
.saveState(false)
.resource(resource)
.delimited()
.names(new String[] {"account", "amount", "timestamp"})
.fieldSetMapper(fieldSet -> {
Transaction transaction = new Transaction();
transaction.setAccount(fieldSet.readString("account"));
transaction.setAmount(fieldSet.readBigDecimal("amount"));
transaction.setTimestamp(fieldSet.readDate("timestamp", "yyyy-MM-dd HH:mm:ss"));
return transaction;
})
.build();
}
Code -
#Bean
public Job multithreadedJob() {
return this.jobBuilderFactory.get("multithreadedJob")
.start(step1())
.build();
}
#Bean
public Step step1() {
ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
taskExecutor.setCorePoolSize(4);
taskExecutor.setMaxPoolSize(4);
taskExecutor.afterPropertiesSet();
return this.stepBuilderFactory.get("step1")
.<Transaction, Transaction>chunk(100)
.reader(fileTransactionReader(null))
.writer(writer(null))
.taskExecutor(taskExecutor)
.build();
}
FlatFileItemReader is not in itself thread-safe as it extends AbstractItemCountingItemStreamItemReader whose javadoc states Subclasses are inherently not thread-safe. So strictly speaking, you should wrap it in a SynchronizedItemStreamReader. See also: Can I use FlatfileItemReader with Taskexecutor?
Having said that, if you
don't care about restartability,
don't care about the line numbers,
don't use a mapping that would require state,
set saveState to false,
and don't change the reader's default bufferedReaderFactory,
then the reader is just a thin wrapper around
a BufferedReader whose method readLine is called for each FlatFileItemReader::read,
and a LineMapper that maps each line to the target type
And BufferedReader is thread-safe which makes your reader effectively safe to call in a multi-threaded step.
But beware: The Spring Batch API makes no promises about the thread-safety of the reader. Quite the opposite, actually. So, the multi-threaded behavior is at least in theory up to change in future versions. Furthermore, there are a lot of conditions listed above which someday may no longer hold for your implementation. Thus, using a SynchronizedItemStreamReader is really recommended.
See also Can spring batch multi-threaded step be used safely if number of items in file are very less?
Currently, we are using transactional Kafka producers. What we have noticed is that the tracing aspect of Kafka is missing which means we don't get to see the instrumentation of Kafka producers thereby missing the b3 headers.
After going through the code, we found that the post processors are not invoked for transactional producers which means the TracingProducer is never created by the TraceProducerPostProcessor. Is there a reason for that? Also, what is the work around for enabling tracing for the transactional producers? It seems there is not a single place easily to create a tracing producer (DefaultKafkaProducerFactory #doCreateTxProducer is private)
Screen shot attached(DefaultKafkaProducerFactory class). In the screenshot you can see the post processors are invoked only for raw producer not for the case for transactional producer.
Your help will be much appreciated.
Thanks
DefaultKafkaProducerFactory#createRawProducer
??
createRawProducer() is called for both transactional and non-transactional producers:
Something else is going on.
EDIT
The problem is that sleuth replaces the producer with a different one, but factory discards that and uses the original.
https://github.com/spring-projects/spring-kafka/issues/1778
EDIT2
Actually, it's a good thing that we discard the tracing producer here; Sleuth also wraps the factory in a proxy and wraps the CloseSafeProducer in a TracingProducer; but I see the same result with both transactional and non-transactional producers...
#SpringBootApplication
public class So67194702Application {
public static void main(String[] args) {
SpringApplication.run(So67194702Application.class, args);
}
#Bean
public ApplicationRunner runner(ProducerFactory<String, String> pf) {
return args -> {
Producer<String, String> prod = pf.createProducer();
prod.close();
};
}
}
Putting a breakpoint on the close()...
Thanks Gary Russell for the very quick response. The createRawConsumer is effectivly called for both transactional and non transactional consumers.
Sleuth is using the TraceConsumerPostProcessor to wrap a Kafka consumer into a TracingConsumer. As the ProducerPostProcessor interface extends the Function interface, we may suppose the result of the function could/should be used but the createRawConsumer method of the DefaultKafkaProducerFactory is applying the post processors without using the return type. Causing the issue in this specific case.
So, couldn't we modify the implementation of the createRawConsumer to assign the result of the post processor. If not, wouldn't it be better to have post processors extending a Consumer instead of a Function?
Successful test made by overriding the createRawConsumer method as follow
#Override
protected Producer<K, V> createRawProducer(Map<String, Object> rawConfigs) {
Producer<K, V> kafkaProducer = new KafkaProducer<>(rawConfigs, getKeySerializerSupplier().get(), getValueSerializerSupplier().get());
for (ProducerPostProcessor<K, V> pp : getPostProcessors()) {
kafkaProducer = pp.apply(kafkaProducer);
}
return kafkaProducer;
}
Thank you for your help.
I implemented in Java what I called a "foldable queue", i.e., a LinkedBlockingQueue used by an ExecutorService. The idea is that each task as a unique id that if is in the queue while another task is submitted via that same id, it is not added to the queue. The Java code looks like this:
public final class FoldablePricingQueue extends LinkedBlockingQueue<Runnable> {
#Override
public boolean offer(final Runnable runnable) {
if (contains(runnable)) {
return true; // rejected, but true not to throw an exception
} else {
return super.offer(runnable);
}
}
}
Threads have to be pre-started but this is a minor detail. I have an Abstract class that implements Runnable that takes a unique id... this is the one passed in
I would like to implement the same logic using Scala and Akka (Actors).
I would need to have access to the mailbox, and I think I would need to override the ! method and check the mailbox for the event.. has anyone done this before?
This is exactly how the Akka mailbox works. The Akka mailbox can only exist once in the task-queue.
Look at:
https://github.com/jboner/akka/blob/master/akka-actor/src/main/scala/akka/dispatch/Dispatcher.scala#L143
https://github.com/jboner/akka/blob/master/akka-actor/src/main/scala/akka/dispatch/Dispatcher.scala#L198
Very cheaply implemented using an atomic boolean, so no need to traverse the queue.
Also, by the way, your Queue in Java is broken since it doesn't override put, add or offer(E, long, TimeUnit).
Maybe you could do that with two actors. A facade one and a worker one. Clients send jobs to facade. Facade forwards then to worker, and remember them in its internal state, a Set queuedJobs. When it receives a job that is queued, it just discard it. Each time the worker starts processing a job (or completes it, whichever suits you), it sends a StartingOn(job) message to facade, which removes it from queuedJobs.
The proposed design doesn't make sense. The closest thing to a Runnable would be an Actor. Sure, you can keep them in a list, and not add them if they are already there. Such lists are kept by routing actors, which can be created from ready parts provided by Akka, or from a basic actor using the forward method.
You can't look into another actor's mailbox, and overriding ! makes no sense. What you do is you send all your messages to a routing actor, and that routing actor forwards them to a proper destination.
Naturally, since it receives these messages, it can do any logic at that point.