Currently, we are using transactional Kafka producers. What we have noticed is that the tracing aspect of Kafka is missing which means we don't get to see the instrumentation of Kafka producers thereby missing the b3 headers.
After going through the code, we found that the post processors are not invoked for transactional producers which means the TracingProducer is never created by the TraceProducerPostProcessor. Is there a reason for that? Also, what is the work around for enabling tracing for the transactional producers? It seems there is not a single place easily to create a tracing producer (DefaultKafkaProducerFactory #doCreateTxProducer is private)
Screen shot attached(DefaultKafkaProducerFactory class). In the screenshot you can see the post processors are invoked only for raw producer not for the case for transactional producer.
Your help will be much appreciated.
Thanks
DefaultKafkaProducerFactory#createRawProducer
??
createRawProducer() is called for both transactional and non-transactional producers:
Something else is going on.
EDIT
The problem is that sleuth replaces the producer with a different one, but factory discards that and uses the original.
https://github.com/spring-projects/spring-kafka/issues/1778
EDIT2
Actually, it's a good thing that we discard the tracing producer here; Sleuth also wraps the factory in a proxy and wraps the CloseSafeProducer in a TracingProducer; but I see the same result with both transactional and non-transactional producers...
#SpringBootApplication
public class So67194702Application {
public static void main(String[] args) {
SpringApplication.run(So67194702Application.class, args);
}
#Bean
public ApplicationRunner runner(ProducerFactory<String, String> pf) {
return args -> {
Producer<String, String> prod = pf.createProducer();
prod.close();
};
}
}
Putting a breakpoint on the close()...
Thanks Gary Russell for the very quick response. The createRawConsumer is effectivly called for both transactional and non transactional consumers.
Sleuth is using the TraceConsumerPostProcessor to wrap a Kafka consumer into a TracingConsumer. As the ProducerPostProcessor interface extends the Function interface, we may suppose the result of the function could/should be used but the createRawConsumer method of the DefaultKafkaProducerFactory is applying the post processors without using the return type. Causing the issue in this specific case.
So, couldn't we modify the implementation of the createRawConsumer to assign the result of the post processor. If not, wouldn't it be better to have post processors extending a Consumer instead of a Function?
Successful test made by overriding the createRawConsumer method as follow
#Override
protected Producer<K, V> createRawProducer(Map<String, Object> rawConfigs) {
Producer<K, V> kafkaProducer = new KafkaProducer<>(rawConfigs, getKeySerializerSupplier().get(), getValueSerializerSupplier().get());
for (ProducerPostProcessor<K, V> pp : getPostProcessors()) {
kafkaProducer = pp.apply(kafkaProducer);
}
return kafkaProducer;
}
Thank you for your help.
Related
I need to convert org.apache.activemq.artemis.core.message.impl.CoreMessage to javax.jms.Message. How can i do this? Maybe there is a required util method somewhere in the code, or it needs to be done manually?
I want to intercept the following events:
afterSend
afterDeliver
messageExpired
And then send the message to a direct endpoint Camel route which requires a javax.jms.Message instance.
My recommendation would be to simply copy the message and route the copy to the address of your choice, e.g.:
public class MyPlugin implements ActiveMQServerMessagePlugin {
ActiveMQServer server;
#Override
public void registered(ActiveMQServer server) {
this.server = server;
}
#Override
public void afterSend(ServerSession session,
Transaction tx,
Message message,
boolean direct,
boolean noAutoCreateQueue,
RoutingStatus result) throws ActiveMQException {
Message copy = message.copy();
copy.setAddress("foo");
try {
server.getPostOffice().route(copy, false);
} catch (Exception e) {
e.printStackTrace();
}
}
}
Then a Camel consumer can pick up the message and do whatever it needs to with it. This approach has a few advantages:
It's simple. It would technically be possible to convert the org.apache.activemq.artemis.api.core.Message instance into a javax.jms.Message instance, but it's not going to be straight-forward. javax.jms.Message is a JMS client class. It's not used on the server anywhere so there is no existing facility to do any kind of conversion to/from it.
It's fast. If you use a javax.jms.Message you'd also have to use a JMS client to send it and that would mean creating and managing JMS resources like a javax.jms.Connection and a javax.jms.Session. This is not really something you want to be doing in a broker plugin as it will add a fair amount of latency. The method shown here uses the broker's own internal API to deal with the message. No client resources are necessary.
It's asynchronous. By sending the message and letting Camel pick it up later you don't have to wait on Camel at all which reduces the latency added by the plugin.
org.apache.activemq.artemis.jms.client.ActiveMQMessage
This looks like the implementation of javax.jms.Message with an underlying org.apache.activemq.artemis.api.core.client.ClientMessage which extends CoreMessage
I have an external dependency on another system in my streams app and would like to publish a message to DLQ kafka topic from within my streams app whenever a Deserialization/Producer/or any external/network exception happens, so that I can monitor that topic and reprocess records as needed. I can't seem to find a good example of doing this anywhere. The closest reference I found is https://docs.confluent.io/current/streams/faq.html#option-3-quarantine-corrupted-records-dead-letter-queue, but 1. It talks only about DeserializationExceptionHandler, what about other exception scenarios? 2. It doesn't demo the right way to configure/manage/close the associated KafkaProducer.
I would like to have try catch for the external dependency code and send the record(s) that cause exception to a dead letter queue topic. Any help will be appreciated!
For the processing logic you could take this approach:
someKStream
// the processing logic
.mapValues(inputValue -> {
// for each execution the below "return" could provide a different class than the previous run!
// e.g. "return isFailedProcessing ? failValue : successValue;"
// where failValue and successValue have no related classes
return someObject; // someObject class vary at runtime depending on your business
}) // here you'll have KStream<whateverKeyClass, Object> -> yes, Object for the value!
// you could have a different logic for choosing
// the target topic, below is just an example
.to((k, v, recordContext) -> v instanceof failValueClass ?
"dead-letter-topic" : "success-topic",
// you could completelly ignore the "Produced" part
// and rely on spring-boot properties only, e.g.
// spring.kafka.streams.properties.default.key.serde=yourKeySerde
// spring.kafka.streams.properties.default.value.serde=org.springframework.kafka.support.serializer.JsonSerde
Produced.with(yourKeySerde,
// JsonSerde could be an instance configured as you need
// (with type mappings or headers setting disabled, etc)
new JsonSerde<>()));
Your classes, though different and landing into different topics, will serialize as expected.
When not using to(), but instead one wants to continue with other processing, he could use branch() with splitting the logic based on the kafka-value class; the trick for branch() is to return KStream<keyClass, ?>[] in order to further allow one to cast to the appropriate class the individual items from KStream<keyClass, ?>[].
As Kafka document said that,
The producer is thread safe and sharing a single producer instance
across threads will generally be faster than having multiple
instances.
So I have following code and want to only have one instance of KafkaProducer for each send request. But when is the best place in the code to call close method on it? As I can't call close method in the send method. How should I write the code to handle?
public class Producer {
private final KafkaProducer<Integer, String> producer;
public Producer(String topic, Boolean isAsync) {
Properties props = new Properties();
props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, KafkaProperties.KAFKA_SERVER_URL + ":" + KafkaProperties.KAFKA_SERVER_PORT);
props.put(ProducerConfig.CLIENT_ID_CONFIG, "DemoProducer");
props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, IntegerSerializer.class.getName());
props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
producer = new KafkaProducer<>(props);
}
public void send(String message) {
producer.send(new ProducerRecord<>(topic, messageNo, messageStr);
}
}
You can create the producer first and pass(inject) it to your web resource or resources.
Then you can use a shutdown hook to close the producer with its reference.
Even better if you can use life cycle stop hooks in your framework.
Example in dropwizard:
https://www.dropwizard.io/en/latest/manual/core.html?highlight=managed#managed-objects
Kafka producer implements the AutoClosable interface. So you can declare it within the try-with-resources block, and it should take care of releasing the resources when your code goes outside the scope of the block.
Do you ever need to change producer parameters at runtime? For example when changing the broker urls or during tests?
If you need it and you have a singletone producer, make sure to provide a hook to close and recreate the producer with new parameters.
What class/method in Kafka Streams can we use to serialize/deserialize Java object to byte array OR vice versa? The following link proposes the usage of ByteArrayOutputStream & ObjectOutputStream but they are not thread safe.
Send Custom Java Objects to Kafka Topic
There is another option to use the ObjectMapper, ObjectReader (for threadsafe), but that's converting from POJO -> JSON -> bytearray. Seems this option is an extensive one. Wanted to check if there is a direct way to translate object into bytearray and vice versa which is threadsafe. Please suggest
import org.apache.kafka.common.serialization.Serializer;
public class HouseSerializer<T> implements Serializer<T>{
private Class<T> tClass;
public HouseSerializer(){
}
#SuppressWarnings("unchecked")
#Override
public void configure(Map configs, boolean isKey) {
tClass = (Class<T>) configs.get("POJOClass");
}
#Override
public void close() {
}
#Override
public byte[] serialize(String topic, T data) {
//Object serialization to be performed here
return null;
}
}
Note: Kafka version - 0.10.1
Wanted to check if there is a direct way to translate object into bytearray
I would suggest you look at using Avro serialization with the Confluent Schema Registry, if possible, but not required. JSON is a good fall back, but takes more space "on the wire", and so MsgPack would be the alternative there.
See Avro code example here
Above example is using the avro-maven-plugin to generate a LogLine class from the src/main/resources/avro schema file.
Otherwise, it's up to you for how to serialize your object into a byte array, for example, a String is commonly packed as
[(length of string) (UTF8 encoded bytes)]
While booleans are a single 0 or 1 bit
which is threadsafe
I understand the concern, but you aren't commonly sharing deserialized data between threads. You send/read/process a message for each independent one.
we have multiple instances of JBoss-Server in a clustered environment. For background tasks there is a global queue available, that manages all jobs registered at it. For this queue there is a simple listener (MDB) on each node, manages the incoming messages. This listener does a manual lookup (no injection) for a singleton bean and starts a pre defined method.
Everything works fine so far, but the method in the singleton bean uses some other (no singleton services) that are not available under some circumstances.
For example if a node will be restarted and there are left messages in the queue (not processed yet) the messages will be picked up by the listener and all further beans are null, so the job produces a NPE.
Is it possible to define a delay time in JMS-Listener after messages will be picked up or is it possible to define an "application completely deployed" hook in there? The DependsOn-Annotation does not work, because of the usage of non singletons.
A possibility can be to set the MDB-property "DeliveryActive" to false and start the bean after full deployment. Is there a simple, working way to do this programatically (not in jmx-console)? Any manuals for this I found, redirects me to a manual jndi lookup. I think it have to be possible to inject the Bean per annotation and call startDelivery()? Is there a good place to do this in application?
Another hint takes me to the initialise in order property in application.xml, because the problem might be connected to JBoss Deployment order (some EJBs will be later available than the listener), but there seems to be a bug in JBoss 6.0 and upgrading to 6.1. is not an option. Maybe there is a walkthrough for this?
I hope that the problem is well enough explained, otherwise please ask for further informations.
Thanks in advance,
Danny
Additional informations:
JBoss 6.0.0 Final
HornetQ 2.2.5 Final (already updated, because of the buggy default version of JBoss)
The Listener:
#MessageDriven(activationConfig =
{
#ActivationConfigProperty(propertyName = "destinationType", propertyValue = "javax.jms.Queue"),
#ActivationConfigProperty(propertyName = "destination", propertyValue = "/queue/SchedulerQueue")
})
public class SchedulerQueueListener implements MessageListener {
...
#Override
public void onMessage(Message message) {
...
service = (IScheduledWorkerService) new InitialContext().lookup(jndiName);
EJobResult eJobResult = service.executeJob(message);
...
}
A sample worker:
#Singleton
#LocalBinding(jndiBinding = SampleJobWorkerService.JNDI_NAME)
public class SampleJobWorkerService implements IScheduledWorkerService {
...
#EJB(name = "SampleEJB/local")
private ISampleEJB sampleEjb;
...
#Override
public EJobResult executeJob(Message message) {
int state = sampleEjb.doSomething(message.getLongProperty(A_PROPERTY));
}
In this case the sampleEjb - member will be null sometimes
As a workaround, instead of calling EJB's directly from MDB, you can create a timer with a timeout with some delay. Therefore there will be some delay in execution.
In Timer's timeout method, then you can call singleton EJB, which in case will call other non-singleton EJB's.
JBoss specific : Can try setting the property in the message object before sending.
msg.setLongProperty("JMS_JBOSS_SCHEDULED_DELIVERY", (current + delay));
Other alternative is _JBM_SCHED_DELIVERY.
Edit :
For 1st part, you can have JTA transaction, which may span across JMS & EJB. Therefore failover & other things may be handled accordingly.
You can also increase the redelivery delay for the message object.
<address-setting match="jms.queue.someQueue">
<redelivery-delay>5000</redelivery-delay>
</address-setting>
I am in the same trouble at the moment.
I propose you use EJB 3 startup bean annotation #Startup on your singleton bean to invoke the startDelivery method on your Message listeners.