I'm creating an easy Apache Flink job (with Scala) that only tries to print the case class representing the events received by a RabbitMQ queue (RMQSource)
I've created my own deserialization schema (using Jackson) and it works fine when the consumed message is actually a JSON representing the case class. But the job fails and keeps restarting if the queue receives a wrong-formatted event (we can call it a "poisoned message", I guess). I have to purge the queue and then the job state changes to 'running'.
QUESTION:
How can I prevent the job from failing when receiving a poisoned message? Can I validate the message before consuming it? If I can set a dead-letter exchange in Rabbit, where should I do (if possible) the negative acknowledgements on behalf Apache Flink as consumer? There is a better way of handling this and keeping the job running consuming the next well-formatted messages?
My custom DeserializationSchema provided to RMQSource[Test]:
class eventSerializationSchema extends DeserializationSchema[Test] {
#throws(classOf[IOException])
def deserialize(message: Array[Byte]): Test = objectMapper.readValue(message, classOf[Test])
def isEndOfStream(nextElement: Test): Boolean = false
def getProducedType: TypeInformation[Test] = createTypeInformation[Test]
}
object eventSerializationSchema{
val objectMapper: ObjectMapper = new ObjectMapper().registerModule(DefaultScalaModule)
}
Error obtained when a poisoned message arrives to the consumed queue:
com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'a': was expecting (JSON String, Number, Array, Object or token 'null', 'true' or 'false')
at [Source: (byte[])"a"; line: 1, column: 2]
at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337)
at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:720)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._reportInvalidToken(UTF8StreamJsonParser.java:3593)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._handleUnexpectedValue(UTF8StreamJsonParser.java:2688)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._nextTokenNotInObject(UTF8StreamJsonParser.java:870)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken(UTF8StreamJsonParser.java:762)
at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:4684)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:4586)
at com.fasterxml.jackson.databind.ObjectMapper.readValue(ObjectMapper.java:3609)
at org.angoglez.deserializers.eventSerializationSchema.deserialize(eventSerializationSchema.scala:17)
at org.angoglez.deserializers.eventSerializationSchema.deserialize(eventSerializationSchema.scala:14)
at org.apache.flink.streaming.connectors.rabbitmq.RMQDeserializationSchemaWrapper.deserialize(RMQDeserializationSchemaWrapper.java:47)
at org.apache.flink.streaming.connectors.rabbitmq.RMQSource.processMessage(RMQSource.java:319)
at org.apache.flink.streaming.connectors.rabbitmq.RMQSource.run(RMQSource.java:331)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:116)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:73)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:323)
From my point of view you have a few options:
Catch exceptions that are thrown inside the deserialize method, and drop poison records.
Catch exceptions thrown inside the deserialize method, and somehow encode what you want to know about these poison records into the object being produced. Then in a downstream process function, filter out these poison records and send them to a side output.
Don't apply the ObjectMapper in the deserializer, and instead do the real deserialization in a chained process function that can directly send the poison records to a side output.
Related
I'm looking to respond to a REST endpoint with a Success/Failure response that dynamically accepts a topic as a query param. In Quarkus with smallrye reactive messaging the code would look something like below wrapping the payload with OutgoingKafkaRecordMetadata
i.e. https://myendpoint/publishToKafka?topic=myDynamicTopic
#Channel("test")
Emitter<byte []> kafkaEmitter;
#POST
#Path("/publishToKafka")
public CompletionStage<Void> publishRecord(#QueryParam("topic") String topic, byte [] payload){
kafkaEmitter.send(Message.of(payload).addMetadata(OutgoingKafkaRecordMetadata.<String>builder()
.withKey("my-key")
.withTopic("myDynamicTopic")
.build()));
}
From the Quarkus doco "If the endpoint does not return a CompletionStage, the HTTP response may be written before the message is sent to Kafka, and so failures won’t be reported to the user." The example here describes this process when you send a payload directly (i.e. emitter.send(payload) which returns a CompletionStage but emitter.send(message) returns void) but this requires configuring the topic in advance. Is it possible to specify metadata with a Message and still respond to the calling client with a success/failure response? (I don't mind if it's with Emitter and CompletionStage or MunityEmitter and Uni).
Any advice or suggestions would be appreciated.
Because you use a Message (as you need to specify the topic), you need something a bit more convoluted:
#Channel("test")
Emitter<byte []> kafkaEmitter;
#POST
#Path("/publishToKafka")
public CompletionStage<Void> publishRecord(#QueryParam("topic") String topic, byte [] payload){
CompletableFuture<Void> future = new CompletableFuture<>();
Message<byte[]> message = Message.of(payload).addMetadata(OutgoingKafkaRecordMetadata.
<String>builder()
.withKey("my-key")
.withTopic("myDynamicTopic")
.build()));
message = message.withAck(() -> {
future.complete(null));
return CompleteableFuture.completedFuture(null);
}
.withNack(t -> {
future.completeExceptionnaly(t));
return CompleteableFuture.completedFuture(null);
});
kafkaEmitter.send(message);
return future;
}
In this snippet, I also attach the ack and nack handlers called when the message is either acknowledged (accepted by the broker) or rejected (something wrong happened).
These callbacks report to future, a CompletableFuture created in the method. This is the object to return, as it will do what you want: indicate the outcome.
I know the callbacks are slightly complicated. This is mainly due to the spec: We have to return CompleteableFuture.completedFuture(...); to acknowledge that the nack-process was successful. If we were to return future; instead (which we have set to future.completeExceptionnaly(t));), this would be interpreted as a failure during the nack-process. This would basically be the equivalent to a throw within a catch-block in the imperative world.
Fortunately, an easier version will be available soonish (no worries, we won't break).
I'm using Spring Cloud Stream with Kafka Streams. Let's say I have a processor which is a Function which converts a KStream of Strings to a KStream of CityProgrammes. It invokes an API to find the City by name and an other transformation which finds any events near that city.
Now the problem is that any error happens during the transformation, the whole application stops. I want to send that one particular message to a DLQ and move along. I've been reading for days and everyone suggests to handle errors within the called services but that is a nonesense in my opinion, plus I still need to return a KStream: how do I do that within a catch?
I also looked at UncaughtExeptionHandler but it is not aware of the message and only able to restart the processing which won't skip this invalid message.
This might sound like an A-B problem so the question rephrased: how do I maintain the flow in a KStream when an exception occurs and send the invalid item to the DLQ?
When it comes to the application-level errors you have, it is up to the application itself how the error is handled. Kafka Streams and the Spring Cloud Stream binder mainly support deserialization and serialization errors at the framework level. Although that is the case, I think your scenario can be handled. If you are using Kafka Client prior to 2.8, here is an SO answer I gave before on something similar: https://stackoverflow.com/a/66749750/2070861
If you are using Kafka/Streams 2.8, here is an idea that you can use. However, the code below should only be used as a starting point. Adjust it according to your use case. Read more on how branching works in Kafka Streams 2.8. The branching API is significantly refactored in 2.8 from the prior versions.
public Function<KStream<?, String>, KStream<?, Foo>> convert() {
Foo[] foo = new Foo[0];
return input -> {
final Map<String, ? extends KStream<?, String>> branches =
input.split(Named.as("foo-")).branch((key, value) -> {
try {
foo[0] = new Foo(); // your API call for CitiProgramme converion here, possibly.
return true;
}
catch (Exception e) {
Message<?> message = MessageBuilder.withPayload(value).build();
streamBridge.send("to-my-dlt", message);
return false;
}
}, Branched.as("bar"))
.defaultBranch();
final KStream<?, String> kStream = branches.get("foo-bar");
return kStream.map((key, value) -> new KeyValue<>("", foo[0]));
};
}
}
The default branch is ignored in this code because that only contains the records that threw exceptions. Those were handled by the catch statement above in which we send the records to a DLT programmatically. Finally, we get the good records and map them to a new KStream and send it through the outbound.
I need to catch the exceptions in case of Async send to Kafka. The Kafka producer Api comes with a fuction send(ProducerRecord record, Callback callback). But when I tested this against following two scenarios :
Kafka Broker Down
Topic not pre created
The callbacks are not getting called. Rather I am getting warning in the code for unsuccessful send (as shown below).
Questions :
So are the callbacks called only for specific exceptions ?
When does Kafka Client try to connect to Kafka broker while async send : on every batch send or periodically ?
Kafka Warning Image
Note : I am also using linger.ms setting of 25 sec to batch send my records.
public class ProducerDemo {
static KafkaProducer<String, String> producer;
public static void main(String[] args) throws IOException {
final Logger logger = LoggerFactory.getLogger(ProducerDemo.class);
Properties properties = new Properties();
properties.setProperty(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "127.0.0.1:9092");
properties.setProperty(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class.getName());
properties.setProperty(ProducerConfig.ACKS_CONFIG, "1");
properties.setProperty(ProducerConfig.LINGER_MS_CONFIG, "30000");
producer = new KafkaProducer<String, String>(properties);
String topic = "first_topic";
for (int i = 0; i < 5; i++) {
String value = "hello world " + Integer.toString(i);
String key = "id_" + Integer.toString(i);
ProducerRecord<String, String> record = new ProducerRecord<String, String>(topic, key, value);
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata recordMetadata, Exception e) {
//execute everytime a record is successfully sent or exception is thrown
if(e == null){
// No Exception
}else{
//Exception Handling
}
}
});
}
producer.close();
}
You will get those warning for non-existing topic as a resilience mechanism provided with KafkaProducer. If you wait a bit longer(should be 60 seconds by default), the callback will be called eventually:
Here's my snippet:
So, when something goes wrong and async send is not successful, it will eventually fail with a failed future or/and a callback with exception.
If you are not running it transactionally, it can still mean that some messages from the batch have found their way to the broker, while others haven't.
It will most certainly be a problem if you need a blocking-style acknowledgement to the upstream system(like http ingestion interface, etc.) per every message that is sent to Kafka. The only way to do that is by blocking every message with the future's get, as described in the documentation:
In general, I've noticed a lot of question related to KafkaProducer delivery semantics and guarantees. It can definitely be documented better.
One more thing, since you mentioned linger.ms:
Note that records that arrive close together in time will generally
batch together even with linger.ms=0 so under heavy load batching will
occur regardless of the linger configuration
For the first question, here is the answer.
As per the apache kafka documentation, you can capture below exceptions using onCompletion method when you are implementing Callback interface
https://kafka.apache.org/25/javadoc/org/apache/kafka/clients/producer/Callback.html
For the second question, the combination of below properties control when to send the records and as far as i understand, it's same for synchronous or asynchronous call.
linger.ms
max.block.ms
https://kafka.apache.org/documentation/#linger.ms
So are the callbacks called only for specific exceptions ?
Yes, that's how it works. From documentation (2.5.0):
* Fully non-blocking usage can make use of the {#link Callback} parameter to provide a callback that
* will be invoked when the request is complete.
Notice the important part: when the request is complete, what means that the producer must have accepted the record and sent the ProduceRequest to Kafka Broker. Without digging too deep into internals, this means that broker metadata must be present and the partition must exist.
When it comes to formal specification, you'd need to take a good look at send()'s Javadoc and possibly at KafkaProducer's implementation of doSend method. Out there you're going to see that multiple exceptions can be thrown at the in submitting call (instead of returning a future and invoking callback), e.g. :
if broker metadata is not available in timeout given,
if data could not be serialized,
if serialized form was too large, etc.
I'm unable to perform check when using jsonPath (or any other matcher).
When using message I'm able to save whole JSON message in session
.check(wsAwait
.within(6 seconds)
.until(1)
.message.exists
//.jsonpJsonPath("$.data").exists
.saveAs("CID"))
And later in scenario I'm able to print whole message
{"event":"ConversationCreated",
"data":"{"conversationId":"0e21f93d-6b0c-441f-a01d-8b0aa4e14769",
"customerInfo":null,"deviceInfo":null}"}
But when using pathJson matcher, my check times out
.check(wsAwait
.within(6 seconds)
.until(1)
// .message.exists
.jsonpJsonPath("$.data").exists
.saveAs("CID"))
When run will produce
...
12:09:32.248 [ERROR] i.g.c.a.b.SessionHookBuilder$$anon$1 - 'hook-2' crashed with 'java.util.NoSuchElementException: key not found: CID', forwarding to the next one
...
---- Errors --------------------------------------------------------------------
> Check failed: Timeout 1 (100,0%)
I've managed to make it "work".
The issue was lack of esceping JSON inside JSON.
Whene you send some data trough sse your clients receive both data and event type. And Gatling provides it as Json Object. If data you send is also Json Gatling will just pass it in as simple string, without any modification, which creates improper Json
Actuall:
{"event":"ConversationCreated",
"data":"{"conversationId":"0e21f93d-6b0c-441f-a01d-8b0aa4e19",
"customerInfo":null,"deviceInfo":null}"}
Proper:
{"event":"ConversationCreated",
"data":"{\"conversationI\d":\"0e21f93d-6b0c-441f-a01d-8b0aa4e19\",
\"customerInfo\":null,\"deviceInfo\":null}\"}
Or, rather than keeping message in Json, it could be kept as object:
case class Message(event: String, data: String)
A project I'm working on requires the reading of messages from SQS, and I decided to use Akka to distribute the processing of these messages.
Since SQS is support by Camel, and there is functionality built in for use in Akka in the Consumer class, I imagined it would be best to implement the endpoint and read messages this way, though I had not seen many examples of people doing so.
My problem is that I cannot poll my queue quickly enough to keep my queue empty, or near empty. What I originally thought was that I could get a Consumer to receive messages over Camel from SQS at a rate of X/s. From there, I could simply create more Consumers to get up to the rate at which I needed messages processed.
My Consumer:
import akka.camel.{CamelMessage, Consumer}
import akka.actor.{ActorRef, ActorPath}
class MyConsumer() extends Consumer {
def endpointUri = "aws-sqs://my_queue?delay=1&maxMessagesPerPoll=10&accessKey=myKey&secretKey=RAW(mySecret)"
var count = 0
def receive = {
case msg: CamelMessage => {
count += 1
}
case _ => {
println("Got something else")
}
}
override def postStop(){
println("Count for actor: " + count)
}
}
As shown, I've set delay=1 as well as &maxMessagesPerPoll=10 to improve the rate of messages, but I'm unable to spawn multiple consumers with the same endpoint.
I read in the docs that By default endpoints are assumed not to support multiple consumers. and I believe this holds true for SQS endpoints as well, as spawning multiple consumers will give me only one consumer where after running the system for a minute, the output message is Count for actor: x instead of the others which output Count for actor: 0.
If this is at all useful; I'm able to read approximately 33 messages/second with this current implementation on the single consumer.
Is this the proper way to be reading messages from an SQS queue in Akka? If so, is there way I can get this to scale outward so that I can increase my rate of message consumption closer to that of 900 messages/second?
Sadly Camel does not currently support parallel consumption of messages on SQS.
http://camel.465427.n5.nabble.com/Amazon-SQS-listener-as-multi-threaded-td5741541.html
To address this I've written my own Actor to poll batch messages SQS using the aws-java-sdk.
def receive = {
case BeginPolling => {
// re-queue sending asynchronously
self ! BeginPolling
// traverse the response
val deleteMessageList = new ArrayList[DeleteMessageBatchRequestEntry]
val messages = sqs.receiveMessage(receiveMessageRequest).getMessages
messages.toList.foreach {
node => {
deleteMessageList.add(new DeleteMessageBatchRequestEntry(node.getMessageId, node.getReceiptHandle))
//log.info("Node body: {}", node.getBody)
filterSupervisor ! node.getBody
}
}
if(deleteEntryList.size() > 0){
val deleteMessageBatchRequest = new DeleteMessageBatchRequest(queueName, deleteMessageList)
sqs.deleteMessageBatch(deleteMessageBatchRequest)
}
}
case _ => {
log.warning("Unknown message")
}
}
Though I'm not certain if this is the best implementation, and it could of course be improved upon so that requests are not constantly hitting an empty queue, it does suit my current needs of being able to poll messages from the same queue.
Getting about 133 (messages/second)/actor from SQS with this.
Camel 2.15 supports concurrentConsumers, though not sure how useful this is in that I don't know if akka camel support 2.15 and I do not know if having one Consumer actor makes a difference even if there are multiple consumers.