Ordering of streams while reading data from multiple kafka partitions

Ordering of streams while reading data from multiple kafka partitions - mongodb

I have used flink streaming API in my application where the source of streaming is kafka. My kafka producer will publish data in ascending order of time in different partitions of kafka and consumer will read data from these partitions. However some kafka partitions may be slow due to some operation and produce late results. Is there any way to maintain order in this stream though the data arrive out of order. I have tried BoundedOutOfOrdernessTimestampExtractor but it didn't served the purpose. While digging this problem I came across your documentation (URL: https://cwiki.apache.org/confluence/display/FLINK/Time+and+Order+in+Streams) and tried to implement this but it didnt worked. I also tried with Table API order by but it seems you not support orderBy in flink 1.5 version. Please suggest me any workaround for this. I am using below custom watermark generator with parallelism 4.
DataStream<Document> streamSource = env
.addSource(kafkaConsumer).setParallelism(4);
public class BoundedOutOfOrdernessGenerator implements AssignerWithPeriodicWatermarks<Document> {
private final long maxOutOfOrderness = 3500; // 3.5 seconds
private long currentMaxTimestamp;
#Override
public long extractTimestamp(Document event, long previousElementTimestamp) {
Map timeStamp = (Map) event.get("ts");
this.currentMaxTimestamp = (long) timeStamp.get("value");
return currentMaxTimestamp;
}
#Override
public Watermark getCurrentWatermark() {
// return the watermark as current highest timestamp minus the out-of-orderness bound
return new Watermark(currentMaxTimestamp - maxOutOfOrderness);
}
}
Thanks,

Related

Apache Kafka Streams : Out-of-Order messages

I have an Apache Kafka 2.6 Producer which writes to topic-A (TA).
I also have a Kafka streams application which consumes from TA and writes to topic-B (TB).
In the streams application, I have a custom timestamp extractor which extracts the timestamp from the message payload.
For one of my failure handling test cases, I shutdown the Kafka cluster while my applications are running.
When the producer application tries to write messages to TA, it cannot because the cluster is down and hence (I assume) buffers the messages.
Let's say it receives 4 messages m1,m2,m3,m4 in increasing time order. (i.e. m1 is first and m4 is last).
When I bring the Kafka cluster back online, the producer sends the buffered messages to the topic, but they are not in order. I receive for example, m2 then m3 then m1 and then m4.
Why is that ? Is it because the buffering in the producer is multi-threaded with each producing to the topic at the same time ?
I assumed that the custom timestamp extractor would help in ordering messages when consuming them. But they do not. Or maybe my understanding of the timestamp extractor is wrong.
I got one solution from SO here, to just stream all events from tA to another intermediate topic (say tA') which will use the TimeStamp extractor to another topic. But I am not sure if this will cause the events to get reordered based on the extracted timestamp.
My code for the Producer is as shown below (I am using Spring Cloud for creating the Producer):
Producer.java
#Service
public class Producer {
private String topicName = "input-topic";
private ApplicationProperties appProps;
#Autowired
private KafkaTemplate<String, MyEvent> kafkaTemplate;
public Producer() {
super();
}
#Autowired
public void setAppProps(ApplicationProperties appProps) {
this.appProps = appProps;
this.topicName = appProps.getInput().getTopicName();
}
public void sendMessage(String key, MyEvent ce) {
ListenableFuture<SendResult<String,MyEvent>> future = this.kafkaTemplate.send(this.topicName, key, ce);
}
}

Why is that ? Is it because the buffering in the producer is multi-threaded with each producing to the topic at the same time ?
By default, the producer allow for up to 5 parallel in-flight requests to a broker, and thus if some requests fail and are retried the request order might change.
To avoid this re-ordering issue, you can either set max.in.flight.requests.per.connection = 1 (what may have a performance hit) or set enable.idempotence = true.
Btw: you did not say if your topic has a single partition or multiple partitions, and if your messages have a key? If your topic has more then one partition and you messages are sent to different partitions, there is no ordering guarantee on read anyway, because offset ordering is only guaranteed within a partition.
I assumed that the custom timestamp extractor would help in ordering messages when consuming them. But they do not. Or maybe my understanding of the timestamp extractor is wrong.
The timestamp extractor only extracts a timestamp. Kafka Streams does not re-order any messages, but processes messages always in offset-order.
If not, then what are the specific uses of the timestamp extractor ? Just to associate a timestamp with an event ?
Correct.
I got one solution from SO here, to just stream all events from tA to another intermediate topic (say tA') which will use the TimeStamp extractor to another topic. But I am not sure if this will cause the events to get reordered based on the extracted timestamp.
No, it won't do any reordering. The other SO question is just about to change the timestamp, but if you read messages in order a,b,c the result would be written in order a,b,c (just with different timestamps, but offset order should be preserved).
This talk explains some more details: https://www.confluent.io/kafka-summit-san-francisco-2019/whats-the-time-and-why/

Roll back mechanism in kafka processor api?

I am using kafka processor api (not DSL)
public class StreamProcessor implements Processor<String, String>
{
public ProcessorContext context;
public void init(ProcessorContext context)
{
this.context = context;
context.commit()
//statestore initialized with key,value
}
public void process(String key, String val)
{
try
{
String[] topicList = stateStore.get(key).split("|");
for(String topic: topicList)
{
context.forward(key,val,To.child(consumerTopic));
} // forward same message to list of topics ( 1..n topics) , rollback if write to some topics failed ?
}
}
}
Scenario : we are reading data from a source topic and stream
processor writes data to multiple sink topics (topicList above) .
Question: How to implement rollback mechanism using kafka streams
processor api when one or more of the topics in the topicList above
fails to receive the message ? .
What I understand is processor api has rollback mechanism for each
record it failed to send, or can roll back for an an entire batch of
messages which failed be achieved as well? as process method in
processor interface is called per record rather than per batch hence I
would surmise it can only be done per record.Is this correct assumption ?, if not please suggest
how to achieve per record and per batch rollbacks for failed topics using processor api.

You would need to implement it yourself. For example, you could use two stores: main-store, and "buffer" store and first only update the buffer store, call context.forward() second to make sure all write are in the output topic, and afterward merge the "buffer" store into the main store.
If you need to roll back, you drop the content from the buffer store.

Seek to end of partition while running kafka spring using Group management?

I'm using a kafka spring consumer that is under group management.
I have the following code in my consumer class
public class ConsumerHandler implements Receiver<String, Message>, ConsumerSeekAware {
#Value("${topic}")
protected String topic;
public ConsumerHandler(){}
#KafkaListener(topics = "${topic}")
public Message receive(List<ConsumerRecord<String, Message>> messages, Acknowledgment acknowledgment) {
for (ConsumerRecord<String, Message> message : messages) {
Message msg = message.value();
this.handleMessage(any, message);
}
acknowledgment.acknowledge();
return null;
}
#Override
public void registerSeekCallback(ConsumerSeekCallback callback) {
}
#Override
public void onPartitionsAssigned(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {
for (Entry<TopicPartition, Long> pair : assignments.entrySet()) {
TopicPartition tp = pair.getKey();
callback.seekToEnd(tp.topic(),tp.partition());
}
}
#Override
public void onIdleContainer(Map<TopicPartition, Long> assignments, ConsumerSeekCallback callback) {}
}
This code works great while my consumer is running. However, sometimes the amount of messages being processed is too much and the messages stack up. I've implemented concurrency on my consumers and still sometimes there's delay in the messages over time.
So as a workaround, before I figure out why the delay is happening, I'm trying to keep my consumer up to the latest messages.
I'm having to restart my app to get partition assigned invoked so that my consumer seeks to end and starts processing the latest messages.
Is there a way to seek to end without having to bounce my application?
Thanks.

As explained in the JavaDocs and the reference manual, you can save off the ConsumerSeekCallback passed into registerSeekCallback in a ThreadLocal<ConsumerSeekCallback>.
Then, you can perform arbitrary seek operations whenever you want; however, since the consumer is not thread-safe, you must perform the seeks within your #KafkaListener so they run on the consumer thread - hence the need to store the callback in a ThreadLocal.
In version 2.0 and later, you can add the consumer as a parameter to the #KafkaListener method and perform the seeks directly thereon.
public Message receive(List<ConsumerRecord<String, Message>> messages, Acknowledgment acknowledgment,
Consumer<?, ?> consumer) {
The current version is 2.1.6.

I have never found or seen a fixed solution for this kind of problem. The way I do is to boost performance as high as possible base on the amount of messages to be processed and Kafka parameters.
Let's say if you have a shopping online app then you can control the upper bound of the number of transactions per day, said N. So you should make the app work well in the scenario where 1.5*N or 2*N transactions will need to sync to Kafka cluster. You keep this state until a day your shopping app reaches a new level and you will need to upgrade your Kafka system again. For shopping online app there are a special high number of transactions in promotion or mega sales days so what you prepare for your system is for these days.

Kafka Streams: Custom TimestampExtractor for aggregation

I am building a pretty straightforward KafkaStreams demo application, to test a use case.
I am not able to upgrade the Kafka broker I am using (which is currently on version 0.10.0), and there are several messages written by a pre-0.10.0 Producer, so I am using a custom TimestampExtractor, which I add as a default to the config in the beginning of my main class:
config.put(StreamsConfig.DEFAULT_TIMESTAMP_EXTRACTOR_CLASS_CONFIG, GenericRecordTimestampExtractor.class);
When consuming from my source topic, this works perfectly fine. But when using an aggregation operator, I run into an exception because the FailOnInvalidTimestamp implementation of TimestampExtractor is used instead of the custom implementation when consuming from the internal aggregation topic.
The code of the Streams app looks something like this:
...
KStream<String, MyValueClass> clickStream = streamsBuilder
.stream("mytopic", Consumed.with(Serdes.String(), valueClassSerde));
KTable<Windowed<Long>, Long> clicksByCustomerId = clickStream
.map(((key, value) -> new KeyValue<>(value.getId(), value)))
.groupByKey(Serialized.with(Serdes.Long(), valueClassSerde))
.windowedBy(TimeWindows.of(TimeUnit.MINUTES.toMillis(1)))
.count();
...
The Exception I'm encountering is the following:
Exception in thread "click-aggregator-b9d77f2e-0263-4fa3-bec4-e48d4d6602ab-StreamThread-1" org.apache.kafka.streams.errors.StreamsException:
Input record ConsumerRecord(topic = click-aggregator-KSTREAM-AGGREGATE-STATE-STORE-0000000002-repartition, partition = 9, offset = 0, CreateTime = -1, serialized key size = 8, serialized value size = 652, headers = RecordHeaders(headers = [], isReadOnly = false), key = 11230, value = org.example.MyValueClass#2a3f2ea2) has invalid (negative) timestamp.
Possibly because a pre-0.10 producer client was used to write this record to Kafka without embedding a timestamp, or because the input topic was created before upgrading the Kafka cluster to 0.10+. Use a different TimestampExtractor to process this data.
Now the question is: Is there any way I can make Kafka Streams use the custom TimestampExtractor when reading from the internal aggregation topic (optimally while still using the Streams DSL)?

You cannot change the timestamp extractor (as of v1.0.0). This is not allowed for correctness reasons.
But I am really wondering, how a record with timestamp -1 is written into this topic in the first place. Kafka Streams uses the timestamp that was provided by your custom extractor when writing the record. Also note, that KafkaProducer does not allow to write records with negative timestamp.
Thus, the only explanation I can think of is that some other producer did write into the repartitioning topic -- and this is not allowed... Only Kafka Streams should write into the repartioning topic.
I guess, you will need to delete this topic and let Kafka Streams recreate it to get back into a clean state.
From the discussion/comment of the other answer:
You need 0.10+ format to work with Kafka Streams. If you upgrade your brokers and keep 0.9 format or older, Kafka Streams might not work as expected.

It is well known issue :-). I have the same problem with old clients in the projects which are still using older Kafka clients like 0.9 and also when communicating with some "not certified" .NET clients.
Therefore I wrote dedicated class:
public class MyTimestampExtractor implements TimestampExtractor {
private static final Logger LOG = LogManager.getLogger( MyTimestampExtractor.class );
#Override
public long extract ( ConsumerRecord<Object, Object> consumerRecord, long previousTimestamp ) {
final long timestamp = consumerRecord.timestamp();
if ( timestamp < 0 ) {
final String msg = consumerRecord.toString().trim();
LOG.warn( "Record has wrong Kafka timestamp: {}. It will be patched with local timestamp. Details: {}", timestamp, msg );
return System.currentTimeMillis();
}
return timestamp;
}
}
When there are many messages you may skip logging, as it may flood.

After reading Matthias' answer I double checked everything and the cause of the issue were incompatible versions between the Kafka Broker and the Kafka Streams app. I was stupid enough to use Kafka Streams 1.0.0 with a 0.10.1.1 Broker, which is clearly stated as incompatible in the Kafka Wiki here.
Edit (thx to Matthias): The actual cause of the problem was the fact, that the log format used by our 0.10.1.x broker was still 0.9.0.x, which is incompatible with Kafka Streams.

Apache Flink dynamic number of Sinks

I am using Apache Flink and the KafkaConsumer to read some values from a Kafka Topic.
I also have a stream obtained from reading a file.
Depending on the received values, I would like to write this stream on different Kafka Topics.
Basically, I have a network with a leader linked to many children. For each child, the Leader needs to write the stream read in a child-specific Kafka Topic, so that the child can read it.
When the child is started, it registers itself in the Kafka topic read from the Leader.
The problem is that I don't know a priori how many children I have.
For example, I read 1 from the Kafka Topic, I want to write the stream in just one Kafka Topic named Topic1.
I read 1-2, I want to write on two Kafka Topics (Topic1 and Topic2).
I don't know if it is possible because in order to write on the Topic, I am using the Kafka Producer along with the addSink method and to my understanding (and from my attempts) it seems that Flink requires to know the number of sinks a priori.
But then, is there no way to obtain such behavior?

If I understood your problem well, I think you can solve it with a single sink, since you can choose the Kafka topic based on the record being processed. It also seems that one element from the source might be written to more than one topic, in which case you would need a FlatMapFunction to replicate each source record N times (one for each output topic). I would recommend to output as a pair (aka Tuple2) with (topic, record).
DataStream<Tuple2<String, MyValue>> stream = input.flatMap(new FlatMapFunction<>() {
public void flatMap(MyValue value, Collector<Tupple2<String, MyValue>> out) {
for (String topic : topics) {
out.collect(Tuple2.of(topic, value));
}
}
});
Then you can use the topic previously computed by creating the FlinkKafkaProducer with a KeyedSerializationSchema in which you implement getTargetTopic to return the first element of the pair.
stream.addSink(new FlinkKafkaProducer10<>(
"default-topic",
new KeyedSerializationSchema<>() {
public String getTargetTopic(Tuple2<String, MyValue> element) {
return element.f0;
}
...
},
kafkaProperties)
);

KeyedSerializationSchema
Is now deprecated. Instead you have to use "KafkaSerializationSchema"
The same can be achieved by overriding the serialize method.
public ProducerRecord<byte[], byte[]> serialize(
String inputString, #Nullable Long aLong){
return new ProducerRecord<>(customTopicName,
key.getBytes(StandardCharsets.UTF_8), inputString.getBytes(StandardCharsets.UTF_8));
}

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse