I have a gui events producer. I want to take the latest emission from it and have it processed on a different thread. While processing the emission I need gui producer emissions to be dropped. After emission is processed I want to take latest emission from gui producer.
I was trying to use onBackpressureLatest() overflow strategy but queue size is my problem. With the default 256 queue size I get the expected behaviour but must process 255 emissions which are useless for me. Decreasing queue size to 16 leaves me with 15 useless emissions. I imagine I would achieve expected behaviour with a queue size=1 but 16 is the min value.
I attach code described in the previous paragraph.
import reactor.core.publisher.Flux;
import reactor.core.scheduler.Schedulers;
import java.util.concurrent.TimeUnit;
public class App {
public static void main(String[] args) throws InterruptedException {
System.setProperty("reactor.bufferSize.small", "16");
guiEventsProducer()
.onBackpressureLatest()
.log()
.publishOn(Schedulers.single())
.subscribe(next -> {
System.out.println("Processing " + next);
try {
TimeUnit.MILLISECONDS.sleep(1);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("Processing " + next + " done");
});
TimeUnit.SECONDS.sleep(60);
}
private static Flux<Integer> guiEventsProducer() {
return Flux.range(0, 10000);
}
}
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;
import reactor.core.scheduler.Schedulers;
import java.util.concurrent.TimeUnit;
public class App {
public static void main(String[] args) throws InterruptedException {
Flux.range(1, 10000)
.log()
.onBackpressureLatest()
.log()
.flatMap(next -> Mono.just(next).subscribeOn(Schedulers.single()), 1, 1)
.subscribe(integer -> {
System.out.println("[" + Thread.currentThread().getName() + "] result=" + integer);
try {
TimeUnit.MILLISECONDS.sleep(10);
} catch (InterruptedException e) {
e.printStackTrace();
}
});
TimeUnit.SECONDS.sleep(60);
}
}
Below is the consumer code to receive messages from kafka topic (8 partition) and processing it.
#Component
public class MessageConsumer {
private static final String TOPIC = "mytopic.t";
private static final String GROUP_ID = "mygroup";
private final ReceiverOptions consumerSettings;
private static final Logger LOG = LoggerFactory.getLogger(MessageConsumer.class);
#Autowired
public MessageConsumer(#Qualifier("consumerSettings") ReceiverOptions consumerSettings)
{
this.consumerSettings=consumerSettings;
consumerMessage();
}
private void consumerMessage()
{
KafkaReceiver<String, String> receiver = KafkaReceiver.create(receiverOptions(Collections.singleton(TOPIC)));
Scheduler scheduler = Schedulers.newElastic("FLUX_DEFER", 10, true);
Flux.defer(receiver::receive)
.groupBy(m -> m.receiverOffset().topicPartition())
.flatMap(partitionFlux ->
partitionFlux.publishOn(scheduler)
.concatMap(m -> {
LOG.info("message received from kafka : " + "key : " + m.key()+ " partition: " + m.partition());
return process(m.key(), m.value())
.thenEmpty(m.receiverOffset().commit());
}))
.retryBackoff(5, Duration.ofSeconds(2), Duration.ofHours(2))
.doOnError(err -> {
handleError(err);
}).retry()
.doOnCancel(() -> close()).subscribe();
}
private void close() {
}
private void handleError(Throwable err) {
LOG.error("kafka stream error : ",err);
}
private Mono<Void> process(String key, String value)
{
if(key.equals("error"))
return Mono.error(new Exception("process error : "));
LOG.error("message consumed : "+key);
return Mono.empty();
}
public ReceiverOptions<String, String> receiverOptions(Collection<String> topics) {
return consumerSettings
.commitInterval(Duration.ZERO)
.commitBatchSize(0)
.addAssignListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.addRevokeListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.subscription(topics);
}
}
#Bean(name="consumerSettings")
public ReceiverOptions<String, String> getConsumerSettings() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, bootstrapServers);
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.GROUP_ID_CONFIG, GROUP_ID);
props.put(ConsumerConfig.CLIENT_ID_CONFIG, GROUP_ID);
props.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest");
props.put(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, false);
props.put("max.block.ms", "3000");
props.put("request.timeout.ms", "3000");
return ReceiverOptions.create(props);
}
On receiving each message, my processing logic returns on empty mono if the consumed message processed successfully.
Everything works as expected if there is no error returned in the processing logic.
But if i throw an error to simulate the exception behaviour in my processing logic for a particular message then i am missing to process that message which caused the exception. The stream moves to the next message.
What i want to achieve is, process the current message and commit the offset if its successful then move to the next record.
If any exception in processing the message don't commit the current offset and retry the same message until its successful. Don't move to the next message until the current message is successful.
Please let me know how to handle process failures without skipping the message and make the stream start from the offset where the exception is thrown.
Regards,
Vinoth
The below code works for me. The idea is to retry the failed messages configured number of time and if its still fails then move it to failed queue and commit the message. At the same time process the messages from other partitions concurrently.
If a message from a particular partition fails configured number of time then restart the stream after a delay so that we can handle dependency failures by not hitting them continuously.
#Autowired
public ReactiveMessageConsumer(#Qualifier("consumerSettings") ReceiverOptions consumerSettings,MessageProducer producer)
{
this.consumerSettings=consumerSettings;
this.fraudCheckService=fraudCheckService;
this.producer=producer;
consumerMessage();
}
private void consumerMessage() {
int numRetries=3;
Scheduler scheduler = Schedulers.newElastic("FLUX_DEFER", 10, true);
KafkaReceiver<String, String> receiver = KafkaReceiver.create(receiverOptions(Collections.singleton(TOPIC)));
Flux<GroupedFlux<TopicPartition, ReceiverRecord<String, String>>> f = Flux.defer(receiver::receive)
.groupBy(m -> m.receiverOffset().topicPartition());
Flux f1 = f.publishOn(scheduler).flatMap(r -> r.publishOn(scheduler).concatMap(b ->
Flux.just(b)
.concatMap(a -> {
LOG.error("processing message - order: {} offset: {} partition: {}",a.key(),a.receiverOffset().offset(),a.receiverOffset().topicPartition().partition());
return process(a.key(), a.value()).
then(a.receiverOffset().commit())
.doOnSuccess(d -> LOG.info("committing order {}: offset: {} partition: {} ",a.key(),a.receiverOffset().offset(),a.receiverOffset().topicPartition().partition()))
.doOnError(d -> LOG.info("committing offset failed for order {}: offset: {} partition: {} ",a.key(),a.receiverOffset().offset(),a.receiverOffset().topicPartition().partition()));
})
.retryWhen(companion -> companion
.doOnNext(s -> LOG.info(" --> Exception processing message for order {}: offset: {} partition: {} message: {} " , b.key() , b.receiverOffset().offset(),b.receiverOffset().topicPartition().partition(),s.getMessage()))
.zipWith(Flux.range(1, numRetries), (error, index) -> {
if (index < numRetries) {
LOG.info(" --> Retying {} order: {} offset: {} partition: {} ", index, b.key(),b.receiverOffset().offset(),b.receiverOffset().topicPartition().partition());
return index;
} else {
LOG.info(" --> Retries Exhausted: {} - order: {} offset: {} partition: {}. Message moved to error queue. Commit and proceed to next", index, b.key(),b.receiverOffset().offset(),b.receiverOffset().topicPartition().partition());
producer.sendMessages(ERROR_TOPIC,b.key(),b.value());
b.receiverOffset().commit();
//return index;
throw Exceptions.propagate(error);
}
})
.flatMap(index -> Mono.delay(Duration.ofSeconds((long) Math.pow(1.5, index - 1) * 3)))
.doOnNext(s -> LOG.info(" --> Retried at: {} ", LocalTime.now()))
))
);
f1.doOnError(a -> {
LOG.info("Moving to next message because of : ", a);
try {
Thread.sleep(5000); // configurable
} catch (InterruptedException e) {
e.printStackTrace();
}
}
).retry().subscribe();
}
public ReceiverOptions<String, String> receiverOptions(Collection<String> topics) {
return consumerSettings
.commitInterval(Duration.ZERO)
.commitBatchSize(0)
.addAssignListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.addRevokeListener(p -> LOG.info("Group {} partitions assigned {}", GROUP_ID, p))
.subscription(topics);
}
private Mono<Void> process(OrderId orderId, TraceId traceId)
{
try {
Thread.sleep(500); // simulate slow response
} catch (InterruptedException e) {
// Causes the restart
e.printStackTrace();
}
if(orderId.getId().startsWith("error")) // simulate error scenario
return Mono.error(new Exception("processing message failed for order: " + orderId.getId()));
return Mono.empty();
}
Create different consumer groups.
Each consumer group would be related to one database.
Create your consumer so that they only process relevant event and push it to related database. If database is down then configure consumer to retry infinite amount of time.
For any reason, if your consumer dies then make sure that they start from where earlier consumer left. There is small possibility that your consumer dies right after committing data to database and sending ack to kafka broker. You need to update consumer code to make sure that you process messages exactly-once (if needed).
Given the following code:
kafkaConsumer
.rxSubscription()
.subscribeOn(Schedulers.io())
.map(s -> {
logger.info("Mapping on Thread: " + Thread.currentThread().getName());
return s;
})
.observeOn(Schedulers.computation())
.subscribe(
set -> {
logger.info("Subscribing on Thread: " +Thread.currentThread().getName());
});
where kafkaConsumer is a Vert.x KafkaConsumer, I expect that the
.map(s -> {
logger.info("Mapping on Thread: " + Thread.currentThread().getName());
return s;
})
would happen on the Reactive IO Thread. However, it executes on the Vert.x event-loop Thread. When I run the following test class, the same scenario runs the map method on the IO thread as expected.
public class ThreadTesting {
public static void main(String args[]) {
Vertx vertx = Vertx.vertx();
Observable.fromArray(new String[] {"start"})
.flatMapSingle(s -> method1())
.subscribeOn(Schedulers.io())
.map(
s -> {
System.out.println("mapping 2 on Thread: " + Thread.currentThread().getName());
return s.concat(method2());
})
.observeOn(Schedulers.computation())
.subscribe(
str -> {
System.out.println("Subscribing on Thread: " + Thread.currentThread().getName());
},
onError -> {
onError.printStackTrace();
});
}
public static Single<String> method1() {
System.out.println("Executing method 1 on Thread: " + Thread.currentThread().getName());
AsyncResultSingle<String> vertxSingle = new AsyncResultSingle<>(
h -> {
h.handle(Future.succeededFuture("method 1 string"));
});
return vertxSingle;
}
public static String method2() {
System.out.println("Executing method 2 on Thread: " + Thread.currentThread().getName());
return "method 2 String";
}
}
What causes this discrepancy in Thread execution to happen?
The Vert.x KafkaConsumer emits items asynchronously on an event-loop thread, even if you subscribed to it on the io scheduler.
In your snippet, you try to force items to be emitted on the computation scheduler. It works, but not on the observable you expect: it applies to the observable returned by the map operation.
If you want map to operate on the computation scheduler, you need to apply the observeOn operator before:
kafkaConsumer
.rxSubscription()
.subscribeOn(Schedulers.io())
.observeOn(Schedulers.computation())
.map(s -> {
logger.info("Mapping on Thread: " + Thread.currentThread().getName());
return s;
})
.subscribe(
set -> {
logger.info("Subscribing on Thread: " +Thread.currentThread().getName());
});
Im working on implementing simple data provider with 2 level cache using RxJava2
public static void main(String args[]) {
Observable.concat(getFromMemory(), getFromFileCache(), getFromNetwork())
.firstElement()
.subscribe((integer) ->
{
System.out.println("Completed with val: " + integer);
});
}
static Observable<Integer> getFromMemory() {
return Observable.create(e -> {
System.out.println("Source: Memory");
e.onNext(1);
e.onComplete();
});
}
static Observable<Integer> getFromFileCache() {
try {
Thread.sleep(1000);
}
catch (InterruptedException e) {
e.printStackTrace();
}
return Observable.create(e -> {
System.out.println("Source: FileCache");
e.onNext(2);
e.onComplete();
});
}
static Observable<Integer> getFromNetwork() {
try {
Thread.sleep(2000);
}
catch (InterruptedException e) {
e.printStackTrace();
}
return Observable.create(e -> {
System.out.println("Source: Network");
e.onNext(3);
e.onComplete();
});
}
Goal is to look for object in memory cache,
if not found - read from file, if not found - call network to get resource. Cache update is not important in this sample.
When executing this code, I see console log:
Source: Memory
Source: FileCache
Source: Network
Completed with val: 1
Which mean file cache and network call will be executed, despite memory cache returns value.
Im using rxJava 2, which operator can I use, to combine sources, but stop executing on first valueFound? I experimented with first(default) and take, no luck so far
Actually, its a bug in rxJava2 - upgrading to 2.1.0 helped:
Github issue: https://github.com/ReactiveX/RxJava/issues/5100
I have a client server application and I'm using rxjava to do server requests from the client. The client should only do one request at a time so I intent to use a thread queue scheduler similar to the trampoline scheduler.
Now I try to implement a mechanism to watch changes on the server. Therefore I send a long living request that blocks until the server has some changes and sends back the result (long pull).
This long pull request should only run when the job queue is idle. I'm looking for a way to automatically stop the watch request when a regular request is scheduled and start it again when the queue becomes empty. I thought about modifying the trampoline scheduler to get this behavior but I have the feeling that this is a common problem and there might be an easier solution?
You can hold onto the Subscription returned by scheduling the long poll task, unsubscribe it if the queue becomes non-empty and re-schedule if the queue becomes empty.
Edit: here is an example with the basic ExecutorScheduler:
import java.util.concurrent.*;
import java.util.concurrent.atomic.*;
public class IdleScheduling {
static final class TaskQueue {
final ExecutorService executor;
final AtomicReference<Future<?>> idleFuture;
final Runnable idleRunnable;
final AtomicInteger wip;
public TaskQueue(Runnable idleRunnable) {
this.executor = Executors.newFixedThreadPool(1);
this.idleRunnable = idleRunnable;
this.idleFuture = new AtomicReference<>();
this.wip = new AtomicInteger();
this.idleFuture.set(executor.submit(idleRunnable));
}
public void shutdownNow() {
executor.shutdownNow();
}
public Future<?> enqueue(Runnable task) {
if (wip.getAndIncrement() == 0) {
idleFuture.get().cancel(true);
}
return executor.submit(() -> {
task.run();
if (wip.decrementAndGet() == 0) {
startIdle();
}
});
}
void startIdle() {
idleFuture.set(executor.submit(idleRunnable));
}
}
public static void main(String[] args) throws Exception {
TaskQueue tq = new TaskQueue(() -> {
while (!Thread.currentThread().isInterrupted()) {
try {
Thread.sleep(1000);
} catch (InterruptedException ex) {
System.out.println("Idle interrupted...");
return;
}
System.out.println("Idle...");
}
});
try {
Thread.sleep(1500);
tq.enqueue(() -> System.out.println("Work 1"));
Thread.sleep(500);
tq.enqueue(() -> {
System.out.println("Work 2");
try {
Thread.sleep(500);
} catch (InterruptedException ex) {
}
});
tq.enqueue(() -> System.out.println("Work 3"));
Thread.sleep(1500);
} finally {
tq.shutdownNow();
}
}
}