Aggregate resource requests & dispatch responses to each subscriber - aggregate

I'm fairly new to RxJava and struggling with an use case that seems quite common to me :
Gather multiple requests from different parts of the application, aggregate them, make a single resource call and dispatch the results to each subscriber.
I've tried a lot of different approaches, using subjects, connectable observables, deferred observables... none did the trick so far.
I was quite optimistic about this approach but turns out it fails just like the others :
//(...)
static HashMap<String, String> requests = new HashMap<>();
//(...)
#Test
public void myTest() throws InterruptedException {
TestScheduler scheduler = new TestScheduler();
Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
.doOnSubscribe(() -> System.out.println("new subscriber!"))
.doOnUnsubscribe(() -> System.out.println("unsubscribed"))
.filter(l -> !requests.isEmpty())
.doOnNext(aLong -> System.out.println(requests.size() + " requests to send"))
.flatMap(aLong -> {
System.out.println("requests " + requests);
return Observable.from(requests.keySet()).take(10).distinct().toList();
})
.doOnNext(strings -> System.out.println("calling aggregate for " + strings + " (from " + requests + ")"))
.flatMap(Observable::from)
.doOnNext(s -> {
System.out.println("----");
System.out.println("removing " + s);
requests.remove(s);
})
.doOnNext(s -> System.out.println("remaining " + requests));
TestSubscriber<String> ts1 = new TestSubscriber<>();
TestSubscriber<String> ts2 = new TestSubscriber<>();
TestSubscriber<String> ts3 = new TestSubscriber<>();
TestSubscriber<String> ts4 = new TestSubscriber<>();
Observable<String> defer = buildObservable(interval, "1");
defer.subscribe(ts1);
Observable<String> defer2 = buildObservable(interval, "2");
defer2.subscribe(ts2);
Observable<String> defer3 = buildObservable(interval, "3");
defer3.subscribe(ts3);
scheduler.advanceTimeBy(200, TimeUnit.MILLISECONDS);
Observable<String> defer4 = buildObservable(interval, "4");
defer4.subscribe(ts4);
scheduler.advanceTimeBy(100, TimeUnit.MILLISECONDS);
ts1.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts2.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts3.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts4.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts1.assertValue("1");
ts2.assertValue("2"); //fails (test stops here)
ts3.assertValue("3"); //fails
ts4.assertValue("4"); //fails
}
public Observable<String> buildObservable(Observable<String> interval, String key) {
return Observable.defer(() -> {
System.out.printf("creating observable for key " + key);
return Observable.create(subscriber -> {
requests.put(key, "xxx");
interval.doOnNext(s -> System.out.println("filtering : key/val " + key + "/" + s))
.filter(s1 -> s1.equals(key))
.doOnError(subscriber::onError)
.subscribe(s -> {
System.out.println("intern " + s);
subscriber.onNext(s);
subscriber.onCompleted();
subscriber.unsubscribe();
});
});
}
)
;
}
Output :
creating observable for key 1new subscriber!
creating observable for key 2new subscriber!
creating observable for key 3new subscriber!
3 requests to send
requests {3=xxx, 2=xxx, 1=xxx}
calling aggregate for [3, 2, 1] (from {3=xxx, 2=xxx, 1=xxx})
----
removing 3
remaining {2=xxx, 1=xxx}
filtering : key/val 1/3
----
removing 2
remaining {1=xxx}
filtering : key/val 1/2
----
removing 1
remaining {}
filtering : key/val 1/1
intern 1
creating observable for key 4new subscriber!
1 requests to send
requests {4=xxx}
calling aggregate for [4] (from {4=xxx})
----
removing 4
remaining {}
filtering : key/val 1/4
The test fails at the second assertion (ts2 not receiving "2")
Turns out the pseudo-aggregation works as expected, but the values are not dispatched to the corresponding subscribers (only the first subscriber receives it)
Any idea why?
Also, I feel like I'm missing the obvious here. If you think of a better approach, I'm more than willing to hear about it.
EDIT : Adding some context regarding what I want to achieve.
I have a REST API exposing data via multiple endpoints (eg. user/{userid}). This API also makes it possible to aggregate requests (eg. user/user1 & user/user2) and get the corresponding data in one single http request instead of two.
My goal is to be able to automatically aggregate the requests made from different parts of my application in a given time frame (say 10ms) with a max batch size (say 10), make an aggregate http request, then dispatch the results to the corresponding subscribers.
Something like this :
// NOTE: those calls can be fired from anywhere in the app, and randomly combined. The timing and order is completely unpredictable
//ts : 0ms
api.call(userProfileRequest1).subscribe(this::show);
api.call(userProfileRequest2).subscribe(this::show);
//--> after 10ms, should fire one single http aggregate request with those 2 calls, map the response items & send them to the corresponding subscribers (that will show the right user profile)
//ts : 20ms
api.call(userProfileRequest3).subscribe(this::show);
api.call(userProfileRequest4).subscribe(this::show);
api.call(userProfileRequest5).subscribe(this::show);
api.call(userProfileRequest6).subscribe(this::show);
api.call(userProfileRequest7).subscribe(this::show);
api.call(userProfileRequest8).subscribe(this::show);
api.call(userProfileRequest9).subscribe(this::show);
api.call(userProfileRequest10).subscribe(this::show);
api.call(userProfileRequest11).subscribe(this::show);
api.call(userProfileRequest12).subscribe(this::show);
//--> should fire a single http aggregate request RIGHT AWAY (we hit the max batch size) with the 10 items, map the response items & send them to the corresponding subscribers (that will show the right user profile)
The test code I wrote (with just strings) and pasted at the top of this question is meant to be a proof of concept for my final implementation.

Your Observable is not well constructed
public Observable<String> buildObservable(Observable<String> interval, String key) {
return interval.doOnSubscribe(() -> System.out.printf("creating observable for key " + key))
.doOnSubscribe(() -> requests.put(key, "xxx"))
.doOnNext(s -> System.out.println("filtering : key/val " + key + "/" + s))
.filter(s1 -> s1.equals(key));
}
When you subsribe in a subscriber : it's offen a bad design.
I'm not shure to understand what you want to achieve, but I think my code should be pretty close to yours.
Please note that, for all side effects, I use doMethods (like doOnNext, doOnSubscribe) to show I explicitly show that I want to do a side effect.
I replace your defer call by returning directly the interval : as you want to emit all interval events in your custom observable build in your defer call, returning the interval observable is better.
Please note, that you filtering your interval Observable :
Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
.filter(l -> !requests.isEmpty()).
// ...
So, as soon you'll put something into requests map, interval will stop emmiting.
I don't understand what you wants to achieve with the request map, but please note that you may want to avoid side effects, and updating this map is clearly a side effect.
Update regarding comments
You may want to use the buffer operator to aggregate request, and then perform request in a bulk way :
PublishSubject<String> subject = PublishSubject.create();
TestScheduler scheduler = new TestScheduler();
Observable<Pair> broker = subject.buffer(100, TimeUnit.MILLISECONDS, 10, scheduler)
.flatMapIterable(list -> list) // you can bulk calls here
.flatMap(id -> Observable.fromCallable(() -> api.call(id)).map(response -> Pair.of(id, response)));
TestSubscriber<Object> ts1 = new TestSubscriber<>();
TestSubscriber<Object> ts2 = new TestSubscriber<>();
TestSubscriber<Object> ts3 = new TestSubscriber<>();
TestSubscriber<Object> ts4 = new TestSubscriber<>();
broker.filter(pair -> pair.id.equals("1")).take(1).map(pair -> pair.response).subscribe(ts1);
broker.filter(pair -> pair.id.equals("2")).take(1).map(pair -> pair.response).subscribe(ts2);
broker.filter(pair -> pair.id.equals("3")).take(1).map(pair -> pair.response).subscribe(ts3);
broker.filter(pair -> pair.id.equals("4")).take(1).map(pair -> pair.response).subscribe(ts4);
subject.onNext("1");
subject.onNext("2");
subject.onNext("3");
scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
ts1.assertValue("resp1");
ts2.assertValue("resp2");
ts3.assertValue("resp3");
ts4.assertNotCompleted();
subject.onNext("4");
scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
ts4.assertValue("resp4");
ts4.assertCompleted();
If you want to perform network request collapsin, you may want to check Hystrix : https://github.com/Netflix/Hystrix

Related

WebFlux/Reactor: checking conditions before+after Flux execution with doOnComplete

I'm already querying some external resource with Flux.using(). Now I want to implement a kind of optimistic locking: read some state before query starts to execute and check if it was updated after query is finished. If so - throw some exception to break http request handling.
I've achieved this by using doOnComplete:
final AtomicReference<String> initialState = new AtomicReference<>();
return Flux.just("some", "constant", "data")
.doOnComplete(() -> initialState.set(getState()))
.concatWith(Flux.using(...)) //actual data query
.doOnComplete(() -> {if (!initialState.get().equals(getState())) throw new RuntimeException();})
.concatWithValues("another", "constant", "data")
My questions:
Is it correct? Is it guaranteed that 1st doOnComplete lambda would be finished before Flux.using() and is it guaranteed that 2nd doOnComplete lambda would be executed strictly after?
Does more elegant solution exists?
The first doOnComplete would be executed after Flux.just("some", "constant", "data") emits all elements and the second one after emitted Publisher defined in concatWith completes successfully. This is working because both publishers have a finite number of elements.
With the proposed approach, however the pre-/postconditions from a particular operation are handled outside of the operations at a higher level. In other words, the condition check belonging to the operation is leaking to the flux definition.
Suggestion, pushing the condition check down to the operation:
var otherElements = Flux.using( // actual data query
() -> "other",
x -> {
var initialState = getState();
return Flux.just(x).doOnComplete(() ->
{ if (!initialState.equals(getState())) throw new IllegalStateException(); }
);
},
x -> { }
);
Flux.just("some", "constant", "data")
.concatWith(otherElements)
.concatWith(Mono.just("another")) // "constant", "data" ...

How can I change the period for Flowable.interval

Is there a way to change the Flowable.interval period at runtime?
LOGGER.info("Start generating bullshit for 7 seconds:");
Flowable.interval(3, TimeUnit.SECONDS)
.map(tick -> random.nextInt(100))
.subscribe(tick -> LOGGER.info("tick = " + tick));
TimeUnit.SECONDS.sleep(7);
LOGGER.info("Change interval to 2 seconds:");
I have a workaround, but the best way would be to create a new operator.
How does this solution work?
You have a trigger source, which will provide values, when to start start a new interval. The source is switchMapped with an interval as inner-stream. The inner-stream takes an input value for the upstream source for setting the new interval time.
switchMap
When the source emits a time (Long), the switchMap lambda is invoked and the returned Flowable will be subscribed to immediately. When a new value arrives at the switchMap, the inner subscribed Flowable interval will be unsubscribed from and the lambda will be invoked once again. The returned Inverval-Flowable will be re-subscribed.
This means, that on each emit from the source, a new Inveral is created.
How does it behave?
When the inveral is subscribed to and is about to emit a new value and a new value is emitted from the source, the inner-stream (inverval) is unsubscribed from. Therefore the value is not emitted anymore. The new Interval-Flowable is subscribed to and will emit a value to it's configuration.
Solution
lateinit var scheduler: TestScheduler
#Before
fun init() {
scheduler = TestScheduler()
}
#Test
fun `62232235`() {
val trigger = PublishSubject.create<Long>()
val switchMap = trigger.toFlowable(BackpressureStrategy.LATEST)
// make sure, that a value is emitted from upstream, in order to make sure, that at least one interval emits values, when the upstream-sources does not provide a seed value.
.startWith(3)
.switchMap {
Flowable.interval(it, TimeUnit.SECONDS, scheduler)
.map { tick: Long? ->
tick
}
}
val test = switchMap.test()
scheduler.advanceTimeBy(10, TimeUnit.SECONDS)
test.assertValues(0, 1, 2)
// send new onNext value at absolute time 10
trigger.onNext(10)
// the inner stream is unsubscribed and a new stream with inverval(10) is subscribed to. Therefore the first vale will be emitted at 20 (current: 10 + 10 configured)
scheduler.advanceTimeTo(21, TimeUnit.SECONDS)
// if the switch did not happen, there would be 7 values
test.assertValues(0, 1, 2, 0)
}

How to process all events emitted by RX Java regardless of error?

I'm using vertx.io web framework to send a list of items to a downstream HTTP server.
records.records() emits 4 records and I have specifically set the web client to connect to the wrong I.P/port.
Processing... prints 4 times.
Exception outer! prints 3 times.
If I put back the proper I.P/port then Susbscribe outer! prints 4 times.
io.reactivex.Flowable
.fromIterable(records.records())
.flatMap(inRecord -> {
System.out.println("Processing...");
// Do stuff here....
Observable<Buffer> bodyBuffer = Observable.just(Buffer.buffer(...));
Single<HttpResponse<Buffer>> request = client
.post(..., ..., ...)
.rxSendStream(bodyBuffer);
return request.toFlowable();
})
.subscribe(record -> {
System.out.println("Subscribe outer!");
}, ex -> {
System.out.println("Exception outer! " + ex.getMessage());
});
UPDATE:
I now understand that on error RX stops right a way. Is there a way to continue and process all records regardless and get an error for each?
Given this article: https://medium.com/#jagsaund/5-not-so-obvious-things-about-rxjava-c388bd19efbc
I have come up with this... Unless you see something wrong with this?
io.reactivex.Flowable
.fromIterable(records.records())
.flatMap
(inRecord -> {
Observable<Buffer> bodyBuffer = Observable.just(Buffer.buffer(inRecord.toString()));
Single<HttpResponse<Buffer>> request = client
.post("xxxxxx", "xxxxxx", "xxxxxx")
.rxSendStream(bodyBuffer);
// So we can capture how long each request took.
final long startTime = System.currentTimeMillis();
return request.toFlowable()
.doOnNext(response -> {
// Capture total time and print it with the logs. Removed below for brevity.
long processTimeMs = System.currentTimeMillis() - startTime;
int status = response.statusCode();
if(status == 200)
logger.info("Success!");
else
logger.error("Failed!");
}).doOnError(ex -> {
long processTimeMs = System.currentTimeMillis() - startTime;
logger.error("Failed! Exception.", ex);
}).doOnTerminate(() -> {
// Do some extra stuff here...
}).onErrorResumeNext(Flowable.empty()); // This will allow us to continue.
}
).subscribe(); // Don't handle here. We subscribe to the inner events.
Is there a way to continue and process all records regardless and get
an error for each?
According to the doc, the observable should be terminated if it encounters an error. So you can't get each error in onError.
You can use onErrorReturn or onErrorResumeNext() to tell the upstream what to do if it encounters an error (e.g. emit null or Flowable.empty()).

RxJava2 : 2 separate observerable output and and merge output of same observables differ

Snippet1 , I can see the sysout from both subscribers.
Snippet2 , I dont see output from the second observable.
Why is the merge not working for me?
Snippet1
x = createQ2Flowable().subscribeOn(Schedulers.computation())
.observeOn(Schedulers.io())
.filter(predicate -> !predicate.toString().contains("<log realm=\"\""))
.subscribe(onNext -> System.out.println("Q2->" + onNext));
y = createMetricsFlowable().subscribeOn(Schedulers.computation())
.observeOn(Schedulers.io())
.subscribe(onNext -> System.out.println("metrics->" + onNext));
Snippet2
createQ2Flowable().mergeWith(createMetricsFlowable())
.subscribeOn(Schedulers.computation())
.subscribe(onNext -> System.out.println(onNext));
[edit]: Added flowable creators
private Flowable<String> createMetricsFlowable() {
return Flowable.create(source -> {
Space sp = SpaceFactory.getSpace("rxObservableFeeder");
while (running()) {
String line = (String) sp.in("RXTmFeeder");
source.onNext(line);
}
}, BackpressureStrategy.BUFFER);
}
private Flowable<String> createQ2Flowable() {
return Flowable.create(source -> {
Space sp = SpaceFactory.getSpace("LoggerSpace");
while (running()) {
LogEvent line = (LogEvent) sp.in("rxLoggingKey");
source.onNext(line.toString());
}
}, BackpressureStrategy.BUFFER);
}
From the comments:
try
createQ2Flowable()
.subscribeOn(Schedulers.computation()) // <-------------------------
.mer‌​geWith(createMetrics‌​Flowable()
.subscribe‌​On(Schedulers.comput‌​ation()) // <-------------------------
)
Now I need to know why it happened
Given the detailed implementation, you have two synchronous Flowables. When you merge them, the first Flowable is subscribed to and starts emitting immediately and never giving back the control to mergeWith, therefore the second Flowable is never subscribed to.
The subscribeOn after mergeWith is not equivalent to the solution provided above. You have to explicitly have both Flowables subscribed on a background thread so mergeWith can subscribe to the second Flowable after now that the synchronous looping has been moved off from the thread the mergeWith uses for subscribing to its sources.

Rate-limiting multiple observables created by multiple threads using RxJava

I'm developing a simple REST application that leverages on RxJava to send requests to a remote server (1). For each incoming request to the REST API a request is sent (using RxJava and RxNetty) to (1). Everything is working fine but now I have a new use case:
In order to not bombard (1) with too many request I need to implement rate limiting. One way to solve this (I assume) would be to add each Observable created when sending a request to (1) into another Observable (2) that does the actual rate-limiting. (2) will then act more or less like a queue and process the outbound requests as fast as possible (but not faster than the rate limit). Here's some pseudo-like code:
Observable<MyResponse> r1 = createRequestToExternalServer() // In thread 1
Observable<MyResponse> r2 = createRequestToExternalServer() // In thread 2
// Somehow send r1 and r2 to the "rate limiter" observable, (2)
rateLimiterObservable.sample(1 / rate, TimeUnit.MILLISECONDS)
How would I use Rx/RxJava to solve this?
I'd use a hot timer along with an atomic counter that keeps track the remaining connection for the given duration:
int rate = 5;
long interval = 1000;
AtomicInteger remaining = new AtomicInteger(rate);
ConnectableObservable<Long> timer = Observable
.interval(interval, TimeUnit.MILLISECONDS)
.doOnNext(e -> remaining.set(rate))
.publish();
timer.connect();
Observable<Integer> networkCall = Observable.just(1).delay(150, TimeUnit.MILLISECONDS);
Observable<Integer> limitedNetworkCall = Observable
.defer(() -> {
if (remaining.getAndDecrement() != 0) {
return networkCall;
}
return Observable.error(new RuntimeException("Rate exceeded"));
});
Observable.interval(100, TimeUnit.MILLISECONDS)
.flatMap(t -> limitedNetworkCall.onErrorReturn(e -> -1))
.take(20)
.toBlocking()
.forEach(System.out::println);