subscribeOn(Schedulers.parallel()) is not working - reactive-programming

I am learning reactor core and following this https://www.baeldung.com/reactor-core
ArrayList<Integer> arrList = new ArrayList<Integer>();
System.out.println("Before: " + arrList);
Flux.just(1, 2, 3, 4)
.log()
.map(i -> i * 2)
.subscribeOn(Schedulers.parallel())
.subscribe(arrList::add);
System.out.println("After: " + arrList);
when I execute above line of code, gives out.
Before: []
[DEBUG] (main) Using Console logging
After: []
Above lines of code should start execution in another thread but it is not working at all.
Can somebody help me on this ?

As mentioned in the Reactor documentation for the various subscribe methods:
Keep in mind that since the sequence can be asynchronous, this will
immediately return control to the calling thread. This can give the
impression the consumer is not invoked when executing in a main thread
or a unit test for instance.
This means that the end of the main method is reached, and thus the main thread exits before any thread is able to subscribe to the Reactive chain, as mentioned by Piotr.
What you want to do is wait till the entire chain completes before printing the contents of the array.
The naive way of doing this is:
ArrayList<Integer> arrList = new ArrayList<>();
System.out.println("Before: " + arrList);
Flux.just(1, 2, 3, 4)
.log()
.map(i -> i * 2)
.subscribeOn(Schedulers.parallel())
.doOnNext(arrList::add)
.blockLast();
System.out.println("After: " + arrList);
Here, you block execution on the main thread until the last element on the Flux is processed. Thus the last System.out will not execute until your ArrayList is fully populated.
Remember that the way code will run in a Console application vs a server environment like Netty, is a little different. The only way to make a Console application wait for all subscriptions to kick in, is to block.
But blocking is not permitted on parallel threads. So this approach would not work in, say, a Netty environment. There your server would be running until explicitly shutdown, and so a subscribe would be fine.
However, in the above code snippet you are blocking not just to prevent the application from exiting, but also to wait before you read the data that has been populated.
An improvement to the above code would be as follows:
ArrayList<Integer> arrList = new ArrayList<>();
System.out.println("Before: " + arrList);
Flux.just(1, 2, 3, 4)
.log()
.map(i -> i * 2)
.subscribeOn(Schedulers.parallel())
.doOnNext(arrList::add)
.doOnComplete(() -> System.out.println("After: " + arrList))
.blockLast();
Even here, the doOnComplete accesses data from outside the reactive chain. To prevent this, you would collect the elements of the Flux in the chain itself, like this:
System.out.println("Before.");
Flux.just(1, 2, 3, 4)
.log()
.map(i -> i * 2)
.subscribeOn(Schedulers.parallel())
.collectList()
.doOnSuccess(list -> System.out.println("After: " + list))
.block();
Again, remember that when running in Netty, (say a Spring Webflux application), the above code would end in a subscribe().
Note, though, that switching from a Flux to a List (or any Collection) means you are switching out of the reactive paradigm into imperative programming. You should be able to implement any functionality within the Reactive paradigm itself.

I think there is some confusion. When you call subscribeOn(Schedulers.parallel()). You specify that you want to receive items on the different thread. Also you have to slow down your code so the subscribe cen actually kick in (that is why I added Thread.sleep(100)). If you run the code that i have passed it works. You see there is no magic synchronization mechanism in reactor.
ArrayList<Integer> arrList = new ArrayList<Integer>();
Flux.just(1, 2, 3, 4)
.log()
.map(i -> i * 2)
.subscribeOn(Schedulers.parallel())
.subscribe(
t -> {
System.out.println(t + " thread id: " + Thread.currentThread().getId());
arrList.add(t);
}
);
System.out.println("size of arrList(before the wait): " + arrList.size());
System.out.println("Thread id: "+ Thread.currentThread().getId() + ": id of main thread ");
Thread.sleep(100);
System.out.println("size of arrList(after the wait): " + arrList.size());
If you want to add your items to the list in parallel reactor is not a good choice. Better to use parallel streams in java 8.
List<Integer> collect = Stream.of(1, 2, 3, 4)
.parallel()
.map(i -> i * 2)
.collect(Collectors.toList());
That tutorial you posted is not very precise when it comes to concurrency part. To the author credit he/she says that more articles is to come. But I don't think that should post that particular example at all as it creates confusion. I suggest not trusting resources on the internet that much :)

Related

Threads still alive after subscription

We are currently migrating from RxJava2 to Project Reactor. To parallize work, we are creating a new Thread via Schedulers.newThread() for every parallel HTTP request. We cannot reuse Threads because Spring's SecurityContext is bound to ThreadLocal.
Recently we ran into the error, that after some time the JVM runs into OutOfMemory Exceptions, because the created Threads were not disposed leading to thousands of parked Threads, with RxJava the Threads we diposed after a successful HTTP request.
For RxJava the code would be the following, showing no active additional Threads when having a Breakpoint in the last line.
Observable<String> deferred = Observable.fromCallable(() -> "1")
.subscribeOn(io.reactivex.schedulers.Schedulers.newThread());
Observable<String> deferred2 = Observable.fromCallable(() -> "2")
.subscribeOn(io.reactivex.schedulers.Schedulers.newThread());
System.out.println("obs: " + deferred.blockingSingle());
System.out.println("obs2: " + deferred2.blockingSingle());
However for Project Reactor, both Threads are still alive after both stdouts.
Mono<String> mono1 = Mono.fromSupplier(() -> "1").subscribeOn(Schedulers.newSingle("single"));
Mono<String> mono2 = Mono.fromSupplier(() -> "1").subscribeOn(Schedulers.newSingle("single"));
System.out.println("Mono1: " + mono1.block());
System.out.println("Mono2: " + mono2.block());
A solution for this would be to tear down the scheduler manually in onFinally:
Scheduler newSingle = Schedulers.newSingle("single");
Mono<String> doFinally = Mono.defer(() -> Mono.fromSupplier(() -> "1")).subscribeOn(newSingle).doFinally(s -> {
newSingle.dispose();
});
However, is this really necessary, or is there a way to establish the same behavior as in RxJava2?

Is there a better way to log information in Scala than using a function?

I'm writing a web service and am often sending messages over HTTPS.
val response = sendJson(JsonUtil.getTextMessageJson("Sorry I don't understand what day you want. You can say " + "things like \"tomorrow\" or \"next Friday\""))
I could follow this call with onComplete {} and handle the resulting Success or Failure but since I do this so often I have written a simple function in a helper class:
def logSendResult(response: Future[WSResponse])(implicit userId: String): Unit = {
response onComplete {
case Success(res) => Logger.info("Message to " + userId + " sent successfully with " +
"response code: " + res.status)
case Failure(exception) => Logger.info("Message to " + userId + " failed with " +
"exception: " + exception.getMessage)
}
}
Which I then call with:
LogUtils.logSendResult(response)
This is working fine but I was wondering if there is a better way?
I believe you have a cross cutting concerns. If not handled properly might cause problem: https://en.wikipedia.org/wiki/Cross-cutting_concern
AOP (aspect oriented programming) can fix 'cross cutting concerns'.
Here is a sample AOP program using AspectJ library in scala.
This sample aspect will print Method, its input and result detail, when a method is entered annotated with the #Loggable annotation and finished execution.
https://github.com/knoldus/CrossCuttingConcern_Scala/blob/master/src/main/scala/com/knoldus/aspect/Aspect.scala
Hope this help. Best of luck!
HTTP Filters
This sounds like a good application of a Filter https://www.playframework.com/documentation/2.5.x/ScalaHttpFilters
In the example they even show ho to Build a logging Filter.

Aggregate resource requests & dispatch responses to each subscriber

I'm fairly new to RxJava and struggling with an use case that seems quite common to me :
Gather multiple requests from different parts of the application, aggregate them, make a single resource call and dispatch the results to each subscriber.
I've tried a lot of different approaches, using subjects, connectable observables, deferred observables... none did the trick so far.
I was quite optimistic about this approach but turns out it fails just like the others :
//(...)
static HashMap<String, String> requests = new HashMap<>();
//(...)
#Test
public void myTest() throws InterruptedException {
TestScheduler scheduler = new TestScheduler();
Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
.doOnSubscribe(() -> System.out.println("new subscriber!"))
.doOnUnsubscribe(() -> System.out.println("unsubscribed"))
.filter(l -> !requests.isEmpty())
.doOnNext(aLong -> System.out.println(requests.size() + " requests to send"))
.flatMap(aLong -> {
System.out.println("requests " + requests);
return Observable.from(requests.keySet()).take(10).distinct().toList();
})
.doOnNext(strings -> System.out.println("calling aggregate for " + strings + " (from " + requests + ")"))
.flatMap(Observable::from)
.doOnNext(s -> {
System.out.println("----");
System.out.println("removing " + s);
requests.remove(s);
})
.doOnNext(s -> System.out.println("remaining " + requests));
TestSubscriber<String> ts1 = new TestSubscriber<>();
TestSubscriber<String> ts2 = new TestSubscriber<>();
TestSubscriber<String> ts3 = new TestSubscriber<>();
TestSubscriber<String> ts4 = new TestSubscriber<>();
Observable<String> defer = buildObservable(interval, "1");
defer.subscribe(ts1);
Observable<String> defer2 = buildObservable(interval, "2");
defer2.subscribe(ts2);
Observable<String> defer3 = buildObservable(interval, "3");
defer3.subscribe(ts3);
scheduler.advanceTimeBy(200, TimeUnit.MILLISECONDS);
Observable<String> defer4 = buildObservable(interval, "4");
defer4.subscribe(ts4);
scheduler.advanceTimeBy(100, TimeUnit.MILLISECONDS);
ts1.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts2.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts3.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts4.awaitTerminalEvent(1, TimeUnit.SECONDS);
ts1.assertValue("1");
ts2.assertValue("2"); //fails (test stops here)
ts3.assertValue("3"); //fails
ts4.assertValue("4"); //fails
}
public Observable<String> buildObservable(Observable<String> interval, String key) {
return Observable.defer(() -> {
System.out.printf("creating observable for key " + key);
return Observable.create(subscriber -> {
requests.put(key, "xxx");
interval.doOnNext(s -> System.out.println("filtering : key/val " + key + "/" + s))
.filter(s1 -> s1.equals(key))
.doOnError(subscriber::onError)
.subscribe(s -> {
System.out.println("intern " + s);
subscriber.onNext(s);
subscriber.onCompleted();
subscriber.unsubscribe();
});
});
}
)
;
}
Output :
creating observable for key 1new subscriber!
creating observable for key 2new subscriber!
creating observable for key 3new subscriber!
3 requests to send
requests {3=xxx, 2=xxx, 1=xxx}
calling aggregate for [3, 2, 1] (from {3=xxx, 2=xxx, 1=xxx})
----
removing 3
remaining {2=xxx, 1=xxx}
filtering : key/val 1/3
----
removing 2
remaining {1=xxx}
filtering : key/val 1/2
----
removing 1
remaining {}
filtering : key/val 1/1
intern 1
creating observable for key 4new subscriber!
1 requests to send
requests {4=xxx}
calling aggregate for [4] (from {4=xxx})
----
removing 4
remaining {}
filtering : key/val 1/4
The test fails at the second assertion (ts2 not receiving "2")
Turns out the pseudo-aggregation works as expected, but the values are not dispatched to the corresponding subscribers (only the first subscriber receives it)
Any idea why?
Also, I feel like I'm missing the obvious here. If you think of a better approach, I'm more than willing to hear about it.
EDIT : Adding some context regarding what I want to achieve.
I have a REST API exposing data via multiple endpoints (eg. user/{userid}). This API also makes it possible to aggregate requests (eg. user/user1 & user/user2) and get the corresponding data in one single http request instead of two.
My goal is to be able to automatically aggregate the requests made from different parts of my application in a given time frame (say 10ms) with a max batch size (say 10), make an aggregate http request, then dispatch the results to the corresponding subscribers.
Something like this :
// NOTE: those calls can be fired from anywhere in the app, and randomly combined. The timing and order is completely unpredictable
//ts : 0ms
api.call(userProfileRequest1).subscribe(this::show);
api.call(userProfileRequest2).subscribe(this::show);
//--> after 10ms, should fire one single http aggregate request with those 2 calls, map the response items & send them to the corresponding subscribers (that will show the right user profile)
//ts : 20ms
api.call(userProfileRequest3).subscribe(this::show);
api.call(userProfileRequest4).subscribe(this::show);
api.call(userProfileRequest5).subscribe(this::show);
api.call(userProfileRequest6).subscribe(this::show);
api.call(userProfileRequest7).subscribe(this::show);
api.call(userProfileRequest8).subscribe(this::show);
api.call(userProfileRequest9).subscribe(this::show);
api.call(userProfileRequest10).subscribe(this::show);
api.call(userProfileRequest11).subscribe(this::show);
api.call(userProfileRequest12).subscribe(this::show);
//--> should fire a single http aggregate request RIGHT AWAY (we hit the max batch size) with the 10 items, map the response items & send them to the corresponding subscribers (that will show the right user profile)
The test code I wrote (with just strings) and pasted at the top of this question is meant to be a proof of concept for my final implementation.
Your Observable is not well constructed
public Observable<String> buildObservable(Observable<String> interval, String key) {
return interval.doOnSubscribe(() -> System.out.printf("creating observable for key " + key))
.doOnSubscribe(() -> requests.put(key, "xxx"))
.doOnNext(s -> System.out.println("filtering : key/val " + key + "/" + s))
.filter(s1 -> s1.equals(key));
}
When you subsribe in a subscriber : it's offen a bad design.
I'm not shure to understand what you want to achieve, but I think my code should be pretty close to yours.
Please note that, for all side effects, I use doMethods (like doOnNext, doOnSubscribe) to show I explicitly show that I want to do a side effect.
I replace your defer call by returning directly the interval : as you want to emit all interval events in your custom observable build in your defer call, returning the interval observable is better.
Please note, that you filtering your interval Observable :
Observable<String> interval = Observable.interval(10, TimeUnit.MILLISECONDS, scheduler)
.filter(l -> !requests.isEmpty()).
// ...
So, as soon you'll put something into requests map, interval will stop emmiting.
I don't understand what you wants to achieve with the request map, but please note that you may want to avoid side effects, and updating this map is clearly a side effect.
Update regarding comments
You may want to use the buffer operator to aggregate request, and then perform request in a bulk way :
PublishSubject<String> subject = PublishSubject.create();
TestScheduler scheduler = new TestScheduler();
Observable<Pair> broker = subject.buffer(100, TimeUnit.MILLISECONDS, 10, scheduler)
.flatMapIterable(list -> list) // you can bulk calls here
.flatMap(id -> Observable.fromCallable(() -> api.call(id)).map(response -> Pair.of(id, response)));
TestSubscriber<Object> ts1 = new TestSubscriber<>();
TestSubscriber<Object> ts2 = new TestSubscriber<>();
TestSubscriber<Object> ts3 = new TestSubscriber<>();
TestSubscriber<Object> ts4 = new TestSubscriber<>();
broker.filter(pair -> pair.id.equals("1")).take(1).map(pair -> pair.response).subscribe(ts1);
broker.filter(pair -> pair.id.equals("2")).take(1).map(pair -> pair.response).subscribe(ts2);
broker.filter(pair -> pair.id.equals("3")).take(1).map(pair -> pair.response).subscribe(ts3);
broker.filter(pair -> pair.id.equals("4")).take(1).map(pair -> pair.response).subscribe(ts4);
subject.onNext("1");
subject.onNext("2");
subject.onNext("3");
scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
ts1.assertValue("resp1");
ts2.assertValue("resp2");
ts3.assertValue("resp3");
ts4.assertNotCompleted();
subject.onNext("4");
scheduler.advanceTimeBy(1, TimeUnit.SECONDS);
ts4.assertValue("resp4");
ts4.assertCompleted();
If you want to perform network request collapsin, you may want to check Hystrix : https://github.com/Netflix/Hystrix

Rate-limiting multiple observables created by multiple threads using RxJava

I'm developing a simple REST application that leverages on RxJava to send requests to a remote server (1). For each incoming request to the REST API a request is sent (using RxJava and RxNetty) to (1). Everything is working fine but now I have a new use case:
In order to not bombard (1) with too many request I need to implement rate limiting. One way to solve this (I assume) would be to add each Observable created when sending a request to (1) into another Observable (2) that does the actual rate-limiting. (2) will then act more or less like a queue and process the outbound requests as fast as possible (but not faster than the rate limit). Here's some pseudo-like code:
Observable<MyResponse> r1 = createRequestToExternalServer() // In thread 1
Observable<MyResponse> r2 = createRequestToExternalServer() // In thread 2
// Somehow send r1 and r2 to the "rate limiter" observable, (2)
rateLimiterObservable.sample(1 / rate, TimeUnit.MILLISECONDS)
How would I use Rx/RxJava to solve this?
I'd use a hot timer along with an atomic counter that keeps track the remaining connection for the given duration:
int rate = 5;
long interval = 1000;
AtomicInteger remaining = new AtomicInteger(rate);
ConnectableObservable<Long> timer = Observable
.interval(interval, TimeUnit.MILLISECONDS)
.doOnNext(e -> remaining.set(rate))
.publish();
timer.connect();
Observable<Integer> networkCall = Observable.just(1).delay(150, TimeUnit.MILLISECONDS);
Observable<Integer> limitedNetworkCall = Observable
.defer(() -> {
if (remaining.getAndDecrement() != 0) {
return networkCall;
}
return Observable.error(new RuntimeException("Rate exceeded"));
});
Observable.interval(100, TimeUnit.MILLISECONDS)
.flatMap(t -> limitedNetworkCall.onErrorReturn(e -> -1))
.take(20)
.toBlocking()
.forEach(System.out::println);

how to use couchbase as fifo queue

With Java client, how can I use couchbase to implement FIFO queue, thread safe? There can be many threads popping from the queue, and pushing into the queue. Each object in the queue is a string[].
Couchbase doesn't have any built-in functionality for creating queues, but you can do that by yourself.
I'll explain how to do that in short example below.
I.e. we have queue with name queue and it will have items with names item:<index>. To implement queue you'll need to store your values with key like: <queue_name>:item:<index>, where index will be separate key queue:index, that you need to increment while pushing to queue, and decrement while popping.
In couchbase you could use increment and decrement operations to implement queue, because that operations are atomic and threadsafe.
So code of your push and pop functions will be like:
void push(string queue, string[] value){
int index = couchbase.increment(queue + ':index');
couchbase.set(queue + ':item:' + index, value);
}
string[] pop(string queue){
int index = couchbase.get(queue + ':index');
string[] result = couchbase.get(queue + ':item:' + index);
couchbase.decrement(queue + ':index');
return result;
}
Sorry for code, Ive used java and couchbase java client a long time ago. If now java client have callbacks, like nodejs client, you can rewrite that code to use callbacks. It will be better, I think.
Also you can add additional check into set operation - use add (in C# client it called StoreMode.Add) operation that will throw exception if item with given key has already exists. And you can catch that exception and call push function again for same arguments.
UPD: I'm sorry, it was too early in the morning, so I couldn't think clear.
For fifo, as #avsej said you'll need two counters: queue:head and queue:tail. So for fifo:
void push(string queue, string[] value){
int index = couchbase.increment(queue + ':tail');
couchbase.set(queue + ':item:' + index, value);
}
string[] pop(string queue){
int index = couchbase.increment(queue + ':head') - 1;
string[] result = couchbase.get(queue + ':item:' + index);
return result;
}
Note: code can look slightly different depending on start values of queue:tail and queue:head (will it be zero or one or something else).
Also you can set some max value for counters, after reaching it, queue:tail and queue:head will be reseted to 0 (just to limit number of documents). Also you can set expire value to each document, if you actually need this.
Couchbase already CouchbaseQueue data structure.
Example usage: taken from the below SDK documentation
Queue<String> shoppingList = new CouchbaseQueue<String>("queueDocId", collection, String.class, QueueOptions.queueOptions());
shoppingList.add("loaf of bread");
shoppingList.add("container of milk");
shoppingList.add("stick of butter");
// What does the JSON document look like?
System.out.println(collection.get("queueDocId").contentAsArray());
//=> ["stick of butter","container of milk","loaf of bread"]
String item;
while ((item = shoppingList.poll()) != null) {
System.out.println(item);
// => loaf of bread
// => container of milk
// => stick of butter
}
// What does the JSON document look like after draining the queue?
System.out.println(collection.get("queueDocId").contentAsArray());
//=> []
Java SDK 3.1 CouchbaseQueue Doc