Does disposing a Disposable guarantee no future calls? - rx-java2

Does disposing the Disposable returned from Observable#subscribe(Consumer) guarantee that the Consumer will receive no further calls to accept? If not, how can I obtain such a guarantee?
If it makes a difference, the Observable is observed and the Disposable is disposed in the same single threaded Scheduler.

Does disposing the Disposable returned from Observable#subscribe(Consumer) guarantee that the Consumer will receive no further calls to accept?
There is no such hard guarantee. RxJava 2 does its best effort to stop an emission upon an asynchronous cancellation, but it is possible the thread issuing the cancellation is delayed ever so slightly that from an outside point of view, one or more items still slipped through.
If not, how can I obtain such a guarantee?
You can make things more eager and mutually exclusive but in the asynchronous world it is often difficult to tell what happened first or during an action.
If it makes a difference, the Observable is observed and the Disposable is disposed in the same single threaded Scheduler.
If the cancellation happens in-sequence, such as using take(2), a synchronously connected upstream will stop sending items: range(1, 5).take(2) will never try to emit 3, 4, 5. However,
range(1, 5)
.map(v -> v + 1)
.observeOn(io())
.take(2)
.subscribe()
may run through 1..5 and perform the mapping before take even sees the value 2.
Depending on what type of flow you have, sending a task to a scheduler to dispose a running sequence on that scheduler may never execute as it is blocked by the task currently emitting, thus the following will not stop the sequence:
var single = Schedulers.single();
var dispose = single.scheduleDirect(() -> {
while (!Thread.currentThread().isInterrupted()) {
System.out.println("Processing...");
}
});
// this will not execute
single.scheduleDirect(() -> dispose.dispose());
// but this will stop the processing
dispose.dispose();
Thread.sleep(1000);

Related

Background task in reactive pipeline (Fire-and-forget)

I have a reactive pipeline to process incoming requests. For each request I need to call a business-relevant function (doSomeRelevantProcessing).
After that is done, I need to notify some external service about what happened. That part of the pipeline should not increase the overall response time.
Also, notifying this external system is not business critical: giving a quick response after the main part of the pipeline is finished is more important than making sure the notification is successful.
As far as I learned, the only way to run something in the background without slowing down the overall process is to subscribe to in directly in the pipeline, thus achieving a fire-and-forget mentality.
Is there a good alternative to subscribing inside the flatmap?
I am a little worried about what might happen if notifying the external service takes longer than the original processing and a lot of requests are coming in at once. Could this lead to a memory exhaustion or the overall process to block?
fun runPipeline(incoming: Mono<Request>) = incoming
.flatMap { doSomeRelevantProcessing(it) } // this should not be delayed
.flatMap { doBackgroundJob(it) } // this can take a moment, but is not super critical
fun doSomeRelevantProcessing(request: Request) = Mono.just(request) // do some processing
fun doBackgroundJob(request: Request) = Mono.deferContextual { ctx: ContextView ->
val notification = "notification" // build an object from context
// this uses non-blocking HTTP (i.e. webclient), so it can take a second or so
notifyExternalService(notification).subscribeOn(Schedulers.boundedElastic()).subscribe()
Mono.just(Unit)
}
fun notifyExternalService(notification: String) = Mono.just(Unit) // might take a while
I'm answering this assuming that you notify the external service using purely reactive mechanisms - i.e. you're not wrapping a blocking service. If you are then the answer would be different as you're bound by the size of your bounded elastic thread pool, which could quickly become overwhelmed if you have hundreds of requests a second incoming.
(Assuming you're using reactive mechanisms, then there's no need for .subscribeOn(Schedulers.boundedElastic()) as you give in your example, as that's not buying you anything - it's designed for wrapping legacy blocking services.)
Could this lead to a memory exhaustion
It's only a possibility in really extreme cases, the memory used by each individual request will be tiny. It's almost certainly not worth worrying about, if you start seeing memory issues here then you'll almost certainly be hit by other issues elsewhere.
That being said, I'd probably recommend adding .timeout(Duration.ofSeconds(5)) or similar before your inner subscribe method to make sure the requests are killed off after a while if they haven't worked for any reason - this will prevent them building up.
...or [can this cause] the overall process to block?
This one is easier - a short no, it can't.

Difference between map and subscribe on Mono\Flux?

Am I right to assume that "map" could essentially be a "subscribe" with a return type . They both seem to get called asynchronously when the promise gets resolved ?
For example , if I dispatch a list of 3 Async calls concurrently, would applying the map operation in manner below be blocking ?
Flux.merge(albums.stream().map(album -> {
Mono<CoverResponse> responseMono = clientRequestHandler.makeAsyncCall()
//2.call and handler for async call
return responseMono
.map(response -> processResponse());
}).collect(Collectors.toList())).then(Mono.just(monoResponse));
In the above snippet, is each map operation going to be blocking? If say, the first call takes 5 ms to to return and every other call takes 2 ms to return , are we going to wait 3ms+2ms+2m = 7ms for the enitre operation ? or just 3ms since once the first call gets resolved , the 2ms calls are already resolved by then.
First of all, nothing will happen until someone subscribes. Subscribe is the last thing in the chain, that will trigger the start of all the events.
Second of all you need to understand the difference of running something in parallell vs running something non-blocking.
to resolve your first map, it needs to make the rest call, then with the response it needs to do your second map. These two wont be run in parallell.
Your responseMono.map can't be run until Mono<Response> responseMono actually has something in it. Think of it as a Promise that will signal the application when it has been resolved.
Or you can think of it as a chain of callbacks.
So in your example you are doing a clientRequestHandler.makeAsyncCall() but you are returning a Mono<CoverResponse> the next part the responseMono.map wont trigger, until there is a CoverResponse in the mono. So your "async" call, is probably async, but still adheres to list order since its all in a sequential stream.
But map is a mapping function. It takes something out of the box, performs a computation on something in the box and then returns the new value or type.
What makes reactive better than other options is that when you do your side effect the remote call somewhere that takes time, the thread that is processing this wont hang around and wait for the external request to finish, it will start doing other things, like processing other requests.
Then when the Mono<Response> signals the system that there is "something in the box" a response, then the same thread or any other available thread will keep on processing the request.

Meaning of async in DispatchQueue

I'm watching the concurrent programming talk from WWDC, found here and I'm a little confused about the meaning of async. In Javascript world, we use "async" to describe processes that happen "out of order" or practically, processes that are not guaranteed to return before the next line of code is executed. However, in this talk it seems as though what the presenter is demonstrating is in fact a series of tasks being executed one after the other (ie. synchronous behavior)
Then in this code example
let queue = DispatchQueue(label: "com.example.imagetransform")
queue.async {
let smallImage = image.resize(to: rect)
DispatchQueue.main.async {
imageView.image = smallImage
}
}
This DispatchQueue doesn't seem to behave like the FIFO data structure it's supposed to be on first glance. I see that they are passing in a closure into the queue.async method, but I'm not sure how to add more blocks, or tasks, for it to execute.
I'm guessing the async nature of this is similar to Javascript callbacks in the sense that all the variables being assigned in the closure are only scoped to that closure and we can only act on that data within the closure.
DispatchQueues are FIFO with regard to the start of tasks. If they are concurrent queues, or if you're submitting async tasks, then there's no FIFO guarantee on when they finish
Concurrent queues allow multiple tasks to run at the same time. Tasks are guaranteed to start in the order they were added. Tasks can finish in any order and you have no knowledge of the time it will take for the next task to start, nor the number of tasks that are running at any given time.
- https://www.raywenderlich.com/148513/grand-central-dispatch-tutorial-swift-3-part-1

priority of Dispatch Queues in swift 3

I have read the tutorial about GCD and Dispatch Queue in Swift 3
But I'm really confused about the order of synchronous execution and asynchronous execution and main queue and background queue.
I know that if we use sync then we execute them one after the precious one, if we use async then we can use QoS to set their priority, but how about this case?
func queuesWithQoS() {
let queue1 = DispatchQueue(label: "com.appcoda.myqueue1")
let queue2 = DispatchQueue(label: "com.appcoda.myqueue2")
for i in 1000..<1005 {
print(i)
}
queue1.async {
for i in 0..<5{
print(i)
}
}
queue2.sync {
for i in 100..<105{
print( i)
}
}
}
The outcome shows that we ignore the asynchronous execution. I know queue2 should be completed before queue1 since it's synchronous execution but why we ignore the asynchronous execution and
what is the actual difference between async, sync and so-called main queue?
You say:
The outcome shows that we ignore the asynchronous execution. ...
No, it just means that you didn't give the asynchronously dispatched code enough time to get started.
I know queue2 should be completed before queue1 since it's synchronous execution ...
First, queue2 might not complete before queue1. It just happens to. Make queue2 do something much slower (e.g. loop through a few thousand iterations rather than just five) and you'll see that queue1 can actually run concurrently with respect to what's on queue2. It just takes a few milliseconds to get going and the stuff on your very simple queue2 is finishing before the stuff on queue1 gets a chance to start.
Second, this behavior is not technically because it's synchronous execution. It's just that async takes a few milliseconds to get it's stuff running on some worker thread, whereas the synchronous call, because of optimizations that I won't bore you with, gets started more quickly.
but why we ignore the asynchronous execution ...
We don't "ignore" it. It just takes a few milliseconds to get started.
and what is the actual difference between async, sync and so-called main queue?
"Async" merely means that the current thread may carry on and not wait for the dispatched code to run on some other thread. "Sync" means that the current thread should wait for the dispatched code to finish.
The "main thread" is a different topic and simply refers to the primary thread that is created for driving your UI. In practice, the main thread is where most of your code runs, basically running everything except that which you manually dispatch to some background queue (or code that is dispatched there for you, e.g. URLSession completion handlers).
sync and async are related to the same thread / queue. To see the difference please run this code:
func queuesWithQoS() {
let queue1 = DispatchQueue(label: "com.appcoda.myqueue1")
queue1.async {
for i in 0..<5{
print(i)
}
}
print("finished")
queue1.sync {
for i in 0..<5{
print(i)
}
}
print("finished")
}
The main queue is the thread the entire user interface (UI) runs on.
First of all I prefer to use the term "delayed" instead of "ignored" about the code execution, because all your code in your question is executed.
QoS is an enum, the first class means the highest priority, the last one the lowest priority, when you don't specify any priority you have a queue with default priority and default is in the middle:
userInteractive
userInitiated
default
utility
background
unspecified
Said that, you have two synchronous for-in loops and one async, where the priority is based by the position of the loops and the kind of the queues (sync/async) in the code because here we have 3 different queues (following the instructions about your link queuesWithQoS() could be launched in viewDidAppearso we can suppose is in the main queue)
The code show the creation of two queues with default priority, so the sequence of the execution will be:
the for-in loop with 1000..<1005 in the main queue
the synchronous queue2 with default priority
the asynchronous queue1 (not ignored, simply delayed) with default priority
Main queue have always the highest priority where all the UI instructions are executed.

Practical use of futures? Ie, how to kill them?

Futures are very convenient, but in practice, you may need some guarantees on their execution. For example, consider:
import scala.actors.Futures._
def slowFn(time:Int) = {
Thread.sleep(time * 1000)
println("%d second fn done".format(time))
}
val fs = List( future(slowFn(2)), future(slowFn(10)) )
awaitAll(5000, fs:_*)
println("5 second expiration. Continuing.")
Thread.sleep(12000) // ie more calculations
println("done with everything")
The idea is to kick off some slow running functions in parallel. But we wouldn't want to hang forever if the functions executed by the futures don't return. So we use awaitAll() to put a timeout on the futures. But if you run the code, you see that the 5 second timer expires, but the 10 second future continues to run and returns later. The timeout doesn't kill the future; it just limits the join wait.
So how do you kill a future after a timeout period? It seems like futures can't be used in practice unless you're certain that they will return in a known amount of time. Otherwise, you run the risk of losing threads in the thread pool to non-terminating futures until there are none left.
So the questions are: How do you kill futures? What are the intended usage patterns for futures given these risks?
Futures are intended to be used in settings where you do need to wait for the computation to complete, no matter what. That's why they are described as being used for slow running functions. You want that function's result, but you have other stuff you can be doing meanwhile. In fact, you might have many futures, all independent of each other that you may want to run in parallel, while you wait until all complete.
The timer just provides a wait to get partial results.
I think the reason Future can't simply be "killed" is exactly the same as why java.lang.Thread.stop() is deprecated.
While Future is running, a Thread is required. In order to stop a Future without calling stop() on the executing Thread, application specific logic is needed: checking for an application specific flag or the interrupted status of the executing Thread periodically is one way to do it.