How to drop new elements if an observer is busy? - swift

I have an observable which regularly emits elements. On those elements, I perform one fast and one slow operation. What I want is to drop new elements for slow observer while it is busy. Is there any way to achieve this with Rx instead of keeping a flag in slow operation?
I am very new at Reactive extensions, please correct me if anything is wrong with my assumptions.
let tick = Observable<Int>.interval(.seconds(1),
scheduler: SerialDispatchQueueScheduler(qos: .background)).share()
tick.subscribe {
print("fast observer \($0)")
}.disposed(by: disposeBag)
// observing in another queue so that it does not block the source
tick.observeOn(SerialDispatchQueueScheduler(qos: .background))
.subscribe {
print("slow observer \($0)")
sleep(3) // cpu-intensive task
}.disposed(by: disposeBag)

For this, flatMap is your friend. Whenever you want to drop events (either the current one when a new one comes in, or subsequent ones while working on the current one) use flatMap. More information can be found in my article: RxSwift’s Many Faces of FlatMap
Here you go:
let tick = Observable<Int>.interval(.seconds(1), scheduler: MainScheduler.instance).share()
func cpuLongRunningTask(_ input: Int) -> Observable<Int> {
return Observable.create { observer in
print("start task")
sleep(3)
print("finish task")
observer.onNext(input)
observer.onCompleted()
return Disposables.create { /* cancel the task if possible */ }
}
}
tick
.subscribe {
print("fast \($0)")
}
.disposed(by: disposeBag)
tick
.flatMapFirst {
// subscribing in another scheduler so that it does not block the source
cpuLongRunningTask($0)
.subscribeOn(SerialDispatchQueueScheduler(qos: .background))
}
.observeOn(MainScheduler.instance) // make sure the print happens on the main thread
.subscribe {
print("slow \($0)")
}
.disposed(by: disposeBag)
Sample output as follows:
fast next(0)
start task
fast next(1)
fast next(2)
fast next(3)
finish task
slow next(0)
fast next(4)
start task
fast next(5)
fast next(6)
fast next(7)
finish task
slow next(4) <-- slow ignored the 1, 2, and 3 values.

I'm afraid there is not a straightforward solution. The issue you describe is related to backpressure and unfortunately, RxSwift does not provide support for it (Apple Combine does). Usually, you will have to handle this situation manually by using one of the filtering operators: debounce, throttle or filter.
By using debounce or throttle you would need to know the exact duration of the operation which probably is not always the case.
By using filter, as you said, you could check for a flag you set before starting the long-running operation.

Related

Is there a Combine-y way to stop multiple requests being triggered?

I currently have code like...
var isFetching = false
func fetch() {
guard !isFetching else { return }
fetching = true
apiPublisher
.receive(on: .main)
.handleEvents(receiveCompletion: { [weak self] _ in
self?.fetching = false
})
.assign(to: \.foo, on: self)
.store(in: &cancellables)
}
But I'm not happy with the way I'm stopping multiple requests happening like this. It works but feels clunky.
I feel like there should be a more "Combine-y" way of doing this.
Is there a more elegant/Combine-y way of doing this?
The issue here is the fact that you have the entire Publisher subscription chain in a function (fetch()) that can be called by virtually any other code at virtually any time from any thread. In order to do this in a "more Combine-y" you need to limit that. So the question is what calls this fetch() function? If you wrap all those call sites in a Publisher and then setup a trigger.flatMap { apiPublisher } and set up the flatMap to ignore events while it's waiting for the current network request to complete.
Something like this:
trigger
.flatMap(maxPublishers: .max(1)) {
apiPublisher
}
.receive(on: .main)
.assign(to: \.foo, on: self)
.store(in: &cancellables)
The important thing to note here is that you only execute this code once (probably in the viewDidLoad if this is a view controller. Then when you want to make the request, you send an event through the trigger Publisher. The flatMap will ensure that only one apiPublisher subscription will happen at a time.

Behavior for receive(on:) for DispatchQueue.main

Given the code below from a class:
cancellable = input
.receive(on: scheduler)
.map { ... }
.sink(receiveValue: { value in
self.state = value
})
where input is a PassthroughSubject.
Now, when scheduler is the main queue or the RunLoop.main AND input will be called from the main thread, does receive(on: scheduler) programmatically optimise away an explicit dispatch to the main queue?
So, basically something like this:
if Thread.isMainThread {
/* execute closure */
} else {
/* dispatch closure async to main */
}
The documentation for receive(on:) gives a vague hint, that it might perform some optimisations:
"Prefer receive(on:options:) over explicit use of dispatch queues"
pub.sink {
DispatchQueue.main.async {
// Do something.
}
}
No, receive(on:) does not optimize away the dispatch. Doing so could lead to a deadlock. Example:
let l = Lock()
let cancellable = input
.receive(on: scheduler)
.map { ... }
.sink(receiveValue: { value in
l.lock()
self.state = value
l.unlock()
})
l.lock()
input.send(1)
l.unlock()
If the dispatch were eliminated, this example would try to lock the already-locked lock l, and hang (or crash if it can detect the deadlock).
Looking at the sources for RunLoop and DispatchQueue, it doesn't look like there is such an optimisation in their conformance to the Scheduler protocol.
But to be fair, there might be lower level optimisations at play.

Combine`s subscribe(on:options:) operator

I have a question about the subscribe(on:options:) operator. I would appreciate if anyone can help me to figure it out.
So what we have from the documentation:
Specifies the scheduler on which to perform subscribe, cancel, and request operations.
In contrast with receive(on:options:), which affects downstream messages, subscribe(on:options:) changes the execution context of upstream messages.
Also, what I got from different articles is that unless we explicitly specify the Scheduler to receive our downstream messages on (using receive(on:options:)), messages will be send on the Scheduler used for receiving a subscription.
This information is not aligned with what I am actually getting during the execution.
I have the next code:
Just("Some text")
.map { _ in
print("Map: \(Thread.isMainThread)")
}
.subscribe(on: DispatchQueue.global())
.sink { _ in
print("Sink: \(Thread.isMainThread)")
}
.store(in: &subscriptions)
I would expect next output:
Map: false
Sink: false
But instead I am getting:
Map: true
Sink: false
The same thing happens when I use Sequence publisher.
If I swap the position of map operator and subscribe operator, I receive what I want:
Just("Some text")
.subscribe(on: DispatchQueue.global())
.map { _ in
print("Map: \(Thread.isMainThread)")
}
.sink { _ in
print("Sink: \(Thread.isMainThread)")
}
.store(in: &subscriptions)
Output:
Map: false
Sink: false
Interesting fact is that when I use the same order of operators from my first listing with my custom publisher, I receive the behaviour I want:
struct TestJust<Output>: Publisher {
typealias Failure = Never
private let value: Output
init(_ output: Output) {
self.value = output
}
func receive<S>(subscriber: S) where S : Subscriber, Failure == S.Failure, Output == S.Input {
subscriber.receive(subscription: Subscriptions.empty)
_ = subscriber.receive(value)
subscriber.receive(completion: .finished)
}
}
TestJust("Some text")
.map { _ in
print("Map: \(Thread.isMainThread)")
}
.subscribe(on: DispatchQueue.global())
.sink { _ in
print("Sink: \(Thread.isMainThread)")
}
.store(in: &subscriptions)
Output:
Map: false
Sink: false
So I think there is either my total misunderstanding of all these mechanisms, or some publishers intentionally choose the thread to publish values (Just, Sequence -> Main, URLSession.DataTaskPublisher -> Some of Background), which does not make sense for me, cause in this case why would we need this subscribe(on:options:) for.
Could you please help me to understand what am I missing? Thank you in advance.
The first thing to understand is that messages flow both up a pipeline and down a pipeline. Messages that flow up a pipeline ("upstream") are:
The actual performance of the subscription (receive subscription)
Requests from a subscriber to the upstream publisher asking for a new value
Cancel messages (these percolate upwards from the final subscriber)
Messages that flow down a pipeline ("downstream") are:
Values
Completions, consisting of either a failure (error) or completion-in-good-order (reporting that the publisher emitted its last value)
Okay, well, as the documentation clearly states, subscribe(on:) is about the former: messages that flow upstream. But you are not actually tracking any of those messages in your tests, so none of your results reflect any information about them! Insert an appropriate handleEvents operator above the subscription point to see stuff flow upwards up the pipeline (e.g. implement its receiveRequest: parameter):
Just("Some text")
.handleEvents(receiveRequest: {
_ in print("Handle1: \(Thread.isMainThread)")
})
.map // etc.
Meanwhile, you should make no assumptions about the thread on which messages will flow downstream (i.e. values and completions). You say:
Also, what I got from different articles is that unless we explicitly specify the Scheduler to receive our downstream messages on (using receive(on:options:)), messages will be send on the Scheduler used for receiving a subscription.
But that seems like a bogus assumption. And nothing about your code determines the downstream-sending thread in a clear way. As you rightly say, you can take control of this with receive(on:), but if you don't, I would say you must assume nothing about the matter. Some publishers certainly do produce a value on a background thread, such as the data task publisher, which makes perfect sense (the same thing happens with a data task completion handler). Others don't.
What you can assume is that operators other than receive(on:) will not generally alter the value-passing thread. But whether and how an operator will use the subscription thread to determine the receive thread, that is something you should assume nothing about. To take control of the receive thread, take control of it! Call receive(on:) or assume nothing.
Just to give an example, if you change your opening to
Just("Some text")
.receive(on: DispatchQueue.main)
then both your map and your sink will report that they are receiving values on the main thread. Why? Because you took control of the receive thread. This works regardless of what you may say in any subscribe(on:) commands. They are different matters entirely.
Maybe if you call subscribe(on:) but you don't call receive(on:), some things about the downstream-sending thread are determined by the subscribe(on:) thread, but I sure wouldn't rely on there being any hard and fast rules about it; there's nothing saying that in the documentation! Instead, don't do that. If you implement subscribe(on:), implement receive(on:) too so that you are in charge of what happens.

What is the difference between .subscribe and .drive

I am quite new in Reactive Programming, Here is what I'm trying
.drive
searchController.rx.text
.asDriver()
.drive(onNext: { (element) in
print(element)
}).disposed(by: disposeBag)
.subscribe
searchController.rx.text
.asObservable()
.subscribe(onNext: { (element) in
print(element)
}).disposed(by: disposeBag)
both blocks are working exactly the same, What is the purpose of using .drive over .subscribe? In which scenario we should use .drive over .subscribe ?
Any help will be appreciated
Driver is a bit different from Observable. From documentation:
Trait that represents observable sequence with following properties:
it never fails
it delivers events on MainScheduler.instance
share(replay: 1, scope: .whileConnected) sharing strategy
I assume that searchController.rx.text never fails and share isn't required in this situation.
So we have only one point that makes them different in your situation:
it delivers events on MainScheduler.instance
And you can check it yourself. Before subscribe insert this and your events won't be delivered on main thread:
.observeOn(ConcurrentDispatchQueueScheduler(qos: .background))
That is how I checked it in my code:
something
.asObservable()
.observeOn(ConcurrentDispatchQueueScheduler(qos: .background))
.subscribe(onNext: { _ in
print("observable is on main thread: ", Thread.isMainThread)
})
something
.asDriver()
.drive(onNext: { _ in
print("driver is on main thread: ", Thread.isMainThread)
})
Logs:
driver is on main thread: true
observable is on main thread: false
In which scenario we should use .drive:
When working with UI. Why? From documentation:
Important
Use UIKit classes only from your app’s main thread or main
dispatch queue, unless otherwise indicated. This restriction
particularly applies to classes derived from UIResponder or that
involve manipulating your app’s user interface in any way.

Is it possible to make `ReplaySubject` to run a closure on being subscribed to?

I want to create a cold observable that would only start doing expensive operation if there is an actual subscription. ReplaySubject would fit nicely except for the part that I need to be able to start an expensive background operation when the actual subscription is made and not on create of the observable. Is there a way to do so? Some sort of onSubscribed { ... } method.
Here are a couple of options:
Adding the expensive operation to a doOn(onSubscribe:) that's in between the Observable and the subscription:
let observable = Observable.of(1, 2)
.doOn(onSubscribe: { _ in
expensiveOperation()
})
observable
.subscribeNext { e in
print(e)
}
Making the Observable connectable and separating the doOn(onSubscribe:):
let observable = Observable.of(1, 2)
.publish()
observable
.doOn(onSubscribe: { _ in
expensiveOperation()
})
.subscribe()
observable
.subscribeNext { e in
print(e)
}
observable.connect()