Can RxJava2 CompositeDisposable Clean Itself of Disposed Subscriptions? - rx-java2

From time to time me App triggers a Single for a network request, which I add to a CompositeDisposable in case I need to cancel all pending network requests.
The CompositeDisposable will add a Disposable to a inner HashSet, so with time the more Singles I add, the more memory the CompositeDisposable will take.
Are there any ways for the CompositeDisposable to make a "clean up", removing all disposed Disposabled from its inner HashSet saving up memory?

I had the same problem.
I'm not sure if it's the best approach, but I implemented it in this way:
package io.reactivex.disposables
import io.reactivex.internal.util.OpenHashSet
fun CompositeDisposable.clearDisposed() {
if (disposed) {
return
}
var notDisposedSet: OpenHashSet<Disposable>
synchronized(this) {
if (disposed) {
return
}
notDisposedSet = OpenHashSet()
for (res in resources?.keys().orEmpty()) {
if (res is Disposable && !res.isDisposed) {
notDisposedSet.add(res)
}
}
resources = notDisposedSet
}
}

My solution in Kotlin was to create such an extension function:
fun <T> CompositeDisposable.subscribeAndPush(single: Single<T>,
onError: ((t: Throwable) -> Unit)? = null,
onSuccess: (t: T) -> Unit) {
lateinit var disposable: Disposable
disposable = single.doAfterTerminate {
this.remove(disposable)
}
.subscribe(onSuccess, onError ?: {})
this.add(disposable)
}
Then sample usage:
compositeDisposable.subscribeAndPush(
authenticationRepository.login(credentials), ::onError) { result ->
// on success here
}
Actual subscription is being done in extension, so before that, there is a place for another listener allowing composite to remove disposable. Don't know if it's a good solution but maybe it will help someone.

Related

Swift Concurrency async/await equivalent of a dispatch barrier / semaphore [duplicate]

I've a document based application that uses a struct for its main data/model. As the model is a property of (a subclass of) NSDocument it needs to be accessed from the main thread. So far all good.
But some operations on the data can take quite a long time and I want to provide the user with a progress bar. And this is where to problems start. Especially when the user starts two operations from the GUI in quick succession.
If I run the operation on the model synchronously (or in a 'normal' Task {}) I get the correct serial behaviour, but the Main thread is blocked, hence I can't show a progress bar. (Option A)
If I run the operation on the model in a Task.detached {} closure I can update the progress bar, but depending on the run time of the operations on the model, the second action of the user might complete before the first operation, resulting in invalid/unexpected state of the model. This is due to the await statements needed in the detached task (I think). (Option B).
So I want a) to free up the main thread to update the GUI and b) make sure each task runs to full completion before another (queued) task starts. This would be quite possible using a background serial dispatch queue, but I'm trying to switch to the new Swift concurrency system, which is also used to perform any preparations before the model is accessed.
I tried using a global actor, as that seems to be some sort of serial background queue, but it also needs await statements. Although the likelihood of unexpected state in the model is reduced, it's still possible.
I've written some small code to demonstrate the problem:
The model:
struct Model {
var doneA = false
var doneB = false
mutating func updateA() {
Thread.sleep(forTimeInterval: 5)
doneA = true
}
mutating func updateB() {
Thread.sleep(forTimeInterval: 1)
doneB = true
}
}
And the document (leaving out standard NSDocument overrides):
#globalActor
struct ModelActor {
actor ActorType { }
static let shared: ActorType = ActorType()
}
class Document: NSDocument {
var model = Model() {
didSet {
Swift.print(model)
}
}
func update(model: Model) {
self.model = model
}
#ModelActor
func updateModel(with operation: (Model) -> Model) async {
var model = await self.model
model = operation(model)
await update(model: model)
}
#IBAction func operationA(_ sender: Any?) {
//Option A
// Task {
// Swift.print("Performing some A work...")
// self.model.updateA()
// }
//Option B
// Task.detached {
// Swift.print("Performing some A work...")
// var model = await self.model
// model.updateA()
// await self.update(model: model)
// }
//Option C
Task.detached {
Swift.print("Performing some A work...")
await self.updateModel { model in
var model = model
model.updateA()
return model
}
}
}
#IBAction func operationB(_ sender: Any?) {
//Option A
// Task {
// Swift.print("Performing some B work...")
// self.model.updateB()
// }
//Option B
// Task.detached {
// Swift.print("Performing some B work...")
// var model = await self.model
// model.updateB()
// await self.update(model: model)
// }
//Option C
Task.detached {
Swift.print("Performing some B work...")
await self.updateModel { model in
var model = model
model.updateB()
return model
}
}
}
}
Clicking 'Operation A' and then 'Operation B' should result in a model with two true's. But it doesn't always.
Is there a way to make sure that operation A completes before I get to operation B and have the Main thread available for GUI updates?
EDIT
Based on Rob's answer I came up with the following. I modified it this way because I can then wait on the created operation and report any error to the original caller. I thought it easier to comprehend what's happening by including all code inside a single update function, so I choose to go for a detached task instead of an actor. I also return the intermediate model from the task, as otherwise an old model might be used.
class Document {
func updateModel(operation: #escaping (Model) throws -> Model) async throws {
//Update the model in the background
let modelTask = Task.detached { [previousTask, model] () throws -> Model in
var model = model
//Check whether we're cancelled
try Task.checkCancellation()
//Check whether we need to wait on earlier task(s)
if let previousTask = previousTask {
//If the preceding task succeeds we use its model
do {
model = try await previousTask.value
} catch {
throw CancellationError()
}
}
return try operation(model)
}
previousTask = modelTask
defer { previousTask = nil } //Make sure a later task can always start if we throw
//Wait for the operation to finish and store the model
do {
self.model = try await modelTask.value
} catch {
if error is CancellationError { return }
else { throw error }
}
}
}
Call side:
#IBAction func operationA(_ sender: Any?) {
//Option D
Task {
do {
try await updateModel { model in
var model = model
model.updateA()
return model
}
} catch {
presentError(error)
}
}
}
It seems to do anything I need, which is queue'ing updates to a property on a document, which can be awaited for and have errors returned, much like if everything happened on the main thread.
The only drawback seems to be that on the call side the closure is very verbose due to the need to make the model a var and return it explicitly.
Obviously if your tasks do not have any await or other suspension points, you would just use an actor, and not make the method async, and it automatically will perform them sequentially.
But, when dealing with asynchronous actor methods, one must appreciate that actors are reentrant (see SE-0306: Actors - Actor Reentrancy). If you really are trying to a series of asynchronous tasks run serially, you will want to manually have each subsequent task await the prior one. E.g.,
actor Foo {
private var previousTask: Task<(), Error>?
func add(block: #Sendable #escaping () async throws -> Void) {
previousTask = Task { [previousTask] in
let _ = await previousTask?.result
return try await block()
}
}
}
There are two subtle aspects to the above:
I use the capture list of [previousTask] to make sure to get a copy of the prior task.
I perform await previousTask?.value inside the new task, not before it.
If you await prior to creating the new task, you have race, where if you launch three tasks, both the second and the third will await the first task, i.e. the third task is not awaiting the second one.
And, perhaps needless to say, because this is within an actor, it avoids the need for detached task, while keeping the main thread free.

Make tasks in Swift concurrency run serially

I've a document based application that uses a struct for its main data/model. As the model is a property of (a subclass of) NSDocument it needs to be accessed from the main thread. So far all good.
But some operations on the data can take quite a long time and I want to provide the user with a progress bar. And this is where to problems start. Especially when the user starts two operations from the GUI in quick succession.
If I run the operation on the model synchronously (or in a 'normal' Task {}) I get the correct serial behaviour, but the Main thread is blocked, hence I can't show a progress bar. (Option A)
If I run the operation on the model in a Task.detached {} closure I can update the progress bar, but depending on the run time of the operations on the model, the second action of the user might complete before the first operation, resulting in invalid/unexpected state of the model. This is due to the await statements needed in the detached task (I think). (Option B).
So I want a) to free up the main thread to update the GUI and b) make sure each task runs to full completion before another (queued) task starts. This would be quite possible using a background serial dispatch queue, but I'm trying to switch to the new Swift concurrency system, which is also used to perform any preparations before the model is accessed.
I tried using a global actor, as that seems to be some sort of serial background queue, but it also needs await statements. Although the likelihood of unexpected state in the model is reduced, it's still possible.
I've written some small code to demonstrate the problem:
The model:
struct Model {
var doneA = false
var doneB = false
mutating func updateA() {
Thread.sleep(forTimeInterval: 5)
doneA = true
}
mutating func updateB() {
Thread.sleep(forTimeInterval: 1)
doneB = true
}
}
And the document (leaving out standard NSDocument overrides):
#globalActor
struct ModelActor {
actor ActorType { }
static let shared: ActorType = ActorType()
}
class Document: NSDocument {
var model = Model() {
didSet {
Swift.print(model)
}
}
func update(model: Model) {
self.model = model
}
#ModelActor
func updateModel(with operation: (Model) -> Model) async {
var model = await self.model
model = operation(model)
await update(model: model)
}
#IBAction func operationA(_ sender: Any?) {
//Option A
// Task {
// Swift.print("Performing some A work...")
// self.model.updateA()
// }
//Option B
// Task.detached {
// Swift.print("Performing some A work...")
// var model = await self.model
// model.updateA()
// await self.update(model: model)
// }
//Option C
Task.detached {
Swift.print("Performing some A work...")
await self.updateModel { model in
var model = model
model.updateA()
return model
}
}
}
#IBAction func operationB(_ sender: Any?) {
//Option A
// Task {
// Swift.print("Performing some B work...")
// self.model.updateB()
// }
//Option B
// Task.detached {
// Swift.print("Performing some B work...")
// var model = await self.model
// model.updateB()
// await self.update(model: model)
// }
//Option C
Task.detached {
Swift.print("Performing some B work...")
await self.updateModel { model in
var model = model
model.updateB()
return model
}
}
}
}
Clicking 'Operation A' and then 'Operation B' should result in a model with two true's. But it doesn't always.
Is there a way to make sure that operation A completes before I get to operation B and have the Main thread available for GUI updates?
EDIT
Based on Rob's answer I came up with the following. I modified it this way because I can then wait on the created operation and report any error to the original caller. I thought it easier to comprehend what's happening by including all code inside a single update function, so I choose to go for a detached task instead of an actor. I also return the intermediate model from the task, as otherwise an old model might be used.
class Document {
func updateModel(operation: #escaping (Model) throws -> Model) async throws {
//Update the model in the background
let modelTask = Task.detached { [previousTask, model] () throws -> Model in
var model = model
//Check whether we're cancelled
try Task.checkCancellation()
//Check whether we need to wait on earlier task(s)
if let previousTask = previousTask {
//If the preceding task succeeds we use its model
do {
model = try await previousTask.value
} catch {
throw CancellationError()
}
}
return try operation(model)
}
previousTask = modelTask
defer { previousTask = nil } //Make sure a later task can always start if we throw
//Wait for the operation to finish and store the model
do {
self.model = try await modelTask.value
} catch {
if error is CancellationError { return }
else { throw error }
}
}
}
Call side:
#IBAction func operationA(_ sender: Any?) {
//Option D
Task {
do {
try await updateModel { model in
var model = model
model.updateA()
return model
}
} catch {
presentError(error)
}
}
}
It seems to do anything I need, which is queue'ing updates to a property on a document, which can be awaited for and have errors returned, much like if everything happened on the main thread.
The only drawback seems to be that on the call side the closure is very verbose due to the need to make the model a var and return it explicitly.
Obviously if your tasks do not have any await or other suspension points, you would just use an actor, and not make the method async, and it automatically will perform them sequentially.
But, when dealing with asynchronous actor methods, one must appreciate that actors are reentrant (see SE-0306: Actors - Actor Reentrancy). If you really are trying to a series of asynchronous tasks run serially, you will want to manually have each subsequent task await the prior one. E.g.,
actor Foo {
private var previousTask: Task<(), Error>?
func add(block: #Sendable #escaping () async throws -> Void) {
previousTask = Task { [previousTask] in
let _ = await previousTask?.result
return try await block()
}
}
}
There are two subtle aspects to the above:
I use the capture list of [previousTask] to make sure to get a copy of the prior task.
I perform await previousTask?.value inside the new task, not before it.
If you await prior to creating the new task, you have race, where if you launch three tasks, both the second and the third will await the first task, i.e. the third task is not awaiting the second one.
And, perhaps needless to say, because this is within an actor, it avoids the need for detached task, while keeping the main thread free.

Swift MetalKit buffer completion handler vs waitForCompletion?

When creating a render pass in MetalKit is it better in terms of performance to wait for completion or to add a completion handler? If I use the completion handler then I'll end up with a lot of nested closures, but I think waitForCompletion might block a thread. If the completion handler is preferred, is there a better way in Swift to do this without having to use so many nested closures?
For example,
buffer.addCompletionHandler { _ in
... next task
buffer2.addCompletionHandler { _ in
... etc etc
}
}
The other people are right in telling you that this is probably not what you want to do, and you should go educate yourself on how others have created render loops in Metal.
That said, if you actually have use cases for non-blocking versions of waitUntilCompleted or waitUntilScheduled, you can create and use your own until Apple gets around to providing the same.
public extension MTLCommandBuffer {
/// Wait until this command buffer is scheduled for execution on the GPU.
var schedulingCompletion: Void {
get async {
await withUnsafeContinuation { continuation in
addScheduledHandler { _ in
continuation.resume()
}
}
}
}
/// Wait until the GPU has finished executing the commands in this buffer.
var completion: Void {
get async {
await withUnsafeContinuation { continuation in
addCompletedHandler { _ in
continuation.resume()
}
}
}
}
}
But I doubt those properties will improve any code, as all of the task ordering code, necessary to ensure that the "handlers" are added before commit is called, is worse than the callbacks.
let string: String = await withTaskGroup(of: String.self) { group in
let buffer = MTLCreateSystemDefaultDevice()!.makeCommandQueue()!.makeCommandBuffer()!
group.addTask {
await buffer.schedulingCompletion
return "2"
}
group.addTask {
await buffer.completion
return "3"
}
group.addTask {
buffer.commit()
return "1"
}
return await .init(group)
}
XCTAssertEqual(string, "123")
public extension String {
init<Strings: AsyncSequence>(_ strings: Strings) async rethrows
where Strings.Element == String {
self = try await strings.reduce(into: .init()) { $0.append($1) }
}
}
However, while I'm unconvinced on addScheduledHandler being improvable, I think pairing addCompletedHandler and commit has more potential.
public extension MTLCommandBuffer {
/// Commit this buffer and wait for the GPU to finish executing its commands.
func complete() async {
await withUnsafeContinuation { continuation in
self.addCompletedHandler { _ in
continuation.resume()
}
commit()
} as Void
}
}
You are supposed to use MTLCommandQueues and MTLEvents for serializing your GPU work, not completion handlers. Completion handlers are meant to be used only in cases where you need CPU-GPU synchronization. e.g. when you need to read back a result of GPU calculation on a CPU, or you need to add a back pressure, like for example when you only want to draw a certain amount of frame concurrently.

Wait combineLatest until #selector is called

TL;DR;
I need to find out a way to setup combineLatest that processes events only after particular self.myMethod() is called without subscribing in that method.
Description
My component A has a subscribe() routin in init(), where all Rx subscriptions are set up.
import RxSwift
final class A {
let bag = DisposeBag()
init() {
//...
subscribe()
}
//...
private func subscribe() {
// Setup all Rx subscriptions here
}
There are two other dependencies B and C, each having their statuses that A needs to combineLatest and yield some UI Event upon that combination.
Observable.combineLatest(b.status,
c.status)
.filter { $0.0 == .connecting && $0.1 == .notReachable }
.map { _ -> Error in
return AError.noInternet
}
.debounce(RxTimeInterval.seconds(5), scheduler: MainScheduler.instance)
.subscribe(onNext: { [weak self] error in
self?.didFail(with: error)
})
.disposed(by: bag)
A is not a UI component and basically handles business logic, thus it should wait until UI "says" it is ready to handle that business logic. E.g., after myMethod() is called on A by UI layer.
Problem
I do want to have the Observable.combineLatest in subscribe() being setup in a way that waits until myMethod() is called and then immediately receives latest events from B's status and C's status.
Currently I do it this way in A:
public func myMethod()
// ...
Observable.combineLatest(...
}
, which breaks the clean code I am striving to.
One thing you could do is make the publisher connectable, and call .connect() when you need to:
let publisher: Publishers.MakeConnectable<AnyPublisher<YourOutput, YourError>>
func subscribe() {
publisher = Observable.combineLatest(b.status, c.status)
.filter { ... }
.map { ... }
.eraseToAnyPublisher()
.makeConnectable()
publisher.subscribe(...)
}
Then, in myMethod() you can do:
func myMethod() {
publisher.connect()
}
Another option would be to add a PublishSubject to your A class.
final class A {
let myMethodCalled = PublishSubject<Void>()
init() {
myMethodCalled
.withLatestFrom(Observable.combineLatest(a.status, b.status))
// etc...
}
func myMethod() {
myMethodCalled.onNext(())
}
}
The above might be a problem if, for example myMethod() is called before a.status and b.status emit any values though.
The best solution is to pass in an Observable that triggers the whole thing instead of calling myMethod(). Embrace the Rx paradigm and get rid of the passive (as opposed to reactive) myMethod().

Combine scan with network request to update model in RxSwift

Currently I'm having a Observable created using scan to update underlying model using a PublishSubject like this:
class ViewModel {
private enum Action {
case updateName(String)
}
private let product: Observable<Product>
private let actions = PublishSubject<Action>()
init(initialProduct: Product) {
product = actions
.scan(initialProduct, accumulator: { (oldProduct, action) -> Product in
var newProduct = oldProduct
switch action {
case .updateName(let name):
newProduct.name = name
}
return newProduct
})
.startWith(initialProduct)
.share()
}
func updateProductName(_ name: String) {
actions.onNext(.updateName(name))
}
private func getProductDetail() {
/// This will call a network request
}
}
Every "local" actions like update product's name, prices... is done by using method like updateProductName(_ name: String) above. But what if I want to have a network request that also update the product, and can be called every time I want, for example after a button tap, or after calling updateProductName?
// UPDATE: After read iWheelBuy's comment and Daniel's answer, I ended up using 2 more actions
class ViewModel {
private enum Action {
case getDetail
case updateProduct(Product)
}
///....
init(initialProduct: Product) {
product = actions
.scan(initialProduct, accumulator: { (oldProduct, action) -> Product in
var newProduct = oldProduct
switch action {
case .updateName(let name):
newProduct.name = name
case .getDetail:
self.getProductDetail()
case .updateProduct(let p):
return p
}
return newProduct
})
.startWith(initialProduct)
.share()
}
func getProductDetail() {
actions.onNext(.getDetail)
}
private func getProductDetail(id: Int) {
ProductService.getProductDetail(id) { product in
self.actions.onNext(.updateProduct(product))
}
}
}
But I feel that, I trigger side effect (call network request) inside scan, without updating the model, is that something wrong?
Also how can I use a "rx" network request?
// What if I want to use this method instead of the one above,
// without subscribe inside viewmodel?
private func rxGetProductDetail(id: Int) -> Observable<Product> {
return ProductService.rxGetProductDetail(id: Int)
}
I'm not sure why #iWheelBuy didn't make a real answer because their comment is the correct answer. Given the hybrid approach to Rx in your question, I expect something like the below will accommodate your style:
class ViewModel {
private enum Action {
case updateName(String)
case updateProduct(Product)
}
private let product: Observable<Product>
private let actions = PublishSubject<Action>()
private var disposable: Disposable?
init(initialProduct: Product) {
product = actions
.scan(initialProduct, accumulator: { (oldProduct, action) -> Product in
var newProduct = oldProduct
switch action {
case .updateName(let name):
newProduct.name = name
case .updateProduct(let product):
newProduct = product
}
return newProduct
})
.startWith(initialProduct)
.share()
// without a subscribe, none of this matters. I assume you just didn't show all your code.
}
deinit {
disposable?.dispose()
}
func updateProductName(_ name: String) {
actions.onNext(.updateName(name))
}
private func getProductDetail() {
let request = URLRequest(url: URL(string: "https://foo.com")!)
disposable?.dispose()
disposable = URLSession.shared.rx.data(request: request)
.map { try JSONDecoder().decode(Product.self, from: $0) }
.map { Action.updateProduct($0) }
.subscribe(
onNext: { [actions] in actions.onNext($0) },
onError: { error in /* handle error */ }
)
}
}
The style above is still pretty imperative but if you don't want your use of Rx to leak out of the view model it's okay.
If you want to see a "full Rx" setup, you might find my sample repo interesting: https://github.com/danielt1263/RxEarthquake
UPDATE
But I feel that, I trigger side effect (call network request) inside scan, without updating the model, is that something wrong?
The scan function should be pure with no side effects. Calling a network request inside it's closure is inappropriate.