Concurrency issues with observable.observeOn() and common resources - swift

I have an observable inside a function.
The function happens in a certain queue, queueA, and the observable is subscribed to with observeOn(schedulerB). In onNext, I'm changing a class variable.
In another function, I'm changing the same class variable, from a different queue.
Here is some code to demonstrate my situation:
class SomeClass {
var commonResource: [String: String] = [:]
var queueA = DispatchQueue(label: "A")
var queueB = DispatchQueue(label: "B")
var schedulerB = ConcurrentDispatchQueueScheduler(queue: QueueB)
func writeToResourceInOnNext() {
let obs: PublishSubject<String> = OtherClass.GetObservable()
obs.observeOn(schedulerB)
.subscribe(onNext: { [weak self] res in
// this happens on queue B
self.commonResource["key"] = res
}
}
func writeToResource() {
// this happens on queue A
commonResource["key"] = "otherValue"
}
}
My question is, is it likely to have concurrency issues, if commonResource is modified in both places at the same time?
What is the common practice for writing/reading from class/global variables inside onNext in an observable with observeOn?
Thanks all!

Since your SomeClass has no control over when these functions will be called or on what threads the answer is yes, you are setup to have concurrency issues in this code due to its passive nature.
The obvious solution here is to dispatch to queue B inside writeToResource() in order to avoid the race condition.
Another option would be to use an NSLock (or NSRecursiveLock) and lock it before you write to the resource and unlock it after.
The best practice is: when you have a side effect happening inside a subscribe function's closure (in this case writing to commonResource that the closure is the only place where the side effect occurs. This would mean doing away with the passive writeToResource() function and instead passing in an Observable that was generated by whatever code currently is calling the function.

Related

Cannot safely access globalActor's method from a closure marked with same globalActor

#globalActor
actor LibraryAccount {
static let shared = LibraryAccount()
var booksOnLoan: [Book] = [Book()]
func getBook() -> Book {
return booksOnLoan[0]
}
}
class Book {
var title: String = "ABC"
}
func test() async {
let handle = Task { #LibraryAccount in
let b = await LibraryAccount.shared.getBook() // WARNING and even ERROR without await (although function is not async)
print(b.title)
}
}
This code generates the following warning:
Non-sendable type 'Book' returned by call to actor-isolated instance method 'getBook()' cannot cross actor boundary
However, the closure is itself marked with the same global Actor, there should be no Actor boundary here. Interestingly, when removing the await from the offending line, it will emit an error:
Expression is 'async' but is not marked with 'await'
This looks like it does not recognize that this is the same instance of the Actor as the one guarding the closure.
What's going on here? Am I misunderstanding how GlobalActors work or is this a bug?
Global actors aren't really designed to be used this way.
The type on which you mark #globalActor, is just a marker type, providing a shared property which returns the actual actor instance doing the synchronisation.
As the proposal puts it:
A global actor type can be a struct, enum, actor, or final class. It is essentially just a marker type that provides access to the actual shared actor instance via shared. The shared instance is a globally-unique actor instance that becomes synonymous with the global actor type, and will be used for synchronizing access to any code or data that is annotated with the global actor.
Therefore, what you write after static let shared = doesn't necessarily have to be LibraryAccount(), it can be any actor in the world. The type marked as #globalActor doesn't even need to be an actor itself.
So from Swift's perspective, it's not at all obvious that the LibraryAccount global actor is the same actor as any actor-typed expression you write in code, like LibraryAccount.shared. You might have implemented shared so that it returns a different instance of LibraryAccount the second time you call it, who knows? Static analysis only goes so far.
What Swift does know, is that two things marked #LibraryAccount are isolated to the same actor - i.e. this is purely nominal. After all, the original motivation for global actors was to easily mark things that need to be run on the main thread with #MainActor. Quote (emphasis mine):
The primary motivation for global actors is the main actor, and the
semantics of this feature are tuned to the needs of main-thread
execution. We know abstractly that there are other similar use cases,
but it's possible that global actors aren't the right match for those
use cases.
You are supposed to create a "marker" type, with almost nothing in it - that is the global actor. And isolate your business logic to the global actor by marking it.
In your case, you can rename LibraryAccount to LibraryAccountActor, then move your properties and methods to a LibraryAccount class, marked with #LibraryAccountActor:
#globalActor
actor LibraryAccountActor {
static let shared = LibraryAccountActor()
}
#LibraryAccountActor
class LibraryAccount {
static let shared = LibraryAccount()
var booksOnLoan: [Book] = [Book()]
func getBook() async -> Book { // this doesn't need to be async, does it?
return booksOnLoan[0]
}
}
class Book {
var title: String = "ABC"
}
func test() async {
let handle = Task { #LibraryAccountActor in
let b = await LibraryAccount.shared.getBook()
print(b.title)
}
}

Publish `operationCount` from operationQueue inside actor?

I have an actor:
actor MyActor {
let theQueue = OperationQueue()
init() {
_ = theQueue.observe(\OperationQueue.operationCount, options: .new) { oq, change in
print("OperationQueue.operationCount changed: \(self.theQueue.operationCount)")
}
}
....
}
I was trying to get a KVO going to then trigger some type of publisher call that other models in the app could subscribe to and react as needed when the operationCount changes.
I was going to have a function that maybe would set that up, but, as of now, using self in that initializer gives me this warning, which according this this:
https://forums.swift.org/t/proposal-actor-initializers-and-deinitializers/52322
it will turn into an error soon.
The warning I get is this:
Actor 'self' can only be captured by a closure from an async initializer
So, how could I trigger a publisher other models can then react to that would publish the operation queue's operationCount as it changes?
You don't need to capture self here. observe sends you the new value (for basically exactly this reason):
_ = theQueue.observe(\OperationQueue.operationCount, options: .new) { oq, change in
print("OperationQueue.operationCount changed: \(change.newValue!)")
}
Also, oq is theQueue if you need that. If you need self, the typical way to do that is:
observation = observe(\.theQueue.operationCount, options: .new) { object, change in
// object is `self` here.
}
Just remember that you're outside the actor inside this closure, so calls may need to be async inside a Task.

In a Combine Publisher chain, how to keep inner objects alive until cancel or complete?

I've created a Combine publisher chain that looks something like this:
let pub = getSomeAsyncData()
.mapError { ... }
.map { ... }
...
.flatMap { data in
let wsi = WebSocketInteraction(data, ...)
return wsi.subject
}
.share().eraseToAnyPublisher()
It's a flow of different possible network requests and data transformations. The calling code wants to subscribe to pub to find out when the whole asynchronous process has succeeded or failed.
I'm confused about the design of the flatMap step with the WebSocketInteraction. That's a helper class that I wrote. I don't think its internal details are important, but its purpose is to provide its subject property (a PassthroughSubject) as the next Publisher in the chain. Internally the WebSocketInteraction uses URLSessionWebSocketTask, talks to a server, and publishes to the subject. I like flatMap, but how do you keep this piece alive for the lifetime of the Publisher chain?
If I store it in the outer object (no problem), then I need to clean it up. I could do that when the subject completes, but if the caller cancels the entire publisher chain then I won't receive a completion event. Do I need to use Publisher.handleEvents and listen for cancellation as well? This seems a bit ugly. But maybe there is no other way...
.flatMap { data in
let wsi = WebSocketInteraction(data, ...)
self.currentWsi = wsi // store in containing object to keep it alive.
wsi.subject.sink(receiveCompletion: { self.currentWsi = nil })
wsi.subject.handleEvents(receiveCancel: {
wsi.closeWebSocket()
self.currentWsi = nil
})
Anyone have any good "design patterns" here?
One design I've considered is making my own Publisher. For example, instead of having WebSocketInteraction vend a PassthroughSubject, it could conform to Publisher. I may end up going this way, but making a custom Combine Publisher is more work, and the documentation steers people toward using a subject instead. To make a custom Publisher you have to implement some of things that the PassthroughSubject does for you, like respond to demand and cancellation, and keep state to ensure you complete at most once and don't send events after that.
[Edit: to clarify that WebSocketInteraction is my own class.]
It's not exactly clear what problems you are facing with keeping an inner object alive. The object should be alive so long as something has a strong reference to it.
It's either an external object that will start some async process, or an internal closure that keeps a strong reference to self via self.subject.send(...).
class WebSocketInteraction {
private let subject = PassthroughSubject<String, Error>()
private var isCancelled: Bool = false
init() {
// start some async work
DispatchQueue.main.asyncAfter(deadline: .now() + 1) {
if !isCancelled { self.subject.send("Done") } // <-- ref
}
}
// return a publisher that can cancel the operation when
var pub: AnyPublisher<String, Error> {
subject
.handleEvents(receiveCancel: {
print("cancel handler")
self.isCancelled = true // <-- ref
})
.eraseToAnyPublisher()
}
}
You should be able to use it as you wanted with flatMap, since the pub property returned publisher, and the inner closure hold a reference to self
let pub = getSomeAsyncData()
...
.flatMap { data in
let wsi = WebSocketInteraction(data, ...)
return wsi.pub
}

Swift - Is checking whether a weak variable is nil or not thread-safe?

I have a process which runs for a long time and which I would like the ability to interrupt.
func longProcess (shouldAbort: #escaping ()->Bool) {
// Runs a long loop and periodically checks shouldAbort(),
// returning early if shouldAbort() returns true
}
Here's my class which uses it:
class Example {
private var abortFlag: NSObject? = .init()
private var dispatchQueue: DispatchQueue = .init(label: "Example")
func startProcess () {
let shouldAbort: ()->Bool = { [weak abortFlag] in
return abortFlag == nil
}
dispatchQueue.async {
longProcess(shouldAbort: shouldAbort)
}
}
func abortProcess () {
self.abortFlag = nil
}
}
The shouldAbort closure captures a weak reference to abortFlag, and checks whether that reference points to nil or to an NSObject. Since the reference is weak, if the original NSObject is deallocated then the reference that is captured by the closure will suddenly be nil and the closure will start returning true. The closure will be called repeatedly during the longProcess function, which is occurring on the private dispatchQueue. The abortProcess method on the Example class will be externally called from some other queue. What if someone calls abortProcess(), thereby deallocating abortFlag, at the exact same time that longProcess is trying to perform the check to see if abortFlag has been deallocated yet? Is checking myWeakReference == nil a thread-safe operation?
You can create the dispatched task as a DispatchWorkItem, which has a thread-safe isCancelled property already. You can then dispatch that DispatchWorkItem to a queue and have it periodically check its isCancelled. You can then just cancel the dispatched as such point you want to stop it.
Alternatively, when trying to wrap some work in an object, we’d often use Operation, instead, which encapsulates the task in its own class quite nicely:
class SomeLongOperation: Operation {
override func main() {
// Runs a long loop and periodically checks `isCancelled`
while !isCancelled {
Thread.sleep(forTimeInterval: 0.1)
print("tick")
}
}
}
And to create queue and add the operation to that queue:
let queue = OperationQueue()
let operation = SomeLongOperation()
queue.addOperation(operation)
And to cancel the operation:
operation.cancel()
Or
queue.cancelAllOperations()
Bottom line, whether you use Operation (which is, frankly, the “go-to” solution for wrapping some task in its own object) or roll-your-own with DispatchWorkItem, the idea is the same, namely that you don’t need to have your own state property to detect cancellation of the task. Both dispatch queues and operation queues already have nice mechanisms to simplify this process for you.
I saw this bug (Weak properties are not thread safe when reading SR-192) indicating that weak reference reads weren't thread safe, but it has been fixed, which suggests that (absent any bugs in the runtime), weak reference reads are intended to be thread safe.
Also interesting: Friday Q&A 2017-09-22: Swift 4 Weak References by Mike Ash

How can I create a reference cycle using dispatchQueues?

I feel that I've always misunderstood that when reference cycles are created. Before I use to think that almost any where that you have a block and the compiler is forcing you to write .self then it's a sign that I'm creating a reference cycle and I need to use [weak self] in.
But the following setup doesn't create a reference cycle.
import Foundation
import PlaygroundSupport
PlaygroundPage.current.needsIndefiniteExecution
class UsingQueue {
var property : Int = 5
var queue : DispatchQueue? = DispatchQueue(label: "myQueue")
func enqueue3() {
print("enqueued")
queue?.asyncAfter(deadline: .now() + 3) {
print(self.property)
}
}
deinit {
print("UsingQueue deinited")
}
}
var u : UsingQueue? = UsingQueue()
u?.enqueue3()
u = nil
The block only retains self for 3 seconds. Then releases it. If I use async instead of asyncAfter then it's almost immediate.
From what I understand the setup here is:
self ---> queue
self <--- block
The queue is merely a shell/wrapper for the block. Which is why even if I nil the queue, the block will continue its execution. They’re independent.
So is there any setup that only uses queues and creates reference cycles?
From what I understand [weak self] is only to be used for reasons other than reference cycles ie to control the flow of the block. e.g.
Do you want to retain the object and run your block and then release it? A real scenario would be to finish this transaction even though the view has been removed from the screen...
Or you want to use [weak self] in so that you can exit early if your object has been deallocated. e.g. some purely UI like stopping a loading spinner is no longer needed
FWIW I understand that if I use a closure then things are different ie if I do:
import PlaygroundSupport
import Foundation
PlaygroundPage.current.needsIndefiniteExecution
class UsingClosure {
var property : Int = 5
var closure : (() -> Void)?
func closing() {
closure = {
print(self.property)
}
}
func execute() {
closure!()
}
func release() {
closure = nil
}
deinit {
print("UsingClosure deinited")
}
}
var cc : UsingClosure? = UsingClosure()
cc?.closing()
cc?.execute()
cc?.release() // Either this needs to be called or I need to use [weak self] for the closure otherwise there is a reference cycle
cc = nil
In the closure example the setup is more like:
self ----> block
self <--- block
Hence it's a reference cycle and doesn't deallocate unless I set block to capturing to nil.
EDIT:
class C {
var item: DispatchWorkItem!
var name: String = "Alpha"
func assignItem() {
item = DispatchWorkItem { // Oops!
print(self.name)
}
}
func execute() {
DispatchQueue.main.asyncAfter(deadline: .now() + 1, execute: item)
}
deinit {
print("deinit hit!")
}
}
With the following code, I was able to create a leak ie in Xcode's memory graph I see a cycle, not a straight line. I get the purple indicators. I think this setup is very much like how a stored closure creates leaks. And this is different from your two examples, where execution is never finished. In this example execution is finished, but because of the references it remains in memory.
I think the reference is something like this:
┌─────────┐─────────────self.item──────────────▶┌────────┐
│ self │ │workItem│
└─────────┘◀︎────item = DispatchWorkItem {...}───└────────┘
You say:
From what I understand the setup here is:
self ---> queue
self <--- block
The queue is merely a shell/wrapper for the block. Which is why even if I nil the queue, the block will continue its execution. They’re independent.
The fact that self happens to have a strong reference to the queue is inconsequential. A better way of thinking about it is that a GCD, itself, keeps a reference to all dispatch queues on which there is anything queued. (It’s analogous to a custom URLSession instance that won’t be deallocated until all tasks on that session are done.)
So, GCD keeps reference to the queue with dispatched tasks. The queue keeps a strong reference to the dispatched blocks/items. The queued block keeps a strong reference to any reference types they capture. When the dispatched task finishes, it resolves any strong references to any captured reference types and is removed from the queue (unless you keep your own reference to it elsewhere.), generally thereby resolving any strong reference cycles.
Setting that aside, where the absence of [weak self] can get you into trouble is where GCD keeps a reference to the block for some reason, such as dispatch sources. The classic example is the repeating timer:
class Ticker {
private var timer: DispatchSourceTimer?
func startTicker() {
let queue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".ticker")
timer = DispatchSource.makeTimerSource(queue: queue)
timer!.schedule(deadline: .now(), repeating: 1)
timer!.setEventHandler { // whoops; missing `[weak self]`
self.tick()
}
timer!.resume()
}
func tick() { ... }
}
Even if the view controller in which I started the above timer is dismissed, GCD keeps firing this timer and Ticker won’t be released. As the “Debug Memory Graph” feature shows, the block, created in the startTicker routine, is keeping a persistent strong reference to the Ticker object:
This is obviously resolved if I use [weak self] in that block used as the event handler for the timer scheduled on that dispatch queue.
Other scenarios include a slow (or indefinite length) dispatched task, where you want to cancel it (e.g., in the deinit):
class Calculator {
private var item: DispatchWorkItem!
deinit {
item?.cancel()
item = nil
}
func startCalculation() {
let queue = DispatchQueue(label: Bundle.main.bundleIdentifier! + ".calcs")
item = DispatchWorkItem { // whoops; missing `[weak self]`
while true {
if self.item?.isCancelled ?? true { break }
self.calculateNextDataPoint()
}
self.item = nil
}
queue.async(execute: item)
}
func calculateNextDataPoint() {
// some intense calculation here
}
}
All of that having been said, in the vast majority of GCD use-cases, the choice of [weak self] is not one of strong reference cycles, but rather merely whether we mind if strong reference to self persists until the task is done or not.
If we’re just going to update the the UI when the task is done, there’s no need to keep the view controller and its views in the hierarchy waiting some UI update if the view controller has been dismissed.
If we need to update the data store when the task is done, then we definitely don’t want to use [weak self] if we want to make sure that update happens.
Frequently, the dispatched tasks aren’t consequential enough to worry about the lifespan of self. For example, you might have a URLSession completion handler dispatch UI update back to the main queue when the request is done. Sure, we theoretically would want [weak self] (as there’s no reason to keep the view hierarchy around for a view controller that’s been dismissed), but then again that adds noise to our code, often with little material benefit.
Unrelated, but playgrounds are a horrible place to test memory behavior because they have their own idiosyncrasies. It’s much better to do it in an actual app. Plus, in an actual app, you then have the “Debug Memory Graph” feature where you can see the actual strong references. See https://stackoverflow.com/a/30993476/1271826.