Concurrency issue when dividing up recursive task

Concurrency issue when dividing up recursive task - swift

I am dealing with geometric types that can be subdivided into instances of themselves. This capability is expressed by the following protocol:
protocol Subdividable {
func subdivision(using block: (Self) -> ())
}
An implementation of Subdividable might look something like this:
struct Tile { ... }
extension Tile: Subdividable {
func subdivision(using block: (Tile) -> ()) {
if condition() {
let (a, b) = createSubTiles()
block(a)
block(b)
} else {
let (a, b, c) = createDifferentSubTiles()
block(a)
block(b)
block(c)
}
}
}
The number of instances a given type subdivides into is not fixed and may or may not depend on properties of the subdivided instance. Once created, every new instance is passed to block.
To create the final result, I need to apply such subdivisions recursively a given number of times:
extension Subdividable {
func subdivision(level: Int, using block: (Self) -> ()) {
switch level {
case 0:
return
case 1:
subdivision(using: block)
default:
precondition(level > 1)
subdivision { value in
value.subdivision(level: level - 1, using: block)
}
}
}
}
As it can't generally be predicted from the onset, how many times block will be called, it is also necessary to keep track of an index. (i.e. to store the result in a buffer)
extension Subdividable {
#discardableResult func subdivision(level: Int, using block: (Int, Self) -> ()) -> Int {
var result = 0
subdivision(level: level) { value in
block(result, value)
result += 1
}
return result
}
}
So far, so good. The resulting geometries can be very complex and may consist of several million elements, making performance a concern. That's why I tried to divide the recursive subdivision into multiple concurrent tasks, it does however not reliably accumulate the results from each of those:
extension Subdividable {
#discardableResult func subdivision(level: (lhs: Int, rhs: Int), using block: #escaping (Self) -> (Int, Self) -> ()) -> Int {
var result = 0
let (queue, group) = (DispatchQueue.global(), DispatchGroup())
subdivision(level: level.lhs) { value in
queue.async(group: group) {
let count = value.subdivision(level: level.rhs, using: block(value))
// not all barrier tasks will have finished after group.wait()
queue.async(group: group, flags: .barrier) {
result += count // accumulate result
}
}
}
group.wait()
return result
}
}
Why does waiting on the group not guarantee the inner async calls to be finished?
I have also tried implementing the above function with a custom serial queue to push the accumulation block onto, and that works, always giving the correct result. I just don't understand why that's not equivalent to pushing it onto the concurrent queue as a barrier task.
What am I missing here? Is there a way to implement this without requiring a custom queue?
Edit No. 1
The issue doesn't seem to be the group not waiting properly but rather the global queue dropping items. Changing queue from DispatchQueue.global() to DispatchQueue(label: "subdivision", attributes: .concurrent) results in correct behavior.
I also ended up wrapping the entire function body in withoutActuallyEscaping(block) { block in ... } so that the input closure doesn't need to be #escaping, which did not result in a crash when using the global queue, which it would have if there actually were tasks executing after group.wait(). To me this reads as the global queue silently dropping items for whatever reason, right?
How can this be and why does a custom concurrent queue not show the same behavior?
Edit No. 2
According to this answer barriers just aren't supported on global queues. They seem to be happy to accept them but then ignore the flag, running the tasks concurrently, resulting in undefined behavior.
I can't find this being properly documented anywhere, does anybody have some pointers?

Related

How to achieve thread safety for removeAtIndex when array has concurrent reads in Swift?

I was looking at this answer that provides code for a thread safe array with concurrent reads. As #tombardey points out in the comments the code (relevant snippet below) is not completely safe:
public func removeAtIndex(index: Int) {
self.accessQueue.async(flags:.barrier) {
self.array.remove(at: index)
}
}
public var count: Int {
var count = 0
self.accessQueue.sync {
count = self.array.count
}
return count
}
...Say the sychronized array has
one element, wouldn't this fail? if synchronizedArray.count == 1 {
synchronizedArray.remove(at: 0) } It's a race condition, say two
threads execute the statement. Both read a count of 1 concurrently,
both enqueue a write block concurrently. The write blocks execute
sequentially, the second one will fail... (cont.)
#Rob replies:
#tombardey - You are absolutely right that this level of
synchronization (at the property/method level) is frequently
insufficient to achieve true thread-safety in broader applications.
Your example is easily solved (by adding an method that dispatches
block to the queue), but there are others that aren't (e.g.
"synchronized" array simultaneously used by a UITableViewDataSource
and mutated by some background operation). In those cases, you have to
implement your own higher-level synchronization. But the above
technique is nonetheless very useful in certain, highly constrained
situations.
I am struggling to work out what #Rob means by "Your example is easily solved (by adding an method that dispatches block to the queue)". I would be interested to see an example implementation of this method (or any other) technique to solve the problem.

This is a very good example of why "atomic" mutable operations on individual properties are rarely sufficient, and are dangerous to add without a great deal of care.
The fundamental problem in this example is that any time the array is modified, it invalidates existing indices. In order to safely use an index, you must ensure that the entire "fetch an index, use the index" operation is atomic. You can't just ensure that each piece is atomic. There is no safe way to write removeAtIndex in isolation, because there is no safe way to acquire the index you pass. Between the time you fetch the index, and the time you use it, the array may have been changed in arbitrary ways.
The point is that there's no such thing as a "thread-safe (mutable) array" that you can use just like a normal array and not have to worry about concurrency issues. A "thread-safe" mutable array cannot return or accept indices, because its indices aren't stable. Exactly what data structure is appropriate depends on the problem you're trying to solve. There's no one answer here.
In most cases the answer is "less concurrency." Rather than trying to manage concurrent access to individual data structures, think about larger-scoped "units of work" that carry all their own data and have exclusive access to it. Put those larger units of work onto queues. (In many cases, even this is overkill. You'd be shocked how often adding currency makes things slower if you don't design it very carefully.) For more recommendations, see Modernizing Grand Central Dispatch Usage.

You said:
I am struggling to work out what #Rob means by “Your example is easily solved (by adding [a] method that dispatches block to the queue)”. I would be interested to see an example implementation of this method (or any other) technique to solve the problem.
Let’s expand upon the example that I posted in response to your other question (see point 3 in this answer), adding a few more Array methods:
class SynchronizedArray<T> {
private var array: [T]
private let accessQueue = DispatchQueue(label: "com.domain.app.reader-writer", attributes: .concurrent)
init(_ array: [T] = []) {
self.array = array
}
subscript(index: Int) -> T {
get { reader { $0[index] } }
set { writer { $0[index] = newValue } }
}
var count: Int {
reader { $0.count }
}
func append(newElement: T) {
writer { $0.append(newElement) }
}
func remove(at index: Int) {
writer { $0.remove(at: index) }
}
func reader<U>(_ block: ([T]) throws -> U) rethrows -> U {
try accessQueue.sync { try block(array) }
}
func writer(_ block: #escaping (inout [T]) -> Void) {
accessQueue.async(flags: .barrier) { block(&self.array) }
}
}
So, let’s imagine that you wanted to delete an item if there was only one item in the array. Consider:
let numbers = SynchronizedArray([42])
...
if numbers.count == 1 {
numbers.remove(at: 0)
}
That looks innocent enough, but it is not thread-safe. You could have a race condition if other threads are inserting or removing values. E.g., if some other thread appended a value between the time you tested the count and when you went to remove the value.
You can fix that by wrapping the whole operation (the if test and the consequent removal) in a single block that is synchronized. Thus you could:
numbers.writer { array in
if array.count == 1 {
array.remove(at: 0)
}
}
This writer method (in this reader-writer-based synchronization) is an example of what I meant by a “method that dispatches block to the queue”.
Now, clearly, you could also give your SynchronizedArray its own method that did this for you, e.g.:
func safelyRemove(at index: Int) {
writer { array in
if index < array.count {
array.remove(at: index)
}
}
}
Then you can do:
numbers.safelyRemove(at: index)
... and that is thread-safe, but still enjoys the performance benefits of reader-writer synchronization.
But the general idea is that when dealing with a thread-safe collection, you invariably have a series of tasks that you will want to synchronize together, at a higher level of abstraction. By exposing the synchronization methods of reader and writer, you have a simple, generalized mechanism for doing that.
All of that having been said, as others have said, the best way to write thread-safe code is to avoid concurrent access altogether. But if you must make a mutable object thread-safe, then it is the responsibility of the caller to identify the series of tasks that must be performed as a single, synchronized operation.

I see several problems with the posted code and example:
Function removeAtIndex is not checking whether it can actually remove at provided index. So it should be changed to
public func removeAtIndex(index: Int) {
// Check if it even makes sense to schedule an update
// This is optional, but IMO just a better practice
guard count > index else { return }
self.accessQueue.async(flags: .barrier) {
// Check again before removing to make sure array didn't change
// Here we can actually check the size of the array, since other threads are blocked
guard self.array.count > index else { return }
self.array.remove(at: index)
}
}
Usage of a thread-safe class also implies that you use one operation to both check and do operation on an item that is supposed to be thread-safe. So if you checking array size and then removing it, you are breaking that thread-safety envelope, it's not a correct use of the class. The particular case synchronizedArray.count == 1 { synchronizedArray.remove(at: 0) } is resolved with adjustments to function above (you don't need to check count anymore, as function already does that). But if you still needed a function that both, verifies count, and then removes an item, you would have to create a function in your thread-safe class that does both operations with no possibility that other threads modify an array in between. You may even need 2 functions: synchronizedArray.getCountAndRemove (get count, then remove), and synchronizedArray.removeAndGetCount` (remove, then get count).
public func getCountAndRemoveAtIndex(index: Int) -> Int {
var currentCount = count
guard currentCount > index else { return currentCount }
// Has to run synchronously to ensure the value is returned
self.accessQueue.sync {
currentCount = self.array.count
guard currentCount > index else { break }
self.array.remove(at: index)
}
return currentCount
}
In general removing item at index for an array that is used from multiple threads, is quite meaningless. You can't even be sure what you are removing. Maybe there are some cases where it would make sense, but usually it makes more sense to either remove by some logic (e.g. specific value), or have a function that returns the value of the item it removes (e.g. func getAndRemoveAtIndex(index: Int) -> T)
Always test every function and combination of them. For example if original poster tested removal like this:
let array = SynchronizedArray<Int>()
array.append(newElement: 1)
array.append(newElement: 2)
array.append(newElement: 3)
DispatchQueue.concurrentPerform(iterations: 5) {_ in
array.removeAtIndex(index: 0)
}
He would get a Fatal error: Index out of range: file Swift/Array.swift, line 1298 in 2 out of 5 threads, so it would be clear that original implementation of this function is not right. Try the same test with the function I posted above, and you will see the difference.
BTW we are only talking about removeAtIndex, but subscript has a similar problem as well. But interestingly first() is implemented correctly.

Creating and consuming a cursor with Vapor 3

This might be a can of worms, I'll do my best to describe the issue. We have a long running data processing job. Our database of actions is added to nightly and the outstanding actions are processed. It takes about 15 minutes to process nightly actions. In Vapor 2 we utilised a lot of raw queries to create a PostgreSQL cursor and loop through it until it was empty.
For the time being, we run the processing via a command line parameter. In future we wish to have it run as part of the main server so that progress can be checked while processing is being performed.
func run(using context: CommandContext) throws -> Future<Void> {
let table = "\"RecRegAction\""
let cursorName = "\"action_cursor\""
let chunkSize = 10_000
return context.container.withNewConnection(to: .psql) { connection in
return PostgreSQLDatabase.transactionExecute({ connection -> Future<Int> in
return connection.simpleQuery("DECLARE \(cursorName) CURSOR FOR SELECT * FROM \(table)").map { result in
var totalResults = 0
var finished : Bool = false
while !finished {
let results = try connection.raw("FETCH \(chunkSize) FROM \(cursorName)").all(decoding: RecRegAction.self).wait()
if results.count > 0 {
totalResults += results.count
print(totalResults)
// Obviously we do our processing here
}
else {
finished = true
}
}
return totalResults
}
}, on: connection)
}.transform(to: ())
}
Now this doesn't work because I'm calling wait() and I get the error "Precondition failed: wait() must not be called when on the EventLoop" which is fair enough. One of the issues I face is that I have no idea how you even get off the main event loop to run things like this on a background thread. I am aware of BlockingIOThreadPool, but that still seems to operate on the same EventLoop and still causes the error. While I'm able to theorise more and more complicated ways to achieve this, I'm hoping I'm missing an elegant solution which perhaps somebody with better knowledge of SwiftNIO and Fluent could help out with.
Edit: To be clear, the goal of this is obviously not to total up the number of actions in the database. The goal is to use the cursor to process every action synchronously. As I read the results in, I detect changes in the actions and then throw batches of them out to processing threads. When all the threads are busy, I don't start reading from the cursor again until they complete.
There are a LOT of these actions, up to 45 million in a single run. Aggregating promises and recursion didn't seem to be a great idea and when I tried it, just for the sake of it, the server hung.
This is a processing intensive task that can run for days on a single thread, so I'm not concerned about creating new threads. The issue is that I cannot work out how I can use the wait() function inside a Command as I need a container to create the database connection and the only one I have access to is context.container Calling wait() on this leads to the above error.
TIA

Ok, so as you know, the problem lies in these lines:
while ... {
...
try connection.raw("...").all(decoding: RecRegAction.self).wait()
...
}
you want to wait for a number of results and therefore you use a while loop and .wait() for all the intermediate results. Essentially, this is turning asynchronous code into synchronous code on the event loop. That is likely leading to deadlocks and will for sure stall other connections which is why SwiftNIO tries to detect that and give you that error. I won't go into the details why it's stalling other connections or why this is likely to lead to deadlocks in this answer.
Let's see what options we have to fix this issue:
as you say, we could just have this .wait() on another thread that isn't one of the event loop threads. For this any non-EventLoop thread would do: Either a DispatchQueue or you could use the BlockingIOThreadPool (which does not run on an EventLoop)
we could rewrite your code to be asynchronous
Both solutions will work but (1) is really not advisable as you would burn a whole (kernel) thread just to wait for the results. And both Dispatch and BlockingIOThreadPool have a finite number of threads they're willing to spawn so if you do that often enough you might run out of threads so it'll take even longer.
So let's look into how we can call an asynchronous function multiple times whilst accumulating the intermediate results. And then if we have accumulated all the intermediate results continue with all the results.
To make things easier let's look at a function that is very similar to yours. We assume this function to be provided just like in your code
/// delivers partial results (integers) and `nil` if no further elements are available
func deliverPartialResult() -> EventLoopFuture<Int?> {
...
}
what we would like now is a new function
func deliverFullResult() -> EventLoopFuture<[Int]>
please note how the deliverPartialResult returns one integer each time and deliverFullResult delivers an array of integers (ie. all the integers). Ok, so how do we write deliverFullResult without calling deliverPartialResult().wait()?
What about this:
func accumulateResults(eventLoop: EventLoop,
partialResultsSoFar: [Int],
getPartial: #escaping () -> EventLoopFuture<Int?>) -> EventLoopFuture<[Int]> {
// let's run getPartial once
return getPartial().then { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's accumulate and call getPartial again
return accumulateResults(eventLoop: eventLoop,
partialResultsSoFar: partialResultsSoFar + [partialResult],
getPartial: getPartial)
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return eventLoop.newSucceededFuture(result: partialResultsSoFar)
}
}
}
Given accumulateResults, implementing deliverFullResult is not too hard anymore:
func deliverFullResult() -> EventLoopFuture<[Int]> {
return accumulateResults(eventLoop: myCurrentEventLoop,
partialResultsSoFar: [],
getPartial: deliverPartialResult)
}
But let's look more into what accumulateResults does:
it invokes getPartial once, then when it calls back it
checks if we have
a partial result in which case we remember it alongside the other partialResultsSoFar and go back to (1)
nil which means partialResultsSoFar is all we get and we return a new succeeded future with everything we have collected so far
that's already it really. What we did here is to turn the synchronous loop into asynchronous recursion.
Ok, we looked at a lot of code but how does this relate to your function now?
Believe it or not but this should actually work (untested):
accumulateResults(eventLoop: el, partialResultsSoFar: []) {
connection.raw("FETCH \(chunkSize) FROM \(cursorName)")
.all(decoding: RecRegAction.self)
.map { results -> Int? in
if results.count > 0 {
return results.count
} else {
return nil
}
}
}.map { allResults in
return allResults.reduce(0, +)
}
The result of all this will be an EventLoopFuture<Int> which carries the sum of all the intermediate result.count.
Sure, we first collect all your counts into an array to then sum it up (allResults.reduce(0, +)) at the end which is a bit wasteful but also not the end of the world. I left it this way because that makes accumulateResults be usable in other cases where you want to accumulate partial results in an array.
Now one last thing, a real accumulateResults function would probably be generic over the element type and also we can eliminate the partialResultsSoFar parameter for the outer function. What about this?
func accumulateResults<T>(eventLoop: EventLoop,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// this is an inner function just to hide it from the outside which carries the accumulator
func accumulateResults<T>(eventLoop: EventLoop,
partialResultsSoFar: [T] /* our accumulator */,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// let's run getPartial once
return getPartial().then { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's accumulate and call getPartial again
return accumulateResults(eventLoop: eventLoop,
partialResultsSoFar: partialResultsSoFar + [partialResult],
getPartial: getPartial)
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return eventLoop.newSucceededFuture(result: partialResultsSoFar)
}
}
}
return accumulateResults(eventLoop: eventLoop, partialResultsSoFar: [], getPartial: getPartial)
}
EDIT: After your edit your question suggests that you do not actually want to accumulate the intermediate results. So my guess is that instead, you want to do some processing after every intermediate result has been received. If that's what you want to do, maybe try this:
func processPartialResults<T, V>(eventLoop: EventLoop,
process: #escaping (T) -> EventLoopFuture<V>,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<V?> {
func processPartialResults<T, V>(eventLoop: EventLoop,
soFar: V?,
process: #escaping (T) -> EventLoopFuture<V>,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<V?> {
// let's run getPartial once
return getPartial().then { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's call the process function and move on
return process(partialResult).then { v in
return processPartialResults(eventLoop: eventLoop, soFar: v, process: process, getPartial: getPartial)
}
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return eventLoop.newSucceededFuture(result: soFar)
}
}
}
return processPartialResults(eventLoop: eventLoop, soFar: nil, process: process, getPartial: getPartial)
}
This will (as before) run getPartial until it returns nil but instead of accumulating all of getPartial's results, it calls process which gets the partial result and can do some further processing. The next getPartial call will happen when the EventLoopFuture process returns is fulfilled.
Is that closer to what you would like?
Notes: I used SwiftNIO's EventLoopFuture type here, in Vapor you would just use Future instead but the remainder of the code should be the same.

Here's the generic solution, rewritten for NIO 2.16/Vapor 4, and as an extension to EventLoop
extension EventLoop {
func accumulateResults<T>(getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// this is an inner function just to hide it from the outside which carries the accumulator
func accumulateResults<T>(partialResultsSoFar: [T] /* our accumulator */,
getPartial: #escaping () -> EventLoopFuture<T?>) -> EventLoopFuture<[T]> {
// let's run getPartial once
return getPartial().flatMap { partialResult in
// we got a partial result, let's check what it is
if let partialResult = partialResult {
// another intermediate results, let's accumulate and call getPartial again
return accumulateResults(partialResultsSoFar: partialResultsSoFar + [partialResult],
getPartial: getPartial)
} else {
// we've got all the partial results, yay, let's fulfill the overall future
return self.makeSucceededFuture(partialResultsSoFar)
}
}
}
return accumulateResults(partialResultsSoFar: [], getPartial: getPartial)
}
}

Properly pass Swift closure into another thread

How do I properly (from multithreading point of view) pass a closure to another thread?
Consider a situation:
class NetManager {
...
var processingClosure : (Data, DispatchQueue, #escaping (Data?) -> ()) -> () = {
respData, complQueue, complClosure in
let resultData = // process respData according to some logic and get resultData
complQueue.async {
complClosure(resultData)
}
// PLEASE NOTE that there is no captured variables in this closure
}
...
func requestData1(..., complClosure) {
// this is main thread context
// make request to endpoint 1 somehow and process result in separate processing queue
...
let procClosure = self.processingClosure
// processingQueue is NOT main queue and not completion queue
request.processingQueue.async {
// Question HERE:
procClosure(data, DispatchQueue.main, complClosure)
// is such passing of the closure safe? Can I have issues with concurrency?
}
}
func requestData2(..., complClosure) {
// the same as requestData1 but gets data from endpoint 2
...
let procClosure = self.processingClosure
request.processingQueue.async {
procClosure(data, DispatchQueue.main, complClosure)
}
}
}
This seems a safe way to pass closure since it doesn't capture any variables. Will I have any concurrency issues with procClosure call?
Is there a better way to encapsulate a common functionality of data transformation to reuse in similar requests to different endpoints (I can encapsulate only data processing but not requesting)?

RxSwift: Nested Queries and ReplaySubject

I have to fetch three types of data (AType, BType, CType) using three separate API requests. The objects returned by the APIs are related by one-to-many:
1 AType object is parent of N BType objects
1 BType object is parent of P CType objects)
I'm using the following three functions to fetch each type:
func get_A_objects() -> Observable<AType> { /* code here */ }
func get_B_objects(a_parentid:Int) -> Observable<BType> { /* code here */}
func get_C_objects(b_parentid:Int) -> Observable<CType> { /* code here */}
and to avoid nested subscriptions, these three functions are chained using flatMap:
func getAll() -> Observable<CType> {
return self.get_A_objects()
.flatMap { (aa:AType) in return get_B_objects(aa.id) }
.flatMap { (bb:BType) in return get_C_objects(bb.id) }
}
func setup() {
self.getAll().subscribeNext { _ in
print ("One more item fetched")
}
}
The above code works fine, when there are M objects of AType, I could see the text "One more item fetched" printed MxNxP times.
I'd like to setup the getAll() function to deliver status updates throughout the chain using ReplaySubject<String>. My initial thought is to write something like:
func getAll() -> ReplaySubject<String> {
let msg = ReplaySubject<String>.createUnbounded()
self.get_A_objects().doOnNext { aobj in msg.onNext ("Fetching A \(aobj)") }
.flatMap { (aa:AType) in
return get_B_objects(aa.id).doOnNext { bobj in msg.onNext ("Fetching B \(bobj)") }
}
.flatMap { (bb:BType) in
return get_C_objects(bb.id).doOnNext { cobj in msg.onNext ("Fetching C \(cobj)") }
}
return msg
}
but this attempt failed, i.e., the following print() does not print anything.
getAll().subscribeNext {
print ($0)
}
How should I rewrite my logic?

Problem
It's because you're not retaining your Disposables, so they're being deallocated immediately, and thus do nothing.
In getAll, you create an Observable<AType> via get_A_objects(), yet it is not added to a DisposeBag. When it goes out of scope (at the end of the func), it will be deallocated. So { aobj in msg.onNext ("Fetching A \(aobj)") } will never happen (or at least isn't likely to, if it's async).
Also, you aren't retaining the ReplaySubject<String> returned from getAll().subscribeNext either. So for the same reason, this would also be a deal-breaker.
Solution
Since you want two Observables: one for the actual final results (Observable<CType>), and one for the progress status (ReplaySubject<String>), you should return both from your getAll() function, so that both can be "owned", and their lifetime managed.
func getAll() -> (Observable<CType>, ReplaySubject<String>) {
let progress = ReplaySubject<String>.createUnbounded()
let results = self.get_A_objects()......
return (results, progress)
}
let (results, progress) = getAll()
progress
.subscribeNext {
print ($0)
}
.addDisposableTo(disposeBag)
results
.subscribeNext {
print ($0)
}
.addDisposableTo(disposeBag)
Some notes:
You shouldn't need to use createUnbounded, which could be dangerous if you aren't careful.
You probably don't really want to use ReplaySubject at all, since it would be a lie to say that you're "fetching" something later if someone subscribes after, and gets an old progress status message. Consider using PublishSubject.
If you follow the above recommendation, then you just need to make sure that you subscribe to progress before results to be sure that you don't miss any progress status messages, since the output won't be buffered anymore.
Also, just my opinion, but I would re-word "Fetching X Y" to something else, since you aren't "fetching", but you have already "fetched" it.

How do I dispatch functions in Swift the right way?

I've kept trying but I just don't get it. I'm rather new to programming so almost every new step is an experiment. Whereas I have no problems dispatching normal closures without arguments/returns, I haven't understood so far how to deal with functions that take (multiple) arguments and return in the end.
To get the logic of the proper "work around" it would be great if someone could post a practical example so I could see whether I've got all of it right. I'd be very thankful for any kind of help... If some other practical example illustrate the topic in a better way, please go ahead with your own!
Let's say we'd like to asynchronously dispatch the following function to a background queue with low priority (or do I make the mistake, trying to implement the dispatch when defining a function instead of waiting till it is called from somewhere else?!):
func mutateInt(someInt: Int) -> Int {
"someHeavyCalculations"
return result
}
or a function with multiple arguments that in addition calls the first function at some point (everything in background queue):
func someBadExample(someString: String, anotherInt: Int) -> Int {
"someHeavyStuff"
println(testString)
mutateInt(testInt)
return result
}
or a UI-function that should be ensured to run just on main queue (just a fictitious example):
override func tableView(tableView: UITableView, numberOfRowsInSection section: Int) -> Int {
let sectionInfo = self.fetchedResultsController.sections?[section] as NSFetchedResultsSectionInfo
return sectionInfo.numberOfObjects
}

Let's say you had some function like so:
func calculate(foo: String, bar: Int) -> Int {
// slow calculations performed here
return result
}
If you wanted to do that asynchronously, you could wrap it in something like this:
func calculate(foo: String, bar: Int, completionHandler: #escaping (Int) -> Void) {
DispatchQueue.global().async {
// slow calculations performed here
completionHandler(result)
}
}
Or, alternatively, if you want to ensure the completion handler is always called on the main queue, you could have this do that for you, too:
func calculate(foo: String, bar: Int, completionHandler: #escaping (Int) -> Void) {
DispatchQueue.global().async {
// slow calculations performed here
DispatchQueue.main.async {
completionHandler(result)
}
}
}
For the work being performed in the background, you may use a different priority background queue, or your might use your own custom queue or your own operation queue. But those details aren't really material to the question at hand.
What is relevant is that this function, itself, doesn't return any value even though the underlying synchronous function does. Instead, this asynchronous rendition is passing the value back via the completionHandler closure. Thus, you would use it like so:
calculate(foo: "life", bar: 42) { result in
// we can use the `result` here (e.g. update model or UI accordingly)
print("the result is = \(result)")
}
// but don't try to use `result` here, because we get here immediately, before
// the above slow, asynchronous process is done
(FYI, all of the above examples are Swift 3. For Swift 2.3 rendition, see previous version of this answer.)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse