How to check when multiple concurrent threads have finished? - swift

I have code that goes like this:
myArray.forEach { item in
concurentOperation(item)
}
Every item in the array goes through a concurrent operation function, which runs in different threads, I'm not sure exactly which thread or how many threads because the function is from a third party library and out of my control. I need a way to find out once all operations are finished.
How can I do this?

without modifying concurentOperation() this is NOT available, sorry ...
UPDATE for user #Scriptable
next snippet demonstrates, why his solution doesn't work ...
import PlaygroundSupport
import Dispatch
PlaygroundPage.current.needsIndefiniteExecution = true
let pq = DispatchQueue(label: "print", qos: .background)
func dprint(_ items: Any...) {
pq.async {
let r = items.map{ String(describing: $0) }.joined(separator: " ")
print(r)
}
}
func concurrentOperation<T>(item: T) { // dummy items
DispatchQueue.global().async {
// long time operation
for i in 0..<10000 {
_ = sin(Double(i))
}
dprint(item, "done")
}
}
let myArray = [1,2,3,4,5,6,7,8,9,0]
let g = DispatchGroup()
myArray.forEach { (item) in
DispatchQueue.global().async(group: g) {
concurrentOperation(item: item)
}
}
g.notify(queue: DispatchQueue.main) {
dprint("all jobs done???")
}
UPDATE 2
without modifying the code of ConcurrentOperation()
DispatchQueue.global().async(group: g) {
concurrentOperation(item: item)
}
the dispatch group is entered and immediately left because concurrentOperation is the asynchronous function. If it is synchronous then the question has no sense.

Related

Why doesn't .async on a concurrent queue in a for loop behave the same as DispatchQueue.concurrentPerform?

import Dispatch
class SynchronizedArray<T> {
private var array: [T] = []
private let accessQueue = DispatchQueue(label: "SynchronizedArrayAccess", attributes: .concurrent)
var get: [T] {
accessQueue.sync {
array
}
}
func append(newElement: T) {
accessQueue.async(flags: .barrier) {
self.array.append(newElement)
}
}
}
If I run the following code, 10,000 elements are appended to the array as expected even if I am reading concurrently:
DispatchQueue.concurrentPerform(iterations: 10000) { i in
_ threadSafeArray.get
threadSafeArray.append(newElement: i)
}
But when I do this, only it never comes close to adding 10,000 elements (only added 92 elements on my computer the last time I ran it).
let concurrent = DispatchQueue(label: "com.concurrent", attributes: .concurrent)
for i in 0..<10000 {
concurrent.async {
_ = threadSafeArray.get
threadSafeArray.append(newElement: i)
}
}
Why does the former work, and why doesn't the latter work?
It's good that you found a solution to the thread explosion. See a discussion on thread explosion WWDC 2015 Building Responsive and Efficient Apps with GCD and again in WWDC 2016 Concurrent Programming With GCD in Swift 3.
That having been said, DispatchSemaphore is a bit of an anti-pattern, nowadays, given the presence of concurrentPerform (or OperationQueue with its maxConcurrentOperationCount or Combine with its maxPublishers). All of these manage degrees of concurrency more elegantly than dispatch semaphores.
All that having been said, a few observations on your semaphore pattern:
When using this DispatchSemaphore pattern, you generally put the wait before the concurrent.async { ... } (because, as written, you're getting nine concurrent operations, not eight, which is a bit misleading).
The deeper problem here is that you've diminished the problem of the count issue, but it still persists. Consider:
let threadSafeArray = SynchronizedArray<Int>()
let concurrent = DispatchQueue(label: "com.concurrent", attributes: .concurrent)
let semaphore = DispatchSemaphore(value: 8)
for i in 0..<10000 {
semaphore.wait()
concurrent.async {
threadSafeArray.append(newElement: i)
semaphore.signal()
}
}
print(threadSafeArray.get.count)
When you leave the for loop, you can still have up to eight of the async tasks on concurrent still running, and the count (unsynchronized with respect to concurrent queue) can still be less than 10,000. You have to add another concurrent.async(flags: .barrier) { ... }, which is just adding a second layer of synchronization. E.g.
let semaphore = DispatchSemaphore(value: 8)
for i in 0..<10000 {
semaphore.wait()
concurrent.async {
threadSafeArray.append(newElement: i)
semaphore.signal()
}
}
concurrent.async(flags: .barrier) {
print(threadSafeArray.get.count)
}
Or you can use a DispatchGroup, the classical mechanism for determining when a series of asynchronously dispatched blocks finish:
let semaphore = DispatchSemaphore(value: 8)
let group = DispatchGroup()
for i in 0..<10000 {
semaphore.wait()
concurrent.async(group: group) {
threadSafeArray.append(newElement: i)
semaphore.signal()
}
}
group.notify(queue: .main) {
print(threadSafeArray.get.count)
}
Using of concurrentPerform eliminates the need for either of these patterns because it won’t continue execution until all of the concurrent tasks are done. (It will also automatically optimize the degree of concurrency for the number of cores on your device.)
FWIW, a much better alternative to to SynchronizedArray is to not expose the underlying array at all, and just implement whatever methods you want to exposed, integrating the necessary synchronization. It makes for cleaner call site, and solves many issues.
For example, assuming you wanted to expose subscript operator and a count variable, you would do:
class SynchronizedArray<T> {
private var array: [T]
private let accessQueue = DispatchQueue(label: "com.domain.app.reader-writer", attributes: .concurrent)
init(_ array: [T] = []) {
self.array = array
}
subscript(index: Int) -> T {
get { reader { $0[index] } }
set { writer { $0[index] = newValue } }
}
var count: Int {
reader { $0.count }
}
func append(newElement: T) {
writer { $0.append(newElement) }
}
func reader<U>(_ block: ([T]) throws -> U) rethrows -> U {
try accessQueue.sync { try block(array) }
}
func writer(_ block: #escaping (inout [T]) -> Void) {
accessQueue.async(flags: .barrier) { block(&self.array) }
}
}
This solves a variety of issues. For example, you can now do:
print(threadSafeArray.count) // get the count
print(threadSafeArray[500]) // get the 500th item
You also now can also do things like:
let average = threadSafeArray.reader { array -> Double in
let sum = array.reduce(0, +)
return Double(sum) / Double(array.count)
}
But, bottom line, when dealing with collections (or any mutable object), you invariably do not want to expose the mutable object, itself, but rather write your own synchronized methods for common operations (subscripts, count, removeAll, etc.), and possibly also expose the reader/writer interface for those cases where the app developer might need a broader synchronization mechanism.
(FWIW, the changes to this SynchronizedArray apply both to the semaphore or concurrentPerform scenarios; it is just that the semaphore just happens to manifest the problem in this case.)
Needless to say, you would generally have more work being done on each thread, too, because as modest as the context switching overhead, it is likely enough here to offset any advantages gained from parallel processing. (But I understand that this was likely just a conceptual demonstration of a problem, not a proposed implementation.) Just a FYI to future readers.
Seems I was experiencing Thread Explosion as 82 threads were being created and the app ran out of threads, the solution I used is a semaphore to limit the number of threads:
let semaphore = DispatchSemaphore(value: 8)
let concurrent = DispatchQueue(label: "com.concurrent", attributes: .concurrent)
for i in 0..<10000 {
concurrent.async {
_ = threadSafeArray.get
threadSafeArray.append(newElement: i)
semaphore.signal()
}
semaphore.wait()
}
Edit: Rob's answer explains some issues with above code

How to make for-in loop wait for data fetch function to complete

I am trying to fetch bunch of data with for in loop function, but it doesn't return data in correct orders. It looks like some data take longer to fetch and so they are mixed up in an array where I need to have all the data in correct order. So, I used DispatchGroup. However, it's not working. Can you please let me know what I am doing wrong here? Spent past 10 hours searching for a solution... below is my code.
#IBAction func parseXMLTapped(_ sender: Any) {
let codeArray = codes[0]
for code in codeArray {
self.fetchData(code)
}
dispatchGroup.notify(queue: .main) {
print(self.dataToAddArray)
print("Complete.")
}
}
private func fetchData(_ code: String) {
dispatchGroup.enter()
print("count: \(count)")
let dataParser = DataParser()
dataParser.parseData(url: url) { (dataItems) in
self.dataItems = dataItems
print("Index #\(self.count): \(self.dataItems)")
self.dataToAddArray.append(self.dataItems)
}
self.dispatchGroup.leave()
dispatchGroup.enter()
self.count += 1
dispatchGroup.leave()
}
The problem with asynchronous functions is that you can never know in which order the blocks return.
If you need to preserve the order, use indices like so:
let dispatchGroup = DispatchGroup()
var dataToAddArray = [String](repeating: "", count: codeArray.count)
for (index, code) in codeArray.enumerated() {
dispatchGroup.enter()
DataParser().parseData(url: url) { dataItems in
dataToAddArray[index] = dataItems
dispatchGroup.leave()
}
}
dispatchGroup.notify(queue: .main) {
print("Complete"
}
Also in your example you are calling dispatchGroup.leave() before the asynchronous block has even finished. That would also yield wrong results.
Using semaphores to eliminate all concurrency solves the order issue, but with a large performance penalty. Dennis has the right idea, namely, rather than sacrificing concurrency, instead, just sort the results.
That having been said, I would probably use a dictionary:
let group = DispatchGroup()
var results: [String: [DataItem]] // you didn't say what `dataItems` was, so I'll assume it's an array of `DataItem` objects; but this detail isn't material to the broader question
for code in codes {
group.enter()
DataParser().parseData(url: url) { dataItems in
results[code] = dataItems // if parseData doesn't already uses the main queue for its completion handler, then dispatch these two lines to the main queue
group.leave()
}
}
group.notify(queue: .main) {
let sortedResults = codes.compactMap { results[$0] } // this very efficiently gets the results in the right order
// do something with sortedResults
}
Now, I might advise constraining the degree of concurrency (e.g. maybe you want to constrain this to the number of CPUs or some reasonable fixed number (e.g. 4 or 6). That is a separate question. But I would advise against sacrificing concurrency just to get the results in the right order.
In this case, using DispatchSemaphore:
let semaphore = DispatchSemaphore(value: 0)
DispatchQueue.global().async {
for code in codeArray {
self.fetchData(code)
semaphore.wait()
}
}
private func fetchData(_ code: String) {
print("count: \(count)")
let dataParser = DataParser()
dataParser.parseData(url: url) { (dataItems) in
self.dataItems = dataItems
print("Index #\(self.count): \(self.dataItems)")
self.dataToAddArray.append(self.dataItems)
semaphore.signal()
}
}

using dispatch group in multi for loop with urlsession tasks

I have using a dispatch group wait() that block my a for loop from completing the code until a set of urlsession tasks (in another loop with completion handler) to be completed before appending new element to my array
the current code will finish the first loop before the second loop of urlClass.selectfoodURL is completed
I want to append the array in meal history after my urlfood for loop is completed
on of the problem in my approach of using dispatch groups is the wait(), when my select food is called the urlsession stuck and doesn’t complete with group.wait
func userSnackHistoryArray() {
let group = DispatchGroup()
let Arrays // array of dictionary
for array in Arrays {
var generateMeal = MealDetails() // struct type
do {
let aa = try JSONDecoder().decode(userSnack.self, from: array)
generateMeal.names = convertToJsonFile.type
for name in generateMeal.names!{
group.enter()
urlClass.selectfoodURL(foodName: name){ success in
generateMeal.units!.append(allVariables.selectedUnit)
group.leave()
}
}
// my select food is called but the urlsession stuck and doesnt complete with group.wait is active
// group.wait()
mealHistory.append(generateMeal)
} catch { }
}
group.notify(queue: .main){
print("complete")
}
}
I have shortened my code to focus on the problem ,, I can split my code into two functions and solve the problem , but I want to use only one function
any suggestions or ideas ?
Rather than waiting, you should just create a local array of values to be added, and then add them when it’s done:
func retrieveSnacks() {
var snacksToAdd: [Snack] = []
let group = DispatchGroup()
...
for url in urls {
group.enter()
fetchSnack(with: url) { result in
dispatchPrecondition(condition: .onQueue(.main)) // note, I’m assuming that this closure is running on the main queue; if not, dispatch this appending of snacks (and `leave` call) to the main queue
if case .success(let snack) = result {
snacksToAdd.append(snack)
}
group.leave()
}
}
// when all the `leave` calls are called, only then append the results
group.notify(queue: .main) {
self.snacks += snacksToAdd
// trigger UI update, or whatever, here
}
}
Note, the above does not assure that the objects are added in the original order. If you need that, you can use a dictionary to build the temporary results and then append the results in sorted order:
func retrieveSnacks() {
var snacksToAdd: [URL: Snack] = [:]
let group = DispatchGroup()
...
for url in urls {
group.enter()
fetchSnack(with: url) { result in
if case .success(let snack) = result {
snacksToAdd[url] = snack
}
group.leave()
}
}
group.notify(queue: .main) {
let sortedSnacks = urls.compactMap { snacksToAdd[$0] }
self.snacks += sortedSnacks
// trigger UI update, or whatever, here
}
}
Finally, I might suggest adopting a completion handler pattern:
func retrieveSnacks(completion: #escaping ([Snack]) -> Void) {
var snacksToAdd: [URL: Snack] = [:]
let group = DispatchGroup()
...
for url in urls {
group.enter()
fetchSnack(with: url) { result in
if case .success(let snack) = result {
snacksToAdd[url] = snack
}
group.leave()
}
}
group.notify(queue: .main) {
let sortedSnacks = urls.compactMap { snacksToAdd[$0] }
completion(sortedSnacks)
}
}
retrieveSnacks { addedSnacks in
self.snacks += addedSnacks
// update UI here
}
This pattern ensures that you don’t entangle your network-related code with your UI code.
I apologize that the above is somewhat refactored from your code snippet, but there wasn’t enough there for me to illustrate what precisely it would look like. But hopefully the above illustrates the pattern and you can see how you’d apply it to your code base. So, don’t get lost in the details, but focus on the basic pattern of building records to be added in a local variable and only update the final results in the .notify block.
FWIW, this is the method signature for the method that the above snippets are using to asynchronously fetch the objects in question.
func fetchSnack(with url: URL, completion: #escaping (Result<Snack, Error>) -> Void) {
...
// if async fetch not successful
DispatchQueue.main.async {
completion(.failure(error))
}
// if successful
DispatchQueue.main.async {
completion(.success(snack))
}
}

Dealing with multiple completion handlers

I'm trying to coordinate several completion handlers for each element in an array.
The code is essentially this:
var results = [String:Int]()
func requestData(for identifiers: [String])
{
identifiers.forEach
{ identifier in
service.request(identifier, completion: { (result) in
result[identifier] = result
})
}
// Execute after all the completion handlers finish
print(result)
}
So each element in the Array is sent through a service with a completion handler, and all the results are stored in an array. Once all of these handlers complete, I wish to execute some code.
I attempted to do this with DispatchQueue
var results = [String:Int]()
func requestData(for identifiers: [String])
{
let queue = DispatchQueue.init(label: "queue")
identifiers.forEach
{ identifier in
service.request(identifier, completion: { (result) in
queue.sync
{
result[identifier] = result
}
})
}
// Execute after all the completion handlers finish
queue.sync
{
print(result)
}
}
but the print call is still being executed first, with an empty Dictionary
If I understand what are you are trying to do correctly, you probably want to use a DispatchGroup
Here is an example:
let group = DispatchGroup()
var letters = ["a", "b", "c"]
for letter in letters {
group.enter()
Server.doSomething(completion: { [weak self] (result) in
print("Letter is: \(letter)")
group.leave()
})
}
group.notify(queue: .main) {
print("- done")
}
This will print something like:
b
c
a
// ^ in some order
- done
First, take note that your service.request(...) is processed in asynchronous mode. Another problem is you want to finish all the service request in that loop.
My suggestion is create the function with completion handler and add a counter on each loop done. Your function will be similarly as below.
var results = [String:Int]()
func requestData(for identifiers: [String], callback:#escaping (Bool) -> Void)
{
var counter = 0
var maxItem = identifiers.count
identifiers.forEach
{ identifier in
service.request(identifier, completion: { (result) in
result[identifier] = result
counter += 1
if counter == maxItem {
callback(true) // update completion handler to say all loops request are done
}
// if not, continue the other request
})
}
}
This is how another part of your code will call the function and wait for callback
requestData(for identifiers:yourArrays) { (complete) in
if complete {
print(results)
}
}
Don't forget to manage if errors happened.

Swift closure async order of execution

In my model have function to fetch data which expects completion handler as parameter:
func fetchMostRecent(completion: (sortedSections: [TableItem]) -> ()) {
self.addressBook.loadContacts({
(contacts: [APContact]?, error: NSError?) in
// 1
if let unwrappedContacts = contacts {
for contact in unwrappedContacts {
// handle constacts
...
self.mostRecent.append(...)
}
}
// 2
completion(sortedSections: self.mostRecent)
})
}
It's calling another function which does asynchronous loading of contacts, to which I'm forwarding my completion
The call of fetchMostRecent with completion looks like this:
model.fetchMostRecent({(sortedSections: [TableItem]) in
dispatch_async(dispatch_get_main_queue()) {
// update some UI
self.state = State.Loaded(sortedSections)
self.tableView.reloadData()
}
})
This sometimes it works, but very often the order of execution is not the way as I would expect. Problem is, that sometimes completion() under // 2 is executed before scope of if under // 1 was finished.
Why is that? How can I ensure that execution of // 2 is started after // 1?
A couple of observations:
It will always execute what's at 1 before 2. The only way you'd get the behavior you describe is if you're doing something else inside that for loop that is, itself, asynchronous. And if that were the case, you'd use a dispatch group to solve that (or refactor the code to handle the asynchronous pattern). But without seeing what's in that for loop, it's hard to comment further. The code in the question, alone, should not manifest the problem you describe. It's got to be something else.
Unrelated, you should note that it's a little dangerous to be updating model objects inside your asynchronously executing for loop (assuming it is running on a background thread). It's much safer to update a local variable, and then pass that back via the completion handler, and let the caller take care of dispatching both the model update and the UI updates to the main queue.
In comments, you mention that in the for loop you're doing something asynchronous, and something that must be completed before the completionHandler is called. So you'd use a dispatch group to do ensure this happens only after all the asynchronous tasks are done.
Note, since you're doing something asynchronous inside the for loop, not only do you need to use a dispatch group to trigger the completion of these asynchronous tasks, but you probably also need to create your own synchronization queue (you shouldn't be mutating an array from multiple threads). So, you might create a queue for this.
Pulling this all together, you end up with something like:
func fetchMostRecent(completionHandler: ([TableItem]?) -> ()) {
addressBook.loadContacts { contacts, error in
var sections = [TableItem]()
let group = dispatch_group_create()
let syncQueue = dispatch_queue_create("com.domain.app.sections", nil)
if let unwrappedContacts = contacts {
for contact in unwrappedContacts {
dispatch_group_enter(group)
self.someAsynchronousMethod {
// handle contacts
dispatch_async(syncQueue) {
let something = ...
sections.append(something)
dispatch_group_leave(group)
}
}
}
dispatch_group_notify(group, dispatch_get_main_queue()) {
self.mostRecent = sections
completionHandler(sections)
}
} else {
completionHandler(nil)
}
}
}
And
model.fetchMostRecent { sortedSections in
guard let sortedSections = sortedSections else {
// handle failure however appropriate for your app
return
}
// update some UI
self.state = State.Loaded(sortedSections)
self.tableView.reloadData()
}
Or, in Swift 3:
func fetchMostRecent(completionHandler: #escaping ([TableItem]?) -> ()) {
addressBook.loadContacts { contacts, error in
var sections = [TableItem]()
let group = DispatchGroup()
let syncQueue = DispatchQueue(label: "com.domain.app.sections")
if let unwrappedContacts = contacts {
for contact in unwrappedContacts {
group.enter()
self.someAsynchronousMethod {
// handle contacts
syncQueue.async {
let something = ...
sections.append(something)
group.leave()
}
}
}
group.notify(queue: .main) {
self.mostRecent = sections
completionHandler(sections)
}
} else {
completionHandler(nil)
}
}
}