DispatchGroup and OperationQueue have methods wait() and waitUntilAllOperationsAreFinished() which wait for all operations in respective queues to complete.
But even when I call cancelAllOperations it just changes the flag isCancelled in every running operation and stop the queue from executing new operations. But it still waits for the operations to complete. Therefore running the operations must be stopped from the inside. But it is possible only if operation is incremental or has an inner cycle of any kind. When it's just long external request (web request for example), there is no use of isCancelled variable.
Is there any way of stopping the OperationQueue or DispatchGroup waiting for the operations to complete if one of the operations decides that all queue is now outdated?
The practical case is: mapping a request to a list of responders, and it is known that only one may answer. If it happens, queue should stop waiting for other operations to finish and unlock the thread.
Edit: DispatchGroup and OperationQueue usage is not obligatory, these are just tools I thought would fit.
OK, so I think I came up with something. Results are stable, I've just tested. The answer is just one semaphore :)
let semaphore = DispatchSemaphore(value: 0)
let group = DispatchGroup()
let queue = DispatchQueue(label: "map-reduce", qos: .userInitiated, attributes: .concurrent)
let stopAtFirst = true // false for all results to be appended into one array
let values: [U] = <some input values>
let mapper: (U) throws -> T? = <closure>
var result: [T?] = []
for value in values {
queue.async(group: group) {
do {
let res = try mapper(value)
// appending must always be thread-safe, otherwise you end up with race condition and unstable results
DispatchQueue.global().sync {
result.append(res)
}
if stopAtFirst && res != nil {
semaphore.signal()
}
} catch let error {
print("Could not map value \"\(value)\" to mapper \(mapper): \(error)")
}
}
}
group.notify(queue: queue) { // this must be declared exactly after submitting all tasks, otherwise notification fires instantly
semaphore.signal()
}
if semaphore.wait(timeout: .init(secondsFromNow: 5)) == .timedOut {
print("MapReduce timed out on values \(values)")
}
Related
I have a JSONEncoder encoding a 20mb file, which takes ages to process. If the data it's processing changes, I'd like to cancel the encoding, and restart the encoding process but I can't think of a way to do this. Any ideas?
I could call JSONEncoder.encode again, but now I would have two 30 second processes running, and double the amount of memory and processor overhead.
It would be lovely to be able cancel the previous one.
EDIT: Some of you requested to see my encoder. Here's the one which I'd say causes the biggest bottleneck...
func encode(to encoder: Encoder) throws {
try autoreleasepool {
var container = encoder.container(keyedBy: CodingKeys.self)
try container.encode(brush, forKey: .brush)
if encoder.coderType == CoderType.export {
let bezierPath = try NSKeyedUnarchiver.unarchivedObject(ofClass: UIBezierPath.self, from: beziersData)
let jsonData = try UIBezierPathSerialization.data(with: bezierPath, options: UIBezierPathWritingOptions.ignoreDrawingProperties)
let bezier = try? JSONDecoder().decode(DBBezier.self, from: jsonData)
try container.encodeIfPresent(bezier, forKey: .beziersData)
} else {
try container.encodeIfPresent(beziersData, forKey: .beziersData)
}
}
}
You can use OperationQueue and add your long running task into that operation queue.
var queue: OperationQueue?
//Initialisation
if queue == nil {
queue = OperationQueue()
queue?.maxConcurrentOperationCount = 1
}
queue?.addOperation {
//Need to check the isCanceled property of the operation for stopping the ongoing execution in any case.
self.encodeHugeJSON()
}
You can also cancel the task whenever you want using the following code:
//Whenever you want to cancel the task, you can do it like this
queue?.cancelAllOperations()
queue = nil
What is an Operation Queue:
An operation queue invokes its queued Operation objects based on their
priority and readiness. After you add an operation to a queue, it
remains in the queue until the operation finishes its task. You can’t
directly remove an operation from a queue after you add it.
Reference links:
https://developer.apple.com/documentation/foundation/operationqueue
https://www.hackingwithswift.com/example-code/system/how-to-use-multithreaded-operations-with-operationqueue
I have a func that gets a list of Players. When i fetch the players i need only to show those who belongs to the current Team so i am showing only a subset of the original list by filtering them. I don't know in advance, before making the request, how much players belong to the Team selected by the User, so i may need to do additional requests until i can display on the TableView at least 10 rows of Players. The User by pulling up from the bottom of the TableView can request more players to display. To do this i am calling a first async func request which in turn calls, inside a while, another nested async func request. Here a code to give you an idea of what i am trying to do:
let semaphore = DispatchSemaphore(value: 0)
func getTeamPlayersRequest() {
service.getTeamPlayers(...)
{
(result) in
switch result
{
case .success(let playersModel):
if let validCurrentPage = currentPageTmp ,
let validTotalPages = totalPagesTmp ,
let validNextPage = self.getTeamPlayersListNextPage()
{
while self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages
{
self.currentPage = validNextPage //global var
self.fetchMorePlayers()
self.semaphore.wait() //global semaphore
}
}
case .failure(let error):
//some code...
}
})
}
private func fetchMorePlayers(){
// Completion handler of the following function is never called..
service.getTeamPlayers(requestedPage: currentPage, completion: {
(result) in
switch result
{
case .success(let playersModel):
if let validPlayerList = playersList,
let validPlayerListData = validPlayerList.data,
let validTeamModel = self.teamPlayerModel,
let validNextPage = self.getTeamPlayersListNextPage()
{
for player in validPlayerListData
{
if ( validTeamModel.id == player.team?.id)
{
self.playersToShowTemp.append(player)
}
}
}
self.currentPage = validNextPage
self.semaphore.signal() //global semaphore
case .failure(let error):
//some code...
}
}
}
I have tried both with DispatchGroup and Semaphore but i don't get it what i am doing wrong. I debugged the code and saw that the first async call get executed in a different queue (not the main queue) and a different thread. The nested async call getexecuted on a different thread but i don't know if it's the same concurrent queue of the first async call.
The completion handler of thenested call it's never called. Does anyone know why? is the self.semaphore.wait(), even if it get executed after the fetchMorePlayers() return, blocking/preventing the nested async completion handler to be called?
I am noticing through the Debugger that the completion() in the Xcode vars window has the note "swift partial apply forwarder for closure #1"
If we inline the function call in your loop, it looks something like this:
while self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages
{
self.currentPage = validNextPage //global var
nbaService.getTeamPlayers(requestedPage: currentPage, completion: { ... })
self.semaphore.wait() //global semaphore
}
So nbaService.getTeamPlayers schedules a request, probably on the main DispatchQueue and immediately returns. Then you call wait on your semaphore, which blocks, probably before GCD even tries to run the task scheduled by nbaService.getTeamPlayers.
That's a problem on DispatchQueue.main, which is a serial queue. It has to be a serial queue for UI updates to work. What normally happens is on some iteration of the run loop you make a request, and return.. that bubbles back up to the run loop, which checks for more events and queued tasks. In this case, when your completion handler in getTeamPlayersRequest is waiting to be run, the run loop (via GCD) executes it for that iteration. Then you block the main thread, so the run loop can't continue. If you do need to block always do it on a different DispatchQueue, preferably a .concurrent one.
There is sometimes confusion about what .async does. It only means "run this later and right now return control back to the caller". That's all. It does not guarantee that your closure will run concurrently. It merely schedules it to be run later (possibly soon) on whatever DispatchQueue you called it on. If that queue is a serial queue, then it will be queued to run in its turn in that dispatch queue's run loop. If it's a concurrent queue (ie one you specifically set the attributes to include .concurrent). Then it will run, possibly at the same time as other tasks on that same DispatchQueue.
To avoid that instead of using a loop you can use async-chaining.
private func fetchMorePlayers(while condition: #autoclosure #escaping () -> Bool){
guard condition() else { return }
nbaService.getTeamPlayers(requestedPage: currentPage, completion: {
(result) in
switch result
{
case .success(let playersModel):
if let validPlayerList = playersList,
let validPlayerListData = validPlayerList.data,
let validTeamModel = self.teamPlayerModel,
let validNextPage = self.getTeamPlayersListNextPage()
{
for player in validPlayerListData
{
if ( validTeamModel.id == player.team?.id)
{
self.playersToShowTemp.append(player)
}
}
}
self.currentPage = validNextPage
// Chain to next call
self.fetchMorePlayers(while: condition))
case .failure(let error):
//some code...
}
}
}
Then in getTeamPlayersRequest you can do this:
func getTeamPlayersRequest() {
service.getTeamPlayers(...)
{
(result) in
switch result
{
case .success(let playersModel):
if let validCurrentPage = currentPageTmp ,
let validTotalPages = totalPagesTmp ,
let validNextPage = self.getTeamPlayersListNextPage()
{
self.currentPage = validNextPage //global var
self.fetchMorePlayers(while: self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages)
}
case .failure(let error):
//some code...
}
})
}
This avoids the need to block on a semaphore, because each subsequent request happens in the completion handler of the previously completed one. The only issue is if you need for the completion handler in getTeamPlayersRequest to block while the fetchMorePlayers requests are being fetched, because now it won't you can re-introduce the semaphore. In that case the guard statement in fetchMorePlayers becomes:
guard condition() else
{
self.semaphore.signal()
return
}
That way it only signals on the last completion handler in the chain. You may need to block in a different DispatchQueue though. I think if you need to block, you probably have something about your design that needs to be reconsidered.
If you find yourself reaching for semaphores, it is almost always a mistake. Semaphores are inefficient at best, and introduce deadlock risks if misused. Semaphores should generally be avoided. (Don't get me wrong: Semaphores can be useful in some very narrow use cases, but this is not one of them.)
Use asynchronous patterns. One simple approach might be to recursively call the routine, calling the completion handler when done:
func startFetching(#escaping completion: () -> Void) {
fetchPlayers(page: 0, completion: completion)
}
private func fetchPlayers(page: Int, #escaping completion: () -> Void) {
// prepare request
// now perform request
performRequest(...) { ...
if let error = error {
completion()
return
}
...
if doesNeedMorePlayers {
fetchPlayers(page: page + 1, completion: completion)
} else {
completion()
}
}
}
Personally, I might probably add another closure to emit the players retrieved as we go along, e.g. like, if not actually, a Combine Publisher. Or if you want to update the UI all at once at the very end, just pass the players retrieved thus far as additional parameter in this recursive routine and pass the whole array back in the completion handler. But avoid globals or other state properties.
But the broader idea is to scrupulously avoid semaphores and instead embrace asynchronous patterns.
I've set up this script to loop through a bunch of data in the background and I've successfully set up a semaphore to keep everything (the array that will populate the table) in order but I cannot exactly understand how or why the semaphore keeps the array in order. The dispatchGroup is entered, the loop stops and waits until the image is downloaded, once the image is gotten the dispatchSemaphore is set to 1 and immediately the dispatchGroup is exited and the semaphore set back to 0. The semaphore is toggled so fast from 0 to 1 that I don't understand how it keeps the array in order.
let dispatchQueue = DispatchQueue(label: "someTask")
let dispatchGroup = DispatchGroup()
let dispatchSemaphore = DispatchSemaphore(value: 0)
dispatchQueue.async {
for doc in snapshot.documents {
// create data object for array
dispatchGroup.enter()
// get image with asynchronous completion handler
Storage.storage().reference(forURL: imageId).getData(maxSize: 1048576, completion: { (data, error) in
defer {
dispatchSemaphore.signal()
dispatchGroup.leave()
}
if let imageData = data,
error == nil {
// add image to data object
// append to array
}
})
dispatchSemaphore.wait()
}
// do some extra stuff in background after loop is done
}
dispatchGroup.notify(queue: dispatchQueue) {
DispatchQueue.main.async {
self.tableView.reloadData()
}
}
The solution is in your comment get image with asynchronous completion handler. Without the semaphore all image downloads would be started at the same time and race for completion, so the image that downloads fastest would be added to the array first.
So after you start your download you immediately wait on your semaphore. This will block until it is signaled in the callback closure from the getData method. Only then the loop can continue to the next document and download it. This way you download one file after another and block the current thread while the downloads are running.
Using a serial queue is not an option here, since this would only cause the downloads to start serially, but you can’t affect the order in which they finish.
This is a rather inefficient though. Your network layer probably can run faster if you give it multiple requests at the same time (think of parallel downloads and HTTP pipelining). Also you're 'wasting' a thread which could do some different work in the meantime. If there is more work to do at the same time GCD will spawn another thread which wastes memory and other resources.
A better pattern would be to skip the semaphore, let the downloads run in parallel and store the image directly at the correct index in your array. This of course means you have to prepare an array of the appropriate size beforehand, and you have to think of a placeholder for missing or failed images. Optionals would do the trick nicely:
var images: [UIImage?] = Array(repeating: nil, count: snapshot.documents.count)
for (index, doc) in snapshot.documents.enumerated() {
// create data object for array
dispatchGroup.enter()
// get image with asynchronous completion handler
Storage.storage().reference(forURL: imageId).getData(maxSize: 1048576) { data, error in
defer {
dispatchGroup.leave()
}
if let imageData = data,
error == nil {
// add image to data object
images[index] = image
}
}
}
The DispatchGroup isn't really doing anything here. You have mutual exclusion granted by the DispatchSemaphor, and the ordering is simply provided by the iteration order of snapshot.documents
I would like to perform multiple Alamofire requests. However, because of data dependency a new request should only start when the previous is finished.
I already asked a question with a more general example of an asynchronous request which was solved with OperationQueue. However, I do not succeed to achieve the same with Alamofire.
public func performAlamofireRequest(_ number: Int, success: #escaping (Int) -> Void)->Void {
Alamofire.request(String(format: "http://jsonplaceholder.typicode.com/posts/%i", number+1)) // NSURLSession dispatch queue
.responseString { response in // Completion handler at main dispatch queue?
if response.result.isSuccess {
// print("data")
} else if response.result.isFailure {
// print("error")
}
success(number) // Always leave closure in this example
}
}
To assure that requests are finished before a next request is started, I use OperationQueue as follows:
let operationQueue = OperationQueue.main
for operationNumber in 0..<4 { // Create some operations
let operation = BlockOperation(block: {
performAlamofireRequest(operationNumber) { number in
print("Operation #\(number) finished")
}
})
operation.name = "Operation #\(operationNumber)"
if operationNumber > 0 {
operation.addDependency(operationQueue.operations.last!)
}
operationQueue.addOperation(operation)
}
However, the output is:
Operation #0 finished
Operation #3 finished
Operation #2 finished
Operation #1 finished
which is clearly not correct.
How would it be possible to achieve this with Alamofire?
The issue is just the same as in the related question you posed: the operation dependencies are on finishing an operation, as documented, but you have written code where the operation exits after asynchronously dispatching a request for future execution (the operations you created and added to a queue will finish in the order set by their dependencies, but the requests will be fired concurrently by the NSURLSession underlying Alamofire).
If you need serial execution, you can for instance do the following:
// you should create an operation queue, not use OperationQueue.main here –
// synchronous network IO that would end up waiting on main queue is a real bad idea.
let operationQueue = OperationQueue()
let timeout:TimeInterval = 30.0
for operationNumber in 0..<4 {
let operation = BlockOperation {
let s = DispatchSemaphore(value: 0)
self.performAlamofireRequest(operationNumber) { number in
// do stuff with the response.
s.signal()
}
// the timeout here is really an extra safety measure – the request itself should time out and end up firing the completion handler.
s.wait(timeout: DispatchTime(DispatchTime.now, Int64(timeout * Double(NSEC_PER_SEC))))
}
operationQueue.addOperation(operation)
}
Various other solutions are discussed in connection to this question, arguably a duplicate. There's also Alamofire-Synchronous.
I'm entering the concurrency programming with some semaphore issues.
My function first loads data from server, analyze received info and then, if necessary, makes second request to server.
I tried different ways to make it run, none of them did it well.
My current code FOR ME seems to be correct, but on second request it just locks(maybe like a DeadLock) and the last log is "<__NSCFLocalDataTask: 0x7ff470c58c90>{ taskIdentifier: 2 } { suspended }"
Please, tell me what do I don't know. Maybe there is more elegant way to work with completions for these purposes?
Thank you in advance!
var users = [Int]()
let linkURL = URL.init(string: "https://bla bla")
let session = URLSession.shared()
let semaphore = DispatchSemaphore.init(value: 0)
let dataRequest = session.dataTask(with:linkURL!) { (data, response, error) in
let json = JSON (data: data!)
if (json["queue"]["numbers"].intValue>999) {
for i in 0...999 {
users.append(json["queue"]["values"][i].intValue)
}
for i in 1...lround(json["queue"]["numbers"].doubleValue/1000) {
let session2 = URLSession.shared()
let semaphore2 = DispatchSemaphore.init(value: 0)
let linkURL = URL.init(string: "https://bla bla")
let dataRequest2 = session2.dataTask(with:linkURL!) { (data, response, error) in
let json = JSON (data: data!)
print(i)
semaphore2.signal()
}
dataRequest2.resume()
semaphore2.wait(timeout: DispatchTime.distantFuture)
}
}
semaphore.signal()
}
dataRequest.resume()
semaphore.wait(timeout: DispatchTime.distantFuture)
P.S. Why do I do it. Server returns limited count of data. To get more, I have to use offset.
This is deadlocking because you are waiting for a semaphore on the URLSession's delegateQueue. The default delegate queue is not the main queue, but it is a serial background queue (i.e. an OperationQueue with a maxConcurrentOperationCount of 1). So your code is waiting for a semaphore on the same serial queue that is supposed to be signaling the semaphore.
The tactical fix is to make sure you're not calling wait on the same serial queue that the session's completion handlers are running on. There are two obvious fixes:
Do not use shared session (whose delegateQueue is a serial queue), but rather instantiate your own URLSession and specify its delegateQueue to be a concurrent OperationQueue that you create:
let queue = OperationQueue()
queue.name = "com.domain.app.networkqueue"
let configuration = URLSessionConfiguration.default()
let session = URLSession(configuration: configuration, delegate: nil, delegateQueue: queue)
Alternatively, you can solve this by dispatching the code with the semaphore off to some other queue, e.g.
let mainRequest = session.dataTask(with: mainUrl) { data, response, error in
// ...
DispatchQueue.global(attributes: .qosUserInitiated).async {
let semaphore = DispatchSemaphore(value: 0)
for i in 1 ... n {
let childUrl = URL(string: "https://blabla/\(i)")!
let childRequest = session.dataTask(with: childUrl) { data, response, error in
// ...
semaphore.signal()
}
childRequest.resume()
_ = semaphore.wait(timeout: .distantFuture)
}
}
}
mainRequest.resume()
For the sake of completeness, I'll note that you probably shouldn't be using semaphores to issue these requests at all, because you'll end up paying a material performance penalty for issuing a series of consecutive requests (plus you're blocking a thread, which is generally discouraged).
The refactoring of this code to do that is a little more considerable. It basically entails issuing a series of concurrent requests, perhaps use "download" tasks rather than "data" tasks to minimize memory impact, and then when all of the requests are done, piece it all together as needed at the end (triggered by either a Operation "completion" operation or dispatch group notification).