How to suspend subsequent tasks until first finishes then share its response with tasks that waited? - swift

I have an actor which throttles requests in a way where the first one will suspend subsequent requests until finished, then share its response with them so they don't have to make the same request.
Here's what I'm trying to do:
let cache = Cache()
let operation = OperationStatus()
func execute() async {
if await operation.isExecuting else {
await operation.waitUntilFinished()
} else {
await operation.set(isExecuting: true)
}
if let data = await cache.data {
return data
}
let request = myRequest()
let response = await myService.send(request)
await cache.set(data: response)
await operation.set(isExecuting: false)
}
actor Cache {
var data: myResponse?
func set(data: myResponse?) {
self.data = data
}
}
actor OperationStatus {
#Published var isExecuting = false
private var cancellable = Set<AnyCancellable>()
func set(isExecuting: Bool) {
self.isExecuting = isExecuting
}
func waitUntilFinished() async {
guard isExecuting else { return }
return await withCheckedContinuation { continuation in
$isExecuting
.first { !$0 } // Wait until execution toggled off
.sink { _ in continuation.resume() }
.store(in: &cancellable)
}
}
}
// Do something
DispatchQueue.concurrentPerform(iterations: 1_000_000) { _ in execute() }
This ensures one request at a time, and subsequent calls are waiting until finished. It seems this works but wondering if there's a pure Concurrency way instead of mixing Combine in, and how I can test this? Here's a test I started but I'm confused how to test this:
final class OperationStatusTests: XCTestCase {
private let iterations = 10_000 // 1_000_000
private let outerIterations = 10
actor Storage {
var counter: Int = 0
func increment() {
counter += 1
}
}
func testConcurrency() {
// Given
let storage = Storage()
let operation = OperationStatus()
let promise = expectation(description: "testConcurrency")
promise.expectedFulfillmentCount = outerIterations * iterations
#Sendable func execute() async {
guard await !operation.isExecuting else {
await operation.waitUntilFinished()
promise.fulfill()
return
}
await operation.set(isExecuting: true)
try? await Task.sleep(seconds: 8)
await storage.increment()
await operation.set(isExecuting: false)
promise.fulfill()
}
waitForExpectations(timeout: 10)
// When
DispatchQueue.concurrentPerform(iterations: outerIterations) { _ in
(0..<iterations).forEach { _ in
Task { await execute() }
}
}
// Then
// XCTAssertEqual... how to test?
}
}

Before I tackle a more general example, let us first dispense with some natural examples of sequential execution of asynchronous tasks, passing the result of one as a parameter of the next. Consider:
func entireProcess() async throws {
let value = try await first()
let value2 = try await subsequent(with: value)
let value3 = try await subsequent(with: value2)
let value4 = try await subsequent(with: value3)
// do something with `value4`
}
Or
func entireProcess() async throws {
var value = try await first()
for _ in 0 ..< 4 {
value = try await subsequent(with: value)
}
// do something with `value`
}
This is the easiest way to declare a series of async functions, each of which takes the prior result as the input for the next iteration. So, let us expand the above to include some signposts for Instruments’ “Points of Interest” tool:
import os.log
private let log = OSLog(subsystem: "Test", category: .pointsOfInterest)
func entireProcess() async throws {
let id = OSSignpostID(log: log)
os_signpost(.begin, log: log, name: #function, signpostID: id, "start")
var value = try await first()
for _ in 0 ..< 4 {
os_signpost(.event, log: log, name: #function, "Scheduling: %d with input of %d", i, value)
value = try await subsequent(with: value)
}
os_signpost(.end, log: log, name: #function, signpostID: id, "%d", value)
}
func first() async throws -> Int {
let id = OSSignpostID(log: log)
os_signpost(.begin, log: log, name: #function, signpostID: id, "start")
try await Task.sleep(seconds: 1)
let value = 42
os_signpost(.end, log: log, name: #function, signpostID: id, "%d", value)
return value
}
func subsequent(with value: Int) async throws -> Int {
let id = OSSignpostID(log: log)
os_signpost(.begin, log: log, name: #function, signpostID: id, "%d", value)
try await Task.sleep(seconds: 1)
let newValue = value + 1
defer { os_signpost(.end, log: log, name: #function, signpostID: id, "%d", newValue) }
return newValue
}
So, there you see a series of requests that pass their result to the subsequent request. All of that os_signpost signpost stuff is so we can visually see that they are running sequentially in Instrument’s “Points of Interest” tool:
You can see ⓢ event signposts as each task is scheduled, and the intervals illustrate the sequential execution of these asynchronous tasks.
This is the easiest way to have dependencies between tasks, passing values from one task to another.
Now, that begs the question is how to generalize the above, where we await the prior task before starting the next one.
One pattern is to write an actor that awaits the result of the prior one. Consider:
actor SerialTasks<Success> {
private var previousTask: Task<Success, Error>?
func add(block: #Sendable #escaping () async throws -> Success) {
previousTask = Task { [previousTask] in
let _ = await previousTask?.result
return try await block()
}
}
}
Unlike the previous example, this does not require that you have a single function from which you initiate the subsequent tasks. E.g., I have used the above when some separate user interaction requires me to add a new task to the end of the list of previously submitted tasks.
There are two subtle, yet critical, aspects of the above actor:
The add method, itself, must not be an asynchronous function. We need to avoid actor reentrancy. If this were an async function (like in your example), we would lose the sequential execution of the tasks.
The Task has a [previousTask] capture list to capture a copy of the prior task. This way, each task will await the prior one, avoiding any races.
The above can be used to make a series of tasks run sequentially. But it is not passing values between tasks, itself. I confess that I have used this pattern where I simply need sequential execution of largely independent tasks (e.g., sending separate commands being sent to some Process). But it can probably be adapted for your scenario, in which you want to “share its response with [subsequent requests]”.
I would suggest that you post a separate question with MCVE with a practical example of precisely what you wanted to pass from one asynchronous function to another. I have, for example, done permutation of the above, passing integer from one task to another. But in practice, that is not of great utility, as it gets more complicated when you start dealing with the reality of heterogenous results parsing. In practice, the simple example with which I started this question is the most practical pattern.
On the broader question of working with/around actor reentrancy, I would advise keeping an eye out on SE-0306 - Future Directions which explicitly contemplates some potential elegant forthcoming alternatives. I would not be surprised to see some refinements, either in the language itself, or in the Swift Async Algorithms library.
tl;dr
I did not want to encumber the above with discussion regarding your code snippets, but there are quite a few issues. So, if you forgive me, here are some observations:
The attempt to use OperationStatus to enforce sequential execution of async calls will not work because actors feature reentrancy. If you have an async function, every time you hit an await, that is a suspension point at which point another call to that async function is allowed to proceed. The integrity of your OperationStatus logic will be violated. You will not experience serial behavior.
If you are interesting in suspension points, I might recommend watching WWDC 2021 video Swift concurrency: Behind the scenes.
The testConcurrency is calling waitForExpectations before it actually starts any tasks that will fulfill any expectations. That will always timeout.
The testConcurrency is using GCD concurrentPerform, which, in turn, just schedules an asynchronous task and immediately returns. That defeats the entire purpose of concurrentPerform (which is a throttling mechanism for running a series of synchronous tasks in parallel, but not exceed the maximum number of cores on your CPU). Besides, Swift concurrency features its own analog to concurrentPerform, namely the constrained “cooperative thread pool” (also discussed in that video, IIRC), rendering concurrentPerform obsolete in the world of Swift concurrency.
Bottom line, it doesn't make sense to include concurrentPerform in a Swift concurrency codebase. It also does not make sense to use concurrentPerform to launch asynchronous tasks (whether Swift concurrency or GCD). It is for launching a series of synchronous tasks in parallel.
In execute in your test, you have two paths of execution, one which will await some state change and fulfills the expectation without ever incrementing the storage. That means that you will lose some attempts to increment the value. Your total will not match the desired resulting value. Now, if your intent was to drop requests if another was pending, that's fine. But I don't think that was your intent.
In answer to your question about how to test success at the end. You might do something like:
actor Storage {
private var counter: Int = 0
func increment() {
counter += 1
}
var value: Int { counter }
}
func testConcurrency() async {
let storage = Storage()
let operation = OperationStatus()
let promise = expectation(description: "testConcurrency")
let finalCount = outerIterations * iterations
promise.expectedFulfillmentCount = finalCount
#Sendable func execute() async {
guard await !operation.isExecuting else {
await operation.waitUntilFinished()
promise.fulfill()
return
}
await operation.set(isExecuting: true)
try? await Task.sleep(seconds: 1)
await storage.increment()
await operation.set(isExecuting: false)
promise.fulfill()
}
// waitForExpectations(timeout: 10) // this is not where you want to wait; moved below, after the tasks started
// DispatchQueue.concurrentPerform(iterations: outerIterations) { _ in // no point in this
for _ in 0 ..< outerIterations {
for _ in 0 ..< iterations {
Task { await execute() }
}
}
await waitForExpectations(timeout: 10)
// test the success to see if the store value was correct
let value = await storage.value // to test that you got the right count, fetch the value; note `await`, thus we need to make this an `async` test
// Then
XCTAssertEqual(finalCount, value, "Count")
}
Now, this test will fail for a variety of reasons, but hopefully this illustrates how you would verify the success or failure of the test. But, note, that this will test only that the final result was correct, not that they were executed sequentially. The fact that Storage is an actor will hide the fact that they were not really invoked sequentially. I.e., if you really needed the results of one request to prepare the next is not tested here.
If, as you go through this, you want to really confirm the behavior of your OperationStatus pattern, I would suggest using os_signpost intervals (or simple logging statements where your tasks start and finish). You will see that the separate invocations of the asynchronous execute method are not running sequentially.

Related

Cancelling the task of a URLSession.AsyncBytes doesn't seem to work

I want to download a large file, knowing the number of bytes transferred, and be able to cancel the download if necessary.
I know that this can be done having a URLSessionDownloadTask and conforming to the URLSessionDownloadDelegate, but I wanted to achieve it through an async/await mechanism, so I used URLSession.shared.bytes(from: url) and then a for-await-in loop to handle each byte.
The issue comes when trying to cancel the ongoing task, as even though the URLSession.AsyncBytes's Task has been cancelled, the for-await-in loop keeps processing bytes, so I'm assuming that the download is still ongoing.
I've tested it with this piece of code in a playground.
let url = URL(string: "https://example.com/large_file.zip")!
let (asyncBytes, _) = try await URLSession.shared.bytes(from: url)
DispatchQueue.main.asyncAfter(deadline: .now() + 1) {
asyncBytes.task.cancel()
}
var data = Data()
for try await byte in asyncBytes {
data.append(byte)
print(data.count)
}
I would have expected that, as soon as the task is cancelled, the download would have been stopped and, therefore, the for-await-in would stop processing bytes.
What am I missing here? Can these tasks not be effectively cancelled?
Canceling a URLSessionDataTask works fine with AsyncBytes. That having been said, even if the URLSessionDataTask is canceled, the AsyncBytes will continue to iterate through the bytes received prior to cancelation. But the data task does stop.
Consider experiment1:
#MainActor
class ViewModel: ObservableObject {
private let url: URL = …
private let session: URLSession = …
private var cancelButtonTapped = false
private var dataTask: URLSessionDataTask?
#Published var bytesBeforeCancel = 0
#Published var bytesAfterCancel = 0
func experiment1() async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
dataTask = asyncBytes.task
var data = Data()
for try await byte in asyncBytes {
if cancelButtonTapped {
bytesAfterCancel += 1
} else {
bytesBeforeCancel += 1
}
data.append(byte)
}
}
func cancel() {
dataTask?.cancel()
cancelButtonTapped = true
}
}
So, I canceled after 1 second (at which point I had iterated through 2,022 bytes), and it continues to iterate through the remaining 14,204 bytes that had been received prior to the cancelation of the URLSessionDataTask. But the download does stop successfully. (In my example, the actual asset being downloaded was 74mb.) When using URLSession, the data comes in packets, so it takes AsyncBytes a little time to get through everything that was actually received before the URLSession request was canceled.
You might consider canceling the Swift concurrency Task, rather than the URLSessionDataTask. (I really wish they did not use the same word, “task”, to refer to entirely different concepts!)
Consider experiment2:
#MainActor
class ViewModel: ObservableObject {
private let url: URL = …
private let session: URLSession = …
private var cancelButtonTapped = false
private var task: Task<Void, Error>?
#Published var bytesBeforeCancel = 0
#Published var bytesAfterCancel = 0
func experiment2() async throws {
task = Task { try await download() }
try await task?.value
}
func cancel() {
task?.cancel()
cancelButtonTapped = true
}
func download() async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
var data = Data()
for try await byte in asyncBytes {
try Task.checkCancellation()
if cancelButtonTapped { // this whole `if` statement is no longer needed, but I've kept it here for comparison to the previous example
bytesAfterCancel += 1
} else {
bytesBeforeCancel += 1
}
data.append(byte)
}
}
}
Without the try Task.checkCancellation() line, the behavior is almost the same as in experiment1. The cancelation of the Task with the AsyncBytes will result in the cancelation of the underlying URLSessionDataTask (but the sequence will continue to iterate through the bytes in the packets that were successfully received prior to cancelation). But with try Task.checkCancellation(), it will exit as soon as the Task is canceled.
TL;DR Read Rob's answer, but the iterator code and and the partial download code are still handy so I'm leaving this answer with corrections.
Okay so I spent some time on this because I'm about to try to write my own cancellable url stream object. and it appears that asyncBytes.task.cancel() is more along the lines of URLSession's finishTasksAndInvalidate() than invalidateAndCancel(). Since you are pointing your streaming task at a file that isn't really that large the URLSessionDataTask had already gotten the bytes in the buffer.
You can see this when you change up the function a bit (see Rob's example as well):
func test_funcCondition(timeOut:TimeInterval, url:URL, session:URLSession) async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
let deadLine = Date.now + timeOut
var data = Data()
func someConditionCheck(_ deadline:Date) -> Bool {
Date.now > deadLine
}
for try await byte in asyncBytes {
if someConditionCheck(deadLine) {
asyncBytes.task.cancel()
print("trying to cancel...")
}
//Wrong type of task! Should not work. if Task.isCancelled { print ("cancelled") }
data.append(byte)
//just to reduce the amount of printing
if data.count % 100 == 0 {
print(data.count)
}
}
}
If you point the URL at "https://example.com/large_file.zip" like your example and make the time interval very short the function will print "trying to cancel..." between the time your marker hits and the file completes. It does NOT however, ever print "cancelled". (The task being cancelled is a URLSessionDataTask, not a Swift concurrency Task, that line never would have worked.)
If you point either what you wrote or this function at a Server-Sent-Event stream it will cancel out just fine. (While true, its not in contrast to the other behavior, which also works just fine. There are just bigger pauses in SSE data.)
If that isn't what you want, if you want to be able to start-stop streams mid-chunk, maybe explore a custom delegate (something I haven't done yet myself), or go work with AVFoundation if that's an option because they've thought a lot about working with large streaming files. I did not check making my own session and running session.invalidateAndCancel() on it instead, because that seems kind of extreme, but may be the way to go if you want to flush the buffer immediately.
The below will work to stop caring about the buffer immediately. It involves making a custom iterator. but it seems kind of quirky and may not in fact arrest the downloading (still cost users data rates and power). I haven't looked into how the stream protocol relates to the network protocol on that lower level, if you stop asking does it stop getting? I don't know. The cancel will arrest the stream allowing through the bytes that are already in the buffer, but your code won't get them. On my todo-list now is to look into how to change buffering policies.
Rob's code seems a nice way to go and advantage of a concurrency Task.
func test_customIterator(timeOut:TimeInterval, url:URL, session:URLSession) async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
let deadLine = Date.now + timeOut
var data = Data()
func someConditionCheck(_ deadline:Date) -> Bool {
Date.now > deadLine
}
//could also be asyncBytes.lines.makeAsyncIterator(), etc.
var iterator = asyncBytes.makeAsyncIterator()
while !someConditionCheck(deadLine) {
//await Task.yield()
let byte = try await iterator.next()
data.append(byte!)
print(data.count)
}
//make sure to still tell URLSession you aren't listening anymore.
//It may auto-close but that's not how I roll.
asyncBytes.task.cancel()
}
let tap_out:TimeInterval = 0.0005
try await test_customIterator(timeOut: tap_out, url: URL(string:"https://example.com/large_file.zip")!, session: URLSession.shared)
Interesting flavor of behavior. Thanks for pointing it out. Also I didn't know that the task was already available (asyncBytes.task). Thanks for that. Incorrect. The asyncBytes.task is a URLSessionDataTask not a concurrency Task
UPDATED TO ADD:
To get part of the file explicitly
//https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests
func requestInChunks(data:inout Data, url:URL, session:URLSession, offset:Int, length:Int) async throws {
var urlRequest = URLRequest(url: url)
urlRequest.addValue("bytes=\(offset)-\(offset + length - 1)", forHTTPHeaderField: "Range")
let (asyncBytes, response) = try await
session.bytes(for: urlRequest, delegate: nil)
guard (response as? HTTPURLResponse)?.statusCode == 206 else { //NOT 200!!
throw APIngError("The server responded with an error.")
}
for try await byte in asyncBytes {
data.append(byte)
if data.count % 100 == 0 {
print(data.count)
}
}
}
Still think if my task on hand was about file downloading session.download would be my go to, but then there is file clean up, etc. so I get why not go there.

Actors with shared state

The following two actors share the same URLSession instance. The actors work in isolation, so is this a problem because each actor may change properties on the session at the same time? How can I protect against this?
actor ImageDownloaderA {
let session = URLSession.shared
func fetchImage() async throws -> Data {
let request = ...
return try await session.data(for: ...)
}
}
actor ImageDownloaderB {
let session = URLSession.shared
func fetchImage() async throws -> Data {
let request = ...
return try await session.data(for: ...)
}
}
struct Fetcher {
func fetcher() async throws {
let downloaderA = ImageDownloaderA()
let downloaderB = ImageDownloaderB()
_ = try await downloaderA.fetchImage()
_ = try await downloaderB.fetchImage()
}
}
URLSession is Sendable and is safe to be passed across concurrency domains. The documentation also explicitly tells us that it is thread-safe:
Thread Safety
The URL session API is thread-safe. You can freely create sessions and
tasks in any thread context. When your delegate methods call the
provided completion handlers, the work is automatically scheduled on
the correct delegate queue.
It should be noted that these two requests are not actually executing concurrently. This is not because of URLSession, but rather because this code will await the completion of ImageDownloaderA.fetchImage before even starting ImageDownloaderB.fetchImage. If we want concurrent execution, we might use async let or a task group.
Now, you’re not showing us what you’re doing with these two responses, but let’s imagine that for demonstration purposes you wanted them to simply print the number of bytes that each returned.
func fetcher() async throws {
let downloaderA = ImageDownloaderA()
let downloaderB = ImageDownloaderB()
async let data1 = downloaderA.fetchImage()
async let data2 = downloaderB.fetchImage()
let count1 = try await data1.count
let count2 = try await data2.count
print(count1, count2)
}
That will run the requests concurrently because we used async let and then only introduced await suspension points after both had been started. For details on async let, see SE-0317.
But, regardless, you can use the same URLSession in separate concurrency contexts without incident.

Can we define an async/await function that returns one value instantly, as well as another value asynchronously (as you'd normally expect)?

Picture an image loading function with a closure completion. Let's say it returns a token ID that you can use to cancel the asynchronous operation if needed.
#discardableResult
func loadImage(url: URL, completion: #escaping (Result<UIImage, Error>) -> Void) -> UUID? {
if let image = loadedImages[url] {
completion(.success(image))
return nil
}
let id = UUID()
let task = URLSession.shared.dataTask(with: url) { data, response, error in
defer {
self.requests.removeValue(forKey: id)
}
if let data, let image = UIImage(data: data) {
DispatchQueue.main.async {
self.loadedImages[url] = image
completion(.success(image))
}
return
}
if let error = error as? NSError, error.code == NSURLErrorCancelled {
return
}
//TODO: Handle response errors
print(response as Any)
completion(.failure(.loadingError))
}
task.resume()
requests[id] = task
return id
}
func cancelRequest(id: UUID) {
requests[id]?.cancel()
requests.removeValue(forKey: id)
print("ImageLoader: cancelling request")
}
How would we accomplish this (elegantly) with swift concurrency? Is it even possible or practical?
I haven't done much testing on this, but I believe this is what you're looking for. It allows you to simply await an image load, but you can cancel using the URL from somewhere else. It also merges near-simultaneous requests for the same URL so you don't re-download something you're in the middle of.
actor Loader {
private var tasks: [URL: Task<UIImage, Error>] = [:]
func loadImage(url: URL) async throws -> UIImage {
if let imageTask = tasks[url] {
return try await imageTask.value
}
let task = Task {
// Rather than removing here, you could skip that and this would become a
// cache of results. Making that correct would take more work than the
// question asks for, so I won't go into it
defer { tasks.removeValue(forKey: url) }
let data = try await URLSession.shared.data(from: url).0
guard let image = UIImage(data: data) else {
throw DecodingError.dataCorrupted(.init(codingPath: [],
debugDescription: "Invalid image"))
}
return image
}
tasks[url] = task
return try await task.value
}
func cancelRequest(url: URL) {
// Remove, and cancel if it's removed
tasks.removeValue(forKey: url)?.cancel()
print("ImageLoader: cancelling request")
}
}
Calling it looks like:
let image = try await loader.loadImage(url: url)
And you can cancel a request if it's still pending using:
loader.cancelRequest(url: url)
A key lesson here is that it is very natural to access task.value multiple times. If the task has already completed, then it will just return immediately.
Return a task in a tuple or other structure.
In the cases where you don't care about the ID, do this:
try await imageTask(url: url).task.value
private var requests: [UUID: Task<UIImage, Swift.Error>] = [:]
func imageTask(url: URL) -> (id: UUID?, task: Task<UIImage, Swift.Error>) {
switch loadedImages[url] {
case let image?: return (id: nil, task: .init { image } )
case nil:
let id = UUID()
let task = Task {
defer { requests[id] = nil }
guard let image = UIImage(data: try await URLSession.shared.data(from: url).0)
else { throw Error.loadingError }
try Task.checkCancellation()
Task { #MainActor in loadedImages[url] = image }
return image
}
requests[id] = task
return (id: id, task: task)
}
}
Is it even possible or practical?
Yes to both.
As I say in a comment, I think you may be missing the fact that a Task is an object you can retain and later cancel. Thus, if you create an architecture where you apply an ID to a task as you ask for the task to start, you can use that same ID to cancel that task before it has returned.
Here's a simple demonstration. I've deliberately written it as Playground code (though I actually developed it in an iOS project).
First, here is a general TimeConsumer class that wraps a single time-consuming Task. We can ask for the task to be created and started, but because we retain the task, we can also cancel it midstream. It happens that my task doesn't return a value, but that's neither here nor there; it could if we wanted.
class TimeConsumer {
var current: Task<(), Error>?
func consume(seconds: Int) async throws {
let task = Task {
try await Task.sleep(for: .seconds(seconds))
}
current = task
_ = await task.result
}
func cancel() {
current?.cancel()
}
}
Now then. In front of my TimeConsumer I'll put a TaskVendor actor. A TimeConsumer represents just one time-consuming task, but a TaskVendor has the ability to maintain multiple time-consuming tasks, identifying each task with an identifier.
actor TaskVendor {
private var tasks = [UUID: TimeConsumer?]()
func giveMeATokenPlease() -> UUID {
let uuid = UUID()
tasks[uuid] = nil
return uuid
}
func beginTheTask(uuid: UUID) async throws {
let consumer = TimeConsumer()
tasks[uuid] = consumer
try await consumer.consume(seconds: 10)
tasks[uuid] = nil
}
func cancel(uuid: UUID) {
tasks[uuid]??.cancel()
tasks[uuid] = nil
}
}
That's all there is to it! Observe how TaskVendor is configured. I can do three things: I can ask for a token (really my actual TaskVendor needn't bother doing this, but I wanted to centralize everything for generality); I can start the task with that token; and, optionally, I can cancel the task with that token.
So here's a simple test harness. Here we go!
let vendor = TaskVendor()
func test() async throws {
let uuid = await vendor.giveMeATokenPlease()
print("start")
Task {
try await Task.sleep(for: .seconds(2))
print("cancel?")
// await vendor.cancel(uuid: uuid)
}
try await vendor.beginTheTask(uuid: uuid)
print("finish")
}
Task {
try await test()
}
What you will see in the console is:
start
[two seconds later] cancel?
[eight seconds after that] finish
We didn't cancel anything; the word "cancel?" signals the place where our test might cancel, but we didn't, because I wanted to prove to you that this is working as we expect: it takes a total of 10 seconds between "start" and "finish", so sure enough, we are consuming the expected time fully.
Now uncomment the await vendor.cancel line. What you will see now is:
start
[two seconds later] cancel?
[immediately!] finish
We did it! We made a cancellable task vendor.
I'm including one possible answer to the question, for the benefit of others. I'll leave the question in place in case someone has another take on it.
The only way that I know of having a 'one-shot' async method that would return a token before returning the async result is by adding an inout argument:
func loadImage(url: URL, token: inout UUID?) async -> Result<UIImage, Error> {
token = UUID()
//...
}
Which we may call like this:
var requestToken: UUID? = nil
let result = await imageLoader.loadImage(url: url, token: &requestToken)
However, this approach and the two-shot solution by #matt both seem fussy, from the api design standpoint. Of course, as he suggests, this leads to a bigger question: How do we implement cancellation with swift concurrency (ideally without too much overhead)? And indeed, using tasks and wrapper objects seems unavoidable, but it certainly seems like a lot of legwork for a fairly simple pattern.

Testing parallel execution with structured concurrency

I’m testing code that uses an actor, and I’d like to test that I’m properly handling concurrent access and reentrancy. One of my usual approaches would be to use DispatchQueue.concurrentPerform to fire off a bunch of requests from different threads, and ensure my values resolve as expected. However, since the actor uses structured concurrency, I’m not sure how to actually wait for the tasks to complete.
What I’d like to do is something like:
let iterationCount = 100
let allTasksComplete = expectation(description: "allTasksComplete")
allTasksComplete.expectedFulfillmentCount = iterationCount
DispatchQueue.concurrentPerform(iterations: iterationCount) { _ in
Task {
// Do some async work here, and assert
allTasksComplete.fulfill()
}
}
wait(for: [allTasksComplete], timeout: 1.0)
However the timeout for the allTasksComplete expectation expires every time, regardless of whether the iteration count is 1 or 100, and regardless of the length of the timeout. I’m assuming this has something to do with the fact that mixing structured and DispatchQueue-style concurrency is a no-no?
How can I properly test concurrent access — specifically how can I guarantee that the actor is accessed from different threads, and wait for the test to complete until all expectations are fulfilled?
A few observations:
When testing Swift concurrency, we no longer need to rely upon expectations. We can just mark our tests as async methods. See Asynchronous Tests and Expectations. Here is an async test adapted from that example:
func testDownloadWebDataWithConcurrency() async throws {
let url = try XCTUnwrap(URL(string: "https://apple.com"), "Expected valid URL.")
let (_, response) = try await URLSession.shared.data(from: url)
let httpResponse = try XCTUnwrap(response as? HTTPURLResponse, "Expected an HTTPURLResponse.")
XCTAssertEqual(httpResponse.statusCode, 200, "Expected a 200 OK response.")
}
FWIW, while we can now use async tests when testing Swift concurrency, we still can use expectations:
func testWithExpectation() {
let iterations = 100
let experiment = ExperimentActor()
let e = self.expectation(description: #function)
e.expectedFulfillmentCount = iterations
for i in 0 ..< iterations {
Task.detached {
let result = await experiment.reentrantCalculation(i)
let success = await experiment.isAcceptable(result)
XCTAssert(success, "Incorrect value")
e.fulfill()
}
}
wait(for: [e], timeout: 10)
}
You said:
However the timeout for the allTasksComplete expectation expires every time, regardless of whether the iteration count is 1 or 100, and regardless of the length of the timeout.
We cannot comment without seeing a reproducible example of the code replaced with the comment “Do some async work here, and assert”. We do not need to see your actual implementation, but rather construct the simplest possible example that manifests the behavior you describe. See How to create a Minimal, Reproducible Example.
I personally suspect that you have some other, unrelated deadlock. E.g., given that concurrentPerform blocks the thread from which you call it, maybe you are doing something that requires the blocked thread. Also, be careful with Task { ... }, which runs the task on the current actor, so if you are doing something slow and synchronous inside there, that could cause problems. We might use detached tasks, instead.
In short, we cannot diagnose the issue without a Minimal, Reproducible Example.
As a more general observation, one should be wary about mixing GCD (or semaphores or long-lived locks or whatever) with Swift concurrency, because the latter uses a cooperative thread pool, which relies upon assumptions about its threads being able to make forward progress. But if you have GCD API blocking threads, those assumptions may no longer be valid. It may not the source of the problem here, but I mention it as a cautionary note.
As an aside, concurrentPerform (which constrains the degree of parallelism) only makes sense if the work being executed runs synchronously. Using concurrentPerform to launch a series of asynchronous tasks will not constrain the concurrency at all. (The cooperative thread pool may, but concurrentPerform will not.)
So, for example, if we wanted to test a bunch of calculations in parallel, rather than concurrentPerform, we might use a TaskGroup:
func testWithStructuredConcurrency() async {
let iterations = 100
let experiment = ExperimentActor()
await withTaskGroup(of: Void.self) { group in
for i in 0 ..< iterations {
group.addTask {
let result = await experiment.reentrantCalculation(i)
let success = await experiment.isAcceptable(result)
XCTAssert(success, "Incorrect value")
}
}
}
let count = await experiment.count
XCTAssertEqual(count, iterations)
}
Now if you wanted to verify concurrent execution within an app, normally I would just profile the app (not unit tests) with Instruments, and either watch intervals in the “Points of Interest” tool or look at the new “Swift Tasks” tool described in WWDC 2022’s Visualize and optimize Swift concurrency video. E.g., here I have launched forty tasks and I can see that my device runs six at a time:
See Alternative to DTSendSignalFlag to identify key events in Instruments? for references about the “Points of Interest” tool.
If you really wanted to write a unit test to confirm concurrency, you could theoretically keep track of your own counters, e.g.,
final class MyAppTests: XCTestCase {
func testWithStructuredConcurrency() async {
let iterations = 100
let experiment = ExperimentActor()
await withTaskGroup(of: Void.self) { group in
for i in 0 ..< iterations {
group.addTask {
let result = await experiment.reentrantCalculation(i)
let success = await experiment.isAcceptable(result)
XCTAssert(success, "Incorrect value")
}
}
}
let count = await experiment.count
XCTAssertEqual(count, iterations, "Correct count")
let degreeOfConcurrency = await experiment.maxDegreeOfConcurrency
XCTAssertGreaterThan(degreeOfConcurrency, 1, "No concurrency")
}
}
Where:
actor ExperimentActor {
var degreeOfConcurrency = 0
var maxDegreeOfConcurrency = 0
var count = 0
/// Calculate pi with Leibniz series
///
/// Note: I am awaiting a detached task so that I can manifest actor reentrancy.
func reentrantCalculation(_ index: Int, decimalPlaces: Int = 8) async -> Double {
let task = Task.detached {
logger.log("starting \(index)") // I wouldn’t generally log in a unit test, but it’s a quick visual confirmation that I’m enjoying parallel execution
await self.increaseConcurrencyCount()
let threshold = pow(0.1, Double(decimalPlaces))
var isPositive = true
var denominator: Double = 1
var result: Double = 0
var increment: Double
repeat {
increment = 4 / denominator
if isPositive {
result += increment
} else {
result -= increment
}
isPositive.toggle()
denominator += 2
} while increment >= threshold
logger.log("finished \(index)")
await self.decreaseConcurrencyCount()
return result
}
count += 1
return await task.value
}
func increaseConcurrencyCount() {
degreeOfConcurrency += 1
if degreeOfConcurrency > maxDegreeOfConcurrency { maxDegreeOfConcurrency = degreeOfConcurrency}
}
func decreaseConcurrencyCount() {
degreeOfConcurrency -= 1
}
func isAcceptable(_ result: Double) -> Bool {
return abs(.pi - result) < 0.0001
}
}
Please note that if testing/running on simulator, the cooperative thread pool is somewhat constrained, not exhibiting the same degree of concurrency as you will see on an actual device.
Also note that if you are testing whether a particular test is exhibiting parallel execution, you might want to disable the parallel execution of tests, themselves, so that other tests do not tie up your cores, preventing any given particular test from enjoying parallel execution.

dispatch group and Threading swift (Async Await)

So I was practicing threading and I have a requirement.
I want to wait for the first api call to finish then I want to execute the second one, (Just like await async) if the first one didn't finish I want to stop the code there until it finishes.
So, I wrote this. Tell me if it is correct or if there is any better way to do this
func myFunction() {
var a: Int?
let group = DispatchGroup()
group.enter()
DispatchQueue.global().asyncAfter(deadline: .now() + 4) {
a = 1
print("First")
group.leave()
}
group.wait()
group.enter()
DispatchQueue.global().asyncAfter(deadline: .now() + 2) {
a = 3
print("Second")
group.leave()
}
// wait ...
group.wait()
group.notify(queue: .main) {
print(a) // y
}
}
Swift 5.5 and Xcode 13.0 bring async/await, so you can do something like this:
import Foundation
print("Hello, World!")
// Returns a 1 after a 3-second delay
func giveMeAOne() async -> Int {
Thread.sleep(forTimeInterval: 3.0)
return 1
}
// Returns a 5 after a 3-second delay
func giveMeAFive() async -> Int {
Thread.sleep(forTimeInterval: 3.0)
return 5
}
func myFunction() async {
var a: Int = 0
print(a)
a += await giveMeAOne()
print(a)
a += await giveMeAFive()
print(a)
}
async {
await myFunction()
}
sleep(7) // added at the end so that process doesn't exit right away
Note that the above was built as a macOS command-line tool in Xcode, rather than with Swift Playgrounds. As of Xcode 13.0 beta 1, Swift Playgrounds fail on a lot of async/await stuff, unfortunately.
That code above runs them sequentially, and cleans up myFunction quite a bit. Note that this doesn't introduce any concurrency, however. Both giveMeAOne() and giveMeAFive() are run sequentially.
If you wanted to run 4 function calls concurrently, you could do something like this:
/// Returns a random integer from 1 through 10, after a 6-second delay
func randomSmallInt() async -> Int {
Thread.sleep(forTimeInterval: 6)
return Int.random(in: 1 ... 10)
}
/// Fetches 4 random integers, summing them
func myFunction() async {
// print timestamp
print(Date())
async let a = randomSmallInt()
async let b = randomSmallInt()
async let c = randomSmallInt()
async let d = randomSmallInt()
// This awaits for a, b, c, and d to all complete.
// You can only await a value once.
let numbersToAdd = await [a, b, c, d]
var sum = 0
for number in numbersToAdd {
print("adding \(number)")
sum += number
}
print("Sum = \(sum)")
// print another timestamp, so you can see that the async let statements ran
// concurrently, rather than sequentially
print(Date())
}
async {
await myFunction()
}
// make sure we have time to finish before process exits
Thread.sleep(forTimeInterval: 10)
When you run this, you can see from the timestamps that the total execution time only took about 6 seconds -- because all of the async let ... statements were executed concurrently.
This works great if you have a fixed number of things you need to call asynchronously. But what if you wanted to make 100 asynchronous function calls, perhaps loading images or something?
You may think you could do something like this, which will work, but will run everything synchronously -- probably not what you want:
// Don't do this -- these all run sequentially
var sum = 0
for _ in 0 ..< 100 {
async let value = randomSmallishInt()
let number = await value
print("Adding \(number)")
sum += number
}
If you want to do an arbitrary or unknown number of asynchronous things, the best way it to use a taskGroup. This code is a bit more verbose than it needs to be, in order to make things clearer. Note: Xcode 13.0 beta struggles with respect to print statement output, missing much of it. Anyway, here it is:
/// Returns a random integer from 1 through 10
func randomSmallInt() async -> Int {
Thread.sleep(forTimeInterval: 6)
return Int.random(in: 1 ... 10)
}
func myFunction_arbitrary_repetitions_asynchronously() async {
// print timestamp
print(Date())
// This is pretty cool --
// `withTaskGroup` creates a context for executing a group of async tasks
// In this case, `Int.self` specifies that each individual task will return
// an Int. `[Int].self` specifies that the entire group will return an array of Ints:
let numbers = await withTaskGroup(of: Int.self, returning: [Int].self) { group in
// Repeat 100 times
for _ in 0 ..< 100 {
// Run the next thing asynchronously
group.async {
return await randomSmallInt()
}
}
// Iterate through results
var result: [Int] = []
for await individualResult in group {
result.append(individualResult)
}
return result
}
var sum = 0
for number in numbers {
print("Adding \(number)")
sum += number
}
print("Sum = \(sum)")
// print another timestamp, so you can see that the async let statements ran
// concurrently, rather than sequentially
print(Date())
}
async { await myFunction_arbitrary_repetitions_asynchronously() }
Thread.sleep(10)
You could shorten this up considerably:
func myFunction_arbitrary_repetitions_asynchronously_short() async {
// print timestamp
print(Date())
// In this case, the taskGroup just returns the sum
let sum = await withTaskGroup(of: Int.self, returning: Int.self) { group in
// Repeat 100 times
for _ in 0 ..< 100 {
// Run the next thing asynchronously
group.async {
return await randomSmallInt()
}
}
// Iterate through results
var sum = 0
for await individualResult in group {
print("Adding \(individualResult)")
sum += individualResult
}
return sum
}
print("Sum = \(sum)")
// print another timestamp, so you can see that the async let statements ran
// concurrently, rather than sequentially
print(Date())
}
async { await myFunction_arbitrary_repetitions_asynchronously_short() }
Important: Xcode 13.0 beta 1 gets quite unstable when dealing with this code. Specifically, the debugger likes to detach when the process isn't finished yet. For example, the output of many of the print statements does not appear, and using breakpoints is tough with async code. Building a command-line tool and then running the resulting executable (found in ~/Library/Developer/Xcode/DerivedData//Build/Products/Debug) will at least show all the output of your print statements.
Don't wait.
Run the second API call in the completion handler of the first
func myFunction() {
var a = 0
DispatchQueue.global().asyncAfter(deadline: .now() + 4) {
a = 1
print("First")
DispatchQueue.global().asyncAfter(deadline: .now() + 2) {
a = 3
print("Second")
DispatchQueue.main.async {
print(a)
}
}
}
}
Just like await async
The lack of a built-in language-level await async mechanism is a serious issue for Swift, and will probably be remedied in a future version. In the meantime, Swift has basically no built-in language-level mechanism for threading; you have to deal with things manually, by talking to either Cocoa/Objective-C (Grand Central Dispatch, NSOperation) or some other framework that deals with asynchronicity (Combine).
The particular problem you've focused on is what I call serializing asynchronicity. There are several ways to do this.
Grand Central Dispatch
The simple way is, as you've been told, perform the second async call at the end of the execution handler of the first async call. This is not very general for lining up numerous tasks, obviously.
You can certainly use DispatchGroup in the way you have outlined. Your statement of the pattern is almost correct. You do not need the last wait, and, more important, you must get onto a background queue right at the start, because using DispatchGroup on the main queue is illegal; your code, as it stands, effectively block the main queue while we wait, which is something you must not do.
Operation Queue
You can make a serial OperationQueue and load Operation objects onto it. This is a much more general mechanism. You can make Operation objects depend upon one another. Before the arrival of the Combine framework, this was the best general solution.
Combine Framework
If you can confine yourself to iOS 13 and later, the Combine framework is the neatest way to serialize asynchronicity in a general way. See for further discussion Combine framework serialize async operations.