How do I check the current progress of URLSession.dataTaskPublisher? - swift

I'm using a dataTaskPublisher to fetch some data:
func downloadData(_ req: URLRequest) {
self.cancelToken = dataTaskPublisher(for: req).sink { /* ... */ }
}
If the function is called while the request is in progress, I would like to return.
Currently I either:
1. Set the cancelToken to nil in the sink or
2. Crate and manage a isDownloading variable.
Is there a built-in way to check if the dataTaskPublisher is running (and optionally its progress)?

I mostly agree with #Rob, however you can control state of DataTask initiated by DataTaskPublisher and its progress by using of URLSession methods:
func downloadData(_ req: URLRequest) {
URLSession.shared.getAllTasks { (tasks) in
let task = tasks.first(where: { (task) -> Bool in
return task.originalRequest?.url == req.url
})
switch task {
case .some(let task) where task.state == .running:
print("progress:", Double(task.countOfBytesReceived) / Double(task.countOfBytesExpectedToReceive))
return
default:
self.cancelToken = URLSession.shared.dataTaskPublisher(for: req).sink { /* ... */ }
}
}
}

Regarding getting progress from this publisher, looking at the source code, we can see that it’s just doing a completion-block-based data task (i.e. dataTask(with:completionHandler:)). But the resulting URLSessionTask is a private property of the private Subscription used by DataTaskPublisher. Bottom line, this publisher doesn’t provide any mechanism to monitor progress of the underlying task.
As Denis pointed out, you are limited to querying the URLSession for information about its tasks.

Related

Cancelling the task of a URLSession.AsyncBytes doesn't seem to work

I want to download a large file, knowing the number of bytes transferred, and be able to cancel the download if necessary.
I know that this can be done having a URLSessionDownloadTask and conforming to the URLSessionDownloadDelegate, but I wanted to achieve it through an async/await mechanism, so I used URLSession.shared.bytes(from: url) and then a for-await-in loop to handle each byte.
The issue comes when trying to cancel the ongoing task, as even though the URLSession.AsyncBytes's Task has been cancelled, the for-await-in loop keeps processing bytes, so I'm assuming that the download is still ongoing.
I've tested it with this piece of code in a playground.
let url = URL(string: "https://example.com/large_file.zip")!
let (asyncBytes, _) = try await URLSession.shared.bytes(from: url)
DispatchQueue.main.asyncAfter(deadline: .now() + 1) {
asyncBytes.task.cancel()
}
var data = Data()
for try await byte in asyncBytes {
data.append(byte)
print(data.count)
}
I would have expected that, as soon as the task is cancelled, the download would have been stopped and, therefore, the for-await-in would stop processing bytes.
What am I missing here? Can these tasks not be effectively cancelled?
Canceling a URLSessionDataTask works fine with AsyncBytes. That having been said, even if the URLSessionDataTask is canceled, the AsyncBytes will continue to iterate through the bytes received prior to cancelation. But the data task does stop.
Consider experiment1:
#MainActor
class ViewModel: ObservableObject {
private let url: URL = …
private let session: URLSession = …
private var cancelButtonTapped = false
private var dataTask: URLSessionDataTask?
#Published var bytesBeforeCancel = 0
#Published var bytesAfterCancel = 0
func experiment1() async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
dataTask = asyncBytes.task
var data = Data()
for try await byte in asyncBytes {
if cancelButtonTapped {
bytesAfterCancel += 1
} else {
bytesBeforeCancel += 1
}
data.append(byte)
}
}
func cancel() {
dataTask?.cancel()
cancelButtonTapped = true
}
}
So, I canceled after 1 second (at which point I had iterated through 2,022 bytes), and it continues to iterate through the remaining 14,204 bytes that had been received prior to the cancelation of the URLSessionDataTask. But the download does stop successfully. (In my example, the actual asset being downloaded was 74mb.) When using URLSession, the data comes in packets, so it takes AsyncBytes a little time to get through everything that was actually received before the URLSession request was canceled.
You might consider canceling the Swift concurrency Task, rather than the URLSessionDataTask. (I really wish they did not use the same word, “task”, to refer to entirely different concepts!)
Consider experiment2:
#MainActor
class ViewModel: ObservableObject {
private let url: URL = …
private let session: URLSession = …
private var cancelButtonTapped = false
private var task: Task<Void, Error>?
#Published var bytesBeforeCancel = 0
#Published var bytesAfterCancel = 0
func experiment2() async throws {
task = Task { try await download() }
try await task?.value
}
func cancel() {
task?.cancel()
cancelButtonTapped = true
}
func download() async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
var data = Data()
for try await byte in asyncBytes {
try Task.checkCancellation()
if cancelButtonTapped { // this whole `if` statement is no longer needed, but I've kept it here for comparison to the previous example
bytesAfterCancel += 1
} else {
bytesBeforeCancel += 1
}
data.append(byte)
}
}
}
Without the try Task.checkCancellation() line, the behavior is almost the same as in experiment1. The cancelation of the Task with the AsyncBytes will result in the cancelation of the underlying URLSessionDataTask (but the sequence will continue to iterate through the bytes in the packets that were successfully received prior to cancelation). But with try Task.checkCancellation(), it will exit as soon as the Task is canceled.
TL;DR Read Rob's answer, but the iterator code and and the partial download code are still handy so I'm leaving this answer with corrections.
Okay so I spent some time on this because I'm about to try to write my own cancellable url stream object. and it appears that asyncBytes.task.cancel() is more along the lines of URLSession's finishTasksAndInvalidate() than invalidateAndCancel(). Since you are pointing your streaming task at a file that isn't really that large the URLSessionDataTask had already gotten the bytes in the buffer.
You can see this when you change up the function a bit (see Rob's example as well):
func test_funcCondition(timeOut:TimeInterval, url:URL, session:URLSession) async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
let deadLine = Date.now + timeOut
var data = Data()
func someConditionCheck(_ deadline:Date) -> Bool {
Date.now > deadLine
}
for try await byte in asyncBytes {
if someConditionCheck(deadLine) {
asyncBytes.task.cancel()
print("trying to cancel...")
}
//Wrong type of task! Should not work. if Task.isCancelled { print ("cancelled") }
data.append(byte)
//just to reduce the amount of printing
if data.count % 100 == 0 {
print(data.count)
}
}
}
If you point the URL at "https://example.com/large_file.zip" like your example and make the time interval very short the function will print "trying to cancel..." between the time your marker hits and the file completes. It does NOT however, ever print "cancelled". (The task being cancelled is a URLSessionDataTask, not a Swift concurrency Task, that line never would have worked.)
If you point either what you wrote or this function at a Server-Sent-Event stream it will cancel out just fine. (While true, its not in contrast to the other behavior, which also works just fine. There are just bigger pauses in SSE data.)
If that isn't what you want, if you want to be able to start-stop streams mid-chunk, maybe explore a custom delegate (something I haven't done yet myself), or go work with AVFoundation if that's an option because they've thought a lot about working with large streaming files. I did not check making my own session and running session.invalidateAndCancel() on it instead, because that seems kind of extreme, but may be the way to go if you want to flush the buffer immediately.
The below will work to stop caring about the buffer immediately. It involves making a custom iterator. but it seems kind of quirky and may not in fact arrest the downloading (still cost users data rates and power). I haven't looked into how the stream protocol relates to the network protocol on that lower level, if you stop asking does it stop getting? I don't know. The cancel will arrest the stream allowing through the bytes that are already in the buffer, but your code won't get them. On my todo-list now is to look into how to change buffering policies.
Rob's code seems a nice way to go and advantage of a concurrency Task.
func test_customIterator(timeOut:TimeInterval, url:URL, session:URLSession) async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
let deadLine = Date.now + timeOut
var data = Data()
func someConditionCheck(_ deadline:Date) -> Bool {
Date.now > deadLine
}
//could also be asyncBytes.lines.makeAsyncIterator(), etc.
var iterator = asyncBytes.makeAsyncIterator()
while !someConditionCheck(deadLine) {
//await Task.yield()
let byte = try await iterator.next()
data.append(byte!)
print(data.count)
}
//make sure to still tell URLSession you aren't listening anymore.
//It may auto-close but that's not how I roll.
asyncBytes.task.cancel()
}
let tap_out:TimeInterval = 0.0005
try await test_customIterator(timeOut: tap_out, url: URL(string:"https://example.com/large_file.zip")!, session: URLSession.shared)
Interesting flavor of behavior. Thanks for pointing it out. Also I didn't know that the task was already available (asyncBytes.task). Thanks for that. Incorrect. The asyncBytes.task is a URLSessionDataTask not a concurrency Task
UPDATED TO ADD:
To get part of the file explicitly
//https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests
func requestInChunks(data:inout Data, url:URL, session:URLSession, offset:Int, length:Int) async throws {
var urlRequest = URLRequest(url: url)
urlRequest.addValue("bytes=\(offset)-\(offset + length - 1)", forHTTPHeaderField: "Range")
let (asyncBytes, response) = try await
session.bytes(for: urlRequest, delegate: nil)
guard (response as? HTTPURLResponse)?.statusCode == 206 else { //NOT 200!!
throw APIngError("The server responded with an error.")
}
for try await byte in asyncBytes {
data.append(byte)
if data.count % 100 == 0 {
print(data.count)
}
}
}
Still think if my task on hand was about file downloading session.download would be my go to, but then there is file clean up, etc. so I get why not go there.

Can we define an async/await function that returns one value instantly, as well as another value asynchronously (as you'd normally expect)?

Picture an image loading function with a closure completion. Let's say it returns a token ID that you can use to cancel the asynchronous operation if needed.
#discardableResult
func loadImage(url: URL, completion: #escaping (Result<UIImage, Error>) -> Void) -> UUID? {
if let image = loadedImages[url] {
completion(.success(image))
return nil
}
let id = UUID()
let task = URLSession.shared.dataTask(with: url) { data, response, error in
defer {
self.requests.removeValue(forKey: id)
}
if let data, let image = UIImage(data: data) {
DispatchQueue.main.async {
self.loadedImages[url] = image
completion(.success(image))
}
return
}
if let error = error as? NSError, error.code == NSURLErrorCancelled {
return
}
//TODO: Handle response errors
print(response as Any)
completion(.failure(.loadingError))
}
task.resume()
requests[id] = task
return id
}
func cancelRequest(id: UUID) {
requests[id]?.cancel()
requests.removeValue(forKey: id)
print("ImageLoader: cancelling request")
}
How would we accomplish this (elegantly) with swift concurrency? Is it even possible or practical?
I haven't done much testing on this, but I believe this is what you're looking for. It allows you to simply await an image load, but you can cancel using the URL from somewhere else. It also merges near-simultaneous requests for the same URL so you don't re-download something you're in the middle of.
actor Loader {
private var tasks: [URL: Task<UIImage, Error>] = [:]
func loadImage(url: URL) async throws -> UIImage {
if let imageTask = tasks[url] {
return try await imageTask.value
}
let task = Task {
// Rather than removing here, you could skip that and this would become a
// cache of results. Making that correct would take more work than the
// question asks for, so I won't go into it
defer { tasks.removeValue(forKey: url) }
let data = try await URLSession.shared.data(from: url).0
guard let image = UIImage(data: data) else {
throw DecodingError.dataCorrupted(.init(codingPath: [],
debugDescription: "Invalid image"))
}
return image
}
tasks[url] = task
return try await task.value
}
func cancelRequest(url: URL) {
// Remove, and cancel if it's removed
tasks.removeValue(forKey: url)?.cancel()
print("ImageLoader: cancelling request")
}
}
Calling it looks like:
let image = try await loader.loadImage(url: url)
And you can cancel a request if it's still pending using:
loader.cancelRequest(url: url)
A key lesson here is that it is very natural to access task.value multiple times. If the task has already completed, then it will just return immediately.
Return a task in a tuple or other structure.
In the cases where you don't care about the ID, do this:
try await imageTask(url: url).task.value
private var requests: [UUID: Task<UIImage, Swift.Error>] = [:]
func imageTask(url: URL) -> (id: UUID?, task: Task<UIImage, Swift.Error>) {
switch loadedImages[url] {
case let image?: return (id: nil, task: .init { image } )
case nil:
let id = UUID()
let task = Task {
defer { requests[id] = nil }
guard let image = UIImage(data: try await URLSession.shared.data(from: url).0)
else { throw Error.loadingError }
try Task.checkCancellation()
Task { #MainActor in loadedImages[url] = image }
return image
}
requests[id] = task
return (id: id, task: task)
}
}
Is it even possible or practical?
Yes to both.
As I say in a comment, I think you may be missing the fact that a Task is an object you can retain and later cancel. Thus, if you create an architecture where you apply an ID to a task as you ask for the task to start, you can use that same ID to cancel that task before it has returned.
Here's a simple demonstration. I've deliberately written it as Playground code (though I actually developed it in an iOS project).
First, here is a general TimeConsumer class that wraps a single time-consuming Task. We can ask for the task to be created and started, but because we retain the task, we can also cancel it midstream. It happens that my task doesn't return a value, but that's neither here nor there; it could if we wanted.
class TimeConsumer {
var current: Task<(), Error>?
func consume(seconds: Int) async throws {
let task = Task {
try await Task.sleep(for: .seconds(seconds))
}
current = task
_ = await task.result
}
func cancel() {
current?.cancel()
}
}
Now then. In front of my TimeConsumer I'll put a TaskVendor actor. A TimeConsumer represents just one time-consuming task, but a TaskVendor has the ability to maintain multiple time-consuming tasks, identifying each task with an identifier.
actor TaskVendor {
private var tasks = [UUID: TimeConsumer?]()
func giveMeATokenPlease() -> UUID {
let uuid = UUID()
tasks[uuid] = nil
return uuid
}
func beginTheTask(uuid: UUID) async throws {
let consumer = TimeConsumer()
tasks[uuid] = consumer
try await consumer.consume(seconds: 10)
tasks[uuid] = nil
}
func cancel(uuid: UUID) {
tasks[uuid]??.cancel()
tasks[uuid] = nil
}
}
That's all there is to it! Observe how TaskVendor is configured. I can do three things: I can ask for a token (really my actual TaskVendor needn't bother doing this, but I wanted to centralize everything for generality); I can start the task with that token; and, optionally, I can cancel the task with that token.
So here's a simple test harness. Here we go!
let vendor = TaskVendor()
func test() async throws {
let uuid = await vendor.giveMeATokenPlease()
print("start")
Task {
try await Task.sleep(for: .seconds(2))
print("cancel?")
// await vendor.cancel(uuid: uuid)
}
try await vendor.beginTheTask(uuid: uuid)
print("finish")
}
Task {
try await test()
}
What you will see in the console is:
start
[two seconds later] cancel?
[eight seconds after that] finish
We didn't cancel anything; the word "cancel?" signals the place where our test might cancel, but we didn't, because I wanted to prove to you that this is working as we expect: it takes a total of 10 seconds between "start" and "finish", so sure enough, we are consuming the expected time fully.
Now uncomment the await vendor.cancel line. What you will see now is:
start
[two seconds later] cancel?
[immediately!] finish
We did it! We made a cancellable task vendor.
I'm including one possible answer to the question, for the benefit of others. I'll leave the question in place in case someone has another take on it.
The only way that I know of having a 'one-shot' async method that would return a token before returning the async result is by adding an inout argument:
func loadImage(url: URL, token: inout UUID?) async -> Result<UIImage, Error> {
token = UUID()
//...
}
Which we may call like this:
var requestToken: UUID? = nil
let result = await imageLoader.loadImage(url: url, token: &requestToken)
However, this approach and the two-shot solution by #matt both seem fussy, from the api design standpoint. Of course, as he suggests, this leads to a bigger question: How do we implement cancellation with swift concurrency (ideally without too much overhead)? And indeed, using tasks and wrapper objects seems unavoidable, but it certainly seems like a lot of legwork for a fairly simple pattern.

Synchronize nested async network requests inside a while loop by using Semaphores

I have a func that gets a list of Players. When i fetch the players i need only to show those who belongs to the current Team so i am showing only a subset of the original list by filtering them. I don't know in advance, before making the request, how much players belong to the Team selected by the User, so i may need to do additional requests until i can display on the TableView at least 10 rows of Players. The User by pulling up from the bottom of the TableView can request more players to display. To do this i am calling a first async func request which in turn calls, inside a while, another nested async func request. Here a code to give you an idea of what i am trying to do:
let semaphore = DispatchSemaphore(value: 0)
func getTeamPlayersRequest() {
service.getTeamPlayers(...)
{
(result) in
switch result
{
case .success(let playersModel):
if let validCurrentPage = currentPageTmp ,
let validTotalPages = totalPagesTmp ,
let validNextPage = self.getTeamPlayersListNextPage()
{
while self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages
{
self.currentPage = validNextPage //global var
self.fetchMorePlayers()
self.semaphore.wait() //global semaphore
}
}
case .failure(let error):
//some code...
}
})
}
private func fetchMorePlayers(){
// Completion handler of the following function is never called..
service.getTeamPlayers(requestedPage: currentPage, completion: {
(result) in
switch result
{
case .success(let playersModel):
if let validPlayerList = playersList,
let validPlayerListData = validPlayerList.data,
let validTeamModel = self.teamPlayerModel,
let validNextPage = self.getTeamPlayersListNextPage()
{
for player in validPlayerListData
{
if ( validTeamModel.id == player.team?.id)
{
self.playersToShowTemp.append(player)
}
}
}
self.currentPage = validNextPage
self.semaphore.signal() //global semaphore
case .failure(let error):
//some code...
}
}
}
I have tried both with DispatchGroup and Semaphore but i don't get it what i am doing wrong. I debugged the code and saw that the first async call get executed in a different queue (not the main queue) and a different thread. The nested async call getexecuted on a different thread but i don't know if it's the same concurrent queue of the first async call.
The completion handler of thenested call it's never called. Does anyone know why? is the self.semaphore.wait(), even if it get executed after the fetchMorePlayers() return, blocking/preventing the nested async completion handler to be called?
I am noticing through the Debugger that the completion() in the Xcode vars window has the note "swift partial apply forwarder for closure #1"
If we inline the function call in your loop, it looks something like this:
while self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages
{
self.currentPage = validNextPage //global var
nbaService.getTeamPlayers(requestedPage: currentPage, completion: { ... })
self.semaphore.wait() //global semaphore
}
So nbaService.getTeamPlayers schedules a request, probably on the main DispatchQueue and immediately returns. Then you call wait on your semaphore, which blocks, probably before GCD even tries to run the task scheduled by nbaService.getTeamPlayers.
That's a problem on DispatchQueue.main, which is a serial queue. It has to be a serial queue for UI updates to work. What normally happens is on some iteration of the run loop you make a request, and return.. that bubbles back up to the run loop, which checks for more events and queued tasks. In this case, when your completion handler in getTeamPlayersRequest is waiting to be run, the run loop (via GCD) executes it for that iteration. Then you block the main thread, so the run loop can't continue. If you do need to block always do it on a different DispatchQueue, preferably a .concurrent one.
There is sometimes confusion about what .async does. It only means "run this later and right now return control back to the caller". That's all. It does not guarantee that your closure will run concurrently. It merely schedules it to be run later (possibly soon) on whatever DispatchQueue you called it on. If that queue is a serial queue, then it will be queued to run in its turn in that dispatch queue's run loop. If it's a concurrent queue (ie one you specifically set the attributes to include .concurrent). Then it will run, possibly at the same time as other tasks on that same DispatchQueue.
To avoid that instead of using a loop you can use async-chaining.
private func fetchMorePlayers(while condition: #autoclosure #escaping () -> Bool){
guard condition() else { return }
nbaService.getTeamPlayers(requestedPage: currentPage, completion: {
(result) in
switch result
{
case .success(let playersModel):
if let validPlayerList = playersList,
let validPlayerListData = validPlayerList.data,
let validTeamModel = self.teamPlayerModel,
let validNextPage = self.getTeamPlayersListNextPage()
{
for player in validPlayerListData
{
if ( validTeamModel.id == player.team?.id)
{
self.playersToShowTemp.append(player)
}
}
}
self.currentPage = validNextPage
// Chain to next call
self.fetchMorePlayers(while: condition))
case .failure(let error):
//some code...
}
}
}
Then in getTeamPlayersRequest you can do this:
func getTeamPlayersRequest() {
service.getTeamPlayers(...)
{
(result) in
switch result
{
case .success(let playersModel):
if let validCurrentPage = currentPageTmp ,
let validTotalPages = totalPagesTmp ,
let validNextPage = self.getTeamPlayersListNextPage()
{
self.currentPage = validNextPage //global var
self.fetchMorePlayers(while: self.playersToShowTemp.count < 10 && self.currentPage < validTotalPages)
}
case .failure(let error):
//some code...
}
})
}
This avoids the need to block on a semaphore, because each subsequent request happens in the completion handler of the previously completed one. The only issue is if you need for the completion handler in getTeamPlayersRequest to block while the fetchMorePlayers requests are being fetched, because now it won't you can re-introduce the semaphore. In that case the guard statement in fetchMorePlayers becomes:
guard condition() else
{
self.semaphore.signal()
return
}
That way it only signals on the last completion handler in the chain. You may need to block in a different DispatchQueue though. I think if you need to block, you probably have something about your design that needs to be reconsidered.
If you find yourself reaching for semaphores, it is almost always a mistake. Semaphores are inefficient at best, and introduce deadlock risks if misused. Semaphores should generally be avoided. (Don't get me wrong: Semaphores can be useful in some very narrow use cases, but this is not one of them.)
Use asynchronous patterns. One simple approach might be to recursively call the routine, calling the completion handler when done:
func startFetching(#escaping completion: () -> Void) {
fetchPlayers(page: 0, completion: completion)
}
private func fetchPlayers(page: Int, #escaping completion: () -> Void) {
// prepare request
// now perform request
performRequest(...) { ...
if let error = error {
completion()
return
}
...
if doesNeedMorePlayers {
fetchPlayers(page: page + 1, completion: completion)
} else {
completion()
}
}
}
Personally, I might probably add another closure to emit the players retrieved as we go along, e.g. like, if not actually, a Combine Publisher. Or if you want to update the UI all at once at the very end, just pass the players retrieved thus far as additional parameter in this recursive routine and pass the whole array back in the completion handler. But avoid globals or other state properties.
But the broader idea is to scrupulously avoid semaphores and instead embrace asynchronous patterns.

Make a Publisher from a callback

I would like to wrap a simple callback so that it would be able to be used as a Combine Publisher. Specifically the NSPersistentContainer.loadPersistentStore callback so I can publish when the container is ready to go.
func createPersistentContainer(name: String) -> AnyPublisher<NSPersistentContainer, Error> {
// What goes here?
// Happy path: send output NSPersistentContainer; send completion.
// Not happy path: send failure Error; send completion.
}
For instance, what would the internals of a function, createPersistentContainer given above, look like to enable me to do something like this in my AppDelegate.
final class AppDelegate: UIResponder, UIApplicationDelegate {
let container = createPersistentContainer(name: "DeadlyBattery")
.assertNoFailure()
.eraseToAnyPublisher()
// ...
}
Mostly this boils down to, how do you wrap a callback in a Publisher?
As one of the previous posters #Ryan pointed out, the solution is to use the Future publisher.
The problem of using only the Future, though, is that it is eager, which means that it starts executing its promise closure at the moment of creation, not when it is subscribed to. The answer to that challenge is to wrap it in the Deferred publisher:
func createPersistentContainer(name: String) -> AnyPublisher<NSPersistentContainer, Error> {
return Deferred {
Future<NSPersistentContainer, Error> { promise in
let container = NSPersistentContainer(name: name)
container.loadPersistentStores { _, error in
if let error = error {
promise(.failure(error))
} else {
promise(.success(container))
}
}
}
}.eraseToAnyPublisher()
}
It seems that Combine's Future is the correct tool for the job.
func createPersistentContainer(name: String) -> AnyPublisher<NSPersistentContainer, Error> {
let future = Future<NSPersistentContainer, Error> { promise in
let container = NSPersistentContainer(name: name)
container.loadPersistentStores { _, error in
if let error = error {
promise(.failure(error))
} else {
promise(.success(container))
}
}
}
return AnyPublisher(future)
}
NSPersistentContainer is just a convenience wrapper around a core data stack, you would be better off subscribing at the source:
NotificationCenter.default.publisher(for: .NSPersistentStoreCoordinatorStoresDidChange)

Recursive/looping NSURLSession async completion handlers

The API I use requires multiple requests to get search results. It's designed this way because searches can take a long time (> 5min). The initial response comes back immediately with metadata about the search, and that metadata is used in follow up requests until the search is complete. I do not control the API.
1st request is a POST to https://api.com/sessions/search/
The response to this request contains a cookie and metadata about the search. The important fields in this response are the search_cookie (a String) and search_completed_pct (an Int)
2nd request is a POST to https://api.com/sessions/results/ with the search_cookie appended to the URL. eg https://api.com/sessions/results/c601eeb7872b7+0
The response to the 2nd request will contain either:
The search results if the query has completed (aka search_completed_pct == 100)
Metadata about the progress of search, search_completed_pct is the progress of the search and will be between 0 and 100.
If the search is not complete, I want to make a request every 5 seconds until it's complete (aka search_completed_pct == 100)
I've found numerous posts here that are similar, many use Dispatch Groups and for loops, but that approach did not work for me. I've tried a while loop and had issues with variable scoping. Dispatch groups also didn't work for me. This smelled like the wrong way to go, but I'm not sure.
I'm looking for the proper design to make these recursive calls. Should I use delegates or are closures + loop the way to go? I've hit a wall and need some help.
The code below is the general idea of what I've tried (edited for clarity. No dispatch_groups(), error handling, json parsing, etc.)
Viewcontroller.swift
apiObj.sessionSearch(domain) { result in
Log.info!.message("result: \(result)")
})
ApiObj.swift
func sessionSearch(domain: String, sessionCompletion: (result: SearchResult) -> ()) {
// Make request to /search/ url
let task = session.dataTaskWithRequest(request) { data, response, error in
let searchCookie = parseCookieFromResponse(data!)
********* pseudo code **************
var progress: Int = 0
var results = SearchResults()
while (progress != 100) {
// Make requests to /results/ until search is complete
self.getResults(searchCookie) { searchResults in
progress = searchResults.search_pct_complete
if (searchResults == 100) {
completion(searchResults)
} else {
sleep(5 seconds)
} //if
} //self.getResults()
} //while
********* pseudo code ************
} //session.dataTaskWithRequest(
task.resume()
}
func getResults(cookie: String, completion: (searchResults: NSDictionary) -> ())
let request = buildRequest((domain), url: NSURL(string: ResultsUrl)!)
let session = NSURLSession.sharedSession()
let task = session.dataTaskWithRequest(request) { data, response, error in
let theResults = getJSONFromData(data!)
completion(theResults)
}
task.resume()
}
Well first off, it seems weird that there is no API with a GET request which simply returns the result - even if this may take minutes. But, as you mentioned, you cannot change the API.
So, according to your description, we need to issue a request which effectively "polls" the server. We do this until we retrieved a Search object which is completed.
So, a viable approach would purposely define the following functions and classes:
A protocol for the "Search" object returned from the server:
public protocol SearchType {
var searchID: String { get }
var isCompleted: Bool { get }
var progress: Double { get }
var result: AnyObject? { get }
}
A concrete struct or class is used on the client side.
An asynchronous function which issues a request to the server in order to create the search object (your #1 POST request):
func createSearch(completion: (SearchType?, ErrorType?) -> () )
Then another asynchronous function which fetches a "Search" object and potentially the result if it is complete:
func fetchSearch(searchID: String, completion: (SearchType?, ErrorType?) -> () )
Now, an asynchronous function which fetches the result for a certain "searchID" (your "search_cookie") - and internally implements the polling:
func fetchResult(searchID: String, completion: (AnyObject?, ErrorType?) -> () )
The implementation of fetchResult may now look as follows:
func fetchResult(searchID: String,
completion: (AnyObject?, ErrorType?) -> () ) {
func poll() {
fetchSearch(searchID) { (search, error) in
if let search = search {
if search.isCompleted {
completion(search.result!, nil)
} else {
delay(1.0, f: poll)
}
} else {
completion(nil, error)
}
}
}
poll()
}
This approach uses a local function poll for implementing the polling feature. poll calls fetchSearch and when it finishes it checks whether the search is complete. If not it delays for certain amount of duration and then calls poll again. This looks like a recursive call, but actually it isn't since poll already finished when it is called again. A local function seems appropriate for this kind of approach.
The function delay simply waits for the specified amount of seconds and then calls the provided closure. delay can be easily implemented in terms of dispatch_after or a with a cancelable dispatch timer (we need later implement cancellation).
I'm not showing how to implement createSearch and fetchSearch. These may be easily implemented using a third party network library or can be easily implemented based on NSURLSession.
Conclusion:
What might become a bit cumbersome, is to implement error handling and cancellation, and also dealing with all the completion handlers. In order to solve this problem in a concise and elegant manner I would suggest to utilise a helper library which implements "Promises" or "Futures" - or try to solve it with Rx.
For example a viable implementation utilising "Scala-like" futures:
func fetchResult(searchID: String) -> Future<AnyObject> {
let promise = Promise<AnyObject>()
func poll() {
fetchSearch(searchID).map { search in
if search.isCompleted {
promise.fulfill(search.result!)
} else {
delay(1.0, f: poll)
}
}
}
poll()
return promise.future!
}
You would start to obtain a result as shown below:
createSearch().flatMap { search in
fetchResult(search.searchID).map { result in
print(result)
}
}.onFailure { error in
print("Error: \(error)")
}
This above contains complete error handling. It does not yet contain cancellation. Your really need to implement a way to cancel the request, otherwise the polling may not be stopped.
A solution implementing cancellation utilising a "CancellationToken" may look as follows:
func fetchResult(searchID: String,
cancellationToken ct: CancellationToken) -> Future<AnyObject> {
let promise = Promise<AnyObject>()
func poll() {
fetchSearch(searchID, cancellationToken: ct).map { search in
if search.isCompleted {
promise.fulfill(search.result!)
} else {
delay(1.0, cancellationToken: ct) { ct in
if ct.isCancelled {
promise.reject(CancellationError.Cancelled)
} else {
poll()
}
}
}
}
}
poll()
return promise.future!
}
And it may be called:
let cr = CancellationRequest()
let ct = cr.token
createSearch(cancellationToken: ct).flatMap { search in
fetchResult(search.searchID, cancellationToken: ct).map { result in
// if we reach here, we got a result
print(result)
}
}.onFailure { error in
print("Error: \(error)")
}
Later you can cancel the request as shown below:
cr.cancel()