I want to download a large file, knowing the number of bytes transferred, and be able to cancel the download if necessary.
I know that this can be done having a URLSessionDownloadTask and conforming to the URLSessionDownloadDelegate, but I wanted to achieve it through an async/await mechanism, so I used URLSession.shared.bytes(from: url) and then a for-await-in loop to handle each byte.
The issue comes when trying to cancel the ongoing task, as even though the URLSession.AsyncBytes's Task has been cancelled, the for-await-in loop keeps processing bytes, so I'm assuming that the download is still ongoing.
I've tested it with this piece of code in a playground.
let url = URL(string: "https://example.com/large_file.zip")!
let (asyncBytes, _) = try await URLSession.shared.bytes(from: url)
DispatchQueue.main.asyncAfter(deadline: .now() + 1) {
asyncBytes.task.cancel()
}
var data = Data()
for try await byte in asyncBytes {
data.append(byte)
print(data.count)
}
I would have expected that, as soon as the task is cancelled, the download would have been stopped and, therefore, the for-await-in would stop processing bytes.
What am I missing here? Can these tasks not be effectively cancelled?
Canceling a URLSessionDataTask works fine with AsyncBytes. That having been said, even if the URLSessionDataTask is canceled, the AsyncBytes will continue to iterate through the bytes received prior to cancelation. But the data task does stop.
Consider experiment1:
#MainActor
class ViewModel: ObservableObject {
private let url: URL = …
private let session: URLSession = …
private var cancelButtonTapped = false
private var dataTask: URLSessionDataTask?
#Published var bytesBeforeCancel = 0
#Published var bytesAfterCancel = 0
func experiment1() async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
dataTask = asyncBytes.task
var data = Data()
for try await byte in asyncBytes {
if cancelButtonTapped {
bytesAfterCancel += 1
} else {
bytesBeforeCancel += 1
}
data.append(byte)
}
}
func cancel() {
dataTask?.cancel()
cancelButtonTapped = true
}
}
So, I canceled after 1 second (at which point I had iterated through 2,022 bytes), and it continues to iterate through the remaining 14,204 bytes that had been received prior to the cancelation of the URLSessionDataTask. But the download does stop successfully. (In my example, the actual asset being downloaded was 74mb.) When using URLSession, the data comes in packets, so it takes AsyncBytes a little time to get through everything that was actually received before the URLSession request was canceled.
You might consider canceling the Swift concurrency Task, rather than the URLSessionDataTask. (I really wish they did not use the same word, “task”, to refer to entirely different concepts!)
Consider experiment2:
#MainActor
class ViewModel: ObservableObject {
private let url: URL = …
private let session: URLSession = …
private var cancelButtonTapped = false
private var task: Task<Void, Error>?
#Published var bytesBeforeCancel = 0
#Published var bytesAfterCancel = 0
func experiment2() async throws {
task = Task { try await download() }
try await task?.value
}
func cancel() {
task?.cancel()
cancelButtonTapped = true
}
func download() async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
var data = Data()
for try await byte in asyncBytes {
try Task.checkCancellation()
if cancelButtonTapped { // this whole `if` statement is no longer needed, but I've kept it here for comparison to the previous example
bytesAfterCancel += 1
} else {
bytesBeforeCancel += 1
}
data.append(byte)
}
}
}
Without the try Task.checkCancellation() line, the behavior is almost the same as in experiment1. The cancelation of the Task with the AsyncBytes will result in the cancelation of the underlying URLSessionDataTask (but the sequence will continue to iterate through the bytes in the packets that were successfully received prior to cancelation). But with try Task.checkCancellation(), it will exit as soon as the Task is canceled.
TL;DR Read Rob's answer, but the iterator code and and the partial download code are still handy so I'm leaving this answer with corrections.
Okay so I spent some time on this because I'm about to try to write my own cancellable url stream object. and it appears that asyncBytes.task.cancel() is more along the lines of URLSession's finishTasksAndInvalidate() than invalidateAndCancel(). Since you are pointing your streaming task at a file that isn't really that large the URLSessionDataTask had already gotten the bytes in the buffer.
You can see this when you change up the function a bit (see Rob's example as well):
func test_funcCondition(timeOut:TimeInterval, url:URL, session:URLSession) async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
let deadLine = Date.now + timeOut
var data = Data()
func someConditionCheck(_ deadline:Date) -> Bool {
Date.now > deadLine
}
for try await byte in asyncBytes {
if someConditionCheck(deadLine) {
asyncBytes.task.cancel()
print("trying to cancel...")
}
//Wrong type of task! Should not work. if Task.isCancelled { print ("cancelled") }
data.append(byte)
//just to reduce the amount of printing
if data.count % 100 == 0 {
print(data.count)
}
}
}
If you point the URL at "https://example.com/large_file.zip" like your example and make the time interval very short the function will print "trying to cancel..." between the time your marker hits and the file completes. It does NOT however, ever print "cancelled". (The task being cancelled is a URLSessionDataTask, not a Swift concurrency Task, that line never would have worked.)
If you point either what you wrote or this function at a Server-Sent-Event stream it will cancel out just fine. (While true, its not in contrast to the other behavior, which also works just fine. There are just bigger pauses in SSE data.)
If that isn't what you want, if you want to be able to start-stop streams mid-chunk, maybe explore a custom delegate (something I haven't done yet myself), or go work with AVFoundation if that's an option because they've thought a lot about working with large streaming files. I did not check making my own session and running session.invalidateAndCancel() on it instead, because that seems kind of extreme, but may be the way to go if you want to flush the buffer immediately.
The below will work to stop caring about the buffer immediately. It involves making a custom iterator. but it seems kind of quirky and may not in fact arrest the downloading (still cost users data rates and power). I haven't looked into how the stream protocol relates to the network protocol on that lower level, if you stop asking does it stop getting? I don't know. The cancel will arrest the stream allowing through the bytes that are already in the buffer, but your code won't get them. On my todo-list now is to look into how to change buffering policies.
Rob's code seems a nice way to go and advantage of a concurrency Task.
func test_customIterator(timeOut:TimeInterval, url:URL, session:URLSession) async throws {
let (asyncBytes, _) = try await session.bytes(from: url)
let deadLine = Date.now + timeOut
var data = Data()
func someConditionCheck(_ deadline:Date) -> Bool {
Date.now > deadLine
}
//could also be asyncBytes.lines.makeAsyncIterator(), etc.
var iterator = asyncBytes.makeAsyncIterator()
while !someConditionCheck(deadLine) {
//await Task.yield()
let byte = try await iterator.next()
data.append(byte!)
print(data.count)
}
//make sure to still tell URLSession you aren't listening anymore.
//It may auto-close but that's not how I roll.
asyncBytes.task.cancel()
}
let tap_out:TimeInterval = 0.0005
try await test_customIterator(timeOut: tap_out, url: URL(string:"https://example.com/large_file.zip")!, session: URLSession.shared)
Interesting flavor of behavior. Thanks for pointing it out. Also I didn't know that the task was already available (asyncBytes.task). Thanks for that. Incorrect. The asyncBytes.task is a URLSessionDataTask not a concurrency Task
UPDATED TO ADD:
To get part of the file explicitly
//https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests
func requestInChunks(data:inout Data, url:URL, session:URLSession, offset:Int, length:Int) async throws {
var urlRequest = URLRequest(url: url)
urlRequest.addValue("bytes=\(offset)-\(offset + length - 1)", forHTTPHeaderField: "Range")
let (asyncBytes, response) = try await
session.bytes(for: urlRequest, delegate: nil)
guard (response as? HTTPURLResponse)?.statusCode == 206 else { //NOT 200!!
throw APIngError("The server responded with an error.")
}
for try await byte in asyncBytes {
data.append(byte)
if data.count % 100 == 0 {
print(data.count)
}
}
}
Still think if my task on hand was about file downloading session.download would be my go to, but then there is file clean up, etc. so I get why not go there.
I have an OperationQueue with multiple custom Operations which all append to the same array on completion (each operation downloads a file from user's iCloud and when it's done it appends the file to the array)
This, sometimes, causes the app to crash, because several operations try to edit the array at the same time.
How can I prevent this and only edit the array 1 operation at a time but running all operations simultaneously?
I must use OperationQueue because I need the operations to be cancelable.
func convertAssetsToMedias(assets: [PHAsset],
completion: #escaping (_ medias: [Media]) ->()) {
operationQueue = OperationQueue()
var medias: [Media] = []
operationQueue?.progress.totalUnitCount = Int64(assets.count)
for asset in assets {
// For each asset we start a new operation
let convertionOperation = ConvertPHAssetToMediaOperation(asset)
convertionOperation.qualityOfService = .userInteractive
convertionOperation.completionBlock = { [unowned convertionOperation] in
let media = convertionOperation.media
medias.append(media) // CRASH HERE (sometimes)
self.operationQueue?.progress.completedUnitCount += 1
if let progress = self.operationQueue?.progress.fractionCompleted {
self.delegate?.onICloudProgressUpdate(progress: progress)
}
convertionOperation.completionBlock = nil
}
operationQueue?.addOperation(convertionOperation)
}
operationQueue?.addBarrierBlock {
completion(medias)
}
}
Edit 1:
The Media file itself is nothing big, just a bunch of metadata and a url to an actual file at documents directory. There are usually about 24 medias max at 1 run. The memory is barely increasing during those operations. The crash never occured due to a lack of memory.
The operation ConvertPHAssetToMediaOperation is a subclass of AsyncOperation where isAsynchronous propery is set to true.
That's how I construct the Media object in the end of each operation:
self.media = Media(type: mediaType, url: resultURL, creationDate: date)
self.finish()
Edit 2: The crash is always the same:
I have a question how to correctly call CIDetector correctly I'm trying to run the face detection in real-time this works very well. However the memory consumption of the app increases linearly with time how you can see in the image below I'm thinking this is due to objects being created but they're not released can anyone advise how to do it correctly.
I have pinpointed the issue down to this function as every time it's invoked memory increases linearly when it terminated it quickly drops down to almost 80 MB instead of 11 GB rising also check for memory leaks however none were found.
My target development platform is Mac OS I'm trying to extractthe mouth position from the CA detector and then use it to compute a Delta in the mouse function for a Game.
I also Looked that this post however I have tried their approach but it did not work for me
CIDetector isn't releasing memory
fileprivate func faceDetection(){
// setting up dispatchQueue
dispatchQueue.async {
// checking if sample buffer is equal to nil if not assign its value to sample
if let sample = self.sampleBuffers {
// if allfeatures is not equal to nil. if yes assign allfeatures to features otherwise return
guard let features = self.allFeatures(sample: sample) else { return }
// loop to cycle through all features
for feature in features {
// checks if the feature is a CIFaceFeature if yes assign feature to face feature and go on.
if let faceFeature = feature as? CIFaceFeature {
if !self.hasEnded {
if self.calX.count > 30 {
self.sens.append((self.calX.max()! - self.calX.min()!))
self.sens.append((self.calY.max()! - self.calY.min()!))
print((self.calX.max()! - self.calX.min()!))
self.hasEnded = true
} else {
self.calX.append(faceFeature.mouthPosition.x)
self.calY.append(faceFeature.mouthPosition.y)
}
} else {
self.mouse(position: CGPoint(x: (faceFeature.mouthPosition.x - 300 ) * 2, y: (faceFeature.mouthPosition.y + 20 ) * 2), faceFeature: faceFeature)
}
}
}
}
if !self.faceTrackingEnds {
self.faceDetection()
}
}
}
This problem was caused by repeatedly calling the function without waiting for its completion the fix was implementing a dispatch group and then calling the function on its completion
like this Now the CIdetector runs comfortably at 200 MB memory
fileprivate func faceDetection(){
let group = DispatchGroup()
group.enter()
// setting up dispatchQueue
dispatchQueue.async {
// checking if sample buffer is equal to nil if not assign its value to sample
if let sample = self.sampleBuffers {
// if allfeatures is not equal to nil. if yes assign allfeatures to features otherwise return
guard let features = self.allFeatures(sample: sample) else { return }
// loop to cycle through all features
for feature in features {
// checks if the feature is a CIFaceFeature if yes assign feature to face feature and go on.
if let faceFeature = feature as? CIFaceFeature {
self.mouse(position: faceFeature.mouthPosition, faceFeature: faceFeature)
}
}
}
group.leave()
}
group.notify(queue: .main) {
if !self.faceTrackingEnds {
self.faceDetection()
}
}
}
Im using MapKit's CLGeocoder().geocodeAddressString() to get the coordinates and county information for a list of addresses. Everything works great as long as the # of requests is under 50. but anything over 50 and im hitting API's limit. Since CLGeocoder calls are asynchronous, i can't easily throttle/control the flow of the calls (calling one address at a time, for instance). How would I do this correctly in the "asynchronous world"? (DISCLAIMER: I'm new to the world of GCD and asynchronous flow control, so I think I might require a more detailed response)
Here's the relevant code:
method of Class Property that calls CLGeocoder on the Property's Adress:
func initializeCoordinates() {
let addressForCoords = self.address.getAddress()
CLGeocoder().geocodeAddressString(addressForCoords, completionHandler: { (placemarks, error) -> Void in
if error != nil {
print(error!)
return
}
if placemarks!.count > 0 {
let placemark = placemarks?[0]
let location = placemark?.location
self.coordinates = location?.coordinate
if let subAdminArea = placemark?.subAdministrativeArea {
self.address.county = subAdminArea
}
}
})
}
and then in the section in the ImportVC that imports all the property's addresses from a textBox (and makes the call to initializeCoordinates method on each Property:
for line in importText {
let newAddress = Address()
let newHouse = Property()
// parse the tab delimited address for each line of input
let address = line.components(separatedBy: "\t")
if address.count == 4 {
newAddress.street = address[0]
newAddress.city = address[1]
newAddress.state = trimState(state: address[2])
newAddress.zip = address[3]
newHouse.address = newAddress
newHouse.initializeCoordinates()
houses.append(newHouse)
}
}
I faced a similar problem recently. Replace your for-loop with a recursive function that calls itself at the end. However the trick is to call its self with a 0.2 second delay. I use 0.4 to be on the safe side. This will increase the waiting time for the user, although we have no choice due to the API limit.
I'm trying to update a struct with multi-level nested async callback, Since each level callback provides info for next batch of requests till everything is done. It's like a tree structure. And each time I can only get to one level below.
However, the first attempt with inout parameter failed. I now learned the reason, thanks to great answers here:
Inout parameter in async callback does not work as expected
My quest is still there to be solved. The only way I can think of is to store the value to a local file or persistent store and modify it directly each time. And after writing the sample code, I think a global var can help me out on this as well. But I guess the best way is to have a struct instance for this job. And for each round of requests, I store info for this round in one place to avoid the mess created by different rounds working on the same time.
With sample code below, only the global var update works. And I believe the reason the other two fail is the same as the question I mentioned above.
func testThis() {
var d = Data()
d.getData()
}
let uriBase = "https://hacker-news.firebaseio.com/v0/"
let u: [String] = ["bane", "LiweiZ", "rdtsc", "ssivark", "sparkzilla", "Wogef"]
var successfulRequestCounter = 0
struct A {}
struct Data {
var dataOkRequestCounter = 0
var dataArray = [A]()
mutating func getData() {
for s in u {
let p = uriBase + "user/" + s + ".json"
getAnApiData(p)
}
}
mutating func getAnApiData(path: String) {
var req = NSURLRequest(URL: NSURL(string: path)!)
var config = NSURLSessionConfiguration.ephemeralSessionConfiguration()
var session = NSURLSession(configuration: config)
println("p: \(path)")
var task = session.dataTaskWithRequest(req) {
(data: NSData!, res: NSURLResponse!, err: NSError!) in
if let e = err {
// Handle error
} else if let d = data {
// Successfully got data. Based on this data, I need to further get more data by sending requests accordingly.
self.handleSuccessfulResponse()
}
}
task.resume()
}
mutating func handleSuccessfulResponse() {
println("successfulRequestCounter before: \(successfulRequestCounter)")
successfulRequestCounter++
println("successfulRequestCounter after: \(successfulRequestCounter)")
println("dataOkRequestCounter before: \(dataOkRequestCounter)")
dataOkRequestCounter++
println("dataOkRequestCounter after: \(dataOkRequestCounter)")
println("dataArray count before: \(dataArray.count)")
dataArray.append(A())
println("dataArray count after: \(dataArray.count)")
if successfulRequestCounter == 6 {
println("Proceeded")
getData()
}
}
}
func getAllApiData() {
for s in u {
let p = uriBase + "user/" + s + ".json"
getOneApiData(p)
}
}
Well, in my actual project, I successfully append a var in the struct in the first batch of callbacks and it failed in the second one. But I failed to make it work in the sample code. I tried many times so that it took me so long to update my question with sample code. Anyway, I think the main issue is to learn appropriate approach for this task. So I just put it aside for now.
I guess there is no way to do it with closure, given how closure works. But still want to ask and learn the best way.
Thanks.
What I did was use an inout NSMutableDictionary.
func myAsyncFunc(inout result: NSMutableDictionary){
let priority = DISPATCH_QUEUE_PRIORITY_DEFAULT
dispatch_async(dispatch_get_global_queue(priority, 0)) {
let intValue = result.valueForKey("intValue")
if intValue as! Int > 0 {
//Do Work
}
}
dispatch_async(dispatch_get_main_queue()) {
result.setValue(0, forKey: "intValue")
}
}
I know you already tried using inout, but NSMutableDictionary worked for me when no other object did.