I was hoping to run this code to fetch multiple images from URLs but although the loop runs through every object in my array, the only object that is saved is the final one. Is there a better way to do this?
EDIT: It seems that all the images saved from the different URLs, but they all saved with the same filename, the final filename in the loop. This would seem to be caused by the delay that by the time NSURLSession and writing the image to file happens, the filename has already been set to the final one of the loop.
for object in objectArray {
url = NSURL(string:object.urlToLoad)
imageFile = paths.stringByAppendingPathComponent("\(object.filename).png")
let task = NSURLSession.sharedSession().dataTaskWithURL(url!) {(data, response, error) in
data.writeToFile(imageFile, atomically: true)
return
}
task.resume()
}
Thanks for any help!
The problem is that they are all capturing/sharing the same imageFile variable, as it is declared outside the scope of the for. So first you go over the loop n times, overwriting the imageFile variable with the next filename as you go and firing off an asynchronous download. Then, later, by the time each of the n closures actually execute on completion of their download, they are all referencing the same filename, the one matching the last value in the array.
Try sticking a let in front of it, thus declaring a fresh local variable with every iteration, that each closure in turn will capture. You should do the same with the url as well, even though it happens not to be a problem.
Generally, this is another good example of how you should use let rather than var at every opportunity, and to avoid the practice of re-using variables in outer scopes except when you specifically want to communicate data to the outer scope.
Here's some standalone code that demonstrates this without the URL-downloading aspect, just 5 sleep-and-prints running in parallel:
import Dispatch
let q = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
var shared: String = "shared=x"
for x: UInt32 in 1...5 {
shared = "shared=\(x)"
let unique = "unique=\(x)"
dispatch_async(q) {
sleep(x)
println("\(shared), \(unique)")
}
}
println("for loop completed, \(shared)")
dispatch_main()
Related
I am trying to do some calculations on a large number of objects. The objects are saved in an array and the results of the operation should be saved in a new array. To speed up the processing, I‘m trying to break up the task into multiple subtasks which can run concurrently on different threads. The simplified example code below replaces the actual operation with two seconds of wait.
I have tried multiple ways of solving this issue, using both DispatchQueues and Tasks.
Using DispatchQueue
The basic setup I used is the following:
import Foundation
class Main {
let originalData = ["a", "b", "c"]
var calculatedData = Set<String>()
func doCalculation() {
//calculate length of array slices.
let totalLength = originalData.count
let sliceLength = Int(totalLength / 3)
var start = 0
var end = 0
let myQueue = DispatchQueue(label: "Calculator", attributes: .concurrent)
var allPartialResults = [Set<String>]()
for i in 0..<3 {
if i != 2 {
start = sliceLength * i
end = start + sliceLength - 1
} else {
start = totalLength - sliceLength * (i - 1)
end = totalLength - 1
}
allPartialResults.append(Set<String>())
myQueue.async {
allPartialResults[i] = self.doPartialCalculation(data: Array(self.originalData[start...end]))
}
}
myQueue.sync(flags: .barrier) {
for result in allPartialResults {
self.calculatedData.formUnion(result)
}
}
//do further calculations with the data
}
func doPartialCalculation(data: [String]) -> Set<String> {
print("began")
sleep(2)
let someResultSet: Set<String> = ["some result"]
print("ended")
return someResultSet
}
}
As expected, the Console Log is the following (with all three "ended" appearing at once, two seconds after all three "began" appeared at once):
began
began
began
ended
ended
ended
When measuring performance using os_signpost (and using real data and calculations), this approach reduces the time needed for the entire doCalculation() function to run from 40ms to around 14ms.
Note that to avoid data races when appending the results to the final calculatedData Set, I created an array of partial Data sets of which every DispatchQueue only accesses one index (which is not a solution I like and the main reason why I am not satisfied with this approach). What I would have liked to do is to call DispatchQueue.main from within myQueue and add the new data to the calculatedData Set on the main thread, however calling DispatchQueue.main.sync causes a deadlock and using the async version leads to the barrier flag not working as intended.
Using Tasks
In a second attempt, I tried using Tasks to run code concurrently. As I understand it, there are two options for running code concurrently with Tasks. async let and withTaskGroup. For the purpose of retrieving a variable quantity of partial results form a variable amount of concurrent tasks, I figured using withTaskGroup was the best option for me.
I modified the code to look like this:
class Main {
let originalData = ["a", "b", "c"]
var calculatedData = Set<String>()
func doCalculation() async {
//calculate length of array slices.
let totalLength = originalData.count
let sliceLength = Int(totalLength / 3)
var start = 0
var end = 0
await withTaskGroup(of: Set<String>.self) { group in
for i in 0..<3 {
if i != 2 {
start = sliceLength * i
end = start + sliceLength - 1
} else {
start = totalLength - sliceLength * (i - 1)
end = totalLength - 1
}
group.addTask {
return await self.doPartialCalculation(data: Array(self.originalData[start...end]))
}
}
for await newSet in group {
calculatedData.formUnion(newSet)
}
}
//do further calculations with the data
}
func doPartialCalculation(data: [String]) async -> Set<String> {
print("began")
try? await Task.sleep(nanoseconds: UInt64(1e9))
let someResultSet: Set<String> = ["some result"]
print("ended")
return someResultSet
}
}
However, the Console Log prints the following (with every "ended" coming 2 seconds after the preceding "before"):
began
ended
began
ended
began
ended
Measuring performance using os_signpost revealed that the operation takes 40ms to complete. Therefore it is not running concurrently.
With that being said, what is the best course of action for this problem?
Using DispatchQueue, how do you call the Main Queue to avoid data races from within a queue, while at the same time preserving a barrier flag later on in the code?
Using Task, how do can you actually make them run concurrently?
EDIT
Running the code on a real device instead of the simulator and changing the sleep function inside the Task from sleep() to Task.sleep(), I was able to achieve concurrent behavior in that the Console prints the expected log. However, the operation time for the task remains upwards of 40-50ms and is highly variable, sometimes reaching 200ms or more. This problem remains after adding the .userInitiated property to the Task.
Why does it take so much longer to run the same operation concurrently using Task compared to using DispatchQueue? Am I missing something?
A few observations:
One possible performance difference is that the simulator artificially constrains the “cooperative thread pool” used by async-await. See Maximum number of threads with async-await task groups. This is one cause of a lack of full concurrency (on the simulator).
In the async-await test, another factor that can affect concurrency is an actor. If an actor is enforcing serial execution, then consider declaring doPartialCalculation as nonisolated, so that it allows concurrent execution. Failure to do so can prevent any concurrent execution (with your sleep scenario, for example).
The fact that you saw a significant performance difference when you went from sleep to Task.sleep makes me wonder if might have done this within an actor. Actors are “reentrant” and Task.sleep suspends execution and lets the actor to switch to another task. So it allows concurrency for a series of async methods.
But Task.sleep is not analogous to some computationally intensive task that will tie up the thread. But by declaring the function as nonisolated, that will achieve concurrent execution for computationally intensive processes. That can achieve performance results that are nearly equivalent to what you achieved with a GCD implementation.
That having being said, you might still find that async-await is a tiny bit slower than pure GCD implementations. Then again, Swift concurrency offers more native protections and compile-time warnings to ensure thread-safety.
E.g., here are 100 compute-heavy tasks in both GCD and async-await, performed twice for each:
So, you simply have to ask yourself whether the benefits of async-await warrant the modest performance impact or not.
A few unrelated asides on the GCD implementation:
It should be noted that your GCD example is not thread-safe and so the comparison of your two code snippets is not entirely fair. You should make the GCD implementation thread-safe. (Perhaps consider temporarily testing with TSAN. See “Detect Data Races Among Your App’s Threads” section of Diagnosing Memory, Thread, and Crash Issues Early.) You should perform doPartialCalculation in parallel, but you must synchronize the update of allPartialResults (or any shared resource). You can use GCD serial queue for this. Or since you seem to be so concerned about performance, perhaps a NSLock or os_unfair_lock (though care must be taken with the latter). See the GCD example at the end of this answer.
If your dispatched blocks are taking ~50 msec, that simply might not be enough work to justify the overhead of concurrency. You may even find that a simple, serial, rendition is faster!
Often, to maximize the amount of work done per thread, we would “stride” through our index (which is what you appear to be doing with your “slice” logic). But if, even after striding, the time per concurrent loop is still measured in milliseconds, then it may turn out that concurrency is unwarranted altogether. Some tasks are so trivial that they simply will not benefit from concurrent execution.
In your GCD example, you are dispatching to a concurrent queue, which if you have too many iterations, can lead to “thread explosion”, exhausting a very limited worker thread pool. You are only doing three iterations, so that’s not a problem now, but if the number of iterations grows, you would want to abandon that pattern, and adopt concurrentPerform (as seen here). It’s a great way to make full use of the hardware capabilities while avoiding the exhausting of the worker thread pool.
As an aside, I would be wary of using any of the sleep methods as a proxy for a time consuming task. You actually want to keep the CPU busy. I personally use an inefficient π calculation as my general proxy for “do something slow”. That is what I used above.
func performHeavyTask(iteration: Int) {
let id = OSSignpostID(log: poi)
os_signpost(.begin, log: poi, name: #function, signpostID: id, "%d", iteration)
let pi = calculatePi(iterations: 100_000_000)
os_signpost(.end, log: poi, name: #function, signpostID: id, "%f", pi)
}
// calculate pi using Gregory-Leibniz series
func calculatePi(iterations: Int) -> Double {
var result = 0.0
var sign = 1.0
for i in 0 ..< iterations {
result += sign / Double(i * 2 + 1)
sign *= -1
}
return result * 4
}
E.g. here is a GCD example which
uses concurrentPerform;
performs calculation in parallel but synchronizes array updates;
performs update of model on main thread;
uses Sequence<String> rather than [String] to eliminate expensive array creation:
func doCalculation() {
DispatchQueue.global().async { [originalData] in // gives me the willies to see asynchronous routine accessing property, so I might capture it here in case it ever changes to mutable property; or, better, it should be parameter of `doCalculation`
let totalLength = originalData.count
let iterations = 3 // avoid brittle pattern of repeating this number (of values based upon it) repeatedly
let sliceLength = totalLength / iterations
let queue = DispatchQueue(label: "Calculator") // serial queue for synchronization
var allResults = Set<String>()
DispatchQueue.concurrentPerform(iterations: iterations) { i in
let start = i * sliceLength
let end = min(start + sliceLength, totalLength)
let result = self.doPartialCalculation(with: originalData[start..<end]) // do calculation in parallel
queue.sync { allResults.formUnion(result) } // synchronize update
}
// personally, I would not update a property from this method,
// but rather would use local var and supply the results in a completion
// handler parameter, and let caller update model as it sees fit.
//
// But if you are going to do this, synchronize the update somehow,
// e.g., do it on the main thread.
DispatchQueue.main.async { // update on main thread
self.calculatedData = allResults // or `self.calculatedData.formUnion(allResults)`, if that's what you really mean
}
}
}
// note, rather than taking `[String]`, which requires us to create a new
// `Array` instance, let's change this to take `Sequence<String>` as
// input ... that way we can supply array slices directly
func doPartialCalculation<S>(with data: S) -> Set<String> where S: Sequence, S.Element == String {
print("began")
sleep(2)
let someResultSet: Set<String> = ["some result"]
print("ended")
return someResultSet
}
Or, alternatively, you could do the updates of the local var asynchronously and keep track of them with a DispatchGroup, performing the final update (or call to the completion handler) on the .main queue:
func doCalculation() {
DispatchQueue.global().async { [originalData] in // gives me the willies to see asynchronous routine accessing property, so I might capture it here in case it ever changes to mutable property; or, better, it should be parameter of `doCalculation`
let totalLength = originalData.count
let iterations = 3 // avoid brittle pattern of repeating this number (of values based upon it) repeatedly
let sliceLength = totalLength / iterations
let queue = DispatchQueue(label: "Calculator") // serial queue for synchronization
let group = DispatchGroup()
var allResults = Set<String>()
DispatchQueue.concurrentPerform(iterations: iterations) { i in
let start = i * sliceLength
let end = min(start + sliceLength, totalLength)
let result = self.doPartialCalculation(with: originalData[start..<end]) // do calculation in parallel
queue.async(group: group) { allResults.formUnion(result) } // synchronize update
}
// personally, I would not update a property from this method,
// but rather would use local var and supply the results in a completion
// handler parameter, and let caller update model as it sees fit.
//
// But if you are going to do this, synchronize the update somehow,
// e.g., do it on the main thread.
group.notify(queue: .main) {
self.calculatedData = allResults // or `self.calculatedData.formUnion(allResults)`, if that's what you really mean
}
}
}
You can benchmark this and see whether the asynchronous update has any material impact. It probably will not in this case, but the proof is in the pudding.
Your Task-based example looks like it should execute concurrently. I ran it and am able to get concurrent execution.
Probably the issue you're having is that Swift concurrency tries to limit Task concurrency to the number of available cores. And (I don't think this is well documented!) Swift playgrounds and the iOS simulators seem to execute in a single-core environment.
So if you run your code in a Swift playground, you'll get serial task execution. If you make a Mac app and run it in that, or on an iOS device, you should get parallel execution.
This WWDC talk from last year has a discussion of why it works that way: https://developer.apple.com/videos/play/wwdc2021/10254/?time=652
That's worth paying attention to. You'll of course be fine scheduling 3 blocks on a concurrent queue, but if your example is standing in for a real workload that might have hundreds or thousands, it's easy to cause thread explosion and create new, harder to understand performance issues.
func forwardGeocoding(address: String) {
CLGeocoder().geocodeAddressString(address, completionHandler: { (placemarks, error) in
if error != nil {
print(error)
return
}
if placemarks?.count > 0 {
let placemark = placemarks?[0]
let location = placemark?.location
let coordinate = location?.coordinate
print("\nlat: \(coordinate!.latitude), long: \(coordinate!.longitude)")
if placemark?.areasOfInterest?.count > 0 {
let areaOfInterest = placemark!.areasOfInterest![0]
print(areaOfInterest)
} else {
print("No area of interest found.")
}
}
})
var INITIAL_DESTINATION = forwardGeocoding(initialDestination)
var DESIRED_DESTINATION = forwardGeocoding(desiredDestination)
var location = CLLocationCoordinate2DMake(<#T##CLLocationDegrees#>, <#T##CLLocationDegrees#>)
Hello, I am trying to make a mapping app, and am having trouble with this part. What I want to do is be able to separate the INITIAL_DESTINATION latitude and longitudes. I have to do this to create a CLLocationCoordinate2DMake. What I have been trying to do is just use INITIAL_DESTINATION.latitude and INITIAL_DESTINATION.longitude, but I am continuingly facing the same error which is "Value of tuple type "()" has no member "latitude". This is also strange because it does not give that error for INITIAL_DESTINATION.longitude.
Any help or suggestions are greatly appreciated, and thank you for reading and taking the time to respond.
Your function returns nothing, and does nothing with the value returned in the asynchronous completion handler. You need to take the asynchronous result and use it in some fashion.
Try this: Put prints at the end of the function, and inside the completion handler, then run the code. What you'll see is that the function is done before the completion handler runs, because the code inside the block does not run until the remote web site returns an answer across the network. At that time Alamofire hands the result to your code in the completion block.
You'll also need to be aware that there are multiple queues in iOS, and UI changes can only be done on the main queue. The completion block does not run on the main queue, however, so likely to use the information returned from the network you'll need to use the dispatch_async function to call a function in your program and have it execute on the main queue.
https://github.com/mateo951/ISBN-Vista-Jera-rquica- Github Link
The structure I have is supposed to be appending values after an internet search. The internet search is called within a function and returns two strings and an image. When I try to append the returned values in the structure, the image is saved but strings are nil.
var datosLibros = [bookData]()
#IBAction func Search(sender: UITextField) {
let (title1, author1, cover1) = (internetSearch(sender.text!))
let libro = bookData(title: title1, author: author1,image:cover1)
datosLibros.append(libro)
print(datosLibros)
}
The saved structured that is printed to the console is the following:
bookData(title: "", author: "", image: <UIImage: 0x7f851a57fbf0>, {0, 0})
Structure:
struct bookData {
var title: String
var author: String
var image: UIImage
init(title: String, author: String, image: UIImage) {
self.title = title
self.author = author
self.image = image
}
}
Thanks in advanced for any advice of help provided. I'm new to swift so there are a lot of stuff uncovered.
The problem is not with the code you posted but with internetSearch.
But before I explain what is going on there, just a quick note about Swift structs. Structs come with one free initializer that takes as its parameters one value for each stored property defined on the struct. Argument labels correspond to the variable labels.
So for your struct bookData (which really should be BookData since types should be capitalized), you do not need to include that initializer you wrote because it will be automatically provided for you as long as you do not create any additional BookData initializers.
Now for the reason your results are not what you expect. Your Strings are not coming back as nil. Instead, they are coming back as empty Strings, or "". In Swift, "" is very different from nil, which means a complete absence of a value. So your Strings are indeed there, they are just empty.
Okay, our Strings are coming back empty. How about our image? No, our image is not coming back either. You thought it was because you saw a UIImage reference printed in the console, but if you look closer you will notice it is a bogus image. Notice "{0, 0}" after the memory address for the instance. As far as I'm aware, this means the image has a size of 0 x 0. How many useful images do you know that have a size of 0 x 0?
So now we have discovered that our Strings are coming back empty and effectively so is our image. What is going on here?
Well, in your implementation of internetSearch I found on GitHub, this is the first thing you do:
var bookTitle = String()
var bookAuthor = String()
var bookCover = UIImage()
Naturally, you do this so that you have some variables of the correct types ready to plop in some actual results if you find them. Just for fun, let's see what the result of the code above would be if there were no results.
Well, the initializer for String that accepts no parameters results in an empty String being created.
Okay, how about our image. While the documentation for UIImage does not even mention an initializer that takes no parameters, it does inherit one from NSObject and it turns out that it will just create an empty image object.
So we now have discovered that what internetSearch is returning is actually the same as what it would be if there were no results. Assuming you are searching for something that you know exists, there must be a problem with the search logic, right? Not necessarily. I noticed that your implementation of the rest of internetSearch relies on an NSURLSession that you use like so:
var bookTitle = String()
var bookAuthor = String()
var bookCover = UIImage()
let session = NSURLSession.sharedSession()
let task = session.dataTaskWithURL(url) { (data, response, error) -> Void in
// Lots of code that eventually sets the three variables above to a found result
}
task.resume()
return (bookTitle, bookAuthor, bookCover)
That seems fine and dandy, except for the fact that NSURLSession performs its tasks asynchronously! Yes, in parts you even dispatch back to the main queue to perform some tasks, but the closure as a whole is asynchronous. This means that as soon as you call task.resume(), NSURLSession executes that task on its own thread/queue/network and as soon as that task is set up it returns way before it completes. So task.resume() returns almost immediately, before any of your search code in the task actually runs, and especially before it completes.
The runtime then goes to the next line and returns those three variables, just like you told it to. This, of course, is the problem because your internetSearch function is returning those initial empty variables before task has a chance to run asynchronously and set them to helpful values.
Suggesting a fully-functional solution is probably beyond the scope of this already-long answer, but it will require a big change in your implementation detail and you should search around for using data returned by NSURLSession.
One possible solution, without me posting any code, is to have your internetSearch function not return anything, but on completion of the task call a function that would then append the result to an array and print it out, like you show. Please research this concept.
Also, I recommend changing your implementation of internetSearch further by declaring your initial values not as:
var bookTitle = String()
var bookAuthor = String()
var bookCover = UIImage()
…but as:
var bookTitle: String?
var bookAuthor: String?
var bookCover: UIImage?
This way, if you find a result than you can represent it wrapped in an Optional and if not you can represent that as nil, which will automatically be the default value of the variables in the code directly above.
I tried to make a serial queue for network operations with GCD like this:
let mySerialQueue = dispatch_queue_create("com.myApp.mySerialQueue", dispatch_queue_attr_make_with_qos_class(DISPATCH_QUEUE_SERIAL, QOS_CLASS_USER_INITIATED, 0))
func myFunc() {
dispatch_async(mySerialQueue) {
do {
// Get object from the database if it exists
let query = PFQuery(className: aClass)
query.whereKey(user, equalTo: currentUser)
let result = try? query.getFirstObject()
// Use existing object or create a new one
let object = result ?? PFObject(className: aClass)
object.setObject(currentUser, forKey: user)
try object.save()
} catch {
print(error)
}
}
}
The code first looks for an existing object in the database.
If it finds one, it updates it. If it doesn't find one, it creates a new one. This is using the Parse SDK and only synchronous network functions (.getFirstObject, .save).
For some reason it seems that this is not executed serially, because a new object is sometimes written into the database, although one existed already that should have been updated only.
Am I missing something about the GCD?
From the documentation on dispatch_queue_attr_make_with_qos_class:
relative_priority: A negative offset from the maximum supported scheduler priority for the given quality-of-service class. This value must be less than 0 and greater than MIN_QOS_CLASS_PRIORITY
Therefore you should be passing in a value less than 0 for this.
However, if you have no need for a priority, you can simply pass DISPATCH_QUEUE_SERIAL into the attr argument when you create your queue. For example:
let mySerialQueue = dispatch_queue_create("com.myApp.mySerialQueue", DISPATCH_QUEUE_SERIAL)
I watched with a great attention the WWDC 2015 sessions about Advanced NSOperations and I played a little bit with the example code.
The provided abstraction are really great, but there is something I may did not really good understand.
I would like to pass result data between two consequent Operation subclasses without using a MOC.
Imagine I have a APIQueryOperation which has a NSData? property and a second operation ParseJSONOperation consuming this property. How do I provide this NSData? intance to the second operation ?
I tried something like this :
queryOperation = APIQueryOperation(request: registerAPICall)
parseOperation = ParseJSONOperation(data: queryOperation.responseData)
parseOperation.addDependency(queryOperation)
But when I enter in the execute method of the ParseJSONOperation the instance in not the same as the same as in the initialiser.
What did I do wrong ?
Your issue is that you are constructing your ParseJSONOperation with a nil value. Since you have two operations that rely on this NSData object I would suggest you write a wrapper object to house this data.
To try and be aligned with the WWDC talk lets call this object the APIResultContext:
class APIResultContext {
var data: NSData?
}
now we can pass this object into both the APIQueryOperation and the ParseJSONOperation so that we have a valid object that can store the data transferred from the API.
This would make the constructors for the query:
let context = APIResultContext()
APIQueryOperation(request: registerAPICall, context: context)
ParseJSONOperation(context: context)
Inside your ParseJSONOperation you should be able to access the data assuming the query completes after it sets the data.
Thread Safety
As #CouchDeveloper pointed out, data is not strictly speaking thread safe. For this trivial example since the two operations are dependent we can safely write and read knowing that these accesses wont take place at the same time. However, to round the solution up and make the context thread safe we can add a simple NSLock
class APIResultContext {
var data: NSData? {
set {
lock.lock()
_data = newValue
lock.unlock()
}
get {
lock.lock()
var result = _data
lock.unlock()
return result
}
}
private var _data: NSData?
private let lock = NSLock()
}