In Swift, we can leverage DispatchQueue to prevent race condition. By using serial queue, all things are performed in order, from https://developer.apple.com/library/content/documentation/General/Conceptual/ConcurrencyProgrammingGuide/OperationQueues/OperationQueues.html
Serial queues (also known as private dispatch queues) execute one task
at a time in the order in which they are added to the queue. The
currently executing task runs on a distinct thread (which can vary
from task to task) that is managed by the dispatch queue. Serial
queues are often used to synchronize access to a specific resource.
But we can easily create deadlock How do I create a deadlock in Grand Central Dispatch? by perform a sync inside async
let serialQueue = DispatchQueue(label: "Cache.Storage.SerialQueue")
serialQueue.async {
serialQueue.sync {
print("perform some job")
}
print("this can't be reached")
}
The only way to prevent deadlock is to use 2 serial queues, each for sync and async function versions. But this can cause rare condition when writeSync and writeAsync happens at the same time.
I see in fs module that it supports both sync and async functions, like fs.writeFileSync(file, data[, options]) and fs.writeFile(file, data[, options], callback). By allowing both 2 versions, it means users can use them in any order they want? So they can easily create deadlock like what we did above?
So maybe fs has a clever way that we can apply to Swift? How do we support both sync and async in a thread safe manner?
serialQueue.async {
serialQueue.sync {
print("perform some job")
}
}
This deadlocks because this code queues a second task on the same dispatch queue and then waits for that second task to finish. The second task can't even start, however, because it is a serial queue and the first task is still executing (albeit blocked on an internal sempahore).
The way to avoid this kind of deadlock is to never do that. It's especially stupid when you consider that you can achieve the same effect with the following:
serialQueue.async {
print("perform some job")
}
There are some use-cases for running synchronous tasks in a different queue to the one you are in e.g.
if the other queue is the main queue and you want to do some stuff in the UI before carrying on
as a means of synchronisation between tasks in different queues, for example if you want to make sure that all the current tasks in another queue have finished before carrying on.
however, there is never a reason to synchronously do something on the same queue, you might as well just do the something. Or to put it another way, if you just write statements one after the other, they are already executing synchronously on the same queue.
I see in fs module that it supports both sync and async functions, like fs.writeFileSync(file, data[, options]) and fs.writeFile(file, data[, options], callback). By allowing both 2 versions, it means users can use them in any order they want? So they can easily create deadlock like what we did above?
That depends on how the two APIs are implemented. The synchronous version of the call might just do the call without messing about on other threads. If it does grab another thread and then wait around until that other thread is finished, then yes there is a potential for deadlock if the node.js server runs out of threads.
Related
I need to defer the execution of a task until I complete a high priority task, such as re-authenticating, then execute the original task from there. I'm trying to use Swift Concurrency's Task object for this:
Task {
await service.fetch(...)
}
I see that I can cancel the task, but I want to stop/start it later instead. I was thinking of storing it in a queue and flushing the queue out after the high priority task finishes. Could this be done with Swift Concurrency, or I'm hoping I don't have to wrap an Operation object with async/await or something similar?
From the WWDC 2021 video "Swift concurrency: Behind the scenes" a task schedules code to execute in the future, but doesn't give you a lot of control over when that task will execute. In particular, even though you can provide a priority for a task, there are situations where you may still have priority inversion in the order tasks execute.
Your best bet for the type of control you seem to want is using Grand Central Dispatch and have both high priority and low priority queues feeding into a serial dispatch queue and then you can put your task in the appropriate queue.
EDIT: I just saw a construct that's new to me (been a while since I was working with GCD daily). It's called a Workloop (https://developer.apple.com/documentation/dispatch/workloop) and sounds like it might be just the ticket without manually tying dispatch queues together.
After reading about Concurrent and Serial queues, sync and async, I think I have an idea about how to create queues and the order they are executed in. My problem is that in any of the tutorials I have seen, none of them actually tell you many use cases. For example:
I have a network manager that uses URLSessions and serializes json to send a request to my api. Does it make sense to wrap it in a .utility Queue or in a .userInitiated or do I just don't wrap it in a queue.
let task = LoginTask(username: username, password: password)
let networkQueue = DispatchQueue(label: "com.messenger.network",
qos: DispatchQoS.userInitiated)
networkQueue.async {
task.dataTask(in: dispatcher) { (user, httpCode, error) in
self.presenter?.loginUserResponse(user: user, httpCode: httpCode, error: error)
}
}
My question is: Is there any guidlines I can follow to know when there is a need to use queues or not because I cant find this information anywhere. I realise apple provides example usage howver it is very vague
Dispatch queues are used in a multitude of use cases, so it's hard to enumerate them, but two very common use cases are as follows:
You have some expensive and/or time-consuming process that you want to run on some thread other than the current thread. Often this is used when you're on the main thread and you want to run something on a background thread.
A good example of this would be image manipulation, which is a notoriously computationally (and memory) intensive process. So, you'd create a queue for image manipulation and then you'd dispatch each image manipulation task to that queue. You might also dispatch the UI update when it's done back to the main queue (because all UI updates must happen on the main thread). A common pattern would be:
imageQueue.async {
// manipulate the image here
// when done, update the UI:
DispatchQueue.main.async {
// update the UI and/or model objects on the main thread
}
}
You have some shared resource (it could be a simple variable, it could be some interaction with some other shared resource like a file or database) that you want to synchronize regardless of from which thread to invoke it. This is often part of a broader strategy of making something that is not inherently thread-safe behave in a thread safe manner.
The virtue of dispatch queues is that it greatly simplifies writing multi-threaded code, an otherwise very complicated technology.
The thing is that your example, initiating a network request, already runs the request on a background thread and URLSession manages all of this for you, so there's little value in using queues for that.
In the interest of full disclosure, there is a surprising of variety of different tools using GCD directly (e.g. dispatch groups or dispatch sources) or indirectly (e.g. operation queues) above and beyond the basic dispatch queues discussed above:
Dispatch groups: Sometimes you will initiate a series of asynchronous tasks and you want to be notified when they're all done. You can use a dispatch group (see https://stackoverflow.com/a/28101212/1271826 for a random example). This eliminates you from needing to keep track of when all of these tasks are done yourself.
Dispatch "apply" (now called concurrentPerform): Sometimes when you're running some massively parallel task, you want to use as many threads as you reasonably can. So concurrentPerform lets you effectively perform a for loop in parallel, and Apple has optimized it for the number of cores and CPUs your particular device, while not flooding it with too many concurrent tasks at any one time, exhausting the limited number of worker threads. See the https://stackoverflow.com/a/39949292/1271826 for an example of running a for loop in parallel.
Dispatch sources:
For example, if you have some background task that is doing a lot of work and you want to update the UI with the progress, sometimes those UI updates can come more quickly than the UI can handle them. So you can use a dispatch source (a DispatchSourceUserDataAdd) to decouple the background process from the UI updates. See aforementioned https://stackoverflow.com/a/39949292/1271826 for an example.
Traditionally, a Timer runs on the main run loop. But sometimes you want to run it on a background thread, but doing that with a Timer is complicated. But you can use a DispatchSourceTimer (a GCD timer) to run a timer on a queue other than the main queue. See https://stackoverflow.com/a/38164203/1271826 for example of how to create and use a dispatch timer. Dispatch timers also can be used to avoid some of the strong reference cycles that are easily introduced with target-based Timer objects.
Barriers: Sometimes when using a concurrent queue, you want most things to run concurrently, but for other things to run serially with respect to everything else on the queue. A barrier is a way to say "add this task to the queue, but make sure it doesn't run concurrently with respect to anything else on that queue."
An example of a barrier is the reader-writer pattern, where reading from some memory resource can happen concurrently with respect to all other reads, but any writes must not happen concurrently with respect to anything else on the queue. See https://stackoverflow.com/a/28784770/1271826 or https://stackoverflow.com/a/45628393/1271826.
Dispatch semaphores: Sometimes you need to let two tasks running on separate threads communicate to each other. You can use semaphores for one thread to "wait" for the "signal" from another.
One common application of semaphores is to make an inherently asynchronous task behave in a more synchronous manner.
networkQueue.async {
let semaphore = DispatchSemaphore(0)
let task = session.dataTask(with: url) { data, _, error in
// process the response
// when done, signal that we're done
semaphore.signal()
}
task.resume()
semaphore.wait(timeout: .distantFuture)
}
The virtue of this approach is that the dispatched task won't finish until the asynchronous network request is done. So if you needed to issue a series of network requests, but not have them run concurrently, semaphores can accomplish that.
Semaphores should be used sparingly, though, because they're inherently inefficient (generally blocking one thread waiting for another). Also, make sure you never wait for a semaphore from the main thread (because you're defeating the purpose of having the asynchronous task). That's why in the above example, I'm waiting on the networkQueue, not the main queue. All of this having been said, there's often better techniques than semaphores, but it is sometimes useful.
Operation queues: Operation queues are built on top of GCD dispatch queues, but offer some interesting advantages including:
The ability to wrap an inherently asynchronous task in a custom Operation subclass. (This avoids the disadvantages of the semaphore technique I discussed earlier.) Dispatch queues are generally used when running inherently synchronous tasks on a background thread, but sometimes you want to manage a bunch of tasks that are, themselves, asynchronous. A common example is the wrapping of asynchronous network requests in Operation subclass.
The ability to easily control the degree of concurrency. Dispatch queues can be either serial or concurrent, but it's cumbersome to design the control mechanism to, for example, to say "run the queued tasks concurrent with respect to each other, but no more than four at any given time." Operation queues make this much easier with the use of maxConcurrentOperationCount. (See https://stackoverflow.com/a/27022598/1271826 for an example.)
The ability to establish dependencies between various tasks (e.g. you might have a queue for downloading images and another queue for manipulating the images). With operation queues you can have one operation for the downloading of an image and another for the processing of the image, and you can make the latter dependent upon the completion of the former.
There are lots of other GCD related applications and technologies, but these are a few that I use with some frequency.
Suppose if multiple async tasks running in a serial queue are accessing a same shared resource, are there any chances we might face race condition?
Following the comment I've added, this is taken from Apple doc. In bold I put the emphasis to what you are looking for.
Serial queues (also known as private dispatch queues) execute one task
at a time in the order in which they are added to the queue. The
currently executing task runs on a distinct thread (which can vary
from task to task) that is managed by the dispatch queue. Serial
queues are often used to synchronize access to a specific resource.
If you are using a concurrent queue instead you could have a race condition. You can prevent it using dispatch barriers, for example. See Grand Central Dispatch In-Depth: Part 1/2 for more details.
For NSOperation and NSOperationQueue the same applies. NSOperationQueue can be made serial with maxConcurrentOperationCount set to 1. In addition, using dependencies through operations, you can synchronize the access to a shared resource.
No you can not run into a race condition, see what I did there, when running async tasks on a serial queue due to the fact that the type of queue has to deal with the ways the tasks are executed and while synchrony and asynchrony has to deal with the responsiveness of your application when completing an expensive task.
The reason it is easy to run into a race condition when on a concurrrent queue because on a concurrent queue tasks are allowed to be executed at the same time therefore different threads sometimes may be "racing" to perform an action and in all actuality they are overwriting the previous threads work due to the threads performing the same action. Where as on a serial queue tasks are executed one at a time so two threads cant race to complete a task because it happens in sequential order. Hope that helps!
I am a rookie in Swift, and there is such misunderstandings
what is the difference how to create dispatch queue
sample 1
let backgroundQueue = DispatchQueue(label: "com.app.queue",
qos: .background,
target: nil)
backgroundQueue.async {
print("Dispatched to background queue")
}
sample 2
let backgroundQueue = DispatchQueue.global()
backgroundQueue.async {
print("Dispatched to background queue")
}
as far as I understand this two approaches do the same
or for example this approach
DispatchQueue.global(qos: .userInitiated).async {
print("user initiated task")
}
what does it mean?
The queue you create in your first example is your own custom serial queue. As the somewhat dated, yet still relevant, Concurrency Programming Guide says:
Serial queues (also known as private dispatch queues) execute one task at a time in the order in which they are added to the queue. The currently executing task runs on a distinct thread (which can vary from task to task) that is managed by the dispatch queue. Serial queues are often used to synchronize access to a specific resource.
You can create as many serial queues as you need, and each queue operates concurrently with respect to all other queues. In other words, if you create four serial queues, each queue executes only one task at a time but up to four tasks could still execute concurrently, one from each queue.
Whereas your latter examples are using simply retrieving system-provided global queues which are concurrent:
Concurrent queues (also known as a type of global dispatch queue) execute one or more tasks concurrently, but tasks are still started in the order in which they were added to the queue. The currently executing tasks run on distinct threads that are managed by the dispatch queue. The exact number of tasks executing at any given point is variable and depends on system conditions.
Now, you can nowadays create your own custom concurrent queue, but a global queue is simply a concurrent queue that was created for us.
So, what does this mean to you?
If you dispatch blocks to serial queue (your first example), only one block can run at any time. This makes it really useful for synchronizing memory access in multi-threaded apps, but can be used in any environment where you need a background queue, but you want dispatched blocks to be run serially (i.e. sequentially) with respect to other blocks on that queue.
The global queues that you are using in your latter examples are concurrent queues. This means that if you dispatch four separate tasks to this global queue, those blocks may run concurrently with respect to each other). This is ideal where you really want not only background execution, but don't care if these blocks also run at the same time as other dispatched blocks, too.
In your latter examples, where you're accessing a global queue, recognize that because those are system-provided, you have some modest limitations on your interaction with these queues, namely:
You cannot suspend global queues;
You cannot use barriers on global queues;
But, with that notwithstanding, if you are just looking for an simple way of dispatching blocks to run in the background (and you don't care if those dispatched blocks run at the same time as each other), then global queues are incredibly simple and efficient way to do that.
By the way, the difference between your second example (for which I assume you intended let backgroundQueue = DispatchQueue.global()) and the third example, is merely that in your third example, you assigned the explicit quality of service (qos), whereas in your second example, you're using the default qos. FWIW, it's generally advisable to specify a qos, so that the OS can prioritize threads contending for limited system resources accordingly.
Not much different between the 3.
Sample 1 creates a queue that only your app has access to. You also set its label to com.app.queue which can help you debug easier.
Sample 2 is not valid Swift code. I believe what you meant was:
let backgroundQueue = DispatchQueue.global(qos: .default) // or any other QoS
backgroundQueue.async {
print(backgroundQueue.label) // the label property tell you what queue it is
}
Same as sample 1, accept that you are using a system queue which is shared with other applications.
Sample 3 is simply a convenient method (save for the different QoS you specified):
DispatchQueue.global(qos: .userInitiated).async {
// you don't have any reference to the queue your code
// is running on, but most of times it's not needed
}
The majority of the code I wrote used something similar to Sample 3. Sample 2 is least common. Sample 1 is used mostly when you need to synchronize actions across multiple queues.
I'm writing an app which requires running a method after another method completes. (Common scenario, right?)
I'm trying to implement chained methods. The best I've come up with is to call performSelector:withObject:afterDelay:. I'm simply not sure if that is the best way to do this. I've looked into how the Cocos2d game engine implements its CCSequence class, but I'm not sure I understand it.
I suspect blocks would do well here, except I'm not sure how to use them as callback objects or whatever.
How would I implement a mechanism for running methods, one after the other? (I'm open to using timers or blocks, but I don't know how I'd use blocks in this case.)
Edit:
To clarify, I'm trying to implement a system like cocos2d's CCSequence class, which takes a few methods and "dispatches" them in sequence. Things like animations, which take much more than a single clock cycle to run.
I'm not looking to block the main thread, nor do I want to hard code methods to each other. Cocos2d has a sequencing system where I can pass in methods to a queue and run them sequentially.
Edit 2:
Also, I'd like to be able to cancel my scheduled queues, and so I'm not sure GCD is a good match for this. Can GCD serial queues be canceled?
You can use the technique of Thread Migration
Then here comes the interesting task called GCD-Grand Central Dispatch
Grand Central Dispatch (GCD) is a technology developed by Apple Inc.
to optimize application support for systems with multi-core processors
and other symmetric multiprocessing systems.It is an implementation of
task parallelism based on the thread pool pattern.
GCD works by allowing specific tasks in a program that can be run in
parallel to be queued up for execution and, depending on availability
of processing resources, scheduling them to execute on any of the
available processor cores
Dispatch Queues are objects that maintain a queue of tasks, either anonymous code blocks or functions, and execute these tasks in their
turn. The library automatically creates several queues with different
priority levels that execute several tasks concurrently, selecting the
optimal number of tasks to run based on the operating environment. A
client to the library may also create any number of serial queues,
which execute tasks in the order they are submitted, one at a time.
Because a serial queue can only run one task at a time, each task
submitted to the queue is critical with regard to the other tasks on
the queue, and thus a serial queue can be used instead of a lock on a
contended resource.
Dispatch queues execute their tasks concurrently with respect to other
dispatch queues. The serialization of tasks is limited to the tasks in
a single dispatch queue.
In your case you can use Serial Dispatch Queues
Serial queues are useful when you want your tasks to execute in a
specific order. A serial queue executes only one task at a time and
always pulls tasks from the head of the queue. You might use a serial
queue instead of a lock to protect a shared resource or mutable data
structure. Unlike a lock, a serial queue ensures that tasks are
executed in a predictable order. And as long as you submit your tasks
to a serial queue asynchronously, the queue can never deadlock.
Unlike concurrent queues, which are created for you, you must
explicitly create and manage any serial queues you want to use. You
can create any number of serial queues for your application but should
avoid creating large numbers of serial queues solely as a means to
execute as many tasks simultaneously as you can. If you want to
execute large numbers of tasks concurrently, submit them to one of the
global concurrent queues. When creating serial queues, try to identify
a purpose for each queue, such as protecting a resource or
synchronizing some key behavior of your application.
dispatch_queue_t queue;
queue = dispatch_queue_create("com.example.MyQueue", NULL);
this code shows the steps required to create a custom serial queue.
The dispatch_queue_create function takes two parameters: the queue
name and a set of queue attributes. The debugger and performance tools
display the queue name to help you track how your tasks are being
executed. The queue attributes are reserved for future use and should
be NULL.
Grand Central Dispatch provides functions to let you access several
common dispatch queues from your application:
Use the dispatch_get_current_queue function for debugging purposes
or to test the identity of the current queue. Calling this function
from inside a block object returns the queue to which the block was
submitted (and on which it is now presumably running). Calling this
function from outside of a block returns the default concurrent queue
for your application.
Use the dispatch_get_main_queue function to get the serial
dispatch queue associated with your application’s main thread. This
queue is created automatically for Cocoa applications and for
applications that either call the dispatch_main function or configure
a run loop (using either the CFRunLoopRef type or an NSRunLoop object)
on the main thread.
Use the dispatch_get_global_queue function to get any of the
shared concurrent queues.
Note: You do not need to retain or release any of the global dispatch
queues, including the concurrent dispatch queues or the main dispatch
queue. Any attempts to retain or release the queues are ignored.
Source: Concurrency Programming Guide
What about using a serial GCD queue?
private dispatch queues
Serial queues (also known as private dispatch queues) execute one task at a time in the order in which they are added to the queue. The currently executing task runs on a distinct thread (which can vary from task to task) that is managed by the dispatch queue. Serial queues are often used to synchronize access to a specific resource.
You can create as many serial queues as you need, and each queue operates concurrently with respect to all other queues. In other words, if you create four serial queues, each queue executes only one task at a time but up to four tasks could still execute concurrently, one from each queue. For information on how to create serial queues, see “Creating Serial Dispatch Queues.”
(source)
This would be useful if you want that all of your messages be handled in a background thread.
There are two performSelector method that can wait for completion, no need to guess a timing.
[self performSelector:<#(SEL)#> onThread:<#(NSThread *)#> withObject:<#(id)#> waitUntilDone:<#(BOOL)#>];
[self performSelectorOnMainThread:<#(SEL)#> withObject:<#(id)#> waitUntilDone:<#(BOOL)#>];
It sounds like you want to check out NSOperationQueue, NSOperation, and either NSBlockOperation or NSInvocationOperation. Unlike a GCD queue, an NSOperationQueue supports cancelling jobs.
You can create your own queue and set its maximum concurrent operation count to 1 to force it to execute operations serially. Or you can set dependencies between operations to force those operations to run serially.
Start with the chapter on Operation Queues in the Concurrency Programming Guide.
I finally found what I'm looking for. Completion blocks. Simply put, I'd write a method like this:
- (void) performSomeActionWithCompletion:(void (^)()) completion{
[self someAction];
if(completion()){
completion();
}
}
Now I can call my method like so:
[self performSomeActionWithCompletion:^{
NSLog(#"All done! (Well, not the async stuff, but at any rate...)");
}];