is GCD really Thread-Safe? - swift

I have studied GCD and Thread-Safe.
In apple document, GCD is Thread-Safe that means multiple thread can access.
And I learned meaning of Thread-Safe that always give same result whenever multiple thread access to some object.
I think that meaning of Thread-Safe and GCD's Thread-Safe is not same because I tested some case which is written below to sum 0 to 9999.
"something.n" value is not same when I excute code below several time.
If GCD is Thread-Safe , Why isn't "something.n" value same ?
I'm really confused with that.. Could you help me?
I really want to master Thread-Safe!!!
class Something {
var n = 0
}
class ViewController: UIViewController {
let something = Something()
var concurrentQueue = DispatchQueue(label: "asdf", attributes: .concurrent)
override func viewDidLoad() {
super.viewDidLoad()
let group = DispatchGroup()
for idx in 0..<10000 {
concurrentQueue.async(group: group) {
self.something.n += idx
}
}
group.notify(queue: .main ) {
print(self.something.n)
}
}
}

You said:
I have studied GCD and Thread-Safe. In apple document, GCD is Thread-Safe that means multiple thread can access. And I learned meaning of Thread-Safe that always give same result whenever multiple thread access to some object.
They are saying the same thing. A block of code is thread-safe only if it is safe to invoke it from different threads at the same time (and this thread safety is achieved by making sure that the critical portion of code cannot run on one thread at the same time as another thread).
But let us be clear: Apple is not saying that if you use GCD, that your code is automatically thread-safe. Yes, the dispatch queue objects, themselves, are thread-safe (i.e. you can safely dispatch to a queue from whatever thread you want), but that doesn’t mean that your own code is necessarily thread-safe. If one’s code is accessing the same memory from multiple threads concurrently, one must provide one’s own synchronization to prevent writes simultaneous with any other access.
In the Threading Programming Guide: Synchronization, which predates GCD, Apple outlines various mechanisms for synchronizing code. You can also use a GCD serial queue for synchronization. If you using a concurrent queue, one achieves thread safety if you use a “barrier” for write operations. See the latter part of this answer for a variety of ways to achieve thread safety.
But be clear, Apple is not introducing a different definition of “thread-safe”. As they say in that aforementioned guide:
When it comes to thread safety, a good design is the best protection you have. Avoiding shared resources and minimizing the interactions between your threads makes it less likely for those threads to interfere with each other. A completely interference-free design is not always possible, however. In cases where your threads must interact, you need to use synchronization tools to ensure that when they interact, they do so safely.
And in the Concurrency Programming Guide: Migrating Away from Threads: Eliminating Lock-Based Code, which was published when GCD was introduced, Apple says:
For threaded code, locks are one of the traditional ways to synchronize access to resources that are shared between threads. ... Instead of using a lock to protect a shared resource, you can instead create a queue to serialize the tasks that access that resource.
But they are not saying that you can just use GCD concurrent queues and automatically achieve thread-safety, but rather that with careful and proper use of GCD queues, one can achieve thread-safety without using locks.
By the way, Apple provides tools to help you diagnose whether your code is thread-safe, namely the Thread Sanitizer (TSAN). See Diagnosing Memory, Thread, and Crash Issues Early.

Your current queue is concurrent
var concurrentQueue = DispatchQueue(label: "asdf", attributes: . concurrent)
which means that code can run in any order inside every async
Every run has a different order but variable is accessed only by one part at a time

Related

How to enforce queue to NOT run on main thread in Swift unit tests

There are many examples on how to enforce code to run on main thread. I have an opposite need for unit testing purposes. I want to exclude main thread from taking on some work.
Why do I need this: I would like to test that a particular function runs correctly, specifically when it's simultaneously called from 2 or more background threads (while main thread is free and available). This is because function itself makes use of main thread, while it is a typical use case that it will be called from background threads, possibly concurrently.
I already have one test case where one of the calling threads is the main thread. But I also want to test the case when main thread is not busy, when 2 or more other threads call the function.
The problem is that there's seems no way to leave main thread free. For instance even if queue is defined with qoc: .background, main thread was still taken:
private let queue = DispatchQueue(label: "bgqueue", qos: .background, attributes: .concurrent)
let iterations = 10
DispatchQueue.concurrentPerform(iterations: iterations) { _ in
queue.async {
callMyFunction() // still may run on main thread, even with 1-2 iterations
}
}
The only approach I can think of is blocking all threads at the same spot (using CountDownLatch for example), and then proceed to function from all threads, but main:
let latch = CountDownLatch(count: iterations)
DispatchQueue.concurrentPerform(iterations: iterations) { _ in
latch.countdown()
latch.await()
if !Thread.isMainThread {
callMyFunction()
}
}
The problems here are 1 - have to make sure iterations < available threads; 2 - feels wrong to block main thread.
So is there any better solution? How to exclude main thread from picking up DispatchQueue work within unit tests?
As long as you're not dispatching to the main queue (or some other queue that uses the main queue as its “target”), async will never use the main thread. There are some interesting optimizations with sync which affect which thread is used (which isn’t relevant here), but not for async. The asynchronously dispatched task will run on a worker thread associated with that QoS, not the main thread.
So, go ahead and insert a thread assertion in your test to make sure it’s not on the main thread, and I think you’ll find that it is fine:
XCTAssertFalse(Thread.isMainThread, "Shouldn't be on main thread")
That having been said, you allude to the fact that “function itself makes use of main thread”. If that’s the case, you can easily deadlock if your test is blocking the main thread, waiting for the called functions to finish processing on the main thread as well.
You can often avoid these sorts of unit test deadlocks by using “expectations”.
func testWillNotDeadlock() throws {
let expectation = self.expectation(description: "testWillNotDeadlock")
someAsyncMethod {
expectation.fulfill()
}
waitForExpectations(timeout: 100)
}
In this case, although someAsyncMethod may have used the main thread, we didn't deadlock because we made sure the test used expectations rather than blocking the current thread.
So, bottom line, if you're testing some asynchronous methods that must use the main thread, make sure that the tests don't block, but rather that they use expectations.
There are, admittedly, other sources of potential problems. For example, you are calling async with 100 iterations. You need to be careful with that pattern, because you can easily exhaust all of the worker threads with this “thread explosion”. The attempt to introduce concurrentPerform (which is often used to solve thread explosion problems) won't work if the code in the closure is calling async.

Swift dispatch queues async order of execution

Considering this trivial piece of code
DispatchQueue.global().async {
print("2")
}
print("1")
we can say that output will be as following:
1
2
Are there any circumstances under which order of execution will be different (disregarding kind of used queue)? May they be forced to appear manually if any?
You said:
we can say that output will be as following ...
No, at best you can only say that the output will often/frequently be in that order, but is not guaranteed to be so.
The code snippet is dispatching code asynchronously to a global, concurrent queue, which will run it on a separate worker thread, and, in the absence of any synchronization, you have a classic race condition between the current thread and that worker thread. You have no guarantee of the sequence of these print statements, though, in practice, you will frequently see 1 before 2.
this is one of the common questions at tech interview; and for some reason interviewers expect conclusion that order of execution is constant here. So (as it was not aligned with my understanding) I decided to clarify.
Your understanding is correct, that the sequence of these two print statements is definitely not guaranteed.
Are there any circumstances under which order of execution will be different
A couple of thoughts:
By adjusting queue priorities, for example, you can change the likelihood that 1 will appear before 2. But, again, not guaranteed.
There are a variety of mechanisms to guarantee the order.
You can use a serial queue ... I gather you didn’t want to consider using another/different queue, but it is generally the right solution, so any discussion on this topic would be incomplete without that scenario;
You can use dispatch group ... you can notify on the global queue when the current queue satisfies the group;
You can use dispatch semaphores ... semaphores are a classic answer to the question, but IMHO semaphores should used sparingly as it’s so easy to make mistakes ... plus blocking threads is never a good idea;
For the sake of completeness, we should mention that you really can use any synchronization mechanism, such as locks.
I do a quick test too. Normally I will do a UI update after code execution on global queue is complete, and I would usually put in the end of code block step 2.
But today I suddenly found even I put that main queue code in the beginning of global queue block step 1, it still is executed after all global queue code execution is completed.
DispatchQueue.global().async {
// step 1
DispatchQueue.main.async {
print("1. on the main thread")
}
// code global queue
print("1. off the main thread")
print("2. off the main thread")
// step 2
DispatchQueue.main.async {
print("2. on the main thread")
}
}
Here is the output:
1. off the main thread
2. off the main thread
1. on the main thread
2. on the main thread

Using libevent together with GCD (libdispatch) in Swift

I'm creating a server side app in Swift 3. I've chosen libevent for implementing networking code because it's cross-platform and doesn't suffer from C10k problem. Libevent implements it's own event loop, but I want to keep CFRunLoop and GCD (DispatchQueue.main.after etc) functional as well, so I need to glue them somehow.
This is what I've came up with:
var terminated = false
DispatchQueue.main.after(when: DispatchTime.now() + 3) {
print("Dispatch works!")
terminated = true
}
while !terminated {
switch event_base_loop(eventBase, EVLOOP_NONBLOCK) { // libevent
case 1:
break // No events were processed
case 0:
print("DEBUG: Libevent processed one or more events")
default: // -1
print("Unhandled error in network backend")
exit(1)
}
RunLoop.current().run(mode: RunLoopMode.defaultRunLoopMode,
before: Date(timeIntervalSinceNow: 0.01))
}
This works, but introduces a latency of 0.01 sec. While RunLoop is sleeping, libevent won't be able to process events. Lowering this timeout increases CPU usage significantly when the app is idle.
I was also considering using only libevent, but third party libs in the project can use dispatch_async internally, so this can be problematic.
Running libevent's loop in a different thread makes synchronization more complex, is this the only way of solving this latency issue?
LINUX UPDATE. The above code does not work on Linux (2016-07-25-a Swift snapshot), RunLoop.current().run exists with an error. Below is a working Linux version reimplemented with a timer and dispatch_main. It suffers from the same latency issue:
let queue = dispatch_get_main_queue()
let timer = dispatch_source_create(DISPATCH_SOURCE_TYPE_TIMER, 0, 0, queue)
let interval = 0.01
let block: () -> () = {
guard !terminated else {
print("Quitting")
exit(0)
}
switch server.loop() {
case 1: break // Just idling
case 0: break //print("Libevent: processed event(s)")
default: // -1
print("Unhandled error in network backend")
exit(1)
}
}
block()
let fireTime = dispatch_time(DISPATCH_TIME_NOW, Int64(interval * Double(NSEC_PER_SEC)))
dispatch_source_set_timer(timer, fireTime, UInt64(interval * Double(NSEC_PER_SEC)), UInt64(NSEC_PER_SEC) / 10)
dispatch_source_set_event_handler(timer, block)
dispatch_resume(timer)
dispatch_main()
A quick search of the Open Source Swift Foundation libraries on GitHub reveals that the support in CFRunLoop is (perhaps obviously) implemented differently on different platforms. This means, in essence, that RunLoop and libevent, with respect to cross-platform-ness, are just different ways to achieve the same thing. I can see the thinking behind the thought that libevent is probably better suited to server implementations, since CFRunLoop didn't grow up with that specific goal, but as far as being cross-platform goes, they're both barking up the same tree.
That said, the underlying synchronization primitives used by RunLoop and libevent are inherently private implementation details and, perhaps more importantly, different between platforms. From the source, it looks like RunLoop uses epoll on Linux, as does libevent, but on macOS/iOS/etc, RunLoop is going to use Mach ports as its fundamental primitive, but libevent looks like it's going to use kqueue. You might, with enough effort, be able to make a hybrid RunLoopSource that ties to a libevent source for a given platform, but this would likely be very fragile, and generally ill-advised, for a couple of reasons: Firstly, it would be based on private implementation details of RunLoop that are not part of the public API, and therefore subject to change at any time without notice. Second, assuming you didn't go through and do this for every platform supported by both Swift and libevent, you would have broken the cross-platform-ness of it, which was one of your stated reasons for going with libevent in the first place.
One additional option you might not have considered would be to use GCD by itself, without RunLoops. Look at the docs for dispatch_main. In a server application, there's (typically) nothing special about a "main thread," so dispatching to the "main queue", should be good enough (if needed at all). You can use dispatch "sources" to manage your connections, etc. I can't personally speak to how dispatch sources scale up to the C10K/C100K/etc. level, but they've seemed pretty lightweight and low-overhead in my experience. I also suspect that using GCD like this would likely be the most idiomatic way to write a server application in Swift. I've written up a small example of a GCD-based TCP echo server as part of another answer here.
If you were bound and determined to use both RunLoop and libevent in the same application, it would, as you guessed, be best to give libevent it's own separate thread, but I don't think it's as complex as you might think. You should be able to dispatch_async from libevent callbacks freely, and similarly marshal replies from GCD managed threads to libevent using libevent's multi-threading mechanisms fairly easily (i.e. either by running with locking on, or by marshaling your calls into libevent as events themselves.) Similarly, third party libraries using GCD should not be an issue even if you chose to use libevent's loop structure. GCD manages its own thread pools and would have no way of stepping on libevent's main loop, etc.
You might also consider architecting your application such that it didn't matter what concurrency and connection handling library you used. Then you could swap out libevent, GCD, CFStreams, etc. (or mix and match) depending on what worked best for a given situation or deployment. Choosing a concurrency approach is important, but ideally you wouldn't couple yourself to it so tightly that you couldn't switch if circumstances called for it.
When you have such an architecture, I'm generally a fan of the approach of using the highest level abstraction that gets the job done, and only driving down to lower level abstractions when specific circumstances require it. In this case, that would probably mean using CFStreams and RunLoops to start, and switching out to "bare" GCD or libevent later, if you hit a wall and also determined (through empirical measurement) that it was the transport layer and not the application layer that was the limiting factor. Very few non-trivial applications actually get to the C10K problem in the transport layer; things tend to have to scale "out" at the application layer first, at least for apps more complicated than basic message passing.

NSManagedObjectContext performBlockAndWait: doesn't execute on background thread?

I have an NSManagedObjectContext declared like so:
- (NSManagedObjectContext *) backgroundMOC {
if (backgroundMOC != nil) {
return backgroundMOC;
}
backgroundMOC = [[NSManagedObjectContext alloc] initWithConcurrencyType:NSPrivateQueueConcurrencyType];
return backgroundMOC;
}
Notice that it is declared with a private queue concurrency type, so its tasks should be run on a background thread. I have the following code:
-(void)testThreading
{
/* ok */
[self.backgroundMOC performBlock:^{
assert(![NSThread isMainThread]);
}];
/* CRASH */
[self.backgroundMOC performBlockAndWait:^{
assert(![NSThread isMainThread]);
}];
}
Why does calling performBlockAndWait execute the task on the main thread rather than background thread?
Tossing in another answer, to try an explain why performBlockAndWait will always run in the calling thread.
performBlock is completely asynchronous. It will always enqueue the block onto the queue of the receiving MOC, and then return immediately. Thus,
[moc performBlock:^{
// Foo
}];
[moc performBlock:^{
// Bar
}];
will place two blocks on the queue for moc. They will always execute asynchronously. Some unknown thread will pull blocks off of the queue and execute them. In addition, those blocks are wrapped within their own autorelease pool, and also they will represent a complete Core Data user event (processPendingChanges).
performBlockAndWait does NOT use the internal queue. It is a synchronous operation that executes in the context of the calling thread. Of course, it will wait until the current operations on the queue have been executed, and then that block will execute in the calling thread. This is documented (and reasserted in several WWDC presentations).
Furthermore, performBockAndWait is re-entrant, so nested calls all happen right in that calling thread.
The Core Data engineers have been very clear that the actual thread in which a queue-based MOC operation runs is not important. It's the synchronization by using the performBlock* API that's key.
So, consider 'performBlock' as "This block is being placed on a queue, to be executed at some undetermined time, in some undetermined thread. The function will return to the caller as soon as it has been enqueued"
performBlockAndWait is "This block will be executed at some undetermined time, in this exact same thread. The function will return after this code has completely executed (which will occur after the current queue associated with this MOC has drained)."
EDIT
Are you sure of "performBlockAndWait does NOT use the internal queue"?
I think it does. The only difference is that performBlockAndWait will
wait until the block's completion. And what do you mean by calling
thread? In my understanding, [moc performBlockAndWait] and [moc
performBloc] both run on its private queue (background or main). The
important concept here is moc owns the queue, not the other way
around. Please correct me if I am wrong. – Philip007
It is unfortunate that I phrased the answer as I did, because, taken by itself, it is incorrect. However, in the context of the original question it is correct. Specifically, when calling performBlockAndWait on a private queue, the block will execute on the thread that called the function - it will not be put on the queue and executed on the "private thread."
Now, before I even get into the details, I want to stress that depending on internal workings of libraries is very dangerous. All you should really care about is that you can never expect a specific thread to execute a block, except anything tied to the main thread. Thus, expecting a performBlockAndWait to not execute on the main thread is not advised because it will execute on the thread that called it.
performBlockAndWait uses GCD, but it also has its own layer (e.g., to prevent deadlocks). If you look at the GCD code (which is open source), you can see how synchronous calls work - and in general they synchronize with the queue and invoke the block on the thread that called the function - unless the queue is the main queue or a global queue. Also, in the WWDC talks, the Core Data engineers stress the point that performBlockAndWait will run in the calling thread.
So, when I say it does not use the internal queue, that does not mean it does not use the data structures at all. It must synchronize the call with the blocks already on the queue, and those submitted in other threads and other asynchronous calls. However, when calling performBlockAndWait it does not put the block on the queue... instead it synchronizes access and runs the submitted block on the thread that called the function.
Now, SO is not a good forum for this, because it's a bit more complex than that, especially w.r.t the main queue, and GCD global queues - but the latter is not important for Core Data.
The main point is that when you call any performBlock* or GCD function, you should not expect it to run on any particular thread (except something tied to the main thread) because queues are not threads, and only the main queue will run blocks on a specific thread.
When calling the core data performBlockAndWait the block will execute in the calling thread (but will be appropriately synchronized with everything submitted to the queue).
I hope that makes sense, though it probably just caused more confusion.
EDIT
Furthermore, you can see the unspoken implications of this, in that the way in which performBlockAndWait provides re-entrant support breaks the FIFO ordering of blocks. As an example...
[context performBlockAndWait:^{
NSLog(#"One");
[context performBlock:^{
NSLog(#"Two");
}];
[context performBlockAndWait:^{
NSLog(#"Three");
}];
}];
Note that strict adherence to the FIFO guarantee of the queue would mean that the nested performBlockAndWait ("Three") would run after the asynchronous block ("Two") since it was submitted after the async block was submitted. However, that is not what happens, as it would be impossible... for the same reason a deadlock ensues with nested dispatch_sync calls. Just something to be aware of if using the synchronous version.
In general, avoid sync versions whenever possible because dispatch_sync can cause a deadlock, and any re-entrant version, like performBlockAndWait will have to make some "bad" decision to support it... like having sync versions "jump" the queue.
Why not? Grand Central Dispatch's block concurrency paradigm (which I assume MOC uses internally) is designed so that only the runtime and operating system need to worry about threads, not the developer (because the OS can do it better than you can do to having more detailed information). Too many people assume that queues are the same as threads. They are not.
Queued blocks are not required to run on any given thread (the exception being blocks in the main queue must execute on the main thread). So, in fact, sometimes sync (i.e. performBlockAndWait) queued blocks will run on the main thread if the runtime feels it would be more efficient than creating a thread for it. Since you are waiting for the result anyway, it wouldn't change the way your program functioned if the main thread were to hang for the duration of the operation.
This last part I am not sure if I remember correctly, but in the WWDC 2011 videos about GCD, I believe that it was mentioned that the runtime will make an effort to run on the main thread, if possible, for sync operations because it is more efficient. In the end though, I suppose the answer to "why" can only be answered by the people who designed the system.
I don't think that the MOC is obligated to use a background thread; it's just obligated to ensure that your code will not run into concurrency issues with the MOC if you use performBlock: or performBlockAndWait:. Since performBlockAndWait: is supposed to block the current thread, it seems reasonable to run that block on that thread.
The performBlockAndWait: call only makes sure that you execute the code in such a way that you don't introduce concurrency (i.e. on 2 threads performBlockAndWait: will not run at the same time, they will block each other).
The long and the short of it is that you can't depend on which thread a MOC operation runs on, well basically ever. I've learned the hard way that if you use GCD or just straight up threads, you always have to create local MOCs for each operation and then merge them to the master MOC.
There is a great library (MagicalRecord) that makes that process very simple.

Does puting a block on a sync GCD queue locks that block and pauses the others?

I read that GCD synchronous queues (dispatch_sync) should be used to implement critical sections of code. An example would be a block that subtracts transaction amount from account balance. The interesting part of sync calls is a question, how does that affect the work of other blocks on multiple threads?
Lets imagine the situation where there are 3 threads that use and execute both system and user defined blocks from main and custom queues in asynchronous mode. Those block are all executed in parallel in some order. Now, if a block is put on a custom queue with sync mode, does that mean that all other blocks (including on other threads) are suspended until the successful execution of the block? Or does that mean that only some lock will be put on that block while other will still execute. However, if other blocks use the same data as the sync block then it's inevitable that other blocks will wait until that lock will be released.
IMHO it doesn't matter, is it one or multiple cores, sync mode should freeze the whole app work. However, these are just my thoughts so please comment on that and share your insights :)
Synchronous dispatch suspends the execution of your code until the dispatched block has finished. Asynchronous dispatch returns immediately, the block is executed asynchronously with regard to the calling code:
dispatch_sync(somewhere, ^{ something });
// Reached later, when the block is finished.
dispatch_async(somewhere, ^{ something });
// Reached immediately. The block might be waiting
// to be executed, executing or already finished.
And there are two kinds of dispatch queues, serial and concurrent. The serial ones dispatch the blocks strictly one by one in the order they are being added. When one finishes, another one starts. There is only one thread needed for this kind of execution. The concurrent queues dispatch the blocks concurrently, in parallel. There are more threads being used there.
You can mix and match sync/async dispatch and serial/concurrent queues as you see fit. If you want to use GCD to guard access to a critical section, use a single serial queue and dispatch all operations on the shared data on this queue (synchronously or asynchronously, does not matter). That way there will always be just one block operating with the shared data:
- (void) addFoo: (id) foo {
dispatch_sync(guardingQueue, ^{ [sharedFooArray addObject:foo]; });
}
- (void) removeFoo: (id) foo {
dispatch_sync(guardingQueue, ^{ [sharedFooArray removeObject:foo]; });
}
Now if guardingQueue is a serial queue, the add/remove operations can never clash even if the addFoo: and removeFoo: methods are called concurrently from different threads.
No it doesn't.
The synchronised part is that the block is put on a queue but control does not pass back to the calling function until the block returns.
Many uses of GCD are asynchronous; you put a block on a queue and rather than waiting for the block to complete it's work control is passed back to the calling function.
This has no effect on other queues.
If you need to serialize the access to a certain resource then there are at least two
mechanisms that are accessible to you. If you have an account object (that is unique
for a given account number), then you can do something like:
#synchronize(accountObject) { ... }
If you don't have an object but are using a C structure for which there is only one
such structure for a given account number then you can do the following:
// Should be added to the account structure.
// 1 => at most 1 object can access accountLock at a time.
dispatch_semaphore_t accountLock = dispatch_semaphore_create(1);
// In your block you do the following:
block = ^(void) {
dispatch_semaphore_wait(accountLock,DISPATCH_TIME_FOREVER);
// Do something
dispatch_semaphore_signal(accountLock);
};
// -- Edited: semaphore was leaking.
// At the appropriate time release the lock
// If the semaphore was created in the init then
// the semaphore should be released in the release method.
dispatch_release(accountLock);
With this, regardless of the level of concurrency of your queues, you are guaranteed that only one thread will access an account at any given time.
There are many more types of synchronization objects but these two are easy to use and
quite flexible.