Swift Queues/Concurrency and Locking - swift

I usually use serial queues as a mechanism of locking to make sure that one resource can be accessed by many different threads without having problems. But, I have seen cases where other devs use concurrent queues with or even without semaphores (saw IBM/Swift on Linux using concurrent queue with semaphore).
Are there any advantages/disadvantages? I would believe that just using serial queues would correctly block the resource without wasting time for semaphores.
On the other hand, what happens when the cpu is busy? If I remember correctly, a serial queue is not necessarily executed on the same thread/same cpu, right?
That would be the only explanation I can think of; a concurrent queue would be able to share the workload on all available threads/cpus, assuring thread-safe access through the semaphore.
Using a concurrent queue without a semaphore would not be safe, right?

Concurrent queues with semaphores give you more granularity as to what conditions require locking. You can have most of the functions be executed in parallel, with only the mutually exclusive regions (the critical regions) requiring locking.
However, this can be equally simulated with a concurrent queue whose critical regions are dispatched to a serial queue, to ensure mutual exclusion.
I would believe that just using serial queues would correctly block the resource without wasting time for semaphores.
Serial queues also need semaphores as mutation to the queue must be synchronized. However, it tucks it under the rug, and protects you from the many easy-to-make mistakes associated with manual semaphore use.
Using a concurrent queue without a semaphore would not be safe, right?
Nope

Related

If the wait and signal operations in semaphore are atomic, does that mean two processes can simultaneously perform them on two different processors?

Because by definition of atomicity, it implies "either all occur, or nothing occurs". But if two processes simultaneously perform a wait on the same semaphore on two different processors, it does not violate atomicity, but will lead to problems. So, what do you exactly mean by atomicity in this context? Shouldn't they be performed in a mutually exclusive manner also?
Let's say you have 2 threads and a semaphore with count of 1.
If they both down() at the same time, the atomicity of the primitive guarantees that one will be granted the semaphore and the other one will go to sleep. In particular it is impossible for both to decide to go to sleep OR both acquiring it.
Similarly, down() vs up(). up() will release and wakeup as necessary. In particular it is impossible for the thread doing down() to go to sleep after up() released it.
It's the entire point of the primitive.
The how of implementing semaphores depends upon whether they will work on a single processor or multi processor. On a single processor, a system can lock the semaphore by blocking interrupts for the short time required or using atomic instructions (ie that cannot be interrupted).
Obviously, that does not work in a multi processor system. There, you have to implement semaphores using interlocked instructions that block other processes from accessing the data at the same time.
So it all depends upon how your semaphore is implemented.

Is there a way we might face race condition when running async tasks on serial queue

Suppose if multiple async tasks running in a serial queue are accessing a same shared resource, are there any chances we might face race condition?
Following the comment I've added, this is taken from Apple doc. In bold I put the emphasis to what you are looking for.
Serial queues (also known as private dispatch queues) execute one task
at a time in the order in which they are added to the queue. The
currently executing task runs on a distinct thread (which can vary
from task to task) that is managed by the dispatch queue. Serial
queues are often used to synchronize access to a specific resource.
If you are using a concurrent queue instead you could have a race condition. You can prevent it using dispatch barriers, for example. See Grand Central Dispatch In-Depth: Part 1/2 for more details.
For NSOperation and NSOperationQueue the same applies. NSOperationQueue can be made serial with maxConcurrentOperationCount set to 1. In addition, using dependencies through operations, you can synchronize the access to a shared resource.
No you can not run into a race condition, see what I did there, when running async tasks on a serial queue due to the fact that the type of queue has to deal with the ways the tasks are executed and while synchrony and asynchrony has to deal with the responsiveness of your application when completing an expensive task.
The reason it is easy to run into a race condition when on a concurrrent queue because on a concurrent queue tasks are allowed to be executed at the same time therefore different threads sometimes may be "racing" to perform an action and in all actuality they are overwriting the previous threads work due to the threads performing the same action. Where as on a serial queue tasks are executed one at a time so two threads cant race to complete a task because it happens in sequential order. Hope that helps!

Semaphore when using a pre-emptive kernel

I know what a binary semaphore is: it is a flag when is set to 1 by an ISR of an interrupt.
But what is a semaphore when we are using a pre-emptive kernel, say FreeRTOS? Is it the same as binary semaphore?
it is a flag when is set to 1 by an ISR of an interrupt.
That is neither a complete nor accurate description of a semaphore. What you have described is merely a flag. A semaphore is a synchronisation object; there are three forms provided by a typical RTOS:
Binary Semaphore
Counting Sempahore
Mutual Exclusion Semaphore (Mutex)
In the case of a binary semaphore, there are two operations give and take. A task taking a semaphore will block (i.e. suspend execution and allow other lower or equal priority threads to run threads to run) until some other thread or interrupt handler gives the semaphore. Binary semaphores are used to signal between threads and from ISRs to threads. They are often used to implement deferred interrupt handlers, so that an ISR can ve bery short, and the handler benefit from RTOS mechanisms that are not allowed in an ISR (anything that blocks or suspends execution).
Multiple threads may block on a single semaphore, but only one of those tasks will respond take the semaphore. Some RTOS have a flush operation (VxWorks for example) that puts all threads waiting on a semaphore in the ready state simultaneously - in which case they will run according to the priority scheduling scheme.
A Counting Semaphore is similar to a Binary Semaphore, except that it can be given multiple times, and tasks may take the semaphore without blocking until the count is zero.
A Mutex is used for resource locking. It is possible to use a binary semaphore for this, but a mutex provides features that make this safer. The operations on a mutex are lock and unlock. When a thread locks a mutex, and another task attempts to lock the same mutex, the second (and any subsequent) task blocks until the first task unlocks it. This can be used to prevent more than one thread accessing a resource (memory or I/O) simultaneously. A thread may lock a mutex multiple times; a count is maintained, so that it must be unlocked an equal number of times before the lock is released. This allows a thread to nest locks.
A special feature of a mutex is that if a thread with the lock is a lower priority that a task requesting the lock, then the lower priority task is boosted to the priority of the higher in order to prevent a priority inversion where a middle priority task may preempt the low priority task with the lock increasing the length of time the higher priority task must wait this rendering the scheduling non-deterministic.
The above descriptions are typical; specific RTOS implementations may differ. For example FreeRTOS distinguishes between a mutex and a recursive mutex, the latter supporting the nestability feature; while the first is marginally more efficient where nesting is not needed.
Semaphores are not just flags, or counts. They support send and wait operations. A user-space thread can wait on a semaphore without unnecessary and unwanted polling and be made ready/running 'immediately' when another thread, or an appropriately-designed driver/ISR, sends a unit.
By 'appropriately-designed driver/ISR', I mean one that can perform a send() operation and then exit via the OS scheduler whenever it needs to set a waiting thread ready/running.
Such a mechanism is vitally important on preemptive kernels because it allows them to achieve very good I/O performance without wasting time, CPU cycles and memory-bandwidth on polling. Non-preemptive systems are hopelessly slow, latency-ridden and wasteful at I/O and this is why they are essentially no longer used and why we put up with all the synchro/locking/queueing etc issues.

When to use MCS lock

I have been reading about MCS locks which I feel is pretty cool. Now that I know how it's implemented the next question is when to use it. Below are my thoughts. Please feel free to add items to the list
1) Ideally should be used when there more than 2 threads we want to synchronise
2) MCS locks reduces the number of cache lines that has to be invalidated. In the worst case, cache lines of 2 CPUs is invalidated.
Is there anything else I'm missing ?
Also can MCS used to implement a mutex instead of a spinlock ?
A code will benefit from using MCS lock when there's a high locks contention, i.e., many threads attempting to acquire a lock simultaneously. When using a simple spin lock, all threads are polling a single shared variable, whereas MCS forms a queue of waiting threads, such that each thread is polling on its predecessor in the queue. Hence cache coherency is much lighter since waiting is performed "locally".
Using MCS to implement a mutex doesn't really makes sense.
In mutex, waiting threads are usually queued by the OS and de-scheduled, so there's no polling whatsoever. For example check out pthread's mutex implementation.
I think the other answer by #CodeMoneky1 doesn't really explain "Also can MCS used to implement a mutex instead of a spinlock ?"
The mutex was implemented using spinlock + counter + wait queue. Here the spinlock is usually Test&Set primitive, or using Peterson's solution. I would actually agree that MCS could be an alternative. The reason it is not used is probably the gain is limited. After all the scope of spinlock used in mutex is much smaller.

CoreData: would it be safe using NSConfinementConcurrencyType and dispatch all operations using this context on a serial dispatch queue?

Literally, this concurrency type requires an specific thread, but using a serial queue would be more easy, but is it safe to use the context with a NSConfinementConcurrencyType concurrency type on a serial dispatch queue?
As long as you're sure you only use that queue with the context, yes, that's completely fine.
Core Data doesn't care about the thread so much as it cares about concurrent access. If you serialize access, you're safe, however you choose to do it. You could use NSRecursiveLock or semaphores or whatever works for you.
Note that the newer concurrency models are queue based. NSPrivateQueueConcurrencyType does not guarantee that operations are always performed on the same thread, even when you use performBlock:. They happen on a private queue and might run on different threads at different times. If you can manage your queue and your access well enough to do this yourself, it's reasonable to do so.
No, having a serial queue does not guarantee the operations will execute on the same thread:
The Concurrency Programming Guide specifies
Serial queues (also known as private dispatch queues) execute one task
at a time in the order in which they are added to the queue. The
currently executing task runs on a distinct thread (which can vary
from task to task) that is managed by the dispatch queue. Serial
queues are often used to synchronize access to a specific resource.
Why don't you just use the NSPrivateQueueConcurrencyType? It will make your code cleaner and thread safe. You just need to call -performBlock: or -performBlockAndWait: when accessing the context from somewhere other than the block that initialized the context.