How does multi-threading works in Cadence/Temporal workflow? - cadence-workflow

In Cadence/Temporal workflow programming:
Native threading library is not allowed. E.g. in Java, threads must be created through Async.procedure or Async.function and in Golang, threads must be created through workflow.Go. So why?
Is there any racing condition like using native threading? E.g. Hashtable or ConcurrentHashMap should be used instead of HashMap for thread-safty?

Summary:
Workflow execution must be deterministic. This is required for history replay to rebuild thread states. In order to be deterministic, Cadence/Temporal controls the thread scheduling in a cooperative manner(instead of preemptive like most OS does):
Only one workflow thread can be running at any point of time
Only when the current executing workflow thread blocked on something else, it will yield and let a next workflow thread to run.
The order of "next workflow thread" is deterministic.
Therefore:
Native threading library is never allowed in workflow code, as Cadence/Temporal will lose the control for determinism
Racing condition that we usually run into will never happen because of the cooperative multi-threading. HashMap is safe to use in workflow code.
More Details
Cadence/Temporal SDKs have a DeterministicRunner to manipulate the thread execution.
E.g. Java SDK, Golang SDK. This deterministic runner will decide which workflow thread to run in the right order, and one at a time. For each decision task, it will execute in loop until "all threads are blocked" -- RunUntilAllBlocked/ExecuteUntilAllBlocked.
Async.procedure/ Async.function/ workflow.Go will create a new thread and add to the list in the deterministicRunner, so that the executing will be controlled.
Because only one thread can be executed at anytime, most racing condition that we are running into in regular code will not happen.
However, this doesn't mean there is no racing condition at all. In some cases, there will still be conditions to cause some deadlock.
In other words, being cooperative doesn't mean no racing condition or deadlock. It just mean the deadlock situations are much less. And there is no racing condition to cause dirty read like preemptive thread scheduling.
Deadlock example
If threadA grabs lockA and wait for an activity, then it yield to threadB, then threadB grabs lockB and wait for an activity;
After the activity, threadA will try to get lockB before releasing lockA,
threadB will try to get lockA before releasing lockA;
Now they will run into deadlock when the activities are completed.
More reference
https://community.temporal.io/t/how-does-workflow-thread-synchronization-work/504

Related

Can a Swift Concurrency Task be stored in a queue and started later?

I need to defer the execution of a task until I complete a high priority task, such as re-authenticating, then execute the original task from there. I'm trying to use Swift Concurrency's Task object for this:
Task {
await service.fetch(...)
}
I see that I can cancel the task, but I want to stop/start it later instead. I was thinking of storing it in a queue and flushing the queue out after the high priority task finishes. Could this be done with Swift Concurrency, or I'm hoping I don't have to wrap an Operation object with async/await or something similar?
From the WWDC 2021 video "Swift concurrency: Behind the scenes" a task schedules code to execute in the future, but doesn't give you a lot of control over when that task will execute. In particular, even though you can provide a priority for a task, there are situations where you may still have priority inversion in the order tasks execute.
Your best bet for the type of control you seem to want is using Grand Central Dispatch and have both high priority and low priority queues feeding into a serial dispatch queue and then you can put your task in the appropriate queue.
EDIT: I just saw a construct that's new to me (been a while since I was working with GCD daily). It's called a Workloop (https://developer.apple.com/documentation/dispatch/workloop) and sounds like it might be just the ticket without manually tying dispatch queues together.

Is there a way we might face race condition when running async tasks on serial queue

Suppose if multiple async tasks running in a serial queue are accessing a same shared resource, are there any chances we might face race condition?
Following the comment I've added, this is taken from Apple doc. In bold I put the emphasis to what you are looking for.
Serial queues (also known as private dispatch queues) execute one task
at a time in the order in which they are added to the queue. The
currently executing task runs on a distinct thread (which can vary
from task to task) that is managed by the dispatch queue. Serial
queues are often used to synchronize access to a specific resource.
If you are using a concurrent queue instead you could have a race condition. You can prevent it using dispatch barriers, for example. See Grand Central Dispatch In-Depth: Part 1/2 for more details.
For NSOperation and NSOperationQueue the same applies. NSOperationQueue can be made serial with maxConcurrentOperationCount set to 1. In addition, using dependencies through operations, you can synchronize the access to a shared resource.
No you can not run into a race condition, see what I did there, when running async tasks on a serial queue due to the fact that the type of queue has to deal with the ways the tasks are executed and while synchrony and asynchrony has to deal with the responsiveness of your application when completing an expensive task.
The reason it is easy to run into a race condition when on a concurrrent queue because on a concurrent queue tasks are allowed to be executed at the same time therefore different threads sometimes may be "racing" to perform an action and in all actuality they are overwriting the previous threads work due to the threads performing the same action. Where as on a serial queue tasks are executed one at a time so two threads cant race to complete a task because it happens in sequential order. Hope that helps!

How scheduler knows a Task is in blocking state?

I am reading "Embedded Software Primer" by David E.Simon.
In it discusses RTOS and its building blocks Scheduler and Task. It says each Task is either in Ready State, Running State, or Blocking State. My question is how the scheduler determines a Task is in Blocking State? Assume it's waiting for a Semaphore. Then it likely Semaphore is in a state it can't return. Does Scheduler see if a function does not return, then mark its state as Blocking?
The implementation details will vary by RTOS. Generally, each task has a state variable that identifies whether the task is ready, running, or blocked. The scheduler simply reads the task's state variable to determine whether the task is blocked.
Each task has a set of parameters that determine the state and context of the task. These parameters are often stored in a struct and called the "task control block" (although the implementation varies by RTOS). The ready/run/block state variable may be a part of the task control block.
When the task attempts to get the semaphore and the semaphore is not available then the task will be set to the blocked state. More specifically, the semaphore-get function will change the task from running to blocked. And then the scheduler will be called to determine which task should run next. The scheduler will read through the task state variables and will not run those tasks that are blocked.
When another task eventually sets the semaphore then the task that is blocked on the semaphore will be changed from the blocked to the ready state and the scheduler may be called to determine if a context switch should occur.
As I'm writing a RTOS ( http://distortos.org/ ), I thought that I may chime in.
The variable which holds the state of each thread is indeed usually implemented in RTOSes, and this includes mine version:
https://github.com/DISTORTEC/distortos/blob/master/include/distortos/ThreadState.hpp#L26
https://github.com/DISTORTEC/distortos/blob/master/include/distortos/internal/scheduler/ThreadControlBlock.hpp#L329
However this variable usually is used only as a debugging aid or for additional checks (like preventing you from starting a thread that is already started).
In RTOSes targeted at deeply embedded systems the distinction between ready/blocked is usually made using the containers that hold the threads. Usually the threads are "chained" in linked lists, usually also sorted by priority and insertion time. The scheduler has its own list of threads that are "ready" ( https://github.com/DISTORTEC/distortos/blob/master/include/distortos/internal/scheduler/Scheduler.hpp#L340 ). Each synchronization object (like a semaphore) also has its own list of threads which are "blocked" waiting for this object ( https://github.com/DISTORTEC/distortos/blob/master/include/distortos/Semaphore.hpp#L244 ) . When a thread attempts to use a semaphore that is currently not available, it is simply moved from the scheduler's "ready" list to semaphores's "blocked" list ( https://github.com/DISTORTEC/distortos/blob/master/source/synchronization/Semaphore.cpp#L82 ). The scheduler doesn't need to decide anything, as now - from scheduler's perspective - this thread is just gone. When this semaphore is now released by another thread, first thread which was waiting on this semaphore's "blocked" list is moved back to scheduler's "ready" list ( https://github.com/DISTORTEC/distortos/blob/master/source/synchronization/Semaphore.cpp#L39 ).
Usually there's no need to make special distinction between threads that are ready and the thread that is actually running. As the amount of threads that can actually run is fixed and equal to the number of available CPU cores, then all you need is a pointer for each CPU core which points to the thread from the "ready" list which is running at that core at that moment. In my system I do the same - the thread that is at the head of the "ready" list is the one that is running, but I also manage an iterator which points to that thread ( https://github.com/DISTORTEC/distortos/blob/master/include/distortos/internal/scheduler/Scheduler.hpp#L337 ). You could have a separate list for running threads, but in most cases it would be a waste of space (there's usually just one) and makes other things slightly more complicated.
I've actually wrote an article about thread states and their transitions if you're interested - http://distortos.org/documentation/task-states/ This article has no special distinction between the thread that is "ready" and the one that is actually running. I don't consider this distinction to be actually useful for anything, as long as you have other means to tell which of the "ready" threads is running.

Networking using run loop

I have an application which uses some external library for analytics. Problem is that I suspect it does some things synchronously, which blocks my thread and makes watchdog kill my app after 10 secs (0x8badf00d code). It is really hard to reproduce (I cannot), but there are quite few cases "in the wild".
I've read some documentation, which suggested that instead creating another thread I should use run-loops. Unfortunately the more I read about them, the more confused I get. And the last thing i want to do is release a fix which will break even more things :/
What I am trying to achieve is:
From main thread add a task to the run-loop, which calls just one function: initMyAnalytics(). My thread continues running, even if initMyAnalytics() gets locked waiting for network data. After initMyAnalytics() finishes, it quietly quits and never gets called again (so it doesnt loop or anything).
Any ideas how to achieve it? Code examples are welcome ;)
Regards!
You don't need to use a run loop in that case. Run loops' purpose is to proceed events from various sources sequentially in a particular thread and stay idle when they have nothing to do. Of course, you can detach a thread, create a run loop, add a source for your function and run the run loop until the function ends. The same as you can use a semi-trailer truck to carry your groceries home.
Here, what you need are dispatch queues. Dispatch queues are First-In-First-Out data structures that run tasks asynchronously. In contrary to run loops, a dispatch queue isn't tied to a particular thread: the working threads are automatically created and terminated as and when required.
As you only have one task to execute, you don't need to create a dispatch queue. Instead you will use an existing global concurrent queue. A concurrent queue execute one or more tasks concurrently, which is perfectly fine in our case. But if we had many tasks to execute and wanted each task to wait for its predecessor to end, we would need to create a serial queue.
So all you have to do is:
create a task for your function by enclosing it into a Block
get a global queue using dispatch_get_global_queue
add the task to the queue using dispatch_async.
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0), ^{
initMyAnalytics();
});
DISPATCH_QUEUE_PRIORITY_DEFAULT is a macro that evaluates to 0. You can get different global queues with different priorities. The second parameter is reserved for future use and should always be 0.

Who schedules the scheduler in OS - Isn't it a chicken and egg scenario?

Who schedules the scheduler?
Which is the first task created and how is this first task created? Isn't any resource or memory required for it? isn't like a chicken and egg scenario?
Isn't scheduler a task? Does it get the CPU at the end of each time slice to check which task needs to be given CPU?
Are there any good links which makes a person think and understand deeply all these concepts rather than spilling out some theory which needs to be byhearted?
The scheduler is scheduled by
an (external) event such as an interrupt, (disk done, mouse click, timer tick)
or an internal event (such as the completion of a thread, the signalling by a thread that it needs to wait for something, or the signalling of a thread that it has released a resource, or a trap caused by a thread doing something illegal like division by zero)
In short, it is triggered by any event that might require that the set of tasks to be run and/or the priorities of those tasks to be reevaluated. The scheduler decides which task(s) run next, and passes control to the next task.
Typically, this "scheduling" of the scheduler is caused by the code associated with a hardware interrupt, or code associated with a system call.
While you can think of the scheduler as being a real thread, in practice it doesn't need to be implemented that way... because it is executed with higher priority than any other task. Sophisticated OSes may in fact set aside a special thread that is the scheduler, and mark it busy when the scheduler gets control. That makes it pretty, but the bogus thread isn't scheduled by the scheduler
One can have multiple schedulers: the highest priority one (e.g., the one we just described), and other schedulers which really are threads, and are run like other user tasks. Such lower priority schedulers tend to be used to manage actions which occur at much longer intervals, such as background jobs.
it is usually invoked periodically by a timed CPU interrupt