What is wrong with a simple variable mutex implementation? - mutex

I have this lock() unlock() mutex implementation:
void lock (boolean *m) {
while (*m == true) {}
*m = true;
}
void unlock (boolean *m) {
*m = false;
}
Question is, what is wrong with this kind of approach. Beside obvious performance implications.

Suppose the mutex is initially true/locked and threads A & B have each called lock. If after C unlocks it, A & B each check m before either sets m to true, both will get past the lock.

Related

FreeRTOS mutex/binary semaphore and deadlock

I am new to FreeRTOS, so I started with what I think is a great tutorial, the one presented by Shawn Hymel. I'm also implementing the code that I'm writting in a ESP32 DevkitC V4.
However, I think that I don't understand the difference between binary semaphores and mutexes. When I run this code that tries to avoid deadlock between two tasks that use two mutexes to protect a critical section (as shown in the tutorial):
// Use only core 1 for demo purposes
#if CONFIG_FREERTOS_UNICORE
static const BaseType_t app_cpu = 0;
#else
static const BaseType_t app_cpu = 1;
#endif
//Settings
TickType_t mutex_timeout = 1000 / portTICK_PERIOD_MS;
//Timeout for any task that tries to take a mutex!
//Globals
static SemaphoreHandle_t mutex_1;
static SemaphoreHandle_t mutex_2;
//**********************************************************
//Tasks
//Task A (High priority)
void doTaskA(void*parameters){
while(1){
//Take mutex 1
if( xSemaphoreTake(mutex_1, mutex_timeout) == pdTRUE){
Serial.println("Task A took mutex 1");
vTaskDelay(1 / portTICK_PERIOD_MS);
//Take mutex 2
if(xSemaphoreTake(mutex_2, mutex_timeout) == pdTRUE){
Serial.println("Task A took mutex 2");
//Critical section protected by 2 mutexes
Serial.println("Task A doing work");
vTaskDelay(500/portTICK_PERIOD_MS); //simulate that critical section takes 500ms
} else {
Serial.println("Task A timed out waiting for mutex 2. Trying again...");
}
} else {
Serial.println("Task A timed out waiting for mutex 1. Trying again...");
}
//Return mutexes
xSemaphoreGive(mutex_2);
xSemaphoreGive(mutex_1);
Serial.println("Task A going to sleep");
vTaskDelay(500/portTICK_PERIOD_MS);
//Wait to let other task execute
}
}
//Task B (low priority)
void doTaskB(void * parameters){
while(1){
//Take mutex 2 and wait to force deadlock
if(xSemaphoreTake(mutex_2, mutex_timeout)==pdTRUE){
Serial.println("Task B took mutex 2");
vTaskDelay(1 / portTICK_PERIOD_MS);
if(xSemaphoreTake(mutex_1, mutex_timeout) == pdTRUE){
Serial.println("Task B took mutex 1");
//Critical section protected by 2 mutexes
Serial.println("Task B doing work");
vTaskDelay(500/portTICK_PERIOD_MS); //simulate that critical section takes 500ms
} else {
Serial.println("Task B timed out waiting for mutex 1");
}
} else {
Serial.println("Task B timed out waiting for mutex 2");
}
//Return mutexes
xSemaphoreGive(mutex_1);
xSemaphoreGive(mutex_2);
Serial.println("Task B going to sleep");
vTaskDelay(500/portTICK_PERIOD_MS);
//Wait to let other task execute
}
}
void setup(){
Serial.begin(115200);
vTaskDelay(1000 / portTICK_PERIOD_MS);
Serial.println();
Serial.println("---FreeRTOS Deadlock Demo---");
//create mutexes
mutex_1 = xSemaphoreCreateMutex();
mutex_2 = xSemaphoreCreateMutex();
//Start task A (high priority)
xTaskCreatePinnedToCore(doTaskA, "Task A", 1500, NULL, 2, NULL, app_cpu);
//Start task B (low priority)
xTaskCreatePinnedToCore(doTaskB, "Task B", 1500, NULL, 1, NULL, app_cpu);
vTaskDelete(NULL);
}
void loop(){
}
My ESP32 starts automatically rebooting after both tasks reach their first mutex in execution, displaying this message:
---FreeRTOS Deadlock Demo---
Task A took mutex 1
Task B took mutex 2
Task A timed out waiting for mutex 2. Trying again...
assert failed: xQueueGenericSend queue.c:832 (pxQueue->pcHead != ((void *)0) || pxQueue->u.xSemaphore.xMutexHolder == ((void *)0) || pxQueue->u.xSemaphore.xMutexHolder == xTaskGetCurrentTaskHandle())
I am unable to interpret the error. However, when I change the definition of the mutexes to binary semaphores in setup():
//create mutexes
mutex_1 = xSemaphoreCreateBinary();
mutex_2 = xSemaphoreCreateBinary();
The code runs fine in the ESP32. Would anyone please explain me why this happens? Many thanks and sorry if the question wasn't adequately made, as this is my first one.
One of the key differences between semaphores and mutexes is the concept of ownership. Semaphores, don't have a thread that owns them. A higher priority thread can acquire a semaphore even if a lower priority thread has already acquired it. On the other hand, mutexes are owned by the thread that acquires them and can only be released by that thread.
In your code above, mutex_1 is acquired by Task A and mutex_2 is acquired by Task B. At this point, Task A is trying to acquire mutex_2. When it is an actual mutex, Task A cannot acquire it since it is owned by Task B. If this were a semaphore, however, Task A could acquire it from Task B. Thus clearing the deadlock.
The error here plays into that. After task A times out waiting for mutex_2, it starts to release the mutexes. It can release mutex_1 no problem because it owns it. When it tries to release mutex_2, it cannot because it is not the owner. Thus the OS throws an error because a task shouldn't try to release a mutex it doesn't own.
If you want to read a little more about the differences between mutexes and semaphores, you can check out this article.

Swift 5.5, when to use `Task.suspend` in custom async implementation?

The new Async/Await syntax looks great! but I wonder how to implement my own asynchronous implementation.
I've stumbled upon this API:
https://developer.apple.com/documentation/swift/task/3862702-suspend (overview in yield)
https://developer.apple.com/documentation/swift/task/3814840-yield (renamed to suspend)
This API allows me to suspend a task manually whenever I choose. The problem is, I'm am not sure how SHOULD I do it, in order to benefit from concurrency AND not avoid bad practices.
In other word, I don't know the best practices of Task.suspend()
for example:
func example() async {
for i in 0..<100 {
print("example", i)
await Task.suspend() // <-- is this OK?
}
}
Some specific questions:
how often should one call on suspend?
should suspend be called before an intensive operation, or after? (for example: IO, Crypto, etc...)
should there be a maximum amount of calls to suspend?
what is the "price" of calling suspend intensively?
when should one NOT call suspend?
are there any other ways to implement this kind of concurrency (async/await style, not GCD)
Real life example, I'm implementing a function that encrypts the content of a big file, since it is an IO+Crypto intensive task it should be async, I wonder how to use Task.suspend (or any other async/await tools) to make it asynchronous.
Calling Task.suspend() will suspend the current task for a few milliseconds in order to give some time to any tasks that might be waiting, which is particularly important if you’re doing intensive work in a loop and all your tasks use the same priority. Otherwise your heavy task can stop all asynchronous code in your app. For instance:
func f() async {
for _ in 0...10 {
var arr = (1...10000).map {_ in arc4random()}
arr.sort()
}
print("f")
}
func z() async {
print("z")
}
// Run in parallel
Task {
await f()
}
Task {
await z()
}
Outputs:
f
z
As you can see z() waits for f() because it does long-running operation of sorting a large array many times. To fix this you can add Task.suspend() in your loop:
func f() async {
for _ in 0...10 {
var arr = (1...10000).map {_ in arc4random()}
arr.sort()
await Task.suspend() // Voluntarily suspend itself
}
print("f")
}
Outputs:
z
f
async/await works on its own cooperative concurrent queues and if you don't want to do suspending consider moving your task to non-default priority(queue) e.g. Task(priority: .background) or run your heavy task on your separate queue.

Mutex does not work as I expected

My Environment: C++ Builder XE4.
I am using Mutex. In the following code, I expect that while Timer1 would acquire mutex, Timer2 process would be skipped. However, Timer2 process was not skipped at all.
What is the problem in the code?
Unit1.cpp
//---------------------------------------------------------------------------
#include <vcl.h>
#pragma hdrstop
#include "Unit1.h"
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner)
: TForm(Owner)
{
}
//---------------------------------------------------------------------------
String MutexName = L"Project1";
HANDLE HWNDMutex;
void __fastcall TForm1::FormShow(TObject *Sender)
{
HWNDMutex = CreateMutex(NULL, false, MutexName.c_str());
if (HWNDMutex == NULL) {
String msg = L"failed to create mutex";
OutputDebugString(msg.c_str());
}
Timer1->Enabled = false;
Timer1->Interval = 1000; // msec
Timer1->Enabled = true;
Timer2->Enabled = false;
Timer2->Interval = 200; // msec
Timer2->Enabled = true;
}
__fastcall TForm1::~TForm1()
{
CloseHandle(HWNDMutex);
}
void __fastcall TForm1::Timer1Timer(TObject *Sender)
{
if (WaitForSingleObject(HWNDMutex, INFINITE) == WAIT_TIMEOUT) {
return;
}
if (CHK_update->Checked) {
String msg = L"Timer1 " + Now().FormatString(L"yyyy/mm/dd hh:nn:ss.zzz");
Memo1->Lines->Add(msg);
}
for(int loop=0; loop<10; loop++) {
Application->ProcessMessages();
Sleep(90); // msec
}
ReleaseMutex(HWNDMutex);
}
//---------------------------------------------------------------------------
void __fastcall TForm1::Timer2Timer(TObject *Sender)
{
if (WaitForSingleObject(HWNDMutex, INFINITE) == WAIT_TIMEOUT) {
return;
}
if (CHK_update->Checked) {
String msg = L">>>Timer2 " + Now().FormatString(L"yyyy/mm/dd hh:nn:ss.zzz");
Memo1->Lines->Add(msg);
}
ReleaseMutex(HWNDMutex);
}
//---------------------------------------------------------------------------
Result
Timer1 2017/11/08 15:20:39.781
>>>Timer2 2017/11/08 15:20:39.786
>>>Timer2 2017/11/08 15:20:40.058
>>>Timer2 2017/11/08 15:20:40.241
>>>Timer2 2017/11/08 15:20:40.423
>>>Timer2 2017/11/08 15:20:40.603
Timer1 2017/11/08 15:20:40.796
>>>Timer2 2017/11/08 15:20:40.799
>>>Timer2 2017/11/08 15:20:41.071
>>>Timer2 2017/11/08 15:20:41.254
>>>Timer2 2017/11/08 15:20:41.436
>>>Timer2 2017/11/08 15:20:41.619
Timer1 2017/11/08 15:20:41.810
>>>Timer2 2017/11/08 15:20:41.811
>>>Timer2 2017/11/08 15:20:42.083
>>>Timer2 2017/11/08 15:20:42.265
>>>Timer2 2017/11/08 15:20:42.448
>>>Timer2 2017/11/08 15:20:42.633
I tried using TMutex with acquire() and release(), but it did not work either.
A mutex has a thread affinity and thus is re-entrant:
A mutex object is a synchronization object whose state is set to signaled when it is not owned by any thread, and nonsignaled when it is owned. Only one thread at a time can own a mutex object, whose name comes from the fact that it is useful in coordinating mutually exclusive access to a shared resource. For example, to prevent two threads from writing to shared memory at the same time, each thread waits for ownership of a mutex object before executing the code that accesses the memory. After writing to the shared memory, the thread releases the mutex object.
...
After a thread obtains ownership of a mutex, it can specify the same mutex in repeated calls to the wait-functions without blocking its execution. This prevents a thread from deadlocking itself while waiting for a mutex that it already owns. To release its ownership under such circumstances, the thread must call ReleaseMutex once for each time that the mutex satisfied the conditions of a wait function.
TTimer is a message-based timer. You have two timers running in the same thread. Which means their OnTimer events are serialized by default in relation to each other. Only one event can be running at a time (unless you do something stupid like call Application->ProcessMessages(), which is a re-entrant nightmare).
Timer2 will trigger first (4-5 times, actually), acquiring and releasing the mutex lock each time, before Timer1 triggers. Then Timer1 triggers, acquires the lock, runs a loop to pump the main UI message queue, thus allowing Timer2 to trigger again (multiple times) while Timer1Timer() is still running. Timer2 will re-acquire and release the same lock that the UI thread already has, so WaitForSingleObject() exits with WAIT_OBJECT_0 immediately. Then the loop ends and Timer1 releases the lock.
Your mutex is useless in this code. A mutex is meant for inter-thread synchronization, but you have no worker threads in this code! You have a single thread synchronizing against itself, which is redundant, and exactly the kind of deadlock-causing situation that many synchronization objects avoid by supporting re-entry.
A critical section also has a thread affinity and is re-entrant, so that is not going to help you, either:
A critical section object provides synchronization similar to that provided by a mutex object, except that a critical section can be used only by the threads of a single process.
...
When a thread owns a critical section, it can make additional calls to EnterCriticalSection or TryEnterCriticalSection without blocking its execution. This prevents a thread from deadlocking itself while waiting for a critical section that it already owns. To release its ownership, the thread must call LeaveCriticalSection one time for each time that it entered the critical section. There is no guarantee about the order in which waiting threads will acquire ownership of the critical section.
However, a semaphore would work for what you are attempting, as it does not have a thread affinity:
A semaphore object is a synchronization object that maintains a count between zero and a specified maximum value. The count is decremented each time a thread completes a wait for the semaphore object and incremented each time a thread releases the semaphore. When the count reaches zero, no more threads can successfully wait for the semaphore object state to become signaled. The state of a semaphore is set to signaled when its count is greater than zero, and nonsignaled when its count is zero.
The semaphore object is useful in controlling a shared resource that can support a limited number of users. It acts as a gate that limits the number of threads sharing the resource to a specified maximum number. For example, an application might place a limit on the number of windows that it creates. It uses a semaphore with a maximum count equal to the window limit, decrementing the count whenever a window is created and incrementing it whenever a window is closed. The application specifies the semaphore object in call to one of the wait functions before each window is created. When the count is zero—indicating that the window limit has been reached—the wait function blocks execution of the window-creation code.
...
A thread that owns a mutex object can wait repeatedly for the same mutex object to become signaled without its execution becoming blocked. A thread that waits repeatedly for the same semaphore object, however, decrements the semaphore's count each time a wait operation is completed; the thread is blocked when the count gets to zero. Similarly, only the thread that owns a mutex can successfully call the ReleaseMutex function, though any thread can use ReleaseSemaphore to increase the count of a semaphore object.
If you switch to a semaphore, your code as shown would deadlock itself as soon as Application->ProcessMessages() is called and the semaphore counter drops to 0, because of your use of INFINITE timeouts. So use smaller timeouts to prevent that.
Try this:
//---------------------------------------------------------------------------
#include <vcl.h>
#pragma hdrstop
#include "Unit1.h"
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner)
: TForm(Owner)
{
}
//---------------------------------------------------------------------------
HANDLE hSemaphore;
void __fastcall TForm1::FormShow(TObject *Sender)
{
hSemaphore = CreateSemaphore(NULL, 1, 1, NULL);
if (hSemaphore == NULL) {
OutputDebugString(L"failed to create semaphore");
}
Timer1->Enabled = false;
Timer1->Interval = 1000; // msec
Timer1->Enabled = true;
Timer2->Enabled = false;
Timer2->Interval = 200; // msec
Timer2->Enabled = true;
}
__fastcall TForm1::~TForm1()
{
if (hSemaphore)
CloseHandle(hSemaphore);
}
void __fastcall TForm1::Timer1Timer(TObject *Sender)
{
if (WaitForSingleObject(hSemaphore, 0) != WAIT_OBJECT_0) {
return;
}
if (CHK_update->Checked) {
String msg = L"Timer1 " + Now().FormatString(L"yyyy/mm/dd hh:nn:ss.zzz");
Memo1->Lines->Add(msg);
}
for(int loop=0; loop<10; loop++) {
Application->ProcessMessages();
Sleep(90); // msec
}
ReleaseSemaphore(hSemaphore, 1, NULL);
}
//---------------------------------------------------------------------------
void __fastcall TForm1::Timer2Timer(TObject *Sender)
{
if (WaitForSingleObject(hSemaphore, 0) != WAIT_OBJECT_0) {
return;
}
if (CHK_update->Checked) {
String msg = L">>>Timer2 " + Now().FormatString(L"yyyy/mm/dd hh:nn:ss.zzz");
Memo1->Lines->Add(msg);
}
ReleaseSemaphore(hSemaphore, 1, NULL);
}
//---------------------------------------------------------------------------
On a side note: beware of giving a kernel-based synchronization object a name. That allows other processes to access it and mess around with its state behind your back. Don't name objects that you don't intend to share across process boundaries! Mutexes and semaphores are namable objects.

Boost ASIO asynchronous socket with timeout

I am trying to find the proper / canonical way to implement the code below that provides a synchronous wrapper around async asio methods in order to have a timeout. The code appears to work, but none of the examples I have looked at use the boolean in the lambda to terminate the do/while loop running i/o service, so I'm not sure if this is the proper form or if it will have unintended consequences down the road. Some do things like
while(IOService.run_one);
but that never terminates.
Edit:
I'm trying to follow this example:
http://www.boost.org/doc/libs/1_53_0/doc/html/boost_asio/example/timeouts/blocking_tcp_client.cpp
But in this code they avoid needing the number of bytes read by using a \n terminator. I need the number of bytes read, hence the callback.
I have seen many other solutions that use boost async futures as well as other methods, but they do not seem to compile with the versions of gcc / boost standard for Ubuntu 16.04 and I would like to stay with those versions.
ByteArray SessionInfo::Read(const boost::posix_time::time_duration &timeout)
{
Deadline.expires_from_now(timeout);
auto bytes_received = 0lu;
auto got_callback = false;
SessionSocket->async_receive(boost::asio::buffer(receive_buffer_,
1024),
[&bytes_received, &got_callback](const boost::system::error_code &error, std::size_t bytes_transferred) {
bytes_received = bytes_transferred;
got_callback = true;
});
do
{
IOService.run_one();
}while (!got_callback);
auto bytes = ByteArray(receive_buffer_, receive_buffer_ + bytes_received);
return bytes;
}
This is how I'd do it. The first event that fires causes io_service::run() to return.
ByteArray SessionInfo::Read(const boost::posix_time::time_duration &timeout)
{
Deadline.expires_from_now(timeout); // I assume this is a member of SessionInfo
auto got_callback{false};
auto result = ByteArray();
SessionSocket->async_receive( // idem for SessionSocket
boost::asio::buffer(receive_buffer_, 1024),
[&](const boost::system::error_code error,
std::size_t bytes_received)
{
if (!ec)
{
result = ByteArray(receive_buffer_, bytes_received);
got_callback = true;
}
Deadline.cancel();
});
Deadline.async_wait([&](const boost::system::error_code ec)
{
if (!ec)
{
SessionSocket->cancel();
}
});
IOService.run();
return result;
}
Reading the conversation below M. Roy's answer, your goal is to make sure that
IOService.run(); returns. All points are valid, the instance of boost::asio::io_service should only be run once (meaning not simultaneously but it could be run multiple times in series) per thread of execution so it is imperative to know how it is used. That said, to make the IOService stop I would amend M. Roy's solution like so:
ByteArray SessionInfo::Read(const boost::posix_time::time_duration &timeout) {
Deadline.expires_from_now(timeout);
auto got_callback{false};
auto result = ByteArray();
SessionSocket->async_receive(
boost::asio::buffer(receive_buffer_, 1024),
[&](const boost::system::error_code error,
std::size_t bytes_received) {
if (!ec) {
result = ByteArray(receive_buffer_, bytes_received);
got_callback = true;
}
Deadline.cancel();
});
Deadline.async_wait(
[&](const boost::system::error_code ec) {
if (!ec) {
SessionSocket->cancel();
IOService.stop();
}
});
IOService.run();
return result;
}

Semaphore implementation : why is disabling interrupts required along with test-and-set?

Going over this sample semaphore implementations (for SMP systems), I understand the test-and-set is required for multiprocessor atomic checks. However, once we add the atomic checks aren't the disable interrupts redundant ? The disable interrupts, anyway, only offer atomicity over one processor. Addition to the semaphore queue also needs to be protected.
class semaphore {
private int t;
private int count;
private queue q;
public semaphore(int init)
{
t = 0;
count = init;
q = new queue();
}
public void P()
{
Disable interrupts;
while (TAS(t) != 0) { /* just spin */ };
if (count > 0) {
count--;
t = 0;
Enable interrupts;
return;
}
Add process to q;
t = 0;
Enable interrupts;
Redispatch;
}
public V()
{
Disable interrupts;
while (TAS(t) != 0) { /* just spin */ };
if (q == empty) {
count++;
} else {
Remove first process from q;
Wake it up;
}
t = 0;
Enable interrupts;
}
}
While it is true that turning interrupts off on one processor is insufficient to guarantee atomic memory access in a multiprocessor system (because, as you mention, threads on other processors can still access shared resources), we turn interrupts off for part of the multiprocessor semaphore implementation because we do not want to be descheduled while we are doing a test and set.
If a thread holding the test and set is descheduled, no other threads can do anything with the semaphore (because its count is protected by that test and set) the thread was using while it's asleep (this is not good). In order to guarantee that this doesn't happen we'll turn interrupts on our processor off while using the test and set.