How mutex guarantee ownership in freeRTOS? - mutex

I'm playing with Mutex in freeRTOS using esp32. in some documents i have read that mutex guarantee ownership, which mean if a thread (let's name it task_A) locks up a critical resource (take token) other threads (task_B and task_C) will stay in hold mode waiting for that resource to be unlocked by the same thread that locked it up(which is task_A). i tried to prove that by setting up the other tasks (task_B and task_C) to give a token before start doing anything and just after that it will try to take a token from the mutex holder, which is surprisingly worked without showing any kid of error.
Well, the method i used to verify or display how things works i created a display function that read events published (set and cleared) by each task (when it's in waiting mode it set the waiting bit up if it's working it will set the working bit up etc..., you get the idea). and a simple printf() in case of error in take or give function ( xSemaphoreTake != true and xSemaphoreGive != true).
I can't use the debug mode because i don't have any kind of micro controller debugger.
This is an example of what i'm trying to do:
i created many tasks and each one will call this function but in different time with different setup.
void vVirtualResource(int taskId, int runTime_ms){
int delay_tick = 10;
int currentTime_tick = 0;
int stopTime_tick = runTime_ms/portTICK_PERIOD_MS;
if(xSemaphoreGive(xMutex)!=true){
printf("Something wrong in giving first mutex's token in task id: %d\n", taskId);
}
while(xSemaphoreTake(xMutex, 10000/portTICK_PERIOD_MS) != true){
vTaskDelay(1000/portTICK_PERIOD_MS);
}
// notify that the task with <<task id>> is currently running and using this resource
switch (taskId)
{
case 1:
xEventGroupClearBits(xMutexEvent, EVENTMASK_MUTEXTSK1);
xEventGroupSetBits(xMutexEvent, EVENTRUN_MUTEXTSK1);
break;
case 2:
xEventGroupClearBits(xMutexEvent, EVENTMASK_MUTEXTSK2);
xEventGroupSetBits(xMutexEvent, EVENTRUN_MUTEXTSK2);
break;
case 3:
xEventGroupClearBits(xMutexEvent, EVENTMASK_MUTEXTSK3);
xEventGroupSetBits(xMutexEvent, EVENTRUN_MUTEXTSK3);
break;
default:
break;
}
// start running the resource
while(currentTime_tick<stopTime_tick){
vTaskDelay(delay_tick);
currentTime_tick += delay_tick;
}
// gives back the token
if(xSemaphoreGive(xMutex)!=true){
printf("Something wrong in giving mutex's token in task id: %d\n", taskId);
}
}
You will notice that for the very first time, the first task that will start running in the processor will print out the first error message because it can't give a token while there still a token in the mutex holder, it's normal, so i just ignore it.
Hope someone can explain to me how mutex guarantee ownership using code in freeRTOS. In the first place i didn't use the first xSemaphoreGive function and it worked fine. but that doesn't mean it guarantee anything. or i'm not coding right.
Thank you.

Your example is quite convoluted, I also don't see clear code of task_A, task_B or task_C so I'll try to explain on a simplier example which hopefully explains how mutex guarantees resource ownership.
The general approach to working with mutexes is the following:
void doWork()
{
// attempt to take mutex
if(xSemaphoreTake(mutex, WAIT_TIME) == pdTRUE)
{
// mutex taken - do work
...
// release mutex
xSemaphoreGive(mutex);
}
else
{
// failed to take mutex for 'WAIT_TIME' amount of time
}
}
The doWork function above is the function that may be called by multiple threads at the same time and needs to be protected. This pattern repeats for every function on given resource that needs protection. If resource is more complex, a good approach is to guard the top-most functions that are callable by threads, then if mutex is successfully taken call internal functions that do the actual work.
The ownership guarantee you speak about is the fact that there may not be more than one context (threads, but also interrupts) that are under the if(xSemaphoreTake(mutex, WAIT_TIME) == pdTRUE) statement. In other words, if one context successfully takes the mutex, it is guaranteed that no other context will be able to also take it, unless the original context releases it with xSemaphoreGive first.
Now as for your scenario - while it is not entirely clear to me how it's supposed to work, I can see two issues with your code:
xSemaphoreGive at the beginning of the function - don't do that. Mutexes are by default "given" and you're not supposed to be "giving" it if you aren't the one "taking" it first. Always put a xSemaphoreGive under a successful xSemaphoreTake and nowhere else.
This code block:
while(xSemaphoreTake(xMutex, 10000/portTICK_PERIOD_MS) != true){
vTaskDelay(1000/portTICK_PERIOD_MS);
}
If you need to wait for mutex for longer - specify a longer time. If you want infinite wait, simply specify longest possible time (0xFFFFFFFF). In your scenario, you're polling for mutex every 10s, then delay for 1s during which mutex isn't actually checked, meaning there will be cases where you'll have to wait almost a full second after mutex is released by other thread to start doing work in the current thread that requested it. Waiting for mutex is already done by RTOS in an optimal way - it'll wake the highest priority task currently waiting for the mutex as soon as it's released, there's no need to do more than necessary.
If I was to give an advice of how to fix your example - simplify it and don't do more than needed such as additional calls to xSemaphoreGive or implementing your own waiting for mutex. Isolate the portion of code that performs some work to a separate function that does a single call to xSemaphoreTake at the very top and a single call to xSemaphoreGive only if xSemaphoreTake succeeds. Then call this function from different threads to test whether it works.

Related

UVM End of test Objection Mechanism and Phase Ready to End Implementation

I am exploring different ways to end a UVM test. One method that has come often from studying different blogs from Verification Academy and other sites is to use the Phase Ready to End. I have some questions regarding the implementation of this method.
I am using this method in scoreboard class, where my understanding is after my usual run phase is finished, it will call the phase ready to end method and implement it. The reason I am using it my scoreboard's run_phase finishes early, and there are some data into queues that need to be processed. So I am trying to prolong this scoreboard run_phase using this method. Here are is some pseudo-code that I have used.
function void phase_ready_to_end(uvm_phase phase);
if (phase.get_name() != "run") return;
if (queue.size() != 0) begin
phase.raise_objection(.obj(this));
fork
begin
delay_phase(phase);
end
join_none
end
endfunction
task delay_phase(uvm_phase phase);
wait(queue.size() == 0);
phase.drop_objection(.obj(this));
endtask
I have taken inspiration for this implementation from this link UVM-End of Test Mechanism for your reference. Here are some of the ungated thoughts in my mind on which I need guidance and help.
to the best of my understanding the phase_ready_to_end is called at the end of run_phase and when it runs it raises the objection for that scoreboard run_phase and runs delay_phase task.
That Delay Phase task is just waiting for the queue to end, but I am not seeing any method or task which will pop the items from the queue. Does I have to call some method to pop from the queue or as according to the 1st point above the raised objection will start the run phase so there is no need for that and we have to wait for a considerable amount of time?
Let me give you some pre-context to this question. I have a scoreboard where there are two queues whose write methods are implemented and they are being fed correctly by their source.
task run_phase (uvm_phase phase);
forever begin
compare_queues(); // this method takes data from two queues and compares them, both queues implementation are fine and they take data from their respective sources. Let me give you a scenario, let's suppose there are a total of 10 transactions being generated but the scoreboard was able to process only 6 of them and there are 4 transactions left when all objections are dropped. So to tackle that I implement this phase_to_ready_end method in my scoreboard.
end
endtask
The problem with this method that I am having is that, when I raise the objection in this phase_ready_to_end and call delay_phase method, nothing happens. And I am curious is there more to this implementation or not?
Sorry for the delay. I have shared more context to the existing question. Please see to that, let me know if it is confusing.
We have a pair of monitors that calls write method implemented inside the scoreboard. The monitors typically capture the transaction from BUS and call these WR methods to push the transactions. Thus two source and destination monitors WR into two - source and destination - queues as and when they find the transactions.
We have a checker task with RD-n-check running in forever loop in the run-phase of scoreboard. It's in a while loop and watches if the destination queue has non-zero entry. Once it finds so, it pops the head entry from destination queue and then pops the head entry from source queue as well and compares the two entries to declare if the check was a PASS or FAIL.
There are more than 2 queues and more than a pair of source/destination of course, but broadly this is the architecture around here.
Now in the current scenario, it seems that the checker tasks stop prints after certain point of time in some of the test cases. Upon adding debug prints thoroughly, it seems that checker tasks that does the job #2/#3 above and gets called inside the forever loop of the run-phase, exits gracefully one last time. However they are entered again - which is to say that the forever loop that should be calling them didn't call. As if the forever loop of run-phase stopped completely.
We also added another forever loop in run-phase that observes whether the queues are empty. From prints inside that parallel loop and from the monitor prints, we know that the queues aren't empty and monitors did push WRs into the queues for a long time.
It seems that the forever loop stopped working suddenly ( going by prints spewed out) all of a sudden but another set of threads that we added in runphase in another forever loop just to monitor those queues - keep printing that the queues have contents. So run-phase shouldn't be over but the checker tasks running in forever has stopped.
We are using Vivado 2020.2 for the simulation. This is a baffling/weird problem for us and we did go through prints multiple times to make sure nothing has been missed out. It seems we are missing very very basic or has hit a bug/broken some basics of UVM coding to land into here.
If you have any help, thoughts here, will appreciate that greatly.
The function phase_ready_to_end() gets called at the end of every task-based phase when all objections have been dropped (or never raised at all).
Typically a scoreboard has a queue or some kind of array of transactions waiting to be checked sent from a monitor via an analysis_port write() method. If your scoreboard is an in-order comparison checker, the queue size is zero when there are no more transactions waiting to be received.
If you look at the code in the link you shared, there is the following in the write_south method doing exactly that:
if (!item.compare(item_stream.pop_front()))

What happens if a thread is in the critical section or entering the critical section?

I am trying to better understand a chapter and have been confused about what happens if a thread is in the critical section or is entering the critical section. May someone explain or give me an idea on the process of what the thread undergoes in such circumstances? Thank you.
For an example, let's assume that you have an array, and multiple threads that read and write to the array; and if different threads are reading and writing to the array at the same time they'd see inconsistent data and it'd cause problems. To prevent those problems you protect the array with some kind of lock - before doing anything with the array a thread acquires the array's lock, and when it's finished using the array the thread releases the array's lock.
For example:
acquire_array_lock();
/** Critical section (code that does something with the array) **/
release_array_lock();
There's nothing special about the code in the critical section. It does whatever it was designed to do (maybe sorting the array, maybe adding up all the numbers in the array, maybe displaying the array, etc) using code that's no different to code that you might use to do the same thing in a single-threaded system without locks.
The only special parts are the code to acquire and release the lock.
There are many types of locks (spinlocks, mutexes, semaphores), but they all have the same fundamental principle - when acquiring it you have something (e.g. a variable) to determine if a thread can/can't continue, then either (if the thread can't continue) some kind of waiting or (if the thread can continue) some kind of change to let others know they need to wait; and when releasing you have something to let others know they can stop waiting.
The main difference between different kinds of locks is the implementation details - what kind of data is used to determine if a thread can/can't continue, and how a thread waits.
For the simplest kind of lock (a spinlock) you might just have a single "yes/no" flag, a little bit like this (but not literally like this):
acquire_lock(void) {
while(myLock == 0) {
// do nothing then retry
}
myLock = 1;
}
release_lock(void) {
myLock = 0;
}
However this won't work because two or more threads can see that myLock == 0 at the same time and think they can both continue (and then do the myLock = 1 after it's too late). To fix this you need assembly language or special language support for atomic operations (e.g. a special function for "test and set" or "compare and exchange").
The reason this is called a "spinlock" is that (if a thread needs to wait) it wastes CPU time continually checking ("spinning") to see if it can continue. Instead of doing this (to avoid wasting CPU time), a thread could tell a scheduler not to give it any CPU time until the lock is released; and this is how a mutex works.

Modification to "Implementing an N process barrier using semaphores"

Recently I see this problem which is pretty similar to First reader/writer problem.
Implementing an N process barrier using semaphores
I am trying to modify it to made sure that it can be reuse and work correctly.
n = the number of threads
count = 0
mutex = Semaphore(1)
barrier = Semaphore(0)
mutex.wait()
count = count + 1
if (count == n){ barrier.signal()}
mutex.signal()
barrier.wait()
mutex.wait()
count=count-1
barrier.signal()
if(count==0){ barrier.wait()}
mutex.signal()
Is this correct?
I'm wondering if there exist some mistakes I didn't detect.
Your pseudocode correctly returns barrier back to initial state. Insignificant suggestion: replace
barrier.signal()
if(count==0){ barrier.wait()}
with IMHO more readable
if(count!=0){ barrier.signal()} //if anyone left pending barrier, release it
However, there may be pitfalls in the way, barrier is reused. Described barrier has two states:
Stop each thread until all of them are pending.
Resume each thread until all of them are running
There is no protection against mixing them: some threads are being resumed, while other have already hit the first stage again. For example, you have bunch of threads, which do some stuff and then sync up on barrier. Each thread body would be:
while (!exit_condition) {
do_some_stuff();
pend_barrier(); // implementation from current question
}
Programmer expects, that number of calls for do_some_stuff() will be the same for all threads. What may (or may not) happen depending on timing: the first thread released from barrier finishes do_some_stuff() calculations before all threads have left the barrier, so it reentered pending for the second time. As a result he will be released along with other threads in the current barrier release iteration and will have (at least) one more call to do_some_stuff().

The reason why Task deletion of uCOS should not occur during ISR

I'm modifying some functionalities (mainly scheduling) of uCos-ii.
And I found out that OSTaskDel function does nothing when it is called by ISR.
Though I learned some basic features of OS, I really don't understand why that should be prohibited.
All it does is withrawl from readylist and release of acquired resources like TCB or semaphores...
Is there any reason for them to be banned while handling interrupt?
It is not clear from the documentation why it is prohibited in this case, but OSTaskDel() explicitly calls OS_Sched(), and in an ISR this should only happen when the outer-most nested interrupt handler exists (handled by OSIntExit()).
I don't think the following is advisable, because there may be other reasons why this is prohibited, but you could remove the:
if (OSIntNesting > 0) {
return (OS_TASK_DEL_ISR);
}
then make the OS_Sched() call conditional as follows:
if (OSIntNesting == 0) {
OS_Sched();
}
If this dies horribly, remember I said it was ill-advised!
This operation will extend your interrupt processing time in any case so is probably a bad idea if only for that reason.
It is a bad idea in general (not just from an ISR) to asynchronously delete another task regardless of that tasks state or resource usage. uC/OS-II provides the OSTaskDelReq() function to manage task deletion in a way that allows a task to delete itself on request and therefore be able to correctly release all its resources. Even without that, sending a request via the task's normal IPC mechanisms is usually better (and more portable).
If a task is not designed for self-deletion on demand, then you might simply use OSSuspend().
Generally, you cannot do a few things in ISRs:
block on a semaphore and the like
block while acquiring a spin lock, if it's a single-CPU system
cause a page fault, that has to be resolved by the virtual memory subsystem (with virtual on-disk memory, that is)
If you do any of the above in an ISR, you'll have a deadlock.
OSTaskDel() is probably doing some of those things.

Does puting a block on a sync GCD queue locks that block and pauses the others?

I read that GCD synchronous queues (dispatch_sync) should be used to implement critical sections of code. An example would be a block that subtracts transaction amount from account balance. The interesting part of sync calls is a question, how does that affect the work of other blocks on multiple threads?
Lets imagine the situation where there are 3 threads that use and execute both system and user defined blocks from main and custom queues in asynchronous mode. Those block are all executed in parallel in some order. Now, if a block is put on a custom queue with sync mode, does that mean that all other blocks (including on other threads) are suspended until the successful execution of the block? Or does that mean that only some lock will be put on that block while other will still execute. However, if other blocks use the same data as the sync block then it's inevitable that other blocks will wait until that lock will be released.
IMHO it doesn't matter, is it one or multiple cores, sync mode should freeze the whole app work. However, these are just my thoughts so please comment on that and share your insights :)
Synchronous dispatch suspends the execution of your code until the dispatched block has finished. Asynchronous dispatch returns immediately, the block is executed asynchronously with regard to the calling code:
dispatch_sync(somewhere, ^{ something });
// Reached later, when the block is finished.
dispatch_async(somewhere, ^{ something });
// Reached immediately. The block might be waiting
// to be executed, executing or already finished.
And there are two kinds of dispatch queues, serial and concurrent. The serial ones dispatch the blocks strictly one by one in the order they are being added. When one finishes, another one starts. There is only one thread needed for this kind of execution. The concurrent queues dispatch the blocks concurrently, in parallel. There are more threads being used there.
You can mix and match sync/async dispatch and serial/concurrent queues as you see fit. If you want to use GCD to guard access to a critical section, use a single serial queue and dispatch all operations on the shared data on this queue (synchronously or asynchronously, does not matter). That way there will always be just one block operating with the shared data:
- (void) addFoo: (id) foo {
dispatch_sync(guardingQueue, ^{ [sharedFooArray addObject:foo]; });
}
- (void) removeFoo: (id) foo {
dispatch_sync(guardingQueue, ^{ [sharedFooArray removeObject:foo]; });
}
Now if guardingQueue is a serial queue, the add/remove operations can never clash even if the addFoo: and removeFoo: methods are called concurrently from different threads.
No it doesn't.
The synchronised part is that the block is put on a queue but control does not pass back to the calling function until the block returns.
Many uses of GCD are asynchronous; you put a block on a queue and rather than waiting for the block to complete it's work control is passed back to the calling function.
This has no effect on other queues.
If you need to serialize the access to a certain resource then there are at least two
mechanisms that are accessible to you. If you have an account object (that is unique
for a given account number), then you can do something like:
#synchronize(accountObject) { ... }
If you don't have an object but are using a C structure for which there is only one
such structure for a given account number then you can do the following:
// Should be added to the account structure.
// 1 => at most 1 object can access accountLock at a time.
dispatch_semaphore_t accountLock = dispatch_semaphore_create(1);
// In your block you do the following:
block = ^(void) {
dispatch_semaphore_wait(accountLock,DISPATCH_TIME_FOREVER);
// Do something
dispatch_semaphore_signal(accountLock);
};
// -- Edited: semaphore was leaking.
// At the appropriate time release the lock
// If the semaphore was created in the init then
// the semaphore should be released in the release method.
dispatch_release(accountLock);
With this, regardless of the level of concurrency of your queues, you are guaranteed that only one thread will access an account at any given time.
There are many more types of synchronization objects but these two are easy to use and
quite flexible.