Bounded-Waiting Mutual Exclusion with Compare-and-Swap - operating-system

I added comments to each line to the best of my understanding, but I still don't get why we set waiting[j] = false; at the end without running process j's critical section. In my opinion, waiting[j] = false; should be replaced with i = j; so when it loops again, we run process j's critical section. Or else we'll always be running process i's critical section!

For a process to enter its critical section, waiting[i] must be false, but waiting[i] can only be set to false if a process is leaving its critical section by calling waiting[j] = false, which I take to mean that now process j can enter its critical section prompting process i to wait. I'm still learning these concepts so I'm not 100% sure. Abraham and Silberschatz 9th edition does not do a very thorough job of explaining these algorithms.

Validity of the Algorithm
First, it is important to note that this algorithm solves the critical section problem only when there are two processes (here they are referred to as process 0 and process 1).
Next, as per the convention in Operating System Concepts by Abraham and Silberschatz . Peter B Galvin . Gerge Gagne, i refers to one of the processes amongst 0 and 1 and j refers to the other process.
Mapping of right code to right process
Having said that, it must be noted that this code is for the process i. The code for process j would be obtained by interchanging i and j in the given code. (In my opinion this is what caused confusion to you, since you said the following)
should be replaced with i = j; so when it loops again, we run process j's critical section. Or else we'll always be running process i's critical section!
Finally, consequences of waiting[j] = false (which happens in Process i)
Now, both these codes would execute as two different processes in a system. So as soon as you set waiting[j] = false in the last line of Process i following events occur:
The condition in while(waiting[j] && key == 1) for the code of Process j (Note that the code for Process j is obtained by replacing i with j as explained in the previous heading of Mapping of right code to right process) turns out to be false and therefore Process j resumes it's execution by entering the critical section.
Meanwhile, Process i loops back into the outermost while(true) loop, setting waiting[i] to true, key to 1 and waiting by looping over in the while(waiting[i] && key == 1) loop.

Related

Anylogic: Queue TimeOut blocks flow

I have a pretty simple Anylogic DE model where POs are launched regularly, and a certain amount of material gets to the incoming Queue in one shot (See Sample Picture below). Then the Manufacturing process starts using that material at a regular rate, but I want to check if the material in the queue gets outdated, so I'm using the TimeOut option of that queue, in order to scrap the outdated material (older than 40wks).
The problem is that every time that some material gets scrapped through this Timeout exit, the downstream Manufacturing process "stops" pulling more material, instead of continuing, and it does not get restarted until a new batch of material gets received into the Queue.
What am I doing wrong here? Thanks a lot in advance!!
Kindest regards
Your situation is interesting because there doesn't seem to be anything wrong with what you're doing. So even though what you are doing seems to be correct, I will provide you with a workaround. Instead of the Queue block, use a Wait block. You can assign a timeout and link the timeout port just like you did for the queue (seem image at the end of the answer).
In the On Enter field of the wait block (which I will assume is named Fridge), write the following code:
if( MFG.size() < MFG.capacity ) {
self.free(agent);
}
In the On Enter of MFG block write the following:
if( self.size() < self.capacity && Fridge.size() > 0 ) {
Fridge.free(Fridge.get(0));
}
And finally, in the On Exit of your MFG block write the following:
if( Fridge.size() > 0 ) {
Fridge.free(Fridge.get(0));
}
What we are doing in the above, is we are manually pushing the agents. Each time an agent is processed, the model checks if there is capacity to send more, if yes, a new agent is sent.
I know this is an unpleasant workaround, but it provides you with a solution until AnyLogic support can figure it out.

SFC Steps in IEC 61131-3 Programming

So I have a problem where in my SFC it jumps to an inital step but the commands written in the step would not register.
At the end of the SFC a step inputs 5 into A_Status(INT).
The very next transition checks if the value of A_Status is 5.
No problems so far, but after the transition when it jumps to the start of the SFC,
where the first step is supposed to input 0 into A_Status, A_Status stays at 5.
The cycle time of my program is 100ms. I have tried slowing the cycle but it didn't work.
What seems to be the problem here? Maybe the same variable used in such a sequence just doesn't work?
Reply would be greatly appreciated.
You don't mention if you write the values during Entry/Exit or in the SFC step actions. But beware, that on some occasions code from a previous step can be executed later than code in the new step.
Here is a link that explains the call order and why sometimes parts of the code is executed twice:
https://infosys.beckhoff.com/english.php?content=../content/1033/tc3_plc_intro/45035999420423563.html
I've had succes with adding the following code in all the actions to prevent this from happening.
IF STEP_NAME.x THEN // Only execute this while the step is active.
// Insert code here.
END_IF

matlab parpool failed when stopping mdce on one of workers node [duplicate]

When an out-of-memory error is raised in a parfor, is there any way to kill only one Matlab slave to free some memory instead of having the entire script terminate?
Here is what happens by default when an out-of-memory error occurs in a parfor: the script terminated, as shown in the screenshot below.
I wish there was a way to just kill one slave (i.e. removing a worker from parpool) or stop using it to release as much memory as possible from it:
If you get a out of memory in the master process there is no chance to fix this. For out of memory on the slave, this should do it:
The simple idea of the code: Restart the parfor again and again with the missing data until you get all results. If one iteration fails, a flag (file) is written which let's all iterations throw an error as soon as the first error occurred. This way we get "out of the loop" without wasting time producing other out of memory.
%Your intended iterator
iterator=1:10;
%flags which indicate what succeeded
succeeded=false(size(iterator));
%result array
result=nan(size(iterator));
FLAG='ANY_WORKER_CRASHED';
while ~all(succeeded)
fprintf('Another try\n')
%determine which iterations should be done
todo=iterator(~succeeded);
%initialize array for the remaining results
partresult=nan(size(todo));
%initialize flags which indicate which iterations succeeded (we can not
%throw erros, it throws aray results)
partsucceeded=false(size(todo));
%flag indicates that any worker crashed. Have to use file based
%solution, don't know a better one. #'
delete(FLAG);
try
parfor falseindex=1:sum(~succeeded)
realindex=todo(falseindex);
try
% The flag is used to let all other workers jump out of the
% loop as soon as one calculation has crashed.
if exist(FLAG,'file')
error('some other worker crashed');
end
% insert your code here
%dummy code which randomly trowsexpection
if rand<.5
error('hit out of memory')
end
partresult(falseindex)=realindex*2
% End of user code
partsucceeded(falseindex)=true;
fprintf('trying to run %d and succeeded\n',realindex)
catch ME
% catch errors within workers to preserve work
partresult(falseindex)=nan
partsucceeded(falseindex)=false;
fprintf('trying to run %d but it failed\n',realindex)
fclose(fopen(FLAG,'w'));
end
end
catch
%reduce poolsize by 1
newsize = matlabpool('size')-1;
matlabpool close
matlabpool(newsize)
end
%put the result of the current iteration into the full result
result(~succeeded)=partresult;
succeeded(~succeeded)=partsucceeded;
end
After quite bit of research, and a lot of trial and error, I think I may have a decent, compact answer. What you're going to do is:
Declare some max memory value. You can set it dynamically using the MATLAB function memory, but I like to set it directly.
Call memory inside your parfor loop, which returns the memory information for that particular worker.
If the memory used by the worker exceeds the threshold, cancel the task that worker was working on. Now, here it get's a bit tricky. Depending on the way you're using parfor, you'll either need to delete or cancel either the task or worker. I've verified that it works with the code below when there is one task per worker, on a remote cluster.
Insert the following code at the beginning of your parfor contents. Tweak as necessary.
memLimit = 280000000; %// This doesn't have to be in parfor. Everything else does.
memData = memory;
if memData.MemUsedMATLAB > memLimit
task = getCurrentTask();
cancel(task);
end
Enjoy! (Fun question, by the way.)
One other option to consider is that since R2013b, you can open a parallel pool with 'SpmdEnabled' set to false - this allows MATLAB worker processes to die without the whole pool being shut down - see the doc here http://www.mathworks.co.uk/help/distcomp/parpool.html . Of course, you still need to arrange somehow to shutdown the workers.

Save the debug state in matlab

I am looking for a way to save 'everything' in the matlab session when it is stopped for debugging.
Example
function funmain
a=1;
if a>1
funsub(1)
end
funsub(2)
end
function funsub(c)
b = c + 1;
funsubsub(c)
end
function funsubsub(c)
c = c + 2; %Line with breakpoint
end
When I finally reach the line with the breakpoint, I can easily navigate all workspaces and see where all function calls are made.
The question
How can I preserve this situation?
When debugging nested programs that take a long time to run, I often find myself waiting for a long time to reach a breakpoint. And sometimes I just have to close matlab, or want to try some stuff and later return to this point, so therefore finding a way to store this state would be quite desirable. I work in Windows Server 2008, but would prefer a platform independant solution that does not require installation of any software.
What have I tried
1. Saving all variables in the workspace: This works sometimes, but often I will also need to navigate other workspaces
2. Saving all variables in the calling workspace: This is already better as I can run the lowest function again, but may still be insufficient. Doing this for all nested workspaces is not very convenient, and navigating the saved workspaces may be even worse.
Besides the mentioned inconveniences, this also doesn't allow me to see the exact route via which the breakpoint is reached. Therefore I hope there is a better solution!
Code structure example
The code looks a bit like this
function fmain
fsub1()
fsub2()
fsub3()
end
function fsub1
fsubsub11
fsubsub12
...
fsubsub19
end
function fsub2
fsubsub21
fsubsub22
...
fsubsub29
end
function fsub3
fsubsub31
fsubsub32
...
fsubsub39
end
function fsubsub29
fsubsubsub291
fsubsubsub292% The break may occur in here
...
fsubsubsub299
The break can of course occur anywhere, and normally I would be able to navigate the workspace and all those above it.
Checkpointing
What you're looking to implement is known as checkpointing code. This can be very useful on pieces of code that run for a very long time. Let's take a very simple example:
f=zeros(1e6,1);
for i=1:1e6
f(i) = g(i) + i*2+5; % //do some stuff with f, not important for this example
end
This would obviously take a while on most machines so it would be a pain if it ran half way, and then you had to restart. So let's add a checkpoint!
f=zeros(1e6,1);
i=1; % //start at 1
% //unless there is a previous checkpoint, in which case skip all those iterations
if exist('checkpoint.mat')==2
load('checkpoint.mat'); % //this will load f and i
end
while i<1e6+1
f(i) = g(i) + i*2+5;
i=i+1;
if mod(i,1000)==0 % //let's save our state every 1000 iterations
save('checkpoint.mat','f','i');
end
end
delete('checkpoint.mat') % //make sure to remove it when we're done!
This allows you to quit your code midway through processing without losing all of that computation time. Deciding when and how often to checkpoint is the balance between performance and lost time!
Sample Code Implementation
Your Sample code would need to be updated as follows:
function fmain
sub1done=false; % //These really wouldn't be necessary if each function returns
sub2done=false; % //something, you could just check if the return exists
sub3done=false;
if exist('checkpoint_main.mat')==2, load('checkpoint_main.mat');end
if ~sub1done
fprintf('Entering fsub1\n');
fsub1()
fprintf('Finished with fsub1\n');
sub1done=true;
save('checkpoint_main.mat');
end
if ~sub2done
fprintf('Entering fsub2\n');
fsub2()
fprintf('Finished with fsub2\n');
sub2done=true;
save('checkpoint_main.mat');
end
if ~sub3done
fprintf('Entering fsub3\n');
fsub3()
fprintf('Finished with fsub3\n');
sub3done=true;
save('checkpoint_main.mat');
end
delete('checkpoint_main.mat');
end
function fsub2
subsub21_done=false;subsub22_done=false;...subsub29_done=false;
if exist('checkpoint_fsub2')==2, load('checkpoint_fsub2');end
if ~subsub21_done
fprintf('\tEntering fsubsub21\n');
fsubsub21
fprintf('\tFinished with fsubsub21\n');
subsub21_done=true;
save('checkpoint_fsub2.mat');
end
...
if ~subsub29_done
fprintf('\tEntering fsubsub29\n');
fsubsub29
fprintf('\tFinished with fsubsub29\n');
subsub29_done=true;
save('checkpoint_fsub2.mat');
end
delete('checkpoint_fsub2.mat');
end
function fsubsub29
subsubsub291_done=false;...subsubsub299_done=false;
if exist('checkpoint_fsubsub29.mat')==2,load('checkpoint_fsubsub29.mat');end
if ~subsubsub291_done
fprintf('\t\tEntering fsubsubsub291\n');
fsubsubsub291
fprintf('\t\tFinished with fsubsubsub291\n');
subsubsub291_done=true;
save('checkpoint_fsubsub29.mat');
end
if ~subsubsub292_done
fprintf('\t\tEntering fsubsubsub292\n');
fsubsubsub292% The break may occur in here
fprintf('\t\tFinished with fsubsubsub292\n')
subsubsub292_done=true;
save(checkpoint_fsubsub29.mat');
end
delete('checkpoint_fsubsub29.mat');
end
So in this structure if you restarted the program after it was killed it would resume back to the last saved checkpoint. So for example if the program died in subsubsub291, the program would skip fsub1 altogether, just loading the result. And then it would skip subsub21 all the way down to subsub29 where it would enter subsub29. Then it would skip subsubsub291 and enter 292 where it left off, having loaded all of the variables in that workspace and in previous workspaces. So if you backed out of 292 into 29 you would have the same workspace as if the code just ran. Note that this will also print a nice tree structure as it enters and exits functions to help debug execution order.
Reference:
https://wiki.hpcc.msu.edu/pages/viewpage.action?pageId=14781653
After a bit of googling, I found that using putvar (custom function from here: http://au.mathworks.com/matlabcentral/fileexchange/27106-putvar--uigetvar ) solved this.

Question on critical section algorithm

The Operating System Concepts 6th edition present one trival algorithm to implementate ciritical section.
do{
while (turn != i);
critical section
trun = j;
remainder section
} while(1);
Note,Pi is the process with identifier i,Pj is the process with identifier j.To simlify the question,the book limit the i,j to 0 and 1,the two processes constriant enviroment.
Question1 is,dose this algorithm voilate the Progress requirement which is one of the three requirements to ciritical section solution?
In my opinion,when Pi is in its remainder section,it cannot participate in the decision on whether Pj can enter the critical section.Then it is bound to the requirement.
Or my understanding of progress requirement is totally wrong.So because if Pi retired from the remainder section,It could not get into the cirtical section immediate,this alg violate the rule.
Question2,
If turn == 0 and P1 is ready to enter
itscritical section,P1 cannot do
so,even thought P0 may be in its
remainder section
Whats the meaning of this statement?As far as I could think,I could not understand why turn == 0 and p0 be in its remainder section could be exist concurrently...
So is this statement wrong?
Suppose that turn = 0 initially. P0 does its critical section and sets turn = 1. Now, P1 must execute its critical section before P0 may execute its one again. But just because both threads have a critical section doesn't mean that they're going to want to alternate their use of it in this way — in fact, P1 may never take its turn. (And in the general case, you can't determine this at compile time.)
So basically the problem is that the threads are forced to alternate their turns, even if one of them actually doesn't want to enter its critical section for an indefinitely long time.
By the way, your answer to Question 1 is correct. The algorithm doesn't fail the Progress condition, it fails the Bounded Waiting condition.