// down = acquire the resource
// up = release the resource
typedef int semaphore;
semaphore resource_1;
semaphore resource_2;
void process_A(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_2);
up(&resource_1);
}
void process_B(void) {
down(&resource_2);
down(&resource_1);
use_both_resources();
up(&resource_1);
up(&resource_2);
}
Why does this code causes deadlock?
If we change the code of process_B where the both processes ask for the resources in the same order as:
void process_B(void) {
down(&resource_1);
down(&resource_2);
use_both_resources();
up(&resource_2);
up(&resource_1);
}
Then there is no deadlock.
Why so?
Imagine that process A is running and try to get the resource_1 and gets it.
Now, process B takes control and try to get resource_2. And gets it. Now, process B tries to get resource_1 and does not get it, because it belongs to resource A. Then, process B goes to sleep.
Process A gets control again and try to get resource_2, but it belongs to process B. Now he goes to sleep too.
At this point, process A is waiting for resource_2 and process B is waiting for resource_1.
If you change the order, process B will never lock resource_2 unless it gets resource_1 first, the same for process A.
They will never be dead locked.
A necessary condition for a deadlock is a cycle of resource acquisitions. The first example constructs this a cycle 1->2->1. The second example acquires the resources in a fixed order which makes a cycle and henceforth a deadlock impossible.
Related
I'm wondering how to stop a for-await loop from "awaiting".
Here's the loop. I use it to listen to new Transactions with storekit2:
transactionListener = Task(priority: .background) { [self] in
// wait for transactions and process them as they arrive
for await verificationResult in Transaction.updates {
if Task.isCancelled { print("canceled"); break }
// --- do some funny stuff with the transaction here ---
await transaction.finish()
}
print("done")
}
As you can see, Transaction.updates is awaited and returns a new transaction whenever one is created. When the App finishes, I cancel the loop with transactionListener.cancel() - but the cancel is ignored as the Transaction.updates is waiting for the next transaction to deliver and there's no direct way in the API to stop it (like, e.g. Task.sleep() does)
The issues starts, when I run unit-tests. The listener from a previous test is still listening while the next test is already running. This produces very unreliable test results and crashes our CI/CD pipeline. I nailed it down to the shown piece of code and the described issue.
So, question: Is it possible to interrupt a for-await loop from awaiting? I have something like the Unix/Linux command kill -1 in mind. Any ideas?
I need to achieve the ability to monitor and be able to cancel an ALREADY RUNNING job on queue.
There's a lot of answers about deleting QUEUED jobs, but not on an already running one.
This is the situation: I have a "job", which consists of HUNDREDS OF THOUSANDS rows on a database, that need to be queried ONE BY ONE against a web service.
Every row needs to be picked up, queried against a web service, stored the response and its status updated.
I had that already working as a Command (launching from / outputting to console), but now I need to implement queues in order to allow piling up more jobs from more users.
So far I've seen Horizon (which doesn't runs on Windows due to missing process control libs). However, in some demos seen around it lacks (I believe) a couple things I need:
Dynamically configurable timeout (the whole job may take more than 12 hours, depending on the number of rows to process on the selected job)
Ability to CANCEL an ALREADY RUNNING job.
I also considered the option to generate EACH REQUEST as a new job instead of seeing a "job" as the whole collection of rows (this would overcome the timeout thing), but that would give me a Horizon "pending jobs" list of hundreds of thousands of records per job, and that would kill the browser (I know Redis can handle this without itching at all). Further, I guess is not possible to cancel "all jobs belonging to X tag".
I've been thinking about hitting an API route, fire the job and decouple it from the app, but I'm seeing that this requires forking processes.
For the ability to cancel, I would implement a database with job_id, and when the user hits an API to cancel a job, I'd mark it as "halted". On every loop I would check its status and if it finds "halted" then kill itself.
If I've missed any aspect just holler and I'll add it or clarify about it.
So I'm asking for an advice here since I'm new to Laravel: how could I achieve this?
So I finally came up with this (a bit clunky) solution:
In Controller:
public function cancelJob()
{
$jobs = DB::table('jobs')->get();
# I could use a specific ID and user owner filter, etc.
foreach ($jobs as $job) {
DB::table('jobs')->delete($job->id);
}
# This is a file that... well, it's self explaining
touch(base_path(config('files.halt_process_signal')));
return "Job cancelled - It will stop soon";
}
In job class (inside model::chunk() function)
# CHECK FOR HALT SIGNAL AND [OPTIONALLY] STOP THE PROCESS
if ($this->service->shouldHaltProcess()) {
# build stats, do some cleanup, log, etc...
$this->halted = true;
$this->service->stopProcess();
# This FALSE is what it makes the chunk() method to stop looping
return false;
}
In service class:
/**
* Checks the existence of the 'Halt Process Signal' file
*
* #return bool
*/
public function shouldHaltProcess() :bool
{
return file_exists($this->config['files.halt_process_signal']);
}
/**
* Stop the batch process
*
* #return void
*/
public function stopProcess() :void
{
logger()->info("=== HALT PROCESS SIGNAL FOUND - STOPPING THE PROCESS ===");
$this->deleteHaltProcessSignalFile();
return ;
}
It doesn't looks quite elegant, but it works.
I've surfed the whole web and many goes for Horizon or other tools that doesn't fit my case.
If anyone has a better way to achieve this, it's welcome to share.
Laravel queue have 3 important config:
1. retry_after
2. timeout
3. tries
See more: https://laravel.com/docs/5.8/queues
Dynamically configurable timeout (the whole job may take more than 12
hours, depending on the number of rows to process on the selected job)
I think you can config timeout + retry_after about 24h.
Ability to CANCEL an ALREADY RUNNING job.
Delete job in jobs table
Delete process by process id in your server
Hope it help you :)
Consider the following piece of code running under Solaris 11.3 (a simplified version of system(3C)):
int main(int argc, char **argv) {
pid_t pid = fork();
pid_t w;
int status;
if (pid == 0) {
execvp(argv[1], argv + 1);
perror("Failed to exec");
exit(127);
}
if (pid > 0) {
w = waitpid(pid, &status, 0);
if (w == -1) {
perror("Wait: ");
exit(1);
}
else if (WIFEXITED(status) > 0) {
printf("\nFinish code: %d\n", WEXITSTATUS(status));
}
else {
printf("\nUnexpected termination of child process.\n");
}
}
if (pid == -1) {
perror("Failed to fork");
}
}
The problem I get is that whenever the process is finished via a signal (for instance, SIGINT) the "Unexpected termination" message is never printed.
The way I see it, the whole process group receives signals from the terminal, and in this case the parent process simply terminates before waitpid(2) returns (Which happens every time, apparently).
If that is the case, I have a follow-up question. How to retrieve infromation about the signal that terminated the child process from the parent without using a signal handler? For example, I could have added another if-else block with a WIFSIGNALED check and a WTERMSIG call passing the variable status (In fact, I did, but upon termination with Ctrl+C the program delivered no output whatsoever)
So what exactly and in which order is happening there?
You say, “… whenever the process is finished via a signal
(for instance, SIGINT) …”, but you aren’t specific enough
to enable anybody to answer your question definitively.
If you are sending a signal to the child process with a kill command,
you have an odd problem.
But if, as I suspect (and as you suggest when you say
“the whole process group receives signals from the terminal”),
you are just typing Ctrl+C, it’s simple:
When you type an INTR, QUIT, or SUSP character,
the corresponding signal (SIGINT, SIGQUIT, or SIGTSTP) is sent
simultaneously to all processes in the terminal process group.
OK, strictly speaking, it’s not simultaneous.
It happens in a loop in the terminal driver
(specifically, I believe, the “line discipline” handler), in the kernel.
No user process execution can occur before this loop completes.
You say “… the parent process simply terminates
before waitpid(2) returns (… every time, apparently).”
Technically this is true.
As described above, all processes in the process group
(including your parent and child processes) receive the signal
(essentially) simultaneously.
Since the parent is not handling the signal, itself,
it terminates before it can possibly do any processing
triggered by the child’s receipt of the signal.
You say “Signal is always caught by parent process first”.
No; see above.
And the processes terminate in an unspecified order —
this may be the order in which they appear in the process table
(which is indeterminate),
or determined by some subtle (and, perhaps, undocumented) aspect
of the scheduler’s algorithm.
Related U&L questions:
What is the purpose of abstractions, session, session leader
and process groups?
What are the responsibilities of each Pseudo-Terminal (PTY) component
(software, master side, slave side)?
Does it work ok if you send signals via a "kill" from another tty? I tried this on linux. Seems the same behavior.
I think you're right if that shell control signals are passed to the process group....and you have a race. You need in the parent to catch and delay them.
What I've done is do "./prog cat"
Doing a kill -SIGINT
works fine.
Doing a control-C prints nothing.
Doing a setsid() in front has the parent terminate, but the child keep running.
I'm trying out some code that is supposed to block until moving to a new simulation time step (similar to waiting for sys.tick_start in e).
I tried writing a function that does this:
task wait_triggered();
event e;
`uvm_info("DBG", "Waiting trig", UVM_NONE)
-> e;
$display("e.triggered = ", e.triggered);
wait (!e.triggered);
`uvm_info("DBG", "Wait trig done", UVM_NONE)
endtask
The idea behind it is that I trigger some event, meaning that its triggered field is going to be 1 when control reaches the line with wait(!e.triggered). This line should unblock in the next time slot, when triggered is going to be cleared.
To test this out I added some other thread that consumes simulation time:
fork
wait_triggered();
begin
`uvm_info("DBG", "Doing stuff", UVM_NONE)
#1;
`uvm_info("DBG", "Did stuff", UVM_NONE)
end
join
#1;
$finish(1);
I see the messages Doing stuff and Did stuff, but Wait trig done never comes. The simulation also stops before reaching the finish(1). One simulator told me that this is because no further events have been scheduled.
All simulators exhibit the same behavior, so there must be something I'm missing. Could anyone explain what's going on?
The problem is with wait (!e.triggered); when e.triggered is changing from 1 to zero. It has to change in a region where nothing can be scheduled, so whether it changes at the end of the current time slot, or the beginning of the next time slot is unobservable. So the wait will hang waiting for the end of the current time slot, which never comes.
I think the closest thing to what you are looking for is #1step. This blocks for the smallest simulation precision time step. But I've got to believe there is a better way to code what you want without having to know if time is advancing.
We have a HTTP end-point that takes a long time to run and can also be called concurrently by users. As part of this request, we update the model inside a synchronized block so that other (possibly concurrent) requests pick up that change.
E.g.
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
//Long running operation continues here. It can involve further changes to instance "m"
The reason for the synchronized block is to ensure that even concurrent requests get to pick up the latest status. However, the underlying JPA does not commit my changes (m.save()) until the request is complete. Since this is a long-running request, I do not want to wait until the request is complete and still want to ensure that other callers are notified of the change in status. I tried to call "m.em().flush(); JPA.em().getTransaction().commit();" after m.save(), but that makes the transaction unavailable for the subsequent action as part of the same request. Can I just given "JPA.em().getTransaction().begin();" and let Play handle the transaction from then on? If not, what is the best way to handle this use-case?
UPDATE:
Based on the response, I modified my code as follows:
MyModel m = null;
synchronized (lockObject) {
m = MyModel.findById(id);
if (m.status == PENDING) {
m.status = ACTIVE;
} else {
//render a response back to user that the operation is not allowed
}
m.save(); //Is not expected to be called unless we set m.status = ACTIVE
}
new MyModelUpdateJob(m.id).now();
And in my job, I have the following line:
doJob() {
MyModel m = MyModel.findById(id);
print m.status; //This still prints the old status as-if m.save() had no effect...
}
What am I missing?
Put your update code in a job an call
new MyModelUpdateJob(id).now().get();
thus the update will be done in another transaction that is commited at the end of the job
ouch, as soon as you add more play servers, you will be in trouble. You may want to play with optimistic locking in your example or and I advise against it pessimistic locking....ick.
HOWEVER, looking at your code, maybe read the article Building on Quicksand. I am not sure you need a synchronized block in that case at all...try to go after being idempotent.
In your case if
1. user 1 and user 2 both call that method and it is pending, then it goes to active(Idempotent)
If user 1 or user 2 wins, well that would be like you had the synchronization block anyways.
I am sure however you have a more complex scenario not shown here, BUT READ that article Building on Quicksand as it really changes the traditional way of thinking and is how google and amazon and very large scale systems operate.
Another option for distributed transactions across play servers is zookeeper which the big large nosql guys use BUT only as a last resort ;) ;)
later,
Dean