I want to start a process, get its PID, and write it to a PID file. I then want to check that file, get the PID, and check if the process is running with kill 0.
If the process is not running, I want to start it, get its PID, and write it to the PID file. If the process is already running, then I want to ignore it.
How can I start a process so that it keeps running and I can check its status with Perl?
It is traditional on UNIX for a process to manage its own PID file if it is understood that other processes will need its PID as a way to interact with it.
But.. If you use fork/exec to start the process, the parent receives the pid of the child process upon successful fork().
If you give us more detail, we can give more precise help.
--------------------- 2014-11-04 -----------------
Your web services 'should' be creating their own PID files (Many commercially available server solutions do this already). But you don't say how those services are started, nor what kind of processes they are: apache, iis, node, websphere, etc.
In general, this feels like an XY problem. You tell us you want to do X but the bogger picture is that you're doing Y and there's a better way to to Y than just doing X.
So please tell us about the environment and the software.
Related
Recently, I have been reading about Operating Systems, and this bugs me a lot.
How is it really possible for one process to manage other process.
Basically a CPU simply executes instructions, after executing one instruction, then it executes the instruction at address pointed by IP and increments the IP.
Let me elaborate my doubt with an example. Lets say I have an User process (or simply a process) which is being executed by CPU. Lets say, it has 'n' instruction and currently executing 'i'th instruction. IP points to (i+1)th instruction.
So, at this point how can all other OS processes like Scheduler, dispatcher etc... comes into play, Since CPU is already executing another process.
One solution (Just a guess), I could think of is , the use of Interrupts and Interrupt Service Routines.
But its only a guess.
PS: I searched and couldn't find any satisfying answer.
With the help of the hardware, ticks causes the CPU to execute operating system code. This code checks the system state and the time that has elapsed since the beginning of this process execution. At this point, the operating system can decide to schedule a different process. All it has to do is save the current state of the running process with the process that is about to start running. (basically changing the content of the registers and saving the registers state before changing to the new process).
Eventually, the CPU is taken away even if the process doesn't want to yield it.
To address your concern, there are no operating system processes in the way you think... it isn't like there are OS processes in the queue waiting among other processes....
I've been using supervisord for a while -- outstanding tool. The one use case I haven't been able to figure out is, how to configure jobs to be restarted until a condition is met, then stop restarting.
Example: let's say you have a bunch of work to do, like scaling thousands of images, or servicing millions of requests on a queue. A useful pattern would be to run many workers in parallel to work on that backlog. You could set up a supervisord job that ensures 100 workers are running, and if any of them crash, supervisord will spin up replacements so the pool of workers won't shrink.
That's great until the work is done. Maybe when the backlog is gone, the number of workers should scale down to 1 or 0. Supervisord will keep spinning up the total to be 100 processes, even if each new process checks to see if there's work to be done, sees none, and shuts down very quickly.
Is there a way for a process instance or process family to communicate with supervisord to say, the autoretsart behavior is no longer needed? Better yet, is there a way to scale the number of worker processes up and down based on some condition (like number of files in a directory or ??).
I know it can be done by updating the supervisord.conf file and running supervisorctl reload, but I'd prefer something that's more declarative and self-managing if such a thing exists.
Is there a way for a process instance or process family to communicate with supervisord to say, the autoretsart behavior is no longer needed?
You can wind down an activity by making sure your processes exit with different exitcode(s) when there is no work and making those the expected exitcodes with autorestart=unexpected in the configuration.
Better yet, is there a way to scale the number of worker processes up and down based on some condition (like number of files in a directory or ??).
The trouble is that the automatic state transitions don't allow for getting processes running again from an expected EXITED state. AFAIK the only way to do this is with the XML-RPC API's startProcess, so you would need to write or find an appropriate event listener that watches for your start condition and then uses the API.
An alternate design is to wrap your worker process in an event handler watching PROCESS COMMUNICATION Events and have one normal subprocess communicating new tasks to a pool of event listeners. But that model doesn't currently eliminate a pool of waiting processes when there is no work, it just organizes the control task in a way that may make it easier to separate out task related logic and resource usage.
I have a process A that needs to send a message to all process of type B that are running. The process A doesn't know about these other processes, they can be created and destroyed depending on external factors, thus I can have a varying number of process of type B running.
I thought I could use an UDP socket in the process A to send messages to a port P and have all my processes of type B to listen to this port P and receive the a copy of the message.
Is that possible?
I am working with Linux OpenWRT.
I am trying with LuaSockets, but I am getting a "address already in use" error. It seems that I can not have multiples applications to listen to the same port ?
Thanks for your help
It could be useful to use shared memory if all the processes are local to a single machine.
Have a look at http://man7.org/linux/man-pages/man7/shm_overview.7.html for an explanation.
In short you will need the master process to create a shared memory region and write the data into it. The slave processes can then check the data in the memory region and if it has been changed act upon it. This is however just one of many ways to accomplish this problem. You could also look into using pipes and tee.
When a packet is routed to the destination, it uses a port number to map it to and appropriate process at the server. However, I do not find any documentation on how the mapping of (port- process) is done. Please let me know with some interesting links/references. Thanks.
The operating knows which process has which ports open, that's about it in general terms. A specific answer would require specifying a specific operating system, but you can guess that there is something like a port control block for each port and that it probably contains the PID of the process that owns it, or a pointer to its process control block, etc.
I've been using Proc::Daemon in an attempt to make a start/stop daemon script, something allows me to do:
X start
X stop
X status
etc. However, in the source code it looks like that Proc::Daemon uses either a "pid" file, or a search of the process table. I'm concerned with both of these approaches, firstly as "pid"s are reused, which may give the impression a service is up when it's actually down, and secondly that process table entries are easily faked, and the checking doesn't look particularly robust.
Is there any robust way to make a start/stop daemon script/program like I've described, or has someone already made one? Note that I haven't got root access, and I'm also on Solaris if that's important.
Although pids are reused, I believe that they round-robin through a (large) fixed size set. e.g. on Solaris this used to be 30,000 (it may be different now). So 30,000 processes would have to start/finish before your pid was reused.
The approach used by Proc::Daemon doesn't look unreasonable and is a fairly common approach to this problem.
An approach I use is to have the daemon process obtain an exclusive (write) lock on a file.
You can test to see if anyone is holding the lock by trying to obtain the lock yourself, and there are various ways of obtaining the PID of a process holding a lock on a file - i.e. fcntl and probably something in /proc.
Some words of advice:
Use local files (ie. not NFS) for locks.
Make sure the lock file exist before the daemon is started.
Never delete the lock file.
The kernel associates locks with the inode number of the file, so you always want the lock file to have the same inode number throughout all time. Deleting and recreating the lock file will change the inode associated with the lock.
A simple keep alive mechanism can be implemented as a cron job - the cron job just tries to spawn the daemon process every N minutes, and then have the daemon quietly exit if it can't obtain the exclusive lock.