Perl - passing an open socket across fork/exec - perl

I want to have a Perl daemon listen for and accept an incoming connection from a client, and then fork & exec another Perl program to continue the conversation with the client.
I can do this fine when simply forking - where the daemon code also contains the child's code. But I don't see how the open socket can be "passed" across an exec() to another Perl program.
Somehow I got the impression that this was easy in Unix (which is my environment) and therefore in Perl, too. Can it actually be done?

This can be done in approximately three steps:
Clear the close-on-exec flag on the file descriptor.
Tell the exec'd program which file descriptor to use.
Restore the file descriptor into a handle.
1. Perl (by default) sets the close-on-exec flag on file descriptors it opens. This means file descriptors won't be preserved across an exec. You have to clear this flag first:
use Fcntl;
my $flags = fcntl $fh, F_GETFD, 0 or die "fcntl F_GETFD: $!";
fcntl $fh, F_SETFD, $flags & ~FD_CLOEXEC or die "fcntl F_SETFD: $!";
2. Now that the file descriptor will stay open across exec, you need to tell the program which descriptor it is:
my $fd = fileno $fh;
exec 'that_program', $fd; # pass it on the command line
# (you could also pass it via %ENV or whatever)
3. Recover the filehandle on the other side:
my $fd = $ARGV[0]; # or however you passed it
open my $fh, '+<&=', $fd; # fdopen
$fh->autoflush(1); # because "normal" sockets have that enabled by default
Now you have a Perl-level handle in $fh again.
Addendum: As ikegami mentioned in a comment, you can also make sure the socket is using one of the three "standard" file descriptors (0 (stdin), 1 (stdout), 2 (stderr)) which are 1. left open by default across execs, 2. have known numbers so no need to pass anything, and 3. perl will create corresponding handles for them automatically.
open STDIN, '+<&', $fh; # now STDIN refers to the socket
exec 'that_program';
Now that_program can simply use STDIN. This works even for output; there is no inherent restriction on file descriptors 0, 1, 2 that they be for input or output only. It's just a convention that all unix programs follow.

Related

How to write to an existing file in Perl?

I want to open an existing file in my desktop and write to it, for some reason I can't do it in ubuntu. Maybe I don't write the path exactly?
Is it possible without modules and etc.
open(WF,'>','/home/user/Desktop/write1.txt';
$text = "I am writing to this file";
print WF $text;
close(WF);
print "Done!\n";
You have to open a file in append (>>) mode in order to write to same file.
(Use a modern way to read a file, using a lexical filehandle:)
Here is the code snippet (tested in Ubuntu 20.04.1 with Perl v5.30.0):
#!/usr/bin/perl
use strict;
use warnings;
my $filename = '/home/vkk/Scripts/outfile.txt';
open(my $fh, '>>', $filename) or die "Could not open file '$filename' $!";
print $fh "Write this line to file\n";
close $fh;
print "done\n";
For more info, refer these links - open or appending-to-files by Gabor.
Please see following code sample, it demonstrates some aspects of correct usage of open, environment variables and reports an error if a file can not be open for writing.
Note: Run a search in Google for Perl bookshelf
#!/bin/env perl
#
# vim: ai ts=4 sw=4
#
use strict;
use warnings;
use feature 'say';
my $fname = $ENV{HOME} . '/Desktop/write1.txt';
my $text = 'I am writing to this file';
open my $fh, '>', $fname
or die "Can't open $fname";
say $fh $text;
close $fh;
say 'Done!';
Documentation quote
About modes
When calling open with three or more arguments, the second argument -- labeled MODE here -- defines the open mode. MODE is usually a literal string comprising special characters that define the intended I/O role of the filehandle being created: whether it's read-only, or read-and-write, and so on.
If MODE is <, the file is opened for input (read-only). If MODE is >, the file is opened for output, with existing files first being truncated ("clobbered") and nonexisting files newly created. If MODE is >>, the file is opened for appending, again being created if necessary.
You can put a + in front of the > or < to indicate that you want both read and write access to the file; thus +< is almost always preferred for read/write updates--the +> mode would clobber the file first. You can't usually use either read-write mode for updating textfiles, since they have variable-length records. See the -i switch in perlrun for a better approach. The file is created with permissions of 0666 modified by the process's umask value.
These various prefixes correspond to the fopen(3) modes of r, r+, w, w+, a, and a+.
Documentation: open, close,

How to write perl sript that can't be run simultaneously [duplicate]

I need to run Perl script by cron periodically (~every 3-5 minutes). I want to ensure that only one Perl script instance will be running in a time, so next cycle won't start until the previous one is finished. Could/Should that be achieved by some built-in functionality of cron, Perl or I need to handle it at script level?
I am quite new to Perl and cron, so help and general recommendations are appreciated.
I have always had good luck using File::NFSLock to get an exclusive lock on the script itself.
use Fcntl qw(LOCK_EX LOCK_NB);
use File::NFSLock;
# Try to get an exclusive lock on myself.
my $lock = File::NFSLock->new($0, LOCK_EX|LOCK_NB);
die "$0 is already running!\n" unless $lock;
This is sort of the same as the other lock file suggestions, except I don't have to do anything except attempt to get the lock.
The Sys::RunAlone module does what you want very nicely. Just add
use Sys::RunAlone;
near the top of your code.
Use File::Pid to store the script's pid in a file, which the script should check for at the start, and abort if found. You can remove the pidfile when the script is done, but it's not truly necessary, as you can simply check later to see if that process id is still alive (which will also account for the cases when your script aborts unexpectedly):
use strict;
use warnings;
use File::Pid;
my $pidfile = File::Pid->new({file => /var/run/myscript});
exit if $pidfile->running();
$pidfile->write();
# ... rest of script...
# end of script
$pidfile->remove();
exit;
A typical approach is for each process to open and lock a certain file. Then the process reads the process ID contained in the file.
If a process with that ID is running, the latecomer exits quietly. Otherwise, the new winner writes its process ID ($$ in Perl) to the pidfile, closes the handle (which releases the lock), and goes about its business.
Example implementation below:
#! /usr/bin/perl
use warnings;
use strict;
use Fcntl qw/ :DEFAULT :flock :seek /;
my $PIDFILE = "/tmp/my-program.pid";
sub take_lock {
sysopen my $fh, $PIDFILE, O_RDWR | O_CREAT or die "$0: open $PIDFILE: $!";
flock $fh => LOCK_EX or die "$0: flock $PIDFILE: $!";
my $pid = <$fh>;
if (defined $pid) {
chomp $pid;
if (kill 0 => $pid) {
close $fh;
exit 1;
}
}
else {
die "$0: readline $PIDFILE: $!" if $!;
}
sysseek $fh, 0, SEEK_SET or die "$0: sysseek $PIDFILE: $!";
truncate $fh, 0 or die "$0: truncate $PIDFILE: $!";
print $fh "$$\n" or die "$0: print $PIDFILE: $!";
close $fh or die "$0: close: $!";
}
take_lock;
print "$0: [$$] running...\n";
sleep 2;
I have always used this - small and simple - no dependancy on any module and works both Windows + Linux.
use Fcntl ':flock';
### Check to make sure there is only one instance ###
open SELF, "< $0" or die("Cannot run two instances of this program");
unless ( flock SELF, LOCK_EX | LOCK_NB ) {
print "You cannot run two instances of this program , a process is still running";
exit 1;
}
AFAIK perl has no such thing builtin. You could easily create a temporary file, when you start your application and delete it, when your script is done.
Given the frequency I would normally write a daemon (server) that nicely waits idly between job runs (i.e. sleep()) rather than try to use cron for fairly fine-grained access.
If necessary, on Unix / Linux systems you could run it from /etc/inittab (or replacement) to ensure that it always running, and is automatically restarted in the process is killed or dies.
Added: (and some irrelevant stuff removed)
The always present (running, but mostly idle) daemon approach has the benefit of eliminating the possibility of concurrent instances of the script being being started by cron automatically.
However it does mean you are responsible for managing the timing correctly, such as in the case of there is an overlap (i.e. a previous run is still running, while a new trigger occurs). This may help you decide whether to use a forking daemon or non-forking design. Threads don't provide any advantage in this scenario, so there is no need to consider their usage.
This does not completely eliminate the possibility of multiple processes running, but that a common problem with many daemons. The typical solution is to use a semaphore such as a mutually-exclusive lock on a file, to prevent a second instance from being run. The file-lock is automatically forgotten when the process ends, so in the case of abnormal termination (e.g. power failure) there is no clean-up necessary of the lock itself.
An approach using Fcntl module, and using a Perl sysopen with a O_EXCL flag (or O_RDWR | O_CREAT | O_EXCL) was given by Greg Bacon. The only differences I would make are combine exclusive locking into the sysopen call (i.e. use the flags I've suggested), and remove the then redundant flock call. Oh, and I would follow the UNIX (& Linux FHS) file-system and naming conventions of /var/run/daemonname.pid.
Another approach would be to use djb's daemontools or similar to "daemonize" the task.

Perl File pointers

I have a question concerning these two files:
1)
use strict;
use warnings;
use v5.12;
use externalModule;
my $file = "test.txt";
unless(unlink $file){ # unlink is UNIX equivalent of rm
say "DEBUG: No test.txt persent";
}
unless (open FILE, '>>'.$file) {
die("Unable to create $file");
}
say FILE "This name is $file"; # print newline automatically
unless(externalModule::external_function($file)) {
say "error with external_function";
}
print FILE "end of file\n";
close FILE;
and external module (.pm)
use strict;
use warnings;
use v5.12;
package externalModule;
sub external_function {
my $file = $_[0]; # first arguement
say "externalModule line 11: $file";
# create a new file handler
unless (open FILE, '>>'.$file) {
die("Unable to create $file");
}
say FILE "this comes from an external module";
close FILE;
1;
}
1; # return true
Now,
In the first perl script line 14:
# create a new file handler
unless (open FILE, '>>'.$file) {
die("Unable to create $file");
}
If I would have
'>'.$file
instead, then the string printed by the external module will not be displayed in the final test.txt file.
Why is that??
Kind Regards
'>' means open the file for output, possibly overwriting it ("clobbering"). >> means appending to it if it already exists.
BTW, it is recommended to use 3 argument form of open with lexical file-handles:
open my $FH, '>', $file or die "Cannot open $file: $!\n";
If you use >$file in your main function, it will write to the start of the file, and buffer output as well. So after your external function returns, the "end of file" will be appended to the buffer, and the buffer flushed -- with the file pointer still at position 0 in the file, so you'll just overwrite the text from the external function. Try a much longer text in the external function, and you'll see that the last part of it remains, with the first part getting overwritten.
This is very old syntax, and not recommended in modern Perl. The 3-argument version, which was "only" introduced in 5.6.1 about 10 years ago, is preferred. So is using a lexical variable for a file handle, rather than an uppercase bareword.
Anyway, >> means open for append, whereas > means open for write, which will remove any existing data in the file.
You're clobbering your file when you reopen it once more. The > means open the file for writing, and delete the old file if it exists and create a new one. The >> means open the file for writing, but append the data if the file already exists.
As you can see, it's very hard to pass FILE back and forth between your module and your program.
The latest syntax is to use lexically scoped variables for file handles:
use autodie;
# Here I'm using the newer three parameter version of the open statement.
# I'm also using '$fh' instead of 'FILE' to store the pointer to the open
# file. This makes it easier to pass the file handle to various functions.
#
# I'm also using "autodie". This causes my program to die due to certain
# errors, so if I forget to test, I don't cause problems. I can use `eval`
# to test for an error if I don't want to die.
open my $fh, ">>", $file; # No die if it doesn't open thx to autodie
# Now, I can pass the file handle to whatever my external module needs
# to do with my file. I no longer have to pass the file name and have
# my external module reopen the file
externalModule::xternal_function( $fh );

Atomic open of non-existing file in Perl

I want to write something to a file which name is in variable $filename.
I don't want to overwrite it, so I check first if it exists and then open it:
#stage1
if(-e $filename)
{
print "file $filename exists, not overwriting\n";
exit 1;
}
#stage2
open(OUTFILE, ">", $filename) or die $!;
But this is not atomic. Theoretically someone can create this file between stage1 and stage2. Is there some variant of open command that will do these both things in atomic way, so it will fail to open a file for writing if the file exists?
Here is an atomic way of opening files:
#!/usr/bin/env perl
use strict;
use warnings qw(all);
use Fcntl qw(:DEFAULT :flock);
my $filename = 'test';
my $fh;
# this is "atomic open" part
unless (sysopen($fh, $filename, O_CREAT | O_EXCL | O_WRONLY)) {
print "file $filename exists, not overwriting\n";
exit 1;
}
# flock() isn't required for "atomic open" per se
# but useful in real world usage like log appending
flock($fh, LOCK_EX);
# use the handle as you wish
print $fh scalar localtime;
print $fh "\n";
# unlock & close
flock($fh, LOCK_UN);
close $fh;
Debug session:
stas#Stanislaws-MacBook-Pro:~/stackoverflow$ cat test
Wed Dec 19 12:10:37 2012
stas#Stanislaws-MacBook-Pro:~/stackoverflow$ perl sysopen.pl
file test exists, not overwriting
stas#Stanislaws-MacBook-Pro:~/stackoverflow$ cat test
Wed Dec 19 12:10:37 2012
If you're concerned about multiple Perl scripts modifying the same file, just use the flock() function in each one to lock the file you're interested in.
If you're worried about external processes, which you probably don't have control over, you can use the sysopen() function. According to the Programming Perl book (which I highly recommend, by the way):
To fix this problem of overwriting, you’ll need to use sysopen, which
provides individual controls over whether to create a new file or
clobber an existing one. And we’ll ditch that –e file existence test
since it serves no useful purpose here and only increases our exposure
to race conditions.
They also provide this sample block of code:
use Fcntl qw/O_WRONLY O_CREAT O_EXCL/;
open(FH, "<", $file)
|| sysopen(FH, $file, O_WRONLY | O_CREAT | O_EXCL)
|| die "can't create new file $file: $!";
In this example, they first pull in a few constants (to be used in the sysopen call). Next, they try to open the file with open, and if that fails, they then try sysopen. They continue on to say:
Now even if the file somehow springs into existence between when open
fails and when sysopen tries to open a new file for writing, no harm
is done, because with the flags provided, sysopen will refuse to open
a file that already exists.
So, to make things clear for your situation, remove the file test completely (no more stage 1), and only do the open operation using code similar to the block above. Problem solved!

Running only one Perl script instance by cron

I need to run Perl script by cron periodically (~every 3-5 minutes). I want to ensure that only one Perl script instance will be running in a time, so next cycle won't start until the previous one is finished. Could/Should that be achieved by some built-in functionality of cron, Perl or I need to handle it at script level?
I am quite new to Perl and cron, so help and general recommendations are appreciated.
I have always had good luck using File::NFSLock to get an exclusive lock on the script itself.
use Fcntl qw(LOCK_EX LOCK_NB);
use File::NFSLock;
# Try to get an exclusive lock on myself.
my $lock = File::NFSLock->new($0, LOCK_EX|LOCK_NB);
die "$0 is already running!\n" unless $lock;
This is sort of the same as the other lock file suggestions, except I don't have to do anything except attempt to get the lock.
The Sys::RunAlone module does what you want very nicely. Just add
use Sys::RunAlone;
near the top of your code.
Use File::Pid to store the script's pid in a file, which the script should check for at the start, and abort if found. You can remove the pidfile when the script is done, but it's not truly necessary, as you can simply check later to see if that process id is still alive (which will also account for the cases when your script aborts unexpectedly):
use strict;
use warnings;
use File::Pid;
my $pidfile = File::Pid->new({file => /var/run/myscript});
exit if $pidfile->running();
$pidfile->write();
# ... rest of script...
# end of script
$pidfile->remove();
exit;
A typical approach is for each process to open and lock a certain file. Then the process reads the process ID contained in the file.
If a process with that ID is running, the latecomer exits quietly. Otherwise, the new winner writes its process ID ($$ in Perl) to the pidfile, closes the handle (which releases the lock), and goes about its business.
Example implementation below:
#! /usr/bin/perl
use warnings;
use strict;
use Fcntl qw/ :DEFAULT :flock :seek /;
my $PIDFILE = "/tmp/my-program.pid";
sub take_lock {
sysopen my $fh, $PIDFILE, O_RDWR | O_CREAT or die "$0: open $PIDFILE: $!";
flock $fh => LOCK_EX or die "$0: flock $PIDFILE: $!";
my $pid = <$fh>;
if (defined $pid) {
chomp $pid;
if (kill 0 => $pid) {
close $fh;
exit 1;
}
}
else {
die "$0: readline $PIDFILE: $!" if $!;
}
sysseek $fh, 0, SEEK_SET or die "$0: sysseek $PIDFILE: $!";
truncate $fh, 0 or die "$0: truncate $PIDFILE: $!";
print $fh "$$\n" or die "$0: print $PIDFILE: $!";
close $fh or die "$0: close: $!";
}
take_lock;
print "$0: [$$] running...\n";
sleep 2;
I have always used this - small and simple - no dependancy on any module and works both Windows + Linux.
use Fcntl ':flock';
### Check to make sure there is only one instance ###
open SELF, "< $0" or die("Cannot run two instances of this program");
unless ( flock SELF, LOCK_EX | LOCK_NB ) {
print "You cannot run two instances of this program , a process is still running";
exit 1;
}
AFAIK perl has no such thing builtin. You could easily create a temporary file, when you start your application and delete it, when your script is done.
Given the frequency I would normally write a daemon (server) that nicely waits idly between job runs (i.e. sleep()) rather than try to use cron for fairly fine-grained access.
If necessary, on Unix / Linux systems you could run it from /etc/inittab (or replacement) to ensure that it always running, and is automatically restarted in the process is killed or dies.
Added: (and some irrelevant stuff removed)
The always present (running, but mostly idle) daemon approach has the benefit of eliminating the possibility of concurrent instances of the script being being started by cron automatically.
However it does mean you are responsible for managing the timing correctly, such as in the case of there is an overlap (i.e. a previous run is still running, while a new trigger occurs). This may help you decide whether to use a forking daemon or non-forking design. Threads don't provide any advantage in this scenario, so there is no need to consider their usage.
This does not completely eliminate the possibility of multiple processes running, but that a common problem with many daemons. The typical solution is to use a semaphore such as a mutually-exclusive lock on a file, to prevent a second instance from being run. The file-lock is automatically forgotten when the process ends, so in the case of abnormal termination (e.g. power failure) there is no clean-up necessary of the lock itself.
An approach using Fcntl module, and using a Perl sysopen with a O_EXCL flag (or O_RDWR | O_CREAT | O_EXCL) was given by Greg Bacon. The only differences I would make are combine exclusive locking into the sysopen call (i.e. use the flags I've suggested), and remove the then redundant flock call. Oh, and I would follow the UNIX (& Linux FHS) file-system and naming conventions of /var/run/daemonname.pid.
Another approach would be to use djb's daemontools or similar to "daemonize" the task.