Read from process before write with open3 in perl - perl

Let's say I have such a C program:
...
puts("What is your name?");
scanf("%s", name);
puts("Thank You!");
...
This program asks you to type your name, accepts input, displays "Thank You!" message and exits.
I want to write a program which automatically writes a name to my C program, receives output and prints it. Following program in perl works fine:
use strict;
use IPC::Open3;
my $bin = './myniceprogram';
my $pid = open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, "$bin")
or die "open3() failed $!";
my $r;
print CHLD_IN "A"x10 . "\n";
$r = <CHLD_OUT>;
print "$r";
$r = <CHLD_OUT>;
print "$r";
waitpid $pid, 0;
It produces following output:
What is your name?
Thank You!
However, I would want to read the first line from my C program ("What is your name?") BEFORE writing to it. But if I change the order of read/writes in my perl program, it just hangs:
use strict;
use IPC::Open3;
my $bin = './myniceprogram';
my $pid = open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, "$bin")
or die "open3() failed $!";
my $r;
$r = <CHLD_OUT>;
print "$r";
print CHLD_IN "A"x10 . "\n";
$r = <CHLD_OUT>;
print "$r";
waitpid $pid, 0;
With strace I can see that it's stuck on read call:
[myuser#myhost mydir]$ strace perl test2.pl
...
close(6) = 0
close(8) = 0
read(5,
But why?

You are suffering from buffering. The child program is probably using the normal convention that STDOUT is line-buffered when connected to a terminal, but block-buffered otherwise.
If this is the case, you can fool it into flushing its buffer when it outputs a line feed by using a pseudo-tty (ptty). A simple way of doing this is by executing unbuffer ./myniceprogram. IPC::Run also provides a simple mechanism for using pseudo-ttys. (Simply use <pty< and >pty> instead of < and >.)
But that's not going to work for "binary data" (data not organized into LF-terminated lines). A program that uses line-buffering for binary data should be considered buggy. If it's not meant to be interactive, it should use block buffering at all times. But this one is meant to be interactive, so it should flush its buffer at the appropriate time or avoid buffering entirely. Fixing this problem will require modifying myniceprogram.
Similarly, you probably want to disable buffering on CHLD_OUT. There are two ways of doing this:
Turn on autoflush for the handle:
CHLD_OUT->autoflush( 1 );
Flush for the handle after every write (print):
CHLD_OUT->flush();

Related

How to run shell command in Perl, like Raku?

I have excellent code in Raku:
#!/usr/bin/env perl6
CONTROL {
when CX::Warn {
note $_;
die
}
}
use fatal;
role KeyRequired {
method AT-KEY (\key) {
die "Key {key} not found" unless self.EXISTS-KEY(key);
nextsame
}
}
sub execute ($cmd) {
put $cmd;
my $proc = shell $cmd, :err, :out;
if $proc.exitcode != 0 {
put 'exit code = ' ~ $proc.exitcode;
put 'stderr ' ~ $proc.err.slurp;
put 'stdout ' ~ $proc.out.slurp;
die
}
}
execute "ls *.p6"
I say "excellent" because the Raku version runs a command, returns an exit code, and prints stdout/stderr if needed, and all in an easily-read and easily-understood manner.
Reading through the Perl5 manual for IPC::Run https://metacpan.org/pod/IPC::Run I've come across what appears to be the best Perl5 way of doing this, but I find the methods used there to be much less easily readable and understood than the Raku way of doing things.
Reading through the manual for IPC::Run the best that I can find is:
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use feature 'say';
use autodie qw(:all);
use IPC::Run qw(run timeout);
sub execute {
my $cmd = shift;
my #cat = ('cat', __FILE__); # Raku doesn't need to split the string into an array
run \#cat, \undef, \my $out, \my $err, timeout( 10 ) or die "cat: $?";
if ($out ne '') {
say "\$out = $out";
}
if ($err ne '') {
say "\$err = $err";
}
}
execute("cat " . __FILE__);
execute("cat __Fle"); #intentionally wrong to produce an error
How can I re-write the Perl5 so that it is as easily read and used as the Raku code?
You've unfairly loaded the Perl 5 example with a lot of extra fluff, and you haven't handled many things in the Raku code. For instance, you output the results in Raku despite what's in the variables, but test the variables in Perl 5.
Your Perl 5 would look more like this:
use v5.30;
use IPC::Run qw(run timeout);
sub execute {
my #command = #_;
run \#command, \undef, \my $out, \my $err, timeout( 10 )
or die "cat: $?";
say "\$out = $out";
say "\$err = $err";
}
execute("cat ", __FILE__);
ikegami offered this version in his pastebin link:
sub execute {
my ($command) = #_;
if (! run $command, \undef, \my $out, \my $err, timeout( 10 ) ) {
say "exit code = $?";
say "stderr $err";
put "stdout $out";
die "Died";
}
}
There's an interesting thing to note in both of those cases. You are assuming an error if the exit code is not zero (and Raku assumes that, which is why you have to worry about not sinking the result). However, many useful programs don't follow that convention. For instance, git merge base uses exit value 1 to mean "not an ancestor" and all exit values higher than 1 to mean an error. The command-line grep is similar. sendmail had exit code 75 to mean that something didn't work out, but it would try again later.
Raku, having an opinion on that, ignores this sort of thing and does not allow you to tell the Proc which exit values it should accept as successful exits. Perl 5 is not so opinionated. Using or die or ! ... is really saying "exit code is not zero", but that's not really a good enough description. In many cases you get away with it, but at least Perl 5 isn't deciding for you. If you expanded the Raku example to check the literal value and decide if that's successful, it will look messy.
But, notice that Raku's shell documentation notes that it's unsafe and that you should use run instead.
For what it's worth, I don't find Raku's interprocess communication all that trustworthy. In many cases, I think its IPC design was neglected. See, for instance, Does changing Perl 6's $*OUT change standard output for child processes? . I have several other IPC questions spread out in bug reports and in Stackoverflow, and almost none of them received a satisfactory answer. Mostly, I think that's because nobody thought about it that much. Granted, Raku is developed by a small team and its a big project, but when it comes to production programming, that's no factor.
Some more Raku shell weirdness:
Which shell does Perl 6's shell() use?

Passing arguments containing spaces from one script to another in Perl

I am trying to pass arguments from one Perl script to another. Some of the arguments contain spaces.
I am reading in a comma-delimited text file and splitting each line on the comma.
my ($jockey, $racecourse, $racenum, $hnamenum, $trainer, $TDRating, $PRO) = split(/,/, $line);
The data in the comma-delimited text file look as follows:
AARON LYNCH,WARRNAMBOOL,RACE 1,DAREBIN (8),ERIC MUSGROVE,B,1
When I print out each variable, from the parent script, they look fine (as above).
print "$jockey\n";
print "$racecourse\n";
print "$racenum\n";
print "$hnamenum\n";
print "$trainer\n";
print "$TDRating\n";
print "$PRO\n";
AARON LYNCH
WARRNAMBOOL
RACE 1
DAREBIN (8)
ERIC MUSGROVE
B
1
When I pass the arguments to the child script (as follows), the arguments are passed incorrectly.
system("perl \"$bindir\\narrative4.pl\" $jockey $racecourse $racenum $hnamenum $trainer $TDRating $PRO");
AARON
LYNCH
WARRNAMBOOL
RACE
1
DAREBIN
(8)
As you can see, $ARGV[0] becomes AARON, $ARGV[1] becomes LYNCH, $ARGV[2] becomes WARRNAMBOOL, and so on.
I have investigated adding quotes to the arguments using qq, quotemeta and Win32::ShellQuote, unfortunately, even if I pass qq{"$jockey"}, the quotes are still stripped before they reach the child script, so they must be protected in some way.
I not sure if either of the aforementioned solutions is the correct but I'm happy to be corrected.
I'd appreciate any suggestions. Thanks in advance.
Note: I am running this using Strawberry Perl on a Windows 10 PC.
Note2: I purposely left out use strict; & use warnings; in these examples.
Parent Script
use Cwd;
$dir = getcwd;
$bin = "bin"; $bindir = "$dir/$bin";
$infile = "FINAL-SORTED-JOCKEY-RIDES-FILE.list";
open (INFILE, "<$infile") or die "Could not open $infile $!\n";
while (<INFILE>)
{
$line = $_;
chomp($line);
my ($jockey, $racecourse, $racenum, $hnamenum, $trainer, $TDRating, $PRO) = split(/,/, $line);
print "$jockey\n";
print "$racecourse\n";
print "$racenum\n";
print "$hnamenum\n";
print "$trainer\n";
print "$TDRating\n";
print "$PRO\n";
system("perl \"$bindir\\narrative4.pl\" $jockey $racecourse $racenum $hnamenum $trainer $TDRating $PRO");
sleep (1);
}
close INFILE;
exit;
Child Script
$passedjockey = $ARGV[0];
$passedracecourse = $ARGV[1];
$passedracenum = $ARGV[2];
$passedhnamenum = $ARGV[3];
$passedtrainer = $ARGV[4];
$passedTDRating = $ARGV[5];
$passedPRO = $ARGV[6];
print "$passedjockey\n";
print "$passedracecourse\n";
print "$passedracenum\n";
print "$passedhnamenum\n";
print "$passedtrainer\n";
print "$passedTDRating\n";
print "$passedPRO\n\n";
That whole double-quoted string that is passed to system is first evaluated and thus all variables are interpolated -- so the intended multi-word arguments become merely words in a list. So in the end the string has a command to run with individual words as arguments.
Then, even if you figure out how to stick which quotes in there just right, so to keep those multi-word arguments "together," there's still a chance of a shell being invoked, in which case those arguments again get broken up into words before being passed to the program.
Instead of all this use the LIST form of system. The first argument is then the name of the program that will be directly executed without a shell (see docs for some details on that), and the remaining arguments are passed as they are to that program.
parent
use warnings;
use strict;
use feature 'say';
my #args = ('first words', 'another', 'two more', 'final');
my $prog = 'print_args.pl';
system($prog, #args) == 0
or die "Error w/ system($prog, #args): $!";
and the invoked print_args.pl
use warnings;
use strict;
use feature 'say';
say for #ARGV;
The #ARGV contains arguments passed to the program at invocation. There's more that can be done to inspect the error, see docs and links in them.†
By what you show you indeed don't need a shell and the LIST form is generally easy to recommend as a basic way to use system, when the shell isn't needed. If you were to need shell's capabilities for something in that command then you'd have to figure out how to protect those spaces.
† And then there are modules for running external programs that are far better than system & Co. From ease-of-use to features and power:
IPC::System::Simple, Capture::Tiny, IPC::Run3, IPC::Run.

Capture the output of Perl's 'system()'

I need to run a shell command with system() in Perl. For example,
system('ls')
The system call will print to STDOUT, but I want to capture the output into a variable so that I can do future processing with my Perl code.
That's what backticks are for. From perldoc perlfaq8:
Why can't I get the output of a command with system()?
You're confusing the purpose of system() and backticks (``). system()
runs a command and returns exit status information (as a 16 bit value:
the low 7 bits are the signal the process died from, if any, and the
high 8 bits are the actual exit value). Backticks (``) run a command
and return what it sent to STDOUT.
my $exit_status = system("mail-users");
my $output_string = `ls`;
See perldoc perlop for more details.
IPC::Run is my favourite module for this kind of task. Very powerful and flexible, and also trivially simple for small cases.
use IPC::Run 'run';
run [ "command", "arguments", "here" ], ">", \my $stdout;
# Now $stdout contains output
Simply use similar to the Bash example:
$variable=`some_command some args`;
That's all. Notice, you will not see any printings to STDOUT on the output because this is redirected to a variable.
This example is unusable for a command that interact with the user, except when you have prepared answers. For that, you can use something like this using a stack of shell commands:
$variable=`cat answers.txt|some_command some args`;
Inside the answers.txt file you should prepare all answers for some_command to work properly.
I know this isn't the best way for programming :) But this is the simplest way how to achieve the goal, specially for Bash programmers.
Of course, if the output is bigger (ls with subdirectory), you shouldn't get all output at once. Read the command by the same way as you read a regular file:
open CMD,'-|','your_command some args' or die $#;
my $line;
while (defined($line=<CMD>)) {
print $line; # Or push #table,$line or do whatever what you want processing line by line
}
close CMD;
An additional extended solution for processing a long command output without extra Bash calling:
my #CommandCall=qw(find / -type d); # Some example single command
my $commandSTDOUT; # File handler
my $pid=open($commandSTDOUT),'-|'); # There will be an implicit fork!
if ($pid) {
#parent side
my $singleLine;
while(defined($singleline=<$commandSTDOUT>)) {
chomp $line; # Typically we don't need EOL
do_some_processing_with($line);
};
close $commandSTDOUT; # In this place $? will be set for capture
$exitcode=$? >> 8;
do_something_with_exit_code($exitcode);
} else {
# Child side, there you really calls a command
open STDERR, '>>&', 'STDOUT'; # Redirect stderr to stdout if needed. It works only for child - remember about fork
exec(#CommandCall); # At this point the child code is overloaded by an external command with parameters
die "Cannot call #CommandCall"; # Error procedure if the call will fail
}
If you use a procedure like that, you will capture all procedure output, and you can do everything processing line by line. Good luck :)
I wanted to run system() instead of backticks because I wanted to see the output of rsync --progress. However, I also wanted to capture the output in case something goes wrong depending on the return value. (This is for a backup script). This is what I am using now:
use File::Temp qw(tempfile);
use Term::ANSIColor qw(colored colorstrip);
sub mysystem {
my $cmd = shift; # "rsync -avz --progress -h $fullfile $copyfile";
my ($fh, $filename) = tempfile();
# http://stackoverflow.com/a/6872163/2923406
# I want to have rsync progress output on the terminal AND capture it in case of error.
# Need to use pipefail because 'tee' would be the last cmd otherwise and hence $? would be wrong.
my #cmd = ("bash", "-c", "set -o pipefail && $cmd 2>&1 | tee $filename");
my $ret = system(#cmd);
my $outerr = join('', <$fh>);
if ($ret != 0) {
logit(colored("ERROR: Could not execute command: $cmd", "red"));
logit(colored("ERROR: stdout+stderr = $outerr", "red"));
logit(colored("ERROR: \$? = $?, \$! = $!", "red"));
}
close $fh;
unlink($filename);
return $ret;
}
# And logit() is something like:
sub logit {
my $s = shift;
my ($logsec, $logmin, $loghour, $logmday, $logmon, $logyear, $logwday, $logyday, $logisdst) = localtime(time);
$logyear += 1900;
my $logtimestamp = sprintf("%4d-%02d-%02d %02d:%02d:%02d", $logyear, $logmon+1, $logmday, $loghour, $logmin, $logsec);
my $msg = "$logtimestamp $s\n";
print $msg;
open LOG, ">>$LOGFILE";
print LOG colorstrip($msg);
close LOG;
}

Making an IRC bot - how can I let people !eval perl/javascript code?

I'm working on a bot in Perl (based on POE) and so far so good, but I can't figure out how can I add a !js or !perl command to evaluate respective code and return one line of output to be printed into the channel. I found App::EvalServer but I don't get how to use it.
Thanks for any help!
The App::EvalServer module comes with a binary to run as a standalone application. You do not put it in your program but rather run it on it's own. It opens a port where you can hand it code as a json string. This does not sound like a good idea to me either.
There is another module you might want to look at called Safe. I suggest you read through the complete documentation as well as the one to Opcode (linked in the doc) before you do anything with this. YOU CAN DO SERIOUS DAMAGE IF YOU EVALUATE ARBITRARY CODE! Never forget that.
UPDATE:
Here's an example of how to capture the output of print or say from your evaled code. You can use open with a variable to make printed output always go to that variable. If you switch back afterwards you can work with the captured output in your var. This is called an in-memory file.
use strict; use warnings;
use feature 'say';
use Safe;
# Put our STDOUT into a variable
my $printBuffer;
open(my $buffer, '>', \$printBuffer);
# Everything we say and print will go into $printBuffer until we change it back
my $stdout = select($buffer);
# Create a new Safe
my $compartment = new Safe;
$compartment->permit(qw(print)); # for testing
# This is where the external code comes in:
my $external_code = qq~print "Hello World!\n"~;
# Execute the code
my $ret = $compartment->reval($external_code, 1);
# Go back to STDOUT
select($stdout);
printf "The return value of the reval is: %d\n", $ret;
say "The reval's output is:";
say $printBuffer;
# Now you can do whatever you want with your output
$printBuffer =~ s/World/Earth/;
say "After I change it:";
say $printBuffer;
Disclaimer: Use this code at your own risk!
Update 2: After a lengthy discussion in chat, here's what we came up with. It implements a kind of timeout to stop the execution if the reval is taking to long, e.g. because of an infinite loop.
#!/usr/bin/perl
use warnings;
use strict;
use Safe;
use Benchmark qw(:hireswallclock);
my ($t0, $t1); # Benchmark
my $timedOut = 0;
my $userError = 0;
my $printBuffer;
open (my $buffer, '>', \$printBuffer);
my $stdout = select($buffer);
my $cpmt = new Safe;
$cpmt->permit_only(qw(:default :base_io sleep));
eval
{
local $SIG{'ALRM'} = sub { $timedOut = 1; die "alarm\n"};
$t0 = Benchmark->new;
alarm 2;
$cpmt->reval('print "bla\n"; die "In the user-code!";');
# $cpmt->reval('print "bla\n"; sleep 50;');
alarm 0;
$t1 = Benchmark->new;
if ($#)
{
$userError = "The user-code died! $#\n";
}
};
select($stdout);
if ($timedOut)
{
print "Timeout!\n";
my $td = timediff($t1, $t0);
print timestr($td), "\n";
print $printBuffer;
}
else
{
print "There was no timeout...\n";
if ($userError)
{
print "There was an error with your code!\n";
print $userError;
print "But here's your output anyway:\n";
print $printBuffer;
}
else
{
print $printBuffer;
}
}
Take a look at perl eval(), you can pass it variables/strings and it will evaluate it as if it's perl code. Likewise in javascript, there's also an eval() function that performs similarly.
However, DO NOT EVALUATE ARBITRARY CODE in either perl or javascript unless you can run it in a completely closed environment (and even then, it's still a bad idea). Lot's of people spend lots of time preventing just this from happening. So that's how you'd do it, but you don't want to do it, really at all.

How can I run a system command and die if anything is written to STDERR?

I'm writing a Perl script which uses an external script. The external script must run from a specific directory so I found the following useful:
use IPC::System::Simple qw(capture);
my #args = ('external script path...', 'arg1', ...);
my $out = capture( [0], "cd $dir ; #args" );
Sometimes the external script writes stuff to STDERR but still returns 0. I wish to capture these times and confess (or die). Since I don't control the return value of the external script, I thought maybe I could capture its STDERR so I'll have something like this:
my ($out, $err) = cool_capture( [0], "cd $dir ; #args" );
say "Output was: $out";
if ($err) {
die "Error: this was written to STDERR: $err";
}
What can I do?
This is covered in the Perl FAQ.
Presuming test_app is a program that outputs one line to stdout and one line to stderr:
use IPC::Open3;
use Symbol 'gensym';
my($wtr, $rdr, $err);
$err = gensym;
my $pid = open3($wtr, $rdr, $err, 'test_app');
waitpid($pid, 0);
my $status = $? >> 8;
my $stdout = <$rdr>;
my $stderr = <$err>;
print "out output: $stdout\n";
print "err output: $stderr\n";
print "Exit code: $status\n";
EDIT: Per the request updated to include capturing the exit code. You could also have asked perldoc IPC::Open3 which says
waitpid( $pid, 0 );
my $child_exit_status = $? >> 8;
And which you should read anyway for its cautions and caveats.
If significant output is being written to stdout and/or stderr or you're both reading and writing to the process. You need to be a lot more careful with your I/O handling to avoid various blocking problems.
my ($wtr, $rdr, $err) ;
my $pid = IPC::Open3::open3($wtr, $rdr, $err, #_);
close($wtr);
my $stdout = '';
my $stderr = '';
my $s = IO::Select->new;
$s->add($rdr) if $rdr;
$s->add($err) if $err;
while (my #ready = $s->can_read) {
foreach my $ioh (#ready) {
my $bytes_read = sysread($ioh, my $chunk = '', 1024);
die "read error: $!" unless $bytes_read >= 0;
if ($bytes_read) {
($ioh eq $rdr? $stdout: $stderr) .= $chunk;
}
else {
$s->remove($ioh);
}
}
}
my $pid1;
for (;;) {
last if kill(0, $pid);
$pid1 = wait();
#
# Wait until we see the process or -1 (no active processes);
#
last if ($pid1 == $pid || $pid1 <= 0);
}
Finish reading before you shutdown the process. If you're writing to the process's stdin, you'd also need to add $wtr and syswrite to the above select loop.
EDIT
Rationale:
The above is probably overkill for simple cases. This advanced handling of input and output comes into play when you're likely to move more than a few K of data.
You wouldn't need it if you were executing a 'df' command for example.
However, it's when system buffers for any of stdin, stdout or stderr fill up that blocking becomes likely and things can get more involved.
If the child process fills up the stderr and/or stdout buffers, it'll likely block and wait for you to clear them. But if you're waiting for the process finish before you read from stdout or stderr; thats a deadlock. You'll likely to see that the system call never finishes and the child process never completes.
There's a similar possibility of deadlock if stdin is being written to, but the child process is unable to consume the input. This is particularly likely in a 'pipe' situation where the child process is consuming input and writing to stdout.
The select loop is about progressively clearing the buffers to avoid blocking. Both stdout and stderr are monitored concurrently.
If you're writing to stdin and reading from stdout (a pipe), you'll want to keep stdout and stderr clear and only write to stdin when its ready to receive input.
Simply waiting for the process to finish, then reading stdout/stderr probably works 90% of the time. This reply is just to give you somewhere to go if things get more complicated and processes start to block or go into deadlock.
EDIT2
As for which to use, I'd say start simple, test hard.
Go with Sorpigal's approach, but try to stress test with higher data volumes and under more difficult loads and conditionals that you'd ever expect in a live system.