Perl: retrieve output from process in IPC::Run if it dies - perl

I have been running some commands with the IPC::Run module and everything is fine, except that I can't access the output (STDOUT, STDERR), the process produced and were redirected into variables. Is there a way to retrieve those in the error handling?
#commands = ();
foreach my $id (1..3) {
push #commands, ["perl", "script" . $id . ".pl"];
}
foreach my $cmd (#commands) {
my $out = "";
my $err = "";
my $h = harness $cmd, \undef, \$out, \$err, timeout(12,exception => {name => 'timeout'});
eval {
run $h;
};
if ($#) {
my $err_msg = $#; # save in case another error happens
print "$out\n";
print "$err\n";
$h->kill_kill;
}
}
I don't need any input for now, I just need to execute it and get the output.
EDIT
I have been testing it with running perl scripts which look like this:
for (my $i = 0; $i < 10; $i++) {
sleep 1;
print "Hello from script 1 " . localtime() . "\n";
}
I have 3 such scripts with different times and the 3rd takes 20 seconds to complete, which is more than the 12 I have in the timer.

As noted by #ysth, the reason you do not get any output, is that the STDOUT and STDERR of the process corresponding to the command $cmd, is not line buffered, but rather block buffered. So all output is collected in a buffer which is not shown (printed) until the buffer is full or it is explicitly flushed. However, when your command times out, all the output is still in the buffer and has not yet been flushed and hence collected into the variable $out in the parent process (script).
Also note that since your $cmd script is a Perl script, this behavior is documented in perlvar:
$|
If set to nonzero, forces a flush right away and after every write
or print on the currently selected output channel. Default is 0
(regardless of whether the channel is really buffered by the system or
not; $| tells you only whether you've asked Perl explicitly to flush
after each write). STDOUT will typically be line buffered if output is
to the terminal and block buffered otherwise.
The problem (that the program is not connected to a terminal or a tty) is also noted in the documentation page for IPC::Run :
Interactive applications are usually optimized for human use. This can
help or hinder trying to interact with them through modules like
IPC::Run. Frequently, programs alter their behavior when they detect
that stdin, stdout, or stderr are not connected to a tty, assuming
that they are being run in batch mode. Whether this helps or hurts
depends on which optimizations change. And there's often no way of
telling what a program does in these areas other than trial and error
and occasionally, reading the source. This includes different versions
and implementations of the same program.
The documentation also lists a set of possible workarounds, including using pseudo terminals.
One solution for your specific case is then to explicitly make STDOUT line buffered at the beginning of your script:
STDOUT->autoflush(1); # Make STDOUT line buffered
# Alternatively use: $| = 1;
for (my $i = 0; $i < 10; $i++) {
sleep 1;
print "Hello from script 1 " . localtime() . "\n";
}
Edit:
If you cannot modify the scripts you are running for some reason, you could try connect the script to a pseudo terminal. So instead of inserting statements like STDOUT->autoflush(1) in the source code of the script, you can fool the script to believe it is connected to a terminal, and hence that it should use line buffering. For your case, we just add a >pty> argument before the \$out argument in the call to harness:
my $h = harness $cmd, \undef, '>pty>', \$out,
timeout(12, exception => {name => 'timeout'});
eval {
run $h;
};

Related

Flush output of child process

I created a child process via IPC::Open2.
I need to read from the stdout of this child process line by line.
Problem is, as the stdout of the child process is not connected to a terminal, it's fully buffered and I can't read from it until the process terminates.
How can I flush the output of the child process without modifying its code ?
child process code
while (<STDIN>) {
print "Received : $_";
}
parent process code:
use IPC::Open2;
use Symbol;
my $in = gensym();
my $out = gensym();
my $pid = open2($out, $in, './child_process');
while (<STDIN>) {
print $in $_;
my $line = <$out>;
print "child said : $line";
}
When I run the code, it get stucks waiting the output of the child process.
However, if I run it with bc the result is what I expect, I believe bc must manually flush its output
note:
In the child process if I add $| = 1 at the beginning or STDOUT->flush() after printing, the parent process can properly read from it.
However this is an example and I must handle programs that don't manually flush their output.
Unfortunately Perl has no control over the buffering behavior of the programs it executes. Some systems have an unbuffer utility that can do this. If you have access to this tool, you could say
my $pid = open2($out, $in, 'unbuffer ./child_process');
There's a discussion here about the equivalent tools for Windows, but I couldn't say whether any of them are effective.
One way to (try to) deal with buffering is to set up a terminal-like environment for the process, a pseudo-terminal (pty). That is not easy to do in general but IPC::Run has that capability ready for easy use.
Here is the driver, run for testing using at facility so that it has no controlling terminal (or run it via cron)
use warnings;
use strict;
use feature 'say';
use IPC::Run qw(run);
my #cmd = qw(./t_term.pl input arguments);
run \#cmd, '>pty>', sub { say "out: #_" };
#run \#cmd, '>', sub { say "out: #_" } # no pty
With >pty> it sets up a pseudo-terminal for STDOUT of the program in #cmd (with > it's a pipe); also see <pty< and see more about redirection.
The anonymous sub {} gets called every time there is output from the child, so one can process it as it goes. There are other related options.
The program that is called (t_term.pl) only tests for a terminal
use warnings;
use strict;
use feature 'say';
say "Is STDOUT filehandle attached to a terminal: ",
( (-t STDOUT) ? "yes" : "no" );
sleep 2;
say "bye from $$";
The -t STDOUT (see filetest operators) is a suitable way to check for a terminal in this example. For more/other ways see this post.
The output shows that the called program (t_term.pl) does see a terminal on its STDOUT, even when a driver runs without one (using at, or out of a crontab). If the >pty> is changed to the usual redirection > (a pipe) then there is no terminal.
Whether this solves the buffering problem is clearly up to that program, and to whether it is enough to fool it with a terminal.
Another way around the problem is using unbuffer when possible, as in mob's answer.

How can I show all input and output with Expect.pm (Perl)?

Here is a little Perl server. It displays (1), accepts a line of input, then displays (2), etc. If you type "error" or "commit", it gives a custom message. If you type "exit", it quits. Otherwise, it just endlessly takes lines of input.
use strict;
use warnings;
$|++;
my $counter = 1;
print "($counter) ";
while (<STDIN>) {
chomp;
if ($_ eq "error") {print "Error on command #$counter\n";}
if ($_ eq "commit") {print "Committing data\n";}
if ($_ eq "exit") {print "Exiting program...\n"; exit;}
$counter++;
print "($counter) ";
}
Now, here is an Expect.pm client script to interact with the server script by typing in various lines.
use strict;
use warnings;
use Expect;
$|++;
my $exp = new Expect;
$exp->raw_pty(1);
$exp->log_file("/tmp/expect.out");
$exp->log_stdout(1);
my #commands = (
"This is the first command",
"Here is the second command",
"error",
"commit",
"This is the last command",
"exit",
);
$exp->spawn("./expecttest_server.pl");
foreach my $command (#commands) {
print "$command\n";
$exp->send("$command\n");
$exp->expect(1, '-re','\(\d+\)');
}
$exp->soft_close();
What I want is to be able to store the entire session from start to finish, including everything the server script generated, and everything the Expect.pm script sent.
That is, I want my client script to be able to return output like this, which is what you would see if you ran and interacted with the server script manually:
(1) This is the first command
(2) Here is the second command
(3) error
Error on command #3
(4) commit
Committing data
(5) This is the last command
(6) exit
Exiting program...
But the STDOUT display that comes from running the client script looks like this:
This is the first command
(1) (2) Here is the second command
error
(3) Error on command #3
(4) commit
This is the last command
Committing data
(5) (6) exit
Exiting program...
and the file specified by $exp->log_file (tmp/expect.out) shows this:
(1) (2) (3) Error on command #3
(4) Committing data
(5) (6) Exiting program...
I've tried experimenting by logging various combinations of the command itself + the before_match and after_match variables returned by $exp->expect(). But so far I haven't gotten the right combination. And it seems like an awfully clunky way to get what I'm looking for.
So, what's the best practice for capturing the entirety of an Expect.pm session?
Thanks to anyone who can help!
When run on the command line, your server prints
(1)
to stdout immediately and waits for input.
However, when you create an Expect object, you are actually setting up a PTY (a pseudo terminal). Any processes you spawn will have their stdin and stdout connected to this PTY, not to the TTY that your shell is connected to. This means that it's up to your Expect object whether output from the spawned process is displayed or not; it will not be displayed automatically.
When you spawn a process, the Expect object holds onto any output in an input buffer. When you send a string to the process, any additional output that is generated will be read into the buffer. If the PTY has echoing enabled (the default), the string you send will be echoed back, but the contents of the Expect object's buffer will not.
When you call the expect method, Expect waits until a matching string appears in the input buffer. If a match is found before the timeout expires, expect returns and prints the matching string.
So, all you need to do is call expect before sending your first command, something like this:
Server
use strict;
use warnings;
$| = 1;
my $counter = 1;
do {
print "($counter) ";
$counter++;
} while (<>);
Client
use strict;
use warnings;
use Expect;
$| = 1;
my $exp = Expect->new;
my $server = './expect_server';
$exp->spawn($server);
my #commands = qw(foo bar baz);
foreach my $command (#commands) {
$exp->expect(1, '-re', '\(\d+\)');
$exp->send("$command\r");
}
$exp->soft_close;
Output
(1) foo
(2) bar
(3) baz
(4)
Note that this is the exact same process you would use when interacting with a process manually: wait for the prompt, then type your command.

When does the output of a script get appended to a file, and what can I do to circumvent the outcome?

I have a script called my_bash.sh and it calls a perl script, and appends the output of the perl script to a log file. Does the file only get written once the perl script has completed?
my_bash.sh
#!/bin/bash
echo "Starting!" > "my_log.log"
perl my_perl.pl >> "my_log.log" 2>&1
echo "Ending!" >> "my_log.log"
The issue is that as the perl script is running, I'd like to manipulate contents of the my_log.log file while it's running, but it appears the file is blank. Is there a proper way to do this? Please let me know if you'd like more information.
my_perl.pl
...
foreach $component (#arrayOfComponents)
{
print "Component Name: $component (MORE_INFO)\n";
# Do some work to gather more info (including other prints)
#...
# I want to replace "MORE_INFO" above with what I've calculated here
system("sed 's/MORE_INFO/$moreInfo/' my_log.log");
}
The sed isn't working correctly since the print statements haven't yet made it to the my_log.log.
Perl would buffer the output by default. To disable buffering set $| to a non-zero value. Add
$|++;
at the top of your perl script.
Quoting perldoc pervar:
$|
If set to nonzero, forces a flush right away and after every write or
print on the currently selected output channel. Default is 0
(regardless of whether the channel is really buffered by the system or
not; $| tells you only whether you've asked Perl explicitly to flush
after each write). STDOUT will typically be line buffered if output is
to the terminal and block buffered otherwise. Setting this variable is
useful primarily when you are outputting to a pipe or socket, such as
when you are running a Perl program under rsh and want to see the
output as it's happening. This has no effect on input buffering. See
getc for that. See select on how to select the output channel. See
also IO::Handle.
Mnemonic: when you want your pipes to be piping hot.
The answer to this question depends on how your my_perl.pl is outputting data and how much data is being output.
If you're using normal (buffered) I/O to produce your output, then my_log.log will only be written to once the STDOUT buffer fills. Generally speaking, if you're not producing a lot of output, this is when the program ends and the buffer is flushed.
If you're producing enough output to fill the output buffer, you will get output in my_log.log prior to my_perl.pl completing.
Additionally, in Perl, you can make your STDOUT unbuffered with the following code:
select STDOUT; $| = 1;
In which case, your output would be written to STDOUT (and then to my_log.log via redirection) the moment it is produced in your script.
Depending on what you need to do to the log file, who might be able to read each line of output from the perl script, do something with the line, then write it to the log yourself (or not):
#!/bin/bash
echo "Starting!" > "my_log.log"
perl my_perl.pl | \
while read line; do
# do something with the line
echo "$line" >> "my_log.log"
done
echo "Ending!" >> "my_log.log"
In between
print "Component Name: $component (MORE_INFO)\n";
and
system("sed 's/MORE_INFO/$moreInfo/' my_log.log");
do you print stuff? If not, delay the first print until you've figured out $moreInfo
my $header = "Component Name: $component (MORE_INFO)";
# ...
# now I have $moreInfo
$header =~ s/MORE_INFO/$moreInfo/;
print $header, "\n";
If you do print stuff, you could always "queue" it until you have the info you need:
my #output;
foreach my $component (...) {
#output = ("Component Name: $component (MORE_INFO)");
# ...
push #output, "something to print";
# ...
$output[0] =~ s/MORE_INFO/$moreInfo/;
print join("\n", #output), "\n";

Perl Capture and Modify STDERR before it prints to a file [duplicate]

I want to execute an external command from within my Perl script, putting the output of both stdout and stderr into a $variable of my choice, and to get the command's exit code into the $? variable.
I went through solutions in perlfaq8 and their forums, but they're not working for me. The strange thing is that I don't get the output of sdterr in any case, as long as the exit code is correct.
I'm using Perl version 5.8.8, on Red Hat Linux 5.
Here's an example of what I'm trying:
my $cmd="less";
my $out=`$cmd 2>&1`;
or
my $out=qx($cmd 2>&1);
or
open(PIPE, "$cmd 2>&1|");
When the command runs successfully, I can capture stdout.
I don't want to use additional capture modules. How can I capture the full results of the external command?
This was exactly the challenge that David Golden faced when he wrote Capture::Tiny. I think it will help you do exactly what you need.
Basic example:
#!/usr/bin/env perl
use strict;
use warnings;
use Capture::Tiny 'capture';
my ($stdout, $stderr, $return) = capture {
system( 'echo Hello' );
};
print "STDOUT: $stdout\n";
print "STDERR: $stderr\n";
print "Return: $return\n";
After rereading you might actually want capture_merged to join STDOUT and STDERR into one variable, but the example I gave is nice and general, so I will leave it.
Actually, the proper way to write this is:
#!/usr/bin/perl
$cmd = 'lsss';
my $out=qx($cmd 2>&1);
my $r_c=$?;
print "output was $out\n";
print "return code = ", $r_c, "\n";
You will get a '0' if no error and '-1' if error.
STDERR is intended to be used for errors or messages that might need to be separated from the STDOUT output stream. Hence, I would not expect any STDERR from the output of a command like less.
If you want both (or either) stream and the return code, you could do:
my $out=qx($cmd 2>&1);
my $r_c=$?
print "output was $out\n";
print "return code = ", $r_c == -1 ? $r_c : $r_c>>8, "\n";
If the command isn't executable (perhaps because you meant to use less but wrote lsss instead), the return code will be -1. Otherwise, the correct exit value is the high 8-bits. See system.
A frequently given answer to this question is to use a command line containing shell type redirection. However, suppose you want to avoid that, and use open() with a command and argument list, so you have to worry less about how a shell might interpret the input (which might be partly made up of user-supplied values). Then without resorting to packages such as IPC::Open3, the following will read both stdout and stderr:
my ($child_pid, $child_rc);
unless ($child_pid = open(OUTPUT, '-|')) {
open(STDERR, ">&STDOUT");
exec('program', 'with', 'arguments');
die "ERROR: Could not execute program: $!";
}
waitpid($child_pid, 0);
$child_rc = $? >> 8;
while (<OUTPUT>) {
# Do something with it
}
close(OUTPUT);

Perl bidirectional pipe IPC, how to avoid output buffering

I am trying to communicate with an interactive process. I want my perl script to be a "moddle man" between the user and the process. The process puts text to stdout, prompts the user for a command, puts more text to stdout, prompts the user for a command, ....... A primitive graphic is provided:
User <----STDOUT---- interface.pl <-----STDOUT--- Process
User -----STDIN----> interface.pl ------STDIN---> Process
User <----STDOUT---- interface.pl <-----STDOUT--- Process
User -----STDIN----> interface.pl ------STDIN---> Process
User <----STDOUT---- interface.pl <-----STDOUT--- Process
User -----STDIN----> interface.pl ------STDIN---> Process
The following simulates what I'm trying to do:
#!/usr/bin/perl
use strict;
use warnings;
use FileHandle;
use IPC::Open2;
my $pid = open2( \*READER, \*WRITER, "cat -n" );
WRITER->autoflush(); # default here, actually
my $got = "";
my $input = " ";
while ($input ne "") {
chomp($input = <STDIN>);
print WRITER "$input \n";
$got = <READER>;
print $got;
}
DUe to output buffering the above example does not work. No matter what text is typed in, or how many enters are pressed the program just sits there. The way to fix it is to issue:
my $pid = open2( \*READER, \*WRITER, "cat -un" );
Notice "cat -un" as opposed to just "cat -n". -u turns off output buffering on cat. When output buffering is turned off this works. The process I am trying to interact with most likely buffers output as I am facing the same issues with "cat -n". Unfortunately I can not turn off output buffering on the process I am communicating with, so how do I handle this issue?
UPDATE1 (using ptty):
#!/usr/bin/perl
use strict;
use warnings;
use IO::Pty;
use IPC::Open2;
my $reader = new IO::Pty;
my $writer = new IO::Pty;
my $pid = open2( $reader, $writer, "cat -n" );
my $got = "";
my $input = " ";
$writer->autoflush(1);
while ($input ne "") {
chomp($input = <STDIN>);
$writer->print("$input \n");
$got = $reader->getline;
print $got;
}
~
There are three kinds of buffering:
Block buffering: Output is placed into a fixed-sized buffer. The buffer is flushed when it becomes full. You'll see the output come out in chunks.
Line buffering: Output is placed into a fixed-sized buffer. The buffer is flushed when a newline is added to the buffer and when it becomes full.
No buffering: Output is passed directly to the OS.
In Perl, buffering works as follows:
File handles are buffered by default. One exception: STDERR is not buffered by default.
Block buffering is used. One exception: STDOUT is line buffered if and only if it's connected to a terminal.
Reading from STDIN flushes the buffer for STDOUT.
Until recently, Perl used 4KB buffers. Now, the default is 8KB, but that can be changed when Perl is built.
This first two are surprisingly standard across all applications. That means:
User -------> interface.pl
User is a person. He doesn't buffer per say, though it's a very slow source of data. OK
interface.pl ----> Process
interface.pl's output is block buffered. BAD
Fixed by adding the following to interface.pl:
use IO::Handle qw( );
WRITER->autoflush(1);
Process ----> interface.pl
Process's output is block buffered. BAD
Fixed by adding the following to Process:
use IO::Handle qw( );
STDOUT->autoflush(1);
Now, you're probably going to tell me you can't change Process. If so, that leaves you three options:
Use a command line or configuration option provided by tool to change its buffering behaviour. I don't know of any tools that provide such an option.
Fool the child to use line buffering instead of block buffering by using a pseudo tty instead of a pipe.
Quitting.
interface.pl -------> User
interface.pl's output is line buffered. OK (right?)