IO::Pipe - close(<handle>) does not set $? - perl

My understanding is that closing the handle for an IO::Pipe object should be done with the method ($fh->close) and not the built-in (close($fh)).
The other day I goofed and used the built-in out of habit on a IO::Pipe object that was opened to a command that I expected to fail. I was surprised when $? was zero, and my error checking wasn't triggered.
I realized my mistake. If I use the built-in, IO:Pipe can't perform the waitpid() and can't set $?. But what I was surprised by was that perl seemed to still close the pipe without setting $? via the core.
I worked up a little test script to show what I mean:
use 5.012;
use warnings;
use IO::Pipe;
say 'init pipes:';
pipes();
my $fh = IO::Pipe->reader(q(false));
say 'post open pipes:';
pipes();
say 'return: ' . $fh->close;
#say 'return: ' . close($fh);
say 'status: ' . $?;
say q();
say 'post close pipes:';
pipes();
sub pipes
{
for my $fd ( glob("/proc/self/fd/*") )
{
say readlink($fd) if -p $fd;
}
say q();
}
When using the method it shows the pipe being gone after the close and $? is set as I expected:
init pipes:
post open pipes:
pipe:[992006]
return: 1
status: 256
post close pipes:
And, when using the built-in it also appears to close the pipe, but does not set $?:
init pipes:
post open pipes:
pipe:[952618]
return: 1
status: 0
post close pipes:
It seems odd to me that the built-in results in the pipe closure, but doesn't set $?. Can anyone help explain the discrepancy?
Thanks!

If you look at the code for IO::Handle (of which IO::Pipe::End is a sub-class), you will see the following:
sub close {
#_ == 1 or croak 'usage: $io->close()';
my($io) = #_;
close($io);
}
It looks like $fh->close just calls close $fh. Of course, we should not be peeking behind the curtain.
We can see after IO::Pipe does a close $fh (behind the scenes), it then does a waitpid:
package IO::Pipe::End;
our(#ISA);
#ISA = qw(IO::Handle);
sub close {
my $fh = shift;
my $r = $fh->SUPER::close(#_); # <-- This just calls a CORE::close
waitpid(${*$fh}{'io_pipe_pid'},0)
if(defined ${*$fh}{'io_pipe_pid'});
$r;
}
Also interesting is this from the close Perldoc:
If the filehandle came from a piped open, close returns false if one of the other syscalls involved fails or if its program exits with non-zero status. If the only problem was that the program exited non-zero, $! will be set to 0 .
Closing a pipe also waits for the process executing on the pipe to exit --in case you wish to look at the output of the pipe
afterwards--and implicitly puts the exit status value of that command
into $? and ${^CHILD_ERROR_NATIVE} .
That answers your question right there.

But what I was surprised by was that perl seemed to still close the pipe without setting $? via the core.
Why would it? It has no way to know the process at the other end is a child, much less one for which the program should wait. Since it has no reason to call waitpid, $? isn't going to get set.
In fact, I doubt it wait for the process at the other end of the pipe even if it wanted to, because I doubt there's a way of obtaining the pid of the process at the other end of the pipe, because it's actually possible for there to be multiple processes at the other end of the pipe.
IO::Pipe::close only calls waitpid when IO::Pipe is used to "open a process".
Similarly, close only calls waitpid when open is used to "open a process".
A process "opened" using one method cannot be closed by the other.

It turns out that my confusion stems from a flawed assumption that the disappearing pipe coincided with a complete process termination. That appears to not be the case, as the process is still available for a wait().
> perl -MIO::Pipe -le 'my $io = IO::Pipe->reader(q(false)); close($io); print $?; print wait(); print $?'
0
8857
256

Related

How to interrupt a read of a lexical filehandle in Perl?

As far as I know, the pattern for interrupting something after some time in Perl, is as follows:
#!/usr/bin/perl
local $SIG{ALRM} = sub { die "timed out\n"; };
my $pid;
eval {
alarm 10;
# do stuff
alarm 0;
};
print "done\n";
kill(9, $pid) if $pid;
So one example of a "something" is opening a subprocess for reading which might randomly hang. Simulating it for testing purposes works as expected:
# replace "# do stuff" with:
$pid = open FH, "sleep 120|";
my $line = <FH>;
close FH;
However, these globally-named things can't be passed to other subroutines, so I really want to use the following implementation:
# replace "# do stuff" with:
$pid = open my $fh, "sleep 120|";
my $line = <$fh>;
close $fh;
... but when I run this, the Perl script hangs indefinitely! Why?
When there are no references left to a variable (perhaps as a result of it going out of scope), Perl cleans up the resources associated with that variable. Destructors are called, memory is freed, file handles are closed, etc. This is the case even if the variable ceases to be referenced due to an exception.
When a file handle created using open -| or open |- is closed, Perl waits for the child process to end. This happens not just for explicit calls to close, but for implicit closes that happen as a result of the destroying an open file handle.
In your scenario, Perl is waiting for the child to complete as a result of the implicit close of the file handle. It doesn't hang indefinitely, but for the remainder of the 120 seconds it takes for the child to end. This is the behaviour one should observe for both versions of the program, and it is the behaviour one observes for both versions unless one is using an old version of Perl. Up until the release of 5.18 9 years ago, a bug skipped closing of file handles during global destruction.

Flush output of child process

I created a child process via IPC::Open2.
I need to read from the stdout of this child process line by line.
Problem is, as the stdout of the child process is not connected to a terminal, it's fully buffered and I can't read from it until the process terminates.
How can I flush the output of the child process without modifying its code ?
child process code
while (<STDIN>) {
print "Received : $_";
}
parent process code:
use IPC::Open2;
use Symbol;
my $in = gensym();
my $out = gensym();
my $pid = open2($out, $in, './child_process');
while (<STDIN>) {
print $in $_;
my $line = <$out>;
print "child said : $line";
}
When I run the code, it get stucks waiting the output of the child process.
However, if I run it with bc the result is what I expect, I believe bc must manually flush its output
note:
In the child process if I add $| = 1 at the beginning or STDOUT->flush() after printing, the parent process can properly read from it.
However this is an example and I must handle programs that don't manually flush their output.
Unfortunately Perl has no control over the buffering behavior of the programs it executes. Some systems have an unbuffer utility that can do this. If you have access to this tool, you could say
my $pid = open2($out, $in, 'unbuffer ./child_process');
There's a discussion here about the equivalent tools for Windows, but I couldn't say whether any of them are effective.
One way to (try to) deal with buffering is to set up a terminal-like environment for the process, a pseudo-terminal (pty). That is not easy to do in general but IPC::Run has that capability ready for easy use.
Here is the driver, run for testing using at facility so that it has no controlling terminal (or run it via cron)
use warnings;
use strict;
use feature 'say';
use IPC::Run qw(run);
my #cmd = qw(./t_term.pl input arguments);
run \#cmd, '>pty>', sub { say "out: #_" };
#run \#cmd, '>', sub { say "out: #_" } # no pty
With >pty> it sets up a pseudo-terminal for STDOUT of the program in #cmd (with > it's a pipe); also see <pty< and see more about redirection.
The anonymous sub {} gets called every time there is output from the child, so one can process it as it goes. There are other related options.
The program that is called (t_term.pl) only tests for a terminal
use warnings;
use strict;
use feature 'say';
say "Is STDOUT filehandle attached to a terminal: ",
( (-t STDOUT) ? "yes" : "no" );
sleep 2;
say "bye from $$";
The -t STDOUT (see filetest operators) is a suitable way to check for a terminal in this example. For more/other ways see this post.
The output shows that the called program (t_term.pl) does see a terminal on its STDOUT, even when a driver runs without one (using at, or out of a crontab). If the >pty> is changed to the usual redirection > (a pipe) then there is no terminal.
Whether this solves the buffering problem is clearly up to that program, and to whether it is enough to fool it with a terminal.
Another way around the problem is using unbuffer when possible, as in mob's answer.

Measure the runtime of a program run via perl "open"

I'm working on a library with a test suite that uses Perl open to run it's tests. It looks something like this:
open (MYOUT, "$myprog $arg1 $arg2 $arg3 2>&1 |") die "Bad stuff happened";
What I'd really like to do is to measure the runtime of $myprog. Unfortunately, just grabbing a start time and end time around the open command just grabs roughly how long it takes to start up the process.
Is there some way of either forcing the open command to finish the process (and therefore accurately measure time) or perhaps something else that would accomplish the same thing?
Key constraints are that we need to capture (potentially a lot of) STDOUT and STDERR.
Since you open a pipe, you need to time from before opening to at least after the reading
use warnings;
use strict;
use Time::HiRes qw(gettimeofday tv_interval sleep);
my $t0 = [gettimeofday];
open my $read, '-|', qw(ls -l) or die "Can't open process: $!";
while (<$read>)
{
sleep 0.1;
print;
}
print "It took ", tv_interval($t0), " seconds\n";
# close pipe and check
or, to time the whole process, after calling close on the pipe (after all reading is done)
my $t0 = [gettimeofday];
open my $read, '-|', qw(ls -l) or die "Can't open process: $!";
# ... while ($read) { ... }
close $read or
warn $! ? "Error closing pipe: $!" : "Exit status: $?";
print "It took ", tv_interval($t0), " seconds\n";
The close blocks and waits for the program to finish
Closing a pipe also waits for the process executing on the pipe to exit--in case you wish to look at the output of the pipe afterwards--and implicitly puts the exit status value of that command into $? [...]
For the status check see $? variable in perlvar and system
If the timed program forks and doesn't wait on its children in a blocking way this won't time them correctly.
In that case you need to identify resources that they use (files?) and monitor that.
I'd like to add that external commands should be put together carefully, to avoid shell injection trouble. A good module is String::ShellQuote. See for example this answer and this answer
Using a module for capturing streams would free you from the shell and perhaps open other ways to run and time this more reliably. A good one is Capture::Tiny (and there are others as well).
Thanks to HåkonHægland for comments. Thanks to ikegami for setting me straight, to use close (and not waitpid).

Set a filehandle so that prints to it are quietly skipped?

This strange interest comes from expanding requirements and no time to change design (refactor). This is not good design, sure, but I need to deal with it now and hope to refactor later.
There are a few log files opened early on which are printed to throughout code. The new requirement implies that with a (new) command-line option (--noflag) one of these log files is irrelevant.
All I could do at the moment is to pad the definition (open my $fh, ...) and all uses of it (print $fh ...) with if $flag. This is clearly bad design and it is error prone (it isn't pretty either).
Is there a way to do something with $fh when it is associated with the file
so that any following print $fh ... is accepted by intepreter but will result in simply not running the print, without error? (Let me imagine something like, say, $fh = VOID if $flag;.) Or, is there some NULL stream or such? All I know of are STDOUT (1), STDERR (2), and STDIN (0).
I do not want $fh to print anywhere else, ideally not even to /dev/null (if that is possible?). I did look around and couldn't find anything related. I'd appreciate being pointed to information if in fact it is out there already.
Any ideas are appreciated.
PS. First question ever asked here (after years of using SO), please let me know if it's off.
UPDATE
Thanks for responses. They prompt me to add to/refine this question: Are prints marked to go to /dev/null possibly optimized, so that the 'printing' actually doesn't happen? (While I am still interested in whether it is possible to set a filehandle so to tell to Perl 'do not print here'.)
I am trying to avoid running void (print) statements, without adding conditionals.
Update/Clarification
To summarize a bit from comments (thank you!): This was not a quest for performance optimization. I completely agree with everything said in comments on this. It is simply that executing pointless statements (typically around a million) makes me uneasy. Also, I was curious about some possible dark corner of Perl that I haven't run into. (Most of this has been addressed in answers/comments.)
If you are on a unix operating system you can use '/dev/null'
open my $fh, '>', '/dev/null' or die 'This should never happen';
Dev null will silently accept all input.
Closing your filehandle
close $fh;
will make all your prints to that file handle fail. Run
no warnings 'closed';
to suppress all the warning messages that would generate (you do use warnings, right?)
Through magic, you could create a magical handle for which operations are always successful.
perl -e'
{
package Handle::Dummy;
use Tie::Handle qw( );
use Symbol qw( gensym );
our #ISA = qw( Tie::Handle );
sub new { my $fh = gensym; tie *$fh, $_[0]; $fh }
sub TIEHANDLE { bless(\my $dummy, $_[0]) }
sub READ { return 1; }
sub WRITE { return 1; }
sub CLOSE { return 1; }
}
my $fh = Handle::Dummy->new();
print($fh "abc\n") or die $!;
close($fh) or die $!;
print("ok\n");
'
ok
That avoids the systems calls, but it replaces them with expensive Perl subroutine calls.
It's far simpler and more reliable[1] to simply use /dev/null. It could very well be faster too.
Are prints marked to go to /dev/null possibly optimized
No. Perl doesn't know anything about /dev/null.
How slow do you think a system call is? This doesn't sound like the right thing to optimize!
The magical file handle is not associated with a system file handle, so it can't be passed to a C library, it won't survive exec, etc.
You can use an anonymous, temporary file (about a quarter of the way down the perldoc page) like so;
#!/usr/bin/env perl
use strict;
use Getopt::Long;
my $fh;
my $need_log = 2;
print "Intitial need_log: $need_log\n";
GetOptions('flag!' => \$need_log);
print "After option processing, need_log: ", $need_log, "\n";
if ($need_log) {
open($fh, '>', "log.txt") or die "Failed to open log: $!\n";
}
else {
open($fh, '>', undef);
}
print $fh "Hello World... NOT\n";
exit 0;
Here is a few runs with different use of the --flag option;
User#Ubuntu:~$ ls -l log.txt
ls: cannot access log.txt: No such file or directory
User#Ubuntu:~$ ./nf.pl
Intitial need_log: 2
After option processing, need_log: 2
User#Ubuntu:~$ cat log.txt
Hello World... NOT
User#Ubuntu:~$ rm log.txt
User#Ubuntu:~$
User#Ubuntu:~$
User#Ubuntu:~$ ./nf.pl --flag
Intitial need_log: 2
After option processing, need_log: 1
User#Ubuntu:~$ cat log.txt
Hello World... NOT
User#Ubuntu:~$ rm log.txt
User#Ubuntu:~$
User#Ubuntu:~$
User#Ubuntu:~$ ./nf.pl --noflag
Intitial need_log: 2
After option processing, need_log: 0
User#Ubuntu:~$ cat log.txt
cat: log.txt: No such file or directory
User#Ubuntu:~$
I've initialized the $need_log variable to '2' so that we can tell if it has a 'True' value as a result of the flag option being present (in which case it will have the value 1) or as a result of no mention of the flag option at all (in which case it will have the value 2).
Specifying '--noflag' triggers the else clause which has 'undef' as the third argument which creates the anonymous temporary file. This doesn't perfectly match your question of not writing at all, but if the file is temporary and you're not putting gigabytes in it, this will hopefully suffice.

How do I influence the width of Perl IPC::Open3 output?

I have the following Perl code and would like it to display exactly as invoking /bin/ls in the terminal would display. For example on a terminal sized to 100 columns, it would print up to 100 characters worth of output before inserting a newline. Instead this code prints 1 file per line of output. I feel like it involves assigning some terminal settings to the IO::Pty instance, but I've tried variations of that without luck.
UPDATE: I replaced the <$READER> with a call to sysread hoping the original code might just have a buffering issue, but the output received from sysread is still one file per line.
UPDATE: I added code showing my attempt at changing the IO::Pty's size via the clone_winsize_from method. This didn't result in the output being any different.
UPDATE: As best I can tell (from reading IPC::open3 code for version 1.12) it seems you cannot pass a variable of type IO::Handle without open3 creating a pipe rather than dup'ing the filehandle. This means isatty doesn't return a true value when ls invokes it and ls then forces itself into "one file per line" mode.
I think I just need to do a fork/exec and handle the I/O redirection myself.
#!/usr/bin/env perl
use IPC::Open3;
use IO::Pty;
use strict;
my $READER = IO::Pty->new();
$READER->slave->clone_winsize_from(\*STDIN);
my $pid = open3(undef, $READER, undef, "/bin/ls");
while(my $line = <$READER>)
{
print $line;
}
waitpid($pid, 0) or die "Error waiting for pid: $!\n";
$READER->close();
I think $READER is getting overwritten with a pipe created by open3, which can be avoided by changing
my $READER = ...;
my $pid = open3(undef, $READER, undef, "/bin/ls");
to
local *READER = ...;
my $pid = open3(undef, '>&READER', undef, "/bin/ls");
See the docs.
You can pass the -C option to ls to force it to use columnar output (without getting IO::Pty involved).
The IO::Pty docs describe a clone_winsize_from(\*FH) method. You might try cloning your actual pty's dimensions.
I see that you're setting up the pty only as stdout of the child process. You might need to set it up also as its stdin — when the child process sends the "query terminal size" escape sequence to its stdout, it would need to receive the response on its stdin.