The syntax of the open3 command - perl

According to the manual the syntax of the IPC::Open3::open3 is
$pid = open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, 'some cmd and args', 'optarg', ...);
I am confused regarding the first three parameters. Are these references to typeglobs?
I tried the following:
my $pid=open3("STDIN", "STDOUT", "STDERR", $cmd);
my $pid=open3(*STDIN, *STDOUT, *STDERR, $cmd);
my $pid=open3(\*STDIN, \*STDOUT, \*STDERR, $cmd);
my $pid=open3('<&STDIN', '>&STDOUT', '>&STDERR', $cmd);
but only number 4, seemed to work. According to manual, I thought number 3 should also work. For example:
use warnings;
use strict;
use feature qw(say);
use IPC::Open3;
my $cmd="sleep 1; echo This STDOUT; sleep 1; echo 'This is STDERR' 1>&2; sleep 1; echo -n 'Reading STDIN: '; read a";
my $pid=open3('<&STDIN', '>&STDOUT', '>&STDERR', $cmd);
say "Waiting for child..";
wait;
say "Done.";

If you pass the strings '<&STDIN' and '>&STDOUT' then the child process gets a duplicate of your Perl program's own standard input and output handles, and can read and write to and from them without any further intervention.
That is a very diffferent things from specifying filehandles using typeglob references. The CHLD_OUT file handle in the documentation is the STDOUT for the child process and it allows your Perl program to read from CHLD_OUT so that you can acquire and process the data it is sending. Using STDOUT here won't work because it is the output file handle for your Perl process. You could, if you really wanted, use STDIN, but that would leave you unable to read anything that was originally presented to your standard input.
The equivalent points apply to CHLD_IN, which is a handle that you print to to send data to the child process. Again, you could use STDOUT here, but that deprives you of the original standard output channel. In any case you would still have to invent another file handle for CHLD_ERR, because you would be reading from it to see what the child was sending to its standard error output, and of course you cannot read from STDERR.
So the best you could do is to replace
open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, 'command')
with
open3(\*STDOUT, \*STDIN, \*CHLD_ERR, 'command')
but file handles aren't costly, so why commit yourself to losing your standard input and output? Much better to create three new file handles and work with all six.

Related

Flush output of child process

I created a child process via IPC::Open2.
I need to read from the stdout of this child process line by line.
Problem is, as the stdout of the child process is not connected to a terminal, it's fully buffered and I can't read from it until the process terminates.
How can I flush the output of the child process without modifying its code ?
child process code
while (<STDIN>) {
print "Received : $_";
}
parent process code:
use IPC::Open2;
use Symbol;
my $in = gensym();
my $out = gensym();
my $pid = open2($out, $in, './child_process');
while (<STDIN>) {
print $in $_;
my $line = <$out>;
print "child said : $line";
}
When I run the code, it get stucks waiting the output of the child process.
However, if I run it with bc the result is what I expect, I believe bc must manually flush its output
note:
In the child process if I add $| = 1 at the beginning or STDOUT->flush() after printing, the parent process can properly read from it.
However this is an example and I must handle programs that don't manually flush their output.
Unfortunately Perl has no control over the buffering behavior of the programs it executes. Some systems have an unbuffer utility that can do this. If you have access to this tool, you could say
my $pid = open2($out, $in, 'unbuffer ./child_process');
There's a discussion here about the equivalent tools for Windows, but I couldn't say whether any of them are effective.
One way to (try to) deal with buffering is to set up a terminal-like environment for the process, a pseudo-terminal (pty). That is not easy to do in general but IPC::Run has that capability ready for easy use.
Here is the driver, run for testing using at facility so that it has no controlling terminal (or run it via cron)
use warnings;
use strict;
use feature 'say';
use IPC::Run qw(run);
my #cmd = qw(./t_term.pl input arguments);
run \#cmd, '>pty>', sub { say "out: #_" };
#run \#cmd, '>', sub { say "out: #_" } # no pty
With >pty> it sets up a pseudo-terminal for STDOUT of the program in #cmd (with > it's a pipe); also see <pty< and see more about redirection.
The anonymous sub {} gets called every time there is output from the child, so one can process it as it goes. There are other related options.
The program that is called (t_term.pl) only tests for a terminal
use warnings;
use strict;
use feature 'say';
say "Is STDOUT filehandle attached to a terminal: ",
( (-t STDOUT) ? "yes" : "no" );
sleep 2;
say "bye from $$";
The -t STDOUT (see filetest operators) is a suitable way to check for a terminal in this example. For more/other ways see this post.
The output shows that the called program (t_term.pl) does see a terminal on its STDOUT, even when a driver runs without one (using at, or out of a crontab). If the >pty> is changed to the usual redirection > (a pipe) then there is no terminal.
Whether this solves the buffering problem is clearly up to that program, and to whether it is enough to fool it with a terminal.
Another way around the problem is using unbuffer when possible, as in mob's answer.

Can I capture STDOUT write events from a process in perl?

I need (would like?) to spawn a slow process from a web app using a Minion queue.
The process - a GLPK solver - can run for a long time but generates progress output.
I'd like to capture that output as it happens and write it to somewhere (database? log file?) so that it can be played back to the user as a status update inside the web app.
Is that possible? I have no idea (hence no code).
I was exploring Capture::Tiny - the simplicity of it is nice but I can't tell if it can track write events upon writing.
A basic way is to use pipe open, where you open a pipe to a process that gets forked. Then the STDOUT from the child is piped to the filehandle in the parent, or the parent pipes to its STDIN.
use warnings;
use strict;
my #cmd = qw(ls -l .); # your command
my $pid = open(my $fh, '-|', #cmd) // die "Can't open pipe from #cmd: $!";
while (<$fh>) {
print;
}
close $fh or die "Error closing pipe from #cmd: $!";
This way the parent receives child's STDOUT right as it is emitted.†
There is a bit more that you can do with error checking, see the man page, close, and $? in perlvar. Also, install a handler for SIGPIPE, see perlipc and %SIG in perlvar.
There are modules that make it far easier to run and manage external commands and, in particular, check errors. However, Capture::Tiny and IPC::Run3 use files to transfer the external program's streams.
On the other hand, the IPC::Run gives you far more control and power.
To have code executed "... each time some data is read from the child" use a callback
use warnings;
use strict;
use IPC::Run qw(run);
my #cmd = (
'perl',
'-le',
'STDOUT->autoflush(1); for (qw( abc def ghi )) { print; sleep 1; }'
);
run \#cmd, '>', sub { print $_[0] };
Once you use IPC::Run a lot more is possible, including better error interrogation, setting up pseudo tty for the process, etc. For example, using >pty> instead of > sets up a terminal-like environment so the external program that is run may turn back to line buffering and provide more timely output. If demands on how to manage the process grow more complex then work will be easier with the module.
Thanks to ikegami for comments, including the demo #cmd.
† To demonstrate that the parent receives child's STDOUT as it is emitted use a command that emits output with delays. For example, instead of ls -l above use
my #cmd = (
'perl',
'-le',
'STDOUT->autoflush(1); for (qw( abc def ghi )) { print; sleep 1; }'
);
This Perl one-liner prints words one second apart, and that is how they wind up on screen.

Set a filehandle so that prints to it are quietly skipped?

This strange interest comes from expanding requirements and no time to change design (refactor). This is not good design, sure, but I need to deal with it now and hope to refactor later.
There are a few log files opened early on which are printed to throughout code. The new requirement implies that with a (new) command-line option (--noflag) one of these log files is irrelevant.
All I could do at the moment is to pad the definition (open my $fh, ...) and all uses of it (print $fh ...) with if $flag. This is clearly bad design and it is error prone (it isn't pretty either).
Is there a way to do something with $fh when it is associated with the file
so that any following print $fh ... is accepted by intepreter but will result in simply not running the print, without error? (Let me imagine something like, say, $fh = VOID if $flag;.) Or, is there some NULL stream or such? All I know of are STDOUT (1), STDERR (2), and STDIN (0).
I do not want $fh to print anywhere else, ideally not even to /dev/null (if that is possible?). I did look around and couldn't find anything related. I'd appreciate being pointed to information if in fact it is out there already.
Any ideas are appreciated.
PS. First question ever asked here (after years of using SO), please let me know if it's off.
UPDATE
Thanks for responses. They prompt me to add to/refine this question: Are prints marked to go to /dev/null possibly optimized, so that the 'printing' actually doesn't happen? (While I am still interested in whether it is possible to set a filehandle so to tell to Perl 'do not print here'.)
I am trying to avoid running void (print) statements, without adding conditionals.
Update/Clarification
To summarize a bit from comments (thank you!): This was not a quest for performance optimization. I completely agree with everything said in comments on this. It is simply that executing pointless statements (typically around a million) makes me uneasy. Also, I was curious about some possible dark corner of Perl that I haven't run into. (Most of this has been addressed in answers/comments.)
If you are on a unix operating system you can use '/dev/null'
open my $fh, '>', '/dev/null' or die 'This should never happen';
Dev null will silently accept all input.
Closing your filehandle
close $fh;
will make all your prints to that file handle fail. Run
no warnings 'closed';
to suppress all the warning messages that would generate (you do use warnings, right?)
Through magic, you could create a magical handle for which operations are always successful.
perl -e'
{
package Handle::Dummy;
use Tie::Handle qw( );
use Symbol qw( gensym );
our #ISA = qw( Tie::Handle );
sub new { my $fh = gensym; tie *$fh, $_[0]; $fh }
sub TIEHANDLE { bless(\my $dummy, $_[0]) }
sub READ { return 1; }
sub WRITE { return 1; }
sub CLOSE { return 1; }
}
my $fh = Handle::Dummy->new();
print($fh "abc\n") or die $!;
close($fh) or die $!;
print("ok\n");
'
ok
That avoids the systems calls, but it replaces them with expensive Perl subroutine calls.
It's far simpler and more reliable[1] to simply use /dev/null. It could very well be faster too.
Are prints marked to go to /dev/null possibly optimized
No. Perl doesn't know anything about /dev/null.
How slow do you think a system call is? This doesn't sound like the right thing to optimize!
The magical file handle is not associated with a system file handle, so it can't be passed to a C library, it won't survive exec, etc.
You can use an anonymous, temporary file (about a quarter of the way down the perldoc page) like so;
#!/usr/bin/env perl
use strict;
use Getopt::Long;
my $fh;
my $need_log = 2;
print "Intitial need_log: $need_log\n";
GetOptions('flag!' => \$need_log);
print "After option processing, need_log: ", $need_log, "\n";
if ($need_log) {
open($fh, '>', "log.txt") or die "Failed to open log: $!\n";
}
else {
open($fh, '>', undef);
}
print $fh "Hello World... NOT\n";
exit 0;
Here is a few runs with different use of the --flag option;
User#Ubuntu:~$ ls -l log.txt
ls: cannot access log.txt: No such file or directory
User#Ubuntu:~$ ./nf.pl
Intitial need_log: 2
After option processing, need_log: 2
User#Ubuntu:~$ cat log.txt
Hello World... NOT
User#Ubuntu:~$ rm log.txt
User#Ubuntu:~$
User#Ubuntu:~$
User#Ubuntu:~$ ./nf.pl --flag
Intitial need_log: 2
After option processing, need_log: 1
User#Ubuntu:~$ cat log.txt
Hello World... NOT
User#Ubuntu:~$ rm log.txt
User#Ubuntu:~$
User#Ubuntu:~$
User#Ubuntu:~$ ./nf.pl --noflag
Intitial need_log: 2
After option processing, need_log: 0
User#Ubuntu:~$ cat log.txt
cat: log.txt: No such file or directory
User#Ubuntu:~$
I've initialized the $need_log variable to '2' so that we can tell if it has a 'True' value as a result of the flag option being present (in which case it will have the value 1) or as a result of no mention of the flag option at all (in which case it will have the value 2).
Specifying '--noflag' triggers the else clause which has 'undef' as the third argument which creates the anonymous temporary file. This doesn't perfectly match your question of not writing at all, but if the file is temporary and you're not putting gigabytes in it, this will hopefully suffice.

Perl: Pass one byte plus STDIN to another command

I would like to do this efficiently:
my $buf;
my $len = read(STDIN,$buf,1);
if($len) {
# Not empty
open(OUT,"|-", "wc") || die;
print OUT $buf;
# This is the line I want to do faster
print OUT <STDIN>;
exit;
}
The task is to start wc only if there is any input. If there is no input, the program should just exit.
wc is just an example here. It will be substituted with a much more complex command.
The input can be of several TB of data, so I would really like to not touch that data at all (not even with a sysread). I tried doing:
pipe(STDIN,OUT);
But that doesn't work. Is there some other way that I can tell OUT that after it has gotten the first byte, it should just read from STDIN? Maybe some open(">=&2") gymnastics combined with exec?
The FIONREAD ioctl, mentioned in the Perl Cookbook, can tell you how many bytes are pending on a file descriptor without consuming them. In perlish terms:
use strict;
use warnings;
use IO::Select qw( );
BEGIN { require 'sys/ioctl.ph'; }
sub fionread {
my $sz = pack('L', 0);
return unless ioctl($_[0], FIONREAD, $sz);
return unpack('L', $sz);
}
# Wait until it's known whether the handle has data to read or has reached EOF.
IO::Select->new(\*STDIN)->can_read();
if (fionread(\*STDIN)) {
system('wc');
# Check for errors
}
This should be very widely portable to UNIX and UNIX-like platforms.
A child process is always given duplicates of its parent's file handles, so simply starting wc - either with backticks or with a call to system or exec - will cause it to read from the same place as the Perl process's STDIN.
As for starting wc only when there is something to read, it looks like you need IO::Select, which will allow you either to check whether a file handle has something to read, or to block until it does have something.
This program will check whether STDIN has any data waiting, and run wc and print its output if so.
use strict;
use warnings;
use IO::Select;
my $select = IO::Select->new(\*STDIN);
if ( $select->can_read(0) ) {
print `wc`;
}
The parameter to can_read is a timeout in seconds. Passing a value of zero makes it return immediately, reporting true (actually it returns the file handle itself) if there is data waiting, or false (undef) if not.
If you don't pass a parameter then can_read will wait forever until there is something to read, so you can suspend your program and wait for data for wc by writing just
$select->can_read;
print `wc`;
or you could combine the construction of the object to make it even more concise
IO::Select->new(\*STDOUT)->can_read;
print `wc`;
Note also that IO::Select works fine with file descriptors too, and as the fileno for STDIN is zero, you could write
my $select = IO::Select(0)
but that isn't very descriptive and would need a comment to make sense
The specific solution in which you're interested is impossible.
As you surely discovered already, you can't determine if a file handle has reached EOF without reading from it. [Apparently, you can] select(2) will get you close. It will tell you that a handle has reached EOF or has data waiting, but it won't tell you which. This is why you're looking into alternate solutions. Unfortunately, the one you're looking into is just as impossible.
Is there some other way that I can tell OUT that after it has gotten the first byte, it should just read from STDIN?
No. OUT isn't code; it doesn't read anything. It's a variable. Furthermore, it's a variable in the parent. Changing a variable in the parent isn't going to affect the child.
Maybe you meant to ask: Can one tell the child program to start reading from a second handle?
No, generally speaking. You can't go and edit another program's variables. The program would have to be specifically written to accept two file handles and read from one after the other.
Then again, it's possible to obtain a file name for an arbitrary file handle, so all we need is a program that is specifically written to accept two file names and read from one after the other, and that's quite common.
$ echo abcdef | perl -MFcntl -e'
if (sysread(STDIN, $buf, 1)) {
pipe(my $r, my $w);
my $pid = fork();
if (!$pid) {
close($w);
# Clear close-on-exec flag.
my $flags = fcntl($r, Fcntl::F_GETFD, 0);
fcntl($r, Fcntl::F_SETFD, $flags & ~Fcntl::FD_CLOEXEC);
exec("cat", "/proc/$$/fd/".fileno($r), "/proc/$$/fd/".fileno(STDIN));
die $!;
}
close($r);
print($w $buf);
close($w);
waitpid($pid, 0);
}
'
abcdef
(Lots of error checking needed.)
Above, cat was used an example where your program would be used, but that presents another solution: Why not just use cat? The overhead of cat should be quite minor for an IO-bound program.
use String::ShellQuote qw( shell_quote );
my $cmd1 = shell_quote("cat", "/proc/$$/fd/".fileno($r), "/proc/$$/fd/".fileno(STDIN));
my $cmd2 = ...
exec("$cmd1 | $cmd2");

How to send STDIN(multiple arguments) to external process and work within interactive mode

External program has interactive mode asking for some details. Each passed argument must be accepted by return key. So far I managed to pass an argument to external process however the problem I'm facing more then one argument is passed, perl executes then all when you close pipe.
It's impractical in interactive modes when arguments are passed one by one.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open2;
open(HANDLE, "|cmd|");
print HANDLE "time /T\n";
print HANDLE "date /T\n";
print HANDLE "dir\n";
close HANDLE;
Unfortunately you can't pass double pipes into open as one would like, and loading IPC::Open2 doesn't fix that. You have to use the open2 function exported by IPC::Open2.
use strict;
use warnings;
use IPC::Open2;
use IO::Handle; # so we can call methods on filehandles
my $command = 'cat';
open2( my $out, my $in, $command ) or die "Can't open $command: $!";
# Set both filehandles to print immediately and not wait for a newline.
# Just a good idea to prevent hanging.
$out->autoflush(1);
$in->autoflush(1);
# Send lines to the command
print $in "Something\n";
print $in "Something else\n";
# Close input to the command so it knows nothing more is coming.
# If you don't do this, you risk hanging reading the output.
# The command thinks there could be more input and will not
# send an end-of-file.
close $in;
# Read all the output
print <$out>;
# Close the output so the command process shuts down
close $out;
This pattern works if all you have to do is send a command a bunch of lines and then read the output once. If you need to be interactive, it's very very easy for your program to hang waiting for output that is never coming. For interactive work, I would suggest IPC::Run. It's rather overpowered, but it will cover just about everything you might want to do with an external process.