Opening a pipe and delays in parsing the output in Perl - perl

I am trying to do a fairly simple process using Perl. A snippet of the code:
open(FH,"<command> |") or die "Could not run command .. $!\n";
print "After open\n";
while(<FH>)
{
print "I am inside loop\n";
<process..something>
}
I am seeing some inexplicable delays when the while() is called. I see the open took 9-10ms to run ( which is within range ), however I do see 200 - 250ms delay between the messages "After open" and "I am inside loop".
Has anyone seen anything like this before ? Any help would be appreciated.
Thanks
Rajib

This is almost certainly because the output from <command> is buffered until either the buffer fills up or the process terminates
You can probably get around this using unbuffer, which pretends to the command that it is outputting to a terminal
Try using this instead
open my $fh, '-|', 'unbuffer <command>' or die "Could not run command: $!\n";

Related

Perl: `die` did not work upon opening a nonexistent gz file using gzip

The following script creates a gziped file named "input.gz". Then the script attempts to open "input.gz" using gzip -dc. Intuitively, die should be triggered if a wrong input file name is provided. However, as in the following script, the program will not die even if a wrong input file name is provided ("inputx.gz"):
use warnings;
use strict;
system("echo PASS | gzip -c > input.gz");
open(IN,"-|","gzip -dc inputx.gz") || die "can't open input.gz!";
print STDOUT "die statment was not triggered!\n";
close IN;
The output of the script above was
die statment was not triggered!
gzip: inputx.gz: No such file or directory
My questions is: why wasn't die statement triggered even though gzip quit with error? And how can I make die statement triggered when a wrong file name is given?
It's buried in perlipc, but this seems relevant (emphasis added):
Be careful to check the return values from both open() and close(). If you're writing to a pipe, you should also trap SIGPIPE. Otherwise, think of what happens when you start up a pipe to a command that doesn't exist: the open() will in all likelihood succeed (it only reflects the fork()'s success), but then your output will fail--spectacularly. Perl can't know whether the command worked, because your command is actually running in a separate process whose exec() might have failed. Therefore, while readers of bogus commands return just a quick EOF, writers to bogus commands will get hit with a signal, which they'd best be prepared to handle.
Use IO::Uncompress::Gunzip to read gzipped files instead.
The open documentation is explicit about open-ing a process since that is indeed different
If you open a pipe on the command - (that is, specify either |- or -| with the one- or two-argument forms of open), an implicit fork is done, so open returns twice: in the parent process it returns the pid of the child process, and in the child process it returns (a defined) 0. Use defined($pid) or // to determine whether the open was successful.
For example, use either
my $child_pid = open(my $from_kid, "-|") // die "Can't fork: $!";
or
my $child_pid = open(my $to_kid, "|-") // die "Can't fork: $!";
(with code following that shows one use of this, which you don't need) The main point is to check for defined -- by design we get undef if open for a process fails, not just any "false."
While this should be corrected, keep in mind that the open call fails if fork itself fails, what is rare; in most cases when a "command fails" the fork was successful but something later wasn't. So in such cases we just cannot get the // die message, but end up seeing messages from the shell or command or OS, hopefully.
This is alright though, if informative messages indeed get emitted by some part of the process. Wrap the whole thing in eval and you'll have manageable error reporting.
But it is in general difficult to ensure to get all the right messages, and in some cases not possible. One good approach is to use a module for running and managing external commands. Among the many other advantages they also usually handle errors much more nicely. If you need to handle process's output right as it is emitted I recommend IPC::Run (which i'd recommend otherwise as well).
Read on what linked docs say, for specific examples on what you need and for much useful insight.
In your case
# Check input, depending on how it is given,
# consider String::ShellQuote if needed
my $file = ...;
my #cmd = ('gzip', '-dc', $file);
my $child_pid = open my $in, '-|', #cmd
// die "Can't fork for '#cmd': $!";
while (<$in>) {
...
}
close $in or die "Error closing pipe: $!";
Note a few other points
the "list form" of the command bypasses the shell
lexical filehandle (my $fh) is much better than typeglobs (IN)
print the actual error in the die statement, in $! variable
check close for a good final check on how it all went

Measure the runtime of a program run via perl "open"

I'm working on a library with a test suite that uses Perl open to run it's tests. It looks something like this:
open (MYOUT, "$myprog $arg1 $arg2 $arg3 2>&1 |") die "Bad stuff happened";
What I'd really like to do is to measure the runtime of $myprog. Unfortunately, just grabbing a start time and end time around the open command just grabs roughly how long it takes to start up the process.
Is there some way of either forcing the open command to finish the process (and therefore accurately measure time) or perhaps something else that would accomplish the same thing?
Key constraints are that we need to capture (potentially a lot of) STDOUT and STDERR.
Since you open a pipe, you need to time from before opening to at least after the reading
use warnings;
use strict;
use Time::HiRes qw(gettimeofday tv_interval sleep);
my $t0 = [gettimeofday];
open my $read, '-|', qw(ls -l) or die "Can't open process: $!";
while (<$read>)
{
sleep 0.1;
print;
}
print "It took ", tv_interval($t0), " seconds\n";
# close pipe and check
or, to time the whole process, after calling close on the pipe (after all reading is done)
my $t0 = [gettimeofday];
open my $read, '-|', qw(ls -l) or die "Can't open process: $!";
# ... while ($read) { ... }
close $read or
warn $! ? "Error closing pipe: $!" : "Exit status: $?";
print "It took ", tv_interval($t0), " seconds\n";
The close blocks and waits for the program to finish
Closing a pipe also waits for the process executing on the pipe to exit--in case you wish to look at the output of the pipe afterwards--and implicitly puts the exit status value of that command into $? [...]
For the status check see $? variable in perlvar and system
If the timed program forks and doesn't wait on its children in a blocking way this won't time them correctly.
In that case you need to identify resources that they use (files?) and monitor that.
I'd like to add that external commands should be put together carefully, to avoid shell injection trouble. A good module is String::ShellQuote. See for example this answer and this answer
Using a module for capturing streams would free you from the shell and perhaps open other ways to run and time this more reliably. A good one is Capture::Tiny (and there are others as well).
Thanks to HåkonHægland for comments. Thanks to ikegami for setting me straight, to use close (and not waitpid).

simply tee in Perl without fork, File::Tee, or piping to tee

Is there a simple way in Perl to send STDOUT or STDERR to multiple places without forking, using File::Tee, or opening a pipe to /usr/bin/tee?
Surely there is a way to do this in pure perl without writing 20+ lines of code, right? What am I missing? Similar questions have been asked, both here on SO and elsewhere, but none of the answers satisfy the requirements that I not have to
fork
use File::Tee / IO::Tee / some other module+dependencies
whose code footprint is 1000x larger than my actual script
open a pipe to the actual tee command
I can see the use of a Core module as a tradeoff here, but really is that needed?
It looks like I can simply do this:
BEGIN {
open my $log, '>>', 'error.log' or die $!;
$SIG{__WARN__} = sub { print $log #_ and print STDERR #_ };
$SIG{__DIE__} = sub { warn #_ and exit 1 };
}
This simply and effectively sends most error messages both to the original STDERR and to a log file (apparently stuff trapped in an eval doesn't show up, I'm told). So there are downsides to this, mentioned in the comments. But as mentioned in the original question, the need was specific. This isn't meant for reuse. It's for a simple, small script that will never be more than 100 lines long.
If you are looking for a way to do this that isn't a "hack", the following was adapted from http://grokbase.com/t/perl/beginners/096pcz62bk/redirecting-stderr-with-io-tee
use IO::Tee;
open my $save_stderr, '>&STDERR' or die $!;
close STDERR;
open my $error_log, '>>', 'error.log' or die $!;
*STDERR = IO::Tee->new( $save_stderr, $error_log ) or die $!;

Perl - Show progress during execution of a system() command

I am trying to write a Perl CGI which executes an RKHunter scan. While executing the comman, I would like to show something to indicate progress instead of the actual output which is to be redirected to another file. The code thus far is:
open(my $quik_rk, '-|', 'rkhunter', '--enable', '"known_rkts"') or print "ERROR RUNNING QUICK ROOTKIT CHECK!!";
while(<$quik_rk>)
{ print ".";
}
print "\n";
close($quik_rk);
This doesn't show any output and I am presented with a blank screen while waiting for execution to complete. All the dots are printed to the screen together instead of one-by-one., Moreover, when I use the following to redirect, the command doesn't execute at all:
open(my $quik_rk, '-|', 'rkhunter', '--enable', '"known_rkts"', '>>', '/path/to/file') or print "ERROR RUNNING QUICK ROOTKIT CHECK!!";
How can I fix this in such a way that the verbose output is redirected to a file and only a .... steadily progresses on the screen?
$|=1;
At the beginning of your script.
This turns autoflush on so every print actually prints instead of waiting for a newline before flushing the buffer.
Also see: http://perldoc.perl.org/perlvar.html#Variables-related-to-filehandles

How do I receive command output immediately?

I'm using perl back-ticks syntax to run some commands.
I would like the output of the command to written to a file and also printed out to stdout.
I can accomplish the first by adding a > at the end of my back-ticked string, but I do not know hot to make the output be printed as soon as it is generated. If I do something like
print `command`;
the output is printed only after command finished executing.
Thanks,
Dave
You cannot do it with the backticks, as they return to the Perl program only when the execution has finished.
So,
print `command1; command2; command3`;
will wait until command3 finishes to output anything.
You should use a pipe instead of backticks to be able to get output immediately:
open (my $cmds, "-|", "command1; command2; command3");
while (<$cmds>) {
print;
}
close $cmds;
After you've done this, you then have to see if you want or not buffering (depending on how immediate you want the output to be): Suffering from buffering?
To print and store the output, you can open() a file to write the output to:
open (my $cmds, "-|", "command1; command2; command3");
open (my $outfile, ">", "file.txt");
while (<$cmds>) {
print;
print $outfile $_;
}
close $cmds;
close $outfile;