I need (would like?) to spawn a slow process from a web app using a Minion queue.
The process - a GLPK solver - can run for a long time but generates progress output.
I'd like to capture that output as it happens and write it to somewhere (database? log file?) so that it can be played back to the user as a status update inside the web app.
Is that possible? I have no idea (hence no code).
I was exploring Capture::Tiny - the simplicity of it is nice but I can't tell if it can track write events upon writing.
A basic way is to use pipe open, where you open a pipe to a process that gets forked. Then the STDOUT from the child is piped to the filehandle in the parent, or the parent pipes to its STDIN.
use warnings;
use strict;
my #cmd = qw(ls -l .); # your command
my $pid = open(my $fh, '-|', #cmd) // die "Can't open pipe from #cmd: $!";
while (<$fh>) {
print;
}
close $fh or die "Error closing pipe from #cmd: $!";
This way the parent receives child's STDOUT right as it is emitted.†
There is a bit more that you can do with error checking, see the man page, close, and $? in perlvar. Also, install a handler for SIGPIPE, see perlipc and %SIG in perlvar.
There are modules that make it far easier to run and manage external commands and, in particular, check errors. However, Capture::Tiny and IPC::Run3 use files to transfer the external program's streams.
On the other hand, the IPC::Run gives you far more control and power.
To have code executed "... each time some data is read from the child" use a callback
use warnings;
use strict;
use IPC::Run qw(run);
my #cmd = (
'perl',
'-le',
'STDOUT->autoflush(1); for (qw( abc def ghi )) { print; sleep 1; }'
);
run \#cmd, '>', sub { print $_[0] };
Once you use IPC::Run a lot more is possible, including better error interrogation, setting up pseudo tty for the process, etc. For example, using >pty> instead of > sets up a terminal-like environment so the external program that is run may turn back to line buffering and provide more timely output. If demands on how to manage the process grow more complex then work will be easier with the module.
Thanks to ikegami for comments, including the demo #cmd.
† To demonstrate that the parent receives child's STDOUT as it is emitted use a command that emits output with delays. For example, instead of ls -l above use
my #cmd = (
'perl',
'-le',
'STDOUT->autoflush(1); for (qw( abc def ghi )) { print; sleep 1; }'
);
This Perl one-liner prints words one second apart, and that is how they wind up on screen.
Related
I created a child process via IPC::Open2.
I need to read from the stdout of this child process line by line.
Problem is, as the stdout of the child process is not connected to a terminal, it's fully buffered and I can't read from it until the process terminates.
How can I flush the output of the child process without modifying its code ?
child process code
while (<STDIN>) {
print "Received : $_";
}
parent process code:
use IPC::Open2;
use Symbol;
my $in = gensym();
my $out = gensym();
my $pid = open2($out, $in, './child_process');
while (<STDIN>) {
print $in $_;
my $line = <$out>;
print "child said : $line";
}
When I run the code, it get stucks waiting the output of the child process.
However, if I run it with bc the result is what I expect, I believe bc must manually flush its output
note:
In the child process if I add $| = 1 at the beginning or STDOUT->flush() after printing, the parent process can properly read from it.
However this is an example and I must handle programs that don't manually flush their output.
Unfortunately Perl has no control over the buffering behavior of the programs it executes. Some systems have an unbuffer utility that can do this. If you have access to this tool, you could say
my $pid = open2($out, $in, 'unbuffer ./child_process');
There's a discussion here about the equivalent tools for Windows, but I couldn't say whether any of them are effective.
One way to (try to) deal with buffering is to set up a terminal-like environment for the process, a pseudo-terminal (pty). That is not easy to do in general but IPC::Run has that capability ready for easy use.
Here is the driver, run for testing using at facility so that it has no controlling terminal (or run it via cron)
use warnings;
use strict;
use feature 'say';
use IPC::Run qw(run);
my #cmd = qw(./t_term.pl input arguments);
run \#cmd, '>pty>', sub { say "out: #_" };
#run \#cmd, '>', sub { say "out: #_" } # no pty
With >pty> it sets up a pseudo-terminal for STDOUT of the program in #cmd (with > it's a pipe); also see <pty< and see more about redirection.
The anonymous sub {} gets called every time there is output from the child, so one can process it as it goes. There are other related options.
The program that is called (t_term.pl) only tests for a terminal
use warnings;
use strict;
use feature 'say';
say "Is STDOUT filehandle attached to a terminal: ",
( (-t STDOUT) ? "yes" : "no" );
sleep 2;
say "bye from $$";
The -t STDOUT (see filetest operators) is a suitable way to check for a terminal in this example. For more/other ways see this post.
The output shows that the called program (t_term.pl) does see a terminal on its STDOUT, even when a driver runs without one (using at, or out of a crontab). If the >pty> is changed to the usual redirection > (a pipe) then there is no terminal.
Whether this solves the buffering problem is clearly up to that program, and to whether it is enough to fool it with a terminal.
Another way around the problem is using unbuffer when possible, as in mob's answer.
I'm working on a library with a test suite that uses Perl open to run it's tests. It looks something like this:
open (MYOUT, "$myprog $arg1 $arg2 $arg3 2>&1 |") die "Bad stuff happened";
What I'd really like to do is to measure the runtime of $myprog. Unfortunately, just grabbing a start time and end time around the open command just grabs roughly how long it takes to start up the process.
Is there some way of either forcing the open command to finish the process (and therefore accurately measure time) or perhaps something else that would accomplish the same thing?
Key constraints are that we need to capture (potentially a lot of) STDOUT and STDERR.
Since you open a pipe, you need to time from before opening to at least after the reading
use warnings;
use strict;
use Time::HiRes qw(gettimeofday tv_interval sleep);
my $t0 = [gettimeofday];
open my $read, '-|', qw(ls -l) or die "Can't open process: $!";
while (<$read>)
{
sleep 0.1;
print;
}
print "It took ", tv_interval($t0), " seconds\n";
# close pipe and check
or, to time the whole process, after calling close on the pipe (after all reading is done)
my $t0 = [gettimeofday];
open my $read, '-|', qw(ls -l) or die "Can't open process: $!";
# ... while ($read) { ... }
close $read or
warn $! ? "Error closing pipe: $!" : "Exit status: $?";
print "It took ", tv_interval($t0), " seconds\n";
The close blocks and waits for the program to finish
Closing a pipe also waits for the process executing on the pipe to exit--in case you wish to look at the output of the pipe afterwards--and implicitly puts the exit status value of that command into $? [...]
For the status check see $? variable in perlvar and system
If the timed program forks and doesn't wait on its children in a blocking way this won't time them correctly.
In that case you need to identify resources that they use (files?) and monitor that.
I'd like to add that external commands should be put together carefully, to avoid shell injection trouble. A good module is String::ShellQuote. See for example this answer and this answer
Using a module for capturing streams would free you from the shell and perhaps open other ways to run and time this more reliably. A good one is Capture::Tiny (and there are others as well).
Thanks to HåkonHægland for comments. Thanks to ikegami for setting me straight, to use close (and not waitpid).
According to the manual the syntax of the IPC::Open3::open3 is
$pid = open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, 'some cmd and args', 'optarg', ...);
I am confused regarding the first three parameters. Are these references to typeglobs?
I tried the following:
my $pid=open3("STDIN", "STDOUT", "STDERR", $cmd);
my $pid=open3(*STDIN, *STDOUT, *STDERR, $cmd);
my $pid=open3(\*STDIN, \*STDOUT, \*STDERR, $cmd);
my $pid=open3('<&STDIN', '>&STDOUT', '>&STDERR', $cmd);
but only number 4, seemed to work. According to manual, I thought number 3 should also work. For example:
use warnings;
use strict;
use feature qw(say);
use IPC::Open3;
my $cmd="sleep 1; echo This STDOUT; sleep 1; echo 'This is STDERR' 1>&2; sleep 1; echo -n 'Reading STDIN: '; read a";
my $pid=open3('<&STDIN', '>&STDOUT', '>&STDERR', $cmd);
say "Waiting for child..";
wait;
say "Done.";
If you pass the strings '<&STDIN' and '>&STDOUT' then the child process gets a duplicate of your Perl program's own standard input and output handles, and can read and write to and from them without any further intervention.
That is a very diffferent things from specifying filehandles using typeglob references. The CHLD_OUT file handle in the documentation is the STDOUT for the child process and it allows your Perl program to read from CHLD_OUT so that you can acquire and process the data it is sending. Using STDOUT here won't work because it is the output file handle for your Perl process. You could, if you really wanted, use STDIN, but that would leave you unable to read anything that was originally presented to your standard input.
The equivalent points apply to CHLD_IN, which is a handle that you print to to send data to the child process. Again, you could use STDOUT here, but that deprives you of the original standard output channel. In any case you would still have to invent another file handle for CHLD_ERR, because you would be reading from it to see what the child was sending to its standard error output, and of course you cannot read from STDERR.
So the best you could do is to replace
open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, 'command')
with
open3(\*STDOUT, \*STDIN, \*CHLD_ERR, 'command')
but file handles aren't costly, so why commit yourself to losing your standard input and output? Much better to create three new file handles and work with all six.
I have an application which publishes realtime market data rates. This app is invoked from the command line and has an interactive mode where the user can change various parameters on-the-fly by simply typing the parameter followed by it's corresponding value.
e.g. ur 2000
would dynamically set the update rate to 2000 updates per second.
What I need to do is perform some soak testing for several hours/days and I need to be able to change the update rate to different values at random times. I normally do all my scripting using Perl so I need a way of invoking the script (easy enough) but then having the ability for the script to randomly change any given parameter (like the update rate).
Any ideas or advice would be really appreciated.
Thanks very much
You can open a pipe to your program with open my $fh, "|-", ... and then set the handle to autoflush with
select $fh;
$| = 1;
Now you have a direct line to the standard input of your system under test, as in the demonstration below.
#! /usr/bin/env perl
use strict;
use warnings;
no warnings "exec";
my #system_under_test = ("cat");
open my $fh, "|-", #system_under_test or die "$0: open #system_under_test: $!";
select $fh;
$| = 1; # autoflush
for (map int rand 2000, 1 .. 10) {
print $fh "ur $_\n";
sleep int rand 10;
}
close $fh or warn "$0: close: $!";
For your soak test, you would of course want to sleep for more intervals and iterate the loop many more times.
You can use the command "mkfifo". This creates a named pipe. If you start your program using the fifo as input it should work.
Create a fifo:
mkfifo MyFifo
Start your application with fifo as input:
./yourAppName < MyFifo
Now all you write (e.g. using echo) to "MyFifo" will forwarded to yourAppName's stdin.
External program has interactive mode asking for some details. Each passed argument must be accepted by return key. So far I managed to pass an argument to external process however the problem I'm facing more then one argument is passed, perl executes then all when you close pipe.
It's impractical in interactive modes when arguments are passed one by one.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open2;
open(HANDLE, "|cmd|");
print HANDLE "time /T\n";
print HANDLE "date /T\n";
print HANDLE "dir\n";
close HANDLE;
Unfortunately you can't pass double pipes into open as one would like, and loading IPC::Open2 doesn't fix that. You have to use the open2 function exported by IPC::Open2.
use strict;
use warnings;
use IPC::Open2;
use IO::Handle; # so we can call methods on filehandles
my $command = 'cat';
open2( my $out, my $in, $command ) or die "Can't open $command: $!";
# Set both filehandles to print immediately and not wait for a newline.
# Just a good idea to prevent hanging.
$out->autoflush(1);
$in->autoflush(1);
# Send lines to the command
print $in "Something\n";
print $in "Something else\n";
# Close input to the command so it knows nothing more is coming.
# If you don't do this, you risk hanging reading the output.
# The command thinks there could be more input and will not
# send an end-of-file.
close $in;
# Read all the output
print <$out>;
# Close the output so the command process shuts down
close $out;
This pattern works if all you have to do is send a command a bunch of lines and then read the output once. If you need to be interactive, it's very very easy for your program to hang waiting for output that is never coming. For interactive work, I would suggest IPC::Run. It's rather overpowered, but it will cover just about everything you might want to do with an external process.