Perl - output from external process directly to stdout (avoid buffering) - perl

I have a Perl script that has to wrap a PHP script that produces a lot of output, and takes about half an hour to run.
At moment I'm shelling out with:
print `$command`;
This works in the sense that the PHP script is called, and it does it's job, but, there is no output rendered by Perl until the PHP script finishes half an hour later.
Is there a way I could shell out so that the output from PHP is printed by perl as soon as it receives it?

The problem is that Perl's not going to finish reading until the PHP script terminates, and only when it finishes reading will it write. The backticks operator blocks until the child process exits, and there's no magic to make a read/write loop implicitly.
So you need to write one. Try a piped open:
open my $fh, '-|', $command or die 'Unable to open';
while (<$fh>) {
print;
}
close $fh;
This should then read each line as the PHP script writes it, and immediately output it. If the PHP script doesn't output in convenient lines and you want to do it with individual characters, you'll need to look into using read to get data from the file handle, and disable output buffering ($| = 1) on stdout for writing it.
See also http://perldoc.perl.org/perlipc.html#Using-open()-for-IPC

Are you really doing print `$command`?
If you are only running a command and not capturing any of its output, simply use system $command. It will write to stdout directly without passing through Perl.

You might want to investigate Capture::Tiny. IIRC something like this should work:
use strict;
use warnings;
use Capture::Tiny qw/tee/;
my ($stdout, $stderr, #result) = tee { system $command };
Actually, just using system might be good enough, YMMV.

Related

IPC::Open2 middleware

I would like to make a wrapper that would take data from STDIN and pass it to another script, wait for his STDOUT response and output it to STDOUT on the parent side.
I have the following code but it does not seems to work:
test.pl
#!/usr/bin/perl
#
use IPC::Open2;
$pid = open2( \*RDR, \*WTR, '/usr/bin/perl test2.pl');
while (<STDIN>) {
print WTR;
}
while (<RDR>) {
print STDOUT;
}
and on the test2.pl, i have:
#!/usr/bin/perl
#
while (<STDIN>) {
print STDOUT;
}
It seems to write to test2.pl but i have no feedback from test2.pl.
Any hints?
Thank's,
You should close WTR when you are done reading from STDIN. Your external command will keep expecting input until you do this, and if you are suffering from buffering, your external program won't terminate and it won't output anything.
You are probably "suffering from buffering" in both your primary script and in your external command.
In your test script, you can add $|=1 to the top of the script to make its output more responsive. You might not be able to affect the output buffering of an arbitrary external command, though.
Update: IPC::Open2 already sets autoflush on the write filehandle, so the external command won't be starved for input.

Reopen STDERR/STDOUT to write to combined logfile with timestamps

I basically want to reopen STDERR/STDOUT so they write to one logfile with both the stream and the timestamp included on every line. So print STDERR "Hello World" prints STDERR: 20130215123456: Hello World. I don't want to rewrite all my print statements into function calls, also some of the output will be coming from external processes via system() calls anyway which I won't be able to rewrite.
I also need for the output to be placed in the file "live", i.e. not only written when the process completes.
(p.s. I'm not asking particularly for details of how to generate timestamps, just how to redirect to a file and prepend a string)
I've worked out the following code, but it's messy:
my $mode = ">>";
my $file = "outerr.txt";
open(STDOUT, "|-", qq(perl -e 'open(FILE, "$mode", "$file"); while (<>) { print FILE "STDOUT: \$\_"; }'));
open(STDERR, "|-", qq(perl -e 'open(FILE, "$mode", "$file"); while (<>) { print FILE "STDERR: \$\_"; }'));
(The above doesn't add dates, but that should be trivial to add)
I'm looking for a cleaner solution, one that doesn't require quoting perl code and passing it on the command line, or at least module that hides some of the complexity. Looking at the code for Capture::Tiny it doesn't look like it can handle writing a part of output, though I'm not sure about that. annotate-output only works on an external command sadly, I need this to work on both external commands and ordinary perl printing.
The child launched via system doesn't write to STDOUT because it does not have access to variables in your program. Therefore, means having code run on a Perl file handle write (e.g. tie) won't work.
Write another script that runs your script with STDOUT and STDERR replaced with pipes. Read from those pipes and print out the modified output. I suggest using IPC::Run to do this, because it'll save you from using select. You can get away without it if you combine STDOUT and STDERR in one stream.

Is There Any Way to Pipe Data from Perl to a Unix Command Line Utility

I have a command line utility from a third party (it's big and written in Java) that I've been using to help me process some data. This utility expects information in a line delimited file and then outputs processed data to STDOUT.
In my testing phases, I was fine with writing some Perl to create a file full of information to be processed and then sending that file to this third party utility, but as I'm nearing putting this code in production, I'd really prefer to just pipe data to this utility directly instead of first writing that data to a file as this would save me the overhead of having to write unneeded information to disk. I recently asked on this board how I could do this in Unix, but have since realized that it would be incredibly more convenient to actually run it directly out of a Perl module. Perhaps something like:
system(bin/someapp do-action --option1 some_value --input $piped_in_data)
Currently I call the utility as follows:
bin/someapp do-action --option1 some_value --input some_file
Basically, what I want is to write all my data either to a variable or to STDOUT and then to pipe it to the Java app through a system call in the SAME Perl script or module. This would make my code a lot more fluid. Without it, I'd wind up needing to write a Perl script which calls a bash file half way through which in turn would need to call another Perl script to prep data. If at all possible I'd love to just stay in Perl the whole way through. Any ideas?
If I am reading your question correctly, you are wanting to spawn a process and be able to both write to its stdin and read from its stdout. If that is the case, then IPC::Open2 is exactly what you need. (Also see IPC::Open3 you also need to read from the process' stderr.)
Here is some sample code. I have marked the areas you will have to change.
#!/usr/bin/perl
use strict;
use warnings;
use IPC::Open2;
# Sample data -- ignore this.
my #words = qw(the quick brown fox jumped over the lazy dog);
# Automatically reap child processes. This is important when forking.
$SIG{'CHLD'} = 'IGNORE';
# Spawn the external process here. Change this to the process you need.
open2(*READER, *WRITER, "wc -c") or die "wc -c: $!";
# Fork into a child process. The child process will write the data, while the
# parent process reads data back from the process. We need to fork in case
# the process' output buffer fills up and it hangs waiting for someone to read
# its output. This could cause a deadlock.
my $pid;
defined($pid = fork()) or die "fork: $!";
if (!$pid) {
# This is the child.
# Close handle to process' stdout; the child doesn't need it.
close READER;
# Write out some data. Change this to print out your data.
print WRITER $words[rand(#words)], " " for (1..100000);
# Then close the handle to the process' stdin.
close WRITER;
# Terminate the child.
exit;
}
# Parent closes its handle to the process' stdin immediately! As long as one
# process has an open handle, the program on the receiving end of the data will
# never see EOF and may continue waiting.
close WRITER;
# Read in data from the process. Change this to whatever you need to do to
# process the incoming data.
print "READ: $_" while (<READER>);
# Close the handle to the process' stdin. After this call, the process should
# be finished executing and will terminate on its own.
close READER;
If it only accepts files, let it open "/proc/self/fd/0", which is the same as STDIN. For the rest, see cdhowies answer.
If all you want to do is pipe the STDOUT from your program into your other program's STDIN, you can do this via the standard Perl open command:
open (CMD, "|$command") or die qq(Couldn't execute $command for piping);
Then, all you have to do to send data to this command is to use the print statement:
print CMD $dataToCommand;
And, you finally close your pipe with the close statement:
close (CMD);
PERL HINT
Perl has a command called perldoc which can give you the documentation of any Perl function or Perl module installed on your system. To get more information about the open command, type:
$ perldoc -f open
The -f says this is a Perl function
If you're doing what cdhowie said in his answer, (you're spawning a process, then reading and writing to that process), you will need IPC::Open2. To get information about the IPC::Open2 module, type:
$ perldoc IPC::Open2

How can I capture the stdin and stdout of system command from a Perl script?

In the middle of a Perl script, there is a system command I want to execute. I have a string that contains the data that needs to be fed into stdin (the command only accepts input from stdin), and I need to capture the output written to stdout. I've looked at the various methods of executing system commands in Perl, and the open function seems to be what I need, except that it looks like I can only capture stdin or stdout, not both.
At the moment, it seems like my best solution is to use open, redirect stdout into a temporary file, and read from the file after the command finishes. Is there a better solution?
IPC::Open2/3 are fine, but I've found that usually all I really need is IPC::Run3, which handles the simple cases really well with minimal complexity:
use IPC::Run3; # Exports run3() by default
run3( \#cmd, \$in, \$out, \$err );
The documentation compares IPC::Run3 to other alternatives. It's worth a read even if you don't decide to use it.
The perlipc documentation covers many ways that you can do this, including IPC::Open2 and IPC::Open3.
Somewhere at the top of your script, include the line
use IPC::Open2;
That will include the necessary module, usually installed with most Perl distributions by default. (If you don't have it, you could install it using CPAN.) Then, instead of open, call:
$pid = open2($cmd_out, $cmd_in, 'some cmd and args');
You can send data to your command by sending it to $cmd_in and then read your command's output by reading from $cmd_out.
If you also want to be able to read the command's stderr stream, you can use the IPC::Open3 module instead.
IPC::Open3 would probably do what you want. It can capture STDERR and STDOUT.
http://metacpan.org/pod/IPC::Open3
A very easy way to do this that I recently found is the IPC::Filter module. It lets you do the job extremely intuitively:
$output = filter $input, 'somecmd', '--with', 'various=args', '--etc';
Note how it invokes your command without going through the shell if you pass it a list. It also does a reasonable job of handling errors for common utilities. (On failure, it dies, using the text from STDERR as its error message; on success, STDERR is just discarded.)
Of course, it’s not suitable for huge amounts of data since it provides no way of doing any streaming processing; also, the error handling might not be granular enough for your needs. But it makes the many simple cases really really simple.
I think you want to take a look at IPC::Open2
There is a special perl command for it
open2()
More info can be found on: http://sunsite.ualberta.ca/Documentation/Misc/perl-5.6.1/lib/IPC/Open2.html
If you do not want to include extra packages, you can just do
open(TMP,">tmpfile");
print TMP $tmpdata ;
open(RES,"$yourcommand|");
$res = "" ;
while(<RES>){
$res .= $_ ;
}
which is the contrary of what you suggested, but should work also.
I always do it this way if I'm only expecting a single line of output or want to split the result on something other than a newline:
my $result = qx( command args 2>&1 );
my $rc=$?;
# $rc >> 8 is the exit code of the called program.
if ($rc != 0 ) {
error();
}
If you want to deal with a multi-line response, get the result as an array:
my #lines = qx( command args 2>&1 );
foreach ( my $line ) (#lines) {
if ( $line =~ /some pattern/ ) {
do_something();
}
}

How do I run a Perl script from within a Perl script?

I've got a Perl script that needs to execute another Perl script. This second script can be executed directly on the command line, but I need to execute it from within my first program. I'll need to pass it a few parameters that would normally be passed in when it's run standalone (the first script runs periodically, and executes the second script under a certain set of system conditions).
Preliminary Google searches suggest using backticks or a system() call. Are there any other ways to run it? (I'm guessing yes, since it's Perl we're talking about :P ) Which method is preferred if I need to capture output from the invoked program (and, if possible, pipe that output as it executes to stdout as though the second program were invoked directly)?
(Edit: oh, now SO suggests some related questions. This one is close, but not exactly the same as what I'm asking. The second program will likely take an hour or more to run (lots of I/O), so I'm not sure a one-off invocation is the right fit for this.)
You can just do it.
{
local #ARGV = qw<param1 param2 param3>;
do '/home/buddy/myscript.pl';
}
Prevents the overhead of loading in another copy of perl.
The location of your current perl interpreter can be found in the special variable $^X. This is important if perl is not in your path, or if you have multiple perl versions available but which to make sure you're using the same one across the board.
When executing external commands, including other Perl programs, determining if they actually ran can be quite difficult. Inspecting $? can leave lasting mental scars, so I prefer to use IPC::System::Simple (available from the CPAN):
use strict;
use warnings;
use IPC::System::Simple qw(system capture);
# Run a command, wait until it finishes, and make sure it works.
# Output from this program goes directly to STDOUT, and it can take input
# from your STDIN if required.
system($^X, "yourscript.pl", #ARGS);
# Run a command, wait until it finishes, and make sure it works.
# The output of this command is captured into $results.
my $results = capture($^X, "yourscript.pl", #ARGS);
In both of the above examples any arguments you wish to pass to your external program go into #ARGS. The shell is also avoided in both of the above examples, which gives you a small speed advantage, and avoids any unwanted interactions involving shell meta-characters. The above code also expects your second program to return a zero exit value to indicate success; if that's not the case, you can specify an additional first argument of allowable exit values:
# Both of these commands allow an exit value of 0, 1 or 2 to be considered
# a successful execution of the command.
system( [0,1,2], $^X, "yourscript.pl", #ARGS );
# OR
capture( [0,1,2, $^X, "yourscript.pl", #ARGS );
If you have a long-running process and you want to process its data while it's being generated, then you're probably going to need a piped open, or one of the more heavyweight IPC modules from the CPAN.
Having said all that, any time you need to be calling another Perl program from Perl, you may wish to consider if using a module would be a better choice. Starting another program carries quite a few overheads, both in terms of start-up costs, and I/O costs for moving data between processes. It also significantly increases the difficulty of error handling. If you can turn your external program into a module, you may find it simplifies your overall design.
All the best,
Paul
I can think of a few ways to do this. You already mentioned the first two, so I won't go into detail on them.
backticks: $retVal = `perl somePerlScript.pl`;
system() call
eval
The eval can be accomplished by slurping the other file into a string (or a list of strings), then 'eval'ing the strings. Heres a sample:
#!/usr/bin/perl
open PERLFILE, "<somePerlScript.pl";
undef $/; # this allows me to slurp the file, ignoring newlines
my $program = <PERLFILE>;
eval $program;
4 . do: do 'somePerlScript.pl'
You already got good answers to your question, but there's always the posibility to take a different point of view: maybe you should consider refactoring the script that you want to run from the first script. Turn the functionality into a module. Use the module from the first and from the second script.
If you need to asynchronously call your external script -you just want to launch it and not wait for it to finish-, then :
# On Unix systems, either of these will execute and just carry-on
# You can't collect output that way
`myscript.pl &`;
system ('myscript.pl &');
# On Windows systems the equivalent would be
`start myscript.pl`;
system ('start myscript.pl');
# If you just want to execute another script and terminate the current one
exec ('myscript.pl');
Use backticks if you need to capture the output of the command.
Use system if you do not need to capture the output of the command.
TMTOWTDI: so there are other ways too, but those are the two easiest and most likely.
See the perlipc documentation for several options for interprocess communication.
If your first script merely sets up the environment for the second script, you may be looking for exec.
#!/usr/bin/perl
use strict;
open(OUTPUT, "date|") or die "Failed to create process: $!\n";
while (<OUTPUT>)
{
print;
}
close(OUTPUT);
print "Process exited with value " . ($? >> 8) . "\n";
This will start the process date and pipe the output of the command to the OUTPUT filehandle which you can process a line at a time. When the command is finished you can close the output filehandle and retrieve the return value of the process. Replace date with whatever you want.
I wanted to do something like this to offload non-subroutines into an external file to make editing easier. I actually made this into a subroutine. The advantage of this way is that those "my" variables in the external file get declared in the main namespace. If you use 'do' they apparently don't migrate to the main namespace. Note the presentation below doesn't include error handling
sub getcode($) {
my #list;
my $filename = shift;
open (INFILE, "< $filename");
#list = <INFILE>;
close (INFILE);
return \#list;
}
# and to use it:
my $codelist = [];
$codelist = getcode('sourcefile.pl');
eval join ("", #$codelist);