I can use <> to loop there the pipeline input to a perl program. However how can I decide whether there are pipelined input, if there is no pipelined input I will use environment variable to load a file. I am trying to use:
my #lines = (<>);
if ($#lines == -1) {
use setenv;
open FILE, "$ENV{'ART_FILE_LIST'}" or die $!;
#lines = <FILE>;
}
Obviously it doesn't work, because the program will waiting at the first line
use 5.010_000;
use utf8;
use strict;
use autodie;
use warnings qw< FATAL all >;
use open qw< :std :utf8 >;
END {
close(STDOUT)
|| die "can't close stdout: $!";
}
if (#ARGV == 0 && -t STDIN) {
# NB: This is magic open, so the envariable
# could hold a pipe, like 'cat -n /some/file |'
#ARGV = $ENV{ART_FILE_LIST}
|| die q(need $ART_FILE_LIST envariable set);
}
while (<>) {
# blah blah blah
}
You can use the -t operator to see if you are a terminal, i.e., not a pipeline:
if (-t STDIN) { print "Terminal\n" }
else { print "Not a terminal\n" }
Use Getopt::Long
perl -Mylib -e 'Mylib::do_stuff' --i_am_pipe_lined
One of the things about UNIX pipelines is that they achieve their usefulness by not caring what's before them or after them. They just have a job to do and they do it. They do one thing, simply, but they all have switches to do their simple job with a little more customization.
Related
I want to write:
... | my_filter | myperlprogram
But I do not know how to run my_filter until I have started myperlprogram.
Can I somehow in myperlprogram loop STDIN through my_filter before reading it?
I am thinking something like:
pipe($a,$b);
if(not fork()) {
close STDOUT;
open STDOUT, $b;
exec "my_filter --with the correct --options";
} else {
close STDIN
open STDIN, $a
}
# continue reading STDIN now looped through `my_filter`
It's not at all clear from the description why a simple
open STDIN, '-|', 'your_filter', '--option1', ...
will not do.
The way I see the problem is: To filter the STDIN for the script, by using an external program which is run from inside the script once the script is running (so, not with a pipeline). With IPC::Run
use warnings;
use strict;
use feature 'say';
use IPC::Run qw(start pump finish);
my $filtered_in;
FILTER_IN: {
my #cmd = qw(a_filter.pl); # add filter's options/arguments
my $h = start \#cmd, \my $in, \$filtered_in;
while (<>) {
$in = $_;
pump $h while length $in;
# Wait for filter's output -- IF WANT to process lines as received
pump $h until $filtered_in =~ /\n\z/;
chomp $filtered_in; # process/use filter's output
$filtered_in .= '|'; # as it's coming (if needed)
}
finish $h or die "Cleanup returned: $?";
};
say $filtered_in // 'no input';
This allows one to process filter's lines of output as they are emitted. If that is not needed but we only want to accumulate filter's output for later then you don't need the code under # Wait for...
Simplest test with a_filter.pl such as
use warnings;
use strict;
STDOUT->autoflush(1);
my $cnt = 0;
while (<>) { print "line ", ++$cnt, ": ", $_ }
and then run
echo "a\nfew\nlines" | script.pl
with output
line 1: a|line 2: few|line 3: lines|
from our toy processing in script.pl above.
This will filter input via a file as well,
script.pl < input.txt
The following code works as expected:
use feature qw(say);
use strict;
use warnings;
use open qw/:std IN :encoding(utf-8) OUT :utf8/;
say join ' ', (PerlIO::get_layers(\*STDOUT));
my $pid = fork();
die "fork() failed: $!" unless defined $pid;
if ($pid == 0) {
say join ' ', (PerlIO::get_layers(\*STDOUT));
}
Output:
unix perlio utf8
unix perlio utf8
But if I use a daemon process instead of a regular fork:
use feature qw(say);
use strict;
use warnings;
use open qw/:std IN :encoding(utf-8) OUT :utf8/;
use Cwd qw(getcwd);
use Proc::Daemon;
my $work_dir = getcwd;
my $daemon = Proc::Daemon->new(
work_dir => $work_dir,
child_STDOUT => 'stdout.txt',
child_STDERR => 'stderr.txt',
pid_file => 'pid.txt',
);
my $pid = $daemon->Init();
if ( $pid == 0 ) {
say join ' ', (PerlIO::get_layers(\*STDOUT));
}
The output to the file stdout.txt is:
unix perlio
so the utf8 IO layer has been stripped off.
Consider this example:
Foo.pm:
package Foo;
use warnings;
use strict;
sub test {
close STDIN;
open \*STDIN, "<", "/dev/null";
}
1;
test.pl:
#!/usr/bin/perl
use strict;
use warnings;
use feature qw/say/;
use open qw/:std IN :encoding(UTF-8) OUT :encoding(UTF-8)/;
use lib qw/./;
use Foo;
$, = ' ';
say "Original STDIN:", PerlIO::get_layers(\*STDIN);
close STDIN;
open \*STDIN, "<", "/dev/null";
say "Reopened STDIN:", PerlIO::get_layers(\*STDIN);
Foo::test();
say "Reopened STDIN in different package:", PerlIO::get_layers(\*STDIN);
Results:
$ perl test.pl
Original STDIN: unix perlio encoding(utf-8-strict) utf8
Reopened STDIN: unix perlio encoding(utf-8-strict) utf8
Reopened STDIN in different package: unix perlio
Looks like use open is like other pragmas and only applies to the file it's in it. So when Proc::Daemon closes standard input, output, and error, and reopens them, it naturally doesn't see the extra layers.
I am writing a simple script using IPC::Open3. The script produces no output to either stdout or stderr, while I would expect output to both.
The complete source code:
#!/usr/bin/env perl
use strict;
use warnings;
use utf8;
use IPC::Open3;
pipe my $input, my $output or die $!;
my $pid = open3(\*STDIN, $output, \*STDERR, 'dd', 'if=/usr/include/unistd.h') or die $!;
while(<$input>) {
print $_."\n";
}
waitpid $pid, 0;
I am fairly certain that I am using IPC::Open3 incorrectly. However, I am still confused as to what I should be doing.
It's the pipe. Without knowing why it's there I can't say more. This works fine.
my $reader;
my $pid = open3(\*STDIN, $reader, \*STDERR, 'dd', 'if=/usr/include/unistd.h') or die $!;
while(<$reader>) {
print $_."\n";
}
waitpid $pid, 0;
I realize it's probably just an example, but in case it's not... this is complete overkill for what you're doing. You can accomplish that with backticks.
print `dd if=/usr/include/unistd.h`
IPC::Open3 is a bit overcomplicated. There are better modules such as IPC::Run and IPC::Run3.
use strict;
use warnings;
use IPC::Run3;
run3(['perl', '-e', 'warn "Warning!"; print "Normal\n";'],
\*STDIN, \*STDOUT, \*STDERR
);
Your program suffers from the following problems:
\*STDIN (open STDIN as a pipe tied to the child's STDIN) should be <&STDIN (use the parent's STDIN as the child's STDIN).
\*STDERR (open STDERR as a pipe tied to the child's STDERR) should be >&STDERR (use the parent's STDERR as the child's STDERR).
The value you place in $output is being ignored and overwritten. Fortunately, it's being overwritten with a correct value!
You use print $_."\n";, but $_ is already newline-terminated. Either chomp first, or don't add a newline.
open3 isn't a system call, so it doesn't set $!.
open3 doesn't return false on error; it throws an exception.
So we get something like:
#!/usr/bin/env perl
use strict;
use warnings;
use feature qw( say );
use IPC::Open3;
my $pid = open3('<&STDIN', my $output, '>&STDERR',
'dd', 'if=/usr/include/unistd.h');
while (<$output>) {
chomp;
say "<$_>";
}
waitpid($pid, 0);
In the following code if there is space between FILE and ( in the printf statement
like
printf FILE ("Test string inline\n");
Perl will treat FILE as a filehandle otherwise
printf FILE("Test string inline\n");
will be treated as subroutine call(If no subroutine is defined by FILE perl will through an error Undefined subroutine &main::FILE called at ./test.pl line xx ). Isn't there a better way Perl can implement this ? (Maybe this is why bareword filehandles are considered outdated ?)
#!/usr/bin/perl
use warnings;
open(FILE,">test.txt");
printf FILE ("Test string inline\n");
close(FILE);
sub FILE
{
return("Test string subroutine\n");
}
Are you asking how to avoid that error accidentally? You could wrap the handle in curlies
printf({ HANDLE } $pattern, #args);
print({ HANDLE } #args);
say({ HANDLE } #args);
Or since parens are often omitted for say, print and printf,
printf { HANDLE } $pattern, #args;
print { HANDLE } #args;
say { HANDLE } #args;
Or you could use a method call
HANDLE->printf($pattern, #args);
HANDLE->print(#args);
HANDLE->say(#args);
Try:
#!/usr/bin/env perl
use strict;
use warnings;
# --------------------------------------
use charnames qw( :full :short );
use English qw( -no_match_vars ) ; # Avoids regex performance penalty
my $test_file = 'test.txt';
open my $test_fh, '>', $test_file or die "could not open $test_file: $OS_ERROR\n";
printf {$test_fh} "Test string inline" or die "could not print $test_file: $OS_ERROR\n";
close $test_fh or die "could not close $test_file: $OS_ERROR\n";
I need to modify an existing Perl program. I want to pipe a string (which can contain multiple lines) through an external program and read the output from this program. This external program is used to modify the string. Let's simply use cat as a filter program. I tried it like this, but it doesn't work. (Output of cat goes to STDOUT instead of being read by perl.)
#!/usr/bin/perl
open(MESSAGE, "| cat |") or die("cat failed\n");
print MESSAGE "Line 1\nLine 2\n";
my $message = "";
while (<MESSAGE>)
{
$message .= $_;
}
close(MESSAGE);
print "This is the message: $message\n";
I've read that this isn't supported by Perl because it may end up in a deadlock, and I can understand it. But how do I do it then?
You can use IPC::Open3 to achieve bi-directional communication with child.
use strict;
use IPC::Open3;
my $pid = open3(\*CHLD_IN, \*CHLD_OUT, \*CHLD_ERR, 'cat')
or die "open3() failed $!";
my $r;
for(my $i=1;$i<10;$i++) {
print CHLD_IN "$i\n";
$r = <CHLD_OUT>;
print "Got $r from child\n";
}
This involves system programming, so it’s more than a basic question. As written, your main program doesn’t require full-duplex interaction with the external program. Dataflow travels in one direction, namely
string → external program → main program
Creating this pipeline is straightforward. Perl’s open has a useful mode explained in the “Safe pipe opens” section of the perlipc documentation.
Another interesting approach to interprocess communication is making your single program go multiprocess and communicate between—or even amongst—yourselves. The open function will accept a file argument of either "-|" or "|-" to do a very interesting thing: it forks a child connected to the filehandle you’ve opened. The child is running the same program as the parent. This is useful for safely opening a file when running under an assumed UID or GID, for example. If you open a pipe to minus, you can write to the filehandle you opened and your kid will find it in his STDIN. If you open a pipe from minus, you can read from the filehandle you opened whatever your kid writes to his STDOUT.
This is an open that involves a pipe, which gives nuance to the return value. The perlfunc documentation on open explains.
If you open a pipe on the command - (that is, specify either |- or -| with the one- or two-argument forms of open), an implicit fork is done, so open returns twice: in the parent process it returns the pid of the child process, and in the child process it returns (a defined) 0. Use defined($pid) or // to determine whether the open was successful.
To create the scaffolding, we work in right-to-left order using open to fork a new process at each step.
Your main program is already running.
Next, fork a process that will eventually become the external program.
Inside the process from step 2
First fork the string-printing process so as to make its output arrive on our STDIN.
Then exec the external program to perform its transformation.
Have the string-printer do its work and then exit, which kicks up to the next level.
Back in the main program, read the transformed result.
With all of that set up, all you have to do is implant your suggestion at the bottom, Mr. Cobb.
#! /usr/bin/env perl
use 5.10.0; # for defined-or and given/when
use strict;
use warnings;
my #transform = qw( tr [A-Za-z] [N-ZA-Mn-za-m] ); # rot13
my #inception = (
"V xabj, Qnq. Lbh jrer qvfnccbvagrq gung V pbhyqa'g or lbh.",
"V jnf qvfnccbvagrq gung lbh gevrq.",
);
sub snow_fortress { print map "$_\n", #inception }
sub hotel {
given (open(STDIN, "-|") // die "$0: fork: $!") { # / StackOverflow hiliter
snow_fortress when 0;
exec #transform or die "$0: exec: $!";
}
}
given (open(my $fh, "-|") // die "$0: fork: $!") {
hotel when 0;
print while <$fh>;
close $fh or warn "$0: close: $!";
}
Thanks for the opportunity to write such a fun program!
You can use the -n commandline switch to effectively wrap your existing program code in a while-loop... look at the man page for -n:
LINE:
while (<>) {
... # your program goes here
}
Then you can use the operating system's pipe mechanism directly
cat file | your_perl_prog.pl
(Edit)
I'll try to explain this more carefully...
The question is not clear about what part the perl program plays: filter or final stage. This works in either case, so I will assume it is the latter.
'your_perl_prog.pl' is your existing code. I'll call your filter program 'filter'.
Modify your_perl_prog.pl so that the shebang line has an added '-n' switch: #!/usr/bin/perl -n or #!/bin/env "perl -n"
This effectively puts a while(<>){} loop around the code in your_perl_prog.pl
add a BEGIN block to print the header:
BEGIN {print "HEADER LINE\n");}
You can read each line with '$line = <>;' and process/print
Then invoke the lot with
cat sourcefile |filter|your_perl_prog.pl
I want to expand on #Greg Bacon's answer without changing it.
I had to execute something similar, but wanted to code without
the given/when commands, and also found there was explicit exit()
calls missing because in the sample code it fell through and exited.
I also had to make it also work on a version running ActiveState perl,
but that version of perl does not work.
See this question How to read to and write from a pipe in perl with ActiveState Perl?
#! /usr/bin/env perl
use strict;
use warnings;
my $isActiveStatePerl = defined(&Win32::BuildNumber);
sub pipeFromFork
{
return open($_[0], "-|") if (!$isActiveStatePerl);
die "active state perl cannot cope with dup file handles after fork";
pipe $_[0], my $child or die "cannot create pipe";
my $pid = fork();
die "fork failed: $!" unless defined $pid;
if ($pid) { # parent
close $child;
} else { # child
open(STDOUT, ">&=", $child) or die "cannot clone child to STDOUT";
close $_[0];
}
return $pid;
}
my #transform = qw( tr [A-Za-z] [N-ZA-Mn-za-m] ); # rot13
my #inception = (
"V xabj, Qnq. Lbh jrer qvfnccbvagrq gung V pbhyqa'g or lbh.",
"V jnf qvfnccbvagrq gung lbh gevrq.",
);
sub snow_fortress { print map "$_\n", #inception }
sub hotel
{
my $fh;
my $pid = pipeFromFork($fh); # my $pid = open STDIN, "-|";
defined($pid) or die "$0: fork: $!";
if (0 == $pid) {
snow_fortress;
exit(0);
}
open(STDIN, "<&", $fh) or die "cannot clone to STDIN";
exec #transform or die "$0: exec: $!";
}
my $fh;
my $pid = pipeFromFork($fh); # my $pid = open my $fh, "-|";
defined($pid) or die "$0: fork: $!";
if (0 == $pid) {
hotel;
exit(0);
}
print while <$fh>;
close $fh or warn "$0: close: $!";
the simplest -- not involving all these cool internals -- way to do what the OP needs, is to use a temporary file to hold the output until the external processor is done, like so:
open ToTemp, "|/usr/bin/tac>/tmp/MyTmp$$.whee" or die "open the tool: $!";
print ToTemp $TheMessageWhateverItIs;
close ToTemp;
my $Result = `cat /tmp/MyTmp$$.whee`; # or open and read it, or use File::Slurp, etc
unlink "/tmp/MyTmp$$.whee";
Of course, this isn't going to work for something interactive, but co-routines appear to be out of the scope of the original question.