How to I use a class property/variable as a print filehandle in Perl? - perl

I want to do the same thing as
open MYFILE, ">", "data.txt";
print MYFILE "Bob\n";
but instead in class variable like
sub _init_tmp_db
{
my ($self) = #_;
open $$self{tmp_db_fh}, ">", "data.txt";
print $$self{tmp_db_fh} "Bob\n";
}
It gave me this error : 'String found where operator expected near "Bob\n"'
what should I do?

From the print manpage:
If you're storing handles in an array or hash, or in general whenever
you're using any expression more complex than a bareword handle or a
plain, unsubscripted scalar variable to retrieve it, you will have to
use a block returning the filehandle value instead.
You should be using:
print { $$self{tmp_db_fh} } "Bob\n";
This code won't work under use strict. To fix it just use a my variable:
open my $fh, ">", "data.txt" or die $!;
$$self{tmp_db_fh} = $fh;
print { $$self{tmp_db_fh} } "Bob\n";

You should the IO::File module instead.
use IO::File;
my $file = IO::File->new;
$file->open("> data.txt");
print_something($file);
sub print_something {
my ($file) = #_;
$file->print("hello world\n");
}
Or in your example function:
use IO::File;
# ...
sub _init_tmp_db
{
my ($self) = #_;
$self{tmp_db_fh} = IO::File->new;
$self{tmp_db_fh}->open(">", "data.txt");
$self{tmp_db_fh}->print"Bob\n";
}
(note, you can still non -> based calls too, but I wrote the above
using the more traditional ->open() type calls.)

Filehandles can only be scalars.
But $$self{tmp_db_fh} is either an open filehandle (to data.txt) then this would work:
sub _init_tmp_db
{
my ($self) = #_;
my $filehandle = $$self{tmp_db_fh} ;
print $filehandle "Bob\n";
}
or you open the filehandle inside _init_tmp_db
sub _init_tmp_db
{
my ($self) = #_;
open my $filehandle , ">", "data.txt" or die "Cannot open data.txt" ;
print $filehandle "Bob\n";
}
But providing a string in $$self{tmp_db_fh} (like 'FILEHANDLE') won't work.

This is easily solved by creating a variable for a file handle:
sub _init_tmp_db {
my $self = shift;
my $fh;
open $fh, ">", "data.txt"
$self->{temp_db_fh} = $fh;
# Sometime later...
$fh = $self-{temp_db_hf};
print $fh "Bob\n";
}
This is an issue because the way the print syntax is parsed and the early sloppiness of the syntax. The print statement has really two separate formats: Format #1 is that the you're simply passing it stuff to print. Format #2 says that the first item may be a file handle, and the rest is the stuff you want to print to the file handle. If print can't easily determine that the first parameter is a file handle, it fails.
If you look at other languages, they'll use a parameter for passing the file handle, and maybe the stuff to print. Or in object oriented languages, they'll overload >> for the file handle parameter. They'll look something like this:
print "This is my statement", file=file_handle;
or
print "This is my statement" >> file_handle;
You might be able to munge the syntax to get away from using a variable. However, it doesn't make the program more efficient or more readable, and may simply make the program harder to maintain. So, just use a variable for the file handle.
You said class in your title. I assume that you are interested in writing a fully fledge object oriented package to do this. Here's a quick example. Notice in the write subroutine method I retrieve the file handle into a variable and use the variable in the print statement.
#! /usr/bin/env perl
#
use strict;
use warnings;
#######################################################
# MAIN PROGRAM
#
my $file = File->new;
$file->open("OUTPUT") or
die "Can't open 'OUTPUT' for writing\n";
$file->write("This is a test");
#
#######################################################
package File;
use Carp;
sub new {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub open {
my $self = shift;
my $file = shift;
my $fh;
if (defined $file) {
$self->{FILE} = $file;
open ($fh, ">", $file) and $self->_fh($fh);
}
return $self->_fh;
}
sub _fh {
my $self = shift;
my $fh = shift;
if (defined $fh) {
$self->{FH} = $fh;
}
return $self->{FH};
}
sub write {
my $self = shift;
my $note = shift;
my $fh = $self->_fh;
print $fh $note . "\n";
return
}

Related

perl redirect stdout to lexical filehandle

I'm trying to write a helper function that runs a perl function in another process and returns a closure that produces a line of output at a time when called.
I figured out a way of doing this using pipe that mixes old and new-style filehandles. I used an old-style one for the sink in order to use the open(STDOUT, ">&thing") syntax and a new-style one for the source since it needs to be captured by a closure and I didn't want to burden the caller with providing a filehandle.
Is there a way of using a new-style filehandle in a construction with the same meaning as open(STDOUT, ">&thing")?
#!/usr/bin/env perl
# pipe.pl
# use pipe() to create a pair of fd's.
# write to one and read from the other.
#
# The source needs to be captured by the closure and can't be
# destructed at the end of get_reader(), so it has to be lexical.
#
# We need to be able to redirect stdout to sink in such a way that
# we actually dup the file descriptor (so shelling out works as intended).
# open(STDOUT, ">&FILEHANDLE") achieves this but appears to require an
# old-style filehandle.
use strict;
use warnings;
sub get_reader {
local *SINK;
my $source;
pipe($source, SINK) or die "can't open pipe!";
my $cpid = fork();
if ($cpid == -1) {
die 'failed to fork';
}
elsif ($cpid == 0) {
open STDOUT, ">&SINK" or die "can't open sink";
system("echo -n hi");
exit;
}
else {
return sub {
my $line = readline($source);
printf "from child (%s)\n", $line;
exit;
}
}
}
sub main {
my $reader = get_reader();
$reader->();
}
main();
When run, this produces
from child (hi)
as expected.
sub get_reader {
my ($cmd) = #_;
open(my $pipe, '-|', #$cmd);
return sub {
return undef if !$pipe;
my $line = <$pipe>;
if (!defined($line)) {
close($pipe);
$pipe = undef;
return undef;
}
chomp($line);
return $line;
};
}
If that's not good enough (e.g. because you also need to redirect the child's STDIN or STDERR), you can use IPC::Run instead.
use IPC::Run qw( start );
sub get_reader {
my ($cmd) = #_;
my $buf = '';
my $h = start($cmd, '>', \$buf);
return sub {
return undef if !$h;
while (1) {
if ($buf =~ s/^([^\n]*)\n//) {
return $1;
}
if (!$h->pump())) {
$h->finish();
$h = undef;
return substr($buf, 0, length($buf), '') if length($buf);
return undef;
}
}
};
}
Either way, you can now do
my $i = get_reader(['prog', 'arg', 'arg']);
while (defined( my $line = $i->() )) {
print "$line\n";
}
Either way, error handling left to you.

Is there a way to be notified when 'print' is called on $fh?

When I do:
print $fh 'text';
I need a some &sub to be called.
It there a way to do that?
You can tie a filehandle and customize the behavior for printing to that filehandle or for any other operation on that filehandle.
sub PrintNotifier::TIEHANDLE {
my ($pkg, $orignalHandle) = #_;
bless { glob => $orignalHandle }, $pkg;
}
sub PrintNotifier::PRINT {
my ($self,#msg) = #_;
... do whatever you want with #msg here ...
return print {$self->{glob}} #msg;
}
sub PrintNotifier::CLOSE { return close $_[0]->{glob} }
open my $fh, '>', 'some-file';
tie *$fh, 'PrintNotifier', $fh;
print $fh "something"; # calls PrintNotifier::PRINT
You can tie the handle, as mob suggested. Or, if you can change the code and your Perl is new enough, you can replace
print $fh 'text';
with
$fh->print('text');
which you might consider cleaner syntax; then you can sub-class IO::File:
package MyFH {
use parent qw/ IO::File /;
use mro; # Get next::method
sub print {
my ($self, #args) = #_;
warn 'Printing ', #args;
$self->next::method(#args);
}
}
my $fh = MyFH->new();
$fh->open('file', '>') or die $!;
However, this doesn't capture the old-fashioned
print $fh 'text';
style.
Depending on your preference, you might find the new style cleaner anyway, since if your file handle is an expression it allows
$obj->method()->print('text');
instead of
print {$obj->method()} 'text';
It works transparently for Perl 5.14 and up, and can be made to work for older Perls back to (at least) 5.8 by adding
use IO::Handle;
to the top of the files you want to use it in (just to be on the safe side).
In is not documented in perl but has usefull usage
If you want to process print'ing on your object in your class you can tie into itself:
tie *$self, $self;
print $self 'text';
$self->print( 'text' );

Preserve local context across nested subroutines

Let's consider the wanted code below. I have recursive calls to process and for each recursion I use a local %context. In this way I can get my context back when I return from a call.
sub process {
my %context; # Local context
process() if rerun();
job1();
job2();
sub job1() {print $context{foo}}
sub job2() {print $context{bar}}
}
Unfortunately perl does not manage nested subroutines as I expected. By moving my subroutines outside from the process subroutine I will get a problem because I won't be able to access %context anymore. So I need to make it global and use a stack as follow:
my %context; # Local context
my #context_stack;
sub process {
push #context_stack, %context;
%context = undef;
process() if rerun();
job1();
job2();
%context = pop #context_stack;
}
sub job1() {print $context{foo}}
sub job2() {print $context{bar}}
The third solution is to pass the context to all subroutines which can be annoying for very small subroutines. Also %context become global to all my program. So I loose the privacy of this variable.
my %context; # Local context
my #context_stack;
sub process {
push #context_stack, %context;
%context = undef;
process() if rerun(\%context);
job1(\%context);
job2(\%context);
%context = pop #context_stack;
}
sub job1() {$context = shift; print $context->{foo}}
sub job2() {$context = shift; print $context->{bar}}
What would be the best approach?
EDIT
For a better understanding of my specific, I provide another example:
process(#ARGV);
exit 0;
sub process {
my $infile = shift;
my $outfile = shift;
open my $fp_in, '<', $infile;
open my $fp_out, '>', $outfile;
LINE: while(<$fp_in>) {
remove_c_comment();
say STDERR "File is $infile";
process($1, "$1.processed") if /#include "(.*?)";
warning("Huh raisin, no!") if /\braisin/;
say STDERR "Fill is still $infile";
print $fp_out $_;
}
sub remove_c_comment { s|//.*$|| }
sub warning { say "[Warning] $infile:$. ".shift() }
}
The thing you're looking for - but you may not know it - is called a closure. (see also: perlref)
{
my %context;
sub job1 { print $context{foo} };
sub job2 { print $context{bar} };
sub init_context{ $context{foo} = 1 };
}
Context remains private within this block, but accessible to all the subroutines.
As an alternative - you can return a code reference from a subroutine - like this:
use strict;
use warnings;
sub make_sub_with_context {
my %context;
$context{"bar"} = 1;
return sub { print $context{"bar"}++ };
}
my $job1_with_context = make_sub_with_context();
my $another_job_with_context = make_sub_with_context();
$job1_with_context->();
$job1_with_context->();
$another_job_with_context->();
$another_job_with_context->();
$another_job_with_context->();
Which may be a better example.
Edit:
Following on from your updated example it looks like your problem spec is to iterate a set of files, and (recursively) traverse referenced files.
Sort of like a find but following include directives. I would point out that by doing it that way, what you're doing is potentially going to end up with a loop, which isn't ideal.
Can I suggest instead taking a different approach? Don't recurse:
use strict;
use warnings;
my #files_to_process = #ARGV;
my %done;
while ( my $infile = pop #files_to_process ) {
next if $done{$infile}++;
open my $fp_in, '<', $infile or die $!;
open my $fp_out, '>', $infile . ".processed" or die $!;
while ( my $line = <$fp_in> ) {
$line =~ s|\/\/.*$||;
if ( my ($include) = ( $line =~ m/#include "(.*?)"/ ) ) {
push #files_to_process, $include;
}
print {$fp_out} $line;
}
close($fp_out);
close($fp_in);
}
With a bit more thought, and the expansion that this task needs to process stuff in declaration order - I'd offer instead - perhaps taking an OO approach would help. Something like:
use strict;
use warnings;
package parser;
sub new {
my ($class) = #_;
my $self = {};
bless $self, $class;
return $self;
}
sub process {
my ( $self, $infile, $outfile ) = #_;
open my $fp_in, '<', $infile;
open my $fp_out, '>', $outfile;
LINE: while ( my $line = <$fp_in> ) {
$line =~ s|\/\/.*$||;
say STDERR "File is $infile";
if ( my ($includefile) = ( $line =~ m/#include "(.*?)"/ ) ) {
my $processor = parser->new();
$processor -> process( $includefile, "$includefile.processed" );
}
$self->warning("Huh raisin, no!") if /\braisin/;
say STDERR "Fill is still $infile";
print $fp_out $line;
}
}
package main;
my $processor = parser->new()->process(#ARGV);

function call in perl

As a part of my course work I have been learning perl programming language for the first time in last the few weeks. I have been writing small functions and making function calls. I have written a function for string matching.
use strict;
use warnings;
sub find_multi_string {
my ($file, #strings) = #_;
my $fh;
open ($fh, "<$file");
#store the whole file in an array
my #array = <$fh>;
for my $string (#strings) {
if (grep /$string/, #array) {
next;
} else {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
In the above script I'm passing the arguments in the function call. The script works.
But I'd like to know if there is way to specify the file name and string1... string n in an array in the program itself and just make the function call.
find_multi_string();
That would be a mistake, always pass parameters and return values to your subroutines.
What you're describing is essentially using subroutines solely to subdivide and document your code. If you were to do that, it would better to just remove the subroutine entirely and include a comment before the section of code.
Overall, your code looks good as is. You probably will want to use quotemeta though, and your logic can be simplified a little:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
# Load the file
my $data = do {
open my $fh, "<", $file;
local $/;
<$fh>
};
for my $string (#strings) {
if ($data !~ /\Q$string/) {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
A few improvements of your original code:
use autodie
use 3-args open
as you want to check anywhere in the file, just load the file as a single string
if the matching string are just text without metacharacters from regexp, just use the index function
Your question is about passing the function arguments from your program.
I suspect that you are looking for #ARGV. See perlvar.
Here is the modified code:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
my $content = do {
open my $fh, '<', $file;
local $/;
<$fh>
};
foreach (#strings) {
die "Cannot find $string in $file" unless index($content, $_) >= 0;
}
return 1;
}
find_multi_string(#ARGV);

How to pass a file handle to a function?

When I run the code below I get
Can't use string ("F") as a symbol ref while "strict refs" in use at ./T.pl line 21.
where line 21 is
flock($fh, LOCK_EX);
What am I doing wrong?
#!/usr/bin/perl
use strict;
use warnings;
use Fcntl ':flock', 'SEEK_SET'; # file locking
use Data::Dumper;
# use xx;
my $file = "T.yaml";
my $fh = "F";
my $obj = open_yaml_with_lock($file, $fh);
$obj->{a} = 1;
write_yaml_with_lock($obj, $fh);
sub open_yaml_with_lock {
my ($file, $fh) = #_;
open $fh, '+<', $file;
flock($fh, LOCK_EX);
my $obj = YAML::Syck::LoadFile($fh);
return $obj;
}
sub write_yaml_with_lock {
my ($obj, $fh) = #_;
my $yaml = YAML::Syck::Dump($obj);
$YAML::Syck::ImplicitUnicode = 1;
seek $fh,0, SEEK_SET; # seek back to the beginning of file
print $fh $yaml . "---\n";
close $fh;
}
What you're doing wrong is using the string "F" as a filehandle. This
has never been something that's worked; you could use a bareword as a
filehandle (open FH, ...; print FH ...), or you could pass in an
empty scalar and perl would assign a new open file object to that
variable. But if you pass in the string F, then you need to refer to
then handle as F, not $fh. But, don't do that.
Do this instead:
sub open_yaml_with_lock {
my ($file) = #_;
open my $fh, '+<', $file or die $!;
flock($fh, LOCK_EX) or die $!;
my $obj = YAML::Syck::LoadFile($fh); # this dies on failure
return ($obj, $fh);
}
We're doing several things here. One, we're not storing the
filehandle in a global. Global state makes your program extremely
difficult to understand -- I had a hard time with your 10 line post --
and should be avoided. Just return the filehandle, if you want to
keep it around. Or, you can alias it like open does:
sub open_yaml_with_lock {
open $_[0], '+<', $_[1] or die $!;
...
}
open_yaml_with_lock(my $fh, 'filename');
write_yaml_with_lock($fh);
But really, this is a mess. Put this stuff in an object. Make new
open and lock the file. Add a write method. Done. Now you can
reuse this code (and let others do the same) without worrying about
getting something wrong. Less stress.
The other thing we're doing here is checking errors. Yup, disks can
fail. Files can be typo'd. If you blissfully ignore the return value
of open and flock, then your program may not be doing what you think
it's doing. The file might not be opened. The file might not be
locked properly. One day, your program is not going to work properly
because you spelled "file" as "flie" and the file can't be opened.
You will scratch your head for hours wondering what's going on.
Eventually, you'll give up, go home, and try again later. This time,
you won't typo the file name, and it will work. Several hours will
have been wasted. You'll die several years earlier than you should
because of the accumulated stress. So just use autodie or write or
die $! after your system calls so that you get an error message when
something goes wrong!
Your script would be correct if you wrote use autodie qw/open flock
seek close/ at the top. (Actually, you should also check that
"print" worked or use
File::Slurp or
syswrite, since autodie can't detect a failing print statement.)
So anyway, to summarize:
Don't open $fh when $fh is defined. Write open my $fh to
avoid thinking about this.
Always check the return values of system calls. Make autodie do
this for you.
Don't keep global state. Don't write a bunch of functions that
are meant to be used together but rely on implicit preconditions
like an open file. If functions have preconditions, put them in
a class and make the constructor satisfy the preconditions.
This way, you can't accidentally write buggy code!
Update
OK, here's how to make this more OO. First we'll do "pure Perl" OO
and then use Moose. Moose is
what I would use for any real work; the "pure Perl" is just for the
sake of making it easy to understand for someone new to both OO and
Perl.
package LockedYAML;
use strict;
use warnings;
use Fcntl ':flock', 'SEEK_SET';
use YAML::Syck;
use autodie qw/open flock sysseek syswrite/;
sub new {
my ($class, $filename) = #_;
open my $fh, '+<', $filename;
flock $fh, LOCK_EX;
my $self = { obj => YAML::Syck::LoadFile($fh), fh => $fh };
bless $self, $class;
return $self;
}
sub object { $_[0]->{obj} }
sub write {
my ($self, $obj) = #_;
my $yaml = YAML::Syck::Dump($obj);
local $YAML::Syck::ImplicitUnicode = 1; # ensure that this is
# set for us only
my $fh = $self->{fh};
# use system seek/write to ensure this really does what we
# mean. optional.
sysseek $fh, 0, SEEK_SET;
syswrite $fh, $yaml;
$self->{obj} = $obj; # to keep things consistent
}
Then, we can use the class in our main program:
use LockedYAML;
my $resource = LockedYAML->new('filename');
print "Our object looks like: ". Dumper($resource->object);
$resource->write({ new => 'stuff' });
Errors will throw exceptions, which can be handled with
Try::Tiny, and the YAML
file will stay locked as long as the instance exists. You can, of
course, have many LockedYAML objects around at once, that's why we
made it OO.
And finally, the Moose version:
package LockedYAML;
use Moose;
use autodie qw/flock sysseek syswrite/;
use MooseX::Types::Path::Class qw(File);
has 'file' => (
is => 'ro',
isa => File,
handles => ['open'],
required => 1,
coerce => 1,
);
has 'fh' => (
is => 'ro',
isa => 'GlobRef',
lazy_build => 1,
);
has 'obj' => (
is => 'rw',
isa => 'HashRef', # or ArrayRef or ArrayRef|HashRef, or whatever
lazy_build => 1,
trigger => sub { shift->_update_obj(#_) },
);
sub _build_fh {
my $self = shift;
my $fh = $self->open('rw');
flock $fh, LOCK_EX;
return $fh;
}
sub _build_obj {
my $self = shift;
return YAML::Syck::LoadFile($self->fh);
}
sub _update_obj {
my ($self, $new, $old) = #_;
return unless $old; # only run if we are replacing something
my $yaml = YAML::Syck::Dump($new);
local $YAML::Syck::ImplicitUnicode = 1;
my $fh = $self->fh;
sysseek $fh, 0, SEEK_SET;
syswrite $fh, $yaml;
return;
}
This is used similarly:
use LockedYAML;
my $resource = LockedYAML->new( file => 'filename' );
$resource->obj; # the object
$resource->obj( { new => 'object' }); # automatically saved to disk
The Moose version is longer, but does a lot more runtime consistency
checking and is easier to enhance. YMMV.
From the documentation:
open FILEHANDLE,EXPR
If FILEHANDLE is an undefined scalar variable (or array or hash
element) the variable is assigned a reference to a new anonymous
filehandle, otherwise if FILEHANDLE is an expression, its value is
used as the name of the real filehandle wanted. (This is considered a
symbolic reference, so "use strict 'refs'" should
not be in effect.)
Filehandle here is an expression ("F") so itsvalue is used as the name of the real filehandle you want. (A filehandle called F). And then... the documentation says "use strict 'refs'" should not be in effect, because you're using F as a symbolic reference.
(use strict; on line 1 includes strict 'refs'.)
Had you just said at the beginning:
my $fh;
This would have worked, because then $fh would become a reference to a new anonymous filehandle and Perl won't try to use it as a symbolic reference.
This works:
#!/usr/bin/perl
my $global_fh;
open_filehandle(\$global_fh);
use_filehandle(\$global_fh);
sub open_filehandle {
my ($fh)=#_;
open($$fh, ">c:\\temp\\testfile") || die;
}
sub use_filehandle {
my($fh) = #_;
# Print is pecular that it expects the next token to be the filehandle
# or a simple scalar. Thus, print $$fh "Hello, world!" will not work.
my $lfh = $$fh;
print $lfh "Hello, world!";
close($$fh);
}
Or you can do what the other poster suggested and use $_[1] directly, but that's a bit harder to read.
If you use the value directly in the sub, it will work:
use strict;
use warnings;
use autodie;
my $fh;
yada($fh);
print $fh "testing, testing";
sub yada {
open $_[0], '>', 'yada.gg';
}
Or as a reference:
yada(\$fh);
sub yada {
my $handle = shift;
open $$handle, '>', 'yada.gg';
}
Or better yet, return a filehandle:
my $fh = yada($file);
sub yada {
my $inputfile = shift;
open my $gg, '>', $inputfile;
return $gg;
}
Replace
my $fh = "F"; # text and also a ref in nonstrict mode
with
my $fh = \*F; # a reference, period
Of course, it's better yet to use lexical filehandles, as in open my $fd, ... or die ..., but that's not always possible, e.g. you have STDIN that's predefined. In such cases, use \*FD wherever $fd fits.
There's also a case with old scripts, you have to watch out where a global FD is opened and closed.