How to pass a file handle to a function? - perl

When I run the code below I get
Can't use string ("F") as a symbol ref while "strict refs" in use at ./T.pl line 21.
where line 21 is
flock($fh, LOCK_EX);
What am I doing wrong?
#!/usr/bin/perl
use strict;
use warnings;
use Fcntl ':flock', 'SEEK_SET'; # file locking
use Data::Dumper;
# use xx;
my $file = "T.yaml";
my $fh = "F";
my $obj = open_yaml_with_lock($file, $fh);
$obj->{a} = 1;
write_yaml_with_lock($obj, $fh);
sub open_yaml_with_lock {
my ($file, $fh) = #_;
open $fh, '+<', $file;
flock($fh, LOCK_EX);
my $obj = YAML::Syck::LoadFile($fh);
return $obj;
}
sub write_yaml_with_lock {
my ($obj, $fh) = #_;
my $yaml = YAML::Syck::Dump($obj);
$YAML::Syck::ImplicitUnicode = 1;
seek $fh,0, SEEK_SET; # seek back to the beginning of file
print $fh $yaml . "---\n";
close $fh;
}

What you're doing wrong is using the string "F" as a filehandle. This
has never been something that's worked; you could use a bareword as a
filehandle (open FH, ...; print FH ...), or you could pass in an
empty scalar and perl would assign a new open file object to that
variable. But if you pass in the string F, then you need to refer to
then handle as F, not $fh. But, don't do that.
Do this instead:
sub open_yaml_with_lock {
my ($file) = #_;
open my $fh, '+<', $file or die $!;
flock($fh, LOCK_EX) or die $!;
my $obj = YAML::Syck::LoadFile($fh); # this dies on failure
return ($obj, $fh);
}
We're doing several things here. One, we're not storing the
filehandle in a global. Global state makes your program extremely
difficult to understand -- I had a hard time with your 10 line post --
and should be avoided. Just return the filehandle, if you want to
keep it around. Or, you can alias it like open does:
sub open_yaml_with_lock {
open $_[0], '+<', $_[1] or die $!;
...
}
open_yaml_with_lock(my $fh, 'filename');
write_yaml_with_lock($fh);
But really, this is a mess. Put this stuff in an object. Make new
open and lock the file. Add a write method. Done. Now you can
reuse this code (and let others do the same) without worrying about
getting something wrong. Less stress.
The other thing we're doing here is checking errors. Yup, disks can
fail. Files can be typo'd. If you blissfully ignore the return value
of open and flock, then your program may not be doing what you think
it's doing. The file might not be opened. The file might not be
locked properly. One day, your program is not going to work properly
because you spelled "file" as "flie" and the file can't be opened.
You will scratch your head for hours wondering what's going on.
Eventually, you'll give up, go home, and try again later. This time,
you won't typo the file name, and it will work. Several hours will
have been wasted. You'll die several years earlier than you should
because of the accumulated stress. So just use autodie or write or
die $! after your system calls so that you get an error message when
something goes wrong!
Your script would be correct if you wrote use autodie qw/open flock
seek close/ at the top. (Actually, you should also check that
"print" worked or use
File::Slurp or
syswrite, since autodie can't detect a failing print statement.)
So anyway, to summarize:
Don't open $fh when $fh is defined. Write open my $fh to
avoid thinking about this.
Always check the return values of system calls. Make autodie do
this for you.
Don't keep global state. Don't write a bunch of functions that
are meant to be used together but rely on implicit preconditions
like an open file. If functions have preconditions, put them in
a class and make the constructor satisfy the preconditions.
This way, you can't accidentally write buggy code!
Update
OK, here's how to make this more OO. First we'll do "pure Perl" OO
and then use Moose. Moose is
what I would use for any real work; the "pure Perl" is just for the
sake of making it easy to understand for someone new to both OO and
Perl.
package LockedYAML;
use strict;
use warnings;
use Fcntl ':flock', 'SEEK_SET';
use YAML::Syck;
use autodie qw/open flock sysseek syswrite/;
sub new {
my ($class, $filename) = #_;
open my $fh, '+<', $filename;
flock $fh, LOCK_EX;
my $self = { obj => YAML::Syck::LoadFile($fh), fh => $fh };
bless $self, $class;
return $self;
}
sub object { $_[0]->{obj} }
sub write {
my ($self, $obj) = #_;
my $yaml = YAML::Syck::Dump($obj);
local $YAML::Syck::ImplicitUnicode = 1; # ensure that this is
# set for us only
my $fh = $self->{fh};
# use system seek/write to ensure this really does what we
# mean. optional.
sysseek $fh, 0, SEEK_SET;
syswrite $fh, $yaml;
$self->{obj} = $obj; # to keep things consistent
}
Then, we can use the class in our main program:
use LockedYAML;
my $resource = LockedYAML->new('filename');
print "Our object looks like: ". Dumper($resource->object);
$resource->write({ new => 'stuff' });
Errors will throw exceptions, which can be handled with
Try::Tiny, and the YAML
file will stay locked as long as the instance exists. You can, of
course, have many LockedYAML objects around at once, that's why we
made it OO.
And finally, the Moose version:
package LockedYAML;
use Moose;
use autodie qw/flock sysseek syswrite/;
use MooseX::Types::Path::Class qw(File);
has 'file' => (
is => 'ro',
isa => File,
handles => ['open'],
required => 1,
coerce => 1,
);
has 'fh' => (
is => 'ro',
isa => 'GlobRef',
lazy_build => 1,
);
has 'obj' => (
is => 'rw',
isa => 'HashRef', # or ArrayRef or ArrayRef|HashRef, or whatever
lazy_build => 1,
trigger => sub { shift->_update_obj(#_) },
);
sub _build_fh {
my $self = shift;
my $fh = $self->open('rw');
flock $fh, LOCK_EX;
return $fh;
}
sub _build_obj {
my $self = shift;
return YAML::Syck::LoadFile($self->fh);
}
sub _update_obj {
my ($self, $new, $old) = #_;
return unless $old; # only run if we are replacing something
my $yaml = YAML::Syck::Dump($new);
local $YAML::Syck::ImplicitUnicode = 1;
my $fh = $self->fh;
sysseek $fh, 0, SEEK_SET;
syswrite $fh, $yaml;
return;
}
This is used similarly:
use LockedYAML;
my $resource = LockedYAML->new( file => 'filename' );
$resource->obj; # the object
$resource->obj( { new => 'object' }); # automatically saved to disk
The Moose version is longer, but does a lot more runtime consistency
checking and is easier to enhance. YMMV.

From the documentation:
open FILEHANDLE,EXPR
If FILEHANDLE is an undefined scalar variable (or array or hash
element) the variable is assigned a reference to a new anonymous
filehandle, otherwise if FILEHANDLE is an expression, its value is
used as the name of the real filehandle wanted. (This is considered a
symbolic reference, so "use strict 'refs'" should
not be in effect.)
Filehandle here is an expression ("F") so itsvalue is used as the name of the real filehandle you want. (A filehandle called F). And then... the documentation says "use strict 'refs'" should not be in effect, because you're using F as a symbolic reference.
(use strict; on line 1 includes strict 'refs'.)
Had you just said at the beginning:
my $fh;
This would have worked, because then $fh would become a reference to a new anonymous filehandle and Perl won't try to use it as a symbolic reference.
This works:
#!/usr/bin/perl
my $global_fh;
open_filehandle(\$global_fh);
use_filehandle(\$global_fh);
sub open_filehandle {
my ($fh)=#_;
open($$fh, ">c:\\temp\\testfile") || die;
}
sub use_filehandle {
my($fh) = #_;
# Print is pecular that it expects the next token to be the filehandle
# or a simple scalar. Thus, print $$fh "Hello, world!" will not work.
my $lfh = $$fh;
print $lfh "Hello, world!";
close($$fh);
}
Or you can do what the other poster suggested and use $_[1] directly, but that's a bit harder to read.

If you use the value directly in the sub, it will work:
use strict;
use warnings;
use autodie;
my $fh;
yada($fh);
print $fh "testing, testing";
sub yada {
open $_[0], '>', 'yada.gg';
}
Or as a reference:
yada(\$fh);
sub yada {
my $handle = shift;
open $$handle, '>', 'yada.gg';
}
Or better yet, return a filehandle:
my $fh = yada($file);
sub yada {
my $inputfile = shift;
open my $gg, '>', $inputfile;
return $gg;
}

Replace
my $fh = "F"; # text and also a ref in nonstrict mode
with
my $fh = \*F; # a reference, period
Of course, it's better yet to use lexical filehandles, as in open my $fd, ... or die ..., but that's not always possible, e.g. you have STDIN that's predefined. In such cases, use \*FD wherever $fd fits.
There's also a case with old scripts, you have to watch out where a global FD is opened and closed.

Related

Perl: Signal handler: Can't use an undefined value as a symbol reference

I have the following:
#! /usr/bin/perl
use Fcntl ':flock';
use strict;
...
my (#list, $fh);
open $fh, "<:encoding(utf8)", $file or die "$file: $!";
flock $fh, LOCK_EX;
use sigtrap 'handler' => sub {flock($fh, LOCK_UN);}, 'normal-signals'; # line 72
It runs but gives an error when I press ^C:
$ verify.pl
...
Can't use an undefined value as a symbol reference at ./verify.pl line 72.
The docs says it's OK
$ perldoc sigtrap
...
use sigtrap 'handler' => \&my_handler, 'normal-signals';
I can't figure out what's wrong.
I'm going to suggest that $fh is the undefined symbol reference. The use is run during the compilation phase (as if in a BEGIN {...} block), and, depending on where you think you're calling it, the $fh may never have been opened, or its scope may not be what you think.
Add some debugging to your handler to show, for example, the value of $fh and the refaddr of $fh. Also add to your open/flock code the same. I bet the refaddr isn't the same.
Since the sigtrap is registered globally, you may be best off with a global, e.g.:
my #handlers;
use sigtrap handler => sub { $_->() for grep defined, #handlers }, 'normal-signals';
{
my ($fh, #list);
open my $fh, ...
flock $fh, LOCK_EX;
my $handler = sub { flock $fh, LOCK_UN };
#handlers = map { weaken $_ } grep defined, #handlers, $handler;
# ... do stuff. When $handler goes out of scope, it'll go undef in #handlers
}

Is there a way to be notified when 'print' is called on $fh?

When I do:
print $fh 'text';
I need a some &sub to be called.
It there a way to do that?
You can tie a filehandle and customize the behavior for printing to that filehandle or for any other operation on that filehandle.
sub PrintNotifier::TIEHANDLE {
my ($pkg, $orignalHandle) = #_;
bless { glob => $orignalHandle }, $pkg;
}
sub PrintNotifier::PRINT {
my ($self,#msg) = #_;
... do whatever you want with #msg here ...
return print {$self->{glob}} #msg;
}
sub PrintNotifier::CLOSE { return close $_[0]->{glob} }
open my $fh, '>', 'some-file';
tie *$fh, 'PrintNotifier', $fh;
print $fh "something"; # calls PrintNotifier::PRINT
You can tie the handle, as mob suggested. Or, if you can change the code and your Perl is new enough, you can replace
print $fh 'text';
with
$fh->print('text');
which you might consider cleaner syntax; then you can sub-class IO::File:
package MyFH {
use parent qw/ IO::File /;
use mro; # Get next::method
sub print {
my ($self, #args) = #_;
warn 'Printing ', #args;
$self->next::method(#args);
}
}
my $fh = MyFH->new();
$fh->open('file', '>') or die $!;
However, this doesn't capture the old-fashioned
print $fh 'text';
style.
Depending on your preference, you might find the new style cleaner anyway, since if your file handle is an expression it allows
$obj->method()->print('text');
instead of
print {$obj->method()} 'text';
It works transparently for Perl 5.14 and up, and can be made to work for older Perls back to (at least) 5.8 by adding
use IO::Handle;
to the top of the files you want to use it in (just to be on the safe side).
In is not documented in perl but has usefull usage
If you want to process print'ing on your object in your class you can tie into itself:
tie *$self, $self;
print $self 'text';
$self->print( 'text' );

function call in perl

As a part of my course work I have been learning perl programming language for the first time in last the few weeks. I have been writing small functions and making function calls. I have written a function for string matching.
use strict;
use warnings;
sub find_multi_string {
my ($file, #strings) = #_;
my $fh;
open ($fh, "<$file");
#store the whole file in an array
my #array = <$fh>;
for my $string (#strings) {
if (grep /$string/, #array) {
next;
} else {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
In the above script I'm passing the arguments in the function call. The script works.
But I'd like to know if there is way to specify the file name and string1... string n in an array in the program itself and just make the function call.
find_multi_string();
That would be a mistake, always pass parameters and return values to your subroutines.
What you're describing is essentially using subroutines solely to subdivide and document your code. If you were to do that, it would better to just remove the subroutine entirely and include a comment before the section of code.
Overall, your code looks good as is. You probably will want to use quotemeta though, and your logic can be simplified a little:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
# Load the file
my $data = do {
open my $fh, "<", $file;
local $/;
<$fh>
};
for my $string (#strings) {
if ($data !~ /\Q$string/) {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
A few improvements of your original code:
use autodie
use 3-args open
as you want to check anywhere in the file, just load the file as a single string
if the matching string are just text without metacharacters from regexp, just use the index function
Your question is about passing the function arguments from your program.
I suspect that you are looking for #ARGV. See perlvar.
Here is the modified code:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
my $content = do {
open my $fh, '<', $file;
local $/;
<$fh>
};
foreach (#strings) {
die "Cannot find $string in $file" unless index($content, $_) >= 0;
}
return 1;
}
find_multi_string(#ARGV);

How can I use tie() to redirect STDOUT, STDERR only for certain packages?

I need to work with some libraries that unfortunately log diagnostic
messages to STDOUT and STDERR. By using tie, I can redirect those
writes to a function that captures those. Since I don't want all
STDOUT and STDERR output of my programs to be captured thtough the
tied handle, I'd like to do this only for certain packages.
I have come up with a solution where the actual behavior is determined
by looking at caller() as can be seen below, but I have the feeling
that there has to be a better way... Is there a more elegant solution?
package My::Log::Capture;
use strict;
use warnings;
use 5.010;
sub TIEHANDLE {
my ($class, $channel, $fh, $packages) = #_;
bless {
channel => lc $channel,
fh => $fh,
packages => $packages,
}, $class;
}
sub PRINT {
my $self = shift;
my $caller = (caller)[0];
if ($caller ~~ $self->{packages}) {
local *STDOUT = *STDOUT;
local *STDERR = *STDERR;
given ($self->{channel}) {
when ('stdout') {
*STDOUT = $self->{fh};
}
when ('stderr') {
*STDERR = $self->{fh};
}
}
# Capturing/Logging code goes here...
} else {
$self->{fh}->print(#_);
}
}
1;
package main;
use My::Foo;
# [...]
use My::Log::Capture;
open my $stderr, '>&', *STDERR;
tie *STDERR, 'My::Log::Capture', (stderr => $stderr, [qw< My::Foo >]);
# My::Foo's STDERR output will be captured, everyone else's STDERR
# output will just be relayed.
Aside from fixing the libraries, I can think of only one solution that might be better.
You can re-open STDOUT and STDERR file handles into your own file handles. Then, re-open STDOUT and STDERR with your tied handles.
For example, here's how you do it for STDOUT:
open my $fh, ">&", \*STDOUT or die "cannot reopen STDOUT: $!";
close STDOUT;
open STDOUT, ">", "/tmp/test.txt";
say $fh "foo"; # goes to real STDOUT
say "bar"; # goes to /tmp/test.txt
You can read perldoc -f open for all the gory details on what ">&" and such does.
Anyway, instead of "/tmp/test.txt" you can replace that open call with the setup for your tied file handle.
Your code will have to always use an explicit file handle to write or use select to switch file handles:
select $fh;
say "foo"; # goes to real STDOUT
select STDOUT;
say "bar"; # goes to /tmp/test.txt

How to I use a class property/variable as a print filehandle in Perl?

I want to do the same thing as
open MYFILE, ">", "data.txt";
print MYFILE "Bob\n";
but instead in class variable like
sub _init_tmp_db
{
my ($self) = #_;
open $$self{tmp_db_fh}, ">", "data.txt";
print $$self{tmp_db_fh} "Bob\n";
}
It gave me this error : 'String found where operator expected near "Bob\n"'
what should I do?
From the print manpage:
If you're storing handles in an array or hash, or in general whenever
you're using any expression more complex than a bareword handle or a
plain, unsubscripted scalar variable to retrieve it, you will have to
use a block returning the filehandle value instead.
You should be using:
print { $$self{tmp_db_fh} } "Bob\n";
This code won't work under use strict. To fix it just use a my variable:
open my $fh, ">", "data.txt" or die $!;
$$self{tmp_db_fh} = $fh;
print { $$self{tmp_db_fh} } "Bob\n";
You should the IO::File module instead.
use IO::File;
my $file = IO::File->new;
$file->open("> data.txt");
print_something($file);
sub print_something {
my ($file) = #_;
$file->print("hello world\n");
}
Or in your example function:
use IO::File;
# ...
sub _init_tmp_db
{
my ($self) = #_;
$self{tmp_db_fh} = IO::File->new;
$self{tmp_db_fh}->open(">", "data.txt");
$self{tmp_db_fh}->print"Bob\n";
}
(note, you can still non -> based calls too, but I wrote the above
using the more traditional ->open() type calls.)
Filehandles can only be scalars.
But $$self{tmp_db_fh} is either an open filehandle (to data.txt) then this would work:
sub _init_tmp_db
{
my ($self) = #_;
my $filehandle = $$self{tmp_db_fh} ;
print $filehandle "Bob\n";
}
or you open the filehandle inside _init_tmp_db
sub _init_tmp_db
{
my ($self) = #_;
open my $filehandle , ">", "data.txt" or die "Cannot open data.txt" ;
print $filehandle "Bob\n";
}
But providing a string in $$self{tmp_db_fh} (like 'FILEHANDLE') won't work.
This is easily solved by creating a variable for a file handle:
sub _init_tmp_db {
my $self = shift;
my $fh;
open $fh, ">", "data.txt"
$self->{temp_db_fh} = $fh;
# Sometime later...
$fh = $self-{temp_db_hf};
print $fh "Bob\n";
}
This is an issue because the way the print syntax is parsed and the early sloppiness of the syntax. The print statement has really two separate formats: Format #1 is that the you're simply passing it stuff to print. Format #2 says that the first item may be a file handle, and the rest is the stuff you want to print to the file handle. If print can't easily determine that the first parameter is a file handle, it fails.
If you look at other languages, they'll use a parameter for passing the file handle, and maybe the stuff to print. Or in object oriented languages, they'll overload >> for the file handle parameter. They'll look something like this:
print "This is my statement", file=file_handle;
or
print "This is my statement" >> file_handle;
You might be able to munge the syntax to get away from using a variable. However, it doesn't make the program more efficient or more readable, and may simply make the program harder to maintain. So, just use a variable for the file handle.
You said class in your title. I assume that you are interested in writing a fully fledge object oriented package to do this. Here's a quick example. Notice in the write subroutine method I retrieve the file handle into a variable and use the variable in the print statement.
#! /usr/bin/env perl
#
use strict;
use warnings;
#######################################################
# MAIN PROGRAM
#
my $file = File->new;
$file->open("OUTPUT") or
die "Can't open 'OUTPUT' for writing\n";
$file->write("This is a test");
#
#######################################################
package File;
use Carp;
sub new {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub open {
my $self = shift;
my $file = shift;
my $fh;
if (defined $file) {
$self->{FILE} = $file;
open ($fh, ">", $file) and $self->_fh($fh);
}
return $self->_fh;
}
sub _fh {
my $self = shift;
my $fh = shift;
if (defined $fh) {
$self->{FH} = $fh;
}
return $self->{FH};
}
sub write {
my $self = shift;
my $note = shift;
my $fh = $self->_fh;
print $fh $note . "\n";
return
}