Trouble passing an IO::File handle to a Perl class - perl

If I pass it as an argument I get the error:
'Can't locate object method "getline" via package "Bad" at Bad.pm line 27.'
But if I insert it in the module it works.
This is the boiled down code. bad.pl uses the module Bad.pm. Set $CAUSE_ERROR to see the problem.
#!/usr/bin/env perl
# This is bad.pl
use strict;
use warnings;
use IO::File;
use Bad; # use the bad module "Bad.pm"
&Main();
sub Main {
my $filename = "bad.pl";
warn "About to parse '$filename'\n";
my $MyWord = Bad->new(); # Create a new object.
my $io = IO::File->new($filename, "r");
#####################
my $CAUSE_ERROR = 1; # Set to 0 it does NOT cause an error. Set to 1 it DOES.
#####################
if($CAUSE_ERROR ) {
$MyWord->Parse($MyWord, $io);
} else {
$MyWord->{fd} = $io;
$MyWord->Parse($MyWord);
}
}
This is Bad.pm
package Bad;
# This is Bad.pm
use warnings;
use strict;
sub new {
my ($class, $args) = #_;
my $self = {
fd => undef,
};
return bless($self, $class); # Changes a function to a class
}
sub Parse {
my ($MyWord, $io) = #_;
if(defined($MyWord->{fd})){
# WORKS
$io = $MyWord->{fd};
while ( defined(my $inputline = $io->getline) ) {
print "${inputline}";
}
} else {
# FAILS
while ( defined(my $inputline = $io->getline) ) {
print "${inputline}";
}
}
}
1;
Using Perl v5.22.3 under Cygwin.
Originally I had Bad.pm in a sub directory but I simplified it.
Thank you for you time.

To summarize:
$MyWord->Parse($MyWord, $io);
Given that $MyWord is a reference blessed into the Bad class (i.e, it's an instance of Bad), this calls Bad::Parse with the arguments ($MyWord, $MyWord, $io). That is, it behaves as if you'd called:
Bad::Parse($MyWord, $MyWord, $io)`.
However, Bad::Parse() is written to expect the arguments ($MyWord, $io), so $io gets set to the second $MyWord, and Bad::Parse() throws an error when it tries to call $io->getline because the Bad module doesn't implement that method.
The fix is simple:
Call the function as $MyWord->Parse($io).
Change the variable name for the first argument in Bad::Parse() from $MyWord to $self. This isn't strictly necessary -- you can technically call this variable whatever you want -- but it's conventional, and will make your code much more readable to other Perl programmers.

To summarize errors in the posted code: The class name is passed to the constructor behind the scenes, as is the object to methods; we do not supply them. We do pass the filehandle to new, so that it is assigned to object's data and it can thus be used by methods in the class.
Here is a basic example. I try to stick to the posted design as much as possible. This does not do much of what is needed with I/O objects, but is rather about writing a class in general.
The class is meant to process a file, having been passed a filehandle for it. We expect to have one filehandle per object. Since we get it open the reponsibility to close it is left to the caller.
script.pl
use strict;
use warnings;
use feature 'say';
use IO::File;
use ProcessFile;
my $filename = shift || $0; # from command line, or this file
say "About to parse '$filename'";
my $io = IO::File->new($filename, "r") or die "Can't open $filename: $!";
my $word = ProcessFile->new($io); # Create a new object, initialize with $io
$word->parse();
# OR, by chaining calls
#my $word = ProcessFile->new($io)->parse();
say "Have ", ProcessFile->num_objects(), " open filehandles";
$io->close;
The package file ProcessFile.pm
package ProcessFile;
use warnings;
use strict;
use Carp qw(croak);
use Scalar::Util qw(openhandle);
# Example of "Class" data and methods: how many objects (open filehandles)
our $NumObjects;
sub num_objects { return $NumObjects }
sub DESTROY { --$NumObjects }
sub new {
my ($class, $fh) = #_; # class name, arguments passed to constructor
# To also check the mode (must be opened for reading) use Fcntl module
croak "No filehandle or not open or invalid " if not openhandle $fh;
my $self = { _fh => $fh }; # add other data that may make sense
bless $self, $class; # now $self is an object of class ProcessFile
++$NumObjects;
return $self;
}
sub parse {
my ($self, #args) = #_; # object, arguments passed to method (if any)
# Filehandle is retrieved from data, $self->{_fh}
while ( defined(my $inputline = $self->{_fh}->getline) ) {
print $inputline;
}
# Rewind before returning $self (or not, depending on design/#args)
# Can do more here, set some data etc, as needed by class design
seek $self->{_fh}, 0, 0;
return $self;
}
1;
A few comments on the above code follow. Let me know if more would be helpful.
Class data and methods don't belong to any one object, and are used for purposes that relate to the class as a whole (for example, to track all objects in play).
The DESTROY method runs when an object is destroyed, for example when it goes out of scope. Here we need it in order to decrease the count of existing objects. Try: place the code creating an object in a block { ... }; and see what count we get after the block.
We use openhandle from Scalar::Util to test whether the filehandle is open. We should really also test whether it is open for reading, since that is the fixed purpose of the class, using Fcntl.
In the sole, example method parse we read out the file and then rewind the filehandle, before returning the object. That is a placeholder for saving and/or setting the state for repeated use. What is done depends on the purpose and design of the class, and can be controlled by arguments.
Documentation: tutorial perlootut and reference perlobj on object-oriented work in Perl, perlmod for modules (a class is firstly a package), and a tutorial perlreftut for references.
There are also many informative SO posts around, please search.

Related

How can I ensure my method calls use the right method name on the right object?

I am working on a program which makes multiple attempts at processing, storing to a new log each time it tries (several other steps before/after).
use strict;
for (my $i = 0; $i < 3; $i++)
{
my $loggerObject = new MyLoggerObject(tag => $i);
#.. do a bunch of other things ..
Process($loggerObject,$i);
#.. do a bunch of other things ..
}
sub Process
{
my ($logger,$thingToLog) = #_;
sub Logger { $logger->Print($_[0]); }
Logger("Processing $thingToLog");
}
package MyLoggerObject;
sub new
{
my $package = shift;
my %hash = (#_); my $self = \%hash;
return bless $self, $package;
}
sub Print
{
my $self = shift;
my $value = shift;
print "Entering into log ".$self->{tag}.": $value\n";
}
1;
To avoid having to do a bunch of $self->{logger}->Print() and risk misspelling Print, I tried to collapse them into the local subroutine as seen above. However, when I run this I get:
perl PerlLocalMethod.pl
Entering into log 0: Processing 0
Entering into log 0: Processing 1
Entering into log 0: Processing 2
instead of:
perl PerlLocalMethod.pl
Entering into log 0: Processing 0
Entering into log 1: Processing 1
Entering into log 1: Processing 2
I am presuming the problem is that the Logger method is 'compiled' the first time I call the Process method with the object reference I used on the first call but not afterwards.
If I did $logger->Print(), misspelling Print, and hit a codepath I can't reliably test (this is for an embedded system and I can't force every error condition) it would error out the script with an undefined Method. I suppose I could use AUTOLOAD within logger and log any bad Method calls, but I'd like to know any other recommendations on how to make sure my Logger() calls are reliable and using the correct object.
In Perl, subroutines are compiled during compile time. Embedding a named subroutine declaration into a subroutine doesn't do what one would expect and isn't recommended.
If you are afraid of typos, write tests. See Test::More on how to do it. Use mocking if you can't instantiate system specific classes on a dev machine. Or use shorter names, like P.
You can declare the Logger in the highest scope as a closure over $logger that you would need to declare there, too:
my $logger;
sub Logger { $logger->Print($_[0]) }
But it's confusing and can lead to code harder to maintain if there are many variables and subroutines like that.
If you had used use warnings in your code you would have seen the message:
Variable "$logger" will not stay shared at logger line 24.
Which would have alerted you to the problem (moral: always use strict and use warnings).
I'm not entirely sure why you need so many levels of subroutines in order to do your logging, but it seems to me that all of your subroutines which take the $logger object as their first parameter should probably by methods on the MyLoggerObject (which should probably be called MyLoggerClass as it's a class, not an object).
If you do that, then you end up with this code (which seems to do what you want):
use strict;
use warnings;
for my $i (0 .. 2) {
my $loggerObject = MyLoggerClass->new(tag => $i);
#.. do a bunch of other things ..
$loggerObject->Process($i);
#.. do a bunch of other things ..
}
package MyLoggerClass;
sub new {
my $package = shift;
my $self = { #_ };
return bless $self, $package;
}
sub Process {
my $self = shift;
my ($thingToLog) = #_;
$self->Logger("Processing $thingToLog");
}
sub Logger {
my $self = shift;
$self->Print($_[0]);
}
sub Print {
my $self = shift;
my ($value) = #_;
print "Entering into log $self->{tag}: $value\n";
}
1;
Oh, and notice that I moved away from the indirect object notation call (new Class(...)) to the slightly safer Class->new(...). The style you used will work in the vast majority of cases, but when it doesn't you'll waste days trying to fix the problem.
As already explained above, using lexical defined variables in these kinds of method is not possible.
If you have to "duct-tape" this problem you could use global Variables (our instead of my).
sub Process
{
our ($logger,$thingToLog) = #_;
sub Logger { $logger->Print($_[0]); }
Logger("Processing $thingToLog");
}
But be aware that $logger and $thingToLog are now global variables accessible outside this function.

is it allowed to pass pipes to constructors?

I tried to do something very fancy in Perl, and I think I'm suffering the consequences. I don't know if what I was trying to do is possible, actually.
My main program creates a pipe like this:
pipe(my $pipe_reader, my $pipe_writer);
(originally it was pipe(PIPE_READER, PIPE_WRITER) but I changed to regular variables when I was trying to debug this)
Then it forks, but I think that is probably irrelevant here. The child does this:
my $response = Response->new($pipe_writer);
The constructor of Response is bare bones:
sub new {
my $class = shift;
my $writer = shift;
my $self = {
writer => $writer
};
bless($self, $class);
return($self);
}
Then later the child will write its response:
$response->respond(123, "Here is my response");
The code for respond is as follows:
sub respond {
my $self = shift;
my $number = shift;
my $text = shift;
print $self->{writer} "$number\n";
print $self->{writer} "$text\n";
close $self->{writer}
}
This triggers a strange compile error: 'String found where operator expected ... Missing operator before "$number\n"?' at the point of the first print. Of course this is the normal syntax for a print, except that I have the object property instead of a normal handle AND it happens to be a pipe, not a file handle. So now I'm wondering if I'm not allowed to do this.
From print
If you're storing handles in an array or hash, or in general whenever you're using any expression more complex than a bareword handle or a plain, unsubscripted scalar variable to retrieve it, you will have to use a block returning the filehandle value instead, ...
print { $files[$i] } "stuff\n";
print { $OK ? *STDOUT : *STDERR } "stuff\n";
(my emphasis)
So you need
print { $self->{writer} } "$number\n";
Or, per Borodin's comment
$self->{writer}->print("$number\n");
The syntax of print is special, see for example this post and this post. For one, after print must come either a "simple" filehandle or a block evaluating to one, as quoted above, to satisfy the parser.
But with the dereference (arrow) operator the filehandle is found to be an IO::File object† and so its parent's IO::Handle::print method is invoked on it.
Prior to v5.14 there had to be use IO::Handle; for this to work, though not anymore. See this post and links in it for more.
Note that print FILEHANDLE LIST is not an indirect method call,
even as it may appear to be. It is just a function call to the print builtin under rather special syntax rules. It is only with an explicit ->
that an IO::Handle method gets called.
† It is either blessed into the class as the method call is encountered (and fails), or at creation; I can't find it in docs or otherwise resolve whether filehandles are blessed at creation or on demand
perl -MScalar::Util=blessed -wE'
pipe(RD,WR);
say *WR{IO}; #--> IO::File=IO(0xe8cb58)
say blessed(WR)//"undef"; #--> undef
'
(warns of unused RD)   We can't do this with lexical filehandles as they are not in the symbol table.
But once needed a filehandle is an IO::File or IO::Handle object (depending on Perl version).

how to get the name of invocant as string in perl

In perl, a class "Lamba" implements a method called "process".
use Lambda;
my $omega = Lambda->new();
$omega->process();
in the process method, how can we get the name of it's invocant?
package Lambda;
use strict;
sub new {
my $class = shift;
my $self = {};
bless ($self, $class);
return $self;
}
sub process {
my $self = shift;
my $invocant;
#
# ??? what is the variable name of my caller ???
#
# ie. how to set $invocant = 'omega';
#
return $self;
}
Update
I've just realised that you want the name of the variable that was used to call the process method. You can't do that without a source filter, because there may be several names all referring to the same object, like this
my $omega = Lambda->new;
my $aa = $omega;
my $bb = $omega;
$aa->process;
and there is quite sensibly no way to get hold of the name actually used to call the method
This is an X Y problem, and is comparable to asking how to use data strings to name a variable. Variable identifiers are purely for the consumption of the programmer, and if you think your program needs to know them then you have a design problem. If you explain exactly what it is that you want to achieve via this mechanism then I am sure we could help you better
Original solution
I've left this here in cased someone arrives at this page looking for a way to discover the name of the calling code
You can use the caller function
A call without parameters like caller() will return the package name, source file name, and line number where the current call was made
You get get more detailed information by adding a parameter that represents the depth on the call stack that you want to examine, so caller(0) will return information about the current subroutine, while the values from caller(1) will be about the calling subroutine
If we change your main code to use a subroutine to call the method, then we can write this
Lambda.pm
package Lambda;
use strict;
use warnings;
sub new {
bless {}, shift;
}
sub process {
my $self = shift;
#
# ??? what is the variable name of my caller ???
#
# ie. how to set $invocant = 'omega';
#
my $calling_sub = (caller(1))[3];
print "Called by $calling_sub\n";
return $self;
}
1;
main.pl
use strict;
use warnings 'all';
use Lambda;
mysub();
sub mysub {
my $omega = Lambda->new;
$omega->process;
}
output
Called by main::mysub
The caller function returns information about the calling subroutine/sourcecode line:
sub process {
my $self = shift;
print join(', ',caller(0)); # Some of these values will be undef!
}
The manual page shows this example:
($package, $filename, $line, $subroutine, $hasargs,
$wantarray, $evaltext, $is_require, $hints, $bitmask, $hinthash)
= caller($i);
Increase $i to walk further through the stacktrace (reverse list of callers):
my $i = 0;
while (my #ci = caller($i++)) {
print "$i. ".$ci[1].':'.$ci[2]."\n";
}
Starts with $i=0 and increases $i after passing it to the caller() function. Prints a stacktrace back to the line of the script starting the current process.

Is it possible to get all valid methods for a particular Perl class?

Is it possible to get all valid methods for a particular Perl class?
I am trying to manipulate the symbol table of a class and get all of its methods. I found I can separate out the subroutines from the non-subroutines via the $obj->can($method), but that doesn't do exactly what I think it does.
The following returns:
subroutine, Property, croak, Group, confess, carp, File
However, subroutine isn't a method, (just a subroutine), and croak, confess, and carp were all imported into my package.
What I really want to print out is:
Property,Group, File
But I'll take:
subroutine, Property,Group, File
Below is my program:
#! /usr/bin/env perl
use strict;
use warnings;
use feature qw(say);
my $sections = Section_group->new;
say join ", ", $sections->Sections;
package Section_group;
use Carp;
sub new {
return bless {}, shift;
}
sub Add {
my $self = shift;
my $section = shift;
}
sub Sections {
my $self = shift;
my #sections;
for my $symbol ( keys %Section_group:: ) {
next if $symbol eq "new"; # This is a constructor
next if $symbol eq "Add"; # Not interested in this method
next if $symbol eq "Sections"; # This is it's own method
push #sections, $symbol if $self->can($symbol);
}
return wantarray ? #sections : \#sections;
}
sub subroutine {
my $param1 = shift;
my $param2 = shift;
}
sub Group {
my $self = shift;
my $section = shift;
}
sub File {
my $self = shift;
my $section = shift;
}
sub Property {
my $self = shift;
my $section = shift;
}
This is fairly trivial. We only want to keep those sub names that were originally defined in our package. Every CV (code value) has a pointer to the package where it was defined. Thanks to B, we can examine that:
use B ();
...
if (my $coderef = $self->can($symbol)) {
my $cv = B::svref_2object $coderef;
push #sections, $symbol if $cv->STASH->NAME eq __PACKAGE__;
}
# Output as wanted
That is, we perform introspection using svref_2object. This returns a Perl object representing an internal perl data structure.
If we look into a coderef, we get a B::CV object, which represents the internal CV. The STASH field in a CV points to the Stash where it was defined. As you know, a Stash is just a special hash (internally represented as a HV), so $cv->STASH returns a B::HV. The NAME field of a HV contains the fully qualified package name of the Stash if the HV is a Stash, and not a regular hash.
Now we have all the info we need, and can compare the wanted package name to the name of the stash of the coderef.
Of course, this is simplified, and you will want to recurse through #ISA for general classes.
Nobody likes polluted namespaces. Thankfully, there are modules that remove foreign symbols from the Stash, e.g. namespace::clean. This is no problem when the CVs of all subs you are calling are known at compile time.
What are you trying to do? Why does it matter how a class defined or implements a method it responds to?
Perl is a dynamic language, so that means that methods don't have to exist at all. With AUTOLOAD, a method might be perfectly fine and callable, but never show up in the symbol table. A good interface would make can work in those cases, but there might be cases where a class or an object decides to respond to that with false.
The Package::Stash module can help you find defined subroutines in a particular namespace, but as you say, they might not be defined in the same file. The methods in a class might come from an inherited class. If you care about where they come from, you're probably doing it wrong.

How to pass filehandle as reference between modules and subs in perl

I'm maintaining old Perl code and need to enable strict pragma in all modules. I have a problem in passing a file handle as a reference between modules and subs. We have a common module responsible for opening the log file which is passed as typeglob reference. In other modules, the run function first calls open_log() from the common module, then it passes this file handle to other subs.
Here I've written a simple test to simulate the situation.
#!/usr/bin/perl -w
use strict;
$::STATUS_OK = 0;
$::STATUS_NOT_OK = 1;
sub print_header {
our $file_handle = #_;
print { $$file_handle } "#### HEADER ####"; # reference passing fails
}
sub print_text {
my ($file_handle, $text)= #_;
print_header(\$file_handle);
print { $$file_handle } $text;
}
sub open_file_handle {
my ($file_handle, $path, $name) = #_;
my $filename = $path."\\".$name;
unless ( open ($$file_handle, ">".$filename)) {
print STDERR "Failed to open file_handle $filename for writing.\n";
return $::STATUS_NOT_OK;
}
print STDERR "File $filename was opened for writing successfully.\n";
return $::STATUS_OK;
}
my $gpath = "C:\\Temp";
my $gname = "mylogfile.log";
my $gfile_handle;
if (open_file_handle(\$gfile_handle, $gpath, $gname) == $::STATUS_OK) {
my $text = "BIG SUCCESS!!!\n";
print_text(\$gfile_handle, $text);
print STDERR $text;
} else {
print STDERR "EPIC FAIL!!!!!!!!\n";
}
The Main function first calls open_file_handle and passes a file handle reference to the print_text function. If I comment out the row:
print_header(\$file_handle);
Everything works fine, but I need to pass the file handle reference to other functions from the print_text function, and this doesn't work.
I'm a Java developer and Perl's reference handling is not familiar to me. I don't want to change the open_log() sub to return a file handle (now it returns only status), since I have lots of modules and hundreds of code lines to go through to make this change in all places.
How can I fix my code to make it work?
There are two types of filehandles in Perl. Lexical and global bareword filehandles:
open my $fh, '>', '/path/to/file' or die $!;
open FILEHANDLE, '>', '/path/to/file' or die $!;
You are dealing with the first, which is good. The second one is global and should not be used.
The file handles you have are lexical, and they are stored in a scalar variable. It's called scalar because it has a dollar sign $. These can be passed as arguments to subs.
foo($fh);
They can also be referenced. In that case, you get a scalar reference.
my $ref = \$fh;
Usually you reference stuff if you hand it over to a function so Perl does not make a copy of the data. Think of a reference like a pointer in C. It's only the memory location of the data (structure). The piece of data itself remains where it is.
Now, in your code you have references to these scalars. You can tell because it is dereferenced in the print statement by saying $$fh.
sub print_text {
my ($file_handle, $text)= #_;
print_header(\$file_handle);
print { $$file_handle } $text;
}
So the $file_handle you get as a parameter (that's what the = #_ does) is actually a reference. You do not need to reference it again when you pass it to a function.
I guess you wrote the print_header yourself:
sub print_header {
our $file_handle = #_;
print { $$file_handle } "#### HEADER ####"; # reference passing fails
}
There are a few things here:
- our is for globals. Do not use that. Use my instead.
- Put parenthesis around the parameter assignment: my ($fh) = #_
- Since you pass over a reference to a reference to a scalar, you need to dereference twice: ${ ${ $file_handle } }
Of course the double-deref is weird. Get rid of it passing the variable $file_hanlde to print_header instead of a refence to it:
sub print_text {
my ($file_handle, $text)= #_;
print_header($file_handle); # <-- NO BACKSLASH HERE
print { $$file_handle } $text;
}
That is all you need to to make it work.
In general, I would get rid of all the references to the $file_handle vars here. You don't need them. The lexical filehandle is already a reference to an IO::Handle object, but don't concern yourself with that right now, it is not important. Just remember:
use filehandles that have a $ up front
pass them without references and you do not need to worry about \ and ${} and stuff like that
For more info, see perlref and perlreftut.
You are having difficulties because you added multiple extra level of references. Objects like lexical filehandles already are references.
If you have difficulties keeping track of what is a reference, you might want to use some kind of hungarian notation, like a _ref suffix.
In print_text, this would be:
sub print_text {
my ($file_handle_ref, $text)= #_;
print_header(\$file_handle_ref);
print { $$file_handle_ref } $text;
}
And in print_header:
sub print_header {
my ($file_handle_ref_ref) = #_; # don't use `our`, and assign to a lvalue list!
print { $$$file_handle_ref_ref } "#### HEADER ####"; # double derefernence … urgh
}
A far superior solution is to pass the filehandle around directly, without references.
sub print_header {
my ($file_handle) = #_;
print {$file_handle} "#### HEADER ####"; # no reference, no cry
}
sub print_text {
my ($file_handle, $text)= #_;
print_header($file_handle);
print {$file_handle} $text;
}
And in the main part:
my $gpath = "C:/Temp"; # forward slashes work too, as long as you are consistent
my $gname = "mylogfile.log";
if (open_file_handle(\my $gfile_handle, $gpath, $gname) == $::STATUS_OK) {
my $text = "BIG SUCCESS!!!\n";
print_text($gfile_handle, $text);
...
} else {
...
}
the reference operator is "\" (backslash)
anything includes arrays, hashes and even sub-routines can be referenced
the 5th line to count backwards
print_text(\$gfile_handle, $text);
you passed a referenced variable \$gfile_handle to the sub-routine print_text
sub print_text {
my ($file_handle, $text)= #_;
print_header(\$file_handle);
print { $$file_handle } $text;
}
and in this sub-routine, $file_handle is already a reference
then your referenced it again and pass it to the sub-routine print_header
so, you can solve this problem by putting off the reference operator the 5th line to count backwards like this:
print_text($gfile_handle, $text);
and try again :-)