Should a subroutine always return explicitly? - perl

If perlcritic says "having no returns in a sub is wrong", what is the alternative if they really aren't needed?
I've developed two apparently bad habits:
I explicitly assign variables to the '$main::' namespace.
I then play with those variables in subs.
For example, I might do..
#!/usr/bin/perl
use strict;
use warnings;
#main::array = (1,4,2,6,1,8,5,5,2);
&sort_array;
&push_array;
&pop_array;
sub sort_array{
#main::array = sort #main::array;
for (#main::array){
print "$_\n";
}
}
sub push_array{
for ( 1 .. 9 ){
push #main::array, $_;
}
}
sub pop_array {
for ( 1 .. 3 ){
pop #main::array;
}
}
I don't do this all the time. But in the above, it makes sense, because I can segregate the operations, not have to worry about passing values back and forth and it generally looks tidy to me.
But as I said, perl critic says its wrong - because there's no return..
So, is anyone able to interpret what I'm trying to do and suggest a better way of approaching this style of coding in perl? eg. am I sort of doing OOP?

In short - yes, you're basically doing OO, but in a way that's going to confuse everyone.
The danger of doing subs like that is that you're acting at a distance. It's a bad coding style to have to look somewhere else entirely for what might be breaking your code.
This is generally why 'globals' are to be avoided wherever possible.
For a short script, it doesn't matter too much.
Regarding return values - Perl returns the result of the last expression by default. (See: return)
(In the absence of an explicit return, a subroutine, eval, or do FILE automatically returns the value of the last expression evaluated.)
The reason Perl critic flags it is:
Require all subroutines to terminate explicitly with one of the following: return, carp, croak, die, exec, exit, goto, or throw.
Subroutines without explicit return statements at their ends can be confusing. It can be challenging to deduce what the return value will be.
Furthermore, if the programmer did not mean for there to be a significant return value, and omits a return statement, some of the subroutine's inner data can leak to the outside.
Perlcritic isn't always right though - if there's good reason for doing what you're doing, then turn it off. Just as long as you've thought about it and are aware of the risks an consequences.
Personally I think it's better style to explicitly return something, even if it is just return;.
Anyway, redrafting your code in a (crude) OO fashion:
#!/usr/bin/perl
use strict;
use warnings;
package MyArray;
my $default_array = [ 1,4,2,6,1,8,5,5,2 ];
sub new {
my ( $class ) = #_;
my $self = {};
$self -> {myarray} = $default_array;
bless ( $self, $class );
return $self;
}
sub get_array {
my ( $self ) = #_;
return ( $self -> {myarray} );
}
sub sort_array{
my ( $self ) = #_;
#{ $self -> {myarray} } = sort ( #{ $self -> {myarray} } );
for ( #{ $self -> {myarray} } ) {
print $_,"\n";
}
return 1;
}
sub push_array{
my ( $self ) = #_;
for ( 1 .. 9 ){
push #{$self -> {myarray}}, $_;
}
return 1;
}
sub pop_array {
my ( $self ) = #_;
for ( 1 .. 3 ){
pop #{$self -> {myarray}};
}
return 1;
}
1;
And then call it with:
#!/usr/bin/perl
use strict;
use warnings;
use MyArray;
my $array = MyArray -> new();
print "Started:\n";
print join (",", #{ $array -> get_array()} ),"\n";
print "Reshuffling:\n";
$array -> sort_array();
$array -> push_array();
$array -> pop_array();
print "Finished:\n";
print join (",", #{ $array -> get_array()} ),"\n";
It can probably be tidied up a bit, but hopefully this illustrates - within your object, you've got an internal 'array' which you then 'do stuff with' by making your calls.
Result is much the same (I think I've replicated the logic, but don't trust that entirely!) but you have a self contained thing going on.

If the function doesn't mean to return anything, there's no need to use return!
No, you don't use any aspects of OO (encapsulation, polymorphism, etc). What you are doing is called procedural programming. Nothing wrong with that. All my work for nuclear power plants was written in that style.
The problem is using #main::array, and I'm not talking about the fact that you could abbreviate that to #::array. Fully-qualified names escape strict checks, so they are far, far more error-prone. Mistyped var name won't get caught as easily, and it's easy to have two pieces of code collide by using the same variable name.
If you're just using one file, you can use my #array, but I presume you are using #main::array because you are accessing it from multiple files/modules. I suggest placing our #array in a module, and exporting it.
package MyData;
use Exporter qw( import );
our #EXPORT = qw( #array );
our #array;
1;
Having some kind of hint in the variable name (such as a prefix or suffix) indicating this is a variable used across many modules would be nice.
By the way, if you wanted do create an object, it would look like
package MyArray;
sub new {
my $class = shift;
my $self = bless({}, $class);
$self->{array} = [ #_ ];
return $self;
}
sub get_elements {
my ($self) = #_;
return #{ $self->{array} };
}
sub sort {
my ($self) = #_;
#{ $self->{array} } = sort #{ $self->{array} };
}
sub push {
my $self = shift;
push #{ $self->{array} }, #_;
}
sub pop {
my ($self, $n) = #_;
return splice(#{ $self->{array} }, 0, $n//1);
}
my $array = MyArray->new(1,4,2,6,1,8,5,5,2);
$array->sort;
print("$_\n") for $array->get_elements();
$array->push_array(1..9);
$array->pop_array(3);
I improved your interface a bit. (Sorting shouldn't print. Would be nice to push different things and to pop other than three elements.)

Related

Catching undefined value accessing in Perl

How can I catch access to member variables?
$Class1->{Class2}
If the Class2 field doesn't exist, is is possible to catch this from an internal function?
You can, but you probably shouldn't. The problem here is - if you access a variable within a class directly... then you just can. You can prevent this with a couple of workarounds - and this is where things like Moose come in.
And there's a couple of slightly hacky tricks like inside-out objects (which I think aren't common practice any more - Perl Best Practice advocated them some years back) or using anonymous hashes to hold state.
But failing that - why not use an accessor, and auto-generate one using 'AUTOLOAD'.
#!/usr/bin/env perl
package MyClass;
use strict;
use warnings;
use vars '$AUTOLOAD';
sub AUTOLOAD {
my ( $self ) = #_;
my $subname = $AUTOLOAD =~ s/.*:://r;
if ( $self -> {$subname} ) {
return $self -> {$subname};
}
warn "Sub called $subname was called\n";
return "$subname";
}
sub new {
my ( $class ) = #_;
my $self = {};
bless $self, $class;
}
package main;
use strict;
use warnings;
my $object = MyClass -> new;
$object -> {var} = "fleeg";
print "Undef fiddle was: ", $object -> fiddle,"\n";
print "But 'var' was: ", $object -> var,"\n";
This has the same problem, in that changing method names might cause things to break. However it has the advantage that you can handle 'invalid' method calls however you like.
But really - explicit 'get' and 'set' methods are better choices for most use-cases.
You do this by providing proper getter/setter methods that wrap around your class/instance variables. The internals should never be accessed directly, particularly from outside of the class itself (it's wise to not do so within the class either, except for the actual method that maintains that specific attribute. Here's a very basic example:
use warnings;
use strict;
package A;
sub new {
my ($class, %args) = #_;
my $self = bless {}, $class;
$self->x($args{x});
$self->y($args{y});
return $self;
}
sub x {
my ($self, $x) = #_;
$self->{x} = $x if defined $x;
return $self->{x} // 1;
}
sub y {
my ($self, $y) = #_;
$self->{y} = $y if defined $y;
return $self->{y} // 2;
}
package main;
my $obj = A->new(x => 5, y => 3);
print $obj->x ."\n";
print $obj->y ."\n";
Now, you could just as easily do print $obj->{x}, but that's where your problem is. What happens when the code is much more complicated than this, and for some reason you want to change the x attribute name to foo, but retain the x() method? $obj->{x} will now be undef as its never set.
Always use the provided methods for accessing attributes of a class/object. Encapsulation such as this is a staple of OO programming.

Hook to provide a value for every Hash lookup in Perl

Is it possible to provide a hook in Perl to make sure no Hash key lookup fails ?
Example :
use strict;
use warnings;
my %hash_example = ( "a"=>"apple", "b"=>"ball" );
print $hash_example{"a"}; # Goes Fine.
print $hash_example{"c"}; # Throws Warning ( "Use of uninitialized value " ).
Codepad link
Whenever a hash lookup happens, some subroutine could get called which can provide a default value.
I mean, any hash lookup should call a sub ( say "get_hash_value (hash_ref, key) " ) and pass the hash and key to it. A sample of such a sub is shown below :
sub get_hash_value {
my $hash_ref = shift;
my $key = shift;
if ( exists $hash_ref->{$key} ) { # For Normal Lookup.
return $hash_ref->{$key};
}
else {
# This is the interesting place where we could provide our own values.
return "custom_value_based_on_certain_conditions"; # Some value
}
}
Another consequence would be the ability to alter the value returned against a key. We would be able to return a different value than what actually is stored against that key ( in that hash ).
There might not be a valid use case for this but am intrigued and would like to learn if such things are supported in Perl.
As said by Сухой27 in comment, this works fine:
my %hash_example = ( "a"=>"apple", "b"=>"ball" );
print $hash_example{"a"};
print $hash_example{"c"} // "custom_value_based_on_certain_conditions";
Doc on logical defined or
I would suggest that trying to alter how a hash lookup "works" is a really terrible idea, as a good way to create code that's hard to maintain.
However instead I would suggest you look at creating an object instead of a hash. They are basically the same thing, but an object includes code, and there is an expectation that the code within the object is 'doing it's own thing'.
So at a basic level:
#!/usr/bin/env perl
use strict;
use warnings;
package Hash_Ob;
sub new {
my ($class) = #_;
my $self = {};
bless( $self, $class );
return $self;
}
sub get_value {
my ( $self, $valuename ) = #_;
if ( $self->{$valuename} ) {
return $self->{$valuename};
}
else {
#generate your own value here!
$self->{$valuename} = 42;
return $self->{$valuename};
}
}
1;
Which you'd then 'call' using:
#!/usr/bin/env perl
use strict;
use warnings;
use Hash_Ob;
my $magic_hash = Hash_Ob -> new();
print $magic_hash -> get_value('new_value');
This avoids the problem of altering how a 'well known' mechanism actually works, and so future maintenance programmers will not curse your name.
Then maybe you want to use a tied hash. Tying is a mechanism to change the behavior of a builtin data type. See perltie for the gory details.
{
package HashWithDefault;
use Tie::StdHash;
our #ISA = qw(Tie::StdHash); # inherit STORE, FIRST, NEXT, etc.
sub TIEHASH {
my ($pkg,$default_val) = #_;
return bless { __default_val__ => $default_val}, $pkg;
}
sub FETCH {
my ($self,$key) = #_;
exists $self->{$key} ? $self->{$key} : $self->{__default_val__};
}
sub CLEAR { # don't clear the default val
my $self = shift;
%$self = ( __default_val__ => $self->{__default_val__} );
}
}
tie my %hash, 'HashWithDefault', "42";
%hash = (foo => 123, bar => 456);
print $hash{foo}; # 123
print $hash{quux}; # 42

returning a lazily-computed scalar, in perl

I'm trying to add some functionality to our code base by using tied scalars.
We have a function which is specified to return scalars. I thought I could add some features to the system by tie-ing these scalars before returning them, but it looks like the FETCH method is called just before the return, which results in an untied scalar being returned.
Is there any way around this?
I really want to keep the subroutine's interface (returning scalars) intact if it's at all possible.
use strict;
use warnings;
main();
sub GetThing{
my $thing;
tie $thing, 'mything', #_;
return $thing;
}
sub main {
my %m;
$m{pre} = GetThing('Fred');
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
require Tie::Scalar;
my #ISA = qw(Tie::StdScalar);
sub TIESCALAR {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
sub FETCH {
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
}
Desired output:
1
ACCESS ALERT!
NAME: 'Fred'
2
ACCESS ALERT!
NAME: 'Fred'
3
I can get the desired output by returning a reference, and dereferencing on each access, but that ruins our established interface, and makes it more confusing for our users.
--Buck
As DVK said, tie applies to containers, so isn't useful for returned values.
For that, you use overloading. An example (not all the possible overloaded operations are supplied; see http://perldoc.perl.org/overload.html#Minimal-set-of-overloaded-operations):
use strict;
use warnings;
main();
sub GetThing{
my $thing;
$thing = "mything"->new(#_);
return $thing;
}
sub main {
my %m;
$m{pre} = GetThing('Fred');
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
use overload 'fallback' => 1, '""' => 'FETCH';
sub new {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
sub FETCH {
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
}
As mentioned in other answers, tie applies to containers, and not to values, so there is no way to assign a tied variable to another variable and retain the tied properties.
Since assignment is out, you need to pass the container into the GetThing routine. You can do this by reference as follows:
use strict;
use warnings;
main();
sub GetThing{
tie ${$_[1]}, 'mything', $_[0];
}
sub main {
my %m;
GetThing('Fred' => \$m{pre});
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
require Tie::Scalar;
my #ISA = qw(Tie::StdScalar);
sub TIESCALAR {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
sub FETCH {
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
}
which produces the correct output.
However, if you want to retain the assignment, you will need to use overloading, which applies to values (actually to objects, but they themselves are values). Without more detail on your intended purpose it is hard to give a complete answer, but this will meet your stated requirements:
use strict;
use warnings;
main();
sub GetThing{
return mything->new( shift );
}
sub main {
my %m;
$m{pre} = GetThing('Fred');
print "1\n";
print $m{pre};
print "2\n";
print $m{pre};
print "3\n";
}
package mything;
sub new {
my $class = shift;
bless {
name => shift || 'noname',
}, $class;
}
use overload '""' => sub { # '""' means to overload stringification
my $self = shift;
print "ACCESS ALERT!\n";
return " NAME: '$self->{name}'\n";
};
Both ties and overloads can get complicated, so read through all of the documentation if anything is not clear.
First, the exact method of doing what you are proposing seems technically impossible:
Tied variables have the tie attached to the variable itself, not to its value.
In Perl, subroutine's return values are returned by value, meaning you take the value passed to return, access it (in you case, accessing the tied variable and calling FETCH in the process) - and then copy that value! Which means that what the caller gets is a scalar VALUE, not a scalar variable (tied or untied).
Your confusion, in short, seems to stem from mixing together variables (locations in program's symbol table) and values stored in those variables.
Second, you were somewhat unclear as to what exactly you are trying to achieve, so it's hard to propose how to achieve what you want. But assuming, based on your description, that you wanted to call some method upon subroutine's return (possibly passing it the return value), you CAN do that.
To do so, you need to employ what fancy people call aspect programming. The politically (and technically) correct way of doing it in Perl is by using Moose.
However, you can DIY it, by basically replacing the original method with a wrapper method.
The exact mechanics of both Moose and DIY approaches can be seen in the first two answers to the following SO question, so I won't copy/paste them here, hope you don't mind:
Simulating aspects of static-typing in a duck-typed language
If you're feeling adventurous, you could also use the Scalar::Defer module which provides a general-purpose mechanism for a scalar variable to compute a value lazily, either once or on each access.

Can I overload Perl's =? (And a problem while use Tie)

I choose to use tie and find this:
package Galaxy::IO::INI;
sub new {
my $invocant = shift;
my $class = ref($invocant) || $invocant;
my $self = {']' => []}; # ini section can never be ']'
tie %{$self},'INIHash';
return bless $self, $class;
}
package INIHash;
use Carp;
require Tie::Hash;
#INIHash::ISA = qw(Tie::StdHash);
sub STORE {
#$_[0]->{$_[1]} = $_[2];
push #{$_[0]->{']'}},$_[1] unless exists $_[0]->{$_[1]};
for (keys %{$_[2]}) {
next if $_ eq '=';
push #{$_[0]->{$_[1]}->{'='}},$_ unless exists $_[0]->{$_[1]}->{$_};
$_[0]->{$_[1]}->{$_}=$_[2]->{$_};
}
$_[0]->{$_[1]}->{'='};
}
if I remove the last "$[0]->{$[1]}->{'='};", it does not work correctly.
Why ?
I know a return value is required. But "$[0]->{$[1]};" cannot work correctly either, and $[0]->{$[1]}->{'='} is not the whole thing.
Old post:
I am write a package in Perl for parsing INI files.
Just something based on Config::Tiny.
I want to keep the order of sections & keys, so I use extra array to store the order.
But when I use " $Config->{newsection} = { this => 'that' }; # Add a section ", I need to overload '=' so that "newsection" and "this" can be pushed in the array.
Is this possible to make "$Config->{newsection} = { this => 'that' };" work without influence other parts ?
Part of the code is:
sub new {
my $invocant = shift;
my $class = ref($invocant) || $invocant;
my $self = {']' => []}; # ini section can never be ']'
return bless $self, $class;
}
sub read_string {
if ( /^\s*\[\s*(.+?)\s*\]\s*$/ ) {
$self->{$ns = $1} ||= {'=' => []}; # ini key can never be '='
push #{$$self{']'}},$ns;
next;
}
if ( /^\s*([^=]+?)\s*=\s*(.*?)\s*$/ ) {
push #{$$self{$ns}{'='}},$1 unless defined $$self{$ns}{$1};
$self->{$ns}->{$1} = $2;
next;
}
}
sub write_string {
my $self = shift;
my $contents = '';
foreach my $section (#{$$self{']'}}) {
}}
Special Symbols for Overload
lists the behaviour of Perl overloading for '='.
The value for "=" is a reference to a function with three arguments, i.e., it looks like the other values in use overload. However, it does not overload the Perl assignment operator. This would go against Camel hair.
So you will probably need to rethink your approach.
This is not exactly JUST operator overloading, but if you absolutely need this functionality, you can try a perl tie:
http://perldoc.perl.org/functions/tie.html
Do you know about Config::IniFiles? You might consider that before you go off and reinvent it. With some proper subclassing, you can add ordering to it.
Also, I think you have the wrong interface. You're exposing the internal structure of your object and modifying it through magical assignments. Using methods would make your life much easier.

Deferring code on scope change in Perl

I often find it useful to be able to schedule code to be executed upon leaving the current scope. In my previous life in TCL, a friend created a function we called defer.
It enabled code like:
set fp [open "x"]
defer("close $fp");
which was invoked when the current scope exited. The main benefit is that it's always invoked no matter how/where I leave scope.
So I implemented something similar in Perl but it seems there'd be an easier way. Comments critiques welcome.
The way I did it in Perl:
create a global, tied variable which holds an array of subs to be executed.
whenever I want to schedule a fn to be invoked on exit, I use local to change the array.
when I leave the current scope, Perl changes the global to the previous value
because the global is tied, I know when this value change happens and can invoke the subs in the list.
The actual code is below.
Is there a better way to do this? Seems this would be a commonly needed capability.
use strict;
package tiescalar;
sub TIESCALAR {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub FETCH {
my $self = shift;
return $self->{VAL};
}
sub STORE {
my $self = shift;
my $value = shift;
if (defined($self->{VAL}) && defined($value)) {
foreach my $s (#{$self->{VAL}}) { &$s; }
}
$self->{VAL} = $value;
}
1;
package main;
our $h;
tie($h, 'tiescalar');
$h = [];
printf "1\n";
printf "2\n";
sub main {
printf "3\n";
local $h = [sub{printf "9\n"}];
push(#$h, sub {printf "10\n";});
printf "4\n";
{
local $h = [sub {printf "8\n"; }];
mysub();
printf "7\n";
return;
}
}
sub mysub {
local $h = [sub {printf "6\n"; }];
print "5\n";
}
main();
printf "11\n";
Well, your specific case is already handled if you use lexical filehandles (as opposed to the old style bareword filehandles). For other cases, you could always use the DESTROY method of an object guaranteed to go to zero references when it goes out of scope:
#!/usr/bin/perl
use strict;
use warnings;
for my $i (1 .. 5) {
my $defer = Defer::Sub->new(sub { print "end\n" });
print "start\n$i\n";
}
package Defer::Sub;
use Carp;
sub new {
my $class = shift;
croak "$class requires a function to call\n" unless #_;
my $self = {
func => shift,
};
return bless $self, $class;
}
sub DESTROY {
my $self = shift;
$self->{func}();
}
ETA: I like brian's name better, Scope::OnExit is a much more descriptive name.
Instead of using tie for this, I think I'd just create an object. You can also avoid the local that way too.
{
my $defer = Scope::OnExit->new( #subs );
$defer->push( $other_sub ); # and pop, shift, etc
...
}
When the variable goes out of scope, you have a chance to do things in the DESTROY method.
Also, in the example you posted, you need to check that the values you store are code references, and it's probably a good idea to check that the VAL value is an array reference:
sub TIESCALAR { bless { VAL => [] }, $_[0] }
sub STORE {
my( $self, $value ) = #_;
carp "Can only store array references!" unless ref $value eq ref [];
foreach { #$value } {
carp "There should only be code refs in the array"
unless ref $_ eq ref sub {}
}
foreach ( #{ $self->{VAL}} ) { $_->() }
$self->{VAL} = $value;
}
You may want to try out B::Hooks::EndOfScope
I Believe this works:
use B::Hooks::EndOfScope;
sub foo {
on_scope_end {
$codehere;
};
$morecode
return 1; # scope end code executes.
}
foo();
I think you want something like Scope::Guard, but it can't be pushed. Hmmm.
Thanks.
Trivially,
sub OnLeavingScope::DESTROY { ${$_[0]}->() }
used like:
{
...
my $onleavingscope = bless \sub { ... }, 'OnLeavingScope';
my $onleavingscope2 = bless \\&whatever, 'OnLeavingScope';
...
}
(The extra level of having a reference to a reference to a sub is necessary only to work around an optimization (that's arguably a bug) when using a non-closure anonymous sub.)