Is there any difference between "standard" and block package declaration? - perl

Usually, a package starts simply as
package Cat;
... #content
!0;
I just discovered that starting from the perl 5.14 there is the "block" syntax too.
package Cat {
... #content
}
It is probably the same. But just to be sure, is there any difference?
And about the 1; at the end of the package file. The return value of any block, is taken as the value of the last evaluated expression. So can I put the 1; before the closing }? To make require happy, is there any difference between:
package Cat {
... #content
1;
}
and
package Cat {
... #content
}
1;

Of course there is a difference. The second variant has a block.
A package declaration sets the current namespace for subs and globals. This is scoped normally, i.e. the scope ends with the end of file or eval string, or with an enclosing block.
The package NAME BLOCK syntax is just syntactic sugar for
{ package NAME;
...;
}
and even compiles down to the same opcodes.
While the package declaration is syntactically a statement, this isn't semantically true; it just sets compile-time properties. Therefore, the last statement of the last block is the last statement of the file, and there is no difference between
package Foo;
1;
and
package Foo {
1;
}
wrt. the last statement.
The package BLOCK syntax is interesting mainly because it looks like class Foo {} in other languages, I think. Because the block limits scope, this also makes using properly scoped variables easier. Think:
package Foo;
our $x = 1;
package main;
$::x = 42;
say $x;
Output: 1, because our is lexically scoped like my and just declares an alias! This can be prevented by the block syntax:
package Foo {
our $x = 1;
}
package main {
$::x = 42;
say $x;
}
works as expected (42), although strict isn't happy.

package Foo { ... }
is equivalent to
{ package Foo; ... }
which is different from the following in that it create a block
package Foo; ...
This only matters if you have code that follows in the file.
package Foo { ... }
isn't equivalent to
BEGIN { package Foo; ... }
I would have liked this, but that proposal was not accepted.
require (and do) requires that the last expression evaluated in the file returns a true value. { ... } evaluates to the last value evaluated within, so the following is fine:
package Foo { ...; 1 }
Placing the 1; on the outside makes more sense to me — it pertains to the file rather than the package — but that's merely a style choice.

There is a difference :) If you can try running below two different code snippets (just import the modules in a perl file )
# perlscript.pl
use wblock;
wblock::wbmethod();
First Code snippet without block, wblock.pm
package wblock1;
my $a =10;
sub wbmethod1{
print "in wb $a";
}
package wblock;
sub wbmethod{
print "in wb1 $a";
}
1;
Second with wblock.pm
package wblock1 {
my $a =10;
sub wbmethod1{
print "in wb $a";
}
1;
}
package wblock {
sub wbmethod{
print "in wb1 $a";
}
1;
}
Now the difference as you might have seen is, variable $a is not available for wblock package when we use BLOCK. But without BLOCK we can use $a from other package, as it's scope is for file.
More to say from perldoc itself:
That is, the forms without a BLOCK are operative through the end of
the current scope, just like the my, state, and our operators.

There is a difference but only if you are doing hard-to-maintain programming. Specially, if you are using a my variable across packages. The convention for Perl is to have only one package per file and the entire package in one file.

There's one other difference that other answers haven't given, that may help explain why there are two.
package Foo;
sub bar { ... }
was always the way to do it in Perl 5. The package BLOCK syntax of
package Foo {
sub bar { ... }
}
was only added at perl 5.14.
The main difference then is that the latter form is neater but only works since 5.14; the former form will work back to older versions. The neater form was added largely for visual neatness; it doesn't have any semantic difference worth worrying about.

Related

How to use `our` class variables with `UNITCHECK` correctly with `use strict`?

As Perl constants are somewhat strange to use, I decided to implement my "class variables" as our variables, just like:
our $foo = '...';
However when I added a UNITCHECK block using the class variables, I realized that the variables were not set yet, so I changed the code to:
BEGIN {
our $foo = '...';
}
UNITCHECK {
if ($foo eq 'bla') {
#...
}
}
Then I realized that I had mistyped some variable names in UNITCHECK, so I decided to add use warnings and use strict.
Unfortunately I'm getting new errors like
Variable "$foo" is not imported at .. line ..
When I initialize the variable outside BEGIN, then the error is away, but then I have the original problem back.
So I wonder:
Is our $var = 'value'; the remommended and correct use, or should it be split in our $var; outside the BEGIN and $var = 'value; inside BEGIN?
As my list of variables is rather long, I'm trying to avoid list them twice (introducing the possibility of misspelling some again).
What is the recommended correct way to do it?
our is lexically scoped so in your code the variable only exists in the BEGIN block. You will need to separate out the declaration from the assignment like this:
our $foo;
BEGIN {
$foo = '...';
}
UNITCHECK {
if ($foo eq 'bla') {
#...
}
}

When and why would you use a class with no data members?

I have noticed some Perl modules use a class based structure, but don't manage any data. The class is simply used to access the methods within and nothing more.
Consider the following example:
Class.pm
package Class;
use Moose;
sub do_something {
print "Hi!\n";
}
1;
test.pl
use Class;
# Instantiate an object from the class
my $obj = Class->new();
$obj->do_something();
In this example you can see that you would first instantiate an instance of the class, then call the method from the created object.
The same end result can be achieved like so:
Module.pm
package Module;
use strict;
use warnings;
sub do_something {
print "Hi!\n";
}
1;
test.pl
use Module;
Module::do_something();
I am wondering why people write modules using the first approach, and if there is some benefit that it provides. To me it seems like it adds an extra step, because in order to use the methods, you first need to instantiate an object of the class.
I don't understand why people would program like this unless it has some benefit that I am not seeing.
One benefit is inheritance. You can subclass behavior of an existing class if it supports the -> style subroutine calls (which is a weaker statement than saying the class is object-oriented, as I said in a comment above).
package Class;
sub new { bless \__PACKAGE__,__PACKAGE__ }
sub do_something { "foo" }
sub do_something_else { 42 }
1;
package Subclass;
#Sublcass::ISA = qw(Class);
sub new { bless \__PACKAGE__,__PACKAGE__ }
sub do_something_else { 19 }
package main;
use feature 'say';
$o1 = Class->new;
$o2 = Subclass->new;
say $o1->do_something; # foo
say $o2->do_something; # foo
say $o1->do_something_else; # 42
say $o2->do_something_else; # 19
A prominent use of this technique is the UNIVERSAL class, that all blessed references implicitly subclass. The methods defined in the UNIVERSAL namespace generally take a package name as the first argument (or resolve a reference in the first argument to its package name), are return some package information. The DB class also does something like this (though the DB package also maintains plenty of state).

Moose Perl: "modify multiple methods in all subclasses"

I have a Moose BaseDBModel which has different subclasses mapping to my tables in the database. All the methods in the subclasses are like "get_xxx" or "update_xxx" which refers to the different DB operations.
Now i want to implement a cache system for all these methods, so my idea is "before" all methods named like "get_xxx", I will search the name of the method as key in my memcache pool for value. If i found the value, then I will return the value directly instead of method.
ideally, my code is like this
BaseDBModel
package Speed::Module::BaseDBModel;
use Moose;
sub BUILD {
my $self = shift;
for my $method ($self->meta->get_method_list()){
if($method =~ /^get_/){
$self->meta->add_before_method_modifier($method,sub {
warn $method;
find_value_by_method_name($method);
[return_value_if_found_value]
});
}
}
}
SubClasses Example 1
package Speed::Module::Character;
use Moose;
extends 'Speed::Module::BaseDBModel';
method get_character_by_id {
xxxx
}
Now my problem is that when my program is running, it's repeatedly modify the methods, for example:
restart apache
visit the page which will call get_character_by_id, so I can see one warning message
Codes:
my $db_character = Speed::Module::Character->new(glr => $self->glr);
$character_state = $db_character->get_character_by_id($cid);
Warnings:
get_character_by_id at /Users/dyk/Sites/speed/lib/Speed/Module/BaseDBModel.pm line 60.
but if I refresh the page, I saw 2 warning messages
Warnings:
get_character_by_id at /Users/dyk/Sites/speed/lib/Speed/Module/BaseDBModel.pm line 60.
get_character_by_id at /Users/dyk/Sites/speed/lib/Speed/Module/BaseDBModel.pm line 60.
I am using mod_perl 2.0 with apache, every time i refresh the page, my get_character_by_id method will be modified which I don't want
Isn't your BUILD doing the add_before every time you construct a new instance? I'm not sure that's what you want.
Well, the simple/clunky way would be to set some package-level flag so you only do it once.
Otherwise, I think you want to hook into Moose's own attribute building. Have a look at this: http://www.perlmonks.org/?node_id=948231
The problem is BUILD runs every time your create an object (i.e. after every ->new() call), but add_before_method_modifier adds modifier to class, i.e. to all objects.
Simple solution
Mind, that use calls import function from used package every time. That is the place where you want to add modifiers.
Parent:
package Parent;
use Moose;
sub import {
my ($class) = #_;
foreach my $method ($class->meta->get_method_list) {
if ($method =~ /^get_/) {
$class->meta->add_before_method_modifier($method, sub {
warn $method
});
}
}
}
1;
Child1:
package Child1;
use Moose;
extends 'Parent';
sub get_a { 'a' }
1;
Child2:
package Child2;
use Moose;
extends 'Parent';
sub get_b { 'b' }
1;
So now it works as expected:
$ perl -e 'use Child1; use Child2; Child1->new->get_a; Child2->new->get_b; Child1->new->get_a;'
get_a at Parent.pm line 11.
get_b at Parent.pm line 11.
get_a at Parent.pm line 11.
Cleaner solution
Since you can't be 100% sure import will be called (since you can't be sure use will be used) the more cleaner and straightforward solution is just add something like use My::Getter::Cacher in every derived class.
package My::Getter::Cacher;
sub import {
my $class = [caller]->[0];
# ...
}
In this case every derived class should contain both extends 'Parent' and use My::Getter::Cacher since the first line is about inheritance while the second is about adding before modifier. You may count it a bit redundant, but as I said I believe it's more cleaner and straightforward.
P. S.
Maybe you should give a glance at Memoize module.

Nested subroutines and Scoping in Perl

I'm writing Perl for quite some time now and always discovering new things, and I just ran into something interesting that I don't have the explanation to it, nor found it over the web.
sub a {
sub b {
print "In B\n";
}
}
b();
how come I can call b() from outside its scope and it works?
I know its a bad practice to do it, and I dont do it, I use closured and such for these cases, but just saw that.
Subroutines are stored in a global namespace at compile time. In your example b(); is short hand for main::b();. To limit visibility of a function to a scope you need to assign an anonymous subroutines to a variable.
Both named and anonymous subroutines can form closures, but since named subroutines are only compiled once if you nest them they don't behave as many people expect.
use warnings;
sub one {
my $var = shift;
sub two {
print "var: $var\n";
}
}
one("test");
two();
one("fail");
two();
__END__
output:
Variable "$var" will not stay shared at -e line 5.
var: test
var: test
Nesting named subroutines is allowed in Perl but it's almost certainly a sign that the code is doing someting incorrectly.
The "official" way to create nested subroutines in perl is to use the local keyword. For example:
sub a {
local *b = sub {
return 123;
};
return b(); # Works as expected
}
b(); # Error: "Undefined subroutine &main::b called at ..."
The perldoc page perlref has this example:
sub outer {
my $x = $_[0] + 35;
local *inner = sub { return $x * 19 };
return $x + inner();
}
"This has the interesting effect of creating a function local to another function, something not normally supported in Perl."
The following prints 123.
sub a {
$b = 123;
}
a();
print $b, "\n";
So why are you surprised that the following does too?
sub a {
sub b { return 123; }
}
a();
print b(), "\n";
Nowhere is any request for $b or &b to be lexical. In fact, you can't ask for &b to be lexical (yet).
sub b { ... }
is basically
BEGIN { *b = sub { ... }; }
where *b is the symbol table entry for $b, #b, ..., and of course &b. That means subs belong to packages, and thus can be called from anywhere within the package, or anywhere at all if their fully qualified name is used (MyPackage::b()).
Subroutines are defined during compile time, and are not affected by scope. In other words, they cannot truly be nested. At least not as far as their own scope is concerned. After being defined, they are effectively removed from the source code.

Why does this Perl produce "Not a CODE reference?"

I need to remove a method from the Perl symbol table at runtime. I attempted to do this using undef &Square::area, which does delete the function but leaves some traces behind. Specifically, when $square->area() is called, Perl complains that it is "Not a CODE reference" instead of "Undefined subroutine &Square::area called" which is what I expect.
You might ask, "Why does it matter? You deleted the function, why would you call it?" The answer is that I'm not calling it, Perl is. Square inherits from Rectangle, and I want the inheritance chain to pass $square->area through to &Rectangle::area, but instead of skipping Square where the method doesn't exist and then falling through to Rectangle's area(), the method call dies with "Not a CODE reference."
Oddly, this appears to only happen when &Square::area was defined by typeglob assignment (e.g. *area = sub {...}). If the function is defined using the standard sub area {} approach, the code works as expected.
Also interesting, undefining the whole glob works as expected. Just not undefining the subroutine itself.
Here's a short example that illustrates the symptom, and contrasts with correct behavior:
#!/usr/bin/env perl
use strict;
use warnings;
# This generates "Not a CODE reference". Why?
sub howdy; *howdy = sub { "Howdy!\n" };
undef &howdy;
eval { howdy };
print $#;
# Undefined subroutine &main::hi called (as expected)
sub hi { "Hi!\n" }
undef &hi;
eval { hi };
print $#;
# Undefined subroutine &main::hello called (as expected)
sub hello; *hello = sub { "Hello!\n" };
undef *hello;
eval { hello };
print $#;
Update: I have since solved this problem using Package::Stash (thanks #Ether), but I'm still confused by why it's happening in the first place. perldoc perlmod says:
package main;
sub Some_package::foo { ... } # &foo defined in Some_package
This is just a shorthand for a typeglob assignment at compile time:
BEGIN { *Some_package::foo = sub { ... } }
But it appears that it isn't just shorthand, because the two cause different behavior after undefining the function. I'd appreciate if someone could tell me whether this is a case of (1) incorrect docs, (2) bug in perl, or (3) PEBCAK.
Manipulating symbol table references yourself is bound to get you into trouble, as there are lots of little fiddly things that are hard to get right. Fortunately there is a module that does all the heavy lifting for you, Package::Stash -- so just call its methods add_package_symbol and remove_package_symbol as needed.
Another good method installer that you may want to check out is Sub::Install -- especially nice if you want to generate lots of similar functions.
As to why your approach is not correct, let's take a look at the symbol table after deleting the code reference:
sub foo { "foo!\n"}
sub howdy; *howdy = sub { "Howdy!\n" };
undef &howdy;
eval { howdy };
print $#;
use Data::Dumper;
no strict 'refs';
print Dumper(\%{"main::"});
prints (abridged):
$VAR1 = {
'howdy' => *::howdy,
'foo' => *::foo,
};
As you can see, the 'howdy' slot is still present - undefining &howdy doesn't actually do anything enough. You need to explicitly remove the glob slot, *howdy.
The reason it happens is precisely because you assigned a typeglob.
When you delete the CODE symbol, the rest of typeglob is still lingering, so when you try to execute howdy it will point to the non-CODE piece of typeglob.