Perl, shift if(#_) - perl

Can someone tell me what
shift if(#_)
means in perl?
and so what would
sub id {
my $self = shift;
$self->{ID} = shift if #_;
return $self->{ID};
}
mean? Thanks.

That is a handrolled accessor/mutator for an object
print $obj->id; # Accessor form
$obj->id(NEW_VAL); # Mutator form
It is functionally equivalent to:
sub id {
my $self = shift;
if (#_) { # If called with additional parameters, set the value:
$self->{ID} = shift(#_);
}
return $self->{ID};
}

In a sub, shift without argument uses #_ array.
So
$self->{ID} = shift if #_;
is equal to
$self->{ID} = shift(#_) if #_;
(remove leftmost element from #_ array and assign it to $self->{ID})

shift takes the first element off an array and returns it. If no array is given, it operates on #_, the array containing function arguments. The if #_ statement modifier causes the preceding statement to be executed only if #_ has at least one element.
In your example, $self->{ID} = shift #_ if #_; means "If there is a function argument, assign it to $self->{ID}.

shift if(#_) says "if #_ has any elements (i.e., evaluating #_ in scalar context is greater than zero), shift off the first element of the default argument (inside a subroutine, this is #_)".
The sub is a standard pre-Moose setter/getter method. Commented to explain:
sub id {
my $self = shift; # #_ is the implied argument of shift.
# Since method calls prepend the object reference to #_,
# this grabs the object itself, assumed by the later code
# to be a hash reference.
$self->{ID} = shift if #_;
# If there's still anything left in #_ (`if #_`), get
# the first item and stash it under the ID key in
# the hash referenced by $self (note that if there is more
# than one item, we'll only stash the first one).
return $self->{ID}; # Return whatever the value of the item stored under ID in the
# hash referenced by $self is. This will be the value just
# assigned if the method was called with a scalar argument,
# or whatever value was there before if no argument was passed.
# This will be undef if nothing was ever stored under the ID key.
}
Invocation would be
$obj->id();
to fetch, and
$obj->id($value);
to set.
A standard pre-Moose constructor would create an anonymous hash reference, bless it to turn it into an object (connecting a package implementing the class's behavior to that reference), and return it:
sub new {
my($class) = #_; # List assignment to a list; puts first item in one into the first
# item in another; a call to a class method prepends the package
# (class) name to #_, so this gets us the name of the class this
# object is to belong to.
my $self = {}; # Gets a new anonymous hash reference into $self.
bless $self, $class; # Connects the hash reference to the package, so that method calls
# made on this object are directed to this class (and its #ISA
# ancestors) to find the sub to be called to implement the method.
return $self; # Hands the object back to the caller.
}

There is no shift if #_ in your example -- if applies to the entire statement, not just the shift expression (because if has very low precedence in the expression hierarchy).
$self->{ID} = shift if #_;
is short for:
if (#_) {
$self->{ID} = shift;
}
shift with no argument pulls the first element off the #_ list, so this sets $self->{ID} to the first argument of the subroutine if there are any arguments.
The general rule regarding postfix use of if is that:
<expression1> if <expression2>;
is short for:
if (<expression2>) {
<expression1>;
}

Related

Getting Variable "#xml_files" will not stay shared at ... line

I have the following Perl code:
sub merge_xml {
foreach my $repository ('repo1', 'repo2') {
my #xml_files;
sub match_xml {
my $filename = $File::Find::dir . '/' . $_;
if ($filename =~ m%/(main|test)\.xml$%) {
push(#xml_files, $filename);
}
}
find(\&match_xml, $repository);
print Dumper(\#xml_files);
}
}
And I am getting the warning:
Variable "#xml_files" will not stay shared at ./script/src/repair.pl line 87.
How does one fix this?
PS find as in File::Find
"Nested" named subs in fact aren't -- they're compiled as separate subroutines, and so having them written as "nested" can only be misleading.
Further, this creates a problem since the "inner" subroutine supposedly closes over the variable #xml_files that it uses, which gets redefined on each new call, being lexical. But the sub, built at compile-time and not being a lexical closure, only keeps the refernce to the value at first call and so it works right only upon the first call to the outer sub (merge_xml here).
We do get the warning though. With use diagnostics; (or see it in perldiag)
Variable "$x" will not stay shared at -e line 1 (#1)
(W closure) An inner (nested) named subroutine is referencing a
lexical variable defined in an outer named subroutine.
When the inner subroutine is called, it will see the value of
the outer subroutine's variable as it was before and during the first
call to the outer subroutine; in this case, after the first call to the
outer subroutine is complete, the inner and outer subroutines will no
longer share a common value for the variable. In other words, the
variable will no longer be shared.
This problem can usually be solved by making the inner subroutine
anonymous, using the sub {} syntax. When inner anonymous subs that
reference variables in outer subroutines are created, they
are automatically rebound to the current values of such variables.
So pull out that "inner" sub (match_xml) and use it normally from the "outer" one (merge_xml). In general you'd pass the reference to the array (#xml_files) to it; or, since in this case one cannot pass to File::Find's find, can have the array in such scope so to be seen as needed.
Or, since the purpose of match_xml is to be find's "wanted" function, can use an anonymous sub for that purpose so there is no need for a separate named sub
find( sub { ... }, #dirs );
Or store that coderef in a variable, as shown in Ed Heal's answer
my $wanted_coderef = sub { ... };
find( $wanted_coderef, #dirs );
With help from zdim I came up with:
sub merge_xml {
foreach my $repository ('repo1', 'repo2') {
my #xml_files;
my match_xml = sub {
my $filename = $File::Find::dir . '/' . $_;
if ($filename =~ m%/(main|test)\.xml$%) {
push(#xml_files, $filename);
}
};
find($match_xml, $repository);
print Dumper(\#xml_files);
}
}
Might I suggest another alternative. By using a factory function, you can eliminate the need to hand write a find subroutine each time.
A factory is a function that generates another function (or subroutine in this case). You feed it some parameters and it creates a custom subroutine with those parameters bolted in. My example uses a closure but you could also build it with a string eval if the closure is costly for some reason.
sub match_factory {
my ($filespec, $array) = #_;
# Validate arguments
die qq{File spec must be a regex, not "$filespec"\n}
unless ref $filespec eq "Regexp";
die qq{Second argument must be an array reference, not "$array"\n}
unless ref $array eq "ARRAY";
# Generate an anonymous sub to perform the match check
# that creates a closure on $filespec and $array
my $matcher = sub {
# Automatically compare against $_
m/$filespec/ and
push(#$array, $File::Find::name);
};
return $matcher;
}
sub merge_xml {
my #repos = #_ # or qw/foo bar .../;
foreach my $repository (#repos) {
my #matched_files;
find(
match_factory( qr/\/(main|test)\.xml$/, \#matched_files ),
$repository
);
print Dumper(\#matched_files);
}
}
HTH

Class::Method::Modifiers - How to apend parameters in "before"?

See https://stackoverflow.com/a/45590793/2139766 for the original problem.
How to append parameter(s) to #_ in before provided by Class::Method::Modifiers?
before 'MIME::Lite::__opts' => sub {
# grep { s/^Hello$/SSL/ } #_; # OK - changes $_[1]
push(#_,'SSL'); # no effect
};
#_ is scoped to your sub. Changes to it can't be seen on the outside.
You noticed that changing the elements of #_ had an effect on scalars outside of the sub, but that's because the existing elements of #_ are aliased to scalars in the caller. But #_ itself is a local variable, so adding and removing elements has no effect on the caller.
You can use around to pass the original method a different #_.
around 'MIME::Lite::__opts' => sub {
my $orig = shift;
return $orig->(#_, 'SSL');
};
Note that changing $method->(#_); to &$method; in Class/Method/Modifiers.pm would allow you to do what you want (by having your sub use the caller's #_ instead of being giving it its own). However, that would cause a lack of symmetry with after, which has no way of manipulating the returned value.
grep { s/^Hello$/SSL/ } #_; # OK - changes $_[1]
No, that's not ok.
First of all,
grep { s/^Hello$/SSL/ } #_;
is a poor way to write
map { s/^Hello$/SSL/ } #_;
which is a poor way to write
s/^Hello$/SSL/ for #_;
That said, you shouldn't do any of those. They alter the arguments in the "real" caller, so you are risking nasty side-effects.
Even claiming it works if pushing it, seeing as the following will crash!
before method => sub {
s/^Hello$/SSL/ for #_;
};
$o->method("Hello"); # Dies: Can't modify constant item in substitution

How to access hash from object

I have written a small class which just got some getter and setter methods. One of those Properties is a hash.
sub getMyData
{
my $objekt = shift;
return $objekt->{MYDATA};
}
sub setMyData
{
my $objekt = shift;
my %myData= shift;
$objekt->{MYDATA} = \%myData;
}
If i set the value like this in another skript which access my class:
my %test;
$test{'apple'}='red';
$objekt = MYNAMESPACE::MYCLASS->new;
$objekt->setMyData(%test);
I thought i can access this value easy via:
my $data = $objekt->getMyData;
print $data{'apple'};
I just get undef value.
Output from Dumper:
Can someone tell me what's wrong here and how i can access getMyData and print the value 'red'?
shift removes and returns the first element of an array. Inside of a subroutine a bare shift operates on #_, which contains a copy of all arguments passed to that subroutine.
What is really happening here is that setMyData is being passed this data:
setMyData($objekt, 'apple', 'red');
The first shift in setMyData removes $objekt from #_
The second shift in setMyData removes 'apple', but since you assign the result of this shift to a Hash it creates a Hash that looks like this: 'apple' => undef
You take a reference to this Hash and store it in the MYDATA key of $objekt
What you really want is to assign the remainder of #_ to your Hash:
sub setMyData {
my $objekt = shift;
my %myData = #_;
# my ($objekt, %myData) = #_; (alternative)
$objekt->{MYDATA} = \%myData;
}
Another option is to instead send a Hash reference to setMyData, which would work with shift:
sub setMyData {
my $objekt = shift;
my $myData_ref = shift
$objekt->{MYDATA} = $myData_ref;
}
$objekt->setMyData(\%test);
You are missing the dereference arrow. Because you put a hashref (\%myData) in, you also get a reference out.
my $data = $objekt->getMyData;
print $data->{'apple'};
# ^
# here
You also need to change the assignment, because you are passing a list to the setter, not a reference. shift is for scalar (single) values, but %test gets turned into a list (many values).
sub setMyData
{
my $objekt = shift;
my %myData = #_;
$objekt->{MYDATA} = \%myData;
}
However, there are a few more issues with your code.

Perl object, toString output from within module

I'm doing a class assignment to learn about Object Oriented programming in Perl. I've got a real basic class that looks like this.
sub new{
my $class = shift;
my $self = {
'Sides' => 3,
'SL' => \#sidelengths};
bless $self, $class;
return $self;
}
I've got two modules to change the sides and length(can't figure out how to modify the sidelegnths with an accessor though) but I have a requirement for my work that I have a method like this
"a method: toString() which returns all of the file attributes in a printable
string. If this is done correctly, the PERL
print $file->toString() . "\n";
should print a readable summary of the file."
I already think I want to use Data::Dumper to do this and that works within a script but it sounds like I need to use it within a module and call that to print a string of whats in the object. So far I have this
sub toString{
my $self = #_;
Dumper( $self );
}
Which just prints out "$VAR1 = 1"
What you want here is to shift an argument out of #_.
sub toString {
my $self = shift #_;
Dumper( $self );
}
When you have $var = #array, that evaluates the array in a scalar context, and that returns the number of elements in the array. So, your statement my $self = #_; set $self to the number of arguments passed to toString, which in this case was 1. (The $self argument.)
Alternately, you can capture the first element of #_ this way:
sub toString {
my ($self) = #_;
Dumper( $self );
}
What this does is evaluate #_ in list context since it uses list assignment. It assigns the first element of #_ to $self.
my $self = #_;
is a scalar assignment operator, so it #_ in scalar context, which is the number of elements it contains. You want to use the list assignment operator.
sub toString {
my ($self) = #_;
return Dumper( $self );
}

Object-Oriented Perl constructor syntax and named parameters

I'm a little confused about what is going on in Perl constructors. I found these two examples perldoc perlbot.
package Foo;
#In Perl, the constructor is just a subroutine called new.
sub new {
#I don't get what this line does at all, but I always see it. Do I need it?
my $type = shift;
#I'm turning the array of inputs into a hash, called parameters.
my %params = #_;
#I'm making a new hash called $self to store my instance variables?
my $self = {};
#I'm adding two values to the instance variables called "High" and "Low".
#But I'm not sure how $params{'High'} has any meaning, since it was an
#array, and I turned it into a hash.
$self->{'High'} = $params{'High'};
$self->{'Low'} = $params{'Low'};
#Even though I read the page on [bless][2], I still don't get what it does.
bless $self, $type;
}
And another example is:
package Bar;
sub new {
my $type = shift;
#I still don't see how I can just turn an array into a hash and expect things
#to work out for me.
my %params = #_;
my $self = [];
#Exactly where did params{'Left'} and params{'Right'} come from?
$self->[0] = $params{'Left'};
$self->[1] = $params{'Right'};
#and again with the bless.
bless $self, $type;
}
And here is the script that uses these objects:
package main;
$a = Foo->new( 'High' => 42, 'Low' => 11 );
print "High=$a->{'High'}\n";
print "Low=$a->{'Low'}\n";
$b = Bar->new( 'Left' => 78, 'Right' => 40 );
print "Left=$b->[0]\n";
print "Right=$b->[1]\n";
I've injected the questions/confusion that I've been having into the code as comments.
To answer the main thrust of your question, since a hash can be initialized as a list of key => value pairs, you can send such a list to a function and then assign #_ to a hash. This is the standard way of doing named parameters in Perl.
For example,
sub foo {
my %stuff = #_;
...
}
foo( beer => 'good', vodka => 'great' );
This will result in %stuff in subroutine foo having a hash with two keys, beer and vodka, and the corresponding values.
Now, in OO Perl, there's some additional wrinkles. Whenever you use the arrow (->) operator to call a method, whatever was on the left side of the arrow is stuck onto the beginning of the #_ array.
So if you say Foo->new( 1, 2, 3 );
Then inside your constructor, #_ will look like this: ( 'Foo', 1, 2, 3 ).
So we use shift, which without an argument operates on #_ implicitly, to get that first item out of #_, and assign it to $type. After that, #_ has just our name/value pairs left, and we can assign it directly to a hash for convenience.
We then use that $type value for bless. All bless does is take a reference (in your first example a hash ref) and say "this reference is associated with a particular package." Alakazzam, you have an object.
Remember that $type contains the string 'Foo', which is the name of our package. If you don't specify a second argument to bless, it will use the name of the current package, which will also work in this example but will not work for inherited constructors.
.1. In Perl, the constructor is just a subroutine called new.
Yes, by convention new is a constructor. It may also perform initialization or not. new should return an object on success or throw an exception (die/croak) if an error has occurred that prevents object creation.
You can name your constructor anything you like, have as many constructors as you like, and even build bless objects into any name space you desire (not that this is a good idea).
.2. I don't get what my $type = shift; does at all, but I always see it. Do I need it?
shift with no arguments takes an argument off the head of #_ and assigns it to $type. The -> operator passes the invocant (left hand side) as the first argument to the subroutine. So this line gets the class name from the argument list. And, yes, you do need it.
.3. How does an array of inputs become the %params hash? my %params = #_;
Assignment into a hash is done in list context, with pairs of list items being grouped into as key/value pairs. So %foo = 1, 2, 3, 4;, creates a hash such that $foo{1} == 2 and $foo{3} == 4. This is typically done to create named parameters for a subroutine. If the sub is passed an odd number of arguments, an warning will be generated if warnings are enabled.
.4. What does 'my $self = {};` do?
This line creates an anonymous hash reference and assigns it to the lexically scoped variable $self. The hash reference will store the data for the object. Typically, the keys in the hash have a one-to-one mapping to the object attributes. So if class Foo has attributes 'size' and 'color', if you inspect the contents of a Foo object, you will see something like $foo = { size => 'm', color => 'black' };.
.5. Given $self->{'High'} = $params{'High'}; where does $params{'High'} come from?
This code relies on the arguments passed to new. If new was called like Foo->new( High => 46 ), then the hash created as per question 3 will have a value for the key High (46). In this case it is equivalent to saying $self->{High} = 46. But if the method is called like Foo->new() then no value will be available, and we have $self->{High} = undef.
.6. What does bless do?
bless takes a reference and associates with a particular package, so that you can use it to make method calls. With one argument, the reference is assoicated with the current package. With two arguments, the second argument specifies the package to associate the reference with. It is best to always use the two argument form, so that your constructors can be inherited by a sub class and still function properly.
Finally, I'll rewrite your hash based object accessor as I would write it using classical OO Perl.
package Foo;
use strict;
use warnings;
use Carp qw(croak);
sub new {
my $class = shift;
croak "Illegal parameter list has odd number of values"
if #_ % 2;
my %params = #_;
my $self = {};
bless $self, $class;
# This could be abstracted out into a method call if you
# expect to need to override this check.
for my $required (qw{ name rank serial_number });
croak "Required parameter '$required' not passed to '$class' constructor"
unless exists $params{$required};
}
# initialize all attributes by passing arguments to accessor methods.
for my $attrib ( keys %params ) {
croak "Invalid parameter '$attrib' passed to '$class' constructor"
unless $self->can( $attrib );
$self->$attrib( $params{$attrib} );
}
return $self;
}
Your question is not about OO Perl. You are confused about data structures.
A hash can be initialized using a list or array:
my #x = ('High' => 42, 'Low' => 11);
my %h = #x;
use Data::Dumper;
print Dumper \%h;
$VAR1 = {
'Low' => 11,
'High' => 42
};
When you invoke a method on a blessed reference, the reference is prepended to the argument list the method receives:
#!/usr/bin/perl
package My::Mod;
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent = 0;
sub new { bless [] => shift }
sub frobnicate { Dumper(\#_) }
package main;
use strict;
use warnings;
my $x = My::Mod->new;
# invoke instance method
print $x->frobnicate('High' => 42, 'Low' => 11);
# invoke class method
print My::Mod->frobnicate('High' => 42, 'Low' => 11);
# call sub frobnicate in package My::Mod
print My::Mod::frobnicate('High' => 42, 'Low' => 11);
Output:
$VAR1 = [bless( [], 'My::Mod' ),'High',42,'Low',11];
$VAR1 = ['My::Mod','High',42,'Low',11];
$VAR1 = ['High',42,'Low',11];
Some points that haven't been dealt with yet:
In Perl, the constructor is just a
subroutine called new.
Not quite. Calling the constructor new is just a convention. You can call it anything you like. There is nothing special about that name from perl's point of view.
bless $self, $type;
Both of your examples don't return the result of bless explicitly. I hope that you know that they do so implicitly anyway.
If you assign an array to a hash, perl treats alternating elements in the array as keys and values. Your array is look at like
my #array = (key1, val1, key2, val2, key3, val3, ...);
When you assign that to %hash, you get
my %hash = #array;
# %hash = ( key1 => val1, key2 => val2, key3 => val3, ...);
Which is another way of saying that in perl list/hash construction syntax, "," and "=>" mean the same thing.
In Perl, all arguments to subroutines are passed via the predefined array #_.
The shift removes and returns the first item from the #_ array. In Perl OO, this is the method invocant -- typically a class name for constructors and an object for other methods.
Hashes flatten to and can be initialized by lists. It's a common trick to emulate named arguments to subroutines. e.g.
Employee->new(name => 'Fred Flintstone', occupation => 'quarry worker');
Ignoring the class name (which is shifted off) the odd elements become hash keys and the even elements become the corresponding values.
The my $self = {} creates a new hash reference to hold the instance data. The bless function is what turns the normal hash reference $self into an object. All it does is add some metadata that identifies the reference as belonging to the class.
Yes, I know that I'm being a bit of a necromancer here, but...
While all of these answers are excellent, I thought I'd mention Moose. Moose makes constructors easy (package Foo;use Moose; automatically provides a constructor called new (although the name "new" can be overridden if you'd like)) but doesn't take away any configurability if you need it.
Once I looked through the documentation for Moose (which is pretty good overall, and there are a lot more tutorial snippets around if you google appropriately), I never looked back.