What does "CODE" means for variable? - perl

In perl I wanted to debug some module code, so I temporarily added the following line to such source code:
print $${${$${$$h[1]{$j}}{proxy_cache}}{$e}}{'fetch_handler'}{'ownerDocument'}
...and it prints:
CODE(0x9b2b3e0)
What "CODE" means? I expected HASH(0x???????). I am pretty new in Perl, so please explain me this, as goooooogling for +Perl +CODE is not helpful :)
I was looking for url of ownerDocument information, btw.
[UPDATE]
I am trying to use module WWW::Scripter for my needs and I already found several bugs that author of this module (Father Chrysostomos) already fixed based on my inputs.
Now I'm "debugging" some issues with images that are created dynamically in JavaScript (for example ((new Image()).src='http://...'), as those images are now not included in $w->images results.
If you take a look at sub update_html in module source [http://cpansearch.perl.org/src/SPROUT/WWW-Scripter-0.026/lib/WWW/Scripter.pm], there is a line that starts with
$h && $h->eval($self, $code ...
This is a section that I need to debug. I am trying to "search" for new images in DOM after script is evaluated. I was able to find image elements pretty easy, but now I am trying to find information to which document they belong to, as I need to get them with correct referer information. Some images are created within frames, iframes, scripts, etc. If incorrect referer information is used, then it may lead to incorrect response, as most of such (new Image()).src='http://...' images are used for tracking with cookies, not for real image content. To get correct content of document, all these special images need to be properly processed and without correct referer it does not work...

It's a code reference, e.g.:
my $var = sub { ... };
print "$var\n";

This is a bit too long for a comment, but it's not a direct answer to your question.
I wanted to figure out your data structure, which I fully realize you might not control. I'm curious why you have to deal with that, and if you have any hair, or sanity, left.
The multiple references are a bit painful, but it also reminds me of stupid things I used to do with references and that I even presented at the first Perl Conference.
When I first started using references, I thought, stupidly, that every time that I wanted to pass a reference I had to take a reference, even if the thing was already a reference. I'd end up with something ugly like $$$$ref:
my $string = 'Buster';
some_sub( \$string );
sub some_sub {
my $ref = shift;
some_other_sub( \$ref );
}
sub some_other_sub {
my $ref = shift;
yet_another_sub( \$ref );
}
sub yet_another_sub {
my $ref = shift;
print "$$$$ref\n"; #fuuuuugly!
}
This gets even worse when you start taking to references to aggregates, which is what I think is happening in your data structure. Since a reference to a reference is just a scalar, as is the original reference, you can't dereference it by lining up subscripts. Hence, all of the $${ } in your line.
I couldn't see what was happening until I started from the inside out, and even then I just used trial and error until I got things that worked.
The first level is an array reference that contains a hash reference at index 1. That's not so hard or ugly:
my $j = 'foo';
my $e = 'baz';
my $h = [];
$h->[1] = { foo => 'bar' }; # to get to $$h[1]{$j}
print "1: $h->[1]{$j}\n";
The next level is a bit weird. To get $${ ... }{proxy_cache}, you need a reference to a hash reference:
$h->[1] = {
foo => \ { proxy_cache => 'duck' } # ref to hash reference
};
print "2. $${ $$h[1]{$j} }{proxy_cache}\n";
I'm not sure how you are building this data structure, but you should look for places where you already have a hash reference and not take another ref. That's the stupid thing I was doing in my youth. It might look like this:
sub some_sub {
my $hash = shift;
$h->[1] = {
foo => \ $hash # don't do that!
};
The next part isn't that bad. It's just a regular hash reference as the value (instead of duck):
$h->[1] = {
foo => \ { proxy_cache => { $e => 'quux' } }
};
print "3. ${ $${ $$h[1]{$j} }{proxy_cache} }{$e}\n";
The next level is another reference to a hash reference:
$h->[1] = {
foo => \ {
proxy_cache => {
$e => \ { fetch_handler => 'zap' }
}
}
};
print "4. $${ ${ $${ $$h[1]{$j} }{proxy_cache} }{$e} }{'fetch_handler'}\n";
Finally, I get to the last key, ownerDocument, and assign a subroutine reference:
$h->[1] = {
foo => \ {
proxy_cache => {
$e => \ { fetch_handler => {
ownerDocument => sub { print "Buster\n" },
}
}
}
}
};
print "5. $${ ${ $${ $$h[1]{$j} }{proxy_cache} }{$e} }{'fetch_handler'}{'ownerDocument'}\n";
The output is the CODE(0x.......) that you've already seen.
I wanted to simplify that, but there's not much to remove because of those pesky non-aggregate references. This removes only three non-whitespace of characters to line up the {$e} key:
print "6. ";
print $${ $${ $h->[1]{$j} }{proxy_cache}{$e} }{'fetch_handler'}{'ownerDocument'};
print "\n";

Related

How can I access an scalar iside a hash atribute of a class in perl?

I am still learning some object oriented perl and I am struggling a bit on how to access my classes.
I've successfully created a class which one of its attributes is a hash. When I want print directly the value of a given key in this hash, the output is always blank. If I assign a new variable to the value of the hash, then I can print this variable.
Toy example code:
sub new {
my($clas) = #_;
my($self) = {};
bless($self,$clas);
$self->{age_record} = {};
return($self);
}
Imaginary code that fills up my hash
my $class->new("class");
fill_hash($class);
Let's use Data::Dumper to see what's in the hash.
print Dumper $class->{age_record};
$VAR1 = {
'Rigobert' => 17,
'Helene' => 42
};
I get nothing if I print directly.
print $class->{age_record}{'Rigobert'};
But if I asign it first to a new variable, it works.
my $age = $class->{age_record}{'Rigobert'};
print "Age is : $age\n";
I get
Age is : 17
What am I doing wrong when referencing the hash attribute?
As far as I can see there is nothing wrong, and the only reason I can think of for your output not appearing is that it is buffered. You should try adding
STDOUT->autoflush;
near the top of your program.
However, you shouldn't be accessing internal data structures from the calling code. fill_hash should be a better_named method and you need to write an accessor method to get to the age_record element. Something like this
sub pupil_age {
my $self = shift;
my ($name) = #_;
$self->{age_record}{$name};
}
and then you can call it as
printf "Age is : %d\n", $class->pupil_age('Rigobert');
(The printf is just a style choice — there's no other need to use it above a simple print)
I'm puzzled by the line
my $class->new("class");
It should be something like:
my $class = Class->new();
(Assuming that your class's name is "Class" - it's pretty confusing having a class called "Class". I'm assuming that you have class modelling an academic class!)
Then you'd have a method to add a pupil.
sub add_pupil {
my ($self, $name, $age) = #_;
$self->{age_record}{$name} = $age;
}
And another method to get a pupil.
sub get_pupil {
my ($self, $name) = #_;
return $self->{age_record}{$name};
}
You'd then use it like this.
my $class = Class->new();
$class->add_pupil('Rigobert', 17);
$class->add_pupil('Helene', 42);
print $class->get_pupil('Rigobert'); # prints 17
I think you're confused because you haven't really thought about how you are modelling your data inside the class. And I can't be much more help as I don't know what the other attributes are or what your fill_hash() subroutine looks like.

Accessing Hash of hash of Array in Perl

I have the following data I wish to access:
$data1=
{'Family' => {
'House' => [
{
'Id' => '1111',
'Name' => 'DFG'
},
{
'Id' => '211',
'Name' => 'ABC'
}
]
}
}
I want to access the each Name field value. I am using this code:
foreach(keys%$data1) {
if(ref($data1->{$_}) eq 'HASH') {
foreach my $inner_key (keys%{$data1->{$_}}) {
print "Key:$inner_key and value:$data1->{$_}->{$inner_key}\n";
}
}
else {
print "Key: $_ and Value: $data1->{$_}\n"
}
}
It prints Key:House and value:ARRAY(OXXX).
I know I am doing something wrong. Since the data in 'House' is an array of hashes, I even tried accessing through $data1->{$_}->{$inner_key}[0]. What is wrong in the code???
You have to dereference array for foreach loop first, and then dereference hashref to reach "Name" values.
print "Key:$inner_key and value:$_->{Name}\n"
for #{$data1->{$_}->{$inner_key}};
You should read perlref first to learn how to create and use references.
Here is a demonstration:
#!/usr/bin/perl
use strict;
use warnings;
my $data1=
{'Family' => {
'House' => [
{
'Id' => '1111',
'Name' => 'DFG'
},
{
'Id' => '211',
'Name' => 'ABC'
}
]
}
};
while (my ($key1, $val1) = each %$data1) {
print "\$key1 = $key1\n";
while (my ($key2, $val2) = each %$val1) {
print "\t\$key2 = $key2\n";
foreach my $val3 (#$val2) {
while (my ($key4, $val4) = each %$val3) {
print "\t\t\$key4 = $key4 => $val4\n";
}
print "\n";
}
}
}
[Edit I typed too slowly while answering, so this response bascially duplicates #mpapec's below - I will leave the references here and you can vote me up for those ;-) but do not accept my response as the answer].
Try something like the following to see if it works:
for $inner_hash (#{ $data1->{Family}{House} }) {
say "Name: $inner_hash->{Name}"
}
since you need to get the inner hashes' values from inside the elements of the array (that is what value:ARRAY(OXXX) is telling you).
You can use perldoc to look at the perldata, perlref, perlreftut and perldsc PODs to learn more about data structures and dereferencing. If keeping your data structure in mind while you are writing code gets to be too hard to do, it may mean you need to simplify things: either the data itself or by writing sub's to make it easier to access, or making use some of the excellent utility modules from CPAN.
There's also some good perl data structure related tutorials out there. The POD/perldoc documentation that ships with perl (along with Chapter 9 of Programming Perl) is the canonical reference, but you might browse these nodes from perlmonks:
References quick reference
Referencing in advanced data structures
Visualizing perl data structures
Perlmonks Hash Tutorial
NB Above I'm using the perlcritic and Perl Best Practices style of dereferencing: e.g.: #{ $data1->{Family}{House} } so the syntax reminds me that the inner hashes (or inner-inner?) are inside an array. There's a cool new way of dereferencing being introduced in perl 5.20 called postfix dereferencing which will be great, but you can't go wrong following the recommendations of PBP.
"Until you start thinking in terms of hashes, you aren't really thinking in Perl." -- Larry Wall
Cheers,

Perl - Overloading package/class properties?

I am attempting to port some code from PHP that basically boils down to property overloading. That is, if you try to get or set a class property that is not actually defined as a part of the class, it will send that information to a function that will pretty much do anything I want with it. (In this case, I want to search an associative array within the class before giving up.)
However, Perl is... quite a bit different from PHP, given that classes are already hashes. Is there any way that I can apply some equivalent of __get() and __set() to a Perl "class" that will remain completely encapsulated in that package, transparent to anything trying to actually get or set properties?
EDIT: The best way to explain this may be to show you code, show the output, and then show what I want it to output.
package AccessTest;
my $test = new Sammich; #"improper" style, don't care, not part of the question.
say 'bacon is: ' . $test->{'bacon'};
say 'cheese is: ' . $test->{'cheese'};
for (keys $test->{'moreProperties'}) {
say "$_ => " . $test->{'moreProperties'}{$_};
}
say 'invalid is: ' . $test->{'invalid'};
say 'Setting invalid.';
$test->{'invalid'} = 'true';
say 'invalid is now: ' . $test->{'invalid'};
for (keys $test->{'moreProperties'}) {
say "$_ => " . $test->{'moreProperties'}{$_};
}
package Sammich;
sub new
{
my $className = shift;
my $this = {
'bacon' => 'yes',
'moreProperties' => {
'cheese' => 'maybe',
'ham' => 'no'
}
};
return bless($this, $className);
}
This currently outputs:
bacon is: yes
Use of uninitialized value in concatenation (.) or string at ./AccessTest.pl line 11.
cheese is:
cheese => maybe
ham => no
Use of uninitialized value in concatenation (.) or string at ./AccessTest.pl line 17.
invalid is:
Setting invalid.
invalid is now: true
cheese => maybe
ham => no
Now, I need to make modifications to Sammich only, without making any changes at all to the initial AccessTest package, that will result in this:
bacon is: yes
cheese is: maybe
cheese => maybe
ham => no
invalid is: 0
Setting invalid.
invalid is now: true
cheese => maybe
ham => no
invalid => true
As you can see, the desired effect is that the 'cheese' property, since it's not a part of the test object directly, would instead be grabbed from the 'moreProperties' hash. 'invalid' would attempt the same thing, but since it is neither a direct property nor in 'moreProperties', it would act in whatever way programmed - in this case, I would want it to simply return the value 0, without any errors or warnings. Upon attempting to set the 'invalid' property, it would not be added to the object directly, because it's not already there, but would instead be added to the 'moreProperties' hash.
I'm expecting this to take more than the six or so lines it would require in PHP, but as it is a very important concept of OOP, I fully expect Perl to handle it somehow.
As I have said in my comments, the reason this problem is hard is that you aren't following one of the golden rules of Object-Oriented Programming, namely encapsulation. How can you expect to intercept a call that isn't a method? If your exposed API consists of getter/setters then you can intercept an unknown method call with an AUTOLOAD method. If you don't you may use #pilcrow's noble suggestion of using a tied hash (Edit: or #tchrist's valiant use of the overload pragma); still this is more a tribute to Perl's flexibility than your API.
To do this more correctly (and yes I see you "require" that the API not be modified, if you choose to ignore this, then call this post a message to future readers).
#!/usr/bin/env perl
use v5.10; # say
use strict;
use warnings;
use MooseX::Declare;
use Method::Signatures::Modifiers;
class Sammich {
has 'bacon' => ( isa => 'Str', is => 'rw', default => 'yes' );
has 'more' => (
isa => 'HashRef',
is => 'rw',
default => sub{ {
cheese => 'maybe',
ham => 'no',
} },
);
our $AUTOLOAD;
method AUTOLOAD ($val?) {
# get unknown method name
my $attr = (split( /::/, $AUTOLOAD))[-1];
my $more = $self->more;
# see if that method name exists in "more"
if (exists $more->{$attr}) {
# if so, are there new values? then set
if (defined $val) {
$more->{$attr} = $val;
}
# ... and return
return $more->{$attr};
}
# attr does not exist, so set it or initialize it to 0
$more->{$attr} = defined $val ? $val : 0;
return $more->{$attr};
}
}
# I don't care that you don't care
my $test = Sammich->new();
say 'bacon is: ' . $test->bacon;
say 'cheese is: ' . $test->cheese;
for (keys %{ $test->more }) {
say "$_ => " . $test->more->{$_};
}
say 'invalid is: ' . $test->invalid;
say 'Setting invalid.';
$test->invalid( 'true' );
say 'invalid is now: ' . $test->invalid;
for (keys %{ $test->more }) {
say "$_ => " . $test->more->{$_};
}
Some may say that my wording here is harsh and perhaps it is. I try to help those who will be helped, therefore seeing a bolded message like
I need to make modifications to Sammich only, without making any changes at all to the initial AccessTest package
then demanding that Perl bow to your whim
I'm expecting this to take more than the six or so lines it would require in PHP, but as it is a very important concept of OOP, I fully expect Perl to handle it somehow.
is irksome. I hope that future readers will see this as a case example of why encapsulation helps in the long run.
Update to the update
(I am receiving anonymous downvotes, presumably for abetting your misguided approach. :) )
Just to be clear, the question you pose is an "XY Problem", specifically an artifact of the mistaken translation of OO technique from PHP to Perl. For example, as mentioned passim in this question, object properties in perl generally should not be implemented as directly accessed hash(ref) elements. That's the wrong "X".
Jumping from one language to another introduces more than merely syntactical differences.
Update
You can accomplish what you want with something like this:
package UnfortunateHack; {
use Tie::Hash;
our #ISA = qw(Tie::StdHash);
sub FETCH {
my ($self, $key) = #_;
return exists $self->{$key}
? $self->{$key}
: $self->{moreProperties}->{$key};
}
}
...
package Sammich;
sub new {
my $class = shift;
tie my %this, 'UnfortunateHack';
%this = ( bacon => 'yes',
moreProperties => { ... } );
bless \%this, $class;
}
Original Answer
If I understand your intent — to intercept $obj->property calls where TheClass::property isn't necessarily defined — then perl's AUTOLOAD in an OO context will do what you want.
From the linked docs (perlobj):
If you call a method that doesn't exist in a class, Perl will throw an error. However, if that class or any of its parent classes defines an AUTOLOAD method, that AUTOLOAD method is called instead.
You are completely violating encapsulation. To prove it to you, comment out the bless from &Sammich::new so that it returns a plain hash reference.
package Sammich;
sub new {
my $className = shift;
my $this = {
'bacon' => 'yes',
'moreProperties' => {
'cheese' => 'maybe',
'ham' => 'no'
}
};
# don't even bother blessing it
# return bless($this, $className);
}
The only way to get what you want it is to use magic.
In Perl classes are more than hashes, they are built on packages and you can define there whatever method you want and it remains encapsulated in that package/class.
You can see a code example in the Object-Oriented Programming in Perl Tutorial.

Using a variable as a method name in Perl

I have a perl script (simplified) like so:
my $dh = Stats::Datahandler->new(); ### homebrew module
my %url_map = (
'/(article|blog)/' => \$dh->articleDataHandler,
'/video/' => \$dh->nullDataHandler,
);
Essentially, I'm going to loop through %url_map, and if the current URL matches a key, I want to call the function pointed to by the value of that key:
foreach my $key (keys %url_map) {
if ($url =~ m{$key}) {
$url_map{$key}($url, $visits, $idsite);
$mapped = 1;
last;
}
}
But I'm getting the message:
Can't use string ("/article/") as a subroutine ref while "strict refs" in use at ./test.pl line 236.
Line 236 happens to be the line $url_map{$key}($url, $visits, $idsite);.
I've done similar things in the past, but I'm usually doing it without parameters to the function, and without using a module.
Since this is being answered here despite being a dup, I may as well post the right answer:
What you need to do is store a code reference as the values in your hash. To get a code reference to a method, you can use the UNIVERSAL::can method of all objects. However, this is not enough as the method needs to be passed an invocant. So it is clearest to skip ->can and just write it this way:
my %url_map = (
'/(article|blog)/' => sub {$dh->articleDataHandler(#_)},
'/video/' => sub {$dh->nullDataHandler(#_)},
);
This technique will store code references in the hash that when called with arguments, will in turn call the appropriate methods with those arguments.
This answer omits an important consideration, and that is making sure that caller works correctly in the methods. If you need this, please see the question I linked to above:
How to take code reference to constructor?
You're overthinking the problem. Figure out the string between the two forward slashes, then look up the method name (not reference) in a hash. You can use a scalar variable as a method name in Perl; the value becomes the method you actually call:
%url_map = (
'foo' => 'foo_method',
);
my( $type ) = $url =~ m|\A/(.*?)/|;
my $method = $url_map{$type} or die '...';
$dh->$method( #args );
Try to get rid of any loops where most of the iterations are useless to you. :)
my previous answer, which I don't like even though it's closer to the problem
You can get a reference to a method on a particular object with can (unless you've implemented it yourself to do otherwise):
my $dh = Stats::Datahandler->new(); ### homebrew module
my %url_map = (
'/(article|blog)/' => $dh->can( 'articleDataHandler' ),
'/video/' => $dh->can( 'nullDataHandler' ),
);
The way you have calls the method and takes a reference to the result. That's not what you want for deferred action.
Now, once you have that, you call it as a normal subroutine dereference, not a method call. It already knows its object:
BEGIN {
package Foo;
sub new { bless {}, $_[0] }
sub cat { print "cat is $_[0]!\n"; }
sub dog { print "dog is $_[0]!\n"; }
}
my $foo = Foo->new;
my %hash = (
'cat' => $foo->can( 'cat' ),
'dog' => $foo->can( 'dog' ),
);
my #tries = qw( cat dog catbird dogberg dogberry );
foreach my $try ( #tries ) {
print "Trying $try\n";
foreach my $key ( keys %hash ) {
print "\tTrying $key\n";
if ($try =~ m{$key}) {
$hash{$key}->($try);
last;
}
}
}
The best way to handle this is to wrap your method calls in an anonymous subroutine, which you can invoke later. You can also use the qr operator to store proper regexes to avoid the awkwardness of interpolating patterns into things. For example,
my #url_map = (
{ regex => qr{/(article|blog)/},
method => sub { $dh->articleDataHandler }
},
{ regex => qr{/video/},
method => sub { $dh->nullDataHandler }
}
);
Then run through it like this:
foreach my $map( #url_map ) {
if ( $url =~ $map->{regex} ) {
$map->{method}->();
$mapped = 1;
last;
}
}
This approach uses an array of hashes rather than a flat hash, so each regex can be associated with an anonymous sub ref that contains the code to execute. The ->() syntax dereferences the sub ref and invokes it. You can also pass parameters to the sub ref and they'll be visible in #_ within the sub's block. You can use this to invoke the method with parameters if you want.

How can I cleanly turn a nested Perl hash into a non-nested one?

Assume a nested hash structure %old_hash ..
my %old_hash;
$old_hash{"foo"}{"bar"}{"zonk"} = "hello";
.. which we want to "flatten" (sorry if that's the wrong terminology!) to a non-nested hash using the sub &flatten(...) so that ..
my %h = &flatten(\%old_hash);
die unless($h{"zonk"} eq "hello");
The following definition of &flatten(...) does the trick:
sub flatten {
my $hashref = shift;
my %hash;
my %i = %{$hashref};
foreach my $ii (keys(%i)) {
my %j = %{$i{$ii}};
foreach my $jj (keys(%j)) {
my %k = %{$j{$jj}};
foreach my $kk (keys(%k)) {
my $value = $k{$kk};
$hash{$kk} = $value;
}
}
}
return %hash;
}
While the code given works it is not very readable or clean.
My question is two-fold:
In what ways does the given code not correspond to modern Perl best practices? Be harsh! :-)
How would you clean it up?
Your method is not best practices because it doesn't scale. What if the nested hash is six, ten levels deep? The repetition should tell you that a recursive routine is probably what you need.
sub flatten {
my ($in, $out) = #_;
for my $key (keys %$in) {
my $value = $in->{$key};
if ( defined $value && ref $value eq 'HASH' ) {
flatten($value, $out);
}
else {
$out->{$key} = $value;
}
}
}
Alternatively, good modern Perl style is to use CPAN wherever possible. Data::Traverse would do what you need:
use Data::Traverse;
sub flatten {
my %hash = #_;
my %flattened;
traverse { $flattened{$a} = $b } \%hash;
return %flattened;
}
As a final note, it is usually more efficient to pass hashes by reference to avoid them being expanded out into lists and then turned into hashes again.
First, I would use perl -c to make sure it compiles cleanly, which it does not. So, I'd add a trailing } to make it compile.
Then, I'd run it through perltidy to improve the code layout (indentation, etc.).
Then, I'd run perlcritic (in "harsh" mode) to automatically tell me what it thinks are bad practices. It complains that:
Subroutine does not end with "return"
Update: the OP essentially changed every line of code after I posted my Answer above, but I believe it still applies. It's not easy shooting at a moving target :)
There are a few problems with your approach that you need to figure out. First off, what happens in the event that there are two leaf nodes with the same key? Does the second clobber the first, is the second ignored, should the output contain a list of them? Here is one approach. First we construct a flat list of key value pairs using a recursive function to deal with other hash depths:
my %data = (
foo => {bar => {baz => 'hello'}},
fizz => {buzz => {bing => 'world'}},
fad => {bad => {baz => 'clobber'}},
);
sub flatten {
my $hash = shift;
map {
my $value = $$hash{$_};
ref $value eq 'HASH'
? flatten($value)
: ($_ => $value)
} keys %$hash
}
print join( ", " => flatten \%data), "\n";
# baz, clobber, bing, world, baz, hello
my %flat = flatten \%data;
print join( ", " => %flat ), "\n";
# baz, hello, bing, world # lost (baz => clobber)
A fix could be something like this, which will create a hash of array refs containing all the values:
sub merge {
my %out;
while (#_) {
my ($key, $value) = splice #_, 0, 2;
push #{ $out{$key} }, $value
}
%out
}
my %better_flat = merge flatten \%data;
In production code, it would be faster to pass references between the functions, but I have omitted that here for clarity.
Is it your intent to end up with a copy of the original hash or just a reordered result?
Your code starts with one hash (the original hash that is used by reference) and makes two copies %i and %hash.
The statement my %i=%{hashref} is not necessary. You are copying the entire hash to a new hash. In either case (whether you want a copy of not) you can use references to the original hash.
You are also losing data if your hash in the hash has the same value as the parent hash. Is this intended?