Why do I see HASH(0xABCDEF) in my Perl output? - perl

I am running perl, v5.6.1 built for sun4-solaris-64int
I am calling print on an array:
print "#vals\n";
and the output looks like:
HASH(0x229a4) uid cn attuid
or another example:
#foo = {};
push(#foo, "c");
print "#foo I am done now\n";
with output of:
HASH(0x2ece0) c I am done now
Where is HASH(0x2ece0) coming from?

Your braces in #foo = {} are creating it. The braces create an unnamed hash reference.
If you want to set #foo to an empty list, use #foo = ()

The key to understanding this sort of problem is that you get an extra item in the output. It's not too important what that item is.
In general, the first thing you'd want to do when your container variable has more (or less) in it than you expect is to look at it's contents. The Data::Dumper module comes with Perl and can pretty print data structures for you:
use Data::Dumper;
print Dumper( \#foo );
Once you see what is in your container, you can start to work backward to find out how it got in there. You'd eventually end up noticing that right after you initialized #foo that it already had one element, which isn't what you wanted.
Another trick is to check the number of elements in the list:
print "There are " . #array . " elements in \#array\n";
If you get a number you don't expect, work backward to find out when the extra element showed up.

You accidentally have a hash reference in #foo. When you print a reference out without dereferencing it (almost always by accident), you get a debugging string (the type of reference it is and a memory location).
I think you want my #foo = (); push #foo, "c"; rather than what you have now. On the other hand, you can also simply say my #foo; to create the array. You don't need to explicitly mark it as empty with ().
See perldoc perlreftut and perldoc perlref for more on references.

Your code should be written like this:
use strict;
use warnings;
my #foo;
push #foo, "c";
print "#foo I am done now\n";
You don't need to initialize variables in Perl, if you want to have an empty variable. You should however use my to declare a local variable. And you don't need parentheses around built-in functions, that just adds clutter.

Related

Change ref of hash in Perl

I ran into this and couldn't find the answer. I am trying to see if it is possible to "change" the reference of a hash. In other words, I have a hash, and a function that returns a hashref, and I want to make my hash point to the location in memory specified by this ref, instead of copying the contents of the hash it points to. The code looks something like this:
%hash = $h->hashref;
My obvious guess was that it should look like this:
\%hash = $h->hashref;
but that gives the error:
Can't modify reference constructor in scalar assignment
I tried a few other things, but nothing worked. Is what I am attempting actually possible?
An experimental feature which would seemingly allow you to do exactly what you're describing has been added to Perl 5.21.5, which is a development release (see "Aliasing via reference").
It sounds like you want:
use Data::Alias;
alias %hash = $h->hashref;
Or if %hash is a package variable, you can instead just do:
*hash = $h->hashref;
But either way, this should almost always be avoided; simply use the hash reference.
This question is really old, but Perl now allows this sort of thing as an experimental feature:
use v5.22;
use experimental qw(refaliasing);
my $first = {
foo => 'bar',
baz => 'quux',
};
\my %hash = $first;
Create named variable aliases with ref aliasing
Mix assignment and reference aliasing with declared_refs
Yes, but…
References in Perl are scalars. You are trying to alias the return value. This actually is possible, but you should not do this, since it involves messing with the symbol table. Furthermore, this only works for globals (declared with our): If you assign a hashref to the glob *hash it will assign to the symbol table entry %hash:
#!/usr/bin/env perl
use warnings;
use strict;
sub a_hashref{{a => "one", b => "two"}}
our %hash;
*hash = a_hashref;
printf "%3s -> %s\n", $_, $hash{$_} foreach keys %hash;
This is bad style! It isn't in PBP (directly, but consider section 5.1: “non-lexicals should be avoided”) and won't be reported by perlcritic, but you shouldn't pollute the package namespace for a little syntactic fanciness. Furthermore it doesn't work with lexical variables (which is what you might want to use most of the time, because they are lexically scoped, not package wide).
Another problem is, that if the $h->hashref method changes its return type, you'll suddenly assign to another table entry! (So if $h->hashref changes its return type to an arrayref, you assign to #hash, good luck detecting that). You could circumvent that by checking if $h->hashref really returns a hashref with 'HASH' eq ref $h->hashref`, but that would defeat the purpose.
What is the problem with just keeping the reference? If you get a reference, just store it in a scalar:
$hash = $h->hashref
To read more about the global symbol table, take a look at perlmod and consider perlref for the *FOO{THING} syntax, which sadly isn't for lvalues.
To achieve what you want, you could check out the several aliasing modules on cpan. Data::Alias or Lexical::Alias seem to fit your purpose. Also if you are interested in tie semantics and/or don't want to use XS modules, Tie::Alias might be worth a shoot.

regarding usage of arrow notation in perl

I've following two statements written in perl :
#m1 = ( [1,2,3],[4,5,6],[7,8,9] ); # It is an array of references.
$mr = [ [1,2,3],[4,5,6],[7,8,9] ]; # It is an anonymous array. $mr holds reference.
When I try to print:
print "$m1[0][1]\n"; # this statement outputs: 2; that is expected.
print "$mr->[0][1]\n"; #this statement outputs: 2; that is expected.
print "$mr[0][1]\n"; #this statement doesn't output anything.
I feel second and third print statements are same. However, I didn't any output with third print statement.
Can anyone let me know what is wrong with third print statement?
This is simple. $mr is a reference. So you use the Arrow Operator to dereference.
Also, if you would use use warnings; use strict;, you would have received a somewhat obvious error message:
Global symbol "#mr" requires explicit package name
$mr is a scalar variable whose value is a reference to a list. It is not a list, and it can't be used as if it was a list. The arrow is needed to access the list it refers to.
But hold on, $m1[0] is also not a list, but a reference to one. You may be wondering why you don't have to write an arrow between the indexes, like $m1[0]->[1]. There's a special rule that says you can omit the arrow when accessing list or hash elements in a list or hash of references, so you can write $mr->[0][1] instead of $mr->[0]->[1] and $m1[0][1] instead of $m1[0]->[1].
$mr holds a reference (conceptually similar to the address of a variable in compiled languages). thus you have an extra level of indirection. replace $mrwith $$mr and you'll be fine.
btw, you can easily check questions like these by browsing for tutorials on perldoc.
You said:
print "$m1[0][1]\n"; # this statement outputs: 2; that is expected.
print "$mr[0][1]\n"; #this statement doesn't output anything.
Notice how you used the same syntax both times.
As you've established by this first line, this syntax accesses the array named: #m1 and #mr. You have no variable named #mr, so you get undef for $mr[0][1].
Maybe you don't realizes that scalar $mr and array #mr have no relation to each other.
Please use use strict; use warnings; to avoid these and many other errors.

Pass by value vs pass by reference for a Perl hash

I'm using a subroutine to make a few different hash maps. I'm currently passing the hashmap by reference, but this conflicts when doing it multiple times. Should I be passing the hash by value or passing the hash reference?
use strict;
use warnings;
sub fromFile($){
local $/;
local our %counts =();
my $string = <$_[0]>;
open FILE, $string or die $!;
my $contents = <FILE>;
close FILE or die $!;
my $pa = qr{
( \pL {2} )
(?{
if(exists $counts{lc($^N)}){
$counts{lc($^N)} = $counts{lc($^N)} + 1;
}
else{
$counts{lc($^N)} = '1';
}
})
(*FAIL)
}x;
$contents =~ $pa;
return %counts;
}
sub main(){
my %english_map = &fromFile("english.txt");
#my %german_map = &fromFile("german.txt");
}
main();
When I run the different txt files individually I get no problems, but with both I get some conflicts.
Three comments:
Don't confuse passing a reference with passing by reference
Passing a reference is passing a scalar containing a reference (a type of value).
The compiler passes an argument by reference when it passes the argument without making a copy.
The compiler passes an argument by value when it passes a copy of the argument.
Arguments are always passed by reference in Perl
Modifying a function's parameters (the elements of #_) will change the corresponding variable in the caller. That's one of the reason the convention to copy the parameters exists.
my ($x, $y) = #_; # This copies the args.
Of course, the primary reason for copying the parameters is to "name" them, but it saves us from some nasty surprises we'd get by using the elements of #_ directly.
$ perl -E'sub f { my ($x) = #_; "b"=~/(.)/; say $x; } "a"=~/(.)/; f($1)'
a
$ perl -E'sub f { "b"=~/(.)/; say $_[0]; } "a"=~/(.)/; f($1)'
b
One cannot pass an array or hash as an argument in Perl
The only thing that can be passed to a Perl sub is a list of scalars. (It's also the only thing that can be returned by one.)
Since #a evaluates to $a[0], $a[1], ... in list context,
foo(#a)
is the same as
foo($a[0], $a[1], ...)
That's why we create a reference to the array or hash we want to pass to a sub and pass the reference.
If we didn't, the array or hash would be evaluated into a list of scalars, and it would have to be reconstructed inside the sub. Not only is that expensive, it's impossible in cases like
foo(#a, #b)
because foo has no way to know how many arguments were returned by #a and how many were returned by #b.
Note that it's possible to make it look like an array or hash is being passed as an argument using prototypes, but the prototype just causes a reference to the array/hash to be created automatically, and that's what actually passed to the sub.
For a couple of reasons you should use pass-by-reference, but the code you show returns the hash by value.
You should use my rather than local except for built-in variables like $/, and then for only as small a scope as possible.
Prototypes on subroutines are almost never a good idea. They do something very specific, and if you don't know what that is you shouldn't use them.
Calling subroutines using the ampersand sigil, as in &fromFile("english.txt"), hasn't been correct since Perl 4, about twenty years ago. It affects the parameters delivered to a subroutine in at least two different ways and is a bad idea.
I'm not sure why you are using a file glob with my $string = <$_[0]>. Are you expecting wildcards in the filename passed as the parameter? If so then you will be opening and reading only the first matching file, otherwise the glob is unnecessary.
Lexical file handles like $fh are better than bareword file handles like FILE, and will be closed implicitly when they are destroyed - usually at the end of the block where they are declared.
I am not sure how your hash %counts gets populated. No regex on its own can fill a hash, but I will have to trust you!
Try this version. People familiar with Perl will thank you (ironically!) for not using camel-case variable names. And it is rare to see a main subroutine declared and called. That is C, this is Perl.
Update I have changed this code to do what your original regex did.
use strict;
use warnings;
sub from_file {
my ($filename) = #_;
my $contents = do {
open my $fh, '<', $filename or die qq{Unable to open "$filename": $!};
local $/;
my $contents = <$fh>;
};
my %counts;
$counts{lc $1}++ while $contents =~ /(?=(\pL{2}))/g;
return \%counts;
}
sub main {
my $english_map = from_file('english.txt');
my $german_map = from_file('german.txt');
}
main();
You can use either a reference or pass the entire hash or array. Your choice. There are two issues that might make you choose one over the other:
Passing other parameters
Memory Management
Perl doesn't really have subroutine parameters. Instead, you're simply passing in an array of parameters. What if you're subroutine is seeing which array has more elements. I couldn't do this:
foo(#first, #second);
because all I'll be passing in is one big array that combines all the members of both. This is true with hashes too. Imagine a program that takes two hashes and finds the ones with common keys:
#common_keys = common(%hash1, %hash1);
Again, I'm combining all the keys and their values in both hashes into one big array.
The only way around this issue is to pass a reference:
foo(\#first, \#second);
#common_keys = common(\%hash1, \%hash2);
In this case, I'm passing the memory location where these two hashes are stored in memory. My subroutine can use those hash references. However, you do have to take some care which I'll explain with the second explanation.
The second reason to pass a reference is memory management. If my array or hash is a few dozen entries, it really doesn't matter all that much. However, imagine I have 10,000,000 entries in my hash or array. Copying all those members could take quite a bit of time. Passing by reference saves me memory, but with a terrible cost. Most of the time, I'm using subroutines as a way of not affecting my main program. This is why subroutines are suppose to use their own variables and why you're taught in most programming courses about variable scope.
However, when I pass a reference, I'm breaking that scope. Here's a simple program that doesn't pass a reference.
#! /usr/bin/env perl
use strict;
use warnings;
my #array = qw(this that the other);
foo (#array);
print join ( ":", #array ) . "\n";
sub foo {
my #foo_array = #_;
$foo_array[1] = "FOO";
}
Note that the subroutine foo1 is changing the second element of the passed in array. However, even though I pass in #array into foo, the subroutine doesn't change the value of #array. That's because the subroutine is working on a copy (created by my #foo_array = #_;). Once the subroutine exists, the copy disappears.
When I execute this program, I get:
this:that:the:other
Now, here's the same program, except I'm passing in a reference, and in the interest of memory management, I use that reference:
#! /usr/bin/env perl
use strict;
use warnings;
my #array = qw(this that the other);
foo (\#array);
print join ( ":", #array ) . "\n";
sub foo {
my $foo_array_ref = shift;
$foo_array_ref->[1] = "FOO";
}
When I execute this program, I get:
this:FOO:the:other
That's because I don't pass in the array, but a reference to that array. It's the same memory location that holds #array. Thus, changing the reference in my subroutine causes it to be changed in my main program. Most of the time, you do not want to do this.
You can get around this by passing in a reference, then copying that reference to an array. For example, if I had done this:
sub foo {
my #foo_array = #{ shift() };
I would be making a copy of my reference to another array. It protects my variables, but it does mean I'm copying my array over to another object which takes time and memory. Back in the 1980s when I first was programming, this was a big issue. However, in this age of gigabyte memory and quadcore processors, the main issue isn't memory management, but maintainability. Even if your array or hash contained 10 million entries, you'll probably not notice any time or memory issues.
This also works the other way around too. I could return from my subroutine a reference to a hash or the entire hash. Many people like returning a reference, but this could be problematic.
In object oriented Perl programming, I use references to keep track of my objects. Normally, I'll have a reference to a hash I can use to store other values, arrays, and hashes.
In a recent program, I was counting IDs and how many times they are referenced in a log file. This was stored in an object (which is just a reference to a hash). I had a method that would return the entire hash of IDs and their counts. I could have done this:
return $self->{COUNT_HASH};
But, what happened, if the user started modifying that reference I passed? They would be actually manipulating my object without using my methods to add and subtract from the IDs. Not something that I want them to do. Instead, I create a new hash, and then return a reference to that hash:
my %hash_counts = % { $self-{COUNT_HASH} };
return \%hash_count;
This copied my reference to an array, and then I passed the reference to the array. This protects my data from outside manipulation. I could still return a reference, but the user would no longer have access to my object without going through my methods.
By the way, I like using wantarray which gives the caller a choice on how they want their data:
my %hash_counts = %{ $self->{COUNT_HASH} };
return want array ? %hash_counts : \%hash_counts;
This allows me to return a reference or a hash depending how the user called my object:
my %hash_counts = $object->totals(); # Returns a hash
my $hash_counts_ref = $object->totals(); # Returns a reference to a hash
1 A footnote: The #_ array is pointing to the same memory location as the parameters of your calling subroutine. Thus, if I pass in foo(#array) and then did $_[1] = "foo";, I would be changing the second element of #array.

perl: printing object properties

I'm playing a bit with the Net::Amazon::EC2 libraries, and can't find out a simple way to print object properties:
This works:
my $snaps = $ec2->describe_snapshots();
foreach my $snap ( #$snaps ) {
print $snap->snapshot_id . " " . $snap->volume_id . "\n";
}
But if I try:
print "$snap->snapshot_id $snap->volume_id \n";
I get
Net::Amazon::EC2::Snapshot=HASH(0x4c1be90)->snapshot_id
Is there a simple way to print the value of the property inside a print?
$snap->volume_id is not a property, it is a method call. While you could interpolate a method call inside a string, it is exceedingly ugly.
To get all the properties of an object you can use the module Data::Dumper, included with core perl:
use Data::Dumper;
print Dumper($object);
Not in the way you want to do it. In fact, what you're doing with $snap->snapshot_id is calling a method (as in sub). Perl cannot do that inside a double-quoted string. It will interpolate your variable $snap. That becomes something like HASH(0x1234567) because that is what it is: a blessed reference of a hash.
The interpolation only works with scalars (and arrays, but I'll omit that). You can go:
print "$foo $bar"; # scalar
print "$hash->{key}"; # scalar inside a hashref
print "$hash->{key}->{moreKeys}->[0]"; # scalar in an array ref in a hashref...
There is one way to do it, though: You can reference and dereference it inside the quoted string, like I do here:
use DateTime;
my $dt = DateTime->now();
print "${\$dt->epoch }"; # both these
print "#{[$dt->epoch]}"; # examples work
But that looks rather ugly, so I would not recommend it. Use your first approach instead!
If you're still interested in how it works, you might also want to look at these Perl FAQs:
What's wrong with always quoting "$vars"?
How do I expand function calls in a string?
From perlref:
Here's a trick for interpolating a subroutine call into a string:
print "My sub returned #{[mysub(1,2,3)]} that time.\n";
The way it works is that when the #{...} is seen in the double-quoted
string, it's evaluated as a block. The block creates a reference to an
anonymous array containing the results of the call to mysub(1,2,3) .
So the whole block returns a reference to an array, which is then
dereferenced by #{...} and stuck into the double-quoted string. This
chicanery is also useful for arbitrary expressions:
print "That yields #{[$n + 5]} widgets\n";
Similarly, an expression that returns a reference to a scalar can be
dereferenced via ${...} . Thus, the above expression may be written
as:
print "That yields ${\($n + 5)} widgets\n";
Stick with the first sample you showed. It looks cleaner and is easier to read.
I'm answering this because it took me a long time to find this and I feel like other people may benefit as well.
For nicer printing of objects use Data::Printer and p():
use DateTime;
use Data::Printer;
my $dt = DateTime->from_epoch( epoch => time );
p($dt);
The PERL translator has limited depth perception within quotes. Removing them should solve the problem. Or just load the real values into a simple variable that you can print within the quotes. Might need to do that if you have objects which contain pointers to other objects:
SwissArmyChainSaw =/= PureMagic:
print("xxx".$this->{whatever}."rest of string\n");
The problem is that $snap is being interpolated inside the string, but $snap is a reference. As perldoc perlref tells us: "Using a reference as a string produces both its referent's type, including any package blessing as described in perlobj, as well as the numeric address expressed in hex."
In other words, within a string, you can't dereference $snap. Your first try was the correct way to do it.
I agree with most comment, stick to concatenation for easy reading. You can use
say
instead of print to spare of using the "\n".

perl encapsulate single variable in double quotes

In Perl, is there any reason to encapsulate a single variable in double quotes (no concatenation) ?
I often find this in the source of the program I am working on (writen 10 years ago by people that don't work here anymore):
my $sql_host = "something";
my $sql_user = "somethingelse";
# a few lines down
my $db = sub_for_sql_conection("$sql_host", "$sql_user", "$sql_pass", "$sql_db");
As far as I know there is no reason to do this. When I work in an old script I usualy remove the quotes so my editor colors them as variables not as strings.
I think they saw this somewhere and copied the style without understanding why it is so. Am I missing something ?
Thank you.
All this does is explicitly stringify the variables. In 99.9% of cases, it is a newbie error of some sort.
There are things that may happen as a side effect of this calling style:
my $foo = "1234";
sub bar { $_[0] =~ s/2/two/ }
print "Foo is $foo\n";
bar( "$foo" );
print "Foo is $foo\n";
bar( $foo );
print "Foo is $foo\n";
Here, stringification created a copy and passed that to the subroutine, circumventing Perl's pass by reference semantics. It's generally considered to be bad manners to munge calling variables, so you are probably okay.
You can also stringify an object or other value here. For example, undef stringifies to the empty string. Objects may specify arbitrary code to run when stringified. It is possible to have dual valued scalars that have distinct numerical and string values. This is a way to specify that you want the string form.
There is also one deep spooky thing that could be going on. If you are working with XS code that looks at the flags that are set on scalar arguments to a function, stringifying the scalar is a straight forward way to say to perl, "Make me a nice clean new string value" with only stringy flags and no numeric flags.
I am sure there are other odd exceptions to the 99.9% rule. These are a few. Before removing the quotes, take a second to check for weird crap like this. If you do happen upon a legit usage, please add a comment that identifies the quotes as a workable kludge, and give their reason for existence.
In this case the double quotes are unnecessary. Moreover, using them is inefficient as this causes the original strings to be copied.
However, sometimes you may want to use this style to "stringify" an object. For example, URI ojects support stringification:
my $uri = URI->new("http://www.perl.com");
my $str = "$uri";
I don't know why, but it's a pattern commonly used by newcomers to Perl. It's usually a waste (as it is in the snippet you posted), but I can think of two uses.
It has the effect of creating a new string with the same value as the original, and that could be useful in very rare circumstances.
In the following example, an explicit copy is done to protect $x from modification by the sub because the sub modifies its argument.
$ perl -E'
sub f { $_[0] =~ tr/a/A/; say $_[0]; }
my $x = "abc";
f($x);
say $x;
'
Abc
Abc
$ perl -E'
sub f { $_[0] =~ tr/a/A/; say $_[0]; }
my $x = "abc";
f("$x");
say $x;
'
Abc
abc
By virtue of creating a copy of the string, it stringifies objects. This could be useful when dealing with code that alters its behaviour based on whether its argument is a reference or not.
In the following example, explicit stringification is done because require handles references in #INC differently than strings.
$ perl -MPath::Class=file -E'
BEGIN { $lib = file($0)->dir; }
use lib $lib;
use DBI;
say "ok";
'
Can't locate object method "INC" via package "Path::Class::Dir" at -e line 4.
BEGIN failed--compilation aborted at -e line 4.
$ perl -MPath::Class=file -E'
BEGIN { $lib = file($0)->dir; }
use lib "$lib";
use DBI;
say "ok";
'
ok
In your case quotes are completely useless. We can even says that it is wrong because this is not idiomatic, as others wrote.
However quoting a variable may sometime be necessary: this explicitely triggers stringification of the value of the variable. Stringification may give a different result for some values if thoses values are dual vars or if they are blessed values with overloaded stringification.
Here is an example with dual vars:
use 5.010;
use strict;
use Scalar::Util 'dualvar';
my $x = dualvar 1, "2";
say 0+$x;
say 0+"$x";
Output:
1
2
My theory has always been that it's people coming over from other languages with bad habits. It's not that they're thinking "I will use double quotes all the time", but that they're just not thinking!
I'll be honest and say that I used to fall into this trap because I came to Perl from Java, so the muscle memory was there, and just kept firing.
PerlCritic finally got me out of the habit!
It definitely makes your code more efficient, but if you're not thinking about whether or not you want your strings interpolated, you are very likely to make silly mistakes, so I'd go further and say that it's dangerous.