Perl: How to break on write-to-variable in Eclipse - perl

I have a script that is writing entries into a hash. However, at a certain point entries exist in the hash that I think should not. So, obviously I've cocked it up somewhere, but there is only one place where I think I add elements in to the hash and I've tested that to make sure that these "rogue" elements aren't being added at this location.
What I would like to do is break on a write to the hash, something like this, but in a "global" kinda way because I don't know where this stray write is in the code - I can't see it...
So what are my options? Can I set a watch point in the EPIC debugger and if so how? (I've had a play but can;t find anything relevant).
Or could I perhaps create a extended hash that can intercept writes somehow?
Any ideas on an "easy" debugging method. Otherwise I think I'll be back to brute force debug :S Thanks in davance...

Not an EPIC-specific answer, but check out Tie::Watch. You can setup a variable (like a hash) to be watched, and your program can output something every time the variable is updated.
updated: Tie::Trace does pretty much the same thing, with a simpler interface.

Here is a DIY-version of mobs answer: Make that hash a tied hash to a class that outputs a stack trace on every write access:
package MyHash {
use Tie::Hash;
use parent -norequire, "Tie::StdHash";
use Carp "cluck";
sub STORE {
my ($self, $k, $v) = #_;
cluck "hash write to $self: $k => $v";
$self->SUPER::STORE($k => $v);
}
}
tie my %hash => "MyHash";
$hash{foo} = 42;
print "$hash{foo}\n";
Output:
hash write to MyHash=HASH(0x91be87c): foo => 42 at -e line 1.
MyHash::STORE('MyHash=HASH(0x91be87c)', 'foo', 42) called at -e line 1
42

use Hash::Util qw(
lock_keys unlock_keys
lock_hash unlock_hash
);
my %hash = (foo => 42, bar => 23);
# no keys change
lock_keys(%hash);
# no keys/values change
lock_hash(%hash);

Related

Printing hash value in Perl

When I print a variable, I am getting a HASH(0xd1007d0) value. I need to print the values of all the keys and values. However, I am unable to as the control does not enter the loop.
foreach my $var(keys %{$HashVariable}){
print"In the loop \n";
print"$var and $HashVariable{$var}\n";
}
But the control is not even entering the loop. I am new to perl.
I can't answer completely, because it depends entirely on what's in $HashVariable.
The easiest way to tell what's in there is:
use Data::Dumper;
print Dumper $HashVariable;
Assuming this is a hash reference - which it would be, if print $HashVariable gives HASH(0xdeadbeef) as an output.
So this should work:
#!/usr/bin/env perl
use strict;
use warnings;
my $HashVariable = { somekey => 'somevalue' };
foreach my $key ( keys %$HashVariable ) {
print $key, " => ", $HashVariable->{$key},"\n";
}
The only mistake you're making is that $HashVariable{$key} won't work - you need to dereference, because as it stands it refers to %HashVariable not $HashVariable which are two completely different things.
Otherwise - if it's not entering the loop - it may mean that keys %$HashVariable isn't returning anything. Which is why that Dumper test would be useful - is there any chance you're either not populating it correctly, or you're writing to %HashVariable instead.
E.g.:
my %HashVariable;
$HashVariable{'test'} = "foo";
There's an obvious problem here, but it wouldn't cause the behaviour that you are seeing.
You think that you have a hash reference in $HashVariable and that sounds correct given the HASH(0xd1007d0) output that you see when you print it.
But setting up a hash reference and running your code, gives slightly strange results:
my $HashVariable = {
foo => 1,
bar => 2,
baz => 3,
};
foreach my $var(keys %{$HashVariable}){
print"In the loop \n";
print"$var and $HashVariable{$var}\n";
}
The output I get is:
In the loop
baz and
In the loop
bar and
In the loop
foo and
Notice that the values aren't being printed out. That's because of the problem I mentioned above. Adding use strict to the program (which you should always do) tells us what the problem is.
Global symbol "%HashVariable" requires explicit package name (did you forget to declare "my %HashVariable"?) at hash line 14.
Execution of hash aborted due to compilation errors.
You are using $HashVariable{$var} to look up a key in your hash. That would be correct if you had a hash called %HashVariable, but you don't - you have a hash reference called $HashVariable (note the $ instead of %). To look up a key from a hash reference, you need to use a dereferencing arrow - $HashVariable->{$var}.
Fixing that, your program works as expected.
use strict;
use warnings;
my $HashVariable = {
foo => 1,
bar => 2,
baz => 3,
};
foreach my $var(keys %{$HashVariable}){
print"In the loop \n";
print"$var and $HashVariable->{$var}\n";
}
And I see:
In the loop
bar and 2
In the loop
foo and 1
In the loop
baz and 3
The only way that you could get the results you describe (the HASH(0xd1007d0) output but no iterations of the loop) is if you have a hash reference but the hash has no keys.
So (as I said in a comment) we need to see how your hash reference is created.

Use of uninitialized value $_ in hash element

This is regarding a warning message I received when running a Perl script.
I understand why I'm receiving this warning: probably because $element is undefined when being called but I don't see it.
for ( my $element->{$_}; #previous_company_names; ) {
map { $element => $previous_company_names->{$_} }
0 .. $previous_company_names;
The result is this message
Use of uninitialized value $_ in hash element
First and foremost - for a new programmer, absolutely the most important thing you must do, is use strict; and use warnings;. You've got my in there, which suggests you might be, but it pays to re-iterate it.
$_ is a special variable, called the implicit variable. It doesn't really make sense to use it in the way you're doing like that, in a for loop. Take a look at perlvar for some more detail.
Indeed, I'd suggest steering clear of map entirely until you really grok it, because it's a good way to confuse yourself.
With a for (or foreach) loop you can either:
for my $thing ( #list_of_things ) {
print $thing;
}
Or you can do:
for ( #list_of_things ) {
print $_;
}
$_ is set implicitly by each iteration of the second loop, which can be quite useful because lots of things default to using it.
E.g.
for ( #list_of_things ) {
chomp;
s/ /_/g;
print;
}
When it comes to map - map is a clever little function, that lets you evaluate a code block for each element in a list. Personally - I still get confused by it, and tend to stick with for or foreach loops instead, most of the time.
But what you're doing with it, isn't really going to work - map makes a hash.
So something like:
use Data::Dumper;
my %things = map { $_ => 1 } 1..5;
print Dumper \%things;
This creates the hash 'things':
$VAR1 = {
'1' => 1,
'3' => 1,
'5' => 1,
'4' => 1,
'2' => 1
};
Again, $_ is used inside, because it's the magic variable - it's set to 'whatever was in the second bit' (e.g 1,2,3,4,5) each loop, and then the block is evaluated.
So your map expression doesn't really make a lot of sense, because you don't have $element defined... and even if you did, you'd repeatedly overwrite it.
I would also note - $previous_company_names would need to be numeric, and is in NO way related to #previous_company_names. You might be meaning to use $#previous_company_names which is the last element index.

perl: call hash values as methods

I have an usual hash or hashref.
my %hash; $hash{'key'} = 'value';
not very easy all this quote marks and curly brackets
I know there is a trick to call hash values like methods:
$hash->key = 'value'; # even no need for round brackets !
May be to use some magick module, I know it exist, I have seen this code once:
use CallHashLikeMethods 'hash';
$hash->key = 'value';
Of course, I can write the class for this hash and then TIE it, but it is very manual;
I looking for a magic moduule which prepares hash automatically. I just forget it's name
What you are asking to do is a fairly bad idea:
Maintainability: When a hash access doesn't look like a hash access, that's bad.
Performance: Method calls are much more expensive than hash accesses.
Correctness: An overdose of cleverness could make other clever code break. Keep your code simple and stupid.
Furthermore, this will not save you any typing, because the keys in a hash access are auto-quoted:
$hash{key} = 'value';
$hash->key = 'value';
… as long as the key is a valid bareword.
I don't know of any pre-written CPAN modules that do this, but it's not exactly difficult...
use v5.10;
use strict;
use warnings;
sub HASH::AUTOLOAD :lvalue {
my ($key) = ($HASH::AUTOLOAD =~ /(\w+)\z/);
shift->{$key};
}
my $hash = {
foo => 1,
bar => 0, # not 2
baz => 3,
};
bless($hash, 'HASH');
$hash->bar = 1;
$hash->bar++;
say $hash->foo;
say $hash->bar;
say $hash->baz;
I agree with amon though - normal hash access will be clearer and faster, and the syntax is not exactly onerous.
Update: found a CPAN module for it: Hash::AsObject.
If you want to use a fixed set of keys as structure values, you might like Struct::Dumb
use Struct::Dumb;
struct Point => ['x', 'y', 'z'];
my $p = Point(10, 20, 30);
$p->x = 40;

Confusion about proper usage of dereference in Perl

I noticed the other day that - while altering values in a hash - that when you dereference a hash in Perl, you actually are making a copy of that hash. To confirm I wrote this quick little script:
#! perl
use warnings;
use strict;
my %h = ();
my $hRef = \%h;
my %h2 = %{$hRef};
my $h2Ref = \%h2;
if($hRef eq $h2Ref) {
print "\n\tThey're the same $hRef $h2Ref";
}
else {
print "\n\tThey're NOT the same $hRef $h2Ref";
}
print "\n\n";
The output:
They're NOT the same HASH(0x10ff6848) HASH(0x10fede18)
This leads me to realize that there could be spots in some of my scripts where they aren't behaving as expected. Why is it even like this in the first place? If you're passing or returning a hash, it would be more natural to assume that dereferencing the hash would allow me to alter the values of the hash being dereferenced. Instead I'm just making copies all over the place without any real need/reason to beyond making syntax a little more obvious.
I realize the fact that I hadn't even noticed this until now shows its probably not that big of a deal (in terms of the need to go fix in all of my scripts - but important going forward). I think its going to be pretty rare to see noticeable performance differences out of this, but that doesn't alter the fact that I'm still confused.
Is this by design in perl? Is there some explicit reason I don't know about for this; or is this just known and you - as the programmer - expected to know and write scripts accordingly?
The problem is that you are making a copy of the hash to work with in this line:
my %h2 = %{$hRef};
And that is understandable, since many posts here on SO use that idiom to make a local name for a hash, without explaining that it is actually making a copy.
In Perl, a hash is a plural value, just like an array. This means that in list context (such as you get when assigning to a hash) the aggregate is taken apart into a list of its contents. This list of pairs is then assembled into a new hash as shown.
What you want to do is work with the reference directly.
for (keys %$hRef) {...}
for (values %$href) {...}
my $x = $href->{some_key};
# or
my $x = $$href{some_key};
$$href{new_key} = 'new_value';
When working with a normal hash, you have the sigil which is either a % when talking about the entire hash, a $ when talking about a single element, and # when talking about a slice. Each of these sigils is then followed by an identifier.
%hash # whole hash
$hash{key} # element
#hash{qw(a b)} # slice
To work with a reference named $href simply replace the string hash in the above code with $href. In other words, $href is the complete name of the identifier:
%$href # whole hash
$$href{key} # element
#$href{qw(a b)} # slice
Each of these could be written in a more verbose form as:
%{$href}
${$href}{key}
#{$href}{qw(a b)}
Which is again a substitution of the string '$href' for 'hash' as the name of the identifier.
%{hash}
${hash}{key}
#{hash}{qw(a b)}
You can also use a dereferencing arrow when working with an element:
$hash->{key} # exactly the same as $$hash{key}
But I prefer the doubled sigil syntax since it is similar to the whole aggregate and slice syntax, as well as the normal non-reference syntax.
So to sum up, any time you write something like this:
my #array = #$array_ref;
my %hash = %$hash_ref;
You will be making a copy of the first level of each aggregate. When using the dereferencing syntax directly, you will be working on the actual values, and not a copy.
If you want a REAL local name for a hash, but want to work on the same hash, you can use the local keyword to create an alias.
sub some_sub {
my $hash_ref = shift;
our %hash; # declare a lexical name for the global %{__PACKAGE__::hash}
local *hash = \%$hash_ref;
# install the hash ref into the glob
# the `\%` bit ensures we have a hash ref
# use %hash here, all changes will be made to $hash_ref
} # local unwinds here, restoring the global to its previous value if any
That is the pure Perl way of aliasing. If you want to use a my variable to hold the alias, you can use the module Data::Alias
You are confusing the actions of dereferencing, which does not inherently create a copy, and using a hash in list context and assigning that list, which does. $hashref->{'a'} is a dereference, but most certainly does affect the original hash. This is true for $#$arrayref or values(%$hashref) also.
Without the assignment, just the list context %$hashref is a mixed beast; the resulting list contains copies of the hash keys but aliases to the actual hash values. You can see this in action:
$ perl -wle'$x={"a".."f"}; for (%$x) { $_=chr(ord($_)+10) }; print %$x'
epcnal
vs.
$ perl -wle'$x={"a".."f"}; %y=%$x; for (%y) { $_=chr(ord($_)+10) }; print %$x; print %y'
efcdab
epcnal
but %$hashref isn't acting any differently than %hash here.
No, dereferencing does not create a copy of the referent. It's my that creates a new variable.
$ perl -E'
my %h1; my $h1 = \%h1;
my %h2; my $h2 = \%h2;
say $h1;
say $h2;
say $h1 == $h2 ?1:0;
'
HASH(0x83b62e0)
HASH(0x83b6340)
0
$ perl -E'
my %h;
my $h1 = \%h;
my $h2 = \%h;
say $h1;
say $h2;
say $h1 == $h2 ?1:0;
'
HASH(0x9eae2d8)
HASH(0x9eae2d8)
1
No, $#{$someArrayHashRef} does not create a new array.
If perl did what you suggest, then variables would get aliased very easily, which would be far more confusing. As it is, you can alias variables with globbing, but you need to do so explicitly.

What is the Perlish way to iterate from item n to the end of an array?

The problem is that I have n command-line arguments. There are always going to be at least 2, however the maximum number is unbounded. The first argument specifies a mode of operation and the second is a file to process. The 3rd through nth are the things to do to the file (which might be none, since the user might just want to clean the file, which is done if you just pass it 2 arguments).
I'm looking at the methods available to me in Perl for working with arrays, but I'm not sure what the "Perlish" way of iterating from item 3 to the end of my array is.
Some options that I've seen:
Pop from the end of the array until I find an element that does not begin with "-" (since the file path does not begin with a "-", although I suppose it could, which might cause problems).
Shift the array twice to remove the first two elements. Whatever I'm left with I can just iterate over, if its size is at least 1.
I like the second option, but I don't know if it's Perlish. And since I'm trying to learn Perl, I might as well learn the right way to do things in Perl.
Aside from using Getopt module as Sinan wrote, I would probably go with:
my ( $operation, $file, #things ) = #ARGV;
And then you can:
for my $thing_to_do ( #things ) {
...
}
IMHO, the Perlish way of accomplishing what you need would be to use one of the Getopt modules on CPAN.
If you still want to do it by hand, I would go for the second option (this is similar to how we handle the first argument of a method call):
die "Must provide filename and operation\n" unless #ARGV >= 2;
my $op = shift #ARGV;
my $file = shift #ARGV;
if ( #ARGV ) {
# handle the other arguments;
}
I would highly recommend using Getopt::Long for parsing command line arguments. It's a standard module, it works awesome, and makes exactly what you're trying to do a breeze.
use strict;
use warnings;
use Getopt::Long;
my $first_option = undef;
my $second_option = undef;
GetOptions ('first-option=s' => \$first_option,
'second-option=s' => \$second_option);
die "Didn't pass in first-option, must be xxxyyyzzz."
if ! defined $first_option;
die "Didn't pass in second-option, must be aaabbbccc."
if ! defined $second_option;
foreach my $arg (#ARGV) {
...
}
This lets you have a long option name, and automatically fills in the information into variables for you, and allows you to test it. It even lets you add extra commands later, without having to do any extra parsing of the arguments, like adding a 'version' or a 'help' option:
# adding these to the above example...
my $VERSION = '1.000';
sub print_help { ... }
# ...and replacing the previous GetOptions with this...
GetOptions ('first-option=s' => \$first_option,
'second-option=s' => \$second_option)
'version' => sub { print "Running version $VERSION"; exit 1 },
'help' => sub { print_help(); exit 2 } );
Then, you can invoke it on the command line using -, --, the first letter, or the entire option, and GetOptions figures it all out for you. It makes your program more robust and easier to figure out; it's more "guessable" you could say. The best part is you never have to change your code that processes #ARGV, because GetOptions will take care of all that setup for you.
The most standard way of doing things in Perl is through CPAN.
So my first choice would be Getopt::Long. There is also a tutorial on DevShed: Processing Command Line Options with Perl
You can use a slice to extract the 2nd. to last items, for example:
[dsm#localhost:~]$ perl -le 'print join ", ", #ARGV[2..$#ARGV];' 1 2 3 4 5 6 7 8 9 10 00
3, 4, 5, 6, 7, 8, 9, 10, 00
[dsm#localhost:~]$
however, you should probably be using shift (or even better, GetOpt::Long)
deepesz answer is one good way to go.
There is also nothing wrong with your second option:
my $op = shift; # implicit shift from #ARGV
my $file = shift;
my #things = #ARGV;
# iterate over #things;
You could also skip copying #ARGV into #things and work directly on it. However, unless the script is very short, very simple, and unlikely to grow more complex over time, I would avoid taking too many short cuts.
Whether you choose deepesz' approach or this one is largely a matter of taste.
Deciding which is better is really a matter of philosophy. The crux of the issue is whether you should modify globals like #ARGV. Some would say it is no big deal as long as it is done in a highly visible way. Others would argue in favor of leaving #ARGV untouched.
Pay no attention to anyone arguing in favor of one option or the other due to speed or memory issues. The #ARGV array is limited by most shells to a very small size and thus no significant optimization is available by using one method over the other.
Getopt::Long, as has been mentioned is an excellent choice, too.
Do have a look at MooseX::Getopt because it may whet your appetite for even more things Moosey!.
Example of MooseX::Getopt:
# getopt.pl
{
package MyOptions;
use Moose;
with 'MooseX::Getopt';
has oper => ( is => 'rw', isa => 'Int', documentation => 'op doc stuff' );
has file => ( is => 'rw', isa => 'Str', documentation => 'about file' );
has things => ( is => 'rw', isa => 'ArrayRef', default => sub {[]} );
no Moose;
}
my $app = MyOptions->new_with_options;
for my $thing (#{ $app->things }) {
print $app->file, " : ", $thing, "\n";
}
# => file.txt : item1
# => file.txt : item2
# => file.txt : item3
Will produce the above when run like so:
perl getopt.pl --oper 1 --file file.txt --things item1 --things item2 --things item3
These Moose types are checked... ./getopt --oper "not a number" produces:
Value "not a number" invalid for option oper (number expected)
And for free you always get a usage list ;-)
usage: getopt.pl [long options...]
--file bit about file
--oper op doc stuff
--things
/I3az/
For the more general case with any array:
for(my $i=2; $i<#array; $i++) {
print "$array[$i]\n";
}
That loops through the array, starting with the third element (index 2). Obviously, the specific example you specifiy, depesz's answer is the most straightforward and best.