Why does the Perl conditonal operator behave different with a undefined hashref key? - perl

Why does the ternary conditional operator behave different with a undefined hashref key than a regular undefined $ variable ?
For example if $form is undefined and $str is defined the following ternary always assigns the value of str to the value of form:
($str) ? $form = $str : $form = $form;
However if undefined hash keys are substituted it no longer works.
I have made a script that demonstrates the odd behavior.
Note you'll get the 'Use of uninitialized value' warnings in all tests.
That is intentional as I am testing for an undefined value.
Thanks in advance
#!/usr/bin/env perl
# ternary operator
# condition ? if True : if False
# Why does the ternary operator behave different with a undefined hashref key
# than a regular undefined $ variable ?
use strict;
use warnings;
use diagnostics;
my $i = 1;
my $form = { vc => 'customer', customer_id => ''};
my $ref = { customer_id => 12345 };
&string_test;# passes
print "-" x 80;
&mixed_test_ref; # passes
print "-" x 80;
&mixed_test; #fails
print "-" x 80;
&hashref_test; #fails
print "-" x 80;
&works_always;
print "-" x 80;
print "\n";
sub string_test {
my $str = '1234';
my $form; # not defined
print qq|
Why does the ternary operator behave different
with a undefined hashref key
than a regular undefined \$ variable ?
First with a undefined variable called form:
Before:
str : '$str' \n
form : $form
|; # will throw undefined warning
# ternary operator
#condition ? if True : if False
($str) ? $form = $str :
$form = $form;
print qq|
Always works:
str : '$str' \n
form: $form \n|;
}
sub mixed_test_ref {
my $local;
print qq|
Mixed test ref
Before:
form->{vc} '$form->{vc}' \n
ref->{"$form->{vc}_id"} : '$ref->{"$form->{vc}_id"}' \n
|;
# ternary operator
#condition ? if True : if False
($ref->{"$form->{vc}_id"}) ? $local = $ref->{"$form->{vc}_id"} :
$local = $local;
print qq|
After:
form->{vc} '$form->{vc}' \n
ref->{"$form->{vc}_id"} : '$ref->{"$form->{vc}_id"}' \n
local: '$local' \n
|;
if ($local = $ref->{"$form->{vc}_id"}) {print "Pass\n"}
else {print "Fail\n"}
}
sub mixed_test {
# $form->{"$form->{vc}_id_$i"} is not defined
# setting it to empty string makes it pass
#$form->{"$form->{vc}_id_$i"} = '';
delete $form->{"$form->{vc}_id_$i"};
my $str = '1234';
print qq|
mixed test
Before:
str : '$str' \n
form: '$form->{"$form->{vc}_id_$i"}' \n
|;
# ternary operator
#condition ? if True : if False
($str) ? $form->{"$form->{vc}_id_$i"} = $str :
$form->{"$form->{vc}_id_$i"} = $form->{"$form->{vc}_id_$i"};
if ( $form->{"$form->{vc}_id_$i"} == $str ) {
print qq|Pass\
After:
str : '$str' \n
form: $form->{"$form->{vc}_id_$i"} \n|;
}
else {print "Fail\n"}
}
sub hashref_test {
# $form->{"$form->{vc}_id_$i"} is not defined
# setting it to empty string makes it pass
#$form->{"$form->{vc}_id_$i"} = '';
delete $form->{"$form->{vc}_id_$i"};
print qq|
hash ref test
Before:
form->{vc} '$form->{vc}' \n
ref->{"$form->{vc}_id"} : '$ref->{"$form->{vc}_id"}' \n
form->{"$form->{vc}_id_$i"} : $form->{"$form->{vc}_id_$i"} \n
|;
# ternary operator
#condition ? if True : if False
$ref->{"$form->{vc}_id"} ? $form->{"$form->{vc}_id_$i"} = $ref->{"$form->{vc}_id"} :
$form->{"$form->{vc}_id_$i"} = $form->{"$form->{vc}_id_$i"};
if ($form->{"$form->{vc}_id_$i"}) {print "Passes and I am amazed\n"}
else {
print qq|
Why does this not work?
After:
form->{vc} '$form->{vc}' \n
ref->{"$form->{vc}_id"} : '$ref->{"$form->{vc}_id"}' \n
form->{"$form->{vc}_id_$i"} : $form->{"$form->{vc}_id_$i"} \n|;
}
}
sub works_always {
if ($ref->{"$form->{vc}_id"}) { $form->{"$form->{vc}_id_$i"} = $ref->{"$form->{vc}_id"} }
# not necessary but for completeness
else { $form->{"$form->{vc}_id_$i"} = $form->{"$form->{vc}_id_$i" } }
print qq|
Always works as expected with if statement:
form->{vc} '$form->{vc}' \n
ref->{"$form->{vc}_id"} : '$ref->{"$form->{vc}_id"}' \n
form->{"$form->{vc}_id_$i"} : $form->{"$form->{vc}_id_$i"} \n|;
}

I knew when I saw the title of the question that the answer would be about precedence.
You might think that
($str) ? $form = $str : $form = $form;
means
if ($str) {
$form = $str;
} else {
$form = $form;
}
But this command
$ perl -MO=Deparse,-p -e '($str) ? $form = $str : $form = $form'
(($str ? ($form = $str) : $form) = $form);
shows us that it really means
if ($str) {
$form = $str = $form;
} else {
$form = $form;
}
This can be fixed in many ways. With more parentheses:
($str) ? ($form = $str) : ($form = $form)
or by writing a much simpler expression:
$form = $str if $str;
$form = $str || $form;

Related

Perl regex to capture group and stop matching

I need some help with this perl regular expression
s/.*?<\?lsmb if\s*?(\S*)\s*?\?>/$1/
in the code below parsing out some non-whitespace chars [A-Z][a-z][0-9][_] surrounded by any number of whitespace and the other chars. I have tried various Perl regular expressions which are all commented out in the program below.
My main problem I think is stopping matching at the end.
The code below runs 8 tests, and I am hoping to find something that passes all 8.
#!/usr/bin/perl
use strict;
use warnings;
use diagnostics;
my $count = 0;
my $t = 0;
#examples of things I need to match, match => catagory
my $self = { 'customerfax' => 'alpha',
'_____' => 'Underscore',
'000000' => 'numeric',
'letter_reason_4' => 'alfa-numeric-underscore',
'customerphone7' => 'alfa-numeric',
'customer_phone' => 'alfa-underscore',
};
# must contain <?lsmb 'varname from $self' ?>
# may contain any amount of whitespace chars where one is depected
# will end with \n that is removed by chop below
my $test1 = qq|<?lsmb if customerfax ?> caacaacac\n|;
my $test2 = qq|<?lsmb if _____ ?> bbb\n|;
my $test3 = qq|<?lsmb if 000000 ?> cccc\n|;
my $test4 = qq|<?lsmb if letter_reason_4 ?><t \></'><><><>\n|; # /
my $test5 = qq| <?lsmb if customerfax ?> |;
my $test6 = qq|<?lsmb if customerphone7 ?> \<?lsmb ignore this >n|;
my $test7 = qq|<?lsmb if customer_phone ?>\n|;
my $test8 = qq| avcscc 34534534 <?lsmb if letter_reason_4 ?> 0xffff\n|;
strip_markup($test1);
strip_markup($test2);
strip_markup($test3);
strip_markup($test4);
strip_markup($test5);
strip_markup($test6);
strip_markup($test7);
strip_markup($test8);
if ($count == 8) { print "Passed All done\n";}
else { print "All done passed $count out of 8 Try again \n"; }
sub strip_markup {
$_= shift;
#print "strip_markup $_ \n";
if (/<\?lsmb if /) {
chop; # gets rid ot the new line
#original
#s/.*?<\?lsmb if (.+?) \?>/$1/;
#What I have tried:
#s/.*?<\?lsmb if(?:\s)*?(\S+?)(?:\s)*?\?>\b/$1/;
s/.*?<\?lsmb if\s*?(\S*)\s*?\?>/$1/;
#s/.*?<\?lsmb if\s*?([A-Za-z0-9_]*?)\s*?\?>/$1/;
#s/.*?<\?lsmb if[\s]*?(\S*?)[\s]*?\?>/$1/;
#s/.*?<\?lsmb if (\S*?) \?>/$1/;
#s/.*?<\?lsmb if (\S+?) \?>/$1/;
#s/.*?<\?lsmb if ([\S]+?)([\s]+?)\?>/$1/;
#s/.*?<\?lsmb if[\s]+([\S]+)[\s]+\?>/$1/;
#s/.*?<\?lsmb if\s*?([\S]*?)\s*?\?>/$1/;
#s/.*?<\?lsmb if\s+?([\S]+?)[\s]+?\?>/$1/;
#s/.*?<\?lsmb if ([\S]+?) \?>/$1/;
#s/.*?<\?lsmb if\s*?([\S_]*?)\s*?\?>/$1/;
#s/.*?<\?lsmb if\s*?([[a-zA-Z]|[\d]|[_]]*?)\s*?\?>/$1/;
#s/.*?<\?lsmb if\s*?([a-zA-Z\d_]*?)\s*?\?>/$1/;
#s/.*?<\?lsmb if\s*?([^[:space:]]+?)\s*?\?>/$1/;
$t++;
print "Test $t ";
#look up the result as the hash key
my $ok = $self->{$_};
if ($ok) {
$count++;
print "OK passed $ok,";
}
print qq|Test Value : '$_' \n|;
}
}
Here are some of the Tests and what they should return:
Test1 = <?lsmb if customerfax ?> caacaacac\n should return customerfax
Test2 = <?lsmb if _____ ?> bbb\n should return _____
Test8 = avcscc 34534534 <?lsmb if letter_reason_4 ?> 0xffff\n
should return letter_reason_4
If my understanding of your requirements is right, the needed phrase is extracted by simple
my ($match) = $string =~ /<\?lsmb \s+ if \s+ (\w+)/x
In the list context the match operator m// returns a list with matches. Even if it's just one, we need the list context – in the scalar context its behavior is different. The list context comes from assigning to a list from it, my (...) =. The /x modifier merely allows us to use spaces inside, for readability. See perlretut for starters.
What may precede <? doesn't have to be specified, since the pattern matches anywhere in the string. The \w is for [A-Za-z0-9_] (see perlrecharclass), what seems to match your examples and description. The \S is more permissive. Nothing is needed after \w+.
Also, there is no need to first test whether the pattern is there
sub strip_markup
{
my ($test_res) = $_[0] =~ /<\?lsmb if (\w+)/;
if ($test_res) {
# ...
}
return $test_res; # return something!
}
There is no reason for the substitution so we use a match.
I understand that you are working with code you can't change, but would still like to comment
No need to remove the newline here. But when you do that, use chomp and not chop
The sub uses global variables. That can lead to bugs. Declare in small scope. Pass
The sub modifies global variables. That often leads to bugs while there is rarely need for it
Use arrays for repetitions of the same thing
This can be organized differently, to separate work more clearly
For example
my #tests = (
qq|<?lsmb if customerfax ?> caacaacac\n|,
# ...
);
my ($cnt, $t);
foreach my $test (#tests)
{
my $test_res = strip_markup($test);
if (defined $test_res) {
$t++;
print "Test $t ";
#look up the result as the hash key
my $ok = $self->{$test_res};
if ($ok) {
$count++;
print "OK passed $ok,";
}
print qq|Test Value : '$_' \n|;
}
else { } # report failure
}
sub strip_markup {
my ($test_res) = $_[0] =~ /<\?lsmb \s+ if \s+ (\w+)/x;
return $test_res;
}
The defined test of $test_res is to allow for falsey things (like 0 or '') to be valid results.
The reporting code can, and should be, in another subroutine.

Perl functions: Get "name" of incoming parameters

I'm trying to wrap my head around perl. It's different enough from what I'm used too (.Net mainly, but C/C++/php/javascript) that some things just aren't "grepping" for me.
I'm working through a book that has some exercises. One of the exercises involves practice coercing scalars into Boolean - Undef and Zeros = false, others = true.
Taking the question, some previous code examples and a desire to pull functions out of repeated code... I'm sitting with this function:
# Which of the following evaluates to true?
use strict;
use warnings;
use diagnostics;
sub check1 {
my #args = #_;
if ($args[0]) {
return "true";
} else {
return "false";
}
}
sub check2 {
my #args = #_;
if ($args[0]) {
print "$args[1] = true\n";
} else {
print "$args[1] = false\n";
}
}
# false (zero or undef)
my $def1 = undef; check2($def1, 'def1');
my $def3 = 0.0; check2($def3, 'def3');
my $def5 = 0; check2($def5, 'def5');
# true (not zero and not undef)
my $def2 = ' '; check2($def2, 'def2');
my $def4 = '0.0'; check2($def4, 'def4');
my $def6 = 'false';check2($def6, 'def6');
my $def7 = 1/0; check2($def7, 'def7');
#as suggested by TheSuitIsBlackNot
my $foo = 0.0; print '$foo = ', check1($foo), "\n";
My question is: Is there a way to remove the duplicate parameter? such that the check function can pull the name of the parameter instead of saying $def1, "def1"?
The solution isn't to get the name of the variable; the solution is to eliminate the variable.
sub check {
my ($name, $value, $expected) = #_;
my $got = $value ? 'true' : 'false';
print("$name is $got. Expected $expected.\n");
}
my #tests = (
[ 'def1', undef, 'false' ],
[ 'def3', 0.0, 'false' ],
[ 'def5', 0, 'false' ],
[ 'def2', ' ', 'true' ],
[ 'def4', '0.0', 'true' ],
[ 'def6', 'false', 'true' ],
);
check(#$_) for #tests;

Using a scalar as a condition in perl

First timer...so let me know if there is anything that I have not paid attention to whilst posing a question.
The question is how to use a scalar as a condition, as the code below does not work.
my #parameter=('hub');
my %condition;
$condition{'hub'}{'1'}='$degree>=5';
foreach (#parameter) {
if ($condition{$_}{'1'}) {..}
}
I thought that is because the condition is not interpreted correctly, so I also tried the following, which also did not work.
if ("$condition{$parameter}{'1'}") { ..}
Would really appreciate any help. :)
You either want string eval, which evaluates a string as Perl code
if (eval $condition{$_}{'1'}) { ...
or perhaps a more secure approach would be using code references
$condition{'hub'}{'1'} = sub { return $degree>=5 };
if ($condition{$_}{'1'}->()) { ...
In the second example, you are attaching a piece of code to a variable. The $var->() syntax executes the code and evaluates to the return value of the code.
What you are trying to do is to evaluate '$degree>=5' as real code. Rather than trying to evaluate the string as code (which can be done with eval), it's usually safer and often more robust to instead pass a code-reference. You can use a generator subroutine to generate conditional subs on demand, like this:
sub generate_condition {
my ( $test, $bound ) = #_;
return sub { return $test >= $bound; };
}
my %condition;
$condition{'hub'}{'1'} = generate_condition( $degree, 5 );
if( $condition{$parameter}{1}->() ) { ... }
It gets a little more tricky if you want the >= (ie, the relationship itself) to be dynamically created as well. Then you have a couple of choices. One takes you back to stringy eval, with all of its risks (especially if you start letting your user specify the string). The another would be a lookup table within your generate_condition() sub.
generate_condition() returns a subroutine reference that when invoked, will evaluate the condition that was bound in at creation time.
Here's a generalized solution that will accept any of Perl's conditionals and wrap them along with the arguments being tested into a subroutine. The subref can then be invoked to evaluate the conditional:
use strict;
use warnings;
use feature qw/state/;
sub generate_condition {
my ( $test, $relation, $bound ) = #_;
die "Bad relationship\n"
if ! $relation =~ m/^(?:<=?|>=?|==|l[te]|g[te]|cmp)$/;
state $relationships = {
'<' => sub { return $test < $bound },
'<=' => sub { return $test <= $bound },
'==' => sub { return $test == $bound },
'>=' => sub { return $test >= $bound },
'>' => sub { return $test > $bound },
'<=>' => sub { return $test <=> $bound },
'lt' => sub { return $test lt $bound },
'le' => sub { return $test le $bound },
'eq' => sub { return $test eq $bound },
'ge' => sub { return $test ge $bound },
'gt' => sub { return $test gt $bound },
'cmp' => sub { return $test cmp $bound },
};
return $relationships->{$relation};
}
my $true_condition = generate_condition( 10, '>', 5 );
my $false_condition = generate_condition( 'flower', 'eq', 'stamp' );
print '10 is greater than 5: ',
$true_condition->() ? "true\n" : "false\n";
print '"flower" is equal to "stamp": ',
$false_condition->() ? "true\n" : "false\n";
Often when you construct these sorts of things one is interested in leaving one parameter open to bind at call-time rather than at subroutine manufacture-time. Let's say you only want to bind the "$bound" and "$relation" parameters, but leave "$test" open for specification at subroutine call time. You would modify your sub generation like this:
sub generate_condition {
my ( $relation, $bound ) = #_;
die "Bad relationship\n"
if ! $relation =~ m/^(?:<=?|>=?|==|l[te]|g[te]|cmp)$/;
state $relationships = {
'<' => sub { return $_[0] < $bound },
# ......
And then invoke it like this:
my $condition = generate_condition( '<', 5 );
if( $condition->(2) ) {
print "Yes, 2 is less than 5\n";
}
If the goal is to provide late binding of both the lefthand and righthand side in the relational evaluation, this will work:
sub generate_condition {
my $relation = shift;
die "Bad relationship\n"
if ! $relation =~ m/^(?:<=?|>=?|==|l[te]|g[te]|cmp)$/;
state $relationships = {
'<' => sub { return $_[0] < $_[1] },
'<=' => sub { return $_[0] <= $_[1] },
# ...... and so on .....
return $relationship->($relation);
}
my $condition = generate_condition( '<' );
if( $condition->(2,10) ) { print "True.\n"; }
This sort of tool falls into the category of functional programming, and is covered in beautiful detail in Mark Jason Dominus's book Higher Order Perl
What are you expecting? String values are interpreted as true when they are nonempty.
themel#kallisti: ~ $ perl -e 'print "oops\n" if "false" ; '
oops
themel#kallisti: ~ $ perl -e 'print "oops\n" if "" ; '
themel#kallisti: ~ $ perl -e 'print "oops\n" if "\$degree < 5" ;'
oops
If you want to dynamically evaluate code in your conditions, you have to investigate eval. Example:
my #conds=('$foo>42', '$foo>23');
my $foo = 33;
foreach my $cond(#conds) {
print "$cond itself was true\n" if $cond;
print "$cond evaluated to true\n" if eval($cond);
}
prints
$foo>42 itself was true
$foo>23 itself was true
$foo>23 evaluated to true

Recursively printing data structures in Perl

I am currently learning Perl. I have Perl hash that contains references to hashes and arrays. The hashes and arrays may in turn contain references to other hashes/arrays.
I wrote a subroutine to parse the hash recursively and print them with proper indentation. Though the routine works as expected, my instructor was not convinced about the readability and elegance of the below code.
I would really appreciate to get the views of Perl experts here on possible optimization of the below code.
Here is my complete code snippet..
# Array of Arrays
$ref_to_AoA = [
[ "fred", "barney" ],
[ "george", "jane", "elroy" ],
[ "homer", "marge", "bart" ],
];
#Array of Hashes
$ref_to_AoH = [
{
husband => "barney",
wife => "betty",
son => "bamm bamm",
},
{
husband => "george",
wife => "jane",
son => "elroy",
},
];
# Hash of Hashes
$ref_to_HoH = {
flintstones => {
husband => "fred",
pal => "barney",
},
jetsons => {
husband => "george",
wife => "jane",
"his boy" => "elroy", # Key quotes needed.
},
simpsons => {
husband => "homer",
wife => "marge",
kid => "bart",
},
};
# Hash which contains references to arrays and hashes
$finalHash = {
'arrayofArrays' => $ref_to_AoA,
'arrayofHash' => $ref_to_AoH,
'hashofHash' => $ref_to_HoH,
};
$string = str($finalHash);
print "$string\n";
#------------------------------------------------------------------
sub str {
my $hash = shift;
my ($space, $newline, $delimiter) = #_;
$space = "" unless (defined $space);
$newline = "\n\n\n" unless (defined $newline);
$delimiter = "\n--------------------------------------------" unless (defined $delimiter);
my $str = "";
for (sort keys %{$hash}) {
my $value = $hash->{$_};
$str .= "$newline$space$_ == $value$delimiter";
$str .= recurseErrors($value,$space);
}
$str;
}
#------------------------------------------------------------------
sub recurseErrors {
my $str;
my ($value,$space) = #_;
my $ref = ref $value;
if ($ref eq 'ARRAY') {
my $i = 0;
my $isEmpty = 1;
my #array = #$value;
$space .= "\t";
for my $a (#array) {
if (defined $a) {
$isEmpty = 0;
$str .= "\n$space$_\[$i\] :";
$str .= recurseErrors($a,$space);
}
$i++;
}
$str .= "= { }" if ($isEmpty);
} elsif ($ref eq 'HASH') {
$space .= "\t";
for my $k (sort keys %$value) {
if ( ( ref($value->{$k}) eq 'HASH') || (ref $value->{$k} eq 'ARRAY') ) {
my $val = $value->{$k};
$str .= "\n\n$space$k == ";
$str .= "$val";
}
else {
$str .= "\n$space$k == ";
}
$str .= recurseErrors($value->{$k},$space);
}
# we have reached a scalar (leaf)
} elsif ($ref eq '') {
$str .= "$value";
}
$str
}
#------------------------------------------------------------------
Output:
arrayofArrays == ARRAY(0x9d9baf8)
--------------------------------------------
arrayofArrays[0] :
arrayofArrays[0] :fred
arrayofArrays[1] :barney
arrayofArrays[1] :
arrayofArrays[0] :george
arrayofArrays[1] :jane
arrayofArrays[2] :elroy
arrayofArrays[2] :
arrayofArrays[0] :homer
arrayofArrays[1] :marge
arrayofArrays[2] :bart
arrayofHash == ARRAY(0x9d9bba8)
--------------------------------------------
arrayofHash[0] :
husband == barney
son == bamm bamm
wife == betty
arrayofHash[1] :
husband == george
son == elroy
wife == jane
hashofHash == HASH(0x9da45f8)
--------------------------------------------
flintstones == HASH(0x9d9bb48)
husband == fred
pal == barney
jetsons == HASH(0x9d9bbf8)
his boy == elroy
husband == george
wife == jane
simpsons == HASH(0x9d9bc48)
husband == homer
kid == bart
wife == marge
Always use use strict;
To be a good boy, use use warnings as well.
The names you use for subroutines should make it obvious what the subroutine does. "recurseErrors" kind of violates that principle. Yes, it does recurse. But what errors?
On the first line of each subroutine you should declare and initialize any parameters. recurseErrors first declares $str and then declares its parameters.
Don't mix shift and = #_ like you do in str()
You might consider breaking up what is now called recurseErrors into specialized routines for handling arrays and hashes.
There's no need to quote variables like you do on lines 99 and 109.
Apart from that I think your instructor had a bad day that day.
maybe Data::Dumper is what you want:
use Data::Dumper;
$str = Dumper($foo);
print($str);
If you are new to perl, I'd recommend running your code through perl-critic (there is also a script you can install from CPAN, normally I use it as a test so it gets run from the command line whenever I do "make test"). In addition to its output, you might want to break up your functions a bit more. recurseErrors has three cases that could be split into sub functions (or even put into a hash of ref-type to sub-function ref).
If this were a production job, I'd use Data::Dumper, but it sounds like this is homework, so your teacher might not be too pleased.
Here is one simple example why your code is not easily readable:
$delimiter = "\n--------------------------------------------" unless (defined $delimiter);
You could use the defined or operator:
$delimiter //= "\n" . '-' x 44;
If you are worried about earlier Perls:
defined $delimeter or $delimeter = "\n" . '-' x 44;
Conditionals going off the right margin are enough of a turn-off for me not to read the rest of the code.
My guess is that he doesn't like that you
expect a hash in the str function.
call the same function to print arrays as hashes, despite that there appears to be no common function between them.
allow various ways to call str, but it never figures into the final result.
allow configurable space to be passed in to the root function, but have a tab hardcoded in the recursive function.
omit undefined values that actually hold a place in the arrays
Those are issues that I can see, pretty quickly.
You could have separated out the code blocks that dealt with arrays, and hashes.
sub recurse{
...
recurse_A(#_) if $ref eq 'ARRAY';
recurse_H(#_) if $ref eq 'HASH';
...
}
sub recurse_A{ ... }
sub recurse_H{ ... }
I would recommend starting out your subroutines like this, unless you have a real good reason for doing otherwise.
sub example{
my( $one, $two, $three, $optional_four ) = #_;
( If you do it like this then Komodo, at least, will be able to figure out what the arguments are to your subroutine )
There is rarely any reason to put a variable into a string containing only the variable.
"$var" eq $var;
The only time I can think I would ever do that is when I am using an object that has an overloaded "" function, and I want to get the string, without also getting the object.
package My_Class;
use overload
'""' => 'Stringify',
;
sub new{
my( $class, $name ) = #_;
my $self = bless { name => $name }, $class;
return $self;
}
sub Stringify{
my( $self ) = #_;
return $self->{name};
}
my $object = My_Class->new;
my $string = "$object";
I've struggled with this same problem before, and found my way here. I almost used a solution posted here, but found a more suitable one (for me anyway). Read about Depth First Recursion here.
The sub in the above article works perfectly with a reference containing other Hashes, Arrays, or Scalars. It did not print Hash key names, though, so I slightly modified it:
#!/usr/bin/perl
#
# See:
#
# http://perldesignpatterns.com/?DepthFirstRecursion
#
use strict;
use warnings;
my %hash = (
'a' => {
'one' => 1111,
'two' => 222,
},
'b' => [ 'foo', 'bar' ],
'c' => 'test',
'd' => {
'states' => {
'virginia' => 'richmond',
'texas' => 'austin',
},
'planets' => [ 'venus','earth','mars' ],
'constellations' => ['orion','ursa major' ],
'galaxies' => {
'milky way' => 'barred spiral',
'm87' => 'elliptical',
},
},
);
&expand_references2(\%hash);
sub expand_references2 {
my $indenting = -1;
my $inner; $inner = sub {
my $ref = $_[0];
my $key = $_[1];
$indenting++;
if(ref $ref eq 'ARRAY'){
print ' ' x $indenting,'ARRAY:';
printf("%s\n",($key) ? $key : '');
$inner->($_) for #{$ref};
}elsif(ref $ref eq 'HASH'){
print ' ' x $indenting,'HASH:';
printf("%s\n",($key) ? $key : '');
for my $k(sort keys %{$ref}){
$inner->($ref->{$k},$k);
}
}else{
if($key){
print ' ' x $indenting,$key,' => ',$ref,"\n";
}else{
print ' ' x $indenting,$ref,"\n";
}
}
$indenting--;
};
$inner->($_) for #_;
}
#use strict ;
use warnings ;
# use module
use XML::Simple;
use Data::Dumper;
#debug print "START SCRIPT " ;
my $fileToParse = 'C:/Temp/CDIP/scripts/perl/nps_all_workflows.xml' ;
# create object
my $objXml= new XML::Simple;
# read XML file
my $data = $objXml->XMLin("$fileToParse");
# #debug print "\n FirstLevel is " . $objXml->{'POWERMART'} ;
my $level = 1 ;
#
printHashKeyValues ($data ) ;
sub printHashKeyValues
{
$level ++ ;
my $refHash = shift ;
my $parentKey = shift ;
my $parentValue = shift ;
while( my ($key, $value) = each %$refHash)
{
if ( defined ( $key ) )
{
if ( ref ($refHash->{"$key"}) eq 'HASH' )
{
my $newRefHash = $refHash->{"$key"} ;
#debug print " \n The key is a hash " ;
printHashKeyValues ($newRefHash , $key , $value) ;
}
if ( ref ($refHash->{"$key"}) eq 'ARRAY' )
{
#debug print " \n the key is an ARRAY " ;
printArrayValues ( $refHash->{"$key"} ) ;
}
} #eof if ( defined ( $key ))
if ( defined ( $value) )
{
if ( ref ($refHash->{"$value"}) eq 'HASH' )
{
my $newRefHash = $refHash->{"$value"} ;
#debug print " \n The value is a hash " ;
printHashKeyValues ($newRefHash , $key , $value) ;
}
if ( ref ($refHash->{"$value"}) eq 'ARRAY' )
{
#debug print " \n the value is an ARRAY " ;
printArrayValues ( $refHash->{"$value"} ) ;
}
} #eof if defined ( $value )
#debug print "\n key: $key, value: $value.\n";
} #eof while
} #eof sub
sub printArrayValues
{
my $arrRef = shift ;
my #array = #$arrRef;
my $parrentArrayElement = shift ;
#debug print "printArrayValues CALLED " ;
foreach my $arrayElement ( #array )
{
if (defined ( $arrayElement ) )
{
if ( ref ($arrayElement) eq 'HASH' )
{
#debug print " \n The \$arrayElement is a hash FROM THE ARRAY " ;
printHashKeyValues ($arrayElement ) ;
} #eof if
if ( ref ($arrayElement) eq 'ARRAY' )
{
#debug print " \n The \$arrayElement is a ARRAY FROM THE ARRAY " ;
printArrayValues ($arrayElement ) ;
} #eof if
#debug print "\n \$arrayElement is $arrayElement " ;
} #eof if ( defined ( $arrayElement ) )
} #eof foreach
} #eof sub
# #debug print output
##debug print Dumper($data);
1 ;

Parse::RecDescent - getting information from it

I'm working with the Parse::RecDescent parser in Perl, and I seem to have the most terrible time getting information from it. The information readily available online does not seem to have non-trivial examples.
Here is the code:
event_function: object_list ':' event_list ';'
<defer:
{ #item is a special character with Parse::Recdescent.
print Dumper($item{object_list});
$return = $item[1];
}
>
| object_list ':' ';'
<defer:
{
print Dumper($item{object_list});
$return = $item[1];
}
>
Here is the output
PS W:\developers\paulnathan\rd_dir> perl parser.pl testfile
$VAR1 = 4;
$VAR1 = 8;
PS W:\developers\paulnathan\rd_dir>
The input file parses correctly.
stuff, stuff2: pre-operation event = {foo1, foo2};
It should be outputting a hash keyed by "stuff", "stuff2".
Thoughts?
edit:
object_list :
object ',' object_list
<defer:
{
my $retval = ();
$retval = ::merge_hash_refs($item[1], $item[3]);
$return = $retval;
}
>
| object
<defer:
{
#print Dumper($item{object});
$return = $item{object};
}
>
object :
'/' /[a-z0-9_][a-z0-9_]*/ '/' '...'
<defer:
{
$::objects->{$item[2]} = "stuff";
$return = $::objects;
}
>
| /[a-z0-9_][a-z0-9_]*/
<defer:
{
$::objects->{$item[1]} = "stuff";
$return = $::objects;
}
>
edit2:
Merge_hash_refs, just in case. :-)
#takes two hash references.
sub merge_hash_refs {
my($ref1, $ref2) = #_;
my $retref = ();
while( my ($k, $v) = each %$ref1 ) {
$retref->{$k} = $v;
}
while( my ($k, $v) = each %$ref2 ) {
$retref->{$k} = $v;
}
return $retref;
}
If you add a use strict to your script you'll get the fatal error Can't use string ("1") as a HASH ref while "strict refs" in use at [the call to merge_hash_refs]. It appears that the closures created by the <defer> directives are causing the contents of #item to be the ones when the production matched instead of the hashrefs eventually returned by the subrules. Removing the <defer> directives gives me this output:
$VAR1 = {
'stuff2' => 'stuff',
'stuff' => 'stuff'
};
Of course, this has the side effect that $::object is updated by successful object productions even if the higher level rules fail (including backtracking). I'd write it this way:
use strict;
use warnings;
use Parse::RecDescent;
use Data::Dumper;
my $parser = Parse::RecDescent->new(<<'EOT');
event_function: object_list ':' event_list(?) ';'
{
$return = $item[1];
}
object_list : <leftop: object ',' object>
{
$return = { map { %$_ } #{$item[1]} };
}
object :
'/' /[a-z0-9_][a-z0-9_]*/ '/' '...'
{
$return = { $item[2] => 'stuff' };
}
| /[a-z0-9_][a-z0-9_]*/
{
$return = { $item[1] => 'stuff' };
}
# stub, don't know what this should be
event_list : /[^;]+/
EOT
my %object;
while (<DATA>) {
my $x = $parser->event_function($_);
next unless $x;
# merge objects into master list
while (my ($k, $v) = each %$x) {
$object{$k} = $v;
}
}
print Dumper \%object;
__DATA__
stuff, stuff2: pre-operation event = {foo1, foo2};
stuff3, stuff4: ;
The output is:
$VAR1 = {
'stuff2' => 'stuff',
'stuff3' => 'stuff',
'stuff' => 'stuff',
'stuff4' => 'stuff'
};
Probably not an answer to your question, but when you start an each() loop through a hash, if each() had previously been used on the hash it just starts from wherever the iterator was pointing. To be safe, put a void-context keys() (e.g. keys(%$ref1);) before the while loop to reset the iterator. Older versions of Data::Dumper had a cute little bug of leaving the iterator pointing just after the last element sometimes, making the hash appear to be empty to an unsafe while(...each...) loop :)