perl string interpolation when using the package delimiter

perl string interpolation when using the package delimiter - perl

Am fighting some legacy perl looking like the following:
sub UNIVERSAL::has_sub_class {
my ($package,$class) = #_;
my $all = all_packages();
print "$package - $class", "\n";
print "$package::$class", "\n";
return exists $all->{"$package::$class"};
}
On two different systems, two different PERL installations / versions, this code behaves differently, i.e. the "$package::$class" construct is properly resolved to the correct package name on one system, but not on the other.
The following different print outputs can be seen when running has_sub_class on the two different systems:
# print output on system 1 (perl v5.8.6):
webmars::parameter=HASH(0xee93d0) - webmars::parameter::date
webmars::parameter::date
# print output on system 2 (perl v5.18.1):
webmars::parameter=HASH(0x251c500) - webmars::parameter::date
webmars::parameter=HASH(0x251c500)::webmars::parameter::date
Has there been any string interpolation changes in between perl v5.8.6 and perl v5.18.1 that you know might cause this behaviour ? Or should I look somewhere else ? I've really tried google-ing around and reading through the perl change notes, but could not find anything of interest.
With my limited knowledge of perl, I've tried getting the smallest piece of code that could reproduce the problem I am having. I have come up with the following which I hope is relevant:
# system 1 (perl v5.8.6):
$ perl -e 'my %x=(),$x=bless(\%x),$y='bar';print "$x::$y\n";'
bar
# system 2 (perl v5.18.1):
$ perl -e 'my %x=(),$x=bless(\%x),$y='bar';print "$x::$y\n";'
main=HASH(0xec0ce0)::bar
The outputs are different ! Any ideas ?

Shorter demonstration:
($x::, $x) = (1,2); print "$x::$x"
$ perl5.16.3 -e '($x::, $x) = (1,2); print "$x::$x"'
12
$ perl5.18.1 -e '($x::, $x) = (1,2); print "$x::$x"'
2::2
Getting warmer.
$ perl5.16.3 -MO=Concise =e 'print "$x::$x"'
8 <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
7 <#> print vK ->8
3 <0> pushmark s ->4
- <1> ex-stringify sK/1 ->7
- <0> ex-pushmark s ->4
6 <2> concat[t3] sK/2 ->7
- <1> ex-rv2sv sK/1 ->5
4 <#> gvsv[*x::] s ->5 <- $x::
- <1> ex-rv2sv sK/1 ->6
5 <#> gvsv[*x] s ->6 <- $x
-e syntax OK
$ perl5.18.1 -MO=Concise -e 'print "$x::$x"'
a <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
9 <#> print vK ->a
3 <0> pushmark s ->4
- <1> ex-stringify sK/1 ->9
- <0> ex-pushmark s ->4
8 <2> concat[t4] sKS/2 ->9
6 <2> concat[t2] sK/2 ->7
- <1> ex-rv2sv sK/1 ->5
4 <#> gvsv[*x] s ->5 <- $x
5 <$> const[PV "::"] s ->6 <- "::"
- <1> ex-rv2sv sK/1 ->8
7 <#> gvsv[*x] s ->8 <- $x
-e syntax OK
TL;DR. v5.16 parses "$x::$x" as $x:: . $x. v5.18 as $x . "::" . $x. I don't see any obvious reference to this change in the delta docs, but I'll keep looking.

So, my very quick test confirms the issue - using
perl -Mstrict -we 'my %x=(),$x=bless(\%x),$y="bar";print "$x::$y\n";'
(With your version, I get a bareword warning for 'bar').
The error I get in 5.8.8 is "use of uninitialized value in concatenation".
The difference seems to be that when I run with perl -MO=Deparse I get:
my ( %x ) = ();
my $x = bless ( \%x );
my $y = 'bar';
print "$x::$y\n";
If I run on 5.20.2 though, I get:
my ( %x ) = ();
my $x = bless ( \%x );
my $y = 'bar';
print "${x}::$y\n";
So yes, there has been a change in how the same code is parsed. But I'm not entirely sure how that helps you, apart from perhaps enlightening you as to what's going on?

Related

Perl increment operator

$a = 10;
$b = (++$a) + (++$a) + (++$a);
print $b;
I am getting the answer 37.
Can anybody explain how this operation is proceeding and how the result is getting 37.
As per my logic it should be 36:
(++$a) + (++$a) + (++$a)
11 + 12 + 13 = 36
But I am getting the answer 37

Perl's is executing this as
( ( $a = $a + 1 ) + ( $a = $a + 1 ) ) + ( $a = $a + 1 )
You have even put the ++$a in parentheses so to say that they should happen first, before the additions, although they are of higher priority anyway
This is centred around the fact that the assignment operator = returns its first operand, which allows operations like
(my $x = $y) =~ tr/A-Z/a-z/
If the result of the assignment were simply the value copied from $y to $x then the tr/// would cause a Can't modify a constant item or the equivalent, and it would have no effect on what was stored in either variable
Here is the variable $a, and the execution is as follows
Execute the first increment, returning $a
$a is now 11
Execute the second increment, returning $a again
$a is now 12
Execute the first addition, which adds what was returned by the two increments—both $a
$a is 12, so $a + $a is 24
Execute the third increment, returning $a again
$a is now 13
Execute the second addition, which adds the what was returned by the first addition (24) and the third increment ($a)
$a is 13, so 24 + $a is 37
Note that this should not be relied on. It is not documented anywhere except to say that it us undefined, and the behaviour could change with any release of Perl

As a complement to mob and Borodin's answer, you can see what's happening clearly if you think about how the operations are interacting with the stack and recognize that preinc returns the variable, not its value.
op | a's value | stack
$a | 10 | $a
++ | 11 | $a
$a | 11 | $a $a
++ | 12 | $a $a
+ | 12 | 24
$a | 12 | 24 $a
++ | 13 | 24 $a
+ | 13 | 37

As it has been noted in comments, changing a variable multiple times within a single statement leads to undefined behavior, as explained in perlop.
So the exact behavior is not specified and may vary between versions and implementations.
As to how it works out, here is one way to see it. Since + is a binary operator, at each operation its left-hand-side operand does get involved when ++ is executed on the other. So at each position $a gets ++ed, and picks up another increment as a LHS operand.
That means that the LHS $a gets incremented additionally (to its ++) once in each + operation. The + operations after the first one must accumulate these, one extra for each extra term. With three terms here that's another +3, once. So there are altogether 7 increments.
Yet another (fourth) term incurs an extra +4, etc
perl -wE'$x=10; $y = ++$x + ++$x + ++$x + ++$x; say $y' # 4*10 + 2+2+3+4
This is interesting to tweak by changing ++$x to $x++ -- the effect depends on position.
Increments in steps
first $a gets incremented (to 11)
in the first addition, as the second $a is incremented (to 11) the first one gets a bump as well being an operand (to 12)
in the second addition, the second $a gets incremented (to 12) as an operand
as the second addition comes up, the third $a is updated and thus picks up increments from both additions, plus its increment (to 13)
The enumeration of $a above refers to their presence at multiple places in the statement.

As #Håkon Hægland pointed out, running this code under B::Concise, which outputs the opcodes that the Perl script generates, is illuminating. Here's are two slightly different examples than the one you provided:
$ perl -E 'say $b=$a + ((++$a)+(++$a))'
6
$ perl -E 'say $b=($a+(++$a)) + (++$a)'
4
So what's going on here? Let's look at the opcodes:
$ perl -MO=Concise -E 'say $b=$a+((++$a)+(++$a))'
e <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 47 -e:1) v:%,{,469764096 ->3
d <#> say vK ->e
3 <0> pushmark s ->4
c <2> sassign sKS/2 ->d
a <2> add[t6] sK/2 ->b
- <1> ex-rv2sv sK/1 ->5
4 <#> gvsv[*a] s ->5
9 <2> add[t5] sKP/2 ->a
6 <1> preinc sKP/1 ->7
- <1> ex-rv2sv sKRM/1 ->6
5 <#> gvsv[*a] s ->6
8 <1> preinc sKP/1 ->9
- <1> ex-rv2sv sKRM/1 ->8
7 <#> gvsv[*a] s ->8
- <1> ex-rv2sv sKRM*/1 ->c
b <#> gvsv[*b] s ->c
-e syntax OK
There are no conditionals in this program. The left most column indicates the order of operations in this program. Whereever you see the ex-rv2sv token, that is where Perl is reading the value of an expression like a global scalar variable.
The preinc operations occur at labels 6 and 8. The add operations occur at labels 9 and a. This tells us that both increments occurred before Perl performed the additions, and so the final expression would be something like 2 + (2 + 2) = 6.
In the other example, the opcodes look like
$ perl -MO=Concise -E 'say $b=($a+(++$a)) + (++$a)'
e <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 47 -e:1) v:%,{,469764096 ->3
d <#> say vK ->e
3 <0> pushmark s ->4
c <2> sassign sKS/2 ->d
a <2> add[t6] sK/2 ->b
7 <2> add[t4] sKP/2 ->8
- <1> ex-rv2sv sK/1 ->5
4 <#> gvsv[*a] s ->5
6 <1> preinc sKP/1 ->7
- <1> ex-rv2sv sKRM/1 ->6
5 <#> gvsv[*a] s ->6
9 <1> preinc sKP/1 ->a
- <1> ex-rv2sv sKRM/1 ->9
8 <#> gvsv[*a] s ->9
- <1> ex-rv2sv sKRM*/1 ->c
b <#> gvsv[*b] s ->c
-e syntax OK
Now the preinc operations still occur at 6 and 9, but there is an add operation at label 7, after $a has only be incremented one time. This makes the values used in the final expression (1 + 1) + 2 = 4.
So in your example:
$ perl -MO=Concise -E '$a=10;$b=(++$a)+(++$a)+(++$a);say $b'
l <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 47 -e:1) v:%,{,469764096 ->3
5 <2> sassign vKS/2 ->6
3 <$> const[IV 10] s ->4
- <1> ex-rv2sv sKRM*/1 ->5
4 <#> gvsv[*a] s ->5
6 <;> nextstate(main 47 -e:1) v:%,{,469764096 ->7
g <2> sassign vKS/2 ->h
e <2> add[t7] sK/2 ->f
b <2> add[t5] sK/2 ->c
8 <1> preinc sKP/1 ->9
- <1> ex-rv2sv sKRM/1 ->8
7 <#> gvsv[*a] s ->8
a <1> preinc sKP/1 ->b
- <1> ex-rv2sv sKRM/1 ->a
9 <#> gvsv[*a] s ->a
d <1> preinc sKP/1 ->e
- <1> ex-rv2sv sKRM/1 ->d
c <#> gvsv[*a] s ->d
- <1> ex-rv2sv sKRM*/1 ->g
f <#> gvsv[*b] s ->g
h <;> nextstate(main 47 -e:1) v:%,{,469764096 ->i
k <#> say vK ->l
i <0> pushmark s ->j
- <1> ex-rv2sv sK/1 ->k
j <#> gvsv[*b] s ->k
-e syntax OK
We see preinc occurring at labels 8, a, and d. The add operations occur at b and e. That is, $a is incremented twice, then two $a's are added together. Then $a is incremented again. Then $a is added to the result. So the output is (12 + 12) + 13 = 37.

Why debugger stops on `use` statement?

When we enclose this code into braces:
#!/usr/bin/env perl
{
use warnings 'void';
1;
}
The debugger stops on use warnings 'void' statement:
main::(/home/kes/tmp/t3.pl:4): use warnings 'void';
DB<1> l 1-20
1 #!/usr/bin/env perl
2
3 {
4==> use warnings 'void';
5: 1;
6 }
7
But if we do not:
#!/usr/bin/env perl
use warnings 'void';
1;
The debugger does not stop on use warnings 'void' statement:
main::(/home/kes/tmp/t3.pl:5): 1;
DB<1> l 1-20
1 #!/usr/bin/env perl
2
3
4: use warnings 'void';
5==> 1;
6
7
But, as we can see, line 4 still marked as breakable.
What are differences in these examples
and why debugger does not stop on line 4?

The use statement is never added to the compiled program because it is executed at compile-time.
$ perl -MO=Concise,-exec -e'use warnings qw( void ); f()'
1 <0> enter
2 <;> nextstate(main 2 -e:1) v:{
3 <0> pushmark s
4 <#> gv[*f] s/EARLYCV
5 <1> entersub[t2] vKS/TARG
6 <#> leave[1 ref] vKP/REFC
-e syntax OK
As such, the debugger is never actually stopping on the use statement. Let's look at the compiled form of both programs:
$ perl -MO=Concise
{
use warnings qw( void );
1;
}
^D
6 <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -:2) v:{ ->3
5 <2> leaveloop vK/2 ->6
3 <{> enterloop(next->5 last->5 redo->4) v ->4
- <#> lineseq vKP ->5
4 <;> nextstate(main 3 -:3) v:{ ->5
- <0> ex-const v ->-
- syntax OK
$ perl -MO=Concise
use warnings qw( void );
1;
^D
3 <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 2 -:2) v:{ ->3
- <0> ex-const v ->3
- syntax OK
nextstate ops set the line number for run-time warnings. (For example, -:2 means line 2 of the code read from STDIN.) The debugger is also using these to know where to break and to find the line in the source file of the current statement.
The second snippet has only one run-time statement, so it has a single nextstate op at which the debugger stop.
The first snippet, however, has two statements. A bare loop ({ ... }), and a constant. The bare loop creates a nextstate op with the line of the first non-whitespace character that follows the {, which happens to be the use warnings;. I'm not sure why it does this.

Find the C APIs used for perl script

Trying to understand the C code that's behind a perl script. For example, the following contrived code:
$name = "john";
$greeting = "hi $name, how old are you?";
if ($greeting =~ /hi (\S+)/) {
$b = $1;
print "got $b as expected\n";
}
Would like to know how the variable $name is substituted in $greeting string, also would like to know what c API is used for the regular expression match.
I heard something like perl -MO=Bytecode,-H test.pl where test.pl has the above content, but the output is bindary.

There isn't a direct mapping of Perl code to C code. Instead, Perl is a bytecode compiler. What you can get is the bytecode, the tree of opcodes. There's several modules to get this in a human readable form, one is B::Concise.
perl -MO=Concise test.pl
Produces this...
w <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 test.plx:1) v:{ ->3
5 <2> sassign vKS/2 ->6
3 <$> const[PV "john"] s ->4
- <1> ex-rv2sv sKRM*/1 ->5
4 <#> gvsv[*name] s ->5
6 <;> nextstate(main 1 test.plx:2) v:{ ->7
d <2> sassign vKS/2 ->e
- <1> ex-stringify sK/1 ->c
- <0> ex-pushmark s ->7
b <2> concat[t5] sKS/2 ->c
9 <2> concat[t4] sK/2 ->a
7 <$> const[PV "hi "] s ->8
- <1> ex-rv2sv sK/1 ->9
8 <#> gvsv[*name] s ->9
a <$> const[PV ", how old are you?"] s ->b
- <1> ex-rv2sv sKRM*/1 ->d
c <#> gvsv[*greeting] s ->d
The documentation for B::Concise explains all this. This tells you the operator sequence, type, name, flags, and the next op in the sequence. For example...
7 <$> const[PV "hi "] s ->8
This is operator 7, it is an SVOP (it applies to scalars), its name is "const" and it's for the scalar string (PV) "hi ", it's in scalar context, and the next operator is 8.
More about operators can be learned from perlguts and the Illustrated Perl Guts and by poking around in the Perl source code. Each operator has a C function associated with it called pp_OPNAME so to find the "const" operator look for pp_const.
The Perl regular expression engine is completely custom and has its own perlreguts documentation.

Perl concatenate operator vs. append operator

First, here is the example straight from Learning Perl (p.29)
# Append a space to $str
$str = $str . " ";
# The same thing with an assignment operator
$str .= " ";
Are either of these method more "correct" or preferred for speed or syntactical reasons?

Looking at the Concise output for each option:
perl -MO=Concise,-exec -e 'my $str = "a"; $str = $str . " ";'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <$> const[PV "a"] s
4 <0> padsv[$str:1,2] sRM*/LVINTRO
5 <2> sassign vKS/2
6 <;> nextstate(main 2 -e:1) v:{
7 <0> padsv[$str:1,2] s
8 <$> const[PV " "] s
9 <2> concat[$str:1,2] sK/TARGMY,2
a <#> leave[1 ref] vKP/REFC
-e syntax OK
perl -MO=Concise,-exec -e 'my $str = "a"; $str .= " ";'
1 <0> enter
2 <;> nextstate(main 1 -e:1) v:{
3 <$> const[PV "a"] s
4 <0> padsv[$str:1,2] sRM*/LVINTRO
5 <2> sassign vKS/2
6 <;> nextstate(main 2 -e:1) v:{
7 <0> padsv[$str:1,2] sRM
8 <$> const[PV " "] s
9 <2> concat[t2] vKS/2
a <#> leave[1 ref] vKP/REFC
-e syntax OK
While they are slightly different (.= does concatenation in a void context, and the other in scalar) the main reason to choose one or the other is style/maintainability. I prefer to write:
$str .= " ";
Mainly for ease of typing and because it's obvious you're appending to the end of string without having to check the variable on the RHS is the same as on the LHS.
Essentially: Use whichever you prefer!

Is there any difference between &$func($arg) and $func->($arg)?

While trying to understand closures, reading thru perl-faq and coderef in perlref found those examples:
sub add_function_generator {
return sub { shift() + shift() };
}
my $add_sub = add_function_generator();
my $sum = $add_sub->(4,5);
and
sub newprint {
my $x = shift;
return sub { my $y = shift; print "$x, $y!\n"; };
}
$h = newprint("Howdy");
&$h("world");
here are two forms of calling a function stored in a variable.
&$func($arg)
$func->($arg)
Are those totally equivalent (only syntactically different) or here are some differences?

There is no difference. Proof: the opcodes generated by each version:
$ perl -MO=Concise -e'my $func; $func->()'
8 <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
3 <0> padsv[$func:1,2] vM/LVINTRO ->4
4 <;> nextstate(main 2 -e:1) v:{ ->5
7 <1> entersub[t2] vKS/TARG ->8
- <1> ex-list K ->7
5 <0> pushmark s ->6
- <1> ex-rv2cv K ->-
6 <0> padsv[$func:1,2] s ->7
$ perl -MO=Concise -e'my $func; &$func()'
8 <#> leave[1 ref] vKP/REFC ->(end)
1 <0> enter ->2
2 <;> nextstate(main 1 -e:1) v:{ ->3
3 <0> padsv[$func:1,2] vM/LVINTRO ->4
4 <;> nextstate(main 2 -e:1) v:{ ->5
7 <1> entersub[t2] vKS/TARG ->8
- <1> ex-list K ->7
5 <0> pushmark s ->6
- <1> ex-rv2cv sKPRMS/4 ->-
6 <0> padsv[$func:1,2] s ->7
… wait, there are actually slight differences in the flags for - <1> ex-rv2cv sKPRMS/4 ->-. Anyways, they don't seemt to matter, and both forms behave the same.
But I would recommend to use the form $func->(): I perceive this syntax as more elegant, and you can't accidentally forget to use parens (&$func works but makes the current #_ visible to the function, which is not what you'd usually want).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

perl string interpolation when using the package delimiter - perl

Related

Perl increment operator

Why debugger stops on `use` statement?

Find the C APIs used for perl script

Perl concatenate operator vs. append operator

Is there any difference between &$func($arg) and $func->($arg)?

Categories

Resources