Why aren't variables interpolated in `constant` declarations? - constants

use v6.d;
my Str $foo = 'Hello';
my constant $BAR = "--$foo--";
say $BAR;
OUTPUT:
Use of uninitialized value element of type Str in string context.
Methods .^name, .raku, .gist, or .say can be used to stringify it to something meaningful.
in block at deleteme.raku line 4
----
Expected OUTPUT:
--Hello--
The same thing happens without the my or with our in place of my.
[187] > $*DISTRO
macos (12.6)
[188] > $*KERNEL
darwin
[189] > $*RAKU
Raku (6.d)

The value assigned to a constant is evaluated at compile time, not runtime. This means that constant values can be calculated as part of the compilation and cached.
Regular assignments happen at runtime. Thus, in:
my Str $foo = 'Hello';
my constant $BAR = "--$foo--";
The assignment to $foo has not taken place at the time "--$foo--" is evaluated. By contrast, if $foo were to be a constant, the value is available at compile time and interpolated, so:
my constant $foo = 'Hello';
my constant $BAR = "--$foo--";
say $BAR;
Produces:
--Hello--

Related

Can someone explain why Perl behaves this way (variable scoping)?

My test goes like this:
use strict;
use warnings;
func();
my $string = 'string';
func();
sub func {
print $string, "\n";
}
And the result is:
Use of uninitialized value $string in print at test.pl line 10.
string
Perl allows us to call a function before it has been defined. However when the function uses a variable declared only after the function call, the variable appears to be undefined. Is this behavior documented somewhere? Thank you!
The behaviour of my is documented in perlsub - it boils down to this - perl knows $string is in scope - because the my tells it so.
The my operator declares the listed variables to be lexically confined to the enclosing block, conditional (if/unless/elsif/else), loop (for/foreach/while/until/continue), subroutine, eval, or do/require/use'd file.
It means it's 'in scope' from the point at which it's first 'seen' until the closing bracket of the current 'block'. (Or in your example - the end of the code)
However - in your example my also assigns a value.
This scoping process happens at compile time - where perl checks where it's valid to use $string or not. (Thanks to strict). However - it can't know what the value was, because that might change during code execution. (and is non-trivial to analyze)
So if you do this it might be a little clearer what's going on:
#!/usr/bin/env perl
use strict;
use warnings;
my $string; #undefined
func();
$string = 'string';
func();
sub func {
print $string, "\n";
}
$string is in scope in both cases - because the my happened at compile time - before the subroutine has been called - but it doesn't have a value set beyond the default of undef prior to the first invocation.
Note this contrasts with:
#!/usr/bin/env perl
use strict;
use warnings;
sub func {
print $string, "\n";
}
my $string; #undefined
func();
$string = 'string';
func();
Which errors because when the sub is declared, $string isn't in scope.
First of all, I would consider this undefined behaviour since it skips executing my like my $x if $cond; does.
That said, the behaviour is currently consistent and predictable. And in this instance, it behaves exactly as expected if the optimization that warranted the undefined behaviour notice didn't exit.
At compile-time, my has the effect of declaring and allocating the variable[1]. Scalars are initialized to undef when created. Arrays and hashes are created empty.
my $string was encountered by the compiler, so the variable was created. But since you haven't executed the assignment yet, it still has its default value (undefined) during the first call to func.
This model allows variables to be captured by closures.
Example 1:
{
my $x = "abc";
sub foo { $x } # Named subs capture at compile-time.
}
say foo(); # abc, even though $x fell out of scope before foo was called.
Example 2:
sub make_closure {
my ($x) = #_;
return sub { $x }; # Anon subs capture at run-time.
}
my $foo = make_closure("foo");
my $bar = make_closure("bar");
say $foo->(); # foo
say $bar->(); # bar
The allocation is possibly deferred until the variable is actually used.

Perl: what happens when we assign a constant value to a reference variable instead of address of a variable?

Example:
#! /usr/bin/perl -w
$$var=10;
print "Variable is containing $$var and is of type ". ref($$var)."\n";
Output:
$perl testRefVar.pl
Variable is containing 10 and is of type
Here the variable $$var is not referring to any other variable, but a constant. So is there any significance of prefixing a variable with '$$'?
This is the same as setting any other variable to 10, except that the access is indirect.
perl will autovivify an anonymous scalar variable and assign the reference to $var so that this action can be fulfilled. If you print out $var you will get something like SCALAR(0x3ecec4).
To do this explcitily you could write
my $temp;
my $var = \$temp;
$$var = 10;
which would assign 10 to the variable $temp.
If, instead, you set $var to a reference to a constant, like this
my $var = \99;
$$var = 10;
you would get the error message
Modification of a read-only value attempted
because in this case $var refers to a constant instead of a variable.
Here, $var is a a reference to a scalar, and $$var is the scalar itself. Unless your code specifically wants references, the indirection (requiring the use of two dollar signs when you want the actual scalar) is at the very least a cumbersome hassle.

Symbolic reference to sub in a package, trying to understand

use strict;
use warnings;
package Foo::Bar;
sub baz { print "$_[0]\n" }
package main;
{ # test 1
my $quux = "Foo::Bar::baz";
no strict 'refs';
&$quux(1);
}
{ # test 2
my $qux = 'Foo::Bar';
my $quux = "$qux\::baz";
no strict 'refs';
&$quux(2);
}
{ # test 3
my $qux = 'Foo::Bar';
my $quux = "$qux::baz";
no strict 'refs';
&$quux(3);
}
Output:
Name "qux::baz" used only once: possible typo at test31.pl line 21.
1
2
Use of uninitialized value $qux::baz in string at test31.pl line 21.
Undefined subroutine &main:: called at test31.pl line 23.
Why does test 2 work, and why does backslash has to be placed exactly there?
Why does test 3 not work, is it so far, in syntax, from test 1?
I tried to write that string as "{$qux}::baz", but it doesn't work, too.
I came across this looking at source of Image::Info distribution.
$qux::baz refers to scalar baz in package qux.
"$qux::baz" is the stringification of that scalar.
"$qux::baz" is another way of writing $qux::baz."".
Curlies can be used to indicate where the variable ends.
"$foo bar" means $foo." bar"
"${f}oo bar" means $f."oo bar"
As such,
"${qux}::baz" is another way of writing $qux."::baz".
"$qux\::baz" is a cute way of doing $qux."::baz" since \ can't appear in variable names.
A variable name can either be simple like $foo or a fully qualified name (in the case of package variables). Such a fully qualified name looks like $Foo::bar. This is the “global” variable $bar in the package Foo.
If an interpolated variable is to be followed by a double colon :: and the variable should not be interpreted as a fully qualified package variable name, then you can either:
Use string concatenation: $qux . "::baz"
Terminate the variable name with a backslash: "$qux\::baz"
Surround the variable name (not the sigil) with curly braces, as if the name were a symbolic reference: "${qux}::bar".

Why does this Perl variable keep its value

What is the difference between the following two Perl variable declarations?
my $foo = 'bar' if 0;
my $baz;
$baz = 'qux' if 0;
The difference is significant when these appear at the top of a loop. For example:
use warnings;
use strict;
foreach my $n (0,1){
my $foo = 'bar' if 0;
print defined $foo ? "defined\n" : "undefined\n";
$foo = 'bar';
print defined $foo ? "defined\n" : "undefined\n";
}
print "==\n";
foreach my $m (0,1){
my $baz;
$baz = 'qux' if 0;
print defined $baz ? "defined\n" : "undefined\n";
$baz = 'qux';
print defined $baz ? "defined\n" : "undefined\n";
}
results in
undefined
defined
defined
defined
==
undefined
defined
undefined
defined
It seems that if 0 fails, so foo is never reinitialized to undef. In this case, how does it get declared in the first place?
First, note that my $foo = 'bar' if 0; is documented to be undefined behaviour, meaning it's allowed to do anything including crash. But I'll explain what happens anyway.
my $x has three documented effects:
It declares a symbol at compile-time.
It creates an new variable on execution.
It returns the new variable on execution.
In short, it's suppose to be like Java's Scalar x = new Scalar();, except it returns the variable if used in an expression.
But if it actually worked that way, the following would create 100 variables:
for (1..100) {
my $x = rand();
print "$x\n";
}
This would mean two or three memory allocations per loop iteration for the my alone! A very expensive prospect. Instead, Perl only creates one variable and clears it at the end of the scope. So in reality, my $x actually does the following:
It declares a symbol at compile-time.
It creates the variable at compile-time[1].
It puts a directive on the stack that will clear[2] the variable when the scope is exited.
It returns the new variable on execution.
As such, only one variable is ever created[2]. This is much more CPU-efficient than then creating one every time the scope is entered.
Now consider what happens if you execute a my conditionally, or never at all. By doing so, you are preventing it from placing the directive to clear the variable on the stack, so the variable never loses its value. Obviously, that's not meant to happen, so that's why my ... if ...; isn't allowed.
Some take advantage of the implementation as follows:
sub foo {
my $state if 0;
$state = 5 if !defined($state);
print "$state\n";
++$state;
}
foo(); # 5
foo(); # 6
foo(); # 7
But doing so requires ignoring the documentation forbidding it. The above can be achieved safely using
{
my $state = 5;
sub foo {
print "$state\n";
++$state;
}
}
or
use feature qw( state ); # Or: use 5.010;
sub foo {
state $state = 5;
print "$state\n";
++$state;
}
Notes:
"Variable" can mean a couple of things. I'm not sure which definition is accurate here, but it doesn't matter.
If anything but the sub itself holds a reference to the variable (REFCNT>1) or if variable contains an object, the directive replaces the variable with a new one (on scope exit) instead of clearing the existing one. This allows the following to work as it should:
my #a;
for (...) {
my $x = ...;
push #a, \$x;
}
See ikegami's better answer, probably above.
In the first example, you never define $foo inside the loop because of the conditional, so when you use it, you're referencing and then assigning a value to an implicitly declared global variable. Then, the second time through the loop that outside variable is already defined.
In the second example, $baz is defined inside the block each time the block is executed. So the second time through the loop it is a new, not yet defined, local variable.

Query reg code in List::Util::reduce

I came across the following code in List::Util for reduce subroutine.
my $caller = caller;
local(*{$caller."::a"}) = \my $a;
local(*{$caller."::b"}) = \my $b;
I could understand that reduce function is called as:
my $sum = reduce { $a + $b } 1 .. 1000;
So, I understood the code is trying to reference $a mentioned in the subroutine. But, I am unable to understand the intent correctly.
For reference, I am adding the complete code for subroutine
sub reduce (&#) {
my $code = shift;
require Scalar::Util;
my $type = Scalar::Util::reftype($code);
unless($type and $type eq 'CODE') {
require Carp;
Carp::croak("Not a subroutine reference");
}
no strict 'refs';
return shift unless #_ > 1;
use vars qw($a $b);
my $caller = caller;
local(*{$caller."::a"}) = \my $a;
local(*{$caller."::b"}) = \my $b;
$a = shift;
foreach (#_) {
$b = $_;
$a = &{$code}();
}
$a;
}
The following aliases package variable $foo to variable $bar.
*foo = \$bar;
Any change to one changes the other as both names refer to the same scalar.
$ perl -E'
*foo = \$bar;
$bar=123; say $foo;
$foo=456; say $bar;
say \$foo == \$bar ? 1 : 0;
'
123
456
1
Of course, you can fully qualify *foo since it's a symbol table entry. The following aliases package variable $main::foo to $bar.
*main::foo = \$bar;
Or, if you don't know the name at compile time
my $caller = 'main';
*{$caller."::foo"} = \$bar; # Symbolic reference
$bar, of course, can just as easily be a lexical variable as a package variable. And since my $bar; actually returns the variable begin declared,
my $bar;
*foo = \$bar;
can be written as
*foo = \my $bar;
So,
my $caller = caller;
local(*{$caller."::a"}) = \my $a;
local(*{$caller."::b"}) = \my $b;
declares and aliases lexical variables $a and $b the similarly named package variables in the caller's namespace.
local simply causes everything to return to their original state once the sub is exited.
On scope
Perl has two variable name scoping mechnisms: global and lexical. Declaration of lexical vars is done with my, and they are accessibly by this name until they encounter a closing curly brace.
Global variables, on the other hand, are accessible from anywhere and do not have a scope. They can be declared with our and use vars, or do not have to be declared if strict is not in effect. However, they have namespaces, or packages. The namespace is a prefix seperated from the variable name by two colons (or a single quote, but never do that). Inside the package of the variable, the variable can be accessed with or without the prefix. Outside of the package, the prefix is required.
The local function is somewhat special and gives global variables a temporary value. The scope of this value is the same as that of a lexical variable plus the scopes of all subs called within this scope. The old value is restored once this scope is exited. This is called the dynamic scope.
On Globs
Perl organizes global variables in a big hash representing the namespace and all variable names (sometimes called the stash). In each slot of this hash, there is a so-called glob. A typeglob is a special hash that has a field for each of Perls native types, e.g. scalar, array, hash, IO, format, code etc. You assign to a slot by passing the glob a reference of a value you want to add - the glob figures out the right slot on it's own. This is also the reason you can have multiple variables with the same name (like $thing, #thing, %thing, thing()). Typeglobs have a special sigil, namely the asterisk *.
On no strict 'refs'
The no strict 'refs' is a cool thing if you know what you are doing. Normally you can only dereference normal references, e.g.
my #array = (1 .. 5);
my $arrayref = \#array; # is a reference
push #{$arrayref}, 6; # works
push #{array}, 6; # works; barewords are considered o.k.
push #{"array"}, 6; # dies horribly, if strict refs enabled.
The last line tried to dereference a string, this is considered bad practice. However, under no strict 'refs', we can access a variable of which we do not know the name at compile time, as we do here.
Conclusion
The caller functions returns the name of the package of the calling code, i.e. it looks up one call stack frame. The name is used here to construct the full names of $a and $b variables of the calling packages, so that they can be used there without a prefix. Then, these names are locally (i.e. in the dynamic scope) assigned to the reference of a newly declared, lexical variable.
The global variables $a and $b are predeclared in each package.
In the foreach loop, these lexicals are assigned different values (lexical vars take precedence over global vars), but the global variables $foo::a and $foo::$b point to the same data because of the reference, allowing the anonymous callback sub in the reduce call to read the two arguments easily. (See ikegamis answer for details on this.)
All of this hassle is good because (a) the effects are not externaly visible, and (b) the callback doesn't have to do tedious argument unpacking.