Programming Languages problem, Procedural Language, Dynamic Scope - perl

I have been assigned such a problem in my software development course. So, the normal way is to check each procedure one by one and remember each call by each subprogram, however, I am a bit lazy programmer and I have decided to take a shortcut by implementing given pseudocode in an actual programming language.
Problem statement:
procedure Main is
X, Y, Z : Integer;
procedure Sub1 is
A, Y, Z : Integer;
begin
...
end;
procedure Sub2 is
A, B, Z : Integer;
begin
...
procedure Sub4 is
A, B, W : Integer;
begin
...
end;
end;
procedure Sub3 is
A, X, W : Integer;
begin
...
end;
begin
...
end;
Consider the program above. Given the following calling sequences and assuming that
dynamic scoping is used, what variables are visible during the execution of the last subprogram
activated? Include with each visible variable the name of the unit where it is declared (e.x. Main.X).
Main calls Sub1; Sub1 calls Sub3; Sub3 calls Sub2;
My Attempt:
$x = 10;
$y = 20;
$z = 30;
sub Sub2
{
return $x;
}
sub Sub1
{
local $x = 9;
local $y = 19;
local $z = 29;
return Sub2();
}
print Sub1()."\n";
I'm stuck at this point and have no idea how to change the code so that it shows me the variables. I see that solution is obvious, but I've coded in C++ and Java so far.

It would be nice if you've spent the time you used on asking this question on watching tutorials. However, we all have been there at one point, being confused exploring new languages. Try not to ask for the answer to your homework next time.
So, I see you'd like to use Perl, a good choice. I myself have done a similar task recently, here is my approach.
As R. Sebesta (2019) writes in the book named "Concepts of Programming
Languages" (12 ed.), the best examples of dynamic scoping are Perl and
Common Lisp.
Basically, it is based on the sequence of subprogram calls which are determined only at run time.
The following program shows how the subprogram calls affect variable value:
$x = 0;
$y = 0;
$z = 0;
sub sub1
{
local $a = 1;
local $y = 1;
local $z = 1;
return sub3();
}
sub sub2
{
local $a = 2;
local $b = 2;
local $z = 2;
sub sub4
{
local $a = 4;
local $b = 4;
local $w = 4;
}
return "Sub".$a.".A, "."Sub".$b.".B, "."Sub".$w.".W, "."Sub".$x.".X,
"."Sub".$y.".Y, "."Sub".$z.".Z";
}
sub sub3
{
local $a = 3;
local $x = 3;
local $w = 3;
return sub2();
}
print sub1()."\n";
Output: Sub2.A, Sub2.B, Sub3.W, Sub3.X, Sub1.Y, Sub2.Z
Note: Sub0 is just Main subprogram scope.

If you want to check the values of the variables in each subroutine you would dump them out with a module like Data::Dump or Data::Dumper.
sub foo {
printf "foo() current values are %s\n\n",
Data::Dumper::Dumper($a, $b, $c, $x, $y, $z);
}
If you want to see the call stack of the current subroutine you would use the Carp module.
use Carp;
sub foo { Carp::cluck("foo() here"); }
sub bar { foo() }
&bar;
# Output
foo() here at (eval 284) line 1.
W::foo() called at (eval 284) line 1
W::bar() called at (eval 285) line 1
eval 'package W; bar' called at script.pl line 116
console::_console called at script.pl line 473

Related

Per: Visibility of variables of a subroutines for its subroutine

I have a problem about variables of a subroutine which cannot be accessed by another subroutine.
the first subroutine :
sub esr_info {
my $esr ;
my #vpls = () ;
my #sap = ();
my #spoke = () ;
&conf_esr($esr , 1);
}
the second :
sub conf_esr {
my $e = #_[0] ;
some code (#vpls, #sap, #spoke);
}
the first calls the second, and I need the variables of the first to be local and not global for the whole code (for threading purposes). The second uses all the variables of the first . I get these errors :
Global symbol "$esr" requires explicit package name (did you forget to declare "my $esr"?) at w.pl line 63.
Global symbol "#vpls" requires explicit package name (did you forget to declare "my #vpls"?) at w.pl line 74.
My question : Can a subroutine access the vars of another without declaring those vars as global ?
Many thanks for reading the post.
You can contain (restrict the visibility of) the variables to the two subs by introducing a scope { ... }, for example:
{
my $esr ;
my #vpls = () ;
my #sap = ();
my #spoke = () ;
sub esr_info {
conf_esr($esr , 1);
}
sub conf_esr {
my $e = #_[0] ;
#some code (#vpls, #sap, #spoke);
}
}
But note that the variables now retain the values after the subs are exited (they become state variables). This is also called a closure.
But other approaches could be more appropriate (closures can make the code more difficult to read and hence to maintain) depending on you situation. For example, alternatives could be:
you could pass references to the variables as arguments to conf_esr, or better
use an object oriented approach where the variables are contained in a $self hash.
My question : Can a subroutine access the vars of another without declaring those vars as global ?
No. You should try passing in variables, it's better form, but you can also use global variables.
my $i=1;
mysub(); # This will not change the global $i
print "i=$i\n"; # This should print '1'
exit;
##########
sub mysub
{my $i=2; # This is a variable local to mysub() only.
return;
}
Type in the code above and run it with Perl. Notice that the $i in the subroutine mysub() is completely different than the global $i in the program itself, because the $i in the mysub() is a different memory address.
Now let's change $i to global. mysub() will change the global $i because it doesn't have a local $i declared.
my $i=1;
mysub(); # This will not change the global $i
print "i=$i\n"; # This should print '2'
exit;
##########
sub mysub
{$i=2; # This is changing the value in the global $i memory area.
return;
}

In perl, when assigning a subroutine's return value to a variable, is the data duplicated in memory?

sub foo {
my #return_value = (1, 2);
}
my #receiver = foo();
Is this assigning like any other assigning in perl? the array is duplicated in memory? I doubt this cause of that since the array held by the subroutine is disposable, a duplication is totally redundant. it makes sense to just 'link' the array to #receiver for optimization reason.
by the way, I noticed a similar question Perl: function returns reference or copy? but didn't get what I want.
and I'm talking about Perl5
ps. any books or materials on such sort of topics about perl?
The scalars returned by :lvalue subs aren't copied.
The scalars returned by XS subs aren't copied.
The scalars returned by function (named operators) aren't copied.
The scalars returned by other subs are copied.
But that's before any assignment comes into play. If you assign the returned values to a variable, you will be copying them (again, in the case of a normal Perl sub).
This means my $y = sub { $x }->(); copies $x twice!
But that doesn't really matter because of optimizations.
Let's start with an example of when they aren't copied.
$ perl -le'
sub f :lvalue { my $x = 123; print \$x; $x }
my $r = \f();
print $r;
'
SCALAR(0x465eb48) # $x
SCALAR(0x465eb48) # The scalar on the stack
But if you remove :lvalue...
$ perl -le'
sub f { my $x = 123; print \$x; $x }
my $r = \f();
print $r;
'
SCALAR(0x17d0918) # $x
SCALAR(0x17b1ec0) # The scalar on the stack
Worse, one usually follows up by assigning the scalar to a variable, so a second copy occurs.
$ perl -le'
sub f { my $x = 123; print \$x; $x }
my $r = \f(); # \
print $r; # > my $y = f();
my $y = $$r; # /
print \$y;
'
SCALAR(0x1802958) # $x
SCALAR(0x17e3eb0) # The scalar on the stack
SCALAR(0x18028f8) # $y
On the plus side, assignment in optimized to minimize the cost of copying strings.
XS subs and functions (named operators) typically return mortal ("TEMP") scalars. These are scalars "on death row". They will be automatically destroyed if nothing steps in to claim a reference to them.
In older versions of Perl (<5.20), assigning a mortal string to another scalar will cause ownership of the string buffer to be transferred to avoid having to copy the string buffer. For example, my $y = lc($x); doesn't copy the string created by lc; simply the string pointer is copied.
$ perl -MDevel::Peek -e'my $s = "abc"; Dump($s); $s = lc($s); Dump($s);'
SV = PV(0x1705840) at 0x1723768
REFCNT = 1
FLAGS = (PADMY,POK,IsCOW,pPOK)
PV = 0x172d4c0 "abc"\0
CUR = 3
LEN = 10
COW_REFCNT = 1
SV = PV(0x1705840) at 0x1723768
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
PV = 0x1730070 "abc"\0 <-- Note the change of address from stealing
CUR = 3 the buffer from the scalar returned by lc.
LEN = 10
In newer versions of Perl (≥5.20), the assignment operator never[1] copies the string buffer. Instead, newer versions of Perl uses a copy-on-write ("COW") mechanism.
$ perl -MDevel::Peek -e'my $x = "abc"; my $y = $x; Dump($x); Dump($y);'
SV = PV(0x26b0530) at 0x26ce230
REFCNT = 1
FLAGS = (POK,IsCOW,pPOK)
PV = 0x26d68a0 "abc"\0 <----+
CUR = 3 |
LEN = 10 |
COW_REFCNT = 2 +-- Same buffer (0x26d68a0)
SV = PV(0x26b05c0) at 0x26ce248 |
REFCNT = 1 |
FLAGS = (POK,IsCOW,pPOK) |
PV = 0x26d68a0 "abc"\0 <----+
CUR = 3
LEN = 10
COW_REFCNT = 2
Ok, so far, I've only talked about scalars. Well, that's because subs and functions can only return scalars[2].
In your example, the scalar assigned to #return_value would be returned[3], copied, then copied a second time into #receiver by the assignment.
You could avoid all of this by returning a reference to the array.
sub f { my #fizbobs = ...; \#fizbobs }
my $fizbobs = f();
The only thing copied there is a reference, the simplest non-undefined scalar.
Ok, maybe not never. I think there needs to be a free byte in the string buffer to hold the COW count.
In list context, they can return 0, 1 or many of them, but they can only return scalars.
The last operator of your sub is a list assignment operator. In list context, the list assignment operator returns the scalars to which its left-hand side (LHS) evaluates. See Scalar vs List Assignment Operator for more info.
The subroutine returns the result of the last operation if you don't specify an explicit return.
#return_value is created separately from #receiver and the values are copied and the memory used by #return_value is released when it goes out of scope at subroutine exit.
So yes - the memory used is duplicated.
If you desperately want to avoid this, you can create an anonymous array once, and 'pass' a reference to it around:
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
sub foo {
my $anon_array_ref = [ 1, 2 ];
return $anon_array_ref;
}
my $results_from_foo = foo();
print Dumper $results_from_foo;
This will usually be premature optimisation though, unless you know you're dealing with really big data structures.
Note - you should probably include an explicit return; in your sub after the assignment, as it's good practice to make clear what you're doing.

Access "external" Perl Array from "BEGIN" Block/Package

Below is a simplified version of a perl script I am trying to modify:
use MODULE_1 ;
use MODULE_2 ;
use vars qw(%ARR $VarZZ) ;
sub A {
# Somestuff
# Call to Sub B
B() ;
# Call to Sub C
C() ;
}
BEGIN {
package XYZ ;
use vars qw($VarYY $VarXX) ;
# MISC SUBS HERE
# end of package XYZ
}
sub B {
# Somestuff
}
sub C {
# Somestuff to set %ARR
}
# Call to Sub A
A() ;
My issue is that I will like to access %ARR within the package XYZ BEGIN block but keep getting error messages saying I need to define %ARR ("requires explicit package name")
I have tried, trying to copy a similar example within the block, $main::ARR{index} but failed so far.
I assume it may be because %ARR isn't set when that block is begin evaluated and that I need to call "sub C" perhaps but &main::C(); seems to be failing as well.
How can I access this array there?
I have looked at: Perl's main package - block syntax - pragmas and BEGIN/END blocks which seems to be addressing similar themes but struggling to properly understand the answers
** EDIT **
Expanded skeleton script showing some attempts at moving forward:
use MODULE_1 ;
use MODULE_2 ;
use vars qw(%ARR $VarZZ) ;
sub A {
# Somestuff
# Call to Sub B
B() ;
# Call to Sub C
C() ;
# Call to Sub E
E() ;
}
sub E {
# Call to Package XYZ subs
}
BEGIN {
package XYZ ;
use vars qw($VarYY $VarXX %ARR) ;
# I tried to Call to Sub C and load a local version of %ARR
#
# This fails with "Undefined subroutine &main::C" error
&main::C() ;
#
# We never get here so not sure if correct
%ARR = &main::ARR ;
# MISC SUBS HERE
sub X {
# Call to Sub D
&main::D() ;
}
# end of package XYZ
}
sub B {
# Somestuff
}
sub C {
# Somestuff to set %ARR
}
sub D {
# Somestuff
}
# Call to Sub A
A() ;
Note the the call to &main::E() works when called within the Subs in Package XYZ but both this and &main::C() fail when running free standing. Perhaps the free standing call is done at complile time before the subs are defined.
BTW, I tried the our definition but getting a 502 error: Nginx Debug Log
Perhaps this is because the array is not available?
%main:ARR or $main::ARR{index} are correct for the code skeleton you have provided (well, anything is correct because you haven't said use strict, but anyway ...). Is it possible that main is not the correct namespace (i.e., is there some pacakge statement that precedes use vars ...) ?
In any case, you can workaround this issue with the our keyword. If you declare it at the top level, it will have scope throughout the rest of the file:
package ABC;
our %ARR; # %ABC::ARR
sub foo {
$ARR{"key"} = "value"; # %ABC::ARR
}
{ # "BEGIN" optional
package XYZ;
our %Hash; # %XYZ::Hash
sub bar {
my $key1 = $Hash{"key1"}; # %XYZ::Hash
my $val1 = $ARR{$key1}; # %ABC::ARR
$ARR{$val1} = $key1;
}
}
...

Is it a good practice to use self invoking anonymous function in Perl?

It is a common practice to use self invoking anonymous functions to scope variables etc. in JavaScript:
;(function() {
...
})();
Is it a good practice to use such functions in Perl ?
(sub {
...
})->();
Or is it better for some reason to use main subroutine ?
sub main {
...
}
main();
Perl has lexical scoping mechanisms JS lacks. You are better off simply enclosing code you want scoped somehow in a block, e.g.:
{
my $localvar;
. . .
}
In this case $localvar will be completely invisible outside of those braces; that is also the same mechanism one can use to localise builtin variables such as $/:
{
local $/ = undef;
#reading from a file handle now consumes the entire file
}
#But not out here
(Side note: never set $/ globally. It can break things in subtle and horrible ways if you forget to set it back when you're done, or if you call other code before restoring it.)
In perl, the best practise is to put things in subs when it makes sense; when it doesn't make sense or unnecessarily complicates the code, lexical blocks ensure scoping; if you do need anonymous subroutines (generally for callbacks or similar) then you can do my $subref = sub { . . . }; or even just stick the sub declaration directly into a function argument: do_something(callback => sub { . . . });
Note: see also ysth's answer for a resource-related advantage to self-invoking anonymous subs.
Since perl provides lexically scoped variables (and, as of 5.18, lexical named subs), there is no scoping reason for doing that.
The only reason to do it that I can think of would be for memory management; if the sub in question is a closure (references at least one external lexical variable), any memory used by the sub will be totally freed instead of retained for reuse on the next call:
$ perl -MDevel::Peek -wle'sub { my $x; Dump $x; $x = 42 }->() for 1..2'
SV = NULL(0x0) at 0x944a88
REFCNT = 1
FLAGS = (PADMY)
SV = IV(0x944a78) at 0x944a88
REFCNT = 1
FLAGS = (PADMY)
IV = 42
$ perl -MDevel::Peek -wle'my $y; sub { $y if 0; my $x; Dump $x; $x = 42 }->() for 1..2'
SV = NULL(0x0) at 0x259d238
REFCNT = 1
FLAGS = (PADMY)
SV = NULL(0x0) at 0x259d220
REFCNT = 1
FLAGS = (PADMY)
Though if you are not concerned about memory, this would be a disadvantage.
It's not unheard of but not common either. To restrict variable scope temporarily, it's much more common to use a block with a my variable declaration:
...
{
my $local_variable;
...
}
In Javascript, self-invoking functions have two uses:
Variable scoping. The var declarations are hoisted into the scope of the first enclosing function or into global scope. Therefore,
function () {
if (true) {
var foo = 42
}
}
is the same as
function () {
var foo
if (true) {
foo = 42
}
}
– often an unwanted effect.
Statements on the expression level. Sometimes you need multiple statements to compute something, but want to do so inside an expression.
largeObject = {
...,
// sum from 1 to 42
sum: (function(n){
var sum = 0;
for(var i = 1; i <= n; i++)
sum += i;
return sum;
})(42),
...,
};
Perl has no need for self-invoking functions as a scoping mechanism, because a new scope is introduced by any curly brace. A bare block is always allowed on a statement level:
...
my $foo = 10;
{
my $foo = 42;
}
$foo == 10 or die; # lives
Perl has reduced need for self-invoking functions to introduce statements into an expression because of the do BLOCK builtin:
%large_hash = (
...,
sum => do {
my $sum = 0;
$sum += $_ for 1 .. 42;
$sum;
},
...,
);
However, you will sometimes want to short-curcuit in such a block. As return exits the surrounding subroutine (not block), it can be quite useful here. For example in a memoized function:
# moronic cached division by two
my %cache;
sub lookup {
my $key = shift;
return $cache{$key} //= sub {
for (1 .. 100) {
return $_ if $_ * 2 == $key
}
return;
}->();
}

Why does this Perl variable keep its value

What is the difference between the following two Perl variable declarations?
my $foo = 'bar' if 0;
my $baz;
$baz = 'qux' if 0;
The difference is significant when these appear at the top of a loop. For example:
use warnings;
use strict;
foreach my $n (0,1){
my $foo = 'bar' if 0;
print defined $foo ? "defined\n" : "undefined\n";
$foo = 'bar';
print defined $foo ? "defined\n" : "undefined\n";
}
print "==\n";
foreach my $m (0,1){
my $baz;
$baz = 'qux' if 0;
print defined $baz ? "defined\n" : "undefined\n";
$baz = 'qux';
print defined $baz ? "defined\n" : "undefined\n";
}
results in
undefined
defined
defined
defined
==
undefined
defined
undefined
defined
It seems that if 0 fails, so foo is never reinitialized to undef. In this case, how does it get declared in the first place?
First, note that my $foo = 'bar' if 0; is documented to be undefined behaviour, meaning it's allowed to do anything including crash. But I'll explain what happens anyway.
my $x has three documented effects:
It declares a symbol at compile-time.
It creates an new variable on execution.
It returns the new variable on execution.
In short, it's suppose to be like Java's Scalar x = new Scalar();, except it returns the variable if used in an expression.
But if it actually worked that way, the following would create 100 variables:
for (1..100) {
my $x = rand();
print "$x\n";
}
This would mean two or three memory allocations per loop iteration for the my alone! A very expensive prospect. Instead, Perl only creates one variable and clears it at the end of the scope. So in reality, my $x actually does the following:
It declares a symbol at compile-time.
It creates the variable at compile-time[1].
It puts a directive on the stack that will clear[2] the variable when the scope is exited.
It returns the new variable on execution.
As such, only one variable is ever created[2]. This is much more CPU-efficient than then creating one every time the scope is entered.
Now consider what happens if you execute a my conditionally, or never at all. By doing so, you are preventing it from placing the directive to clear the variable on the stack, so the variable never loses its value. Obviously, that's not meant to happen, so that's why my ... if ...; isn't allowed.
Some take advantage of the implementation as follows:
sub foo {
my $state if 0;
$state = 5 if !defined($state);
print "$state\n";
++$state;
}
foo(); # 5
foo(); # 6
foo(); # 7
But doing so requires ignoring the documentation forbidding it. The above can be achieved safely using
{
my $state = 5;
sub foo {
print "$state\n";
++$state;
}
}
or
use feature qw( state ); # Or: use 5.010;
sub foo {
state $state = 5;
print "$state\n";
++$state;
}
Notes:
"Variable" can mean a couple of things. I'm not sure which definition is accurate here, but it doesn't matter.
If anything but the sub itself holds a reference to the variable (REFCNT>1) or if variable contains an object, the directive replaces the variable with a new one (on scope exit) instead of clearing the existing one. This allows the following to work as it should:
my #a;
for (...) {
my $x = ...;
push #a, \$x;
}
See ikegami's better answer, probably above.
In the first example, you never define $foo inside the loop because of the conditional, so when you use it, you're referencing and then assigning a value to an implicitly declared global variable. Then, the second time through the loop that outside variable is already defined.
In the second example, $baz is defined inside the block each time the block is executed. So the second time through the loop it is a new, not yet defined, local variable.