Moving from CGI to mod_perl. Understanding my, our, local

Moving from CGI to mod_perl. Understanding my, our, local - perl

I've been using apache mod_cgi during some years. Now I am moving to mod_perl and I have found some problems, specially with subroutines. Until now I was never using my, our nor local; and the CGI scripts worked without problems. After reading documentation and even some previous questions posted here I understand more or less how my, our and local works. My concern is what information is going to be shared between the next requests (if I understand correctly, that's the main concern I must have while running mod_perl instead of mod_cgi).
Is there any difference between using our in a scalar or just the scalar without declaring anything special such as my? Aren't both global?
If I do not declare the scalar as private is going to be shared in the next request? Even in another request of a different perl script in the same server?
How can I share the value of a scalar inside a subroutine to outside that subroutine but not outside the same file nor the same request?
If I use a my in a scalar inside an if in the same level of the file or in the same subroutine, and after that I create another if where I use the same scalar; is that scalar shared between both if or each if means different blocks? What about while and for, are they different blocks for the previously declared as my scalar or that only works for subroutines and files?

mod_perl works by wrapping each Perl script in a subroutine called handler within a package based on the name and path of the script. Instead of starting a new process to run each script, this handler subroutine is called by one of a number of persistent Perl theads.
Ordinarily this knowledge would help a lot to understand the changes in environment from mod_cgi, but since you have never added use strict to your programs and become familiar with the workings of declared variables you have a lot of catching up to do!
The mod_perl environment has the potential for causing non-obvious security breaches, and you should start now to use strict on every script and declare every variable. use Carp will also help you to understand the error logs.
A variable name declared with our is a lexically-scoped synonym for a package variable of the same name that can be used without fully qualifying the name by including the package name. For instance, ordinarily a variable declared with our $var will provide access to the $main::var scalar (if there has been no preceding package declaration) without specifying main::. However, such variables that began life with a value of undef in mod_cgi will now retain their values from the previous execution of any given mod_perl thread, and for consistency it is safest to always initialise them at the point of declaration. Note also that the default package name is no longer main because of the wrapping that mod_perl does, so you can no longer access package variables using the main:: prefix, and it is unwise to find the actual name of the package and explicitly use that because it will be a very long name and will change if you move or rename your script.
A my variable is one that exists independently of the package symbol table, and normally its lifetime is the run time of the enclosing file (for variables declared at file scope) or subroutine. They are safe in mod_perl if both declared and used at file scope of the script or entirely within one subroutine, but you can be stung if you mix scopes and declare a my $global at file scope and then try to use it in a subroutine. The reason for this isn't simple, but it is caused by mod_perl wrapping your script in a handler subroutine so you have nested subroutine declarations. The inner subroutine will tend to adopt only the first instantiation of $global and ignore any others created by later calls to handler. If you need a global variable you should declare it with our and initialise it in that declaration as described above.
A local variable is very like an our variable in that it forms a synonym to a package variable. However it temporarily saves the current value of that variable and provides a new copy for use until the end of the file or block scope. Because of its automatic creation and deletion within its scope it can be a useful alternative to a my variable in mod_perl scripts, particularly where you are using pointers to data structures like, say, an instance of the CGI class. Declaring our $cgi = CGI->new would correctly create the object but, because of mod_perl's persistence, would leave it in memory until the next execution of the thread deletes it to make room for another one.
As for your questions:
Using a variable without declaring it either causes a compile-time error if use strict is in place as it should be. Otherwise it is a synonym for that variable in the current package namespace.
Variables are either package variables or lexical variables; there is no way to declare a variable as private as such. Lexical variables (declared with my) will be created and destroyed with each execution of the script, unless you have created an invalid closure as described above by writing a subroutine that uses a variable declared at a wider scope, when the variable will be persistent but won't do what you want it to. A variable declared with our will retain its value across calls to the script, while one declared with local will be destroyed when the script terminates. Both our and local variables are package variables and all references to the same variable name refer to the same variable.
To declare a variable that is consistently accessible everywhere within any one call of a script you can either use a local variable or an initialised our variable. At file scope local $global is largely equivalent to our $global = undef for mod_perl scripts. If you use an our variable to point to a data structure then remember to destroy it at the end of the script with undef $global.
my variables are unique to, and visible within, the block in which they are declared, whether that is a block within an if, while or for, or even just a bare { ... } block scope. Always use my variables for temporary work variables that are used only within a block and accessed from nowhere else.
I hope this helps

Edit: this is general information on Perl variable scoping only. Please see Borodin's post for specific mod_perl issues.
Variables declared with my are lexical. In other words, they exist only within the current scope. You should declare all of your variables with my by default; only do something else when you specifically want different functionality.
Using lexically-scoped variables is a basic part of good code design in (almost) any language. Putting use strict; and use warnings; in all of your scripts will require you to follow this good practice.
our is a way of declaring a global variable; the underlying result is very similar to using undeclared globals. However, it has two differences:
You are explicitly stating that you want the variable to be global. This is a good practice to follow, since use of global variables should be an exceptional case. Because of this, you can create a global in this way even if you use strict;.
The variable declared with our will be accessible by the name you declare throughout all packages in the current scope. An undeclared variable, by contrast, is only accessible by simple name within the current package. Outside of that, you could only refer to it as $package::variable.
See the documentation for our for more details.
local does not create a lexical variable; instead, it is a way to give a global variable a temporary value within the current scope. It is mostly used with Perl's special built-in (punctuation) variables:
{
local $/; #make the record separator undefined in this scope only.
my $file = <FILE>; #read in an entire file at once.
}
You can go far simply by using my at all times for your variables and using local only for special cases like that shown above.

Related

Correct way of variable declaration in Perl

I have a set of 3 or 4 separate Perl scripts that used to be part of a simple pipeline, but I am now trying to combine them in a single script for easier use (for now without subroutine functions). The thing is that several variables with the same name are defined in the different scripts. The workaround I found was to give different names to those variables, but it can start to become messy and probably it is not the correct way of doing so.
I know the concept of global and local variables but I do not quite understand how do they exactly work.
Are there any rules of thumb for dealing with this sort of variables? Do you know any good documentation that can shed some light on variable-scope or have any advise on this?
Thanks.
EDITED: I already use "use warnings; use strict;" and declare variables with "my". The question might actually be more related to the definition of scoping blocks and how to get them to be independent from each other...

You are likely getting into trouble because of your use of global variables (which actually likely exist in package main). You should try to avoid the use of global variables.
And to do so, you should become familiar with the meaning of variable scope. Although somewhat dated, Coping with Scoping offers a good introduction to this topic. Also see this answer and the others to the question How to properly use Global variables in perl. (Short Answer: avoid them to the degree possible.)
The principle of variable scope and limiting use of global variables actually applies to nearly all programming languages. You should get in the habit of declaring variables as close as possible to the point where you are actually using them.
And finally, to save yourself from a lot of headaches, get in the habit of:
including use strict; and use warnings; at the top every Perl source file, and
declaring variables with my within each of your sub's (to limit the scope of those variables to the sub).
(See this PerlMonks article for more on this recommendation.)
I refer to this practice as "Perl programming with your seat belt on." :-)

The rule of thumb is to put your code in subroutines, each of them focused on a simple, well-defined part of the larger process. From this one decision flow many virtuous outcomes, including a natural solution to the variable scoping problem you asked about.
sub foo {
my $x = 99;
...
}
sub bar {
my $x = 1234; # Won't interfere with foo's $x.
...
}
If, for some reason, you really don't want to do that, you can wrap each section of the code in scoping blocks, and make sure you declare all variables with my (you should be doing the latter nearly always as a matter of common practice):
{
my $x = 99;
...
}
{
my $x = 1234; # Won't interfere with the other $x.
...
}

"Fake" global lexical variables in Common Lisp

It is stated in section "Global variables and constants" of the Google Common Lisp Style Guide that:
"Common Lisp does not have global lexical variables, so a naming convention is used to ensure that globals, which are dynamically bound, never have names that overlap with local variables.
It is possible to fake global lexical variables with a differently named global variable and a DEFINE-SYMBOL-MACRO. You should not use this trick, unless you first publish a library that abstracts it away."
Can someone, please, help me to understand the meaning of this last sentence.

The last sentence,
You should not use this trick, unless you first publish a library that abstracts it away.
means that if you do something that simulates global lexical variables, then the implementation of that simulation should not be apparent to the user. For instance, you might simulate a global lexical using some scheme using define-symbol-macro, but if you do, it should be transparent to the user. See Ron Garret's GLOBALS — Global Variables Done Right for an example of “a library that abstracts it away.”

How to import all "our"-variables from the unnamed Perl module without listing them?

I need to import all our variables from the unnamed Perl module (Module.pm) and use them inside the Perl script (Script.pl).
The following code works well without the "use strict", but failed with it. How can I change this code to work with "use strict" without the manual listing of all imported variables (as described in the answer to other question)?
Thanks a lot for your help!
Script.pl:
use strict;
require Module;
print $Var1;
Module.pm:
our $Var1 = "1\n";
...
our $VarN = "N\n";
return 1;
Run the script:
$> perl Script.pl
Errors:
Global symbol "$Var1" requires explicit package name at Script.pl line 3.
Execution of Script.pl aborted due to compilation errors.
NOTE (1): The module is unnamed, so using a Module:: prefix is not the option.
NOTE (2): Module.pm contains also a set of functions configured by global variables.
NOTE (3): Variables are different and should NOT be stored in one array.
NOTE (4): Design is NOT good, but the question is not about the design. It's about forcing of the listed code to work with minimal modifications with the complexity O(1), i.e. a few lines of code that don't depend on the N.
Solution Candidate (ACCEPTED): Add $:: before all imported variables. It's compliant with strict and also allows to differ my variables from imported in the code.

Change your script to:
use strict;
require Module;
print $Module::Var1;
The problem is the $Var1 isn't in the main namespace, it's in Module's namespace.
Edit: As is pointed out in comments below, you haven't named your module (i.e. it doesn't say package Module; at the top). Because of this, there is no Module namespace. Changing your script to:
use strict;
require Module;
print $main::Var1;
...allows the script to correctly print out 1\n.

If you have to import all the our variables in every module, there's something seriously wrong with your design. I suggest that you redesign your program to separate the elements so there is a minimum of cross-talk between them. This is called decoupling.

You want to export all variables from a module, and you want to do it in such a way that you don't even know what you're exporting? Forget about use strict and use warnings because if you put them in your program, they'll just run screaming out, and curl up in a corner weeping hysterically.
I never, and I don't mean hardly ever, never export variables. I always create a method to pull out the required value. It gives me vital control over what I'm exposing to the outside world and it keeps the user's namespace pure.
Let's look at the possible problems with your idea.
You have no idea what is being exported in your module. How is the program that uses that module going to know what to use? Somewhere, you have to document that the variable $foo and #bar are available for use. If you have to do that, why not simply play it safe?
You have the issue of someone changing the module, and suddenly a new variable is being exported into the program using that module. Imagine if that variable was already in use. The program suddenly has a bug, and you'll never be able to figure it out.
You are exporting a variable in your module, and the developer decides to modify that variable, or even removes it from the program. Again, because you have no idea what is being imported or exported, there's no way of knowing why a bug suddenly appeared in the program.
As I mentioned, you have to know somewhere what is being used in your module that the program can use, so you have to document it anyway. If you're going to insist on importing variables, at least use the EXPORT_OK array and the Exporter module. That will help limit the damage. This way, your program can declare what variables its depending upon and your module can declare what variables it knows programs might be using. If I am modifying the module, I would be extra careful of any variable I see I am exporting. And, if you must specify in your program what variables you're importing, you know to be cautious about those particular variables.
Otherwise, why bother with modules? Why not simply go back to Perl 3.0 and use require instead of use and forget about using the package statement.

It sounds like you have data in a file and are trying to load that data into your program.
As it is now, the our declarations in the module only declare variables for the scope of that file. Once the file finshes running, to access the variables, you need to use their fully qualified name. If your module has a package xyz; line, then the fully qualified name is $xzy::Var1. If there is no package declaration, then the default package main is used, giving your variables the name $main::Var1
However, any time that you are making many variables all with numeric name changes, you probably should be using an array.
Change your module to something like:
#My::Module::Data = ("1\n", "2\n" ... )
and then access the items by index:
$My::Module::Data[1]

how to implement import semantics into the current block scope?

The documentation of use indicates that:
Some ... pseudo-modules import semantics into the current block scope (like strict or integer , unlike ordinary modules, which import symbols into the current package (which are effective through the end of the file).
Similarly, autodie
Replace functions with ones that succeed or die with lexical scope
How to implement import semantics into the current block scope with ordinary modules?

strict and warnings are implemented using some special flag variables that don't contain room for user pragmas. Starting with perl 5.10, you can write your own lexically scoped pragmas. perlpragma contains information on how to do so. You can also browse the source of existing pragmatic modules.

Does Perl monkey-patching allow you to see the patched package's scope?

I'm monkey patching a package using a technique given at the beginning of "How can I monkey-patch an instance method in Perl?". The problem that I'm running into is that the original subroutine used a package-level my variable which the patched subroutine appears not to have access to, either by full path specification or implicit use.
Is there any way to get at the data scoped in this way for use in the patched subroutine?

You can obtain lexicals with the PadWalker module. Evil, but it works.

No. The thing you're mistaken in is that they are not package scoped. A lexical variable is by definition limited to its lexical scope, in other words, the block it is in.

Lexicals (ie: declared with 'my') are not visible outside the lexical scope (file or block) in which they are declared. That's the whole point of lexical variables.
If there is a subroutine/method which is in the same scope as the lexical var, then it can return the value of the lexical and that can allow indirect access to the var from outside its scope.
There is no such thing as a 'full path specification' for lexical variables. That's for package variables. If the var was declared with 'our' instead of 'my' you could do that.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse