how to fake a perl module for dependency? - perl

An external Perl library that I am using has a dependency (DBD::mysql) that I will not be using in my application (DBD::SQLite), so I would like the system to just pretend the dependency is there, even if it's a "fake".
Can I just create an empty DBD::mysql.pm module that compiles or is there a more straightforward way of doing this?

So I think there are few issues here.
When you say dependency, do you mean the external module simply tries to require or use DBD::mysql? If that is the case then you should advise the developer that he shouldn't be explicitly doing that because that defeats the purpose of using DBI. The database driver should be selected on the fly based on the DSN.
Assuming that the author is merely useing the package name because he thought that was a useful or meaningful thing to do, then yes, you may override that package, and there are a few ways to do it.
As you suggested, you can merely create your own module DBD/mysql.pm that would define the DBD::mysql package.
There are some other things you could do if you are interested. Instead of littering your source tree with fake directories and files, you just need to convince Perl that the module was loaded. We can do this by directly manipulating %INC.
package main; # or whereever
BEGIN {
$INC{'DBD/mysql.pm'} = "nothing to see here";
}
Simply by adding this hash key, we preclude a search of the filesystem for the offending module. Observe that this is in a BEGIN block. If the external author did a use then we must populate this value before the use statement is evaluated. The use statements are equivalent to a require and import wrapped in a BEGIN.
Now lets further speculate in the general sense that the external author was attempting to call methods of the package. You will get run time errors if there if those symbols don't exist. You can take advantage of Perl's AUTOLOAD to intercept such calls and do the right thing. What The right thing is can vary a lot, from simply logging a message to something more elaborate. For instance, you could use this facility to examine the depth of the coupling that the author introduced by monitoring all the calls.
package DBD::mysql;
sub AUTOLOAD {
printf(
"I don't wanna '%s' called from '%s'\n", $AUTOLOAD, caller(0)
);
}
package main; # or whereever
BEGIN {
$INC{'DBD/mysql.pm'} = "nothing to see here";
}
DBD::mysql::blah()
Now let's also cover the case where the offending author also created some object oriented instances of a class, and his code doesn't properly account
for your stub code. We will stub the constructor which we assume is new to just bless an anonymous hash with our package name. That way you won't get
errors when he calls methods on an instance.
package DBD::mysql;
sub AUTOLOAD {
printf(
"I don't wanna '%s' called from '%s'\n", $AUTOLOAD, caller(0)
);
}
sub new {
bless({}, __PACKAGE__)
}
package main; # or whereever
BEGIN {
$INC{'DBD/mysql.pm'} = "nothing to see here";
}
my $thing = new DBD::mysql;
$thing->blah()

Related

When do Perl package variables fall out of scope?

From my main program I require a file containing a package, and then call a subroutine from that package:
while($somecondition){
require( 'people.pm' );
my $result = PERSON::stuff($args);
}
The PERSON package has multiple subs and some 'our' variables declared:
package PERSON;
our $name;
our ...
sub stuff {
...
}
In my understanding of other more object oriented languages you would need to declare a new object instance, maybe with its own constructor/initialization functions to use "package" variables. That doesn't seem to be the case here with Perl.
I'm dealing with legacy code so I don't want to change much, I just want to understand when the package variables ($name) come into existence, and when are they returned to memory from the perspective of the main program.
Would putting a PERSON::stuff() call after the while loop have new package variables?
After calling a single function inside a package do the package variables live until the end of the program?
The question mixes up some concepts so let's first address what appears to be the main issue: If a package is require'd inside some scope, what of it outside of that scope?
In short, (dynamical global) symbols from the package are accessible everywhere in the unit in which it is require'd, via their fully qualified names.†
Let's try with an example
use warnings;
use strict;
use feature 'say';
TEST_SCOPE: {
say "In scope, in ", __PACKAGE__;
require TestPack;
#hi(); # "Undefined subroutine &main::hi called..."
TestPack::hi(); # ok
#say $global; # $global ... who??
say $TestPack::global; # ok
say "Leaving scope\n";
};
say "--- in ", __PACKAGE__;
TestPack::hi(); # ok
say $TestPack::global; # ok
File TestPack.pm:
package TestPack;
use warnings;
use strict;
use feature 'say';
#use Exporter qw(import); # This is normally done to export symbols
#our #EXPORT_OK = qw(hi); # (unless the package is a class)
our $global = 7;
sub hi { say "hi from ", __PACKAGE__ }
1;
One needs to use fully qualified names for those symbols as they weren't imported. If the package exports symbols and we import some‡ then they go into the calling package's namespace so in the example above they'd be available in main::, so they can be accessed by any code in the interpreter by their exported names (hi, no need for TestPack::hi). One cannot access lexical variables from that package (created with my, our, state)§.
This also works if instead of the mere block (named TEST_SCOPE) we introduce another package, and require our TestPack inside of it.
...
package AnotherPack {
require TestPack;
...
1;
};
...
TestPack::hi(); # ok
...
(That package should be inside a BEGIN block really, what doesn't change the main point here.) Global symbols from TestPack are still accessible in main::, via their fully qualified names. The exported names, which we import along with require, are then available as such in this package, but not in main::.
Comments on the question
Package name (PERSON) and the filename for it (person.pm) have to agree. For example, the namespace (==package) Person is defined in the file Person.pm
This is about basics related to require-ing a package; it has nothing to do with object-oriented notions. (Even though, a class is firstly a package. See perlootut and perlobj.) Also see Packages in perlmod and our.
If you were to use a package that bless-es, the returned object (instance) is (assigned to) a lexical variable. As such, it doesn't exist outside of the scope. The package itself, though, is visible just as shown above but in object-oriented work we don't want to poke at a package's insides, but rather use methods via objects.
So yes, to work with that package outside of the scope in which it is require-ed you'd need to instantiate an object in that other scope as well. That would still work much like the example above -- we can use the package name, outside of scope in which it was required, to instantiate an object (try!), even though I'd raise questions of such design (see next)
This hints at a convoluted design though, bringing in packages inside scopes, at runtime (via require); what is the context? (And I hope it's not really in a while loop.)
† Print out the main's symbol table, %main:: (using Data::Dumper for example) and we find
"TestPack::" => *main::TestPack::
along with all other symbols from TestPack namespace.
‡ If a package exports symbols and we require the package then we can import by
require Pack::Name;
Pack::Name->import( qw(subname anothername ...) );
§ Note that our creates a lexical which is an alias for a package variable, which is accessible.
zdim's answer gives a very good explanation of how package variables work and can be used. I don't think it directly answers the question of when they fall out of scope though.
Succinctly:
Package variables are global static variables, just namespaced so the "global" aspect isn't as terrible.
As with any static variable, they are in scope for the entire execution of the program.
You also asked:
In my understanding of other more object oriented languages you would need to declare a new object instance, maybe with its own constructor/initialization functions to use "package" variables. That doesn't seem to be the case here with Perl.
Package variables are fairly unrelated to object-oriented programming in Perl. They are not used for storing instance data. (Except sometimes in the case of inside-out objects, though that's more of an advanced topic.)

Best way to add dynamic code to a perl application

I know specific instances of this question have been answered before:
How can I dynamically include Perl modules without using eval?
How do I use a Perl package known only in runtime?
There are also good answers at Perl Monks:
Writing a Perl module that dynamically loads other modules.
Creating subroutines on the fly
But I would like a robust way to add functionallity to a Perl application that will be:
Efficient: if the code is not needed it should not be compiled.
Easy to debug: error reporting if something goes wrong at the dynamic code, should point at the right place at the dynamic code.
Easy to extend: adding new code should be as easy as adding a new file or directory+file.
Easy to invoke: the main application should be able to use an "add on" without much trouble. An efficient mechanism to check if the "add on" has already been loaded and if not load it, would be a plus.
To illustrate the point, here are some examples that would benefit from a good solution:
A set of scripts that move data from different applications. For instance, moving data from OpenCart to Prestashop, where each entity in the data model has a specific "add on" that deals with the input or output; then an intermediate data model takes care of the transformation of the data. This could be used to move data in any direction or even between different versions of the same ecommerce.
A web application that needs to render different types of HTML in different places. Each "module" knows how to handle a certain information and accepts parameters to do it. A module outputs HTML, another a list of documents, another a document, another a banner, and so on.
Here are some examples that I have used and that work.
Load a function at run time and output the possible compile errors:
eval `cat $file_with_function`;
if( $# ) {
print STDERR $#, "\n";
die "Errors at file $file_with_function\n";
}
Or more robust using File::Slurp:
eval read_file("$file_with_function", binmode => ':utf8');
Check that a certain function has been defined:
if( !defined &myfunction ) {
die "myfunction is not defined\n";
}
The function may be called from there on. This is fine with one function, but not for many.
If the function is put in a module:
require $file_with_function; # needs the ".pm" extension, i.e. addon/func.pm
$name_of_module->import(); # need to know the module name, i.e. Addon::Func
$name_of_module->myfunction(...);
Where the require may be protected inside an eval and then use $# as before.
With Module::Load:
load $name_of_module;
Followed by the import and used in the same way. Security should not be a concern as it may be assumed that the dynamic code comes from a trusted place. Are there better ways? Which way would be considered good practice?
In case it helps, I will be using the solution (among other places, but not exclusively) within the Dancer framework.
EDIT: Given the comments, I add some more info. All cases that I have in mind have in common:
There is more than one dynamic piece of code. Probably many to start with.
Each bit of code has the same interface.
Given the comments and the lack of responses, I have done some research to answer my own question. Comments or other answers are welcome!
Dynamic code
By dynamic code I mean code that is evaluated at run-time. In general, I consider better to compile an application so that you have all the error checking the Perl compiler can offer before starting to execute. Added to use strict and use warnings, you can catch many common mistakes that way. So why using dynamic code at all? These are the reasons I consider:
An application performs many different actions that are chosen depending on the context of execution. For instance, an application extracts certain properties from a file. The way to extract them depends on the file type and we want to deal with many file types, but we do not want to change the application for each new file type we add. We also want the application to start quickly.
An application needs to be expanded on the fly in a way that does not require the application to restart.
We have a large application that contains a number of features. When we deploy the application, we do not want to provide all the possible features all the time, maybe because we licence them separately, maybe because not all of them are able to run under all platforms. By throwing in only the files with the features we want, we have a distribution that does not require changing any code or config files.
How do we do it?
Given the possibilities that Perl offers, solutions to adding dynamic code come in two flavors: using eval and using require. Then there are modules that may help do things in an easier or more maintainable way.
The quick and dirty way
The eval way uses the form eval EXPR to compile a piece of Perl code at run-time. The expression could be a string but I suggest putting the code in a file and grouping other similar files in a convenient place. Then, if possible using File::Slurp:
eval read_file("$file_with_code", binmode => ':utf8');
if( $# ) {
die "$file_with_code: error $#\n";
}
if( !defined &myfunction ) {
die "myfunction is not defined at $file_with_code\n";
}
Specifying the character set to read_file makes sure that the file will be interpreted correctly. It is also good to check that the compilation was correct and that the function we expect was defined. So in $file_with_code, we will have:
sub myfunction(...) {
# Do whatever; maybe return something
}
Then you may invoke the function normally. The function will be a different one depending on which file was loaded. Simple and dynamic.
The modular way (recommended)
The way I would do it with maintainability in mind would be using require. Unlike use, that is evaluated at compile-time, require may be used to load a module at run-time. Out of the various ways to invoke require, I would go for:
my $mymodule = 'MyCompany::MyModule'; # The module name ends up in $mymodule
require $mymodule;
Also unlike use, require will load the module but will not execute import. So we may use any functions inside the module and those function names will not polute the calling namespace. To access the function we will need to use:
$mymodule->myfunction($a, $b);
See below as to how the arguments get passed. This way of invoking a function will add an argument before $a and $b that is usually named $self. You may ignore it if you don´t know anything about object orientation.
As require will try to load a module and the module may not exist or it may not compile, to catch the error it will be better to use:
eval "require $mymodule";
Then $# may be used to check for an error in the loading+compiling process. We may also check that the function has been defined with:
if( $mymodule->can('myfunction') ) {
die "myfunction is not defined at module $mymodule\n";
}
In this case we will need to create a directory for the modules and a file with the .pm extension for each one:
MyCompany
MyModule.pm
Inside MyModule.pm we will have:
package MyCompany::MyModule;
sub myfunction {
my ($self, $a, $b);
# Do whatever; maybe return something
# $self will be 'MyCompany::MyModule'
}
1;
The package bit is essential and will make sure that whatever definitions we put inside will be at the MyCompany::MyModule namespace. The 1; at the end will tell require that the module initialization was correct.
In case we wanted to implement the module by using other libraries that could polute the caller namespace, we could use the namespace::clean module. This module will make sure the caller does not get any additions to the namespace coming from the module we are defining. It is used in this way:
package MyCompany::MyModule;
# Definitions by these modules will not be available to the code doing the require
use Library1 qw(def1 def2);
use Library2 qw(def3 def4);
...
# Private functions go here and will not be visible from the code doing the require
sub private_function1 {
...
}
...
use namespace::clean;
# myfunction will be available
sub myfunction {
# Do whatever; maybe return something
}
...
1;
What happens if we include a module more than once?
The short answer is nothing. Perl keeps track of which modules have been loaded and from where using the %INC variable. Both use and require will not load a library twice. use will add any exported names to the callers namespace. require will not do that either. In case you want to check that a module has been loaded already, you could use %INC or better yet, you could use module::loaded which is part of the core in modern Perl versions:
use Module::Loaded;
if( !is_loaded( $mymodule ) {
eval "require $mymodule" );
...
}
How do I make sure Perl finds my module files?
For use and require Perl uses the #INC variable to define the list of directories that will be used to look for libraries. Adding a new directory to it may be achieved (among other ways) by adding it to the PERL5LIB environment variable or by using:
use lib '/the/path/to/my/libs';
Helper libraries
I have found some libraries that may be used to make the code that uses the dynamic mechanism more maintainable. They are:
The if module: will load a module or not depending on a condition: use if CONDITION, MODULE => ARGUMENTS;. May also be used to unload a module.
Module::Load::Conditional: will not die on you while trying to load a module and may also be used to check the module version or its dependencies. It is also able to load a list of modules all at once even checking their versions before doing so.
Taken from the Module::Load::Conditional documentation:
use Module::Load::Conditional qw(can_load);
my $use_list = {
CPANPLUS => 0.05,
LWP => 5.60,
'Test::More' => undef,
};
print can_load( modules => $use_list )
? 'all modules loaded successfully'
: 'failed to load required modules';

Perl file changes package between / inside functions

I'm looking at some of our Perl codebase and am puzzled by the use of package in some files.
We have a file containing some useful functions, functions.pl, which is laid out roughly like this:
package functions;
use strict;
sub function_a {
# code here
}
sub function_b {
# code here
}
package main;
sub function_c {
my ($arguments, $for, $this, $function) = #_;
package functions;
# Actual function code here.
}
(Function and package names changed, obviously.)
Functions in this file are used in other scripts by require 'functions.pl' and then calling &function_c() - since the scripts where function_c is called do not declare a package, presumably they're in the main namespace so don't have to prepend anything to function_c when calling it.
function_a and function_b aren't used outside this file, so presumably keeping the main body of function_c back in the non-main namespace means that code in there doesn't have to prepend functions:: to any calls to them.
Does anyone know why someone might write a script to be require'd in this way, rather than writing it as a module and explicitly importing certain functions?
And I know that there's more than one way to do it in Perl, but is package really supposed to be switched around in one file whenever you feel like it like this?
Technically, there's nothing wrong with the code. The package declaration can indeed be used to "switch around" the current package like that.
That said, it's certainly not the standard or generally recommended way to do this; as you note, that would be to turn the script into a module and (optionally) export the public functions into the namespace where the module is used.
One practical use of multiple package declarations is in OO code, where you may want to define multiple classes in one file, e.g. like this:
package MyClass;
# ... MyClass methods here ...
package MyClass::Helper {
# ... helper class methods here ...
}
# ... more MyClass methods here ...
or, in older Perl versions (< 5.14):
package MyClass;
# ... MyClass methods here ...
{
package MyClass::Helper;
# ... helper class methods here ...
}
# ... more MyClass methods here ...
One reason could be that the script was originally designed to import functions from different set of files (eg:function.pl;function1.pl;function2.pl) based on certain user inputs or certain conditions.
This will entail importing the functions at run time and so 'require $function' where $function could be function.pl or function1.pl
Other reason could be that the person was not aware of modules at that time ;)

Difference between use and require (I listed the differences, need to know what else are there)

I read the explanation even from perldoc and StackOverflow. But there is a little confusion.
use normally loads the module at compile time whereas require does at run time
use calls the import function inbuilt only whereas require need to call import module separately like
BEGIN {
require ModuleName;
ModuleName->import;
}
require is used if we want to load bigger modules occasionally.
use throws the exception at earlier states whereas require does when I encounters the issue
With use we can selectively load the procedures not all but few like
use Module qw(foo bar) # it will load foo and bar only
is it possible in require also?
Beisdes that are there another differences between use and require?
Lot of discussion on google but I understood these above mentioned points only.
Please help me other points.
This is sort of like the differences between my, our, and local. The differences are important, but you should be using my 99% of the time.
Perl is a fairly old and crufty language. It has evolved over the years from a combination awk/shell/kitchen sink language into a stronger typed and more powerful language.
Back in Perl 3.x days before the concept of modules and packages solidified, there was no concept of modules having their own namespace for functions and variables. Everything was available everywhere. There was nothing to import. The use keyword didn't exist. You always used require.
By the time Perl 5 came out, modules had their own storage for variable and subroutine names. Thus, I could use $total in my program, and my Foo::Bar module could also use $total because my $total was really $main::total and their $total was really $Foo::Bar::total.
Exporting was a way to make variables and subroutines from a module available to your main program. That way, you can say copy( $file, $tofile); instead of File::Copy::copy( $file, $tofile );.
The use keyword simply automated stuff for you. Plus, use ran at compile time before your program was executed. This allows modules to use prototyping, so you can say foo( #array ) instead of foo( \#array ) or munge $file; instead of munge( $file );
As it says in the use perldoc's page:
It [use] is exactly equivalent to:
BEGIN { require Module; Module->import( LIST ); }
Basically, you should be using use over require 99% of the time.
I can only think of one occasion where you need to use require over use, but that's only to emulate use. There are times when a module is optional. If Foo::Bar is available, I may use it, but if it's not, I won't. It would be nice if I could check whether Foo::Bar is available.
Let's try this:
eval { use Foo::Bar; };
my $foo_bar_is_available = 1 unless ($#);
If Foo::Bar isn't available, I get this:
Can't locate Foo/Bar.pm in #INC (#INC contains:....)
That's because use happens BEFORE I can run eval on it. However, I know how to emulate use with require:
BEGIN {
eval { require Foo::Bar; Foo::Bar->import( qw(foo bar barfu) ); };
our foo_bar_module_available = 1 unless ($#);
}
This does work. I can now check for this in my code:
our $foo_bar_module_available;
if ( $foo_bar_module_available ) {
fubar( $var, $var2 ); #I can use it
}
else {
... #Do something else
}
I think that the code you written by your own in the second point is self explanatory of the difference between the two ...
In practice "use" perform a "require" of the module and after that it automatically import the module, with "require" instead the module is only mandatory to be present but you have the freedom to import it when you need it ...
Given what stated above it result obvious that the question in the point 5 have no sense, since "require" doesn't import anything, there is no need to specify the module part to load, you can selectively load the part you need when you will do the import operation ...
Furthermore bear in mind that while "use" act at compile time(Perl compilation phase), "require" act at runtime, for this reason with "require" you will be able to import the package only if and/or when it is really needed .
Difference between use and require:
If we use "use" no need to give file extension. Ex: use
server_update_file.
If we use "require" need to give file extension. Ex: require
"server_update_file.pm";
"use" method is used only for modules.
"require" method is used for both libraries and modules.
Refer the link for more information: http://www.perlmonks.org/?node_id=412860

Prevent multiple inclusions in perl

Suppose I have two files: a module file that looks like this:
package myPackage;
use Bio::Seq;
and another file that looks like this:
use lib "path/to/lib";
use myPackage;
use Bio::Seq;
How can i prevent that Bio::Seq is included twice? Thanx
It won't be included twice. use semantics could be described like that:
require the module
call module's import
As the documentation says, it's equivalent to:
BEGIN { require Module; Module−>import( LIST ); }
require mechanism, on the other hand, assures modules' code is compiled and executed only once, the first time some require it. This mechanism is based on the special variable %INC. You can find further details in the documentation for use, require, and in the perlmod page.
use Foo
is mostly equivalent to
# perldoc -f use
BEGIN {
require "Foo.pm";
Foo->import();
}
And require "Foo" is mostly equivalent to
# perldoc -f require
sub require {
my ($filename) = #_;
if (exists $INC{$filename}) {
return 1 if $INC{$filename};
die "Compilation failed in require";
}
# .... find $filename in #INC
# really load
return do $realfilename;
}
So
No, the code won't be "Loaded" more than once, only "imported" more than once.
If you have code such as
package Bio::Seq;
...
sub import {
# fancy stuff
}
And you wanted to make sure a library was loaded, but not call import on it,
#perldoc -f use
use Bio::Seq ();
Modules aren't "included" in Perl like they are in C. They are "loaded", by which I mean "executed".
A module will only be loaded/executed once, no matter how many use statements specify it.
The only thing that happens for every use of a module is the call to the module's import method. That is typically used to export symbols to the using namespace.
I guess, you want to optimize the loading(usage) of Module.
For optimizing, dynamic loading may be helpful.
For dynamically loading a Perl Module, we use Class::Autouse.
For more details you can visit this link.
I guess the OP may look for a way of avoiding a long list of use statement boilerplate at the beginning of his/her Perl script. In this case, I'd like to point everyone to Import::Into. It works like the keyword import in Java and Python. Also, this blog post provides a wonderful demo of Import::Into.