How do I dynamically discover packages from a partial namespace in perl? - perl

I have a directory structure that looks like:
Foo::Bar::Baz::1
Foo::Bar::Baz::2 etc
Can I list the packages from something like:
use Foo::Bar::Baz;
Thanks!
Edit: Made it more clear what the modules are.

If you want to load all modules in your include path with a certain prefix (e.g. everything under a::b::c, you can use Module::Find.
For example:
use Module::Find 'useall';
my #loaded = useall 'Foo::Bar::Baz'; # loads everything under Foo::Bar::Baz
This depends on your #INC path being set up with the necessary directories, so do any required manipulation (e.g. with use lib) first.

Normally a script such as a/b/c.pl won't have a namespace other than main. Perhaps you are thinking of discovering modules with names such as a/b/c.pm (which is a bad name, since lower-cased package names are generally reserved for Perl internals).
However, given a directory path, you can look for potential Perl modules using File::Find:
use strict;
use warnings;
use File::Find;
use Data::Dumper;
my #modules;
sub wanted
{
push #modules, $_ if m/\.pm$/
}
find(\&wanted, 'A/B');
print "possible modules found:\n";
print Dumper(\#modules)'

This might be overkill, but you can inspect the symbol table before and after loading the module and see what changed:
use strict; use warnings;
my %original = map { $_ => 1 } get_namespaces("::");
require Inline;
print "New namespaces since 'require Inline' call are:\n";
my #new_namespaces = sort grep !defined $original{$_}, get_namespaces("::");
foreach my $new_namespace (#new_namespaces) {
print "\t$new_namespace\n";
}
sub get_namespaces {
# recursively inspect symbol table for known namespaces
my $pkg = shift;
my #namespace = ();
my %s = eval "%" . $pkg;
foreach my $key (grep /::$/, keys %s) {
next if $key eq "main::";
push #namespace, "$pkg$key", get_namespaces("$pkg$key");
}
return #namespace;
}
New namespaces since 'require Inline' call are:
::AutoLoader::
::Config::
::Digest::
::Digest::MD5::
::Dos::
::EPOC::
::Exporter::
::Exporter::Heavy::
::File::
::File::Spec::
::File::Spec::Cygwin::
::File::Spec::Unix::
::File::Spec::Win32::
::Inline::Files::
::Inline::denter::
::Scalar::
::Scalar::Util::
::Socket::
::VMS::
::VMS::Filespec::
::XSLoader::
::vars::
::warnings::register::

Just to be clear, are you looking at random packages in random Perl code?
Or for Perl modules, e.g. "a/b/c/d1.pm" with module "a::b::c::d1"?
In either case, you can not use a single "use" statement to load them all.
What you need to do is to find all the appropriate files, using either glob or File::Find.
In the first case (modules), you can then load them either by require-ing each file, OR by converting filename into module name (s#/#::#g; s#\.pm$##;) and calling use on each module individually.
As far as actual packages nested in random Perl files, those packages can be:
Listed by grepping each file (again, found via glob or File::Find) for /^package (.*);/
Actually loaded by executing require $file for each file.
In this case, please note that the package name for each of those packages in a/b/c/1.pl will NOT need to be related to "a::b::c" - e.g. they CAN be named by the file author "p1", "a::p1" or "a::b::c::p1_something".

Related

perl: change #INC for current scope only

Modification to Perl's #INC array seems for an individual scope very confusing. I would like some clarification, as it seems to be fighting any means of dynamic initialization of objects.
One would think that I could define it as local to solve this problem.
According to the manual, "local modifies the listed variables to be local to the enclosing block, file, or eval."
The part that is annoying me is the "or" portion.
Problem: I would like to change the #INC array to include one and ONLY one directory under certain circumstances and ONLY for the current file.
Example attempt and issues:
Lets say I have a launching script index.pl:
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use File::Basename;
# Lets say I want to modify #INC here to look in ONLY one path. Local
# should allow us to declare for one scope or file (how non explicit this
# is annoys me) Since I have not defined a scope with brackets, it should
# be effective for the current file
local #INC = (dirname(__FILE__) . '/foo/'); #some relative path
# Lets say bar now uses standard perl modules
require 'bar.pm';
# ^- This will fail because local did not work as described, fails at use
# XML::Simple because it is traversing foo
my $bar = bar->new();
For the sake of being comprehensive, here is a bar.pm:
package bar;
use strict;
use warnings;
sub new
{
my $class = shift;
my $self = bless {}, $class;
use XML::Simple;
return $self;
}
1;
Is there anyway to modify #INC ONLY for the current file while leaving it intact in all parsed files afterwards?
(I know I can unshift, but eventually there could be dozens of directories it could be traversing)
require(dirname(__FILE__) . '/foo/bar.pm');
use File::Basename;
use subs 'require';
sub require {
my $module_file = shift;
die "unexpected absolute path $module_file\n" if $module_file =~ /^\//;
CORE::require(dirname(__FILE__) . '/foo/' . $module_file);
}
See http://perldoc.perl.org/CORE.html#OVERRIDING-CORE-FUNCTIONS
local #INC works, but your bar.pm file still needs to be able to find XML/Simple.pm (a use statement is executed at the time a file is compiled, no matter where it appears in the file), presumably from the original #INC, so your local #INC should start with a copy of the original #INC.
{
local #INC = (dirname(__FILE__) . '/foo/', #INC);
require 'bar.pm';
} # local #INC out of scope now, original #INC restored
my $bar = bar->new();

Exporting subroutines from a module 'used' via a 'require'

I'm working with a set of perl scripts which our build system is written with. Unfortunately they were not written as a set of modules, but instead a bunch of .pl files which 'require' each other.
After making some changes to a 'LogOutput.pl' which was used by almost every other file, I started to suffer from some issues caused by the file being 'require'd multiple times.
In an effort to fix this, while not changing every file (some of which are not under my direct control), I did the following:
-Move everything in LogOutput.pl to a new file LogOutput.pm, this one having everything needed to make it a module (based on reading http://www.perlmonks.org/?node_id=102347 ).
-Replace the existing LogOutput.pl with the following
BEGIN
{
use File::Spec;
push #INC, File::Spec->catfile($BuildScriptsRoot, 'Modules');
}
use COMPANY::LogOutput qw(:DEFAULT);
1;
This works, except that I need to change calling code to prefix the sub names with the new package (i.e. COMPANY::LogOutput::OpenLog instead of just OpenLog)
Is there any way for me to export the new module's subroutine's from within LogOutput.pl?
The well named Import::Into can be used to export a module's symbols into another package.
use Import::Into;
# As if Some::Package did 'use COMPANY::LogOutput'
COMPANY::LogOutput->import::into("Some::Package");
However, this shouldn't be necessary. Since LogOutput.pl has no package, its code is in the package it was required from. use COMPANY::LogOutput will export into the package which required LogOutput.pl. Your code, as written, should work to emulate a bunch of functions in a .pl file.
Here's what I assume LogOutput.pl looked like (using the subroutine "pass" as a stand in for whatever subroutines you had in there)...
sub pass { print "pass called\n" }
1;
And what I assume LogOutput.pl and LogOutput.pm look like now...
# LogOutput.pl
BEGIN
{
use File::Spec;
push #INC, File::Spec->catfile($BuildScriptsRoot, 'Modules');
}
use COMPANY::LogOutput qw(:DEFAULT);
1;
# LogOutput.pm
package test;
use strict;
use warnings;
use Exporter "import";
our #EXPORT_OK = qw(pass);
our %EXPORT_TAGS = (
':DEFAULT' => [qw(pass)],
);
sub pass { print "pass called\n" }
1;
Note this will not change the basic nature of require. A module will still only be required once, after that requiring it again is a no-op. So this will still not work...
{
package Foo;
require "test.pl"; # this one will work
pass();
}
{
package Bar;
require "test.pl"; # this is a no-op
pass();
}
You can make it work. Perl stores the list of what files have been required in %INC. If you delete and entry, Perl will load the file again. However, you have to be careful that all the code in the .pl file is ok with this. That #INC hack has to make sure its only run once.
BEGIN
{
use File::Spec;
# Only run this code once, no matter how many times this
# file is loaded.
push #INC, File::Spec->catfile($BuildScriptsRoot, 'Modules')
if $LogOutput_pl::only_once++;
}
use COMPANY::LogOutput qw(:DEFAULT);
# Allow this to be required and functions imported more
# than once.
delete $INC{"LogOutput.pl"};
1;
This is one of the few cases that a global variable is justified. A lexical (my) variable must be declared and would be reset with each loading of the library. A global variable does not need to be declared and will persist between loading.
This turned out to just be a stupid mistake on my part, I didn't put the subs into the #EXPORT list, only into #EXPORT_OK.

Perl portable serialization only with CORE modules

Exists any portable serialization method/Module what is included in the CORE modules? I know here is Storable, but it is not truly portable nor "cross-platform-standardized". Looking for something like YAML, JSON, XML or like...
I already chcecked the http://perldoc.perl.org/index-modules-T.html - but maybe missed something.
Motivation: want make a simple perl script what will works with any perl (without CPAN) and can read some configuration (and data) from a file. Using require with the Data::Dumper format is not very "user friendly"...
So possible solutions:
include something like YAML directly to my script (can be a solution, but...)
forcing users to install CPAN modules (not a solution)
use native perl and require - not very userfriendly syntax (for a non-perl users)
Any other suggested solution?
Ps: Understand the need keep core as small as possible and reasonable, but reading data in some standardized formats maybe? should be in a core...
There is a YAML parser and serializer bundled with Perl, hidden away. It's called CPAN::Meta::YAML. It only handles a subset of YAML, but that may be sufficient for your purposes.
You can configure Data::Dumper's output to be JSON-like. For example:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Pair = ': ';
$Data::Dumper::Terse = 1;
$Data::Dumper::Useqq = 1;
$Data::Dumper::Indent = 1;
my $structure = {
foo => 'bar',
baz => {
quux => 'duck',
badger => 'mythical',
}
};
print Dumper( $structure );
This prints:
{
"baz": {
"quux": "duck",
"badger": "mythical"
},
"foo": "bar"
}
That might get you most of the way towards interoperability? The module does have a bunch of options for controlling / changing output e.g. the Freezer and Toaster options.
Can you explain to me the problem with Storable again? If you look at Perlport, after a discussion of Bigendiness and Littleendiness, it concludes:
One can circumnavigate both these problems in two ways. Either transfer and store numbers always in text format, instead of raw binary, or else consider using modules like Data::Dumper and Storable (included as of perl 5.8). Keeping all data as text significantly simplifies matters.
So, Storable is universal for storing and retrieving data in Perl, and it's not only easy to use, but it's a standard Perl module.
Is the issue that you want to be able to write the data without having a Perl program do it for you? You could simply write your own Perl module. In most Perl installations, that module could be placed in the same directory as your program.
package Some_data; # Can be put in the same directory as the program like a config file
our $data; # Module variable which makes it accessible to your local program
$data = {}; # I am making this complex data structure...
$data->{NAME}->{FIRST} = "Bob";
$data->{NAME}->{LAST} = "Smith";
$data->{PHONE}->[0]->{TYPE} = "C";
$data->{PHONE}->[0]->{NUMBER} = "555-1234";
$data->{PHONE}->[1]->{TYPE} = "H";
$data->{PHONE}->[1]->{NUMBER} = "555-2345";
# Or use Subroutines
sub first {
return "Bob";
}
sub last {
return "Smith"
}
...
Now you can include this in your program:
use Some_data;
my $first_name = $Some_data::data->{NAME}->{FIRST} # As a hash of hashes
# OR
my $first_name = Some_data::first; # As a constant
The nice thing about the subroutines is that you can't change the data in your program. They're constants. In fact, that's exactly how Perl constants work too.
Speaking about constants. You could use use constant too:
package Some_data;
use constant {
FIRST => "Bob",
SECOND => "Smith",
};
And in your program:
use strict;
use warnings;
use Some_data;
my $fist_name = &Some_Data::FIRST; # Note the use of the ampersand!
Not quite as clean because you need to prefix the constant with an ampersand. There are ways of getting around that ampersand, but they're not all that pretty.
Now, you have a way of importing your data in your program, and it's really no harder to maintain than a JSON data structure. There's nothing your program has to do except to use Module; to get that data.
One final possibility
Here's one I've done before. I simply have a configuration file that looks like what you'd put on the command line, then use Getopt::Long to pull in the configuration:
Configfile
-first Bob -last Smith
-phone 555-1212
NOTE: It doesn't matter if you put it all on one line or not:
use strict;
use warnings;
use Getopt::Long qw(GetOptionsFromString);
open my $param_fh, "<", $param_file;
my #parameters = <$param_fh>;
close $param_fh;
my $params = join " ", $parameters # One long string
my ( $first, $phone );
GetOptionsFromString ( $params,
"first=s" => \$first,
"phone=s" => \$phone,
);
You can't get easier to maintain than that!

Finding the path to a Perl module upon loading

I am using a legacy Perl application which loads a library, call it "blah". I need to know where does "blah" resides in my file system.
I am not familiar at all with Perl, and I wonder what is the equivalent way to print the path to the module, along the lines of the special variable __file__ in Python. In other words, the Perl equivalent of the following Python script:
import blah
print blah.__file__
Many thanks.
use blah;
print $INC{'blah.pm'};
use Blah1::Blah2::blah;
print $INC{'Blah1/Blah2/blah.pm'};
The case is significant, even on Windows. use Blah will create an entry for $INC{'Blah.pm'} and use blah will create an entry for $INC{'blah.pm'}.
C:\>perl -MList::util -e "print join $/, keys %INC"
XSLoader.pm
Carp.pm
warnings/register.pm
Exporter.pm
vars.pm
strict.pm
List/util.pm
warnings.pm
To expand on my comment on mob's answer, try a more loose use of %INC to help you:
#!/usr/bin/perl
use strict;
use warnings;
use blah;
foreach (keys %INC) {
if (m'blah.pm') {
print "$_ => $INC{$_}\n";
}
}
The relevant perldoc perlvar on the subject says
%INC
The hash %INC contains entries
for each filename included via the do,
require, or use operators. The key is
the filename you specified (with
module names converted to pathnames),
and the value is the location of the
file found. The require operator uses
this hash to determine whether a
particular file has already been
included.
If the file was loaded via a
hook (e.g. a subroutine reference, see
require for a description of these
hooks), this hook is by default
inserted into %INC in place of a
filename. Note, however, that the hook
may have set the %INC entry by itself
to provide some more specific info.
If even that doesn't help, you may, as the previous document suggests, read about the require command, to help you understand how it is getting to be loaded in the first place. This should help you back it out, perhaps by iterating through #INC, which are the folders that Perl will search for to find files to be required.
I found the following one-liner which solved my problem:
$ perl -MList::Util -e'print $_ . " => " . $INC{$_} . "\n" for keys %INC'
Where -MList::Util stands for the List::Util module (in my case, I used -MBlah)

How can I dynamically include Perl modules without using eval?

I need to dynamically include a Perl module, but if possible would like to stay away from eval due to work coding standards. This works:
$module = "My::module";
eval("use $module;");
But I need a way to do it without eval if possible. All google searches lead to the eval method, but none in any other way.
Is it possible to do it without eval?
Use require to load modules at runtime. It often a good idea to wrap this in a block (not string) eval in case the module can't be loaded.
eval {
require My::Module;
My::Module->import();
1;
} or do {
my $error = $#;
# Module load failed. You could recover, try loading
# an alternate module, die with $error...
# whatever's appropriate
};
The reason for the eval {...} or do {...} syntax and making a copy of $# is because $# is a global variable that can be set by many different things. You want to grab the value as atomically as possible to avoid a race condition where something else has set it to a different value.
If you don't know the name of the module until runtime you'll have to do the translation between module name (My::Module) and file name (My/Module.pm) manually:
my $module = 'My::Module';
eval {
(my $file = $module) =~ s|::|/|g;
require $file . '.pm';
$module->import();
1;
} or do {
my $error = $#;
# ...
};
How about using the core module Module::Load
With your example:
use Module::Load;
my $module = "My::module";
load $module;
"Module::Load - runtime require of both modules and files"
"load eliminates the need to know whether you are trying to require either a file or a module."
If it fails it will die with something of the like "Can't locate xxx in #INC (#INC contains: ...".
Well, there's always require as in
require 'My/Module.pm';
My::Module->import();
Note that you lose whatever effects you may have gotten from the import being called at compile time instead of runtime.
Edit: The tradeoffs between this and the eval way are: eval lets you use the normal module syntax and gives you a more explicit error if the module name is invalid (as opposed to merely not found). OTOH, the eval way is (potentially) more subject to arbitrary code injection.
No, it's not possible to without eval, as require() needs the bareword module name, as described at perldoc -f require. However, it's not an evil use of eval, as it doesn't allow injection of arbitrary code (assuming you have control over the contents of the file you are requireing, of course).
EDIT: Code amended below, but I'm leaving the first version up for completeness.
I use I used to use this little sugar module to do dynamic loads at runtime:
package MyApp::Util::RequireClass;
use strict;
use warnings;
use Exporter 'import'; # gives you Exporter's import() method directly
our #EXPORT_OK = qw(requireClass);
# Usage: requireClass(moduleName);
# does not do imports (wrong scope) -- you should do this after calling me: $class->import(#imports);
sub requireClass
{
my ($class) = #_;
eval "require $class" or do { die "Ack, can't load $class: $#" };
}
1;
PS. I'm staring at this definition (I wrote it quite a while ago) and I'm pondering adding this:
$class->export_to_level(1, undef, #imports);... it should work, but is not tested.
EDIT: version 2 now, much nicer without an eval (thanks ysth): :)
package MyApp::Util::RequireClass;
use strict;
use warnings;
use Exporter 'import'; # gives you Exporter's import() method directly
our #EXPORT_OK = qw(requireClass);
# Usage: requireClass(moduleName);
# does not do imports (wrong scope) -- you should do this after calling me: $class->import(#imports);
sub requireClass
{
my ($class) = #_;
(my $file = $class) =~ s|::|/|g;
$file .= '.pm';
require $file; # will die if there was an error
}
1;
Class::MOP on CPAN has a load_class method for this:
http://metacpan.org/pod/Class::MOP
i like doing things like..
require Win32::Console::ANSI if ( $^O eq "MSWin32" );