include/eval perl file into unique namespace defined at runtime - perl

I'm writing a tool that must import a number of other perl config files. The files are not wrapped w/packages and may have similar or conflicting variables/functions. I don't have the ability to change the format of these files, so I must work around what they are. What I was thinking to do was import each into a unique name space, but I've not found a way to do that using do, require, or use. If I don't use dynamic names, just a hardcoded name, I can do it.
Want something like this:
sub sourceTheFile {
my ($namespace, $file) = #_;
package $namespace;
do $file;
1;
return;
}
That doesn't work because the package command requires a constant for the name. So then I try something like this:
sub sourceTheFile {
my ($namespace, $file) = #_;
eval "package $namespace;do $file;1;"
return;
}
But the contents of the file read by do are placed in the main:: scope not the one I want. The target scope is created, just not populated by the
do. (I tried require, and just a straight cat $file inside the eval as well.)
I'm using Devel::Symdump to verify that the namespaces are built correctly or not.
example input file:
my $xyz = "some var";
%all_have_this = ( common=>"stuff" );
ADDITIONAL CHALLENGE
Using the answer that does the temp file build and do call, I can make this work dynamically as I require. BUT, big but, how do I now reference the data inside this new namespace? Perl doesn't seem to have the lose ability to build a variable name from a string and use that as the variable.

I am not sure why the eval did not work. Maybe a bug? Here is a workaround using a temp file. This works for me:
use strict;
use warnings;
use Devel::Symdump;
use File::Temp;
my $file = './test.pl';
my $namespace = 'TEST';
{
my $fh = File::Temp->new();
print $fh "package $namespace;\n";
print $fh "do '$file';\n";
print $fh "1;\n";
close $fh;
do $fh->filename;
}

Perl's use and require facilities make use of any hooks you might have installed in #INC. You can simply install a hook which looks in a specific location to load modules with a prefix you choose:
package MyIncHook;
use strict;
use warnings;
use autouse Carp => qw( croak );
use File::Spec::Functions qw( catfile );
sub import {
my ($class, $prefix, $location) = #_;
unshift #INC, _loader_for($prefix, $location);
return;
}
sub _loader_for {
my $prefix = shift;
my $location = shift;
$prefix =~ s{::}{/}g;
return sub {
my $self = shift;
my $wanted = shift;
return unless $wanted =~ /^\Q$prefix/;
my $path = catfile($location, $wanted);
my ($is_done);
open my $fh, '<', $path
or croak "Failed to open '$path' for reading: $!";
my $loader = sub {
if ($is_done) {
close $fh
or croak "Failed to close '$path': $!";
return 0;
}
if (defined (my $line = <$fh>)) {
$_ = $line;
return 1;
}
else {
$_ = "1\n";
$is_done = 1;
return 1;
}
};
(my $package = $wanted) =~ s{/}{::}g;
$package =~ s/[.]pm\z//;
my #ret = (\"package $package;", $loader);
return #ret;
}
}
__PACKAGE__;
__END__
Obviously, modify the construction of $path according to your requirements.
You can use it like this:
#!/usr/bin/env perl
use strict;
use warnings;
use MyIncHook ('My::Namespace', "$ENV{TEMP}/1");
use My::Namespace::Rand;
print $My::Namespace::Rand::settings{WARNING_LEVEL}, "\n";
where $ENV{TEMP}/1/My/Namespace/Rand.pm contains:
%settings = (
WARNING_LEVEL => 'critical',
);
Output:
C:\Temp> perl t.pl
critical
You can, obviously, define your own mapping from made up module names to file names.

Related

Overwriting a function defined in a module but before used in its runtime phase?

Let's take something very simple,
# Foo.pm
package Foo {
my $baz = bar();
sub bar { 42 }; ## Overwrite this
print $baz; ## Before this is executed
}
Is there anyway that I can from test.pl run code that changes what $baz is set to and causes Foo.pm to print something else to the screen?
# maybe something here.
use Foo;
# maybe something here
Is it possible with the compiler phases to force the above to print 7?
A hack is required because require (and thus use) both compiles and executes the module before returning.
Same goes for eval. eval can't be used to compile code without also executing it.
The least intrusive solution I've found would be to override DB::postponed. This is called before evaluating a compiled required file. Unfortunately, it's only called when debugging (perl -d).
Another solution would be to read the file, modify it and evaluate the modified file, kinda like the following does:
use File::Slurper qw( read_binary );
eval(read_binary("Foo.pm") . <<'__EOS__') or die $#;
package Foo {
no warnings qw( redefine );
sub bar { 7 }
}
__EOS__
The above doesn't properly set %INC, it messes up the file name used by warnings and such, it doesn't call DB::postponed, etc. The following is a more robust solution:
use IO::Unread qw( unread );
use Path::Class qw( dir );
BEGIN {
my $preamble = '
UNITCHECK {
no warnings qw( redefine );
*Foo::bar = sub { 7 };
}
';
my #libs = #INC;
unshift #INC, sub {
my (undef, $fn) = #_;
return undef if $_[1] ne 'Foo.pm';
for my $qfn (map dir($_)->file($fn), #libs) {
open(my $fh, '<', $qfn)
or do {
next if $!{ENOENT};
die $!;
};
unread $fh, "$preamble\n#line 1 $qfn\n";
return $fh;
}
return undef;
};
}
use Foo;
I used UNITCHECK (which is called after compilation but before execution) because I prepended the override (using unread) rather than reading in the whole file in and appending the new definition. If you want to use that approach, you can get a file handle to return using
open(my $fh_for_perl, '<', \$modified_code);
return $fh_for_perl;
Kudos to #Grinnz for mentioning #INC hooks.
Since the only options here are going to be deeply hacky, what we really want here is to run code after the subroutine has been added to the %Foo:: stash:
use strict;
use warnings;
# bless a coderef and run it on destruction
package RunOnDestruct {
sub new { my $class = shift; bless shift, $class }
sub DESTROY { my $self = shift; $self->() }
}
use Variable::Magic 0.58 qw(wizard cast dispell);
use Scalar::Util 'weaken';
BEGIN {
my $wiz;
$wiz = wizard(store => sub {
return undef unless $_[2] eq 'bar';
dispell %Foo::, $wiz; # avoid infinite recursion
# Variable::Magic will destroy returned object *after* the store
return RunOnDestruct->new(sub { no warnings 'redefine'; *Foo::bar = sub { 7 } });
});
cast %Foo::, $wiz;
weaken $wiz; # avoid memory leak from self-reference
}
use lib::relative '.';
use Foo;
This will emit some warnings, but prints 7:
sub Foo::bar {}
BEGIN {
$SIG{__WARN__} = sub {
*Foo::bar = sub { 7 };
};
}
First, we define Foo::bar. It's value will be redefined by the declaration in Foo.pm, but the "Subroutine Foo::bar redefined" warning will be triggered, which will call the signal handler that redefines the subroutine again to return 7.
Here is a solution that combines hooking the module loading process with the readonly-making capabilities of the Readonly module:
$ cat Foo.pm
package Foo {
my $baz = bar();
sub bar { 42 }; ## Overwrite this
print $baz; ## Before this is executed
}
$ cat test.pl
#!/usr/bin/perl
use strict;
use warnings;
use lib qw(.);
use Path::Tiny;
use Readonly;
BEGIN {
my #remap = (
'$Foo::{bar} => \&mybar'
);
my $pre = join ' ', map "Readonly::Scalar $_;", #remap;
my #inc = #INC;
unshift #INC, sub {
return undef if $_[1] ne 'Foo.pm';
my ($pm) = grep { $_->is_file && -r } map { path $_, $_[1] } #inc
or return undef;
open my $fh, '<', \($pre. "#line 1 $pm\n". $pm->slurp_raw);
return $fh;
};
}
sub mybar { 5 }
use Foo;
$ ./test.pl
5
I have revised my solution here, so that it no longer relies on Readonly.pm, after learning that I had missed a very simple alternative, based on m-conrad's answer, which I have reworked into the modular approach that I had started here.
Foo.pm (Same as in the opening post)
package Foo {
my $baz = bar();
sub bar { 42 }; ## Overwrite this
print $baz; ## Before this is executed
}
# Note, even though print normally returns true, a final line of 1; is recommended.
OverrideSubs.pm Updated
package OverrideSubs;
use strict;
use warnings;
use Path::Tiny;
use List::Util qw(first);
sub import {
my (undef, %overrides) = #_;
my $default_pkg = caller; # Default namespace when unspecified.
my %remap;
for my $what (keys %overrides) {
( my $with = $overrides{$what} ) =~ s/^([^:]+)$/${default_pkg}::$1/;
my $what_pkg = $what =~ /^(.*)\:\:/ ? $1 : $default_pkg;
my $what_file = ( join '/', split /\:\:/, $what_pkg ). '.pm';
push #{ $remap{$what_file} }, "*$what = *$with";
}
my #inc = grep !ref, #INC; # Filter out any existing hooks; strings only.
unshift #INC, sub {
my $remap = $remap{ $_[1] } or return undef;
my $pre = join ';', #$remap;
my $pm = first { $_->is_file && -r } map { path $_, $_[1] } #inc
or return undef;
# Prepend code to override subroutine(s) and reset line numbering.
open my $fh, '<', \( $pre. ";\n#line 1 $pm\n". $pm->slurp_raw );
return $fh;
};
}
1;
test-run.pl
#!/usr/bin/env perl
use strict;
use warnings;
use lib qw(.); # Needed for newer Perls that typically exclude . from #INC by default.
use OverrideSubs
'Foo::bar' => 'mybar';
sub mybar { 5 } # This can appear before or after 'use OverrideSubs',
# but must appear before 'use Foo'.
use Foo;
Run and output:
$ ./test-run.pl
5
If the sub bar inside Foo.pm has a different prototype than an existing Foo::bar function, Perl won't overwrite it? That seems to be the case, and makes the solution pretty simple:
# test.pl
BEGIN { *Foo::bar = sub () { 7 } }
use Foo;
or kind of the same thing
# test.pl
package Foo { use constant bar => 7 };
use Foo;
Update: no, the reason this works is that Perl won't redefine a "constant" subroutine (with prototype ()), so this is only a viable solution if your mock function is constant.
Lets have a Golf contest!
sub _override { 7 }
BEGIN {
my ($pm)= grep -f, map "$_/Foo.pm", #INC or die "Foo.pm not found";
open my $fh, "<", $pm or die;
local $/= undef;
eval "*Foo::bar= *main::_override;\n#line 1 $pm\n".<$fh> or die $#;
$INC{'Foo.pm'}= $pm;
}
use Foo;
This just prefixes the module's code with a replacement of the method, which will be the first line of code that runs after the compilation phase and before the execution phase.
Then, fill in the %INC entry so that future loads of use Foo don't pull in the original.

How to pass an anonymous sub to Find::File

I know I can do this as an expression modifier:
#!/usr/bin/perl -w
use strict;
use File::Find;
sub file_find{
my ($path,$filter) = #_;
find(sub {print $File::Find::name."\n" if /$filter/}, $path);
}
file_find($newdir,'\.txt');
or this which is less readable:
find(sub {if(/$filter/){print $File::Find::name."\n"}}, $path);
But if I wanted to do something like this, how can I do it?
sub file_find{
my ($path,$filter) = #_;
find(\&print, $path);
sub print {
if(/$filter/){ #Variable $filter will not stay shared
print $File::Find::name."\n";
}
}
}
file_find($newdir,'\.txt')
I get 'variable will not stay shared'. I believe I'm supposed to make it an anonymous sub:
my $print = sub {
if(/$filter/){
print $File::Find::name."\n";
}
}
But then I don't know how to pass the reference to the find sub. Perhaps it's somthing silly I'm missing.
Edit: Never mind, this seems to work:
sub file_find{
my ($path,$filter) = #_;
my $subref = sub{
if(/$filter/){
print $File::Find::name."\n";
}
};
find($subref,$path);
}
file_find($newdir,'\.txt');
I had to push the find sub to the bottom! Man I feel so dumb :)
I would separate the subs apart (and rename the print() one as it conflicts with the built-in with the same name!), then you can do something along these lines (if I'm understanding what you want correctly):
use warnings;
use strict;
use File::Find;
file_find('.', '.txt');
sub file_find{
my ($path,$filter) = #_;
my #files = find(sub {my_print($filter)}, $path);
}
sub my_print {
my $filter = shift;
my $fname = $File::Find::name;
if($fname =~ /$filter/){
print "$fname\n";
}
}
However, with that said, File::Find::Rule can make these things very, very easy (particularly handling the file filters as it handles regex natively):
use warnings;
use strict;
use File::Find::Rule;
my $filter = '*.txt';
my $dir = '.';
my #files = File::Find::Rule->file()
->name($filter)
->in($dir);
print "$_\n" for #files;

How to include a data file with a Perl module?

What is the "proper" way to bundle a required-at-runtime data file with a Perl module, such that the module can read its contents before being used?
A simple example would be this Dictionary module, which needs to read a list of (word,definition) pairs at startup.
package Reference::Dictionary;
# TODO: This is the Dictionary, which needs to be populated from
# data-file BEFORE calling Lookup!
our %Dictionary;
sub new {
my $class = shift;
return bless {}, $class;
}
sub Lookup {
my ($self,$word) = #_;
return $Dictionary{$word};
}
1;
and a driver program, Main.pl:
use Reference::Dictionary;
my $dictionary = new Reference::Dictionary;
print $dictionary->Lookup("aardvark");
Now, my directory structure looks like this:
root/
Main.pl
Reference/
Dictionary.pm
Dictionary.txt
I can't seem to get Dictionary.pm to load Dictionary.txt at startup. I've tried a few methods to get this to work, such as...
Using BEGIN block:
BEGIN {
open(FP, '<', 'Dictionary.txt') or die "Can't open: $!\n";
while (<FP>) {
chomp;
my ($word, $def) = split(/,/);
$Dictionary{$word} = $def;
}
close(FP);
}
No dice: Perl is looking in cwd for Dictionary.txt, which is the path of the main script ("Main.pl"), not the path of the module, so this gives File Not Found.
Using DATA:
BEGIN {
while (<DATA>) {
chomp;
my ($word, $def) = split(/,/);
$Dictionary{$word} = $def;
}
close(DATA);
}
and at end of module
__DATA__
aardvark,an animal which is definitely not an anteater
abacus,an oldschool calculator
...
This too fails because BEGIN executes at compile-time, before DATA is available.
Hard-code the data in the module
our %Dictionary = (
aardvark => 'an animal which is definitely not an anteater',
abacus => 'an oldschool calculator'
...
);
Works, but is decidedly non-maintainable.
Similar question here: How should I distribute data files with Perl modules? but that one deals with modules installed by CPAN, not modules relative to the current script as I'm attempting to do.
There's no need to load the dictionary at BEGIN time. BEGIN time is relative to the file being loaded. When your main.pl says use Dictionary, all the code in Dictionary.pm is compiled and loaded. Put the code to load it early in Dictionary.pm.
package Dictionary;
use strict;
use warnings;
my %Dictionary; # There is no need for a global
while (<DATA>) {
chomp;
my ($word, $def) = split(/,/);
$Dictionary{$word} = $def;
}
You can also load from Dictionary.txt located in the same directory. The trick is you have to provide an absolute path to the file. You can get this from __FILE__ which is the path to the current file (ie. Dictionary.pm).
use File::Basename;
# Get the directory Dictionary.pm is located in.
my $dir = dirname(__FILE__);
open(my $fh, '<', "$dir/Dictionary.txt") or die "Can't open: $!\n";
my %Dictionary;
while (<$fh>) {
chomp;
my ($word, $def) = split(/,/);
$Dictionary{$word} = $def;
}
close($fh);
Which should you use? DATA is easier to distribute. A separate, parallel file is easier for non-coders to work on.
Better than loading the whole dictionary when the library is loaded, it is more polite to wait to load it when it's needed.
use File::Basename;
# Load the dictionary from Dictionary.txt
sub _load_dictionary {
my %dictionary;
# Get the directory Dictionary.pm is located in.
my $dir = dirname(__FILE__);
open(my $fh, '<', "$dir/Dictionary.txt") or die "Can't open: $!\n";
while (<$fh>) {
chomp;
my ($word, $def) = split(/,/);
$dictionary{$word} = $def;
}
return \%dictionary;
}
# Get the possibly cached dictionary
my $Dictionary;
sub _get_dictionary {
return $Dictionary ||= _load_dictionary;
}
sub new {
my $class = shift;
my $self = bless {}, $class;
$self->{dictionary} = $self->_get_dictionary;
return $self;
}
sub lookup {
my $self = shift;
my $word = shift;
return $self->{dictionary}{$word};
}
Each object now contains a reference to the shared dictionary (eliminating the need for a global) which is only loaded when an object is created.
I suggest using DATA with INIT instead of BEGIN to ensure that the data is initialised before run time. It also makers it more self-documenting
Or it may be more appropriate to use a UNITCHECK block, which will be executed as early as possible, immediately after the library file is compiled, and so can be considered as an extension of the compilation
package Dictionary;
use strict;
use warnings;
my %dictionary;
UNITCHECK {
while ( <DATA> ) {
chomp;
my ($k, $v) = split /,/;
$dictionary{$k} = $v;
}
}

Perl print out all subs arguments at every call at runtime

I'm looking for way to debug print each subroutine call from the namespace Myapp::* (e.g. without dumping the CPAN modules), but without the need edit every .pm file manually for to inserting some module or print statement.
I just learning (better to say: trying to understand) the package DB, what allows me tracing the execution (using the shebang #!/usr/bin/perl -d:Mytrace)
package DB;
use 5.010;
sub DB {
my( $package, $file, $line ) = caller;
my $code = \#{"::_<$file"};
print STDERR "--> $file $line $code->[$line]";
}
#sub sub {
# print STDERR "$sub\n";
# &$sub;
#}
1;
and looking for a way how to use the sub call to print the actual arguments of the called sub from the namespace of Myapp::*.
Or is here some easier (common) method to
combine the execution line-tracer DB::DB
with the Dump of the each subroutine call arguments (and its return values, if possible)?
I don't know if it counts as "easier" in any sane meaning of the word, but you can walk the symbol table and wrap all functions in code that prints their arguments and return values. Here's an example of how it might be done:
#!/usr/bin/env perl
use 5.14.2;
use warnings;
package Foo;
sub first {
my ( $m, $n ) = #_;
return $m+$n;
}
sub second {
my ( $m, $n ) = #_;
return $m*$n;
}
package main;
no warnings 'redefine';
for my $k (keys %{$::{'Foo::'}}) {
my $orig = *{$::{'Foo::'}{$k}}{CODE};
$::{'Foo::'}{$k} = sub {
say "Args: #_";
unless (wantarray) {
my $r = $orig->(#_);
say "Scalar return: $r";
return $r;
}
else {
my #r = $orig->(#_);
say "List return: #r";
return #r
}
}
}
say Foo::first(2,3);
say Foo::second(4,6);

function call in perl

As a part of my course work I have been learning perl programming language for the first time in last the few weeks. I have been writing small functions and making function calls. I have written a function for string matching.
use strict;
use warnings;
sub find_multi_string {
my ($file, #strings) = #_;
my $fh;
open ($fh, "<$file");
#store the whole file in an array
my #array = <$fh>;
for my $string (#strings) {
if (grep /$string/, #array) {
next;
} else {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
In the above script I'm passing the arguments in the function call. The script works.
But I'd like to know if there is way to specify the file name and string1... string n in an array in the program itself and just make the function call.
find_multi_string();
That would be a mistake, always pass parameters and return values to your subroutines.
What you're describing is essentially using subroutines solely to subdivide and document your code. If you were to do that, it would better to just remove the subroutine entirely and include a comment before the section of code.
Overall, your code looks good as is. You probably will want to use quotemeta though, and your logic can be simplified a little:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
# Load the file
my $data = do {
open my $fh, "<", $file;
local $/;
<$fh>
};
for my $string (#strings) {
if ($data !~ /\Q$string/) {
die "Cannot find $string in $file";
}
}
return 1;
}
find_multi_string('file name', 'string1','string2','string3','string4','string 5');
A few improvements of your original code:
use autodie
use 3-args open
as you want to check anywhere in the file, just load the file as a single string
if the matching string are just text without metacharacters from regexp, just use the index function
Your question is about passing the function arguments from your program.
I suspect that you are looking for #ARGV. See perlvar.
Here is the modified code:
use strict;
use warnings;
use autodie;
sub find_multi_string {
my ($file, #strings) = #_;
my $content = do {
open my $fh, '<', $file;
local $/;
<$fh>
};
foreach (#strings) {
die "Cannot find $string in $file" unless index($content, $_) >= 0;
}
return 1;
}
find_multi_string(#ARGV);