Using filehandles in Perl to alter actively running code - perl

I've been learning about filehandles in Perl, and I was curious to see if there's a way to alter the source code of a program as it's running. For example, I created a script named "dynamic.pl" which contained the following:
use strict;
use warnings;
open(my $append, ">>", "dynamic.pl");
print $append "print \"It works!!\\n\";\n";
This program adds the line
print "It works!!\n";
to the end of it's own source file, and I hoped that once that line was added, it would then execute and output "It works!!"
Well, it does correctly append the line to the source file, but it doesn't execute it then and there.
So I assume therefore that when perl executes a program that it loads it to memory and runs it from there, but my question is, is there a way to access this loaded version of the program so you can have a program that can alter itself as you run it?

The missing piece you need is eval EXPR. This compiles, "evaluates", any string as code.
my $string = q[print "Hello, world!";];
eval $string;
This string can come from any source, including a filehandle.
It also doesn't have to be a single statement. If you want to modify how a program runs, you can replace its subroutines.
use strict;
use warnings;
use v5.10;
sub speak { return "Woof!"; }
say speak();
eval q[sub speak { return "Meow!"; }];
say speak();
You'll get a Subroutine speak redefined warning from that. It can be supressed with no warnings "redefine".
{
# The block is so this "no warnings" only affects
# the eval and not the entire program.
no warnings "redefine";
eval q[sub speak { return "Shazoo!"; }];
}
say speak();
Obviously this is a major security hole. There is many, many, many things to consider here, too long for an answer, and I strongly recommend you not do this and find a better solution to whatever problem you're trying to solve this way.
One way to mitigate the potential for damage is to use the Safe module. This is like eval but limits what built in functions are available. It is by no means a panacea for the security issues.

With a warning about all kinds of issues, you can reload modules.
There are packages for that, for example, Module::Reload. Then you can write code that you intend to change in a module, change the source at runtime, and have it reloaded.
By hand you would delete that from %INC and then require, like
# ... change source code in the module ...
delete $INC{'ModuleWithCodeThatChages.pm'};
require ModuleWithCodeThatChanges;
The only reason I can think of for doing this is experimentation and play. Otherwise, there are all kinds of concerns with doing something like this, and whatever your goal may be there are other ways to accomplish it.
Note The question does specify a filehandle. However, I don't see that to be really related to what I see to be the heart of the question, of modifying code at runtime.

The source file isn't used after it's been compiled.
You could just eval it.
use strict;
use warnings;
my $code = <<'__EOS__'
print "It works!!\n";
__EOS__
open(my $append_fh, ">>", "dynamic.pl")
or die($!);
print($append_fh $code);
eval("$code; 1")
or die($#);

There's almost definitely a better way to achieve your end goal here. BUT, you could recursively make exec() or system() calls -- latter if you need a return value. Be sure to setup some condition or the dominoes will keep falling. Again, you should rethink this, unless it's just practice of some sort, or maybe I don't get it!
Each call should execute the latest state of the file; also be sure to close the file before each call.
i.e.,
exec("dynamic.pl"); or
my retval;
retval = system("perl dynamic.pl");
Don't use eval ever.

Related

perl - two stage conditional compilation

I have pretty big perl script executed quite frequently (from cron).
Most executions require pretty short & simple tests.
How to split single file script into two parts with "part two" compiled based on "part 1" decision?
Considered solution:
using BEGIN{ …; exit if …; } block for trivial test.
two file solution with file_1 using require to compile&execute file_2.
I would prefer single file solution to ease maintenance if the cost is reasonable.
First, you should measure how long the compilation really takes, to see if this "optimization" is even necessary. If it does happen to be, then since you said you'd prefer a one-file solution, one possible solution is using the __DATA__ section for code like so:
use warnings;
use strict;
# measure compliation and execution time
use Time::HiRes qw/ gettimeofday tv_interval /;
my $start;
BEGIN { $start = [gettimeofday] }
INIT { printf "%.06f\n", tv_interval($start) }
END { printf "%.06f\n", tv_interval($start) }
my $condition = 1; # dummy for testing
# conditionally compile and run the code in the DATA section
if ($condition) {
eval do { local $/; <DATA>.'; 1' } or die $#;
}
__DATA__
# ... lots of code here ...
I see two ways of achieving what you want. The simple one would be to divide the script in two parts. The first part will do the simple tests. Then, if you need to do more complicated tests you may "add" the second part. The way to do this is using eval like this:
<first-script.pl>
...
eval `cat second-script.pl`;
if( $# ) {
print STDERR $#, "\n";
die "Errors in the second script.\n";
}
Or using File::Slurp in a more robust way:
eval read_file("second-script.pl", binmode => ':utf8');
Or following #amon suggestion and do:
do "second-script.pl";
Only beware that do is different from eval in this way:
It also differs in that code evaluated with do FILE cannot see lexicals in the enclosing scope; eval STRING does. It's the same, however, in that it does reparse the file every time you call it, so you probably don't want to do this inside a loop.
The eval will execute in the context of the first script, so any variables or initializations will be available to that code.
Related to this, there is this question: Best way to add dynamic code to a perl application, which I asked some time ago (and answered myself with the help of the comments provided and some research.) I took some time to document everything I could think of for anyone (and myself) to refer to.
The second way I see would be to turn your testing script into a daemon and have the crontab bit call this daemon as necessary. The daemon remains alive so any data structures that you may need will remain in memory. On the down side, this will take resources in a continuos way as the daemon process will always be running.

I serialized my data in Perl with Data::Dumper. Now when I eval it I get "Global symbol "$VAR1" requires explicit package name"

I serialized my data to string in Perl using Data::Dumper. Now in another program I'm trying to deserialize it by using eval and I'm getting:
Global symbol "$VAR1" requires explicit package name
I'm using use warnings; use strict; in my program.
Here is how I'm evaling the code:
my $wiki_categories = eval($db_row->{categories});
die $# if $#;
/* use $wiki_categories */
How can I disable my program dying because of "$VAR1" not being declared as my?
Should I append "my " before the $db_row->{categories} in the eval? Like this:
my $wiki_categories = eval("my ".$db_row->{categories});
I didn't test this yet, but I think it would work.
Any other ways to do this? Perhaps wrap it in some block, and turn off strict for that block? I haven't ever done it but I've seen it mentioned.
Any help appreciated!
This is normal. By default, when Data::Dumper serializes data, it outputs something like:
$VAR1 = ...your data...
To use Data::Dumper for serialization, you need to configure it a little. Terse being the most important option to set, it turns off the $VAR thing.
use Data::Dumper;
my $data = {
foo => 23,
bar => [qw(1 2 3)]
};
my $dumper = Data::Dumper->new([]);
$dumper->Terse(1);
$dumper->Values([$data]);
print $dumper->Dump;
Then the result can be evaled straight into a variable.
my $data = eval $your_dump;
You can do various tricks to shrink the size of Data::Dumper, but on the whole it's fast and space efficient. The major down sides are that it's Perl only and wildly insecure. If anyone can modify your dump file, they own your program.
There are modules on CPAN which take care of this for you, and a whole lot more, such as Data::Serializer.
Your question has a number of implications, I'll try to address as many as I can.
First, read the perldoc for Data::Dumper. Setting $Data::Dumper::Terse = 1 may suffice for your needs. There are many options here in global variables, so be sure to localise them. But this changes the producer, not the consumer, of the data. I don't know how much control you have over that. Your question implies you're working on the consumer, but makes no mention of any control over the producer. Maybe the data already exists, and you have to use it as is.
The next implication is that you're tied to Data::Dumper. Again, the data may already exist, so too bad, use it. If this is not the case, I would recommend switching to another storable format. A fairly common one nowadays is JSON. While JSON isn't part of core perl, it's pretty trivial to install. It also makes this much easier. One advantage is that the data is useful in other languages, too. Another is that you avoid eval STRING which, should the data be compromised, could easily compromise your consumer.
The next item is just how to solve it as is. If the data exists, for example. A simple solution is to just add the my as you did. This works fine. Another one is to strip the $VAR1: (my $dumped = $db_row->{categories}) =~ s/^\s*\$\w+\s*=\s*//;. Another one is to put the "no warnings" right into the eval: eval ("no warnings; no strict; " . $db_row->{categories});.
Personally, I go with JSON whenever possible.
Your code would work as it stood except that the eval fails because $VAR1 is undeclared in the scope of the eval and use strict 'vars' is in effect.
Get around this by disabling strictures within as tight a block as possible. A do block does the trick, like this
my $wiki_categories = do {
no strict 'vars';
eval $db_row->{categories};
};

Perl - New definition of myprint() or Overload print command

I am a newb to Perl. I am writing some scripts and want to define my own print called myprint() which will print the stuff passed to it based on some flags (verbose/debug flag)
open(FD, "> /tmp/abc.txt") or die "Cannot create abc.txt file";
print FD "---Production Data---\n";
myprint "Hello - This is only a comment - debug data";
Can someone please help me with some sample code to for myprint() function?
Do you care more about writing your own logging system, or do you want to know how to put logging statements in appropriate parts of your program which you can turn off (and, incur little performance penalty when they are turned off)?
If you want a logging system that is easy to start using, but also offers a world of features which you can incrementally discover and use, Log::Log4perl is a good option. It has an easy mode, which allows you to specify the desired logging level, and emits only those logging messages that are above the desired level.
#!/usr/bin/env perl
use strict; use warnings;
use File::Temp qw(tempfile);
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init({level => $INFO});
my ($fh, $filename) = tempfile;
print $fh "---Production Data---\n";
WARN 'Wrote something somewhere somehow';
The snippet also shows a better way of opening a temporary file using File::Temp.
As for overriding the built-in print … It really isn't a good idea to fiddle with built-ins except in very specific circumstances. perldoc perlsub has a section on Overriding Built-in Functions. The accepted answer to this question lists the Perl built-ins that cannot be overridden. print is one of those.
But, then, one really does not need to override a built-in to write a logging system.
So, if an already-written logging system does not do it for you, you really seem to be asking "how do I write a function that prints stuff conditionally depending on the value of a flag?"
Here is one way:
#!/usr/bin/env perl
package My::Logger;
{
use strict; use warnings;
use Sub::Exporter -setup => {
exports => [
DEBUG => sub {
return sub {} unless $ENV{MYDEBUG};
return sub { print 'DEBUG: ' => #_ };
},
]
};
}
package main;
use strict; use warnings;
# You'd replace this with use My::Logger qw(DEBUG) if you put My::Logger
# in My/Logger.pm somewhere in your #INC
BEGIN {
My::Logger->import('DEBUG');
}
sub nicefunc {
print "Hello World!\n";
DEBUG("Isn't this a nice function?\n");
return;
}
nicefunc();
Sample usage:
$ ./yy.pl
Hello World!
$ MYDEBUG=1 ./yy.pl
Hello World!
DEBUG: Isn't this a nice function?
I wasn't going to answer this because Sinan already has the answer I'd recommend, but tonight I also happened to be working on the "Filehandle References" chapter to the upcoming Intermediate Perl. That are a couple of relevant paragraphs which I'll just copy directly without adapting them to your question:
IO::Null and IO::Interactive
Sometimes we don't want to send our output anywhere, but we are forced
to send it somewhere. In that case, we can use IO::Null to create
a filehandle that simply discards anything that we give it. It looks
and acts just like a filehandle, but does nothing:
use IO::Null;
my $null_fh = IO::Null->new;
some_printing_thing( $null_fh, #args );
Other times, we want output in some cases but not in others. If we are
logged in and running our program in our terminal, we probably want to
see lots of output. However, if we schedule the job through cron, we
probably don't care so much about the output as long as it does the job.
The IO::Interactive module is smart enough to tell the difference:
use IO::Interactive;
print { is_interactive } 'Bamboo car frame';
The is_interactive subroutine returns a filehandle. Since the
call to the subroutine is not a simple scalar variable, we surround
it with braces to tell Perl that it's the filehandle.
Now that you know about "do nothing" filehandles, you can replace some
ugly code that everyone tends to write. In some cases you want output
and in some cases you don't, so many people use a post-expression
conditional to turn off a statement in some cases:
print STDOUT "Hey, the radio's not working!" if $Debug;
Instead of that, you can assign different values to $debug_fh based
on whatever condition you want, then leave off the ugly if $Debug
at the end of every print:
use IO::Null;
my $debug_fh = $Debug ? *STDOUT : IO::Null->new;
$debug_fh->print( "Hey, the radio's not working!" );
The magic behind IO::Null might give a warning about "print() on
unopened filehandle GLOB" with the indirect object notation (e.g.
print $debug_fh) even though it works just fine. We don't get that
warning with the direct form.

In Perl, is there any way to tie a stash?

Similar to the way AUTOLOAD can be used to define subroutines on demand, I am wondering if there is a way to tie a package's stash so that I can intercept access to variables in that package.
I've tried various permutations of the following idea, but none seem to work:
{package Tie::Stash;
use Tie::Hash;
BEGIN {our #ISA = 'Tie::StdHash'}
sub FETCH {
print "calling fetch\n";
}
}
{package Target}
BEGIN {tie %Target::, 'Tie::Stash'}
say $Target::x;
This dies with Bad symbol for scalar ... on the last line, without ever printing "calling fetch". If the say $Target::x; line is removed, the program runs and exits properly.
My guess is that the failure has to do with stashes being like, but not the same as hashes, so the standard tie mechanism is not working right (or it might just be that stash lookup never invokes tie magic).
Does anyone know if this is possible? Pure Perl would be best, but XS solutions are ok.
You're hitting a compile time internal error ("Bad symbol for scalar"), this happens while Perl is trying to work out what '$Target::x' should be, which you can verify by running a debugging Perl with:
perl -DT foo.pl
...
### 14:LEX_NORMAL/XOPERATOR ";\n"
### Pending identifier '$Target::x'
Bad symbol for scalar at foo.pl line 14.
I think the GV for '::Target' is replaced by something else when you tie() it, so that whatever eventually tries to get to its internal hash cannot. Given that tie() is a little bit of a mess, I suspect what you're trying to do won't work, which is also suggested by this (old) set of exchanges on p5p:
https://groups.google.com/group/perl.perl5.porters/browse_thread/thread/f93da6bde02a91c0/ba43854e3c59a744?hl=en&ie=UTF-8&q=perl+tie+stash#ba43854e3c59a744
A little late to the question, but although it's not possible to use tie to do this, Variable::Magic allows you to attach magic to a stash and thereby achieve something similar.

How can I make a static analysis call graph for Perl?

I am working on a moderately complex Perl program. As a part of its development, it has to go through modifications and testing. Due to certain environment constraints, running this program frequently is not an option that is easy to exercise.
What I want is a static call-graph generator for Perl. It doesn't have to cover every edge case(e,g., redefining variables to be functions or vice versa in an eval).
(Yes, I know there is a run-time call-graph generating facility with Devel::DprofPP, but run-time is not guaranteed to call every function. I need to be able to look at each function.)
Can't be done in the general case:
my $obj = Obj->new;
my $method = some_external_source();
$obj->$method();
However, it should be fairly easy to get a large number of the cases (run this program against itself):
#!/usr/bin/perl
use strict;
use warnings;
sub foo {
bar();
baz(quux());
}
sub bar {
baz();
}
sub baz {
print "foo\n";
}
sub quux {
return 5;
}
my %calls;
while (<>) {
next unless my ($name) = /^sub (\S+)/;
while (<>) {
last if /^}/;
next unless my #funcs = /(\w+)\(/g;
push #{$calls{$name}}, #funcs;
}
}
use Data::Dumper;
print Dumper \%calls;
Note, this misses
calls to functions that don't use parentheses (e.g. print "foo\n";)
calls to functions that are dereferenced (e.g. $coderef->())
calls to methods that are strings (e.g. $obj->$method())
calls the putt the open parenthesis on a different line
other things I haven't thought of
It incorrectly catches
commented functions (e.g. #foo())
some strings (e.g. "foo()")
other things I haven't thought of
If you want a better solution than that worthless hack, it is time to start looking into PPI, but even it will have problems with things like $obj->$method().
Just because I was bored, here is a version that uses PPI. It only finds function calls (not method calls). It also makes no attempt to keep the names of the subroutines unique (i.e. if you call the same subroutine more than once it will show up more than once).
#!/usr/bin/perl
use strict;
use warnings;
use PPI;
use Data::Dumper;
use Scalar::Util qw/blessed/;
sub is {
my ($obj, $class) = #_;
return blessed $obj and $obj->isa($class);
}
my $program = PPI::Document->new(shift);
my $subs = $program->find(
sub { $_[1]->isa('PPI::Statement::Sub') and $_[1]->name }
);
die "no subroutines declared?" unless $subs;
for my $sub (#$subs) {
print $sub->name, "\n";
next unless my $function_calls = $sub->find(
sub {
$_[1]->isa('PPI::Statement') and
$_[1]->child(0)->isa("PPI::Token::Word") and
not (
$_[1]->isa("PPI::Statement::Scheduled") or
$_[1]->isa("PPI::Statement::Package") or
$_[1]->isa("PPI::Statement::Include") or
$_[1]->isa("PPI::Statement::Sub") or
$_[1]->isa("PPI::Statement::Variable") or
$_[1]->isa("PPI::Statement::Compound") or
$_[1]->isa("PPI::Statement::Break") or
$_[1]->isa("PPI::Statement::Given") or
$_[1]->isa("PPI::Statement::When")
)
}
);
print map { "\t" . $_->child(0)->content . "\n" } #$function_calls;
}
I'm not sure it is 100% feasible (since Perl code can not be statically analyzed in theory, due to BEGIN blocks and such - see very recent SO discussion). In addition, subroutine references may make it very difficult to do even in places where BEGIN blocks don't come into play.
However, someone apparently made the attempt - I only know of it but never used it so buyer beware.
I don't think there is a "static" call-graph generator for Perl.
The next closest thing would be Devel::NYTProf.
The main goal is for profiling, but it's output can tell you how many times a subroutine has been called, and from where.
If you need to make sure every subroutine gets called, you could also use Devel::Cover, which checks to make sure your test-suite covers every subroutine.
I recently stumbled across a script while trying to solve find an answer to this same question. The script (linked to below) uses GraphViz to create a call graph of a Perl program or module. The output can be in a number of image formats.
http://www.teragridforum.org/mediawiki/index.php?title=Perl_Static_Source_Code_Analysis
I solved a similar problem recently, and would like to share my solution.
This tool was born out of desperation, untangling an undocumented part of a 30,000-line legacy script, in order to implement an urgent bug fix.
It reads the source code(s), uses GraphViz to generate a png, and then displays the image on-screen.
Since it uses simple line-by-line regexes, the formatting must be "sane" so that nesting can be determined.
If the target code is badly formatted, run it through a linter first.
Also, don't expect miracles such as parsing dynamic function calls.
The silver lining of a simple regex engine is that it can be easily extended for other languages.
The tool now also supports awk, bash, basic, dart, fortran, go, lua, javascript, kotlin, matlab, pascal, perl, php, python, r, raku, ruby, rust, scala, swift, and tcl.
https://github.com/koknat/callGraph