perl - two stage conditional compilation

perl - two stage conditional compilation - perl

I have pretty big perl script executed quite frequently (from cron).
Most executions require pretty short & simple tests.
How to split single file script into two parts with "part two" compiled based on "part 1" decision?
Considered solution:
using BEGIN{ …; exit if …; } block for trivial test.
two file solution with file_1 using require to compile&execute file_2.
I would prefer single file solution to ease maintenance if the cost is reasonable.

First, you should measure how long the compilation really takes, to see if this "optimization" is even necessary. If it does happen to be, then since you said you'd prefer a one-file solution, one possible solution is using the __DATA__ section for code like so:
use warnings;
use strict;
# measure compliation and execution time
use Time::HiRes qw/ gettimeofday tv_interval /;
my $start;
BEGIN { $start = [gettimeofday] }
INIT { printf "%.06f\n", tv_interval($start) }
END { printf "%.06f\n", tv_interval($start) }
my $condition = 1; # dummy for testing
# conditionally compile and run the code in the DATA section
if ($condition) {
eval do { local $/; <DATA>.'; 1' } or die $#;
}
__DATA__
# ... lots of code here ...

I see two ways of achieving what you want. The simple one would be to divide the script in two parts. The first part will do the simple tests. Then, if you need to do more complicated tests you may "add" the second part. The way to do this is using eval like this:
<first-script.pl>
...
eval `cat second-script.pl`;
if( $# ) {
print STDERR $#, "\n";
die "Errors in the second script.\n";
}
Or using File::Slurp in a more robust way:
eval read_file("second-script.pl", binmode => ':utf8');
Or following #amon suggestion and do:
do "second-script.pl";
Only beware that do is different from eval in this way:
It also differs in that code evaluated with do FILE cannot see lexicals in the enclosing scope; eval STRING does. It's the same, however, in that it does reparse the file every time you call it, so you probably don't want to do this inside a loop.
The eval will execute in the context of the first script, so any variables or initializations will be available to that code.
Related to this, there is this question: Best way to add dynamic code to a perl application, which I asked some time ago (and answered myself with the help of the comments provided and some research.) I took some time to document everything I could think of for anyone (and myself) to refer to.
The second way I see would be to turn your testing script into a daemon and have the crontab bit call this daemon as necessary. The daemon remains alive so any data structures that you may need will remain in memory. On the down side, this will take resources in a continuos way as the daemon process will always be running.

Related

Using filehandles in Perl to alter actively running code

I've been learning about filehandles in Perl, and I was curious to see if there's a way to alter the source code of a program as it's running. For example, I created a script named "dynamic.pl" which contained the following:
use strict;
use warnings;
open(my $append, ">>", "dynamic.pl");
print $append "print \"It works!!\\n\";\n";
This program adds the line
print "It works!!\n";
to the end of it's own source file, and I hoped that once that line was added, it would then execute and output "It works!!"
Well, it does correctly append the line to the source file, but it doesn't execute it then and there.
So I assume therefore that when perl executes a program that it loads it to memory and runs it from there, but my question is, is there a way to access this loaded version of the program so you can have a program that can alter itself as you run it?

The missing piece you need is eval EXPR. This compiles, "evaluates", any string as code.
my $string = q[print "Hello, world!";];
eval $string;
This string can come from any source, including a filehandle.
It also doesn't have to be a single statement. If you want to modify how a program runs, you can replace its subroutines.
use strict;
use warnings;
use v5.10;
sub speak { return "Woof!"; }
say speak();
eval q[sub speak { return "Meow!"; }];
say speak();
You'll get a Subroutine speak redefined warning from that. It can be supressed with no warnings "redefine".
{
# The block is so this "no warnings" only affects
# the eval and not the entire program.
no warnings "redefine";
eval q[sub speak { return "Shazoo!"; }];
}
say speak();
Obviously this is a major security hole. There is many, many, many things to consider here, too long for an answer, and I strongly recommend you not do this and find a better solution to whatever problem you're trying to solve this way.
One way to mitigate the potential for damage is to use the Safe module. This is like eval but limits what built in functions are available. It is by no means a panacea for the security issues.

With a warning about all kinds of issues, you can reload modules.
There are packages for that, for example, Module::Reload. Then you can write code that you intend to change in a module, change the source at runtime, and have it reloaded.
By hand you would delete that from %INC and then require, like
# ... change source code in the module ...
delete $INC{'ModuleWithCodeThatChages.pm'};
require ModuleWithCodeThatChanges;
The only reason I can think of for doing this is experimentation and play. Otherwise, there are all kinds of concerns with doing something like this, and whatever your goal may be there are other ways to accomplish it.
Note The question does specify a filehandle. However, I don't see that to be really related to what I see to be the heart of the question, of modifying code at runtime.

The source file isn't used after it's been compiled.
You could just eval it.
use strict;
use warnings;
my $code = <<'__EOS__'
print "It works!!\n";
__EOS__
open(my $append_fh, ">>", "dynamic.pl")
or die($!);
print($append_fh $code);
eval("$code; 1")
or die($#);

There's almost definitely a better way to achieve your end goal here. BUT, you could recursively make exec() or system() calls -- latter if you need a return value. Be sure to setup some condition or the dominoes will keep falling. Again, you should rethink this, unless it's just practice of some sort, or maybe I don't get it!
Each call should execute the latest state of the file; also be sure to close the file before each call.
i.e.,
exec("dynamic.pl"); or
my retval;
retval = system("perl dynamic.pl");
Don't use eval ever.

Why is "fork inside BEGIN ... a horrible prospect" in Perl?

As part of a comment on another SO Q&A, a user noted that:
fork inside BEGIN is a horrible prospect
Why is that a "horrible prospect"? (on a technical level - let's leave aside readability or prettiness of design)
use strict;
BEGIN {
# Begin block to fork off a child process.
# This is done in BEGIN, because otherwise My::HeavyModule module
# will be loaded BEFORE the fork and thus inherited by child process
# which is something we want to avoid
my $init = 1; # Some lightweight init code
if (my $pid = fork()) {
# Nothing to do here, proceed to the rest of the main program
} else {
die "cannot fork" unless defined $pid;
print "Child process started!\n";
exit 0;
}
} # End BEGIN block
use My::HeavyModule; # Very heavyweight on both startup time and memory
# Start parent process logic using My::HeavyModule;
Just to be clear: I am NOT asking if there are better ways to achieve what this code is doing. I'm asking why this approach was called "horrible", NOT whether it can be replaced by something that may be better.

I wrote the following comment which you are referring to:
fork inside BEGIN is a horrible prospect. You could also delay compilation for some parts using eval "string" or require, but that also has its issues.
although not because of technical reasons (there are none beyond what dan1111 mentioned, and an explicit portability warning in perlfork), but because it completely broke my expectations about any programs behaviour.
Each piece of code has a compile time. I expect that packages and classes will be set up here, and maybe that some metaprogramming takes place.
Then, there is a run time during which the main control flow of our program occurs, during which the job of this program is being carried out.
Perl complicates this simple distinction in that one block's compile time is another block's run time (e.g. via BEGIN or use or eval or require). But there is always a clear reference point: The phase of $0, the program originally being invoked (see also ${^GLOBAL_PHASE}). There will always be cases where it is a good idea to do funky things during the main script's compile time, but this doesn't necessitate that doing so would be a best practice – on the contrary, and thus my objection that doing so would be a “horrible prospect”.
If the main job of your program is to kick off two other independent programs, it might look like this:
use strict;
use warnings;
my $pid = fork;
if (not defined $pid) {
die "welp, can't fork: $!";
}
if ($pid) {
exec $^X, "heavy_program_you_intended_to_be.pl", #ARGV;
}
else {
... # background yourself, etc.
exec $^X, "light_program_you_intended_to_kick_off.pl";
}
But it would not fork in a BEGIN. I think this is one of many examples where you can do something with Perl, but this doesn't mean you should.

You are running a process of your program before you even know if the whole thing compiles successfully. That doesn't seem to be the type of thing you should want to do.
Readability.
Prettiness of design.
(Sorry, I know you said to ignore those last two. But that isn't really fair, is it? Both are very important considerations in whether code is a "horrible prospect".)
On the other hand, I love that you can do ill-advised things like this in Perl. And I'm not going to tell you you need to re-write your code. That depends too much on the situation: how mission-critical is this code? How maintainable does it need to be? How many other developers will work on it? etc.

Perl - New definition of myprint() or Overload print command

I am a newb to Perl. I am writing some scripts and want to define my own print called myprint() which will print the stuff passed to it based on some flags (verbose/debug flag)
open(FD, "> /tmp/abc.txt") or die "Cannot create abc.txt file";
print FD "---Production Data---\n";
myprint "Hello - This is only a comment - debug data";
Can someone please help me with some sample code to for myprint() function?

Do you care more about writing your own logging system, or do you want to know how to put logging statements in appropriate parts of your program which you can turn off (and, incur little performance penalty when they are turned off)?
If you want a logging system that is easy to start using, but also offers a world of features which you can incrementally discover and use, Log::Log4perl is a good option. It has an easy mode, which allows you to specify the desired logging level, and emits only those logging messages that are above the desired level.
#!/usr/bin/env perl
use strict; use warnings;
use File::Temp qw(tempfile);
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init({level => $INFO});
my ($fh, $filename) = tempfile;
print $fh "---Production Data---\n";
WARN 'Wrote something somewhere somehow';
The snippet also shows a better way of opening a temporary file using File::Temp.
As for overriding the built-in print … It really isn't a good idea to fiddle with built-ins except in very specific circumstances. perldoc perlsub has a section on Overriding Built-in Functions. The accepted answer to this question lists the Perl built-ins that cannot be overridden. print is one of those.
But, then, one really does not need to override a built-in to write a logging system.
So, if an already-written logging system does not do it for you, you really seem to be asking "how do I write a function that prints stuff conditionally depending on the value of a flag?"
Here is one way:
#!/usr/bin/env perl
package My::Logger;
{
use strict; use warnings;
use Sub::Exporter -setup => {
exports => [
DEBUG => sub {
return sub {} unless $ENV{MYDEBUG};
return sub { print 'DEBUG: ' => #_ };
},
]
};
}
package main;
use strict; use warnings;
# You'd replace this with use My::Logger qw(DEBUG) if you put My::Logger
# in My/Logger.pm somewhere in your #INC
BEGIN {
My::Logger->import('DEBUG');
}
sub nicefunc {
print "Hello World!\n";
DEBUG("Isn't this a nice function?\n");
return;
}
nicefunc();
Sample usage:
$ ./yy.pl
Hello World!
$ MYDEBUG=1 ./yy.pl
Hello World!
DEBUG: Isn't this a nice function?

I wasn't going to answer this because Sinan already has the answer I'd recommend, but tonight I also happened to be working on the "Filehandle References" chapter to the upcoming Intermediate Perl. That are a couple of relevant paragraphs which I'll just copy directly without adapting them to your question:
IO::Null and IO::Interactive
Sometimes we don't want to send our output anywhere, but we are forced
to send it somewhere. In that case, we can use IO::Null to create
a filehandle that simply discards anything that we give it. It looks
and acts just like a filehandle, but does nothing:
use IO::Null;
my $null_fh = IO::Null->new;
some_printing_thing( $null_fh, #args );
Other times, we want output in some cases but not in others. If we are
logged in and running our program in our terminal, we probably want to
see lots of output. However, if we schedule the job through cron, we
probably don't care so much about the output as long as it does the job.
The IO::Interactive module is smart enough to tell the difference:
use IO::Interactive;
print { is_interactive } 'Bamboo car frame';
The is_interactive subroutine returns a filehandle. Since the
call to the subroutine is not a simple scalar variable, we surround
it with braces to tell Perl that it's the filehandle.
Now that you know about "do nothing" filehandles, you can replace some
ugly code that everyone tends to write. In some cases you want output
and in some cases you don't, so many people use a post-expression
conditional to turn off a statement in some cases:
print STDOUT "Hey, the radio's not working!" if $Debug;
Instead of that, you can assign different values to $debug_fh based
on whatever condition you want, then leave off the ugly if $Debug
at the end of every print:
use IO::Null;
my $debug_fh = $Debug ? *STDOUT : IO::Null->new;
$debug_fh->print( "Hey, the radio's not working!" );
The magic behind IO::Null might give a warning about "print() on
unopened filehandle GLOB" with the indirect object notation (e.g.
print $debug_fh) even though it works just fine. We don't get that
warning with the direct form.

How can I make a static analysis call graph for Perl?

I am working on a moderately complex Perl program. As a part of its development, it has to go through modifications and testing. Due to certain environment constraints, running this program frequently is not an option that is easy to exercise.
What I want is a static call-graph generator for Perl. It doesn't have to cover every edge case(e,g., redefining variables to be functions or vice versa in an eval).
(Yes, I know there is a run-time call-graph generating facility with Devel::DprofPP, but run-time is not guaranteed to call every function. I need to be able to look at each function.)

Can't be done in the general case:
my $obj = Obj->new;
my $method = some_external_source();
$obj->$method();
However, it should be fairly easy to get a large number of the cases (run this program against itself):
#!/usr/bin/perl
use strict;
use warnings;
sub foo {
bar();
baz(quux());
}
sub bar {
baz();
}
sub baz {
print "foo\n";
}
sub quux {
return 5;
}
my %calls;
while (<>) {
next unless my ($name) = /^sub (\S+)/;
while (<>) {
last if /^}/;
next unless my #funcs = /(\w+)\(/g;
push #{$calls{$name}}, #funcs;
}
}
use Data::Dumper;
print Dumper \%calls;
Note, this misses
calls to functions that don't use parentheses (e.g. print "foo\n";)
calls to functions that are dereferenced (e.g. $coderef->())
calls to methods that are strings (e.g. $obj->$method())
calls the putt the open parenthesis on a different line
other things I haven't thought of
It incorrectly catches
commented functions (e.g. #foo())
some strings (e.g. "foo()")
other things I haven't thought of
If you want a better solution than that worthless hack, it is time to start looking into PPI, but even it will have problems with things like $obj->$method().
Just because I was bored, here is a version that uses PPI. It only finds function calls (not method calls). It also makes no attempt to keep the names of the subroutines unique (i.e. if you call the same subroutine more than once it will show up more than once).
#!/usr/bin/perl
use strict;
use warnings;
use PPI;
use Data::Dumper;
use Scalar::Util qw/blessed/;
sub is {
my ($obj, $class) = #_;
return blessed $obj and $obj->isa($class);
}
my $program = PPI::Document->new(shift);
my $subs = $program->find(
sub { $_[1]->isa('PPI::Statement::Sub') and $_[1]->name }
);
die "no subroutines declared?" unless $subs;
for my $sub (#$subs) {
print $sub->name, "\n";
next unless my $function_calls = $sub->find(
sub {
$_[1]->isa('PPI::Statement') and
$_[1]->child(0)->isa("PPI::Token::Word") and
not (
$_[1]->isa("PPI::Statement::Scheduled") or
$_[1]->isa("PPI::Statement::Package") or
$_[1]->isa("PPI::Statement::Include") or
$_[1]->isa("PPI::Statement::Sub") or
$_[1]->isa("PPI::Statement::Variable") or
$_[1]->isa("PPI::Statement::Compound") or
$_[1]->isa("PPI::Statement::Break") or
$_[1]->isa("PPI::Statement::Given") or
$_[1]->isa("PPI::Statement::When")
)
}
);
print map { "\t" . $_->child(0)->content . "\n" } #$function_calls;
}

I'm not sure it is 100% feasible (since Perl code can not be statically analyzed in theory, due to BEGIN blocks and such - see very recent SO discussion). In addition, subroutine references may make it very difficult to do even in places where BEGIN blocks don't come into play.
However, someone apparently made the attempt - I only know of it but never used it so buyer beware.

I don't think there is a "static" call-graph generator for Perl.
The next closest thing would be Devel::NYTProf.
The main goal is for profiling, but it's output can tell you how many times a subroutine has been called, and from where.
If you need to make sure every subroutine gets called, you could also use Devel::Cover, which checks to make sure your test-suite covers every subroutine.

I recently stumbled across a script while trying to solve find an answer to this same question. The script (linked to below) uses GraphViz to create a call graph of a Perl program or module. The output can be in a number of image formats.
http://www.teragridforum.org/mediawiki/index.php?title=Perl_Static_Source_Code_Analysis

I solved a similar problem recently, and would like to share my solution.
This tool was born out of desperation, untangling an undocumented part of a 30,000-line legacy script, in order to implement an urgent bug fix.
It reads the source code(s), uses GraphViz to generate a png, and then displays the image on-screen.
Since it uses simple line-by-line regexes, the formatting must be "sane" so that nesting can be determined.
If the target code is badly formatted, run it through a linter first.
Also, don't expect miracles such as parsing dynamic function calls.
The silver lining of a simple regex engine is that it can be easily extended for other languages.
The tool now also supports awk, bash, basic, dart, fortran, go, lua, javascript, kotlin, matlab, pascal, perl, php, python, r, raku, ruby, rust, scala, swift, and tcl.
https://github.com/koknat/callGraph

How do I run a Perl script from within a Perl script?

I've got a Perl script that needs to execute another Perl script. This second script can be executed directly on the command line, but I need to execute it from within my first program. I'll need to pass it a few parameters that would normally be passed in when it's run standalone (the first script runs periodically, and executes the second script under a certain set of system conditions).
Preliminary Google searches suggest using backticks or a system() call. Are there any other ways to run it? (I'm guessing yes, since it's Perl we're talking about :P ) Which method is preferred if I need to capture output from the invoked program (and, if possible, pipe that output as it executes to stdout as though the second program were invoked directly)?
(Edit: oh, now SO suggests some related questions. This one is close, but not exactly the same as what I'm asking. The second program will likely take an hour or more to run (lots of I/O), so I'm not sure a one-off invocation is the right fit for this.)

You can just do it.
{
local #ARGV = qw<param1 param2 param3>;
do '/home/buddy/myscript.pl';
}
Prevents the overhead of loading in another copy of perl.

The location of your current perl interpreter can be found in the special variable $^X. This is important if perl is not in your path, or if you have multiple perl versions available but which to make sure you're using the same one across the board.
When executing external commands, including other Perl programs, determining if they actually ran can be quite difficult. Inspecting $? can leave lasting mental scars, so I prefer to use IPC::System::Simple (available from the CPAN):
use strict;
use warnings;
use IPC::System::Simple qw(system capture);
# Run a command, wait until it finishes, and make sure it works.
# Output from this program goes directly to STDOUT, and it can take input
# from your STDIN if required.
system($^X, "yourscript.pl", #ARGS);
# Run a command, wait until it finishes, and make sure it works.
# The output of this command is captured into $results.
my $results = capture($^X, "yourscript.pl", #ARGS);
In both of the above examples any arguments you wish to pass to your external program go into #ARGS. The shell is also avoided in both of the above examples, which gives you a small speed advantage, and avoids any unwanted interactions involving shell meta-characters. The above code also expects your second program to return a zero exit value to indicate success; if that's not the case, you can specify an additional first argument of allowable exit values:
# Both of these commands allow an exit value of 0, 1 or 2 to be considered
# a successful execution of the command.
system( [0,1,2], $^X, "yourscript.pl", #ARGS );
# OR
capture( [0,1,2, $^X, "yourscript.pl", #ARGS );
If you have a long-running process and you want to process its data while it's being generated, then you're probably going to need a piped open, or one of the more heavyweight IPC modules from the CPAN.
Having said all that, any time you need to be calling another Perl program from Perl, you may wish to consider if using a module would be a better choice. Starting another program carries quite a few overheads, both in terms of start-up costs, and I/O costs for moving data between processes. It also significantly increases the difficulty of error handling. If you can turn your external program into a module, you may find it simplifies your overall design.
All the best,
Paul

I can think of a few ways to do this. You already mentioned the first two, so I won't go into detail on them.
backticks: $retVal = `perl somePerlScript.pl`;
system() call
eval
The eval can be accomplished by slurping the other file into a string (or a list of strings), then 'eval'ing the strings. Heres a sample:
#!/usr/bin/perl
open PERLFILE, "<somePerlScript.pl";
undef $/; # this allows me to slurp the file, ignoring newlines
my $program = <PERLFILE>;
eval $program;
4 . do: do 'somePerlScript.pl'

You already got good answers to your question, but there's always the posibility to take a different point of view: maybe you should consider refactoring the script that you want to run from the first script. Turn the functionality into a module. Use the module from the first and from the second script.

If you need to asynchronously call your external script -you just want to launch it and not wait for it to finish-, then :
# On Unix systems, either of these will execute and just carry-on
# You can't collect output that way
`myscript.pl &`;
system ('myscript.pl &');
# On Windows systems the equivalent would be
`start myscript.pl`;
system ('start myscript.pl');
# If you just want to execute another script and terminate the current one
exec ('myscript.pl');

Use backticks if you need to capture the output of the command.
Use system if you do not need to capture the output of the command.
TMTOWTDI: so there are other ways too, but those are the two easiest and most likely.

See the perlipc documentation for several options for interprocess communication.
If your first script merely sets up the environment for the second script, you may be looking for exec.

#!/usr/bin/perl
use strict;
open(OUTPUT, "date|") or die "Failed to create process: $!\n";
while (<OUTPUT>)
{
print;
}
close(OUTPUT);
print "Process exited with value " . ($? >> 8) . "\n";
This will start the process date and pipe the output of the command to the OUTPUT filehandle which you can process a line at a time. When the command is finished you can close the output filehandle and retrieve the return value of the process. Replace date with whatever you want.

I wanted to do something like this to offload non-subroutines into an external file to make editing easier. I actually made this into a subroutine. The advantage of this way is that those "my" variables in the external file get declared in the main namespace. If you use 'do' they apparently don't migrate to the main namespace. Note the presentation below doesn't include error handling
sub getcode($) {
my #list;
my $filename = shift;
open (INFILE, "< $filename");
#list = <INFILE>;
close (INFILE);
return \#list;
}
# and to use it:
my $codelist = [];
$codelist = getcode('sourcefile.pl');
eval join ("", #$codelist);

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse