Re-use a sub from a perl script in another script - perl

I have two perl scripts. Both have no "package" keyword or anything. One has a sub (and plus some free floating code too) that I want to use in another script, without running the free floating code in the process.
A.pl
sub my_sub {
# do something
}
# do something else
my_sub();
# do something else
B.pl
require A.pl; # but load only the subs, don't run anything
my_sub();
Is this possible without having to separate out the sub in a separate .pm file and then loading it?

require will run the code. So then you need A.pl to recognise the difference between being loaded and being called on the command line. I think it can be done, but you're kind of missing the point.
Instead, I highly recommend reversing everything. Instead of having mammoth pl scripts, and no pm modules, reverse it and have tiny pl files whose sole function is to load module(s) and call into the function that encapsulates the functionality there. And then it's trivial to add new modules to hold shared form or functionality.

Related

Running nested, dependent perl scripts

I have two perl scripts that I need to run together.
The first script defines a number of common functions and a main method.
Script 1(one.pl) Example:
#!/usr/bin/perl
sub getInformation
{
my $serverMode = $_[0];
my $port = $_[1];
return "You\nneed\nto\nparse\nme\nright\nnow\n!\n";
}
#main
&parseInformation(&getInformation($ARGV[0], $ARGV[1]));
The second is a script that calls the second script after defining two functions.
Script 2(two.pl) Example:
#!/usr/bin/perl
sub parseInformation
{
my $parsedInformation = $_[0];
#omitted
print "$parsedInformation";
}
my $parameterOne = $ARGV[0];
my $parameterTwo = $ARGV[1];
do "./one.pl $parameterOne $parameterTwo";
Command line usage:
> ./two.pl bayside 20
I have attempted to do this and the script seems to run however, whenever I run the script in perl -d two.pl mode I get no information from the debugger about the other script.
I have done some research and read about system, capture, require and do. If use the system function to run the script, how will I be able to export the functions defined in script two?
Questions:
1. Is there anyway to do this in perl?
2. If so how exactly do I need to achieve that?
I fully understand that perl is perl. Not another programming language. Unfortunately, when transitioning one tends to bring with what they knew with them. My apologies.
References:
How to run a per script from within a perl script
Perl documentation for require function
Generally speaking, that is not the way you should write reusable common functions in Perl. Instead, you should put the bulk of your code into Perl modules, and write just short scripts that act as wrappers for the modules. These short scripts should basically just grab and validate command-line arguments, pass those arguments to the modules for the real work, then format and output the results.
I really wish I could recommend perldoc perlmod to learn about writing modules, but it seems to mostly concentrate on the minutiae rather than a high-level overview of how to write and use a Perl module. Gabor Szabo's tutorial is perhaps a better place to start.
Here's a simple example, creating a script that outputs the Unix timestamp. This is the module:
# This file is called "lib/MyLib/DateTime.pm"
use strict;
use warnings;
package MyLib::DateTime;
use parent "Exporter";
our #EXPORT_OK = qw( get_timestamp );
sub get_timestamp {
my $ts = time;
return $ts;
}
1;
And this is the script that uses it:
#!/usr/bin/env perl
use strict;
use warnings;
use lib "/path/to/lib"; # path to the module we want, but
# excluding the "MyLib/DateTime.pm" part
use MyLib::DateTime qw( get_timestamp ); # import the function we want
# Here we might deal with input; e.g. #ARGV
# but as get_timestamp doesn't need any input, we don't
# have anything to do.
# Here we'll call the function we defined in the module.
my $result = get_timestamp();
# And here we'll do the output
print $result, "\n";
Now, running the script should output the current Unix timestamp. Another script that was doing something more complex with timestamps could also use MyLib::DateTime.
More importantly, another module which needed to do something with timestamps could use MyLib::DateTime. Putting logic into modules, and having those modules use each other is really the essence of CPAN. I've been demonstrating a really basic date and time library, but the king of datetime manipulation is the DateTime module on CPAN. This in turn uses DateTime::TimeZone.
The ease of re-using code, and the availability of a large repository of free, well-tested, and (mostly) well-documented modules on CPAN, is one of the key selling points of Perl.
Exactly.
Running 2 separate scripts at the same time won't give either script access to the others functions at all. They are 2 completely separate processes. You need to use modules. The point of modules is so that you don't repeat yourself, often called "dry" programming. A simple rule of thumb is:
If you are going to use a block of code more than once put it into a subroutine in the current script.
If you are going to use the same block in several programs put it in a module.
Also remember that common problems usually have a module on CPAN
That should be enough to get you going. Then if you're going to do much Perl Programming you should buy the book "Programming Perl" by Larry Wall, if you've programmed in other languages, or "Learning Perl" by Randal Schwartz if you're new to programming. I'm old fashioned so I have both books in print but you can still get them as ebooks. Also check out Perl.org as you're not alone.

identify a procedure and replace it with a different procedure

What I want to achieve:
###############CODE########
old_procedure(arg1, arg2);
#############CODE_END######
I have a huge code which has a old procedure in it. I want that the call to that old_procedure go to a call to a new procedure (new_procedure(arg1, arg2)) with the same arguments.
Now I know, the question seems pretty stupid but the trick is I am not allowed to change the code or the bad_function. So the only thing I can do it create a procedure externally which reads the code flow or something and then whenever it finds the bad_function, it replaces it with the new_function. They have a void type, so don't have to worry about the return values.
I am usng perl. If someone knows how to atleast start in this direction...please comment or answer. It would be nice if the new code can be done in perl or C, but other known languages are good too. C++, java.
EDIT: The code is written in shell script and perl. I cannot edit the code and I don't have location of the old_function, I mean I can find it...but its really tough. So I can use the package thing pointed out but if there is a way around it...so that I could parse the thread with that function and replace function calls. Please don't remove tags as I need suggestions from java, C++ experts also.
EDIT: #mirod
So I tried it out and your answer made a new subroutine and now there is no way of accessing the old one. I had created an variable which checks the value to decide which way to go( old_sub or new_sub)...is there a way to add the variable in the new code...which sends the control back to old_function if it is not set...
like:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
# check for the variable and send to old_sub if the var is not set
sub bad_sub
{ # good code
}
}
# Thanks #mirod
This is easier to do in Perl than in a lot of other languages, but that doesn't mean it's easy, and I don't know if it's what you want to hear. Here's a proof-of-concept:
Let's take some broken code:
# file name: Some/Package.pm
package Some::Package;
use base 'Exporter';
our #EXPORT = qw(forty_two nineteen);
sub forty_two { 19 }
sub nineteen { 19 }
1;
# file name: main.pl
use Some::Package;
print "forty-two plus nineteen is ", forty_two() + nineteen();
Running the program perl main.pl produces the output:
forty-two plus nineteen is 38
It is given that the files Some/Package.pm and main.pl are broken and immutable. How can we fix their behavior?
One way we can insert arbitrary code to a perl command is with the -M command-line switch. Let's make a repair module:
# file: MyRepairs.pm
CHECK {
no warnings 'redefine';
*forty_two = *Some::Package::forty_two = sub { 42 };
};
1;
Now running the program perl -MMyRepairs main.pl produces:
forty-two plus nineteen is 61
Our repair module uses a CHECK block to execute code in between the compile-time and run-time phase. We want our code to be the last code run at compile-time so it will overwrite some functions that have already been loaded. The -M command-line switch will run our code first, so the CHECK block delays execution of our repairs until all the other compile time code is run. See perlmod for more details.
This solution is fragile. It can't do much about modules loaded at run-time (with require ... or eval "use ..." (these are common) or subroutines defined in other CHECK blocks (these are rare).
If we assume the shell script that runs main.pl is also immutable (i.e., we're not allowed to change perl main.pl to perl -MMyRepairs main.pl), then we move up one level and pass the -MMyRepairs in the PERL5OPT environment variable:
PERL5OPT="-I/path/to/MyRepairs -MMyRepairs" bash the_immutable_script_that_calls_main_pl.sh
These are called automated refactoring tools and are common for other languages. For Perl though you may well be in a really bad way because parsing Perl to find all the references is going to be virtually impossible.
Where is the old procedure defined?
If it is defined in a package, you can switch to the package, after it has been used, and redefine the sub:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
sub bad_sub
{ # good code
}
}
If the code is in the same package but in a different file (loaded through a require), you can do the same thing without having to switch package.
if all the code is in the same file, then change it.
sed -i 's/old_procedure/new_procedure/g codefile
Is this what you mean?

Having a perl script make use of one among several secondary scripts

I have a main program mytool.pl to be run from the command line. There are several auxillary scripts special1.pl, special2.pl, etc. which each contain a couple subroutines and a hash, all identically named across scripts. Let's suppose these are named MySpecialFunction(), AnotherSpecialFunction() and %SpecialData.
I'd like for mytool to include/use/import the contents of one of the special*.pl files, only one, according to a command line option. For example, the user will do:
bash> perl mytool.pl --specialcase=5
and mytools will use MySpecialFunction() from special5.pl, and ignore all other special*.pl files.
Is this possible and how to do it?
It's important to note that the selection of which special file to use is made at runtime, so adding a "use" at the top of mytool.pl probably isn't the right thing to do.
Note I am a long-time C programmer, not a perl expert; I may be asking something obvious.
This is for a one-off project that will turn to dust in only a month. Neither mytool.pl nor special?.pl (nor perl itself) will be of interest beyond the end of this short project. Therefore, we don't care for solutions that are elaborate or require learning some deep magic. Quick and dirty preferred. I'm guessing that Perl's module mechanism is overkill for this, but have no idea what the alternatives are.
You can use a hash or array to map values of specialcase to .pl files and require or do them as needed.
#!/usr/bin/env perl
use strict; use warnings;
my #handlers = qw(one.pl two.pl);
my ($case) = #ARGV;
$case = 0 unless defined $case;
# check that $case is within range
do $handlers[$case];
print special_function(), "\n";
When you use a module, Perl just require's the module in a BEGIN block (and imports the modules exported items). Since you want to change what script you load at runtime, call require yourself.
if ($special_case_1) {
require 'special1.pl';
# and go about your business
}
Here's a good reference on when to use use vs. require.

.pm file that's loaded on every invocation of the perl interpreter?

I thought I remember reading somewhere about where perl can be configured to automatically load a certain .pm file on start up.
I know about PERL5OPT, but to my recollection, this was a specific file that would be loaded if it exists.
Is it a compile option that can be set (i.e. via Configure)?
Reading through perldoc perlrun it looks like you are looking for what is talked about in the -f option:
-f
Disable executing $Config{sitelib}/sitecustomize.pl at startup.
Perl can be built so that it by default will try to execute
$Config{sitelib}/sitecustomize.pl at startup (in a BEGIN block). This
is a hook that allows the sysadmin to customize how Perl behaves. It
can for instance be used to add entries to the #INC array to make Perl
find modules in non-standard locations.
Perl actually inserts the following code:
BEGIN {
do { local $!; -f "$Config{sitelib}/sitecustomize.pl"; }
&& do "$Config{sitelib}/sitecustomize.pl";
}
Since it is an actual do (not a require), sitecustomize.pl doesn't
need to return a true value. The code is run in package main , in its
own lexical scope. However, if the script dies, $# will not be set.
The value of $Config{sitelib} is also determined in C code and not
read from Config.pm , which is not loaded.
The code is executed very early. For example, any changes made to #INC
will show up in the output of perl -V. Of course, END blocks will be
likewise executed very late.
To determine at runtime if this capability has been compiled in your
perl, you can check the value of $Config{usesitecustomize} .
I've never done this, but it looks like if you put what you want in $Config{sitelib}/sitecustomize.pl you'll get what you are looking for.
See:
http://perldoc.perl.org/perlrun.html
http://www.nntp.perl.org/group/perl.perl5.porters/2007/10/msg129926.html
I'm confused by what you mean by "on start up". If you mean when a script / CGI / whatever is "started", then just use the module in the script:
use Data::Dumper;
Or do you mean something else?

How should I organize many Perl modules?

Consider that I have 100 Perl modules in 12 directories. But, looking into the main Perl script, it looks like 100 use p1 ; use p2 ; etc. What is the to best way to solve this issue?
It seems unlikely to me that you're useing all 100 modules directly in your main program. If your program uses a function in module A which then calls a function from module B, but the main program itself doesn't reference anything in module B, then the program should only use A. It should not use B unless it directly calls anything from module B.
If, on the other hand, your main program really does talk directly to all 100 modules, then it's probably just plain too big. Identify different functional groupings within the program and break each of those groups out into its own module. The main reason for doing this is so that it will result in code that is more maintainable, flexible, and reusable, but it will also have the happy side-effect of reducing the number of modules that the main program talks to directly, thus cutting down on the number of use statements required in any one place.
(And, yes, I do realize that 100 was probably an exaggeration, but, if you're getting uncomfortable about the number of modules being used by your code, then that's usually a strong indication that the code in question is trying to do too much in one place and should be broken down into a collection of modules.)
Put all the use statements in one file, say Mods.pm:
package Mods;
use Mod1;
use Mod2;
...
and include the file in your main script:
use Mods;
I support eugene's solution, but you could group the use statements in files by topic, like:
package Math;
use ModMatrix;
use ModFourier;
...
And of course you should name the modules and the mod-collections meaningful.
Putting all of the use statements in a separate file as eugene y suggested is probably the best approach. you can minimize the typing in that module with a bit of meta programming:
package Mods;
require Exporter;
our #ISA = 'Exporter';
my #packages = qw/Mod1 Mod2 Mod3 .... /;
# or map {"Mod$_"} 1 .. 100 if your modules are actually named that way
for (#packages) {
eval "require $_" or die $#; # 'use' means "require pkg; pkg->import()"
$_->import(); # at compile time
}
our #EXPORT = grep {*{$Mods::{$_}}{CODE}} keys %Mods::; # grab imported subs
#or #EXPORT_OK