Checking if imported modules are ever used

Checking if imported modules are ever used - perl

For whatever reason my coworkers (and probably myself in the past) have imported modules that we don't use to our projects. It's so bad that modules that don't even exist are being imported.
In Perl, you can say something like:
use Socket;
gethostname("DERP");
It's very hard to tell from that function that it was tied to the socket module.
Is there any program that exists or some functionality of Perl that I'm not aware of that can tell you if you've got bloat imported into your script?

That's why I always explicitly lists imports. e.g. use Socket qw( gethostname ); or use Foo qw( ); (The latter imports nothing rather than defaults.)
One approach would be to go file by file and add these import lists, then stop loading module with an empty import list for which no mention of its package name exists in the file.

Use a profiler to see what modules ARE in use. From that find out which are not. Works for me:
Install the Devel::DProf module if you havent (cpanm Devel::DProf or if ubuntu: apt install libdevel-dprof-perl). Newer profilers also exists, i.e. Devel::NYTProf
Change #!/use/bin/perl to #!/use/bin/perl -d:DProf in your program and then run it. This will make a tmon.out file in your current dir
Run dprofpp -aO9999 to analyze tmon.out. This outputs all executed subs including the module name on each line.
Note: if a sub is called conditionally (inside an if-block for instance) the profiler will not list it if the condition is always false. It helps to have a test suit with full code coverage and run that in profiling mode.

Related

Force locally installed module to use locally installed dependencies

I'm working on a project that requires all third-party (read: CPAN) perl modules to be installed in a perforce repository, so that any code that depends on them can be successfully run without anyone else needing to manually install them. I use cpanminus to install my CPAN modules, so I ran cpanm -L . Moose in the desired directory, and everything installed successfully. However, when I try to compile a module I made with Moose, I sometimes get this error:
Undefined subroutine &Carp::longmess_heavy called at /usr/lib/perl5/5.8.8/Carp.pm line 235.
It looks like, even though Carp was installed into my local directory with Moose, it is using the (outdated) version in /user/lib/perl5/5.8.8. I could upgrade Carp on my machine, but as soon as I check my code into the repository everyone else with their outdated Carps will run into the same issue. So how do I force Moose to use the locally installed Carp, rather than the one in /usr/lib/perl5/5.8.8?

You don't force Moose, you force perl. You've installed the module in a location perl doesn't know anything about, so you need to tell perl about it.
Since you want to affect all scripts, you'd want to place in your login script.
export PERL5LIB=/home/.../lib
If you wanted to only affect that one program, you'd launch the program using the following:
PERL5LIB=/home/.../lib script
or you'd add the following to your script:
use lib '/home/.../lib';

I managed to find a solution. It's messy, but that's the price I have to pay for joining a project that already has a messy system in place.
Before including Moose, I have to explicitly required the new Carp:
require "[path to Carp]/Carp.pm";
This generated a lot of warnings about redefining subroutines, so I had to (temporarily) suppress them:
my $restore_warn = $SIG{'__WARN__'};
$SIG{'__WARN__'} = sub {};
require "[path to Carp]/Carp.pm";
$SIG{'__WARN__'} = $restore_warn;

Perl to generate one executable file for a script which uses any number of modules and libraries

I am working on creating an agent in perl which does several actions. It uses several modules which are in .pm format and also few libraries. Now i want to convert it as one executable file so that i can install in n number of servers by copying that single file. Is it something i can achieve in perl? I am just a beginner in perl, perhaps my question might sound dumb but it will teach me something.

pp script provided with PAR::Packer is able to create single-file executables. An example from its page:
pp -o foo foo.pl bar.pl # Pack 'foo.pl' and 'bar.pl' into 'foo'

Some modules are included with Perl, so even though they're separate modules, they will work on other Perl installs without installing those modules. These include File::Copy, File::Find, Time::Piece.
You can see the listing of all standard modules on the Perldoc home page. Be sure to set the drop down version field (located on the left side) to the version that you're using. It goes all the way back to Perl 5.8.8 which is on Solaris.
It is entirely possible that the modules you need are already included in the standard Perl distribution, so there's no need to worry. Sometimes, you can substitute a non-standard module that's being used for one that's a standard module with little rewriting.
Some modules include compiled C code and can't be redistributed. They must be compiled on the machine they'r running on and installed. However, most modules are pure Perl modules, and can be redistributed with a program.
If a module isn't a standard module, and it's a pure Perl module, there are two ways it can be redistributed:
Perl has an #INC list that says what directories to search for when you search for modules. There's a Perl use lib pragma that allows you to add directories. You could include modules as sub directories for your program, and then zip up the entire structure. Users would unzip the entire directory tree which would include your program and the modules you need. By the way, the default #INC usually includes the current directory.
The other way is to append the modules to your program and then remove the use statement for that module (since it's now part of the file). This is a bit tricky, but it means a single program file.
Just remember that a module might require another module, so check thoroughly.
Another thing you can do is check for the module, and if it isn't there, download it via CPAN. Testing is easy:
BEGIN {
eval {
require My::Module; Module->import( LIST );
};
if ($#) {
die qq(Module doesn't exist);
}
}
Of course, doing a die is sort of silly because use would do that. However, it might be possible instead of dying to load the module via the CPAN module programmer's interface. I've never done that, and I don't know people who have. But, it is possible.
So, your best bet is to check to see if your program uses standard Perl modules, and if not, see if you can modify the program to use them. For example, if your program uses Archive::Zip, you might be able to modify it to use IO::Uncompress::Unzip and IO::Compress::Zip instead.
Otherwise, your choice is to try include those modules for installation (and watch for recursiveness and non-Pure Perl modules) or to try to detect that a module isn't installed, and programmatically install it.

The answer is a bit complicated.
The nature of Perl makes it practically impossible to compile a perl script in most use cases, so that a single executable could be distributed (with executable in the Windows sense). There are ways to do something similar, but sadly I don't know them.
But you can actually embed the Perl interpreter inside any C application, including the Perl source (your scripts + modules). When you statically link all C libraries, this should work as well. You can then use the Perl API to call your scripts.
If all of the servers you target are guaranteed to run the exact same OS, using the exact same libraries, and are preferably a *nix of some sort, it would be possible to pack all required files into an archive and write an install script. It is possible to write self-extracting shell scripts that contain the archive they are about to unpack. Same goes with perl, using the special __DATA__ command and the DATA filehandle:
#!/usr/bin/perl
print for <DATA>;
__DATA__
1
2
3
prints
1
2
3
Works great for piping data to tar as well.
You should include all dependent modules and all compiled libraries into the file and figure out a metadata system to install all files to the correct place.
As a general rule, software should rather be compiled on the target system itself, than just copying the binary files. It is too easy to overlook architecture differencies, configuration files or special registration entries hidden from view.
If you have to target different systems, it might be better to write a script that delegates the bulk of the installation to cpan or whatever perl package manager you prefer. This will be more flexible than hard-coding filepaths.
#!/bin/bash
cpan install Foo::Bar
cpan install Acme
cpan install ...
# etc.
I would stick with that.
The most elegant solution would be to create your own package or distribution like the ones you download from CPAN. As you would include a metadata file referencing all your dependencies, cpan would figure out everything by itself and do possibly neccessary compilation. I don't think this exactly is a beginners topic, but it would give you max flexibility and maintainability (easy upgrades!). This should make it fairly easy to include some installation tests.
This is just for starters, I am sure the internet or somebody else with more knowledge will elaborate.

Why doesn't my Perl script find my module even after I adjust #INC with FindBin?

I want to be able to use a module kept in the lib directory of my source code repository, and I want the only prerequisite for a developer to use the scripts that I'm writing is to have a standard Perl installation, but I'm not sure how to accomplish this.
In my scripts, I have
use FindBin qw($Bin);
use lib "$Bin/lib"; # store non standard modules here
use Term::ANSIColor;
use Win32::Console::ANSI;
print Term::ANSIColor::colored("this should be in color\n", "bold red");
and I put the module in ./lib. I verified that's the actual location where the module exists (by renaming it and causing it to fail). However, even if the module is in an arbitrary lib directory, it still seems to be a requirement that ppm be aware of the module.
I can not get my scripts to find/use it in lib without it being "installed" by ppm first. I would imagine that there should be some sort of way around this.
I know this may be an atypical request, but my goals are probably atypical. I just want a developer to do a checkout and immediately use some scripts without having to run some additional commands or use a package manager.
Thanks for any insight.
EDIT: I updated with a complete example. I also realized that if I uninstall it via ppm (but leave the pm in the referenced directory), I may have to change my syntax, and I was not considering that before. So maybe I have to give a full path or use require like jheddings or BipedalShark propose (ie. if it's not "installed", then I must use "require" and append ".pm" to it or use a BEGIN block.
If this is the case, then I have not found the correct syntax.
EDIT 2: Based on a comment below, I realize that I may have a flawed assumption. My reasoning is this: If I reference the actual code, the ".pm", directly then I should be able to use it without using a package manager. Maybe that's not the case, or if I want to do that maybe I have to do it a different way. Alternatively, I may have to refactor the code in the ".pm".
EDIT 3: I think that I was misunderstanding a few things. The error message in my IDE "Compilation failed in require", it's highlighting of the line that I was using to include the module, and the console error message of "Can't locate loadable object for module Win32::Console::ANSI"
I was reading that as a problem with loading the module itself, but it seems to be a problem that results from something the module itself is attempting to load. Interesting that this is only a problem since I didn't use ppm install though.
It is finding the actual module. I was able to verify that by commenting out the trouble lines.
Thanks for the help, but I'll have to spend some more time with it.

See perldoc perldiag under "Can't locate loadable object for module ...":
(F) The module you loaded is trying to load an external library,
like for example, "foo.so" or "bar.dll", but the DynaLoader module
was unable to locate this library. See DynaLoader.
You are correct that this problem is arising from something the module is trying to load -- that's what Dynaloader does. However, the documentation for Win32::Console::ANSI makes no mention of any external library requirements.

Are you preserving your module path structure in your lib directory?
i.e. your module should be in the path $Bin/lib/Some/Module.pm.

From perlfaq8's answer to How do I add the directory my program lives in to the module/library search path?
You appear to be doing it correctly, but you need to give us more if you expect to get help.
When you run that script, what ends up in #INC? Put in a debugging line like:
BEGIN {
use lib ...;
print "INC is \#INC\n";
}
Check that that output shows the directory that you expect. If it doesn't, start bisecting the problem from there.

Try this:
BEGIN {
use FindBin qw($Bin);
}
use lib "$Bin/lib"; # store non standard modules here

I manually install modules all the time and it seems to work. I just copy directories and files into a location and use the "use lib" directive like you've shown. Sometimes I miss a file and I get a runtime error that it's looking for a certain file and I go find the file on the Internet and put it in the right place and it works. Not sure what's going on with your setup. This should work.
I usually put the perl modules in the same directory as my script and then: use lib "."
But I don't know that it would matter.

How do I start a new Perl module distribution?

I'm trying to set up a large-ish project, written in Perl. The IBM MakeMaker tutorial has been very helpful so far, but I don't understand how to link all the modules into the main program. In my project root, I have MANIFEST, Makefile.PL, README, a bin directory, and a lib directory. In my bin directory, I have my main script (Main.pl). In the lib directory, I have each of my modules, divided up into their own respective directories (i.e. Utils::Util1 and Utils::Utils2 in the utils directory, etc). In each module directory, there is also a t directory, containing tests
My MANIFEST file has the following:
bin/Main.pl
lib/Utils/Util1.pm
lib/Utils/Util2.pm
lib/Utils/t/Utils1.t
lib/Utils/t/Utils2.t
Makefile.PL
MANIFEST
README
Makefile.PL is the following:
use ExtUtils::MakeMaker;
WriteMakefile(
'NAME'=>'Foo',
'VERSION_FROM'=>'bin/Main.pl',
'PREREQ_PM'=>{
"XML::Simple"=> 2.18}, #The libraries that we need and their
#minimum version numbers
'EXE_FILES' =>[("bin/Main.pl")]
);
After I make and run, the program crashes, complaining that it cannot find Utils::Util1, and when I run 'make test, it says no tests defined. Can anyone make any suggestions? I have never done a large scale project like this in perl, and I will need to add many more modules

If you are just starting to create Perl modules (which is also Perl's equivalent of a project), don't use Makemaker. Module::Build is the way to go, and it's now part of the standard library. Makemaker is for us old salts who haven't converted to Module::Build yet. :) I'll strike that now that Module::Build is unmaintained and out of favor; I still use MakeMaker.
You should never start off a Perl project by trying to create the structure yourself. It's too much work and you'll always forget something.
There's h2xs, a program that comes with perl and was supposed to be a tool to convert .h files into Perl's glue language XS. It works fine, but its advantage is that it comes with perl:
% h2xs -AXn Module::Name
Something like Module::Starter is a bit more sophisticated, although you have to get it from CPAN. It's the tool we use in Intermediate Perl because it's simple. It fills in some templates with your information:
% module-starter --author=... --email=... --module=...
If you are doing to do this quite a bit, you might then convert that to Distribution::Cooker so you can customize your files and contents. It's a dinky utility I wrote for myself so I could use my own templates.
% dist_cooker Module::Name
If you're really hard core, you might want Dist::Zilla, but that's more for people who already know what they are doing.

Might I also suggest module-starter? It'll automatically create a skeleton project which "Just Works". I learned what little I know about Perl modules organization by reading the generated skeleton files. It's all well-documented, and quite easy to use as a base for growing a larger project in. You can check out the getting-started docs to see what it gives you.
Running module-starter will give you a Perl distribution, consisting of a number of modules (use the command line option --module, such as:
module-starter --distro=Project --module=Project::Module::A,Project::Module::B [...]
to create multiple modules in a single distribution). It's then up to you whether you'd prefer to organize your project as a single distribution consisting of a number of modules working together, or as a number of distributions which can be released separately but which depend on each other (as configured in your Build or Makefile.PL file) to provide a complete system.

Try this structure:
bin/Main.pl
lib/Utils/Util1.pm
lib/Utils/Util2.pm
Makefile.PL
MANIFEST
README
t/Utils1.t
t/Utils2.t
As ysth said, make does not install your modules, it just builds them in a blib directory. (In your case it just copies them there, but if you had XS code, it would be compiled with a C compiler.) Use make install to install your modules for regular scripts to use.
If you want to run your script between make and make install, you can do:
perl -Mblib bin/Main.pl
The -Mblib instructs perl to temporarily add the appropriate directories to the search path, so you can try out an uninstalled module. (make test does that automatically.)

By default, tests are looked for in a top-level t directory (or a test.pl file, but that has some limitations, so should be avoided).
You say "After I make and run"...make puts things into a blib directory structure ready to be installed, but doesn't do anything special to make running a script access them. (make test is special; it does add appropriate paths from blib to perl's #INC to be able to run the tests.) You will need to do a "make install" to install the modules where your script will find them (or use a tool like PAR to package them together with your script).

How do I find the module dependencies of my Perl script?

I want another developer to run a Perl script I have written. The script uses many CPAN modules that have to be installed before the script can be run. Is it possible to make the script (or the perl binary) to dump a list of all the missing modules? Perl prints out the missing modules’ names when I attempt to run the script, but this is verbose and does not list all the missing modules at once. I’d like to do something like:
$ cpan -i `said-script --list-deps`
Or even:
$ list-deps said-script > required-modules # on my machine
$ cpan -i `cat required-modules` # on his machine
Is there a simple way to do it? This is not a show stopper, but I would like to make the other developer’s life easier. (The required modules are sprinkled across several files, so that it’s not easy for me to make the list by hand without missing anything. I know about PAR, but it seems a bit too complicated for what I want.)
Update: Thanks, Manni, that will do. I did not know about %INC, I only knew about #INC. I settled with something like this:
print join("\n", map { s|/|::|g; s|\.pm$||; $_ } keys %INC);
Which prints out:
Moose::Meta::TypeConstraint::Registry
Moose::Meta::Role::Application::ToClass
Class::C3
List::Util
Imager::Color
…
Looks like this will work.

Check out Module::ScanDeps and the "scandeps.pl" utility that comes with it. It can do a static (and recursive) analysis of your code for dependencies as well as the %INC dump either after compiling or running the program.
Please note that the static source scanning always errs on the side of including too many dependencies. (It is the dependency scanner used by PAR and aims at being easiest on the end-user.)
Finally, you could choose to distribute your script as a CPAN distribution. That sounds much more complicated than it really is. You can use something like Module::Starter to set up a basic skeleton of a tentative App::YourScript distribution. Put your script in the bin/ subdirectory and edit the Makefile.PL to reference all of your direct dependencies. Then, for distribution you do:
perl Makefile.PL
make
make dist
The last step generates a nice App-YourScript-VERSION.tar.gz
Now, when the client wants to install all dependencies, he does the following:
Set up the CPAN client correctly. Simply run it and answer the questions. But you're requiring that already anyway.
"tar -xz App-YourScript-VERSION.tar.gz && cd App-YourScript-VERSION"
Run "cpan ."
The CPAN client will now install all direct dependencies and the dependencies of those distributions automatically. Depending on how you set it up, it will either follow the prerequisites recursively automatically or prompt with a y/n each time.
As an example of this, you might check out a few of the App::* distributions on CPAN. I would think App::Ack is a good example. Maybe one of the App::* distributions from my CPAN directory (SMUELLER).

You could dump %INC at the end of your script. It will contain all used and required modules. But of course, this will only be helpful if you don't require modules conditionally (require Foo if $bar).

For quick-and-dirty, infrequent use, the %INC is the best way to go. If you have to do this with continuous integration testing or something more robust, there are some other tools to help.
Steffen already mentioned the Module::ScanDeps.
The code in Test::Prereq does this, but it has an additional layer that ensures that your Makefile.PL or Build.PL lists them as a dependency. If you make your scripts look like a normal Perl distribution, that makes it fairly easy to check for new dependencies; just run the test suite again.
Aside from that, you might use a tool such as Module::Extract::Use, which parses the static code looking for use and require statements (although it won't find them in string evals). That gets you just the modules you told your script to load.
Also, once you know which modules you loaded, you can combine that with David Cantrell's CPANdeps tool that has already created the dependency tree for most CPAN modules.
Note that you also have to think about optional features too. Your code in this case my not have them, but sometimes you don't load a module until you need it:
sub foo
{
require Bar; # don't load until we need to use it
....
}
If you don't exercise that feature in your trial run or test, you won't see that you need Bar for that feature. A similar problem comes up when a module loads a different set of dependency modules in a different environment (say, mod_perl or Windows, and so on).
There's not a good, automated way of testing optional features like that so you can get their dependencies. However, I think that should be on my To Do list since it sounds like an interesting problem.

Another tool in this area, which is used by Dist::Zilla and its AutoPrereqs plugin, is Perl::PrereqScanner. It installs a scan-perl-prereqs program that will use PPI and a few plugins to search for most kinds of prereq declaration, using the minimum versions you define. In general, I suggest this over scanning %INC, which can bring in bogus requirements and ignores versions.

Today I develop my Perl apps as CPAN-like distributions using Dist::Zilla that can take care of the dependencies through the AutoPrereq plugin. Another interesting piece of code in this area is carton.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse