Is there a module that searches for superfluous code? - perl

Is there a module, which can find code not needed?
As an example a script with code not needed to run the script:
#!/usr/bin/env perl
use warnings;
use 5.12.0;
use utf8;
binmode STDOUT, ':utf8';
use DateTime;
use WWW::Mechanize;
sub my_print {
my ( $string, $tab, $color ) = #_;
say $string;
}
sub check {
my $string = shift;
return if length $string > 10;
return $string;
}
my_print( 'Hello World' );

Not categorically. Perl is notoriously difficult to analyze without actually executing, to the point that compiling a Perl program to be run later actually requires including a copy of the perl interpreter! As a result there are very few code analysis tools for Perl. What you can do is use a profiler, but this is a bit overkill (and as I mentioned, requires actually executing the program. I like Devel::NYTProf. This will spit out some HTML files showing how many times eaqch line or sub was executed, as well as how much time was spent there, but this only works for that specific execution of the program. It will allow you to see that WWW::Mechanize is loaded but never called, but it will not be able to tell you if warnings or binmode had any effect on execution.

Devel::Cover provides code coverage metrics that may be of some use here.

Related

How to execute a script from another so that it also sets variables for the caller script

I reviewed many examples on-line about running another process (either PERL or shell command or a program), but do not find any useful for my needs way.
(As by already received answers I see that my 'request' is not understood, I will try to say it in short, leaving all earlier printed as an example of what I already tried...)
I need:
- In a caller script set parameters for the second script before call the second script (thus, I could not use the do script2.pl s it executed before startin to run the first script)
- In the second script I need to set some variables that will be used in the caller script (therefore it is not useful to process the second script by system() or by back ticks);
- and, as I need to use those variables in the first script, I need come back to the first script after completting the second one
(I hope now it is more clear what I need...)
(Reviewed and not useful the system(), 'back ticks', exec() and open())
I would like to run another PERL-script from a first one, not exiting (as by exec()), not catching the STDOUT of the called script (as in the back tick processing,) but having it printed out, as in initial script (as it is by system()) while I do not need the return status (as by system());
but, I would like to have the called script to set some variables, that will be accessible in the calling s cript (sure, set by the our #set_var;)
My attempt (that I am not able to make do what I need) is:
Script1 is something, like:
...
if($condition)
{ local $0 = 'script2.pl';
local #ARGV = ('first-arg', 'second_arg');
do script2.pl;
}
print "set array is: '#set_var'\n";
...
The 'script2' would have something like:
#!/usr/bin/perl
...
print "having input parameters: '#ARGV'\n";
... # all script activities
our #set_var = ($val1, $val2, $val3);
exit 0;
The problem in my code is that the do ... command is executed on beginning of the first script run and is not in the place, where it is prepared for it (by setting some local .. vars!)
I did try to use the eval "do script2.pl" :
- now it is executed in the proper place, but it is not setting the #set_var into the first script process!
Is there any idea to do it as I would like to have it?
(I understand, that I can rewrite the script2.pl, including whole processing in some function (say, main()) and load it by require() and execute the function main(): that will do everything as I prefer it; but I would like to leave the second script as-is to be executable from shell by itself, as it is now.
... and I do not like the way to pass values by a flat file...)
Does anybody have an idea how to do my whim?
This works just fine:
script2.pl
use strict;
our #set_var = ("foo","bar");
script1.pl
use strict;
our #set_var;
do './script2.pl';
print "#set_var\n";
$ perl script1.pl
foo bar
But it does not if you use:
script2.pl
use strict;
our #set_var = ("foo","bar");
exit 0;
There is only a single perl process in this example, so calling exit, even from the second script, exits your program.
If you don't want to remove the exit call in the second script, we can work around that with some CORE::GLOBAL namespace hacking. The gist is to redirect the exit function to your own custom function that you can manipulate when the second script runs.
script1.pl
BEGIN { *CORE::GLOBAL::exit = *my_exit };
use strict;
sub my_exit { goto &CORE::exit }
our #set_var;
{
local *my_exit = sub { warn "Not exiting" };
do './script2.pl';
}
print "#set_var\n";
script2.pl
use strict;
our #set_var = ("foo","bar");
exit 0;
$ perl script1.pl
Not exiting at script1.pl line 7.
foo bar
(Ok, finally, asked by myself and ansvering by myself, too!)
( After additional reviewing, I am realized, that 'mod' solution does use it, but I did not understand advice!
I am sorry: my false to step over the real solution!
)
Solution to my question is simple! It is the:
do EXPR;
That way
- the second script executed in place where it placed; so, anything defined and set in the first one usefull in the second one;
- It is printing to STDOUT everything what it should print (the second script;)
- any variables or objects that are defined in the second script process, are accessible in the first one after coming back; and
- control is returned to position immediately after the second-script execution with continuation to process the first script commands!
Simple! I am just amazed, why I forget about that 'do...' command. I have used it already not once!
And I am disappointed by that forum much!
While it is badly designed to display communication, participants, instead of perl-issue reviewing, much concerned on moderating others, teaching them how to leave in such nice forum!
I am not really sure what you are trying to do exactly, but along these lines it should be very close.
test.pl
#!/usr/bin/perl
use strict;
use warnings;
use feature 'say';
use IPC::System::Simple qw(system);
say $0;
system($^X, "sample.pl", #ARGV);
$ perl test.pl first-arg second-arg
test.pl
sample.pl
$VAR1 = [
'first-arg',
'second-arg'
];
sample.pl
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use feature 'say';
say $0;
print Dumper \#ARGV;
I used the module IPC::System::Simple. You can also capture the output of the script (sample.pl) through IPC::System::Simple::capture.
Update: Maybe you can use Storable. This way you can pass new parameters that you can use from script 2 (sample.pl) to script 1 (test.pl).
test.pl
#!/usr/bin/perl
use strict;
use warnings;
use Storable;
use Data::Dumper;
use feature 'say';
use IPC::System::Simple qw(system);
say $0;
system($^X, "sample.pl", #ARGV);
my $hashref = retrieve('sample');
print Dumper $hashref;
__END__
$ perl test.pl first-arg second-arg
test.pl
sample.pl
$VAR1 = [
'first-arg',
'second-arg'
];
$VAR1 = {
'arg1' => 'test1',
'arg2' => 'test2'
};
sample.pl
#!/usr/bin/perl
use strict;
use warnings;
use Storable;
use Data::Dumper;
use feature 'say';
say $0;
print Dumper \#ARGV;
my %hashArgs = ( arg1 => 'test1',
arg2 => 'test2', );
store \%hashArgs, 'sample';

Perl disable shell access

Certain builtins like system and exec (as well as backticks) will use the shell (I think sh by default) if passed a single argument containing shell metacharacters. If I want to write a portable program that avoids making any assumptions about the underlying shell, is there a pragma or some other option I can use to either disable shell access or trigger a fatal error immediately?
I write about this extensively in Mastering Perl. The short answer is to use system in it's list form.
system '/path/to/command', #args;
This doesn't interpret any special characters in #args.
At the same time, you should enable taint checking to help catch bad data before you pass it to the system. See the perlsec documentation for details.
There are limited options to do this, keep in mind that these are core routines and completely disabling them may have some unexpected consequences. You do have a few options.
Override Locally
You can override system and exec locally by using the subs pragma, this will only effect the package into which you have imported the sub routine:
#!/usr/bin/env perl
use subs 'system';
sub system { die('Do not use system calls!!'); }
# .. more code here, note this will runn
my $out = system('ls -lah'); # I will die at this point, but not before
print $out;
Override Globally
To override globally, in the current perl process, you need to import your function into the CORE::GLOBAL pseudo-namespace at compile time:
#!/usr/bin/env perl
BEGIN {
*CORE::GLOBAL::system = sub {
die('Do not use system calls.');
};
*CORE::GLOBAL::exec = sub {
die('Do not use exec.');
};
*CORE::GLOBAL::readpipe = sub {
die('Do not use back ticks.');
};
}
#...
my $out = system('ls -lah'); # I will die at this point, but not before
print $out;
Prevent anything form running if in source
If you want to prevent any code running before getting to a system call you can include the following, note this is fairly loose in it's matching, I've written it to be easy to modify or update:
package Acme::Noshell;
open 0 or print "Can't execute '$0'\n" and exit;
my $source = join "", <0>;
die("Found back ticks in '$0'") if($source =~ m/^.*[^#].*\`/g);
die("Found 'exec' in '$0'") if($source =~ / exec\W/g);
die("Found 'system' in '$0'") if($source =~ / system\W/g);
1;
Which can be used as follows:
#!/usr/bin/env perl
use strict;
use warnings;
use Acme::Noshell;
print "I wont print because of the call below";
my $out = system('ls -lah');

Piping Perl Commands?

I'm currently trying to take my program and have it take user input, usually a text file then call an external script to count the words. The script I'm working on is essentially a "middle man" and I'm trying to get more familiar with piping to external scripts/commands. It's currently not correctly executing the word counter script. Here's the code:
I'm still receiving error for ./word_counter.pl saying "no such file or directory at glue.pl (which is this script you see here)".
#!usr/bin/perl
use warnings;
use strict;
use IO::Handle qw();
open (PIPE_TO, "|-", "./word_counter.pl");
While(<>)
{
$PIPE_TO -> autoflush(1);
print PIPE_TO $_;
}
Suffering from buffering?
use IO::Handle qw( );
PIPE_TO->autoflush(1);
The reason it doesn't work is probably that you have syntax errors.
Otherwise: Other than introducing line-buffered semantics, you are really doing nothing here (you just pipe what you read to another program, which is in this case equivalent to just running the program)
Modulo the buffering (which you don't seem to explicitly need) an equivalent script would be:
#!/usr/bin/perl
exec ("./word_counter.pl");
Is this what you are trying to do?
#!/usr/bin/perl
use warnings;
use strict;
open (my $PIPE_TO, "|-", "./word_counter.pl") or die $!;
while(<>) {
print $PIPE_TO $_;
}

Perl - New definition of myprint() or Overload print command

I am a newb to Perl. I am writing some scripts and want to define my own print called myprint() which will print the stuff passed to it based on some flags (verbose/debug flag)
open(FD, "> /tmp/abc.txt") or die "Cannot create abc.txt file";
print FD "---Production Data---\n";
myprint "Hello - This is only a comment - debug data";
Can someone please help me with some sample code to for myprint() function?
Do you care more about writing your own logging system, or do you want to know how to put logging statements in appropriate parts of your program which you can turn off (and, incur little performance penalty when they are turned off)?
If you want a logging system that is easy to start using, but also offers a world of features which you can incrementally discover and use, Log::Log4perl is a good option. It has an easy mode, which allows you to specify the desired logging level, and emits only those logging messages that are above the desired level.
#!/usr/bin/env perl
use strict; use warnings;
use File::Temp qw(tempfile);
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init({level => $INFO});
my ($fh, $filename) = tempfile;
print $fh "---Production Data---\n";
WARN 'Wrote something somewhere somehow';
The snippet also shows a better way of opening a temporary file using File::Temp.
As for overriding the built-in print … It really isn't a good idea to fiddle with built-ins except in very specific circumstances. perldoc perlsub has a section on Overriding Built-in Functions. The accepted answer to this question lists the Perl built-ins that cannot be overridden. print is one of those.
But, then, one really does not need to override a built-in to write a logging system.
So, if an already-written logging system does not do it for you, you really seem to be asking "how do I write a function that prints stuff conditionally depending on the value of a flag?"
Here is one way:
#!/usr/bin/env perl
package My::Logger;
{
use strict; use warnings;
use Sub::Exporter -setup => {
exports => [
DEBUG => sub {
return sub {} unless $ENV{MYDEBUG};
return sub { print 'DEBUG: ' => #_ };
},
]
};
}
package main;
use strict; use warnings;
# You'd replace this with use My::Logger qw(DEBUG) if you put My::Logger
# in My/Logger.pm somewhere in your #INC
BEGIN {
My::Logger->import('DEBUG');
}
sub nicefunc {
print "Hello World!\n";
DEBUG("Isn't this a nice function?\n");
return;
}
nicefunc();
Sample usage:
$ ./yy.pl
Hello World!
$ MYDEBUG=1 ./yy.pl
Hello World!
DEBUG: Isn't this a nice function?
I wasn't going to answer this because Sinan already has the answer I'd recommend, but tonight I also happened to be working on the "Filehandle References" chapter to the upcoming Intermediate Perl. That are a couple of relevant paragraphs which I'll just copy directly without adapting them to your question:
IO::Null and IO::Interactive
Sometimes we don't want to send our output anywhere, but we are forced
to send it somewhere. In that case, we can use IO::Null to create
a filehandle that simply discards anything that we give it. It looks
and acts just like a filehandle, but does nothing:
use IO::Null;
my $null_fh = IO::Null->new;
some_printing_thing( $null_fh, #args );
Other times, we want output in some cases but not in others. If we are
logged in and running our program in our terminal, we probably want to
see lots of output. However, if we schedule the job through cron, we
probably don't care so much about the output as long as it does the job.
The IO::Interactive module is smart enough to tell the difference:
use IO::Interactive;
print { is_interactive } 'Bamboo car frame';
The is_interactive subroutine returns a filehandle. Since the
call to the subroutine is not a simple scalar variable, we surround
it with braces to tell Perl that it's the filehandle.
Now that you know about "do nothing" filehandles, you can replace some
ugly code that everyone tends to write. In some cases you want output
and in some cases you don't, so many people use a post-expression
conditional to turn off a statement in some cases:
print STDOUT "Hey, the radio's not working!" if $Debug;
Instead of that, you can assign different values to $debug_fh based
on whatever condition you want, then leave off the ugly if $Debug
at the end of every print:
use IO::Null;
my $debug_fh = $Debug ? *STDOUT : IO::Null->new;
$debug_fh->print( "Hey, the radio's not working!" );
The magic behind IO::Null might give a warning about "print() on
unopened filehandle GLOB" with the indirect object notation (e.g.
print $debug_fh) even though it works just fine. We don't get that
warning with the direct form.

Subroutines vs scripts in Perl

I'm fairly new to Perl and was wondering what the best practices regarding subroutines are with Perl. Can a subroutine be too big?
I'm working on a script right now, and it might need to call another script. Should I just integrate the old script into the new one in the form of a subroutine? I need to pass one argument to the script and need one return value.
I'm guessing I'd have to do some sort of black magic to get the output from the original script, so subroutine-ing it makes sense right?
Avoiding "black magic" is always a good idea when writing code. You never want to jump through hoops and come up with an unintuitive hack to solve a problem, especially if that code needs to be supported later. It happens, admittedly, and we're all guilty of it. Circumstances can weigh heavily on "just getting the darn thing to work."
The point is, the best practice is always to make the code clean and understandable. Remember, and this is especially true with Perl code in my experience, any code you wrote yourself more than a few months ago may as well have been written by someone else. So even if you're the only one who needs to support it, do yourself a favor and make it easy to read.
Don't cling to broad sweeping ideas like "favor more files over larger files" or "favor smaller methods/subroutines over larger ones" etc. Those are good guidelines to be sure, but apply the spirit of the guideline rather than the letter of it. Keep the code clean, understandable, and maintainable. If that means the occasional large file or large method/subroutine, so be it. As long as it makes sense.
A key design goal is separation of concerns. Ideally, each subroutine performs a single well-defined task. In this light, the main question revolves not around a subroutine's size but its focus. If your program requires multiple tasks, that implies multiple subroutines.
In more complex scenarios, you may end up with groups of subroutines that logically belong together. They can be organized into libraries or, even better, modules. If possible, you want to avoid a scenario where you end up with multiple scripts that need to communicate with each other, because the usual mechanism for one script to return data to another script is tedious: the first script writes to standard output and the second script must parse that output.
Several years ago I started work at a job requiring that I build a large number of command-line scripts (at least, that's how it turned out; in the beginning, it wasn't clear what we were building). I was quite inexperienced at the time and did not organize the code very well. In hindsight, I should have worked from the premise that I was writing modules rather than scripts. In other words, the real work would have been done by modules, and the scripts (the code executed by a user on the command line) would have remained very small front-ends to invoke the modules in various ways. This would have facilitated code reuse and all of that good stuff. Live and learn, right?
Another option that hasn't been mentioned yet for reusing the code in your scripts is to put common code in a module. If you put shared subroutines into a module or modules, you can keep your scripts short and focussed on what they do that is special, while isolating the common code in a easy to access and reuse form.
For example, here is a module with a few subroutines. Put this in a file called MyModule.pm:
package MyModule;
# Always do this:
use strict;
use warnings;
use IO::Handle; # For OOP filehandle stuff.
use Exporter qw(import); # This lets us export subroutines to other scripts.
# These may be exported.
our #EXPORT_OK = qw( gather_data_from_fh open_data_file );
# Automatically export everything allowed.
# Generally best to leave empty, but in some cases it makes
# sense to export a small number of subroutines automatically.
our #EXPORT = #EXPORT_OK;
# Array of directories to search for files.
our #SEARCH_PATH;
# Parse the contents of a IO::Handle object and return structured data
sub gather_data_from_fh {
my $fh = shift;
my %data;
while( my $line = $fh->readline );
# Parse the line
chomp $line;
my ($key, #values) = split $line;
$data{$key} = \#values;
}
return \%data;
}
# Search a list of directories for a file with a matching name.
# Open it and return a handle if found.
# Die otherwise
sub open_data_file {
my $file_name = shift;
for my $path ( #SEARCH_PATH, '.' ) {
my $file_path = "$path/$file_name";
next unless -e $file_path;
open my $fh, '<', $file_path
or die "Error opening '$file_path' - $!\n"
return $fh;
}
die "No matching file found in path\n";
}
1; # Need to have trailing TRUE value at end of module.
Now in script A, we take a filename to search for and process and then print formatted output:
use strict;
use warnings;
use MyModule;
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /foo/foo/rah /bar/bar/bar /eeenie/meenie/mynie/moe );
#get file name from args.
my $name = shift;
my $fh = open_data_file($name);
my $data = gather_data_from_fh($fh);
for my $key ( sort keys %$data ) {
print "$key -> ", join ', ', #{$data->{$key}};
print "\n";
}
Script B, searches for a file, parses it and then writes the parsed data structure into a YAML file.
use strict;
use warnings;
use MyModule;
use YAML qw( DumpFile );
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /da/da/da/dum /tutti/frutti/unruly /cheese/burger );
#get file names from args.
my $infile = shift;
my $outfile = shift;
my $fh = open_data_file($infile);
my $data = gather_data_from_fh($fh);
DumpFile( $outfile, $data );
Some related documentation:
perlmod - About Perl modules in general
perlmodstyle - Perl module style guide; this has very useful info.
perlnewmod - Starting a new module
Exporter - The module used to export functions in the sample code
use - the perlfunc article on use.
Some of these docs assume you will be sharing your code on CPAN. If you won't be publishing to CPAN, simply ignore the parts about signing up and uploading code.
Even if you aren't writing for CPAN, it is beneficial to use the standard tools and CPAN file structure for your module development. Following the standard allows you to use all of the tools CPAN authors use to simplify the development, testing and installation process.
I know that all this seems really complicated, but the standard tools make each step easy. Even adding unit tests to your module distribution is easy thanks to the great tools available. The payoff is huge, and well worth the time you will invest.
Sometimes it makes sense to have a separate script, sometimes it doesn't. The "black magic" isn't that complicated.
#!/usr/bin/perl
# square.pl
use strict;
use warnings;
my $input = shift;
print $input ** 2;
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += `square.pl $num` // die "square.pl failed: $? $!";
}
print $sum, "\n";
Easier and better error reporting on failure is automatic with IPC::System::Simple:
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
use IPC::System::Simple 'capture';
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += capture( "square.pl $num" );
}
print $sum, "\n";