Debugging Perl scripts without interactive debugger - perl

I've been tasked with understanding and debugging some legacy code written in Perl 5. The comments aren't great, and I have almost zero experience with the language (I'm a Python guy; some shell scripting experience). Basically, I need to understand it to the point where I can decide if it will be worth it to rewrite it in Python to match the rest of our code base and be more readable to those who aren't familiar with Perl's less obvious syntax (at least, to the uninitated...) or just leave it as is and do our best to make small changes as necessary despite having an incomplete understanding of what's going on.
Perl is being invoked to process some data from within a script in our organization's internal scripting language (which is used to interact with our proprietary hardware). The part written in the in the proprietary scripting language is similar to a shellscript in that every line sent to the interpreter will be written in the log, so while not as easy as debugging something like Python in a modern IDE, it's possible to figure out what's going on by checking the logs. But perl, as a programming language, only prints/logs what you tell it to. So as of right now, it's kind of a black box.
I looked at the Perl docs and saw that there is a command line option to launch the debugger -d, along with -Dtls to configure debugger behavior (those are the recommend options to "watch how perl executes your program"). But when I tried running that in the script, i got the following...warning? error?
Recompile perl with -DDEBUGGING to use -D switch (did you mean -d ?)
Because it's being called inside a proprietary-scripting langauge script, if this debugger is just an interactive shell, I don't think it will suit this purpose (because I can't send things to stdin while the proprietary-scripting langauge script is running). But if it's not interactive, adding a 2nd perl installation to the server for debugging wouldn't be out of the question, so if anyone has experience with this debugger mode, and options, I would appreciate some feedback about that.
I'm quite familiar with Python, so I know alot of tricks to log everything or to set up a debug environment to use the VS code debugger, but with Perl, I'm out of my comfort zone.
My question is: is there some sort of easy way (a flag or something) to call Perl in such a way where every command sent to the interpreter is written to the console/stdout/ or a logfile in the same way as a shell script? Or is there another good way to debug perl scripts (other than using the interactive debug shell)? Or do I have no better option than to take the time to go through this rather massive script and put print statements everywhere?
Thanks for reading a lengthy question.

You can use the Perl debugger non-interactively to print each statement as they are executed. Here is an example: Consider you have a Perl script p.pl, for example:
use feature qw(say);
use strict;
use warnings;
$DB::trace=1; # <-- turn on AutoTrace from this point..
{
my $bar = "xyz";
$bar =~ s/y//;
say "bar = $bar";
func1();
}
sub func1 {
for (1..3) {
say "1 : $_";
func2($_);
}
say "Done 1";
}
sub func2 {
my $x = shift;
my $bar = $x ** 2;
say "2: $bar";
}
Then execute p.pl like this:
$ PERLDB_OPTS="NonStop=1 AutoTrace=0" perl -d p.pl
main::(p.pl:7): my $bar = "xyz";
main::(p.pl:7): my $bar = "xyz";
main::(p.pl:8): $bar =~ s/y//;
main::(p.pl:9): say "bar = $bar";
bar = xz
main::(p.pl:10): func1();
main::func1(p.pl:14): for (1..3) {
main::func1(p.pl:15): say "1 : $_";
1 : 1
main::func1(p.pl:16): func2($_);
main::func2(p.pl:22): my $x = shift;
main::func2(p.pl:23): my $bar = $x ** 2;
main::func2(p.pl:24): say "2: $bar";
2: 1
main::func1(p.pl:15): say "1 : $_";
1 : 2
main::func1(p.pl:16): func2($_);
main::func2(p.pl:22): my $x = shift;
main::func2(p.pl:23): my $bar = $x ** 2;
main::func2(p.pl:24): say "2: $bar";
2: 4
main::func1(p.pl:15): say "1 : $_";
1 : 3
main::func1(p.pl:16): func2($_);
main::func2(p.pl:22): my $x = shift;
main::func2(p.pl:23): my $bar = $x ** 2;
main::func2(p.pl:24): say "2: $bar";
2: 9
main::func1(p.pl:18): say "Done 1";
Done 1

You can emulate bash -x ... in a Perl script with the Devel::DumpTrace module.
#!/usr/bin/perl
# demo.pl: a demonstration of Devel::DumpTrace
$a = 1;
$b = 3;
$c = 2 * $a + 7 * $b;
#d = ($a, $b, $c + $b);
$ perl -d:DumpTrace demo.pl
>>>>> demo.pl:3: $a:1 = 1;
>>>>> demo.pl:4: $b:3 = 3;
>>>>> demo.pl:5: $c:23 = 2 * $a:1 + 7 * $b:3;
>>>>> demo.pl:6: #d:(1,3,26) = ($a:1, $b:3, $c:23 + $b:3);
There are settings to produce more or less output, and ways to just turn tracing on for interesting sections of code. It works best if you also have PPI installed, but it also works without PPI.
For something lighter weight and which may already be installed on your system, also see Devel::Trace.

Related

Is there an interactive command line environment for Perl?

Hi I'm wondering if there is something for Perl similar to Rstudio? That is to ability to run commands, retain all variables in memory without exiting the script.
For example say I execute this command my $temp = 83; then instead of ending the script I change the value $temp = 22; print "$temp \n"; and so on, but I don't end the script and continue to work on it. This will be extremely helpful when dealing with a large datasets and general workflow.
The closest thing I came across is Visual Studio Code using a plugin whereby I can execute specific chunks of code in my script. However I did not find a way to keep the variable persistently in memory.
thanks!
You want a REPL.
Take a look at Devel::REPL. It brings a script called re.pl that you can run.
$ re.pl
$ my $foo = 123;
123$ use feature 'say';
$ $foo + 1;
124$
A newer alternative is Reply with its reply script.
$ reply
0> my $foo = 123;
$res[0] = 123
1> $foo + 2
$res[1] = 125
2>
For a comparison, you can read this blog post by Matt Trout.

Perl disable shell access

Certain builtins like system and exec (as well as backticks) will use the shell (I think sh by default) if passed a single argument containing shell metacharacters. If I want to write a portable program that avoids making any assumptions about the underlying shell, is there a pragma or some other option I can use to either disable shell access or trigger a fatal error immediately?
I write about this extensively in Mastering Perl. The short answer is to use system in it's list form.
system '/path/to/command', #args;
This doesn't interpret any special characters in #args.
At the same time, you should enable taint checking to help catch bad data before you pass it to the system. See the perlsec documentation for details.
There are limited options to do this, keep in mind that these are core routines and completely disabling them may have some unexpected consequences. You do have a few options.
Override Locally
You can override system and exec locally by using the subs pragma, this will only effect the package into which you have imported the sub routine:
#!/usr/bin/env perl
use subs 'system';
sub system { die('Do not use system calls!!'); }
# .. more code here, note this will runn
my $out = system('ls -lah'); # I will die at this point, but not before
print $out;
Override Globally
To override globally, in the current perl process, you need to import your function into the CORE::GLOBAL pseudo-namespace at compile time:
#!/usr/bin/env perl
BEGIN {
*CORE::GLOBAL::system = sub {
die('Do not use system calls.');
};
*CORE::GLOBAL::exec = sub {
die('Do not use exec.');
};
*CORE::GLOBAL::readpipe = sub {
die('Do not use back ticks.');
};
}
#...
my $out = system('ls -lah'); # I will die at this point, but not before
print $out;
Prevent anything form running if in source
If you want to prevent any code running before getting to a system call you can include the following, note this is fairly loose in it's matching, I've written it to be easy to modify or update:
package Acme::Noshell;
open 0 or print "Can't execute '$0'\n" and exit;
my $source = join "", <0>;
die("Found back ticks in '$0'") if($source =~ m/^.*[^#].*\`/g);
die("Found 'exec' in '$0'") if($source =~ / exec\W/g);
die("Found 'system' in '$0'") if($source =~ / system\W/g);
1;
Which can be used as follows:
#!/usr/bin/env perl
use strict;
use warnings;
use Acme::Noshell;
print "I wont print because of the call below";
my $out = system('ls -lah');

Declaring variables in Perl

I'm learning Perl, and facing some inconsistency between running a program from the command line versus interactively executing it interactively in the debugger.
Specifically, I invoke the Perl debugger with perl -d -e 1, and run this code line-by-line
my $a = 1;
print $a;
$b = 2;
print $b;
In the output I am only seeing the value of $b, while $a seems to be undefined. At the same time, when I execute the same statements with perl myscript.pl, both values are shown in the output. Why does this happen? What am I missing?
The debugger is a wholly different environment from run time Perl. Each line you enter behaves like a separate block, and if you declare a lexical variable like my $a then it will be deleted immediately after the command.
It is as if you had written
{ my $a = 1; }
{ print $a; }
{ $b = 2; }
{ print $b; }
Ordinarily you will declare lexical variables at an appropriate point in the program so that they don't disappear before you need them. But if you want to use the debugger to play with the language then you need to use only package variables, which never disappear and are what you get by default if you don't use my.
Command line "one-liner" Perl programs usually do the same thing, but it's a lesson you will have to unlearn when you come to writing proper Perl programs. You will be using use strict and use warnings at the head of every program, and strict requires that you choose between lexical or package variables by using my or our respectively. If you try to use a variable that you haven't previously declared then your program won't compile.
Also never use $a or $b in your code. Apart from being dreadful variable names, they are reserved for use by the sort operator.
I hope that helps.

Usage of defined with Filehandle and while Loop

While reading a book on advanced Perl programming(1), I came across
this code:
while (defined($s = <>)) {
...
Is there any special reason for using defined here? The documentation for
perlop says:
In these loop constructs, the assigned value (whether assignment is
automatic or explicit) is then tested to see whether it is defined. The
defined test avoids problems where line has a string value that would be
treated as false by Perl, for example a "" or a "0" with no trailing
newline. If you really mean for such values to terminate the loop, they
should be tested for explicitly: [...]
So, would there be a corner case or that's simply because the book is too old
and the automatic defined test was added in a recent Perl version?
(1) Advanced Perl Programming, First Edition, Sriram Srinivasan. O'Reilly
(1997)
Perl has a lot of implicit behaviors, many more than most other languages. Perl's motto is There's More Than One To Do It, and because there is so much implicit behavior, there is often More Than One Way To express the exact same thing.
/foo/ instead of $_ =~ m/foo/
$x = shift instead of $x = shift #_
while (defined($_=<ARGV>)) instead of while(<>)
etc.
Which expressions to use are largely a matter of your local coding standards and personal preference. The more explicit expressions remind the reader what is really going on under the hood. This may or may not improve the readability of the code -- that depends on how knowledgeable the audience is and whether you are using well-known idioms.
In this case, the implicit behavior is a little more complicated than it seems. Sometimes perl will implicitly perform a defined(...) test on the result of the readline operator:
$ perl -MO=Deparse -e 'while($s=<>) { print $s }'
while (defined($s = <ARGV>)) {
print $s;
}
-e syntax OK
but sometimes it won't:
$ perl -MO=Deparse -e 'if($s=<>) { print $s }'
if ($s = <ARGV>) {
print $s;
}
-e syntax OK
$ perl -MO=Deparse -e 'while(some_condition() && ($s=<>)) { print $s }'
while (some_condition() and $s = <ARGV>) {
print $s;
}
-e syntax OK
Suppose that you are concerned about the corner cases that this implicit behavior is supposed to handle. Have you committed perlop to memory so that you understand when Perl uses this implicit behavior and when it doesn't? Do you understand the differences in this behavior between Perl v5.14 and Perl v5.6? Will the people reading your code understand?
Again, there's no right or wrong answer about when to use the more explicit expressions, but the case for using an explicit expression is stronger when the implicit behavior is more esoteric.
Say you have the following file
4<LF>
3<LF>
2<LF>
1<LF>
0
(<LF> represents a line feed. Note the lack of newline on the last line.)
Say you use the code
while ($s = <>) {
chomp;
say $s;
}
If Perl didn't do anything magical, the output would be
4
3
2
1
Note the lack of 0, since the string 0 is false. defined is needed in the unlikely case that
You have a non-standard text file (missing trailing newline).
The last line of the file consists of a single ASCII zero (0x30).
BUT WAIT A MINUTE! If you actually ran the above code with the above data, you would see 0 printed! What many don't know is that Perl automagically translates
while ($s = <>) {
to
while (defined($s = <>)) {
as seen here:
$ perl -MO=Deparse -e'while($s=<DATA>) {}'
while (defined($s = <DATA>)) {
();
}
__DATA__
-e syntax OK
So you technically don't even need to specify defined in this very specific circumstance.
That said, I can't blame someone for being explicit instead of relying on Perl automagically modifying their code. After all, Perl is (necessarily) quite specific as to which code sequences it will change. Note the lack of defined in the following even though it's supposedly equivalent code:
$ perl -MO=Deparse -e'while((), $s=<DATA>) {}'
while ((), $s = <DATA>) {
();
}
__DATA__
-e syntax OK
while($line=<DATA>){
chomp($line);
if(***defined*** $line){
print "SEE:$line\n";
}
}
__DATA__
1
0
3
Try the code with defined removed and you will see the different result.

What Perl module can I use to test CGI output for common errors?

Is there a Perl module which can test the CGI output of another program? E.g. I have a program
x.cgi
(this program is not in Perl) and I want to run it from program
test_x_cgi.pl
So, e.g. test_x_cgi.pl is something like
#!perl
use IPC::Run3
run3 (("x.cgi"), ...)
So in test_x_cgi.pl I want to automatically check that the output of x.cgi doesn't do stupid things like, e.g. print messages before the HTTP header is fully outputted. In other words, I want to have a kind of "browser" in Perl which processes the output. Before I try to create such a thing myself, is there any module on CPAN which does this?
Please note that x.cgi here is not a Perl script; I am trying to write a test framework for it in Perl. So, specifically, I want to test a string of output for ill-formedness.
Edit: Thanks
I have already written a module which does what I want, so feel free to answer this question for the benefit of other people, but any further answers are academic as far as I'm concerned.
There's CGI::Test, which looks like what you're looking for. It specifically mentions the ability to test non-Perl CGI programs. It hasn't been updated for a while, but neither has the CGI spec.
There is Test::HTTP. I have not used it, but seems to have an interface that fits your requirements.
$test->header_is($header_name, $value [, $description]);
Compares the response header
$header_name with the value $value
using Test::Builder-is>.
$test->header_like($header_name, $regex, [, $description]);
Compares the response header
$header_name with the regex $regex
using Test::Builder-like>.
Look at the examples from chapter 16 from the perl cookbook
16.9. Controlling the Input, Output, and Error of Another Program
It uses IPC::Open3.
Fom perl cookbook, might be modified by me, see below.
Example 16.2
cmd3sel - control all three of kids in, out, and error.
use IPC::Open3;
use IO::Select;
$cmd = "grep vt33 /none/such - /etc/termcap";
my $pid = open3(*CMD_IN, *CMD_OUT, *CMD_ERR, $cmd);
$SIG{CHLD} = sub {
print "REAPER: status $? on $pid\n" if waitpid($pid, 0) > 0
};
#print CMD_IN "test test 1 2 3 \n";
close(CMD_IN);
my $selector = IO::Select->new();
$selector->add(*CMD_ERR, *CMD_OUT);
while (my #ready = $selector->can_read) {
foreach my $fh (#ready) {
if (fileno($fh) == fileno(CMD_ERR)) {print "STDERR: ", scalar <CMD_ERR>}
else {print "STDOUT: ", scalar <CMD_OUT>}
$selector->remove($fh) if eof($fh);
}
}
close(CMD_OUT);
close(CMD_ERR);
If you want to check that the output of x.cgi is properly formatted HTML/XHTML/XML/etc, why not run it through the W3 Validator?
You can download the source and find some way to call it from your Perl test script. Or, you might able to leverage this Perl interface to calling the W3 Validator on the web.
If you want to write a testing framework, I'd suggest taking a look at Test::More from CPAN as a good starting point. It's powerful but fairly easy to use and is definitely going to be better than cobbling something together as a one-off.