Debugging perl script - perl

I don't know perl, but I need to debug a perl script which is needed by an application I am using, this is the error I get:
Unable to recognise encoding of this document at /usr/lib/perl5/vendor_perl/5.8.8/XML/SAX/PurePerl/EncodingDetect.pm line 9
The thing is, this script cannot figure out what is the encoding of a file. What I am trying to find out is, which file is that. I couldn't be able to find a way to stack trace. Here is the script trimmed a little:
package XML::SAX::PurePerl; # NB, not ::EncodingDetect!
use strict;
sub encoding_detect {
my ($parser, $reader) = #_;
my $error = "Invalid byte sequence at start of file";
my $data = $reader->data;
if ($data =~ /^\x00\x00\xFE\xFF/) {
# BO-UCS4-be
$reader->move_along(4);
$reader->set_encoding('UCS-4BE');
return;
} .. tons of if else statements
warn("Unable to recognise encoding of this document");
return;
I checked, but this reader object doesn't have a name, or path attribute. I have a control over this script, so I may modify it if necessary. Any help is appreciated.
Edit: I have tracked down the problem until this line in the application I'm trying to use:
my #array = SystemImager::HostRange::expand_groups($clients);

If you use the Carp module and the confess method, you get a stack backtrace:
use Carp;
confess "Something went horribly wrong" if ($something == $wrong);
This is of most use inside a function (in a module), but it helps. However, it sounds as if the error is being reported by code you're using, so you may not be able to get it to croak for you, but you should read the manual for Carp, which says in part:
Forcing a Stack Trace
As a debugging aid, you can force Carp to treat a croak as a confess and a carp as a cluck across all modules. In other words, force a detailed stack trace to be given. This can be very helpful when trying to understand why, or from where, a warning or error is being generated.
This feature is enabled by 'importing' the non-existent symbol 'verbose'. You would typically enable it by saying
perl -MCarp=verbose script.pl
or by including the string -MCarp=verbose in the PERL5OPT environment variable.
Alternat[iv]ely, you can set the global variable $Carp::Verbose to true.
As suggested by daxim in a comment, also consider Carp::Always:
use Carp::Always;
makes every warn() and die() complain loudly in the calling package and elsewhere. More often used on the command line:
perl -MCarp::Always script.pl
The pure Perl implementation of XML::SAX::PurePerl is marked as 'slow' by its maintainer. You should perhaps look at using one of the many other XS-based SAX modules, especially one that provides automatic encoding detection.

To debug perl script you may use:
perl -d:DebugHooks::Terminal script.pl
Look at this

Related

Using filehandles in Perl to alter actively running code

I've been learning about filehandles in Perl, and I was curious to see if there's a way to alter the source code of a program as it's running. For example, I created a script named "dynamic.pl" which contained the following:
use strict;
use warnings;
open(my $append, ">>", "dynamic.pl");
print $append "print \"It works!!\\n\";\n";
This program adds the line
print "It works!!\n";
to the end of it's own source file, and I hoped that once that line was added, it would then execute and output "It works!!"
Well, it does correctly append the line to the source file, but it doesn't execute it then and there.
So I assume therefore that when perl executes a program that it loads it to memory and runs it from there, but my question is, is there a way to access this loaded version of the program so you can have a program that can alter itself as you run it?
The missing piece you need is eval EXPR. This compiles, "evaluates", any string as code.
my $string = q[print "Hello, world!";];
eval $string;
This string can come from any source, including a filehandle.
It also doesn't have to be a single statement. If you want to modify how a program runs, you can replace its subroutines.
use strict;
use warnings;
use v5.10;
sub speak { return "Woof!"; }
say speak();
eval q[sub speak { return "Meow!"; }];
say speak();
You'll get a Subroutine speak redefined warning from that. It can be supressed with no warnings "redefine".
{
# The block is so this "no warnings" only affects
# the eval and not the entire program.
no warnings "redefine";
eval q[sub speak { return "Shazoo!"; }];
}
say speak();
Obviously this is a major security hole. There is many, many, many things to consider here, too long for an answer, and I strongly recommend you not do this and find a better solution to whatever problem you're trying to solve this way.
One way to mitigate the potential for damage is to use the Safe module. This is like eval but limits what built in functions are available. It is by no means a panacea for the security issues.
With a warning about all kinds of issues, you can reload modules.
There are packages for that, for example, Module::Reload. Then you can write code that you intend to change in a module, change the source at runtime, and have it reloaded.
By hand you would delete that from %INC and then require, like
# ... change source code in the module ...
delete $INC{'ModuleWithCodeThatChages.pm'};
require ModuleWithCodeThatChanges;
The only reason I can think of for doing this is experimentation and play. Otherwise, there are all kinds of concerns with doing something like this, and whatever your goal may be there are other ways to accomplish it.
Note The question does specify a filehandle. However, I don't see that to be really related to what I see to be the heart of the question, of modifying code at runtime.
The source file isn't used after it's been compiled.
You could just eval it.
use strict;
use warnings;
my $code = <<'__EOS__'
print "It works!!\n";
__EOS__
open(my $append_fh, ">>", "dynamic.pl")
or die($!);
print($append_fh $code);
eval("$code; 1")
or die($#);
There's almost definitely a better way to achieve your end goal here. BUT, you could recursively make exec() or system() calls -- latter if you need a return value. Be sure to setup some condition or the dominoes will keep falling. Again, you should rethink this, unless it's just practice of some sort, or maybe I don't get it!
Each call should execute the latest state of the file; also be sure to close the file before each call.
i.e.,
exec("dynamic.pl"); or
my retval;
retval = system("perl dynamic.pl");
Don't use eval ever.

How to create own module for reuse in Perl?

I wish to create my own module and I want to use that module name for further use. The main concept is: Create a module which should contain subroutines for line count, word-count and character count, and in main program I should use that module and I should read the file and the output should show me total number of lines, total number of words and total number of characters in that file.
package Countsample;
use strict;
use base"Exporter";
use 5.010;
our #EXPORT=qw/line_count/;
sub line_count
{my $line=#_;
return $line;
}
1;
This above doc is saved in Count.pm.
#!usr/bin/perl
use Countsample;
open FH,"text1.txt";
line_count();
print"$line\n";
The above code is saved in Count1.pl format.
I think this much information is enough to create, if you need any further information then let me know.
Kindly help me to create complete this task.
Let's start with the module. It was already mentioned, that it is common practice to name the file as the package it contains.
And there is a reason for this: using use with a name builds on the expectation that there will be a file somewhere in the path list (#INC) for modules by that name containing a matching namespace declaration (package whatever). This connection makes possible, what Exporter does in the first place.
http://perldoc.perl.org/Exporter.html
A more indepth but still succinct explanation of this can be found here
http://www.perlmonks.org/?node_id=102347
So the file would be named Countsample.pm:
package Countsample;
use strict;
use warnings;
use base "Exporter";
use 5.010;
our #EXPORT=qw/line_count/;
sub line_count
{
my ($fd) = (#_);
my $lines = 0;
while (<$fd>) {
$lines++
}
return $lines;
}
1;
I added use strict and use warnings to be notified about errors.
Then I changed the assignment of the arguments; the arguments in #_ are a list, so I assign those to a list on the left side (see the parentheses around $fd). You could use
my $fd = shift;
alternatively.
I chose to pass an open filehandle as an argument here, then count the lines, simply by reading the file linewise and returning the number of lines.
There are many ways to get the number of lines out of a file, just as a reference:
http://www.perlmonks.org/?node_id=538824
The main program then looks like this:
#!/usr/bin/perl
use v5.14;
use strict;
use warnings;
use Countsample;
open (my $fd, "<", "text1.txt");
say line_count($fd);
close($fd);
See http://perldoc.perl.org/functions/open.html for the ways to open a file that are available and preferable.
You really need to work on your question asking. You're not giving us enough to go on. You need to tell us:
Exactly what you want to do
Exactly what you have done
Exactly what expected behaviour you are seeing (including any error messages)
Let's step through getting past some of your errors.
Firstly, when I ran your program, I got this.
$ ./Count1.pl
bash: ./Count1.pl: usr/bin/perl: bad interpreter: No such file or directory
Ok, so that's just a stupid typo. But because you haven't explained what problems you are getting, we don't know if that's the problem you're seeing or whether you've introduced the typo when posting your question.
But it's easy to fix. The shebang line needs to be #!/usr/bin/perl. I'm pretty sure you had exactly the same typo in your last question!
Now what happens when I run your code.
$ ./Count1.pl
Can't locate Countsample.pm in #INC (you may need to install the Countsample module) (#INC contains: ...)
This is because your package doesn't have the same name as your module file. Why would you do that? It just complicates your life.
Ok, so let's fix the use statement so it's looking for the right thing - use Count.
Now I get a different error.
$ ./Count1.pl
Undefined subroutine &main::line_count called at ./Count1.pl line 5.
That's going to be a little harder to track down. So to make my life easier I'll turn on use strict and use warnings in both of the files.
I now get this:
$ ./Count1.pl
Global symbol "$line" requires explicit package name at ./Count1.pl line 9.
Execution of ./Count1.pl aborted due to compilation errors.
That means I'll need to declare the variable $line at some point so I'll add my $line just after the use Count line.
And now I get this:
$ ./Count1.pl
Name "main::FH" used only once: possible typo at ./Count1.pl line 8.
Undefined subroutine &main::line_count called at ./Count1.pl line 9.
At which point, I'm afraid, I get bored of digging. Had you presented us with this version of the code, then I might have had some energy left to investigate from here. But because I've spent ten minutes finding silly typos and fixing pointless bugs, I've lost all enthusiasm.
It's important to realise that the people here are all volunteers. We're happy to help you solve your problems, but you need to do some of the work yourself. You need to ensure that we don't waste time fixing obvious things that you could have found yourself. And you need to be a lot clearer than you have been so far when explaining what the problem is.
Here's the version of you code that I got to. Perhaps someone else will have the enthusiasm to take it to the next stage.
Count.pm
package Countsample;
use strict;
use warnings;
use base "Exporter";
use 5.010;
our #EXPORT = qw/line_count/;
sub line_count {
my $line = #_;
return $line;
}
1;
Count1.pl
#!/usr/bin/perl
use strict;
use warnings;
use Count;
my $line;
open FH,"text1.txt";
line_count();
print "$line\n";
Update: The "undefined subroutine" error was because I forgot to change the name of the package in Count.pm. Having done that I now get:
$ ./Count1.pl
Name "main::FH" used only once: possible typo at ./Count1.pl line 8.
Use of uninitialized value $line in concatenation (.) or string at ./Count1.pl line 10.
Which is the point at which you really need to start thinking about how your module works. What subroutines do you need? What parameters do they take? What do they return?

Confused about perl IPC::Run

I'm trying to use code like this:
run \#cmd, \$in, \$out, \$err;
As discussed in IPC::Run.
Of course, this complains about undefined variables.
So then I try this:
my $in;
my $out;
my $err;
run \#cmd, \$in, \$out, \$err;
print $in "Hello World";
But then on the print line I get issues with an undefined reference.
Am I doing something totally wrong here? And if so, what do I need to modify?
The sample code on the IPC::Run page is assuming that you already have those variables / descriptors declared and setup elsewhere hence why once you set them up, it stopped complaining about that.
Printing to $in when it isn't a valid file handle will kick that error. You want to either just leave the file handle off of the print statement, or open a filehandle to a file you want to write to, and pass that to print.
See the doc pages on open and print for more info on those functions:
http://perldoc.perl.org/functions/open.html
http://perldoc.perl.org/functions/print.html
Also, I highly recommend that you use strict and warnings in your perl scripts if you are not already as it will catch many errors for you.
As LeoNerd mentioned, if you are not setting up the array #cmd to contain an array of commands that you want run, nothing is actually going to be executed in that call to run.
If you are just getting started with Perl and using CPAN modules, I highly recommend that you also start using Data::Dumper (in core Perl, nothing to install to use it, just put use Dumper; up top with your other use statments) to print out your variables as a way of debugging your code to get visibility into what is going on.

Perl - New definition of myprint() or Overload print command

I am a newb to Perl. I am writing some scripts and want to define my own print called myprint() which will print the stuff passed to it based on some flags (verbose/debug flag)
open(FD, "> /tmp/abc.txt") or die "Cannot create abc.txt file";
print FD "---Production Data---\n";
myprint "Hello - This is only a comment - debug data";
Can someone please help me with some sample code to for myprint() function?
Do you care more about writing your own logging system, or do you want to know how to put logging statements in appropriate parts of your program which you can turn off (and, incur little performance penalty when they are turned off)?
If you want a logging system that is easy to start using, but also offers a world of features which you can incrementally discover and use, Log::Log4perl is a good option. It has an easy mode, which allows you to specify the desired logging level, and emits only those logging messages that are above the desired level.
#!/usr/bin/env perl
use strict; use warnings;
use File::Temp qw(tempfile);
use Log::Log4perl qw(:easy);
Log::Log4perl->easy_init({level => $INFO});
my ($fh, $filename) = tempfile;
print $fh "---Production Data---\n";
WARN 'Wrote something somewhere somehow';
The snippet also shows a better way of opening a temporary file using File::Temp.
As for overriding the built-in print … It really isn't a good idea to fiddle with built-ins except in very specific circumstances. perldoc perlsub has a section on Overriding Built-in Functions. The accepted answer to this question lists the Perl built-ins that cannot be overridden. print is one of those.
But, then, one really does not need to override a built-in to write a logging system.
So, if an already-written logging system does not do it for you, you really seem to be asking "how do I write a function that prints stuff conditionally depending on the value of a flag?"
Here is one way:
#!/usr/bin/env perl
package My::Logger;
{
use strict; use warnings;
use Sub::Exporter -setup => {
exports => [
DEBUG => sub {
return sub {} unless $ENV{MYDEBUG};
return sub { print 'DEBUG: ' => #_ };
},
]
};
}
package main;
use strict; use warnings;
# You'd replace this with use My::Logger qw(DEBUG) if you put My::Logger
# in My/Logger.pm somewhere in your #INC
BEGIN {
My::Logger->import('DEBUG');
}
sub nicefunc {
print "Hello World!\n";
DEBUG("Isn't this a nice function?\n");
return;
}
nicefunc();
Sample usage:
$ ./yy.pl
Hello World!
$ MYDEBUG=1 ./yy.pl
Hello World!
DEBUG: Isn't this a nice function?
I wasn't going to answer this because Sinan already has the answer I'd recommend, but tonight I also happened to be working on the "Filehandle References" chapter to the upcoming Intermediate Perl. That are a couple of relevant paragraphs which I'll just copy directly without adapting them to your question:
IO::Null and IO::Interactive
Sometimes we don't want to send our output anywhere, but we are forced
to send it somewhere. In that case, we can use IO::Null to create
a filehandle that simply discards anything that we give it. It looks
and acts just like a filehandle, but does nothing:
use IO::Null;
my $null_fh = IO::Null->new;
some_printing_thing( $null_fh, #args );
Other times, we want output in some cases but not in others. If we are
logged in and running our program in our terminal, we probably want to
see lots of output. However, if we schedule the job through cron, we
probably don't care so much about the output as long as it does the job.
The IO::Interactive module is smart enough to tell the difference:
use IO::Interactive;
print { is_interactive } 'Bamboo car frame';
The is_interactive subroutine returns a filehandle. Since the
call to the subroutine is not a simple scalar variable, we surround
it with braces to tell Perl that it's the filehandle.
Now that you know about "do nothing" filehandles, you can replace some
ugly code that everyone tends to write. In some cases you want output
and in some cases you don't, so many people use a post-expression
conditional to turn off a statement in some cases:
print STDOUT "Hey, the radio's not working!" if $Debug;
Instead of that, you can assign different values to $debug_fh based
on whatever condition you want, then leave off the ugly if $Debug
at the end of every print:
use IO::Null;
my $debug_fh = $Debug ? *STDOUT : IO::Null->new;
$debug_fh->print( "Hey, the radio's not working!" );
The magic behind IO::Null might give a warning about "print() on
unopened filehandle GLOB" with the indirect object notation (e.g.
print $debug_fh) even though it works just fine. We don't get that
warning with the direct form.

What are the best practices for error handling in Perl?

I'm learning Perl, and in a lot of the examples I see errors are handled like this
open FILE, "file.txt" or die $!;
Is die in the middle of a script really the best way to deal with an error?
Whether die is appropriate in the middle of the script really depends on what you're doing. If it's only tens of lines, then it's fine. A small tool with a couple hundred lines, then consider confess (see below). If it's a large object-oriented system with lots of classes and interconnected code, then maybe an exception object would be better.
confess in the Carp package:
Often the bug that led to the die isn't on the line that die reports.
Replacing die with confess (see Carp package) will give the stack trace (how we got to this line) which greatly aids in debugging.
For handling exceptions from Perl builtins, I like to use autodie. It catches failures from open and other system calls and will throw exceptions for you, without having to do the or die bit. These exceptions can be caught with a eval { }, or better yet, by using Try::Tiny.
Since I use Log::Log4perl almost everywhere, I use $logger->logdie instead of die. And if you want to have more control over your exceptions, consider Exception::Class.
It is better to catch your exceptions with Try::Tiny (see its documentation why).
Unless you've got a more specific idea, then yes you want to die when unexpected things happen.
Dying at the failure to open a file and giving the file name is better than the system telling you it can't read from or write to an anonymous undefined.
If you're talking about a "script", in general you're talking about a pretty simple piece of code. Not layers that need coordination (not usually). In a Perl Module, there is an attendant idea is that you don't own the execution environment, so either the main software cares and it catches things in an eval, OR it doesn't really care and dying would be fine. However, one you should try at a little more robustness as a module and just pass back undefs or something.
You can catch whatever dies (or croaks) in an eval block. And you can do your more specific handling there.
But if you want to inspect $! then write that code, and you'll have a more specific resolution.
Take a look at the near-universal standard of using strict. That's code that dies on questionable syntax, rather than letting you continue along.
So I think the general idea is: yes, DIE unless you have a better idea of how things should be handled. If you put enough foresight into it, you can be forgiven for the one or two times you don't die, because you know you don't need to.
The more modern approach is to use the Carp standard library.
use Carp;
my $fh;
open $fh, '<', "file.txt" or confess($!);
The main advantage is it gives a stack trace on death.
I use die but I wrap it in eval blocks for control over the error handling:
my $status = eval
{
# Some code...
};
If the 'eval' fails:
$status will be undefined.
$# will be set to whatever error message was produced (or the contents of
a die)
If the 'eval' succeeds:
$status will be the last returned value of the block.
$# will be set to ''.
As #friedo wrote, if this is a standalone script of a few lines, die is fine, but in my opinion using die in modules or require'd piece of code is not a good idea because it interrupts the flow of the program. I think the control of the flow should be a prerogative of the main part of the program.
Thus, I think, in module it is better to return undef such as return which would return undef in scalar context and an empty list in list context and set an Exception object to be retrieved for more details. This concept is implemented in the module Module::Generic like this:
# in some module
sub myfunc
{
my $self = shift( #_ );
# some code...
do{ # something } or return( $self->error( "Oops, something went wrong." ) );
}
Then, in the caller, you would write:
$obj->myfunc || die( "An error occurred at line ", $obj->error->line, " with stack trace: ", $obj->error->trace );
Here error will set an exception object and return undef.
However, because many module die or croak, you also need to catch those interruption using eval or try-catch blocks such as with Nice::Try
Full disclosure: I am the developer of both Module::Generic and Nice::Try.