module creation using perl script [duplicate] - perl

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
How do you create a Perl module?
I have the script that reads an xml file and creates hash table. its working properly but now i need to create module for that code, that i can call in my main function.In my main function file path as input and it gives output as hash. now i need to create module for this code.
#!/usr/bin/perl
use warnings;
use strict;
use XML::LibXML::Reader;
#Reading XML with a pull parser
my $file;
open( $file, 'formal.xml');
my $reader = XML::LibXML::Reader->new( IO => $file ) or die ("unable to open file");
my %nums;
while ($reader->nextElement( 'Data' ) ) {
my $des = $reader->readOuterXml();
$reader->nextElement( 'Number' );
my $desnode = $reader->readInnerXml();
$nums{$desnode}= $des;
print( " NUMBER: $desnode\n" );
print( " Datainfo: $des\n" );
}
how can i create module for this code?

You need to create a file with .pm extension, i.e. "MyModule.pm" with this code:
package MyModule;
use warnings;
use strict;
use XML::LibXML::Reader;
sub mi_function_name {
#Reading XML with a pull parser
my $file;
open( $file, 'formal.xml');
my $reader = XML::LibXML::Reader->new( IO => $file ) or die ("unable to open file");
my %nums;
while ($reader->nextElement( 'Data' ) ) {
my $des = $reader->readOuterXml();
$reader->nextElement( 'Number' );
my $desnode = $reader->readInnerXml();
$nums{$desnode}= $des;
print( " NUMBER: $desnode\n" );
print( " Datainfo: $des\n" );
}
}
1; #this is important
And in the file you want to use this module:
use MyModule;
#...
MyModule::mi_function_name;
This is a very simple and basic usage of a module, I recommend the lecture of better tutorials (http://www.perlmonks.org/?node_id=102347) to gain further knowledge on this

Take a look at the Perl Documentation. One of the tutorials included is perlmod. This offers a lot of good information.
First step: Make your program into a subroutine. That way, you can call it your code. I've taken the liberty to do that:
#!/usr/bin/perl
use warnings;
use strict;
use Carp;
use XML::LibXML::Reader;
#Reading XML with a pull parser
sub myFunction {
my $fh = shift; #File Handle (should be opened before calling
my $reader = XML::LibXML::Reader->new( IO => $fh )
or croak ("unable to open file");
my %nums;
while ($reader->nextElement( 'Data' ) ) {
my $des = $reader->readOuterXml();
$reader->nextElement( 'Number' );
my $desnode = $reader->readInnerXml();
$nums{$desnode} = $des;
}
return %nums;
}
1;
I've made a wee change. You notice that I no longer open a file. Instead, you'll pass a file handle to your MyFunction subroutine. Second, instead of printing out $desnode and $des, it now returns a hash that has these values in them. You don't want subroutines to output data. You want them to return the data, and let your program decide what to do with the information.
I've also put in a use Carp; line. Carp gives you two functions (as well as a few others). One is called carp which is a replacement for warning, and the other is called croak which is a replacement for die. What these two functions do is report the line number in the user's program which called your function. That way, the user doesn't see the error in your module, but their program.
I've also added the line 1; at the bottom of your program. When a module loads, if it returns a zero on load, the load fails. Thus, your last statement should return a non-zero value. The 1; guarantees it.
Now that we have a subroutine that you can return, let's make your program into a module.
To create a module, all you have to do is say package <moduleName> on top of your program. And, also make sure that the last statement executes with a non-zero value. The tradition is just to put a 1; as the last line of the program. Modules names end with a .pm suffix by default. Modules names can have components in the names separated by double colons. For example File::Basename. In that case, the module, Basename.pm lives in the directory File somewhere in the #INC list of directories (which, by default includes the current directory).
The package command simply creates a separate namespace, so your package variables and functions don't collide with the names of the variables and functions inside the program that uses your package.
If you use an object oriented interface, there's no reason why you need to export anything. The program that uses your module will simply use the object oriented syntax. If your module is function based, you probably want to export your function names into the main program.
For example, let's take File::Basename. This module imports the function basename and dirname into your program. This allows you to do this:
my $directoryName = dirname $fileName;
Instead of having to do this:
my $direcotryName = File::Basename::dirname $fileName;
To export a function, make sure your module uses the Exporter module, and then set the package variable #EXPORT_OK or #EXPORT to contain the list of functions you're allowing to be exported in the user's program. The difference is that if you say #EXPORT_OK, the functions will be exported, but the user must request each one. If you use #EXPORT, all those functions will automatically be exported.
Using your program as a basis, your module will be called Mypackage.pm and look like this:
#!/usr/bin/perl
package Mymodule;
use warnings;
use strict;
use Exporter qw(import);
use Carp;
use XML::LibXML::Reader;
our #EXPORT_OK(myFunction);
#Reading XML with a pull parser
sub MyFunction {
my $fh = shift; #File Handle (should be opened before calling
my $reader = XML::LibXML::Reader->new( IO => $fh )
or die ("unable to open file");
my %nums;
while ($reader->nextElement( 'Data' ) ) {
my $des = $reader->readOuterXml();
$reader->nextElement( 'Number' );
my $desnode = $reader->readInnerXml();
$nums{$desnode}= $des;
}
return %nums;
}
1;
The big thing is the use of:
package Mypackage
use Exporter qw(import)
our #EXPORT_OK qw(myFunction);
The package function sets up an independent name space, so your variables and function names don't override (or get overwritten) by the user's program.
The use Exporter says that your program is using the import function of the Exporter module. This allows you to import variables and functions into the main namespace of the user's program. That way, the user can simply refer to your function as mi_function_name instead of Mypackage::my_function_name. In theory, you don't have to export anything, and newer modules don't. These module are entirely object oriented or just don't want to bother with namespace issues.
The #EXPORT_OK array says what you're exporting. This is preferred over #EXPORT. With #EXPORT_OK, the developer must specify what functions he wants to import into their program. With #EXPORT, this is done automatically.
In the program that uses your module, you'll need to do this:
use Mypackage qw(myFunction);
Now, all you have to do in your program is
my %returnedHash = myFunction($fh);
Now, things are constantly evolving in Perl, and I've never received any formal training. I simply read the documentation and take a look at various examples and hope that I understand them correctly. So, if someone might say that I'm doing something wrong, they're probably correct. I've also didn't test any of the code. I might have screwed something in your program when I turned it into a subroutine.
However, the gist should be correct: You need to make your code into callable subroutines that return the information you need. Then, you can turn it into a module. It's not all that difficult to do.

Related

Call a subroutine defined as a variable

I am working on a program which uses different subroutines in separate files.
There are three parts
A text file with the name of the subroutine
A Perl program with the subroutine
The main program which extracts the name of the subroutine and launches it
The subroutine takes its data from a text file.
I need the user to choose the text file, the program then extracts the name of the subroutine.
The text file contains
cycle.name=cycle01
Here is the main program :
# !/usr/bin/perl -w
use strict;
use warnings;
use cycle01;
my $nb_cycle = 10;
# The user chooses a text file
print STDERR "\nfilename: ";
chomp($filename = <STDIN>);
# Extract the name of the cycle
open (my $fh, "<", "$filename.txt") or die "cannot open $filename";
while ( <$fh> ) {
if ( /cycle\.name/ ) {
(undef, $cycleToUse) = split /\s*=\s*/;
}
}
# I then try to launch the subroutine by passing variables.
# This fails because the subroutine is defined as a variable.
$cycleToUse($filename, $nb_cycle);
And here is the subroutine in another file
# !/usr/bin/perl
package cycle01;
use strict;
use warnings;
sub cycle01 {
# Get the total number of arguments passed
my ($filename, $nb_cycle) = #_;
print "$filename, $nb_cycle";
Your code doesn't compile, because in the final call, you have mistyped the name of $nb_cycle. It's helpful if you post code that actually runs :-)
Traditionally, Perl module names start with a capital letter, so you might want to rename your package to Cycle01.
The quick and dirty way to do this is to use the string version of eval. But evaluating an arbitrary string containing code is dangerous, so I'm not going to show you that. The best way is to use a dispatch table - basically a hash where the keys are valid subroutine names and the values are references to the subroutines themselves. The best place to add this is in the Cycle01.pm file:
our %subs = (
cycle01 => \&cycle01,
);
Then, the end of your program becomes:
if (exists $Cycle01::subs{$cycleToUse}) {
$Cycle01::subs{$cycleToUse}->($filename, $nb_cycle);
} else {
die "$cycleToUse is not a valid subroutine name";
}
(Note that you'll also need to chomp() the lines as you read them in your while loop.)
To build on Dave Cross' answer, I usually avoid the hash table, partly because, in perl, everything is a hash table anyway. Instead, I have all my entry-point subs start with a particular prefix, that prefix depends on what I'm doing, but here we'll just use ep_ for entry-point. And then I do something like this:
my $subname = 'ep_' . $cycleToUse;
if (my $func = Cycle01->can($subname))
{
$func->($filename, $nb_cycle);
}
else
{
die "$cycleToUse is not a valid subroutine name";
}
The can method in UNIVERSAL extracts the CODE reference for me from perl's hash tables, instead of me maintaining my own (and forgetting to update it). The prefix allows me to have other functions and methods in that same namespace that cannot be called by the user code directly, allowing me to still refactor code into common functions, etc.
If you want to have other namespaces as well, I would suggest having them all be in a single parent namespace, and potentially all prefixed the same way, and, ideally, don't allow :: or ' (single quote) in those names, so that you minimise the scope of what the user might call to only that which you're willing to test.
e.g.,
die "Invalid namespace $cycleNameSpaceToUse"
if $cycleNameSpaceToUse =~ /::|'/;
my $ns = 'UserCallable::' . $cycleNameSpaceToUse;
my $subname = 'ep_' . $cycleToUse;
if (my $func = $ns->can($subname))
# ... as before
There are definitely advantages to doing it the other way, such as being explicit about what you want to expose. The advantage here is in not having to maintain a separate list. I'm always horrible at doing that.

Prevent strings from being interpreted as a file handle

Perl has the feature that strings named like a file handle are taken to be a filehandle:
# let this be some nice class I wrote
package Input {
sub awesome { ... }
}
So when we do Input->awesome or extra-careful: 'Input'->awesome, the method will get called. Unless:
# now somewhere far, far away, in package main, somebody does this:
open Input, "<&", \*STDIN or die $!; # normally we'd open to a file
This code doesn't even have to be executed, but only be seen by the parser in order to have Perl interpret the string 'Input' as a file handle from now on. Therefore, a method call 'Input'->awesome will die because the object representing the file handle doesn't have awesome methods.
As I am only in control of my class but not of other code, I can't simply decide to only use lexical filehandles everywhere.
Is there any way I can force Input->awesome to always be a method call on the Input package, but never a file handle (at least in scopes controlled by me)? I'd think there shouldn't be any name clash because the Input package is actually the %Input:: stash.
Full code to reproduce the problem (see also this ideone):
use strict;
use warnings;
use feature 'say';
say "This is perl $^V";
package Input {
sub awesome {
say "yay, this works";
}
}
# this works
'Input'->awesome;
# the "open" is parsed, but not actually executed
eval <<'END';
sub red_herring {
open Input, "<&", \*STDIN or die $!;
}
END
say "eval failed: $#" if $#;
# this will die
eval {
'Input'->awesome;
};
say "Caught: $#" if $#;
Example output:
This is perl v5.16.2
yay, this works
Caught: Can't locate object method "awesome" via package "IO::File" at prog.pl line 27.
Using the same identifier for two different things (a used class and filehandle) begs for problems. If your class is used from a different class that's used in the code that uses the filehandle, the error does not appear:
My1.pm
package My1;
use warnings;
use strict;
sub new { bless [], shift }
sub awesome { 'My1'->new }
__PACKAGE__
My2.pm
package My2;
use warnings;
use strict;
use parent 'My1';
sub try {
my $self = shift;
return ('My1'->awesome, $self->awesome);
}
__PACKAGE__
script.pl
#!/usr/bin/perl
use warnings;
use strict;
use My2;
open My1, '<&', *STDIN;
my $o = 'My2'->new;
print $o->awesome, $o->try;
Using the bareword Input as a filehandle is a breach of the naming convention to have only uppercase barewords for FILEHANDLEs and Capitalized/CamelCased barewords for Classes and Packages.
Furthermore lexcial $filehandles have been introduced and encouraged already a very long time ago.
So the programmer using your class is clearly misbehaving, and since namespaces are per definition global this can hardly be addressed by Perl (supporting chorobas statement about begging for problems).
Some naming conventions are crucial for all (dynamic) languages.
Thanks for the interesting question though, the first time I see a Perl question in SO I would preferred to see on perlmonks! :)
UPDATE: The discussion has has been deepened here: http://www.perlmonks.org/?node_id=1083985

Writing subroutine to log file in perl

I am new to perl scripting. I want to write subroutine test to log file.
for e.g.
my ($logfile, $logpath);
$logpath = '/usr/bin';
$logfile = "$logpath/log.txt";
open (LOG,">>","$logfile") || die ("Error : can't open log file");
sub test
{
print "Hi\n";
my $date = `date`;
}
sub logFunc
{
print LOG "Writing log files\n";
print LOG test(); # we cannot do like this :)
}
logFunc();
Say their are 15+ subroutines. So to write commands in each subroutine to log file I have to write print LOG "[Command]\n"; which works fine but script length is huge. So using common subroutine is their any way to achieve this?
There are several problems with your code.
Are you sure you have (and want) write-access to /usr/bin/?
You don't ever call your log() or your test() subroutines. No one will call any of them automatically.
The name log clashes with the built-in log function. So you will either have to call it with a prepended ampersand &log() which is ugly or rename it.
Your test() sub only has an implicit return value. Rather return the value of $date explicitly.
You are using the deprecated 2-argument version of open using a bare-word global file handle. Please use the 3-arg version with a lexical filehandle: open my $log_fh, '>>', $logfile.
A few hints:
Always add use strict; and use warnings; at the top of your script.
Since you're dealing with reading and writing files, you should also add use autodie;. This will automatically kill your program if you cannot open a file, or you cannot write to an open file.
Don't use OS commands when Perl probably can do exactly what you want without calling an OS command.
A Subroutine usually takes arguments and returns a value of some sort. In your case, have your test subroutine return something to write to the log. Or, create just a log subroutine that writes to a log, and have your test subroutine call it.
Here I'm reversing your subroutine calls. I create a write_to_log subroutine to handle my subroutine calls. My write_to_log adds the date/time stamp and writes that and my message. My various subroutines now just call write_to_log for me.
Notice all of my subroutines return some sort of value. The say command (as well as print) returns a non-zero value on success and a 0 on failure. I can use this to test whether my call to my subroutine worked or not.
use strict;
use warnings;
use autodie;
use features qw(say); #Allows you to use `say` instead of `print:
my $log_file = "/usr/bin/log.txt"; #You have write permission to this directory?
open my $log_fh, ">", $log_file;
my test ( $log_fh ) or die qq(Can't write to the log); #Pass the file handle to log
my test2 ( $log_fh ) or die qq(Can't write to the log);
close $log_fh;
sub test {
return write_to_log ( $log_fh, "Hello World!" );
}
sub test2 {
return write_to_log ( $log_fh, "Goodbye World!" );
}
sub write_to_log {
my $file_handle = shift;
my $message = shift;
use Time::Piece;
my $time = localtime->cdate;
return say {$file_handle} "$time: $message";
}
Here's a webpage that lists good books for learning modern Perl and what to look for in those books. If you're beginning to learn Perl, use one of these books.

Which script initialized module?

In Perl, is there any way to tell which .pl script has initialized this instance of a module?
Specifically, I'd like to get the name of the script calling a module, which has a Log4perl object it. That way, I'll know which .log file I want to write to within the module.
Am I doing this wrong? If I define the $logger in my .pl script, will any $logger calls within the module write to the same .log file as the calling script?
I don't have any sample code yet, but have been reading up on Log4perl. Basically, if I set an Appender to a file, caller.log, which is the file appender for my calling script, caller.pl, I'd want any logging defined in the custom imported module, to also write to caller.log (implicitly, if possible -- obviously I could just pass the name of the log name when I initialize the module instance).
Is this possible without passing arguments specifying which File Appender the module should write to? Doesn't Log4perl use just one $logger instance?
Also, let me know if I'm way out, and if there's a different approach I should be considering.
Thank you
EDIT: Sorry, after I posted this, I looked at the Related Links, and I guess my search wording just wasn't correct. It looks like this is a pretty good solution: Self logging Perl modules (without Moose)
If anyone has any other ideas, though, please let me know.
EDIT 2: Finally tested, and got it to work as I had wanted -- was a lot easier than was making it out to be, too!
This is my setup, pretty much:
Module.pm
package Module;
use Log::Log4perl qw(get_logger :levels);
use Data::Dumper;
my $logger = get_logger("Module");
sub new {
my ($class, $name) = #_;
my #caller = caller(0);
$logger->debug("Creating new Module. Called by " . Dumper(\#caller));
my $object = { 'name' => $name };
return bless($object, $class);
}
caller.pl
use Module;
use Log::Log4perl qw(get_logger :levels);
use Data::Dumper;
my $PATH = "$ENV{'APPS'}/$ENV{'OUTDIR'}";
my $SCRIPT = "caller";
my $logger = get_logger("Module");
$logger->level($DEBUG);
my $file_appender = Log::Log4perl::Appender->new("Log::Dispatch::File",
filename=> "$PATH/$SCRIPT.log",
mode => "append",);
$logger->add_appender($file_appender);
my $layout = Log::Log4perl::Layout::PatternLayout->new("%d %p> %F{1}:%L %M - %m%n");
$file_appender->layout($layout);
my $lib = Module->new('Chris');
$logger->info(Dumper($lib));
You could subclass Log4perl, overriding its constructor. In your custom constructor, use caller() to get the filename that called the constructor and put it in $self.
You can put a subroutine hook into #INC that can run arbitrary code, as documented in perldoc -f require. For example:
# UseLogger.pm
package UseLogger;
sub import { unshift #INC, \&UseLogger::log_use }
sub log_use {
my ($self, $filename) = #_;
my #c = caller(0);
print "Module $filename required in file $c[1] line $c[2]\n";
return 0;
}
1;
$ perl -MUseLogger my_script.pl
Module feature.pm required in file my_script.pl line 2
Module Encode.pm required in file my_script.pl line 5
Module XSLoader.pm from /usr/lib/perl5/5.14.0/cygwin-thread-multi-64int/Encode.pm line 13
...
$0 contains the path to the script. You can use File::Basename's basename if you want to want the file name component.

How can I dynamically include Perl modules without using eval?

I need to dynamically include a Perl module, but if possible would like to stay away from eval due to work coding standards. This works:
$module = "My::module";
eval("use $module;");
But I need a way to do it without eval if possible. All google searches lead to the eval method, but none in any other way.
Is it possible to do it without eval?
Use require to load modules at runtime. It often a good idea to wrap this in a block (not string) eval in case the module can't be loaded.
eval {
require My::Module;
My::Module->import();
1;
} or do {
my $error = $#;
# Module load failed. You could recover, try loading
# an alternate module, die with $error...
# whatever's appropriate
};
The reason for the eval {...} or do {...} syntax and making a copy of $# is because $# is a global variable that can be set by many different things. You want to grab the value as atomically as possible to avoid a race condition where something else has set it to a different value.
If you don't know the name of the module until runtime you'll have to do the translation between module name (My::Module) and file name (My/Module.pm) manually:
my $module = 'My::Module';
eval {
(my $file = $module) =~ s|::|/|g;
require $file . '.pm';
$module->import();
1;
} or do {
my $error = $#;
# ...
};
How about using the core module Module::Load
With your example:
use Module::Load;
my $module = "My::module";
load $module;
"Module::Load - runtime require of both modules and files"
"load eliminates the need to know whether you are trying to require either a file or a module."
If it fails it will die with something of the like "Can't locate xxx in #INC (#INC contains: ...".
Well, there's always require as in
require 'My/Module.pm';
My::Module->import();
Note that you lose whatever effects you may have gotten from the import being called at compile time instead of runtime.
Edit: The tradeoffs between this and the eval way are: eval lets you use the normal module syntax and gives you a more explicit error if the module name is invalid (as opposed to merely not found). OTOH, the eval way is (potentially) more subject to arbitrary code injection.
No, it's not possible to without eval, as require() needs the bareword module name, as described at perldoc -f require. However, it's not an evil use of eval, as it doesn't allow injection of arbitrary code (assuming you have control over the contents of the file you are requireing, of course).
EDIT: Code amended below, but I'm leaving the first version up for completeness.
I use I used to use this little sugar module to do dynamic loads at runtime:
package MyApp::Util::RequireClass;
use strict;
use warnings;
use Exporter 'import'; # gives you Exporter's import() method directly
our #EXPORT_OK = qw(requireClass);
# Usage: requireClass(moduleName);
# does not do imports (wrong scope) -- you should do this after calling me: $class->import(#imports);
sub requireClass
{
my ($class) = #_;
eval "require $class" or do { die "Ack, can't load $class: $#" };
}
1;
PS. I'm staring at this definition (I wrote it quite a while ago) and I'm pondering adding this:
$class->export_to_level(1, undef, #imports);... it should work, but is not tested.
EDIT: version 2 now, much nicer without an eval (thanks ysth): :)
package MyApp::Util::RequireClass;
use strict;
use warnings;
use Exporter 'import'; # gives you Exporter's import() method directly
our #EXPORT_OK = qw(requireClass);
# Usage: requireClass(moduleName);
# does not do imports (wrong scope) -- you should do this after calling me: $class->import(#imports);
sub requireClass
{
my ($class) = #_;
(my $file = $class) =~ s|::|/|g;
$file .= '.pm';
require $file; # will die if there was an error
}
1;
Class::MOP on CPAN has a load_class method for this:
http://metacpan.org/pod/Class::MOP
i like doing things like..
require Win32::Console::ANSI if ( $^O eq "MSWin32" );