Perl portable serialization only with CORE modules - perl

Exists any portable serialization method/Module what is included in the CORE modules? I know here is Storable, but it is not truly portable nor "cross-platform-standardized". Looking for something like YAML, JSON, XML or like...
I already chcecked the http://perldoc.perl.org/index-modules-T.html - but maybe missed something.
Motivation: want make a simple perl script what will works with any perl (without CPAN) and can read some configuration (and data) from a file. Using require with the Data::Dumper format is not very "user friendly"...
So possible solutions:
include something like YAML directly to my script (can be a solution, but...)
forcing users to install CPAN modules (not a solution)
use native perl and require - not very userfriendly syntax (for a non-perl users)
Any other suggested solution?
Ps: Understand the need keep core as small as possible and reasonable, but reading data in some standardized formats maybe? should be in a core...

There is a YAML parser and serializer bundled with Perl, hidden away. It's called CPAN::Meta::YAML. It only handles a subset of YAML, but that may be sufficient for your purposes.

You can configure Data::Dumper's output to be JSON-like. For example:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Pair = ': ';
$Data::Dumper::Terse = 1;
$Data::Dumper::Useqq = 1;
$Data::Dumper::Indent = 1;
my $structure = {
foo => 'bar',
baz => {
quux => 'duck',
badger => 'mythical',
}
};
print Dumper( $structure );
This prints:
{
"baz": {
"quux": "duck",
"badger": "mythical"
},
"foo": "bar"
}
That might get you most of the way towards interoperability? The module does have a bunch of options for controlling / changing output e.g. the Freezer and Toaster options.

Can you explain to me the problem with Storable again? If you look at Perlport, after a discussion of Bigendiness and Littleendiness, it concludes:
One can circumnavigate both these problems in two ways. Either transfer and store numbers always in text format, instead of raw binary, or else consider using modules like Data::Dumper and Storable (included as of perl 5.8). Keeping all data as text significantly simplifies matters.
So, Storable is universal for storing and retrieving data in Perl, and it's not only easy to use, but it's a standard Perl module.
Is the issue that you want to be able to write the data without having a Perl program do it for you? You could simply write your own Perl module. In most Perl installations, that module could be placed in the same directory as your program.
package Some_data; # Can be put in the same directory as the program like a config file
our $data; # Module variable which makes it accessible to your local program
$data = {}; # I am making this complex data structure...
$data->{NAME}->{FIRST} = "Bob";
$data->{NAME}->{LAST} = "Smith";
$data->{PHONE}->[0]->{TYPE} = "C";
$data->{PHONE}->[0]->{NUMBER} = "555-1234";
$data->{PHONE}->[1]->{TYPE} = "H";
$data->{PHONE}->[1]->{NUMBER} = "555-2345";
# Or use Subroutines
sub first {
return "Bob";
}
sub last {
return "Smith"
}
...
Now you can include this in your program:
use Some_data;
my $first_name = $Some_data::data->{NAME}->{FIRST} # As a hash of hashes
# OR
my $first_name = Some_data::first; # As a constant
The nice thing about the subroutines is that you can't change the data in your program. They're constants. In fact, that's exactly how Perl constants work too.
Speaking about constants. You could use use constant too:
package Some_data;
use constant {
FIRST => "Bob",
SECOND => "Smith",
};
And in your program:
use strict;
use warnings;
use Some_data;
my $fist_name = &Some_Data::FIRST; # Note the use of the ampersand!
Not quite as clean because you need to prefix the constant with an ampersand. There are ways of getting around that ampersand, but they're not all that pretty.
Now, you have a way of importing your data in your program, and it's really no harder to maintain than a JSON data structure. There's nothing your program has to do except to use Module; to get that data.
One final possibility
Here's one I've done before. I simply have a configuration file that looks like what you'd put on the command line, then use Getopt::Long to pull in the configuration:
Configfile
-first Bob -last Smith
-phone 555-1212
NOTE: It doesn't matter if you put it all on one line or not:
use strict;
use warnings;
use Getopt::Long qw(GetOptionsFromString);
open my $param_fh, "<", $param_file;
my #parameters = <$param_fh>;
close $param_fh;
my $params = join " ", $parameters # One long string
my ( $first, $phone );
GetOptionsFromString ( $params,
"first=s" => \$first,
"phone=s" => \$phone,
);
You can't get easier to maintain than that!

Related

In Perl, how do I extract out the declaration of variables into a wrapper script?

Background
I have a perl script, called main.pl that is currently in several branched states on clear case like so:
Branch 1:
my %hash
my $variable = "a"
my $variable2 = "c"
sub codeIsOtherwiseTheSame()
....
Branch 2:
my %hash2
my $variable = "b"
sub codeIsOtherwiseTheSame()
....
Branch 3
my %hash
my $variable2 = "d"
sub codeIsOtherwiseTheSame()
....
Right now, each branch of the script has the same code. The only differences are the kind of variables that are declared and what their initialized value is. What I want to do is extract these differing variables out to a wrapper script (for each variation) so that the main script does not have to be changed. I am doing this because several users will be using this script, but have only minor differences based on their use case. Thus I want each kind of user to have their own simplified interface. At the same time, I want the main script to still be aware of these variable once it is called. Below is an example of what I want:
Desired Solution
Wrapper Script 1:
my %hash;
my $variable = "a";
my $variable2 = "c";
system("main.pl");
Wrapper Script 2:
my %hash2;
my $variable = "b";
system("main.pl");
Wrapper Script 3:
my %hash;
my $variable2 = "d";
system("main.pl");
Main.pl
sub codeIsOtherwiseTheSame()
Question
How do I extract out a wrapper script to obtain the organization and behavior I want above?
Extract the common code into a module, not a script. Save it as e.g. MyCommon.pm.
Export a function from the module that does what you need:
package MyCommon;
use Exporter qw{ import };
our #EXPORT = qw{ common_code };
sub common_code {
my ($var1, $var2) = #_;
# Common code goes here...
}
Then, in various scripts, write
use MyCommon qw{ common_code };
common_code('a', 'b'); # <- insert the specific values here.
There are more advanced ways, e.g. you can use "object orientation": construct an object from the specific values, then run a method that implements the common code - but for simple use cases, you probably don't need it.
Desired behavior for simple case as yours can be achieved with with required function of perl
Put common code in a file, for example common.inc end the file with 1; (requirement for modules and include files)
sub commonFunction {
my $data = shift;
print "DATA: $data\n";
}
1;
Copy/move common.inc file into one of #INC directory (probably site directory best fit for this purpose).
Check your perl #INC configuration setting with following command
perl -e "print qw(#INC)"
Now you can reuse common.inc file in your user interface script
#!/usr/bin/perl
require 'common.inc';
my $a = 7;
commonFunction($a);
Already was suggested to place the common code which will be reused multiple times in form of .pm module.
By doing so you gain more control what functions/variables are visible (exported) to avoid namespace clash/collision [the modules can have functions/variables with same name].
Short tutorial how to create a module is available. Next natural step will be OOP programming.
Book: Object Oriented Perl
perlootut, Writing perl modules, Chapter Object Oriented Perl

In Perl, how can I access a scalar defined in another package?

I seem to be stuck trying to access a scalar which is defined in another package, and have narrowed down an example to a simple test case where I can reproduce the issue.
What I wish to be able to do it access a reference to a list which is in defined within the package 'Example', using the our mechanism, however, Dumper is showing that the variable is always undefined within example.pl:
Example.pm looks like the following:
#!/usr/bin/perl -w
use strict;
use warnings;
use diagnostics;
package Example;
use Data::Dumper;
my $exported_array = [ 'one', 'two', 'three' ];
print Dumper $exported_array;
1;
And the code which uses this package looks like this:
#!/usr/bin/perl -w
use strict;
use warnings;
use diagnostics;
use Data::Dumper;
use lib '.';
use Example;
{ package Example;
use Data::Dumper;
our $exported_array;
print Dumper $exported_array;
}
exit 0;
Upon running this code, the first Dumper runs and things look normal, after this, the second Dumper, example.pl runs and the reference is then undefined:
$VAR1 = [
'one',
'two',
'three'
];
$VAR1 = undef;
A my declaration does not create a package level variable and does not enter anything onto the symbol table for any namespace.
To do what you look like you are trying to do, you will have to change the declaration in the first file to
our $exported_array = [ ... ];
You can then access it in another file as
$Example::exported_array
Even if $exported_array weren't lexically scoped in the Example package, Example's $exported_array and main's $exported_array are two different things. The easiest way to change the example you've given is to 1. change my to our in the Example declaration and explicitly qualify the variable name.
our $exported_array;
...
print Dumper $Example::exported_array;
Otherwise, you need to make Example an Exporter. (Or just write an Example::import routine--but I' not going to cover that.)
package Example;
our $exported_array = ...;
our #EXPORT_OK = qw<$exported_array>;
use parent qw<Exporter>;
And in the script:
use Example qw<$exported_array>;
However, as you can actually export arrays (not just refs), I would make that:
our #exported_array = (...);
our #EXPORT_OK = qw<#exported_array>;
...
use Example qw<#exported_array>;
...
print Dumper( \#exported_array );
When you use the my operator, you are lexically scoping the variable name to either the scope you're in or to the file.
If you want something to be visible as a qualified package array, you need to use our like you do in the driver code. I believe you also need to declare a few special exporter variables in the .pm file, but on the plus side you won't need to declare our $exported_array; in the driver file.
Using Exporter is fine for smaller projects, but if you have lots of code handling data that is internal to a module, things can get ... messy. Object-orientation is a lot friendlier for this type of thing.
Why not construct a method to fetch this data? In fact, why not just use Moose?
In your Example.pm, just load Moose - this gives you a constructor and destructor for free, as well as a subroutine to fetch values and turns on strict, etc by default. Array references have to be declared a little differently, due to how Class:MOP (The engine under the antlers of Moose) initializes attributes - you have to wrap it in a code reference (aka sub {}). You would also use Data::Dumper in the script which calls the package, instead of the package itself.
Example.pm
package Example;
use Moose;
has 'exported_array' => (is => 'rw', default => sub { [ 'one', 'two', 'three' ] });
1;
Then call this from a script:
example.pl
#!/usr/bin/env perl
use Modern::Perl '2013';
use lib '.';
use Example;
use Data::Dumper;
my $example = Example->new;
my $imported_array_ref = $example->exported_array;
my #imported_array = #{$imported_array_ref};
foreach my $element(#imported_array) { say $element; }
say Dumper(\#imported_array);
I made the dereferencing really explicit in the example.pl script above... it can be much more terse by dereferencing it directly into the array:
#!/usr/bin/env perl
use Modern::Perl '2013';
use lib '.';
use Example;
use Data::Dumper;
my $example = Example->new;
my #imported_array = #{$example->exported_array};
foreach my $element(#imported_array) { say $element; }
say Dumper(\#imported_array);
I think a lot more Perl programmers would embrace Moose if there were more simple examples that show how to get simple things done.
The Official Moose Manual is excellent, but it was really written for those who are already familiar with OOP.

Subroutines vs scripts in Perl

I'm fairly new to Perl and was wondering what the best practices regarding subroutines are with Perl. Can a subroutine be too big?
I'm working on a script right now, and it might need to call another script. Should I just integrate the old script into the new one in the form of a subroutine? I need to pass one argument to the script and need one return value.
I'm guessing I'd have to do some sort of black magic to get the output from the original script, so subroutine-ing it makes sense right?
Avoiding "black magic" is always a good idea when writing code. You never want to jump through hoops and come up with an unintuitive hack to solve a problem, especially if that code needs to be supported later. It happens, admittedly, and we're all guilty of it. Circumstances can weigh heavily on "just getting the darn thing to work."
The point is, the best practice is always to make the code clean and understandable. Remember, and this is especially true with Perl code in my experience, any code you wrote yourself more than a few months ago may as well have been written by someone else. So even if you're the only one who needs to support it, do yourself a favor and make it easy to read.
Don't cling to broad sweeping ideas like "favor more files over larger files" or "favor smaller methods/subroutines over larger ones" etc. Those are good guidelines to be sure, but apply the spirit of the guideline rather than the letter of it. Keep the code clean, understandable, and maintainable. If that means the occasional large file or large method/subroutine, so be it. As long as it makes sense.
A key design goal is separation of concerns. Ideally, each subroutine performs a single well-defined task. In this light, the main question revolves not around a subroutine's size but its focus. If your program requires multiple tasks, that implies multiple subroutines.
In more complex scenarios, you may end up with groups of subroutines that logically belong together. They can be organized into libraries or, even better, modules. If possible, you want to avoid a scenario where you end up with multiple scripts that need to communicate with each other, because the usual mechanism for one script to return data to another script is tedious: the first script writes to standard output and the second script must parse that output.
Several years ago I started work at a job requiring that I build a large number of command-line scripts (at least, that's how it turned out; in the beginning, it wasn't clear what we were building). I was quite inexperienced at the time and did not organize the code very well. In hindsight, I should have worked from the premise that I was writing modules rather than scripts. In other words, the real work would have been done by modules, and the scripts (the code executed by a user on the command line) would have remained very small front-ends to invoke the modules in various ways. This would have facilitated code reuse and all of that good stuff. Live and learn, right?
Another option that hasn't been mentioned yet for reusing the code in your scripts is to put common code in a module. If you put shared subroutines into a module or modules, you can keep your scripts short and focussed on what they do that is special, while isolating the common code in a easy to access and reuse form.
For example, here is a module with a few subroutines. Put this in a file called MyModule.pm:
package MyModule;
# Always do this:
use strict;
use warnings;
use IO::Handle; # For OOP filehandle stuff.
use Exporter qw(import); # This lets us export subroutines to other scripts.
# These may be exported.
our #EXPORT_OK = qw( gather_data_from_fh open_data_file );
# Automatically export everything allowed.
# Generally best to leave empty, but in some cases it makes
# sense to export a small number of subroutines automatically.
our #EXPORT = #EXPORT_OK;
# Array of directories to search for files.
our #SEARCH_PATH;
# Parse the contents of a IO::Handle object and return structured data
sub gather_data_from_fh {
my $fh = shift;
my %data;
while( my $line = $fh->readline );
# Parse the line
chomp $line;
my ($key, #values) = split $line;
$data{$key} = \#values;
}
return \%data;
}
# Search a list of directories for a file with a matching name.
# Open it and return a handle if found.
# Die otherwise
sub open_data_file {
my $file_name = shift;
for my $path ( #SEARCH_PATH, '.' ) {
my $file_path = "$path/$file_name";
next unless -e $file_path;
open my $fh, '<', $file_path
or die "Error opening '$file_path' - $!\n"
return $fh;
}
die "No matching file found in path\n";
}
1; # Need to have trailing TRUE value at end of module.
Now in script A, we take a filename to search for and process and then print formatted output:
use strict;
use warnings;
use MyModule;
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /foo/foo/rah /bar/bar/bar /eeenie/meenie/mynie/moe );
#get file name from args.
my $name = shift;
my $fh = open_data_file($name);
my $data = gather_data_from_fh($fh);
for my $key ( sort keys %$data ) {
print "$key -> ", join ', ', #{$data->{$key}};
print "\n";
}
Script B, searches for a file, parses it and then writes the parsed data structure into a YAML file.
use strict;
use warnings;
use MyModule;
use YAML qw( DumpFile );
# Configure which directories to search
#MyModule::SEARCH_PATH = qw( /da/da/da/dum /tutti/frutti/unruly /cheese/burger );
#get file names from args.
my $infile = shift;
my $outfile = shift;
my $fh = open_data_file($infile);
my $data = gather_data_from_fh($fh);
DumpFile( $outfile, $data );
Some related documentation:
perlmod - About Perl modules in general
perlmodstyle - Perl module style guide; this has very useful info.
perlnewmod - Starting a new module
Exporter - The module used to export functions in the sample code
use - the perlfunc article on use.
Some of these docs assume you will be sharing your code on CPAN. If you won't be publishing to CPAN, simply ignore the parts about signing up and uploading code.
Even if you aren't writing for CPAN, it is beneficial to use the standard tools and CPAN file structure for your module development. Following the standard allows you to use all of the tools CPAN authors use to simplify the development, testing and installation process.
I know that all this seems really complicated, but the standard tools make each step easy. Even adding unit tests to your module distribution is easy thanks to the great tools available. The payoff is huge, and well worth the time you will invest.
Sometimes it makes sense to have a separate script, sometimes it doesn't. The "black magic" isn't that complicated.
#!/usr/bin/perl
# square.pl
use strict;
use warnings;
my $input = shift;
print $input ** 2;
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += `square.pl $num` // die "square.pl failed: $? $!";
}
print $sum, "\n";
Easier and better error reporting on failure is automatic with IPC::System::Simple:
#!/usr/bin/perl
# sum_of_squares.pl
use strict;
use warnings;
use IPC::System::Simple 'capture';
my ($from, $to) = #ARGV;
my $sum;
for my $num ( $from .. $to ) {
$sum += capture( "square.pl $num" );
}
print $sum, "\n";

How do I dynamically discover packages from a partial namespace in perl?

I have a directory structure that looks like:
Foo::Bar::Baz::1
Foo::Bar::Baz::2 etc
Can I list the packages from something like:
use Foo::Bar::Baz;
Thanks!
Edit: Made it more clear what the modules are.
If you want to load all modules in your include path with a certain prefix (e.g. everything under a::b::c, you can use Module::Find.
For example:
use Module::Find 'useall';
my #loaded = useall 'Foo::Bar::Baz'; # loads everything under Foo::Bar::Baz
This depends on your #INC path being set up with the necessary directories, so do any required manipulation (e.g. with use lib) first.
Normally a script such as a/b/c.pl won't have a namespace other than main. Perhaps you are thinking of discovering modules with names such as a/b/c.pm (which is a bad name, since lower-cased package names are generally reserved for Perl internals).
However, given a directory path, you can look for potential Perl modules using File::Find:
use strict;
use warnings;
use File::Find;
use Data::Dumper;
my #modules;
sub wanted
{
push #modules, $_ if m/\.pm$/
}
find(\&wanted, 'A/B');
print "possible modules found:\n";
print Dumper(\#modules)'
This might be overkill, but you can inspect the symbol table before and after loading the module and see what changed:
use strict; use warnings;
my %original = map { $_ => 1 } get_namespaces("::");
require Inline;
print "New namespaces since 'require Inline' call are:\n";
my #new_namespaces = sort grep !defined $original{$_}, get_namespaces("::");
foreach my $new_namespace (#new_namespaces) {
print "\t$new_namespace\n";
}
sub get_namespaces {
# recursively inspect symbol table for known namespaces
my $pkg = shift;
my #namespace = ();
my %s = eval "%" . $pkg;
foreach my $key (grep /::$/, keys %s) {
next if $key eq "main::";
push #namespace, "$pkg$key", get_namespaces("$pkg$key");
}
return #namespace;
}
New namespaces since 'require Inline' call are:
::AutoLoader::
::Config::
::Digest::
::Digest::MD5::
::Dos::
::EPOC::
::Exporter::
::Exporter::Heavy::
::File::
::File::Spec::
::File::Spec::Cygwin::
::File::Spec::Unix::
::File::Spec::Win32::
::Inline::Files::
::Inline::denter::
::Scalar::
::Scalar::Util::
::Socket::
::VMS::
::VMS::Filespec::
::XSLoader::
::vars::
::warnings::register::
Just to be clear, are you looking at random packages in random Perl code?
Or for Perl modules, e.g. "a/b/c/d1.pm" with module "a::b::c::d1"?
In either case, you can not use a single "use" statement to load them all.
What you need to do is to find all the appropriate files, using either glob or File::Find.
In the first case (modules), you can then load them either by require-ing each file, OR by converting filename into module name (s#/#::#g; s#\.pm$##;) and calling use on each module individually.
As far as actual packages nested in random Perl files, those packages can be:
Listed by grepping each file (again, found via glob or File::Find) for /^package (.*);/
Actually loaded by executing require $file for each file.
In this case, please note that the package name for each of those packages in a/b/c/1.pl will NOT need to be related to "a::b::c" - e.g. they CAN be named by the file author "p1", "a::p1" or "a::b::c::p1_something".

How can I share global values among different packages in Perl?

Is there a standard way to code a module to hold global application parameters to be included in every other package? For instance: use Config;?
A simple package that only contains our variables? What about readonly variables?
There's already a standard Config module, so choose a different name.
Say you have MyConfig.pm with the following contents:
package MyConfig;
our $Foo = "bar";
our %Baz = (quux => "potrzebie");
1;
Then other modules might use it as in
#! /usr/bin/perl
use warnings;
use strict;
use MyConfig;
print "Foo = $MyConfig::Foo\n";
print $MyConfig::Baz{quux}, "\n";
If you don't want to fully qualify the names, then use the standard Exporter module instead.
Add three lines to MyConfig.pm:
package MyConfig;
require Exporter;
our #ISA = qw/ Exporter /;
our #EXPORT = qw/ $Foo %Baz /;
our $Foo = "bar";
our %Baz = (quux => "potrzebie");
1;
Now the full package name is no longer necessary:
#! /usr/bin/perl
use warnings;
use strict;
use MyConfig;
print "Foo = $Foo\n";
print $Baz{quux}, "\n";
You could add a read-only scalar to MyConfig.pm with
our $READONLY;
*READONLY = \42;
This is documented in perlmod.
After adding it to #MyConfig::EXPORT, you might try
$READONLY = 3;
in a different module, but you'll get
Modification of a read-only value attempted at ./program line 12.
As an alternative, you could declare in MyConfig.pm constants using the constant module and then export those.
Don't use global variables for configuration and don't sotre configuration as code. I have an entire chapter in Mastering Perl about this.
Instead, make a configuration class that any other package can use to access the configuration data. It will be much easier in the long run to provide an interface to something you may want to change later than deal with the nuttiness you lock yourself into by scattering variables names you have to support for the rest of your life.
A config interface also gives you the benefit of composing new answers to configuration questions by combining the right bits of actual configuration data. You hide all of that behind a method and the higher levels don't have to see how it's implemented. For instance,
print "Hello!" unless $config->be_silent;
The be_silent answer can be triggered for multiple reasons the higher-level code doesn't need to know about. It could be from a user switch, that the program detected it is not interactive, and so on. It can also be flipped by options such as a debugging switch, which overrides all other preferences. No matter what you decide, that code line doesn't change because that statement only cares about the answer, not how you got the answer.