Iterate directories in Perl, getting introspectable objects as result - perl

I'm about to start a script that may have some file lookups and manipulation, so I thought I'd look into some packages that would assist me; mostly, I'd like the results of the iteration (or search) to be returned as objects, which would have (base)name, path, file size, uid, modification time, etc as some sort of properties.
The thing is, I don't do this all that often, and tend to forget APIs; when that happens, I'd rather let the code run on an example directory, and dump all of the properties in an object, so I can remind myself what is available where (obviously, I'd like to "dump", in order to avoid having to code custom printouts). However, I'm aware of the following:
list out all methods of object - perlmonks.org
"Out of the box Perl doesn't do object introspection. Class wrappers like Moose provide introspection as part of their implementation, but Perl's built in object support is much more primitive than that."
Anyways, I looked into:
"Files and Directories Handling in Perl - Perl Beginners' Site" http://perl-begin.org/topics/files-and-directories/
... and started looking into the libraries referred there (also related link: rjbs's rubric: the speed of Perl file finders).
So, for one, File::Find::Object seems to work for me; this snippet:
use Data::Dumper;
#targetDirsToScan = ("./");
use File::Find::Object;
my $tree = File::Find::Object->new({}, #targetDirsToScan);
while (my $robh = $tree->next_obj()) {
#print $robh ."\n"; # prints File::Find::Object::Result=HASH(0xa146a58)}
print Dumper($robh) ."\n";
}
... prints this:
# $VAR1 = bless( {
# 'stat_ret' => [
# 2054,
# 429937,
# 16877,
# 5,
# 1000,
# 1000,
# 0,
# '4096',
# 1405194147,
# 1405194139,
# 1405194139,
# 4096,
# 8
# ],
# 'base' => '.',
# 'is_link' => '',
# 'is_dir' => 1,
# 'path' => '.',
# 'dir_components' => [],
# 'is_file' => ''
# }, 'File::Find::Object::Result' );
# $VAR1 = bless( {
# 'base' => '.',
# 'is_link' => '',
# 'is_dir' => '',
# 'path' => './test.blg',
# 'is_file' => 1,
# 'stat_ret' => [
# 2054,
# 423870,
# 33188,
# 1,
# 1000,
# 1000,
# 0,
# '358',
# 1404972637,
# 1394828707,
# 1394828707,
# 4096,
# 8
# ],
# 'basename' => 'test.blg',
# 'dir_components' => []
... which is mostly what I wanted, except the stat results are an array, and I'd have to know its layout (($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks) stat - perldoc.perl.org) to make sense of the printout.
Then I looked into IO::All, which I like because of utf-8 handling (but also, say, socket functionality, which would be useful to me for an unrelated task in the same script); and I was thinking I'd use this package instead. The problem is, I have a very hard time discovering what the available fields in the object returned are; e.g. with this code:
use Data::Dumper;
#targetDirsToScan = ("./");
use IO::All -utf8;
$io = io(#targetDirsToScan);
#contents = $io->all(0);
for my $contentry ( #contents ) {
#print Dumper($contentry) ."\n";
# $VAR1 = bless( \*Symbol::GEN298, 'IO::All::File' );
# $VAR1 = bless( \*Symbol::GEN307, 'IO::All::Dir' ); ...
#print $contentry->uid . " -/- " . $contentry->mtime . "\n";
# https://stackoverflow.com/q/24717210/printing-ret-of-ioall-w-datadumper
print Dumper \%{*$contentry}; # doesn't list uid
}
... I get a printout like this:
# $VAR1 = {
# '_utf8' => 1,
# 'constructor' => sub { "DUMMY" },
# 'is_open' => 0,
# 'io_handle' => undef,
# 'name' => './test.blg',
# '_encoding' => 'utf8',
# 'package' => 'IO::All'
# };
# $VAR1 = {
# '_utf8' => 1,
# 'constructor' => sub { "DUMMY" },
# 'mode' => undef,
# 'name' => './testdir',
# 'package' => 'IO::All',
# 'is_absolute' => 0,
# 'io_handle' => undef,
# 'is_open' => 0,
# '_assert' => 0,
# '_encoding' => 'utf8'
... which clearly doesn't show attributes like mtime, etc. - even if they exist (which you can see if you uncomment the respective print line).
I've also tried Data::Printer's (How can I perform introspection in Perl?) p() function - it prints exactly the same fields as Dumper. I also tried to use print Dumper \%{ref ($contentry) . "::"}; (list out all methods of object - perlmonks.org), and this prints stuff like:
'O_SEQUENTIAL' => *IO::All::File::O_SEQUENTIAL,
'mtime' => *IO::All::File::mtime,
'DESTROY' => *IO::All::File::DESTROY,
...
'deep' => *IO::All::Dir::deep,
'uid' => *IO::All::Dir::uid,
'name' => *IO::All::Dir::name,
...
... but only if you use the print $contentry->uid ... line beforehand; else they are not listed! I guess that relates to this:
introspection - How do I list available methods on a given object or package in Perl? #911294
In general, you can't do this with a dynamic language like Perl. The package might define some methods that you can find, but it can also make up methods on the fly that don't have definitions until you use them. Additionally, even calling a method (that works) might not define it. That's the sort of things that make dynamic languages nice. :)
Still, that prints the name and type of the field - I'd want the name and value of the field instead.
So, I guess my main question is - how can I dump an IO::All result, so that all fields (including stat ones) are printed out with their names and values (as is mostly the case with File::Find::Object)?
(I noticed the IO::All results can be of type, say, IO::All::File, but its docs defer to "See IO::All", which doesn't discuss IO::All::File explicitly much at all. I thought, if I could "cast" \%{*$contentry} to a IO::All::File, maybe then mtime etc fields will be printed - but is such a "cast" possible at all?)
If that is problematic, are there other packages, that would allow introspective printout of directory iteration results - but with named fields for individual stat properties?

Perl does introspection in the fact that an object will tell you what type of object it is.
if ( $object->isa("Foo::Bar") ) {
say "Object is of a class of Foo::Bar, or is a subclass of Foo::Bar.";
}
if ( ref $object eq "Foo::Bar" ) {
say "Object is of the class Foo::Bar.";
}
else {
say "Object isn't a Foo::Bar object, but may be a subclass of Foo::Bar";
}
You can also see if an object can do something:
if ( $object->can("quack") ) {
say "Object looks like a duck!";
}
What Perl can't do directly is give you a list of all the methods that a particular object can do.
You might be able to munge some way.Perl objects are stored in package namespaces which are in the symbol table. Classes are implemented via Perl subroutines. It may be possible to go through the package namespace and then find all the subroutines.
However, I can see several issues. First private methods (the ones you're not suppose to use) and non-method subroutines would also be included. There's no way to know which is which. Also, parent methods won't be listed.
Many languages can generate such a list of methods for their objects (I believe both Python and Ruby can), but these usually give you a list without an explanation what these do. For example, File::Find::Object::Result (which is returned by the next_obj method of File::Find::Object) has a base method. What does it do? Maybe it's like basename and gives me the name of the file. Nope, it's like dirname and gives me the name of the directory.
Again, some languages could give a list of those methods for an object and a description. However, those descriptions depend upon the programmer to maintain and make sure they're correct. No guaranteed of that.
Perl doesn't have introspection, but all Perl modules stored in CPAN must be documented via POD embedded documentation, and this is printable from the command line:
$ perldoc File::Find::Object
This is the documentation you see in CPAN pages, in http://Perldoc.perl.org and in ActiveState's Perl documentation.
It's not bad. It's not true introspection, but the documentation is usually pretty good. After all, if the documentation stunk, I probably wouldn't have installed that module in the first place. I use perldoc all the time. I can barely remember my kids' names let alone the way to use Perl classes that I haven't used in a few months, but I find that using perldoc works pretty wall.
What you should not do is use Data::Dumper to dump out objects and try to figure out what they contain and possible methods. Some cleaver programmers are using Inside-Out Objects to thwart peeking toms.
So no, Perl doesn't list methods of a particular class like some languages can, but perldoc comes pretty close to doing what you need. I haven't use File::Find::Object in a long while, but going over the perldoc, I probably could write up such a program without much difficulty.

As I answered to your previous question, it is not a good idea to go relying on the guts of objects in Perl. Instead just call methods.
If IO::All doesn't offer a method that gives you the information that you need, you might be able to write your own method for it that assembles that information using just the documented methods provided by IO::All...
use IO::All;
# Define a new method for IO::All::Base to use, but
# define it in a lexical variable!
#
my $dump_info = sub {
use Data::Dumper ();
my $self = shift;
local $Data::Dumper::Terse = 1;
local $Data::Dumper::Sortkeys = 1;
return Data::Dumper::Dumper {
name => $self->name,
mtime => $self->mtime,
mode => $self->mode,
ctime => $self->ctime,
};
};
$io = io('/tmp');
for my $file ( $io->all(0) ) {
print $file->$dump_info();
}

Ok, this is more-less as an exercise (and reminder for me); below is some code, where I've tried to define a class (File::Find::Object::StatObj) with accessor fields for all of the stat fields. Then, I have the hack for IO::All::File from Replacing a class in Perl ("overriding"/"extending" a class with same name)?, where a mtimef field is added which corresponds to mtime, just as a reminder.
Then, just to see what sort of interface I could have between the two libraries, I have IO::All doing the iterating; and the current file path is passed to File::Find::Object, from which we obtain a File::Find::Object::Result - which has been "hacked" to also show the File::Find::Object::StatObj; but that one is only generated after a call to the hacked Result's full_components (that might as well have been a separate function). Notice that in this case, you won't get full_components/dir_components of File::Find::Object::Result -- because apparently it is not File::Find::Object doing the traversal here, but IO::All. Anyways, the result is something like this:
# $VAR1 = {
# '_utf8' => 1,
# 'mtimef' => 1403956165,
# 'constructor' => sub { "DUMMY" },
# 'is_open' => 0,
# 'io_handle' => undef,
# 'name' => 'img/test.png',
# '_encoding' => 'utf8',
# 'package' => 'IO::All'
# };
# img/test.png
# > - $VAR1 = bless( {
# 'base' => 'img/test.png',
# 'is_link' => '',
# 'is_dir' => '',
# 'path' => 'img/test.png',
# 'is_file' => 1,
# 'stat_ret' => [
# 2054,
# 426287,
# 33188,
# 1,
# 1000,
# 1000,
# 0,
# '37242',
# 1405023944,
# 1403956165,
# 1403956165,
# 4096,
# 80
# ],
# 'basename' => undef,
# 'stat_obj' => bless( {
# 'blksize' => 4096,
# 'ctime' => 1403956165,
# 'rdev' => 0,
# 'blocks' => 80,
# 'uid' => 1000,
# 'dev' => 2054,
# 'mtime' => 1403956165,
# 'mode' => 33188,
# 'size' => '37242',
# 'nlink' => 1,
# 'atime' => 1405023944,
# 'ino' => 426287,
# 'gid' => 1000
# }, 'File::Find::Object::StatObj' ),
# 'dir_components' => []
# }, 'File::Find::Object::Result' );
I'm not sure how correct this would be, but what I like about this is that I could forget where the fields are; then I could rerun the dumper, and see that I could get mtime via (*::Result)->stat_obj->size - and that seems to work (here I'd need just to read these, not to set them).
Anyways, here is the code:
use Data::Dumper;
my #targetDirsToScan = ("./");
use IO::All -utf8 ; # Turn on utf8 for all io
# try to "replace" the IO::All::File class
{ # https://stackoverflow.com/a/24726797/277826
package IO::All::File;
use IO::All::File; # -base; # just do not use `-base` here?!
# hacks work if directly in /usr/local/share/perl/5.10.1/IO/All/File.pm
# NB: field is a sub in /usr/local/share/perl/5.10.1/IO/All/Base.pm
field mtimef => undef; # hack
sub file {
my $self = shift;
bless $self, __PACKAGE__;
$self->name(shift) if #_;
$self->mtimef($self->mtime); # hack
#print("!! *haxx0rz'd* file() reporting in\n");
return $self->_init;
}
1;
}
use File::Find::Object;
# based on /usr/local/share/perl/5.10.1/File/Find/Object/Result.pm;
# but inst. from /usr/local/share/perl/5.10.1/File/Find/Object.pm
{
package File::Find::Object::StatObj;
use integer;
use Tie::IxHash;
#use Data::Dumper;
sub ordered_hash { # https://stackoverflow.com/a/3001400/277826
#my (#ar) = #_; #print("# ". join(",",#ar) . "\n");
tie my %hash => 'Tie::IxHash';
%hash = #_; #print Dumper(\%hash);
\%hash
}
my $fields = ordered_hash(
# from http://perldoc.perl.org/functions/stat.html
(map { $_ => $_ } (qw(
dev ino mode nlink uid gid rdev size
atime mtime ctime blksize blocks
)))
); #print Dumper(\%{$fields});
use Class::XSAccessor
#accessors => %{$fields}, # cannot - is seemingly late
# ordered_hash gets accepted, but doesn't matter in final dump;
#accessors => { (map { $_ => $_ } (qw(
accessors => ordered_hash( (map { $_ => $_ } (qw(
dev ino mode nlink uid gid rdev size
atime mtime ctime blksize blocks
))) ),
#))) },
;
use Fcntl qw(:mode);
sub new
{
#my $self = shift;
my $class = shift;
my #stat_arr = #_; # the rest
my $ic = 0;
my $self = {};
bless $self, $class;
for my $k (keys %{$fields}) {
$fld = $fields->{$k};
#print "$ic '$k' '$fld' ".join(", ",$stat_arr[$ic])." ; ";
$self->$fld($stat_arr[$ic]);
$ic++;
}
#print "\n";
return $self;
}
1;
}
# try to "replace" the File::Find::Object::Result
{
package File::Find::Object::Result;
use File::Find::Object::Result;
#use File::Find::Object::StatObj; # no, has no file!
use Class::XSAccessor replace => 1,
accessors => {
(map { $_ => $_ } (qw(
base
basename
is_dir
is_file
is_link
path
dir_components
stat_ret
stat_obj
)))
}
;
#use Fcntl qw(:mode);
#sub new # never gets called
sub full_components
{
my $self = shift; #print("NEWCOMP\n");
my $sobj = File::Find::Object::StatObj->new(#{$self->stat_ret()});
$self->stat_obj($sobj); # add stat_obj and its fields
return
[
#{$self->dir_components()},
($self->is_dir() ? () : $self->basename()),
];
}
1;
}
# main script start
my $io = io($targetDirsToScan[0]);
my #contents = $io->all(0); # Get all contents of dir
for my $contentry ( #contents ) {
print Dumper \%{*$contentry};
print $contentry->name . "\n"; # img/test.png
# get a File::Find::Object::Result - must instantiate
# a File::Find::Object; just item_obj() will return undef
# right after instantiation, so must give it "next";
# no instantition occurs for $tro, though!
#my $tffor = File::Find::Object->new({}, ($contentry->name))->next_obj();
my $tffo = File::Find::Object->new({}, ("./".$contentry->name));
my $tffos = $tffo->next(); # just a string!
$tffo->_calc_current_item_obj(); # unfortunately, this will not calculate dir_components ...
my $tffor = $tffo->item_obj();
# ->full_components doesn't call new, either!
# must call full_compoments, to generate the fields
# (assign to unused variable triggers it fine)
# however, $arrref_fullcomp will be empty, because
# File::Find::Object seemingly calcs dir_components only
# if it is traversing a tree...
$arrref_fullcomp = $tffor->full_components;
#print("# ".$tffor->stat_obj->size."\n"); # seems to work
print "> ". join(", ", #$arrref_fullcomp) ." - ". Dumper($tffor);
}

Related

Perl Moose accessors generated on the fly

See the following fragment of Perl code which is based on Moose:
$BusinessClass->meta->add_attribute($Key => { is => $rorw,
isa => $MooseType,
lazy => 0,
required => 0,
reader => sub { $_[0]->ORM->{$Key} },
writer => sub { $_[0]->ORM->newVal($Key, $_[1]) },
predicate => "has_$Key",
});
I receive the error:
bad accessor/reader/writer/predicate/clearer format, must be a HASH ref at /usr/local/lib/perl5/site_perl/mach/5.20/Class/MOP/Class.pm line 899
The reason of the error is clear: reader and writer must be string names of functions.
But what to do it in this specific case? I do not want to create a new function for each of a hundred ORM fields (ORM attribute here is a tied hash). So I can't pass a string here, I need a closure.
Thus my coding needs resulted in a contradiction. I don't know what to do.
The above was a fragment of real code. Now I present a minimal example:
#!/usr/bin/perl
my #Fields = qw( af sdaf gdsg ewwq fsf ); # pretend that we have 100 fields
# Imagine that this is a tied hash with 100 fields
my %Data = map { $_ => rand } #Fields;
package Test;
use Moose;
foreach my $Key (#Fields) {
__PACKAGE__->meta->add_attribute($Key => { is => 'rw',
isa => 'Str',
lazy => 0,
required => 0,
reader => sub { $Data{$Key} },
writer => sub { $Data{$Key} = $_[1] },
});
}
Running it results in:
$ ./test.pl
bad accessor/reader/writer/predicate/clearer format, must be a HASH ref at /usr/lib/i386-linux-gnu/perl5/5.22/Class/MOP/Class.pm line 899
Class::MOP::Class::try {...} at /usr/share/perl5/Try/Tiny.pm line 92
eval {...} at /usr/share/perl5/Try/Tiny.pm line 83
Try::Tiny::try('CODE(0x9dc6cec)', 'Try::Tiny::Catch=REF(0x9ea0c60)') called at /usr/lib/i386-linux-gnu/perl5/5.22/Class/MOP/Class.pm line 904
Class::MOP::Class::_post_add_attribute('Moose::Meta::Class=HASH(0x9dc13f4)', 'Moose::Meta::Attribute=HASH(0x9dc6b5c)') called at /usr/lib/i386-linux-gnu/perl5/5.22/Class/MOP/Mixin/HasAttributes.pm line 39
Class::MOP::Mixin::HasAttributes::add_attribute('Moose::Meta::Class=HASH(0x9dc13f4)', 'Moose::Meta::Attribute=HASH(0x9dc6b5c)') called at /usr/lib/i386-linux-gnu/perl5/5.22/Moose/Meta/Class.pm line 572
Moose::Meta::Class::add_attribute('Moose::Meta::Class=HASH(0x9dc13f4)', 'af', 'HASH(0x9ea13a4)') called at test.pl line 18
I don't know what to do (how to create "dynamic" (closure-like) accessors, without writing an individual function for each of the 100 fields?)
I think changing the reader and writer methods like that requires an unhealthy level of insanity. If you want to, take a look at the source code of Class::MOP::Method::Accessor, which is used under the hood to create the accessors.
Instead, I suggest to just overwrite (or attach) the functionality to the Moose-generated readers using an around method modifier. To get that to work with sub-classes, you can use Class::Method::Modifiers instead of the Moose around.
package Foo::Subclass;
use Moose;
extends 'Foo';
package Foo;
use Moose;
package main;
require Class::Method::Modifiers; # no import because it would overwrite Moose
my #Fields = qw( af sdaf gdsg ewwq fsf ); # pretend that we have 100 fields
# Imagine that this is a tied hash with 100 fields
my %Data = map { $_ => rand } #Fields;
my $class = 'Foo::Subclass';
foreach my $Key (#Fields) {
$class->meta->add_attribute(
$Key => {
is => 'rw',
isa => 'Str',
lazy => 0,
required => 0,
}
);
Class::Method::Modifiers::around( "${class}::$Key", sub {
my $orig = shift;
my $self = shift;
$self->$orig(#_); # just so Moose is up to speed
# writer
$Data{$Key} = $_[0] if #_;
return $Data{$Key};
});
}
And then run a test.
package main;
use Data::Printer;
use v5.10;
my $foo = Test->new;
say $foo->sdaf;
$foo->sdaf('foobar');
say $foo->sdaf;
p %Data;
p $foo;
Here's the STDOUT/STDERR from my machine.
{
af 0.972962507120432,
ewwq 0.959195914302605,
fsf 0.719139421719849,
gdsg 0.140205658312095,
sdaf "foobar"
}
Foo::Subclass {
Parents Foo
Linear #ISA Foo::Subclass, Foo, Moose::Object
public methods (6) : af, ewwq, fsf, gdsg, meta, sdaf
private methods (0)
internals: {
sdaf "foobar"
}
}
0.885114977459551
foobar
As you can see, Moose doesn't really know about the values inside of the hash, but if you use the accessors, it will read and write them. The Moose object will slowly fill up with new values when you use the writer, but otherwise the values inside of the Moose object do not really matter.

Unblessing Perl objects and constructing the TO_JSON method for convert_blessed

In this answer I found a recommendation for a simple TO_JSON method, which is needed for serializing blessed objects to JSON.
sub TO_JSON { return { %{ shift() } }; }
Could anybody please explain in detail how it works?
I changed it to:
sub TO_JSON {
my $self = shift; # the object itself – blessed ref
print STDERR Dumper $self;
my %h = %{ $self }; # Somehow unblesses $self. WHY???
print STDERR Dumper \%h; # same as $self, only unblessed
return { %h }; # Returns a hashref that contains a hash.
#return \%h; # Why not this? Works too…
}
Many questions… :( Simply, I’m unable to understand 3-liner Perl code. ;(
I need the TO_JSON but it will filter out:
unwanted attributes and
unset attributes too (e.g. for those the has_${attr} predicate returns false)
This is my code – it works but I really don't understand why the unblessing works…
use 5.010;
use warnings;
use Data::Dumper;
package Some;
use Moo;
has $_ => ( is => 'rw', predicate => 1,) for (qw(a1 a2 nn xx));
sub TO_JSON {
my $self = shift;
my $href;
$href->{$_} = $self->$_ for( grep {!/xx/} keys %$self );
# Same mysterious unblessing. The `keys` automagically filters out
# “unset” attributes without the need of call of the has_${attr}
# predicate… WHY?
return $href;
}
package main;
use JSON;
use Data::Dumper;
my #objs = map { Some->new(a1 => "a1-$_", a2 => "a2-$_", xx=>"xx-$_") } (1..2);
my $data = {arr => \#objs};
#say Dumper $data;
say JSON->new->allow_blessed->convert_blessed->utf8->pretty->encode($data);
EDIT: To clarify the questions:
The %{ $hRef } derefences the $hRef (getting the hash pointed to by the reference), but why get a plain hash from a blessed object reference $self?
In other words, why the $self is a hashref?
I tried to make a hash slice like #{$self}{ grep {!/xx/} keys %$self} but it didn't work. Therefore I created that horrible TO_JSON.
If the $self is a hashref, why the keys %$self returns only attributes having a value, and not all declared attributes (e.g. the nn too – see the has)?
sub TO_JSON { return { %{ shift() } }; }
| | |
| | L_ 1. pull first parameter from `#_`
| | (hashref/blessed or not)
| |
| L____ 2. dereference hash (returns key/value list)
|
L______ 3. return hashref assembled out of list
In your TO_JSON() function { %h } returns a shallow hash copy, while \%h returns a reference to %h (no copying).
Perl implemented object orientation by simply making it possible for a reference to know which package it came from (with bless). Knowing that a reference came from the Foo package means that methods are really functions defined in that package.
Perl allows any kind of reference to get blessed; not just hash references. It's very common to bless hash references; a lot of documentation shows doing exactly that; and Moose does it; but, it's possible to bless an array reference, or a subroutine reference, or a filehandle, or a reference to a scalar. The syntax %{$self} only works on hash references (blessed or not). It takes the hash reference, and dereferences it as a hash. The fact that the original reference may have been blessed is lost.
I need the TO_JSON but what will filter out:
unwanted attributes
and unset attributes too (e.g. for those the has_${attr} predicate returns false.
Pre-5.20, hash slices only give you the values and not the keys from the original hash. You want both keys and values.
Assuming you have a hash, and want to filter out undef values and keys not on a whitelist, there are a few options. Here's what I have, using the JSON module:
use strict; # well, I used "use v5.18", but I don't know which version of Perl you're using
use warnings;
use JSON;
my $foo = { foo => undef, bar => 'baz', quux => 5 };
my %whitelist = map { $_, 1 } qw{foo bar};
my %bar = map { $_ => $foo->{$_} }
grep { defined $foo->{$_} && exists $whitelist{$_} }
keys %$foo;
print to_json(\%bar) . "\n";
# well, I used say() instead of print(), but I don't know which version of Perl you're using
The maps and greps aren't necessarily pretty, but it's the simplest way I could think of to filter out keys not on the whitelist and elements without an undef value.
You could use an array slice:
use strict;
use warnings;
use JSON;
my $foo = { foo => undef, bar => 'baz', quux => 5 };
my #whitelist = qw{foo bar};
my %filtered_on_keys;
#filtered_on_keys{#whitelist} = #$foo{#whitelist};
my %bar = map { $_ => $filtered_on_keys{$_} }
grep { defined $filtered_on_keys{$_} }
keys %filtered_on_keys;
print to_json(\%bar) . "\n";
Or if you like loops:
use strict;
use warnings;
use JSON;
my $foo = { foo => undef, bar => 'baz', quux => 5 };
my %whitelist = map { $_ => 1 } qw{foo bar};
my %bar;
while (my ($key, $value) = each %$foo) {
if (defined $value && exists $whitelist{$key}) {
$bar{$key} = $value;
}
}
print to_json(\%bar) . "\n";
It seems like a good time to bring up Larry wall's quote, "Perl is designed to give you several ways to do anything, so consider picking the most readable one."
However, I made a big point that not all objects are hashes. The appropriate way to get data from an object is through its getter functions:
use strict;
use warnings;
use JSON;
my $foo = Foo->new({ foo => undef, bar => 'baz', quux => 5 }); # as an example
my %filtered_on_keys;
#filtered_on_keys{qw{foo bar}} = ($foo->get_foo(), $foo->get_bar());
my %bar = map { $_ => $filtered_on_keys{$_} }
grep { defined $filtered_on_keys{$_} }
keys %filtered_on_keys;
print to_json(\%bar) . "\n";

Turn data structures into perl objects (module recommendation)

I have an arbitrary data structure and I'd like to treat it as an object. I get this as a response from a REST app. Example below. There are some modules on CPAN which promise to do this. Data::Object looks best to me, but it's last updated 2011. Am I missing something? Is there perhaps an easy Moose way to do this? Thanks!
$o=$class->new($response);
$s=$o->success;
#i=$o->items;
{
'success' => bless( do{\(my $o = 1)}, 'JSON::XS::Boolean' ),
'requestNumber' => 5,
'itemsCount' => 1,
'action' => 'search.json',
'totalResults' => 161,
'items' => [
{
'link' => 'http://europeana.eu/api//v2/record/15503/E627F23EF13FA8E6584AF8706A95DB85908413BE.json?wskey=NpXXXX',
'provider' => [
'Kulturpool'
],
'europeanaCollectionName' => [
'15503_Ag_AT_Kulturpool_khm_fs'
],
# more fields omitted
}
],
'apikey' => 'Npxxxx'
};
Although I don't like using it, defining an AUTOLOAD subroutine is a way to create arbitrary classes on the fly. It's been a while since I used it, but it should look something like this:
package Local::Foo;
sub new {
my $class = shift;
my $self = {};
bless $self, $class;
return $self;
}
sub AUTOLOAD {
my $self = shift;
my $value = shift;
our $AUTOLOAD;
(my $method = $AUTOLOAD) = s/.*:://;
if ( defined $value ) {
$self->{$method} = $value;
}
return $self->{$method};
}
This class Local::Foo has an infinite amount of methods. For example, if I said
$foo->bar("fubar");
This would be the same as:
$foo->{bar} = "foobar";
If i called $foo->bar;, it will return the value of $foo->{bar};.
You probably want something to limit your method's style, and their values. For example, with this:
$foo->BAR;
$foo->Bar;
$foo->bar;
are all three valid and completely different methods. You probably want something to make sure your methods match a particular pattern (i.e., they're all lowercase, or the first letter is uppercase and the rest are lowercase. You probably want to make sure they start with a letter so, $foo->23diba; isn't a valid method.
One little problem: Once you define an AUTOLOAD subroutine, you also define DESTROY subroutine too. Perl calls the DESTROY subroutine before an object is destroyed. You need to handle the issue if $AUTOLOAD =~ /.*::DESTROY$/ too. You may need to add:
return if $AUTOLOAD =~ /.*::DESTROY$/;
somewhere in the AUTOLOAD subroutine, so you don't accidentally do something when DESTROY is called. Remember, it's automatically called whenever a class object falls out of scope if one exists, and with AUTOLOAD, you've defined one anyway.
This is an example:
use strict;
package Foo;
#define a simple Foo class with 3 properties
use base qw(Class::Accessor);
Foo->mk_accessors(qw(name role salary));
package main;
#define a perl hash with the same keys
my $hr = {'name'=>'john doe', 'role'=>'admin', 'salary'=>2500 };
#bless the object
my $obj = bless $hr, 'Foo';
print $obj->name, "\n"; #<-- prints: john doe
I am not saying this is necessarily a good idea, the best way to do the idea, or gotcha-free. I never tried it till 15 minutes ago. But it is fun and it is terse, so–
#!/usr/bin/env perl
BEGIN {
package Role::AutoVacca;
use Moo::Role;
use Scalar::Util "blessed";
sub BUILD {
my $self = shift;
for my $attr ( grep /\A[^_]/, keys %{$self} )
{
Method::Generate::Accessor
->generate_method( blessed($self),
$attr,
{ is => "rw" } );
}
}
package Fakey;
use Moo;
with "Role::AutoVacca";
}
my $fake = Fakey->new({
success => bless( do{\(my $o = 1)}, "JSON::XS::Boolean" ),
items => [ { link => "http://europeana.eu/o/haipi",
provider => [ "mememememe" ] } ],
apikey => "3k437" });
print "I CAN HAZ KEE? ", $fake->apikey, $/;
print "IZ GUD? ", $fake->success ? "YAH" : "ONOES", $/;
print "WUT DIZZYING? ", $fake->items, $/;

How to check undefined key in perl OO code?

I have use strict;use warnings; in my perl script;
But this error cannot be found:
sub new {
#....
my $self={};
$self->{databas}="..."; # 'e' is missing
#....
}
sub foo {
my $self=shift;
print $self->{database}; # undef
}
I have spend hours to found out that database in mispelled in sub new.
use strict;use warnings; didnt help.
How can I avoid this error?
Restrict/lock hashes with Hash::Util.
Alternatively, use Moose to describe your classes, making a misspelled attribute a run-time error.
package MyClass;
use Moose;
has 'database' => (isa => 'Str', is => 'rw', default => 'quux');
sub foo {
my ($self) = #_;
$self->database; # returns quux
$self->databas; # Can't locate object method "databas" via package…
use defined, or // operator (if you have perl 5.10/later)
print "not defined" if !defined $a; # check if $a is undef
print $a // 'undefed!'; # print a if availiable, "undefed!" otherwise
See http://perldoc.perl.org/functions/defined.html and http://perldoc.perl.org/perlop.html#C-style-Logical-Defined-Or
Use getters and setters instead of hash keys, or switch to Moose.
Do you think you would have spotted it if you saw the hash dumped out? Like this:
$self = bless( {
'anotherfield' => 'something else',
'databas' => '...',
'afield' => 'something'
}, 'MyClass' );
If you were wondering "How come 'database' isn't set?!?!" and you dumped this out, do you think that would help? "Oh it assigned 'databas' not 'database'!"
Then Data::Dumper is the minimal Perl debugging tool
use Data::Dumper;
...
# Why isn't database assigned?!?!
say Data::Dumper->Dump( [ $self ], [ '$self' ] );
Of course, the most convenient form of Data::Dumper tools is Smart:Comments.
use Smart::Comments;
...
### $self
Which outputs:
### $self: bless( {
### afield => 'something',
### anotherfield => 'something else',
### databas => '...'
### }, 'MyClass' )
It's not as preventative a tool as Moose but it will save hours, though. I think it even helps you learn Perl tricks and practices as you spill out the guts of CPAN objects. When you know the underlying structure, you have something to search for in CPAN modules.
Like I said, it solves the problem of hours tracking down bugs (often enough).
Another approach is to use the core module Class::Struct.
package MyObj;
use Class::Struct;
struct(
databas => '$',
# ...
);
1;
package main;
# create object
my $obj = MyObj->new(databas => 'MyDB');
# later
print $obj->database;
Running this results in the following error:
Can't locate object method "database" via package "MyObj" at ... .

How do I use an array as an object attribute in Perl?

I need some help regarding the arrays in Perl
This is the constructor I have.
BuildPacket.pm
sub new {
my $class = shift;
my $Packet = {
_PacketName => shift,
_Platform => shift,
_Version => shift,
_IncludePath => [#_],
};
bless $Packet, $class;
return $Packet;
}
sub SetPacketName {
my ( $Packet, $PacketName ) = #_;
$Packet->{_PacketName} = $PacketName if defined($PacketName);
return $Packet->{_PacketName};
}
sub SetIncludePath {
my ( $Packet, #IncludePath ) = #_;
$Packet->{_IncludePath} = \#IncludePath;
}
sub GetPacketName {
my( $Packet ) = #_;
return $Packet->{_PacketName};
}
sub GetIncludePath {
my( $Packet ) = #_;
#{ $Packet->{_IncludePath} };
}
(The code has been modified according to the suggestions from 'gbacon', thank you)
I am pushing the relative paths into 'includeobjects' array in a dynamic way. The includepaths are being read from an xml file and are pushed into this array.
# PacketInput.pm
if($element eq 'Include')
{
while( my( $key, $value ) = each( %attrs ))
{
if($key eq 'Path')
push(#includeobjects, $value);
}
}
So, the includeobject will be this way:
#includeobjects = (
"./input/myMockPacketName",
"./input/myPacket/my3/*.txt",
"./input/myPacket/in.html",
);
I am using this line for set include path
$newPacket->SetIncludePath(#includeobjects);
Also in PacketInput.pm, I have
sub CreateStringPath
{
my $packet = shift;
print "printing packet in CreateStringPath".$packet."\n";
my $append = "";
my #arr = #{$packet->GetIncludePath()};
foreach my $inc (#arr)
{
$append = $append + $inc;
print "print append :".$append."\n";
}
}
I have many packets, so I am looping through each packet
# PacketCreation.pl
my #packets = PacketInput::GetPackets();
foreach my $packet (PacketInput::GetPackets())
{
print "printing packet in loop packet".$packet."\n";
PacketInput::CreateStringPath($packet);
$packet->CreateTar($platform, $input);
$packet->GetValidateOutputFile($platform);
}
The get and set methods work fine for PacketName. But since IncludePath is an array, I could not get it to work, I mean the relative paths are not being printed.
If you enable the strict pragma, the code doesn't even compile:
Global symbol "#_IncludePath" requires explicit package name at Packet.pm line 15.
Global symbol "#_IncludePath" requires explicit package name at Packet.pm line 29.
Global symbol "#_IncludePath" requires explicit package name at Packet.pm line 30.
Global symbol "#_IncludePath" requires explicit package name at Packet.pm line 40.
Don't use # unquoted in your keys because it will confuse the parser. I recommend removing them entirely to avoid confusing human readers of your code.
You seem to want to pull all the attribute values from the arguments to the constructor, so continue peeling off the scalar values with shift, and then everything left must be the include path.
I assume that the components of the include path will be simple scalars and not references; if the latter is the case, then you'll want to make deep copies for safety.
sub new {
my $class = shift;
my $Packet = {
_PacketName => shift,
_Platform => shift,
_Version => shift,
_IncludePath => [ #_ ],
};
bless $Packet, $class;
}
Note that there's no need to store the blessed object in a temporary variable and then immediately return it because of the semantics of Perl subs:
If no return is found and if the last statement is an expression, its value is returned.
The methods below will also make use of this feature.
Given the constructor above, GetIncludePath becomes
sub GetIncludePath {
my( $Packet ) = #_;
my #path = #{ $Packet->{_IncludePath} };
wantarray ? #path : \#path;
}
There are a couple of things going on here. First, note that we're careful to return a copy of the include path rather than a direct reference to the internal array. This way, the user can modify the value returned from GetIncludePath without having to worry about mucking up the packet's state.
The wantarray operator allows a sub to determine the context of its call and respond accordingly. In list context, GetIncludePath will return the list of values in the array. Otherwise, it returns a reference to a copy of the array. This way, client code can call it either as in
foreach my $path (#{ $packet->GetIncludePath }) { ... }
or
foreach my $path ($packet->GetIncludePath) { ... }
SetIncludePath is then
sub SetIncludePath {
my ( $Packet, #IncludePath ) = #_;
$Packet->{_IncludePath} = \#IncludePath;
}
Note that you could have used similar code in the constructor rather than removing one parameter at a time with shift.
You might use the class defined above as in
#! /usr/bin/perl
use strict;
use warnings;
use Packet;
sub print_packet {
my($p) = #_;
print $p->GetPacketName, "\n",
map(" - [$_]\n", $p->GetIncludePath),
"\n";
}
my $p = Packet->new("MyName", "platform", "v1.0", qw/ foo bar baz /);
print_packet $p;
my #includeobjects = (
"./input/myMockPacketName",
"./input/myPacket/my3/*.txt",
"./input/myPacket/in.html",
);
$p->SetIncludePath(#includeobjects);
print_packet $p;
print "In scalar context:\n";
foreach my $path (#{ $p->GetIncludePath }) {
print $path, "\n";
}
Output:
MyName
- [foo]
- [bar]
- [baz]
MyName
- [./input/myMockPacketName]
- [./input/myPacket/my3/*.txt]
- [./input/myPacket/in.html]
In scalar context:
./input/myMockPacketName
./input/myPacket/my3/*.txt
./input/myPacket/in.html
Another way to reduce typing is to use Moose.
package Packet;
use Moose::Policy 'Moose::Policy::JavaAccessors';
use Moose;
has 'PacketName' => (
is => 'rw',
isa => 'Str',
required => 1,
);
has 'Platform' => (
is => 'rw',
isa => 'Str',
required => 1,
);
has 'Version' => (
is => 'rw',
isa => 'Int',
required => 1,
);
has 'IncludePath' => (
is => 'ro',
isa => 'ArrayRef[Str]',
default => sub {[]},
traits => [ 'Array' ],
handles => {
getIncludePath => 'elements',
getIncludePathMember => 'get',
setIncludePathMember => 'set',
},
);
__PACKAGE__->meta->make_immutable;
no Moose;
1;
Check out Moose::Manual::Unsweetened for another example of how Moose saves time.
If you are adamant in your desire to learn classical Perl OOP, read the following perldoc articles: perlboot, perltoot, perlfreftut and perldsc.
A great book about classical Perl OO is Damian Conway's Object Oriented Perl. It will give you a sense of the possibilities in Perl's object.
Once you understand #gbacon's answer, you can save some typing by using Class::Accessor::Fast:
#!/usr/bin/perl
package My::Class;
use strict; use warnings;
use base 'Class::Accessor::Fast';
__PACKAGE__->follow_best_practice;
__PACKAGE__->mk_accessors( qw(
IncludePath
PacketName
Platform
Version
));
use overload '""' => 'to_string';
sub to_string {
my $self = shift;
sprintf(
"%s [ %s:%s ]: %s",
$self->get_PacketName,
$self->get_Platform,
$self->get_Version,
join(':', #{ $self->get_IncludePath })
);
}
my $obj = My::Class->new({
PacketName => 'dummy', Platform => 'Linux'
});
$obj->set_IncludePath([ qw( /home/include /opt/include )]);
$obj->set_Version( '1.05b' );
print "$obj\n";