How to look for a node by id in the Perl Graph.pm module? - perl

I'm trying to use the Graph.pm module but all the examples I saw are keeping simple basic scalars as nodes. I'm trying to keep instances of three different classes as nodes in that graph. Consider:
sub new {
my ($class,$node_name) = #_;
my $self = {
"name" => $node_name,
# More fields
};
bless $self, $class;
return $self;
}
Now I can do something like:
$g->add_vertex(new DNode("/"));
where DNode is the constructor of one of the three classes. But how can I can look for that node? For example, I have:
$g->add_vertex(new DNode("/"));
$g->add_vertex(new DNode("/a"));
$g->add_vertex(new DNode("/a/b"));
my $node = $g->get_vertex("/a/b");
There is no get_vertex. I thought that add_vertex_by_id can be helpful here:
$g->add_vertex_by_id(new DNode("/"),"/");
$g->add_vertex_by_id(new DNode("/a"),"/a");
$g->add_vertex_by_id(new DNode("/a/b"),"/a/b");
But there is no get_vertex_by_id method. How can I do the lookup?

It's not 100% clear what you're really trying to achieve. I can try to explain the things you did ask about (I'm the module's current maintainer).
The Graph module is about things, and connections between things. The usual approach, which I think will work for you here, is to identify "things" (vertices) by string names, such as "/" or "/a", and set an attribute (e.g. object) as the Perl object. You can instead have the vertices identified by actual Perl objects, using refvertexed.
Once you've added a vertex, like this:
$g = Graph->new;
$g->set_vertex_attribute('/a', object => $node_class->new('/a')); # no need to separately add vertex
You can look it up:
$bool = $g->has_vertex('/a');
$obj = $g->get_vertex_attribute('/a', 'object'); # safely returns undef if no such vertex, or attribute not set
The by_id methods are part of Multiedges, Multivertices, Multigraphs, where you'd have several (or "multi") "aspects" of the same vertex, each identified by an ID. I don't think that's what you want here.
If this doesn't solve your problem, you'd need to explain your problem more/better :-)

Related

Perl dereferencing a subroutine

I have come across code with the following syntax:
$a -> mysub($b);
And after looking into it I am still struggling to figure out what it means. Any help would be greatly appreciated, thanks!
What you have encountered is object oriented perl.
it's documented in perlobj. The principle is fairly simple though - an object is a sort of super-hash, which as well as data, also includes built in code.
The advantage of this, is that your data structure 'knows what to do' with it's contents. At a basic level, that's just validate data - so you can make a hash that rejects "incorrect" input.
But it allows you to do considerably more complicated things. The real point of it is encapsulation, such that I can write a module, and you can make use of it without really having to care what's going on inside it - only the mechanisms for driving it.
So a really basic example might look like this:
#!/usr/bin/env perl
use strict;
use warnings;
package MyObject;
#define new object
sub new {
my ($class) = #_;
my $self = {};
$self->{count} = 0;
bless( $self, $class );
return $self;
}
#method within the object
sub mysub {
my ( $self, $new_count ) = #_;
$self->{count} += $new_count;
print "Internal counter: ", $self->{count}, "\n";
}
package main;
#create a new instance of `MyObject`.
my $obj = MyObject->new();
#call the method,
$obj->mysub(10);
$obj->mysub(10);
We define "class" which is a description of how the object 'works'. In this, class, we set up a subroutine called mysub - but because it's a class, we refer to it as a "method" - that is, a subroutine that is specifically tied to an object.
We create a new instance of the object (basically the same as my %newhash) and then call the methods within it. If you create multiple objects, they each hold their own internal state, just the same as it would if you created separate hashes.
Also: Don't use $a and $b as variable names. It's dirty. Both because single var names are wrong, but also because these two in particular are used for sort.
That's a method call. $a is the invocant (a class name or an object), mysub is the method name, and $b is an argument. You should proceed to read perlootut which explains all of this.

Perl OOP attribute manipulation best practice

Assume the following code:
package Thing;
sub new {
my $this=shift;
bless {#_},$this;
}
sub name {
my $this=shift;
if (#_) {
$this->{_name}=shift;
}
return $this->{_name};
}
Now assume we've instantiated an object thusly:
my $o=Thing->new();
$o->name('Harold');
Good enough. We could also instantiate the same thing more quickly with either of the following:
my $o=Thing->new(_name=>'Harold'); # poor form
my $o=Thing->new()->name('Harold');
To be sure, I allowed attributes to be passed in the constructor to allow "friendly" classes to create objects more completely. It could also allow for a clone-type operator with the following code:
my $o=Thing->new(%$otherthing); # will clone attrs if not deeper than 1 level
This is all well and good. I understand the need for hiding attributes behind methods to allow for validation, etc.
$o->name; # returns 'Harold'
$o->name('Fred'); # sets name to 'Fred' and returns 'Fred'
But what this doesn't allow is easy manipulation of the attribute based on itself, such as:
$o->{_name}=~s/old/ry/; # name is now 'Harry', but this "exposes" the attribute
One alternative is to do the following:
# Cumbersome, not syntactically sweet
my $n=$o->name;
$n=~s/old/ry/;
$o->name($n);
Another potential is the following method:
sub Name :lvalue { # note the capital 'N', not the same as name
my $this=shift;
return $this->{_name};
}
Now I can do the following:
$o->Name=~s/old/ry/;
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way? I mean, doing that takes away any validation that might be found in the 'name' method. For example, if the 'name' method enforced a capital first letter and lowercase letters thereafter, the 'Name' (capital 'N') bypasses that and forces the user of the class to police herself in the use of it.
So, if the 'Name' lvalue method isn't exactly "kosher" are there any established ways to do such things?
I have considered (but get dizzy considering) things like tied scalars as attributes. To be sure, it may be the way to go.
Also, are there perhaps overloads that may help?
Or should I create replacement methods in the vein of (if it would even work):
sub replace_name {
my $this=shift;
my $repl=shift;
my $new=shift;
$this->{_name}=~s/$repl/$new/;
}
...
$o->replace_name(qr/old/,'ry');
Thanks in advance... and note, I am not very experienced in Perl's brand of OOP, even though I am fairly well-versed in OOP itself.
Additional info:
I guess I could get really creative with my interface... here's an idea I tinkered with, but I guess it shows that there really are no bounds:
sub name {
my $this=shift;
if (#_) {
my $first=shift;
if (ref($first) eq 'Regexp') {
my $second=shift;
$this->{_name}=~s/$first/$second/;
}
else {
$this->{_name}=$first;
}
}
return $this->{_name};
}
Now, I can either set the name attribute with
$o->name('Fred');
or I can manipulate it with
$o->name(qr/old/,'ry'); # name is now Harry
This still doesn't allow stuff like $o->name.=' Jr.'; but that's not too tough to add. Heck, I could allow calllback functions to be passed in, couldn't I?
Your first code example is abolutely fine. This is a standard method to write accessors. Of course this can get ugly when doing a substitution, the best solution might be:
$o->name($o->name =~ s/old/ry/r);
The /r flag returns the result of the substitution. Equivalently:
$o->name(do { (my $t = $o->name) =~ s/old/ry/; $t });
Well yes, this 2nd solution is admittedly ugly. But I am assuming that accessing the fields is a more common operation than setting them.
Depending on your personal style preferences, you could have two different methods for getting and setting, e.g. name and set_name. (I do not think get_ prefixes are a good idea – 4 unneccessary characters).
If substituting parts of the name is a central aspect of your class, then encapsulating this in a special substitute_name method sounds like a good idea. Otherwise this is just unneccessary ballast, and a bad tradeoff for avoiding occasional syntactic pain.
I do not advise you to use lvalue methods, as these are experimental.
I would rather not see (and debug) some “clever” code that returns tied scalars. This would work, but feels a bit too fragile for me to be comfortable with such solutions.
Operator overloading does not help with writing accessors. Especially assignment cannot be overloaded in Perl.
Writing accessors is boring, especially when they do no validation. There are modules that can handle autogeneration for us, e.g. Class::Accessor. This adds generic accessors get and set to your class, plus specific accessors as requested. E.g.
package Thing;
use Class::Accessor 'antlers'; # use the Moose-ish syntax
has name => (is => 'rw'); # declare a read-write attribute
# new is autogenerated. Achtung: this takes a hashref
Then:
Thing->new({ name => 'Harold'});
# or
Thing->new->name('Harold');
# or any of the other permutations.
If you want a modern object system for Perl, there is a row of compatible implementations. The most feature-rich of these is Moose, and allows you to add validation, type constraints, default values, etc. to your attributes. E.g.
package Thing;
use Moose; # this is now a Moose class
has first_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name, # run this sub after attribute is set
);
has last_name => (
is => 'rw',
isa => 'Str',
required => 1, # must be given in constructor
trigger => \&_update_name,
);
has name => (
is => 'ro', # readonly
writer => '_set_name', # but private setter
);
sub _update_name {
my $self = shift;
$self->_set_name(join ' ', $self->first_name, $self->last_name);
}
# accessors are normal Moose methods, which we can modify
before first_name => sub {
my $self = shift;
if (#_ and $_[0] !~ /^\pU/) {
Carp::croak "First name must begin with uppercase letter";
}
};
The purpose of class interface is to prevent users from directly manipulating your data. What you want to do is cool, but not a good idea.
In fact, I design my classes, so even the class itself doesn't know it's own structure:
package Thingy;
sub new {
my $class = shift;
my $name = shift;
my $self = {};
bless, $self, $class;
$self->name($name);
return $self;
}
sub name {
my $self = shift;
my $name = shift;
my $attribute = "GLUNKENSPEC";
if ( defined $name ) {
$self->{$attribute} = $name;
}
return $self->{$attribute};
}
You can see by my new constructor that I could pass it a name for my Thingy. However, my constructor doesn't know how I store my name. Instead, it merely uses my name method to set the name. As you can see by my name method, it stores the name in an unusual way, but my constructor doesn't need to know or care.
If you want to manipulate the name, you have to work at it (as you showed):
my $name = $thingy->name;
$name =~ s/old/ry/;
$thingy->name( $name );
In fact, a lot of Perl developers use inside out classes just to prevent this direct object manipulation.
What if you want to be able to directly manipulate a class by passing in a regular expression? You have to write a method to do this:
sub mod_name {
my $self = shift;
my $pattern = shift;
my $replacement = shift;
if ( not defined $replacement ) {
croak qq(Some basic error checking: Need pattern and replacement string);
}
my $name = $self->name; # Using my name method for my class
if ( not defined $name ) {
croak qq(Cannot modify name: Name is not yet set.);
}
$name = s/$pattern/$replacement/;
return $self->name($name);
}
Now, the developer can do this:
my $thingy->new( "Harold" );
$thingy->mod_name( "old", "new" );
say $thingy->name; # Says "Harry"
Whatever time or effort you save by allowing for direct object manipulation is offset by the magnitude of extra effort it will take to maintain your program. Most methods don't take more than a few minutes to create. If I suddenly got an hankering to manipulate my object in a new and surprising way, it's easy enough to create a new method to do this.
1. No. I don't actually use random nonsense words to protect my class. This is purely for demo purposes to show that even my constructor doesn't have to know how methods actually store their data.
I understand the need for hiding attributes behind methods to allow for validation, etc.
Validation is not the only reason, although it is the only one you refer to. I mention this because another is that encapsulation like this leaves the implementation open. For example, if you have a class which needs to have a string "name" which can be get and set, you could just expose a member, name. However, if you instead use get()/set() subroutines, how "name" is stored and represented internally doesn't matter.
That can be very significant if you write bunches of code with uses the class and then suddenly realize that although the user may be accessing "name" as a string, it would be much better stored some other way (for whatever reason). If the user was accessing the string directly, as a member field, you now either have to compensate for this by including code that will change name when the real whatever is changed and...but wait, how can you then compensate for the client code that changed name...
You can't. You're stuck. You now have to go back and change all the code that uses the class -- if you can. I'm sure anyone who has done enough OOP has run into this situation in one form or another.
No doubt you've read all this before, but I'm bringing it up again because there are a few points (perhaps I've misunderstood you) where you seem to outline strategies for changing "name" based on your knowledge of the implementation, and not what was intended to be the API. That is very tempting in perl because of course there is no access control -- everything is essential public -- but it is still a very very bad practice for the reason just described.
That doesn't mean, of course, that you can't simply commit to exposing "name" as a string. That's a decision and it won't be the same in all cases. However, in this particular case, if what you are particularly concerned with is a simple way to transform "name", IMO you might as well stick with a get/set method. This:
# Cumbersome, not syntactically sweet
Maybe true (although someone else might say it is simple and straightforward), but your primary concern should not be syntactic sweetness, and neither should speed of execution. They can be concerns, but your primary concern has to be design, because no matter how sweet and fast your stuff is, if it is badly designed, it will all come down around you in time.
Remember, "Premature optimization is the root of all evil" (Knuth).
So my question is this... is the above "kosher"? Or is it bad form to expose the attribute that way?
It boils down to: Will this continue to work if the internals change? If the answer is yes, you can do many other things including but not limited to validation.)
The answer is yes. This can be done by having the method return a magical value.
{
package Lvalue;
sub TIESCALAR { my $class = shift; bless({ #_ }, $class) }
sub FETCH { my $self = shift; my $m = $self->{getter}; $self->{obj}->$m(#_) }
sub STORE { my $self = shift; my $m = $self->{setter}; $self->{obj}->$m(#_) }
}
sub new { my $class = shift; bless({}, $class) }
sub get_name {
my ($self) = #_;
return $self->{_name};
}
sub set_name {
my ($self, $val) = #_;
die "Invalid name" if !length($val);
$self->{_name} = $val;
}
sub name :lvalue {
my ($self) = #_;
tie my $rv, 'Lvalue', obj=>$self, getter=>'get_name', setter=>'set_name';
return $rv;
}
my $o = __PACKAGE__->new();
$o->name = 'abc';
print $o->name, "\n"; # abc
$o->name = ''; # Invalid name

Patching a class to count number of created objects

I have a Perl library that uses many objects of some (about 3 or 4) classes during its operation.
In testing code, I'd like to ensure it isn't too many (I'm not talking about memory leaks, I know how to check that). To this end, I thought I could count every object used and check the maximum used during the run on the test data. Then, I would compare the obtained number to some guess about how many objects the library should use.
However, I've got problems implementing this. I thought about two possible ways:
intercept Package::new and Package::DESTROY. However, this has a little catch that in that package, new doesn't always return a new object. Sometimes, it uses a preexisting one (the objects are used as immutables, so it shouldn't matter much). So I'd have to track each individual object to see if it existed before.
intercept Package::bless and Package::DESTROY. This should work, but seems a little unorthodox.
The question is, which of those ways is more likely to succeed (maybe what is commonly used in similar situations), and secondly, how would I even implement the second way (would I have to override Package::bless for all packages in question or only base classes etc.).
Try something like this (not tested):
my $Package_objects = {};
my $override_new = *Package::new{CODE};
*Package::new = sub {
my $self = $override_new->(#_);
# Interpolate $self as string to get "HASH(0x12345)"; save package name
$Package_objects->{ "$self" } = 'Package';
return $self;
};
my $override_dest = *Package::DESTROY{CODE};
*Package::DESTROY = sub {
delete $Package_objects->{ "$_[0]" };
$override_dest->(#_);
};
Probably this is most barbaric method, but must work without 3rd party modules ;)
With regard to how to intercept bless (not Package::bless, bless is a builtin, not some kind of method), most builtins are overridable (see http://perldoc.perl.org/perlsub.html#Overriding-Built-in-Functions). The replacement bless function would perform your tracking (if blessing an object into your target class) and then call CORE::bless to actually perform the bless.
Store a hash of seen object ids to make sure you count each object only once. You can do that using Hash::Util::FieldHash or Object::ID.
An idhash has the advantage that it will not artificially keep the object alive. As each object is destroyed its entry will be removed from the idhash. It also has the nice advantage of working across threads.
package Foo;
use strict;
use warnings;
use v5.10;
use Hash::Util::FieldHash qw(idhash register id);
idhash my %objects;
sub new {
my $self = bless {}, shift;
register $self, \%objects;
$objects{$self} = 1;
say "Creating ".id $self;
my $num_objects = keys %objects;
say "There are now $num_objects alive.";
return $self;
}
sub DESTROY {
my $self = shift;
my $num_objects = keys(%objects) - 1;
say "Destroying ".id $self;
say "There are $num_objects left alive.";
}
{
my $obj1 = Foo->new; # 1 object
my $obj2 = Foo->new; # 2 objects
{
my $obj3 = Foo->new; # 3 objects
} # 2 objects
my $obj4 = Foo->new; # 3 objects
} # 0 objects
__END__
Creating 4303384168
There are now 1 alive.
Creating 4303542768
There are now 2 alive.
Creating 4303545192
There are now 3 alive.
Destroying 4303545192
There are 2 left alive.
Creating 4303638136
There are now 3 alive.
Destroying 4303542768
There are 2 left alive.
Destroying 4303384168
There are 1 left alive.
Destroying 4303638136
There are 0 left alive.
Alternatively, since every object created will be destroyed, only count when an object is destroyed.
Have a look at the techniques used in
Devel-Leak-Object-1.01
I've used ADAMK's code as the basis for collecting various kinds of object creation/destroying statistics.

Catalyst dispatcher for arbitrary tree-structure

Greetings,
I'm new to Catalyst and I am attempting to implement some dispatch logic.
My database has a table of items, each with a unique url_part field, and every item has a parent in the same table, making a tree structure. If baz is a child of bar which is a child of foo which is a child of the root, I want the URL /foo/bar/baz to map to this object. The tree can be any depth, and users will need to be able to access any node whether branch or leaf.
I have been looking through the documentation for Chained dispatchers, but I'm not sure if this can do what I want. It seems like each step in a chained dispatcher must have a defined name for the PathPart attribute, but I want my URLs to be determined solely by the database structure.
Is this easy to implement with the existing Catalyst dispatcher, or will I need to write my own dispatch class?
Thanks! :)
ETA:
I figured out that I can use an empty Args attribute to catch an arbitrary number of arguments. The following seems to successfully catch every request under the root:
sub default :Path :Args() {
my ( $self, $c ) = #_;
my $path = $c->request->path;
$c->response->status( 200 );
$c->response->body( "Your path is $path" );
}
From there I can manually parse the path and get what I need, however, I don't know if this is the best way to accomplish what I'm after.
It depends on the structure of your data, which I'm not completely clear on from your question.
If there is a fixed number of levels (or at least a limited range of numbers of levels) with each level corresponding to a specific sort of thing, then Chained can do what you want -- it's valid (and downright common) to have a chained action with :CaptureArgs(1) PathPart('') which will create a /*/ segment in the path -- that is, it gobbles up one segment of the path without requiring any particular fixed string to show up.
If there's not any such thing -- e.g. you're chasing an unlimited number of levels down an arbitrary tree, then a variadic :Args action is probably exactly what you want, and there's nothing dirty in using it. But you don't need to be decoding $c->req->path yourself -- you can get the left-over path segments from $c->req->args, or simply do my ($self, $c, #args) = #_; in your action.
You can write a new DispatchType, but it's just not likely to be worth the payoff.
After playing around with various options, I believe I've arrived at an acceptable solution. Unfortunately, I couldn't get a recursive dispatch going with :Chained (Catalyst complains if you try to chain a handler to itself. That's no fun.)
So I ended up using a single handler with a large CaptureArgs, like this:
sub default : CaptureArgs(10) PathInfo('') {
my ( $self, $c, #args ) = #_;
foreach my $i( 0 .. $#args ) {
my $sub_path = join '/', #args[ 0 .. $i ];
if ( my $ent = $self->_lookup_entity( $c, $sub_path ) ) {
push #{ $c->stash->{ent_chain} }, $ent;
next;
}
$c->detach( 'error_not_found' );
}
my $chain = join "\n", map { $_->entity_id } #{ $c->stash->{ent_chain} };
$c->response->content_type( 'text/plain' );
$c->response->body( $chain );
}
If I do a GET on /foo/bar/baz I get
foo
foo/bar
foo/bar/baz
which is what I want. If any part of the URL doesn't correspond to an object in the DB, I get a 404.
This works fine for my application, which will never have things ten-levels deep, but I wish I could find a more general solution that could support an arbitrary-depth tree.

About using an array of functions in Perl

We are trying to build an API to support commit() and rollback() automatically, so that we don't have to bother with it anymore. By researching, we have found that using eval {} is the way to go.
For eval {} to know what to do, I have thought of giving the API an array of functions, which it can execute with a foreach without the API having to intepret anything. However, this function might be in a different package.
Let me clarify with an example:
sub handler {
use OSA::SQL;
use OSA::ourAPI;
my #functions = ();
push(#functions, OSA::SQL->add_page($date, $stuff, $foo, $bar));
my $API = OSA::ourAPI->connect();
$API->exec_multi(#functions);
}
The question is: Is it possible to execute the functions in #functions inside of OSA::ourAPI, even if ourAPI has no use OSA::SQL. If not, would it be possible if I use an array reference instead of an array, given that the pointer would point to the known function inside of the memory?
Note: This is the basic idea that we want to base the more complex final version on.
You are NOT adding a function pointer to your array. You are adding teh return value of calling the add_page() subroutine. You have 3 solutions to this:
A. You will need to store (in #functions) an array of arrayrefs of the form [\&OSA::SQL::add_page, #argument_values], meaning you pass in an actual reference to a subroutine (called statically); and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
sub exec_multi {
my ($class, $funcs)= #_;
foreach my $f (#$funcs) {
my ($func, #args) = #$f;
my $res = &$func(#args);
print "RES:$res\n";
}
}
Just to re-iterate, this will call individual subs in static version (OSA::SQL::add_page), e.g. WITHOUT passing the package name as the first parameter as a class call OSA::SQL->add_page would. If you want the latter, see the next solution.
B. If you want to call your subs in class context (like in your example, in other words with the class name as a first parameter), you can use ysth's suggestion in the comment.
You will need to store (in #functions) an array of arrayrefs of the form [sub { OSA::SQL->add_page(#argument_values) }], meaning you pass in a reference to a subroutine which will in turn call what you need; and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
sub exec_multi {
my ($class, $funcs)= #_;
foreach my $f (#$funcs) {
my ($func) = #$f;
my $res = &$func();
print "RES:$res\n";
}
}
C. You will need to store (in #functions) an array of arrayrefs of the form [ "OSA::SQL", "add_page", #argument_values], meaning you pass in a package and function name; and then exec_multi will do something like (syntax may not be 100% correct as it's 4am here)
my ($package, $sub, #args) = #{ $functions[$i] };
no strict 'refs';
$package->$sub(#args);
use strict 'refs';
If I understood your question correctly, then you don't need to worry about whether ourAPI uses OSA::SQL, since your main code imports it already.
However, since - in #1B - you will be passing a list of packages to exec_multi as first elements of each arrayref, you can do "require $package; $package->import();" in exec_multi. But again, it's completely un-necessary if your handler call already required and loaded each of those packages. And to do it right you need to pass in a list of parameters to import() as well. BUT WHYYYYYY? :)