Perl - how to check if a hashref is an object? - perl

My perl code makes a call to an API, passing a reference to an array of string IDs, and expects to get an array of objects back.
my #customers = $api->get_customer_details(\#customer_ids);
for my $customer (#customers) {
# debug
print $customer . "/n";
print ref($customer) . "/n";
print Dumper ($customer);
if (!$customer) {
print "No customer object\n";
# do stuff
last;
}
print "Got the customer object\n";
$info->{customer_objects}{$customer->customer_id} = $customer;
}
99% of the time, when I run this I get customer objects back. However, I'll occasionally get the output:
HASH(0x12345)
HASH
{}
Got the customer object
Can't call method "customer_id" on unblessed reference at ..
I've tried to edit my IF statement to check for an empty hashref, but it will always ignore the IF:
if (!$customer || !%$customer)
When I test this out via command line, the empty hashref works as expected:
$ perl -E 'my $hr = {}; if (!$hr || !%$hr) { say "empty" } else { "nonempty" }'
empty
I'm not understanding what the issue is. It appears that my debug output isn't accurate, and I am actually not getting an empty hashref. Can someone please explain what is going on, and how I might figure out exactly what $customer is and how to ignore this case in my IF statement?

You can check blessed, as you see in the comments to the question.
However, it looks like you might have some cases where you query a particular ID, but there's no record for that ID. Maybe that's not the case, so just ignore this if it's not even close to the cause.
In that case, you might get back something that's not a customer object (because there is no customer). I'd much rather push that complexity down so the application doesn't have to think that hard about it and I catch it sooner.
This work probably happens in get_customer_details.
The trick then is to figure out how to represent the customer that does not exist. One way is to simply have get_customer_details not return anything for a non-existent ID. The returned list simply does not have those entries. Then there's no problem with later method calls. However, you then have silent failures and you can get fewer items than the arguments you supplied.
Another way is to return undef in the list, then look for defined values. The undef positions would correlate to the bad customer ID in the #customer_ids, and you can handle that .
my #customers = $api->get_customer_details(\#customer_ids);
for( my $i = 0; $i < #customers; $i++ ) {
unless( defined $customers[$i] ) {
warn "No customer for $customer_ids[$i]";
next;
}
...
}
But I tend to favor something where there's a parallel object that acts like the null customer. That would be able to tell you that there was no customer, what the ID was, and so on. It might respond to all the normal customer methods, but warn and return undef (or die, or whatever). There might be some method to tell you which objects are real customers, like exists (or some better name):
my #customers = $api->get_customer_details(\#customer_ids);
foreach my $customer ( #customers ) {
unless( $customers->exists ) {
# this would be a Customer::Null or something
# that responds to the same methods
warn "No customer for " . $customer->id";
next;
}
...
}
I think this last approach because as these objects pass through the program, they carry with them the story of their creation. These things might even store the part of the API data structure so you can see what's going on. Maybe there is a customer but whatever turns it into a Customer doesn't understand something about it.

Related

Perl sub returns a subroutine

I haven't used Perl for around 20 years, and this is confusing me. I've g******d for it, but I obviously haven't used a suitable search string because I haven't found anything relating to this...
Why would I want to do the following? I understand what it's doing, but the "why" escapes me. Why not just return 0 or 1 to begin with?
I'm working on some code where a sub uses "return sub"; here's a very truncated example e.g.
sub test1 {
$a = shift #_;
if ($a eq "cat") {
return sub {
print("cat test OK\n");
return 0;
}
}
# default if "cat" wasn't the argument
return sub {
print("test for cat did not work\n");
return 1;
}
}
$c = test1("cat");
print ("received: $c\n");
print ("value is: ",&$c,"\n");
$c = test1("bat");
print ("received: $c\n");
print ("value is: ",&$c,"\n");
In your code there is no reason to return a sub. However, with a little tweak
sub test1 {
my $animal = shift #_;
if ($animal eq "cat" || $animal eq "dog") {
return sub {
print("$animal test OK\n");
return 0;
};
}
# default if "cat" or "dog" wasn't the argument
return sub {
print("test for cat or dog did not work\n");
return 1;
};
}
We now have a closure around $animal this saves memory as the test for cat and dog share the same code. Note that this only works with my variables. Also note that $a and $b are slightly special to Perl, they are used in the block of code that you can pass to the sort function and bypass some of the checks on visibility so it's best to avoid them for anything except sort.
You probably want to search "perl closures".
There are many reasons that you'd want to return a code reference, but it's not something I can shortly answer in a StackOverflow question. Mark Jason Dominus's Higher Order Perl is a good way to expand your mind, and we cover a little of that in Intermediate Perl.
I wrote File::Find::Closures as a way to demonstrate this is class. Each subroutine in that module returns two code references—one for the callback to File::Find and the other as a way to access the results. The two share a common variable which nothing else can access.
Notice in your case, you aren't merely calling a subroutine to "get a zero". It's doing other things. Even in your simple example there's some output. Some behavior is then deferred until you actually use the result for something.
Having said that, we have no chance of understanding why the programmer who wrote your particular code did it. One plausible guess was that the system was set up for more complex situations and you're looking at a trivial example that fits into that. Another plausible guess was that the programmer just learned that feature and fell in love with it then used it everywhere for a year. There's always a reason, but that doesn't mean there's always a good reason.

Evaluating the success/failure of a subroutine

There's something quite unclear to me about subs return value.
I like to test my modules, sub by sub, and check whether they issue the correct return value or the correct exception if the case arise.
For example, let's say I have the following code (X::Argument::BadFormat is an exception handler derived from Exception::Class):
package My::Module;
use strict;
use warnings;
sub new{#does things unrelated to the current question}
sub my_sub {
my ($self,$possible_value) = #_;
if ($possible_value =~ q{\w}) { #Affect value to current object
$self->{field} = $possible_value;
}else{ #throw an exception
X::Argument::BadFormat->throw(
arg => 'possible_value',
expected_format => 'something that looks like a word',
received_value => $possible_value,
);
}
}
In the test file, I will run tests such as:
my $object = My::Module->new();
throws_ok(sub {$object->my_sub('*')}, 'X::Argument::BadFormat', 'Faulty value will raise an exception');
ok($object->my_sub('turlututu'));
It is easy to test when:
the sub returns a value,
the test conditions must raise an exception,
However, when I just set the value of a field in the current object, I have no reason to return anything.
In that case:
is the simple execution of the code enough to evaluate the sub output as "true" ?
Shall I add an explicit "return 1;" ?
does the sub actually return the last evaluation, in this case the sucess of the
test in the "if"? Something else I did not think about but which is
obvious to everybody?
In this case, I'd just check to ensure that the object's attribute was set correctly. That's all this particular sub does. If it's set ok, the sub ended correctly. If it wasn't set, something went wrong before the sub ended.
my $p='blah';
$obj->my_sub($p);
is $obj->{field}, $p, "my_sub() set the field attr ok";
It would be better if the field attribute had a getter so you're not breaking encapsulation, but I digress.
A sub that has no need to return a value should end with
return;
In your case, without it, you will be returning the value of $possible_value, which is the last thing executed. This doesn't look like a useful thing to return.
Assuming you add the explicit return:
Your throws_ok test looks fine. You should then test that the field was correctly set. Your ok test isn't needed, since your sub won't be returning anything.
Perl returns the result of the last executed code by default.
For example:
print main();
sub main {
my $var = 9 * 7;
}
print will output 63. If your code may be affected by the output of a given subroutine, then you need to set a return value (it's generally considered a best practice to always set an explicit return at the end of a subroutine/method).
print main();
sub main {
my $var = 9 * 7;
return;
}
print will output nothing.
Personally, I always try to set a return value depending on the context of what the subroutine will be returning to, but if you're writing code other people will be using, then it's generally safest to just do return;.
An additional explanation from Perl::Critic (link to the specific policy):
Subroutine "main" does not end with "return" at line 8, near 'sub main {'.
Subroutines::RequireFinalReturn (Severity: 4)
Require all subroutines to terminate explicitly with one of the
following: return',carp', croak',die', exec',exit', goto', or
throw'.
Subroutines without explicit return statements at their ends can be
confusing. It can be challenging to deduce what the return value will
be.
Furthermore, if the programmer did not mean for there to be a
significant return value, and omits a return statement, some of the
subroutine's inner data can leak to the outside. Consider this case:
package Password;
# every time the user guesses the password wrong, its value
# is rotated by one character
my $password;
sub set_password {
$password = shift;
}
sub check_password {
my $guess = shift;
if ($guess eq $password) {
unlock_secrets();
} else {
$password = (substr $password, 1).(substr $password, 0, 1);
}
}
1;
In this case, the last statement in check_password() is the assignment.
The result of that assignment is the implicit return value, so a wrong
guess returns the right password! Adding a `return;' at the end of that
subroutine solves the problem.
The only exception allowed is an empty subroutine.
Be careful when fixing problems identified by this Policy; don't blindly
put a `return;' statement at the end of every subroutine.

Perl get parameter datatype

Am trying to make a subroutine that replaces data depending on datatype: the problem is i can't get the datatype of the parameter, i used this:
sub replace {
my ($search, $replacement, $subject) = #_;
if (ref($search) eq "HASH") {
print "r is a reference to a HASH.\n";
}
elsif (ref($search) eq "SCALAR") {
print "r is a reference to a SCALAR.\n";
}
elsif (ref($search) eq "ARRAY") {
print "r is a reference to a ARRAY.\n";
}
}
my $str = "Foo";
my #arr = ("Foo");
replace($str);
replace(#arr);
But none works. am really new to perl
ref() takes a reference to something, not the something itself. Here:
replace($str);
replace(#arr);
...you are sending in the something directly. Send in a reference to the something instead, by putting a \ in front of it (which says, "take a reference to this something"):
replace(\$str);
replace(\#arr);
Output:
r is a reference to a SCALAR.
r is a reference to a ARRAY.
Note also that in your replace() function, in this line:
my ($search, $replacement, $subject) = #_;
You are effectively asking for a scalar value as the thing to search, so passing in a list (array, hash etc) will clobber $replacement and $subject if the passed in list has more than one element, so you may want to do something like this to ensure you're getting the proper params, and nothing is clobbered unexpectedly:
sub replace {
my ($search, $replacement, $subject) = #_;
die "first arg must be a ref\n" if ! ref $search;
Of course, you can do further argument checking, but this'll ensure that the first parameter can only be a reference to something. Instead of die(), you can also just return so the program doesn't crash, or print or warn and then return.
It is not stated what you want to do with it, but here's what is wrong with what you show.
The ref function shows the datatype of the reference subtmitted to it, or it returns an empty string if its argument isn't a reference at all.
So to get the expected behavior you should do
replace(\$str);
replace(\#arr);
Also, you need to add the test to your function
else (not ref $search)
for when a submitted string is not a reference.
For completeness, I should also point out an issue, explained in the answer by stevieb. When you pass an array to a function, it receives it as a flat list of arguments. With your function you clearly do not want replace(#arr). They are assigned to your list of scalar variables in order, one element to each. (As soon as there is an array variable it all goes into it.) See, for example, this post.

Perl Using a hash as a reference is deprecated when used with package

I have a module called News (original name, I know) with a method called get_fields, this method returns all the fields that belong to the module like this
sub get_fields {
my $self = shift;
return $self;
}
Now when I call it like this in a different module where I need to do stuff to the fields
my %fields = %{ $news->get_fields };
I discovered doing it like this prevented this issue
Type of argument to keys on reference must be unblessed hashref or
arrayref
when I iterate other fields like this
foreach my $key ( keys %fields ) {
%pairs->{$key} = %fields->{$key} if %fields->{$key};
}
in order to use the values of the fields, I get this warning
Using a hash as a reference is deprecated
which is pointing back to the foreach loop.
How can I avoid this error message without getting the unbless warning back?
I think you're getting mixed up between objects and hashes. get_fields will return $self - which whilst I can't tell for certain, looks like it'll be returning a blessed object.
Now, blessed objects are quite similar to hashes, but they're not the same. You can test the difference with the ref function.
So the question is more - why are you doing this? Why are you trying to cast an object reference into a hash? Because that's what you're doing with:
my %fields = %{ $news->get_fields };
Because pretty fundamentally - even if that worked, it would be a horrible thing to do. The point, purpose and reason for objects is encapsulation - e.g. things outside the module don't meddle with stuff inside.
So why not instead have get_fields return a list of fields, which you can then iterate on and make method calls? This would really be the 'right' way to do something like this.
sub get_fields {
my ( $self ) = #_;
return keys %$self;
}
Or if you really must, embed a method within your object that returns as hash - rather than an object reference - that you can then manipulate externally.
Generally - you don't refer to hashes with a % prefix, unless you're manipulating the whole hash.
To extract a single element from %pairs you should do:
foreach my $key ( keys %pairs ) {
print $pairs{$key},"\n";
}
If the contents of $pairs{$key} is a reference, then you can use the -> to indicate that you should dereference, e.g. $pairs -> {$key}.

Catalyst dispatcher for arbitrary tree-structure

Greetings,
I'm new to Catalyst and I am attempting to implement some dispatch logic.
My database has a table of items, each with a unique url_part field, and every item has a parent in the same table, making a tree structure. If baz is a child of bar which is a child of foo which is a child of the root, I want the URL /foo/bar/baz to map to this object. The tree can be any depth, and users will need to be able to access any node whether branch or leaf.
I have been looking through the documentation for Chained dispatchers, but I'm not sure if this can do what I want. It seems like each step in a chained dispatcher must have a defined name for the PathPart attribute, but I want my URLs to be determined solely by the database structure.
Is this easy to implement with the existing Catalyst dispatcher, or will I need to write my own dispatch class?
Thanks! :)
ETA:
I figured out that I can use an empty Args attribute to catch an arbitrary number of arguments. The following seems to successfully catch every request under the root:
sub default :Path :Args() {
my ( $self, $c ) = #_;
my $path = $c->request->path;
$c->response->status( 200 );
$c->response->body( "Your path is $path" );
}
From there I can manually parse the path and get what I need, however, I don't know if this is the best way to accomplish what I'm after.
It depends on the structure of your data, which I'm not completely clear on from your question.
If there is a fixed number of levels (or at least a limited range of numbers of levels) with each level corresponding to a specific sort of thing, then Chained can do what you want -- it's valid (and downright common) to have a chained action with :CaptureArgs(1) PathPart('') which will create a /*/ segment in the path -- that is, it gobbles up one segment of the path without requiring any particular fixed string to show up.
If there's not any such thing -- e.g. you're chasing an unlimited number of levels down an arbitrary tree, then a variadic :Args action is probably exactly what you want, and there's nothing dirty in using it. But you don't need to be decoding $c->req->path yourself -- you can get the left-over path segments from $c->req->args, or simply do my ($self, $c, #args) = #_; in your action.
You can write a new DispatchType, but it's just not likely to be worth the payoff.
After playing around with various options, I believe I've arrived at an acceptable solution. Unfortunately, I couldn't get a recursive dispatch going with :Chained (Catalyst complains if you try to chain a handler to itself. That's no fun.)
So I ended up using a single handler with a large CaptureArgs, like this:
sub default : CaptureArgs(10) PathInfo('') {
my ( $self, $c, #args ) = #_;
foreach my $i( 0 .. $#args ) {
my $sub_path = join '/', #args[ 0 .. $i ];
if ( my $ent = $self->_lookup_entity( $c, $sub_path ) ) {
push #{ $c->stash->{ent_chain} }, $ent;
next;
}
$c->detach( 'error_not_found' );
}
my $chain = join "\n", map { $_->entity_id } #{ $c->stash->{ent_chain} };
$c->response->content_type( 'text/plain' );
$c->response->body( $chain );
}
If I do a GET on /foo/bar/baz I get
foo
foo/bar
foo/bar/baz
which is what I want. If any part of the URL doesn't correspond to an object in the DB, I get a 404.
This works fine for my application, which will never have things ten-levels deep, but I wish I could find a more general solution that could support an arbitrary-depth tree.