What does this mean in Perl? - perl

I am converting some Perl code to php, and I've stumbled on something, which I don't know for sure what it is.
if(!$continenttxt_cached{$savedcontinentid.'_'.$savedcountrygroupid})
What does the {} bracket do here? Is this a standard array element accessed this way? Because I am converting only a small part of a rather large code, I can't find how was $continenttxt_cached defined, so I can only presume this is an array. Is the {} used for something else in Perl?

{} in this context denotes a hash accessor - hashes are key-value pairs.
So you have a hash called %continenttxt_cached from which you're trying to extract the value associated with $savedcontinentid.'_'.$savedcountrygroupid
See perldata for more information.

Related

Netlogo arrays need literal values

The array is expecting a literal value
set chrom [forage_min forage_rate share_min share_rate mating_treshold]
print chrom
How can I handle it? I really don't understand arrays in Netlogo.
(You speak of "arrays" in your question, but I think you mean "lists". It is possible to use arrays in NetLogo via the array extension, but unless you have very specific needs, that's probably not what you want. So, assuming that you are trying to create a list:)
The square bracket syntax for declaring lists only works with "literal" values, e.g., raw strings or numbers. If you want to build a list out of variables or more complex expressions, you need to use the list primitive. In your case, that would be something like:
set chrom (list forage_min forage_rate share_min share_rate mating_treshold)
I would encourage you to read the Lists section of the NetLogo programming guide.

Perl code ! What does it do (Hash of hashes)?

I'm currently working on some code written by the previous internship student. I'm not familiar to Perl so I have some problems in understanding what his code actually do. So it looks like:
$Hash{Key1}{Key2}++;
The original code was:
$genotypes_parent2_array{$real_genotype}{$individu_depth}++;
I use to see hashes in this form $Hash{Key} in order to get the value but I struggle with this one. Any help out there ?
Thanks!
%genotypes_parent2_array is a hash (so that's not a very good name for the variable!) Each value in the hash is a hash reference. So effectively you have a hash of hashes.
$genotypes_parent2_array{$real_genotype} looks up the key $real_genotype in the hash. And that value is (as we said above) a hash reference. If you have a hash reference, then you can look up values in the referenced hash using an arrow. So we can get to a value in the second-level hash using code like this:
$genotypes_parent2_array{$real_genotype}->{$individu_depth}
However, Perl has a nice piece of syntactic sugar. When you have two pairs of "look-up brackets" next to each other (as we have here) you can omit the arrow. So you can get exactly the same effect with:
$genotypes_parent2_array{$real_genotype}{$individu_depth}
And that's what we have here. We look up the key $real_genotype in the hash %genotypes_parent2_array. This gives us a hash reference. We then look up the key $individu_depth in the referenced array and that gives us the value in the second-level hash. Your code then increments that value.
The manual page perldoc perldsc is a good introduction to using references to build complex data structures in Perl. In addition, I find Data::Dumper very useful for showing what a complex data structure looks like.
%Hash is a hash of hashes.
That codes add 1 to the value of $Hash{Key1}{Key2} which is the value of a hash element.

How to use B::Hooks to manipulate the perl parser

I'm looking to play with perl parser manipulation. It looks like the various B::Hooks modules are what people use. I was wondering:
Best place to start for someone who has no XS experience (yet). Any relevant blog posts?
How much work would be involved in creating a new operator, for example:
$a~>one~>two~>three
~> would work like -> but it would not try to call on undef and would instead simply return undef to LHS.
Although a source filter would work -- I'm more interested in seeing how you can manipulate the parser at a deeper level.
I don't believe you can add infix operators (operators whose operands are before and after the operator), much less symbolic ones (as opposed to named operators), but you could write an an op checker that replaces method calls. This means you could cause ->foo to behave differently. By writing your module as a pragma, you could limit the effect of your module to a lexical scope (e.g. { use mypragma; ...}).

Perfect Hash Function for Perl (like gperf)?

I'm going to be using a key:value store and would like to create non-collidable hashes in Perl. Is there a Perl module, or function that I can use to generate a non-collidable hash function or table (maybe something like gperf)? I already know my range of input values.
I can't find a pure Perl solution, closest is Reini Urban's examinations of using perfect hashes with a type system. If you were to do it in XS, the CMPH (C Minimal Perfect Hashing Library) might be more apropos than gperf. CMPH seems to be optimized for non-trivial key sizes and run-time generation.
The cost of generating a perfect hash function at runtime in Perl might swamp the value of using it. In order to gain benefit, you'd want it compiled and cached. So again, writing an XS module which generates the function from a fixed key list at XS compile time might be the best way to go.
Out of curiosity, how big is your data and how many keys does the set contain?
You might be interested in Judy. It's not a hash table implementation, but it's supposedly a very efficient associative array implementation.
Mind you, Perl's hashes are very well tuned, and they automatically get rehashed when a bucket starts growing large.

Nested dereferencing arrows in Perl: to omit or not to omit?

In Perl, when you have a nested data structure, it is permissible to omit de-referencing arrows to 2d and more level of nesting. In other words, the following two syntaxes are identical:
my $hash_ref = { 1 => [ 11, 12, 13 ], 3 => [31, 32] };
my $elem1 = $hash_ref->{1}->[1];
my $elem2 = $hash_ref->{1}[1]; # exactly the same as above
Now, my question is, is there a good reason to choose one style over the other?
It seems to be a popular bone of stylistic contention (Just on SO, I accidentally bumped into this and this in the space of 5 minutes).
So far, almost none of the usual suspects says anything definitive:
perldoc merely says "you are free to omit the pointer dereferencing arrow".
Conway's "Perl Best Practices" says "whenever possible, dereference with arrows", but it appears to only apply to the context of dereferencing the main reference, not optional arrows on 2d level of nested data structures.
"Mastering Perl for Bioinfirmatics" author James Tisdall doesn't give very solid preference either:
"The sharp-witted reader may have
noticed that we seem to be omitting
arrow operators between array
subscripts. (After all, these are
anonymous arrays of anonymous arrays
of anonymous arrays, etc., so
shouldn't they be written
[$array->[$i]->[$j]->[$k]?) Perl
allows this; only the arrow operator
between the variable name and the
first array subscript is required. It
make things easier on the eyes and
helps avoid carpal tunnel syndrome. On
the other hand, you may prefer to keep
the dereferencing arrows in place, to
make it clear you are dealing with
references. Your choice."
UPDATED "Intermediate Perl", as per its co-author brian d foy, recommends omitting the arrows. See brian's full answer below.
Personally, I'm on the side of "always put arrows in, since it's more readable and obvious they're dealing with a reference".
UPDATE To be more specific re: readability, in case of a multi-nested expression where subscripts themselves are expressions, the arrows help to "visually tokenize" the expressions by more obviously separating subscripts from one another.
Unless you really enjoy typing or excessively long lines, don't use the arrows when you don't need them. Subscripts next to subscripts imply references, so the competent programmer doesn't need extra clues to figure that out.
I disagree that it's more readable to have extra arrows. It's definitely unconventional to have them moving the interesting parts of the term further away from each other.
In Intermediate Perl, where we actually teach references, we tell you to omit the unnecessary arrows.
Also, remember there is no such thing as "readability". There is only what you (and others) have trained your eyes to recognize as patterns. You don't read things character-by-character then figure out what they mean. You see groups of things that you've seen before and recognize them. At the base syntax level that you are talking about, your "readability" is just your ability to recognize patterns. It's easier to recognize patterns the more you use it, so it's not surprising that what you do now is more "readable" to you. New styles seem odd at first, but eventually become more recognizable, and thus more "readable".
The example you give in your comments isn't hard to read because it lacks arrows. It's still hard to read with arrows:
$expr1->[$sub1{$x}]{$sub2[$y]-33*$x3}{24456+myFunct($abc)}
$expr1->[$sub1{$x}]->{$sub2[$y]-33*$x3}->{24456+myFunct($abc)}
I write that sort of code like this, using these sorts of variable names to remind the next coder about the sort of container each level is:
my $index = $sub1{$x};
my $key1 = $sub2[$y]-33*$x3;
my $key2 = 24456+myFunct($abc);
$expr1->[ $index ]{ $key1 }{ $key2 };
To make that even better, hide the details in a subroutine (that's what they are there for :) so you never have to play with that mess of a data structure directly. This is more readable that any of them:
my $value = get_value( $index, $key1, $key2 );
my $value = get_value(
$sub1{$x},
$sub2[$y]-33*$x3,
24456+myFunct($abc)
);
Since the -> arrow is non-optionally used for method calls, I prefer to only use it to call code. So I would use the following:
$object->method;
$coderef->();
$$dispatch{name}->();
$$arrayref[1];
$$arrayref[1][5];
#$arrayref[1 .. 5];
#$arrayref;
$$hashref{foo};
$$hashref{foo}{bar};
#$hashref{qw/foo bar/};
%$hashref;
Two sigils back to back always means a dereference, and the structure remains consistent across all forms of dereferencing (scalar, slice, all).
It also keeps all parts of the variable "together" which I find more readable, and it's shorter :)
I have always written all of the arrows. I agree with you, they separate better the different subscripts. Plus I use curly braces for regular expressions, so to me {foo}{bar} is a substitution: s{foo}{bar} stands out more from $s->{foo}->{bar} than from $s->{foo}{bar}.
I don't think it's a big thing though, reading code that omits the extra arrows is not a problem (as opposed to any indentation that's not the one I use ;--)