Perl code ! What does it do (Hash of hashes)? - perl

I'm currently working on some code written by the previous internship student. I'm not familiar to Perl so I have some problems in understanding what his code actually do. So it looks like:
$Hash{Key1}{Key2}++;
The original code was:
$genotypes_parent2_array{$real_genotype}{$individu_depth}++;
I use to see hashes in this form $Hash{Key} in order to get the value but I struggle with this one. Any help out there ?
Thanks!

%genotypes_parent2_array is a hash (so that's not a very good name for the variable!) Each value in the hash is a hash reference. So effectively you have a hash of hashes.
$genotypes_parent2_array{$real_genotype} looks up the key $real_genotype in the hash. And that value is (as we said above) a hash reference. If you have a hash reference, then you can look up values in the referenced hash using an arrow. So we can get to a value in the second-level hash using code like this:
$genotypes_parent2_array{$real_genotype}->{$individu_depth}
However, Perl has a nice piece of syntactic sugar. When you have two pairs of "look-up brackets" next to each other (as we have here) you can omit the arrow. So you can get exactly the same effect with:
$genotypes_parent2_array{$real_genotype}{$individu_depth}
And that's what we have here. We look up the key $real_genotype in the hash %genotypes_parent2_array. This gives us a hash reference. We then look up the key $individu_depth in the referenced array and that gives us the value in the second-level hash. Your code then increments that value.
The manual page perldoc perldsc is a good introduction to using references to build complex data structures in Perl. In addition, I find Data::Dumper very useful for showing what a complex data structure looks like.

%Hash is a hash of hashes.
That codes add 1 to the value of $Hash{Key1}{Key2} which is the value of a hash element.

Related

What does this mean in Perl?

I am converting some Perl code to php, and I've stumbled on something, which I don't know for sure what it is.
if(!$continenttxt_cached{$savedcontinentid.'_'.$savedcountrygroupid})
What does the {} bracket do here? Is this a standard array element accessed this way? Because I am converting only a small part of a rather large code, I can't find how was $continenttxt_cached defined, so I can only presume this is an array. Is the {} used for something else in Perl?
{} in this context denotes a hash accessor - hashes are key-value pairs.
So you have a hash called %continenttxt_cached from which you're trying to extract the value associated with $savedcontinentid.'_'.$savedcountrygroupid
See perldata for more information.

Perfect Hash Function for Perl (like gperf)?

I'm going to be using a key:value store and would like to create non-collidable hashes in Perl. Is there a Perl module, or function that I can use to generate a non-collidable hash function or table (maybe something like gperf)? I already know my range of input values.
I can't find a pure Perl solution, closest is Reini Urban's examinations of using perfect hashes with a type system. If you were to do it in XS, the CMPH (C Minimal Perfect Hashing Library) might be more apropos than gperf. CMPH seems to be optimized for non-trivial key sizes and run-time generation.
The cost of generating a perfect hash function at runtime in Perl might swamp the value of using it. In order to gain benefit, you'd want it compiled and cached. So again, writing an XS module which generates the function from a fixed key list at XS compile time might be the best way to go.
Out of curiosity, how big is your data and how many keys does the set contain?
You might be interested in Judy. It's not a hash table implementation, but it's supposedly a very efficient associative array implementation.
Mind you, Perl's hashes are very well tuned, and they automatically get rehashed when a bucket starts growing large.

Hash randomization in Perl 5

When Perl 5.8.1 came out it added hash randomization. When Perl 5.8.2 came out, I thought, it removed hash randomization unless an environment variable (PERL_HASH_SEED) was present. It now seems as if I am gravely mistaken as
PERL_HASH_SEED=$SEED perl -MData::Dumper -e 'print Dumper{map{$_,1}"a".."z"}'
Always kicks back the same key ordering regardless of the value of $SEED.
Did hash randomization go completely away, am I doing something wrong, or is this a bug?
See Algorithmic Complexity Attacks:
In Perl 5.8.1 the hash function is randomly perturbed by a pseudorandom seed which makes generating such naughty hash keys harder. [...] but as of 5.8.2 it is only used on individual hashes if the internals detect the insertion of pathological data.
So randomization doesn't always happen, only when perl detects that it's needed.
At a minimum there have been some sloppy documentation updates. In the third paragraph of perlrun's entry for PERL_HASH_SEED it says:
The default behaviour is to randomise unless the PERL_HASH_SEED is set.
which was true only in 5.8.1 and contradicts the paragraph immediately preceding it:
Most hashes by default return elements in the same order as in Perl 5.8.0. On a hash by hash basis, if pathological data is detected during a hash key insertion, then that hash will switch to an alternative random hash seed.
perlsec's entry for Algorithmic Complexity Attacks gets this right:
In Perl 5.8.1 the random perturbation was done by default, but as of
5.8.2 it is only used on individual hashes if the internals detect the
insertion of pathological data.
perlsec goes on to say
If one wants for some reason emulate the old behaviour [...] set the
environment variable PERL_HASH_SEED to zero to disable the
protection (or any other integer to force a known perturbation, rather
than random).
[emphasis added]
Since setting PERL_HASH_SEED does not effect the hash order, I'd call it a bug. Searching for "PERL_HASH_SEED" on rt.perl.org didn't return any results, so it doesn't appear to be a "known" issue.

Should all implementations of SHA512 give the same Hash?

I am working on writing a SHA512 function. When i check the file I am encrypting on different sources, a Linux SHA512SUM tool, a couple websites, and run it through the old source code i have for SHA512, they all give different hash values. My thought going into this project is that all Hash algorithms will output the same hash value if implemented correctly, to be used as a check sum. Am I wrong in thinking this? If I am wrong how would I really check to see if my work is correct?
Thanks in advance.
Yes, that's one of the basic building block of PKI: the same data block passed to a hash should always return the same hash value.
beware of the interpretation, though: the result of a SHA-2(512) hash is a block of 512 bits, not a string value so it will first be encoded for human consumption and it is therefore possible that you see what looks like visually different results when it's simply a matter of using different encodings.

KRL: Replace HASH val

I can see plenty of example of how to read from a hash or add to a hash (using put), but how do I replace a current value? Any simple examples would be cool.
I haven't worked much with hashes, but you might look at Mike Grace's example here:
http://kynetxappaday.wordpress.com/2010/12/29/day-28-updating-users-list-when-user-joins-app/
That will replace the entire hash. I'm not sure how to replace just one value in the hash.