smart match operator for hash of hash - perl

I want to match keys of a hash of hash with regexp .
$line=" Cluster(A,B):A(T) M(S)";
$reg="Cluster";
my ( $cluster, $characters ) = split (/:/,$line);
$HoH{$cluster}={split /[( )]+/,$characters } ;
foreach $value(keys %HoH){
foreach $characters (keys %{$HoH{$cluster}}){
print "$value:$characters\n" if /$reg/ ~~ %HoH;
}
}
now Output is :
Cluster(A,B):A
Cluster(A,B):M
This code is works fine with this sample data,but not with real data!! my data is more complicated but the structure is the same I was wondering if there is some other ways to do this

Perhaps you want just
print "something\n" if exists $HoH{regexp}
or maybe
print "something\n" if grep /regexp/, keys %HoH
but if neither of these are correct then you need to explain better what you need, and give some examples

This is under documented, and I don't grok exactly what the issue is, but the smart match operator works better with references to arrays and hashes. So you may have better luck with
/$reg/ ~~ \%Hoh

SmartMatch is currently complicated, unwieldy and surprising. Don't use it, at least not now. There's talk by the main developers of perl to either greatly simplify it or remove it completely. Either way it won't do what you're asking it to do in the future, so don't rely on it doing that now.
Being more explicit about what you want is better anyway.

Most likely, your bug is here:
foreach $characters (keys %{$HoH{$cluster}}) {
which should read
foreach $characters (keys %{$HoH{$value}}) {
. Probably.

Related

Perl: How to get keys on key "keys on reference is experimental"

I am looking for a solution to Perl's warning
"keys on reference is experimental at"
I get this from code like this:
foreach my $f (keys($normal{$nuc}{$e})) {#x, y, and z
I found something similar on StackOverflow here:
Perl throws "keys on reference is experimental"
but I don't see how I can apply it in my situation.
How can I get the keys to multiple keyed hashes without throwing this error?
keys %{$normal{$nuc}{$e}}
E.g. dereference it first.
If you had a reference to start off with, you don't need {} E.g.:
my $ref = $normal{$nuc}{$e};
print keys %$ref;
The problem is that $normal{$nuc}{$e} is a hash reference, and keys will officially only accept a hash. The solution is simple—you must dereference the reference—and you can get around this by writing
for my $f ( keys %{ $normal{$nuc}{$e} } ) { ... }
but it may be wiser to extract the hash reference into an intermediate variable. This will make your code much clearer, like so
my $hr = $normal{$nuc}{$e};
for my $f ( keys %$hr ) { ... }
I would encourage you to write more meaningful variable names. $f and $e may tell you a lot while you're writing it, but it is sure to cause others problems, and it may even come back to hit you a year or two down the line
Likewise, I am sure that there is a better identifier than $hr, but I don't know the meaning of the various levels of your data structure so I couldn't do any better. Please call it something that's relevant to the data that it points to
keys $hash_ref
is the reason for the warning keys on references is experimental
Solution:
keys %{ $hash_ref }
Reference:
https://www.perl.com/pub/2005/07/14/bestpractices.html/

what does print for mean in Perl?

I need to edit some Perl script and I'm new to this language.
I encountered the following statement:
print for (#$result);
I know that $result is a reference to an array and #$result returns the whole array.
But what does print for mean?
Thank you in advance.
In Perl, there's such a thing as an implicit variable. You may have seen it already as $_. There's a lot of built in functions in perl that will work on $_ by default.
$_ is set in a variety of places, such as loops. So you can do:
while ( <$filehandle> ) {
chomp;
tr/A-Z/a-z/;
s/oldword/newword/;
print;
}
Each of these lines is using $_ and modifying it as it goes. Your for loop is doing the same - each iteration of the loop sets $_ to the current value and print is then doing that by default.
I would point out though - whilst useful and clever, it's also a really good way to make confusing and inscrutable code. In nested loops, for example, it can be quite unclear what's actually going on with $_.
So I'd typically:
avoid writing it explicitly - if you need to do that, you should consider actually naming your variable properly.
only use it in places where it makes it clearer what's going on. As a rule of thumb - if you use it more than twice, you should probably use a named variable instead.
I find it particularly useful if iterating on a file handle. E.g.:
while ( <$filehandle> ) {
next unless m/keyword/; #skips any line without 'keyword' in it.
my ( $wiggle, $wobble, $fronk ) = split ( /:/ ); #split $_ into 3 variables on ':'
print $wobble, "\n";
}
It would be redundant to assign a variable name to capture a line from <$filehandle>, only to immediately discard it - thus instead we use split which by default uses $_ to extract 3 values.
If it's hard to figure out what's going on, then one of the more useful ways is to use perl -MO=Deparse which'll re-print the 'parsed' version of the script. So in the example you give:
foreach $_ (#$result) {
print $_;
}
It is equivalent to for (#$result) { print; }, which is equivalent to for (#$result) { print $_; }. $_ refers to the current element.

what does this Perl foreach do?

I am stuck while understanding what this foreach does, I am new to perl programming
-e && print ("$_\n") foreach $base_name, "build/$base_name";
here build is directory. Thanks
Not very pretty code somebody left you with there. :(
Either of
for ($basename, "build/$basename") { say if -e }
or
for my $file ($basename, "build/$basename") {
say $file if -e $file;
}
would be clearer.
Checks whether the file exists, and if it does,prints its name.
It can be written in a clearer way.
It iterates over $base_name and then build/$base_name while the file name is in $base_name
it is essentially the same as:
foreach( $base_name, "build/$base_name" ){
if( -e ){
print ("$_\n");
}
}
I hope whoever wrote that isn't prolific.
How many ways to do the same thing in a clearer fashion?
grep
grep filters a list and removes anything that returns false for the the check condition.
The current value under consideration is set to $_, so it is very convenient to use file test operators.
Since say and print both handle lists of strings well, it makes sense to filter output before passing it to them.
say grep -e, $base_name, "build/$base_name";
On older perls without say, map is sometimes used to apply newlines to the output before it is printed.
print map "$_\n", grep -e, $base_name, "build/$base_name";
for
Here we safely use a statement modifier if. IMHO, statement modifier forms can be very handy as long as the LHS is kept very simple.
for my $dir ( $base_name, "build/$base_name" ) {
print "$dir\n" if -e $dir;
}
punctuation
Still ugly as sin, but breaking up the line and using some extra parens helps identify what is being looped over and separate it from the looping code.
-e && print("$_\n")
foreach( $base_name, "build/$base_name" );
It's better, but please don't do this. If you want to do ($foo if $bar) for #baz; DON"T. Use a full sized block. I can count one time in my life where it felt OK to use a $bar and $foo for #baz construct like this one--and it was probably wrong to use it.
sub
The OP's mystery code is nicer if it is merely wrapped in a sub with a good name.
I'll write this in the ugliest way I know how:
sub print_existing_files { -e && print ("$_\n") foreach #_; }
The body of the sub remains a puzzle, but at least there's some kind of clue in the name.
Conclusion
I'm sure I left out many variations on how this could be done (I don't have anything that uses until, and damn near anything that isn't intentionally obfuscated would be clearer). But that is beside the point.
The point of this is really to say that in any language there are many ways to achieve any given task. It is, therefore, important that each programmer who works on a system remain mindful of those who follow.
Code exists to serve two purposes: to communicate with future programmers and to instruct the computer how to operate--in that order.

How do I create new variables with indexed names in Perl?

hi i've read some related question non seems tohelp me.
this is a code that will explain what i want.
for ($i=0;$i<5;$i++) {
my $new$i=$i;
print "\$new$i is $new$i";
}
expecting variables to be named $new0,$new1,$new2,$new3,$new4,$new5.
and to have the abillty to use them in a loop like the print command is trying to do.
Thanks
You want to use a hash or array instead. It allows a collection of data to remain together and will result in less pain down the line.
my %hash;
for my $i (0..4) {
$hash{$i} = $i;
print "\$hash{$i} is $hash{$i}\n";
}
Can you describe why exactly you need to do this.
Mark Jason Dominus has written why doing this is not such a good idea in Why it's stupid to "use a variable as a variable name"
part 1, part 2 and part 3.
If you think your need is an exception to cases described there, do let us know how and somebody here might help.
You are asking a common question. The answer in 99.9% of cases is DON'T DO THAT.
The question is so common that it is in perlfaq7: How can I use a variable as a variable name?.
See "How do I use symbolic references in Perl?" for lots of discussion of the issues with symbolic references.
use a hash instead.
my %h=();
$new="test";
for ($i=0;$i<5;$i++) {
$h{"$new$i"}=$i;
}
while( my( $key, $value ) = each( %h ) ) {
print "$key,$value\n";
}
if you want to create variables like $newN you can use eval:
eval("\$new$i=$i");
(using hash probably would be better)

How can I identify and remove redundant code in Perl?

I have a Perl codebase, and there are a lot of redundant functions and they are spread across many files.
Is there a convenient way to identify those redundant functions in the codebase?
Is there any simple tool that can verify my codebase for this?
You could use the B::Xref module to generate cross-reference reports.
I've run into this problem myself in the past. I've slapped together a quick little program that uses PPI to find subroutines. It normalizes the code a bit (whitespace normalized, comments removed) and reports any duplicates. Works reasonably well. PPI does all the heavy lifting.
You could make the normalization a little smarter by normalizing all variable names in each routine to $a, $b, $c and maybe doing something similar for strings. Depends on how aggressive you want to be.
#!perl
use strict;
use warnings;
use PPI;
my %Seen;
for my $file (#ARGV) {
my $doc = PPI::Document->new($file);
$doc->prune("PPI::Token::Comment"); # strip comments
my $subs = $doc->find('PPI::Statement::Sub');
for my $sub (#$subs) {
my $code = $sub->block;
$code =~ s/\s+/ /; # normalize whitespace
next if $code =~ /^{\s*}$/; # ignore empty routines
if( $Seen{$code} ) {
printf "%s in $file is a duplicate of $Seen{$code}\n", $sub->name;
}
else {
$Seen{$code} = sprintf "%s in $file", $sub->name;
}
}
}
It may not be convenient, but the best tool for this is your brain. Go through all the code and get an understanding of its interrelationships. Try to see the common patterns. Then, refactor!
I've tagged your question with "refactoring". You may find some interesting material on this site filed under that subject.
If you are on Linux you might use grep to help you make list all of the functions in your codebase. You will probably need to do what Ether suggests and really go through the code to understand it if you haven't already.
Here's an over-simplified example:
grep -r "sub " codebase/* > function_list
You can look for duplicates this way too. This idea may be less effective if you are using Perl's OOP capability.
It might also be worth mentioning NaturalDocs, a code documentation tool. This will help you going forward.