perl simple efficiency and readability - perl

I have a question on efficiency and readability in perl
I have a variable that can take on one of several values (5-6). Sometimes I want to check if this is a specific value, sometimes I want to check if it is one of several choices. I am making this sort of decision in many places in my code (in different functions), and I would like to make this as 'tight' as possible.
for example, say
my $mode; # can be any of qw(one two three four five six)
if ($mode eq 'one') {
#do code
}
if ($mode eq 'one' or $mode eq 'two' or $mode eq 'three') {
#do more code
}
This of course is not my real code, and with meaningful variable names, my if statements are getting quite long and wrap on several lines.
Any help is appreciated!

The List::MoreUtils module has the any function, which is like a short-circuiting grep:
use List::MoreUtils qw/any/;
say "correct mode" if any { $_ eq $mode } qw/one two three/;
That said, you can still use grep, but that always tests all elements, whereas any aborts after the first matching element:
say "correct mode" if grep { $_ eq $mode } qw/one two three/;

An idea.
my %please_name_me;
$please_name_me{$_}++ for qw(one two three);
if ($please_name_me{$mode}) {
#do something
}
Otherwise I like using whitespace:
if (
'one' eq $mode or
'two' eq $mode or
'three' eq $mode
) {
}

This would be a nice job for smart match, but the operator has been marked experimental and may be removed from later versions of Perl.
List::Util has the any operator. However, be very careful. The List::Util is a standard Perl module since at least Perl 5.8.8. Unfortunately, the nice any operator isn't included. You need to update this module in order to use any. However, the first operator may be good enough and that's part of the package:
use strict;
use warnings;
use feature qw(say);
use autodie;
use List::Util qw(first);
use constant VALID_VALUES => qw(one two three four five);
for my $value ( qw(zero one two three four five six) ) {
if ( first { $value eq $_ } VALID_VALUES ) {
say "$value is in the list!";
}
else {
say "Nope. $value is not";
}
}
Since you have use List::Util qw(first); in your program, user should realize that first is from the List::Util package and they can use perldoc List::Util to look it up.
You can also just use grep and forget about List::Util:
for my $value ( qw(zero one two three four five six) ) {
if ( grep { $value eq $_ } VALID_VALUES ) {
say "$value is in the list!";
}
else {
say "Nope. $value is not";
}
}
What you shouldn't do is use a complex regular expression or a if/else chain. These aren't necessarily any clearer and make your program harder to understand:
if ( $value =~ /^(one|two|three|four|five)$/ ) {
if ( $value eq "one" or $value eq "two" or $value eq "three" ... ) {
If you decide to change the valid list of values, you would have to go through your entire program to search for them. This is why I made them a constant. There's only one place in the program where they have to be modified.

There are a number of options (TMTOWDI). Here are two of the simplest:
if ($mode =~ /^(?one|two|three)$/) { ... }
if (grep { $mode eq $_ } qw(one two three)) { ... }
With a little setup beforehand you can make it rather more efficient. Do this once:
my #modes = qw(one two three);
my %modes;
#modes{#modes} = #modes;
Then your check becomes simply:
if ($modes{$mode}) { ... }

If you're using Perl 5.10 or later, you can use smart match operator:
if ($mode ~~ ['one', 'two', 'three'])

If you have many, many combinations to check, consider translating $mode to a numbered bit:
if 'one' -> $modeN = 1
if 'two' -> $modeN = 2
if 'three' -> $modeN = 4 etc.
To check just 'one', if ($modeN == 1) {...
To check 'one', 'two', or 'three', if ($modeN & 7) {...
To check 'two' or 'three', if ($modeN & 6) {...
To check 'one' OR ('two' AND 'three'), if ($modeN & 1 || &modeN & 6) {...
Does that work for you?

Related

Check specific value not in array

I have an array with two values and need to perform some operations if input is not in that array.
I tried like
if ($a ne ('value1' || 'value2')
if (($a ne 'value1' ) || ($a ne 'value2' ))
Both methods didn't work. Can anyone please help?
You could use the none function from List::MoreUtils.
If you really have an array as your subject line says then your code would look like this
use List::MoreUtils 'none';
if ( none { $_ eq $a } #array ) {
# Do stuff
}
or if you really have two constants then you could use this
if ( none { $_ eq $a } 'value1', 'value2' ) {
# Do stuff
}
but in this case I would prefer to see just
if ( $a ne 'value1' and $a ne 'value2' ) {
# Do stuff
}
$a is not in the array if it's different to the first element and it's different to the second one, too.
if ($x ne 'value1' and $x ne 'value2') {
For a real array of any size:
if (not grep $_ eq $x, #array) {
(I use $x instead of $a, as $a is special - see perlvar.)
if ($a ne ('value1' || 'value2')
evaluates to
if ($a ne 'value1')
and
if (($a ne 'value1' ) || ($a ne 'value2' ))
is always TRUE.
You might try
if ($a ne 'value1' and $a ne 'value2')
or
if (!grep{$a eq $_} 'value1', 'value2')
Building on the smartmatch solution by #Dilbertino (nice nick) using match::simple by #tobyink to ease the pain of smartmatch going away (I miss it already):
use match::simple;
my #array = qw(abcd.txt abcdeff.txt abcdweff.txt abcdefrgt.txt);
my $x="abcd.txt" ;
say "it's there" if ($x |M| \#array );
The |M| operator from match::simple can be replaced with a match function which speeds things up a bit (it is implemented with XS):
use match::simple qw(match);
my #array = qw(abcd.txt abcdeff.txt abcdweff.txt abcdefrgt.txt);
my $x="xyz.txt" ;
if ( match ( $x, \#array ) ) {
say "it's there!" ;
}
else {
say "no hay nada";
}
It's "simple" because the RHS controls the behavior. With match::simple if you are matching against an array on the RHS it should be an arrayref.
Smart::Match also has a none function. To use it you would do:
if ( $x ~~ none (#array) ) {
say "not here so do stuff ...";
}
Appendix
Discussion here on Stackoverlfow (see: Perl 5.20 and the fate of smart matching and given-when?) and elsewhere (c.f. the Perlmonks article by #ikegami from circa perl-5.18) gives the context for the smartmatch experiment. TLDR; things might change in the future but meanwhile, you can go back in time and use match::smart qw(match); with perl-5.8.9 proving once again that perl never dies; it just returns to its ecosystem.
In the future something like Smart::Match (i.e. the non-core CPAN module not the concept) can help supercharge a simplified smart matching operator with helper functions that read like adverbs and adjectives and have the added bonus (as I understand it) of clarifying/simplifying things for perl itself since the ~~ operator will have a less ambiguous context for its operations.
I would do something like this using grep with a regex match
#!/usr/bin/perl
use warnings;
use strict;
my #array = ('value1','value2');
if(grep(/\bvalue1\b|\bvalue2\b/, #array)){
print "Not Found\n";
}
else {
print "do something\n";
}
You can also use the smart match operator:
unless( $x ~~ ['value1','value2'] )
Your variable $a is not evaluated as a array without [INDEX] index, but is been treated as a scalar.
Two value array:
$array[0] = "X";
$array[1] = "Y";
or
#array = qw/X Y/;
Condition check using if:
if ( $array[0] ne "Your-String" || $array[1] ne "Your-String")

How to check if several variables are empty in Perl

I have a Perl script where variables must be initialized before the script can proceed. A lengthy if statement where I check each variable is the obvious choice. But maybe there is a more elegant or concise way to check several variables.
Edit:
I don't need to check for "defined", they are always defined with an empty string, I need to check that all are non-empty.
Example:
my ($a, $b, $c) = ("", "", "");
# If-clauses for setting the variables here
if( !$a || !$b || !$c) {
print "Init failed\n";
}
I am assuming that empty means the empty string, not just any false value. That is, if 0 or "0" are ever valid values post-initialization, the currently accepted answer will give you the wrong result:
use strict; use warnings;
my ($x, $y, $z) = ('0') x 3;
# my ($x, $y, $z) = ('') x 3;
for my $var ($x, $y, $z) {
die "Not properly initialized\n" unless defined($var) and length $var;
}
Now, this is pretty useless as a validation, because, more than likely, you would like to know which variable was not properly initialized if this situation occurs.
You would be better served by keeping your configuration parameters in a hash so you can easily check which ones were properly initialized.
use strict; use warnings;
my %params = (
x => 0,
y => '',
z => undef,
);
while ( my ($k, $v) = each %params ) {
validate_nonempty($v)
or die "'$k' was not properly initialized\n";
}
sub validate_nonempty {
my ($v) = #_;
defined($v) and length $v;
}
Or, if you want to list all that were not properly initialized:
my #invalid = grep is_not_initialized($params{$_}), keys %params;
die "Not properly initialized: #invalid\n" if #invalid;
sub is_not_initialized {
my ($v) = #_;
not ( defined($v) and length $v );
}
use List::MoreUtils 'all';
say 'Yes' if (all { defined } $var1, $var2, $var3);
What do you mean by "initialized"? Have values that are not "undef"?
For a small amount of values, the straightforward if check is IMHO the most readable/maintainable.
if (!$var1 || !$var2 || !$var3) {
print "ERROR: Some are not defined!";
}
By the way, checking !$var is a possible bug in that "0" is false in Perl and thus a string initialized to "0" would fail this check. It's a lot better to use $var eq ""
Or better yet, space things out for >3 values
if (!$var1 # Use this if your values are guarantee not to be "0"
|| $var2 eq "" # This is a LOT better since !$var fails on "0" value
|| $var3 eq "") {
print "ERROR: Some are not defined!";
}
If there are so many values to check that the above becomes hard to read (though with per-line check as in the second example, it doesn't really ever happen), or if the values are stored in an array, you can use grep to abstract away the checking:
# We use "length" check instead of "$_ eq ''" as per tchrist's comment below
if (grep { length } ($var1, $var2, $var3, $var4, $var5, #more_args) ) {
print "ERROR: Some are not defined!";
}
If you must know WHICH of the values are not defined, you can use for loop (left as an obvious excercise for the reader), or a map trick:
my $i = -1; # we will be pre-incrementing
if (my #undefined_indexes = map { $i++; $_ ? () : $i }
($var1, $var2, $var3, $var4, $var5, #more_args) ) {
print "ERROR: Value # $_ not defined!\n" foreach #undefined_indexes;
}
use List::Util 'first';
if (defined first { $_ ne "" } $a, $b, $c) {
warn "empty";
}
Your way is readable and easy to understand which means it's easy to maintain. Restating your boolean using de Morgan's laws:
if (not($a and $b and $c)) {
warn(qq(Not all variables are initialized!))
}
That way, you're not prefixing not in front of every variable, and it doesn't affect readability. You can use List::Util or List::MoreUtils, but they don't really add to the legibility.
As Sinan Ünür stated, if you put the variables in a hash, you could parse through the hash and then list which variables weren't initialized. This might be best if there are a lot of these variables, and the list keeps changing.
foreach my $variable qw(a b c d e f g h i j) {
if (not $param{$variable}) {
warn qq(You didn't define $variable\n);
}
}
You can use Getopts::Long to put your parameter values inside a hash instead of separate variables. Plus, the latest versions of Getopts::Long can now operate on any array and not just #ARGV.

How do I search a Perl array for a matching string?

What is the smartest way of searching through an array of strings for a matching string in Perl?
One caveat, I would like the search to be case-insensitive
so "aAa" would be in ("aaa","bbb")
It depends on what you want the search to do:
if you want to find all matches, use the built-in grep:
my #matches = grep { /pattern/ } #list_of_strings;
if you want to find the first match, use first in List::Util:
use List::Util 'first';
my $match = first { /pattern/ } #list_of_strings;
if you want to find the count of all matches, use true in List::MoreUtils:
use List::MoreUtils 'true';
my $count = true { /pattern/ } #list_of_strings;
if you want to know the index of the first match, use first_index in List::MoreUtils:
use List::MoreUtils 'first_index';
my $index = first_index { /pattern/ } #list_of_strings;
if you want to simply know if there was a match, but you don't care which element it was or its value, use any in List::Util:
use List::Util 1.33 'any';
my $match_found = any { /pattern/ } #list_of_strings;
All these examples do similar things at their core, but their implementations have been heavily optimized to be fast, and will be faster than any pure-perl implementation that you might write yourself with grep, map or a for loop.
Note that the algorithm for doing the looping is a separate issue than performing the individual matches. To match a string case-insensitively, you can simply use the i flag in the pattern: /pattern/i. You should definitely read through perldoc perlre if you have not previously done so.
I guess
#foo = ("aAa", "bbb");
#bar = grep(/^aaa/i, #foo);
print join ",",#bar;
would do the trick.
Perl 5.10+ contains the 'smart-match' operator ~~, which returns true if a certain element is contained in an array or hash, and false if it doesn't (see perlfaq4):
The nice thing is that it also supports regexes, meaning that your case-insensitive requirement can easily be taken care of:
use strict;
use warnings;
use 5.010;
my #array = qw/aaa bbb/;
my $wanted = 'aAa';
say "'$wanted' matches!" if /$wanted/i ~~ #array; # Prints "'aAa' matches!"
If you will be doing many searches of the array, AND matching always is defined as string equivalence, then you can normalize your data and use a hash.
my #strings = qw( aAa Bbb cCC DDD eee );
my %string_lut;
# Init via slice:
#string_lut{ map uc, #strings } = ();
# or use a for loop:
# for my $string ( #strings ) {
# $string_lut{ uc($string) } = undef;
# }
#Look for a string:
my $search = 'AAa';
print "'$string' ",
( exists $string_lut{ uc $string ? "IS" : "is NOT" ),
" in the array\n";
Let me emphasize that doing a hash lookup is good if you are planning on doing many lookups on the array. Also, it will only work if matching means that $foo eq $bar, or other requirements that can be met through normalization (like case insensitivity).
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #bar = qw(aaa bbb);
my #foo = grep {/aAa/i} #bar;
print Dumper \#foo;
Perl string match can also be used for a simple yes/no.
my #foo=("hello", "world", "foo", "bar");
if ("#foo" =~ /\bhello\b/){
print "found";
}
else{
print "not found";
}
For just a boolean match result or for a count of occurrences, you could use:
use 5.014; use strict; use warnings;
my #foo=('hello', 'world', 'foo', 'bar', 'hello world', 'HeLlo');
my $patterns=join(',',#foo);
for my $str (qw(quux world hello hEllO)) {
my $count=map {m/^$str$/i} #foo;
if ($count) {
print "I found '$str' $count time(s) in '$patterns'\n";
} else {
print "I could not find '$str' in the pattern list\n"
};
}
Output:
I could not find 'quux' in the pattern list
I found 'world' 1 time(s) in 'hello,world,foo,bar,hello world,HeLlo'
I found 'hello' 2 time(s) in 'hello,world,foo,bar,hello world,HeLlo'
I found 'hEllO' 2 time(s) in 'hello,world,foo,bar,hello world,HeLlo'
Does not require to use a module.
Of course it's less "expandable" and versatile as some code above.
I use this for interactive user answers to match against a predefined set of case unsensitive answers.

How can I verify that a value is present in an array (list) in Perl?

I have a list of possible values:
#a = qw(foo bar baz);
How do I check in a concise way that a value $val is present or absent in #a?
An obvious implementation is to loop over the list, but I am sure TMTOWTDI.
Thanks to all who answered! The three answers I would like to highlight are:
The accepted answer - the most "built-in" and backward-compatible way.
RET's answer is the cleanest, but only good for Perl 5.10 and later.
draegtun's answer is (possibly) a bit faster, but requires using an additional module. I do not like adding dependencies if I can avoid them, and in this case do not need the performance difference, but if you have a 1,000,000-element list you might want to give this answer a try.
If you have perl 5.10, use the smart-match operator ~~
print "Exist\n" if $var ~~ #array;
It's almost magic.
Perl's bulit in grep() function is designed to do this.
#matches = grep( /^MyItem$/, #someArray );
or you can insert any expression into the matcher
#matches = grep( $_ == $val, #a );
This is answered in perlfaq4's answer to "How can I tell whether a certain element is contained in a list or array?".
To search the perlfaq, you could search through the list of all questions in perlfaq using your favorite browser.
From the command line, you can use the -q switch to perldoc to search for keywords. You would have found your answer by searching for "list":
perldoc -q list
(portions of this answer contributed by Anno Siegel and brian d foy)
Hearing the word "in" is an indication that you probably should have used a hash, not a list or array, to store your data. Hashes are designed to answer this question quickly and efficiently. Arrays aren't.
That being said, there are several ways to approach this. In Perl 5.10 and later, you can use the smart match operator to check that an item is contained in an array or a hash:
use 5.010;
if( $item ~~ #array )
{
say "The array contains $item"
}
if( $item ~~ %hash )
{
say "The hash contains $item"
}
With earlier versions of Perl, you have to do a bit more work. If you are going to make this query many times over arbitrary string values, the fastest way is probably to invert the original array and maintain a hash whose keys are the first array's values:
#blues = qw/azure cerulean teal turquoise lapis-lazuli/;
%is_blue = ();
for (#blues) { $is_blue{$_} = 1 }
Now you can check whether $is_blue{$some_color}. It might have been a good idea to keep the blues all in a hash in the first place.
If the values are all small integers, you could use a simple indexed array. This kind of an array will take up less space:
#primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31);
#is_tiny_prime = ();
for (#primes) { $is_tiny_prime[$_] = 1 }
# or simply #istiny_prime[#primes] = (1) x #primes;
Now you check whether $is_tiny_prime[$some_number].
If the values in question are integers instead of strings, you can save quite a lot of space by using bit strings instead:
#articles = ( 1..10, 150..2000, 2017 );
undef $read;
for (#articles) { vec($read,$_,1) = 1 }
Now check whether vec($read,$n,1) is true for some $n.
These methods guarantee fast individual tests but require a re-organization of the original list or array. They only pay off if you have to test multiple values against the same array.
If you are testing only once, the standard module List::Util exports the function first for this purpose. It works by stopping once it finds the element. It's written in C for speed, and its Perl equivalent looks like this subroutine:
sub first (&#) {
my $code = shift;
foreach (#_) {
return $_ if &{$code}();
}
undef;
}
If speed is of little concern, the common idiom uses grep in scalar context (which returns the number of items that passed its condition) to traverse the entire list. This does have the benefit of telling you how many matches it found, though.
my $is_there = grep $_ eq $whatever, #array;
If you want to actually extract the matching elements, simply use grep in list context.
my #matches = grep $_ eq $whatever, #array;
Use the first function from List::Util which comes as standard with Perl....
use List::Util qw/first/;
my #a = qw(foo bar baz);
if ( first { $_ eq 'bar' } #a ) { say "Found bar!" }
NB. first returns the first element it finds and so doesn't have to iterate through the complete list (which is what grep will do).
One possible approach is to use List::MoreUtils 'any' function.
use List::MoreUtils qw/any/;
my #array = qw(foo bar baz);
print "Exist\n" if any {($_ eq "foo")} #array;
Update: corrected based on zoul's comment.
Interesting solution, especially for repeated searching:
my %hash;
map { $hash{$_}++ } #a;
print $hash{$val};
$ perl -e '#a = qw(foo bar baz);$val="bar";
if (grep{$_ eq $val} #a) {
print "found"
} else {
print "not found"
}'
found
$val='baq';
not found
If you don't like unnecessary dependency, implement any or first yourself
sub first (&#) {
my $code = shift;
$code->() and return $_ foreach #_;
undef
}
sub any (&#) {
my $code = shift;
$code->() and return 1 foreach #_;
undef
}

How do I find which elements in one array aren't in another?

I am new to programming and hence I am stuck on a basic level problem.
Following is code I wrote for comparison. But the result I get does not make sense to me. I would appreciate if someone could tell me what is going wrong.
There are two arrays: #array1 , #array2 of unequal length.
I wish to compare both and list down values not present in #array1.
my %temp = map {$_,$_}#array2;
for (#array1){
next if exists $temp{$_};
open (FILE, ">>/filename") or die "$!";
print FILE "$_\n";
close(FILE);
}
See the FAQ How do I compute the difference of two arrays? How do I compute the intersection of two arrays?
Adapting the code you posted:
#!/usr/bin/perl
use strict; use warnings;
my #x = 1 .. 10;
my #y = grep { $_ % 2 } #x;
my %lookup = map { $_ => undef } #y;
for my $x ( #x ) {
next if exists $lookup{$x};
print "$x\n";
}
If you're doing this for a test, which I assume you are I would highly suggest is_deeply in the newer versions of Test::More
You'll have to update Test::More
cpanp install Test::More
or if you're on perl 5.5
cpan Test::More
Then you'll have use it
use Test::More;
tests => 1
is_deeply ( \#arr1, \#arr2, 'test failed' );
If you're not doing this for testing, but you're doing this for introspective purposes and the arrays are small, I'd suggest using XXX:
cpanp install http://search.cpan.org/CPAN/authors/id/I/IN/INGY/XXX-0.12.tar.gz
Then you'll have use it
use XXX;
YYY [ \#arr1, \#arr2 ];
That's some pretty clever code you've got there. Your code is more or less identical to what the Perl FAQ says. I might be tempted to do this, however:
my %tmp = map { $_ => 1 } #array2;
my #diff = grep { not exists $tmp{$_} } #array1;
This gets everything in #array1 that's not in #array2, but avoiding all of those out-of-style looping constructs (yay for functional programming). Though what I'd really do is this:
sub comp (\#\#) {
my %t = map { $_ => 1 } #{$_[1]};
return grep { not exists $t{$_} } #{$_[0]};
}
Then you can just do:
my #diff = comp(#array1, #array2); # get items in #array1 not in #array2
#diff = comp(#arraty2, #array1); # vice versa
Or you can go to CPAN. List::Compare::Functional::complement() does what you want, though the syntax is reversed.
Swap #array1 and #array2 in your code?
For simple values like strings or numbers, the following should work
my #result;
my $hosts = [qw(host1 host2 host3 host4 host5)];
my $stie_obj = [qw(host1 host5 host6)];
#result = map { my $a=$_; my $b=grep {/$a/} #$site_obj; $b==0 ? $a : () } #$hosts;
print Dumper (#result);
Should give :
$VAR1 = 'host2';
$VAR2 = 'host3';
$VAR3 = 'host4';