I need to search for a value in a perl array and if I find a match execute some code - perl

This is sort of what I am wanting to do. At present mm returns nothing, while searchname returns the expected value.
This is a perl script embedded in a web page.
I have tried numerous approaches to this code but nothing seems to provide the results I desire. I think it is just a case of syntax.
# search for an item
if ($modtype eq "search") {
$searchname=$modname;
print "Value of searchname $searchname\n";
my #mm = grep{$searchname} #names;
print "Value of mm #mm\n";
if ($mm eq $searchname) {
print "$searchname found!\n";
}
else {
print "$searchname not Found\n";
}
}

my #mm = grep { $_ eq $searchname } #names;
if (#mm) {
print "found\n";
}
grep takes a boolean expression, not just a variable. In that expression, $_ refers to the current list element. By using an equality comparison we get (in #mm) all elements of #names that are equal to $searchname, if any.
To check whether an array is empty, you can simply use it in boolean context, as in if (#mm).
If you don't care about the found elements themselves, just whether there are any, you can use grep in scalar context:
my $count = grep { $_ eq $searchname } #names;
if ($count > 0) {
print "found $count results\n";
}
This will give you the number of matching elements.
If you don't need to know that number, just whether there was any result at all, you can use any from List::Util:
use List::Util qw(any);
if (any { $_ eq $searchname } #names) {
...
}
If #names is big, this is potentially more efficient because it can stop after the first match is found.
I'm not sure what $mm refers to in your code. Did you start your code with use strict; use warnings;? If not, you should.

Looks like you misunderstand a couple of things.
my #mm = grep{$searchname} #names;
The grep() function takes two arguments. A block of code ({ $searchname }) and a list of values (#names). For each value in the list, it puts the value into $_ and executes the code block. If the code block returns a true value then the contents of $_ is added to the output list.
Your block of code ignores $_ and just checks for the value of $searchname. That is very likely to always be true, so all of the values from #names get copied into #mm.
I think it's more likely that you want:
my #mm = grep{ $_ eq $searchname } #names;
Secondly, you suddenly start using a new variable called $mm. I suspect you're getting confused between #mm and $mm which are completely different variables with no connection with each other.
I think what you're actually trying to do is to look at the first element of #mm so you want:
if ($mm[0] eq $searchname)
But, given that values only end up in #mm if they are equal to $searchname (because that's what your grep() does), I think you really just want to check whether or not anything ended up in #mm. So you should use:
if (#mm)
Which is, in my opinion, easier to understand.

Related

Explanation of Perl's syntax from module MoreUtils.pm

I am seeking explanation of the syntax of Perl's uniq and fidrstidx function from module MoreUtils.pm.
Having sought that, I already know other ways to get uniq array elements from an array having duplicate elements and finding the first index from an array by below ways :
## remove duplicate elements ##
my #arr = qw (2 4 2 8 3 4 6);
my #uniq = ();
my %hash = ();
#uniq = grep {!$hash{$_}++ } #arr;
### first index ###
#arr = qw (Java ooperl Ruby cgiperl Python);
my ($index) = grep {$arr[$_] =~ /perl/} 0..$#arr;
Can anybody please explain me second line of this below sub uniq function comprising map and ternary operator from MoreUtils.pm:
map {$h{$_}++ == 0 ? $_ : () } #;
and also
the &# passed to firstidx function and the below line in the body of the function :
local *_ = \$_[$i];
What I understand that sub routine ref is passed to firstidx. But a bit more detailed explanation will be much appreciated.
Thanks.
Your second question was answered in the comments.
Your first question asks about map {$h{$_}++ == 0 ? $_ : () } #; from List::MoreUtils. In recent versions, it's actually in List::MoreUtils::PP (for Pure Perl) since many of the subroutines are also implemented in C and XS. Here's the current version of the Pure Perl uniq:
sub uniq (#)
{
my %seen = ();
my $k;
my $seen_undef;
grep { defined $_ ? not $seen{ $k = $_ }++ : not $seen_undef++ } #_;
}
This has the same map technique although it's using grep instead. The grep goes through all of the elements in #_ and has to return either true or false for each of them. The elements which evaluate to true end up in the output list. The code then wants to make an element evaluate to true the first time it sees it and false the rest of the times.
In this code it handles undef separately. If the current element is not undef, it does the first branch of the conditional operator and the second branch otherwise. Now let's look at the branches.
The defined case adds an element to a hash. No one left code comments about the use of $k but it probably has something to do with not disturbing $_. That $k becomes the key for the hash:
not $seen{ $k = $_ }++
If that is the first time that key has been encountered the value of the hash is undef. That post-increment does its work after the value is used so hold off on thinking about that for a moment. The low-precendence not sees the value of $seen{$k}, which is undef. The not turns the false value of undef into true. That true indicates that the grep has seen $_ for the first time. It becomes part of the output list. Then the ++ does its work and increments the undef value to 1. On all subsequent encounters with the same value the hash value will be true. The not will turn the true value into false and that element won't be in the output list.
The map you show implements the grep. It returns an element when the condition is true and returns no elements when it is false:
map {$h{$_}++ == 0 ? $_ : () } #_;
For each element it adds it as the key in the hash and compares the value to 0. The first time an element is seen that value is undef. In numeric context an undef is 0. So, the == returns true and the first branch of the conditional operator fires, returning $_ to the output list. The ++ then increments the hash value from undef to 1. The next time it encounters the same value the hash value is not 0 and the second branch of the conditional operator returns the empty list. That adds no elements to the output list.
Newer version of List::MoreUtils don't use the construct any more, but as Сухой27 explained,
map { CONDITION ? $_ : () } LIST
is just a fancy alternative to
grep { CONDITION } LIST
I don't think there's any overarching reason the author chose map for this implementation, and in fact it was simplified to grep in later versions of List::MoreUtils.
The firstidx syntax is firstidx BLOCK LIST. Like the builtin map and grep, it is specified that the code in BLOCK will operate on the variable $_, and that the code is allowed to make changes to $_. So in the firstidx implementation, it is not sufficient to set $_ to each value in LIST. Rather, $_ must be aliased to each element of LIST so that a change in $_ inside BLOCK also results to a change in the element in the LIST. This is accomplished by manipulating the symbol table
local *_ = \$scalar # make $_ an alias of $scalar
And you use local so that when firstidx is done, we haven't clobbered any useful information that was previously in the $_ variable.

Perl: If two elements match print elements, else iterate until match and then print

I'm new to Perl and I'm trying to iterate over two elements of an array with multiple indices in each element and look for a match. If element2 matches element1, I want to print both and move to the next position in element1 and continue the loop looking for the next match. If I don't have a match, loop until I get a match. Here is what I have:
#array = split(',',$row);
foreach $element1(#array[1])
{
foreach $element2(#array[2])
{
if($element1 == $element2)
{
print "1 = $element1 : 2 = $element2 \n";
}
}
}
I'm not getting the the matched output. I've tried multiple iterations with different syntactical changes.
I can get both elements when I do this:
foreach $element1(#array[1])
{
foreach $element2(#array[2])
{
print "1 = $element1 : 2 = $element2 \n";
}
}
I thought I might not be dereferencing correctly. Any guidance or suggestions would be appreciated. Thanks.
There are a number of issues with your script. Briefly:
You should always use strict and warnings.
Array indices start at 0, not 1.
You get an element of an array with $array[0], not #array[0]. This is a common frustration for new Perl programmers. The thing to remember is that the sigil (the symbol preceding a variable name) indicates the type of value being passed (e.g. $scalar, #array, or %hash) to the left-hand side of the expression, not the type of datastructure being accessed on the right-hand side.
As #sp-asic pointed out in the comments on the OP, string comparisons are performed with eq, not ==.
References to datastructures are stored in scalars, and you dereference by prepending the sigil of the original datastructure. If $foo is a reference to an array, #$foo gets you the original array.
You apparently want to break out of your inner loops when you find a match, but you'll want to make it clear (for people who look at this code in the future, which may include yourself) which loop you're breaking out of.
Most critically, #array will be an array of strings after you split another string (the row) on commas, so it's not clear why you expect to be able to treat the strings in the first and second position as arrays that you can loop through. I have a few guesses about what you're actually trying to do, and what your inputs and expected outputs actually look like, but I'll wait for you to provide some additional information and leave the information above as general guidance in the meantime, along with a lightly-reworked version of your code below.
use strict;
use warnings;
my #array = split(',', $row);
foreach my $element1 (#$array[0]) {
foreach my $element2 (#$array[1]) {
if ($element1 eq $element2) {
print "1 = $element1 : 2 = $element2\n";
last;
}
}
}

Perl need the right grep operator to match value of variable

I want to see if I have repeated items in my array, there are over 16.000 so will automate it
There may be other ways but I started with this and, well, would like to finish it unless there is a straightforward command. What I am doing is shifting and pushing from one array into another and this way, check the destination array to see if it is "in array" (like there is such a command in PHP).
So, I got this sub routine and it works with literals, but it doesn't with variables. It is because of the 'eq' or whatever I should need. The 'sourcefile' will contain one or more of the words of the destination array.
// Here I just fetch my file
$listamails = <STDIN>;
# Remove the newlines filename
chomp $listamails;
# open the file, or exit
unless ( open(MAILS, $listamails) ) {
print "Cannot open file \"$listamails\"\n\n";
exit;
}
# Read the list of mails from the file, and store it
# into the array variable #sourcefile
#sourcefile = <MAILS>;
# Close the handle - we've read all the data into #sourcefile now.
close MAILS;
my #destination = ('hi', 'bye');
sub in_array
{
my ($destination,$search_for) = #_;
return grep {$search_for eq $_} #$destination;
}
for($i = 0; $i <=100; $i ++)
{
$elemento = shift #sourcefile;
if(in_array(\#destination, $elemento))
{
print "it is";
}
else
{
print "it aint there";
}
}
Well, if instead of including the $elemento in there I put a 'hi' it does work and also I have printed the value of $elemento which is also 'hi', but when I put the variable, it does not work, and that is because of the 'eq', but I don't know what else to put. If I put == it complains that 'hi' is not a numeric value.
When you want distinct values think hash.
my %seen;
#seen{ #array } = ();
if (keys %seen == #array) {
print "\#array has no duplicate values\n";
}
It's not clear what you want. If your first sentence is the only one that matters ("I want to see if I have repeated items in my array"), then you could use:
my %seen;
if (grep ++$seen{$_} >= 2, #array) {
say "Has duplicates";
}
You said you have a large array, so it might be faster to stop as soon as you find a duplicate.
my %seen;
for (#array) {
if (++$seen{$_} == 2) {
say "Has duplicates";
last;
}
}
By the way, when looking for duplicates in a large number of items, it's much faster to use a strategy based on sorting. After sorting the items, all duplicates will be right next to each other, so to tell if something is a duplicate, all you have to do is compare it with the previous one:
#sorted = sort #sourcefile;
for (my $i = 1; $i < #sorted; ++$i) { # Start at 1 because we'll check the previous one
print "$sorted[$i] is a duplicate!\n" if $sorted[$i] eq $sorted[$i - 1];
}
This will print multiple dupe messages if there are multiple dupes, but you can clean it up.
As eugene y said, hashes are definitely the way to go here. Here's a direct translation of the code you posted to a hash-based method (with a little more Perlishness added along the way):
my #destination = ('hi', 'bye');
my %in_array = map { $_ => 1 } #destination;
for my $i (0 .. 100) {
$elemento = shift #sourcefile;
if(exists $in_array{$elemento})
{
print "it is";
}
else
{
print "it aint there";
}
}
Also, if you mean to check all elements of #sourcefile (as opposed to testing the first 101 elements) against #destination, you should replace the for line with
while (#sourcefile) {
Also also, don't forget to chomp any values read from a file! Lines read from a file have a linebreak at the end of them (the \r\n or \n mentioned in comments on the initial question), which will cause both eq and hash lookups to report that otherwise-matching values are different. This is, most likely, the reason why your code is failing to work correctly in the first place and changing to use sort or hashes won't fix that. First chomp your input to make it work, then use sort or hashes to make it efficient.

Grep to find item in Perl array

Every time I input something the code always tells me that it exists. But I know some of the inputs do not exist. What is wrong?
#!/usr/bin/perl
#array = <>;
print "Enter the word you what to match\n";
chomp($match = <STDIN>);
if (grep($match, #array)) {
print "found it\n";
}
The first arg that you give to grep needs to evaluate as true or false to indicate whether there was a match. So it should be:
# note that grep returns a list, so $matched needs to be in brackets to get the
# actual value, otherwise $matched will just contain the number of matches
if (my ($matched) = grep $_ eq $match, #array) {
print "found it: $matched\n";
}
If you need to match on a lot of different values, it might also be worth for you to consider putting the array data into a hash, since hashes allow you to do this efficiently without having to iterate through the list.
# convert array to a hash with the array elements as the hash keys and the values are simply 1
my %hash = map {$_ => 1} #array;
# check if the hash contains $match
if (defined $hash{$match}) {
print "found it\n";
}
You seem to be using grep() like the Unix grep utility, which is wrong.
Perl's grep() in scalar context evaluates the expression for each element of a list and returns the number of times the expression was true.
So when $match contains any "true" value, grep($match, #array) in scalar context will always return the number of elements in #array.
Instead, try using the pattern matching operator:
if (grep /$match/, #array) {
print "found it\n";
}
This could be done using List::Util's first function:
use List::Util qw/first/;
my #array = qw/foo bar baz/;
print first { $_ eq 'bar' } #array;
Other functions from List::Util like max, min, sum also may be useful for you
In addition to what eugene and stevenl posted, you might encounter problems with using both <> and <STDIN> in one script: <> iterates through (=concatenating) all files given as command line arguments.
However, should a user ever forget to specify a file on the command line, it will read from STDIN, and your code will wait forever on input
I could happen that if your array contains the string "hello", and if you are searching for "he", grep returns true, although, "he" may not be an array element.
Perhaps,
if (grep(/^$match$/, #array)) more apt.
You can also check single value in multiple arrays like,
if (grep /$match/, #array, #array_one, #array_two, #array_Three)
{
print "found it\n";
}

What perl code samples can lead to undefined behaviour?

These are the ones I'm aware of:
The behaviour of a "my" statement modified with a statement modifier conditional or loop construct (e.g. "my $x if ...").
Modifying a variable twice in the same statement, like $i = $i++;
sort() in scalar context
truncate(), when LENGTH is greater than the length of the file
Using 32-bit integers, "1 << 32" is undefined. Shifting by a negative number of bits is also undefined.
Non-scalar assignment to "state" variables, e.g. state #a = (1..3).
One that is easy to trip over is prematurely breaking out of a loop while iterating through a hash with each.
#!/usr/bin/perl
use strict;
use warnings;
my %name_to_num = ( one => 1, two => 2, three => 3 );
find_name(2); # works the first time
find_name(2); # but fails this time
exit;
sub find_name {
my($target) = #_;
while( my($name, $num) = each %name_to_num ) {
if($num == $target) {
print "The number $target is called '$name'\n";
return;
}
}
print "Unable to find a name for $target\n";
}
Output:
The number 2 is called 'two'
Unable to find a name for 2
This is obviously a silly example, but the point still stands - when iterating through a hash with each you should either never last or return out of the loop; or you should reset the iterator (with keys %hash) before each search.
These are just variations on the theme of modifying a structure that is being iterated over:
map, grep and sort where the code reference modifies the list of items to sort.
Another issue with sort arises where the code reference is not idempotent (in the comp sci sense)--sort_func($a, $b) must always return the same value for any given $a and $b.