Grep to find item in Perl array - perl

Every time I input something the code always tells me that it exists. But I know some of the inputs do not exist. What is wrong?
#!/usr/bin/perl
#array = <>;
print "Enter the word you what to match\n";
chomp($match = <STDIN>);
if (grep($match, #array)) {
print "found it\n";
}

The first arg that you give to grep needs to evaluate as true or false to indicate whether there was a match. So it should be:
# note that grep returns a list, so $matched needs to be in brackets to get the
# actual value, otherwise $matched will just contain the number of matches
if (my ($matched) = grep $_ eq $match, #array) {
print "found it: $matched\n";
}
If you need to match on a lot of different values, it might also be worth for you to consider putting the array data into a hash, since hashes allow you to do this efficiently without having to iterate through the list.
# convert array to a hash with the array elements as the hash keys and the values are simply 1
my %hash = map {$_ => 1} #array;
# check if the hash contains $match
if (defined $hash{$match}) {
print "found it\n";
}

You seem to be using grep() like the Unix grep utility, which is wrong.
Perl's grep() in scalar context evaluates the expression for each element of a list and returns the number of times the expression was true.
So when $match contains any "true" value, grep($match, #array) in scalar context will always return the number of elements in #array.
Instead, try using the pattern matching operator:
if (grep /$match/, #array) {
print "found it\n";
}

This could be done using List::Util's first function:
use List::Util qw/first/;
my #array = qw/foo bar baz/;
print first { $_ eq 'bar' } #array;
Other functions from List::Util like max, min, sum also may be useful for you

In addition to what eugene and stevenl posted, you might encounter problems with using both <> and <STDIN> in one script: <> iterates through (=concatenating) all files given as command line arguments.
However, should a user ever forget to specify a file on the command line, it will read from STDIN, and your code will wait forever on input

I could happen that if your array contains the string "hello", and if you are searching for "he", grep returns true, although, "he" may not be an array element.
Perhaps,
if (grep(/^$match$/, #array)) more apt.

You can also check single value in multiple arrays like,
if (grep /$match/, #array, #array_one, #array_two, #array_Three)
{
print "found it\n";
}

Related

I need to search for a value in a perl array and if I find a match execute some code

This is sort of what I am wanting to do. At present mm returns nothing, while searchname returns the expected value.
This is a perl script embedded in a web page.
I have tried numerous approaches to this code but nothing seems to provide the results I desire. I think it is just a case of syntax.
# search for an item
if ($modtype eq "search") {
$searchname=$modname;
print "Value of searchname $searchname\n";
my #mm = grep{$searchname} #names;
print "Value of mm #mm\n";
if ($mm eq $searchname) {
print "$searchname found!\n";
}
else {
print "$searchname not Found\n";
}
}
my #mm = grep { $_ eq $searchname } #names;
if (#mm) {
print "found\n";
}
grep takes a boolean expression, not just a variable. In that expression, $_ refers to the current list element. By using an equality comparison we get (in #mm) all elements of #names that are equal to $searchname, if any.
To check whether an array is empty, you can simply use it in boolean context, as in if (#mm).
If you don't care about the found elements themselves, just whether there are any, you can use grep in scalar context:
my $count = grep { $_ eq $searchname } #names;
if ($count > 0) {
print "found $count results\n";
}
This will give you the number of matching elements.
If you don't need to know that number, just whether there was any result at all, you can use any from List::Util:
use List::Util qw(any);
if (any { $_ eq $searchname } #names) {
...
}
If #names is big, this is potentially more efficient because it can stop after the first match is found.
I'm not sure what $mm refers to in your code. Did you start your code with use strict; use warnings;? If not, you should.
Looks like you misunderstand a couple of things.
my #mm = grep{$searchname} #names;
The grep() function takes two arguments. A block of code ({ $searchname }) and a list of values (#names). For each value in the list, it puts the value into $_ and executes the code block. If the code block returns a true value then the contents of $_ is added to the output list.
Your block of code ignores $_ and just checks for the value of $searchname. That is very likely to always be true, so all of the values from #names get copied into #mm.
I think it's more likely that you want:
my #mm = grep{ $_ eq $searchname } #names;
Secondly, you suddenly start using a new variable called $mm. I suspect you're getting confused between #mm and $mm which are completely different variables with no connection with each other.
I think what you're actually trying to do is to look at the first element of #mm so you want:
if ($mm[0] eq $searchname)
But, given that values only end up in #mm if they are equal to $searchname (because that's what your grep() does), I think you really just want to check whether or not anything ended up in #mm. So you should use:
if (#mm)
Which is, in my opinion, easier to understand.

Explanation of Perl's syntax from module MoreUtils.pm

I am seeking explanation of the syntax of Perl's uniq and fidrstidx function from module MoreUtils.pm.
Having sought that, I already know other ways to get uniq array elements from an array having duplicate elements and finding the first index from an array by below ways :
## remove duplicate elements ##
my #arr = qw (2 4 2 8 3 4 6);
my #uniq = ();
my %hash = ();
#uniq = grep {!$hash{$_}++ } #arr;
### first index ###
#arr = qw (Java ooperl Ruby cgiperl Python);
my ($index) = grep {$arr[$_] =~ /perl/} 0..$#arr;
Can anybody please explain me second line of this below sub uniq function comprising map and ternary operator from MoreUtils.pm:
map {$h{$_}++ == 0 ? $_ : () } #;
and also
the &# passed to firstidx function and the below line in the body of the function :
local *_ = \$_[$i];
What I understand that sub routine ref is passed to firstidx. But a bit more detailed explanation will be much appreciated.
Thanks.
Your second question was answered in the comments.
Your first question asks about map {$h{$_}++ == 0 ? $_ : () } #; from List::MoreUtils. In recent versions, it's actually in List::MoreUtils::PP (for Pure Perl) since many of the subroutines are also implemented in C and XS. Here's the current version of the Pure Perl uniq:
sub uniq (#)
{
my %seen = ();
my $k;
my $seen_undef;
grep { defined $_ ? not $seen{ $k = $_ }++ : not $seen_undef++ } #_;
}
This has the same map technique although it's using grep instead. The grep goes through all of the elements in #_ and has to return either true or false for each of them. The elements which evaluate to true end up in the output list. The code then wants to make an element evaluate to true the first time it sees it and false the rest of the times.
In this code it handles undef separately. If the current element is not undef, it does the first branch of the conditional operator and the second branch otherwise. Now let's look at the branches.
The defined case adds an element to a hash. No one left code comments about the use of $k but it probably has something to do with not disturbing $_. That $k becomes the key for the hash:
not $seen{ $k = $_ }++
If that is the first time that key has been encountered the value of the hash is undef. That post-increment does its work after the value is used so hold off on thinking about that for a moment. The low-precendence not sees the value of $seen{$k}, which is undef. The not turns the false value of undef into true. That true indicates that the grep has seen $_ for the first time. It becomes part of the output list. Then the ++ does its work and increments the undef value to 1. On all subsequent encounters with the same value the hash value will be true. The not will turn the true value into false and that element won't be in the output list.
The map you show implements the grep. It returns an element when the condition is true and returns no elements when it is false:
map {$h{$_}++ == 0 ? $_ : () } #_;
For each element it adds it as the key in the hash and compares the value to 0. The first time an element is seen that value is undef. In numeric context an undef is 0. So, the == returns true and the first branch of the conditional operator fires, returning $_ to the output list. The ++ then increments the hash value from undef to 1. The next time it encounters the same value the hash value is not 0 and the second branch of the conditional operator returns the empty list. That adds no elements to the output list.
Newer version of List::MoreUtils don't use the construct any more, but as Сухой27 explained,
map { CONDITION ? $_ : () } LIST
is just a fancy alternative to
grep { CONDITION } LIST
I don't think there's any overarching reason the author chose map for this implementation, and in fact it was simplified to grep in later versions of List::MoreUtils.
The firstidx syntax is firstidx BLOCK LIST. Like the builtin map and grep, it is specified that the code in BLOCK will operate on the variable $_, and that the code is allowed to make changes to $_. So in the firstidx implementation, it is not sufficient to set $_ to each value in LIST. Rather, $_ must be aliased to each element of LIST so that a change in $_ inside BLOCK also results to a change in the element in the LIST. This is accomplished by manipulating the symbol table
local *_ = \$scalar # make $_ an alias of $scalar
And you use local so that when firstidx is done, we haven't clobbered any useful information that was previously in the $_ variable.

Perl significance of $#_ variable

I see when I loop through elements of an array, and test $#_ , I get -1 for each element. I am hoping someone can explain what this variable does, and what it is used for most often.
Just like $#foo is the last existing index of array #foo, $#_ is the last existing index of array #_. If #_ is empty, $#_ is -1.
It sounds like you mean to use $_. $_ is aliased by foreach, map and grep loops to the element current being processed. while (<>) also sets $_ (as it gets rewritten to while (defined($_ = <>))). As a result, $_ is used as the default argument by many builtins (e.g. say).
# Print each element on its own line
say for #a;
is short for
# Print each element on its own line
say $_ for #a;
which is the terse form of
# Print each element on its own line
for my $ele (#a) {
say $ele;
}
I believe you mean $_ which is a special variable in Perl. It holds the current value while looping through a list element. For instance, below will print out each element of #foo, one at a time.
foreach (#foo) {
print $_;
}

Why does Perl's shift complain 'Type of arg 1 to shift must be array (not grep iterator).'?

I've got a data structure that is a hash that contains an array of hashes. I'd like to reach in there and pull out the first hash that matches a value I'm looking for. I tried this:
my $result = shift grep {$_->{name} eq 'foo'} #{$hash_ref->{list}};
But that gives me this error: Type of arg 1 to shift must be array (not grep iterator). I've re-read the perldoc for grep and I think what I'm doing makes sense. grep returns a list, right? Is it in the wrong context?
I'll use a temporary variable for now, but I'd like to figure out why this doesn't work.
A list isn't an array.
my ($result) = grep {$_->{name} eq 'foo'} #{$hash_ref->{list}};
… should do the job though. Take the return from grep in list context, but don't assign any of the values other than the first.
I think a better way to write this would be this:
use List::Util qw/first/;
my $result = first { $_->{name} eq 'foo' } #{ $hash_ref->{list} };
Not only will it be more clear what you're trying to do, it will also be faster because it will stop grepping your array once it has found the matching element.
Another way to do it:
my $result = (grep {$_->{name} eq 'foo'} #{$hash_ref->{list}})[0];
Note that the curlies around the first argument to grep are redundant in this case, so you can avoid block setup and teardown costs with
my $result = (grep $_->{name} eq 'foo', #{$hash_ref->{list}})[0];
“List value constructors” in perldata documents subscripting of lists:
A list value may also be subscripted like a normal array. You must put the list in parentheses to avoid ambiguity. For example:
# Stat returns list value.
$time = (stat($file))[8];
# SYNTAX ERROR HERE.
$time = stat($file)[8]; # OOPS, FORGOT PARENTHESES
# Find a hex digit.
$hexdigit = ('a','b','c','d','e','f')[$digit-10];
# A "reverse comma operator".
return (pop(#foo),pop(#foo))[0];
As I recall, we got this feature when Randal Schwartz jokingly suggested it, and Chip Salzenberg—who was a patching machine in those days—implemented it that evening.
Update: A bit of searching shows the feature I had in mind was $coderef->(#args). The commit message even logs the conversation!

How do I print unique elements in Perl array?

I'm pushing elements into an array during a while statement. Each element is a teacher's name. There ends up being duplicate teacher names in the array when the loop finishes. Sometimes they are not right next to each other in the array, sometimes they are.
How can I print only the unique values in that array after its finished getting values pushed into it? Without having to parse the entire array each time I want to print an element.
Heres the code after everything has been pushed into the array:
$faculty_len = #faculty;
$i=0;
while ($i != $faculty_len)
{
printf $fh '"'.$faculty[$i].'"';
$i++;
}
use List::MoreUtils qw/ uniq /;
my #unique = uniq #faculty;
foreach ( #unique ) {
print $_, "\n";
}
Your best bet would be to use a (basically) built-in tool, like uniq (as described by innaM).
If you don't have the ability to use uniq and want to preserve order, you can use grep to simulate that.
my %seen;
my #unique = grep { ! $seen{$_}++ } #faculty;
# printing, etc.
This first gives you a hash where each key is each entry. Then, you iterate over each element, counting how many of them there are, and adding the first one. (Updated with comments by brian d foy)
I suggest pushing it into a hash.
like this:
my %faculty_hash = ();
foreach my $facs (#faculty) {
$faculty_hash{$facs} = 1;
}
my #faculty_unique = keys(%faculty_hash);
#array1 = ("abc", "def", "abc", "def", "abc", "def", "abc", "def", "xyz");
#array1 = grep { ! $seen{ $_ }++ } #array1;
print "#array1\n";
This question is answered with multiple solutions in perldoc. Just type at command line:
perldoc -q duplicate
Please note: Some of the answers containing a hash will change the ordering of the array. Hashes dont have any kind of order, so getting the keys or values will make a list with an undefined ordering.
This doen't apply to grep { ! $seen{$_}++ } #faculty
This is a one liner command to print unique lines in order it appears.
perl -ne '$seen{$_}++ || print $_' fileWithDuplicateValues
I just found hackneyed 3 liner, enjoy
my %uniq;
undef #uniq(#non_uniq_array);
my #uniq_array = keys %uniq;
Just another way to do it, useful only if you don't care about order:
my %hash;
#hash{#faculty}=1;
my #unique=keys %hash;
If you want to avoid declaring a new variable, you can use the somehow underdocumented global variable %_
#_{#faculty}=1;
my #unique=keys %_;
If you need to process the faculty list in any way, a map over the array converted to a hash for key coalescing and then sorting keys is another good way:
my #deduped = sort keys %{{ map { /.*/? ($_,1):() } #faculty }};
print join("\n", #deduped)."\n";
You process the list by changing the /.*/ regex for selecting or parsing and capturing accordingly, and you can output one or more mutated, non-unique keys per pass by making ($_,1):() arbitrarily complex.
If you need to modify the data in-flight with a substitution regex, say to remove dots from the names (s/\.//g), then a substitution according to the above pattern will mutate the original #faculty array due to $_ aliasing. You can get around $_ aliasing by making an anonymous copy of the #faculty array (see the so-called "baby cart" operator):
my #deduped = sort keys %{{ map {/.*/? do{s/\.//g; ($_,1)}:()} #{[ #faculty ]} }};
print join("\n", #deduped)."\n";
print "Unmolested array:\n".join("\n", #faculty)."\n";
In more recent versions of Perl, you can pass keys a hashref, and you can use the non-destructive substitution:
my #deduped = sort keys { map { /.*/? (s/\.//gr,1):() } #faculty };
Otherwise, the grep or $seen[$_]++ solutions elsewhere may be preferable.