How can I determine if an element exists in an array (perl) - perl

I'm looping through an array, and I want to test if an element is found in another array.
In pseudo-code, what I'm trying to do is this:
foreach $term (#array1) {
if ($term is found in #array2) {
#do something here
}
}
I've got the "foreach" and the "do something here" parts down-pat ... but everything I've tried for the "if term is found in array" test does NOT work ...
I've tried grep:
if grep {/$term/} #array2 { #do something }
# this test always succeeds for values of $term that ARE NOT in #array2
if (grep(/$term/, #array2)) { #do something }
# this test likewise succeeds for values NOT IN the array
I've tried a couple different flavors of "converting the array to a hash" which many previous posts have indicated are so simple and easy ... and none of them have worked.
I am a long-time low-level user of perl, I understand just the basics of perl, do not understand all the fancy obfuscated code that comprises 99% of the solutions I read on the interwebs ... I would really, truly, honestly appreciate any answers that are explicit in the code and provide a step-by-step explanation of what the code is doing ...
... I seriously don't grok $_ and any other kind or type of hidden, understood, or implied value, variable, or function. I would really appreciate it if any examples or samples have all variables and functions named with clear terms ($term as opposed to $_) ... and describe with comments what the code is doing so I, in all my mentally deficient glory, may hope to possibly understand it some day. Please. :-)
...
I have an existing script which uses 'grep' somewhat succesfully:
$rc=grep(/$term/, #array);
if ($rc eq 0) { #something happens here }
but I applied that EXACT same code to my new script and it simply does NOT succeed properly ... i.e., it "succeeds" (rc = zero) when it tests a value of $term that I know is NOT present in the array being tested. I just don't get it.
The ONLY difference in my 'grep' approach between 'old' script and 'new' script is how I built the array ... in old script, I built array by reading in from a file:
#array=`cat file`;
whereas in new script I put the array inside the script itself (coz it's small) ... like this:
#array=("element1","element2","element3","element4");
How can that result in different output of the grep function? They're both bog-standard arrays! I don't get it!!!! :-(
########################################################################
addendum ... some clarifications or examples of my actual code:
########################################################################
The term I'm trying to match/find/grep is a word element, for example "word123".
This exercise was just intended to be a quick-n-dirty script to find some important info from a file full of junk, so I skip all the niceties (use strict, warnings, modules, subroutines) by choice ... this doesn't have to be elegant, just simple.
The term I'm searching for is stored in a variable which is instantiated via split:
foreach $line(#array1) {
chomp($line); # habit
# every line has multiple elements that I want to capture
($term1,$term2,$term3,$term4)=split(/\t/,$line);
# if a particular one of those terms is found in my other array 'array2'
if (grep(/$term2/, #array2) {
# then I'm storing a different element from the line into a 3rd array which eventually will be outputted
push(#known, $term1) unless $seen{$term1}++;
}
}
see that grep up there? It ain't workin right ... it is succeeding for all values of $term2 even if it is definitely NOT in array2 ... array1 is a file of a couple thousand lines. The element I'm calling $term2 here is a discrete term that may be in multiple lines, but is never repeated (or part of a larger string) within any given line. Array2 is about a couple dozen elements that I need to "filter in" for my output.
...
I just tried one of the below suggestions:
if (grep $_ eq $term2, #array2)
And this grep failed for all values of $term2 ... I'm getting an all or nothing response from grep ... so I guess I need to stop using grep. Try one of those hash solutions ... but I really could use more explanation and clarification on those.

This is in perlfaq. A quick way to do it is
my %seen;
$seen{$_}++ for #array1;
for my $item (#array2) {
if ($seen{$item}) {
# item is in array2, do something
}
}
If letter case is not important, you can set the keys with $seen{ lc($_) } and check with if ($seen{ lc($item) }).
ETA:
With the changed question: If the task is to match single words in #array2 against whole lines in #array1, the task is more complicated. Trying to split the lines and match against hash keys will likely be unsafe, because of punctuation and other such things. So, a regex solution will likely be the safest.
Unless #array2 is very large, you might do something like this:
my $rx = join "|", #array2;
for my $line (#array1) {
if ($line =~ /\b$rx\b/) { # use word boundary to avoid partial matches
# do something
}
}
If #array2 contains meta characters, such as *?+|, you have to make sure they are escaped, in which case you'd do something like:
my $rx = join "|", map quotemeta, #array2;
# etc

You could use the (infamous) "smart match" operator, provided you are on 5.10 or later:
#!/usr/bin/perl
use strict;
use warnings;
my #array1 = qw/a b c d e f g h/;
my #array2 = qw/a c e g z/;
print "a in \#array1\n" if 'a' ~~ #array1;
print "z in \#array1\n" if 'z' ~~ #array1;
print "z in \#array2\n" if 'z' ~~ #array2;
The example is very simple, but you can use an RE if you need to as well.
I should add that not everyone likes ~~ because there are some ambiguities and, um, "undocumented features". Should be OK for this though.

This should work.
#!/usr/bin/perl
use strict;
use warnings;
my #array1 = qw/a b c d e f g h/;
my #array2 = qw/a c e g z/;
for my $term (#array1) {
if (grep $_ eq $term, #array2) {
print "$term found.\n";
}
}
Output:
a found.
c found.
e found.
g found.

#!/usr/bin/perl
#ar = ( '1','2','3','4','5','6','10' );
#arr = ( '1','2','3','4','5','6','7','8','9' ) ;
foreach $var ( #arr ){
print "$var not found\n " if ( ! ( grep /$var/, #ar )) ;
}

Pattern matching is the most efficient way of matching elements. This would do the trick. Cheers!
print "$element found in the array\n" if ("#array" =~ m/$element/);

Your 'actual code' shouldn't even compile:
if (grep(/$term2/, #array2) {
should be:
if (grep (/$term2/, #array2)) {
You have unbalanced parentheses in your code. You may also find it easier to use grep with a callback (code reference) that operates on its arguments (the array.) It helps keep the parenthesis from blurring together. This is optional, though. It would be:
if (grep {/$term2/} #array2) {
You may want to use strict; and use warnings; to catch issues like this.

The example below might be helpful, it tries to see if any element in #array_sp is present in #my_array:
#! /usr/bin/perl -w
#my_array = qw(20001 20003);
#array_sp = qw(20001 20002 20004);
print "#array_sp\n";
foreach $case(#my_array){
if("#array_sp" =~ m/$case/){
print "My God!\n";
}
}
use pattern matching can solve this. Hope it helps
-QC

1. grep with eq , then
if (grep {$_ eq $term2} #array2) {
print "$term2 exists in the array";
}
2. grep with regex , then
if (grep {/$term2/} #array2) {
print "element with pattern $term2 exists in the array";
}

Related

Perl multi-order grep with $_

I'm trying to learn Perl's grep better.
I want to grep which keys of a hash are not in an array
my %args = ( fake => 1);
my #defined_args = ('color', 'colors', 'data', 'figheight', 'figwidth', 'filename', 'flip', 'grid', 'labelsize', 'logscale', 'minor_gridlines');
my #bad_args = grep { not grep {$_} #defined_args} keys %args;
where the list of bad args is in #bad_args The last line is obviously wrong.
I know that I can do the same thing with a hash, but I want to be able to do this with a multi-order grep, i.e. grep on grep.
How can I do this like the following?
my #bad_args = grep { not grep {$_ eq $_} #defined_args} keys %args;
I'm confused because there would be two $_, which I can't run an equality test on.
First the direct answer -- that block that grep takes, you can put any code in it. That's the point of the block, and an element passes/not based on the truthiness of the last statement that returns.
my #bad_args = grep {
my $key = $_;
#defined_args == grep { $key ne $_ } #defined_args
} keys %args;
Here we test whether a key is not-equal to array elements, and then test whether it was unequal to all of them, what decides. Another way would be to test whether it is equal to any one element,
not grep { $key eq $_ } #defined_args;
This is all a little convoluted, needing to work with negations.
But these are kinds of common things to do and there are libraries.
To directly improve on the above
use List::Util 1.33 qw(none); # before 1.33 it was in List::MoreUtils
my #bad_args = grep {
my $key = $_;
none { $key eq $_ } #defined_args
} keys %args;
Now the needed "negative" is absorbed in the library's function name, making this far easier to look at. Also, none will stop once it sees that it failed while grep always processes all elements so this is also more efficient.
These aren't terribly efficient in comparison with hash-based approaches (complexity O(NM-M2/2) or so) but that is completely irrelevant for small arrays. Use of hashes, mentioned in the question, for existence-related issues is a standard; see for example this post, or the source for methods used in all libraries discussed below (simplest example).
Finally, while the question is about (double) filtering it should be mentioned that we are looking for which elements of a list aren't in another; a "difference" between lists. Then other kinds of libraries come into play. Some examples
Using Set::Scalar
use Set::Scalar;
...
my $keys = Set::Scalar->new(keys %args);
my $good = Set::Scalar->new(#defined_args);
my $keys_not_in_good = $keys->difference($good);
say $keys_not_in_good;
Also note Set::Object in the same camp.
Then there are tools specifically for array comparison, like List::Compare
use List::Compare;
...
my $lc = List::Compare->new('-u', '-a', \#defined_args, [keys %args]);
my #only_in_second = $lc->get_complement();
say "#only_in_second";
Options -u and -a showcase some of modules capabilities, to speed things up; they are not necessary. This module has a lot, see docs.
On the other end is the simple Array::Utils.
There is more out there. See for example this page for plenty of ideas.
When you get into these sort of tangles, it's sometimes better to find a different way.
There are two things to think about. If you want a nested use of $_, you need to protect the outer one somehow. Since you want to use the outer and inner ones in the same expression, one of them needs a different name:
grep {
my $top = $_;
my $count = grep { $top eq $_ } ...;
...
} keys %args;
But, that inner grep is a bit weird. You want to check if something is (or isn't) in a list. That's the job for a hash and exists:
my %allowed_args = map { $_, 1 } #allowed_args;
my #found_bad_args = grep { ! exists $allowed_args{$_} } keys %args;

Search array of strings for specific words

Firstly, I don't know Perl at all, and need a reasonably quick answer on this. I have the result of running a command stored in an array:
my #result = `$command`;
What I need to do is search the array to see if any element contains the word "Merge" or the word "changed" (both case insensitive).
Can someone advise please?
The tool for the job here is grep - a function that allows you to specify a filter against a list. You can use it much like Unix grep, but it'll also allow for more complex tests (e.g. code to run).
In your case:
my #matches = grep { /merge|changed/i } #result;
if ( #matches ) {
print "One or more lines matched\n";
}
You can do this with a regular expression. In this example if the line match 'merge' or 'changed' (insensitive of course) the line is printed :
#!/usr/bin/perl
use strict;
use warnings;
my #result = `command`;
foreach my $line (#result){
if ($line =~ /merge|changed/i){
print $line;
}
}

Perl searching for string contained in array

I have an array with the following values:
push #fruitArray, "apple|0";
push #fruitArray, "apple|1";
push #fruitArray, "pear|0";
push #fruitArray, "pear|0";
I want to find out if the string "apple" exists in this array (ignoring the "|0" "|1")
I am using:
$fruit = 'apple';
if( $fruit ~~ #fruitArray ){ print "I found apple"; }
Which isn't working.
Don't use smart matching. It never worked properly for a number of reasons and it is now marked as experimental
In this case you can use grep instead, together with an appropriate regex pattern
This program tests every element of #fruitArray to see if it starts with the letters in $fruit followed by a pipe character |. grep returns the number of elements that matched the pattern, which is a true value if at least one matched
my #fruitArray = qw/ apple|0 apple|1 pear|0 pear|0 /;
my $fruit = 'apple';
print "I found $fruit\n" if grep /^$fruit\|/, #fruitArray;
output
I found apple
I - like #Borodin suggests, too - would simply use grep():
$fruit = 'apple';
if (grep(/^\Q$fruit\E\|/, #fruitArray)) { print "I found apple"; }
which outputs:
I found apple
\Q...\E converts your string into a regex pattern.
Looking for the | prevents finding a fruit whose name starts with the name of the fruit for which you are looking.
Simple and effective... :-)
Update: to remove elements from array:
$fruit = 'apple';
#fruitsArrayWithoutApples = grep ! /^\Q$fruit\E|/, #fruitArray;
If your Perl is not ancient, you can use the first subroutine from the List::Util module (which became a core module at Perl 5.8) to do the check efficiently:
use List::Util qw{ first };
my $first_fruit = first { /\Q$fruit\E/ } #fruitArray;
if ( defined $first_fruit ) { print "I found $fruit\n"; }
Don't use grep, that will loop the entire array, even if it finds what you are looking for in the first index, so it is inefficient.
this will return true if it finds the substring 'apple', then return and not finish iterating through the rest of the array
#takes a reference to the array as the first parameter
sub find_apple{
#array_input = #{$_[0]};
foreach $fruit (#array_input){
if (index($fruit, 'apple') != -1){
return 1;
}
}
}
You can get close to the smartmatch sun without melting your wings by using match::simple:
use match::simple;
my #fruits = qw/apple|0 apple|1 pear|0 pear|0/;
$fruit = qr/apple/ ;
say "found $fruit" if $fruit |M| \#fruits ;
There's also a match() function if the infix [M] doesn't read well.
I like the way match::simple does almost everything I expected from ~~ without any surprising complexity. If you're fluent in perl it probably isn't something you'd see as necessary, but - especially with match() - code can be made pleasantly readable ... at the cost of imposing the use of references, etc.

How can I convert a string number to a number in Perl? [duplicate]

Is there any way to replace multiple strings in a string?
For example, I have the string hello world what a lovely day and I want to replace what and lovely with something else..
$sentence = "hello world what a lovely day";
#list = ("what", "lovely"); # strings to replace
#replist = ("its", "bad"); # strings to replace with
($val = $sentence) =~ "tr/#list/#replist/d";
print "$val\n"; # should print "hello world its a bad day"..
Any ideas why it's not working?
Thanks.
First of all, tr doesn't work that way; consult perldoc perlop for details, but tr does transliteration, and is very different from substitution.
For this purpose, a more correct way to replace would be
# $val
$val =~ s/what/its/g;
$val =~ s/lovely/bad/g;
Note that "simultaneous" change is rather more difficult, but we could do it, for example,
%replacements = ("what" => "its", "lovely" => "bad");
($val = $sentence) =~ s/(#{[join "|", keys %replacements]})/$replacements{$1}/g;
(Escaping may be necessary to replace strings with metacharacters, of course.)
This is still only simultaneous in a very loose sense of the term, but it does, for most purposes, act as if the substitutions are done in one pass.
Also, it is more correct to replace "what" with "it's", rather than "its".
Well, mainly it's not working as tr///d has nothing to do with your request (tr/abc/12/d replaces a with 1, b with 2, and removes c). Also, by default arrays don't interpolate into regular expressions in a way that's useful for your task. Also, without something like a hash lookup or a subroutine call or other logic, you can't make decisions in the right-hand side of a s/// operation.
To answer the question in the title, you can perform multiple replaces simultaneously--er, in convenient succession--in this manner:
#! /usr/bin/env perl
use common::sense;
my $sentence = "hello world what a lovely day";
for ($sentence) {
s/what/it's/;
s/lovely/bad/
}
say $sentence;
To do something more like what you attempt here:
#! /usr/bin/env perl
use common::sense;
my $sentence = "hello world what a lovely day";
my %replace = (
what => "it's",
lovely => 'bad'
);
$sentence =~ s/(#{[join '|', map { quotemeta($_) } keys %replace]})/$replace{$1}/g;
say $sentence;
If you'll be doing a lot of such replacements, 'compile' the regex first:
my $matchkey = qr/#{[join '|', map { quotemeta($_) } keys %replace]}/;
...
$sentence =~ s/($matchkey)/$replace{$1}/g;
EDIT:
And to expand on my remark about array interpolation, you can change $":
local $" = '|';
$sentence =~ s/(#{[keys %replace]})/$replace{$1}/g;
# --> $sentence =~ s/(what|lovely)/$replace{$1}/g;
Which doesn't improve things here, really, although it may if you already had the keys in an array:
local $" = '|';
$sentence =~ s/(#keys)/$replace{$1}/g;

Simple multi-dimensional array with loop in perl

I'm trying to use an array and a loop to print out the following (basically for each letter of the alphabet, print each letter of the alphabet after it and then move on to the next letter). I'm new to perl, anyone have any quick words of :
aa
ab
ac
ad
...
ba
bb
bc
bd
...
ca
cb
...
Currently I have this, but it only prints a single character alphabet...
#arr = ("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z");
$i = #arr;
while ($i)
{
print $arr[$i];
$i--;
}
Using the range operator and the ranges you want to target:
use strict;
use warnings;
my #elements = ("aa" .. "zz");
for my $combo (#elements)
{
print "$combo\n";
}
You can utilize the initial 2 letters till the ending 2 letters you want as ending and the for will take care of everything.
This really isn't multi-dimensional array work, if it were you'd be working with stuff like:
my #foo = (
[1,2,3],
[4,7,8,1,2,3],
[2,3],
);
This is really a very basic how do I make a nested loop that iterates over the same array. I'll bet this is homework.
So, I'll let you figure out the nesting bits, but give some help with Perl's loop operators.
!! for/foreach
for (the each is optional) is the real heavy hitter for looping in perl. Use it like so:
for my $var ( #array ) {
#do stuff with $var
}
Each element in #array will be aliased to the $var variable, and the block of code will be executed. The fact that we are aliasing, rather than copying means that if alter the value of $var, #array will be changed as well. The stuff between the parenthesis may be any expression. The expression will be evaluated in list context. So if you put a file handle in the parens, the entire file will be read into memory and processed.
You can also leave off naming the loop variable, and $_ will be used instead. In general, DO NOT DO THIS.
!! C-Style for
Every once in a while you need to keep track of indexes as you loop over an array. This is when a C style for loop comes in handy.
for( my $i=0; $i<#array; $i++ ) {
# do stuff with $array[$i]
}
!! While/Until
While and until operate with boolean loop conditions. That means that the loop will repeat as long as the appropriate boolean value if found for the condition ( TRUE for while, and FALSE for until). In addition to the obvious cases where you are looking for a particular condition, while is great for processing a file one line at a time.
while ( my $line = <$fh> ) {
# Do stuff with $line.
}
!! map
map is an amazingly useful bit of functional programming kung-fu. It is used to turn one list into another. You pass an anonymous code reference that is used to enact the transformation.
# Multiply all elements of #old by two and store them in #new.
my #new = map { $_ * 2 } #old;
So how do you solve your particular problem? There are many ways. Which is best depends on how you want to use the results. If you want to create a new array of the letter pairs, use map. If you are interested primarily in a side effect (say printing a variable) use for. If you need to work with really big lists that come from sort of interator (like lines from a filehandle) use while.
Here's a solution. I wouldn't turn it in to your professor until you understand how it works.
print map { my $letter=$_; map "$letter$_\n", "a".."z" } "a".."z";
Look at perldoc articles, perlsyn for info on the looping constructs, perlfunc for info on map and look at perlop for info on the range operator (..).
Good luck.
Use the range operator (..) for your initialization. The range operator basically grabs a range of values such as numbers or characters.
Then use a nested loop to go through the array one time per character for a total of 26^2 iterations.
Rather than a while loop I've used a foreach loop to go through each item in the array. You could also put 'a' .. 'z' instead of declared #arr as the argument to the foreach loop. The foreach loops below set $char or $char2 to each value in #arr in turn.
my #arr = ('a' .. 'z');
for my $char (#arr) {
for my $char2 (#arr) {
print "$char$char2\n";
}
}
If all you really want to do is print the 676 strings you describe, then:
#!/usr/bin/perl
use warnings;
use strict;
my $str = 'aa';
while (length $str < 3) {
print $str++, "\n";
}
But I smell an "XY problem"...