Matching a string with a special character in Perl - perl

I am having trouble with matching a string with a dollar sign ($) in it.
Here is my code:
if (index($ln, '$COMB') != -1)
{
[do some stuff]
}
I have tried '\$COMB', '\\$COMB' and '\\\$COMB'
I need to match the exact string, $COMB. The problem is that my code also matches : $[some other stuff]COMB. Which is not what I want.

Indeed, index checks if a string is contained in another. It won't find $COMB in $..COMB as you claim, but it would find it in ...$COMB....
If you want to check if a string is the same as another, use eq.
$ln eq '$COMB'
For example,
$ perl -e'
CORE::say qq{"$_" is }.( $_ eq "abc" ? "equal" : "not equal" ).qq{ to "abc"}
for qw( abc abcdef def );
'
"abc" is equal to "abc"
"abcdef" is not equal to "abc"
"def" is not equal to "abc"

Related

Why comparison operator == does not work for strings in perl?

I'm new to Perl. I want to understand why does == operator is treating both these strings alike? It works ok if I use eq instead if ==. If name is kuldeep or rahul, it prints 'Right name'.
my $name="kuldeep";
if ($name == "rahul")
{
print 'Right name!',"\n";
}
else
{
print 'Wrong name!','\n';
}
You are mistaken. The numerical equality comparison operator works perfectly fine with strings!
$ perl -e'CORE::say "123" == "123.0" ? "same" : "different"'
same
$ perl -e'CORE::say "123" == "123.1" ? "same" : "different"'
different
In your example, you are asking Perl to compare the numerical value of the string kuldeep (zero with a warning) with the numerical value of the string rahul (zero with a warning), and they are indeed equal.
ALWAYS USE use strict; use warnings;!!!
And use eq to compare strings.
Interpreter realizes (from the == operator) that it's doing a numeric comparison. The value of $name is converted to a numeric, which gets you a 0. "rahul" is converted to a numeric, which is a 0. 0 == 0, so that's true, and thus "Right name" is chosen.
If I compare it against my name, works the same way.
However, if there really is a string with a number in it, like you made either string "12345" (specifically with the quotes), Perl will assume you knew what you were doing by requesting the == operator, and will dutifully auto-convert ("cast" in programmer-speak) that to numeric 12345. Then your comparison will fail.
TL/DR: use 'eq' for string comparison! :-)

perl compare strings with "=="

In perl, one should compare two strings with "eq" or "ne" etc.
I am a little surprised the following code snippet will print "yes".
$str = "aJohn";
$x = substr($str, 1);
if ($x == "John") {
print "yes\n";
}
My perl has version v5.18.4 on Ubuntu.
Is there a case where the "==" on two strings produce a different result from "eq"?
Thanks.
"foo" == "bar" is true. "foo" eq "bar" is false.
The reason for this: == is numeric comparison. "foo" and "bar" are both numerically evaluate to 0 (like "17foo" evaluates numerically to 17); since 0 == 0, "foo" == "bar". This is not normally the operation you are looking for.

Does Perl's smartmatch work against arrays (or arrayrefs) of mixed strings and compiled regexes?

I'm hoping to use Perl's smart-matching to do look-ups against an array that contains both strings and compiled regexes:
do_something($file) unless ($file ~~ [ #global_excludes, $local_excludes ]);
(Both the #global_excludes array and the $local_excludes array reference can contain a mixture of strings or compiled regexes.)
Is smart-matching in Perl that smart? Currently, when I run the above with v5.10.1 I get:
Argument "script.sh" isn't numeric in smart match at test.pl line 422.
Argument "Debug.log" isn't numeric in smart match at test.pl line 422.
Argument "lib.pm" isn't numeric in smart match at test.pl line 422.
...
Why does smartmatch think that $file is a number?
For now, I'm just doing it manually:
do_something($file) unless exclude ($file, [ #global_excludes, $local_excludes ]);
where exclude looks like this:
sub exclude
{
my ($file, $list) = #_;
foreach my $lookup (#$list)
{
if (is_regexp($lookup))
{
return 1 if $file =~ $lookup;
}
else
{
return 1 if $file eq $lookup;
}
}
return 0;
}
Basically, I'm looking to make the solution more Perly.
Yes, this does work. The problem is that one of your excludes is a number, not a string. When the right-hand side of a smartmatch is a number, Perl does an == numeric comparison.
my $s = 'foo';
$s ~~ 2; # means $s == 2, warns "$s isn't numeric"
$s ~~ '2'; # means $s eq '2', no warning
If you intend to do a string comparison, make sure your excludes are strings. If necessary, stringify them first (e.g. #array = map { ref($_) ? $_ : "$_" } #array).
Bug found! Was a simple empty string in one of the elements of
[ #global_excludes, $local_excludes ]
I guess in such case perl 5.10.1 figures an empty string for a number

Exact pattern match using perl index() function

I am trying to use the index() function and I want to find the position of a word inside a string, only when it is an exact match. For example:
My string is STRING="CATALOG SCATTER CAT CATHARSIS"
And my search string is KEY=CAT
I want to say something like index($STRING, $KEY) and check match for CAT, and not CATALOG. How do I accomplish this? The documentation says
The index function searches for one string within another, but without the wildcard-like behavior of a full regular-expression pattern match.
which makes me think that it may not be that straight-forward, but my perl skills are limited :). Is it possible to do what I am trying to do?
Hopefully, I was able to articulate my question well. Thanks in advance for your help!
How about:
my $str = "CATALOG SCATTER CAT CATHARSIS";
my $key = "CAT";
if ($str =~ /\b$key\b/) {
say "match at char ",$-[0];;
} else {
say "no match";
}
output:
match at char 16
You need to learn about Regular Expressions in Perl. Perl didn't invent Regular Expressions, but tremendously expanded upon the concept. In fact, many other programming languages talk specifically about using Perl Regular Expressions.
A regular expression matches a specific word pattern. For example, /cat/ matches the sequence cat in a string.
if ( $string =~ /cat/ ) {
print "String contains the letters 'cat' in a row\n";
}
In many ways, this does the same thing as:
my $location = index ( $string, "cat" );
if ( $location =! -1 ) { # index returns -1 when substring isn't found
print "String contains the letters 'cat' in a row\n";
}
But, both of these would match:
"Don't let the cat out of the bag"
"The Sears catalog arrived in the mail"
You don't want to match the last. So, you could do this:
my $location = index $string, " cat ";
Now, index $string, " cat " won't match the word catalog. Case closed! Or is it? What about:
"cat and dog it doth rain."
Maybe you could check and say things are okay if a sentence starts with "cat":
if ( (index ($string, " cat ") != -1) or (index ($string, "cat") = 0) ) {
print "String contains the letters 'cat' in a row\n";
}
But, what about these?
"The word CAT in all uppercase"
"Stupid cat"
"Cat! Here Cat! Common Cat!": Punctuation after the word "cat"
"Don't let the 'cat' out of the 'bag'": Quotation Marks around "cat"
It could take dozens of lines to specify each and every one of these conditions.
However:
if ( $string =~ /\bcat\b/i ) {
print "String contains the word 'cat' in it\n";
}
Specifies each and every one -- and then some. The \b says this is a word boundary. This could be a space, a tab, a quote, the beginning or ending of a line. Thus /\bcat\b/ specifies that this should be the word cat and not catalog. The i on the end tells your regular expression to ignore case when matching, so you'll find Cat, cat, CAT, cAt, and all other possible combinations.
In fact, Perl's regular expressions is what made Perl such a popular language to begin with.
Fortunately, Perl comes with not one, but two tutorials on Regular Expressions:
perlretut: Perl Regular Expression Tutorial
perlrequick: Perl Regular Expression Quick Start.
Hope this helps.
That's (partial) solution of this problem with index:
use warnings;
use strict;
my $test = 'CATALOG SCATTER CAT CATHARSIS';
my $key = 'CAT';
my $k_length = length $key;
my $s_length = (length $test) - $k_length;
my $pos = -1;
while (($pos = index $test, $key, $pos + 1) > -1) {
if ($pos > 0) {
my $prev_char = substr $test, $pos - 1, 1;
### print "Previous character: '$prev_char'\n";
next if $prev_char ge 'A' && $prev_char le 'Z'
|| $prev_char ge 'a' && $prev_char le 'z';
}
if ($pos < $s_length) {
my $next_char = substr $test, $pos + $k_length, 1;
### print "Next character: '$next_char'\n";
next if $next_char ge 'A' && $next_char le 'Z'
|| $next_char ge 'a' && $next_char le 'z';
}
print "Word '$key' found at " . $pos + 1 . "th position.\n";
}
As you see, it's kinda wordy, because it uses basic Perl string functions - index and substr - only. Checking whether the substring found is indeed a word is done via checking its next and previous characters (if they exist): if they belong to either A-Z or a-z range, it's not a word.
You can simplify it a bit by trying to lowercase these characters (with lc), then check against the single character range only:
my $lc_prev_char = lc( substr $test, $pos - 1, 1 );
next if $lc_prev_char ge 'a' && $lc_prev_char le 'z';
... but then again, it's quite a minor improvement (if improvement at all).
Now consider this:
my $test = 'CATALOG SCATTER CAT CATHARSIS CAT';
my $key = 'CAT';
while ($test =~ /(?<![A-Za-z])$key(?![A-Za-z])/g) {
print "Word '$key' found at " . ($-[0] + 1) . "th position.\n";
}
... and that's it! The pattern literally tests the string given ($test) for the substring given ($key) not being either preceded with or followed by the symbol of A-Za-z range, and supporting Perl regex magic (this variable, in particular) makes it easy to get the starting position of such substring.
The bottom line: use regexes to do the regexes' work.
Regular expressions allow for the search to contain word boundaries as well as distinct characters. While
my $string = "CATALOG SCATTER CAT CATHARSIS";
index($string, 'CAT');
will return zero or greater if $string contains the characters CAT, a regular expression like
$string =~ /\bCAT\b/;
will return false as $string doesn't contain CAT preceded and followed by a word boundary. (A word boundary is either the beginning or end of the string, or between an word character and a non-word character. A word character is any alphanumeric character or an underscore.)
use \E value.
so :
#!usr/bin/perl
my $string ="Little Tony";
my $check = "Ton";
if($string =~ m/$check\E/g)
{
print "match";
}
else
{
die("No Match");
}

how to return the search results in perl

I would like to write a script which can return me the result whenever the regex meet.I have some difficulties in writing the regex i guess.
Content of My input file is as below:
Number a123;
Number b456789 vit;
alphabet fty;
I wish that it will return me the result of a123 and b456789, which is the string after "Number " and before ("\s" or ";").
I have tried with below cmd line:
my #result=grep /Number/,#input_file;
print "#results\n";
The result i obtained is shown below:
Number a123;
Number b456789 vit;
Wheareas the expected result should be like below:
a123
b456789
Can anyone help on this?
Perls grep function selects/filters all elements from a list that match a certain condition. In your case, you selected all elements that match the regex /Number/ from the #input_file array.
To select the non-whitespace string after Number use this Regex:
my $regex = qr{
Number # Match the literal string 'Number'
\s+ # match any number of whitespace characters
([^\s;]+) # Capture the following non-spaces-or-semicolons into $1
# using a negated character class
}x; # use /x modifier to allow whitespaces in pattern
# for better formatting
My suggestion would be to loop directly over the input file handle:
while(defined(my $line = <$input>)) {
$line =~ /$regex/;
print "Found: $1" if length $1; # skip if nothing was found
}
If you have to use an array, a foreach-loop would be preferable:
foreach my $line (#input_lines) {
$line =~ /$regex/;
print "Found: $1" if length $1; # skip if nothing was found
}
If you don't want to print your matches directly but to store them in an array, push the values into the array inside your loop (both work) or use the map function. The map function replaces each input element by the value of the specified operation:
my #result = map {/$regex/; length $1 ? $1 : ()} #input_file;
or
my #result = map {/$regex/; length $1 ? $1 : ()} <$input>;
Inside the map block, we match the regex against the current array element. If we have a match, we return $1, else we return an empty list. This gets flattened into invisibility so we don't create an entry in #result. This is different form returning undef, what would create an undef element in your array.
if your script is intended as a simple filter, you can use
$ cat FILE | perl -nle 'print $1 if /Number\s+([^\s;]+)/'
or
$ cat FILE | perl -nle 'for (/Number\s+([^\s;]+)/g) { print }'
if there can be multiple occurences on the same line.
perl -lne 'if(/Number/){s/.*\s([a-zA-Z])([\d]+).*$/\1\2/g;print}' your_file
tested below:
> cat temp
Number a123;
Number b456789 vit;
alphabet fty;
> perl -lne 'if(/Number/){s/.*\s([a-zA-Z])([\d]+).*$/\1\2/g;print}' temp
a123
b456789
>