What is the difference between m[] and m{} regular expression in Perl? How does $1, $2, etc matches the pattern "(((Cu)(Na))(Hg))"?
thanks
There is no difference between m{} and m[]. Perl lets you change the delimiters of regexes to make them easier to read in a given context.
$var =~ m/*.zip/ and $var =~ m{*.zip} and $var =~ m[*.zip] and $var =~ m#*.zip# all match the same way.
For capture groups, captures are always handled from left to right, so for your example:
my $foo = 'CuNaHg';
if ( $foo =~ m{(((Cu)(Na))(Hg))} ) {
print $1; # CuNaHg
print $2; # CuNa
print $3; # Cu
print $4; # Na
print $5; # Hg
}
Related
I don't understand why perl chomp isn't removing the whitespace surrounding my string. I've even tried to call chomp twice, for example, using bash:
$ perl -e 'use 5.22.4; chomp(my $extra=" lol "); chomp($extra); say "<$extra>"'
< lol >
I really expected to get
<lol>
Chomp only removes the line ending (can be set with $/ variable) from the end of the string. It does not trim the string. Perl does not have a built-in trim function. I usually spell out two substitutions instead:
s/^\s+//, s/\s+$// for $string;
Further reading:
perldoc -f chomp
Perl FAQ: How do I strip blank space from the beginning/end of a string?
To remove all whitespace:
$string =~ s/\s+//g;
Left trim:
$string =~ s/^\s+//;
Right Trim:
$string =~ s/\s+$//;
Left and Right trim:
$string =~ s/^\s+|\s+$//g
We can then also build trimming fucntions. This helps in much bigger scripts where you would not want to write the full replacement strings each time, we write them once, then use the function to do the work.
This simple function can be used in any script as trim($string);
sub trim {
$_[0] =~ s/^\s+|\s+$//g;
}
Similarly with a full strip of whitespace.
sub full_strip {
$_[0] =~ s/\s+//g;
}
in a script:
use strict;
use warnings;
my $string = " this is line with leading and trailing whitespaces ";
my $string2 = " another one of those lines ";
trim($string);
trim($string2);
print "$string\n";
print "$string2\n";
full_strip($string);
full_strip($string2);
print "$string\n";
print "$string2\n";
sub trim {
$_[0] =~ s/^\s+|\s+$//g;
}
sub full_strip {
$_[0] =~ s/\s+//g;
}
$string=~s/^\s+|\s+$//g;
This would work well for any generic string where you want to remove beginning and ending spaces.
I want my program to divide the string by the spaces between them
$string = "hello how are you";
The output should look like that:
hello
how
are
you
You can do this is a few different ways.
use strict;
use warnings;
my $string = "hello how are you";
my #first = $string =~ /\S+/g; # regex capture non-whitespace
my #second = split ' ', $string; # split on whitespace
my $third = $string;
$third =~ tr/ /\n/; # copy string, substitute space for newline
# $third =~ s/ /\n/g; # same thing, but with s///
The first two creates arrays with the individual words, the last creates a different single string. If all you want is something to print, the last will suffice. To print an array do something like:
print "$_\n" for #first;
Notes:
Normally, regex capture requires parentheses /(\S+)/, but when the /g modifier is used, and parentheses are omitted, the entire match is returned.
When using capture this way, you need to assure list context on the assignment. If the left hand parameter is a scalar, you would force list context with parentheses: my ($var) = ...
I think like simple....
$string = "hello how are you";
print $_, "\n" for split ' ', $string;
#Array = split(" ",$string); then the #Array contain the answer
You need a split for dividing the string by spaces like
use strict;
my $string = "hello how are you";
my #substr = split(' ', $string); # split the string by space
{
local $, = "\n"; # setting the output field operator for printing the values in each line
print #substr;
}
Output:
hello
how
are
you
Split with regexp to account for extra spaces if any:
my $string = "hello how are you";
my #words = split /\s+/, $string; ## account for extra spaces if any
print join "\n", #words
Can you assist me in determing correct $string = line to end up with partial_phone containing 4165867111?
sub phoneno {
my ($string) = #_;
$string =~ s/^\+*0*1*//g;
return $string;
}
my $phone = "<sip:+4165867111#something;tag=somethingelse>";
my $partial_phone = phoneno($phone);
$string =~ s{
\A # beginning of string
.+ # any characters
\+ # literal +
( # begin capture to $1
\d{5,} # at least five digits
) # end capture to $`
\# # literal #
.+ # any characters
\z # end of string
}{$1}xms;
Your substitution starts with a ^, which means it won't perform substitution unless the rest of your pattern matches the start of your string.
There are lots of ways to do this. How about
my ($partial) = $phone =~ /([2-9]\d+)/;
return $partial;
This returns any string of digits that doesn't begin with a 0 or 1.
This will capture all digits preceding the #:
use strict;
use warnings;
sub phoneno {
my ($string) = #_;
my ($phoneNo) = $string =~ /(\d+)\#/;
return $phoneNo;
}
my $phone = '<sip:+4165867111#something;tag=somethingelse>';
my $partial_phone = phoneno($phone);
print $partial_phone;
Output:
4165867111
I am using Perl to replace all instances of
../../../../../../abc' and
in a string with
/ and , respectively.
The method I am using looks like this:
sub encode
{
my $result = $_[0];
$result =~ s/..\/..\/..\/..\/..\/..\//\//g;
$result =~ s/ / /g;
return $result;
}
Is this correct?
Essentially, yes, although the first regex has to be written in a different way: because . matches any character, we have to escape it \. or put it in its own character class [.]. The first regex can also be written cleaner as
...;
$result =~ s{ (?: [.][.]/ ){6} }
{/}gx;
...;
We look for the literal pattern ../ repeated 6 times and then replace it. Because I use curly braces as a delimiter I don't have to escape the slash. Because I use the /x modifier I can have these spaces inside the regex improving readability.
Try this. It will print /foo bar/baz.
#!/usr/bin/perl -w
use strict;
my $result = "../../../../../../foo bar/baz";
#$result =~ s/(\.\.\/)+/\//g; #for any number of ../
$result =~ s/(\.\.\/){6}/\//g; #for 6 exactly
$result =~ s/ / /g;
print $result . "\n";
you forgot the abc, i think:
sub encode
{
my $result = $_[0];
$result =~ s/(?:..\/){6}abc/\//g;
$result =~ s/ / /g;
return $result;
}
In perl, I have to determine whether user input is a palindrome or not and it must display like this:
Enter in 7 characters: ghghghg #one line here #
Palindrome! #second line answer#
But instead this is what it does:
Enter in 7 characters: g #one line#
h #second line#
g #third line#
h #fourth line#
g #fifth line#
h #sixth line#
g Palindrom
e! #seventh line#
My problem seems to be on the chomp lines with all the variables but I just can't figure out what to do and I've been at if for hours. I need a simple solution, but have not progressed to arrays yet so need some simple to fix this. Thanks
And here is what i have so far, the formula seems to work but it keeps printing a new line for each character:
use strict;
use warnings;
my ($a, $b, $c, $d, $e, $f, $g);
print "Enter in 7 characters:";
chomp ($a = <>); chomp ($b = <>); chomp ($c = <>); chomp ($d = <>); chomp ($e = <>); chomp ($f = <>); chomp ($g = <>);
if (($a eq $g) && ($b eq $f) && ($c eq $e) && ($d eq $d) && ($e eq $c) && ($f eq $b) && ($g eq $a))
{print "Palindrome! \n";}
else
{print "Not Palindrome! \n";}
If you're going to determine if a word is the same backwards, may I suggest using reverse and lc?
chomp(my $word = <>);
my $reverse = reverse $word;
if (lc($word) eq lc($reverse)) {
print "Palindrome!";
} else {
print "Not palindrome!";
}
Perl is famous for its TIMTOWTDI. Here are two more ways of doing it:
print "Enter 7 characters: ";
chomp(my $i= <STDIN>);
say "reverse: ", pal_reverse($i) ? "yes" : "no";
say "regex: ", pal_regex($i) ? "yes" : "no";
sub pal_reverse {
my $i = (#_ ? shift : $_);
return $i eq reverse $i;
}
sub pal_regex {
return (#_ ? shift() : $_) =~ /^(.?|(.)(?1)\2)$/ + 0;
}
use strict;
use warnings;
use feature 'say';
print "Please enter 7 characters : ";
my $input = <>; # Read in input
chomp $input; # To remove trailing "\n"
# Season with input validation
warn 'Expected 7 characters, got ', length $input, ' instead'
unless length $input == 7;
# Determine if it's palindromic or not
say $input eq reverse $input
? 'Palindrome'
: 'Not palindrome' ;
TIMTOWTDI for the recursion-prone:
sub is_palindrome {
return 1 if length $_[0] < 2; # Whole string is palindromic
goto \&is_palindrome
if substr $_[0], 0, 1, '' eq substr $_[0], -1, 1, ''; # Check next chars
return; # Not palindromic if we reach here
}
say is_palindrome( 'ghghghg' ) ? 'Palindromic' : 'Not palindromic' ;
And perldoc perlretut for those who aren't :)
Recursive patterns
This feature (introduced in Perl 5.10) significantly extends the power
of Perl's pattern matching. By referring to some other capture group
anywhere in the pattern with the construct (?group-ref), the pattern
within the referenced group is used as an independent subpattern in
place of the group reference itself. Because the group reference may
be contained within the group it refers to, it is now possible to
apply pattern matching to tasks that hitherto required a recursive
parser.
To illustrate this feature, we'll design a pattern that matches if a
string contains a palindrome. (This is a word or a sentence that,
while ignoring spaces, interpunctuation and case, reads the same
backwards as forwards. We begin by observing that the empty string or
a string containing just one word character is a palindrome. Otherwise
it must have a word character up front and the same at its end, with
another palindrome in between.
/(?: (\w) (?...Here be a palindrome...) \g{-1} | \w? )/x
Adding \W* at either end to eliminate what is to be ignored, we
already have the full pattern:
my $pp = qr/^(\W* (?: (\w) (?1) \g{-1} | \w? ) \W*)$/ix;
for $s ( "saippuakauppias", "A man, a plan, a canal: Panama!" ){
print "'$s' is a palindrome\n" if $s =~ /$pp/;
}