Perl: How to get all grouped patterns - perl

I have the following code.
Here I am matching the vowels characters words:
if ( /(a)+/ and /(e)+/ and /(i)+/ and /(o)+/ and /(u)+/ )
{
print "$1#$2#$3#$4#$5\n";
$number++;
}
I am trying to get the all matched patterns using grouping, but I am getting only the last expression pattern, which means the fifth expression of the if condition. Here I know that it is giving only one pattern because last pattern matching in if condition. I want to get all matched patterns, however. Can anyone help me out of this problem?

It is not quite clear what you want to do. Here are some thoughts.
Are you trying to count the number of vowels? In which case, tr will do the job:
my $count = tr/aeiou// ;
printf("string:%-20s count:%d\n" , $_ , $count ) ;
output :
string:book count:2
string:stackoverflow count:4
Or extract the vowels
my #array = / ( [aeiou] ) /xg ;
print Dumper \#array ;
Output from "stackoverflow question"
$VAR1 = [
'a',
'o',
'e',
'o',
'u',
'e',
'i',
'o'
];
Or extract sequences of vowels
my #array = / ( [aeiou]+ ) /xg ;
print Dumper \#array ;
Output from "stackoverflow question"
$VAR1 = [
'a',
'o',
'e',
'o',
'ue',
'io'
];

You could use
sub match_all {
my($s,#patterns) = #_;
my #matches = grep #$_ >= 1,
map [$s =~ /$_/g] => #patterns;
wantarray ? #matches : \#matches;
}
to create an array of non-empty matches.
For example:
my $string = "aaa e iiii oo uuuuu aa";
my #matches = match_all $string, map qr/$_+/ => qw/ a e i o u /;
if (#matches == 5) {
print "[", join("][", #$_), "]\n"
for #matches;
}
else {
my $es = #matches == 1 ? "" : "es";
print scalar(#matches), " match$es\n";
}
Output:
[aaa][aa]
[e]
[iiii]
[oo]
[uuuuu]
An input of, say, "aaa iiii oo uuuuu aa" produces
4 matches

You have 5 patterns with one matching group () each. Not 1 pattern with 5 groups.
(a)+ looks for a string containing a, aa, aaa, aaaa etc. The match will be multiple a's, not the word containing the group of a-s.
Your if( ...) is true if $_ contains one or more of 'a','e','i','o','u'.

Related

Sort Hash Key and Value simultaneously Perl

I have a hash that I want to sort the keys numerically in ascending order and
its values in ascending alphabetically manner.
#!/usr/bin/perl
use warnings;
use strict;
use List::MoreUtils;
use Tie::IxHash;
my %KEY_VALUE;
#tie %KEY_VALUE,'Tie::IxHash';
my %KEY_VALUE= (
0 => [ 'A', 'C', 'B', 'A' ,'D'],
5 => [ 'D', 'F', 'E', ],
2 => [ 'Z', 'X', 'Y' ],
4 => [ 'E', 'R', 'M' ],
3 => [ 'A', 'B', 'B', 'A' ],
1 => [ 'C', 'C', 'F', 'E' ],
);
#while (my ($k, $av) = each %KEY_VALUE)
#{
# print "$k #$av\n ";
#}
#Sort the key numerically
foreach my $key (sort keys %KEY_VALUE)
{
print "$key\n";
}
#To sort the value alphabetically
foreach my $key (sort {$KEY_VALUE{$a} cmp $KEY_VALUE{$b}} keys %KEY_VALUE){
print "$key: $KEY_VALUE{$key}\n";
}
The wanted input is like this, and I want to print out the sorted keys and values.
%KEY_VALUE= (
0 => [ 'A','A','B','C','D'],
1 => [ 'C','C','E','F' ],
2 => [ 'X','Y','Z' ],
3 => [ 'A', 'A', 'B', 'B' ],
4 => [ 'E','M','R' ],
5 => [ 'D','E','F', ],
);
Additional problem, how to print the key and the scalar value of the first different value
Wanted Output:
KEY= 0 VALUE:0 2 3 4 #The scalar value of first A B C D, start with 0
KEY= 1 VALUE:0 2 3 #The scalar value of first C E F
KEY= 2 VALUE:0 1 2 #The scalar value of first X Y Z
KEY= 3 VALUE:0 2 #The scalar value of first A B
KEY= 4 VALUE:0 1 2 #The scalar value of first E M R
KEY= 5 VALUE:0 1 2 #The scalar value of first D E F
Hash keys have no defined order. Generally you sort the keys as you're iterating through the hash.
The values can be sorted as you iterate through the hash.
# Iterate through the keys in numeric order.
for my $key (sort {$a <=> $b } keys %hash) {
# Get the value
my $val = $hash{$key};
# Sort it in place
#$val = sort { $a cmp $b } #$val;
# Display it
say "$key -> #$val";
}
Note that by default sort sorts in ASCII order as strings. That means sort keys %KEY_VALUE is not sorting as numbers but as strings. sort(2,3,10) is (10,2,3). "10" is less than "2" like "ah" is less than "b". Be sure to use sort { $a <=> $b } for numeric sorting and sort { $a cmp $b } for strings.
You could use a different data structure such as Tie::Ixhash though tying has a significant performance penalty. Generally it's better to sort in place unless your hash gets very large.
You can't sort a hash, you can at best print it sorted (or keep the sorted keys in another array). Finding the position of the first value can be done with first_index; we remove duplicates with uniq.
foreach my $key (sort keys %KEY_VALUE) {
my #value = #{$KEY_VALUE{$key}};
my #indices = map { my $e = $_; first_index { $_ eq $e } #value } (uniq (sort #value));
print "$key: " . (join ', ', #indices) . "\n";
}

Perl to sort words by user-defined alphabet sequence

I have an array of "words" (strings), which consist of letters from an "alphabet" with user-defined sequence. E.g my "alphabet" starts with "ʔ ʕ b g d", so a list of "words" (bʔd ʔbg ʕʔb bʕd) after sort by_my_alphabet should be ʔbd ʕʔb bʔd bʕd.
sort by_my_alphabet (bʔd ʔbg ʕʔb bʕd) # gives ʔbd ʕʔb bʔd bʕd
Is there a way to make a simple subroutine by_my_alphabet with $a and $b to solve this problem?
Simple, and very fast because it doesn't use a compare callback, but it needs to scan the entire string:
use utf8;
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
my #sorted =
map { join '', #my_chr[ unpack 'W*', $_ ] } # "\x00\x01\x02\x03\x04" ⇒ "ʔʕbgd"
sort
map { pack 'W*', #my_ord{ split //, $_ } } # "ʔʕbgd" ⇒ "\x00\x01\x02\x03\x04"
#unsorted;
Optimized for long strings since it only scans a string up until a difference is found:
use utf8;
use List::Util qw( min );
my #my_chr = split //, "ʔʕbgd";
my %my_ord = map { $my_chr[$_] => $_ } 0..$#my_chr;
sub my_cmp($$) {
for ( 0 .. ( min map length($_), #_ ) - 1 ) {
my $cmp = $my_ord{substr($_[0], $_, 1)} <=> $my_ord{substr($_[1], $_, 1)};
return $cmp if $cmp;
}
return length($_[0]) <=> length($_[1]);
}
my #sorted = sort my_cmp #unsorted;
Both should be faster than Sobrique's. Theirs uses a compare callback, and it scans the entire strings being compared.
Yes.
sort can take any function that returns a relative sort position. All you need is a function that correctly looks up the 'sort value' of a string for comparing.
So all you need to do here is define a 'relative weight' of your extra letters, and then compare the two.
#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my #sort_order = qw ( B C A D );
my #array_to_sort = qw ( A B C D A B C D AB BB CCC ABC );
my $count = 0;
my %position_of;
$position_of{$_} = $count++ for #sort_order;
print Dumper \%position_of;
sub sort_by_pos {
my #a = split //, $a;
my #b = split //, $b;
#iterate one letter at a time, using 'shift' to take it off the front
#of the array.
while ( #a and #b ) {
my $result = $position_of{shift #a} <=> $position_of{shift #b};
#result is 'true' if it's "-1" or "1" which indicates relative position.
# 0 is false, and that'll cause the next loop iteration to test the next
#letter-pair
return $result if $result;
}
#return a value based on remaining length - longest 'string' will sort last;
#That's so "AAA" comparing with "AA" comparison actually work,
return scalar #a <=> scalar #b;
}
my #new = sort { sort_by_pos } #array_to_sort;
print Dumper \#new;
Bit of a simple case, but it sorts our array into:
$VAR1 = [
'B',
'B',
'BB',
'C',
'C',
'CCC',
'A',
'A',
'AB',
'ABC',
'D',
'D'
];

Printing values of an array from an array of array references

How can I print the values of an array. I have tried several ways but I am unable to get the required values out of the arrays:
#array;
Dumper output is as below :
$VAR1 = [
'a',
'b',
'c'
];
$VAR1 = [
'd',
'e',
'f'
];
$VAR1 = [
'g',
'h',
'i'
];
$VAR1 = [
'j',
'k',
'l'
];
for my $value (#array) {
my $ip = $value->[0];
DEBUG("DEBUG '$ip\n'");
}
I am getting output as below, which means foreach instance I am only getting the first value.
a
d
g
j
I have tried several approaches :
First option :
my $size = #array;
for ($n=0; $n < $size; $n++) {
my $value=$array[$n];
DEBUG( "DEBUG: Element is as $value" );
}
Second Option :
for my $value (#array) {
my $ip = $value->[$_];
DEBUG("DEBUG Element is '$ip\n'");
}
What is the best way to do this?
It is obvious that you have list of arrays. You only loop over top list and print first (0th) value in your first example. Barring any automatic dumpers, you need to loop over both levels.
for my $value (#array) {
for my $ip (#$value) {
DEBUG("DEBUG '$ip\n'");
}
}
You want to dereference here so you need to do something like:
my #array_of_arrays = ([qw/a b c/], [qw/d e f/ ], [qw/i j k/])
for my $anon_array (#array_of_arrays) { say for #{$anon_array} }
Or using your variable names:
use strict;
use warnings;
my #array = ([qw/a b c/], [qw/d e f/], [qw/i j k/]);
for my $ip (#array) {
print join "", #{$ip} , "\n"; # or "say"
}
Since there are anonymous arrays involved I have focused on dereferencing (using PPB style!) instead of nested loops, but print for is a loop in disguise really.
Cheers.

Why if/elsif in Perl execute only the first block?

I am new to Perl. I have an assignment to write a Perl program that accept a countable word from a command line and then generates its plural form. I have composed the following code below, and it shows no errors of compilation. When I execute it from the command line:
(perl plural.pl, for example), it prompts me to enter a noun, then whatever noun I feed as input, the plural form is the same. It doesn't execute the remaining if statements.
For example, if I enter the word "cat", the plural is generated as "cats". But when I enter the word 'church', for example, the plural is generated as 'churches', "fly" as "flys".
Here is the code:
#!/usr/bin/perl
$suffix1 = 's';
$suffix2 = 'es';
$suffix3 = 'ies';
print "Enter a countable noun to get plural: ";
$word = <STDIN>;
chomp($word);
if(substr $word, -1 == 'b' or 'd' or 'c' or 'g' or 'r' or 'j' or 'k' or 'l' or 'm' or 'n' or 'p' or 'q' or 'r' or 't' or 'v' or 'w' or 'e' or 'i' or 'o' or 'u') {
$temp = $word.$suffix1;
print "The plural form of the word \"$word\" is: $temp \n";
}
elsif (substr $word, -1 == 's' or 'sh' or 'ch' or 'x' or 'z') {
$temp = $word.$suffix2;
print "The plural form of the word \"$word\" is: $temp \n";
}
elsif (substr $word, -1 == 'y') {
chop($word);
$temp = $word.$suffix3;
print "The plural form of the word \"$word\" is: $temp \n";
}
Could you help me making the code execute the three statements.
First of all, always use use strict; use warnings;.
Strings are compared using eq, not ==.
substr $word, -1 eq 'b' means substr $word, (-1 eq 'b') when you meant substr($word, -1) eq 'b'. You'll face lots of problems if you omit parens around function calls.
substr($word, -1) eq 'b' or 'd' means the same as (substr($word, -1) eq 'b') or ('d'). 'd' is always true. You'd need to use substr($word, -1) eq 'b' or substr($word, -1) eq 'd'. (Preferably, you'd save substr $word, -1 in a variable to avoid doing it repeatedly.)
substr $word, -1 will never equal ch or sh.
The match operator makes this easy:
if ($word =~ /[bdcgrjklmnpqrtvweiou]\z/) {
...
}
elsif ($word =~ /(?:[sxz]|[sc]h)\z/) {
...
}
elsif ($word =~ /y\z/) {
...
}
In Perl, we use eq for string comparison instead of ==.
You can't use or like this. It should be like if (substr($word, -1) eq 'b' or substr ($word, -1) eq 'd'). Otherwise you could use an array containing all the string that you would like to compare and grep from that array.
Duskast is right. Perl uses symbols for numeric comparisons, and strings for string comparisons.
== eq
!= ne
< lt
<= le
> gt
>= ge
<=> cmp
Also, your use of or, though a good try, doesn't work. The keyword or has weak precedence, and so the expression
substr $word, -1 == 'b' or 'd' or 'c' or
'g' or 'r' or 'j' or
'k' or 'l' or 'm' or
'n' or 'p' or 'q' or
'r' or 't' or 'v' or
'w' or 'e' or 'i' or
'o' or 'u'
is interpreted as
substr ($word, (-1 == 'b')) or 'd' or 'c' or
'g' or 'r' or 'j' or
'k' or 'l' or 'm' or
'n' or 'p' or 'q' or
'r' or 't' or 'v' or
'w' or 'e' or 'i' or
'o' or 'u'
I'm not sure what the substr works out to, but if it's false, the expression continues to the or 'b', which is interpreted as true. Have you seen regular expressions yet? This is much more idiomatically done as
if ($word =~ /[bdcgrjklmnpqrtvweiou]$/) {...}
# Does $word match any of those characters followed by
# the end of the line or string?
Look in the Perl docs for string substitution and the s/.../.../ construct.
By the way, if you were paid to do this instead of being a student, you'd use the Lingua modules instead.
First of all, always, always include use strict; and use warnings;.
Second, use indentations. I've taught Perl courses at work and refuse to accept any assignment that was not indented correctly. In fact, I'm very, very strict about this because I want users to learn to code to the standard (4 space indent, etc.). It makes your program easier to read and to support.
While we're at it, break overly long lines -- especially on StackOverflow. It's hard to read a program when you have to scroll back and forth.
Quick look at your program:
In Perl, strings and numerics use two different sets of boolean operations. This is because strings can contain only digits, but still be strings. Imagine inventory item numbers like 1384 and 993. If I'm sorting these as strings, the 1384 item comes first. If I am sorting them numerically, 993 should come first. Your program has no way of knowing this except by the boolean operation you use:
Boolean Operation Numeric String
================= ======= ======
Equals == eq
Not Equals != ne
Greater Than > gt
Less Than < lt
Greater than/Equals >= ge
Less than/Equals <= le
THe other is that an or, and, || and && only work with two booleans. This won't work:
if ( $a > $b or $c ) {
What this is saying is this:
if ( ( $a > $b ) or $c ) {
So, if $c is a non-zero value, then $c will be true, and the whole statement would be true. You have to do your statement this way:
if ( $a > $b or $a > $c ) {
Another thing, use qq(..) and q() when quoting strings that contain quotation marks. This way, you don't have to put a backslash in front of them.
print "The word is \"swordfish\"\n";
print qq(The word is "swordfish"\n);
And, if you use use feature qw(say); at the top of your program, you get the bonus command of say which is like print, except the ending new line is assumed:
say qq(The word is "swordfish");
When you use substr, $foo, -1, you are only looking at the last character. It cannot ever be a two character string:
if ( substr $word, -1 eq "ch" ) {
will always be false.
Long ifs are hard to maintain. I would use a for loop (actually not, but let's pretend for now..):
#! /usr/bin/env perl
#
# Use these in ALL of your programs
#
use strict;
use warnings;
use feature qw(say);
#
# Use better, more descriptive names
#
my $standard_plural_suffix = 's';
my $uncommon_plural_suffix = 'es';
my $y_ending_plural_suffix = 'ies';
print "Enter a countable noun to get plural: ";
chomp (my $word = <STDIN>);
my $plural_form;
#
# Instead of a long, long "if", use a for loop for testing. Easier to maintain
#
for my $last_letter qw( b d c g r j k l m n p q r t v w e i o u) {
if ( substr($word, -1) eq $last_letter ) {
$plural_form = $word . $standard_plural_suffix;
last;
}
}
#
# Is it an "uncommon plural" test (single character)
#
if ( not $plural_form ) {
for my $last_letter qw(s x z) {
if ( substr($word, -1) eq $last_letter ) {
$plural_form = $word . $uncommon_plural_suffix;
last;
}
}
}
#
# Is it an "uncommon plural" test (double character)
#
if ( not $plural_form ) {
for my $last_two_letters qw(sh ch) {
if ( substr($word, -2) eq $last_two_letters ) {
$plural_form = $word . $uncommon_plural_suffix;
last;
}
}
}
if ( not $plural_form ) {
if ( substr($word, -1) eq 'y' ) {
chop ( my $chopped_word = $word );
$plural_form = $chopped_word . $y_ending_plural_suffix;
}
}
if ( $plural_form ) {
say qq(The plural of "$word" is "$plural_form");
}
else {
say qq(Could not find plural form of "$word");
}
Do you know about regular expressions? Those would work a lot better than using substr because you can test multiple things at once. Plus, I wouldn't use chop, but a regular expression substitution:
#! /usr/bin/env perl
#
# Use these in ALL of your programs
#
use strict;
use warnings;
use feature qw(say);
#
# Use better, more descriptive names
#
my $standard_plural_suffix = 's';
my $uncommon_plural_suffix = 'es';
my $y_ending_plural_suffix = 'ies';
print "Enter a countable noun to get plural: ";
chomp (my $word = <STDIN>);
my $plural_form;
#
# Standard plural (adding plain ol' 's'
#
if ( $word =~ /[bdcgrjklmnpqrtvweiou]$/ ) {
$plural_form = $word . $standard_plural_suffix;
}
#
# Uncommon plural (adding es)
#
elsif ( $word =~ /([sxz]|[sc]h)$/ ) {
$plural_form = $word . $uncommon_plural_suffix;
}
#
# Final 'y' rule: Replace y with ies
#
elsif ( $word =~ /y$/ ) {
$plural_form = $word;
$plural_form =~ s/y$/ies/;
}
if ( $plural_form ) {
say qq(The plural of "$word" is "$plural_form");
}
else {
say qq(Could not find plural form of "$word");
}
I have changed your code a bit. I'm using regular expression:
#!/usr/bin/perl
$suffix1 = 's';
$suffix2 = 'es';
$suffix3 = 'ies';
print "Enter a countable noun to get plural: ";
$word = <STDIN>;
chomp($word);
if ( $word =~ m/(s|sh|ch|x|z)$/) {
$temp = $word . $suffix2;
}
elsif ( substr( $word, -1 ) eq 'y' ) {
chop($word);
$temp = $word . $suffix3;
}
else {
$temp = $word . $suffix1;
}
print "The plural form of the word \"$word\" is: $temp \n";
Also I recommend you always use strict; and use warnings;

Perl: Using glob for permutation of values in an array with a userdefined length

I just read
How can I generate all permutations of an array in Perl?
http://www.perlmonks.org/?node_id=503904
and
https://metacpan.org/module/Algorithm::Permute
I want to create all possible combinations with a userdefined length of values in an array.
perlmonks did it like this:
#a= glob "{a,b,c,d,e,1,2,3,4,5}"x 2;
for(#a){print "$_ "}
and this works fine, but instead of "{a,b,c,d,e,1,2,3,4,5}" I would like to use an array
i tried this:
#a= glob #my_array x $userinput ;
for(#a){print "$_ "}
but it didn't work, how can I do that? Or how can I limit the length of permutation within Algorithm::Permute ?
Simply generate the string from the array:
my #array = ( 'a' .. 'e', 1 .. 5 );
my $stringified = join ',', #array;
my #a = glob "{$stringified}" x 2;
say 0+#a; # Prints '100';
say join ', ', #a; # 'aa, ab, ac, ad ... 53, 54, 55'
One could also use a CPAN module. Like List::Gen:
use List::Gen 'cartesian';
my #permutations = cartesian { join '', #_ } map [ $_ ], ( 'a'..'e', 1..5 ) ;