Using Perl hash table to delete vowels in string, but output is always empty - perl

I'm trying to write a short script in Perl to go through a an array of strings provided by the user, check in a hash table to see if there are vowels in the strings, then return the strings minus the vowels. I know this would be easier to accomplish using regex, but the parameters for the problem state that a hash table, exists(), and split() must be used. This is the script I have so far:
my #vowels = qw(a e i o u A E I O U);
my %vowel;
foreach $v (#vowels) {
$vowel{$v} = undef;
}
foreach $word (#ARGV) {
my #letter_array = split(undef,$word);
}
foreach $letter (#letter_array) {
print($letter) if !exists($vowel{$letter})
}
print "\n"
Input: hello
Expected output: hll
Actual output: nothing
There are no error messages, so I know it's not a syntax error.
Any ideas what I'm messing up? I'm much more comfortable with Python and this is one of my first attempts at Perl.

An alternative and more compact method of achieving the same thing is to use the substitute operator, "s" with a regular expression that matches the vowels.
Here is an example
use strict;
use warnings;
for my $word (#ARGV)
{
print $word =~ s/[aeiou]//gri;
}
or more succinctly like this
use strict;
use warnings;
for (#ARGV)
{
print s/[aeiou]//gri;
}
Key points to note
the regular expression uses the Character Class [aeiou] to match a single lower-case vowel.
the substitute operator has been given three options
the i option to force a case insensitive match. This means the Character Class [aeiou] will match both uppercase and lower-case vowels.
the g option to make the substitute match all instances of the regular expression -- in this instance it will match against all the vowels in the string.
the r option (which is a newish addition to Perl) to get the substitute operator to return the substituted string.
running that gives this
$ perl try.pl hello world
hllwrld

You should use strict not to mess visibility of your variables.
If you require perl version 5.12 or higher it would be used automatically.
So your list #letter_array exists only in foreach my $word (#ARGV) loop. That's why it's empty in the end.
If you want to fix that you'll get the following code:
#!/usr/bin/env perl
use strict;
use warnings;
my #vowels = qw( a e i o u y A E I O U Y );
my %vowel;
foreach my $v (#vowels) {
$vowel{$v} = undef;
}
my #letter_array;
foreach my $word (#ARGV) {
#letter_array = split //, $word;
}
foreach my $letter (#letter_array) {
print($letter) if !exists($vowel{$letter})
}
print "\n"
But this code is still not practical.
If you would get more that 1 word in the input, you'll show only the last one, because the # letter_array overwrites each time.
You can use map to get the hash of vowels much easier without using extra variables.
You can use less loops if you would handle each word right after reading it.
You can also use unless if you want to check if not to make it prettier and more perl-style.
Don't use split on undef. Better use split //, $word
You can use for instead of foreach because it's the same but shorter :)
So you can get an optimised solution.
#!/usr/bin/env perl
use 5.012;
use warnings;
my %vowels = map { $_ => undef } qw( a e i o u y A E I O U Y );
for my $word (#ARGV) {
my #letters = split //, $word;
for my $letter (#letters) {
print $letter unless exists $vowels{$letter};
}
print ' ';
}
print "\n"
Result:
$ perl delete_vowels.pl hello world
hll wrld

Related

Perl, search string for occurrence of items of array

For a file filter, I want to use an array of words, where lines are checked if they match any of the words.
I already have a rather straightforward approach to this (only the essential matching part):
# check if any of the #words is found in $term
#words= qw/one
two
three/;
$term= "too for the show";
# the following looks very C like
$size= #words;
$found= 0;
for ($i= 0; $i<$size && !$found; $i++) {
$found|= $term=~ /$words[$i]/;
}
printf "found= %d\n", $found;
Having seen a lot of arcane syntax and solutions in Perl, I'm wondering if (or rather what) are more compact ways of writing this.
Create a regular expression from all the words and do just one match:
#!/usr/bin/perl
use warnings;
use strict;
my #words = qw( one two three );
my $regex = join '|', map quotemeta, #words;
for my $term ('too for the show', 'five four three', 'bones') {
my $found = $term =~ $regex;
printf "found = %d\n", $found;
}
Matching /\b(?:$regex)\b/ would prevent bones from matching one.
Use Regexp::Assemble to turn the search into one regex. That way each string only has to be scanned once making it more efficient for large numbers of lines.
Regexp::Assemble is preferable to doing it manually. It has a full API of things you might want to do with such a regex, it can handle edge cases, and it can intelligently compile into a more efficient regex.
For example, this program produces (?^:\b(?:t(?:hree|wo)|one)\b) which will result in less backtracking. This becomes VERY important as your word list increases in size. Recent versions of Perl, about 5.14 and up, will do this for you.
use strict;
use warnings;
use v5.10;
use Regexp::Assemble;
# Wrap each word in \b (word break) so only the full word is
# matched. 'one' will match 'money' but '\bone\b' won't.
my #words= qw(
\bone\b
\btwo\b
\bthree\b
);
# These lines simulate reading from a file.
my #lines = (
"won for the money\n",
"two for the show\n",
"three to get ready\n",
"now go cat go!\n"
);
# Assemble all the words into one regex.
my $ra = Regexp::Assemble->new;
$ra->add(#words);
for my $line (#lines) {
print $line if $line =~ $ra;
}
Also note the foreach style loop to iterate over an array, and the use of a statement modifier.
Finally, I used \b to ensure that only the actual words are matched, not substrings like money.
This is perhaps an overly simplistic "translation" of your C like code into perl.
Pro: It's compact
Con: It's not very efficient (the other answers are a ton better here).
#words= qw/one
two
three/;
$term= "too for the show";
my #found = grep { $term =~ /$_/; } #words;
printf "found= %d\n", scalar #found;

Not an ARRAY reference error in "pop($str)"

I am learning Perl for work and I'm trying to practise with some basic programs.
I want my program to take a string from STDIN and modify it by taking the last character and putting it at the start of the string.
I get an error when I use variable $str in $str = <STDIN>.
Here is my code:
my $str = "\0";
$str = <STDIN>;
sub last_to_first {
chomp($str);
pop($str);
print $str;
}
last_to_first;
Exec :
Matrix :hi
Not an ARRAY reference at matrix.pl line 13, <STDIN> line 1.
Why your approach doesn't work
The pop keyword does not work on strings. Strings in Perl are not automatically cast to character arrays, and those array keywords only work on arrays.
The error message is Not an ARRAY reference because pop sees a scalar variable. References are scalars in Perl (the scalar here is something like a reference to the address of the actual array in memory). The pop built-in takes array references in Perl versions between 5.14 and 5.22. It was experimental, but got removed in the (currently latest) 5.24.
Starting with Perl 5.14, an experimental feature allowed pop to take a scalar expression. This experiment has been deemed unsuccessful, and was removed as of Perl 5.24.
How to make it work
You have to split and join your string first.
my $str = 'foo';
# turn it into an array
my #chars = split //, $str;
# remove the last char and put it at the front
unshift #chars, pop #chars;
# turn it back into a string
$str = join '', #chars;
print $str;
That will give you ofo.
Now to use that as a sub, you should pass a parameter. Otherwise you do not need a subroutine.
sub last_to_first {
my $str = shift;
my #chars = split //, $str;
unshift #chars, pop #chars;
$str = join '', #chars;
return $str;
}
You can call that sub with any string argument. You should do the chomp to remove the trailing newline from STDIN outside of the sub, because it is not needed for switching the chars. Always build your subs in the smallest possible unit to make it easy to debug them. One piece of code should do exactly one functionality.
You also do not need to initialize a string with \0. In fact, that doesn't make sense.
Here's a full program.
use strict;
use warnings 'all';
my $str = <STDIN>;
chomp $str;
print last_to_first($str);
sub last_to_first {
my $str = shift;
my #chars = split //, $str;
unshift #chars, pop #chars;
$str = join '', #chars;
return $str;
}
Testing your program
Because you now have one unit in your last_to_first function, you can easily implement a unit test. Perl brings Test::Simple and Test::More (and other tools) for that purpose. Because this is simple, we'll go with Test::Simple.
You load it, tell it how many tests you are going to do, and then use the ok function. Ideally you would put the stuff you want to test into its own module, but for simplicity I'll have it all in the same program.
use strict;
use warnings 'all';
use Test::Simple tests => 3;
ok last_to_first('foo', 'ofo');
ok last_to_first('123', '321');
ok last_to_first('qqqqqq', 'qqqqqq');
sub last_to_first {
my $str = shift;
my #chars = split //, $str;
unshift #chars, pop #chars;
$str = join '', #chars;
return $str;
}
This will output the following:
1..3
ok 1
ok 2
ok 3
Run it with prove instead of perl to get a bit more comprehensive output.
Refactoring it
Now let's change the implementation of last_to_first to use a regular expression substitution with s/// instead of the array approach.
sub last_to_first {
my $str = shift;
$str =~ s/^(.+)(.)$/$2$1/;
return $str;
}
This code uses a pattern match with two groups (). The first one has a lot of chars after the beginning of the string ^, and the second one has exactly one char, after which the string ends $. You can check it out here. Those groups end up in $1 and $2, and all we need to do is switch them around.
If you replace your function in the program with the test, and then run it, the output will be the same. You have just refactored one of the units in your program.
You can also try the substr approach from zdim's answer with this test, and you will see that the tests still pass.
The core function pop takes an array, and removes and returns its last element.
To manipulate characters in a string you can use substr, for example
use warnings;
use strict;
my $str = <STDIN>;
chomp($str);
my $last_char = substr $str, -1, 1, '';
my $new_str = $last_char . $str;
The arguments to substr mean: search the variable $str, at offset -1 (one from the back), for a substring of length 1, and replace that with an empty string '' (thus removing it). The substring that is found, here the last character, is returned. See the documentation page linked above.
In the last line the returned character is concatenated with the remaining string, using the . operator.
You can browse the list of functions broken down by categories at Perl functions by category.
Perl documentation has a lot of goodies, please look around.
Strings are very often manipulated using regular expressions. See the tutorial perlretut, the quick start perlrequick, the quick reference perlreref, and the full reference perlre.
You can also split a string into a character array and work with that. This is shown in detail in the answer by simbabque, which packs a whole lot more of good advice.
This is for substring function used for array variables:
my #arrays = qw(jan feb mar);
last_to_first(#arrays);
sub last_to_first
{
my #lists = #_;
my $last = pop(#lists);
#print $last;
unshift #lists, $last;
print #lists;
}
This is for substring function used for scalar variables:
my $str = "";
$str = <STDIN>;
chomp ($str);
last_to_first($str);
sub last_to_first
{
my $chr = shift;
my $lastchar = substr($chr, -1);
print $lastchar;
}

Perl subtitute not working after previous pattern match

I have a strange problem that the susbtitute operator s/// doesn't work after a preceding pattern match. For example
use strict;
use warnings;
my $var = "var";
$var =~ s||/|g;
print "$var\n";
The output is: /v/a/r/
But in this case
use strict;
use warnings;
my $a = "test";
if ($a =~ /te/) {
my $var = "var";
$var =~ s||/|g;
print "$var\n";
}
the output is: var, when it should be the same as the previous result.
What is going on here? How can I fix it?
perlop has this to say about the The empty pattern //
If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead. In this case, only the g and c flags on the empty pattern are honored; the other flags are taken from the original pattern. If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match).
So your first case does a substitution on the empty string because there have been no previous pattern matches, while the second comes after a successful match of te in test, so it substitutes te everywhere in var and so has no effect.
This program demonstrates
use strict;
use warnings;
my $str = 'a/b/c';
if ($str =~ m{/}) {
$str =~ s//x/g;
}
print $str;
output
axbxc
The only exception to this is the pattern in the split command, which always matches an empty pattern if that is what you specify.
To get around this, if you really want to match the point before and after every character, you can use the /x modifier to use an insignificant space for your pattern, like this
use strict;
use warnings;
my $var_a = 'test';
if ($var_a =~ /te/) {
my $var_b = 'var';
$var_b =~ s| |/|gx;
print "$var_b\n";
}
output
/v/a/r/

Using a char variable in tr///

I am trying to count the characters in a string and found an easy solution counting a single character using the tr operator. Now I want to do this with every character from a to z. The following solution doesn't work because tr/// matches every character.
my #chars = ('a' .. 'z');
foreach my $c (#chars)
{
$count{$c} = ($text =~ tr/$c//);
}
How do I correctly use the char variable in tr///?
tr/// doesn't work with variables unless you wrap it in an eval
But there is a nicer way to do this:
$count{$_} = () = $text =~ /$_/g for 'a' .. 'z';
For the TIMTOWTDI:
$count{$_}++ for grep /[a-z]/i, split //, $text;
tr doesn't support variable interpolation (neither in the search list nor in the replacement list). If you want to use variables, you must use eval():
$count{$c} = eval "\$text =~ tr/$c/$c/";
That said, a more efficient (and secure) approach would be to simply iterate over the characters in the string and increment counters for each character, e.g.:
my %count = map { $_ => 0 } 'a' .. 'z';
for my $char (split //, $text) {
$count{$char}++ if defined $count{$char};
}
If you look at the perldoc for tr/SEARCHLIST/REPLACEMENTLIST/cdsr, then you'll see, right at the bottom of the section, the following:
Because the transliteration table is built at compile time, neither the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote interpolation. That means that if you want to use variables, you must use an eval():
eval "tr/$oldlist/$newlist/";
die $# if $#;
eval "tr/$oldlist/$newlist/, 1" or die $#;
Thus, you would need an eval to generate a new SEARCHLIST.
This is going to be very inefficient... the code might feel neat, but you're processing the complete string 26 times. You're also not counting uppercase characters.
You'd be better off stepping through the string once and just incrementing counters for each character found.
From the perlop documentation:
tr/AAA/XYZ/
will transliterate any A to X.
Because the transliteration table is built at compile time, neither
the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote
interpolation. That means that if you want to use variables, you must
use an eval()
Alternatively in your case you can use the s/// operator as:
foreach my $c (#chars) {
$count{$c} += ($text =~ s/$c//g);
}
My solution with some modification based from http://www.perlmonks.org/?node_id=446003
sub lowerLetters {
my $string = shift;
my %table;
#table{split //, $letters_uc} = split //, $letters_lc;
my $table_re = join '|', map { quotemeta } reverse sort keys %table;
$string =~ s/($table_re)/$table{$1}/g;
return if not defined $string;
return $string;
}
You may want to use s instead. Substitution is much more powerful than tr
My solution:
$count{$c} =~ s/\$search/$replace/g;
g at the end means "use it globally".
See:
https://blog.james.rcpt.to/2010/10/25/perl-search-and-replace-using-variables/
https://docstore.mik.ua/orelly/perl3/lperl/ch09_06.htm

Perl's tr/// is not doing what I want

EDIT: tr/// does not support variable interpolation, so I went with s/\Q$_\E//g; instead
Or, more likely, I'm not doing something right...
I have the following code:
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
sub strip_invalid {
my ($str, #chars) = #_;
map { $str =~ tr/$_//; } #chars;
return $str;
}
my #invalid = qw( a e i o u );
print strip_invalid("This is the super sample with vowels.\n", #invalid);
I'd just like to pass a string to strip_invalid() and have tr/// remove the characters in #invalid through a map... Where did I go wrong? (by the way, using regular expressions it works).
Perl's tr feature doesn't support variables.
Note that because the translation table is built at compile time, neither the SEARCHLIST nor the REPLACEMENTLIST are subjected to double quote interpolation. That means that if you want to use variables, you must use an eval():
eval "tr/$oldlist/$newlist/";
(Source)
Since tr/// does not allow the use of variables, I would suggest something along these lines (rather than using eval, which raises other concerns):
sub strip_invalid {
my $str = shift;
my $chars = quotemeta(join '', #_);
$str =~ s/[$chars]//g;
return $str;
}
Note also that tr/// has a delete option, so it's not necessary to iterate across all characters that you want to delete. For example:
$str =~ tr/aeiou//d; # Delete all vowels from $str
To delete with tr, you need to specify the /d flag. Otherwise, it defaults the replacementlist based on the searchlist (so just counts or compresses).
And tr does not support variable interpolation.
To use tr, you'd need to do something like this:
sub strip_invalid {
my ($str, #chars) = #_;
my $strip = quotemeta join '', #chars;
eval "\$str =~ tr/$strip//d";
return $str;
}