Perl Function that validates input using if statements - perl

I want to create a Perl function as follows:
$iError = _validate( "$cVariable", "c" );
$cVariable – the input that I want to validate against a pre-defined standard
"c" – the type of data that is expected
c = character
a = alphanumeric
i = integer
x = decimal number
d = date, two default formats are YY_MM_DD and YYYY_MM_DD
f = file/dir name
e = characters that are valid in an email address (email address must have \# instead of just #)
Return values:
Success = 0
Failed = -1
This is what I have so far but there's definitely some formatting errors that I can't catch since I am completely new to Perl, can anyone point out where I am going wrong?
$iError=_validate($cVariable, c)
if ($c == c) {
if ($cVariable =~ ^.+$) {
$iError=0
} else {
$iError=-1
} # end if
elsif ($c == a) {
if ($cVariable =~ ^([a-z])|([0-9])+$) {
%iError=0
}
else {$iError=-1} # end if
}
elsif ($c == i) {
if ($cVariable =~ ^[-+]?\d+$) {
$iError=0
}
else {$iError = -1} # end if
}
elsif ($c == x) {
if ($cVariable =~ ^[0-9]+[.][0-9]+$) {
$iError=0
}
else {$iError=-1} # end if
}
elsif ($c == d) {
if ($cVariable =~ ^\d{2}_{1}\d{2}_{1}\d{2}$) {
$iError=0
}
elsif ($cVariable =~ ^\d{4}_{1}\d{2}_{1}\d{2}$) {
$iError=0
}
else {$iError=-1} # end if
}
elsif($c == f) {
if ($cVariable =~ ^.+$) {
$iError=0
}
else {$iError=-1} # end if
}
else($c == e) {
if ($cVariable =~ ^\S{0,50}\\#\S{0,20}[.]\S{1,10}$) {
$iError=0
}
else {$iError=0} # end if
} # end outer if
return($iError);

One way to develop a solution, while learning Perl, would be to take the Test Driven Development [TDD] approach, in which you start out writing one or more tests and then develop the code to enable the tests to pass.
For the problem you state, you could start with a file of tests (call it 'validate.t' and put it in a directory named 't'):
#!/usr/bin/env perl -I.
use Test::More ;
BEGIN { require_ok( "ValidateStarter.pl" ) ; }
my $cVariable = 'abc' ;
my $iError = validate( $cVariable, 'c' ) ;
is( $iError, 0, "correctly validated a character string ($cVariable)" ) ;
$cVariable = 'def456' ;
$iError = validate( $cVariable, 'c' ) ;
is( $iError, -1, "correctly validated a non-character string ($cVariable)" ) ;
$cVariable = 'def456' ;
$iError = validate( $cVariable, 'a' ) ;
is( $iError, 0, "correctly validated an alphanumeric string ($cVariable)" ) ;
$cVariable = '123' ;
$iError = validate( $cVariable, 'a' ) ;
is( $iError, -1, "correctly validated a non-alphanumeric string ($cVariable)" ) ;
$cVariable = '1' ;
$iError = validate( $cVariable, 'i' ) ;
is( $iError, 0, "correctly validated an integer ($cVariable)" ) ;
$cVariable = 'z' ;
$iError = validate( $cVariable, 'i' ) ;
is( $iError, -1, "correctly validated a non-integer ($cVariable)" ) ;
$cVariable = '123.456' ;
$iError = validate( $cVariable, 'x' ) ;
is( $iError, 0, "correctly validated a decimal number ($cVariable)" ) ;
$cVariable = '-0.1234567' ;
$iError = validate( $cVariable, 'x' ) ;
is( $iError, 0, "correctly validated a decimal number ($cVariable)" ) ;
$cVariable = '1234567' ;
$iError = validate( $cVariable, 'x' ) ;
is( $iError, 0, "correctly validated a decimal number ($cVariable)" ) ;
$cVariable = '0xDEADBEEF' ;
$iError = validate( $cVariable, 'x' ) ;
is( $iError, -1, "correctly validated a non-decimal number ($cVariable)" ) ;
done_testing ;
Next, in the directory 'above' t/, create a file called ValidateStarter.pl:
#!/usr/bin/env perl
use strict ;
use warnings ;
use Regexp::Common qw( number ) ;
sub validate {
my ( $cVar, $c ) = #_ ;
if ( 'c' eq $c ) {
if ( $cVar =~ /^[[:alpha:]]+$/ ) {
return 0 ;
}
}
elsif ( 'a' eq $c ) {
if ( $cVar =~ /^[[:alpha:]][[:alnum:]]+$/ ) {
return 0 ;
}
}
elsif ( 'i' eq $c ) {
if ( $cVar =~ /^$RE{num}{int}$/ ) {
return 0 ;
}
}
elsif ( 'x' eq $c ) {
if ( $cVar =~ /^$RE{num}{decimal}$/ ) {
return 0 ;
}
}
elsif ( 'a' eq $c ) {
if ( $cVar =~ /^\A\p{Alnum}+\z$/ ) {
return 0 ;
}
}
return -1 ;
}
1 ;
Execute the tests by changing to the directory containing ValidateStarter.pl and the t/ directory and typing (Note: '$' would be your console prompt -- don't type it):
$ perl t/validate.t
If you can figure out how to make 'validate.t' an executable file (hint: use 'chmod' on Linux), then you could just type:
$ t/validate.t
On Linux, you'd see:
$ t/validate.t
ok 1 - require 'ValidateStarter.pl';
ok 2 - correctly validated a character string (abc)
ok 3 - correctly validated a non-character string (def456)
ok 4 - correctly validated an alphanumeric string (def456)
ok 5 - correctly validated a non-alphanumeric string (123)
ok 6 - correctly validated an integer (1)
ok 7 - correctly validated a non-integer (z)
ok 8 - correctly validated a decimal number (123.456)
ok 9 - correctly validated a decimal number (-0.1234567)
ok 10 - correctly validated a decimal number (1234567)
ok 11 - correctly validated a non-decimal number (0xDEADBEEF)
1..11
'ok' on a line of output means a test passed, while 'not ok' would mean it failed.
Starting with these lines of working code, I would suggest further steps along these lines:
Read Test::Tutorial for more on how to write tests in Perl.
Read Regexp::Common::number to see about using some good regular expression utilities.
Browse the online text of 'Modern Perl' by chromatic to read more about Perl itself.
Study and tinker with the example code until you understand how it works.
Add more test cases applying to the sample code.
Add a test case applying to another of the line items in your problem spec. and follow-up with the code to enable the test to pass.
Debug & add more test cases until you're done.
Judging by the very specific nature of your question, it seems likely that it's based on an assignment with a specific deadline, so you may believe you don't have time to write automated tests; but TDD is a good way to make progress incrementally as you learn what you need to know to develop the solution.
Keep in mind that you will know that whatever parts you're able to finish by the deadline are working as proven by the tests that you've written.

When you do RegEx ("=~"), you need to surround the expression with "/". So your first RegEx if statement should look like:
if ($cVariable =~ /^.+$/) {

Your code is broken and incomplete enough that I'm reluctant to wade through all the error messages just to get it to compile, as you would still have a flawed solution in the end.
Syntax, just about anyone can learn. But the more important problem is that each of the things you're trying to validate (with the exception of a character) are not as trivial as you might expect. Given the complexities involved in validating an email address, or a date, well-proven code as provided by trusted CPAN modules is the way to go. You really don't want to waste a lot of time fleshing out the details of a date validator, for example.
Here's how I might get started on it. The following code provides exactly the behavior you've described (with the exception of file/dir, since I don't know what you want there). But it does so in a way that is probably more correct than simple regex matching alone:
#!/usr/bin/env perl
use strict;
use warnings;
use Scalar::Util 'looks_like_number';
use Date::Calc 'check_date';
use Email::Valid;
use constant VALIDATORS => {
'c' => \&char, # Validate a single character.
'a' => \&alnum, # Validate string is alphanumeric only.
'i' => \&intgr, # Validate input is an integer.
'x' => \&dec, # Validate input is a number.
'd' => \&date, # Validate input is a date in YYYY_MM_DD or YY_MM_DD
'f' => \&fdn, # Who knows?! Something to do with file validation.
'e' => \&eml, # Validate input is a legal email address.
};
sub validate {
my( $input, $mode ) = #_;
die "No mode provided." unless defined $mode;
die "Invalid mode: $mode." unless exists VALIDATORS->{$mode};
return -1 if not defined $input;
return VALIDATORS->{$mode}->($input) ? 0 : -1;
}
sub char {
my $c = shift;
return 1 == length $c;
}
sub alnum {
my $an = shift;
return $an =~ /\A\p{Alnum}+\z/;
}
sub intgr {
my $n = shift;
return looks_like_number($n) && $n == int($n);
}
sub dec {
return looks_like_number(shift);
}
sub date {
my $date = shift;
my( $y, $m, $d );
if( ( $y, $m, $d ) = $date =~ m/\A(\d{2}|\d{4})_(\d{2})_(\d{2})\z/ ) {
return check_date( $y, $m, $d );
}
else {
return 0;
}
}
sub fdn {
# I have no idea what you want to do in validating a filename or directory.
# Is this a matter of "legal characters" for a given OS?
# Or is it a matter of "This file / path exists"?
}
sub eml { return Email::Valid->address(shift) }
# ___________________________
use Test::More;
{
local $#;
eval{ validate('a') };
like ( $#, qr/^No mode provided/, 'Die if no mode provided.' );
}
{
local $#;
eval{ validate('a','invalid') };
like( $#, qr/^Invalid mode/, 'Die on invalid mode.' );
}
# 0 = success, -1 = failure.
ok( 0 == validate( 'a','c' ), 'Char accepted.' );
ok( -1 == validate( 'ab', 'c' ), 'More than one char rejected.' );
ok( -1 == validate( '', 'c' ), 'Empty string rejected.' );
ok( -1 == validate( undef, 'c' ), 'undefined value rejected.' );
# 0 = success, non-zero = failure (-1).
ok( !validate( 'aA10', 'a' ), 'Alnum accepted.' );
ok( validate( '.$', 'a' ), 'Non-alnum rejected.' );
ok( validate( undef,'a' ), 'undefined value rejected for alnum.' );
ok( !validate( '10', 'i' ), 'Positive integer.' );
ok( !validate( -5, 'i' ), 'Negative integer.' );
ok( validate( -0.5, 'i' ), 'Reject non-integer.' );
ok( validate( 'a', 'i' ), 'Reject non-numeric as int.' );
ok( !validate( '10', 'x' ), 'Positive integer as decimal number.' );
ok( !validate( '10.5', 'x' ), 'Positive floating point as decimal number.' );
ok( validate( '17f', 'x' ), 'Decimal rejects hex string.' );
ok( validate( '12.3.5', 'x' ), 'Malformed decimal rejected.' );
ok( !validate( '1600_12_15', 'd' ), 'Four digit year date accepted.' );
ok( !validate( '14_06_05', 'd' ), 'Two digit year date accepted.' );
ok( validate( '15000_12_15', 'd' ), 'Y10k bug ;)' );
ok( validate( '2000_02_30', 'd' ), 'Impossible date rejected.' );
ok( !validate( 'someone#example.com', 'e' ), 'Good email address accepted.' );
ok( validate( 'a"b(c)d,e:f;g<h>i[j\k]l#example.com', 'e' ),
'Bad email rejected.' );
ok( validate( 'a#b#example.com', 'e' ), 'Bad email rejected.' );
done_testing();
I don't particularly care for the "0 == success, -1 == failure" mode, but it's not unheard of, and is simple to achieve.
There are modules on CPAN that would probably do a better job of validating integers and numbers, and you're welcome to search them out and drop them into place. My tests for those categories are quick and dirty, and should be effective in most cases.
For more detailed info on numeric validation, have a look at perlfaq4.
I didn't attempt to validate a filename or directory, as I'm not sure what you're after in that regard. The File::Util module could be useful in verifying that a filename is legal for a given operating system. As for paths, I'm not sure if you want to know if the characters are legal, or if the path exists. You'll have to work on that yourself.
None of this is the sort of thing that someone new to Perl should be taking on oneself when a deadline looms. There's a lot of learning involved in something that on the face looks quite simple.

Related

Unit and integration testing in perl using Test::More - how to capture exit call?

I have this script that converts a file from csv delimited to pipe delimited (csv2pipe.pl).
The pipe delimited file then gets loaded into database.
I am trying to run unit testing for it (csv2pipe_unit_test.pl with test call like(main()...) .
Function sub main() in csv2pipe.pl has exit call if there is no parameter supplied.
Unfortunately it is not possible to capture exit call in testing call so the testing program just exits.
I cannot do eval { code }. It does not catch exit call. It will catch die and any errors but not exit.
The same with package Capture::Tiny from CPAN. It does not catch exit call.
Any ideas how to test this main() sub and catch exit call?
File: csv2pipe.pl
#!/bin/perl
use strict;
use Getopt::Std;
#--------------------------
# all work done in main sub
#--------------------------
unless (caller) { main(); exit; }
sub main {
printf( "START---$0---%s---\n", scalar( localtime(time) ) );
if ( scalar(#ARGV) < 4 ) {
print "Usage: $0 -i file.csv -o file.pipe\n";
exit;
}
my %options = ();
getopts( "i:o:", \%options );
my $file_out = $options{o} or die "Missing -o param";
my $file_inp = $options{i} or die "Missing -i param";
if ( !-f $file_inp ) {
die "Could not find file to parse ($file_inp)\n";
}
my $row_ctr = 0;
open( FH_inp, "<$file_inp" ) or die "Could not open file: $file_inp";
open( FH_out, ">$file_out" ) or die "Could not open file: $file_out";
my ( $str, $ret );
foreach $str (<FH_inp>) {
$row_ctr++;
#print "str=$str";
$ret = csv2pipe($str);
#print "ret=$ret";
print FH_out $ret;
}
close(FH_inp);
close(FH_out);
print "Processed $row_ctr rows\n";
printf( "END---$0---%s---\n", scalar( localtime(time) ) );
printf( "Program Run Time: %s second(s).\n", ( time - $^T ) );
}
# convert csv to pipe
sub csv2pipe {
my $str = shift;
return undef if !defined $str;
#print ''.(caller(0))[3]."\n";
$str =~ s/,/|/g;
if ( $str =~ /"/ ) {
while ( $str =~ /".*?"/ ) {
my $beg = $`;
my $match = $&;
my $end = $';
$match =~ s/"//g;
$match =~ s/\|/,/g;
$str = $beg . $match . $end;
}
} elsif ( $str =~ /'/ ) {
while ( $str =~ /'.*?'/ ) {
my $beg = $`;
my $match = $&;
my $end = $';
$match =~ s/'//g;
$match =~ s/\|/,/g;
$str = $beg . $match . $end;
}
}
return $str;
}
File: csv2pipe_unit_test.pl
#!/bin/perl
use strict;
use warnings;
use Test::More;
# enter main test loop
run_tests();
exit;
sub run_tests {
printf( "START---$0---%s---\n", scalar( localtime(time) ) );
require_ok "csv2pipe.pl";
ok( csv2pipe('') eq '', 'test empty string' );
ok( csv2pipe('a') eq 'a', 'test one char' );
ok( csv2pipe('a,b') eq 'a|b', 'test two chars' );
ok( csv2pipe('a,b,abc,def,') eq 'a|b|abc|def|', 'test multiple strings' );
ok( csv2pipe("abc,def,abc'xyz") eq "abc|def|abc'xyz", 'test single quote in string' );
ok( csv2pipe('abc,def,abc"xyz') eq 'abc|def|abc"xyz', 'test double quote in string' );
ok( csv2pipe('abc,def,"abc,xyz"') eq 'abc|def|abc,xyz', 'test double quoted comma does not get converted' );
ok( csv2pipe("abc,def,'abc,xyz'") eq 'abc|def|abc,xyz', 'test single quoted comma does not get converted' );
# this call does not work!!!
# exit in csv2pipe.pl function main() will exit this testing script as well
like(
main(), # first param = argument to program
qr/Usage:/, # second param = expected result
'test run program with no parameter'
); # 3rd parameter = test description
print "Tests Executed: ";
done_testing();
printf( "END---$0---%s---\n", scalar( localtime(time) ) );
printf( "Program Run Time: %s second(s).\n", ( time - $^T ) );
}
Normal output of testing:
C:\workspace\csv2pipe>c:\perl\bin\perl csv2pipe_unit_testing.pl
START---csv2pipe_unit_testing.pl---Fri Sep 12 15:15:47 2014---
ok 1 - require 'csv2pipe.pl';
ok 2 - test empty string
ok 3 - test one char
ok 4 - test two chars
ok 5 - test multiple strings
ok 6 - test single quote in string
ok 7 - test double quote in string
ok 8 - test double quoted comma does not get converted
ok 9 - test single quoted comma does not get converted
Tests Executed: 1..9
END---csv2pipe_unit_testing.pl---Fri Sep 12 15:15:48 2014---
Program Run Time: 1 second(s).
Output of testing when it fails in sub main() exit:
C:\workspace\csv2pipe>c:\perl\bin\perl csv2pipe_unit_testing.pl
START---csv2pipe_unit_testing.pl---Fri Sep 12 15:20:49 2014---
ok 1 - require 'csv2pipe.pl';
ok 2 - test empty string
ok 3 - test one char
ok 4 - test two chars
ok 5 - test multiple strings
ok 6 - test single quote in string
ok 7 - test double quote in string
ok 8 - test double quoted comma does not get converted
ok 9 - test single quoted comma does not get converted
START---csv2pipe_unit_testing.pl---Fri Sep 12 15:20:49 2014---
Usage: csv2pipe_unit_testing.pl -i file.csv -o file.pipe
# Tests were run but no plan was declared and done_testing() was not seen.
If you can't turn your exit into a die, try Test::Trap. From the synopsis:
use Test::More;
use Test::Trap;
my #r = trap { some_code(#some_parameters) };
is ( $trap->exit, 1, 'Expecting &some_code to exit with 1' );
is ( $trap->stdout, '', 'Expecting no STDOUT' );
like ( $trap->stderr, qr/^Bad parameters; exiting\b/, 'Expecting warnings.' );

Why if/elsif in Perl execute only the first block?

I am new to Perl. I have an assignment to write a Perl program that accept a countable word from a command line and then generates its plural form. I have composed the following code below, and it shows no errors of compilation. When I execute it from the command line:
(perl plural.pl, for example), it prompts me to enter a noun, then whatever noun I feed as input, the plural form is the same. It doesn't execute the remaining if statements.
For example, if I enter the word "cat", the plural is generated as "cats". But when I enter the word 'church', for example, the plural is generated as 'churches', "fly" as "flys".
Here is the code:
#!/usr/bin/perl
$suffix1 = 's';
$suffix2 = 'es';
$suffix3 = 'ies';
print "Enter a countable noun to get plural: ";
$word = <STDIN>;
chomp($word);
if(substr $word, -1 == 'b' or 'd' or 'c' or 'g' or 'r' or 'j' or 'k' or 'l' or 'm' or 'n' or 'p' or 'q' or 'r' or 't' or 'v' or 'w' or 'e' or 'i' or 'o' or 'u') {
$temp = $word.$suffix1;
print "The plural form of the word \"$word\" is: $temp \n";
}
elsif (substr $word, -1 == 's' or 'sh' or 'ch' or 'x' or 'z') {
$temp = $word.$suffix2;
print "The plural form of the word \"$word\" is: $temp \n";
}
elsif (substr $word, -1 == 'y') {
chop($word);
$temp = $word.$suffix3;
print "The plural form of the word \"$word\" is: $temp \n";
}
Could you help me making the code execute the three statements.
First of all, always use use strict; use warnings;.
Strings are compared using eq, not ==.
substr $word, -1 eq 'b' means substr $word, (-1 eq 'b') when you meant substr($word, -1) eq 'b'. You'll face lots of problems if you omit parens around function calls.
substr($word, -1) eq 'b' or 'd' means the same as (substr($word, -1) eq 'b') or ('d'). 'd' is always true. You'd need to use substr($word, -1) eq 'b' or substr($word, -1) eq 'd'. (Preferably, you'd save substr $word, -1 in a variable to avoid doing it repeatedly.)
substr $word, -1 will never equal ch or sh.
The match operator makes this easy:
if ($word =~ /[bdcgrjklmnpqrtvweiou]\z/) {
...
}
elsif ($word =~ /(?:[sxz]|[sc]h)\z/) {
...
}
elsif ($word =~ /y\z/) {
...
}
In Perl, we use eq for string comparison instead of ==.
You can't use or like this. It should be like if (substr($word, -1) eq 'b' or substr ($word, -1) eq 'd'). Otherwise you could use an array containing all the string that you would like to compare and grep from that array.
Duskast is right. Perl uses symbols for numeric comparisons, and strings for string comparisons.
== eq
!= ne
< lt
<= le
> gt
>= ge
<=> cmp
Also, your use of or, though a good try, doesn't work. The keyword or has weak precedence, and so the expression
substr $word, -1 == 'b' or 'd' or 'c' or
'g' or 'r' or 'j' or
'k' or 'l' or 'm' or
'n' or 'p' or 'q' or
'r' or 't' or 'v' or
'w' or 'e' or 'i' or
'o' or 'u'
is interpreted as
substr ($word, (-1 == 'b')) or 'd' or 'c' or
'g' or 'r' or 'j' or
'k' or 'l' or 'm' or
'n' or 'p' or 'q' or
'r' or 't' or 'v' or
'w' or 'e' or 'i' or
'o' or 'u'
I'm not sure what the substr works out to, but if it's false, the expression continues to the or 'b', which is interpreted as true. Have you seen regular expressions yet? This is much more idiomatically done as
if ($word =~ /[bdcgrjklmnpqrtvweiou]$/) {...}
# Does $word match any of those characters followed by
# the end of the line or string?
Look in the Perl docs for string substitution and the s/.../.../ construct.
By the way, if you were paid to do this instead of being a student, you'd use the Lingua modules instead.
First of all, always, always include use strict; and use warnings;.
Second, use indentations. I've taught Perl courses at work and refuse to accept any assignment that was not indented correctly. In fact, I'm very, very strict about this because I want users to learn to code to the standard (4 space indent, etc.). It makes your program easier to read and to support.
While we're at it, break overly long lines -- especially on StackOverflow. It's hard to read a program when you have to scroll back and forth.
Quick look at your program:
In Perl, strings and numerics use two different sets of boolean operations. This is because strings can contain only digits, but still be strings. Imagine inventory item numbers like 1384 and 993. If I'm sorting these as strings, the 1384 item comes first. If I am sorting them numerically, 993 should come first. Your program has no way of knowing this except by the boolean operation you use:
Boolean Operation Numeric String
================= ======= ======
Equals == eq
Not Equals != ne
Greater Than > gt
Less Than < lt
Greater than/Equals >= ge
Less than/Equals <= le
THe other is that an or, and, || and && only work with two booleans. This won't work:
if ( $a > $b or $c ) {
What this is saying is this:
if ( ( $a > $b ) or $c ) {
So, if $c is a non-zero value, then $c will be true, and the whole statement would be true. You have to do your statement this way:
if ( $a > $b or $a > $c ) {
Another thing, use qq(..) and q() when quoting strings that contain quotation marks. This way, you don't have to put a backslash in front of them.
print "The word is \"swordfish\"\n";
print qq(The word is "swordfish"\n);
And, if you use use feature qw(say); at the top of your program, you get the bonus command of say which is like print, except the ending new line is assumed:
say qq(The word is "swordfish");
When you use substr, $foo, -1, you are only looking at the last character. It cannot ever be a two character string:
if ( substr $word, -1 eq "ch" ) {
will always be false.
Long ifs are hard to maintain. I would use a for loop (actually not, but let's pretend for now..):
#! /usr/bin/env perl
#
# Use these in ALL of your programs
#
use strict;
use warnings;
use feature qw(say);
#
# Use better, more descriptive names
#
my $standard_plural_suffix = 's';
my $uncommon_plural_suffix = 'es';
my $y_ending_plural_suffix = 'ies';
print "Enter a countable noun to get plural: ";
chomp (my $word = <STDIN>);
my $plural_form;
#
# Instead of a long, long "if", use a for loop for testing. Easier to maintain
#
for my $last_letter qw( b d c g r j k l m n p q r t v w e i o u) {
if ( substr($word, -1) eq $last_letter ) {
$plural_form = $word . $standard_plural_suffix;
last;
}
}
#
# Is it an "uncommon plural" test (single character)
#
if ( not $plural_form ) {
for my $last_letter qw(s x z) {
if ( substr($word, -1) eq $last_letter ) {
$plural_form = $word . $uncommon_plural_suffix;
last;
}
}
}
#
# Is it an "uncommon plural" test (double character)
#
if ( not $plural_form ) {
for my $last_two_letters qw(sh ch) {
if ( substr($word, -2) eq $last_two_letters ) {
$plural_form = $word . $uncommon_plural_suffix;
last;
}
}
}
if ( not $plural_form ) {
if ( substr($word, -1) eq 'y' ) {
chop ( my $chopped_word = $word );
$plural_form = $chopped_word . $y_ending_plural_suffix;
}
}
if ( $plural_form ) {
say qq(The plural of "$word" is "$plural_form");
}
else {
say qq(Could not find plural form of "$word");
}
Do you know about regular expressions? Those would work a lot better than using substr because you can test multiple things at once. Plus, I wouldn't use chop, but a regular expression substitution:
#! /usr/bin/env perl
#
# Use these in ALL of your programs
#
use strict;
use warnings;
use feature qw(say);
#
# Use better, more descriptive names
#
my $standard_plural_suffix = 's';
my $uncommon_plural_suffix = 'es';
my $y_ending_plural_suffix = 'ies';
print "Enter a countable noun to get plural: ";
chomp (my $word = <STDIN>);
my $plural_form;
#
# Standard plural (adding plain ol' 's'
#
if ( $word =~ /[bdcgrjklmnpqrtvweiou]$/ ) {
$plural_form = $word . $standard_plural_suffix;
}
#
# Uncommon plural (adding es)
#
elsif ( $word =~ /([sxz]|[sc]h)$/ ) {
$plural_form = $word . $uncommon_plural_suffix;
}
#
# Final 'y' rule: Replace y with ies
#
elsif ( $word =~ /y$/ ) {
$plural_form = $word;
$plural_form =~ s/y$/ies/;
}
if ( $plural_form ) {
say qq(The plural of "$word" is "$plural_form");
}
else {
say qq(Could not find plural form of "$word");
}
I have changed your code a bit. I'm using regular expression:
#!/usr/bin/perl
$suffix1 = 's';
$suffix2 = 'es';
$suffix3 = 'ies';
print "Enter a countable noun to get plural: ";
$word = <STDIN>;
chomp($word);
if ( $word =~ m/(s|sh|ch|x|z)$/) {
$temp = $word . $suffix2;
}
elsif ( substr( $word, -1 ) eq 'y' ) {
chop($word);
$temp = $word . $suffix3;
}
else {
$temp = $word . $suffix1;
}
print "The plural form of the word \"$word\" is: $temp \n";
Also I recommend you always use strict; and use warnings;

Trying to Develop PostFix Notation in Tree Using Perl

I'm using Perl to run through a tree, and then calculate the leaf nodes of the tree using the internal nodes as operators. I want to be able to print this in a postfix manner, and I managed to this this fairly easily with the basic operands (simply call the left and right nodes respectively before calling the parent) but I am having trouble producing the desired output for an average function. I don't have any trouble printing the actual result of the calculation, but I want to be able to print the operators and operands in postfix notation.
For example, 1 + average(3, 4, 5) will be shown as 1 ; 3 4 5 average +.
Here is my code:
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Indent = 0;
$Data::Dumper::Terse = 1;
my $debug = 0;
# an arithmetic expression tree is a reference to a list, which can
# be of two kinds as follows:
# [ 'leaf', value ]
# [ 'internal', operation, leftarg, rightarg ]
# Evaluate($ex) takes an arithmetic expression tree and returns its
# evaluated value.
sub Evaluate {
my ($ex) = #_;
$debug and
print "evaluating: ", Dumper($ex), "\n";
# the kind of node is given in the first element of the array
my $node_type = $ex->[0];
# if a leaf, then the value is a number
if ( $node_type eq 'leaf' ) {
# base case
my $value = $ex->[1];
$debug and
print "returning leaf: $value\n";
return $value;
}
# if not a leaf, then is internal,
if ( $node_type ne 'internal' ) {
die "Eval: Strange node type '$node_type' when evaluating tree";
}
# should now have an operation and two arguments
my $operation = $ex->[1];
my $left_ex = $ex->[2];
my $right_ex = $ex->[3];
# evaluate the left and right arguments;
my $left_value = Evaluate($left_ex);
my $right_value = Evaluate($right_ex);
# if any arguments are undefined, our value is undefined.
return undef unless
defined($left_value) and defined($right_value);
my $result;
# or do it explicitly for the required operators ...
if ($operation eq 'average') {
$result = ($left_value + $right_value) / 2;
}
if ($operation eq '+') {
$result = $left_value + $right_value;
} elsif ($operation eq '-') {
$result = $left_value - $right_value;
} elsif ($operation eq '*') {
$result = $left_value * $right_value;
} elsif ($operation eq 'div') {
if ($right_value != 0 ) {
$result = int ($left_value / $right_value);
} else {
$result = undef;
}
} elsif ($operation eq 'mod') {
$result = $left_value % $right_value;
} elsif ($operation eq '/') {
if ( $right_value != 0 ) {
$result = $left_value / $right_value;
}
else {
$result = undef;
}
}
$debug and
print "returning '$operation' on $left_value and $right_value result: $result\n";
return $result;
}
# Display($ex, $style) takes an arithmetic expression tree and a style
# parameter ('infix' or 'postfix') and returns a string that represents
# printable form of the expression in the given style.
sub Display {
my ($ex, $style) = #_;
# the kind of node is given in the first element of the array
my $node_type = $ex->[0];
# if a leaf, then the value is a number
if ( $node_type eq 'leaf' ) {
# base case
my $value = $ex->[1];
return $value;
}
# if not a leaf, then is internal,
if ( $node_type ne 'internal' ) {
die "Display: Strange node type '$node_type' when evaluating tree";
}
# should now have an operation and two arguments
my $operation = $ex->[1];
my $left_ex = $ex->[2];
my $right_ex = $ex->[3];
# evaluate the left and right arguments;
my $left_value = Display($left_ex, $style);
my $right_value = Display($right_ex, $style);
my $result;
if ($operation ne 'average') {
$result = "($left_value $operation $right_value) \n $left_value $right_value $operation";
} else {
$result = "($left_value $operation $right_value) \n $left_value $right_value $operation";
}
return $result;
}
# module end;
1;
And here is a test:
use strict;
use warnings;
use Display;
use arith;
my $ex1 = [ 'leaf', 42];
my $ex2 = [ 'internal', '+', [ 'leaf', 42], [ 'leaf', 10 ] ];
my $ex3 = [ 'internal', 'average', $ex2, [ 'leaf', 1 ] ];
print "ex1 is ", Evaluate($ex1), "\n";
print "ex1: ", Display($ex1), "\n";
print "\n";
print "ex2 is ", Evaluate($ex2), "\n";
print "ex2: ", Display($ex2), "\n";
print "\n";
print "ex3 is ", Evaluate($ex3), "\n";
print "ex3: ", Display($ex3), "\n";
print "\n";
Display::Render(\$ex3);
In order to do this, I realize I will have to change the subroutine "Display", but I'm not sure how to get the output --> value value ; #to indicate values that aren't averaged# value value average operand etc.
Any ideas?
I am not 100% sure that I understand your problem, but here is a cleanup / improvement of your two functions:
my %ops = ( # dispatch table for operations
average => sub {my $acc; $acc += $_ for #_; $acc / #_},
'+' => sub {$_[0] + $_[1]},
'-' => sub {$_[0] - $_[1]},
'*' => sub {$_[0] * $_[1]},
'mod' => sub {$_[0] % $_[1]},
(map {$_ => sub {$_[1] ? $_[0] / $_[1] : undef}} qw (/ div)),
);
sub Evaluate {
my $ex = shift;
print "evaluating: ", Dumper($ex), "\n" if $debug;
my $node_type = $ex->[0];
if ( $node_type eq 'leaf' ) {
print "returning leaf: $$ex[1]\n" if $debug;
return $$ex[1];
}
elsif ( $node_type ne 'internal' ) {
die "Eval: Strange node type '$node_type' when evaluating tree";
}
my $operation = $ex->[1];
my #values = map {Evaluate($_)} #$ex[2 .. $#$ex];
defined or return for #values;
if (my $op = $ops{$operation}) {
return $op->(#values);
} else {
print "operation $operation not found\n";
return undef;
}
}
Here the large if/elsif block is replaced with a dispatch table. This allows you to separate the logic from the parser. I have also replaced the $left_value and $right_value variables with the #values array, allowing your code to scale to n-arity operations (like average).
The following Display function has also been updated to handle n-arity operations:
my %is_infix = map {$_ => 1} qw( * + / - );
sub Display {
my ($ex, $style) = #_;
my $node_type = $ex->[0];
# if a leaf, then the value is a number
if ( $node_type eq 'leaf' ) {
return $$ex[1];
}
# if not a leaf, then is internal,
if ( $node_type ne 'internal' ) {
die "Display: Strange node type '$node_type' when evaluating tree";
}
# should now have an operation and n arguments
my $operation = $ex->[1];
if ($style and $style eq 'infix') {
my #values = map {Display($_, $style)} #$ex[2 .. $#$ex];
if ($is_infix{$operation}) {
return "$values[0] $operation $values[1]"
} else {
local $" = ', '; # "
return "$operation( #values )"
}
} else { # postfix by default
my #out;
for (#$ex[2 .. $#$ex]) {
if (#out and $_->[0] eq 'internal') {
push #out, ';'
}
push #out, Display($_, $style)
}
return join ' ' => #out, $operation;
}
}
You can call Display as Display($tree) or Display($tree, 'postfix') for postfix notation. And Display($tree, 'infix') for the infix notation.
ex1 is 42
ex1: 42
ex1: 42
ex2 is 52
ex2: 42 10 +
ex2: 42 + 10
ex3 is 26.5
ex3: 42 10 + 1 average
ex3: average( 42 + 10, 1 )
Which I believe is what you are looking for.
Finally, using your first example 1 + average(3, 4, 5):
my $avg = ['internal', 'average', [leaf => 3], [leaf => 4], [leaf => 5] ];
my $ex4 = ['internal', '+', [leaf => 1], $avg ];
print "ex4 is ", Evaluate($ex4), "\n";
print "ex4: ", Display($ex4), "\n";
print "ex4: ", Display($ex4, 'infix'), "\n";
print "\n";
which prints:
ex4 is 5
ex4: 1 ; 3 4 5 average +
ex4: 1 + average( 3, 4, 5 )
Maybe try AlgebraicToRPN?

Differences beteween switch- and if-statements

Do these two statements behave equally or could they yield different results?
if ( ... ) {...}
elsif( ... ) {... }
elsif( ... ) { ... }
else { ... }
.
given ( ... ) {
when ( ... ) { ... }
when ( ... ) { ... }
default { ... }
}
I've found the problem - with a modified ninth "when" it works now.
...
no warnings qw(numeric);
my $c = &getch();
given ( $c ) {
when ( $c == $KEY_LEFT and 1 > 0 ) { say 1; say $c }
when ( $c == $KEY_RIGHT ) { say 2; say $c }
when ( $c eq "\cH" or $c eq "\c?" ) { say 3; say $c }
when ( $c eq "\cC" ) { say 4; say $c }
when ( $c eq "\cX" or $c eq "\cD" ) { say 5; say $c }
when ( $c eq "\cA" ) { say 6; say $c }
when ( $c eq "\cE" ) { say 7; say $c }
when ( $c eq "\cL" ) { say 8; say $c }
when ( not( not $SpecialKey{$c} ) ) { say 9; say $c }
when ( ord( $c ) >= 32 ) { say 10; say $c }
default { say 11; say $c }
}
if ( $c == $KEY_LEFT and 1 > 0 ) { say 1; say $c }
elsif ( $c == $KEY_RIGHT ) { say 2; say $c }
elsif ( $c eq "\cH" or $c eq "\c?" ) { say 3; say $c }
elsif ( $c eq "\cC" ) { say 4; say $c }
elsif ( $c eq "\cX" or $c eq "\cD" ) { say 5; say $c }
elsif ( $c eq "\cA" ) { say 6; say $c }
elsif ( $c eq "\cE" ) { say 7; say $c }
elsif ( $c eq "\cL" ) { say 8; say $c }
elsif ( $SpecialKey{$c} ) { say 9; say $c }
elsif ( ord( $c ) >= 32 ) { say 10; say $c }
else { say 11; say $c }
close TTYIN;
Your supposedly "fixed" version now does different things in the two versions of the code. Checking if a key exists in a hash is completely different to checking whether the associated value is true.
There are three different truth values you can get from a hash - whether the key exists in the hash, whether the associated value is defined and whether the associated value is true or false. This code should demonstrate the difference between the three:
#!/usr/bin/perl
use strict;
use warnings;
my %hash = (
key1 => undef,
key2 => 0,
key3 => 1,
);
foreach (qw(key1 key2 key3 key4)) {
check_key($_);
}
sub check_key {
my $k = shift;
print "Key $k ";
if (exists $hash{$k}) {
print 'exists. ';
} else {
print "doesn't exist. ";
}
print 'Value ';
if (defined $hash{$k}) {
print 'is defined ';
} else {
print 'is not defined ';
}
print 'and is ';
if ($hash{$k}) {
print "true\n";
} else {
print "false\n";
}
}
Whatever you can do with the given/when, you can do with if/elsif/else. The idea is that when/given is suppose to be easier to read, and can automatically use smartmatching by default while you have to specify smart matching with the if/else statement.
I didn't parse through your code to make sure they're exact equivalents, but it looks like you've got more or less the right idea about if/elsif/else and given/when.
I never really understood the desire for what was referred to as the switch statement. It was something Perl coders always complained about -- the lack of a switch statement in Perl. Maybe it's a C thing that most Perl developers fondly remember. But I never found the if/elsif/else statements that bad.
What really confuses me is that when they finally implemented the switch statement, they didn't call it switch. Why the heck not?
And why say instead of printnl?
And, why last and next instead of break and continue. Why not simply use the standard names that other languages already use?
But enough of this Seinfeld style whining...
As davorg said, there's a big difference between a hash key that doesn't exist, a hash key that exists, but isn't defined, and a hash key that's defined, but gets evaluated to false:
For example:
use strict;
use warnings;
no warnings qw(uninitialized);
my %hash;
$hash{FOO} = "bar";
if (not exists($hash{BAR})) {
print "\$hash{FOO} doesn't exist\n";
}
if (not defined($hash{BAR})) {
print "\$hash{BAR} is undefined\n";
}
if (not $hash{BAR}) {
print "\$hash{BAR} evaluates to false\n";
}
if ($hash{BAR} eq undef) {
print "\$hash{BAR} is equal to 'undef'\n";
}
You can see that $hash{BAR} doesn't even exist as a key in the %hash hash, but it is also undefined, and it evaluates as false. And, you can also see that it also evaluates as undef (notice I had to set no warnings qw(uninitialized); to prevent it from complaining about $hash{BAR} being uninitialized in my last if statement).
However, if I do this:
$hash{FOO} = bar;
$hash{BAR} = undef;
The first if statement no longer evaluates as true because the key BAR now does exist in the hash, but the value is still undefined and still evaluates as false.
And, if I did this:
$hash{FOO} = bar;
$hash{BAR} = 0;
$hash{BAR} now exists as a key in %hash, and it is no longer undefined, but it still evaluates as false.
I like simply being able to say this:
if (not $hash{BAR}) {
because it is short and sweet, and probably does what I want. However, I do have to understand the difference between existence of a key in a hash, evaluation to false, and not being defined as three separate things. It can be important if you have a subroutine that could return a NULL string or zero value:
if (not foo($bar)) {
die qq(Error of some sort\n);
}
sub foo {
$bar = <FOO> or return;
return chomp($bar);
}
If there's a blank line in my file, it'll return a NULL string, but the subroutine will still return a defined value. The above probably does not do what I want.
Adding another answer as I've just realised what the real problem is.
Your "when ($SpecialKey{$c})" is equivalent to "if ($_ ~~ $SpecialKey{$c})". And as "given" has set $_ to $c that's the same as "if ($c ~~ $SpecialKey{$c})". When comparing two scalar values (and I assume that's what we've got here) the smart match operator looks for values that are numbers and uses "eq" or "==" as appropriate.
So your code is effectively equivalent to "if ($c == $SpecialKey{$c})". And that's completely different to "if ($SpecialKey{$c})".
They behave totally equal.
given / when has implicit smart matching:
Most of the power comes from implicit smart matching:
when($foo)
is exactly equivalent to
when($_ ~~ $foo)
I don't think if/else does that(?)
See perlsyn
EDIT:
Guess this doesn't really matter when using given/when the way OP is, but it still answers the question :)

Why doesn't Perl's for() go through all of the elements in my array?

Have a perl brain-teaser:
my #l = ('a', 'b', 'c');
for (#l) {
my $n = 1;
print shift #l while (#l and $n --> 0);
print "\n";
}
What's it print? Should be a, b, and c, right? But oh wait actually there's a bug somewhere, it only prints a and b. Probably just some stupid off-by-one, should be easy to solve, right?
Ok so make a small code change to test things out and change #l to
my #l = ('a', 'b', 'c', 'd');
What's it print? Probably a, b, and c because of that stupid off by one, right? ...Wait a second, actually it still prints only a and b. Okay, so the bug is that it only prints the first two characters.
Change #l again to
my #l = ('a', 'b', 'c', 'd', 'e');
Uhm, now it prints a, b, and c. But not d or e. In fact, every 2 letters we add from now on will make it print the next letter in the sequence. So if we add f it'll still just print a, b, and c, but if we add f and g it'll print a, b, c, and d.
This also happens with similar results for different values of $n.
So what's going on here?
Dave Webb beat me to the problem, but here's a quote from perldoc perlsyn saying not to do it:
If any part of LIST is an array, foreach will get very confused if you add or remove elements within the loop body, for example with splice. So don't do that.
Note that, earlier in the text, the syntax of foreach was described as foreach LIST, which is the LIST they refer to in the documentation. Note also that foreach and for are equivalent.
What's going on is that you're using for and shift at the same time. So you're looping through the list whilst modifying it, not a good idea.
I think this is somebody's gadget code. It doesn't look like the way you would want to write anything. But what it might illustrate best is that (in at least some versions) Perl is really running a more basic for loop, where:
for ( #l ) {
#...
}
Is replaced by:
for ( my $i = 0; $i < #l; $i++ ) {
local $_ = $l[$i];
#...
}
Thus, because #l is ( 'c' ) when we've gone through twice, our trips through is already greater than scalar( #l ), so we're out. I've tested it out in a number of cases, and they seem to be equivalent.
Below is the code I wrote to test cases. From it we can see that because of the shift, as soon as we're halfway through, the loop will exit.
use strict;
use warnings;
use English qw<$LIST_SEPARATOR>;
use Test::More 'no_plan';
sub test_loops_without_shifts {
my #l = #_;
my #tests;
for ( #l ) {
push #tests, $_;
}
my #l2 = #_;
my $n = #tests;
my $i = 0;
for ( $i = 0; $i < #l2; $i++ ) {
local $_ = $l2[$i];
my $x = shift #tests;
my $g = $_;
is( $g, $x, "expected: $x, got: $g" );
}
is( $n, $i );
is_deeply( \#l, \#l2, do { local $LIST_SEPARATOR = .', '; "leftover: ( #l ) = ( #l2 )" } );
return $i;
}
sub test_loops {
my #l = #_;
my #tests;
for ( #l ) {
push #tests, shift #l;
}
my #l2 = #_;
my $n = #tests;
my $i = 0;
for ( $i = 0; $i < #l2; $i++ ) {
local $_ = $l2[$i];
my $x = shift #tests;
my $g = shift #l2;
is( $g, $x, "expected: $x, got: $g" );
}
is( $n, $i );
is_deeply( \#l, \#l2, do { local $LIST_SEPARATOR = ', 'c; "leftover: ( #l ) = ( #l2 )" } );
return $i;
}
is( test_loops( 'a'..'c' ), 2 );
is( test_loops( 'a'..'d' ), 2 );
is( test_loops( 'a'..'e' ), 3 );
is( test_loops( 'a'..'f' ), 3 );
is( test_loops( 'a'..'g' ), 4 );
is( test_loops_without_shifts( 'a'..'c' ), 3 );
is( test_loops_without_shifts( 'a'..'d' ), 4 );
is( test_loops_without_shifts( 'a'..'e' ), 5 );
is( test_loops_without_shifts( 'a'..'f' ), 6 );
is( test_loops_without_shifts( 'a'..'g' ), 7 );