Perl: basic question, function functionality - perl

What does this function do?
sub MyDigit {
return <<END;
0030\t0039
END
}

That's called a "here-document", and is used for breaking strings up over multiple lines as an alternative to concatenation or list operations:
print "this is ",
"one line when printed, ",
"because print takes multiple ",
"arguments and prints them all!\n";
print "however, you can also " .
"concatenate strings together " .
"and print them all as one string.\n";
print <<DOC;
But if you have a lot of text to print,
you can use a "here document" and create
a literal string that runs until the
delimiter that was declared with <<.
DOC
print "..and now we're back to regular code.\n";
You can read more about here-documents in the manual: see perldoc perlop.

You’ve all missed the point!
It’s defining a user-defined property for use in \p{MyDigit} and \P{MyDigit} using regular expressions.
It’s like these:
sub InKana {
return <<'END';
3040 309F
30A0 30FF
END
}
Alternatively, you could define it in terms of existing property names:
sub InKana {
return <<'END';
+utf8::InHiragana
+utf8::InKatakana
END
}
You can also do set subtraction using a "C<->" prefix. Suppose you only
wanted the actual characters, not just the block ranges of characters.
You could weed out all the undefined ones like this:
sub IsKana {
return <<'END';
+utf8::InHiragana
+utf8::InKatakana
-utf8::IsCn
END
}
You can also start with a complemented character set using the "C" prefix:
sub IsNotKana {
return <<'END';
!utf8::InHiragana
-utf8::InKatakana
+utf8::IsCn
END
}
I figure I must be right, since I’m speaking ex camelis. :)

It uses something called a Here Document to return a string "0030\t0039"

It returns the string "0030\t0039\n" (\t being a tab and \n a newline that is being added because the line ends in a newline (obviously)).
<<FOO
sometext
FOO
Is a so-called heredoc, a way to conveniently write multi-line strings (though here it is used with only one line).

You can help yourself by trying a simple experiment:
C:\Temp> cat t.pl
#!/usr/bin/perl
use strict; use warnings;
print MyDigit();
sub MyDigit {
return <<END;
0030\t0039
END
}
Output:
C:\Temp> t | xxd
0000000: 2020 2020 3030 3330 0930 3033 390d 0a 0030.0039..
Now, in your case, the END is not lined up at the beginning of the line, so you should have gotten the message:
Can't find string terminator "END" anywhere before EOF at …

Related

print lines after finding a key word in perl

I have a variable $string and i want to print all the lines after I find a keyword in the line (including the line with keyword)
$string=~ /apple /;
I'm using this regexp to find the key word but I do not how to print lines after this keyword.
It's not really clear where your data is coming from. Let's assume it's a string containing newlines. Let's start by splitting it into an array.
my #string = split /\n/, $string;
We can then use the flip-flop operator to decide which lines to print. I'm using \0 as a regex that is very unlikely to match any string (so, effectively, it's always false).
for (#string) {
say if /apple / .. /\0/;
}
Just keep a flag variable, set it to true when you see the string, print if the flag is true.
perl -ne 'print if $seen ||= /apple/'
If your data in scalar variable we can use several methods
Recommended method
($matching) = $string=~ /([^\n]*apple.+)/s;
print "$matching\n";
And there is another way to do it
$string=~ /[^\n]*apple.+/s;
print $&; #it will print the data which is match.
If you reading the data from file, try the following
while (<$fh>)
{
if(/apple/)
{
print <$fh>;
}
}
Or else try the following one liner
perl -ne 'print <> and exit if(/apple/);' file.txt

Date validator using subroutine in perl

I have dates say $date1, $date2, $date3.
I want to create array of these dates pass to subroutine & want to retrieve status of each date. Regular expression inside subroutine will evaluate date format.
I have create subroutine as DateValidator as,
my #newDateArray = qw /$date1, $date2, $date3/;
foreach (#newDateArray) {
print "Date used $_ : ";
DateValidator($_);
}
# Subroutine to evaluate dates
sub DateValidator {
my $dateVal=shift;
if ($dateVal =~ /^20?\d{2}\-0?(:?[1-9]|10|11|12)\-(\d{2})$/) {
if ($2 <= 31) {
print "All DD's are correct\n";
} else {
print "Please verify the DD again !\n";
}
} else {
print "Please enter correct date !\n";
}
}
This does not work as expected. Any help would be appreciated.
The qw() function does not interpolate variables. So this code:
my #newDateArray = qw /$date1, $date2, $date3/;
Needs to be:
my #newDateArray = ($date1, $date2, $date3);
I.e. replace qw() with a simple pair of parentheses.
This is not explicitly mentioned in the documentation, but there is a rather subtle mention:
Evaluates to a list of the words extracted out of STRING, using embedded whitespace as the word delimiters. It can be understood as being roughly equivalent to:
split(" ", q/STRING/);
Where the observant people will notice that a single quoted STRING -- using q() -- will not interpolate variables. This could have been written quite a few hundred times clearer, I agree.
Also, you might notice that the documentation says:
A common mistake is to try to separate the words with comma or to put comments into a multi-line qw-string. For this reason, the use warnings pragma and the -w switch (that is, the $^W variable) produces warnings if the STRING contains the "," or the "#" character.
Which you have not noticed, which makes me suspect that you are not using warnings. This is a very, very bad idea. See Why use strict and warnings? for more information.

how can i fetch the whole word on the basis of index no of that string in perl

I have one string of line like
comments:[I#1278327] is related to office communicator.i fixed the bug to declare it null at first time.
Here I am searching index of I#then I want the whole word means [I#1278327]. I'm doing it like this:
open(READ1,"<letter.txt");
while(<READ1>)
{
if(index($_,"I#")!=-1)
{
$indexof=index($_,"I#");
print $indexof,"\n";
$string=substr($_,$indexof);##i m cutting that string first from index of I# to end then...
$string=substr($string,0,index($string," "));
$lengthof=length($string);
print $lengthof,"\n";
print $string,"\n";
print $_,"\n";
}
}
Is any API is there in perl to find the word length directly after finding the index of I# in that line.
You could do something like:
$indexof=index($_,"I#");
$index2 = index($_,' ',$indexof);
$lengthof = $index2 - $indexof;
However, the bigger issue is you are using Perl as if it were BASIC. A more perlish approach to the task of printing selected lines:
use strict;
use warnings;
open my $read, '<', 'letter.txt'; # safer version of open
LINE:
while (<$read>) {
print "$1 - $_" if (/(I#.*?) /);
}
I would use a regex instead, a regex will allow you to match a pattern ("I#") and also capture other data from the string:
$_ =~ m/I#(\d+)/;
The line above will match and set $1 to the number.
See perldoc perlre

Why do printf and sprintf behave differently when only given an array?

sub do_printf { printf #_ }
sub do_sprintf { print sprintf #_ }
do_printf("%s\n", "ok"); # prints ok
do_sprintf("%s\n", "ok"); # prints 2
sprintf has prototype $# while printf has prototype of #
From the perldoc on sprintf:
Unlike printf, sprintf does not do
what you probably mean when you pass
it an array as your first argument.
The array is given scalar context, and
instead of using the 0th element of
the array as the format, Perl will use
the count of elements in the array as
the format, which is almost never
useful.
See codeholic's and Mark's answers for the explanation as to why they behave differently.
As a workaround, simply do:
sub do_sprintf { print sprintf(shift, #_) }
Then,
sub do_printf { printf #_ }
sub do_sprintf { print sprintf(shift, #_) }
do_printf("%s\n", "ok"); # prints ok
do_sprintf("%s\n", "ok2"); # prints ok2
They do different things. For printf the output is to a stream; for sprintf you want the string constructed. It handles the formatting (the f) of the print command. The main purpose for printf is to print out the value it constructs to a stream but with s(tring)printf(ormat) you're only trying to create the string, not print it.
printf returns the number of characters printed to a stream as feedback. Once characters are printed to a stream they've passed out of the logical structure of the program. Meanwhile, sprintf needs to hand you back a string. The most convenient way is as a return value--which because it is within the program structure can be inspected for length, or whether it contains any 'e's, or whatever you want.
Why shouldn't they behave differently?
sprintf evaluates the array in scalar context. Your array has two elements, so it evaluates as "2" (without a trailing \n).

What's an easy way to print a multi-line string without variable substitution in Perl?

I have a Perl program that reads in a bunch of data, munges it, and then outputs several different file formats. I'd like to make Perl be one of those formats (in the form of a .pm package) and allow people to use the munged data within their own Perl scripts.
Printing out the data is easy using Data::Dump::pp.
I'd also like to print some helper functions to the resulting package.
What's an easy way to print a multi-line string without variable substitution?
I'd like to be able to do:
print <<EOL;
sub xyz {
my $var = shift;
}
EOL
But then I'd have to escape all of the $'s.
Is there a simple way to do this? Perhaps I can create an actual sub and have some magic pretty-printer print the contents? The printed code doesn't have to match the input or even be legible.
Enclose the name of the delimiter in single quotes and interpolation will not occur.
print <<'EOL';
sub xyz {
my $var = shift;
}
EOL
You could use a templating package like Template::Toolkit or Text::Template.
Or, you could roll your own primitive templating system that looks something like this:
my %vars = qw( foo 1 bar 2 );
Write_Code(\$vars);
sub Write_Code {
my $vars = shift;
my $code = <<'END';
sub baz {
my $foo = <%foo%>;
my $bar = <%bar%>;
return $foo + $bar;
}
END
while ( my ($key, $value) = each %$vars ) {
$code =~ s/<%$key%>/$value/g;
}
return $code;
}
This looks nice and simple, but there are various traps and tricks waiting for you if you DIY. Did you notice that I failed to use quotemeta on my key names in the substituion?
I recommend that you use a time-tested templating library, like the ones I mentioned above.
You can actually continue a string literal on the next line, like this:
my $mail = "Hello!
Blah blah.";
Personally, I find that more readable than heredocs (the <<<EOL thing mentioned elsewhere).
Double quote " interpolates variables, but you can use '. Note you'll need to escape any ' in your string for this to work.
Perl is actually quite rich in convenient things to make things more readable, e.g. other quote-operations. qq and q correspond to " and ' and you can use whatever delimiter makes sense:
my $greeting = qq/Hello there $name!
Nice to meet you/; # Interpolation
my $url = q|http://perlmonks.org/|; # No need to escape /
(note how the syntax coloring here didn't quite keep up)
Read perldoc perlop (find in page: "Quote and Quote-like Operators") for more information.
Use a data section to store the Perl code:
#!/usr/bin/perl
use strict;
use warnings;
print <DATA>;
#print munged data
__DATA__
package MungedData;
use strict;
use warnings;
sub foo {
print "foo\n";
}
Try writing your code as an actual perl subroutine, then using B::Deparse to get the source code at runtime.