How do you override substitution operations? - perl

I'm playing around with Perl and creating a string object. I know that this is a very bad idea to do in the real world. I'm doing it purely for fun.
I'm using overload to overload standard Perl string operators with the standard operators you would find in most other languages.
use strict;
use warnings;
use feature qw(say);
my $obj_string1 = Object::String->new("foo");
my $obj_string2 = Object::String->new("bar");
my $reg_string1 = "foobar";
my $reg_string2 = "barfu";
# Object::String "stringifies" correctly inside quotes
say "$obj_string1 $obj_string2";
# Use "+" for concatenations
say $obj_string1 + $obj_string2; # Works
say $obj_string1 + $reg_string1 + $reg_string2 # Works
say $reg_string1 + $obj_string1 # Still works!
say $reg_string1 + $obj_string1 + $reg_string2; # Still works!
say $reg_string1 + $reg_string2 + $obj_string1; # Does't work, of course.
# Overload math booleans with their string boolean equivalents
my $forty = Object::String(40);
my $one_hundred = "100";
if ( $forty > $one_hundred ) { # Valid
say "$forty is bigger than $one_hundred (in strings!)";
}
if ( $one_hundred < $forty ) { # Also Valid
say "$one_hundred is less than $forty (In strings!)";
}
# Some standard "string" methods
say $forty->length # Prints 5
say $forty->reverse; # Prints "ytrof"
say $forty; # Prints "ytrof"
Now comes the hard part:
my $string = Object::String("I am the best programmer around!");
say $string; # Prints "I am the best programmer around"
say $string->get_value; # Prints "I am the best programmer around" with get_value Method
# But, it's time to speak the truth...
$string =~ s/best programer/biggest liar/;
say $string; # Prints "I am the biggest liar around"
say $string->get_value; # Whoops, no get_value method on scalar strings
As you can see, when I do my substitution, it works correctly, but returns a regular scalar string instead of an Object::String.
I am trying to figure out how to override the substitution operation. I've looked in the Perldoc, and I've gone through various Perl books (Advance Perl Programming, Intermediate Perl Programming, Perl Cookbook, etc.), but haven't found a way to override the substitution operation, so it returns an Object::String.
How do I override the substitution operation?

Unfortunately Perl's overload support isn't very universal in the area of strings. There's many operations that overloading isn't party to; and s/// is one of them.
I have started a module to fix this; overload::substr but as yet it's incomplete. It allows you to overload the substr() function for your object, but so far it doesn't yet have power to apply to m// or s///.
You might however, be able to use lvalue (or 4-argument) substr() on your objects as a way to cheat this; if the objects at least stringify into regular strings that can be matched upon, the substitution can be done using the substr()-rewrite trick.
Turn
$string =~ s/pattern/replacement/;
into
$string =~ m/pattern/ and substr($string, $-[0], $+[0]-$-[0]) = "replacement";
and then you'll have some code which will respect a substr() overload on the $string object, if you use my module above.
At some point of course it would be nice if overload::substr can perform that itself; I just haven't got around to writing it yet.

Related

How to interpret replacement?

$test='abc="def"';
$replacement='$1="ghj"';
$test =~ s/(.+)="(.+)"/"$replacement/;
print $test;
It prints:
$1=ghj
How can I treat $replacement to be interpreted?
You add the /e modifier to your regex. You need to modify your replacement string too, so that it is evaluated correctly. Double evaluation is needed to interpolate the variable.
my $test='abc="def"';
my $replacement='"$1=ghj"';
$test =~ s/(.+)="(.+)"/$replacement/ee;
print $test;
Output:
abc=ghj
It should be noted that this is somewhat unsafe, especially if others can affect the value of your replacement. Then they can execute arbitrary code on your system.
There are approximately 3 answers to this question.
Your replacement "string" is actually code to be evaluated at match time to generate the replacement string. That is, it is better represented as a function:
my $test = 'abc="def"';
my $replacement = sub { $1 . '="ghj"' };
$test =~ s/(.+)="(.+)"/$replacement->()/e;
print $test;
If you don't need the full power of arbitrary Perl expressions (or if your replacement string comes from an external source), you can also treat it as a template to be filled in with the match results. There is a module that encapsulates this in the form of a JavaScript-like replace function, Data::Munge:
use Data::Munge qw(replace);
my $test = 'abc="def"';
my $replacement = '$1="ghj"';
$test = replace $test, qr/(.+)="(.+)"/, $replacement;
print $test;
Finally, you can represent Perl code as a string to be eval'd. This is not only inefficient but also fraught with quoting issues (you have to make sure everything in $replacement is syntactically valid Perl) and security holes (if $replacement is generated at runtime, especially if it comes from an external source). My least favorite approach:
my $test = 'abc="def"';
my $replacement = '$1 . "=\\"ghj\\""';
$test =~ s/(.+)="(.+)"/eval $replacement/e;
print $test;
(The s//eval $foo/e part can also be written as s//$foo/ee. I don't like to do that because eval is evil and shouldn't be more hidden than it already is.)

Data types for parameters of subroutines / functions?

In Perl, can one specifiy data types for the parameters of subroutines? E.g. when using a dualvar in a numeric context like exit:
use constant NOTIFY_DIE_MAIL_SEND_FAILED => dualvar 3, 'NOTIFY_DIE_MAIL_SEND_FAILED';
exit NOTIFY_DIE_MAIL_SEND_FAILED;
How does Perl in that case know, that exit expects a numeric parameter? I didn't see a way to define data types for the parameters of subroutines like you do it in Java? (where I could understand how the data type is known as it is explicitely defined)
The whole point of the dualvar is that it behaves as a number or text depending on what you want. In cases where that's not obvious (to you more importantly than to perl) then make it clear.
exit 0 + NOTIFY_DIE_MAIL_SEND_FAILED;
As for explicitly typing parameters, that's not something built in. Perl is a much more dynamic language than Java so it's not common to check/force the type of every parameter or variable. In particular, a perl sub can accept different numbers of parameters and even different structures.
If you want to validate parameters (for an external API for example) try something like Params::Validate
In addition, Moose and Moo allow a certain level of attribute typing and even coercion.
In Perl, scalars are both numeric and stringy at the same time. It is not the variables themselves that distinguish between strings and numbers, but the operators you work with. While the addition + only uses a number, the concatenation . only uses strings.
In more strongly typing languages, e.g. Java, the addition operator doubles as addition and concatenation operator, because it can access type information.
"1" + 2 + 3 is still sick in Java, whereas Perl can cleanly distinguish between "1" + 2 + 3 == 6 and "1" . 2 . 3 eq "123".
You can force numeric or stringy context of a variable by adding 0 or concatenating the empty string:
sub foo {
my ($var) = #_;
$var += 0; # $var is numeric
$var .= ""; # $var is stringy now
}
Perl is quite different from Java in that - Perl is dynamically typed language, because it does not requires its variables to be typed at compile time..
Whereas, Java is statically typed (as you know already)
Perl determines the type of the variable depending upon the context it is used..
There can be only two context: -
List Context
Scalar Context
And the context is defined by the operator or function that is used..
For EG:-
# Define a list
#arr = qw/rohit jain/;
# Define a scalar
$num = 2
# Here perl will evaluate #arr in scalar context and take its length..
# so, below code will evaluate to : - value = 2 / 2
$value = #arr / $num;
# Here since it is used with a foreach loop, #arr will be taken as in list context
foreach (#arr) {
say $_;
}
# Above foreach loop will output: - `rohit` \n `jain` to the console..
You can force the type by:
use Scalar::Util qw(dualvar);
use constant NOTIFY_DIE_MAIL_SEND_FAILED => dualvar 3, 'NOTIFY_DIE_MAIL_SEND_FAILED';
say NOTIFY_DIE_MAIL_SEND_FAILED;
say int(NOTIFY_DIE_MAIL_SEND_FAILED);
output:
NOTIFY_DIE_MAIL_SEND_FAILED
3
How does Perl in that case know, that exit expects a numeric parameter?
exit expect a number as is part of its specification and its behaviour is kind of undefined if you pass it a non-integer value (i.e. you should not do it.
Now, in this particular case, how does dualvar manages to return either value type depending of the context?
I don't know how Scalar::Util's dualvar is implemented but you can write something similar with overload instead.
You certainly can modify the behaviour for a blessed object:
#!/usr/bin/env perl
use strict;
use warnings;
{package Dualvar;
use overload
fallback => 1,
'0+' => sub { $_[0]->{INT_VAL} },
'""' => sub { $_[0]->{STR_VAL} };
sub new {
my $class = shift;
my $self = { INT_VAL => shift, STR_VAL => shift };
bless($self,$class);
}
1;
}
my $x = Dualvar->new(31,'Therty-One');
print $x . " + One = ",$x + 1,"\n"; # Therty-One + One = 32
From the docs, it seems that overload actually changes the behaviour within the declaration scope so you should be able to change the behaviour of some common operators locally for any operand.
If exit does use one of those overloadable operations to evaluate its parameter into a integer then this solution would do.
I didn't see a way to define data types for the parameters of subroutines like you do it in Java?
As already said by others... this is not the case in Perl, at least not at compilation time, except for subroutine prototypes but these don't offer much type granularity (like int vs strings or different object classes).
Richard has mentioned some run-time alternatives you may use. I personally would recommend Moose if you don't mind the performance penalty.
What Rohit Jain said is correct. A function that wants input to follow certain rules simply has to explicitly check that the input is valid.
For example
sub foo
{
my ($param1,$param2) = shift;
$param1 =~ /^\d+$/ or die "Parameter 1 must be a positive integer.";
$param2 =~ /^(bar|baz)$/ or die "Parameter 2 must be either 'bar' or 'baz'";
...
}
This may seem like a pain, but:
The extra flexibility gained generally outweighs the work involved in doing this.
Simply having the correct data type is often not enough to ensure that you valid input, so you end up doing a lot this anyway even in a language like Java.

How can I get case-insensitive completion with Term::ReadLine::Gnu?

I can't seem to get case-insensitive completion when using Term::ReadLine::Gnu. Take this example script:
use strict;
use warnings;
use 5.010;
use Term::ReadLine;
my $term = Term::ReadLine->new('test');
say "Using " . $term->ReadLine;
if (my $attr = $term->Attribs) {
$term->ornaments(0);
$attr->{basic_word_break_characters} = ". \t\n";
$attr->{completer_word_break_characters} = " \t\n";
$attr->{completion_function} = \&complete_word;
} # end if attributes
my #words = qw(apple approve Adam America UPPER UPPERCASE UNUSED);
sub complete_word
{
my ($text, $line, $start) = #_;
return grep(/^$text/i, #words);
} # end complete_word
while (1) {
$_ = $term->readline(']');
last unless /\S/; # quit on empty input
} # end while 1
Note that complete_word does case-insensitive matching. If I run this with Term::ReadLine::Perl (by doing PERL_RL=Perl perl script.pl) it works as I expect. Typing a<TAB><TAB> lists all 4 words. Typing u<TAB><TAB> converts u to U and lists 3 words.
When I use Term::ReadLine::Gnu instead (PERL_RL=Gnu perl script.pl or just perl script.pl), it only does case-sensitive completion. Typing a<TAB> gives app. Typing u<TAB><TAB> doesn't list any completions.
I even have set completion-ignore-case on in my /etc/inputrc, but it still doesn't work here. (It works fine in bash, though.)
Is there any way to get Term::ReadLine::Gnu to do case-insensitive completion?
It would appear the problem is in the Term::ReadLine::Gnu::XS::_trp_completion_function() (a wrapper for the user-defined completion function).
Your matches are retrieved correctly from your complete_word() function, but then the following snippet from the wrapper does its own case-sensitive match:
for (; $_i <= $#_matches; $_i++) {
return $_matches[$_i] if ($_matches[$_i] =~ /^\Q$text/);
}
where #_matches is the result of your complete_word() and $text is the completed text so far.
So it looks like the answer is no, there is no supported way to get Term::ReadLine::Gnu to do case-insensitive completion. You would have to to override the private Term::ReadLine::Gnu::XS::_trp_completion_function (an ugly hack to be sure) -- or modify XS.pm directly (arguably an even uglier hack).
EDIT: Term::ReadLine::Gnu version used: 1.20

How to validate number in perl?

I know that there is a library that do that
use Scalar::Util qw(looks_like_number);
yet I want to do it using perl regular expression. And I want it to work for double numbers not for only integers.
so I want something better than this
$var =~ /^[+-]?\d+$/
thanks.
Constructing a single regular expression to validate a number is really difficult. There simply are too many criteria to consider. Perlfaq4 contains a section "How do I determine whether a scalar is a number/whole/integer/float?
The code from that documentation shows the following tests:
if (/\D/) {print "has nondigits\n" }
if (/^\d+$/) {print "is a whole number\n" }
if (/^-?\d+$/) {print "is an integer\n" }
if (/^[+-]?\d+$/) {print "is a +/- integer\n" }
if (/^-?\d+\.?\d*$/) {print "is a real number\n" }
if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) {print "is a decimal number\n"}
if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/) {
print "is a C float\n"
}
The first test disqualifies an unsigned integer.
The second test qualifies a whole number.
The third test qualifies an integer.
The fourth test qualifies a positive/negatively signed integer.
The fifth test qualifies a real number.
The sixth test qualifies a decimal number.
The seventh test qualifies a number in c-style scientific notation.
So if you were using those tests (excluding the first one) you would have to verify that one or more of the tests passes. Then you've got a number.
Another method, since you don't want to use the module Scalar::Util, you can learn from the code IN Scalar::Util. The looks_like_number() function is set up like this:
sub looks_like_number {
local $_ = shift;
# checks from perlfaq4
return $] < 5.009002 unless defined;
return 1 if (/^[+-]?\d+$/); # is a +/- integer
return 1 if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/); # a C float
return 1 if ($] >= 5.008 and /^(Inf(inity)?|NaN)$/i)
or ($] >= 5.006001 and /^Inf$/i);
0;
}
You should be able to use the portions of that function that are applicable to your situation.
I would like to point out, however, that Scalar::Util is a core Perl module; it ships with Perl, just like strict does. The best practice of all is probably to just use it.
You should use Regexp::Common, most patterns are more complicated than you realize.
use Regexp::Common;
my $real = 3.14159;
print "Real" if $real =~ /$RE{num}{real}/;
However, the pattern is not anchored by default, so a stricter version is:
my $real_pat = $RE{num}{real};
my $real = 3.14159;
print "Real" if $real =~ /^$real_pat$/;
Well first you should make sure that the number does not contain any commas so you do this:
$var =~ s/,//g; # remove all the commas
Then create another variable to do the rest of the compare.
$var2=$var;
Then remove the . from the new variable yet only once occurrence.
$var2 =~ s/.//; # replace . with nothing to compare yet only once.
now var2 should look like an integer with no "."
so do this:
if($var2 !~ /^[+-]?\d+$/){
print "not valid";
}else{
#use var1
}
you can fix this code and write it as a function if you need to use it more than once.
Cheers!

Is there an analogue of Ruby gsub method in Perl? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How do I perform a Perl substitution on a string while keeping the original?
How do I do one line replacements in Perl without modifying the string itself? I also want it to be usable inside expressions, much like I can do p s.gsub(/from/, 'to') in Ruby.
All I can think of is
do {my $r = $s; $r =~ s/from/to/; $r}
but sure there is a better way?
Starting on the day you feel comfortable writing use 5.14.0 at the top of all of your programs, you can use the s/foo/bar/r variant of the s/// operator, which returns the changed string instead of modifying the original in place (added in perl 5.13.2).
The solution you found with do is not bad, but you can shorten it a little:
do {(my $r = $s) =~ s/from/to/; $r}
It still reveals the mechanics though. You can hide the implementation, and also apply substitutions to lists by writing a subroutine. In most implementations, this function is called apply which you could import from List::Gen or List::MoreUtils or a number of other modules. Or since it is so short, just write it yourself:
sub apply (&#) { # takes code block `&` and list `#`
my ($sub, #ret) = #_; # shallow copy of argument list
$sub->() for #ret; # apply code to each copy
wantarray ? #ret : pop #ret # list in list context, last elem in scalar
}
apply creates a shallow copy of the argument list, and then calls its code block, which is expected to modify $_. The block's return value is not used. apply behaves like the comma , operator. In list context, it returns the list. In scalar context, it returns the last item in the list.
To use it:
my $new = apply {s/foo/bar/} $old;
my #new = apply {s/foo/bar/} qw( foot fool fooz );
From Perl's docs: Regexp-like operators:
($foo = $bar) =~ s/this/that/g; # copy first, then change would match gsub, while
$bar =~ s/this/that/g; # change would match gsub!