How to overload operator in non-class package? - perl

In my situation I don't need warnings Use of uninitialized value in string while comparing string equality. So I tought that instead silencing all such warnings in the scope with no warnings 'uninitialized' would be better to overload eq-operator with my own subroutine, like:
use overload 'eq' => \&undefined_equal;
sub undefined_equal {
my ( $left, $right ) = #_;
no warnings 'uninitialized';
if ( $left eq $right ) {
return 1;
}
return 0;
}
Of course, overloading does not work, because according to the docs, overload is meant to use with classes, but I have plain procedural packages.
So I did try with overloading built-in functions, like:
package REeq;
use strict; use warnings; use 5.014;
BEGIN {
use Exporter ();
#REeq::ISA = qw( Exporter );
#REeq::EXPORT = qw( eq );
}
sub eq {
my ( $left, $right ) = #_;
no warnings 'uninitialized';
if ( $left CORE::eq $right ) {
return 1;
}
return 0;
}
1;
I can call my eq but can't use it as operator.
I need it because I want instead
if ( defined $some_scalar && $some_scalar eq 'literal string' ){
....
}
to use just
if ( $some_scalar eq 'literal string' ){
....
}
How could I achieve my goal?

Changing the behaviour of eq is possible, but it requires writing an XS modules that creates an op checker that replaces the code perl executes for the eq ops in scope. This is the approach used by no autovivification;, for example.

Seems you can - I haven't tried this but the perl monks have
sure, if you see it that way... you just have to bless your variable,
just like you did with your Number-package.
use overload ...;
my $n = bless {number => 23}, "main";
print $n >> 2;
i think that's not what you want, just wanted to make clear that it's
not a problem of the package name but that you must have a blessed
object.
Edit: taking zdim's onboard...
use strict;
use warnings;
use overload 'eq' => \&undefined_equal;
sub undefined_equal {
my ( $left, $right ) = #_;
no warnings 'uninitialized';
if ( ${$left} eq $right ) {
return 1;
}
return 0;
}
my $a = "abcde";
my $n = bless \$a, "main";
print "a eq undef -->";
print $a eq undef;
print "<--\nn eq undef -->";
print $n eq undef;
print "<--\n";
which gives
$ perl overload.pl
Use of uninitialized value in string eq at overload.pl line 20.
a eq undef --><--
n eq undef -->0<--
Don't forget the double $$ in the sub or you disappear into recursion. And the scalar reference for bless as you can only bless references, it seems
It still has a bless but hey

Related

Perl, Split string into Key:Value pairs for hash with lowercase keys without temporary array

Given a string of Key:Value pairs, I want to create a lookup hash but with lowercase values for the keys. I can do so with this code
my $a="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my #a = split '\|', $a;
my %b = map { $a[$_] = ( !($_ % 2) ? lc($a[$_]) : $a[$_]) } 0 .. $#a ;
The resulting Hash would look like this Dumper output:
$VAR1 = {
'key3' => 'Value3',
'key2' => 'Value2',
'key1' => 'Value1'
};
Would it be possible to directly create hash %b without using temporary array #a or is there a more efficient way to achieve the same result?
Edit: I forgot to mention that I cannot use external modules for this. It needs to be basic Perl.
You can use pairmap from List::Util to do this without an intermediate array at all.
use strict;
use warnings;
use List::Util 1.29 'pairmap';
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash = pairmap { lc($a) => $b } split /\|/, $str;
Note: you should never use $a or $b outside of sort (or List::Util pair function) blocks. They are special global variables for sort, and just declaring my $a in a scope can break all sorts (and List::Util pair functions) in that scope. An easy solution is to immediately replace them with $x and $y whenever you find yourself starting to use them as example variables.
Since the key-value pair has to be around the | you can use a regex
my $v = "KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %h = split /\|/, $v =~ s/([^|]+) \| ([^|]+)/lc($1).q(|).$2/xger;
use strict;
use warnings;
use Data::Dumper;
my $i;
my %hash = map { $i++ % 2 ? $_ : lc } split(/\|/, 'KEY1|Value1|kEy2|Value2|KeY3|Value3');
print Dumper(\%hash);
Output:
$VAR1 = {
'key1' => 'Value1',
'key2' => 'Value2',
'key3' => 'Value3'
};
For fun, here are two additional approaches.
A cheaper one than the original (since the elements are aliased rather than copied into #_):
my %hash = sub { map { $_ % 2 ? $_[$_] : lc($_[$_]) } 0..$#_ }->( ... );
A more expensive one than the original:
my %hash = ...;
#hash{ map lc, keys(%hash) } = delete( #hash{ keys(%hash) } );
More possible solutions using regexes to do all the work, but not very pretty unless you really like regex:
use strict;
use warnings;
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash;
my $copy = $str;
$hash{lc $1} = $2 while $copy =~ s/^([^|]*)\|([^|]*)\|?//;
use strict;
use warnings;
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash;
$hash{lc $1} = $2 while $str =~ m/\G([^|]*)\|([^|]*)\|?/g;
use strict;
use warnings;
my $str="KEY1|Value1|kEy2|Value2|KeY3|Value3";
my %hash = map { my ($k, $v) = split /\|/, $_, 2; (lc($k) => $v) }
$str =~ m/([^|]*\|[^|]*)\|?/g;
Here's a solution that avoids mutating the input string, constructing a new string of the same length as the input string, or creating an intermediate array in memory.
The solution here changes the split into looping over a match statement.
#! /usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;
my $a="KEY1|Value1|kEy2|Value2|KeY3|Value3";
sub normalize_alist_opt {
my ($input) = #_;
my %c;
my $last_key;
while ($input =~ m/([^|]*(\||\z)?)/g) {
my $s = $1;
next unless $s ne '';
$s =~ s/\|\z//g;
if (defined $last_key) {
$c{ lc($last_key) } = $s;
$last_key = undef;
} else {
$last_key = $s;
}
}
return \%c;
}
print Dumper(normalize_alist_opt($a));
A potential solution that operates over the split directly. Perl might recognize and optimize the special case. Although based on discussions here and here, I'm not sure.
sub normalize_alist {
my ($input) = #_;
my %c;
my $last_key;
foreach my $s (split /\|/, $input) {
if (defined $last_key) {
$c{ lc($last_key) } = $s;
$last_key = undef;
} else {
$last_key = $s;
}
}
return \%c;
}

Making Perl's "specific declarations" as variable

I was looking for some sort of solution if the following setup can be interpreted as formal declaration of the variables , if possible.
What i have is :
my $str_1 = "{cow}" ;
my $str_2 = "{cow}{black}{tasty_milk}";
what i want is :
(Based on above variable string is it possible to initialize a hash directly,
something like :)
my %hash=();
$hash."*some operator* on $str_i" = 'Initialized' ;
This "some operator" should make $hash to recognize as hash as it was declared earlier. i.e Input specific hash initialization.
PS: I don't want to write a function that will work on the string and get all information to initialize the hash.
Say you had the following input instead:
my #path = qw( cow black tasty_milk );
Then you can use the following:
use Data::Diver qw( DiveVal );
DiveVal(\%hash, map \$_, #path) = 'value';
So, with Data::Diver, we get:
use Data::Diver qw( DiveVal );
$str =~ /^(?:\{\w+\})+\z/
or die("Unrecognized format");
my #path = $str =~ /(\w+)/g;
DiveVal(\%hash, map \$_, #path) = 'value';
Without a module:
sub dive_val :lvalue { my $p = \shift; $p = \( $$p->{$_} ) for #_; $$p }
$str =~ /^(?:\{\w+\})+\z/
or die("Unrecognized format");
my #path = $str =~ /(\w+)/g;
dive_val(\%hash, #path) = 'value';
Try the following with anonymous hash
use Data::Dumper;
use warnings;
use strict;
my $str_2 = "{cow}{black}{tasty_milk}";
my %hash;
my $ref =\%hash;
my $val;
my $lp = () = $str_2=~m/\}/g; #count the number of }
my $i = 1;
while($str_2=~m/\{(\w+)\}/g)
{
$val = $1;
last if ($lp == $i);
$ref->{$val} = {}; #make the anonymous hash
$ref = $ref->{$val}; #add the data into anonymous hash
$i++;
}
$ref->{$val} = "value"; #add the last value
print Dumper \%hash;

How to convert an array into a hash, with variable names mapped as keys in Perl?

I find myself doing this pattern a lot in perl
sub fun {
my $line = $_[0];
my ( $this, $that, $the_other_thing ) = split /\t/, $line;
return { 'this' => $this, 'that' => $that, 'the_other_thing' => $the_other_thing};
}
Obviously I can simplify this pattern by returning the output of a function which transforms a given array of variables into a map, where the keys are the same names as the variables eg
sub fun {
my $line = $_[0];
my ( $this, $that, $the_other_thing ) = split /\t/, $line;
return &to_hash( $this, $that, $the_other_thing );
}
It helps as the quantity of elements get larger. How do I do this? It looks like I could combine PadWalker & closures, but I would like a way to do this using only the core language.
EDIT: thb provided a clever solution to this problem, but I've not checked it because it bypasses a lot of the hard parts(tm). How would you do it if you wanted to rely on the core language's destructuring semantics and drive your reflection off the actual variables?
EDIT2: Here's the solution I hinted at using PadWalker & closures:
use PadWalker qw( var_name );
# Given two arrays, we build a hash by treating the first set as keys and
# the second as values
sub to_hash {
my $keys = $_[0];
my $vals = $_[1];
my %hash;
#hash{#$keys} = #$vals;
return \%hash;
}
# Given a list of variables, and a callback function, retrieves the
# symbols for the variables in the list. It calls the function with
# the generated syms, followed by the original variables, and returns
# that output.
# Input is: Function, var1, var2, var3, etc....
sub with_syms {
my $fun = shift #_;
my #syms = map substr( var_name(1, \$_), 1 ), #_;
$fun->(\#syms, \#_);
}
sub fun {
my $line = $_[0];
my ( $this, $that, $other) = split /\t/, $line;
return &with_syms(\&to_hash, $this, $that, $other);
}
You could use PadWalker to try to get the name of the variables, but that's really not something you should do. It's fragile and/or limiting.
Instead, you could use a hash slice:
sub fun {
my ($line) = #_;
my %hash;
#hash{qw( this that the_other_thing )} = split /\t/, $line;
return \%hash;
}
You can hide the slice in a function to_hash if that's what you desire.
sub to_hash {
my $var_names = shift;
return { map { $_ => shift } #$var_names };
}
sub fun_long {
my ($line) = #_;
my #fields = split /\t/, $line;
return to_hash [qw( this that the_other_thing )] #fields;
}
sub fun_short {
my ($line) = #_;
return to_hash [qw( this that the_other_thing )], split /\t/, $line;
}
But if you insist, here's the PadWalker version:
use Carp qw( croak );
use PadWalker qw( var_name );
sub to_hash {
my %hash;
for (0..$#_) {
my $var_name = var_name(1, \$_[$_])
or croak("Can't determine name of \$_[$_]");
$hash{ substr($var_name, 1) } = $_[$_];
}
return \%hash;
}
sub fun {
my ($line) = #_;
my ($this, $that, $the_other_thing) = split /\t/, $line;
return to_hash($this, $that, $the_other_thing);
}
This does it:
my #part_label = qw( part1 part2 part3 );
sub fun {
my $line = $_[0];
my #part = split /\t/, $line;
my $no_part = $#part_label <= $#part ? $#part_label : $#part;
return map { $part_label[$_] => $part[$_] } (0 .. $no_part);
}
Of course, your code must name the parts somewhere. The code above does it by qw(), but you can have your code autogenerate the names if you like.
[If you anticipate a very large list of *part_labels,* then you should probably avoid the *(0 .. $no_part)* idiom, but for lists of moderate size it works fine.]
Update in response to OP's comment below: You pose an interesting challenge. I like it. How close does the following get to what you want?
sub to_hash ($$) {
my #var_name = #{shift()};
my #value = #{shift()};
$#var_name == $#value or die "$0: wrong number of elements in to_hash()\n";
return map { $var_name[$_] => $value[$_] } (0 .. $#var_name);
}
sub fun {
my $line = $_[0];
return to_hash [qw( this that the_other_thing )], [split /\t/, $line];
}
If I understand you properly you want to build a hash by assigning a given sequence of keys to values split from a data record.
This code seems to do the trick. Please explain if I have misunderstood you.
use strict;
use warnings;
use Data::Dumper;
$Data::Dumper::Terse++;
my $line = "1111 2222 3333 4444 5555 6666 7777 8888 9999\n";
print Dumper to_hash($line, qw/ class division grade group kind level rank section tier /);
sub to_hash {
my #fields = split ' ', shift;
my %fields = map {$_ => shift #fields} #_;
return \%fields;
}
output
{
'division' => '2222',
'grade' => '3333',
'section' => '8888',
'tier' => '9999',
'group' => '4444',
'kind' => '5555',
'level' => '6666',
'class' => '1111',
'rank' => '7777'
}
For a more general solution which will build a hash from any two lists, I suggest the zip_by function from List::UtilsBy
use strict;
use warnings;
use List::UtilsBy qw/zip_by/;
use Data::Dumper;
$Data::Dumper::Terse++;
my $line = "1111 2222 3333 4444 5555 6666 7777 8888 9999\n";
my %fields = zip_by { $_[0] => $_[1] }
[qw/ class division grade group kind level rank section tier /],
[split ' ', $line];
print Dumper \%fields;
The output is identical to that of my initial solution.
See also the pairwise function from List::MoreUtils which takes a pair of arrays instead of a list of array references.
Aside from parsing the Perl code yourself, a to_hash function isn't feasible using just the core language. The function being called doesn't know whether those args are variables, return values from other functions, string literals, or what have you...much less what their names are. And it doesn't, and shouldn't, care.

Why is my for loop an illegal declaration

I created two subs one to do Fibonacci and the other to test even numbers. When I call it though it is saying my for loop in line 7 the sub Fibonacci is illegal why?
#!/usr/bin/perl
use strict;
use warnings;
my ($x,$y);
my $num = 0;
sub Fibs($start,$stop){
for ($start..$stop){
($x, $y) = ($y, $x+$y);
my $total += $y;
}
print "$total \n"
}
sub even($num){
if ($num % 2 == 0){
return $num;}
}
my $big_total = Fibs(even($num), 3999999)
Edited from suggestions below.
Clearly I am missing something. From feedback updated to new version.
#!/usr/bin/perl
use strict;
use warnings;
my ($x,$y);
my $num = 0;
sub Fibs{
my ($start, $stop) = #_ ;
for ($start..$stop){
my ($x, $y) = (0,2);
if ($x % 2 == 0){
($x, $y) = ($y, $x+$y);
my $total += $y;
}
}
my $big_total = Fibs(0, 3999999)
In addition to the missing opening braces, Perl doesn't support that kind of declaration for subroutine parameters.
Rather than
sub Fibs($start, $stop) {
...
}
you need to write something like:
sub Fibs {
my($start, $stop) = #_;
...
}
(Perl does have prototypes, but they're not really intended for declaring the types of parameters, and they don't provide names. See this article for a discussion.)
Other problems:
You should add
use strict;
use warnings;
You never use the $x and $y that you declare in the outer scope.
Your even function appears to be incomplete. It doesn't (explicitly) return a value if its argument is an odd number. What exactly is it intended to do?

perl: iterate over a typeglob

Given a typeglob, how can I find which types are actually defined?
In my application, we user PERL as a simple configuration format.
I'd like to require() the user config file, then be able to see which variables are defined, as well as what types they are.
Code: (questionable quality advisory)
#!/usr/bin/env perl
use strict;
use warnings;
my %before = %main::;
require "/path/to/my.config";
my %after = %main::;
foreach my $key (sort keys %after) {
next if exists $before{$symbol};
local *myglob = $after{$symbol};
#the SCALAR glob is always defined, so we check the value instead
if ( defined ${ *myglob{SCALAR} } ) {
my $val = ${ *myglob{SCALAR} };
print "\$$symbol = '".$val."'\n" ;
}
if ( defined *myglob{ARRAY} ) {
my #val = #{ *myglob{ARRAY} };
print "\#$symbol = ( '". join("', '", #val) . "' )\n" ;
}
if ( defined *myglob{HASH} ) {
my %val = %{ *myglob{HASH} };
print "\%$symbol = ( ";
while( my ($key, $val) = each %val ) {
print "$key=>'$val', ";
}
print ")\n" ;
}
}
my.config:
#A = ( a, b, c );
%B = ( b=>'bee' );
$C = 'see';
output:
#A = ( 'a', 'b', 'c' )
%B = ( b=>'bee', )
$C = 'see'
$_<my.config = 'my.config'
In the fully general case, you can't do what you want thanks to the following excerpt from perlref:
*foo{THING} returns undef if that particular THING hasn't been used yet, except in the case of scalars. *foo{SCALAR} returns a reference to an anonymous scalar if $foo hasn't been used yet. This might change in a future release.
But if you're willing to accept the restriction that any scalar must have a defined value to be detected, then you might use code such as
#! /usr/bin/perl
use strict;
use warnings;
open my $fh, "<", \$_; # get DynaLoader out of the way
my %before = %main::;
require "my.config";
my %after = %main::;
foreach my $name (sort keys %after) {
unless (exists $before{$name}) {
no strict 'refs';
my $glob = $after{$name};
print "\$$name\n" if defined ${ *{$glob}{SCALAR} };
print "\#$name\n" if defined *{$glob}{ARRAY};
print "%$name\n" if defined *{$glob}{HASH};
print "&$name\n" if defined *{$glob}{CODE};
print "$name (format)\n" if defined *{$glob}{FORMAT};
print "$name (filehandle)\n" if defined *{$glob}{IO};
}
}
will get you there.
With my.config of
$JACKPOT = 3_756_788;
$YOU_CANT_SEE_ME = undef;
#OPTIONS = qw/ apple cherries bar orange lemon /;
%CREDITS = (1 => 1, 5 => 6, 10 => 15);
sub is_jackpot {
local $" = ""; # " fix Stack Overflow highlighting
"#_[0,1,2]" eq "barbarbar";
}
open FH, "<", \$JACKPOT;
format WinMessage =
You win!
.
the output is
%CREDITS
FH (filehandle)
$JACKPOT
#OPTIONS
WinMessage (format)
&is_jackpot
Printing the names takes a little work, but we can use the Data::Dumper module to take part of the burden. The front matter is similar:
#! /usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
sub _dump {
my($ref) = #_;
local $Data::Dumper::Indent = 0;
local $Data::Dumper::Terse = 1;
scalar Dumper $ref;
}
open my $fh, "<", \$_; # get DynaLoader out of the way
my %before = %main::;
require "my.config";
my %after = %main::;
We need to dump the various slots slightly differently and in each case remove the trappings of references:
my %dump = (
SCALAR => sub {
my($ref,$name) = #_;
return unless defined $$ref;
"\$$name = " . substr _dump($ref), 1;
},
ARRAY => sub {
my($ref,$name) = #_;
return unless defined $ref;
for ("\#$name = " . _dump $ref) {
s/= \[/= (/;
s/\]$/)/;
return $_;
}
},
HASH => sub {
my($ref,$name) = #_;
return unless defined $ref;
for ("%$name = " . _dump $ref) {
s/= \{/= (/;
s/\}$/)/;
return $_;
}
},
);
Finally, we loop over the set-difference between %before and %after:
foreach my $name (sort keys %after) {
unless (exists $before{$name}) {
no strict 'refs';
my $glob = $after{$name};
foreach my $slot (keys %dump) {
my $var = $dump{$slot}(*{$glob}{$slot},$name);
print $var, "\n" if defined $var;
}
}
}
Using the my.config from your question, the output is
$ ./prog.pl
#A = ('a','b','c')
%B = ('b' => 'bee')
$C = 'see'
Working code using a CPAN module that gets some of the hair out of the way, Package::Stash. As noted in my comment to gbacon's answer, this is blind to the config file doing $someval = undef but that seems to be unavoidable, and at least the other cases are caught. It also limits itself to the SCALAR, ARRAY, HASH, CODE, and IO types -- getting GLOB and FORMAT is possible but it makes the code less pretty and also creates noise in the output :)
#!perl
use strict;
use warnings;
use Package::Stash;
sub all_vars_in {
my ($package) = #_;
my #ret;
my $stash = Package::Stash->new($package);
for my $sym ($stash->list_all_package_symbols) {
for my $sigil (qw($ # % &), '') {
my $fullsym = "$sigil$sym";
push #ret, $fullsym if $stash->has_package_symbol($fullsym);
}
}
#ret;
}
my %before;
$before{$_} ++ for all_vars_in('main');
require "my.config";
for my $var (all_vars_in('main')) {
print "$var\n" unless exists $before{$var};
}
Beginning in 5.010, you can distinguish whether a SCALAR exists using the B introspection module; see Detecting declared package variables in perl
Update: example copied from that answer:
# package main;
our $f;
sub f {}
sub g {}
use B;
use 5.010;
if ( ${ B::svref_2object(\*f)->SV } ) {
say "f: Thar be a scalar tharrr!";
}
if ( ${ B::svref_2object(\*g)->SV } ) {
say "g: Thar be a scalar tharrr!";
}
1;
UPDATE:
gbacon is right. *glob{SCALAR} is defined.
Here is the output I get using your code:
Name "main::glob" used only once:
possible typo at
test_glob_foo_thing.pl line 13.
'FOO1' (SCALAR)
'FOO1' (GLOB)
'FOO2' (SCALAR)
'FOO2' (GLOB)
'_<my.config' (SCALAR)
'_<my.config' (GLOB)
This is despite FOO2 being defined as a hash, but not as a scalar.
ORIGINAL ANSWER:
If I understand you correctly, you simply need to use the defined built-in.
#!/usr/bin/env perl
use strict;
use warnings;
my %before = %main::;
require "/path/to/my.config";
my %after = %main::;
foreach my $key (sort keys %after) {
if (not exists $before{$key}) {
if(defined($after{$key}){
my $val = $after{$key};
my $what = ref($val);
print "'$key' ($what)\n";
}
}
}
I hate to ask, but instead of messing around with typeglobs, why not switch to a real configuration format? e.g. check out Config::Simple and YAML.
I wouldn't recommend messing around with typeglobs and symbol tables in normal cases (some CPAN modules do that, but only at the bottom levels of large systems - e.g. Moose in the lowest levels of Class::MOP). Perl gives you a lot of rope to work with, but that rope is also quite happy to self-noosify and self-tie-around-your-neck if you're not careful :)
See also: How do you manage configuration files in Perl?
no strict 'refs';
my $func_name = 'myfunc';
*{$func_name}{CODE}()
use strict 'refs';
If you don't mind parsing Data::Dump output, you could use it to tease out the differences.
use strict;
use warnings;
use Data::Dump qw{ dump };
my %before = %main::;
require "my.config";
my %after = %main::;
foreach my $key ( sort keys %after ) {
if ( not exists $before{$key} ) {
my $glob = $after{$key};
print "'$key' " . dump( $glob) . "\n";
}
}
Using this code with the following config file:
$FOO1 = 3;
$FOO2 = 'my_scalar';
%FOO2 = ( a=>'b', c=>'d' );
#FOO3 = ( 1 .. 5);
$FOO4 = [ 1 .. 5 ];
I believe that this output provides enough information to be able to figure out which parts of each type glob are defined:
'FOO1' do {
my $a = *main::FOO1;
$a = \3;
$a;
}
'FOO2' do {
my $a = *main::FOO2;
$a = \"my_scalar";
$a = { a => "b", c => "d" };
$a;
}
'FOO3' do {
my $a = *main::FOO3;
$a = [1 .. 5];
$a;
}
'FOO4' do {
my $a = *main::FOO4;
$a = \[1 .. 5];
$a;
}
'_<my.config' do {
my $a = *main::_<my.config;
$a = \"my.config";
$a;
}