How do I add new lines to a Perl script using PPI? - perl

What I'd ideally like to be able to do is scan a bunch of files that are implicitly importing a bunch of functions via Test::Most. I'd like to explicitly import the functions within the file. So basically I'll check the use statements to see if they already exist and, if they don't, I'd like to add an additional use statement for the function(s) in question. For example, I might add use Test::Differences qw( eq_or_diff ); if there's an eq_or_diff in the file, but no use Test::Differences. It'll get a little more complicated, but that's the basic idea.
As a proof of concept, I've tried to add just a single word into an existing script, but I can't figure it out. insert_after() returns true on success. I only ever get a false value, but I don't see any debugging info as to why the line could not be added.
use strict;
use warnings;
use PPI::Document ();
use PPI::Token::Word ();
use Test::More;
my $script = <<'EOF';
use strict;
use warnings;
use DateTime ();
use Git::Helpers qw( checkout_root );
use LWP::UserAgent ();
my $foo = 'bar';
EOF
my $doc = PPI::Document->new( \$script );
my $includes = $doc->find('PPI::Statement::Include');
my #use = grep { $_->type eq 'use' } #{$includes};
my $second_last = $use[-2];
diag 'Trying to insert after ' . $second_last->module;
my $word = PPI::Token::Word->new('use');
isa_ok( $word, 'PPI::Element', 'word is an Element' );
isa_ok( $second_last, 'PPI::Element', 'use is an Element' );
ok( $second_last->insert_after($word), 'word inserted' );
diag $doc->serialize;
done_testing();
The output of my script is as follows. You'll note that the document does not appear to have been altered:
# trying to insert after Git::Helpers
ok 1 - 'word is an Element' isa 'PPI::Element'
ok 2 - 'use is an Element' isa 'PPI::Element'
not ok 3 - word inserted
# failed test 'word inserted'
# at so.pl line 31.
# use strict;
# use warnings;
#
# use DateTime ();
# use Git::Helpers qw( checkout_root );
# use LWP::UserAgent ();
#
# my $foo = 'bar';
1..3
# looks like you failed 1 test of 3.

Looking at the source of PPI::Statement:
# As above, you can insert a statement, or a non-significant token
sub insert_after {
my $self = shift;
my $Element = _INSTANCE(shift, 'PPI::Element') or return undef;
if ( $Element->isa('PPI::Statement') ) {
return $self->__insert_after($Element);
} elsif ( $Element->isa('PPI::Token') and ! $Element->significant ) {
return $self->__insert_after($Element);
}
'';
}
A "non-significant token" is something like whitespace or a comment.
You are trying to insert a single, significant token at the top level (after a statement). That's not allowed.
You'll have to build a full PPI::Statement::Include element.
Here's some (rather ugly) proof-of-concept code:
# ...
diag 'Trying to insert after ' . $second_last->module;
{
my $insertion_point = $second_last;
for my $new_element (
do {
my $synthetic_use = PPI::Statement::Include->new;
for my $child (
PPI::Token::Word->new('use'),
PPI::Token::Whitespace->new(' '),
PPI::Token::Word->new('Test::Differences'),
PPI::Token::Whitespace->new(' '),
PPI::Token::Quote::Single->new("'eq_or_diff'"),
PPI::Token::Structure->new(';'),
) {
ok $synthetic_use->add_element($child);
}
$synthetic_use
},
PPI::Token::Whitespace->new("\n"),
) {
ok $insertion_point->insert_after($new_element);
}
}
diag $doc->serialize;
But it's much easier to let PPI parse a given fragment and just use those objects:
diag 'Trying to insert after ' . $second_last->module;
{
my $insertion_point = $second_last;
for my $new_element (
reverse PPI::Document->new(\ "\nuse Test::Differences qw( eq_or_diff );")->elements
) {
ok $insertion_point->insert_after($new_element->remove);
}
}
diag $doc->serialize;
Beware: Using $new_element->remove instead of just $new_element is crucial. You need to detach $new_element from its old containing document because otherwise the destruction of the temporary PPI::Document instance will wipe out all child elements, including those already added to $doc.

Related

List all the subroutine names in perl program

I am using more modules in my perl program.
example:
use File::copy;
so likewise File module contains Basename, Path, stat and etc..
i want to list all the subroutine(function) names which is in File Package module.
In python has dir(modulename)
It list all the function that used in that module....
example:
#!/usr/bin/python
# Import built-in module math
import math
content = dir(math)
print content
Like python tell any code for in perl
If you want to look at the contents of a namespace in perl, you can use %modulename::.
For main that's either %main:: or %::.
E.g.:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
sub fish {};
sub carrot {};
print "Stuff defined in Dumper:\n";
print Dumper \%Data::Dumper::;
print "Stuff defined:\n";
print Dumper \%::;
That covers a load of stuff though - including pragmas. But you can check for e.g. subroutines by simply testing it for being a code reference.
foreach my $thing ( keys %:: ) {
if ( defined &$thing ) {
print "sub $thing\n";
}
}
And with reference to the above sample, this prints:
sub Dumper
sub carrot
sub fish
So with reference to your original question:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use File::Copy;
print "File::Copy has subs of:\n";
foreach my $thing ( keys %File::Copy:: ) {
if ( defined &$thing ) {
print "sub $thing\n";
}
}
Unfortunately you can't do the same thing with the whole File:: namespace, because there's a whole bunch of different modules that could be installed/loaded, but might not be.
You'd have to use e.g. CPAN to check that -
perl -MCPAN -e shell
i /^File::/
Which will list you around 717 modules that are grouped into the File:: tree.
You could look this up on CPAN. Or if you're just after the core modules, then some variant of using Module::CoreList might do what you want.
Something like this:
#!/usr/bin/perl
use strict;
use warnings;
use Module::CoreList;
foreach my $module ( Module::CoreList->find_modules(qr/^File::/) ) {
if ( eval { require $module =~ s|::|/|gr . ".pm" } ) {
print "Module: $module contains\n";
my $key_str = "\%$module\:\:";
my %stuff = eval $key_str;
foreach my $thing ( sort keys %stuff ) {
my $full_sub_path = "$module::$thing";
if ( eval {"defined &$full_sub_path"} ) {
if ( defined &$thing ) {
print "$thing <- $full_sub_path imported by default\n";
}
else {
print "\t$full_sub_path might be loadable\n";
}
}
}
}
else {
print "Module: $module couldn't be loaded\n";
}
}
It's a bit messy because you have to eval various bits of it to test if a module is in fact present and loadable at runtime. Oddly enough, File::Spec::VMS wasn't present on my Win32 system. Can't think why.... :).
Should note - just because you could import a sub from a module (that isn't exported by default) doesn't make it a good idea. By convention, any sub prefixed with an _ is not supposed to be used externally, etc.
My Devel::Examine::Subs module can do this, plus much more. Note that whether it's a method or function is irrelevant, it'll catch both. It works purely on subroutines as found with PPI.
use warnings;
use strict;
use Devel::Examine::Subs;
my $des = Devel::Examine::Subs->new;
my $subs = $des->module(module => 'File::Copy');
for (#$subs){
print "$_\n";
}
Output:
_move
move
syscopy
carp
mv
_eq
_catname
cp
copy
croak
Or a file/full directory. For all Perl files in a directory (recursively), just pass the dir to file param without a file at the end of the path:
my $des = Devel::Examine::Subs->new(file => '/path/to/file.pm');
my $subs = $des->all;
If you just want to print it use the Data::Dumper module and the following method, CGI used as an example:
use strict;
use warnings;
use CGI;
use Data::Dumper;
my $object = CGI->new();
{
no strict 'refs';
print "Instance METHOD IS " . Dumper( \%{ref ($object)."::" }) ;
}
Also note, it's File::Copy, not File::copy.

Hook to provide a value for every Hash lookup in Perl

Is it possible to provide a hook in Perl to make sure no Hash key lookup fails ?
Example :
use strict;
use warnings;
my %hash_example = ( "a"=>"apple", "b"=>"ball" );
print $hash_example{"a"}; # Goes Fine.
print $hash_example{"c"}; # Throws Warning ( "Use of uninitialized value " ).
Codepad link
Whenever a hash lookup happens, some subroutine could get called which can provide a default value.
I mean, any hash lookup should call a sub ( say "get_hash_value (hash_ref, key) " ) and pass the hash and key to it. A sample of such a sub is shown below :
sub get_hash_value {
my $hash_ref = shift;
my $key = shift;
if ( exists $hash_ref->{$key} ) { # For Normal Lookup.
return $hash_ref->{$key};
}
else {
# This is the interesting place where we could provide our own values.
return "custom_value_based_on_certain_conditions"; # Some value
}
}
Another consequence would be the ability to alter the value returned against a key. We would be able to return a different value than what actually is stored against that key ( in that hash ).
There might not be a valid use case for this but am intrigued and would like to learn if such things are supported in Perl.
As said by Сухой27 in comment, this works fine:
my %hash_example = ( "a"=>"apple", "b"=>"ball" );
print $hash_example{"a"};
print $hash_example{"c"} // "custom_value_based_on_certain_conditions";
Doc on logical defined or
I would suggest that trying to alter how a hash lookup "works" is a really terrible idea, as a good way to create code that's hard to maintain.
However instead I would suggest you look at creating an object instead of a hash. They are basically the same thing, but an object includes code, and there is an expectation that the code within the object is 'doing it's own thing'.
So at a basic level:
#!/usr/bin/env perl
use strict;
use warnings;
package Hash_Ob;
sub new {
my ($class) = #_;
my $self = {};
bless( $self, $class );
return $self;
}
sub get_value {
my ( $self, $valuename ) = #_;
if ( $self->{$valuename} ) {
return $self->{$valuename};
}
else {
#generate your own value here!
$self->{$valuename} = 42;
return $self->{$valuename};
}
}
1;
Which you'd then 'call' using:
#!/usr/bin/env perl
use strict;
use warnings;
use Hash_Ob;
my $magic_hash = Hash_Ob -> new();
print $magic_hash -> get_value('new_value');
This avoids the problem of altering how a 'well known' mechanism actually works, and so future maintenance programmers will not curse your name.
Then maybe you want to use a tied hash. Tying is a mechanism to change the behavior of a builtin data type. See perltie for the gory details.
{
package HashWithDefault;
use Tie::StdHash;
our #ISA = qw(Tie::StdHash); # inherit STORE, FIRST, NEXT, etc.
sub TIEHASH {
my ($pkg,$default_val) = #_;
return bless { __default_val__ => $default_val}, $pkg;
}
sub FETCH {
my ($self,$key) = #_;
exists $self->{$key} ? $self->{$key} : $self->{__default_val__};
}
sub CLEAR { # don't clear the default val
my $self = shift;
%$self = ( __default_val__ => $self->{__default_val__} );
}
}
tie my %hash, 'HashWithDefault', "42";
%hash = (foo => 123, bar => 456);
print $hash{foo}; # 123
print $hash{quux}; # 42

Perl print line over Prompt

My script asks for download URLs and sends them to the download queue. The progress of the download should be printed back.
I don't find a way to keep the prompt on bottom and do the status over it.
I tried a search on CPAN, but I found no module for it.
#!/usr/bin/perl
use 5.14.0;
use strict;
use warnings;
use Term::UI;
use Term::ReadLine;
use threads;
use Thread::Queue;
sub rndStr{ join'', #_[ map{ rand #_ } 1 .. shift ] }
my $q = Thread::Queue->new(); # A new empty queue
my $thr = threads->create(
sub {
while (defined(my $item = $q->dequeue())) {
say "Downloading: ".$item;
sleep 1;
#$q->enqueue(1..10) if $item eq '10';
$q->enqueue(rndStr rand (15)+5, 'a'..'z', 0..9);
}
}
);
$q->enqueue(rndStr 10, 'a'..'z', 0..9);
my $url;
my $term = Term::ReadLine->new('brand');
while ($url ne 'end'){
$url = $term->get_reply(
prompt => 'URL to download',
default => 'end' );
$q->enqueue($url);
}
say "Finishing remaining downloads";
$q->enqueue(undef);
$thr->join();
The basic just of what you are trying to do is use ANSI codes to move the cursor around. Something such as ncurses (windows version) will allow you do this.
Alternatively you can do it yourself with raw ASCII/ANSI codes (as explained by these two links)
http://ascii-table.com/ansi-escape-sequences-vt-100.php
http://www.tldp.org/HOWTO/Bash-Prompt-HOWTO/x361.html
Or lastly you could use a Perl Module Win32::Console::ANSI which is designed to help you do this.
As this is a perl question I would suggest looking at Win32::Console::ANSI.
say adds a newline in the output; use print instead. Add a carriage return to write over previous output:
print "Downloading: ".$item."\r";

Can I define a subroutine that can be called like a built-in, everywhere?

Currently, during debugging, I tend to insert
carp Dumper \#foo, \%bar;
statements into my code, and regularly run into the problem that Carp and Data::Dumper are not imported in the current package. To avoid this problem, I defined the following sub in a central file that's always included:
sub main::bla {
use Carp; use Data::Dumper;
carp Dumper #_;
}
So now I can do
main::bla \#foo, \%bar;
everywhere, but the main:: annoys me. Is there something like a global package that's always imported, in any package, automatically (just like built-ins practically are imported everywhere)?
You could just call it
::bla( \#foo, \%bar );
In earlier days, I used to put util functions in a Ut package. And that was fine, but I noticed that my code wasn't really as modular as I thought of it. Each module that depended on Ut functions being there could only succeed if somebody took the trouble to build that namespace.
In the end, I considered use and require statements as simply documenting dependencies. There are more flexible ways to change what code is called by library modules, rather than changing their implementation in main.
For example, you could do this in a single use statement.
use Ut blah => sub {
use Carp;
use Data::Dumper;
carp Dumper #_;
};
And define the import:
sub import {
shift; # It's just me
my ( $name, $impl ) = #_;
if ( $name eq 'blah' && ref( $impl ) eq 'CODE' ) {
*blah = $_[1];
}
...
}
I still use the ut pattern, when I'm developing a lot of code at once. But I don't mind writing
ut:dump_var( $var )
as opposed to saving 3-4 characters (because sometimes I call it U::).
Now, it appears that you don't want to do this long term, and dumping out your variables is a useful thing for development. If you really want to do this, Smart::Comments does it like so:
### #foo
### %bar
All it takes is one use statement.
use Smart::Comments;
Maybe just better to make another Package with Export and needed things?
Like, MyTest.pm:
package MyTest;
use strict;
use Carp;
use Data::Dumper;
use base qw( Exporter );
our #EXPORT = qw(
debug
)
sub debug {
carp Dumper #_;
}
1;
So you can then just write in your script:
use MyTest;
debug {a => 'b', c => 'd' }
Fun fact: Some symbols are magic in that they always refer to their values in the main package. You can assign subroutines to these symbols that and they will be visible in any package.
{
package Foo;
# special names _ ARGV ARGVOUT ENV INC SIG STDERR STDIN STDOUT
sub ENV { print "In main::ENV ...\n" }
sub STDIN { print "In main::STDIN ...\n" }
sub _ { print "In main::_\n" }
# names that begin with ^ + upper case letter, or all digits
*{^T} = sub { scalar localtime };
*{^Gmtime} = sub { scalar gmtime };
*43 = sub { 42 };
use Data::Dumper;
*{^D} = \&Data::Dumper::Dumper;
}
{
package Bar;
&ENV;
STDIN();
print "The time is ", &^T, "\n";
print "In London it is ", &{^Gmtime}, "\n";
print "The answer is ", &43, "\n";
print "\#foo is ", &^D( \#foo );
}
None of this is recommended, unless you want to drive the next person who maintains your code insane.

How do you create a callback function (dispatch table) in Perl using hashes?

I want to call a main controller function that dispatches other function dynamically, something like this:
package Controller;
my %callback_funcs = ();
sub register_callback{
my ($class,$callback,$options) = _#;
#apppend to %callback_funcs hash ... ?
}
sub main{
%callback_funcs = ( add => 'add_func', rem => 'remove_func', edit => 'edit_func');
while(<STDIN>){
last if ($_ =~ /^\s*$/);
if($_ == 'add' || _$ == 'rem' || _$ == 'edit'){
$result = ${callback_funcs['add']['func']}(callback_funcs['add']['options']);
}
}
}
sub add_func{
...
}
One caveat is that the subs are defined in other Modules, so the callbacks would have to be able to reference them... plus
I'm having a hard time getting the hashes right!
So, it's possible to have a hash that contains anonymous subroutines that you can invoke from stdin.
my %callbacks = (
add => sub {
# do stuff
},
fuzzerbligh => sub {
# other stuff
},
);
And you can insert more hashvalues into the hash:
$callbacks{next} = sub {
...
};
And you would invoke one like this
$callbacks{next}->(#args);
Or
my $coderef = $callbacks{next};
$coderef->(#args);
You can get the hashkey from STDIN, or anywhere else.
You can also define them nonymously and then take a reference to them.
sub delete {
# regular sub definition
}
$callbacks{delete} = \&delete;
I wouldn't call these callbacks, however. Callbacks are subs that get called after another subroutine has returned.
Your code is also rife with syntax errors which may be obscuring the deeper issues here. It's also not clear to me what you're trying to do with the second level of arrays. When are you defining these subs, and who is using them when, and for what?
Perhaps this simplified example will help:
# Very important.
use strict;
use warnings;
# Define some functions.
sub multiply { $_[0] * $_[1] }
sub divide { $_[0] / $_[1] }
sub add { $_[0] + $_[1] }
sub subtract { $_[0] - $_[1] }
# Create a hash of references to those functions (dispatch table).
my %funcs = (
multiply => \&multiply,
divide => \&divide,
add => \&add,
subtract => \&subtract,
);
# Register some more functions.
sub register {
my ($key, $func) = #_;
$funcs{$key} = $func;
}
register('+', \&add); # As above.
register('sum', sub { # Or using an anonymous subroutine.
my $s = 0;
$s += $_ for #_;
return $s;
});
# Invoke them dynamically.
while (<>){
my ($op, #args) = split;
last unless $op and exists $funcs{$op}; # No need for equality tests.
print $funcs{$op}->(#args), "\n";
}
You've already got some good answers on how to build a dispatch table and call functions through it within a single file, but you also keep talking about wanting the functions to be defined in other modules. If that's the case, then wouldn't it be better to build the dispatch table dynamically based on what dispatchable functions each module says it has rather than having to worry about keeping it up to date manually? Of course it would!
Demonstrating this requires multiple files, of course, and I'm using Module::Pluggable from CPAN to find the modules which provide the function definitions.
dispatch_core.pl:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
use lib '.'; # a demo is easier if I can put modules in the same directory
use Module::Pluggable require => 1, search_path => 'DTable';
for my $plugin (plugins) {
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}
DTable/Add.pm:
package DTable::Add;
use strict;
use warnings;
sub dispatchable {
return (add => \&add);
}
sub add {
my ($num1, $num2) = #_;
print "$num1 + $num2 = ", $num1 + $num2, "\n";
}
1;
DTable/MultDiv.pm:
package DTable::MultDiv;
use strict;
use warnings;
sub dispatchable {
return (multiply => \&multiply, divide => \&divide);
}
sub multiply {
my ($num1, $num2) = #_;
print "$num1 * $num2 = ", $num1 * $num2, "\n";
}
sub divide {
my ($num1, $num2) = #_;
print "$num1 / $num2 = ", $num1 / $num2, "\n";
}
1;
Then, on the command line:
$ ./dispatch_core.pl
add:
2 + 5 = 7
divide:
2 / 5 = 0.4
multiply:
2 * 5 = 10
Adding new functions is now as simple as dropping a new file into the DTable directory with an appropriate dispatchable sub. No need to ever touch dispatch_core.pl just to add a new function again.
Edit: In response to the comment's question about whether this can be done without Module::Pluggable, here's a modified dispatch_core.pl which doesn't use any external modules other than the ones defining the dispatchable functions:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
my #dtable = qw(
DTable::Add
DTable::MultDiv
);
use lib '.';
for my $plugin (#dtable) {
eval "use $plugin";
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}