Sorting module subroutines alphabetically - perl

I would like to sort my module subroutines alphabetically (I have a lot of subroutines, and I think it will be easier to edit the file if the subroutines are ordered in the file). For example given A.pm:
package A;
use warnings;
use strict;
sub subA {
print "A\n";
}
sub subC {
print "C\n";
}
sub subB {
print "B\n";
}
1;
I would like to run a sortSub A.pm the gives:
package A;
use warnings;
use strict;
sub subA {
print "A\n";
}
sub subB {
print "B\n";
}
sub subC {
print "C\n";
}
1;
Is there any CPAN resource that can help with this task?

To parse and reformat Perl code, you should use PPI.
This is the same tool that Perl::Critic and Perl::Tidy use to accomplish all of their feats.
In this case, I studied the code for PPI::Dumper to get a sense of how to navigate the Document Tree that PPI returns.
The following will parse source code and separate out sections containing subroutines and comments. It will tie the comments, pod, and whitespace before a subroutine with it, and then it will sort all the neighboring subs by their names.
use strict;
use warnings;
use PPI;
use Data::Dump;
my $src = do { local $/; <DATA> };
# Load a document
my $doc = PPI::Document->new( \$src );
# Save Sub locations for later sorting
my #group = ();
my #subs = ();
for my $i ( 0 .. $#{ $doc->{children} } ) {
my $child = $doc->{children}[$i];
my ( $subtype, $subname )
= $child->isa('PPI::Statement::Sub')
? grep { $_->isa('PPI::Token::Word') } #{ $child->{children} }
: ( '', '' );
# Look for grouped subs, whitespace and comments. Sort each group separately.
my $is_related = ($subtype eq 'sub') || grep { $child->isa("PPI::Token::$_") } qw(Whitespace Comment Pod);
# State change or end of stream
if ( my $range = $is_related .. ( !$is_related || ( $i == $#{ $doc->{children} } ) ) ) {
if ($is_related) {
push #group, $child;
if ( $subtype ) {
push #subs, { name => "$subname", children => [#group] };
#group = ();
}
}
if ( $range =~ /E/ ) {
#group = ();
if (#subs) {
# Sort and Flatten
my #sorted = map { #{ $_->{children} } } sort { $a->{name} cmp $b->{name} } #subs;
# Assign back to document, and then reset group
my $min_index = $i - $range + 1;
#{ $doc->{children} }[ $min_index .. $min_index + $#sorted ] = #sorted;
#subs = ();
}
}
}
}
print $doc->serialize;
1;
__DATA__
package A;
use warnings;
use strict;
=comment
Pod describing subC
=cut
sub subC {
print "C\n";
}
INIT {
print "Hello World";
}
sub subB {
print "B\n";
}
# Hello subA comment
sub subA {
print "A\n";
}
1;
Output:
package A;
use warnings;
use strict;
=comment
Pod describing subC
=cut
sub subC {
print "C\n";
}
INIT {
print "Hello World";
}
# Hello subA comment
sub subA {
print "A\n";
}
sub subB {
print "B\n";
}
1;

First, here's my solution;
#!/bin/sh
TOKEN=sub
gsed -e ':a;N;$!ba;s/\n/__newline__/g' "$1" > "$1.out"
gsed -i "s/__newline__\\s*$TOKEN\W/\\nsub /g" "$1.out"
sort $1.out -o $1.out
gsed -i 's/__newline__/\n/g' $1.out
Usage: token_sort.sh myfile.pl
This is how it works;
Replace all newlines with a placeholder, __newline__
break out all $TOKENS, in this case subs, to their own line
Sort the lines using unix sort
Replace back all the newlines
You should now have a sorted copy of your file in myfile.pl.out
A few caveats;
Add a comment, "# Something", or "#!/usr/bin/env perl" to the top of the file; this will ensure that the header block remains sorted at the top.
The sorted block will be the start of the current sub to the next sub - comments at above the sub will get sorted with the previous sub.
You need to use gnu-sed for this to work, on a mac this means doing a "brew install gnu-sed"

Related

List of subroutines current package declares

Need to gather a list of the subroutines that the current package itself declares - no imports.
I've seen Package::Stash, but it lists imported names (of course).
Came up with the following, but I don't like having to move the includes to the bottom of the file.
Anyone see how I can gather the same list, but still keep my includes near the top ?
package Foo;
use common::sense;
use Function::Parameters;
# Must import at least "fun" and "method" first for them to work.
# See bottom of file for rest of includes.
our %package_functions;
say join q{, }, sort keys %package_functions;
sub foo_1 { ; }
fun foo_2 () { ; }
method foo_3 () { ; }
BEGIN {
# This block must be kept *after* the sub declarations, and *before* imports.
no strict 'refs';
%package_functions = map { $_ => 1 } # Hash offers more convenient lookups when/if checked often.
grep { !/^(can|fun|method)$|^_/ } # Exclude certain names or name patterns.
grep { ref __PACKAGE__->can($_) eq 'CODE' } # Pick out only CODEREFs.
keys %{__PACKAGE__ . '::'}; # Any functions above should have their names here.
}
use JSON;
use Data::Dumper;
# use ...
1;
Outputs (with "perl" -E 'use Foo;') :
foo_1, foo_2, foo_3
If BEGIN is moved after the other includes, we see Dumper, encode_json, etc..
Deparse from core is perfectly able to do that, so you can do what B::Deparse.pm is doing, namely use the B module to peek into perl's innards:
# usage: for_subs 'package', sub { my ($sub_name, $pkg, $type, $cv) = #_; ... }
sub for_subs {
my ($pkg, $sub) = (#_, sub { printf "%-15s %-15s %-15s%.0s\n", #_ });
use B (); no strict 'refs';
my %stash = B::svref_2object(\%{$pkg.'::'})->ARRAY;
while(my($k, $v) = each %stash){
if($v->FLAGS & B::SVf_ROK){
my $cv = $v->RV;
if($cv->isa('B::CV')){
$sub->($k, $pkg, sub => $cv);
}elsif(!$cv->isa('B::SPECIAL') and $cv->FLAGS & B::SVs_PADTMP){
$sub->($k, $pkg, const => $cv);
}
}elsif($v->FLAGS & B::SVf_POK){
$sub->($k, $pkg, proto => $v->PV);
}elsif($v->FLAGS & B::SVf_IOK){
$sub->($k, $pkg, proto => '');
}elsif($v->isa('B::GV')){
my $cv = $v->CV;
next if $cv->isa('B::SPECIAL');
next if ${$cv->GV} != $$v;
$sub->($k, $pkg, sub => $cv);
}
}
}
Sample usage:
package P::Q { sub foo {}; sub bar; sub baz(){ 13 } }
for_subs 'P::Q';
sub foo {}; sub bar; sub baz(){ 13 }
for_subs __PACKAGE__;
should result in:
foo P::Q sub
bar P::Q proto
baz P::Q sub
baz main const
for_subs main sub
bar main proto
foo main sub
If the package you're interested in is not main, you don't care about empty prototypes (like the bar in the example above) and you need just a list of names, you can cut it to:
# usage: #subs = get_subs 'package'
sub get_subs {
my #subs;
use B (); no strict 'refs';
my %stash = B::svref_2object(\%{shift.'::'})->ARRAY;
while(my($k, $v) = each %stash){
next unless $v->isa('B::GV');
my $cv = $v->CV;
next if $cv->isa('B::SPECIAL');
next if ${$cv->GV} != $$v;
push #subs, $k;
}
#subs
}
My Devel::Examine::Subs can do this. Review the documentation for methods (and parameters to new()) that allow you to exclude subs that are retrieved.
package TestLib;
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
use Devel::Examine::Subs;
use JSON;
my $des = Devel::Examine::Subs->new(file => __FILE__);
my $sub_names = $des->all;
say join ', ', #$sub_names;
sub one {}
sub two {}
sub three {}
Output:
perl -E 'use lib "."; use TestLib'
one, two, three

How to get right line number when Carp::croaked?

Is there a proper way to get a line number wherecroak was called?
In the following example I get into $stack :
line 22, where last subroutine (l) was called
line 44, where try-block is terminated
all the other calls in the stack
but I'd like to know the line 28, where I call the croak (or confess);
#!/usr/bin/env perl
{
package Module;
use strict; use warnings;
use Carp qw(croak confess longmess);
our #CARP_NOT = qw(Try::Tiny);
use Try::Tiny;
sub i {
my ($x) = #_;
j($x);
}
sub j {
my ($x) = #_;
k($x);
}
sub k {
my ($x) = #_;
l($x);
}
sub l {
my ($x) = #_;
my $stack = longmess();
croak( { data => 1, stack => $stack } ) if $x =~ /\D/; # or confess
return $x;
}
1;
}
use strict; use warnings; use 5.014;
import Module;
use Try::Tiny;
use Data::Dumper;
try {
Module::i("x");
} catch {
say Dumper $_;
};
sub _lm { longmess() }
sub l {
my ($x) = #_;
die( { data => 1, stack => _lm() } ) if $x =~ /\D/;
return $x;
}
or
sub l {
my ($x) = #_;
local $Carp::CarpLevel = $Carp::CarpLevel - 1;
die( { data => 1, stack => longmess() } ) if $x =~ /\D/;
return $x;
}
or
sub mycroak { die( { #_, stack => longmess() } ); }
sub l {
my ($x) = #_;
mycroak( data => 1 ) if $x =~ /\D/;
return $x;
}
(Replaced croak with die because you didn't take advantage of any of croak's functionality.)
From the BUGS section of Carp documentation:
The Carp routines don't handle exception objects currently. If called with a first argument that is a reference, they simply call die() or warn(), as appropriate.
If you simply call confess() without an arg, the line number will be reported.

Perl print out all subs arguments at every call at runtime

I'm looking for way to debug print each subroutine call from the namespace Myapp::* (e.g. without dumping the CPAN modules), but without the need edit every .pm file manually for to inserting some module or print statement.
I just learning (better to say: trying to understand) the package DB, what allows me tracing the execution (using the shebang #!/usr/bin/perl -d:Mytrace)
package DB;
use 5.010;
sub DB {
my( $package, $file, $line ) = caller;
my $code = \#{"::_<$file"};
print STDERR "--> $file $line $code->[$line]";
}
#sub sub {
# print STDERR "$sub\n";
# &$sub;
#}
1;
and looking for a way how to use the sub call to print the actual arguments of the called sub from the namespace of Myapp::*.
Or is here some easier (common) method to
combine the execution line-tracer DB::DB
with the Dump of the each subroutine call arguments (and its return values, if possible)?
I don't know if it counts as "easier" in any sane meaning of the word, but you can walk the symbol table and wrap all functions in code that prints their arguments and return values. Here's an example of how it might be done:
#!/usr/bin/env perl
use 5.14.2;
use warnings;
package Foo;
sub first {
my ( $m, $n ) = #_;
return $m+$n;
}
sub second {
my ( $m, $n ) = #_;
return $m*$n;
}
package main;
no warnings 'redefine';
for my $k (keys %{$::{'Foo::'}}) {
my $orig = *{$::{'Foo::'}{$k}}{CODE};
$::{'Foo::'}{$k} = sub {
say "Args: #_";
unless (wantarray) {
my $r = $orig->(#_);
say "Scalar return: $r";
return $r;
}
else {
my #r = $orig->(#_);
say "List return: #r";
return #r
}
}
}
say Foo::first(2,3);
say Foo::second(4,6);

My TOC script is not generating Strict html standard code

I'd written a Perl script to generate a table of contents from HTML pages which is working fine (and generating valid HTML) except for that the Perl output is removing closing tags for some elements like p. This is not validating against DocType of strict.
Please scroll down the post to see the Perl code.
What should I do to correct it?
#!/usr/bin/perl -w
#Copyright anurag gupta ; free to use under GNU GPL License
use strict;
use feature "switch";
use Common;
use HTML::Element;
use HTML::TreeBuilder;
#"F:/anurag/work/indiacustomercare/airtel/recharge.html";
my $filename="F:/tmp/t9.html";
my $index=0;
my $labelprefix="anu555ltg-";
my $tocIndex=100001;
my $toc;
my #stack;
my $prevHtag="h2";
sub hTagEncountered($)
{
my $hTag=shift;
my $currLevel=(split //, $hTag)[1];
given($hTag)
{
when(/h1/)
{
break;
}
default{
my $countCurr= (split /h/,$hTag)[1];
my $countPrev= (split /h/,$prevHtag)[1];
if($countCurr>$countPrev)
{
push #stack,($currLevel);
$toc.="<ul>";
}
elsif($countCurr<$countPrev)
{
# Now check in the stack
while ( #stack and $currLevel < $stack[$#stack])
{
pop #stack;
$toc.="</ul>";
}
}
}
}
$prevHtag=$hTag;
}
sub getLabel
{
my $name=$labelprefix.++$tocIndex;
}
sub traversehtml
{
my $node=$_[0];
# $node->dump();
# print "-----------------\n";
# print $node->tag()."\n";
# print ref($node),"->\n";
if((ref(\$node) ne "SCALAR" )and ($node->tag() =~m/^h[2-7]$/i)) #it's an H Element!
{
my #h = $node->content_list();
if(#h==1 and ref(\$h[0]) eq "SCALAR") #H1 contains simple string and nothing else
{
hTagEncountered($node->tag());
my $label=getLabel();
my $a = HTML::Element->new('a', name => $label);
my $text=$node->as_trimmed_text();
$a->push_content($text);
$node->delete_content();
$text=HTML::Entities::encode_entities($text);
$node->push_content($a);
$toc.=<<EOF;
<li>$text
EOF
}
elsif ( #h==1 and ($h[0]->tag() eq "a")) # <h1>ttt</h1> case
{
#See if any previous label already exists
my $prevlabel = $h[0]->attr("name");
$h[0]->attr("name",undef) if(defined($prevlabel) and $prevlabel=~m/$labelprefix/); #delete previous name tag if any
#set the new label
my $label=getLabel();
$h[0]->attr("name",$label);
hTagEncountered($node->tag());
my $text=HTML::Entities::encode_entities($node->as_trimmed_text());
$toc.=<<EOF;
<li>$text
EOF
}
elsif (#h>1) #<h1>some text herettt</h1> case
{
die "h1 must not contain any html elements";
}
}
my #h = $node->content_list();
foreach my $item (#h)
{
if(ref(\$item) ne "SCALAR") {traversehtml($item); } #skip scalar items
}
}
die "File $filename not found" if !-r $filename;
my $tree = HTML::TreeBuilder->new();
$tree->parse_file($filename);
my #h = $tree->content_list();
traversehtml($h[1]);
while(pop #stack)
{
$toc.="</ul>";
}
$toc="<ul>$toc</ul>";
print qq{<div id="icctoc"><h2>TOC</h2>$toc</div>};
my #list1=$tree->content_list();
my #list2=$list1[1]->content_list();
for(my $i=0;$i<#list2;++$i){
if(ref(\$list2[$i]) eq "SCALAR")
{
print $list2[$i]
}
else{
print $list2[$i]->as_HTML();
}
}
# Finally:
Try passing {} for the \%optional_end_tags argument to as_HTML. See the documentation for details.

How do you create a callback function (dispatch table) in Perl using hashes?

I want to call a main controller function that dispatches other function dynamically, something like this:
package Controller;
my %callback_funcs = ();
sub register_callback{
my ($class,$callback,$options) = _#;
#apppend to %callback_funcs hash ... ?
}
sub main{
%callback_funcs = ( add => 'add_func', rem => 'remove_func', edit => 'edit_func');
while(<STDIN>){
last if ($_ =~ /^\s*$/);
if($_ == 'add' || _$ == 'rem' || _$ == 'edit'){
$result = ${callback_funcs['add']['func']}(callback_funcs['add']['options']);
}
}
}
sub add_func{
...
}
One caveat is that the subs are defined in other Modules, so the callbacks would have to be able to reference them... plus
I'm having a hard time getting the hashes right!
So, it's possible to have a hash that contains anonymous subroutines that you can invoke from stdin.
my %callbacks = (
add => sub {
# do stuff
},
fuzzerbligh => sub {
# other stuff
},
);
And you can insert more hashvalues into the hash:
$callbacks{next} = sub {
...
};
And you would invoke one like this
$callbacks{next}->(#args);
Or
my $coderef = $callbacks{next};
$coderef->(#args);
You can get the hashkey from STDIN, or anywhere else.
You can also define them nonymously and then take a reference to them.
sub delete {
# regular sub definition
}
$callbacks{delete} = \&delete;
I wouldn't call these callbacks, however. Callbacks are subs that get called after another subroutine has returned.
Your code is also rife with syntax errors which may be obscuring the deeper issues here. It's also not clear to me what you're trying to do with the second level of arrays. When are you defining these subs, and who is using them when, and for what?
Perhaps this simplified example will help:
# Very important.
use strict;
use warnings;
# Define some functions.
sub multiply { $_[0] * $_[1] }
sub divide { $_[0] / $_[1] }
sub add { $_[0] + $_[1] }
sub subtract { $_[0] - $_[1] }
# Create a hash of references to those functions (dispatch table).
my %funcs = (
multiply => \&multiply,
divide => \&divide,
add => \&add,
subtract => \&subtract,
);
# Register some more functions.
sub register {
my ($key, $func) = #_;
$funcs{$key} = $func;
}
register('+', \&add); # As above.
register('sum', sub { # Or using an anonymous subroutine.
my $s = 0;
$s += $_ for #_;
return $s;
});
# Invoke them dynamically.
while (<>){
my ($op, #args) = split;
last unless $op and exists $funcs{$op}; # No need for equality tests.
print $funcs{$op}->(#args), "\n";
}
You've already got some good answers on how to build a dispatch table and call functions through it within a single file, but you also keep talking about wanting the functions to be defined in other modules. If that's the case, then wouldn't it be better to build the dispatch table dynamically based on what dispatchable functions each module says it has rather than having to worry about keeping it up to date manually? Of course it would!
Demonstrating this requires multiple files, of course, and I'm using Module::Pluggable from CPAN to find the modules which provide the function definitions.
dispatch_core.pl:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
use lib '.'; # a demo is easier if I can put modules in the same directory
use Module::Pluggable require => 1, search_path => 'DTable';
for my $plugin (plugins) {
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}
DTable/Add.pm:
package DTable::Add;
use strict;
use warnings;
sub dispatchable {
return (add => \&add);
}
sub add {
my ($num1, $num2) = #_;
print "$num1 + $num2 = ", $num1 + $num2, "\n";
}
1;
DTable/MultDiv.pm:
package DTable::MultDiv;
use strict;
use warnings;
sub dispatchable {
return (multiply => \&multiply, divide => \&divide);
}
sub multiply {
my ($num1, $num2) = #_;
print "$num1 * $num2 = ", $num1 * $num2, "\n";
}
sub divide {
my ($num1, $num2) = #_;
print "$num1 / $num2 = ", $num1 / $num2, "\n";
}
1;
Then, on the command line:
$ ./dispatch_core.pl
add:
2 + 5 = 7
divide:
2 / 5 = 0.4
multiply:
2 * 5 = 10
Adding new functions is now as simple as dropping a new file into the DTable directory with an appropriate dispatchable sub. No need to ever touch dispatch_core.pl just to add a new function again.
Edit: In response to the comment's question about whether this can be done without Module::Pluggable, here's a modified dispatch_core.pl which doesn't use any external modules other than the ones defining the dispatchable functions:
#!/usr/bin/env perl
use strict;
use warnings;
my %dispatch;
my #dtable = qw(
DTable::Add
DTable::MultDiv
);
use lib '.';
for my $plugin (#dtable) {
eval "use $plugin";
%dispatch = (%dispatch, $plugin->dispatchable);
}
for my $func (sort keys %dispatch) {
print "$func:\n";
$dispatch{$func}->(2, 5);
}