In Perl, is it better to use goto or local function, and why with an example?
For example, I am using a code
sub data {
data;
}
data();
or
goto L:
L: if ( $i == 0 )
print "Hello!!!!";
Programmers don't die. They just GOSUB without RETURN.
That said, do not use goto in Perl.
If you want a statement to be executed once, just write that statement and don't use any flow control
If you want your program to execute the stuff several times, put it in a loop
If you want them to be executed from different places, put them in a sub
If you want the sub to be available in different programs, put them in a module
Don't use goto
There is one place where a goto makes sense in Perl. Matt Trout is talking about that in his blog post No, not that goto, the other goto.
Conventional wisdom is that other constructs are better than goto.
If you want to do something repeatedly, use a loop.
If you want to do something conditionally, use a block with if.
If you want to go somewhere and come back, use a function.
If you want to bail out early and "jump straight to the end", this can often be better written using an exception. (Perl handily manages
Related
I'm reading Perl which is quite interesting. But while reading goto from here in Perl I got a doubt.
I know that goto statement has three types.
goto LABEL.
goto EXPR.
goto &NAME.
But in this three types, what is the use of third one goto &NAME?
This is also seems to be like a function call.
Then,
What is the real difference between goto &NAME and normal function call in Perl?
When we use goto &NAME?
Can anyone please explain with example.
Thanks in advance.
It says in the goto page
The goto &NAME form is quite different from the other forms of
goto. In fact, it isn't a goto in the normal sense at all, and
doesn't have the stigma associated with other gotos.
Then follows the answer to your question
Instead, it
exits the current subroutine (losing any changes set by local())
and immediately calls in its place the named subroutine using the
current value of #_.
With a normal function call the execution continues on the next line after the function exits.
The rest of that paragraph is well worth reading as well, and answers your second question
This is used by AUTOLOAD subroutines that wish to load another subroutine and then pretend that the other subroutine had been called in the first place (except that any modifications to #_ in the current subroutine are propagated to the other subroutine.) After the goto, not even caller will be able to tell that this routine was called first.
A basic example. With a subroutine deeper defined somewhere, compare
sub func_top {
deeper( #_ ); # pass its own arguments
# The rest of the code here runs after deeper() returns
}
with
sub func_top {
goto &deeper; # #_ is passed to it, as it is at this point
# Control never returns here
}
At the statement goto &deeper the sub func_top is exited. So after deeper completes, the control returns to after the func_top call.
In a sense, func_top is replaced by deeper.
Trying to pass arguments with goto &func results in errors, even just for goto &deeper().
What I want to achieve:
###############CODE########
old_procedure(arg1, arg2);
#############CODE_END######
I have a huge code which has a old procedure in it. I want that the call to that old_procedure go to a call to a new procedure (new_procedure(arg1, arg2)) with the same arguments.
Now I know, the question seems pretty stupid but the trick is I am not allowed to change the code or the bad_function. So the only thing I can do it create a procedure externally which reads the code flow or something and then whenever it finds the bad_function, it replaces it with the new_function. They have a void type, so don't have to worry about the return values.
I am usng perl. If someone knows how to atleast start in this direction...please comment or answer. It would be nice if the new code can be done in perl or C, but other known languages are good too. C++, java.
EDIT: The code is written in shell script and perl. I cannot edit the code and I don't have location of the old_function, I mean I can find it...but its really tough. So I can use the package thing pointed out but if there is a way around it...so that I could parse the thread with that function and replace function calls. Please don't remove tags as I need suggestions from java, C++ experts also.
EDIT: #mirod
So I tried it out and your answer made a new subroutine and now there is no way of accessing the old one. I had created an variable which checks the value to decide which way to go( old_sub or new_sub)...is there a way to add the variable in the new code...which sends the control back to old_function if it is not set...
like:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
# check for the variable and send to old_sub if the var is not set
sub bad_sub
{ # good code
}
}
# Thanks #mirod
This is easier to do in Perl than in a lot of other languages, but that doesn't mean it's easy, and I don't know if it's what you want to hear. Here's a proof-of-concept:
Let's take some broken code:
# file name: Some/Package.pm
package Some::Package;
use base 'Exporter';
our #EXPORT = qw(forty_two nineteen);
sub forty_two { 19 }
sub nineteen { 19 }
1;
# file name: main.pl
use Some::Package;
print "forty-two plus nineteen is ", forty_two() + nineteen();
Running the program perl main.pl produces the output:
forty-two plus nineteen is 38
It is given that the files Some/Package.pm and main.pl are broken and immutable. How can we fix their behavior?
One way we can insert arbitrary code to a perl command is with the -M command-line switch. Let's make a repair module:
# file: MyRepairs.pm
CHECK {
no warnings 'redefine';
*forty_two = *Some::Package::forty_two = sub { 42 };
};
1;
Now running the program perl -MMyRepairs main.pl produces:
forty-two plus nineteen is 61
Our repair module uses a CHECK block to execute code in between the compile-time and run-time phase. We want our code to be the last code run at compile-time so it will overwrite some functions that have already been loaded. The -M command-line switch will run our code first, so the CHECK block delays execution of our repairs until all the other compile time code is run. See perlmod for more details.
This solution is fragile. It can't do much about modules loaded at run-time (with require ... or eval "use ..." (these are common) or subroutines defined in other CHECK blocks (these are rare).
If we assume the shell script that runs main.pl is also immutable (i.e., we're not allowed to change perl main.pl to perl -MMyRepairs main.pl), then we move up one level and pass the -MMyRepairs in the PERL5OPT environment variable:
PERL5OPT="-I/path/to/MyRepairs -MMyRepairs" bash the_immutable_script_that_calls_main_pl.sh
These are called automated refactoring tools and are common for other languages. For Perl though you may well be in a really bad way because parsing Perl to find all the references is going to be virtually impossible.
Where is the old procedure defined?
If it is defined in a package, you can switch to the package, after it has been used, and redefine the sub:
use BadPackage; # sub is defined there
BEGIN
{ package BapPackage;
no warnings; # to avoid the "Subroutine bad_sub redefined" message
sub bad_sub
{ # good code
}
}
If the code is in the same package but in a different file (loaded through a require), you can do the same thing without having to switch package.
if all the code is in the same file, then change it.
sed -i 's/old_procedure/new_procedure/g codefile
Is this what you mean?
I wrote three small programs in perl.
I have Program1.pl, then Program2.pl (which uses output generated by Program1.pl as input) and then I have Program3.pl (which uses output generated by Program2.pl as input).
Now I want to write a program which 'calls' all three programs, so that the user only has to run one program, namely MainProgram.pl.
How do I go about doing this?
Thanks in advance! :)
I depends on what do you mean with combine, but if you can do this, for example, creating a pipe:
open(PIPE, "perl Program1.pl | perl Program2.pl | perl Prograp3.pl |") or die "can't create pipe: $!";
while(<PIPE>){
print;
}
It sounds like there's a bit of a design problem here; it would make sense to be passing data from sub to sub, or be using an OO structure, rather than piping one script into another then into another
Unless the "Output[n]" files are needed by a human, you simply need to push your results into arrays, then read the arrays into hashes on subsequent steps. This only works for cases where the number of items will fit into memory. If you are processing gigabytes of data, you might need the intermediary files.
push is an array operator.
Pipes cannot be readily used because you have more than one output, and more than one condition on the output.
Depending on how much you want to change your code it could be very simple, like:
#!/usr/bin/perl
system('perl Program1.pl');
system('perl Program2.pl');
system('perl Program3.pl');
Or you could rewrite your scripts into a single script with subroutines for each part you want to execute. Something like:
#!/usr/bin/perl
part_A();
part_B();
part_C();
exit;
sub part_A {
# code from Program1.pl goes in here
}
sub part_B {
# code from Program2.pl goes here
}
sub part_C {
# code from Program3.pl here
}
But honestly, it sounds like you just need to write a new script that does all the logic you need, and just git rid of the other scripts. It would be simpler to not write all those "temp" files just to read them in again to continue processing. You can pass hashes and arrays into subroutines and have them return hashes or arrays, or just have them modify the ones you passed in (if you pass in references).
Oh, and the mention of using pipes- pipes are only useful when taking standard out and/or standard error of one program and sending that to standard in to another program. And since you said that your scripts write to files and read from files, then they are not printing to standard out nor reading from standard in. Thus piping them together won't benefit you any more than just calling each one in sequence. You could just as well execute them in on line like this (from bash shell):
$ Program1.pl && Program2.pl && Program3.pl
I am working on a moderately complex Perl program. As a part of its development, it has to go through modifications and testing. Due to certain environment constraints, running this program frequently is not an option that is easy to exercise.
What I want is a static call-graph generator for Perl. It doesn't have to cover every edge case(e,g., redefining variables to be functions or vice versa in an eval).
(Yes, I know there is a run-time call-graph generating facility with Devel::DprofPP, but run-time is not guaranteed to call every function. I need to be able to look at each function.)
Can't be done in the general case:
my $obj = Obj->new;
my $method = some_external_source();
$obj->$method();
However, it should be fairly easy to get a large number of the cases (run this program against itself):
#!/usr/bin/perl
use strict;
use warnings;
sub foo {
bar();
baz(quux());
}
sub bar {
baz();
}
sub baz {
print "foo\n";
}
sub quux {
return 5;
}
my %calls;
while (<>) {
next unless my ($name) = /^sub (\S+)/;
while (<>) {
last if /^}/;
next unless my #funcs = /(\w+)\(/g;
push #{$calls{$name}}, #funcs;
}
}
use Data::Dumper;
print Dumper \%calls;
Note, this misses
calls to functions that don't use parentheses (e.g. print "foo\n";)
calls to functions that are dereferenced (e.g. $coderef->())
calls to methods that are strings (e.g. $obj->$method())
calls the putt the open parenthesis on a different line
other things I haven't thought of
It incorrectly catches
commented functions (e.g. #foo())
some strings (e.g. "foo()")
other things I haven't thought of
If you want a better solution than that worthless hack, it is time to start looking into PPI, but even it will have problems with things like $obj->$method().
Just because I was bored, here is a version that uses PPI. It only finds function calls (not method calls). It also makes no attempt to keep the names of the subroutines unique (i.e. if you call the same subroutine more than once it will show up more than once).
#!/usr/bin/perl
use strict;
use warnings;
use PPI;
use Data::Dumper;
use Scalar::Util qw/blessed/;
sub is {
my ($obj, $class) = #_;
return blessed $obj and $obj->isa($class);
}
my $program = PPI::Document->new(shift);
my $subs = $program->find(
sub { $_[1]->isa('PPI::Statement::Sub') and $_[1]->name }
);
die "no subroutines declared?" unless $subs;
for my $sub (#$subs) {
print $sub->name, "\n";
next unless my $function_calls = $sub->find(
sub {
$_[1]->isa('PPI::Statement') and
$_[1]->child(0)->isa("PPI::Token::Word") and
not (
$_[1]->isa("PPI::Statement::Scheduled") or
$_[1]->isa("PPI::Statement::Package") or
$_[1]->isa("PPI::Statement::Include") or
$_[1]->isa("PPI::Statement::Sub") or
$_[1]->isa("PPI::Statement::Variable") or
$_[1]->isa("PPI::Statement::Compound") or
$_[1]->isa("PPI::Statement::Break") or
$_[1]->isa("PPI::Statement::Given") or
$_[1]->isa("PPI::Statement::When")
)
}
);
print map { "\t" . $_->child(0)->content . "\n" } #$function_calls;
}
I'm not sure it is 100% feasible (since Perl code can not be statically analyzed in theory, due to BEGIN blocks and such - see very recent SO discussion). In addition, subroutine references may make it very difficult to do even in places where BEGIN blocks don't come into play.
However, someone apparently made the attempt - I only know of it but never used it so buyer beware.
I don't think there is a "static" call-graph generator for Perl.
The next closest thing would be Devel::NYTProf.
The main goal is for profiling, but it's output can tell you how many times a subroutine has been called, and from where.
If you need to make sure every subroutine gets called, you could also use Devel::Cover, which checks to make sure your test-suite covers every subroutine.
I recently stumbled across a script while trying to solve find an answer to this same question. The script (linked to below) uses GraphViz to create a call graph of a Perl program or module. The output can be in a number of image formats.
http://www.teragridforum.org/mediawiki/index.php?title=Perl_Static_Source_Code_Analysis
I solved a similar problem recently, and would like to share my solution.
This tool was born out of desperation, untangling an undocumented part of a 30,000-line legacy script, in order to implement an urgent bug fix.
It reads the source code(s), uses GraphViz to generate a png, and then displays the image on-screen.
Since it uses simple line-by-line regexes, the formatting must be "sane" so that nesting can be determined.
If the target code is badly formatted, run it through a linter first.
Also, don't expect miracles such as parsing dynamic function calls.
The silver lining of a simple regex engine is that it can be easily extended for other languages.
The tool now also supports awk, bash, basic, dart, fortran, go, lua, javascript, kotlin, matlab, pascal, perl, php, python, r, raku, ruby, rust, scala, swift, and tcl.
https://github.com/koknat/callGraph
I have heard that people shouldn't be using & to call Perl subs, i.e:
function($a,$b,...);
# opposed to
&function($a,$b,...);
I know for one the argument list becomes optional, but what are some cases where it is appropriate to use the & and the cases where you should absolutely not be using it?
Also how does the performace increase come into play here when omitting the &?
I'm a frequent abuser of &, but mostly because I'm doing weird interface stuff. If you don't need one of these situations, don't use the &. Most of these are just to access a subroutine definition, not call a subroutine. It's all in perlsub.
Taking a reference to a named subroutine. This is probably the only common situation for most Perlers:
my $sub = \&foo;
Similarly, assigning to a typeglob, which allows you to call the subroutine with a different name:
*bar = \&foo;
Checking that a subroutine is defined, as you might in test suites:
if( defined &foo ) { ... }
Removing a subroutine definition, which shouldn't be common:
undef &foo;
Providing a dispatcher subroutine whose only job is to choose the right subroutine to call. This is the only situation I use & to call a subroutine, and when I expect to call the dispatcher many, many times and need to squeeze a little performance out of the operation:
sub figure_it_out_for_me {
# all of these re-use the current #_
if( ...some condition... ) { &foo }
elsif( ...some other... ) { &bar }
else { &default }
}
To jump into another subroutine using the current argument stack (and replacing the current subroutine in the call stack), an unrare operation in dispatching, especially in AUTOLOAD:
goto ⊂
Call a subroutine that you've named after a Perl built-in. The & always gives you the user-defined one. That's why we teach it in Learning Perl. You don't really want to do that normally, but it's one of the features of &.
There are some places where you could use them, but there are better ways:
To call a subroutine with the same name as a Perl built-in. Just don't have subroutines with the same name as a Perl built-in. Check perlfunc to see the list of built-in names you shouldn't use.
To disable prototypes. If you don't know what that means or why you'd want it, don't use the &. Some black magic code might need it, but in those cases you probably know what you are doing.
To dereference and execute a subroutine reference. Just use the -> notation.
IMO, the only time there's any reason to use & is if you're obtaining or calling a coderef, like:
sub foo() {
print "hi\n";
}
my $x = \&foo;
&$x();
The main time that you can use it that you absolutely shouldn't in most circumstances is when calling a sub that has a prototype that specifies any non-default call behavior. What I mean by this is that some prototypes allow reinterpretation of the argument list, for example converting #array and %hash specifications to references. So the sub will be expecting those reinterpretations to have occurred, and unless you go to whatever lengths are necessary to mimic them by hand, the sub will get inputs wildly different from those it expects.
I think mainly people are trying to tell you that you're still writing in Perl 4 style, and we have a much cleaner, nicer thing called Perl 5 now.
Regarding performance, there are various ways that Perl optimizes sub calls which & defeats, with one of the main ones being inlining of constants.
There is also one circumstance where using & provides a performance benefit: if you're forwarding a sub call with foo(#_). Using &foo is infinitesimally faster than foo(#_). I wouldn't recommend it unless you've definitively found by profiling that you need that micro-optimization.
The &subroutine() form disables prototype checking. This may or may not be what you want.
http://www.perl.com/doc/manual/html/pod/perlsub.html#Prototypes
Prototypes allow you to specify the numbers and types of your subroutine arguments, and have them checked at compile time. This can provide useful diagnostic assistance.
Prototypes don't apply to method calls, or calls made in the old-fashioned style using the & prefix.
The & is necessary to reference or dereference a subroutine or code reference
e.g.
sub foo {
# a subroutine
}
my $subref = \&foo; # take a reference to the subroutine
&$subref(#args); # make a subroutine call using the reference.
my $anon_func = sub { ... }; # anonymous code reference
&$anon_func(); # called like this
Protypes aren't applicable to subroutine references either.
The &subroutine form is also used in the so-called magic goto form.
The expression goto &subroutine replaces the current calling context with a call to the named subroutine, using the current value of #_.
In essence, you can completely switch a call to one subroutine with a call to the named one. This is commonly seen in AUTOLOAD blocks, where a deferred subroutine call can be made, perhaps with some modification to #_ , but it looks to the program entirely as if it was a call to the named sub.
e.g.
sub AUTOLOAD {
...
push #_, #extra_args; # add more arguments onto the parameter list
goto &subroutine ; # change call another subroutine, as if we were never here
}
}
Potentially this could be useful for tail call elimination, I suppose.
see detailed explanation of this technique here
I've read the arguments against using '&', but I nearly always use it. It saves me too much time not to. I spend a very large fraction of my Perl coding time looking for what parts of the code call a particular function. With a leading &, I can search and find them instantly. Without a leading &, I get the function definition, comments, and debug statements, usually tripling the amount of code I have to inspect to find what I'm looking for.
The main thing not using '&' buys you is it lets you use function prototypes. But Perl function prototypes may create errors as often as they prevent them, because they will take your argument list and reinterpret it in ways you might not expect, so that your function call no longer passes the arguments that it literally says it does.