Perl 6 - Subroutine taking "Bareword" input - perl

So I've been diving into Perl 6 and have been looking at interpreting another language using Perl 6's operator definitions. I understand that this could be done by parsing the code but I'm looking to push Perl 6's capabilities to see what it can do. Having this functionality would also make the parsing a lot easier
I'm trying to make a variable definition in a C-style format.(The language isn't important)
Something like:
char foo;
Where the char represents the type and the foo is the variable name. From my understanding the char can be interpreted using an operator definition like so:
sub prefix:<char>($input) {
say $input;
}
Which calls a subroutine with the foo as $input. The idea from here would be to use foo as a string and hold it's reference in a hash somewhere. The problem with this is that Perl 6 seems to see any bareword as a function call and will complain when it can't find the "Undeclared routine".
I've looked possibly everywhere for an answer to this and the only thing that makes me still think that this may be possible is the qw function from Perl 5 which is now < > in Perl 6. The < > is obviously an operator which leads me to believe that there is a subroutine defined somewhere that tells this operator how to work and how to deal with the bareword input.
So to my question:
Is there a way of accepting bareword input into a subroutine just like the < > operator does?
Cheers!

The best way to do that would be to create a Grammar that parses your language. If you additionally want it to run the DSL you have just created, combine it with Actions.

Related

what is the meaning of prototype in perl?

I was writing a code and had an error as Prototype after '%' for main::compareHashes. I don't know what is a prototype, and I'm still confused after viewing numerous documents online. Can someone please explain?
You have something like
sub compareHashes(%$) { ... }
(%$) is the prototype. Prototypes affects how the call to the sub is parsed.
As for your the error you received, perldiag provides the following explanation:
A character follows % or # in a prototype. This is useless, since % and # gobble the rest of the subroutine arguments.
Whatever prototype you used makes no sense, and this is Perl's way of telling you that. You need to fix or remove compareHashes's prototype.
Looking at your previous question, your prototype is this:
sub compareHashes(%$hash1, %$hash2) { ... }
As a prototype, this is nonsense. Perhaps you should read the documentation on prototypes.
Prototypes in Perl are completely unlike prototypes in pretty any other language. Which is why expert Perl programmers do not recommend their use (outside of a tiny number of use cases where they are essential). Recent versions of Perl have added a new feature called function signatures which are far more like what most people expect prototypes to be - but they are currently marked as an experimental feature so not many people use them.
But let's look at your prototype and see what's wrong with it.
sub compareHashes(%$hash1, %$hash2) { ... }
Firstly, it contains what look like variable names. And Perl prototypes don't contain variable names. They are just a string of symbols describing the types of the arguments you are going to pass to the subroutine.
But the parser isn't complaining about the variable names. It doesn't get to the first variable name. It finds a problem before that. It doesn't like the %$. And that's because %$ makes no sense as a prototype.
%$ means "this subroutine takes two arguments - a hash followed by a scalar. But we (should!) know that it makes no sense to pass a scalar to a subroutine after a hash. That's because hash assignments are greedy and will eat up all of the remaining arguments in #_ - leaving nothing to go into the scalar.
You are saying that you will call the subroutine like this:
some_sub(%hash, $scalar);
And within the subroutine, you'll do this:
my (%hash, $scalar) = #_;
And that just won't work. That's what the error is telling you. That your prototype is nonsense.
Don't use prototypes. Prototypes don't work the way you think they do. They are an advanced Perl feature and should only be used in specific circumstances.
Update: I've just noticed this in your question:
I don't know what is a prototype
The prototype is the bit in parentheses between the subroutine name and the subroutine block. In your subroutine definition:
sub compareHashes(%$hash1, %$hash2) { ... }
The prototype is (%$hash1, %$hash2). I know why people use them - they look a lot like how subroutines work in other languages, but in Perl, they are usually far more trouble than they are worth. It's best to just drop them and just define the subroutine without a prototype:
sub compareHashes { ... }

What does the dollar character in the brackets of a Perl subroutine mean?

I've inherited some Perl code and occasionally I see subroutines defined like this:
sub do_it($) {
...
}
I can't find the docs that explain this. What does the dollar symbol in brackets mean?
It is a subroutine prototype.
The single $ means that the sub will only accept a single scalar value, and will interpret other types using scalar context. For instance, if you pass an array as the param e.g. do_it(#array), Perl will not expand #array into a list, but instead pass in the length of the array to the subroutine body.
This is sometimes useful as Perl can give an error message when the subroutine is called incorrectly. Also, Perl's interpreter can use the prototypes to disambiguate method calls. I have seen the & symbol (for code block prototype) used quite neatly to write native-looking routines that call to anonymous code.
However, it only works in some situations - e.g. it doesn't work very well in OO Perl. Hence its use is a bit patchy. Perl Best Practices recommends against using them.
The ($) is called a subroutine prototype.
See the PerlSub man page for more information: http://perldoc.perl.org/perlsub.html#Prototypes
Prototyping isn't very common nowadays. Best Practice is not using it.

Why is parenthesis optional only after sub declaration?

(Assume use strict; use warnings; throughout this question.)
I am exploring the usage of sub.
sub bb { print #_; }
bb 'a';
This works as expected. The parenthesis is optional, like with many other functions, like print, open etc.
However, this causes a compilation error:
bb 'a';
sub bb { print #_; }
String found where operator expected at t13.pl line 4, near "bb 'a'"
(Do you need to predeclare bb?)
syntax error at t13.pl line 4, near "bb 'a'"
Execution of t13.pl aborted due to compilation errors.
But this does not:
bb('a');
sub bb { print #_; }
Similarly, a sub without args, such as:
special_print;
my special_print { print $some_stuff }
Will cause this error:
Bareword "special_print" not allowed while "strict subs" in use at t13.pl line 6.
Execution of t13.pl aborted due to compilation errors.
Ways to alleviate this particular error is:
Put & before the sub name, e.g. &special_print
Put empty parenthesis after sub name, e.g. special_print()
Predeclare special_print with sub special_print at the top of the script.
Call special_print after the sub declaration.
My question is, why this special treatment? If I can use a sub globally within the script, why can't I use it any way I want it? Is there a logic to sub being implemented this way?
ETA: I know how I can fix it. I want to know the logic behind this.
I think what you are missing is that Perl uses a strictly one-pass parser. It does not scan the file for subroutines, and then go back and compile the rest. Knowing this, the following describes how the one pass parse system works:
In Perl, the sub NAME syntax for declaring a subroutine is equivalent to the following:
sub name {...} === BEGIN {*name = sub {...}}
This means that the sub NAME syntax has a compile time effect. When Perl is parsing source code, it is working with a current set of declarations. By default, the set is the builtin functions. Since Perl already knows about these, it lets you omit the parenthesis.
As soon as the compiler hits a BEGIN block, it compiles the inside of the block using the current rule set, and then immediately executes the block. If anything in that block changes the rule set (such as adding a subroutine to the current namespace), those new rules will be in effect for the remainder of the parse.
Without a predeclared rule, an identifier will be interpreted as follows:
bareword === 'bareword' # a string
bareword LIST === syntax error, missing ','
bareword() === &bareword() # runtime execution of &bareword
&bareword === &bareword # same
&bareword() === &bareword() # same
When using strict and warnings as you have stated, barewords will not be converted into strings, so the first example is a syntax error.
When predeclared with any of the following:
sub bareword;
use subs 'bareword';
sub bareword {...}
BEGIN {*bareword = sub {...}}
Then the identifier will be interpreted as follows:
bareword === &bareword() # compile time binding to &bareword
bareword LIST === &bareword(LIST) # same
bareword() === &bareword() # same
&bareword === &bareword # same
&bareword() === &bareword() # same
So in order for the first example to not be a syntax error, one of the preceding subroutine declarations must be seen first.
As to the why behind all of this, Perl has a lot of legacy. One of the goals in developing Perl was complete backwards compatibility. A script that works in Perl 1 still works in Perl 5. Because of this, it is not possible to change the rules surrounding bareword parsing.
That said, you will be hard pressed to find a language that is more flexible in the ways it lets you call subroutines. This allows you to find the method that works best for you. In my own code, if I need to call a subroutine before it has been declared, I usually use name(...), but if that subroutine has a prototype, I will call it as &name(...) (and you will get a warning "subroutine called too early to check prototype" if you don't call it this way).
The best answer I can come up with is that's the way Perl is written. It's not a satisfying answer, but in the end, it's the truth. Perl 6 (if it ever comes out) won't have this limitation.
Perl has a lot of crud and cruft from five different versions of the language. Perl 4 and Perl 5 did some major changes which can cause problems with earlier programs written in a free flowing manner.
Because of the long history, and the various ways Perl has and can work, it can be difficult for Perl to understand what's going on. When you have this:
b $a, $c;
Perl has no way of knowing if b is a string and is simply a bareword (which was allowed in Perl 4) or if b is a function. If b is a function, it should be stored in the symbol table as the rest of the program is parsed. If b isn't a subroutine, you shouldn't put it in the symbol table.
When the Perl compiler sees this:
b($a, $c);
It doesn't know what the function b does, but it at least knows it's a function and can store it in the symbol table waiting for the definition to come later.
When you pre-declare your function, Perl can see this:
sub b; #Or use subs qw(b); will also work.
b $a, $c;
and know that b is a function. It might not know what the function does, but there's now a symbol table entry for b as a function.
One of the reasons for Perl 6 is to remove much of the baggage left from the older versions of Perl and to remove strange things like this.
By the way, never ever use Perl Prototypes to get around this limitation. Use use subs or predeclare a blank subroutine. Don't use prototypes.
Parentheses are optional only if the subroutine has been predeclared. This is documented in perlsub.
Perl needs to know at compile time whether the bareword is a subroutine name or a string literal. If you use parentheses, Perl will guess that it's a subroutine name. Otherwise you need to provide this information beforehand (e.g. using subs).
The reason is that Larry Wall is a linguist, not a computer scientist.
Computer scientist: The grammar of the language should be as simple & clear as possible.
Avoids complexity in the compiler
Eliminates sources of ambiguity
Larry Wall: People work differently from compilers. The language should serve the programmer, not the compiler. See also Larry Wall's outline of the three virtues of a programmer.

Origin of discouraged perl idioms: &x(...) and sub x($$) { ... }

In my perl code I've previously used the following two styles of writing which I've later found are being discouraged in modern perl:
# Style #1: Using & before calling a user-defined subroutine
&name_of_subroutine($something, $something_else);
# Style #2: Using ($$) to show the number of arguments in a user-defined sub
sub name_of_subroutine($$) {
# the body of a subroutine taking two arguments.
}
Since learning that those styles are not recommended I've simply stopped using them.
However, out of curiosity I'd like to know the following:
What is the origin of those two styles of writing? (I'm sure I've not dreamt up the styles myself.)
Why are those two styles of writing discouraged in modern perl?
Have the styles been considered best practice at some point in time?
The & sigil is not commonly used with function calls in modern Perl for two reasons. First, it is largely redundant since Perl will consider anything that looks like a function (followed by parens) a function. Secondly, there is a major difference between the way &function() and &function are executed, which may be confusing to less experienced Perl programmers. In the first case, the function is called with no arguments. In the second case, the function is called with the current #_ (and it can even make changes to the argument list which will be seen by later statements in that scope:
sub print_and_remove_first_arg {print 'first arg: ', shift, "\n"}
sub test {
&print_and_remove_first_arg;
print "remaining args: #_\n";
}
test 1, 2, 3;
prints
first arg: 1
remaining args: 2 3
So ultimately, using & for every function call ends up hiding the few &function; calls which can lead to hard to find bugs. In addition, using the & sigil prevents the honoring of function prototypes, which can be useful in some cases (if you know what you are doing), but also may lead to hard to track down bugs. Ultimately, & is a powerful modifier to function behavior, and should only be used when that behavior is desired.
Prototypes are similar, and their use should be limited in modern Perl. What must be stated explicitly is that prototypes in Perl are NOT function signatures. They are hints to the compiler that tell it to parse calls to those functions in a similar way as the built in functions. That is, each of the symbols in the prototype tells the compiler to impose that type of context on the argument. This functionality can be very helpful when defining functions that behave like map or push or keys which all treat their first argument differently than a standard list operator would.
sub my_map (&#) {...} # first arg is either a block or explicit code reference
my #ret = my_map {some_function($_)} 1 .. 10;
The reason sub ($$) {...} and similar uses of prototypes are discouraged is because 9 times out of 10 the author means "I want two args" and not "I want two args each with scalar context imposed on the call site". The former assertion is better written:
use Carp;
sub needs2 {
#_ == 2 or croak 'needs2 takes 2 arguments';
...
}
which would then allow the following calling style to work as expected:
my #array = (2, 4);
needs2 #array;
To sum up, both the & sigil and function prototypes are useful and powerful tools, but they should only be used when that functionality is required. Their superfluous use (or misuse as argument validation) leads to unintended behavior and difficult to track down bugs.
The & in function-calls was mandatory in Perl 4, so maybe you have picked that up from Programming perl (1991) by Larry Wall and Randal L. Schwartz, as I did, or somewhere similar.
As for the function prototypes, my guess is less qualified. Maybe you have been mimicking languages where it makes sense and/or is mandatory to declare argument lists, and since function prototypes in Perl look a little like argument lists, you've started adding them?
&function is discouraged because it makes the code less readable and isn't necessary (the cases that &function is necessary are rare and often better avoided).
Function prototypes aren't argument lists, so most of the time they'll just confuse your reader or lull you into a false sense of rigidity, so no need to use those unless you know exactly why you are.
& was mandatory in Perl 4, so they have been best/necessary practise. I don't think function prototypes ever has been.
For style #1, the & before the subroutine is only necessary if you have a subroutine that shares a name with a builtin and you need to disambiguate which one you wish to call, so that the interpreter knows what's going on. Otherwise, it's equivalent to calling the subroutine without &.
Since that's the case, I'd say its use is discouraged since you shouldn't be naming your subroutines with the same names as builtins, and it's good practice to define all your subroutines before you call them, for the sake of reading comprehension. In addition to this, if you define your subroutines before you call them, you can omit the parentheses, like in a builtin. Plus, just speaking visually, sticking & in front of every subroutine unnecessarily clutters up the file.
As for function prototypes, they were stuck into Perl after the fact and don't really do what they were made to do. From an article on perl.com:
For the most part, prototypes are more trouble than they're worth. For one thing, Perl doesn't check prototypes for methods because that would require the ability to determine, at compile time, which class will handle the method. Because you can alter #ISA at runtime--you see the problem. The main reason, however, is that prototypes aren't very smart. If you specify sub foo ($$$), you cannot pass it an array of three scalars (this is the problem with vec()). Instead, you have to say foo( $x[0], $x[1], $x[2] ), and that's just a pain.
In the end, it's better to comment your code to indicate what you intend for a subroutine to accept and do parameter checking yourself. As the article states, this is actually necessary for class methods, since no parameter checking occurs for them.
For what it's worth, Perl 6 adds formal parameter lists to the language like this:
sub do_something(Str $thing, Int $other) {
...
}

Should I call Perl subroutines with no arguments as marine() or marine?

As per my sample code below, there are two styles to call a subroutine: subname and subname().
#!C:\Perl\bin\perl.exe
use strict;
use warnings;
use 5.010;
&marine(); # style 1
&marine; # style 2
sub marine {
state $n = 0; # private, persistent variable $n
$n += 1;
print "Hello, sailor number $n!\n";
}
Which one, &marine(); or &marine;, is the better choice if there are no arguments in the call?
In Learning Perl, where this example comes from, we're at the very beginning of showing you subroutines. We only tell you to use the & so that you, as the beginning Perler, don't run into a problem where you define a subroutine with the same name as a Perl built-in then wonder why it doesn't work. The & in front always calls your defined subroutine. Beginning students often create their own subroutine log to print a message because they are used to doing that in other technologies they use. In Perl, that's the math function builtin.
After you get used to using Perl and you know about the Perl built-ins (scan through perlfunc), drop the &. There's some special magic with & that you hardly ever need:
marine();
You can leave off the () if you've pre-declared the subroutine, but I normally leave the () there even for an empty argument list. It's a bit more robust since you're giving Perl the hint that the marine is a subroutine name. To me, I recognize that more quickly as a subroutine.
The side effect of using & without parentheses is that the subroutine is invoked with #_. This program
sub g {
print "g: #_\n";
}
sub f {
&g(); # g()
&g; # g(#_)
g(); # g()
g; # g()
}
f(1,2,3);
produces this output:
g:
g: 1 2 3
g:
g:
It's good style to declare your subroutines first with the sub keyword, then call them. (Of course there are ways around it, but why make things more complicated than necessary?)
Do not use the & syntax unless you know what it does exactly to #_ and subroutines declared with prototypes. It is terribly obscure, rarely needed and a source of bugs through unintended behaviour. Just leave it away – Perl::Critic aptly says about it:
Since Perl 5, the ampersand sigil is completely optional when invoking subroutines.
Now, given following these style hints, I prefer to call subroutines that require no parameters in style 1, that is to say marine();. The reasons are
visual consistency with subroutines that do require parameters
it cannot be confused with a different keyword.
As a general rule I recommend the following:
Unless you need the & because you're over riding a built in function or you have no parameter list omit it.
Always include the () as in marine().
I do both of these for code readability. The first rule makes it clear when I'm overriding internal Perl functions by making them distinct. The second makes it clear when I'm invoking functions.
perl allows you to omit parenthesis in your function call.
So you can call your function with arguments in two different ways:
your_function( arg1,arg2,arg3);
or
your function arg1,arg2,arg3 ;
Its a matter of choice that which form do you prefer. With users from C background the former is more intuitive.
I personally use the former for functions defined by me and latter for built in functions like:
print "something" instead of print("something")