In Scala 3 (3.1.3 right now, if it matters), I'm writing a parser that requires some rule/grammar processing and I'd like to write a macro that lets me define the alternates for a non-terminal and define the non-terminal so that it's available for use in other rules. Everything is written now with the non-terminals as Strings and the terminals as Chars, so a set of rules for a grammar of tuples of as might be:
new Grammar() {
val rules = Set(
Rule("tuple", List('(', "as", ')')),
Rule("as", List('a', "more")),
Rule("as", Nil),
Rule("more", List(',', "as")),
Rule("more", Nil),
)
What I'd like to do instead is use some macro magic to make it more like:
new Grammar() {
rule(tuple, List('(', as, ')')
rule(as, List('a', more), Nil)
rule(more, List(',', as), Nil)
}
where, instead of using Strings for non-terminals, I could use identifiers, and at compile time, the rule macro would do something like turn the second into
new Grammar() {
val rules = mutable.Set()
val tuple = NonTerminal("tuple")
val as = NonTerminal("as")
val more = NonTerminal("more")
rules.add(Rule(tuple, List('(', as, ')')))
rules.add(Rule(as, List('a', more)))
rules.add(Rule(as, Nil))
rules.add(Rule(more, List(',', as)))
rules.add(Rule(more, Nil)
}
Is such a thing possible with Scala 3 macros in their current state, or is it not possible for a macro to take a not-yet-defined identifier as an argument and provide a definition for it?
No, the argument you pass to the macro must be a correct Scala code.
You can have a 'god-prefix' for non-terminals like (by using Dynamic):
new Grammar() {
rule(?.tuple, List('(', ?.as, ')')
rule(?.as, List('a', ?.more), Nil)
rule(?.more, List(',', ?.as), Nil)
}
But I'm not sure if that's better than Strings.
Related
The question is a little difficult to phrase so I'll try to provide an example instead:
def myThing(): (String, String, String) = ("", "", "")
// Illegal, this is a Match
val (`r-1`, `r-2`, `r-3`) = myThing()
// Legal
val `r-1` = myThing()._1
The first evaluation is invalid because this is technically a match expression, and in a match backtick marked identifiers are assumed to be references to an existing val in scope.
Outside of a match though, I could freely define "r-1".
Is there a way to perform match extraction using complex variable names?
You can write out the full variable names explicitly:
def myThing(): (String, String, String) = ("a", "b", "c")
// legal, syntactic backtick-sugar replaced by explicit variable names
val (r$minus1, r$minus2, r$minus3) = myThing()
println(`r-1`, `r-2`, `r-3`)
But since variable names can be chosen freely (unlike method in Java APIs that are called yield etc.), I would suggest to invent simpler variable names, the r$minusx-things really don't look pretty.
In language like python and ruby to ask the language what index-related methods its string class supports (which methods’ names contain the word “index”) you can do
“”.methods.sort.grep /index/i
And in java
List results = new ArrayList();
Method[] methods = String.class.getMethods();
for (int i = 0; i < methods.length; i++) {
Method m = methods[i];
if (m.getName().toLowerCase().indexOf(“index”) != -1) {
results.add(m.getName());
}
}
String[] names = (String[]) results.toArray();
Arrays.sort(names);
return names;
How would you do the same thing in Scala?
Curious that no one tried a more direct translation:
""
.getClass.getMethods.map(_.getName) // methods
.sorted // sort
.filter(_ matches "(?i).*index.*") // grep /index/i
So, some random thoughts.
The difference between "methods" and the hoops above is striking, but no one ever said reflection was Java's strength.
I'm hiding something about sorted above: it actually takes an implicit parameter of type Ordering. If I wanted to sort the methods themselves instead of their names, I'd have to provide it.
A grep is actually a combination of filter and matches. It's made a bit more complex because of Java's decision to match whole strings even when ^ and $ are not specified. I think it would some sense to have a grep method on Regex, which took Traversable as parameters, but...
So, here's what we could do about it:
implicit def toMethods(obj: AnyRef) = new {
def methods = obj.getClass.getMethods.map(_.getName)
}
implicit def toGrep[T <% Traversable[String]](coll: T) = new {
def grep(pattern: String) = coll filter (pattern.r.findFirstIn(_) != None)
def grep(pattern: String, flags: String) = {
val regex = ("(?"+flags+")"+pattern).r
coll filter (regex.findFirstIn(_) != None)
}
}
And now this is possible:
"".methods.sorted grep ("index", "i")
You can use the scala REPL prompt. To find list the member methods of a string object, for instance, type "". and then press the TAB key (that's an empty string - or even a non-empty one, if you like, followed by a dot and then press TAB). The REPL will list for you all member methods.
This applies to other variable types as well.
More or less the same way:
val names = classOf[String].getMethods.toSeq.
filter(_.getName.toLowerCase().indexOf(“index”) != -1).
map(_.getName).
sort(((e1, e2) => (e1 compareTo e2) < 0))
But all on one line.
To make it more readable,
val names = for(val method <- classOf[String].getMethods.toSeq
if(method.getName.toLowerCase().indexOf("index") != -1))
yield { method.getName }
val sorted = names.sort(((e1, e2) => (e1 compareTo e2) < 0))
This is as far as I got:
"".getClass.getMethods.map(_.getName).filter( _.indexOf("in")>=0)
It's strange Scala array doesn't have sort method.
edit
It would end up like.
"".getClass.getMethods.map(_.getName).toList.sort(_<_).filter(_.indexOf("index")>=0)
Now, wait a minute.
I concede Java is verbose compared to Ruby for instance.
But that piece of code shouldn't have been so verbose in first place.
Here's the equivalent :
Collection<String> mds = new TreeSet<String>();
for( Method m : "".getClass().getMethods()) {
if( m.getName().matches(".*index.*")){ mds.add( m.getName() ); }
}
Which has almost the same number of characters as the marked as correct, Scala version
Just using the Java code direct will get you most of the way there, as Scala classes are still JVM ones. You could port the code to Scala pretty easily as well, though, for fun/practice/ease of use in REPL.
In languages like SML, Erlang and in buch of others we may define functions like this:
fun reverse [] = []
| reverse x :: xs = reverse xs # [x];
I know we can write analog in Scala like this (and I know, there are many flaws in the code below):
def reverse[T](lst: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
But I wonder, if we could write former code in Scala, perhaps with desugaring to the latter.
Is there any fundamental limitations for such syntax being implemented in the future (I mean, really fundamental -- e.g. the way type inference works in scala, or something else, except parser obviously)?
UPD
Here is a snippet of how it could look like:
type T
def reverse(Nil: List[T]) = Nil
def reverse(x :: xs: List[T]): List[T] = reverse(xs) ++ List(x)
It really depends on what you mean by fundamental.
If you are really asking "if there is a technical showstopper that would prevent to implement this feature", then I would say the answer is no. You are talking about desugaring, and you are on the right track here. All there is to do is to basically stitch several separates cases into one single function, and this can be done as a mere preprocessing step (this only requires syntactic knowledge, no need for semantic knowledge). But for this to even make sense, I would define a few rules:
The function signature is mandatory (in Haskell by example, this would be optional, but it is always optional whether you are defining the function at once or in several parts). We could try to arrange to live without the signature and attempt to extract it from the different parts, but lack of type information would quickly come to byte us. A simpler argument is that if we are to try to infer an implicit signature, we might as well do it for all the methods. But the truth is that there are very good reasons to have explicit singatures in scala and I can't imagine to change that.
All the parts must be defined within the same scope. To start with, they must be declared in the same file because each source file is compiled separately, and thus a simple preprocessor would not be enough to implement the feature. Second, we still end up with a single method in the end, so it's only natural to have all the parts in the same scope.
Overloading is not possible for such methods (otherwise we would need to repeat the signature for each part just so the preprocessor knows which part belongs to which overload)
Parts are added (stitched) to the generated match in the order they are declared
So here is how it could look like:
def reverse[T](lst: List[T]): List[T] // Exactly like an abstract def (provides the signature)
// .... some unrelated code here...
def reverse(Nil) = Nil
// .... another bit of unrelated code here...
def reverse(x :: xs ) = reverse(xs) ++ List(x)
Which could be trivially transformed into:
def reverse[T](list: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
// .... some unrelated code here...
// .... another bit of unrelated code here...
It is easy to see that the above transformation is very mechanical and can be done by just manipulating a source AST (the AST produced by the slightly modified grammar that accepts this new constructs), and transforming it into the target AST (the AST produced by the standard scala grammar).
Then we can compile the result as usual.
So there you go, with a few simple rules we are able to implement a preprocessor that does all the work to implement this new feature.
If by fundamental you are asking "is there anything that would make this feature out of place" then it can be argued that this does not feel very scala. But more to the point, it does not bring that much to the table. Scala author(s) actually tend toward making the language simpler (as in less built-in features, trying to move some built-in features into libraries) and adding a new syntax that is not really more readable goes against the goal of simplification.
In SML, your code snippet is literally just syntactic sugar (a "derived form" in the terminology of the language spec) for
val rec reverse = fn x =>
case x of [] => []
| x::xs = reverse xs # [x]
which is very close to the Scala code you show. So, no there is no "fundamental" reason that Scala couldn't provide the same kind of syntax. The main problem is Scala's need for more type annotations, which makes this shorthand syntax far less attractive in general, and probably not worth the while.
Note also that the specific syntax you suggest would not fly well, because there is no way to distinguish one case-by-case function definition from two overloaded functions syntactically. You probably would need some alternative syntax, similar to SML using "|".
I don't know SML or Erlang, but I know Haskell. It is a language without method overloading. Method overloading combined with such pattern matching could lead to ambiguities. Imagine following code:
def f(x: String) = "String "+x
def f(x: List[_]) = "List "+x
What should it mean? It can mean method overloading, i.e. the method is determined in compile time. It can also mean pattern matching. There would be just a f(x: AnyRef) method that would do the matching.
Scala also has named parameters, which would be probably also broken.
I don't think that Scala is able to offer more simple syntax than you have shown in general. A simpler syntax may IMHO work in some special cases only.
There are at least two problems:
[ and ] are reserved characters because they are used for type arguments. The compiler allows spaces around them, so that would not be an option.
The other problem is that = returns Unit. So the expression after the | would not return any result
The closest I could come up with is this (note that is very specialized towards your example):
// Define a class to hold the values left and right of the | sign
class |[T, S](val left: T, val right: PartialFunction[T, T])
// Create a class that contains the | operator
class OrAssoc[T](left: T) {
def |(right: PartialFunction[T, T]): T | T = new |(left, right)
}
// Add the | to any potential target
implicit def anyToOrAssoc[S](left: S): OrAssoc[S] = new OrAssoc(left)
object fun {
// Use the magic of the update method
def update[T, S](choice: T | S): T => T = { arg =>
if (choice.right.isDefinedAt(arg)) choice.right(arg)
else choice.left
}
}
// Use the above construction to define a new method
val reverse: List[Int] => List[Int] =
fun() = List.empty[Int] | {
case x :: xs => reverse(xs) ++ List(x)
}
// Call the method
reverse(List(3, 2, 1))
Methods are often declared with obvious parameter names, e.g.
def myMethod(s: String, image: BufferedImage, mesh: Mesh) { ... }
Parameter names correspond to parameter types.
1) "s" is often used for String
2) "i" for Int
3) lowercased class name for one word named classes (Mesh -> mesh)
4) lowercased last word from class name for long class names (BufferedImage -> image)
(Of course, it would not be convenient for ALL methods and arguments. Of course, somebody would prefer other rules…)
Scala macros are intended to generate some expressions in code. I would like to write some specific macros to convert to correct Scala expressions something like this:
// "arguments interpolation" style
// like string interpolation
def myMethod s[String, BufferedImage, Mesh]
{ /* code using vars "s", "image", "mesh" */ }
// or even better:
mydef myMethod[String, BufferedImage, Mesh]
{ /* code using vars "s", "image", "mesh" */ }
Is it possible?
Currently it is not possible and probably it will never be. Macros can not introduce their own syntax - they must be represented through valid Scala code (which can be executed at compile time) and, too, they must generate valid Scala code (better say a valid Scala AST).
Both of your shown examples are not valid Scala code, thus Macros can not handle them. Nevertheless, the current nightly build of Macro Paradise includes untyped macros. They allow to write Scala code which is typechecked after they are expanded, this means it is possible to write:
forM({i = 0; i < 10; i += 1}) {
println(i)
}
Notice, that the curly braces inside of the first parameter list are needed because, although the code is not typechecked when one writes it, it must represent a valid Scala AST.
The implementation of this macro looks like this:
def forM(header: _)(body: _) = macro __forM
def __forM(c: Context)(header: c.Tree)(body: c.Tree): c.Tree = {
import c.universe._
header match {
case Block(
List(
Assign(Ident(TermName(name)), Literal(Constant(start))),
Apply(Select(Ident(TermName(name2)), TermName(comparison)), List(Literal(Constant(end))))
),
Apply(Select(Ident(TermName(name3)), TermName(incrementation)), List(Literal(Constant(inc))))
) =>
// here one can generate the behavior of the loop
// but omit full implementation for clarity now ...
}
}
Instead of an already typechecked expression, the macro expects only a tree, that is typechecked after the expansion. The method call itself expects two parameter lists, whose parameter types can be delayed after the expansion phase if one uses an underscore.
Currently there is a little bit of documentation available but because it is extremely beta a lot of things will probably change in future.
With type macros it is possible to write something like this:
object O extends M {
// traverse the body of O to find what you want
}
type M(x: _) = macro __M
def __M(c: Context)(x: c.Tree): c.Tree = {
// omit full implementation for clarity ...
}
This is nice in order to delay the typechecking of the whole body because it allows to to cool things...
Macros that can change Scalas syntax are not planned at the moment and are probably not a good idea. I can't say if they will happen one day only future can tell us this.
Aside from the "why" (no really, why do you want to do that?), the answer is no, because as far as I know macros cannot (in their current state) generate methods or types, only expressions.
I'm experimenting with the scala 2.10 macro features. I have trouble using LabelDef in some cases, though. To some extent I peeked in the compiler's code, read excerpts of Miguel Garcia's papers but I'm still stuck.
If my understanding is correct, a pseudo-definition would be:
LabelDef(labelName, listOfParameters, stmsAndApply) where the 3 arguments are Trees and:
- labelNameis the identifier of the label $L being defined
- listOfParameters correspond to the arguments passed when label-apply occurs, as in $L(a1,...,an), and can be empty
- stmsAndApplycorresponds to the block of statements (possibly none) and final apply-expression
label-apply meaning more-or-less a GOTO to a label
For instance, in the case of a simple loop, a LabelDef can eventually apply itself:
LabelDef($L, (), {...; $L()})
Now, if I want to define 2 LabelDef that jump to each other:
...
LabelDef($L1, (), $L2())
...
LabelDef($L2, (), $L1())
...
The 2nd LabelDef is fine, but the compiler outputs an error on the 1st, "not found: value $L2". I guess that is because $L2 isn't yet defined while there is an attempt to apply it. This is a tree being constructed so that would make sense to me. Is my understanding correct so far? Because if no error is expected, that means my macro implementation is probably buggy.
Anyway, I believe there must be a way to apply $L2 (i.e. Jumping to $L2) from $L1, somehow, but I just have no clue how to do it. Does someone have an example of doing that, or any pointer?
Other unclear points (but less of a concern right now) about using LabelDef in macros are:
-what the 2nd argument is, concretely, how is it used when non-empty? In other words, what are the mechanisms of a label-apply with parameters?
-is it valid to put in the 3rd argument's final expression anything else than a label-apply? (not that I can't try, but macros are still experimental)
-is it possible to perform a forwarding label-apply outside a LabelDef? (maybe this is a redundant question)
Any macro implementation example in the answer is, of course, very welcome!
Cheers,
Because if no error is expected, that means my macro implementation is probably buggy.
Yes, it seems that was a bug (^^; Although I'm not sure whether or not the limitation with the Block/LabelDef combination exists on purpose.
def EVIL_LABELS_MACRO = macro EVIL_LABELS_MACRO_impl
def EVIL_LABELS_MACRO_impl(c:Context):c.Expr[Unit] = { // fails to expand
import c.universe._
val lt1 = newTermName("$L1"); val lt2 = newTermName("$L2")
val ld1 = LabelDef(lt1, Nil, Block(c.reify{println("$L1")}.tree, Apply(Ident(lt2), Nil)))
val ld2 = LabelDef(lt2, Nil, Block(c.reify{println("$L2")}.tree, Apply(Ident(lt1), Nil)))
c.Expr( Block(ld1, c.reify{println("ignored")}.tree, ld2) )
}
def FINE_LABELS_MACRO = macro FINE_LABELS_MACRO_impl
def FINE_LABELS_MACRO_impl(c:Context):c.Expr[Unit] = { // The End isn't near
import c.universe._
val lt1 = newTermName("$L1"); val lt2 = newTermName("$L2")
val ld1 = LabelDef(lt1, Nil, Block(c.reify{println("$L1")}.tree, Apply(Ident(lt2), Nil)))
val ld2 = LabelDef(lt2, Nil, Block(c.reify{println("$L2")}.tree, Apply(Ident(lt1), Nil)))
c.Expr( Block(ld1, c.reify{println("ignored")}.tree, ld2, c.reify{println("The End")}.tree) )
}
I think a Block is parsed into { statements; expression } thus the last argument is the expression. If a LabelDef "falls in" expression, e.g. the EVIL_LABELS_MACRO pattern, its expansion won't be visible in statements; hence error "not found: value $L2".
So it's better to make sure all LabelDef "fall in" statements. FINE_LABELS_MACRO does that and expands to:
{
$L1(){
scala.this.Predef.println("$L1");
$L2()
};
scala.this.Predef.println("ignored");
$L2(){
scala.this.Predef.println("$L2");
$L1()
};
scala.this.Predef.println("The End")
}