Why won't the Scala compiler apply tail call optimization unless a method is final?
For example, this:
class C {
#tailrec def fact(n: Int, result: Int): Int =
if(n == 0)
result
else
fact(n - 1, n * result)
}
results in
error: could not optimize #tailrec annotated method: it is neither private nor final so can be overridden
What exactly would go wrong if the compiler applied TCO in a case such as this?
Consider the following interaction with the REPL. First we define a class with a factorial method:
scala> class C {
def fact(n: Int, result: Int): Int =
if(n == 0) result
else fact(n - 1, n * result)
}
defined class C
scala> (new C).fact(5, 1)
res11: Int = 120
Now let's override it in a subclass to double the superclass's answer:
scala> class C2 extends C {
override def fact(n: Int, result: Int): Int = 2 * super.fact(n, result)
}
defined class C2
scala> (new C).fact(5, 1)
res12: Int = 120
scala> (new C2).fact(5, 1)
What result do you expect for this last call? You might be expecting 240. But no:
scala> (new C2).fact(5, 1)
res13: Int = 7680
That's because when the superclass's method makes a recursive call, the recursive call goes through the subclass.
If overriding worked such that 240 was the right answer, then it would be safe for tail-call optimization to be performed in the superclass here. But that isn't how Scala (or Java) works.
Unless a method is marked final, it might not be calling itself when it makes a recursive call.
And that's why #tailrec doesn't work unless a method is final (or private).
UPDATE: I recommend reading the other two answers (John's and Rex's) as well.
Recursive calls might be to a subclass instead of to a superclass; final will prevent that. But why might you want that behavior? The Fibonacci series doesn't provide any clues. But this does:
class Pretty {
def recursivePrinter(a: Any): String = { a match {
case xs: List[_] => xs.map(recursivePrinter).mkString("L[",",","]")
case xs: Array[_] => xs.map(recursivePrinter).mkString("A[",",","]")
case _ => a.toString
}}
}
class Prettier extends Pretty {
override def recursivePrinter(a: Any): String = { a match {
case s: Set[_] => s.map(recursivePrinter).mkString("{",",","}")
case _ => super.recursivePrinter(a)
}}
}
scala> (new Prettier).recursivePrinter(Set(Set(0,1),1))
res8: String = {{0,1},1}
If the Pretty call was tail-recursive, we'd print out {Set(0, 1),1} instead since the extension wouldn't apply.
Since this sort of recursion is plausibly useful, and would be destroyed if tail calls on non-final methods were allowed, the compiler inserts a real call instead.
Let foo::fact(n, res) denote your routine. Let baz::fact(n, res) denote someone else's override of your routine.
The compiler is telling you that the semantics allow baz::fact() to be a wrapper, that MAY upcall (?) foo::fact() if it wants to. Under such a scenario, the rule is that foo::fact(), when it recurs, must activate baz::fact() rather than foo::fact(), and, while foo::fact() is tail-recursive, baz::fact() may not be. At that point, rather than looping on the tail-recursive call, foo::fact() must return to baz::fact(), so it can unwind itself.
What exactly would go wrong if the compiler applied TCO in a case such as this?
Nothing would go wrong. Any language with proper tail call elimination will do this (SML, OCaml, F#, Haskell etc.). The only reason Scala does not is that the JVM does not support tail recursion and Scala's usual hack of replacing self-recursive calls in tail position with goto does not work in this case. Scala on the CLR could do this as F# does.
The popular and accepted answer to this question is actually misleading, because the question itself is confusing. The OP does not make the distinction between tailrec and TCO, and the answer does not address this.
The key point is that the requirements for tailrec are more strict than the requirements for TCO.
The tailrec annotation requires that tail calls are made to the same function, whereas TCO can be used on tail calls to any function.
The compiler could use TCO on fact because there is a call in the tail position. Specifically, it could turn the call to fact into a jump to fact by adjusting the stack appropriately. It does not matter that this version of fact is not the same as the function making the call.
So the accepted answer correctly explains why a non-final function cannot be tailrec because you cannot guarantee that the tail calls are to the same function and not to an overloaded version of the function. But it incorrectly implies that it is not safe to use TCO on this method, when in fact this would be perfectly safe and a good optimisation.
[ Note that, as explained by Jon Harrop, you cannot implement TCO on the JVM, but that is a restriction of the compiler, not the language, and is unrelated to tailrec ]
And for reference, here is how you can avoid the problem without making the method final:
class C {
def fact(n: Int): Int = {
#tailrec
def loop(n: Int, result: Int): Int =
if (n == 0) {
result
} else {
loop(n - 1, n * result)
}
loop(n, 1)
}
}
This works because loop is a concrete function rather than a method and cannot be overridden. This version also has the advantage of removing the spurious result parameter to fact.
This is the pattern I use for all recursive algorithms.
Related
I'm currently doing a Scala course and recently I was introduced to different techniques of returning functions.
For example, given this function and method:
val simpleAddFunction = (x: Int, y: Int) => x + y
def simpleAddMethod(x: Int, y: Int) = x + y
I can return another function just doing this:
val add7_v1 = (x: Int) => simpleAddFunction(x, 7)
val add7_v2 = simpleAddFunction(_: Int, 7)
val add7_v3 = (x: Int) => simpleAddMethod(x, 7)
val add7_v4 = simpleAddMethod(_: Int, 7)
All the values add7_x accomplish the same thing, so, whats the purpose of Currying then?
Why I have to write def simpleCurryMethod(x: Int)(y: Int) = x + y if all of the above functions do a similar functionality?
That's it! I'm a newbie in functional programming and I don't know many use cases of Currying apart from saving time by reducing the use of parameters repeatedly. So, if someone could explain me the advantages of currying over the previous examples or in Currying in general I would be very grateful.
That's it, have a nice day!
In Scala 2 there are only four pragmatic reasons for currying METHODS (as far as I can recall, if someone has another valid use case then please let me know).
(and probably the principal reason to use it) to drive type inference.
For example, when you want to accept a function or another kind of generic value whose generic type should be inferred from some plain data. For example:
def applyTwice[A](a: A)(f: A => A): A = f(f(a))
applyTwice(10)(_ + 1) // Here the compiler is able to infer that f is Int => Int
In the above example, if I wouldn't have curried the function then I would need to have done something like: applyTwice(10, (x: Int) => x + 1) to call the function.
Which is redundant and looks worse (IMHO).
Note: In Scala 3 type inference is improved thus this reason is not longer valid there.
(and probably the main reason now in Scala 3) for the UX of callers.
For example, if you expect an argument to be a function or a block it is usually better as a single argument in its own (and last) parameter list so it looks nice in usage. For example:
def iterN(n: Int)(body: => Unit): Unit =
if (n > 0) {
body
iterN(n - 1)(body)
}
iterN(3) {
println("Hello")
// more code
println("World")
}
Again, if I wouldn't have curried the previous method the usage would have been like this:
iterN(3, {
println("Hello")
// more code
println("World")
})
Which doesn't look that nice :)
(in my experience weird but valid) when you know that majority of users will call it partially to return a function.
Because val baz = foo(bar) _ looks better than val baz = foo(bar, _) and with the first one, you sometimes don't the the underscore like: data.map(foo(bar))
Note: Disclaimer, I personally think that if this is the case, is better to just return a function right away instead of currying.
Edit
Thanks to #jwvh for pointing out this fourth use case.
(useful when using path-dependant types) when you need to refer to previous parameters. For example:
trait Foo {
type I
def bar(i: I): Baz
}
def run(foo: Foo)(i: foo.I): Baz =
foo.bar(i)
Why is this tail recursion:
def navigate(myList : List[Int]) : (Int, List[Int]) = {
def navigate(step: Int, offset: Int, myList: List[Int]): (Int, scala.List[Int]) = {
if //some test and exit condition, then a definition of jump
else navigate(step + 1, offset + jump, myList)
}
navigate(0, 0, myList)
}
while this is not:
def navigate(myList : List[Int]) : (Int, List[Int]) = {
navigate(0, 0, myList)
}
def navigate(step: Int, offset: Int, myList: List[Int]): (Int, scala.List[Int]) = {
if //some test and exit condition, then a definition of jump
else navigate(step + 1, offset + jump, myList)
}
If myList is very long, the first case does not give any problem, when the second one causes a StackOverflowError.
Also, is there any way to say the compiler that the latter should be compiled so that the recursion does not increase the stack?
In order for a method to be eligible for tail-recursion optimization, it must:
be tail-recursive (duh!)
not use return
be final
Both of your examples conform to #1 and #2, but only the first example conforms to #3 (local methods are implicitly final).
The reason why a method is not tail-recursive if it is not final is that "tail-recursive" means "tail-calls itself", but if the method is virtual, then you cannot know whether it tail-calls itself or an overridden version of itself. Figuring out at compile time whether a method has been overridden requires Class Hierarchy Analysis, which is known to be equivalent to solving the Halting Problem … IOW is impossible.
Also, is there any way to say the compiler that the latter should be compiled so that the recursion does not increase the stack?
No. There is no way to turn tail-recursion optimization on or off. Methods that are tail-recursive (according to the Scala Language Specification's definition of "tail-recursive", of course) are always optimized. Any implementation of Scala that does not do this is in violation of the Scala Language Specification.
There is, however, the scala.annotation.tailrec annotation, which guarantees that the compiler will generate an error if a method that is annotated with this annotation does not comply with the SLS's definition of tail-recursion.
Why are some methods in Scala's standard libraries implemented with mutable state?
For instance, the find method as part of scala.Iterator class is implemented as
def find(p: A => Boolean): Option[A] = {
var res: Option[A] = None
while (res.isEmpty && hasNext) {
val e = next()
if (p(e)) res = Some(e)
}
res
}
Which could have been implemented as a #tailrec'd method, perhaps something like
def findNew(p: A => Boolean): Option[A] = {
#tailrec
def findRec(e: A): Option[A] = {
if (p(e)) Some(e)
else {
if (hasNext) findRec(next())
else None
}
}
if (hasNext) findRec(next())
else None
}
Now I suppose one argument could be the use of mutable state and a while loop could be more efficient, which is understandably very important in library code, but is that really the case over a #tailrec'd method?
There is no harm in having a mutable state as long as he is not shared.
In your example there is no way the mutable var could be accessed from outside, so it's not possible that this mutable variable change due to a side effect.
It's always good to enforce immutability as much as possible, but when performance matter there is nothing wrong in having some mutability as long as it's constrained in a safe way.
NOTE: Iterator is a data-structure which is not side-effect free and this could lead to some weird behavior, but this is an other story and in no way the reason for designing a method in such way. You'll find method like that in immutable data-structure too.
In this case the tailrec quite possibly has the same performance as the while loop. I would say that in this case the while loop solution is shorter and more concise.
But, iterators are a mutable abstraction anyway, so the gain of having a tail recursive method to avoid that var, which is local to that short code snippet, is questionable.
Scala is not designed for functional purity but for broadly useful capability. Part of this includes trying to have the most efficient implementations of basic library routines (certainly not universally true, but it often is).
As such, if you have two possible interfaces:
trait Iterator[A] { def next: A }
trait FunctionalIterator[A] { def next: (A, FunctionalIterator[A]) }
and the second one is awkward and slower, it's quite sensible to choose the first.
When a functionally pure implementation is superior for the bulk of use cases, you'll typically find the functionally pure one.
And when it comes to simply using a while loop vs. recursion, either one is easy enough to maintain so it's really up to the preferences of the coder. Note that find would have to be marked final in the tailrec case, so while preserves more flexibility:
trait Foo {
def next: Int
def foo: Int = {
var a = next
while (a < 0) a = next
a
}
}
defined trait Foo
trait Bar {
def next: Int
#tailrec def bar: Int = {
val a = next
if (a < 0) bar else a
}
}
<console>:10: error: could not optimize #tailrec annotated method bar:
it is neither private nor final so can be overridden
#tailrec def bar: Int = {
^
There are ways to get around this (nested methods, final, redirect to private method, etc.), but it tends to adds boilerplate to the point where the while is syntactically more compact.
I understand the difference between zero-parameter and parameterless methods, but what I don't really understand is the language design choice that made parameterless methods necessary.
Disadvantages I can think of:
It's confusing. Every week or two there are questions here or on the Scala mailing list about it.
It's complicated; we also have to distinguish between () => X and => X.
It's ambiguous: does x.toFoo(y) mean what it says, or x.toFoo.apply(y)? (Answer: it depends on what overloads there are x's toFoo method and the overloads on Foo's apply method, but if there's a clash you don't see an error until you try to call it.)
It messes up operator style method calling syntax: there is no symbol to use in place of the arguments, when chaining methods, or at the end to avoid semicolon interference. With zero-arg methods you can use the empty parameter list ().
Currently, you can't have both defined in a class: you get an error saying the method is already defined. They also both convert to a Function0.
Why not just make methods def foo and def foo() exactly the same thing, and allow them to be called with or without parentheses? What are the upsides of how it is?
Currying, That's Why
Daniel did a great job at explaining why parameterless methods are necessary. I'll explain why they are regarded distinctly from zero-parameter methods.
Many people view the distinction between parameterless and zero-parameter functions as some vague form of syntactic sugar. In truth it is purely an artifact of how Scala supports currying (for completeness, see below for a more thorough explanation of what currying is, and why we all like it so much).
Formally, a function may have zero or more parameter lists, with zero or more parameters each.
This means the following are valid: def a, def b(), but also the contrived def c()() and def d(x: Int)()()(y: Int) etc...
A function def foo = ??? has zero parameter lists. A function def bar() = ??? has precisely one parameter list, with zero parameters. Introducing additional rules that conflate the two forms would have undermined currying as a consistent language feature: def a would be equivalent in form to def b() and def c()() both; def d(x: Int)()()(y: Int) would be equivalent to def e()(x: Int)(y: Int)()().
One case where currying is irrelevant is when dealing with Java interop. Java does not support currying, so there's no problem with introducing syntactic sugar for zero-parameter methods like "test".length() (which directly invokes java.lang.String#length()) to also be invoked as "test".length.
A quick explanation of currying
Scala supports a language feature called 'currying', named after mathematician Haskell Curry.
Currying allows you to define functions with several parameter lists, e.g.:
def add(a: Int)(b: Int): Int = a + b
add(2)(3) // 5
This is useful, because you can now define inc in terms of a partial application of add:
def inc: Int => Int = add(1)
inc(2) // 3
Currying is most often seen as a way of introducing control structures via libraries, e.g.:
def repeat(n: Int)(thunk: => Any): Unit = (1 to n) foreach { _ => thunk }
repeat(2) {
println("Hello, world")
}
// Hello, world
// Hello, world
As a recap, see how repeat opens up another opportunity to use currying:
def twice: (=> Any) => Unit = repeat(2)
twice {
println("Hello, world")
}
// ... you get the picture :-)
One nice thing about an issue coming up periodically on the ML is that there are periodic answers.
Who can resist a thread called "What is wrong with us?"
https://groups.google.com/forum/#!topic/scala-debate/h2Rej7LlB2A
From: martin odersky Date: Fri, Mar 2, 2012 at
12:13 PM Subject: Re: [scala-debate] what is wrong with us...
What some people think is "wrong with us" is that we are trying bend
over backwards to make Java idioms work smoothly in Scala. The
principaled thing would have been to say def length() and def length
are different, and, sorry, String is a Java class so you have to write
s.length(), not s.length. We work really hard to paper over it by
admitting automatic conversions from s.length to s.length(). That's
problematic as it is. Generalizing that so that the two are identified
in the type system would be a sure way to doom. How then do you
disambiguate:
type Action = () => () def foo: Action
Is then foo of type Action or ()? What about foo()?
Martin
My favorite bit of paulp fiction from that thread:
On Fri, Mar 2, 2012 at 10:15 AM, Rex Kerr <ich...#gmail.com> wrote:
>This would leave you unable to distinguish between the two with
>structural types, but how often is the case when you desperately
>want to distinguish the two compared to the case where distinguishing
>between the two is a hassle?
/** Note to maintenance programmer: It is important that this method be
* callable by classes which have a 'def foo(): Int' but not by classes which
* merely have a 'def foo: Int'. The correctness of this application depends
* on maintaining this distinction.
*
* Additional note to maintenance programmer: I have moved to zambia.
* There is no forwarding address. You will never find me.
*/
def actOnFoo(...)
So the underlying motivation for the feature is to generate this sort of ML thread.
One more bit of googlology:
On Thu, Apr 1, 2010 at 8:04 PM, Rex Kerr <[hidden email]> wrote: On
Thu, Apr 1, 2010 at 1:00 PM, richard emberson <[hidden email]> wrote:
I assume "def getName: String" is the same as "def getName(): String"
No, actually, they are not. Even though they both call a method
without parameters, one is a "method with zero parameter lists" while
the other is a "method with one empty parameter list". If you want to
be even more perplexed, try def getName()(): String (and create a
class with that signature)!
Scala represents parameters as a list of lists, not just a list, and
List() != List(List())
It's kind of a quirky annoyance, especially since there are so few
distinctions between the two otherwise, and since both can be
automatically turned into the function signature () => String.
True. In fact, any conflation between parameterless methods and
methods with empty parameter lists is entirely due to Java interop.
They should be different but then dealing with Java methods would be
just too painful. Can you imagine having to write str.length() each
time you take the length of a string?
Cheers
First off, () => X and => X has absolutely nothing to do with parameterless methods.
Now, it looks pretty silly to write something like this:
var x() = 5
val y() = 2
x() = x() + y()
Now, if you don't follow what the above has to do with parameterless methods, then you should look up uniform access principle. All of the above are method declarations, and all of them can be replaced by def. That is, assuming you remove their parenthesis.
Besides the convention fact mentioned (side-effect versus non-side-effect), it helps with several cases:
Usefulness of having empty-paren
// short apply syntax
object A {
def apply() = 33
}
object B {
def apply = 33
}
A() // works
B() // does not work
// using in place of a curried function
object C {
def m()() = ()
}
val f: () => () => Unit = C.m
Usefulness of having no-paren
// val <=> def, var <=> two related defs
trait T { def a: Int; def a_=(v: Int): Unit }
trait U { def a(): Int; def a_=(v: Int): Unit }
def tt(t: T): Unit = t.a += 1 // works
def tu(u: U): Unit = u.a += 1 // does not work
// avoiding clutter with apply the other way round
object D {
def a = Vector(1, 2, 3)
def b() = Vector(1, 2, 3)
}
D.a(0) // works
D.b(0) // does not work
// object can stand for no-paren method
trait E
trait F { def f: E }
trait G { def f(): E }
object H extends F {
object f extends E // works
}
object I extends G {
object f extends E // does not work
}
Thus in terms of regularity of the language, it makes sense to have the distinction (especially for the last shown case).
I would say both are possible because you can access mutable state with a parameterless method:
class X(private var x: Int) {
def inc() { x += 1 }
def value = x
}
The method value does not have side effects (it only accesses mutable state). This behavior is explicitly mentioned in Programming in Scala:
Such parameterless methods are quite common in Scala. By contrast, methods defined with empty parentheses, such as def height(): Int, are called empty-paren methods. The recommended convention is to use a parameterless method whenever there are no parameters and the method accesses mutable state only by reading fields of the containing object (in particular, it does not change mutable state).
This convention supports the uniform access principle [...]
To summarize, it is encouraged style in Scala to define methods that take no parameters and have no side effects as parameterless methods, i.e., leaving off the empty parentheses. On the other hand, you should never define a method that has side-effects without parentheses, because then invocations of that method would look like a field selection.
Why won't the Scala compiler apply tail call optimization unless a method is final?
For example, this:
class C {
#tailrec def fact(n: Int, result: Int): Int =
if(n == 0)
result
else
fact(n - 1, n * result)
}
results in
error: could not optimize #tailrec annotated method: it is neither private nor final so can be overridden
What exactly would go wrong if the compiler applied TCO in a case such as this?
Consider the following interaction with the REPL. First we define a class with a factorial method:
scala> class C {
def fact(n: Int, result: Int): Int =
if(n == 0) result
else fact(n - 1, n * result)
}
defined class C
scala> (new C).fact(5, 1)
res11: Int = 120
Now let's override it in a subclass to double the superclass's answer:
scala> class C2 extends C {
override def fact(n: Int, result: Int): Int = 2 * super.fact(n, result)
}
defined class C2
scala> (new C).fact(5, 1)
res12: Int = 120
scala> (new C2).fact(5, 1)
What result do you expect for this last call? You might be expecting 240. But no:
scala> (new C2).fact(5, 1)
res13: Int = 7680
That's because when the superclass's method makes a recursive call, the recursive call goes through the subclass.
If overriding worked such that 240 was the right answer, then it would be safe for tail-call optimization to be performed in the superclass here. But that isn't how Scala (or Java) works.
Unless a method is marked final, it might not be calling itself when it makes a recursive call.
And that's why #tailrec doesn't work unless a method is final (or private).
UPDATE: I recommend reading the other two answers (John's and Rex's) as well.
Recursive calls might be to a subclass instead of to a superclass; final will prevent that. But why might you want that behavior? The Fibonacci series doesn't provide any clues. But this does:
class Pretty {
def recursivePrinter(a: Any): String = { a match {
case xs: List[_] => xs.map(recursivePrinter).mkString("L[",",","]")
case xs: Array[_] => xs.map(recursivePrinter).mkString("A[",",","]")
case _ => a.toString
}}
}
class Prettier extends Pretty {
override def recursivePrinter(a: Any): String = { a match {
case s: Set[_] => s.map(recursivePrinter).mkString("{",",","}")
case _ => super.recursivePrinter(a)
}}
}
scala> (new Prettier).recursivePrinter(Set(Set(0,1),1))
res8: String = {{0,1},1}
If the Pretty call was tail-recursive, we'd print out {Set(0, 1),1} instead since the extension wouldn't apply.
Since this sort of recursion is plausibly useful, and would be destroyed if tail calls on non-final methods were allowed, the compiler inserts a real call instead.
Let foo::fact(n, res) denote your routine. Let baz::fact(n, res) denote someone else's override of your routine.
The compiler is telling you that the semantics allow baz::fact() to be a wrapper, that MAY upcall (?) foo::fact() if it wants to. Under such a scenario, the rule is that foo::fact(), when it recurs, must activate baz::fact() rather than foo::fact(), and, while foo::fact() is tail-recursive, baz::fact() may not be. At that point, rather than looping on the tail-recursive call, foo::fact() must return to baz::fact(), so it can unwind itself.
What exactly would go wrong if the compiler applied TCO in a case such as this?
Nothing would go wrong. Any language with proper tail call elimination will do this (SML, OCaml, F#, Haskell etc.). The only reason Scala does not is that the JVM does not support tail recursion and Scala's usual hack of replacing self-recursive calls in tail position with goto does not work in this case. Scala on the CLR could do this as F# does.
The popular and accepted answer to this question is actually misleading, because the question itself is confusing. The OP does not make the distinction between tailrec and TCO, and the answer does not address this.
The key point is that the requirements for tailrec are more strict than the requirements for TCO.
The tailrec annotation requires that tail calls are made to the same function, whereas TCO can be used on tail calls to any function.
The compiler could use TCO on fact because there is a call in the tail position. Specifically, it could turn the call to fact into a jump to fact by adjusting the stack appropriately. It does not matter that this version of fact is not the same as the function making the call.
So the accepted answer correctly explains why a non-final function cannot be tailrec because you cannot guarantee that the tail calls are to the same function and not to an overloaded version of the function. But it incorrectly implies that it is not safe to use TCO on this method, when in fact this would be perfectly safe and a good optimisation.
[ Note that, as explained by Jon Harrop, you cannot implement TCO on the JVM, but that is a restriction of the compiler, not the language, and is unrelated to tailrec ]
And for reference, here is how you can avoid the problem without making the method final:
class C {
def fact(n: Int): Int = {
#tailrec
def loop(n: Int, result: Int): Int =
if (n == 0) {
result
} else {
loop(n - 1, n * result)
}
loop(n, 1)
}
}
This works because loop is a concrete function rather than a method and cannot be overridden. This version also has the advantage of removing the spurious result parameter to fact.
This is the pattern I use for all recursive algorithms.