Recursive call not in tail position when using for comprehensions - scala

I don't know if I am doing something wrong or it is just a property of the Scala compiler - I get the mentioned compilation error when I try to compile this code:
#tailrec
def shiftDown2(x: Int, bound: Int) {
val childOfX = chooseChild(x, bound)
for (child <- childOfX) {
swap(x, child)
shiftDown2(child, bound)
}
}
whilst the following code compiles without problems:
#tailrec
def shiftDown(x: Int, bound: Int) {
val childOfX = chooseChild(x, bound)
if (childOfX.isDefined) {
swap(x, childOfX.get)
shiftDown(childOfX.get, bound)
}
}
I believe that above code fragments are semantically the same and both should work with tail recursion.

Tail recursion optimization will not work with recursive invocations inside for loop, because for loop here is just a syntactic sugar for calling foreach higher-order method. So, your code is equivalent to:
#tailrec
def shiftDown2(x: Int, bound: Int) {
val childOfX = chooseChild(x, bound)
childOfX.foreach { child =>
swap(x, child)
shiftDown2(child, bound)
}
}
scalac can optimize tail calls only if recursive method is tail-calling itself directly - by translating it into something similar to while loop in bytecode.
Unfortunately, this is not the case here - shiftDown2 calls childOfX.foreach passing it an anonymous function. Then, foreach (potentially) calls apply on that anonymous function and that anonymous function finally calls shiftDown2 again. So this is an indirect recursion and cannot be optimized by scalac. This limitation has its roots in the JVM, which doesn't have native tail call support.

Related

Why is this method causing a StackOverflowError? [duplicate]

Why won't the Scala compiler apply tail call optimization unless a method is final?
For example, this:
class C {
#tailrec def fact(n: Int, result: Int): Int =
if(n == 0)
result
else
fact(n - 1, n * result)
}
results in
error: could not optimize #tailrec annotated method: it is neither private nor final so can be overridden
What exactly would go wrong if the compiler applied TCO in a case such as this?
Consider the following interaction with the REPL. First we define a class with a factorial method:
scala> class C {
def fact(n: Int, result: Int): Int =
if(n == 0) result
else fact(n - 1, n * result)
}
defined class C
scala> (new C).fact(5, 1)
res11: Int = 120
Now let's override it in a subclass to double the superclass's answer:
scala> class C2 extends C {
override def fact(n: Int, result: Int): Int = 2 * super.fact(n, result)
}
defined class C2
scala> (new C).fact(5, 1)
res12: Int = 120
scala> (new C2).fact(5, 1)
What result do you expect for this last call? You might be expecting 240. But no:
scala> (new C2).fact(5, 1)
res13: Int = 7680
That's because when the superclass's method makes a recursive call, the recursive call goes through the subclass.
If overriding worked such that 240 was the right answer, then it would be safe for tail-call optimization to be performed in the superclass here. But that isn't how Scala (or Java) works.
Unless a method is marked final, it might not be calling itself when it makes a recursive call.
And that's why #tailrec doesn't work unless a method is final (or private).
UPDATE: I recommend reading the other two answers (John's and Rex's) as well.
Recursive calls might be to a subclass instead of to a superclass; final will prevent that. But why might you want that behavior? The Fibonacci series doesn't provide any clues. But this does:
class Pretty {
def recursivePrinter(a: Any): String = { a match {
case xs: List[_] => xs.map(recursivePrinter).mkString("L[",",","]")
case xs: Array[_] => xs.map(recursivePrinter).mkString("A[",",","]")
case _ => a.toString
}}
}
class Prettier extends Pretty {
override def recursivePrinter(a: Any): String = { a match {
case s: Set[_] => s.map(recursivePrinter).mkString("{",",","}")
case _ => super.recursivePrinter(a)
}}
}
scala> (new Prettier).recursivePrinter(Set(Set(0,1),1))
res8: String = {{0,1},1}
If the Pretty call was tail-recursive, we'd print out {Set(0, 1),1} instead since the extension wouldn't apply.
Since this sort of recursion is plausibly useful, and would be destroyed if tail calls on non-final methods were allowed, the compiler inserts a real call instead.
Let foo::fact(n, res) denote your routine. Let baz::fact(n, res) denote someone else's override of your routine.
The compiler is telling you that the semantics allow baz::fact() to be a wrapper, that MAY upcall (?) foo::fact() if it wants to. Under such a scenario, the rule is that foo::fact(), when it recurs, must activate baz::fact() rather than foo::fact(), and, while foo::fact() is tail-recursive, baz::fact() may not be. At that point, rather than looping on the tail-recursive call, foo::fact() must return to baz::fact(), so it can unwind itself.
What exactly would go wrong if the compiler applied TCO in a case such as this?
Nothing would go wrong. Any language with proper tail call elimination will do this (SML, OCaml, F#, Haskell etc.). The only reason Scala does not is that the JVM does not support tail recursion and Scala's usual hack of replacing self-recursive calls in tail position with goto does not work in this case. Scala on the CLR could do this as F# does.
The popular and accepted answer to this question is actually misleading, because the question itself is confusing. The OP does not make the distinction between tailrec and TCO, and the answer does not address this.
The key point is that the requirements for tailrec are more strict than the requirements for TCO.
The tailrec annotation requires that tail calls are made to the same function, whereas TCO can be used on tail calls to any function.
The compiler could use TCO on fact because there is a call in the tail position. Specifically, it could turn the call to fact into a jump to fact by adjusting the stack appropriately. It does not matter that this version of fact is not the same as the function making the call.
So the accepted answer correctly explains why a non-final function cannot be tailrec because you cannot guarantee that the tail calls are to the same function and not to an overloaded version of the function. But it incorrectly implies that it is not safe to use TCO on this method, when in fact this would be perfectly safe and a good optimisation.
[ Note that, as explained by Jon Harrop, you cannot implement TCO on the JVM, but that is a restriction of the compiler, not the language, and is unrelated to tailrec ]
And for reference, here is how you can avoid the problem without making the method final:
class C {
def fact(n: Int): Int = {
#tailrec
def loop(n: Int, result: Int): Int =
if (n == 0) {
result
} else {
loop(n - 1, n * result)
}
loop(n, 1)
}
}
This works because loop is a concrete function rather than a method and cannot be overridden. This version also has the advantage of removing the spurious result parameter to fact.
This is the pattern I use for all recursive algorithms.

Tail recursion: comparing two cases

Why is this tail recursion:
def navigate(myList : List[Int]) : (Int, List[Int]) = {
def navigate(step: Int, offset: Int, myList: List[Int]): (Int, scala.List[Int]) = {
if //some test and exit condition, then a definition of jump
else navigate(step + 1, offset + jump, myList)
}
navigate(0, 0, myList)
}
while this is not:
def navigate(myList : List[Int]) : (Int, List[Int]) = {
navigate(0, 0, myList)
}
def navigate(step: Int, offset: Int, myList: List[Int]): (Int, scala.List[Int]) = {
if //some test and exit condition, then a definition of jump
else navigate(step + 1, offset + jump, myList)
}
If myList is very long, the first case does not give any problem, when the second one causes a StackOverflowError.
Also, is there any way to say the compiler that the latter should be compiled so that the recursion does not increase the stack?
In order for a method to be eligible for tail-recursion optimization, it must:
be tail-recursive (duh!)
not use return
be final
Both of your examples conform to #1 and #2, but only the first example conforms to #3 (local methods are implicitly final).
The reason why a method is not tail-recursive if it is not final is that "tail-recursive" means "tail-calls itself", but if the method is virtual, then you cannot know whether it tail-calls itself or an overridden version of itself. Figuring out at compile time whether a method has been overridden requires Class Hierarchy Analysis, which is known to be equivalent to solving the Halting Problem … IOW is impossible.
Also, is there any way to say the compiler that the latter should be compiled so that the recursion does not increase the stack?
No. There is no way to turn tail-recursion optimization on or off. Methods that are tail-recursive (according to the Scala Language Specification's definition of "tail-recursive", of course) are always optimized. Any implementation of Scala that does not do this is in violation of the Scala Language Specification.
There is, however, the scala.annotation.tailrec annotation, which guarantees that the compiler will generate an error if a method that is annotated with this annotation does not comply with the SLS's definition of tail-recursion.

What's the difference between using and no using a "=" in Scala defs ?

What the difference between the two defs below
def someFun(x:String) { x.length }
AND
def someFun(x:String) = { x.length }
As others already pointed out, the former is a syntactic shortcut for
def someFun(x:String): Unit = { x.length }
Meaning that the value of x.length is discarded and the function returns Unit (or () if you prefer) instead.
I'd like to stress out that this is deprecated since Oct 29, 2013 (https://github.com/scala/scala/pull/3076/), but the warning only shows up if you compile with the -Xfuture flag.
scala -Xfuture -deprecation
scala> def foo {}
<console>:1: warning: Procedure syntax is deprecated. Convert procedure `foo` to method by adding `: Unit =`.
def foo {}
foo: Unit
So you should never use the so-called procedure syntax.
Martin Odersky itself pointed this out in his Scala Day 2013 Keynote and it has been discussed in the scala mailing list.
The syntax is very inconsistent and it's very common for a beginner to hit this issue when learning the language. For this reasons it's very like that it will be removed from the language at some point.
Without the equals it is implicitly typed to return Unit (or "void"): the result of the body is fixed - not inferred - and any would-be return value is discarded.
That is, def someFun(x:String) { x.length } is equivalent to def someFun(x:String): Unit = { x.length }, neither of which are very useful here because the function causes no side-effects and returns no value.
The "equals form" without the explicit Unit (or other type) has the return type inferred; in this case that would be def someFun(x:String): Int = { x.length } which is more useful, albeit not very exciting.
I prefer to specify the return type for all exposed members, which helps to ensure API/contract stability and arguably adds clarity. For "void" methods this is trivial done by using the Procedure form, without the equals, although it is a stylistic debate of which is better - opponents might argue that having the different forms leads to needless questions about such ;-)
The former is
def someFun(x: String): Unit = {
x.length
() // return unit
}
And the latter is
def someFun(x: String): Int = {
x.length // returned
}
Note that the Scala Style guide always recommend using '=', both in
method declaration
Methods should be declared according to the following pattern:
def foo(bar: Baz): Bin = expr
function declaration
Function types should be declared with a space between the parameter type, the arrow and the return type:
def foo(f: Int => String) = ...
def bar(f: (Boolean, Double) => List[String]) = ...
As per Scala-2.10, using equals sign is preferred. Infact you must use equals sign in call declarations except the definitions returning Unit.
As such there is no difference but the first one is not recommended anymore and it should not be used.

Tail recursion and return statement in Scala

I was thinking about one of the questions asked here (Why does Scala require a return type for recursive functions?) and how to improve the code.
Anyway, I was thinking something like this:
def simpledb_update(name: String, metadata: Map[String, String]) = {
def inner_update(attempt: int): Unit = {
try {
db(config("simpledb_db")) += (name, metadata)
return
} catch {
case e =>
if (attempt >= 6) {
AUlog(name + ": SimpleDB Failed")
return
}
}
inner_update(attempt+1)
}
inner_update(0)
}
Or
def simpledb_update(name: String, metadata: Map[String, String]) {
def inner_update(attempt: int): Unit = {
try {
db(config("simpledb_db")) += (name, metadata)
} catch {
//Do I need the pattern match, since I don't
// care what exception is thrown???
if (attempt >= 6) {
AUlog(name + ": SimpleDB Failed")
} else {
inner_update(attempt+1)
}
}
}
inner_update(0)
}
Is the second implementation still tail recursive (is the first???). I'm still a bit hazy on when a function is tail recursive, and when it's not.
Yes, both examples are still tail-recursive, you're doing the checking/processing first, with the recursive call last.
The Jargon File put it succinctly: Tail Recursion (n): If you aren't sick of it already, see tail recursion.
In Scala, you can add tailrecursive annotation to the function, and then the compiler will check whether or not it can do tail call optimisation on the function.
Tail recursion is simple to understand - if the last expression, from all code paths of a function F, is simply a function call then it is tail recursive. But is it tail-recursive for Scala compiler to optimize - depends. The fact that JVM doesn't do TCO means scala compiler does the magic.
And the reason scalac might fail to optimize has to do with polymorphism. If the Function can be overridden in a subclass, scalac cannot optimize it out.
#Timo Rantalaiho, using the tailrec annotation is the safest way to verify if TCO can be done.

Why won't the Scala compiler apply tail call optimization unless a method is final?

Why won't the Scala compiler apply tail call optimization unless a method is final?
For example, this:
class C {
#tailrec def fact(n: Int, result: Int): Int =
if(n == 0)
result
else
fact(n - 1, n * result)
}
results in
error: could not optimize #tailrec annotated method: it is neither private nor final so can be overridden
What exactly would go wrong if the compiler applied TCO in a case such as this?
Consider the following interaction with the REPL. First we define a class with a factorial method:
scala> class C {
def fact(n: Int, result: Int): Int =
if(n == 0) result
else fact(n - 1, n * result)
}
defined class C
scala> (new C).fact(5, 1)
res11: Int = 120
Now let's override it in a subclass to double the superclass's answer:
scala> class C2 extends C {
override def fact(n: Int, result: Int): Int = 2 * super.fact(n, result)
}
defined class C2
scala> (new C).fact(5, 1)
res12: Int = 120
scala> (new C2).fact(5, 1)
What result do you expect for this last call? You might be expecting 240. But no:
scala> (new C2).fact(5, 1)
res13: Int = 7680
That's because when the superclass's method makes a recursive call, the recursive call goes through the subclass.
If overriding worked such that 240 was the right answer, then it would be safe for tail-call optimization to be performed in the superclass here. But that isn't how Scala (or Java) works.
Unless a method is marked final, it might not be calling itself when it makes a recursive call.
And that's why #tailrec doesn't work unless a method is final (or private).
UPDATE: I recommend reading the other two answers (John's and Rex's) as well.
Recursive calls might be to a subclass instead of to a superclass; final will prevent that. But why might you want that behavior? The Fibonacci series doesn't provide any clues. But this does:
class Pretty {
def recursivePrinter(a: Any): String = { a match {
case xs: List[_] => xs.map(recursivePrinter).mkString("L[",",","]")
case xs: Array[_] => xs.map(recursivePrinter).mkString("A[",",","]")
case _ => a.toString
}}
}
class Prettier extends Pretty {
override def recursivePrinter(a: Any): String = { a match {
case s: Set[_] => s.map(recursivePrinter).mkString("{",",","}")
case _ => super.recursivePrinter(a)
}}
}
scala> (new Prettier).recursivePrinter(Set(Set(0,1),1))
res8: String = {{0,1},1}
If the Pretty call was tail-recursive, we'd print out {Set(0, 1),1} instead since the extension wouldn't apply.
Since this sort of recursion is plausibly useful, and would be destroyed if tail calls on non-final methods were allowed, the compiler inserts a real call instead.
Let foo::fact(n, res) denote your routine. Let baz::fact(n, res) denote someone else's override of your routine.
The compiler is telling you that the semantics allow baz::fact() to be a wrapper, that MAY upcall (?) foo::fact() if it wants to. Under such a scenario, the rule is that foo::fact(), when it recurs, must activate baz::fact() rather than foo::fact(), and, while foo::fact() is tail-recursive, baz::fact() may not be. At that point, rather than looping on the tail-recursive call, foo::fact() must return to baz::fact(), so it can unwind itself.
What exactly would go wrong if the compiler applied TCO in a case such as this?
Nothing would go wrong. Any language with proper tail call elimination will do this (SML, OCaml, F#, Haskell etc.). The only reason Scala does not is that the JVM does not support tail recursion and Scala's usual hack of replacing self-recursive calls in tail position with goto does not work in this case. Scala on the CLR could do this as F# does.
The popular and accepted answer to this question is actually misleading, because the question itself is confusing. The OP does not make the distinction between tailrec and TCO, and the answer does not address this.
The key point is that the requirements for tailrec are more strict than the requirements for TCO.
The tailrec annotation requires that tail calls are made to the same function, whereas TCO can be used on tail calls to any function.
The compiler could use TCO on fact because there is a call in the tail position. Specifically, it could turn the call to fact into a jump to fact by adjusting the stack appropriately. It does not matter that this version of fact is not the same as the function making the call.
So the accepted answer correctly explains why a non-final function cannot be tailrec because you cannot guarantee that the tail calls are to the same function and not to an overloaded version of the function. But it incorrectly implies that it is not safe to use TCO on this method, when in fact this would be perfectly safe and a good optimisation.
[ Note that, as explained by Jon Harrop, you cannot implement TCO on the JVM, but that is a restriction of the compiler, not the language, and is unrelated to tailrec ]
And for reference, here is how you can avoid the problem without making the method final:
class C {
def fact(n: Int): Int = {
#tailrec
def loop(n: Int, result: Int): Int =
if (n == 0) {
result
} else {
loop(n - 1, n * result)
}
loop(n, 1)
}
}
This works because loop is a concrete function rather than a method and cannot be overridden. This version also has the advantage of removing the spurious result parameter to fact.
This is the pattern I use for all recursive algorithms.