I come from a object oriented background, where I primarily wrote applications in Java. I recently started to explore more on Scala and I have been reading some text. I thus came across something called tail recursion. I understood how to write tail recursive methods.
For example - To add the elements in a List (Of course this could be done using reduce method) but for the sake of understanding, I wrote a tail recursive method:
#scala.annotation.tailrec
def sum(l: List[Int], acc: Int): Int = l match {
case Nil => acc
case x :: xs => sum(xs, acc + x)
}
How is this recursion handled internally by the Scala run time?
How is this recursion handled internally by the Scala run time?
It isn't. It is handled by the compiler at compile time.
Tail-recursion is equivalent to a while loop. So, a tail-recursive method can be compiled to a while loop, or, more precisely, it can be compiled the same way a while loop is compiled. Of course, how exactly it is compiled depends on the compiler being used.
There are currently three major implementations of Scala, these are Scala-native (a compiler that targets native machine code with its own runtime), Scala.js (a compiler that targets the ECMAScript platform, sitting on top of the ECMAScript runtime), and the JVM implementation Scala which confusingly is also called "Scala" like the language (which targets the JVM platform and uses the JVM runtime). There used to be a Scala.NET, but that is no longer actively maintained.
I will focus on Scala-JVM in this answer.
I'll use a slightly different example than yours, because the encoding of pattern matching is actually fairly complex. Let's start with the simplest possible tail-recursive function there is:
def foo(): Unit = foo()
This gets compiled by Scala-JVM to the following JVM bytecode:
public void foo()
0: goto 0
Remember how I said above that tail-recursion is equivalent to looping? Well, the JVM doesn't have loops, it only has GOTO. This is exactly the same as a while loop:
def bar(): Unit = while (true) {}
Gets compiled to:
public void bar()
0: goto 0
And for a slightly more interesting example:
def baz(n: Int): Int = if (n <= 0) n else baz(n-1)
gets compiled to:
public int baz(int);
0: iload_1
1: iconst_0
2: if_icmpgt 9
5: iload_1
6: goto 16
9: iload_1
10: iconst_1
11: isub
12: istore_1
13: goto 0
16: ireturn
As you can see, it is just a while loop.
Related
As Scala developer learning the IO Monad, and therefore technicalities of Trampolining in general that are necessary for recursion where tail call optimization is not possible, I wonder how Haskell seems to natively avoid it.
I get that Haskell is a lazy language, however I wonder if someone could elaborate a bit further.
For instance, why doesn't ForeverM stackoverflow in scala? Well, I can answer for trampoline, and I can find the actual code that does that in libraries and blogs. I actually implemented a basic trampoline myself to learn.
How does it happens in Haskell? Is there a way to unpack the laziness a bit, give some pointers, and maybe documentation that would help in understanding it better?
sealed trait IO[A] {
.....
def flatMap[B](f: A => IO[B]): IO[B] =
FlatMap[A,B](this, f) // we do not interpret the `flatMap` here, just return it as a value
def map[B](f: A => B): IO[B] =
flatMap[B](f andThen (Return(_)))
}
case class Return[A](a: A) extends IO[A]
case class Suspend[A](resume: () => A) extends IO[A]
case class FlatMap[A,B](sub: IO[A], k: A => IO[B]) extends IO[B]
......
#annotation.tailrec
def run[A](io: IO[A]): A = io match {
case Return(a) => a
case Suspend(r) => r()
case FlatMap(x, f) => x match {
case Return(a) => run(f (a))
case Suspend(r) => run(f( r()))
case FlatMap(y, g) => run(y flatMap (a => g(a) flatMap f))
}
}
Functional programming in general requires tail-call elimination (otherwise the deep chains of function calls overflow the stack). For example, consider this (absurdly inefficient) implementation of an even/odd classifier:
def even(i: Int): Boolean =
if (i == 0) true
else if (i > 0) odd(i - 1)
else odd(i + 1)
def odd(i: Int): Boolean =
if (i == 0) false
else if (i > 0) even(i - 1)
else even(i + 1)
In both even and odd, every branch is either a simple expression (true or false in this case) which doesn't make a function call or a tail-call: the value of the called function is returned without being operated on.
Without tail-call elimination, the (potentially recursive with an indefinite length of a cycle) calls have to be implemented using a stack which consumes memory, because the caller may do something with the result. Tail-call elimination relies on observing that the caller doesn't do anything with the result, therefore the called function can effectively replace the caller on the stack.
Haskell and essentially every other post-Scheme functional language runtime implements generalized tail-call elimination: tail-calls become an unconditional jump (think a GOTO). The famous series of Steele and Sussman papers (the PDFs unfortunately didn't get archived, but you can search for, e.g. AIM-443 (mit or steele or sussman might be required)) known as "Lambda: The Ultimate" (which inspired the name of a programming language forum) goes through the implications of tail-call elimination and how this means that functional programming is actually viable for solving real-world computing problems.
Scala, however, primarily targets the Java Virtual Machine, the specification of which effectively (by design) prohibits generalized tail-call elimination, and the instruction set of which constrains unconditional jumps to not cross the boundaries of a method. In certain limited contexts (basically recursive calls of a method where the compiler can be absolutely sure of what implementation is being called), the Scala compiler performs the tail-call elimination before emitting the Java bytecode (it's theoretically conceivable that Scala Native could perform generalized tail-call elimination, but that would entail some semantic break with JVM and JS Scala (some JavaScript runtimes perform generalized tail-call elimination, though not V8 to my knowledge)). The #tailrec annotation, with which you may have some familiarity, enforces a requirement that the compiler be able to perform tail-call elimination.
Trampolining is a low-level technique at runtime for emulating compile-time tail-call elimination, especially in languages like C or Scala. Since Haskell has performed the tail-call elimination at compile-time, there's thus no need for the complexity of a trampoline (and the requirement to write the high-level code into continuation-passing style).
You can arguably think of the CPU in a Haskell program (or the runtime itself if transpiling to, e.g. JS) as implementing a trampoline.
Trampolining is not the only solution for tail calls. Scala requires trampolining precisely because it runs on the JVM, with the Java runtime. The Scala language developers did not get to choose precisely how their runtime operates, nor their binary format. Because they use the JVM, they must endure every way that the JVM is optimized for Java and not for Scala.
Haskell does not have this limitation, because it has its own runtime, it's own binary format, etc. It can choose precisely how to set up the stack at runtime based on language-level constructs of the Haskell language --- not, of the Java one.
In scala, the following 2 functions serve exactly the same purpose:
#tailrec
final def fn(str: String): Option[String] = {
Option(str).filter(_.nonEmpty).flatMap { v =>
fn(v.drop(1))
}
}
#tailrec
final def fn2(str: String): Option[String] = {
Option(str).filter(_.nonEmpty) match {
case None => None
case Some(v) => fn2(v.drop(1))
}
}
However #tailrec only works in second case, in the first case it will generate the following error:
Error: could not optimize #tailrec annotated method fn: it contains a
recursive call not in tail position
Option(str).filter(_.nonEmpty).flatMap { v =>
Why this error was given? And why these 2 codes generate different kinds JVM bytecode
For fn to be tail-recursive, the recursive call must be the last action in the function. If you pass fn to another function such as flatMap then the other function is free to perform other actions after calling fn and therefore the compiler cannot be sure that it is tail recursive.
In some cases the compiler could detect that calling fn is the last action in the other function, but not in the general case. And this would rely on a specific implementation of that other function so the tailrec annotation might become invalid if that other function were changed, which is an undesirable dependency.
Specifically for the last question:
And why these 2 codes generate different kinds JVM bytecode
Because on JVM there's no guarantee that the JAR containing Option class at runtime is the same as was seen at compile-time. This is good, because otherwise even minor versions of libraries (including standard Java and Scala libraries) would be incompatible, and you'd need all dependencies to be using the same minor version of their common dependencies.
If that class doesn't have a suitable flatMap method, you'll get AbstractMethodError, but otherwise semantics of Scala require that its flatMap method must be called. So the compiler has to emit bytecode to actually call the method.
Kotlin works around this by using inline functions and Scala 3 will support them too, but I don't know if it'll use them for such cases.
Consider the following:
List('a', 'b').flatMap(List(_,'g')) //res0: List[Char] = List(a, g, b, g)
I seems pretty obvious that flatMap() is doing some internal post-processing in order to achieve that result. How else would List('a','g') get combined with List('b','g')?
A mutable Set's retain method is implemented as follows:
def retain(p: A => Boolean): Unit =
for (elem <- this.toList) // SI-7269 toList avoids ConcurrentModificationException
if (!p(elem)) this -= elem
But if I implement my own method that doesn't make a copy for iterating, nothing blows up.
def dumbRetain[A](self: mutable.Set[A], p: A => Boolean): Unit =
for (elem <- self)
if (!p(elem)) self -= elem
dumbRetain(mutable.HashSet(1,2,3,4,5,6), Set(2,4,6))
// everything is ok
I see that SI-7269's test case uses the JavaConversions wrapper around a java Set/Map, and it seems like the issue arises from the underlying java collection.
I know there will never be a java collection passed to my algorithm, so can I use dumbRetain without worrying about the ConcurrentModificationException? Or is this "coincidental behavior" that I shouldn't rely on?
edit to clarify, I would be using dumbRetain as an implementation detail in an algorithm which would be in full control of what it passes to dumbRetain. And this would be run in a single-threaded context.
This seems to rely on the specific implementation of mutable.HashSet, and there is nothing in the API that guarantees that it would work for all other implementations of mutable.Set, even if we exclude all wrappers for the Java collections.
The for-loop
for (elem <- self) {
...
}
is desugared into foreach, which for mutable.HashSet is implemented as follows:
override def foreach[U](f: A => U) {
var i = 0
val len = table.length
while (i < len) {
val curEntry = table(i)
if (curEntry ne null) f(entryToElem(curEntry))
i += 1
}
}
Essentially, it simply loops through the Array of the underlying FlatHashTable, and invokes the passed function f on every element. The whole foreach simply does not have any lines which could throw anything, it doesn't check for concurrent [footnote-1] modifications at all.
A ConcurrentModificationException seems to be the less troubling case: at least, your program fails fast, and even returns a detailed stack trace that points to the line in which the problem occurred. It would be actually much worse if it simply deteriorated into undefined behavior without throwing anything. This would be the worst case. However, this worst case shouldn't occur for collections from the standard library: Throw ConcurrentModificationException exception's in scala collections? #188
Quote:
In scala/scala#5295 (merged in to 2.12.x) I made sure that removing the element last returned from an iterator would not cause a problem for the iterator.
So, as long as you clearly state in the documentation that only the collections from standard library are supported, you will most likely not have any problems using it in your own code. But if you use it in a public interface, this would be an invitation for a bug analogous to "SI-7269" quoted in your question.
[footnote-1] "concurrent" as in "ConcurrentModificationException", not as in "concurrently executed threads".
EDIT: I've tried to choose less ambiguous formulations. Great Thanks #Dima for the feedback and the numerous suggestions.
Yeah, you can do it, as long as you are sure this is the scala's native HashSet implementation, not a wrapper around java ... and with understanding, that this is not thread-safe, and should never be used concurrently (the original HashSet.retain is that way too as well as the other mutators).
Better yet, just use immutable Set.filter, unless you actually have real hard evidence (not just intuition) demonstrating that your specific case absolutely requires mutable container.
Here is a minimal code that raise the compilation error 'Recursive call not in tail position'. However, I'm using an #inline and the recursive call is in tail position. The reason why I'm using this #inline is that I have the code pf the original reccall duplicated twice.
import scala.annotation._
object Test {
#tailrec private def test(i: Int): Int = {
#inline def reccall(i: Int): Int = test(i-1)
i match {
case 0 => 0
case i => reccall(i)
}
}
}
I've looked at the answers Recursive call not in tail position #tailrec why does this method not compile with 'contains a recursive call not in tail position'? but they do not apply to my case. Using Scala 2.12
It appears, the way #inline is implemented is that it still passes the parameter via stack. The jump is eliminated, by inserting the code inline, but the stack is still used for the arguments. This makes it impossible to be in a tail position, because the stack needs to be cleaned up after the call is completed.
Besides, annotating a function with #inline does not guarantee that the optimizer will inline it, just that it will "try especially hard".
Well, the mechanism of how tail recursion is actualized in JVM is explained in following way:
Scala, in the case of tail recursion, can eliminate the creation of a
new stack frame and just re-use the current stack frame. The stack
never gets any deeper, no matter how many times the recursive call is
made.
So in your case it cannot reuse the current stack frame belonging to the test method since it MUST create a new stack frame for the reccall method anyway.
Recursive call is implicit in this case, made from another method. So I believe you cannot really have tail recursion implemented for such case.
You may just remove the reccall method altogether and write case i => test(i-1) and then compiler will not complain.
NOTE: also I believe #inline has nothing to do here and is not essential in this example, since if I remove it - compiler still complains the same reason.
The issue here is that #inline is strictly advisory: it doesn't guarantee that the compiler will inline the function. Since #tailrec only works if it's absolutely guaranteed that the tail-calls can be eliminated, this means that using #tailrec has to assume no inlining.
Using Scala's command line REPL:
def foo(x: Int): Unit = {}
def foo(x: String): Unit = {println(foo(2))}
gives
error: type mismatch;
found: Int(2)
required: String
It seems that you can't define overloaded recursive methods in the REPL. I thought this was a bug in the Scala REPL and filed it, but it was almost instantly closed with "wontfix: I don't see any way this could be supported given the semantics of the interpreter, because these two methods must to be compiled together." He recommended putting the methods in an enclosing object.
Is there a JVM language implementation or Scala expert who could explain why? I can see it would be a problem if the methods called each other for instance, but in this case?
Or if this is too large a question and you think I need more prerequisite knowledge, does someone have any good links to books or sites about language implementations, especially on the JVM? (I know about John Rose's blog, and the book Programming Language Pragmatics... but that's about it. :)
The issue is due to the fact that the interpreter most often has to replace existing elements with a given name, rather than overload them. For example, I will often be running through experimenting with something, often creating a method called test:
def test(x: Int) = x + x
A little later on, let's say that I'm running a different experiment and I create another method named test, unrelated to the first:
def test(ls: List[Int]) = (0 /: ls) { _ + _ }
This isn't an entirely unrealistic scenario. In fact, it's precisely how most people use the interpreter, often without even realizing it. If the interpreter arbitrarily decided to keep both versions of test in scope, that could lead to confusing semantic differences in using test. For example, we might make a call to test, accidentally passing an Int rather than List[Int] (not the most unlikely accident in the world):
test(1 :: Nil) // => 1
test(2) // => 4 (expecting 2)
Over time, the root scope of the interpreter would get incredibly cluttered with various versions of methods, fields, etc. I tend to leave my interpreter open for days at a time, but if overloading like this were allowed, we would be forced to "flush" the interpreter every so often as things got to be too confusing.
It's not a limitation of the JVM or the Scala compiler, it's a deliberate design decision. As mentioned in the bug, you can still overload if you're within something other than the root scope. Enclosing your test methods within a class seems like the best solution to me.
% scala28
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_20).
Type in expressions to have them evaluated.
Type :help for more information.
scala> def foo(x: Int): Unit = () ; def foo(x: String): Unit = { println(foo(2)) }
foo: (x: String)Unit <and> (x: Int)Unit
foo: (x: String)Unit <and> (x: Int)Unit
scala> foo(5)
scala> foo("abc")
()
REPL will accept if you copy both lines and paste both at same time.
As shown by extempore's answer, it is possible to overload. Daniel's comment about design decision is correct, but, I think, incomplete and a bit misleading. There's no outlawing of overloads (since they are possible), but they are not easily achieved.
The design decisions that lead to this are:
All previous definitions must be available.
Only newly entered code is compiled, instead of recompiling everything ever entered every time.
It must be possible to redefine definitions (as Daniel mentioned).
It must be possible to define members such as vals and defs, not only classes and objects.
The problem is... how to achieve all these goals? How do we process your example?
def foo(x: Int): Unit = {}
def foo(x: String): Unit = {println(foo(2))}
Starting with the 4th item, A val or def can only be defined inside a class, trait, object or package object. So, REPL puts the definitions inside objects, like this (not actual representation!)
package $line1 { // input line
object $read { // what was read
object $iw { // definitions
def foo(x: Int): Unit = {}
}
// val res1 would be here somewhere if this was an expression
}
}
Now, due to how JVM works, once you defined one of them, you can't extend them. You could, of course, recompile everything, but we discarded that. So you need to place it in a different place:
package $line1 { // input line
object $read { // what was read
object $iw { // definitions
def foo(x: String): Unit = { println(foo(2)) }
}
}
}
And this explains why your examples are not overloads: they are defined in two different places. If you put them in the same line, they'd all be defined together, which would make them overloads, as shown in extempore's example.
As for the other design decisions, each new package import definitions and "res" from previous packages, and the imports can shadow each other, which makes it possible to "redefine" stuff.