how to approach implementing TCO'ed recursion - scala

I have been looking into recursion and TCO. It seems that TCO can make the code verbose and also impact the performance. e.g. I have implemented the code which takes in 7 digit phone number and gives back all possible permutation of words e.g. 464-7328 can be "GMGPDAS ... IMGREAT ... IOIRFCU" Here is the code.
/*Generate the alphabet table*/
val alphabet = (for (ch <- 'a' to 'z') yield ch.toString).toList
/*Given the number, return the possible alphabet List of String(Instead of Char for convenience)*/
def getChars(num : Int) : List[String] = {
if (num > 1) return List[String](alphabet((num - 2) * 3), alphabet((num - 2) * 3 + 1), alphabet((num - 2) * 3 + 2))
List[String](num.toString)
}
/*Recursion without TCO*/
def getTelWords(input : List[Int]) : List[String] = {
if (input.length == 1) return getChars(input.head)
getChars(input.head).foldLeft(List[String]()) {
(l, ch) => getTelWords(input.tail).foldLeft(List[String]()) { (ll, x) => ch + x :: ll } ++ l
}
}
It is short and I don't have to spend too much time on this. However when I try to do that in tail call recursion to get it TCO'ed. I have to spend a considerable amount of time and The code become very verbose. I won't be posing the whole code to save space. Here is a link to git repo link. It is for sure that quite a lot of you can write better and concise tail recursive code than mine. I still believe that in general TCO is more verbose (e.g. Factorial and Fibonacci tail call recursion has extra parameter, accumulator.) Yet, TCO is needed to prevent the stack overflow. I would like to know how you would approach TCO and recursion. The Scheme implementation of Akermann with TCO in this thread epitomize my problem statement.

Is it possible that you're using the term "tail call optimization", when in fact you really either mean writing a function in iterative recursive style, or continuation passing style, so that all the recursive calls are tail calls?
Implementing TCO is the job of a language implementer; one paper that talks about how it can be done efficiently is the classic Lambda: the Ultimate GOTO paper.
Tail call optimization is something that your language's evaluator will do for you. Your question, on the other hand, sounds like you are asking how to express functions in a particular style so that the program's shape allows your evaluator to perform tail call optimization.

As sclv mentioned in the comments, tail recursion is pointless for this example in Haskell. A simple implementation of your problem can be written succinctly and efficiently using the list monad.
import Data.Char
getChars n | n > 1 = [chr (ord 'a' + 3*(n-2)+i) | i <- [0..2]]
| otherwise = ""
getTelNum = mapM getChars

As said by others, I would not be worried about tail call for this case, as it does not recurse very deeply (length of the input) compared to the size of the output. You should be out of memory (or patience) before you are out of stack
I would implement probably implement with something like
def getTelWords(input: List[Int]): List[String] = input match {
case Nil => List("")
case x :: xs => {
val heads = getChars(x)
val tails = getTelWords(xs)
for(c <- heads; cs <- tails) yield c + cs
}
}
If you insist on a tail recursive one, that might be based on
def helper(reversedPrefixes: List[String], input: List[Int]): List[String]
= input match {
case Nil => reversedPrefixes.map(_.reverse)
case (x :: xs) => helper(
for(c <- getChars(x); rp <- reversedPrefixes) yield c + rp,
xs)
}
(the actual routine should call helper(List(""), input))

Related

Scala for expressions in tail recursive form

I have a bit of a problem trying to come up with a valid way to convert a for - expression N queens solution to a tail recursive form and still preserve the idiomatic nature achieved by using the for syntax. Any ideas are more than welcome.
def place(boardSize: Int, n: Int): Solutions = n match {
case 0 => List(Nil)
case _ =>
for {
queens <- place(boardSize, n - 1)
y <- 1 to boardSize
queen = (n, y)
if (isSafe(queen, queens))
} yield queen :: queens
}
def isSafe(queen: Queen, others: List[Queen]) = {...}
What you're writing basically corresponds to what's called Depth-First Search (DFS).
Although a recursive implementation of DFS is easily written, it is not tail-recursive. Here's a proposal for a tail-recursive one. Note that I did not test this code, but it should at least give you an idea of how to proceed.
def solve(): List[List[Int]] = {
#tailrec def solver(fringe: List[List[Int]], solutions: List[List[Int]]): List[List[Int]] = fringe match {
case Nil => solutions
case potentialSol :: fringeTail =>
if(potentialSol.length == n) // We found a solution
solver(fringe.tail, potentialSol.reverse :: solutions)
else { // Keep looking
val unused = (1 to n).toList filterNot potentialSol.contains
val children = for(u <- unused ; partial = u :: fringe.head if isValid(partial)) yield partial
solver(children ++ fringe.tail, solutions)
}
}
solver((1 to n).toList.map(List(_)), Nil).map(_.reverse)
}
If you're concerned about performances, note that this solution is very poor because it uses slow operations on immutable data structure, and because on the JVM you're better off using iteration where performance matters. This will start failing quite rapidly as n increases. Algorithmically, there are far better ways to solve NQueens than using DFS.

Is this definition of a tail recursive fibonacci function tail-recursive?

I've seen around the following F# definition of a continuation-passing-style fibonacci function, that I always assumed to be tail recursive:
let fib k =
let rec fib' k cont =
match k with
| 0 | 1 -> cont 1
| k -> fib' (k-1) (fun a -> fib' (k-2) (fun b -> cont (a+b)))
fib' k id
When trying out the equivalent code in Scala, I've made use of the existent #tailrec and was caught off-guard when the Scala compiler informed me the recursive calls are NOT in tail position:
def fib(k: Int): Int = {
#tailrec def go(k: Int, cont: Int => Int): Int = {
if (k == 0 || k == 1) cont(1)
else go(k-1, { a => go(k-2, { b => cont(a+b) })})
}
go(k, { x => x })
}
I believe my Scala implementation is equivalent to the F# one, so I'm left wondering why the function is not tail recursive?
The second call to go on line 4 is not in tail position, it is wrapped inside an anonymous function. (It is in tail position for that function, but not for go itself.)
For continuation passing style you need Proper Tail Calls, which Scala unfortunately doesn't have. (In order to provide PTCs on the JVM, you need to manage your own stack and not use the JVM call stack which breaks interoperability with other languages, however, interoperability is a major design goal of Scala.)
The JVMs support for tail call elimination is limited.
I can't speak of the F# implementation, but in the scala you've got nested calls to go, so it's not in tail position. The simplest way to think about it is from the stacks point of view: is there any other information the stack needs to 'remember', when doing a recursive call?
In the case of the nested go calls there obviously is, because the inner call has to be completed before the computation 'goes back' and completes the outer call.
Fib can recursively be defined like so:
def fib(k:Int) = {
#tailrec
def go(k:Int, p:Int, c:Int) : Int = {
if(k == 0) p
else { go(k-1, c p+c) }
}
go(k,0 1)
}
Unfortunately, the JVM does not support tail-call optimisation yet (?) (to be fair, it can sometimes optimize some calls). Scala implements tail-recursion optimisation via program transformation (every tail-recursive function is equivalent to a loop). This is generally enough for simple recursive functions but mutual recursion or continuation-passing style require the full optimisation.
This is indeed problematic when using advanced functional patterns like CPS or monadic style. To avoid blowing the stack up you need using Trampolines. It works but this is neither as convenient nor efficient as proper tail-call optimisation. Edward Kmett's comments on the subject is a good reading.

Convert normal recursion to tail recursion

I was wondering if there is some general method to convert a "normal" recursion with foo(...) + foo(...) as the last call to a tail-recursion.
For example (scala):
def pascal(c: Int, r: Int): Int = {
if (c == 0 || c == r) 1
else pascal(c - 1, r - 1) + pascal(c, r - 1)
}
A general solution for functional languages to convert recursive function to a tail-call equivalent:
A simple way is to wrap the non tail-recursive function in the Trampoline monad.
def pascalM(c: Int, r: Int): Trampoline[Int] = {
if (c == 0 || c == r) Trampoline.done(1)
else for {
a <- Trampoline.suspend(pascal(c - 1, r - 1))
b <- Trampoline.suspend(pascal(c, r - 1))
} yield a + b
}
val pascal = pascalM(10, 5).run
So the pascal function is not a recursive function anymore. However, the Trampoline monad is a nested structure of the computation that need to be done. Finally, run is a tail-recursive function that walks through the tree-like structure, interpreting it, and finally at the base case returns the value.
A paper from Rúnar Bjanarson on the subject of Trampolines: Stackless Scala With Free Monads
In cases where there is a simple modification to the value of a recursive call, that operation can be moved to the front of the recursive function. The classic example of this is Tail recursion modulo cons, where a simple recursive function in this form:
def recur[A](...):List[A] = {
...
x :: recur(...)
}
which is not tail recursive, is transformed into
def recur[A]{...): List[A] = {
def consRecur(..., consA: A): List[A] = {
consA :: ...
...
consrecur(..., ...)
}
...
consrecur(...,...)
}
Alexlv's example is a variant of this.
This is such a well known situation that some compilers (I know of Prolog and Scheme examples but Scalac does not do this) can detect simple cases and perform this optimisation automatically.
Problems combining multiple calls to recursive functions have no such simple solution. TMRC optimisatin is useless, as you are simply moving the first recursive call to another non-tail position. The only way to reach a tail-recursive solution is remove all but one of the recursive calls; how to do this is entirely context dependent but requires finding an entirely different approach to solving the problem.
As it happens, in some ways your example is similar to the classic Fibonnaci sequence problem; in that case the naive but elegant doubly-recursive solution can be replaced by one which loops forward from the 0th number.
def fib (n: Long): Long = n match {
case 0 | 1 => n
case _ => fib( n - 2) + fib( n - 1 )
}
def fib (n: Long): Long = {
def loop(current: Long, next: => Long, iteration: Long): Long = {
if (n == iteration)
current
else
loop(next, current + next, iteration + 1)
}
loop(0, 1, 0)
}
For the Fibonnaci sequence, this is the most efficient approach (a streams based solution is just a different expression of this solution that can cache results for subsequent calls). Now,
you can also solve your problem by looping forward from c0/r0 (well, c0/r2) and calculating each row in sequence - the difference being that you need to cache the entire previous row. So while this has a similarity to fib, it differs dramatically in the specifics and is also significantly less efficient than your original, doubly-recursive solution.
Here's an approach for your pascal triangle example which can calculate pascal(30,60) efficiently:
def pascal(column: Long, row: Long):Long = {
type Point = (Long, Long)
type Points = List[Point]
type Triangle = Map[Point,Long]
def above(p: Point) = (p._1, p._2 - 1)
def aboveLeft(p: Point) = (p._1 - 1, p._2 - 1)
def find(ps: Points, t: Triangle): Long = ps match {
// Found the ultimate goal
case (p :: Nil) if t contains p => t(p)
// Found an intermediate point: pop the stack and carry on
case (p :: rest) if t contains p => find(rest, t)
// Hit a triangle edge, add it to the triangle
case ((c, r) :: _) if (c == 0) || (c == r) => find(ps, t + ((c,r) -> 1))
// Triangle contains (c - 1, r - 1)...
case (p :: _) if t contains aboveLeft(p) => if (t contains above(p))
// And it contains (c, r - 1)! Add to the triangle
find(ps, t + (p -> (t(aboveLeft(p)) + t(above(p)))))
else
// Does not contain(c, r -1). So find that
find(above(p) :: ps, t)
// If we get here, we don't have (c - 1, r - 1). Find that.
case (p :: _) => find(aboveLeft(p) :: ps, t)
}
require(column >= 0 && row >= 0 && column <= row)
(column, row) match {
case (c, r) if (c == 0) || (c == r) => 1
case p => find(List(p), Map())
}
}
It's efficient, but I think it shows how ugly complex recursive solutions can become as you deform them to become tail recursive. At this point, it may be worth moving to a different model entirely. Continuations or monadic gymnastics might be better.
You want a generic way to transform your function. There isn't one. There are helpful approaches, that's all.
I don't know how theoretical this question is, but a recursive implementation won't be efficient even with tail-recursion. Try computing pascal(30, 60), for example. I don't think you'll get a stack overflow, but be prepared to take a long coffee break.
Instead, consider using a Stream or memoization:
val pascal: Stream[Stream[Long]] =
(Stream(1L)
#:: (Stream from 1 map { i =>
// compute row i
(1L
#:: (pascal(i-1) // take the previous row
sliding 2 // and add adjacent values pairwise
collect { case Stream(a,b) => a + b }).toStream
++ Stream(1L))
}))
The accumulator approach
def pascal(c: Int, r: Int): Int = {
def pascalAcc(acc:Int, leftover: List[(Int, Int)]):Int = {
if (leftover.isEmpty) acc
else {
val (c1, r1) = leftover.head
// Edge.
if (c1 == 0 || c1 == r1) pascalAcc(acc + 1, leftover.tail)
// Safe checks.
else if (c1 < 0 || r1 < 0 || c1 > r1) pascalAcc(acc, leftover.tail)
// Add 2 other points to accumulator.
else pascalAcc(acc, (c1 , r1 - 1) :: ((c1 - 1, r1 - 1) :: leftover.tail ))
}
}
pascalAcc(0, List ((c,r) ))
}
It does not overflow the stack but as on big row and column but Aaron mentioned it's not fast.
Yes it's possible. Usually it's done with accumulator pattern through some internally defined function, which has one additional argument with so called accumulator logic, example with counting length of a list.
For example normal recursive version would look like this:
def length[A](xs: List[A]): Int = if (xs.isEmpty) 0 else 1 + length(xs.tail)
that's not a tail recursive version, in order to eliminate last addition operation we have to accumulate values while somehow, for example with accumulator pattern:
def length[A](xs: List[A]) = {
def inner(ys: List[A], acc: Int): Int = {
if (ys.isEmpty) acc else inner(ys.tail, acc + 1)
}
inner(xs, 0)
}
a bit longer to code, but i think the idea i clear. Of cause you can do it without inner function, but in such case you should provide acc initial value manually.
I'm pretty sure it's not possible in the simple way you're looking for the general case, but it would depend on how elaborate you permit the changes to be.
A tail-recursive function must be re-writable as a while-loop, but try implementing for example a Fractal Tree using while-loops. It's possble, but you need to use an array or collection to store the state for each point, which susbstitutes for the data otherwise stored in the call-stack.
It's also possible to use trampolining.
It is indeed possible. The way I'd do this is to
begin with List(1) and keep recursing till you get to the
row you want.
Worth noticing that you can optimize it: if c==0 or c==r the value is one, and to calculate let's say column 3 of the 100th row you still only need to calculate the first three elements of the previous rows.
A working tail recursive solution would be this:
def pascal(c: Int, r: Int): Int = {
#tailrec
def pascalAcc(c: Int, r: Int, acc: List[Int]): List[Int] = {
if (r == 0) acc
else pascalAcc(c, r - 1,
// from let's say 1 3 3 1 builds 0 1 3 3 1 0 , takes only the
// subset that matters (if asking for col c, no cols after c are
// used) and uses sliding to build (0 1) (1 3) (3 3) etc.
(0 +: acc :+ 0).take(c + 2)
.sliding(2, 1).map { x => x.reduce(_ + _) }.toList)
}
if (c == 0 || c == r) 1
else pascalAcc(c, r, List(1))(c)
}
The annotation #tailrec actually makes the compiler check the function
is actually tail recursive.
It could be probably be further optimized since given that the rows are symmetric, if c > r/2, pascal(c,r) == pascal ( r-c,r).. but left to the reader ;)

Is Scala idiomatic coding style just a cool trap for writing inefficient code?

I sense that the Scala community has a little big obsession with writing "concise", "cool", "scala idiomatic", "one-liner" -if possible- code. This is immediately followed by a comparison to Java/imperative/ugly code.
While this (sometimes) leads to easy to understand code, it also leads to inefficient code for 99% of developers. And this is where Java/C++ is not easy to beat.
Consider this simple problem: Given a list of integers, remove the greatest element. Ordering does not need to be preserved.
Here is my version of the solution (It may not be the greatest, but it's what the average non-rockstar developer would do).
def removeMaxCool(xs: List[Int]) = {
val maxIndex = xs.indexOf(xs.max);
xs.take(maxIndex) ::: xs.drop(maxIndex+1)
}
It's Scala idiomatic, concise, and uses a few nice list functions. It's also very inefficient. It traverses the list at least 3 or 4 times.
Here is my totally uncool, Java-like solution. It's also what a reasonable Java developer (or Scala novice) would write.
def removeMaxFast(xs: List[Int]) = {
var res = ArrayBuffer[Int]()
var max = xs.head
var first = true;
for (x <- xs) {
if (first) {
first = false;
} else {
if (x > max) {
res.append(max)
max = x
} else {
res.append(x)
}
}
}
res.toList
}
Totally non-Scala idiomatic, non-functional, non-concise, but it's very efficient. It traverses the list only once!
So, if 99% of Java developers write more efficient code than 99% of Scala developers, this is a huge
obstacle to cross for greater Scala adoption. Is there a way out of this trap?
I am looking for practical advice to avoid such "inefficiency traps" while keeping implementation clear ans concise.
Clarification: This question comes from a real-life scenario: I had to write a complex algorithm. First I wrote it in Scala, then I "had to" rewrite it in Java. The Java implementation was twice as long, and not that clear, but at the same time it was twice as fast. Rewriting the Scala code to be efficient would probably take some time and a somewhat deeper understanding of scala internal efficiencies (for vs. map vs. fold, etc)
Let's discuss a fallacy in the question:
So, if 99% of Java developers write more efficient code than 99% of
Scala developers, this is a huge obstacle to cross for greater Scala
adoption. Is there a way out of this trap?
This is presumed, with absolutely no evidence backing it up. If false, the question is moot.
Is there evidence to the contrary? Well, let's consider the question itself -- it doesn't prove anything, but shows things are not that clear.
Totally non-Scala idiomatic, non-functional, non-concise, but it's
very efficient. It traverses the list only once!
Of the four claims in the first sentence, the first three are true, and the fourth, as shown by user unknown, is false! And why it is false? Because, contrary to what the second sentence states, it traverses the list more than once.
The code calls the following methods on it:
res.append(max)
res.append(x)
and
res.toList
Let's consider first append.
append takes a vararg parameter. That means max and x are first encapsulated into a sequence of some type (a WrappedArray, in fact), and then passed as parameter. A better method would have been +=.
Ok, append calls ++=, which delegates to +=. But, first, it calls ensureSize, which is the second mistake (+= calls that too -- ++= just optimizes that for multiple elements). Because an Array is a fixed size collection, which means that, at each resize, the whole Array must be copied!
So let's consider this. When you resize, Java first clears the memory by storing 0 in each element, then Scala copies each element of the previous array over to the new array. Since size doubles each time, this happens log(n) times, with the number of elements being copied increasing each time it happens.
Take for example n = 16. It does this four times, copying 1, 2, 4 and 8 elements respectively. Since Java has to clear each of these arrays, and each element must be read and written, each element copied represents 4 traversals of an element. Adding all we have (n - 1) * 4, or, roughly, 4 traversals of the complete list. If you count read and write as a single pass, as people often erroneously do, then it's still three traversals.
One can improve on this by initializing the ArrayBuffer with an initial size equal to the list that will be read, minus one, since we'll be discarding one element. To get this size, we need to traverse the list once, though.
Now let's consider toList. To put it simply, it traverses the whole list to create a new list.
So, we have 1 traversal for the algorithm, 3 or 4 traversals for resize, and 1 additional traversal for toList. That's 4 or 5 traversals.
The original algorithm is a bit difficult to analyse, because take, drop and ::: traverse a variable number of elements. Adding all together, however, it does the equivalent of 3 traversals. If splitAt was used, it would be reduced to 2 traversals. With 2 more traversals to get the maximum, we get 5 traversals -- the same number as the non-functional, non-concise algorithm!
So, let's consider improvements.
On the imperative algorithm, if one uses ListBuffer and +=, then all methods are constant-time, which reduces it to a single traversal.
On the functional algorithm, it could be rewritten as:
val max = xs.max
val (before, _ :: after) = xs span (max !=)
before ::: after
That reduces it to a worst case of three traversals. Of course, there are other alternatives presented, based on recursion or fold, that solve it in one traversal.
And, most interesting of all, all of these algorithms are O(n), and the only one which almost incurred (accidentally) in worst complexity was the imperative one (because of array copying). On the other hand, the cache characteristics of the imperative one might well make it faster, because the data is contiguous in memory. That, however, is unrelated to either big-Oh or functional vs imperative, and it is just a matter of the data structures that were chosen.
So, if we actually go to the trouble of benchmarking, analyzing the results, considering performance of methods, and looking into ways of optimizing it, then we can find faster ways to do this in an imperative manner than in a functional manner.
But all this effort is very different from saying the average Java programmer code will be faster than the average Scala programmer code -- if the question is an example, that is simply false. And even discounting the question, we have seen no evidence that the fundamental premise of the question is true.
EDIT
First, let me restate my point, because it seems I wasn't clear. My point is that the code the average Java programmer writes may seem to be more efficient, but actually isn't. Or, put another way, traditional Java style doesn't gain you performance -- only hard work does, be it Java or Scala.
Next, I have a benchmark and results too, including almost all solutions suggested. Two interesting points about it:
Depending on list size, the creation of objects can have a bigger impact than multiple traversals of the list. The original functional code by Adrian takes advantage of the fact that lists are persistent data structures by not copying the elements right of the maximum element at all. If a Vector was used instead, both left and right sides would be mostly unchanged, which might lead to even better performance.
Even though user unknown and paradigmatic have similar recursive solutions, paradigmatic's is way faster. The reason for that is that he avoids pattern matching. Pattern matching can be really slow.
The benchmark code is here, and the results are here.
def removeOneMax (xs: List [Int]) : List [Int] = xs match {
case x :: Nil => Nil
case a :: b :: xs => if (a < b) a :: removeOneMax (b :: xs) else b :: removeOneMax (a :: xs)
case Nil => Nil
}
Here is a recursive method, which only iterates once. If you need performance, you have to think about it, if not, not.
You can make it tail-recursive in the standard way: giving an extra parameter carry, which is per default the empty List, and collects the result while iterating. That is, of course, a bit longer, but if you need performance, you have to pay for it:
import annotation.tailrec
#tailrec
def removeOneMax (xs: List [Int], carry: List [Int] = List.empty) : List [Int] = xs match {
case a :: b :: xs => if (a < b) removeOneMax (b :: xs, a :: carry) else removeOneMax (a :: xs, b :: carry)
case x :: Nil => carry
case Nil => Nil
}
I don't know what the chances are, that later compilers will improve slower map-calls to be as fast as while-loops. However: You rarely need high speed solutions, but if you need them often, you will learn them fast.
Do you know how big your collection has to be, to use a whole second for your solution on your machine?
As oneliner, similar to Daniel C. Sobrals solution:
((Nil : List[Int], xs(0)) /: xs.tail) ((p, x)=> if (p._2 > x) (x :: p._1, p._2) else ((p._2 :: p._1), x))._1
but that is hard to read, and I didn't measure the effective performance. The normal pattern is (x /: xs) ((a, b) => /* something */). Here, x and a are pairs of List-so-far and max-so-far, which solves the problem to bring everything into one line of code, but isn't very readable. However, you can earn reputation on CodeGolf this way, and maybe someone likes to make a performance measurement.
And now to our big surprise, some measurements:
An updated timing-method, to get the garbage collection out of the way, and have the hotspot-compiler warm up, a main, and many methods from this thread, together in an Object named
object PerfRemMax {
def timed (name: String, xs: List [Int]) (f: List [Int] => List [Int]) = {
val a = System.currentTimeMillis
val res = f (xs)
val z = System.currentTimeMillis
val delta = z-a
println (name + ": " + (delta / 1000.0))
res
}
def main (args: Array [String]) : Unit = {
val n = args(0).toInt
val funs : List [(String, List[Int] => List[Int])] = List (
"indexOf/take-drop" -> adrian1 _,
"arraybuf" -> adrian2 _, /* out of memory */
"paradigmatic1" -> pm1 _, /**/
"paradigmatic2" -> pm2 _,
// "match" -> uu1 _, /*oom*/
"tailrec match" -> uu2 _,
"foldLeft" -> uu3 _,
"buf-=buf.max" -> soc1 _,
"for/yield" -> soc2 _,
"splitAt" -> daniel1,
"ListBuffer" -> daniel2
)
val r = util.Random
val xs = (for (x <- 1 to n) yield r.nextInt (n)).toList
// With 1 Mio. as param, it starts with 100 000, 200k, 300k, ... 1Mio. cases.
// a) warmup
// b) look, where the process gets linear to size
funs.foreach (f => {
(1 to 10) foreach (i => {
timed (f._1, xs.take (n/10 * i)) (f._2)
compat.Platform.collectGarbage
});
println ()
})
}
I renamed all the methods, and had to modify uu2 a bit, to fit to the common method declaration (List [Int] => List [Int]).
From the long result, i only provide the output for 1M invocations:
scala -Dserver PerfRemMax 2000000
indexOf/take-drop: 0.882
arraybuf: 1.681
paradigmatic1: 0.55
paradigmatic2: 1.13
tailrec match: 0.812
foldLeft: 1.054
buf-=buf.max: 1.185
for/yield: 0.725
splitAt: 1.127
ListBuffer: 0.61
The numbers aren't completly stable, depending on the sample size, and a bit varying from run to run. For example, for 100k to 1M runs, in steps of 100k, the timing for splitAt was as follows:
splitAt: 0.109
splitAt: 0.118
splitAt: 0.129
splitAt: 0.139
splitAt: 0.157
splitAt: 0.166
splitAt: 0.749
splitAt: 0.752
splitAt: 1.444
splitAt: 1.127
The initial solution is already pretty fast. splitAt is a modification from Daniel, often faster, but not always.
The measurement was done on a single core 2Ghz Centrino, running xUbuntu Linux, Scala-2.8 with Sun-Java-1.6 (desktop).
The two lessons for me are:
always measure your performance improvements; it is very hard to estimate it, if you don't do it on a daily basis
it is not only fun, to write functional code - sometimes the result is even faster
Here is a link to my benchmarkcode, if somebody is interested.
First of all, the behavior of the methods you presented is not the same. The first one keeps the element ordering, while the second one doesn't.
Second, among all the possible solution which could be qualified as "idiomatic", some are more efficient than others. Staying very close to your example, you can for instance use tail-recursion to eliminate variables and manual state management:
def removeMax1( xs: List[Int] ) = {
def rec( max: Int, rest: List[Int], result: List[Int]): List[Int] = {
if( rest.isEmpty ) result
else if( rest.head > max ) rec( rest.head, rest.tail, max :: result)
else rec( max, rest.tail, rest.head :: result )
}
rec( xs.head, xs.tail, List() )
}
or fold the list:
def removeMax2( xs: List[Int] ) = {
val result = xs.tail.foldLeft( xs.head -> List[Int]() ) {
(acc,x) =>
val (max,res) = acc
if( x > max ) x -> ( max :: res )
else max -> ( x :: res )
}
result._2
}
If you want to keep the original insertion order, you can (at the expense of having two passes, rather than one) without any effort write something like:
def removeMax3( xs: List[Int] ) = {
val max = xs.max
xs.filterNot( _ == max )
}
which is more clear than your first example.
The biggest inefficiency when you're writing a program is worrying about the wrong things. This is usually the wrong thing to worry about. Why?
Developer time is generally much more expensive than CPU time — in fact, there is usually a dearth of the former and a surplus of the latter.
Most code does not need to be very efficient because it will never be running on million-item datasets multiple times every second.
Most code does need to bug free, and less code is less room for bugs to hide.
The example you gave is not very functional, actually. Here's what you are doing:
// Given a list of Int
def removeMaxCool(xs: List[Int]): List[Int] = {
// Find the index of the biggest Int
val maxIndex = xs.indexOf(xs.max);
// Then take the ints before and after it, and then concatenate then
xs.take(maxIndex) ::: xs.drop(maxIndex+1)
}
Mind you, it is not bad, but you know when functional code is at its best when it describes what you want, instead of how you want it. As a minor criticism, if you used splitAt instead of take and drop you could improve it slightly.
Another way of doing it is this:
def removeMaxCool(xs: List[Int]): List[Int] = {
// the result is the folding of the tail over the head
// and an empty list
xs.tail.foldLeft(xs.head -> List[Int]()) {
// Where the accumulated list is increased by the
// lesser of the current element and the accumulated
// element, and the accumulated element is the maximum between them
case ((max, ys), x) =>
if (x > max) (x, max :: ys)
else (max, x :: ys)
// and of which we return only the accumulated list
}._2
}
Now, let's discuss the main issue. Is this code slower than the Java one? Most certainly! Is the Java code slower than a C equivalent? You can bet it is, JIT or no JIT. And if you write it directly in assembler, you can make it even faster!
But the cost of that speed is that you get more bugs, you spend more time trying to understand the code to debug it, and you have less visibility of what the overall program is doing as opposed to what a little piece of code is doing -- which might result in performance problems of its own.
So my answer is simple: if you think the speed penalty of programming in Scala is not worth the gains it brings, you should program in assembler. If you think I'm being radical, then I counter that you just chose the familiar as being the "ideal" trade off.
Do I think performance doesn't matter? Not at all! I think one of the main advantages of Scala is leveraging gains often found in dynamically typed languages with the performance of a statically typed language! Performance matters, algorithm complexity matters a lot, and constant costs matters too.
But, whenever there is a choice between performance and readability and maintainability, the latter is preferable. Sure, if performance must be improved, then there isn't a choice: you have to sacrifice something to it. And if there's no lost in readability/maintainability -- such as Scala vs dynamically typed languages -- sure, go for performance.
Lastly, to gain performance out of functional programming you have to know functional algorithms and data structures. Sure, 99% of Java programmers with 5-10 years experience will beat the performance of 99% of Scala programmers with 6 months experience. The same was true for imperative programming vs object oriented programming a couple of decades ago, and history shows it didn't matter.
EDIT
As a side note, your "fast" algorithm suffer from a serious problem: you use ArrayBuffer. That collection does not have constant time append, and has linear time toList. If you use ListBuffer instead, you get constant time append and toList.
For reference, here's how splitAt is defined in TraversableLike in the Scala standard library,
def splitAt(n: Int): (Repr, Repr) = {
val l, r = newBuilder
l.sizeHintBounded(n, this)
if (n >= 0) r.sizeHint(this, -n)
var i = 0
for (x <- this) {
(if (i < n) l else r) += x
i += 1
}
(l.result, r.result)
}
It's not unlike your example code of what a Java programmer might come up with.
I like Scala because, where performance matters, mutability is a reasonable way to go. The collections library is a great example; especially how it hides this mutability behind a functional interface.
Where performance isn't as important, such as some application code, the higher order functions in Scala's library allow great expressivity and programmer efficiency.
Out of curiosity, I picked an arbitrary large file in the Scala compiler (scala.tools.nsc.typechecker.Typers.scala) and counted something like 37 for loops, 11 while loops, 6 concatenations (++), and 1 fold (it happens to be a foldRight).
What about this?
def removeMax(xs: List[Int]) = {
val buf = xs.toBuffer
buf -= (buf.max)
}
A bit more ugly, but faster:
def removeMax(xs: List[Int]) = {
var max = xs.head
for ( x <- xs.tail )
yield {
if (x > max) { val result = max; max = x; result}
else x
}
}
Try this:
(myList.foldLeft((List[Int](), None: Option[Int]))) {
case ((_, None), x) => (List(), Some(x))
case ((Nil, Some(m), x) => (List(Math.min(x, m)), Some(Math.max(x, m))
case ((l, Some(m), x) => (Math.min(x, m) :: l, Some(Math.max(x, m))
})._1
Idiomatic, functional, traverses only once. Maybe somewhat cryptic if you are not used to functional-programming idioms.
Let's try to explain what is happening here. I will try to make it as simple as possible, lacking some rigor.
A fold is an operation on a List[A] (that is, a list that contains elements of type A) that will take an initial state s0: S (that is, an instance of a type S) and a function f: (S, A) => S (that is, a function that takes the current state and an element from the list, and gives the next state, ie, it updates the state according to the next element).
The operation will then iterate over the elements of the list, using each one to update the state according to the given function. In Java, it would be something like:
interface Function<T, R> { R apply(T t); }
class Pair<A, B> { ... }
<State> State fold(List<A> list, State s0, Function<Pair<A, State>, State> f) {
State s = s0;
for (A a: list) {
s = f.apply(new Pair<A, State>(a, s));
}
return s;
}
For example, if you want to add all the elements of a List[Int], the state would be the partial sum, that would have to be initialized to 0, and the new state produced by a function would simply add the current state to the current element being processed:
myList.fold(0)((partialSum, element) => partialSum + element)
Try to write a fold to multiply the elements of a list, then another one to find extreme values (max, min).
Now, the fold presented above is a bit more complex, since the state is composed of the new list being created along with the maximum element found so far. The function that updates the state is more or less straightforward once you grasp these concepts. It simply puts into the new list the minimum between the current maximum and the current element, while the other value goes to the current maximum of the updated state.
What is a bit more complex than to understand this (if you have no FP background) is to come up with this solution. However, this is only to show you that it exists, can be done. It's just a completely different mindset.
EDIT: As you see, the first and second case in the solution I proposed are used to setup the fold. It is equivalent to what you see in other answers when they do xs.tail.fold((xs.head, ...)) {...}. Note that the solutions proposed until now using xs.tail/xs.head don't cover the case in which xs is List(), and will throw an exception. The solution above will return List() instead. Since you didn't specify the behavior of the function on empty lists, both are valid.
Another option would be:
package code.array
object SliceArrays {
def main(args: Array[String]): Unit = {
println(removeMaxCool(Vector(1,2,3,100,12,23,44)))
}
def removeMaxCool(xs: Vector[Int]) = xs.filter(_ < xs.max)
}
Using Vector instead of List, the reason is that Vector is more versatile and has a better general performance and time complexity if compared to List.
Consider the following collections operations:
head, tail, apply, update, prepend, append
Vector takes an amortized constant time for all operations, as per Scala docs:
"The operation takes effectively constant time, but this might depend on some assumptions such as maximum length of a vector or distribution of hash keys"
While List takes constant time only for head, tail and prepend operations.
Using
scalac -print
generates:
package code.array {
object SliceArrays extends Object {
def main(args: Array[String]): Unit = scala.Predef.println(SliceArrays.this.removeMaxCool(scala.`package`.Vector().apply(scala.Predef.wrapIntArray(Array[Int]{1, 2, 3, 100, 12, 23, 44})).$asInstanceOf[scala.collection.immutable.Vector]()));
def removeMaxCool(xs: scala.collection.immutable.Vector): scala.collection.immutable.Vector = xs.filter({
((x$1: Int) => SliceArrays.this.$anonfun$removeMaxCool$1(xs, x$1))
}).$asInstanceOf[scala.collection.immutable.Vector]();
final <artifact> private[this] def $anonfun$removeMaxCool$1(xs$1: scala.collection.immutable.Vector, x$1: Int): Boolean = x$1.<(scala.Int.unbox(xs$1.max(scala.math.Ordering$Int)));
def <init>(): code.array.SliceArrays.type = {
SliceArrays.super.<init>();
()
}
}
}
Another contender. This uses a ListBuffer, like Daniel's second offering, but shares the post-max tail of the original list, avoiding copying it.
def shareTail(xs: List[Int]): List[Int] = {
var res = ListBuffer[Int]()
var maxTail = xs
var first = true;
var x = xs
while ( x != Nil ) {
if (x.head > maxTail.head) {
while (!(maxTail.head == x.head)) {
res += maxTail.head
maxTail = maxTail.tail
}
}
x = x.tail
}
res.prependToList(maxTail.tail)
}

Selection sort in functional Scala

I'm making my way through "Programming in Scala" and wrote a quick implementation of the selection sort algorithm. However, since I'm still a bit green in functional programming, I'm having trouble translating to a more Scala-ish style. For the Scala programmers out there, how can I do this using Lists and vals rather than falling back into my imperative ways?
http://gist.github.com/225870
As starblue already said, you need a function that calculates the minimum of a list and returns the list with that element removed. Here is my tail recursive implementation of something similar (as I believe foldl is tail recursive in the standard library), and I tried to make it as functional as possible :). It returns a list that contains all the elements of the original list (but kindof reversed - see the explanation below) with the minimum as a head.
def minimum(xs: List[Int]): List[Int] =
(List(xs.head) /: xs.tail) {
(ys, x) =>
if(x < ys.head) (x :: ys)
else (ys.head :: x :: ys.tail)
}
This basically does a fold, starting with a list containing of the first element of xs If the first element of xs is smaller than the head of that list, we pre-append it to the list ys. Otherwise, we add it to the list ys as the second element. And so on recursively, we've folded our list into a new list containing the minimum element as a head and a list containing all the elements of xs (not necessarily in the same order) with the minimum removed, as a tail. Note that this function does not remove duplicates.
After creating this helper function, it's now easy to implement selection sort.
def selectionSort(xs: List[Int]): List[Int] =
if(xs.isEmpty) List()
else {
val ys = minimum(xs)
if(ys.tail.isEmpty)
ys
else
ys.head :: selectionSort(ys.tail)
}
Unfortunately this implementation is not tail recursive, so it will blow up the stack for large lists. Anyway, you shouldn't use a O(n^2) sort for large lists, but still... it would be nice if the implementation was tail recursive. I'll try to think of something... I think it will look like the implementation of a fold.
Tail Recursive!
To make it tail recursive, I use quite a common pattern in functional programming - an accumulator. It works a bit backward, as now I need a function called maximum, which basically does the same as minimum, but with the maximum element - its implementation is exact as minimum, but using > instead of <.
def selectionSort(xs: List[Int]) = {
def selectionSortHelper(xs: List[Int], accumulator: List[Int]): List[Int] =
if(xs.isEmpty) accumulator
else {
val ys = maximum(xs)
selectionSortHelper(ys.tail, ys.head :: accumulator)
}
selectionSortHelper(xs, Nil)
}
EDIT: Changed the answer to have the helper function as a subfunction of the selection sort function.
It basically accumulates the maxima to a list, which it eventually returns as the base case. You can also see that it is tail recursive by replacing accumulator by throw new NullPointerException - and then inspect the stack trace.
Here's a step by step sorting using an accumulator. The left hand side shows the list xs while the right hand side shows the accumulator. The maximum is indicated at each step by a star.
64* 25 12 22 11 ------- Nil
11 22 12 25* ------- 64
22* 12 11 ------- 25 64
11 12* ------- 22 25 64
11* ------- 12 22 25 64
Nil ------- 11 12 22 25 64
The following shows a step by step folding to calculate the maximum:
maximum(25 12 64 22 11)
25 :: Nil /: 12 64 22 11 -- 25 > 12, so it stays as head
25 :: 12 /: 64 22 11 -- same as above
64 :: 25 12 /: 22 11 -- 25 < 64, so the new head is 64
64 :: 22 25 12 /: 11 -- and stays so
64 :: 11 22 25 12 /: Nil -- until the end
64 11 22 25 12
You should have problems doing selection sort in functional style, as it is an in-place sort algorithm. In-place, by definition, isn't functional.
The main problem you'll face is that you can't swap elements. Here's why this is important. Suppose I have a list (a0 ... ax ... an), where ax is the minimum value. You need to get ax away, and then compose a list (a0 ... ax-1 ax+1 an). The problem is that you'll necessarily have to copy the elements a0 to ax-1, if you wish to remain purely functional. Other functional data structures, particularly trees, can have better performance than this, but the basic problem remains.
here is another implementation of selection sort (generic version).
def less[T <: Comparable[T]](i: T, j: T) = i.compareTo(j) < 0
def swap[T](xs: Array[T], i: Int, j: Int) { val tmp = xs(i); xs(i) = xs(j); xs(j) = tmp }
def selectiveSort[T <: Comparable[T]](xs: Array[T]) {
val n = xs.size
for (i <- 0 until n) {
val min = List.range(i + 1, n).foldLeft(i)((a, b) => if (less(xs(a), xs(b))) a else b)
swap(xs, i, min)
}
}
You need a helper function which does the selection. It should return the minimal element and the rest of the list with the element removed.
I think it's reasonably feasible to do a selection sort in a functional style, but as Daniel indicated, it has a good chance of performing horribly.
I just tried my hand at writing a functional bubble sort, as a slightly simpler and degenerate case of selection sort. Here's what I did, and this hints at what you could do:
define bubble(data)
if data is empty or just one element: return data;
otherwise, if the first element < the second,
return first element :: bubble(rest of data);
otherwise, return second element :: bubble(
first element :: (rest of data starting at 3rd element)).
Once that's finished recursing, the largest element is at the end of the list. Now,
define bubblesort [data]
apply bubble to data as often as there are elements in data.
When that's done, your data is indeed sorted. Yes, it's horrible, but my Clojure implementation of this pseudocode works.
Just concerning yourself with the first element or two and then leaving the rest of the work to a recursed activity is a lisp-y, functional-y way to do this kind of thing. But once you've gotten your mind accustomed to that kind of thinking, there are more sensible approaches to the problem.
I would recommend implementing a merge sort:
Break list into two sub-lists,
either by counting off half the elements into one sublist
and the rest in the other,
or by copying every other element from the original list
into either of the new lists.
Sort each of the two smaller lists (recursion here, obviously).
Assemble a new list by selecting the smaller from the front of either sub-list
until you've exhausted both sub-lists.
The recursion is in the middle of that, and I don't see a clever way of making the algorithm tail recursive. Still, I think it's O(log-2) in time and also doesn't place an exorbitant load on the stack.
Have fun, good luck!
Thanks for the hints above, they were very inspiring. Here's another functional approach to the selection sort algorithm. I tried to base it on the following idea: finding a max / min can be done quite easily by min(A)=if A=Nil ->Int.MaxValue else min(A.head, min(A.tail)). The first min is the min of a list, the second the min of two numbers. This is easy to understand, but unfortunately not tail recursive. Using the accumulator method the min definition can be transformed like this, now in correct Scala:
def min(x: Int,y: Int) = if (x<y) x else y
def min(xs: List[Int], accu: Int): Int = xs match {
case Nil => accu
case x :: ys => min(ys, min(accu, x))
}
(This is tail recursive)
Now a min version is needed which returns a list leaving out the min value. The following function returns a list whose head is the min value, the tail contains the rest of the original list:
def minl(xs: List[Int]): List[Int] = minl(xs, List(Int.MaxValue))
def minl(xs: List[Int],accu:List[Int]): List[Int] = xs match {
// accu always contains min as head
case Nil => accu take accu.length-1
case x :: ys => minl(ys,
if (x<accu.head) x::accu else accu.head :: x :: accu.tail )
}
Using this selection sort can be written tail recursively as:
def ssort(xs: List[Int], accu: List[Int]): List[Int] = minl(xs) match {
case Nil => accu
case min :: rest => ssort(rest, min::accu)
}
(reverses the order). In a test with 10000 list elements this algorithm is only about 4 times slower than the usual imperative algorithm.
Even though, when coding Scala, I'm used to prefer functional programming style (via combinators or recursion) over imperative style (via variables and iterations), THIS TIME, for this specific problem, old school imperative nested loops result in simpler and more performant code.
I don't think falling back to imperative style is a mistake for certain classes of problems, such as sorting algorithms which usually transform the input buffer in place rather than resulting to a new collection.
My solution is:
package bitspoke.algo
import scala.math.Ordered
import scala.collection.mutable.Buffer
abstract class Sorter[T <% Ordered[T]] {
// algorithm provided by subclasses
def sort(buffer : Buffer[T]) : Unit
// check if the buffer is sorted
def sorted(buffer : Buffer[T]) = buffer.isEmpty || buffer.view.zip(buffer.tail).forall { t => t._2 > t._1 }
// swap elements in buffer
def swap(buffer : Buffer[T], i:Int, j:Int) {
val temp = buffer(i)
buffer(i) = buffer(j)
buffer(j) = temp
}
}
class SelectionSorter[T <% Ordered[T]] extends Sorter[T] {
def sort(buffer : Buffer[T]) : Unit = {
for (i <- 0 until buffer.length) {
var min = i
for (j <- i until buffer.length) {
if (buffer(j) < buffer(min))
min = j
}
swap(buffer, i, min)
}
}
}
As you can see, to achieve parametric polymorphism, rather than using java.lang.Comparable, I preferred scala.math.Ordered and Scala View Bounds rather than Upper Bounds. That's certainly works thanks to Scala Implicit Conversions of primitive types to Rich Wrappers.
You can write a client program as follows:
import bitspoke.algo._
import scala.collection.mutable._
val sorter = new SelectionSorter[Int]
val buffer = ArrayBuffer(3, 0, 4, 2, 1)
sorter.sort(buffer)
assert(sorter.sorted(buffer))
A simple functional program for selection-sort in Scala
def selectionSort(list:List[Int]):List[Int] = {
#tailrec
def selectSortHelper(list:List[Int], accumList:List[Int] = List[Int]()): List[Int] = {
list match {
case Nil => accumList
case _ => {
val min = list.min
val requiredList = list.filter(_ != min)
selectSortHelper(requiredList, accumList ::: List.fill(list.length - requiredList.length)(min))
}
}
}
selectSortHelper(list)
}
You may want to try replacing your while loops with recursion, so, you have two places where you can create new recursive functions.
That would begin to get rid of some vars.
This was probably the toughest lesson for me, trying to move more toward FP.
I hesitate to show solutions here, as I think it would be better for you to try first.
But, if possible you should be using tail-recursion, to avoid problems with stack overflows (if you are sorting a very, very large list).
Here is my point of view on this problem: SelectionSort.scala
def selectionsort[A <% Ordered[A]](list: List[A]): List[A] = {
def sort(as: List[A], bs: List[A]): List[A] = as match {
case h :: t => select(h, t, Nil, bs)
case Nil => bs
}
def select(m: A, as: List[A], zs: List[A], bs: List[A]): List[A] =
as match {
case h :: t =>
if (m > h) select(m, t, h :: zs, bs)
else select(h, t, m :: zs, bs)
case Nil => sort(zs, m :: bs)
}
sort(list, Nil)
}
There are two inner functions: sort and select, which represents two loops in original algorithm. The first function sort iterates through the elements and call select for each of them. When the source list is empty it return bs list as result, which is initially Nil. The sort function tries to search for maximum (not minimum, since we build result list in reversive order) element in source list. It suppose that maximum is head by the default and then just replace it with a proper value.
This is 100% functional implementation of Selection Sort in Scala.
Here is my solution
def sort(list: List[Int]): List[Int] = {
#tailrec
def pivotCompare(p: Int, l: List[Int], accList: List[Int] = List.empty): List[Int] = {
l match {
case Nil => p +: accList
case x :: xs if p < x => pivotCompare(p, xs, accList :+ x)
case x :: xs => pivotCompare(x, xs, accList :+ p)
}
}
#tailrec
def loop(list: List[Int], accList: List[Int] = List.empty): List[Int] = {
list match {
case x :: xs =>
pivotCompare(x, xs) match {
case Nil => accList
case h :: tail => loop(tail, accList :+ h)
}
case Nil => accList
}
}
loop(list)
}