Functional implementation of Tarjan's Strongly Connected Components algorithm - scala

I went ahead and implemented the textbook version of Tarjan's SCC algorithm in Scala. However, I dislike the code - it is very imperative/procedural with lots of mutating states and book-keeping indices. Is there a more "functional" version of the algorithm? I believe imperative versions of algorithms hide the core ideas behind the algorithm unlike the functional versions. I found someone else encountering the same problem with this particular algorithm but I have not been able to translate his Clojure code into idomatic Scala.
Note: If anyone wants to experiment, I have a good setup that generates random graphs and tests your SCC algorithm vs running Floyd-Warshall

See Lazy Depth-First Search and Linear Graph Algorithms in Haskell by David King and John Launchbury. It describes many graph algorithms in a functional style, including SCC.

The following functional Scala code generates a map that assigns a representative to each node of a graph. Each representative identifies one strongly connected component. The code is based on Tarjan's algorithm for strongly connected components.
In order to understand the algorithm it might suffice to understand the fold and the contract of the dfs function.
def scc[T](graph:Map[T,Set[T]]): Map[T,T] = {
//`dfs` finds all strongly connected components below `node`
//`path` holds the the depth for all nodes above the current one
//'sccs' holds the representatives found so far; the accumulator
def dfs(node: T, path: Map[T,Int], sccs: Map[T,T]): Map[T,T] = {
//returns the earliest encountered node of both arguments
//for the case both aren't on the path, `old` is returned
def shallowerNode(old: T,candidate: T): T =
(path.get(old),path.get(candidate)) match {
case (_,None) => old
case (None,_) => candidate
case (Some(dOld),Some(dCand)) => if(dCand < dOld) candidate else old
}
//handle the child nodes
val children: Set[T] = graph(node)
//the initially known shallowest back-link is `node` itself
val (newState,shallowestBackNode) = children.foldLeft((sccs,node)){
case ((foldedSCCs,shallowest),child) =>
if(path.contains(child))
(foldedSCCs, shallowerNode(shallowest,child))
else {
val sccWithChildData = dfs(child,path + (node -> path.size),foldedSCCs)
val shallowestForChild = sccWithChildData(child)
(sccWithChildData, shallowerNode(shallowest, shallowestForChild))
}
}
newState + (node -> shallowestBackNode)
}
//run the above function, so every node gets visited
graph.keys.foldLeft(Map[T,T]()){ case (sccs,nextNode) =>
if(sccs.contains(nextNode))
sccs
else
dfs(nextNode,Map(),sccs)
}
}
I've tested the code only on the example graph found on the Wikipedia page.
Difference to imperative version
In contrast to the original implementation, my version avoids explicitly unwinding the stack and simply uses a proper (non tail-) recursive function. The stack is represented by a persistent map called path instead. In my first version I used a List as stack; but this was less efficient since it had to be searched for containing elements.
Efficiency
The code is rather efficient. For each edge, you have to update and/or access the immutable map path, which costs O(log|N|), for a total of O(|E| log|N|). This is in contrast to O(|E|) achieved by the imperative version.
Linear Time implementation
The paper in Chris Okasaki's answer gives a linear time solution in Haskell for finding strongly connected components. Their implementation is based on Kosaraju's Algorithm for finding SCCs, which basically requires two depth-first traversals. The paper's main contribution appears to be a lazy, linear time DFS implementation in Haskell.
What they require to achieve a linear time solution is having a set with O(1) singleton add and membership test. This is basically the same problem that makes the solution given in this answer have a higher complexity than the imperative solution. They solve it with state-threads in Haskell, which can also be done in Scala (see Scalaz). So if one is willing to make the code rather complicated, it is possible to implement Tarjan's SCC algorithm to a functional O(|E|) version.

Have a look at https://github.com/jordanlewis/data.union-find, a Clojure implementation of the algorithm. It's sorta disguised as a data structure, but the algorithm is all there. And it's purely functional, of course.

Related

Scala uses mutable variables to implement its apis

I am in the process of learning Scala through Coursera course (progfun).
We are being learned to think functionally and use tail recursions when possible to implement functions/methods.
And as an example for foreach on a list function, we have taught to implement it like:
def foreach[T](list: List[T], f: [T] => Unit) {
if (!list.isEmpty) foreach(list.tail) else f(list.head)
}
Then I was surprised when I found the following implementation in some Scala apis:
override /*IterableLike*/
def foreach[B](f: A => B) {
var these = this
while (!these.isEmpty) {
f(these.head)
these = these.tail
}
}
So how come we are being learned to use recursion and avoid using mutable variables and the api is being implemented by opposite techniques?
Have a look at scala.collection.LinearSeqOptimized where scala.collection.immutable.List extend. (similar implementation found in the List class itself)
Don't forget that Scala is intended to be a multiparadigm language. For educational purposes, it's good to know how to read and write tail-call recursive functions. But when using the language day-to-day, it's important to remember that it's not pure FP.
It's possible that part of the library predated TCO and the #tailrec annotation. You'd have to look at commit history to find out.
That implementation of foreach might use a mutable var, but from the outside, it appears to be pure. Ultimately, this is exactly what TCO would do behind the scenes.
There are two parts to your question:
So how come we are being learned to use recursion and avoid using mutable variables
Because the teachers assume that you either already know about imperative programming with mutable state and loops or will be exposed to it sometime during your career anyway, so they would rather focus on teaching you the things you are less likely to pick up on your own.
Also, imperative programming with mutable state is much harder to reason about, much harder to understand and thus much harder to teach.
and the api is being implemented by opposite techniques?
Because the Scala standard library is intended to be a high-performance industrial-strength library, not a teaching example. Maybe the person who wrote that code profiled it and measured it to be 0.001% percent faster than the tail-recursive version. Maybe, when that code was written, the compiler couldn't yet reliably optimize the tail-recursive version.
Don't forget that Iterable and friends are the cornerstone of Scala's collections library, those methods you are looking at are probably among the most often executed methods in the entire Scala universe. Even the tiniest performance optimization pays out in a method that is executed billions of times.

for vs map in functional programming

I am learning functional programming using scala. In general I notice that for loops are not much used in functional programs instead they use map.
Questions
What are the advantages of using map over for loop in terms of performance, readablity etc ?
What is the intention of bringing in a map function when it can be achieved using loop ?
Program 1: Using For loop
val num = 1 to 1000
val another = 1000 to 2000
for ( i <- num )
{
for ( j <- another)
{
println(i,j)
}
}
Program 2 : Using map
val num = 1 to 1000
val another = 1000 to 2000
val mapper = num.map(x => another.map(y => (x,y))).flatten
mapper.map(x=>println(x))
Both program 1 and program 2 does the same thing.
The answer is quite simple actually.
Whenever you use a loop over a collection it has a semantic purpose. Either you want to iterate the items of the collection and print them. Or you want to transform the type of the elements to another type (map). Or you want to change the cardinality, such as computing the sum of the elements of a collection (fold).
Of course, all that can also be done using for - loops but to the reader of the code, it is more work to figure out which semantic purpose the loop has, compared to a well known named operation such as map, iter, fold, filter, ...
Another aspect is, that for loops lead to the dark side of using mutable state. How would you sum the elements of a collection in a for loop without mutable state? You would not. Instead you would need to write a recursive function. So, for good measure, it is best to drop the habit of thinking in for loops early and enjoy the brave new functional way of doing things.
I'll start by quoting Programming in Scala.
"Every for expression can be expressed in terms of the three higher-order functions map, flatMap and filter. This section describes the translation scheme, which is also used by the Scala compiler."
http://www.artima.com/pins1ed/for-expressions-revisited.html#23.4
So the reason that you noticed for-loops are not used as much is because they technically aren't needed, and any for expressions you do see are just syntactic sugar which the compiler will translate into some equivalent. The rules for translating a for expression into a map/flatMap/filter expression are listed in the link above.
Generally speaking, in functional programming there is no index variable to mutate. This means one typically makes heavy use of function calls (often in the form of recursion) such as list folds in place of a while or for loop.
For a good example of using list folds in place of while/for loops, I recommend "Explain List Folds to Yourself" by Tony Morris.
https://vimeo.com/64673035
If a function is tail-recursive (denoted with #tailrec) then it can be optimized so as to not incur the high use of the stack which is common in recursive functions. In this case the compiler can translate the tail-recursive function to the "while loop equivalent".
To answer the second part of Question 1, there are some cases where one could make an argument that a for expression is clearer (although certainly there are cases where the opposite is true too.) One such example is given in the Coursera.org course "Functional Programming with Scala" by Dr. Martin Odersky:
for {
i <- 1 until n
j <- 1 until i
if isPrime(i + j)
} yield (i, j)
is arguably more clear than
(1 until n).flatMap(i =>
(1 until i).withFilter(j => isPrime(i + j))
.map(j => (i, j)))
For more information check out Dr. Martin Odersky's "Functional Programming with Scala" course on Coursera.org. Lecture 6.5 "Translation of For" in particular discusses this in more detail.
Also, as a quick side note, in your example you use
mapper.map(x => println(x))
It is generally more accepted to use foreach in this case because you have the intent of side-effecting. Also, there is short hand
mapper.foreach(println)
As for Question 2, it is better to use the map function in place of loops (especially when there is mutation in the loop) because map is a function and it can be composed. Also, once one is acquainted and used to using map, it is very easy to reason about.
The two programs that you have provided are not the same, even if the output might suggest that they are. It is true that for comprehensions are de-sugared by the compiler, but the first program you have is actually equivalent to:
val num = 1 to 1000
val another = 1000 to 2000
num.foreach(i => another.foreach(j => println(i,j)))
It should be noted that the resultant type for the above (and your example program) is Unit
In the case of your second program, the resultant type of the program is, as determined by the compiler, Seq[Unit] - which is now a Seq that has the length of the product of the loop members. As a result, you should always use foreach to indicate an effect that results in a Unit result.
Think about what is happening at the machine-language level. Loops are still fundamental. Functional programming abstracts the loop that is implemented in conventional programming.
Essentially, instead of writing a loop as you would in conventional or imparitive programming, the use of chaining or pipelining in functional programming allows the compiler to optimize the code for the user, and map is simply mapping the function to each element as a list or collection is iterated through. Functional programming, is more convenient, and abstracts the mundane implementation of "for" loops etc. There are limitations to this convenience, particularly if you intend to use functional programming to implement parallel processing.
It is arguable depending on the Software Engineer or developer, that the compiler will be more efficient and know ahead of time the situation it is implemented in. IMHO, mid-level Software Engineers who are familiar with functional programming, well versed in conventional programming, and knowledgeable in parallel processing, will implement both conventional and functional.

Is recursion in scala very necessary?

In the coursera scala tutorial, most examples are using top-down iterations. Partially, as I can see, iterations are used to avoid for/while loops. I'm from C++ and feel a little confused about this.
Is iteration chosen over for/while loops? Is it practical in production? Any risk of stackoverflow? How about efficiency? How about bottom up Dynamic Programming (especially when they are not tail-recusions)?
Also, should I use less "if" conditions, instead use more "case" and subclasses?
Truly high-quality Scala will use very little iteration and only slightly more recursion. What would be done with looping in lower-level imperative languages is usually best done with higher-order combinators, map and flatmap most especially, but also filter, zip, fold, foreach, reduce, collect, partition, scan, groupBy, and a good few others. Iteration is best done only in performance critical sections, and recursion done only in a some deep edge cases where the higher-order combinators don't quite fit (which usually aren't tail recursive, fwiw). In three years of coding Scala in production systems, I used iteration once, recursion twice, and map about five times per day.
Hmm, several questions in one.
Necessity of Recursion
Recursion is not necessary, but it can sometimes provide a very elegant solution.
If the solution is tail recursive and the compiler supports tail call optimisation, then the solution can even be efficient.
As has been well said already, Scala has many combinator functions which can be used to perform the same tasks more expressively and efficiently.
One classic example is writing a function to return the nth Fibonacci number. Here's a naive recursive implementation:
def fib (n: Long): Long = n match {
case 0 | 1 => n
case _ => fib( n - 2) + fib( n - 1 )
}
Now, this is inefficient (definitely not tail recursive) but it is very obvious how its structure relates to the Fibonacci sequence. We can make it properly tail recursive, though:
def fib (n: Long): Long = {
def fibloop(current: Long, next: => Long, iteration: Long): Long = {
if (n == iteration)
current
else
fibloop(next, current + next, iteration + 1)
}
fibloop(0, 1, 0)
}
That could have been written more tersely, but it is an efficient recursive implementation. That said, it is not as pretty as the first and it's structure is less clearly related to the original problem.
Finally, stolen shamelessly from elsewhere on this site is Luigi Plinge's streams-based implementation:
val fibs: Stream[Int] = 0 #:: fibs.scanLeft(1)(_ + _)
Very terse, efficient, elegant and (if you understand streams and lazy evaluation) very expressive. It is also, in fact, recursive; #:: is a recursive function, but one that operates in a lazily-evaluated context. You certainly have to be able to think recursively to come up with this kind of solution.
Iteration compared to For/While loops
I'm assuming you mean the traditional C-Style for, here.
Recursive solutions can often be preferable to while loops because C/C++/Java-style while loops do not return a value and require side effects to achieve anything (this is also true for C-Style for and Java-style foreach). Frankly, I often wish Scala had never implemented while (or had implemented it as syntactic sugar for something like Scheme's named let), because it allows classically-trained Java developers to keep doing things the way they always did. There are situations where a loop with side effects, which is what while gives you, is a more expressive way of achieving something but I had rather Java-fixated devs were forced to reach a little harder for it (e.g. by abusing a for comprehension).
Simply, traditional while and for make clunky imperative coding much too easy. If you don't care about that, why are you using Scala?
Efficiency and risk of Stackoverflow
Tail optimisation eliminates the risk of stackoverflow. Rewriting recursive solutions to be properly tail recursive can make them very ugly (particularly in any language running on the JVM).
Recursive solutions can be more efficient than more imperative solutions, sometimes suprisingly so. One reason is that they often operate on lists, in a way that only involves head and tail access. Head and tail operations on lists are actually faster than random access operations on more structured collections.
Dynamic Programming
A good recursive algorithm typically reduces a complex problem to a small set of simpler problems, picks one to solve and delegates the rest to another function (usually a recursive call to itself). Now, to me this sounds like a great fit for dynamic programming. Certainly, if I am trying a recursive approach to a problem, I often start with a naive solution which I know can't solve every case, see where it fails, add that pattern to the solution and iterate towards success.
The Little Schemer has many examples of this iterative approach to recursive programming, particularly because it re-uses earlier solutions as sub-components for later, more complex ones. I would say it is the epitome of the Dynamic Programming approach. (It is also one of the best-written educational books about software ever produced). I can recommend it, not least because it teaches you Scheme at the same time. If you really don't want to learn Scheme (why? why would you not?), it has been adapted for a few other languages
If versus Match
if expressions, in Scala, return values (which is very useful and why Scala has no need for a ternary operator). There is no reason to avoid simple
if (something)
// do something
else
// do something else
expressions. The principle reason to match instead of a simple if...else is to use the power of case statements to extract information from complex objects. Here is one example.
On the other hand, if...else if...else if...else is a terrible pattern
There's no easy way to see if you covered all the possibilities properly, even with a final else in place.
Unintentionally nested if expressions are hard to spot
It is too easy to link unrelated conditions together (accidentally or through bone-headed design)
Wherever you find you have written else if, look for an alternative. match is a good place to start.
I'm assuming that, since you say "recursion" in your title, you also mean "recursion" in your question, and not "iteration" (which cannot be chosen "over for/while loops", because those are iterative :D).
You might be interested in reading Effective Scala, especially the section on control structures, which should mostly answer your question. In short:
Recursion isn't "better" than iteration. Often it is easier to write a recursive algorithm for a given problem, then it is to write an iterative algorithm (of course there are cases where the opposite applies). When "tail call optimization" can be applied to a problem, the compiler actually converts it to an iterative algorithm, thus making it impossible for a StackOverflow to happen, and without performance impact. You can read about tail call optimization in Effective Scala, too.
The main problem with your question is that it is very broad. There are many many resources available on functional programming, idiomatic scala, dynamic programming and so on, and no answer here on Stack Overflow would be able to cover all those topics. It'd be probably a good idea to just roam the interwebs for a while, and then come back with more concrete questions :)
One of the main benefits of recursion is that it lets you create solutions without mutation. for following example, you have to calculate the sum of all the elements of a List.
One of the many ways to solve this problem is as below. The imperative solution to this problem uses for loop as shown:
scala> var total = 0
scala> for(f <- List(1,2,3)) { total += f }
And recursion solution would look like following:
def total(xs: List[Int]): Int = xs match {
case Nil => 0
case x :: ys => x + total(ys)
}
The difference is that a recursive solution doesn’t use any mutable temporary variables by letting you break the problem into smaller pieces. Because Functional programming is all about writing side effect free programs it's always encourage to use recursion vs loops (that use mutating variables).
Head recursion is a traditional way of doing recursion, where you perform the recursive call first and then take the return value from the recursive function and calculate the result.
Generally when you call a function an entry is added to the call stack of a currently running thread. The downside is that the call stack has a defined size so quickly you may get StackOverflowError exception. This is why Java prefers to iterate rather than recurse. Because Scala runs on the JVM, Scala also suffers from this problem. But starting with Scala 2.8.1, Scala gets away this limitation by doing tail call optimization. you can do tail recursion in Scala.
To recap recursion is preferred way in functional programming to avoid using mutation and secondly tail recursion is supported in Scala so you don't get into StackOverFlow exceptions which you get in Java.
Hope this helps.
As for stack overflow, a lot of the time you can get away with it because of tail call elimination.
The reason scala and other function paradigms avoid for/while loops they are highly dependent on state and time. That makes it much harder to reason about complex "loops" in a formal and precise manor.

Why do immutable objects enable functional programming?

I'm trying to learn scala and I'm unable to grasp this concept. Why does making an object immutable help prevent side-effects in functions. Can anyone explain like I'm five?
Interesting question, a bit difficult to answer.
Functional programming is very much about using mathematics to reason about programs. To do so, one needs a formalism that describe the programs and how one can make proofs about properties they might have.
There are many models of computation that provide such formalisms, such as lambda calculus and turing machines. And there's a certain degree of equivalency between them (see this question, for a discussion).
In a very real sense, programs with mutability and some other side effects have a direct mapping to functional program. Consider this example:
a = 0
b = 1
a = a + b
Here are two ways of mapping it to functional program. First one, a and b are part of a "state", and each line is a function from a state into a new state:
state1 = (a = 0, b = ?)
state2 = (a = state1.a, b = 1)
state3 = (a = state2.a + state2.b, b = state2.b)
Here's another, where each variable is associated with a time:
(a, t0) = 0
(b, t1) = 1
(a, t2) = (a, t0) + (b, t1)
So, given the above, why not use mutability?
Well, here's the interesting thing about math: the less powerful the formalism is, the easier it is to make proofs with it. Or, to put it in other words, it's too hard to reason about programs that have mutability.
As a consequence, there's very little advance regarding concepts in programming with mutability. The famous Design Patterns were not arrived at through study, nor do they have any mathematical backing. Instead, they are the result of years and years of trial and error, and some of them have since proved to be misguided. Who knows about the other dozens "design patterns" seen everywhere?
Meanwhile, Haskell programmers came up with Functors, Monads, Co-monads, Zippers, Applicatives, Lenses... dozens of concepts with mathematical backing and, most importantly, actual patterns of how code is composed to make up programs. Things you can use to reason about your program, increase reusability and improve correctness. Take a look at the Typeclassopedia for examples.
It's no wonder people not familiar with functional programming get a bit scared with this stuff... by comparison, the rest of the programming world is still working with a few decades-old concepts. The very idea of new concepts is alien.
Unfortunately, all these patterns, all these concepts, only apply with the code they are working with does not contain mutability (or other side effects). If it does, then their properties cease to be valid, and you can't rely on them. You are back to guessing, testing and debugging.
In short, if a function mutates an object then it has side effects. Mutation is a side effect. This is just true by definition.
In truth, in a purely functional language it should not matter if an object is technically mutable or immutable, because the language will never "try" to mutate an object anyway. A pure functional language doesn't give you any way to perform side effects.
Scala is not a pure functional language, though, and it runs in the Java environment in which side effects are very popular. In this environment, using objects that are incapable of mutation encourages you to use a pure functional style because it makes a side-effect oriented style impossible. You are using data types to enforce purity because the language does not do it for you.
Now I will say a bunch of other stuff in the hope that it helps this make sense to you.
Fundamental to the concept of a variable in functional languages is referential transparency.
Referential transparency means that there is no difference between a value, and a reference to that value. In a language where this is true, it makes it much simpler to think about a program works, since you never have to stop and ask, is this a value, or a reference to a value? Anyone who's ever programmed in C recognizes that a great part of the challenge of learning that paradigm is knowing which is which at all times.
In order to have referential transparency, the value that a reference refers to can never change.
(Warning, I'm about to make an analogy.)
Think of it this way: in your cell phone, you have saved some phone numbers of other people's cell phones. You assume that whenever you call that phone number, you will reach the person you intend to talk to. If someone else wants to talk to your friend, you give them the phone number and they reach that same person.
If someone changes their cell phone number, this system breaks down. Suddenly, you need to get their new phone number if you want to reach them. Maybe you call the same number six months later and reach a different person. Calling the same number and reaching a different person is what happens when functions perform side effects: you have what seems to be the same thing, but you try to use it, it turns out it's different now. Even if you expected this, what about all the people you gave that number to, are you going to call them all up and tell them that the old number doesn't reach the same person anymore?
You counted on the phone number corresponding to that person, but it didn't really. The phone number system lacks referential transparency: the number isn't really ALWAYS the same as the person.
Functional languages avoid this problem. You can give out your phone number and people will always be able to reach you, for the rest of your life, and will never reach anybody else at that number.
However, in the Java platform, things can change. What you thought was one thing, might turn into another thing a minute later. If this is the case, how can you stop it?
Scala uses the power of types to prevent this, by making classes that have referential transparency. So, even though the language as a whole isn't referentially transparent, your code will be referentially transparent as long as you use immutable types.
Practically speaking, the advantages of coding with immutable types are:
Your code is simpler to read when the reader doesn't have to look out for surprising side effects.
If you use multiple threads, you don't have to worry about locking because shared objects can never change. When you have side effects, you have to really think through the code and figure out all the places where two threads might try to change the same object at the same time, and protect against the problems that this might cause.
Theoretically, at least, the compiler can optimize some code better if it uses only immutable types. I don't know if Java can do this effectively, though, since it allows side effects. This is a toss-up at best, anyway, because there are some problems that can be solved much more efficiently by using side effects.
I'm running with this 5 year old explanation:
class Account(var myMoney:List[Int] = List(10, 10, 1, 1, 1, 5)) {
def getBalance = println(myMoney.sum + " dollars available")
def myMoneyWithInterest = {
myMoney = myMoney.map(_ * 2)
println(myMoney.sum + " dollars will accru in 1 year")
}
}
Assume we are at an ATM and it is using this code to give us account information.
You do the following:
scala> val myAccount = new Account()
myAccount: Account = Account#7f4a6c40
scala> myAccount.getBalance
28 dollars available
scala> myAccount.myMoneyWithInterest
56 dollars will accru in 1 year
scala> myAccount.getBalance
56 dollars available
We mutated the account balance when we wanted to check our current balance plus a years worth of interest. Now we have an incorrect account balance. Bad news for the bank!
If we were using val instead of var to keep track of myMoney in the class definition, we would not have been able to mutate the dollars and raise our balance.
When defining the class (in the REPL) with val:
error: reassignment to val
myMoney = myMoney.map(_ * 2
Scala is telling us that we wanted an immutable value but are trying to change it!
Thanks to Scala, we can switch to val, re-write our myMoneyWithInterest method and rest assured that our Account class will never alter the balance.
One important property of functional programming is: If I call the same function twice with the same arguments I'll get the same result. This makes reasoning about code much easier in many cases.
Now imagine a function returning the attribute content of some object. If that content can change the function might return different results on different calls with the same argument. => no more functional programming.
First a few definitions:
A side effect is a change in state -- also called a mutation.
An immutable object is an object which does not support mutation, (side effects).
A function which is passed mutable objects (either as parameters or in the global environment) may or may not produce side effects. This is up to the implementation.
However, it is impossible for a function which is passed only immutable objects (either as parameters or in the global environment) to produce side effects. Therefore, exclusive use of immutable objects will preclude the possibility of side effects.
Nate's answer is great, and here is some example.
In functional programming, there is an important feature that when you call a function with same argument, you always get same return value.
This is always true for immutable objects, because you can't modify them after create it:
class MyValue(val value: Int)
def plus(x: MyValue) = x.value + 10
val x = new MyValue(10)
val y = plus(x) // y is 20
val z = plus(x) // z is still 20, plus(x) will always yield 20
But if you have mutable objects, you can't guarantee that plus(x) will always return same value for same instance of MyValue.
class MyValue(var value: Int)
def plus(x: MyValue) = x.value + 10
val x = new MyValue(10)
val y = plus(x) // y is 20
x.value = 30
val z = plus(x) // z is 40, you can't for sure what value will plus(x) return because MyValue.value may be changed at any point.
Why do immutable objects enable functional programming?
They don't.
Take one definition of "function," or "prodecure," "routine" or "method," which I believe applies to many programming languages: "A section of code, typically named, accepting arguments and/or returning a value."
Take one definition of "functional programming:" "Programming using functions." The ability to program with functions is indepedent of whether state is modified.
For instance, Scheme is considered a functional programming language. It features tail calls, higher-order functions and aggregate operations using functions. It also has mutable objects. While mutability destroys some nice mathematical qualities, it does not necessarily prevent "functional programming."
I've read all the answers and they don't satisfy me, because they mostly talk about "immutability", and not about its relation to FP.
The main question is:
Why do immutable objects enable functional programming?
So I've searched a bit more and I have another answer, I believe the easy answer to this question is: "Because Functional Programming is basically defined on the basis of functions that are easy to reason about". Here's the definition of Functional Programming:
The process of building software by composing pure functions.
If a function is not pure -- which means receiving the same input, it's not guaranteed to always produce the same output (e.g., if the function relies on a global object, or date and time, or a random number to compute the output) -- then that function is unpredictable, that's it! Now exactly the same story goes about "immutability" as well, if objects are not immutable, a function with the same object as its input may have different results (aka side effects) each time used, and this will make it hard to reason about the program.
I first tried to put this in a comment, but it got longer than the limit, I'm by no means a pro so please take this answer with a grain of salt.

why use foldLeft instead of procedural version?

So in reading this question it was pointed out that instead of the procedural code:
def expand(exp: String, replacements: Traversable[(String, String)]): String = {
var result = exp
for ((oldS, newS) <- replacements)
result = result.replace(oldS, newS)
result
}
You could write the following functional code:
def expand(exp: String, replacements: Traversable[(String, String)]): String = {
replacements.foldLeft(exp){
case (result, (oldS, newS)) => result.replace(oldS, newS)
}
}
I would almost certainly write the first version because coders familiar with either procedural or functional styles can easily read and understand it, while only coders familiar with functional style can easily read and understand the second version.
But setting readability aside for the moment, is there something that makes foldLeft a better choice than the procedural version? I might have thought it would be more efficient, but it turns out that the implementation of foldLeft is actually just the procedural code above. So is it just a style choice, or is there a good reason to use one version or the other?
Edit: Just to be clear, I'm not asking about other functions, just foldLeft. I'm perfectly happy with the use of foreach, map, filter, etc. which all map nicely onto for-comprehensions.
Answer: There are really two good answers here (provided by delnan and Dave Griffith) even though I could only accept one:
Use foldLeft because there are additional optimizations, e.g. using a while loop which will be faster than a for loop.
Use fold if it ever gets added to regular collections, because that will make the transition to parallel collections trivial.
It's shorter and clearer - yes, you need to know what a fold is to understand it, but when you're programming in a language that's 50% functional, you should know these basic building blocks anyway. A fold is exactly what the procedural code does (repeatedly applying an operation), but it's given a name and generalized. And while it's only a small wheel you're reinventing, but it's still a wheel reinvention.
And in case the implementation of foldLeft should ever get some special perk - say, extra optimizations - you get that for free, without updating countless methods.
Other than a distaste for mutable variable (even mutable locals), the basic reason to use fold in this case is clarity, with occasional brevity. Most of the wordiness of the fold version is because you have to use an explicit function definition with a destructuring bind. If each element in the list is used precisely once in the fold operation (a common case), this can be simplified to use the short form. Thus the classic definition of the sum of a collection of numbers
collection.foldLeft(0)(_+_)
is much simpler and shorter than any equivalent imperative construct.
One additional meta-reason to use functional collection operations, although not directly applicable in this case, is to enable a move to using parallel collection operations if needed for performance. Fold can't be parallelized, but often fold operations can be turned into commutative-associative reduce operations, and those can be parallelized. With Scala 2.9, changing something from non-parallel functional to parallel functional utilizing multiple processing cores can sometimes be as easy as dropping a .par onto the collection you want to execute parallel operations on.
One word I haven't seen mentioned here yet is declarative:
Declarative programming is often defined as any style of programming that is not imperative. A number of other common definitions exist that attempt to give the term a definition other than simply contrasting it with imperative programming. For example:
A program that describes what computation should be performed and not how to compute it
Any programming language that lacks side effects (or more specifically, is referentially transparent)
A language with a clear correspondence to mathematical logic.
These definitions overlap substantially.
Higher-order functions (HOFs) are a key enabler of declarativity, since we only specify the what (e.g. "using this collection of values, multiply each value by 2, sum the result") and not the how (e.g. initialize an accumulator, iterate with a for loop, extract values from the collection, add to the accumulator...).
Compare the following:
// Sugar-free Scala (Still better than Java<5)
def sumDoubled1(xs: List[Int]) = {
var sum = 0 // Initialized correctly?
for (i <- 0 until xs.size) { // Fenceposts?
sum = sum + (xs(i) * 2) // Correct value being extracted?
// Value extraction and +/* smashed together
}
sum // Correct value returned?
}
// Iteration sugar (similar to Java 5)
def sumDoubled2(xs: List[Int]) = {
var sum = 0
for (x <- xs) // We don't need to worry about fenceposts or
sum = sum + (x * 2) // value extraction anymore; that's progress
sum
}
// Verbose Scala
def sumDoubled3(xs: List[Int]) = xs.map((x: Int) => x*2). // the doubling
reduceLeft((x: Int, y: Int) => x+y) // the addition
// Idiomatic Scala
def sumDoubled4(xs: List[Int]) = xs.map(_*2).reduceLeft(_+_)
// ^ the doubling ^
// \ the addition
Note that our first example, sumDoubled1, is already more declarative than (most would say superior to) C/C++/Java<5 for loops, because we haven't had to micromanage the iteration state and termination logic, but we're still vulnerable to off-by-one errors.
Next, in sumDoubled2, we're basically at the level of Java>=5. There are still a couple things that can go wrong, but we're getting pretty good at reading this code-shape, so errors are quite unlikely. However, don't forget that a pattern that's trivial in a toy example isn't always so readable when scaled up to production code!
With sumDoubled3, desugared for didactic purposes, and sumDoubled4, the idiomatic Scala version, the iteration, initialization, value extraction and choice of return value are all gone.
Sure, it takes time to learn to read the functional versions, but we've drastically foreclosed our options for making mistakes. The "business logic" is clearly marked, and the plumbing is chosen from the same menu that everyone else is reading from.
It is worth pointing out that there is another way of calling foldLeft which takes advantages of:
The ability to use (almost) any Unicode symbol in an identifier
The feature that if a method name ends with a colon :, and is called infix, then the target and parameter are switched
For me this version is much clearer, because I can see that I am folding the expr value into the replacements collection
def expand(expr: String, replacements: Traversable[(String, String)]): String = {
(expr /: replacements) { case (r, (o, n)) => r.replace(o, n) }
}