Why is Scalas Either not a monad? - scala

I'm interested in knowing the design decisions why scala.Either was not done as a monad. There already exists some discussion on how to right-biasing Either, for example:
right-biasing Either
fixing Either
Fixing scala.Either - unbiased vs biased
But there are mentioned too much details and I couldn't get an complete overview about why it is done as it is done. Can someone give an overview about the benefits of having a right-biased Either, about the problems to do it that way and about the benefits of not having an right-biased Either (if they even exist)?

I think it just comes down to the Principle of Least Astonishment. Using Either to encode success or failure is clearly something people do, but it is not the only use of Either. In fact, there is not much reason to have Right be success and Left be failure other than tradition. As adelbertc's comment above mentions, scalaz has Validation which specifically encode this.
To further justify the POLA claim above, take this code:
def foo(): Either[Int, Int] = Right(1)
def bar(j: Int): Either[Int, Int] = Left(1)
def baz(z: Int): Either[Int, Int] = Right(3)
// Result is Left(1)
for (a <- foo().right; b <- bar(a).right; c <- baz(b).right) yield c
This compiles because I am using the .right projection in the for expression. It makes sense here the Left(1) in bar is the failure case and so that is the result, but imagine if Either was Right-biased. The above code would compile without the .right projections in the expression like:
for (a <- foo(); b <- bar(a); c <- baz(b)) yield c
If you used Either in your code to just be a "one-or-the-other" type, you would be surprised by (1) the fact that this would compile and (2) that it returns Left(1) and seemingly never executes baz.
In summary, use Validation if you want to use Either to encode success or failure.

I don't know if this was the original reason, but there is at least one good reason. In Scala, for comprehensions require more from the argument than that it is a monad in order to gain full functionality. In particular, there are constructs like
for (a <- ma if a > 7; /*...*/) yield /*...*/
which require that the monad be a monad-with-zero (so you can empty it if the condition fails).
Either cannot sensibly be a monad-with-zero (save for Either[Unit,B], where Left(()) can be the zero).
One way to go is to say: okay, fine, just don't use your for-comprehensions that way. Another way to go is to say: okay, fine, don't bother making Either a monad at all. It's a matter of personal preference, of course, but I can see a certain lack of elegance in having Either (uniquely among the common monads provided in Scala) fall flat on its face once you try to use the full power for gives you.

Related

Why do we need flatMap (in general)?

I have been looking into FP languages (off and on) for some time and have played with Scala, Haskell, F#, and some others. I like what I see and understand some of the fundamental concepts of FP (with absolutely no background in Category Theory - so don't talk Math, please).
So, given a type M[A] we have map which takes a function A=>B and returns a M[B]. But we also have flatMap which takes a function A=>M[B] and returns a M[B]. We also have flatten which takes a M[M[A]] and returns a M[A].
In addition, many of the sources I have read describe flatMap as map followed by flatten.
So, given that flatMap seems to be equivalent to flatten compose map, what is its purpose? Please don't say it is to support 'for comprehensions' as this question really isn't Scala-specific. And I am less concerned with the syntactic sugar than I am in the concept behind it. The same question arises with Haskell's bind operator (>>=). I believe they both are related to some Category Theory concept but I don't speak that language.
I have watched Brian Beckman's great video Don't Fear the Monad more than once and I think I see that flatMap is the monadic composition operator but I have never really seen it used the way he describes this operator. Does it perform this function? If so, how do I map that concept to flatMap?
BTW, I had a long writeup on this question with lots of listings showing experiments I ran trying to get to the bottom of the meaning of flatMap and then ran into this question which answered some of my questions. Sometimes I hate Scala implicits. They can really muddy the waters. :)
FlatMap, known as "bind" in some other languages, is as you said yourself for function composition.
Imagine for a moment that you have some functions like these:
def foo(x: Int): Option[Int] = Some(x + 2)
def bar(x: Int): Option[Int] = Some(x * 3)
The functions work great, calling foo(3) returns Some(5), and calling bar(3) returns Some(9), and we're all happy.
But now you've run into the situation that requires you to do the operation more than once.
foo(3).map(x => foo(x)) // or just foo(3).map(foo) for short
Job done, right?
Except not really. The output of the expression above is Some(Some(7)), not Some(7), and if you now want to chain another map on the end you can't because foo and bar take an Int, and not an Option[Int].
Enter flatMap
foo(3).flatMap(foo)
Will return Some(7), and
foo(3).flatMap(foo).flatMap(bar)
Returns Some(15).
This is great! Using flatMap lets you chain functions of the shape A => M[B] to oblivion (in the previous example A and B are Int, and M is Option).
More technically speaking; flatMap and bind have the signature M[A] => (A => M[B]) => M[B], meaning they take a "wrapped" value, such as Some(3), Right('foo), or List(1,2,3) and shove it through a function that would normally take an unwrapped value, such as the aforementioned foo and bar. It does this by first "unwrapping" the value, and then passing it through the function.
I've seen the box analogy being used for this, so observe my expertly drawn MSPaint illustration:
This unwrapping and re-wrapping behavior means that if I were to introduce a third function that doesn't return an Option[Int] and tried to flatMap it to the sequence, it wouldn't work because flatMap expects you to return a monad (in this case an Option)
def baz(x: Int): String = x + " is a number"
foo(3).flatMap(foo).flatMap(bar).flatMap(baz) // <<< ERROR
To get around this, if your function doesn't return a monad, you'd just have to use the regular map function
foo(3).flatMap(foo).flatMap(bar).map(baz)
Which would then return Some("15 is a number")
It's the same reason you provide more than one way to do anything: it's a common enough operation that you may want to wrap it.
You could ask the opposite question: why have map and flatten when you already have flatMap and a way to store a single element inside your collection? That is,
x map f
x filter p
can be replaced by
x flatMap ( xi => x.take(0) :+ f(xi) )
x flatMap ( xi => if (p(xi)) x.take(0) :+ xi else x.take(0) )
so why bother with map and filter?
In fact, there are various minimal sets of operations you need to reconstruct many of the others (flatMap is a good choice because of its flexibility).
Pragmatically, it's better to have the tool you need. Same reason why there are non-adjustable wrenches.
The simplest reason is to compose an output set where each entry in the input set may produce more than one (or zero!) outputs.
For example, consider a program which outputs addresses for people to generate mailers. Most people have one address. Some have two or more. Some people, unfortunately, have none. Flatmap is a generalized algorithm to take a list of these people and return all of the addresses, regardless of how many come from each person.
The zero output case is particularly useful for monads, which often (always?) return exactly zero or one results (think Maybe- returns zero results if the computation fails, or one if it succeeds). In that case you want to perform an operation on "all of the results", which it just so happens may be one or many.
The "flatMap", or "bind", method, provides an invaluable way to chain together methods that provide their output wrapped in a Monadic construct (like List, Option, or Future). For example, suppose you have two methods that produce a Future of a result (eg. they make long-running calls to databases or web service calls or the like, and should be used asynchronously):
def fn1(input1: A): Future[B] // (for some types A and B)
def fn2(input2: B): Future[C] // (for some types B and C)
How to combine these? With flatMap, we can do this as simply as:
def fn3(input3: A): Future[C] = fn1(a).flatMap(b => fn2(b))
In this sense, we have "composed" a function fn3 out of fn1 and fn2 using flatMap, which has the same general structure (and so can be composed in turn with further similar functions).
The map method would give us a not-so-convenient - and not readily chainable - Future[Future[C]]. Certainly we can then use flatten to reduce this, but the flatMap method does it in one call, and can be chained as far as we wish.
This is so useful a way of working, in fact, that Scala provides the for-comprehension as essentially a short-cut for this (Haskell, too, provides a short-hand way of writing a chain of bind operations - I'm not a Haskell expert, though, and don't recall the details) - hence the talk you will have come across about for-comprehensions being "de-sugared" into a chain of flatMap calls (along with possible filter calls and a final map call for the yield).
Well, one could argue, you don't need .flatten either. Why not just do something like
#tailrec
def flatten[T](in: Seq[Seq[T], out: Seq[T] = Nil): Seq[T] = in match {
case Nil => out
case head ::tail => flatten(tail, out ++ head)
}
Same can be said about map:
#tailrec
def map[A,B](in: Seq[A], out: Seq[B] = Nil)(f: A => B): Seq[B] = in match {
case Nil => out
case head :: tail => map(tail, out :+ f(head))(f)
}
So, why are .flatten and .map provided by the library? Same reason .flatMap is: convenience.
There is also .collect, which is really just
list.filter(f.isDefinedAt _).map(f)
.reduce is actually nothing more then list.foldLeft(list.head)(f),
.headOption is
list match {
case Nil => None
case head :: _ => Some(head)
}
Etc ...

Monad transformer in Scala for comprehension to handle Option and collect error messages

I've been looking at a lot of Scala monad transformer examples and haven't been able to figure out how to do what I think is probably something straightforward. I want to write a for comprehension that looks up something in a database (MongoDB), which returns an Option, then if that Option is a Some, looks at its contents and gets another Option, and so on. At each step, if I get a None, I want to abort the whole thing and produce an error message like "X not found". The for comprehension should yield an Either (or something similar), in which a Left contains the error message and a Right contains the successful result of the whole operation (perhaps just a string, or perhaps an object constructed using several of the values obtained along the way).
So far I've just been using the Option monad by itself, as in this trivial example:
val docContentOpt = for {
doc <- mongoCollection.findOne(MongoDBObject("_id" -> id))
content <- doc.getAs[String]("content")
} yield content
However, I'm stuck trying to integrate something like Either into this. What I'm looking for is a working code snippet, not just a suggestion to try \/ in Scalaz. I've tried to make sense of Scalaz, but it has very little documentation, and what little there is seems to be written for people who know all about lambda calculus, which I don't.
I'd "try" something like this:
def tryOption[T](option: Option[T], message:String ="" ):Try[T] = option match {
case Some(v) => Success(v)
case None => Failure(new Exception(message))
}
val docContentOpt = for {
doc <- tryOption(mongoCollection.findOne(MongoDBObject("_id" -> id)),s"$id not found")
content <- tryOption(doc.getAs[String]("content"), "content not found")
} yield content
Basically an Option to Try conversion that captures the error in an exception. Try is an specialized right-biased Either that is monadic (in contrast to Either, which is not)
Try may be what you're looking for, but it's also possible to do this using the "right projection" of the standard library's Either:
val docContentOpt: Either[String, String] = for {
doc <- mongoCollection.findOne(MongoDBObject("_id" -> id)).toRight(
s"$id not found"
).right
content <- doc.getAs[String]("content").toRight("Can't get as content").right
} yield content
This may make more sense if your error type doesn't extend Throwable, for example, or if you're stuck on 2.9.2 or earlier (or if you just prefer the generality of Either, etc.).
(As a side note, it'd be nice if the standard library provided toSuccess and toFailure methods on Option that would make converting Option into Try as convenient as converting Option into Either is here—maybe someday.)
(And as another side note, Scalaz doesn't actually buy you much here—it would allow you to write .toRightDisjunction("error") instead of .toRight("error").right, but that's about it. As Gabriel Claramunt points out in a comment, this isn't a case for monad transformers.)

Either monadic operations

I just start to be used to deal with monadic operations.
For the Option type, this Cheat Sheet of Tony Morris helped:
http://blog.tmorris.net/posts/scalaoption-cheat-sheet/
So in the end it seems easy to understand that:
map transforms the value of inside an option
flatten permits to transform Option[Option[X]] in Option[X]
flatMap is somehow a map operation producing an Option[Option[X]] and then flattened to Option[X]
At least it is what I understand until now.
For Either, it seems a bit more difficult to understand since Either itself is not right biaised, does not have map / flatMap operations... and we use projection.
I can read the Scaladoc but it is not as clear as the Cheat Sheet on Options.
Can someone provide an Either Sheet Cheat to describe the basic monadic operations?
It seems to me that Either.joinRight is a bit like RightProjection.flatMap and seems to be the equivalent of Option.flatten for Either.
It seems to me that if Either was Right biaised, then Either.flatten would be Either.joinRight no?
In this question: Either, Options and for comprehensions I ask about for comprehension with Eiher, and one of the answers says that we can't mix monads because of the way it is desugared into map/flatMap/filter.
When using this kind of code:
def updateUserStats(user: User): Either[Error,User] = for {
stampleCount <- stampleRepository.getStampleCount(user).right
userUpdated <- Right(copyUserWithStats(user,stampleCount)).right
userSaved <- userService.update(userUpdated).right
} yield userSaved
Does this mean that all my 3 method calls must always return Either[Error,Something]?
I mean if I have a method call Either[Throwable,Something] it won't work right?
Edit:
Is Try[Something] exactly the same as a right-biaised Either[Throwable,Something]?
Either was never really meant to be an exception handling based structure. It was meant to represent a situation where a function really could possible return one of two distinct types, but people started the convention where the left type is a supposed to be a failed case and the right is success. If you want to return a biased type for some pass/fail type business checks logic, then Validation from scalaz works well. If you have a function that could return a value or a Throwable, then Try would be a good choice. Either should be used for situations where you really might get one of two possible types, and now that I am using Try and Validation (each for different types of situations), I never use Either any more.

What are practical uses of applicative style?

I am a Scala programmer, learning Haskell now. It's easy to find practical use cases and real world examples for OO concepts, such as decorators, strategy pattern etc. Books and interwebs are filled with it.
I came to the realization that this somehow is not the case for functional concepts. Case in point: applicatives.
I am struggling to find practical use cases for applicatives. Almost all of the tutorials and books I have come across so far provide the examples of [] and Maybe. I expected applicatives to be more applicable than that, seeing all the attention they get in the FP community.
I think I understand the conceptual basis for applicatives (maybe I am wrong), and I have waited long for my moment of enlightenment. But it doesn't seem to be happening. Never while programming, have I had a moment when I would shout with a joy, "Eureka! I can use applicative here!" (except again, for [] and Maybe).
Can someone please guide me how applicatives can be used in a day-to-day programming? How do I start spotting the pattern? Thanks!
Applicatives are great when you've got a plain old function of several variables, and you have the arguments but they're wrapped up in some kind of context. For instance, you have the plain old concatenate function (++) but you want to apply it to 2 strings which were acquired through I/O. Then the fact that IO is an applicative functor comes to the rescue:
Prelude Control.Applicative> (++) <$> getLine <*> getLine
hi
there
"hithere"
Even though you explicitly asked for non-Maybe examples, it seems like a great use case to me, so I'll give an example. You have a regular function of several variables, but you don't know if you have all the values you need (some of them may have failed to compute, yielding Nothing). So essentially because you have "partial values", you want to turn your function into a partial function, which is undefined if any of its inputs is undefined. Then
Prelude Control.Applicative> (+) <$> Just 3 <*> Just 5
Just 8
but
Prelude Control.Applicative> (+) <$> Just 3 <*> Nothing
Nothing
which is exactly what you want.
The basic idea is that you're "lifting" a regular function into a context where it can be applied to as many arguments as you like. The extra power of Applicative over just a basic Functor is that it can lift functions of arbitrary arity, whereas fmap can only lift a unary function.
Since many applicatives are also monads, I feel there's really two sides to this question.
Why would I want to use the applicative interface instead of the monadic one when both are available?
This is mostly a matter of style. Although monads have the syntactic sugar of do-notation, using applicative style frequently leads to more compact code.
In this example, we have a type Foo and we want to construct random values of this type. Using the monad instance for IO, we might write
data Foo = Foo Int Double
randomFoo = do
x <- randomIO
y <- randomIO
return $ Foo x y
The applicative variant is quite a bit shorter.
randomFoo = Foo <$> randomIO <*> randomIO
Of course, we could use liftM2 to get similar brevity, however the applicative style is neater than having to rely on arity-specific lifting functions.
In practice, I mostly find myself using applicatives much in the same way like I use point-free style: To avoid naming intermediate values when an operation is more clearly expressed as a composition of other operations.
Why would I want to use an applicative that is not a monad?
Since applicatives are more restricted than monads, this means that you can extract more useful static information about them.
An example of this is applicative parsers. Whereas monadic parsers support sequential composition using (>>=) :: Monad m => m a -> (a -> m b) -> m b, applicative parsers only use (<*>) :: Applicative f => f (a -> b) -> f a -> f b. The types make the difference obvious: In monadic parsers the grammar can change depending on the input, whereas in an applicative parser the grammar is fixed.
By limiting the interface in this way, we can for example determine whether a parser will accept the empty string without running it. We can also determine the first and follow sets, which can be used for optimization, or, as I've been playing with recently, constructing parsers that support better error recovery.
I think of Functor, Applicative and Monad as design patterns.
Imagine you want to write a Future[T] class. That is, a class that holds values that are to be calculated.
In a Java mindset, you might create it like
trait Future[T] {
def get: T
}
Where 'get' blocks until the value is available.
You might realize this, and rewrite it to take a callback:
trait Future[T] {
def foreach(f: T => Unit): Unit
}
But then what happens if there are two uses for the future? It means you need to keep a list of callbacks. Also, what happens if a method receives a Future[Int] and needs to return a calculation based on the Int inside? Or what do you do if you have two futures and you need to calculate something based on the values they will provide?
But if you know of FP concepts, you know that instead of working directly on T, you can manipulate the Future instance.
trait Future[T] {
def map[U](f: T => U): Future[U]
}
Now your application changes so that each time you need to work on the contained value, you just return a new Future.
Once you start in this path, you can't stop there. You realize that in order to manipulate two futures, you just need to model as an applicative, in order to create futures, you need a monad definition for future, etc.
UPDATE: As suggested by #Eric, I've written a blog post: http://www.tikalk.com/incubator/blog/functional-programming-scala-rest-us
I finally understood how applicatives can help in day-to-day programming with that presentation:
https://web.archive.org/web/20100818221025/http://applicative-errors-scala.googlecode.com/svn/artifacts/0.6/chunk-html/index.html
The autor shows how applicatives can help for combining validations and handling failures.
The presentation is in Scala, but the author also provides the full code example for Haskell, Java and C#.
Warning: my answer is rather preachy/apologetic. So sue me.
Well, how often in your day-to-day Haskell programming do you create new data types? Sounds like you want to know when to make your own Applicative instance, and in all honesty unless you are rolling your own parser, you probably won't need to do it very much. Using applicative instances, on the other hand, you should learn to do frequently.
Applicative is not a "design pattern" like decorators or strategies. It is an abstraction, which makes it much more pervasive and generally useful, but much less tangible. The reason you have a hard time finding "practical uses" is because the example uses for it are almost too simple. You use decorators to put scrollbars on windows. You use strategies to unify the interface for both aggressive and defensive moves for your chess bot. But what are applicatives for? Well, they're a lot more generalized, so it's hard to say what they are for, and that's OK. Applicatives are handy as parsing combinators; the Yesod web framework uses Applicative to help set up and extract information from forms. If you look, you'll find a million and one uses for Applicative; it's all over the place. But since it's so abstract, you just need to get the feel for it in order to recognize the many places where it can help make your life easier.
I think Applicatives ease the general usage of monadic code. How many times have you had the situation that you wanted to apply a function but the function was not monadic and the value you want to apply it to is monadic? For me: quite a lot of times!
Here is an example that I just wrote yesterday:
ghci> import Data.Time.Clock
ghci> import Data.Time.Calendar
ghci> getCurrentTime >>= return . toGregorian . utctDay
in comparison to this using Applicative:
ghci> import Control.Applicative
ghci> toGregorian . utctDay <$> getCurrentTime
This form looks "more natural" (at least to my eyes :)
Coming at Applicative from "Functor" it generalizes "fmap" to easily express acting on several arguments (liftA2) or a sequence of arguments (using <*>).
Coming at Applicative from "Monad" it does not let the computation depend on the value that is computed. Specifically you cannot pattern match and branch on a returned value, typically all you can do is pass it to another constructor or function.
Thus I see Applicative as sandwiched in between Functor and Monad. Recognizing when you are not branching on the values from a monadic computation is one way to see when to switch to Applicative.
Here is an example taken from the aeson package:
data Coord = Coord { x :: Double, y :: Double }
instance FromJSON Coord where
parseJSON (Object v) =
Coord <$>
v .: "x" <*>
v .: "y"
There are some ADTs like ZipList that can have applicative instances, but not monadic instances. This was a very helpful example for me when understanding the difference between applicatives and monads. Since so many applicatives are also monads, it's easy to not see the difference between the two without a concrete example like ZipList.
I think it might be worthwhile to browse the sources of packages on Hackage, and see first-handedly how applicative functors and the like are used in existing Haskell code.
I described an example of practical use of the applicative functor in a discussion, which I quote below.
Note the code examples are pseudo-code for my hypothetical language which would hide the type classes in a conceptual form of subtyping, so if you see a method call for apply just translate into your type class model, e.g. <*> in Scalaz or Haskell.
If we mark elements of an array or hashmap with null or none to
indicate their index or key is valid yet valueless, the Applicative
enables without any boilerplate skipping the valueless elements while
applying operations to the elements that have a value. And more
importantly it can automatically handle any Wrapped semantics that
are unknown a priori, i.e. operations on T over
Hashmap[Wrapped[T]] (any over any level of composition, e.g. Hashmap[Wrapped[Wrapped2[T]]] because applicative is composable but monad is not).
I can already picture how it will make my code easier to
understand. I can focus on the semantics, not on all the
cruft to get me there and my semantics will be open under extension of
Wrapped whereas all your example code isn’t.
Significantly, I forgot to point out before that your prior examples
do not emulate the return value of the Applicative, which will be a
List, not a Nullable, Option, or Maybe. So even my attempts to
repair your examples were not emulating Applicative.apply.
Remember the functionToApply is the input to the
Applicative.apply, so the container maintains control.
list1.apply( list2.apply( ... listN.apply( List.lift(functionToApply) ) ... ) )
Equivalently.
list1.apply( list2.apply( ... listN.map(functionToApply) ... ) )
And my proposed syntactical sugar which the compiler would translate
to the above.
funcToApply(list1, list2, ... list N)
It is useful to read that interactive discussion, because I can't copy it all here. I expect that url to not break, given who the owner of that blog is. For example, I quote from further down the discussion.
the conflation of out-of-statement control flow with assignment is probably not desired by most programmers
Applicative.apply is for generalizing the partial application of functions to parameterized types (a.k.a. generics) at any level of nesting (composition) of the type parameter. This is all about making more generalized composition possible. The generality can’t be accomplished by pulling it outside the completed evaluation (i.e. return value) of the function, analogous to the onion can’t be peeled from the inside-out.
Thus it isn’t conflation, it is a new degree-of-freedom that is not currently available to you. Per our discussion up thread, this is why you must throw exceptions or stored them in a global variable, because your language doesn’t have this degree-of-freedom. And that is not the only application of these category theory functors (expounded in my comment in moderator queue).
I provided a link to an example abstracting validation in Scala, F#, and C#, which is currently stuck in moderator queue. Compare the obnoxious C# version of the code. And the reason is because the C# is not generalized. I intuitively expect that C# case-specific boilerplate will explode geometrically as the program grows.

Why should I avoid using local modifiable variables in Scala?

I'm pretty new to Scala and most of the time before I've used Java. Right now I have warnings all over my code saying that i should "Avoid mutable local variables" and I have a simple question - why?
Suppose I have small problem - determine max int out of four. My first approach was:
def max4(a: Int, b: Int,c: Int, d: Int): Int = {
var subMax1 = a
if (b > a) subMax1 = b
var subMax2 = c
if (d > c) subMax2 = d
if (subMax1 > subMax2) subMax1
else subMax2
}
After taking into account this warning message I found another solution:
def max4(a: Int, b: Int,c: Int, d: Int): Int = {
max(max(a, b), max(c, d))
}
def max(a: Int, b: Int): Int = {
if (a > b) a
else b
}
It looks more pretty, but what is ideology behind this?
Whenever I approach a problem I'm thinking about it like: "Ok, we start from this and then we incrementally change things and get the answer". I understand that the problem is that I try to change some initial state to get an answer and do not understand why changing things at least locally is bad? How to iterate over collection then in functional languages like Scala?
Like an example: Suppose we have a list of ints, how to write a function that returns sublist of ints which are divisible by 6? Can't think of solution without local mutable variable.
In your particular case there is another solution:
def max4(a: Int, b: Int,c: Int, d: Int): Int = {
val submax1 = if (a > b) a else b
val submax2 = if (c > d) c else d
if (submax1 > submax2) submax1 else submax2
}
Isn't it easier to follow? Of course I am a bit biased but I tend to think it is, BUT don't follow that rule blindly. If you see that some code might be written more readably and concisely in mutable style, do it this way -- the great strength of scala is that you don't need to commit to neither immutable nor mutable approaches, you can swing between them (btw same applies to return keyword usage).
Like an example: Suppose we have a list of ints, how to write a
function that returns the sublist of ints which are divisible by 6?
Can't think of solution without local mutable variable.
It is certainly possible to write such function using recursion, but, again, if mutable solution looks and works good, why not?
It's not so related with Scala as with the functional programming methodology in general. The idea is the following: if you have constant variables (final in Java), you can use them without any fear that they are going to change. In the same way, you can parallelize your code without worrying about race conditions or thread-unsafe code.
In your example is not so important, however imagine the following example:
val variable = ...
new Future { function1(variable) }
new Future { function2(variable) }
Using final variables you can be sure that there will not be any problem. Otherwise, you would have to check the main thread and both function1 and function2.
Of course, it's possible to obtain the same result with mutable variables if you do not ever change them. But using inmutable ones you can be sure that this will be the case.
Edit to answer your edit:
Local mutables are not bad, that's the reason you can use them. However, if you try to think approaches without them, you can arrive to solutions as the one you posted, which is cleaner and can be parallelized very easily.
How to iterate over collection then in functional languages like Scala?
You can always iterate over a inmutable collection, while you do not change anything. For example:
val list = Seq(1,2,3)
for (n <- list)
println n
With respect to the second thing that you said: you have to stop thinking in a traditional way. In functional programming the usage of Map, Filter, Reduce, etc. is normal; as well as pattern matching and other concepts that are not typical in OOP. For the example you give:
Like an example: Suppose we have a list of ints, how to write a function that returns sublist of ints which are divisible by 6?
val list = Seq(1,6,10,12,18,20)
val result = list.filter(_ % 6 == 0)
Firstly you could rewrite your example like this:
def max(first: Int, others: Int*): Int = {
val curMax = Math.max(first, others(0))
if (others.size == 1) curMax else max(curMax, others.tail : _*)
}
This uses varargs and tail recursion to find the largest number. Of course there are many other ways of doing the same thing.
To answer your queston - It's a good question and one that I thought about myself when I first started to use scala. Personally I think the whole immutable/functional programming approach is somewhat over hyped. But for what it's worth here are the main arguments in favour of it:
Immutable code is easier to read (subjective)
Immutable code is more robust - it's certainly true that changing mutable state can lead to bugs. Take this for example:
for (int i=0; i<100; i++) {
for (int j=0; j<100; i++) {
System.out.println("i is " + i = " and j is " + j);
}
}
This is an over simplified example but it's still easy to miss the bug and the compiler won't help you
Mutable code is generally not thread safe. Even trivial and seemingly atomic operations are not safe. Take for example i++ this looks like an atomic operation but it's actually equivalent to:
int i = 0;
int tempI = i + 0;
i = tempI;
Immutable data structures won't allow you to do something like this so you would need to explicitly think about how to handle it. Of course as you point out local variables are generally threadsafe, but there is no guarantee. It's possible to pass a ListBuffer instance variable as a parameter to a method for example
However there are downsides to immutable and functional programming styles:
Performance. It is generally slower in both compilation and runtime. The compiler must enforce the immutability and the JVM must allocate more objects than would be required with mutable data structures. This is especially true of collections.
Most scala examples show something like val numbers = List(1,2,3) but in the real world hard coded values are rare. We generally build collections dynamically (from a database query etc). Whilst scala can reassign the values in a colection it must still create a new collection object every time you modify it. If you want to add 1000 elements to a scala List (immutable) the JVM will need to allocate (and then GC) 1000 objects
Hard to maintain. Functional code can be very hard to read, it's not uncommon to see code like this:
val data = numbers.foreach(_.map(a => doStuff(a).flatMap(somethingElse)).foldleft("", (a : Int,b: Int) => a + b))
I don't know about you but I find this sort of code really hard to follow!
Hard to debug. Functional code can also be hard to debug. Try putting a breakpoint halfway into my (terrible) example above
My advice would be to use a functional/immutable style where it genuinely makes sense and you and your colleagues feel comfortable doing it. Don't use immutable structures because they're cool or it's "clever". Complex and challenging solutions will get you bonus points at Uni but in the commercial world we want simple solutions to complex problems! :)
Your two main questions:
Why warn against local state changes?
How can you iterate over collections without mutable state?
I'll answer both.
Warnings
The compiler warns against the use of mutable local variables because they are often a cause of error. That doesn't mean this is always the case. However, your sample code is pretty much a classic example of where mutable local state is used entirely unnecessarily, in a way that not only makes it more error prone and less clear but also less efficient.
Your first code example is more inefficient than your second, functional solution. Why potentially make two assignments to submax1 when you only ever need to assign one? You ask which of the two inputs is larger anyway, so why not ask that first and then make one assignment? Why was your first approach to temporarily store partial state only halfway through the process of asking such a simple question?
Your first code example is also inefficient because of unnecessary code duplication. You're repeatedly asking "which is the biggest of two values?" Why write out the code for that 3 times independently? Needlessly repeating code is a known bad habit in OOP every bit as much as FP and for precisely the same reasons. Each time you needlessly repeat code, you open a potential source of error. Adding mutable local state (especially when so unnecessary) only adds to the fragility and to the potential for hard to spot errors, even in short code. You just have to type submax1 instead of submax2 in one place and you may not notice the error for a while.
Your second, FP solution removes the code duplication, dramatically reducing the chance of error, and shows that there was simply no need for mutable local state. It's also, as you yourself say, cleaner and clearer - and better than the alternative solution in om-nom-nom's answer.
(By the way, the idiomatic Scala way to write such a simple function is
def max(a: Int, b: Int) = if (a > b) a else b
which terser style emphasises its simplicity and makes the code less verbose)
Your first solution was inefficient and fragile, but it was your first instinct. The warning caused you to find a better solution. The warning proved its value. Scala was designed to be accessible to Java developers and is taken up by many with a long experience of imperative style and little or no knowledge of FP. Their first instinct is almost always the same as yours. You have demonstrated how that warning can help improve code.
There are cases where using mutable local state can be faster but the advice of Scala experts in general (not just the pure FP true believers) is to prefer immutability and to reach for mutability only where there is a clear case for its use. This is so against the instincts of many developers that the warning is useful even if annoying to experienced Scala devs.
It's funny how often some kind of max function comes up in "new to FP/Scala" questions. The questioner is very often tripping up on errors caused by their use of local state... which link both demonstrates the often obtuse addiction to mutable state among some devs while also leading me on to your other question.
Functional Iteration over Collections
There are three functional ways to iterate over collections in Scala
For Comprehensions
Explicit Recursion
Folds and other Higher Order Functions
For Comprehensions
Your question:
Suppose we have a list of ints, how to write a function that returns sublist of ints which are divisible by 6? Can't think of solution without local mutable variable
Answer: assuming xs is a list (or some other sequence) of integers, then
for (x <- xs; if x % 6 == 0) yield x
will give you a sequence (of the same type as xs) containing only those items which are divisible by 6, if any. No mutable state required. Scala just iterates over the sequence for you and returns anything matching your criteria.
If you haven't yet learned the power of for comprehensions (also known as sequence comprehensions) you really should. Its a very expressive and powerful part of Scala syntax. You can even use them with side effects and mutable state if you want (look at the final example on the tutorial I just linked to). That said, there can be unexpected performance penalties and they are overused by some developers.
Explicit Recursion
In the question I linked to at the end of the first section, I give in my answer a very simple, explicitly recursive solution to returning the largest Int from a list.
def max(xs: List[Int]): Option[Int] = xs match {
case Nil => None
case List(x: Int) => Some(x)
case x :: y :: rest => max( (if (x > y) x else y) :: rest )
}
I'm not going to explain how the pattern matching and explicit recursion work (read my other answer or this one). I'm just showing you the technique. Most Scala collections can be iterated over recursively, without any need for mutable state. If you need to keep track of what you've been up to along the way, you pass along an accumulator. (In my example code, I stick the accumulator at the front of the list to keep the code smaller but look at the other answers to those questions for more conventional use of accumulators).
But here is a (naive) explicitly recursive way of finding those integers divisible by 6
def divisibleByN(n: Int, xs: List[Int]): List[Int] = xs match {
case Nil => Nil
case x :: rest if x % n == 0 => x :: divisibleByN(n, rest)
case _ :: rest => divisibleByN(n, rest)
}
I call it naive because it isn't tail recursive and so could blow your stack. A safer version can be written using an accumulator list and an inner helper function but I leave that exercise to you. The result will be less pretty code than the naive version, no matter how you try, but the effort is educational.
Recursion is a very important technique to learn. That said, once you have learned to do it, the next important thing to learn is that you can usually avoid using it explicitly yourself...
Folds and other Higher Order Functions
Did you notice how similar my two explicit recursion examples are? That's because most recursions over a list have the same basic structure. If you write a lot of such functions, you'll repeat that structure many times. Which makes it boilerplate; a waste of your time and a potential source of error.
Now, there are any number of sophisticated ways to explain folds but one simple concept is that they take the boilerplate out of recursion. They take care of the recursion and the management of accumulator values for you. All they ask is that you provide a seed value for the accumulator and the function to apply at each iteration.
For example, here is one way to use fold to extract the highest Int from the list xs
xs.tail.foldRight(xs.head) {(a, b) => if (a > b) a else b}
I know you aren't familiar with folds, so this may seem gibberish to you but surely you recognise the lambda (anonymous function) I'm passing in on the right. What I'm doing there is taking the first item in the list (xs.head) and using it as the seed value for the accumulator. Then I'm telling the rest of the list (xs.tail) to iterate over itself, comparing each item in turn to the accumulator value.
This kind of thing is a common case, so the Collections api designers have provided a shorthand version:
xs.reduce {(a, b) => if (a > b) a else b}
(If you look at the source code, you'll see they have implemented it using a fold).
Anything you might want to do iteratively to a Scala collection can be done using a fold. Often, the api designers will have provided a simpler higher-order function which is implemented, under the hood, using a fold. Want to find those divisible-by-six Ints again?
xs.foldRight(Nil: List[Int]) {(x, acc) => if (x % 6 == 0) x :: acc else acc}
That starts with an empty list as the accumulator, iterates over every item, only adding those divisible by 6 to the accumulator. Again, a simpler fold-based HoF has been provided for you:
xs filter { _ % 6 == 0 }
Folds and related higher-order functions are harder to understand than for comprehensions or explicit recursion, but very powerful and expressive (to anybody else who understands them). They eliminate boilerplate, removing a potential source of error. Because they are implemented by the core language developers, they can be more efficient (and that implementation can change, as the language progresses, without breaking your code). Experienced Scala developers use them in preference to for comprehensions or explicit recursion.
tl;dr
Learn For comprehensions
Learn explicit recursion
Don't use them if a higher-order function will do the job.
It is always nicer to use immutable variables since they make your code easier to read. Writing a recursive code can help solve your problem.
def max(x: List[Int]): Int = {
if (x.isEmpty == true) {
0
}
else {
Math.max(x.head, max(x.tail))
}
}
val a_list = List(a,b,c,d)
max_value = max(a_list)