I'm following the Scala course on Coursera.
I've started to read the Scala book of Odersky as well.
What I often hear is that it's not a good idea to throw exceptions in functional languages, because it breaks the control flow and we usually return an Either with the Failure or Success.
It seems also that Scala 2.10 will provide the Try which goes in that direction.
But in the book and the course, Martin Odersky doesn't seem to say (at least for now) that exceptions are bad, and he uses them a lot.
I also noticed the methods assert / require...
Finally I'm a bit confused because I'd like to follow the best practices but they are not clear and the language seems to go in both directions...
Can someone explain me what i should use in which case?
The basic guideline is to use exceptions for something really exceptional**. For an "ordinary" failure, it's far better to use Option or Either. If you are interfacing with Java where exceptions are thrown when someone sneezes the wrong way, you can use Try to keep yourself safe.
Let's take some examples.
Suppose you have a method that fetches something from a map. What could go wrong? Well, something dramatic and dangerous like a segfault* stack overflow, or something expected like the element isn't found. You'd let the segfault stack overflow throw an exception, but if you merely don't find an element, why not return an Option[V] instead of the value or an exception (or null)?
Now suppose you're writing a program where the user is supposed to enter a filename. Now, if you're not just going to instantly bail on the program when something goes wrong, an Either is the way to go:
def main(args: Array[String]) {
val f = {
if (args.length < 1) Left("No filename given")
else {
val file = new File(args(0))
if (!file.exists) Left("File does not exist: "+args(0))
else Right(file)
}
}
// ...
}
Now suppose you want to parse an string with space-delimited numbers.
val numbers = "1 2 3 fish 5 6" // Uh-oh
// numbers.split(" ").map(_.toInt) <- will throw exception!
val tried = numbers.split(" ").map(s => Try(s.toInt)) // Caught it!
val good = tried.collect{ case Success(n) => n }
So you have three ways (at least) to deal with different types of failure: Option for it worked / didn't, in cases where not working is expected behavior, not a shocking and alarming failure; Either for when things can work or not (or, really, any case where you have two mutually exclusive options) and you want to save some information about what went wrong; and Try when you don't want the whole headache of exception handling yourself, but still need to interface with code that is exception-happy.
Incidentally, exceptions make for good examples--so you'll find them more often in a textbook or learning material than elsewhere, I think: textbook examples are very often incomplete, which means that serious problems that normally would be prevented by careful design ought instead be flagged by throwing an exception.
*Edit: Segfaults crash the JVM and should never happen regardless of the bytecode; even an exception won't help you then. I meant stack overflow.
**Edit: Exceptions (without a stack trace) are also used for control flow in Scala--they're actually quite an efficient mechanism, and they enable things like library-defined break statements and a return that returns from your method even though the control has actually passed into one or more closures. Mostly, you shouldn't worry about this yourself, except to realize that catching all Throwables is not such a super idea since you might catch one of these control flow exceptions by mistake.
So this is one of those places where Scala specifically trades off functional purity for ease-of-transition-from/interoperability-with legacy languages and environments, specifically Java. Functional purity is broken by exceptions, as they break referential integrity and make it impossible to reason equationally. (Of course, non-terminating recursions do the same, but few languages are willing to enforce the restrictions that would make those impossible.) To keep functional purity, you use Option/Maybe/Either/Try/Validation, all of which encode success or failure as a referentially-transparent type, and use the various higher-order functions they provide or the underlying languages special monad syntax to make things clearer. Or, in Scala, you can simply decide to ditch functional purity, knowing that it might make things easier in the short term but more difficult in the long. This is similar to using "null" in Scala, or mutable collections, or local "var"s. Mildly shameful, and don't do to much of it, but everyone's under deadline.
Related
I'm wondering if this is a bad code style to replace map with return and get on Try for readability? Say I have some variable with Try inside it and then I need to do anything on it.
val myData: Try[String]
I can do:
myData.flatMap{
some long code
}
Or I can do:
if (myData.isFailure) return myData
val myString = myData.get
some long code that use myString
Yes I would regard this as bad style. One of the biggest gains I have made as a programmer have come from replacing statements with expressions. So I would rewrite your example using pattern matching.
myData match {
case Success(myString) =>
some long code that use myString
case Failure(_) => tFileDf
}
I understand that if-guards are very common in languages like java and C# and they even make the code easier to read in these cases. Since moving to Scala I find less cases where if-guards would improve my code. However the move to expressions has greatly improved the quality of all of my code.
Once you make this change you will gradually use less mutable state and side effects. Gradually your methods will become pure functions. They will probably become shorter. Then you will discover the benefits of total functions. These things improve all of your code where if-guards are local optimisations that will start to clash with modern scala code, and even make it error prone to change.
In the case above I would probably consider exposing the Try, this exposes the failure case more explicitly than returning a default or error value of the same return type.
Another reason that return is discouraged is that it does not alway play nicely in functions.
My advice is to embrace the more functional aspects of Scala and see where it takes you. It will take you more time to write than the equivalent in Java it pay dividends very quickly.
In the chapter "Handling errors without exceptions" of book "functional programming in Scala", the author gives:
The problem of throwing exceptions from the body of a function
Use Option if we don't care about the actual exception
Use Either if we care about the actual exception
But scala.util.Try is not mentioned. From my point of view, I think Try is very suitable when we care about the actual exception, why it's not mentioned? Is there any reason I have missed?
I'm neither of the authors of Functional Programming in Scala, but I can make a few guesses about why they don't mention Try.
Some people don't like the standard library's Try because they claim it violates the functor composition law. I personally think that this position is kind of silly, for the reasons Josh Suereth mentions in the comments of SI-6284, but the debate does highlight an important aspect of Try's design.
Try's map and flatMap are explicitly designed to work with functions that may throw exceptions. People from the FPiS school of thought (including me) would tend to suggest wrapping such functions (if you absolutely have to deal with them at all) in safe versions at a low level in your program, and then exposing an API that will never throw (non-fatal) exceptions.
Including Try in your API muddles up the layers in this model—you're guaranteeing that your API methods won't throw exceptions, but then you're handing people a type that's designed to be used with functions that throw exceptions.
That's only a complaint about the standard library's design and implementation of Try, though. It's easy enough to imagine a version of Try with different semantics, where the map and flatMap methods didn't catch exceptions, and there would still be good reasons to avoid this "improved" version of Try whenever possible.
One of these reasons is that using Either[MyExceptionType, A] instead of Try[A] makes it possible to get more mileage out of the compiler's exhaustivity checking. Suppose I'm using the following simple ADT for errors in my application:
sealed class FooAppError(message: String) extends Exception(message)
case class InvalidInput(message: String) extends FooAppError(message)
case class MissingField(fieldName: String) extends FooAppError(
s"$fieldName field is missing"
)
Now I'm trying to decide whether a method that can only fail in one of these two ways should return Either[FooAppError, A] or Try[A]. Choosing Try[A] means we're throwing away information that's potentially useful both to human users and to the compiler. Suppose I write a method like this:
def doSomething(result: Either[FooAppError, String]) = result match {
case Right(x) => x
case Left(MissingField(_)) => "bad"
}
I'll get a nice compile-time warning telling me that the match is not exhaustive. If I add a case for the missing error, the warning goes away.
If I had used Try[String] instead, I'd also get exhaustivity checking, but the only way to get rid of the warning would be to have a catch-all case—it's just not possible to enumerate all Throwables in the pattern match.
Sometimes we actually can't conveniently limit the kinds of ways an operation can fail to our own failure type (like FooAppError above), and in these cases we can always use Either[Throwable, A]. Scalaz's Task, for example, is essentially a wrapper for Future[Throwable \/ A]. The difference is that Either (or \/) supports this kind of signature, while Try requires it. And it's not always what you want, for reasons like useful exhaustivity checking.
Given a method in UserService: update, what's the best way to handle errors/exceptions here?
Option A:
def update(...): Try[User]
In this way, I need to define my custom exceptions and throw them in the function body when needed. Most of these exceptions are business errors (e.g. user_id cannot be changed, etc). The point here is no matter what exception(s) are thrown (business error, network exception, DB IO exception, etc), treat them the same way and just return a Failure(err) - let the upper layer handle them.
Option B:
def update(...): Either[Error, User]
This is the exception-free way. In the function body it catches all possible exceptions and turns them into Error, and for business errors just return Left[Error].
Using Try seems to be a more natural way to me as I want to handle errors. Either is a more generic thing - Either[Error, T] is just one special case and I think Try is invented for this special case. But I also read that we should avoid using exceptions for error handling...
So, which solution is better, and why?
There's no silver bullet.
As you noted already, Try is simply a more specialized version of Either, where the Left type is fixed to Throwable.
Try might be a good fit if you need to materialize exceptions thrown by external (perhaps java) libraries, as its constructor automatically catches them.
Another advantage of Try is that it has map and flatMap, so you can use it directly in for-comprehensions, whereas with Either you would have to explicitly project on the right case.
Anyway, there's plenty of alternative implementations with a "right-bias", and probably the scalaz \/ type is the most popular one.
That being said, I typically use \/ or the almost equivalent Validation (both from scalaz), as I like having the ability of returning errors that do not extend Throwable.
It also allows for more precise error types, which is a huge win.
So I'm learning functional Scala, and the book says exception breaks referential transparency, and thus Option should be used instead, like so:
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
This seems pretty bad; I mean it seems equivalent to:
catch(Exception e){
return null;
}
Save for the fact that we can distinguish "null for error" from "null as genuine value". It seems it should at least return something that contains the error information like:
catch {
case e: Exception => Fail(e)
}
What am I missing?
At this specific section, Option is used mostly as an example because the operation used (calculating the mean) is a partial function, it doesn't produce a value for all possible values (the collection could be empty, thus there's no way to calculate the mean) and Option could be a valid case here. If you can't calculate the mean because the collection is empty just return a None.
But there are many other ways to solve this problem, you could use Either[L,R], with the Left being the error result and a Right as being the good result, you could still throw an exception and wrap it inside a Try object (which seems more common nowadays due to it's use in Promise and Future computations), you could use ScalaZ Validation if the error was actually a validation issue.
The main concept you should take a way from this part is that the error should be part of the return type of the function and not some magic operation (the exception) that can't be reasonably declared by the types.
And as a shameless plug, I did blog about Either and Try here.
It would be easier to answer this question if you weren't asking "why is Option better than exceptions?" and "why is Option better than null?" and "why is Option better than Try?" all at the same time.
The answer to the first of these questions is that using exceptions in situations that aren't truly exceptional muddles the control flow of your program. This is where referential transparency comes in—it's much easier for me (or you) to reason about your code if I can think in terms of values and don't have to keep track of where exceptions are being thrown and caught.
The answer to the second question (why not null?) is something like "Have you ever had to deal with NullPointerException in Java?".
For the third question, in general you're right—it's better to use a type like Either[Throwable, A] or Try[A] to represent computations that can fail, since they allow you to pass along more detailed information about the failure. In some cases, though, when a function can only fail in a single obvious way, it makes sense to use Option. For example, if I'm performing a lookup in a map, I probably don't really need or want something like an Either[NoSuchElementException, A], where the error is so abstract that I'd probably end up wrapping it in something more domain-specific anyway. So get on a map just returns an Option[A].
You should use util.Try:
scala> import java.util.regex.Pattern
import java.util.regex.Pattern
scala> def pattern(s: String): util.Try[Pattern] = util.Try(Pattern.compile(s))
pattern: (s: String)scala.util.Try[java.util.regex.Pattern]
scala> pattern("<?++")
res0: scala.util.Try[java.util.regex.Pattern] =
Failure(java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 3
<?++
^)
scala> pattern("[.*]")
res1: scala.util.Try[java.util.regex.Pattern] = Success([.*])
The naive example
def pattern(s: String): Pattern = {
Pattern.compile(s)
}
has a sideeffect, it can influence the programm that uses it by other means than its result(it can cause a exception). This is discouraged in functional programming, because it increases the code complexity.
The code
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
encapsulates the side effect producing part of the programm. The information why the Pattern failed is lost, but sometimes it only matters whether or not it fails. If it matters why the method failed one can use Try(http://www.scala-lang.org/files/archive/nightly/docs/library/index.html#scala.util.Try):
def pattern(s: String): Try[Pattern] = {
Try(Pattern.compile(s))
}
I think the other two answers give you good suggestions about how to proceed. I would still argue that throwing an exception is well represented in Scala's type system, using the bottom type Nothing. So it is well-typed, and I wouldn't exactly called it "magic operation".
However... if your method can quite commonly result in an invalid value, that is if your call side quite reasonably wants to handle such an invalid value straight away, then using Option, Either or Try is a good approach. In a scenario, where your call site doesn't really know what to do with such an invalid value, especially if it is an exceptional condition and not the common case, then you should use exceptions IMO.
The problem of exception is precisely not that they are not well working with functional programming, but that they can be difficult to reason about when you have side effects. Because then your call site must ensure to undo the side effects in the case of an exception. If your call site is purely functional, passing on an exception doesn't do any damage.
If any functions that does anything with integers would declare its return type a Try because of division-by-zero or overflow possibilities, this might totally clutter your code. Another very good reason to use exceptions is invalid argument ranges, or requirements. If you expect an argument to be an integer between 0 and x, you may well throw an IllegalArgumentException if it does not meet that property; conveniently in Scala: require(a >= 0 && a < x).
Why is the method get defined in Option and not in Some?
One could apply pattern matching or use foreach, map, flatMap, getOrElse which is preferred anyway without the danger of runtime exceptions if None.get is called.
Are there really cases where the get method is so much convinent that it justifies this? Using the get method reminds me to Java where "I don't have to check for null because I know the var is set" and then see a NullPointerException.
Option.get is occasionally useful when you know, by reasoning that isn't captured by the type system, that the value you have is not None. However, you should reach for it only when you've tried:
Arranging for this knowledge to be expressed by your types.
Using the Option processing methods mentioned (flatMap, getOrElse etc), and none of them fit elegantly.
It's also convenient for throwaway / interactive code (REPL, worksheets etc).
Foreword
Scala was created with compatibility for the JVM as a target.
So it's pretty natural that Exceptions and try/catch should be considered as part of the language, at least for java interoperability.
With this premise in mind, now to the point of including interfaces that can throw errors in algebraic data types (or case classes if you prefer).
To the Question
Assume we encapsulate the get method (or head for List) only in Some instance.
This implies that we're forced to pattern match on the Option every time we want to extract the value. It wouldn't look pretty neat, but it's still a possibility.
We could getOrElse everywhere but this means that I can't signal a failure through the error stack if using imperative style (which is not forbidden in scala, the last time I checked).
My point here is that getOrElse and headOption are available to enforce a functional style when desired, but get and head are good alternatives if imperative is a choice fit to the problem or the programmer's preference.
There is no good reason for it. Martin Odersky added it in this commit in 2007, so I guess we'd need to ask him his motivations. The earlier version had getOrElse semantics.
If you examine the commit provided. It's kind of obvious why and
when it is used. The get is used to unbox when you are sure there is Something in the
Option. It would also serve to be an inheritable method where None.get Throws an exception and Some.get unboxes the contents.
In some cases, when your Option turns out to be a None, there is no recourse and you want to throw an exception. So, instead of writing
myOption match {
case Some(value) => // use value
case None => throw new RuntimeException("I really needed that value")
}
you can simply write
myOption.get // might throw an exception
It also makes Option much nicer to use from Java, in conjunction with isDefined:
if (myOption.isDefined()) process(myOption.get());
else // it's a None!