Why should one prefer Option for error handling over exceptions in Scala? - scala

So I'm learning functional Scala, and the book says exception breaks referential transparency, and thus Option should be used instead, like so:
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
This seems pretty bad; I mean it seems equivalent to:
catch(Exception e){
return null;
}
Save for the fact that we can distinguish "null for error" from "null as genuine value". It seems it should at least return something that contains the error information like:
catch {
case e: Exception => Fail(e)
}
What am I missing?

At this specific section, Option is used mostly as an example because the operation used (calculating the mean) is a partial function, it doesn't produce a value for all possible values (the collection could be empty, thus there's no way to calculate the mean) and Option could be a valid case here. If you can't calculate the mean because the collection is empty just return a None.
But there are many other ways to solve this problem, you could use Either[L,R], with the Left being the error result and a Right as being the good result, you could still throw an exception and wrap it inside a Try object (which seems more common nowadays due to it's use in Promise and Future computations), you could use ScalaZ Validation if the error was actually a validation issue.
The main concept you should take a way from this part is that the error should be part of the return type of the function and not some magic operation (the exception) that can't be reasonably declared by the types.
And as a shameless plug, I did blog about Either and Try here.

It would be easier to answer this question if you weren't asking "why is Option better than exceptions?" and "why is Option better than null?" and "why is Option better than Try?" all at the same time.
The answer to the first of these questions is that using exceptions in situations that aren't truly exceptional muddles the control flow of your program. This is where referential transparency comes in—it's much easier for me (or you) to reason about your code if I can think in terms of values and don't have to keep track of where exceptions are being thrown and caught.
The answer to the second question (why not null?) is something like "Have you ever had to deal with NullPointerException in Java?".
For the third question, in general you're right—it's better to use a type like Either[Throwable, A] or Try[A] to represent computations that can fail, since they allow you to pass along more detailed information about the failure. In some cases, though, when a function can only fail in a single obvious way, it makes sense to use Option. For example, if I'm performing a lookup in a map, I probably don't really need or want something like an Either[NoSuchElementException, A], where the error is so abstract that I'd probably end up wrapping it in something more domain-specific anyway. So get on a map just returns an Option[A].

You should use util.Try:
scala> import java.util.regex.Pattern
import java.util.regex.Pattern
scala> def pattern(s: String): util.Try[Pattern] = util.Try(Pattern.compile(s))
pattern: (s: String)scala.util.Try[java.util.regex.Pattern]
scala> pattern("<?++")
res0: scala.util.Try[java.util.regex.Pattern] =
Failure(java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 3
<?++
^)
scala> pattern("[.*]")
res1: scala.util.Try[java.util.regex.Pattern] = Success([.*])

The naive example
def pattern(s: String): Pattern = {
Pattern.compile(s)
}
has a sideeffect, it can influence the programm that uses it by other means than its result(it can cause a exception). This is discouraged in functional programming, because it increases the code complexity.
The code
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
encapsulates the side effect producing part of the programm. The information why the Pattern failed is lost, but sometimes it only matters whether or not it fails. If it matters why the method failed one can use Try(http://www.scala-lang.org/files/archive/nightly/docs/library/index.html#scala.util.Try):
def pattern(s: String): Try[Pattern] = {
Try(Pattern.compile(s))
}

I think the other two answers give you good suggestions about how to proceed. I would still argue that throwing an exception is well represented in Scala's type system, using the bottom type Nothing. So it is well-typed, and I wouldn't exactly called it "magic operation".
However... if your method can quite commonly result in an invalid value, that is if your call side quite reasonably wants to handle such an invalid value straight away, then using Option, Either or Try is a good approach. In a scenario, where your call site doesn't really know what to do with such an invalid value, especially if it is an exceptional condition and not the common case, then you should use exceptions IMO.
The problem of exception is precisely not that they are not well working with functional programming, but that they can be difficult to reason about when you have side effects. Because then your call site must ensure to undo the side effects in the case of an exception. If your call site is purely functional, passing on an exception doesn't do any damage.
If any functions that does anything with integers would declare its return type a Try because of division-by-zero or overflow possibilities, this might totally clutter your code. Another very good reason to use exceptions is invalid argument ranges, or requirements. If you expect an argument to be an integer between 0 and x, you may well throw an IllegalArgumentException if it does not meet that property; conveniently in Scala: require(a >= 0 && a < x).

Related

Is returning Either/Option/Try/Or considered a viable / idiomatic approach when function has preconditions for arguments?

First of all, I'm very new to Scala and don't have any experience writing production code with it, so I lack understanding of what is considered a good/best practice among community. I stumbled upon these resources:
https://github.com/alexandru/scala-best-practices
https://nrinaudo.github.io/scala-best-practices/
It is mentioned there that throwing exceptions is not very good practice, which made me think what would be a good way to define preconditions for function then, because
A function that throws is a bit of a lie: its type implies it’s total function when it’s not.
After a bit of research, it seems that using Option/Either/Try/Or(scalactic) is a better approach, since you can use something like T Or IllegalArgumentException as return type to clearly indicate that function is actually partial, using exception as a way to store message that can be wrapped in other exceptions.
However lacking Scala experience I don't quite understand if this is actually viable approach for a real project or using Predef.require is a way to go. I would appreciate if someone explained how things are usually done in Scala community and why.
I've also seen Functional assertion in Scala, but while the idea itself looks interesting, I think PartialFunction is not very suitable for the purpose as it is, because often more than one argument is passed and tuples look like a hack in this case.
Option or Either is definitely the way to go for functional programming.
With Option it is important to document why None might be returned.
With Either, the left side is the unsuccessful value (the "error"), while the right side is the successful value. The left side does not necessarily have to be an Exception (or a subtype of it), it can be a simple error message String (type aliases are your friend here) or a custom data type that is suitable for you application.
As an example, I usually use the following pattern when error handling with Either:
// Somewhere in a package.scala
type Error = String // Or choose something more advanced
type EitherE[T] = Either[Error, T]
// Somewhere in the program
def fooMaybe(...): EitherE[Foo] = ...
Try should only be used for wrapping unsafe (most of the time, plain Java) code, giving you the ability to pattern-match on the result:
Try(fooDangerous()) match {
case Success(value) => ...
case Failure(value) => ...
}
But I would suggest only using Try locally and then go with the above mentioned data types from there.
Some advanced datatypes like cats.effect.IO or monix.reactive.Observable contain error handling natively.
I would also suggest looking into cats.data.EitherT for typeclass-based error handling. Read the documentation, it's definitely worth it.
As a sidenote, for everyone coming from Java, Scala treats all Exceptions as Java treats RuntimeExceptions. That means, even when an unsafe piece of code from one of your dependencies throws a (checked) IOException, Scala will never require you to catch or otherwise handle the exception. So as a rule of thumb, when using Java - dependencies, almost always wrap them in a Try (or an IO if they execute side effects or block the thread).
I think your reasoning is correct. If you have a simple total (opposite of partial) function with arguments that can have invalid types then the most common and simple solution is to return some optional result like Option, etc.
It's usually not advisable to throw exceptions as they break FP laws. You can use any library that can return a more advanced type than Option like Scalaz Validation if you need to compose results in ways that are awkward with Option.
Another two alternatives I could offer is to use:
Type constrained arguments that enforce preconditions. Example: val i: Int Refined Positive = 5 based on https://github.com/fthomas/refined. You can also write your own types which wrap primitive types and assert some properties. The problem here is if you have arguments that have multiple interdependent valid values which are mutually exclusive per argument. For instance x > 1 and y < 1 or x < 1 and y > 1. In such case you can return an optional value instead of using this approach.
Partial functions, which in the essence resemble optional return types: case i: Int if i > 0 => .... Docs: https://www.scala-lang.org/api/2.12.1/scala/PartialFunction.html.
For example:
PF's def lift: (A) ⇒ Option[B] converts PF to your regular function.
Turns this partial function into a plain function returning an Option
result.
Which is similar to returning an option. The problem with partial functions that they are a bit awkward to use and not fully FP friendly.
I think Predef.require belongs to very rare cases where you don't want to allow any invalid data to be constructed and is more of a stop-everything-if-this-happens kind of measure. Example would be that you get arguments you never supposed to get.
You use the return type of the function to indicate the type of the result.
If you want to describe a function that can fail for whatever reason, of the types you mentioned you would probably return Try or Either: I am going to "try" to give your a result, or I am going to return "either" a success or an failure.
Now you can specify a custom exception
case class ConditionException(message: String) extends RuntimeException(message)
that you would return if your condition is not satisfied, e.g
import scala.util._
def myfunction(a: String, minLength: Int): Try[String] = {
if(a.size < minLength) {
Failure(ConditionException(s"string $a is too short")
} else {
Success(a)
}
}
and with Either you would get
import scala.util._
def myfunction(a: String, minLength: Int): Either[ConditionException,String] = {
if(a.size < minLength) {
Left(ConditionException(s"string $a is too short")
} else {
Right(a)
}
}
Not that the Either solution clearly indicates the error your function might return

Effects to use for wrapping impure methods?

I'm trying to understand how to use effect monads (cats.effect.IO or scalaz.IO does not matter). Imagine I have the following method:
def extract(str: String): String = {
if(str.contains("123"))
"123"
else
throw new IllegalArgumentException("Boom!")
}
Since this method is impure (throws exception) and I need to combine its result with another effectful computation (network-IO) is it a good practice to just wrap it into IO as follows:
def extract(str: String): IO[String] = IO {
if(str.contains("123"))
"123"
else
throw new IllegalArgumentException("Boom!")
}
Is it a common use case of Effect monads?
Whether this is appropriate or not depends on what you want to express.
It's not like your first def extract(str: String): String-method definition is somehow invalid: you are just sweeping all the exceptions and side-effects under the rug, so that they aren't visible in the signature. If this is irrelevant for your current project, and if it's acceptable that a program simply crashes with a long stack trace of the thrown exception, then do it, no problem (it's easy to imagine one-off throw-away scripts where this would be appropriate).
If instead you declare def extract(str: String): IO[String] = IO { ... }, then you at least can see in the signature that the function extract can do something impure (throw an exception, in this case). Now the question becomes: who is responsible for dealing with this exception, or where do you want to deal with this exception? Consider this: if the exception is thrown, it will emerge in your code in the line where something like yourProgram.unsafeRunSync() is invoked. Does it make sense to deal with this exception there? Maybe it does, maybe it doesn't: nobody can tell you. If you just want to catch the exception at the top level in your main, log it, and exit, then it's appropriate.
However, if you want to deal with the exception immediately, you have better options. For example, if you are writing a method that prompts for a file name, then tries to extract something from this name, and finally does some IO on files, then the return type IO[String] might be too opaque. You might want to use some other monad instead (Option, Either, Try etc.) to represent the failed extraction. For example, if you use Option[String] as return type, you no longer have to deal with an obscure exception thousand miles away in your main method, but instead you can deal with it immediately, and, for example, prompt for a new file name repeatedly.
The whole exercise is somewhat similar to coming up with a strategy how to deal with exceptions in general, only here you are expressing it explicitly in the type signatures of your methods.

What's the recommended way to deal with errors in Scala?

Let's say that I have a method addUser that adds a user to database. When called, the method might:
succeed
fail, because the input was invalid (i. e. the user name already exists)
fail, because the database crashed or whatever
The method would probably consist of a single database API call that would in case of failure throw an exception. If it was in plain Java, I'd probably catch the exception inside the method and examine the reason. If it fell in the second category (invalid input), I would throw a custom checked exception explaining the reason (for example UserAlreadyExistsException). In case of the second category, I'd just re-throw the original exception.
I know that there are strong opinions in Java about error handling so there might be people disagreeing with this approach but I'd like to focus on Scala now.
The advantage of the described approach is that when I call addUser I can choose to catch UserAlreadyExistsException and deal with it (because it's appropriate for my current level of abstraction) but at the same time I can choose to completely ignore any other low-level database exception that might be thrown and let other layers deal with it.
Now, how do I achieve the same thing in Scala? Or what would be the right Scala approach? Obviously, exceptions would work in Scala exactly the same way but I came across opinions that there are better and more suitable ways.
As far as I know, I could go either with Option, Either or Try. Neither of those, however, seem as elegant as good old exceptions.
For example, dealing with the Try result would look like this (borrowed from similar question):
addUser(user) match {
case Success(user) => Ok(user)
case Failure(t: PSQLException) if(e.getSQLState == "23505") => InternalServerError("Some sort of unique key violation..")
case Failure(t: PSQLException) => InternalServerError("Some sort of psql error..")
case Failure(_) => InternalServerError("Something else happened.. it was bad..")
}
Those last two lines are exactly something I'd like to avoid because I'd have to add them anywhere I make a database query (and counting on MatchError doesn't seem like a good idea).
Also dealing with multiple error sources seems a bit cumbersome:
(for {
u1 <- addUser(user1)
u2 <- addUser(user2)
u3 <- addUser(user3)
} yield {
(u1, u2, u3)
}) match {
case Success((u1, u2, u3)) => Ok(...)
case Failure(...) => ...
}
How is that better than:
try {
u1 = addUser(user1)
u2 = addUser(user2)
u3 = addUser(user3)
Ok(...)
} catch {
case (e: UserAlreadyExistsException) => ...
}
Has the former approach any advantages that I'm not aware of?
From what I understood, Try is very useful when passing exceptions between different threads but as long as I'm working within a single thread, it doesn't make much sense.
I'd like to hear some arguments and recommendations about this.
Much of this topic is of course a matter of opinion. Still, there are some concrete points that can be made:
You are correct to observe that Option, Either and Try are quite generic; the names do not provide much documentation. Therefore, you could consider a custom sealed trait:
sealed trait UserAddResult
case object UserAlreadyExists extends UserAddResult
case class UserSuccessfullyAdded(user: User) extends UserAddResult
This is functionally equivalent to an Option[User], with the added benefit of documentation.
Exceptions in Scala are always unchecked. Therefore, you would use them in the same cases you use unchecked exceptions in Java, and not for the cases where you would use checked exceptions.
There are monadic error handling mechanisms such as Try, scalaz's Validation, or the monadic projections of Either.
The primary purpose of these monadic tools is to be used by the caller to organize and handle several exceptions together. Therefore, there is not much benefit, either in terms of documentation or behavior, to having your method return these types. Any caller who wants to use Try or Validation can convert your method's return type to their desired monadic form.
As you can maybe guess from the way I phrased these points, I favor defining custom sealed traits, as this provides the best self-documenting code. But, this is a matter of taste.

What is the difference between Try and Either?

According to the documentation:
The Try type represents a computation that may either result in an
exception, or return a successfully computed value. It's similar to,
but semantically different from the scala.util.Either type.
The docs do not go into further detail as to what the semantic difference is. Both seem to be able to communicate successes and failures. Why would you use one over the other?
I covered the relationship between Try, Either, and Option in this answer. The highlights from there regarding the relationship between Try and Either are summarized below:
Try[A] is isomorphic to Either[Throwable, A]. In other words you can treat a Try as an Either with a left type of Throwable, and you can treat any Either that has a left type of Throwable as a Try. It is conventional to use Left for failures and Right for successes.
Of course, you can also use Either more broadly, not only in situations with missing or exceptional values. There are other situations where Either can help express the semantics of a simple union type (where value is one of two types).
Semantically, you might use Try to indicate that the operation might fail. You might similarly use Either in such a situation, especially if your "error" type is something other than Throwable (e.g. Either[ErrorType, SuccessType]). And then you might also use Either when you are operating over a union type (e.g. Either[PossibleType1, PossibleType2]).
Since Scala 2.12, the standard library does include the conversions from Either to Try or from Try to Either. For earlier versions, it is pretty simple to enrich Try, and Either as needed:
object TryEitherConversions {
implicit class EitherToTry[L <: Throwable, R](val e: Either[L, R]) extends AnyVal {
def toTry: Try[R] = e.fold(Failure(_), Success(_))
}
implicit class TryToEither[T](val t: Try[T]) extends AnyVal {
def toEither: Either[Throwable, T] =
t.map(Right(_)).recover(Left(_)).get
}
}
This would allow you to do:
import TryEitherConversions._
//Try to Either
Try(1).toEither //Either[Throwable, Int] = Right(1)
Try("foo".toInt).toEither //Either[Throwable, Int] = Left(java.lang.NumberFormatException)
//Either to Try
Right[Throwable, Int](1).toTry //Success(1)
Left[Throwable, Int](new Exception).toTry //Failure(java.lang.Exception)
To narrowly answer your question: "What's the semantic difference":
This probably refers to flatMap and map, which are non-existent in Either and either propagate failure or map the success value in Try. This allows, for instance, chaining like
for {
a <- Try {something}
b <- Try {somethingElse(a)}
c <- Try {theOtherThing(b)}
} yield c
which does just what you'd hope - returns a Try containing either the first exception, or the result.
Try has lots of other useful methods, and of course its companion apply method, that make it very convenient for its intended use - exception handling.
If you really want to be overwhelmed, there are two other classes out there which may be of interest for this kind of application. Scalaz has a class called "\/" (formerly known as Prince), pronounced "Either", which is mostly like Either, but flatMap and map work on the Right value. Similarly, and not, Scalactic has an "Or" which is also similar to Either, but flatMap and map work on the Left value.
I don't recommend Scalaz for beginners.
Either does not imply success and failure, it is just a container for either an A or a B. It is common to use it to represent successes and failures, the convention being to put the failure on the left side, and the success on the right.
A Try can be seen as an Either with the left-side type set to Throwable. Try[A] would be equivalent to Either[Throwable, A].
Use Try to clearly identify a potential failure in the computation, the failure being represented by an exception. If you want to represent the failure with a different type (like a String, or a set of case classes extending a sealed trait for example) use Either.
Either is more general, since it simply represents disjoint unions of types.
In particular, it can represent a union of valid return values of some type X and Exception. However, it does not attempt to catch any exceptions on its own. You have to add try-catch blocks around dangerous code, and then make sure that each branch returns an appropriate subclass of Either (usually: Left for errors, Right for successful computations).
Try[X] can be thought of as Either[Exception, X], but it also catches Exceptions on its own.
Either[X, Y] usage is more general. As its name say it can represent either an object of X type or of Y.
Try[X] has only one type and it might be either a Success[X] or a Failure (which contains a Throwable).
At some point you might see Try[X] as an Either[Throwable,X]
What is nice about Try[X] is that you can chain futher operations to it, if it is really a Success they will execute, if it was a Failure they won't
val connection = Try(factory.open())
val data = connection.flatMap(conn => Try(conn.readData()))
//At some point you can do
data matches {
Success(data) => print data
Failure(throwable) => log error
}
Of course, you can always oneline this like
Try(factory.open()).flatMap(conn => Try(conn.readData()) matches {
Success(data) => print data
Failure(throwable) => log error
}
As already have been mentioned, Either is more general, so it might not only wrap error/successful result, but also can be used as an alternative to Option, for branching the code path.
For abstracting the effect of an error, only for this purpose, I identified the following differences:
Either can be used to specify a description of the error, which can be shown to the client. Try - wraps an exception with a stack trace, less descriptive, less client oriented, more for internal usage.
Either allows us to specify error type, with existing monoid for this type. As a result, it allows us to combine errors (usually via applicative effects). Try abstraction with its exception, has no monoid defined. With Try we must spent more effort to extract error and handle it.
Based on it, here is my best practices:
When I want to abstract effect of error, I always use Either as the first choice, with List/Vector/NonEmptyList as error type.
Try is used only, when you invoke code, written in OOP. Good candidates for Try are methods, that might throw an exception, or methods, that sends request to external systems (rest/soap/database requests in case the methods return a raw result, not wrapped into FP abstractions, like Future, for instance.

Throwing exceptions in Scala, what is the "official rule"

I'm following the Scala course on Coursera.
I've started to read the Scala book of Odersky as well.
What I often hear is that it's not a good idea to throw exceptions in functional languages, because it breaks the control flow and we usually return an Either with the Failure or Success.
It seems also that Scala 2.10 will provide the Try which goes in that direction.
But in the book and the course, Martin Odersky doesn't seem to say (at least for now) that exceptions are bad, and he uses them a lot.
I also noticed the methods assert / require...
Finally I'm a bit confused because I'd like to follow the best practices but they are not clear and the language seems to go in both directions...
Can someone explain me what i should use in which case?
The basic guideline is to use exceptions for something really exceptional**. For an "ordinary" failure, it's far better to use Option or Either. If you are interfacing with Java where exceptions are thrown when someone sneezes the wrong way, you can use Try to keep yourself safe.
Let's take some examples.
Suppose you have a method that fetches something from a map. What could go wrong? Well, something dramatic and dangerous like a segfault* stack overflow, or something expected like the element isn't found. You'd let the segfault stack overflow throw an exception, but if you merely don't find an element, why not return an Option[V] instead of the value or an exception (or null)?
Now suppose you're writing a program where the user is supposed to enter a filename. Now, if you're not just going to instantly bail on the program when something goes wrong, an Either is the way to go:
def main(args: Array[String]) {
val f = {
if (args.length < 1) Left("No filename given")
else {
val file = new File(args(0))
if (!file.exists) Left("File does not exist: "+args(0))
else Right(file)
}
}
// ...
}
Now suppose you want to parse an string with space-delimited numbers.
val numbers = "1 2 3 fish 5 6" // Uh-oh
// numbers.split(" ").map(_.toInt) <- will throw exception!
val tried = numbers.split(" ").map(s => Try(s.toInt)) // Caught it!
val good = tried.collect{ case Success(n) => n }
So you have three ways (at least) to deal with different types of failure: Option for it worked / didn't, in cases where not working is expected behavior, not a shocking and alarming failure; Either for when things can work or not (or, really, any case where you have two mutually exclusive options) and you want to save some information about what went wrong; and Try when you don't want the whole headache of exception handling yourself, but still need to interface with code that is exception-happy.
Incidentally, exceptions make for good examples--so you'll find them more often in a textbook or learning material than elsewhere, I think: textbook examples are very often incomplete, which means that serious problems that normally would be prevented by careful design ought instead be flagged by throwing an exception.
*Edit: Segfaults crash the JVM and should never happen regardless of the bytecode; even an exception won't help you then. I meant stack overflow.
**Edit: Exceptions (without a stack trace) are also used for control flow in Scala--they're actually quite an efficient mechanism, and they enable things like library-defined break statements and a return that returns from your method even though the control has actually passed into one or more closures. Mostly, you shouldn't worry about this yourself, except to realize that catching all Throwables is not such a super idea since you might catch one of these control flow exceptions by mistake.
So this is one of those places where Scala specifically trades off functional purity for ease-of-transition-from/interoperability-with legacy languages and environments, specifically Java. Functional purity is broken by exceptions, as they break referential integrity and make it impossible to reason equationally. (Of course, non-terminating recursions do the same, but few languages are willing to enforce the restrictions that would make those impossible.) To keep functional purity, you use Option/Maybe/Either/Try/Validation, all of which encode success or failure as a referentially-transparent type, and use the various higher-order functions they provide or the underlying languages special monad syntax to make things clearer. Or, in Scala, you can simply decide to ditch functional purity, knowing that it might make things easier in the short term but more difficult in the long. This is similar to using "null" in Scala, or mutable collections, or local "var"s. Mildly shameful, and don't do to much of it, but everyone's under deadline.