Better to return None or throw an exception when fetching URL? - scala

I have a Scala helper method that currently tries to fetch a URL and return an Option[String] with the HTML of that webpage.
If there are any exceptions (malformed url, read timeouts, etc...) or if there are any problems it returns a None. The question is, would it be better to just throw the exception so the calling code can log the exception or is it preferable to return the None in this case?

Creating exceptions is expensive because the stack trace has to be filled. Throwing and catching exceptions is also more expensive than a normal return. Considering that, you might ask yourself the following questions:
Do you want to force handling of an error by the caller? If so, don't throw an exception, as Scala has no checked-exception mechanism that forces the caller to catch them.
In case of an error, do you want to include details on why it failed? If not, you can just return Option[A], where A is your return type, and then you'd either have Some(validContent) or None, with no additional explanation. If yes, you can return something like Either[E, A] or a Scalaz Validation[E, A]. All of this options force the caller to somehow unwrap the result while being free to process the error E as he wants. Now what should E be?
Do you want to provide a stack trace in case of failure? If so, you could return Either[Exception, A] or Validation[Exception, A]. If you really go with the exception, you'll want to use Try[A], whose two possible cases are Failure(exc: Throwable) and Success(value: A). Note that you will of course incur the costs of creating the throwable. If not, you can maybe just return Either[String, A] (and be extra careful to remember whether Right means success or failure here — Left is usually used for errors, and Right for the “right” value — Validation is maybe clearer). If you want to optionally return a stack trace, you could use Lift's Box[A], which could be Full(validContents), Empty with no additional explanation (very similar to Option[A] up to here), or indicate a Failure which can store an error string and/or a throwable (and more).
Do you maybe want to provide multiple indications as to why it failed? Then return Either[Seq[String], A]. If you do this frequently, you may probably want to use Scalaz and a Validation[NonEmptyList[String], A] instead, which provides some other nice goodies. Look it up for more info about it or check out these usage examples.

I think that in this case, if is important to log the exceptions then by all means throw the exceptions (and possibly just return String instead of the option). Otherwise you might as well just return the None. One warning- there may be exceptions for other reasons that you don't foresee, in which case it may be dangerous to make a catch-all bit of code.
One thing that you could do is something like Lift's Box system. A Box is essentially an Option, but with a couple features added in: A Full is like a Some, an Empty is like a None, but Lift goes a step further and has a Failure, which is like an Empty but with a reason/message.

The general rule of thumb is "if you can handle the exception, handle it". So there's not enough context to guess. You can use the tryFetchUrl/fetchUrl pair of methods.

Related

Why `scala.util.Try` is not mentioned in chapter "Handling errors without exceptions" of book "functional programming in Scala"?

In the chapter "Handling errors without exceptions" of book "functional programming in Scala", the author gives:
The problem of throwing exceptions from the body of a function
Use Option if we don't care about the actual exception
Use Either if we care about the actual exception
But scala.util.Try is not mentioned. From my point of view, I think Try is very suitable when we care about the actual exception, why it's not mentioned? Is there any reason I have missed?
I'm neither of the authors of Functional Programming in Scala, but I can make a few guesses about why they don't mention Try.
Some people don't like the standard library's Try because they claim it violates the functor composition law. I personally think that this position is kind of silly, for the reasons Josh Suereth mentions in the comments of SI-6284, but the debate does highlight an important aspect of Try's design.
Try's map and flatMap are explicitly designed to work with functions that may throw exceptions. People from the FPiS school of thought (including me) would tend to suggest wrapping such functions (if you absolutely have to deal with them at all) in safe versions at a low level in your program, and then exposing an API that will never throw (non-fatal) exceptions.
Including Try in your API muddles up the layers in this model—you're guaranteeing that your API methods won't throw exceptions, but then you're handing people a type that's designed to be used with functions that throw exceptions.
That's only a complaint about the standard library's design and implementation of Try, though. It's easy enough to imagine a version of Try with different semantics, where the map and flatMap methods didn't catch exceptions, and there would still be good reasons to avoid this "improved" version of Try whenever possible.
One of these reasons is that using Either[MyExceptionType, A] instead of Try[A] makes it possible to get more mileage out of the compiler's exhaustivity checking. Suppose I'm using the following simple ADT for errors in my application:
sealed class FooAppError(message: String) extends Exception(message)
case class InvalidInput(message: String) extends FooAppError(message)
case class MissingField(fieldName: String) extends FooAppError(
s"$fieldName field is missing"
)
Now I'm trying to decide whether a method that can only fail in one of these two ways should return Either[FooAppError, A] or Try[A]. Choosing Try[A] means we're throwing away information that's potentially useful both to human users and to the compiler. Suppose I write a method like this:
def doSomething(result: Either[FooAppError, String]) = result match {
case Right(x) => x
case Left(MissingField(_)) => "bad"
}
I'll get a nice compile-time warning telling me that the match is not exhaustive. If I add a case for the missing error, the warning goes away.
If I had used Try[String] instead, I'd also get exhaustivity checking, but the only way to get rid of the warning would be to have a catch-all case—it's just not possible to enumerate all Throwables in the pattern match.
Sometimes we actually can't conveniently limit the kinds of ways an operation can fail to our own failure type (like FooAppError above), and in these cases we can always use Either[Throwable, A]. Scalaz's Task, for example, is essentially a wrapper for Future[Throwable \/ A]. The difference is that Either (or \/) supports this kind of signature, while Try requires it. And it's not always what you want, for reasons like useful exhaustivity checking.

What is the difference between Try and Either?

According to the documentation:
The Try type represents a computation that may either result in an
exception, or return a successfully computed value. It's similar to,
but semantically different from the scala.util.Either type.
The docs do not go into further detail as to what the semantic difference is. Both seem to be able to communicate successes and failures. Why would you use one over the other?
I covered the relationship between Try, Either, and Option in this answer. The highlights from there regarding the relationship between Try and Either are summarized below:
Try[A] is isomorphic to Either[Throwable, A]. In other words you can treat a Try as an Either with a left type of Throwable, and you can treat any Either that has a left type of Throwable as a Try. It is conventional to use Left for failures and Right for successes.
Of course, you can also use Either more broadly, not only in situations with missing or exceptional values. There are other situations where Either can help express the semantics of a simple union type (where value is one of two types).
Semantically, you might use Try to indicate that the operation might fail. You might similarly use Either in such a situation, especially if your "error" type is something other than Throwable (e.g. Either[ErrorType, SuccessType]). And then you might also use Either when you are operating over a union type (e.g. Either[PossibleType1, PossibleType2]).
Since Scala 2.12, the standard library does include the conversions from Either to Try or from Try to Either. For earlier versions, it is pretty simple to enrich Try, and Either as needed:
object TryEitherConversions {
implicit class EitherToTry[L <: Throwable, R](val e: Either[L, R]) extends AnyVal {
def toTry: Try[R] = e.fold(Failure(_), Success(_))
}
implicit class TryToEither[T](val t: Try[T]) extends AnyVal {
def toEither: Either[Throwable, T] =
t.map(Right(_)).recover(Left(_)).get
}
}
This would allow you to do:
import TryEitherConversions._
//Try to Either
Try(1).toEither //Either[Throwable, Int] = Right(1)
Try("foo".toInt).toEither //Either[Throwable, Int] = Left(java.lang.NumberFormatException)
//Either to Try
Right[Throwable, Int](1).toTry //Success(1)
Left[Throwable, Int](new Exception).toTry //Failure(java.lang.Exception)
To narrowly answer your question: "What's the semantic difference":
This probably refers to flatMap and map, which are non-existent in Either and either propagate failure or map the success value in Try. This allows, for instance, chaining like
for {
a <- Try {something}
b <- Try {somethingElse(a)}
c <- Try {theOtherThing(b)}
} yield c
which does just what you'd hope - returns a Try containing either the first exception, or the result.
Try has lots of other useful methods, and of course its companion apply method, that make it very convenient for its intended use - exception handling.
If you really want to be overwhelmed, there are two other classes out there which may be of interest for this kind of application. Scalaz has a class called "\/" (formerly known as Prince), pronounced "Either", which is mostly like Either, but flatMap and map work on the Right value. Similarly, and not, Scalactic has an "Or" which is also similar to Either, but flatMap and map work on the Left value.
I don't recommend Scalaz for beginners.
Either does not imply success and failure, it is just a container for either an A or a B. It is common to use it to represent successes and failures, the convention being to put the failure on the left side, and the success on the right.
A Try can be seen as an Either with the left-side type set to Throwable. Try[A] would be equivalent to Either[Throwable, A].
Use Try to clearly identify a potential failure in the computation, the failure being represented by an exception. If you want to represent the failure with a different type (like a String, or a set of case classes extending a sealed trait for example) use Either.
Either is more general, since it simply represents disjoint unions of types.
In particular, it can represent a union of valid return values of some type X and Exception. However, it does not attempt to catch any exceptions on its own. You have to add try-catch blocks around dangerous code, and then make sure that each branch returns an appropriate subclass of Either (usually: Left for errors, Right for successful computations).
Try[X] can be thought of as Either[Exception, X], but it also catches Exceptions on its own.
Either[X, Y] usage is more general. As its name say it can represent either an object of X type or of Y.
Try[X] has only one type and it might be either a Success[X] or a Failure (which contains a Throwable).
At some point you might see Try[X] as an Either[Throwable,X]
What is nice about Try[X] is that you can chain futher operations to it, if it is really a Success they will execute, if it was a Failure they won't
val connection = Try(factory.open())
val data = connection.flatMap(conn => Try(conn.readData()))
//At some point you can do
data matches {
Success(data) => print data
Failure(throwable) => log error
}
Of course, you can always oneline this like
Try(factory.open()).flatMap(conn => Try(conn.readData()) matches {
Success(data) => print data
Failure(throwable) => log error
}
As already have been mentioned, Either is more general, so it might not only wrap error/successful result, but also can be used as an alternative to Option, for branching the code path.
For abstracting the effect of an error, only for this purpose, I identified the following differences:
Either can be used to specify a description of the error, which can be shown to the client. Try - wraps an exception with a stack trace, less descriptive, less client oriented, more for internal usage.
Either allows us to specify error type, with existing monoid for this type. As a result, it allows us to combine errors (usually via applicative effects). Try abstraction with its exception, has no monoid defined. With Try we must spent more effort to extract error and handle it.
Based on it, here is my best practices:
When I want to abstract effect of error, I always use Either as the first choice, with List/Vector/NonEmptyList as error type.
Try is used only, when you invoke code, written in OOP. Good candidates for Try are methods, that might throw an exception, or methods, that sends request to external systems (rest/soap/database requests in case the methods return a raw result, not wrapped into FP abstractions, like Future, for instance.

Scala error handling: Try or Either?

Given a method in UserService: update, what's the best way to handle errors/exceptions here?
Option A:
def update(...): Try[User]
In this way, I need to define my custom exceptions and throw them in the function body when needed. Most of these exceptions are business errors (e.g. user_id cannot be changed, etc). The point here is no matter what exception(s) are thrown (business error, network exception, DB IO exception, etc), treat them the same way and just return a Failure(err) - let the upper layer handle them.
Option B:
def update(...): Either[Error, User]
This is the exception-free way. In the function body it catches all possible exceptions and turns them into Error, and for business errors just return Left[Error].
Using Try seems to be a more natural way to me as I want to handle errors. Either is a more generic thing - Either[Error, T] is just one special case and I think Try is invented for this special case. But I also read that we should avoid using exceptions for error handling...
So, which solution is better, and why?
There's no silver bullet.
As you noted already, Try is simply a more specialized version of Either, where the Left type is fixed to Throwable.
Try might be a good fit if you need to materialize exceptions thrown by external (perhaps java) libraries, as its constructor automatically catches them.
Another advantage of Try is that it has map and flatMap, so you can use it directly in for-comprehensions, whereas with Either you would have to explicitly project on the right case.
Anyway, there's plenty of alternative implementations with a "right-bias", and probably the scalaz \/ type is the most popular one.
That being said, I typically use \/ or the almost equivalent Validation (both from scalaz), as I like having the ability of returning errors that do not extend Throwable.
It also allows for more precise error types, which is a huge win.

Why should one prefer Option for error handling over exceptions in Scala?

So I'm learning functional Scala, and the book says exception breaks referential transparency, and thus Option should be used instead, like so:
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
This seems pretty bad; I mean it seems equivalent to:
catch(Exception e){
return null;
}
Save for the fact that we can distinguish "null for error" from "null as genuine value". It seems it should at least return something that contains the error information like:
catch {
case e: Exception => Fail(e)
}
What am I missing?
At this specific section, Option is used mostly as an example because the operation used (calculating the mean) is a partial function, it doesn't produce a value for all possible values (the collection could be empty, thus there's no way to calculate the mean) and Option could be a valid case here. If you can't calculate the mean because the collection is empty just return a None.
But there are many other ways to solve this problem, you could use Either[L,R], with the Left being the error result and a Right as being the good result, you could still throw an exception and wrap it inside a Try object (which seems more common nowadays due to it's use in Promise and Future computations), you could use ScalaZ Validation if the error was actually a validation issue.
The main concept you should take a way from this part is that the error should be part of the return type of the function and not some magic operation (the exception) that can't be reasonably declared by the types.
And as a shameless plug, I did blog about Either and Try here.
It would be easier to answer this question if you weren't asking "why is Option better than exceptions?" and "why is Option better than null?" and "why is Option better than Try?" all at the same time.
The answer to the first of these questions is that using exceptions in situations that aren't truly exceptional muddles the control flow of your program. This is where referential transparency comes in—it's much easier for me (or you) to reason about your code if I can think in terms of values and don't have to keep track of where exceptions are being thrown and caught.
The answer to the second question (why not null?) is something like "Have you ever had to deal with NullPointerException in Java?".
For the third question, in general you're right—it's better to use a type like Either[Throwable, A] or Try[A] to represent computations that can fail, since they allow you to pass along more detailed information about the failure. In some cases, though, when a function can only fail in a single obvious way, it makes sense to use Option. For example, if I'm performing a lookup in a map, I probably don't really need or want something like an Either[NoSuchElementException, A], where the error is so abstract that I'd probably end up wrapping it in something more domain-specific anyway. So get on a map just returns an Option[A].
You should use util.Try:
scala> import java.util.regex.Pattern
import java.util.regex.Pattern
scala> def pattern(s: String): util.Try[Pattern] = util.Try(Pattern.compile(s))
pattern: (s: String)scala.util.Try[java.util.regex.Pattern]
scala> pattern("<?++")
res0: scala.util.Try[java.util.regex.Pattern] =
Failure(java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 3
<?++
^)
scala> pattern("[.*]")
res1: scala.util.Try[java.util.regex.Pattern] = Success([.*])
The naive example
def pattern(s: String): Pattern = {
Pattern.compile(s)
}
has a sideeffect, it can influence the programm that uses it by other means than its result(it can cause a exception). This is discouraged in functional programming, because it increases the code complexity.
The code
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
encapsulates the side effect producing part of the programm. The information why the Pattern failed is lost, but sometimes it only matters whether or not it fails. If it matters why the method failed one can use Try(http://www.scala-lang.org/files/archive/nightly/docs/library/index.html#scala.util.Try):
def pattern(s: String): Try[Pattern] = {
Try(Pattern.compile(s))
}
I think the other two answers give you good suggestions about how to proceed. I would still argue that throwing an exception is well represented in Scala's type system, using the bottom type Nothing. So it is well-typed, and I wouldn't exactly called it "magic operation".
However... if your method can quite commonly result in an invalid value, that is if your call side quite reasonably wants to handle such an invalid value straight away, then using Option, Either or Try is a good approach. In a scenario, where your call site doesn't really know what to do with such an invalid value, especially if it is an exceptional condition and not the common case, then you should use exceptions IMO.
The problem of exception is precisely not that they are not well working with functional programming, but that they can be difficult to reason about when you have side effects. Because then your call site must ensure to undo the side effects in the case of an exception. If your call site is purely functional, passing on an exception doesn't do any damage.
If any functions that does anything with integers would declare its return type a Try because of division-by-zero or overflow possibilities, this might totally clutter your code. Another very good reason to use exceptions is invalid argument ranges, or requirements. If you expect an argument to be an integer between 0 and x, you may well throw an IllegalArgumentException if it does not meet that property; conveniently in Scala: require(a >= 0 && a < x).

Why is there an Option.get method

Why is the method get defined in Option and not in Some?
One could apply pattern matching or use foreach, map, flatMap, getOrElse which is preferred anyway without the danger of runtime exceptions if None.get is called.
Are there really cases where the get method is so much convinent that it justifies this? Using the get method reminds me to Java where "I don't have to check for null because I know the var is set" and then see a NullPointerException.
Option.get is occasionally useful when you know, by reasoning that isn't captured by the type system, that the value you have is not None. However, you should reach for it only when you've tried:
Arranging for this knowledge to be expressed by your types.
Using the Option processing methods mentioned (flatMap, getOrElse etc), and none of them fit elegantly.
It's also convenient for throwaway / interactive code (REPL, worksheets etc).
Foreword
Scala was created with compatibility for the JVM as a target.
So it's pretty natural that Exceptions and try/catch should be considered as part of the language, at least for java interoperability.
With this premise in mind, now to the point of including interfaces that can throw errors in algebraic data types (or case classes if you prefer).
To the Question
Assume we encapsulate the get method (or head for List) only in Some instance.
This implies that we're forced to pattern match on the Option every time we want to extract the value. It wouldn't look pretty neat, but it's still a possibility.
We could getOrElse everywhere but this means that I can't signal a failure through the error stack if using imperative style (which is not forbidden in scala, the last time I checked).
My point here is that getOrElse and headOption are available to enforce a functional style when desired, but get and head are good alternatives if imperative is a choice fit to the problem or the programmer's preference.
There is no good reason for it. Martin Odersky added it in this commit in 2007, so I guess we'd need to ask him his motivations. The earlier version had getOrElse semantics.
If you examine the commit provided. It's kind of obvious why and
when it is used. The get is used to unbox when you are sure there is Something in the
Option. It would also serve to be an inheritable method where None.get Throws an exception and Some.get unboxes the contents.
In some cases, when your Option turns out to be a None, there is no recourse and you want to throw an exception. So, instead of writing
myOption match {
case Some(value) => // use value
case None => throw new RuntimeException("I really needed that value")
}
you can simply write
myOption.get // might throw an exception
It also makes Option much nicer to use from Java, in conjunction with isDefined:
if (myOption.isDefined()) process(myOption.get());
else // it's a None!