Effects to use for wrapping impure methods? - scala

I'm trying to understand how to use effect monads (cats.effect.IO or scalaz.IO does not matter). Imagine I have the following method:
def extract(str: String): String = {
if(str.contains("123"))
"123"
else
throw new IllegalArgumentException("Boom!")
}
Since this method is impure (throws exception) and I need to combine its result with another effectful computation (network-IO) is it a good practice to just wrap it into IO as follows:
def extract(str: String): IO[String] = IO {
if(str.contains("123"))
"123"
else
throw new IllegalArgumentException("Boom!")
}
Is it a common use case of Effect monads?

Whether this is appropriate or not depends on what you want to express.
It's not like your first def extract(str: String): String-method definition is somehow invalid: you are just sweeping all the exceptions and side-effects under the rug, so that they aren't visible in the signature. If this is irrelevant for your current project, and if it's acceptable that a program simply crashes with a long stack trace of the thrown exception, then do it, no problem (it's easy to imagine one-off throw-away scripts where this would be appropriate).
If instead you declare def extract(str: String): IO[String] = IO { ... }, then you at least can see in the signature that the function extract can do something impure (throw an exception, in this case). Now the question becomes: who is responsible for dealing with this exception, or where do you want to deal with this exception? Consider this: if the exception is thrown, it will emerge in your code in the line where something like yourProgram.unsafeRunSync() is invoked. Does it make sense to deal with this exception there? Maybe it does, maybe it doesn't: nobody can tell you. If you just want to catch the exception at the top level in your main, log it, and exit, then it's appropriate.
However, if you want to deal with the exception immediately, you have better options. For example, if you are writing a method that prompts for a file name, then tries to extract something from this name, and finally does some IO on files, then the return type IO[String] might be too opaque. You might want to use some other monad instead (Option, Either, Try etc.) to represent the failed extraction. For example, if you use Option[String] as return type, you no longer have to deal with an obscure exception thousand miles away in your main method, but instead you can deal with it immediately, and, for example, prompt for a new file name repeatedly.
The whole exercise is somewhat similar to coming up with a strategy how to deal with exceptions in general, only here you are expressing it explicitly in the type signatures of your methods.

Related

scala programming practice with option

May be my design is flawed (most probably it is) but I have been thinking about the way Option is used in Scala and I am not so very happy about it. Let's say I have 3 methods calling one another like this:
def A(): reads a file and returns something
def B(): returns something
def C(): Side effect (writes into DB)
and C() calls B() and in turn B() calls A()
Now, as A() is dependent on I/O ops, I had to handle the exceptions and return and Option otherwise it won't compile (if A() does not return anything). As B() receives an Option from A() and it has to return something, it is bound to return another Option to C(). So, you can possibly imagine that my code is flooded with match/case Some/case None (don't have the liberty to use getOrElse() always). And, if C() is dependent on some other methods which also return Option, you would be scared to look at the definition of C().
So, am I missing something? Or how flawed is my design? How can I improve it?
Using match/case on type Option is often useful when you want to throw away the Option and produce some value after processing the Some(...) but a different value of the same type if you have a None. (Personally, I usually find fold to be cleaner for such situations.)
If, on the other hand, you're passing the Option along, then there are other ways to go about it.
def a():Option[DataType] = {/*read new data or fail*/}
def b(): Optioon[DataType] = {
... //some setup
a().map{ inData =>
... //inData is real, process it for output
}
}
def c():Unit = {
... //some setup
b().foreach{ outData =>
... //outData is real, write it to DB
}
}
am I missing something?
Option is one design decision, but there can be others. I.e what happens when you want to describe the error returned by the API? Option can only tell you two kinds of state, either I have successfully read a value, or I failed. But sometimes you really want to know why you failed. Or more so, If I return None, is it because the file isn't there or because I failed on an exception (i.e. I don't have permission to read the file?).
Whichever path you choose, you'll usually be dealing with one or more effects. Option is one such effect which representing a partial function, i.e. this operation may not yield a result. While using pattern matching with Option, as other said, is one way of handling it, there are other operations which decrease the verbosity.
For example, if you want to invoke an operation in case the value exists and another in case it isn't and they both have the same return type, you can use Option.fold:
scala> val maybeValue = Some(1)
maybeValue: Some[Int] = Some(1)
scala> maybeValue.fold(0)(x => x + 1)
res0: Int = 2
Generally, there are many such combinators defined on Option and other effects, and they might seem cumbersome at the beginning, later they come to grow on you and you see their real power when you want to compose operations one after the other.

What is the difference between Try and Either?

According to the documentation:
The Try type represents a computation that may either result in an
exception, or return a successfully computed value. It's similar to,
but semantically different from the scala.util.Either type.
The docs do not go into further detail as to what the semantic difference is. Both seem to be able to communicate successes and failures. Why would you use one over the other?
I covered the relationship between Try, Either, and Option in this answer. The highlights from there regarding the relationship between Try and Either are summarized below:
Try[A] is isomorphic to Either[Throwable, A]. In other words you can treat a Try as an Either with a left type of Throwable, and you can treat any Either that has a left type of Throwable as a Try. It is conventional to use Left for failures and Right for successes.
Of course, you can also use Either more broadly, not only in situations with missing or exceptional values. There are other situations where Either can help express the semantics of a simple union type (where value is one of two types).
Semantically, you might use Try to indicate that the operation might fail. You might similarly use Either in such a situation, especially if your "error" type is something other than Throwable (e.g. Either[ErrorType, SuccessType]). And then you might also use Either when you are operating over a union type (e.g. Either[PossibleType1, PossibleType2]).
Since Scala 2.12, the standard library does include the conversions from Either to Try or from Try to Either. For earlier versions, it is pretty simple to enrich Try, and Either as needed:
object TryEitherConversions {
implicit class EitherToTry[L <: Throwable, R](val e: Either[L, R]) extends AnyVal {
def toTry: Try[R] = e.fold(Failure(_), Success(_))
}
implicit class TryToEither[T](val t: Try[T]) extends AnyVal {
def toEither: Either[Throwable, T] =
t.map(Right(_)).recover(Left(_)).get
}
}
This would allow you to do:
import TryEitherConversions._
//Try to Either
Try(1).toEither //Either[Throwable, Int] = Right(1)
Try("foo".toInt).toEither //Either[Throwable, Int] = Left(java.lang.NumberFormatException)
//Either to Try
Right[Throwable, Int](1).toTry //Success(1)
Left[Throwable, Int](new Exception).toTry //Failure(java.lang.Exception)
To narrowly answer your question: "What's the semantic difference":
This probably refers to flatMap and map, which are non-existent in Either and either propagate failure or map the success value in Try. This allows, for instance, chaining like
for {
a <- Try {something}
b <- Try {somethingElse(a)}
c <- Try {theOtherThing(b)}
} yield c
which does just what you'd hope - returns a Try containing either the first exception, or the result.
Try has lots of other useful methods, and of course its companion apply method, that make it very convenient for its intended use - exception handling.
If you really want to be overwhelmed, there are two other classes out there which may be of interest for this kind of application. Scalaz has a class called "\/" (formerly known as Prince), pronounced "Either", which is mostly like Either, but flatMap and map work on the Right value. Similarly, and not, Scalactic has an "Or" which is also similar to Either, but flatMap and map work on the Left value.
I don't recommend Scalaz for beginners.
Either does not imply success and failure, it is just a container for either an A or a B. It is common to use it to represent successes and failures, the convention being to put the failure on the left side, and the success on the right.
A Try can be seen as an Either with the left-side type set to Throwable. Try[A] would be equivalent to Either[Throwable, A].
Use Try to clearly identify a potential failure in the computation, the failure being represented by an exception. If you want to represent the failure with a different type (like a String, or a set of case classes extending a sealed trait for example) use Either.
Either is more general, since it simply represents disjoint unions of types.
In particular, it can represent a union of valid return values of some type X and Exception. However, it does not attempt to catch any exceptions on its own. You have to add try-catch blocks around dangerous code, and then make sure that each branch returns an appropriate subclass of Either (usually: Left for errors, Right for successful computations).
Try[X] can be thought of as Either[Exception, X], but it also catches Exceptions on its own.
Either[X, Y] usage is more general. As its name say it can represent either an object of X type or of Y.
Try[X] has only one type and it might be either a Success[X] or a Failure (which contains a Throwable).
At some point you might see Try[X] as an Either[Throwable,X]
What is nice about Try[X] is that you can chain futher operations to it, if it is really a Success they will execute, if it was a Failure they won't
val connection = Try(factory.open())
val data = connection.flatMap(conn => Try(conn.readData()))
//At some point you can do
data matches {
Success(data) => print data
Failure(throwable) => log error
}
Of course, you can always oneline this like
Try(factory.open()).flatMap(conn => Try(conn.readData()) matches {
Success(data) => print data
Failure(throwable) => log error
}
As already have been mentioned, Either is more general, so it might not only wrap error/successful result, but also can be used as an alternative to Option, for branching the code path.
For abstracting the effect of an error, only for this purpose, I identified the following differences:
Either can be used to specify a description of the error, which can be shown to the client. Try - wraps an exception with a stack trace, less descriptive, less client oriented, more for internal usage.
Either allows us to specify error type, with existing monoid for this type. As a result, it allows us to combine errors (usually via applicative effects). Try abstraction with its exception, has no monoid defined. With Try we must spent more effort to extract error and handle it.
Based on it, here is my best practices:
When I want to abstract effect of error, I always use Either as the first choice, with List/Vector/NonEmptyList as error type.
Try is used only, when you invoke code, written in OOP. Good candidates for Try are methods, that might throw an exception, or methods, that sends request to external systems (rest/soap/database requests in case the methods return a raw result, not wrapped into FP abstractions, like Future, for instance.

Scala error handling: Try or Either?

Given a method in UserService: update, what's the best way to handle errors/exceptions here?
Option A:
def update(...): Try[User]
In this way, I need to define my custom exceptions and throw them in the function body when needed. Most of these exceptions are business errors (e.g. user_id cannot be changed, etc). The point here is no matter what exception(s) are thrown (business error, network exception, DB IO exception, etc), treat them the same way and just return a Failure(err) - let the upper layer handle them.
Option B:
def update(...): Either[Error, User]
This is the exception-free way. In the function body it catches all possible exceptions and turns them into Error, and for business errors just return Left[Error].
Using Try seems to be a more natural way to me as I want to handle errors. Either is a more generic thing - Either[Error, T] is just one special case and I think Try is invented for this special case. But I also read that we should avoid using exceptions for error handling...
So, which solution is better, and why?
There's no silver bullet.
As you noted already, Try is simply a more specialized version of Either, where the Left type is fixed to Throwable.
Try might be a good fit if you need to materialize exceptions thrown by external (perhaps java) libraries, as its constructor automatically catches them.
Another advantage of Try is that it has map and flatMap, so you can use it directly in for-comprehensions, whereas with Either you would have to explicitly project on the right case.
Anyway, there's plenty of alternative implementations with a "right-bias", and probably the scalaz \/ type is the most popular one.
That being said, I typically use \/ or the almost equivalent Validation (both from scalaz), as I like having the ability of returning errors that do not extend Throwable.
It also allows for more precise error types, which is a huge win.

Why should one prefer Option for error handling over exceptions in Scala?

So I'm learning functional Scala, and the book says exception breaks referential transparency, and thus Option should be used instead, like so:
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
This seems pretty bad; I mean it seems equivalent to:
catch(Exception e){
return null;
}
Save for the fact that we can distinguish "null for error" from "null as genuine value". It seems it should at least return something that contains the error information like:
catch {
case e: Exception => Fail(e)
}
What am I missing?
At this specific section, Option is used mostly as an example because the operation used (calculating the mean) is a partial function, it doesn't produce a value for all possible values (the collection could be empty, thus there's no way to calculate the mean) and Option could be a valid case here. If you can't calculate the mean because the collection is empty just return a None.
But there are many other ways to solve this problem, you could use Either[L,R], with the Left being the error result and a Right as being the good result, you could still throw an exception and wrap it inside a Try object (which seems more common nowadays due to it's use in Promise and Future computations), you could use ScalaZ Validation if the error was actually a validation issue.
The main concept you should take a way from this part is that the error should be part of the return type of the function and not some magic operation (the exception) that can't be reasonably declared by the types.
And as a shameless plug, I did blog about Either and Try here.
It would be easier to answer this question if you weren't asking "why is Option better than exceptions?" and "why is Option better than null?" and "why is Option better than Try?" all at the same time.
The answer to the first of these questions is that using exceptions in situations that aren't truly exceptional muddles the control flow of your program. This is where referential transparency comes in—it's much easier for me (or you) to reason about your code if I can think in terms of values and don't have to keep track of where exceptions are being thrown and caught.
The answer to the second question (why not null?) is something like "Have you ever had to deal with NullPointerException in Java?".
For the third question, in general you're right—it's better to use a type like Either[Throwable, A] or Try[A] to represent computations that can fail, since they allow you to pass along more detailed information about the failure. In some cases, though, when a function can only fail in a single obvious way, it makes sense to use Option. For example, if I'm performing a lookup in a map, I probably don't really need or want something like an Either[NoSuchElementException, A], where the error is so abstract that I'd probably end up wrapping it in something more domain-specific anyway. So get on a map just returns an Option[A].
You should use util.Try:
scala> import java.util.regex.Pattern
import java.util.regex.Pattern
scala> def pattern(s: String): util.Try[Pattern] = util.Try(Pattern.compile(s))
pattern: (s: String)scala.util.Try[java.util.regex.Pattern]
scala> pattern("<?++")
res0: scala.util.Try[java.util.regex.Pattern] =
Failure(java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 3
<?++
^)
scala> pattern("[.*]")
res1: scala.util.Try[java.util.regex.Pattern] = Success([.*])
The naive example
def pattern(s: String): Pattern = {
Pattern.compile(s)
}
has a sideeffect, it can influence the programm that uses it by other means than its result(it can cause a exception). This is discouraged in functional programming, because it increases the code complexity.
The code
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
encapsulates the side effect producing part of the programm. The information why the Pattern failed is lost, but sometimes it only matters whether or not it fails. If it matters why the method failed one can use Try(http://www.scala-lang.org/files/archive/nightly/docs/library/index.html#scala.util.Try):
def pattern(s: String): Try[Pattern] = {
Try(Pattern.compile(s))
}
I think the other two answers give you good suggestions about how to proceed. I would still argue that throwing an exception is well represented in Scala's type system, using the bottom type Nothing. So it is well-typed, and I wouldn't exactly called it "magic operation".
However... if your method can quite commonly result in an invalid value, that is if your call side quite reasonably wants to handle such an invalid value straight away, then using Option, Either or Try is a good approach. In a scenario, where your call site doesn't really know what to do with such an invalid value, especially if it is an exceptional condition and not the common case, then you should use exceptions IMO.
The problem of exception is precisely not that they are not well working with functional programming, but that they can be difficult to reason about when you have side effects. Because then your call site must ensure to undo the side effects in the case of an exception. If your call site is purely functional, passing on an exception doesn't do any damage.
If any functions that does anything with integers would declare its return type a Try because of division-by-zero or overflow possibilities, this might totally clutter your code. Another very good reason to use exceptions is invalid argument ranges, or requirements. If you expect an argument to be an integer between 0 and x, you may well throw an IllegalArgumentException if it does not meet that property; conveniently in Scala: require(a >= 0 && a < x).

Better to return None or throw an exception when fetching URL?

I have a Scala helper method that currently tries to fetch a URL and return an Option[String] with the HTML of that webpage.
If there are any exceptions (malformed url, read timeouts, etc...) or if there are any problems it returns a None. The question is, would it be better to just throw the exception so the calling code can log the exception or is it preferable to return the None in this case?
Creating exceptions is expensive because the stack trace has to be filled. Throwing and catching exceptions is also more expensive than a normal return. Considering that, you might ask yourself the following questions:
Do you want to force handling of an error by the caller? If so, don't throw an exception, as Scala has no checked-exception mechanism that forces the caller to catch them.
In case of an error, do you want to include details on why it failed? If not, you can just return Option[A], where A is your return type, and then you'd either have Some(validContent) or None, with no additional explanation. If yes, you can return something like Either[E, A] or a Scalaz Validation[E, A]. All of this options force the caller to somehow unwrap the result while being free to process the error E as he wants. Now what should E be?
Do you want to provide a stack trace in case of failure? If so, you could return Either[Exception, A] or Validation[Exception, A]. If you really go with the exception, you'll want to use Try[A], whose two possible cases are Failure(exc: Throwable) and Success(value: A). Note that you will of course incur the costs of creating the throwable. If not, you can maybe just return Either[String, A] (and be extra careful to remember whether Right means success or failure here — Left is usually used for errors, and Right for the “right” value — Validation is maybe clearer). If you want to optionally return a stack trace, you could use Lift's Box[A], which could be Full(validContents), Empty with no additional explanation (very similar to Option[A] up to here), or indicate a Failure which can store an error string and/or a throwable (and more).
Do you maybe want to provide multiple indications as to why it failed? Then return Either[Seq[String], A]. If you do this frequently, you may probably want to use Scalaz and a Validation[NonEmptyList[String], A] instead, which provides some other nice goodies. Look it up for more info about it or check out these usage examples.
I think that in this case, if is important to log the exceptions then by all means throw the exceptions (and possibly just return String instead of the option). Otherwise you might as well just return the None. One warning- there may be exceptions for other reasons that you don't foresee, in which case it may be dangerous to make a catch-all bit of code.
One thing that you could do is something like Lift's Box system. A Box is essentially an Option, but with a couple features added in: A Full is like a Some, an Empty is like a None, but Lift goes a step further and has a Failure, which is like an Empty but with a reason/message.
The general rule of thumb is "if you can handle the exception, handle it". So there's not enough context to guess. You can use the tryFetchUrl/fetchUrl pair of methods.