scala programming practice with option - scala

May be my design is flawed (most probably it is) but I have been thinking about the way Option is used in Scala and I am not so very happy about it. Let's say I have 3 methods calling one another like this:
def A(): reads a file and returns something
def B(): returns something
def C(): Side effect (writes into DB)
and C() calls B() and in turn B() calls A()
Now, as A() is dependent on I/O ops, I had to handle the exceptions and return and Option otherwise it won't compile (if A() does not return anything). As B() receives an Option from A() and it has to return something, it is bound to return another Option to C(). So, you can possibly imagine that my code is flooded with match/case Some/case None (don't have the liberty to use getOrElse() always). And, if C() is dependent on some other methods which also return Option, you would be scared to look at the definition of C().
So, am I missing something? Or how flawed is my design? How can I improve it?

Using match/case on type Option is often useful when you want to throw away the Option and produce some value after processing the Some(...) but a different value of the same type if you have a None. (Personally, I usually find fold to be cleaner for such situations.)
If, on the other hand, you're passing the Option along, then there are other ways to go about it.
def a():Option[DataType] = {/*read new data or fail*/}
def b(): Optioon[DataType] = {
... //some setup
a().map{ inData =>
... //inData is real, process it for output
}
}
def c():Unit = {
... //some setup
b().foreach{ outData =>
... //outData is real, write it to DB
}
}

am I missing something?
Option is one design decision, but there can be others. I.e what happens when you want to describe the error returned by the API? Option can only tell you two kinds of state, either I have successfully read a value, or I failed. But sometimes you really want to know why you failed. Or more so, If I return None, is it because the file isn't there or because I failed on an exception (i.e. I don't have permission to read the file?).
Whichever path you choose, you'll usually be dealing with one or more effects. Option is one such effect which representing a partial function, i.e. this operation may not yield a result. While using pattern matching with Option, as other said, is one way of handling it, there are other operations which decrease the verbosity.
For example, if you want to invoke an operation in case the value exists and another in case it isn't and they both have the same return type, you can use Option.fold:
scala> val maybeValue = Some(1)
maybeValue: Some[Int] = Some(1)
scala> maybeValue.fold(0)(x => x + 1)
res0: Int = 2
Generally, there are many such combinators defined on Option and other effects, and they might seem cumbersome at the beginning, later they come to grow on you and you see their real power when you want to compose operations one after the other.

Related

Is returning Either/Option/Try/Or considered a viable / idiomatic approach when function has preconditions for arguments?

First of all, I'm very new to Scala and don't have any experience writing production code with it, so I lack understanding of what is considered a good/best practice among community. I stumbled upon these resources:
https://github.com/alexandru/scala-best-practices
https://nrinaudo.github.io/scala-best-practices/
It is mentioned there that throwing exceptions is not very good practice, which made me think what would be a good way to define preconditions for function then, because
A function that throws is a bit of a lie: its type implies it’s total function when it’s not.
After a bit of research, it seems that using Option/Either/Try/Or(scalactic) is a better approach, since you can use something like T Or IllegalArgumentException as return type to clearly indicate that function is actually partial, using exception as a way to store message that can be wrapped in other exceptions.
However lacking Scala experience I don't quite understand if this is actually viable approach for a real project or using Predef.require is a way to go. I would appreciate if someone explained how things are usually done in Scala community and why.
I've also seen Functional assertion in Scala, but while the idea itself looks interesting, I think PartialFunction is not very suitable for the purpose as it is, because often more than one argument is passed and tuples look like a hack in this case.
Option or Either is definitely the way to go for functional programming.
With Option it is important to document why None might be returned.
With Either, the left side is the unsuccessful value (the "error"), while the right side is the successful value. The left side does not necessarily have to be an Exception (or a subtype of it), it can be a simple error message String (type aliases are your friend here) or a custom data type that is suitable for you application.
As an example, I usually use the following pattern when error handling with Either:
// Somewhere in a package.scala
type Error = String // Or choose something more advanced
type EitherE[T] = Either[Error, T]
// Somewhere in the program
def fooMaybe(...): EitherE[Foo] = ...
Try should only be used for wrapping unsafe (most of the time, plain Java) code, giving you the ability to pattern-match on the result:
Try(fooDangerous()) match {
case Success(value) => ...
case Failure(value) => ...
}
But I would suggest only using Try locally and then go with the above mentioned data types from there.
Some advanced datatypes like cats.effect.IO or monix.reactive.Observable contain error handling natively.
I would also suggest looking into cats.data.EitherT for typeclass-based error handling. Read the documentation, it's definitely worth it.
As a sidenote, for everyone coming from Java, Scala treats all Exceptions as Java treats RuntimeExceptions. That means, even when an unsafe piece of code from one of your dependencies throws a (checked) IOException, Scala will never require you to catch or otherwise handle the exception. So as a rule of thumb, when using Java - dependencies, almost always wrap them in a Try (or an IO if they execute side effects or block the thread).
I think your reasoning is correct. If you have a simple total (opposite of partial) function with arguments that can have invalid types then the most common and simple solution is to return some optional result like Option, etc.
It's usually not advisable to throw exceptions as they break FP laws. You can use any library that can return a more advanced type than Option like Scalaz Validation if you need to compose results in ways that are awkward with Option.
Another two alternatives I could offer is to use:
Type constrained arguments that enforce preconditions. Example: val i: Int Refined Positive = 5 based on https://github.com/fthomas/refined. You can also write your own types which wrap primitive types and assert some properties. The problem here is if you have arguments that have multiple interdependent valid values which are mutually exclusive per argument. For instance x > 1 and y < 1 or x < 1 and y > 1. In such case you can return an optional value instead of using this approach.
Partial functions, which in the essence resemble optional return types: case i: Int if i > 0 => .... Docs: https://www.scala-lang.org/api/2.12.1/scala/PartialFunction.html.
For example:
PF's def lift: (A) ⇒ Option[B] converts PF to your regular function.
Turns this partial function into a plain function returning an Option
result.
Which is similar to returning an option. The problem with partial functions that they are a bit awkward to use and not fully FP friendly.
I think Predef.require belongs to very rare cases where you don't want to allow any invalid data to be constructed and is more of a stop-everything-if-this-happens kind of measure. Example would be that you get arguments you never supposed to get.
You use the return type of the function to indicate the type of the result.
If you want to describe a function that can fail for whatever reason, of the types you mentioned you would probably return Try or Either: I am going to "try" to give your a result, or I am going to return "either" a success or an failure.
Now you can specify a custom exception
case class ConditionException(message: String) extends RuntimeException(message)
that you would return if your condition is not satisfied, e.g
import scala.util._
def myfunction(a: String, minLength: Int): Try[String] = {
if(a.size < minLength) {
Failure(ConditionException(s"string $a is too short")
} else {
Success(a)
}
}
and with Either you would get
import scala.util._
def myfunction(a: String, minLength: Int): Either[ConditionException,String] = {
if(a.size < minLength) {
Left(ConditionException(s"string $a is too short")
} else {
Right(a)
}
}
Not that the Either solution clearly indicates the error your function might return

Throwaway values - what is the best practice in Scala?

WartRemover's NonUnitStatements requires that statements that aren't returning unit must have an assignment. OK, but sometimes we have to use annoying Java APIs that both mutate and return a value, and we don't actually hardly ever care about the returned value.
So I end up trying this:
val _ = mutateSomething(foo)
But if I have multiple of these, _ is actually a legit val that has been assigned to, so I cannot reassign. Wartremover also rightly will admonish for gratuitous var-usage, so I can't just do var _ =.
I can do the following (need the ; to avoid Scala thinking it is a continue definition unless I add a full newline everytime I do this).
;{val _ = mutateSomething(foo)}
Is there a better way?
My general thoughts about linting tools is that you should not jump through hoops to satisfy them.
The point is to make your code better, both with fewer bugs and stylistically. But just assigning to var _ = does not achieve this. I would first be sure I really really don't care about the return value, not even asserting that it is what I expect it to be. If I don't, I would just add a #SuppressWarnings(Array("org.wartremover.warts.NonUnitStatements")) and maybe a comment about why and be done with it.
Scala is somewhat unique in that it's a somewhat opinionated language that also tries to integrate with another not-so-opinionated language. This leads to pain points. There are different philosophies to dealing with this, but I tend to not sweat it, and just be aware of and try to isolate the boundaries.
I'm posting an answer, but the real credit goes to Shane Delmore for pointing this out:
def discard(evaluateForSideEffectOnly: Any): Unit = {
val _: Any = evaluateForSideEffectOnly
() //Return unit to prevent warning due to discarding value
}
Alternatively (or see #som-snytt's comments below):
#specialized def discard[A](evaluateForSideEffectOnly: A): Unit = {
val _: A = evaluateForSideEffectOnly
() //Return unit to prevent warning due to discarding value
}
Then use it like: discard{ badMutateFun(foo) }.
Unlike the ;{ val _ = ... } solution, it looks way better, and also works in the beginning of a block without needing to alter style (a ; can't come at the start of a block, it must come after a statement).

What return type should a Scala method have if it can throw/return errors but has Unit return type?

So usually when we run a method that can both fail and return a value, we can encode our method return type as Either[SomeErrorType, ReturnType]. But many times we're running a method for its side effects, so the return type is Unit.
I could of course return an Either[SomeErrorType, Unit] but that definitely looks odd.
I could also just return an Option[SomeErrorType] but it doesn't really look a lot better (and breaks a possibly existing symmetry with other Either[SomeErrorType, NonUnitReturnType]s.
What's your approach in these cases?
def m(): Unit // and implicitly know that exceptions can be thrown?;
def m(): Either[SomeErrorType, Unit] // this is odd;
def m(): Option[SomeErrorType] // this is odd, as it makes it look as the return type ofm()on a successful run is an error code.
Other that I can't think of?
Thanks
I use Try[Unit] for that case.
It encodes that the result of the method either succeeds or fails with some Exception, which can be further processed.
vs T => Unit Try lifts errors to the application level, encoding in the signature that some error can be expected and allowing the application to handle it as a value.
vs. Option[T] => Option is only able to encode that the operation had a value or not
vs. Either[SomeErrorType, Unit] => Try It's easier to work with using monadic constructions.
I've used something like this to implement checks. (imaginary example)
for {
entity <- receiveEntity // Try[Entity]
_ <- isRelational(entity)
_ <- isComplete(entity)
_ <- isStable(entity)
} yield entity
where each check is of the form: Entity => Try[Unit]
This will return the entity if all checks pass of the first error that failed the check.
One more option that hasn't been mentioned yet is Validated from cats. All the options mentioned so far (Try, Either, Option) are monads, while Validated is an applicative functor. In practice this means you can accumulate errors from multiple methods returning Validated, and you can do several validations in parallel. This might not be relevant to you, and this is a bit orthogonal to the original question, but I still feel it's worth mentioning in this context.
As for the original question, using Unit return type for a side-effecting function is perfectly fine. The fact this function can also return error shouldn't get in your way when you define the "real" (right, successful, etc.) return type. Therefore, if I were to select from your original options, I'd go for Either[Error, Unit]. It definitely doesn't look odd to me, and if anyone sees any drawbacks in it, I'd like to know them.

What is the difference between Try and Either?

According to the documentation:
The Try type represents a computation that may either result in an
exception, or return a successfully computed value. It's similar to,
but semantically different from the scala.util.Either type.
The docs do not go into further detail as to what the semantic difference is. Both seem to be able to communicate successes and failures. Why would you use one over the other?
I covered the relationship between Try, Either, and Option in this answer. The highlights from there regarding the relationship between Try and Either are summarized below:
Try[A] is isomorphic to Either[Throwable, A]. In other words you can treat a Try as an Either with a left type of Throwable, and you can treat any Either that has a left type of Throwable as a Try. It is conventional to use Left for failures and Right for successes.
Of course, you can also use Either more broadly, not only in situations with missing or exceptional values. There are other situations where Either can help express the semantics of a simple union type (where value is one of two types).
Semantically, you might use Try to indicate that the operation might fail. You might similarly use Either in such a situation, especially if your "error" type is something other than Throwable (e.g. Either[ErrorType, SuccessType]). And then you might also use Either when you are operating over a union type (e.g. Either[PossibleType1, PossibleType2]).
Since Scala 2.12, the standard library does include the conversions from Either to Try or from Try to Either. For earlier versions, it is pretty simple to enrich Try, and Either as needed:
object TryEitherConversions {
implicit class EitherToTry[L <: Throwable, R](val e: Either[L, R]) extends AnyVal {
def toTry: Try[R] = e.fold(Failure(_), Success(_))
}
implicit class TryToEither[T](val t: Try[T]) extends AnyVal {
def toEither: Either[Throwable, T] =
t.map(Right(_)).recover(Left(_)).get
}
}
This would allow you to do:
import TryEitherConversions._
//Try to Either
Try(1).toEither //Either[Throwable, Int] = Right(1)
Try("foo".toInt).toEither //Either[Throwable, Int] = Left(java.lang.NumberFormatException)
//Either to Try
Right[Throwable, Int](1).toTry //Success(1)
Left[Throwable, Int](new Exception).toTry //Failure(java.lang.Exception)
To narrowly answer your question: "What's the semantic difference":
This probably refers to flatMap and map, which are non-existent in Either and either propagate failure or map the success value in Try. This allows, for instance, chaining like
for {
a <- Try {something}
b <- Try {somethingElse(a)}
c <- Try {theOtherThing(b)}
} yield c
which does just what you'd hope - returns a Try containing either the first exception, or the result.
Try has lots of other useful methods, and of course its companion apply method, that make it very convenient for its intended use - exception handling.
If you really want to be overwhelmed, there are two other classes out there which may be of interest for this kind of application. Scalaz has a class called "\/" (formerly known as Prince), pronounced "Either", which is mostly like Either, but flatMap and map work on the Right value. Similarly, and not, Scalactic has an "Or" which is also similar to Either, but flatMap and map work on the Left value.
I don't recommend Scalaz for beginners.
Either does not imply success and failure, it is just a container for either an A or a B. It is common to use it to represent successes and failures, the convention being to put the failure on the left side, and the success on the right.
A Try can be seen as an Either with the left-side type set to Throwable. Try[A] would be equivalent to Either[Throwable, A].
Use Try to clearly identify a potential failure in the computation, the failure being represented by an exception. If you want to represent the failure with a different type (like a String, or a set of case classes extending a sealed trait for example) use Either.
Either is more general, since it simply represents disjoint unions of types.
In particular, it can represent a union of valid return values of some type X and Exception. However, it does not attempt to catch any exceptions on its own. You have to add try-catch blocks around dangerous code, and then make sure that each branch returns an appropriate subclass of Either (usually: Left for errors, Right for successful computations).
Try[X] can be thought of as Either[Exception, X], but it also catches Exceptions on its own.
Either[X, Y] usage is more general. As its name say it can represent either an object of X type or of Y.
Try[X] has only one type and it might be either a Success[X] or a Failure (which contains a Throwable).
At some point you might see Try[X] as an Either[Throwable,X]
What is nice about Try[X] is that you can chain futher operations to it, if it is really a Success they will execute, if it was a Failure they won't
val connection = Try(factory.open())
val data = connection.flatMap(conn => Try(conn.readData()))
//At some point you can do
data matches {
Success(data) => print data
Failure(throwable) => log error
}
Of course, you can always oneline this like
Try(factory.open()).flatMap(conn => Try(conn.readData()) matches {
Success(data) => print data
Failure(throwable) => log error
}
As already have been mentioned, Either is more general, so it might not only wrap error/successful result, but also can be used as an alternative to Option, for branching the code path.
For abstracting the effect of an error, only for this purpose, I identified the following differences:
Either can be used to specify a description of the error, which can be shown to the client. Try - wraps an exception with a stack trace, less descriptive, less client oriented, more for internal usage.
Either allows us to specify error type, with existing monoid for this type. As a result, it allows us to combine errors (usually via applicative effects). Try abstraction with its exception, has no monoid defined. With Try we must spent more effort to extract error and handle it.
Based on it, here is my best practices:
When I want to abstract effect of error, I always use Either as the first choice, with List/Vector/NonEmptyList as error type.
Try is used only, when you invoke code, written in OOP. Good candidates for Try are methods, that might throw an exception, or methods, that sends request to external systems (rest/soap/database requests in case the methods return a raw result, not wrapped into FP abstractions, like Future, for instance.

Why should one prefer Option for error handling over exceptions in Scala?

So I'm learning functional Scala, and the book says exception breaks referential transparency, and thus Option should be used instead, like so:
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
This seems pretty bad; I mean it seems equivalent to:
catch(Exception e){
return null;
}
Save for the fact that we can distinguish "null for error" from "null as genuine value". It seems it should at least return something that contains the error information like:
catch {
case e: Exception => Fail(e)
}
What am I missing?
At this specific section, Option is used mostly as an example because the operation used (calculating the mean) is a partial function, it doesn't produce a value for all possible values (the collection could be empty, thus there's no way to calculate the mean) and Option could be a valid case here. If you can't calculate the mean because the collection is empty just return a None.
But there are many other ways to solve this problem, you could use Either[L,R], with the Left being the error result and a Right as being the good result, you could still throw an exception and wrap it inside a Try object (which seems more common nowadays due to it's use in Promise and Future computations), you could use ScalaZ Validation if the error was actually a validation issue.
The main concept you should take a way from this part is that the error should be part of the return type of the function and not some magic operation (the exception) that can't be reasonably declared by the types.
And as a shameless plug, I did blog about Either and Try here.
It would be easier to answer this question if you weren't asking "why is Option better than exceptions?" and "why is Option better than null?" and "why is Option better than Try?" all at the same time.
The answer to the first of these questions is that using exceptions in situations that aren't truly exceptional muddles the control flow of your program. This is where referential transparency comes in—it's much easier for me (or you) to reason about your code if I can think in terms of values and don't have to keep track of where exceptions are being thrown and caught.
The answer to the second question (why not null?) is something like "Have you ever had to deal with NullPointerException in Java?".
For the third question, in general you're right—it's better to use a type like Either[Throwable, A] or Try[A] to represent computations that can fail, since they allow you to pass along more detailed information about the failure. In some cases, though, when a function can only fail in a single obvious way, it makes sense to use Option. For example, if I'm performing a lookup in a map, I probably don't really need or want something like an Either[NoSuchElementException, A], where the error is so abstract that I'd probably end up wrapping it in something more domain-specific anyway. So get on a map just returns an Option[A].
You should use util.Try:
scala> import java.util.regex.Pattern
import java.util.regex.Pattern
scala> def pattern(s: String): util.Try[Pattern] = util.Try(Pattern.compile(s))
pattern: (s: String)scala.util.Try[java.util.regex.Pattern]
scala> pattern("<?++")
res0: scala.util.Try[java.util.regex.Pattern] =
Failure(java.util.regex.PatternSyntaxException: Dangling meta character '+' near index 3
<?++
^)
scala> pattern("[.*]")
res1: scala.util.Try[java.util.regex.Pattern] = Success([.*])
The naive example
def pattern(s: String): Pattern = {
Pattern.compile(s)
}
has a sideeffect, it can influence the programm that uses it by other means than its result(it can cause a exception). This is discouraged in functional programming, because it increases the code complexity.
The code
def pattern(s: String): Option[Pattern] = {
try {
Some(Pattern.compile(s))
} catch {
case e: PatternSyntaxException => None
}
}
encapsulates the side effect producing part of the programm. The information why the Pattern failed is lost, but sometimes it only matters whether or not it fails. If it matters why the method failed one can use Try(http://www.scala-lang.org/files/archive/nightly/docs/library/index.html#scala.util.Try):
def pattern(s: String): Try[Pattern] = {
Try(Pattern.compile(s))
}
I think the other two answers give you good suggestions about how to proceed. I would still argue that throwing an exception is well represented in Scala's type system, using the bottom type Nothing. So it is well-typed, and I wouldn't exactly called it "magic operation".
However... if your method can quite commonly result in an invalid value, that is if your call side quite reasonably wants to handle such an invalid value straight away, then using Option, Either or Try is a good approach. In a scenario, where your call site doesn't really know what to do with such an invalid value, especially if it is an exceptional condition and not the common case, then you should use exceptions IMO.
The problem of exception is precisely not that they are not well working with functional programming, but that they can be difficult to reason about when you have side effects. Because then your call site must ensure to undo the side effects in the case of an exception. If your call site is purely functional, passing on an exception doesn't do any damage.
If any functions that does anything with integers would declare its return type a Try because of division-by-zero or overflow possibilities, this might totally clutter your code. Another very good reason to use exceptions is invalid argument ranges, or requirements. If you expect an argument to be an integer between 0 and x, you may well throw an IllegalArgumentException if it does not meet that property; conveniently in Scala: require(a >= 0 && a < x).