I started to use Scalaz 7 Validation and/or disjunction to process a list of possibly failing operation and managing their result.
There is two well documented case for that kind of use cases:
1/ You want to check a list of conditions on something, and accumulate each error if any. Here, you always go the end of list, and in case of any error, you have failure as global result.
And that's an applicative functor at work.
2/ You want to execute several steps that may fail, and stop on the first one failing.
Here, we have a monad that goes nicely in Scala for-comprehension.
So, I have two other use cases that are among the same lines, but don't seems to feet well on any precedent case:
I want to process a list of step, possibly failing, and accumulate both error and success results (ex: it's a list of modification on files, errors may happen because that's the outer world, and success are patch that I want to keep for later).
The difference on the two use case is only if I want to stop early (on the first error) or go to the end of the list.
OK, so what is the correct thing for that ?
(writting the question leads me to think that it's just a simple foldLeft, does it ? I will let the question here to validate, and if anybody else wonder)
Take a look at Validation#append or its alias Validation#+|+. Given two validations, if both are success, it returns success of the values appended. If both are failures, it returns failure of the values appended. Otherwise, it returns the successful value. This requires an implicit Semigroup instance for the success type.
I'd do something like this:
scala> List(1.success[String], 2.success[String], "3".failure[Int], "4".failure[Int]).partition(_.isSuccess)
res2: (List[scalaz.Validation[java.lang.String,Int]], List[scalaz.Validation[java.lang.String,Int]]) = (List(Success(1), Success(2)),List(Failure(3), Failure(4)))
scala> val fun = (_:List[Validation[String, Int]]).reduceLeft(_ append _)
fun: List[scalaz.Validation[String,Int]] => scalaz.Validation[String,Int] = <function1>
scala> fun <-: res2 :-> fun
res3: (scalaz.Validation[String,Int], scalaz.Validation[String,Int]) = (Success(3),Failure(34))
UPD: With #129 and #130 merged, you can change fun to (_:List[Validation[String, Int]]).concatenate or (_:List[Validation[String, Int]]).suml
Or bimap like this:
scala> List(1.success[String], 2.success[String], "3".failure[Int], "4".failure[Int]).partition(_.isSuccess).bimap(_.suml, _.suml)
res6: (scalaz.Validation[java.lang.String,Int], scalaz.Validation[java.lang.String,Int]) = (Success(3),Failure(34))
What you need is approximately switching an Either[E, A] into a Writer[List[E], A]. The Writer monad logs the errors you encountered.
Sounds like you want a pair (SomveValue, List[T]) where T is your 'Failure' although I'd call it 'Warning' or 'Log' since you still get a result, so its not really a failure.
Don't know if Scalaz has anything fancy for this though.
Related
I do some data conversion on a list of string and I get a list of Either where Left represents an error and Right represents a successfully converted item.
val results: Seq[Either[String, T]] = ...
I partition my results with:
val (errors, items) = results.partition(_.isLeft)
After doing some error processing I want to return a Seq[T] of valid items. That means, returning the value of all Right elements. Because of the partitioning I already knew that all elements of items Right. I have come up with five possibilities of how to do it. But what is the best in readability and performance? Is there an idiomatic way of how to do it in Scala?
// which variant is most scala like and still understandable?
items.map(_.right.get)
items.map(_.right.getOrElse(null))
items.map(_.asInstanceOf[Right[String, T]].value)
items.flatMap(_.toOption)
items.collect{case Right(item) => item}
Using .get is considered "code smell": it will work in this case, but makes the reader of the code pause and spend a few extra "cycles" in order to "prove" that it is ok. It is better to avoid using things like .get on Either and Option or .apply on a Map or a IndexedSeq.
.getOrElse is ok ... but null is not something you often see in scala code. Again, makes the reader stop and think "why is this here? what will happen if it ends up returning null?" etc. Better to avoid as well.
.asInstanceOf is ... just bad. It breaks type safety, and is just ... not scala.
That leaves .flatMap(_.toOption) or .collect. Both are fine. I would personally prefer the latter as it is a bit more explicit (and does not make the reader stop to remember which way Either is biased).
You could also use foldRight to do both partition and extract in one "go":
val (errors, items) = results.foldRight[(List[String], List[T])](Nil,Nil) {
case (Left(error), (e, i)) => (error :: e, i)
case ((Right(result), (e, i)) => (e, result :: i)
}
Starting in Scala 2.13, you'll probably prefer partitionMap to partition.
It partitions elements based on a function which returns either Right or Left. Which in your case, is simply the identity:
val (lefts, rights) = List(Right(1), Left("2"), Left("3")).partitionMap(identity)
// val lefts: List[String] = List(2, 3)
// val rights: List[Int] = List(1)
which let you use lefts and rights independently and with the right types.
Going through them one by one:
items.map(_.right.get)
You already know that these are all Rights. This will be absolutely fine.
items.map(_.right.getOrElse(null))
The .getOrElse is unnecessary here as you already know it should never happen. I would recommend throwing an exception if you find a Left (somehow) though, something like this: items.map(x => x.right.getOrElse(throw new Exception(s"Unexpected Left: [$x]")) (or whatever exception you think is suitable), rather than meddling with null values.
items.map(_.asInstanceOf[Right[String, T]].value)
This is unnecessarily complicated. I also can't get this to compile, but I may be doing something wrong. Either way there's no need to use asInstanceOf here.
items.flatMap(_.toOption)
I also can't get this to compile. items.flatMap(_.right.toOption) compiles for me, but at that point it'll always be a Some and you'll still have to .get it.
items.collect{case Right(item) => item}
This is another case of "it works but why be so complicated?". It also isn't exhaustive in the case of a Left item being there, but this should never happen so there's no need to use .collect.
Another way you could get the right values out is with pattern matching:
items.map {
case Right(value) => value
case other => throw new Exception(s"Unexpected Left: $other")
}
But again, this is probably unnecessary as you already know that all values will be Right.
If you are going to partition results like this, I recommend the first option, items.map(_.right.get). Any other options either have unreachable code (code you'd never be able to hit through Unit tests or real-life operation) or are unnecessarily complicated for the sake of "looking functional".
May be my design is flawed (most probably it is) but I have been thinking about the way Option is used in Scala and I am not so very happy about it. Let's say I have 3 methods calling one another like this:
def A(): reads a file and returns something
def B(): returns something
def C(): Side effect (writes into DB)
and C() calls B() and in turn B() calls A()
Now, as A() is dependent on I/O ops, I had to handle the exceptions and return and Option otherwise it won't compile (if A() does not return anything). As B() receives an Option from A() and it has to return something, it is bound to return another Option to C(). So, you can possibly imagine that my code is flooded with match/case Some/case None (don't have the liberty to use getOrElse() always). And, if C() is dependent on some other methods which also return Option, you would be scared to look at the definition of C().
So, am I missing something? Or how flawed is my design? How can I improve it?
Using match/case on type Option is often useful when you want to throw away the Option and produce some value after processing the Some(...) but a different value of the same type if you have a None. (Personally, I usually find fold to be cleaner for such situations.)
If, on the other hand, you're passing the Option along, then there are other ways to go about it.
def a():Option[DataType] = {/*read new data or fail*/}
def b(): Optioon[DataType] = {
... //some setup
a().map{ inData =>
... //inData is real, process it for output
}
}
def c():Unit = {
... //some setup
b().foreach{ outData =>
... //outData is real, write it to DB
}
}
am I missing something?
Option is one design decision, but there can be others. I.e what happens when you want to describe the error returned by the API? Option can only tell you two kinds of state, either I have successfully read a value, or I failed. But sometimes you really want to know why you failed. Or more so, If I return None, is it because the file isn't there or because I failed on an exception (i.e. I don't have permission to read the file?).
Whichever path you choose, you'll usually be dealing with one or more effects. Option is one such effect which representing a partial function, i.e. this operation may not yield a result. While using pattern matching with Option, as other said, is one way of handling it, there are other operations which decrease the verbosity.
For example, if you want to invoke an operation in case the value exists and another in case it isn't and they both have the same return type, you can use Option.fold:
scala> val maybeValue = Some(1)
maybeValue: Some[Int] = Some(1)
scala> maybeValue.fold(0)(x => x + 1)
res0: Int = 2
Generally, there are many such combinators defined on Option and other effects, and they might seem cumbersome at the beginning, later they come to grow on you and you see their real power when you want to compose operations one after the other.
Say I have the following snippet
def testFailure2() = {
val f1 = Future.failed(new Exception("ex1"))
val f2 = Future.successful(2);
val f3 = Future.successful((5));
val f4 = Future.failed(new Exception("ex4"))
val l = List(f1, f2, f3, f4)
l
}
The return type is List[Future[Int]]. In a normal way, I can just do Future.sequence and get List[Future[Int]]. But in this scenario it won't work as I have a failed Future. So I want to convert this to List[Future[Int]] by ignoring the failed Futures. How do I do that?
Second Q on similar topic I have is, I understand filter, collect, partition, etc on a List. In this scenario, say I wanted to filter/partition the list into two lists
- Failed Futures in one
- Successfully done Futures in another.
How do I do that?
One way would be to first convert all Future[Int]s to Future[Option[Int]] that always succeed (but result in None if the original future fails). Then you can use Future.sequence and then flatten the result:
def sequenceIgnoringFailures[A](xs: List[Future[A]])(implicit ec: ExecutionContext): Future[List[A]] = {
val opts = xs.map(_.map(Some(_)).fallbackTo(Future(None)))
Future.sequence(opts).map(_.flatten)
}
The other answer is correct : you should use a Future[List[X]] where X is something that differentiate between failure and success. It can be an Option, an Either, a Try, or whatever you want.
It seems like you're bothered by this, and I suppose it's because you're willing to find something like :
Do all these futures in parallel, ignore the failed ones during the process
And you're given
Do all these futures, wait for everything to finish, and discard based on the result
But actually, there is no special way to express "ignore the failed ones". Something has to acknowledge each future result since you're interested in it, otherwise starting it makes no sense in the first place. And this something has to wait for all futures to finish anyway. And as such, the flag for "you can now ignore me" is indeed the Option being None, the Either being Left, or the Try being Failure. There is not, afaik, a specific flag for futures for "this result being discarded", and I don't think scala would need one.
So, fear not, and go for Future[List[X]], because it actually expresses what you want ! :-)
According to the documentation:
The Try type represents a computation that may either result in an
exception, or return a successfully computed value. It's similar to,
but semantically different from the scala.util.Either type.
The docs do not go into further detail as to what the semantic difference is. Both seem to be able to communicate successes and failures. Why would you use one over the other?
I covered the relationship between Try, Either, and Option in this answer. The highlights from there regarding the relationship between Try and Either are summarized below:
Try[A] is isomorphic to Either[Throwable, A]. In other words you can treat a Try as an Either with a left type of Throwable, and you can treat any Either that has a left type of Throwable as a Try. It is conventional to use Left for failures and Right for successes.
Of course, you can also use Either more broadly, not only in situations with missing or exceptional values. There are other situations where Either can help express the semantics of a simple union type (where value is one of two types).
Semantically, you might use Try to indicate that the operation might fail. You might similarly use Either in such a situation, especially if your "error" type is something other than Throwable (e.g. Either[ErrorType, SuccessType]). And then you might also use Either when you are operating over a union type (e.g. Either[PossibleType1, PossibleType2]).
Since Scala 2.12, the standard library does include the conversions from Either to Try or from Try to Either. For earlier versions, it is pretty simple to enrich Try, and Either as needed:
object TryEitherConversions {
implicit class EitherToTry[L <: Throwable, R](val e: Either[L, R]) extends AnyVal {
def toTry: Try[R] = e.fold(Failure(_), Success(_))
}
implicit class TryToEither[T](val t: Try[T]) extends AnyVal {
def toEither: Either[Throwable, T] =
t.map(Right(_)).recover(Left(_)).get
}
}
This would allow you to do:
import TryEitherConversions._
//Try to Either
Try(1).toEither //Either[Throwable, Int] = Right(1)
Try("foo".toInt).toEither //Either[Throwable, Int] = Left(java.lang.NumberFormatException)
//Either to Try
Right[Throwable, Int](1).toTry //Success(1)
Left[Throwable, Int](new Exception).toTry //Failure(java.lang.Exception)
To narrowly answer your question: "What's the semantic difference":
This probably refers to flatMap and map, which are non-existent in Either and either propagate failure or map the success value in Try. This allows, for instance, chaining like
for {
a <- Try {something}
b <- Try {somethingElse(a)}
c <- Try {theOtherThing(b)}
} yield c
which does just what you'd hope - returns a Try containing either the first exception, or the result.
Try has lots of other useful methods, and of course its companion apply method, that make it very convenient for its intended use - exception handling.
If you really want to be overwhelmed, there are two other classes out there which may be of interest for this kind of application. Scalaz has a class called "\/" (formerly known as Prince), pronounced "Either", which is mostly like Either, but flatMap and map work on the Right value. Similarly, and not, Scalactic has an "Or" which is also similar to Either, but flatMap and map work on the Left value.
I don't recommend Scalaz for beginners.
Either does not imply success and failure, it is just a container for either an A or a B. It is common to use it to represent successes and failures, the convention being to put the failure on the left side, and the success on the right.
A Try can be seen as an Either with the left-side type set to Throwable. Try[A] would be equivalent to Either[Throwable, A].
Use Try to clearly identify a potential failure in the computation, the failure being represented by an exception. If you want to represent the failure with a different type (like a String, or a set of case classes extending a sealed trait for example) use Either.
Either is more general, since it simply represents disjoint unions of types.
In particular, it can represent a union of valid return values of some type X and Exception. However, it does not attempt to catch any exceptions on its own. You have to add try-catch blocks around dangerous code, and then make sure that each branch returns an appropriate subclass of Either (usually: Left for errors, Right for successful computations).
Try[X] can be thought of as Either[Exception, X], but it also catches Exceptions on its own.
Either[X, Y] usage is more general. As its name say it can represent either an object of X type or of Y.
Try[X] has only one type and it might be either a Success[X] or a Failure (which contains a Throwable).
At some point you might see Try[X] as an Either[Throwable,X]
What is nice about Try[X] is that you can chain futher operations to it, if it is really a Success they will execute, if it was a Failure they won't
val connection = Try(factory.open())
val data = connection.flatMap(conn => Try(conn.readData()))
//At some point you can do
data matches {
Success(data) => print data
Failure(throwable) => log error
}
Of course, you can always oneline this like
Try(factory.open()).flatMap(conn => Try(conn.readData()) matches {
Success(data) => print data
Failure(throwable) => log error
}
As already have been mentioned, Either is more general, so it might not only wrap error/successful result, but also can be used as an alternative to Option, for branching the code path.
For abstracting the effect of an error, only for this purpose, I identified the following differences:
Either can be used to specify a description of the error, which can be shown to the client. Try - wraps an exception with a stack trace, less descriptive, less client oriented, more for internal usage.
Either allows us to specify error type, with existing monoid for this type. As a result, it allows us to combine errors (usually via applicative effects). Try abstraction with its exception, has no monoid defined. With Try we must spent more effort to extract error and handle it.
Based on it, here is my best practices:
When I want to abstract effect of error, I always use Either as the first choice, with List/Vector/NonEmptyList as error type.
Try is used only, when you invoke code, written in OOP. Good candidates for Try are methods, that might throw an exception, or methods, that sends request to external systems (rest/soap/database requests in case the methods return a raw result, not wrapped into FP abstractions, like Future, for instance.
Let's say I have a method..
def foo(b: Bar): Try[Bar] = ???
Try is just a placeholder here. foo does something with Bar, then returns a value to indicate success/failure. I want to return the original value with the success/failure indication, so when I have a collection, I can know which ones failed and succeeded, and do something with them. Try doesn't really work for me, because Failure wraps an exception (let's say I don't care about the reason why it failed).
I could maybe return Either[Bar, Bar], but it seems redundant to repeat the type parameter.
Are there better alternatives than this?
Either[Bar, Bar] and (Boolean, Bar) are isomorphic and the choice between them is a matter of taste.
I'd personally prefer Either because you get a nicer set of operations for mapping over the collection with pattern matching, etc., as well as a merge extension method that allows you to write results.map(_.merge) to get a Seq[Bar] if in some situation you no longer need to make a distinction between successful and failed results. I also find this:
val result: Either[Bar, Bar] = foo(input).toOption.toRight(input)
A little nicer than:
val result: (Boolean, Bar) =
foo(input).map((true, _)).getOrElse((false, input))
Or the alternatives, but your mileage may vary.
If I understood your question: Your solution is almost what you were about to achieve.
def foo(b: Bar): Try[Bar] = Try(...)
val succeed = foo(b).getOrElse()
In succeed you have the value you wanted, or nothing (since you don't care about exception). Very good article is: http://danielwestheide.com/blog/2012/12/26/the-neophytes-guide-to-scala-part-6-error-handling-with-try.html