The traverse method from Future object stops at first failure. I want a tolerant/forgiving version of this method which on occurrence of errors carries on with the rest of the sequence.
Currently we have added the following method to our utils:
def traverseFilteringErrors[A, B <: AnyRef]
(seq: Seq[A])
(f: A => Future[B]): Future[Seq[B]] = {
val sentinelValue = null.asInstanceOf[B]
val allResults = Future.traverse(seq) { x =>
f(x) recover { case _ => sentinelValue }
}
val successfulResults = allResults map { result =>
result.filterNot(_ == sentinelValue)
}
successfulResults
}
Is there a better way to do this?
A genuinely useful thing (generally speaking) would be to be able to promote the error of a future into a proper value. Or in other words, transform a Future[T] into a Future[Try[T]] (the succesful return value becomes a Success[T] while the failure case becomes a Failure[T]). Here is how we might implement it:
// Can also be done more concisely (but less efficiently) as:
// f.map(Success(_)).recover{ case t: Throwable => Failure( t ) }
// NOTE: you might also want to move this into an enrichment class
def mapValue[T]( f: Future[T] ): Future[Try[T]] = {
val prom = Promise[Try[T]]()
f onComplete prom.success
prom.future
}
Now, if you do the following:
Future.traverse(seq)( f andThen mapValue )
You'll obtain a succesful Future[Seq[Try[A]]], whose eventual value contains a Success instance for each successful future, and a Failure instance for each failed future.
If needed, you can then use collect on this seq to drop the Failure instances and keep only the sucessful values.
In other words, you can rewrite your helper method as follows:
def traverseFilteringErrors[A, B](seq: Seq[A])(f: A => Future[B]): Future[Seq[B]] = {
Future.traverse( seq )( f andThen mapValue ) map ( _ collect{ case Success( x ) => x } )
}
Related
Background
I have been reading the book Functional Programming in Scala, and have some questions regarding the content in Chapter 7: Purely functional parallelism.
Here is the code for the answers in the book: Par.scala, but I am confused about certain part of it.
Here is the first part of the code of Par.scala, which stands for Parallelism:
import java.util.concurrent._
object Par {
type Par[A] = ExecutorService => Future[A]
def unit[A](a: A): Par[A] = (es: ExecutorService) => UnitFuture(a)
private case class UnitFuture[A](get: A) extends Future[A] {
def isDone = true
def get(timeout: Long, units: TimeUnit): A = get
def isCancelled = false
def cancel(evenIfRunning: Boolean): Boolean = false
}
def map2[A, B, C](a: Par[A], b: Par[B])(f: (A, B) => C): Par[C] =
(es: ExecutorService) => {
val af = a(es)
val bf = b(es)
UnitFuture(f(af.get, bf.get))
}
def fork[A](a: => Par[A]): Par[A] =
(es: ExecutorService) => es.submit(new Callable[A] {
def call: A = a(es).get
})
def lazyUnit[A](a: => A): Par[A] =
fork(unit(a))
def run[A](es: ExecutorService)(a: Par[A]): Future[A] = a(es)
def asyncF[A, B](f: A => B): A => Par[B] =
a => lazyUnit(f(a))
def map[A, B](pa: Par[A])(f: A => B): Par[B] =
map2(pa, unit(()))((a, _) => f(a))
}
The simplest possible model for Par[A] might be ExecutorService => Future[A], and run simply returns the Future.
unit promotes a constant value to a parallel computation by returning a UnitFuture, which is a simple implementation of Future that just wraps a constant value.
map2 combines the results of two parallel computations with a binary function.
fork marks a computation for concurrent evaluation. The evaluation won’t actually occur until forced by run. Here is with its simplest and most natural implementation of it. Even though it has its problems, let's first put them aside.
lazyUnit wraps its unevaluated argument in a Par and marks it for concurrent evaluation.
run extracts a value from a Par by actually performing the computation.
asyncF converts any function A => B to one that evaluates its result asynchronously.
Questions
The fork is the function confuses me a lot here, because it takes a lazy argument, which will be evaluated later when it is called. Then my questions are more about when we should use this fork, i.e., when we need lazy-evaluation and when we need to have the value directly.
Here is an exercise from the book:
EXERCISE 7.5
Hard: Write this function, called sequence. No additional primitives are required. Do not call run.
def sequence[A](ps: List[Par[A]]): Par[List[A]]
And here is the answers (offered here).
First
def sequence_simple[A](l: List[Par[A]]): Par[List[A]] =
l.foldRight[Par[List[A]]](unit(List()))((h, t) => map2(h, t)(_ :: _))
What is the different between above code and the following:
def sequence_simple[A](l: List[Par[A]]): Par[List[A]] =
l.foldLeft[Par[List[A]]](unit(List()))((t, h) => map2(h, t)(_ :: _))
Additionally
def sequenceRight[A](as: List[Par[A]]): Par[List[A]] =
as match {
case Nil => unit(Nil)
case h :: t => map2(h, fork(sequenceRight(t)))(_ :: _)
}
def sequenceBalanced[A](as: IndexedSeq[Par[A]]): Par[IndexedSeq[A]] = fork {
if (as.isEmpty) unit(Vector())
else if (as.length == 1) map(as.head)(a => Vector(a))
else {
val (l,r) = as.splitAt(as.length/2)
map2(sequenceBalanced(l), sequenceBalanced(r))(_ ++ _)
}
}
In sequenceRight, fork is used when recursive function is directly called. However, in sequenceBalanced, fork is used outside of the whole function body.
Then, what is the differences or above code and the following (where we switched the places of fork):
def sequenceRight[A](as: List[Par[A]]): Par[List[A]] = fork {
as match {
case Nil => unit(Nil)
case h :: t => map2(h, sequenceRight(t))(_ :: _)
}
}
def sequenceBalanced[A](as: IndexedSeq[Par[A]]): Par[IndexedSeq[A]] =
if (as.isEmpty) unit(Vector())
else if (as.length == 1) map(as.head)(a => Vector(a))
else {
val (l,r) = as.splitAt(as.length/2)
map2(fork(sequenceBalanced(l)), fork(sequenceBalanced(r)))(_ ++ _)
}
Finally, given the sequence defined above, we have the following function:
def parMap[A,B](ps: List[A])(f: A => B): Par[List[B]] = fork {
val fbs: List[Par[B]] = ps.map(asyncF(f))
sequence(fbs)
}
I would like to know, can I also implement the function in the following way, which is by applying the lazyUnit defined in the beginning? Is this implementation lazyUnit(ps.map(f)) lazy?
def parMapByLazyUnit[A, B](ps: List[A])(f: A => B): Par[List[B]] =
lazyUnit(ps.map(f))
I did not completely understand your doubt. But I see a major problem with the following solution,
def parMapByLazyUnit[A, B](ps: List[A])(f: A => B): Par[List[B]] =
lazyUnit(ps.map(f))
To understand the problem lets look at def lazyUnit,
def fork[A](a: => Par[A]): Par[A] =
(es: ExecutorService) => es.submit(new Callable[A] {
def call: A = a(es).get
})
def lazyUnit[A](a: => A): Par[A] =
fork(unit(a))
So... lazyUnit takes an expression of type => A and submits it to ExecutorService to get evaluated. And returns the wrapped result of this parallel computation as Par[A].
In parMap for every element of ps: List[A], we not only have to evaluate the corresponding mapping using the function f: A => B but we have to do these evaluations in parallel.
But our solution lazyUnit(ps.map(f)) will submit the whole { ps.map(f) } evaluation as a single task to our ExecutionService. Which means we are not doing it in parallel.
What we need to do is make sure that for each element a in ps: [A], the function f: A => B is executed as a separate task for our ExecutorService.
Now, as we learned from our implementation is that we can run an expression of type exp: => A by using lazyUnit(exp) to get a result: Par[A].
So, we will do exactly that for every a: A in ps: List[A],
val parMappedTmp = ps.map( a => lazyUnit(f(a) ) )
// or
val parMappedTmp = ps.map( a => asyncF(f)(a) )
// or
val parMappedTmp = ps.map(asyncF(f))
But, Now our parMappedTmp is a List[Par[B]] and whereas we needed a Par[List[B]]
So, you will need a function with the following signature to get what you wanted,
def sequence[A](ps: List[Par[A]]): Par[List[A]]
Once you have it,
val parMapped = sequence(parMappedTmp)
I want to apply a function f to each element of a List and not stop at the first error but throw the last error (if any) only:
#annotation.tailrec
def tryAll[A](xs: List[A])(f: A => Unit): Unit = {
xs match {
case x :: xt =>
try {
f(x)
} finally {
tryAll(xt)(f)
}
case _ =>
}
}
But, the above code does not compile - it is complaining that this function is not tail recursive. Why not?
This solution iterates over all elements and produces (throws) the last error if any:
def tryAll[A](xs: List[A])(f: A => Unit): Unit = {
val res = xs.foldLeft(Option.empty[Throwable]) {
case (maybeThrowable, a) =>
Try(f(a)) match {
case Success(_) => maybeThrowable
case Failure(e) => Option(e)
}
}
res.foreach(throwable => throw throwable)
}
As mentioned by #HristoIliev, your method cannot be tail recursive because the finally call is not guaranteed to be the tail call. This means that any method using try in this way will not be tail recursive. See this answer, also.
Calling the method again is a weird way of trying something repeatedly until it succeeds, because at each stage it's throwing an exception you are presumably not handling. Instead, I'd argue using a functional approach with Try, taking failures from a view until the operation succeeds. The only disadvantage to this approach is that it doesn't throw any exceptions for you to handle along the way (which can also be an advantage!).
def tryAll[A](xs: List[A])(f: A => Unit): Unit =
xs.view.map(x => Try(f(x))).takeWhile(_.isFailure).force
scala> val list = List(0, 0, 0, 4, 5, 0)
scala> tryAll(list)(a => println(10 / a))
2
If you really want to handle the exceptions (or just the last exception), you can change the return type of tryAll to List[Try[Unit]] (or simply Try[Unit] if you modify the code to only take the last one). It's better for the return type of the method to describe part of what it's actually doing--potentially returning errors.
Not sure the intention of the method, but you can something like that:
final def tryAll[A](xs: List[A])(f: A => Unit): Unit = {
xs match {
case x :: xt =>
try {
f(x)
} catch {
case e => tryAll(xt)(f)
}
case _ => //do something else
}
}
I know this way to use #annotation.tailrec
From this:
def fac(n:Int):Int = if (n<=1) 1 else n*fac(n-1)
You should have this:
#scala.annotation.tailrec
def facIter(f:Int, n:Int):Int = if (n<2) f else facIter(n*f, n-1)
def fac(n:Int) = facIter(1,n)
An example from the official scala documentation:
def first[T](f: Future[T], g: Future[T]): Future[T] = {
val p = promise[T]
f onSuccess {
case x => p.trySuccess(x)
}
g onSuccess {
case x => p.trySuccess(x)
}
p.future
}
Note that in this implementation, if neither f nor g succeeds, then
first(f, g) never completes (either with a value or with an
exception).
It gives us a warning, but no corresponding solution. How can you make first return a Failure if both f and g fail?
Something like this could do the job, although it's not based on the same logic as your example (this solution is wrong, check the Update)
def first[T](f: Future[T], g: Future[T]): Future[T] = {
val p = Promise[T]
p.completeWith(f.fallbackTo(g))
p.future
}
If both fails, first holds the throwable coming from f.
Update (based on #Jatin's comment):
My first solution is wrong, because if the f future never completes, then even if g does, first never completes either. This updated solution completes the Promise nondeterministically with the first completed value:
def first[T](f: Future[T], g: Future[T]): Future[T] = {
val p = Promise[T]
p.completeWith(Future.firstCompletedOf(Seq(f.fallbackTo(g),g.fallbackTo(f))))
p.future
}
Another solution: where you need to find the first successful one amongst a list. Or a failure if all fails.
The solution is a bit effectful by nature, would be happy to see pure functional implementation:
def second[T](ls: Seq[Future[T]]): Future[T] = {
val p = Promise[T]()
val size = ls.size
val count = new AtomicInteger(0)
ls.foreach(x => x onComplete{
case Success(a) => p.tryCompleteWith(x)
case Failure(y) => if(count.incrementAndGet() == size) p.tryCompleteWith(x)
})
p.future
}
The below prints 2 as expected:
second(List(Future{ Thread.sleep(5000); 1}, Future{Thread.sleep(3000) ;2}, Future{throw new RuntimeException})) onComplete{
case Success(x) => println(x)
case Failure(y) => y.printStackTrace()
}
I am trying to find a cleaner way to express code that looks similar to this:
def method1: Try[Option[String]] = ???
def method2: Try[Option[String]] = ???
def method3: Try[Option[String]] = ???
method1 match
{
case f: Failure[Option[String]] => f
case Success(None) =>
method2 match
{
case f:Failure[Option[String]] => f
case Success(None) =>
{
method3
}
case s: Success[Option[String]] => s
}
case s: Success[Option[String]] => s
}
As you can see, this tries each method in sequence and if one fails then execution stops and the base match resolves to that failure. If method1 or method2 succeeds but contains None then the next method in the sequence is tried. If execution gets to method3 its results are always returned regardless of Success or Failure. This works fine in code but I find it difficult to follow whats happening.
I would love to use a for comprehension
for
{
attempt1 <- method1
attempt2 <- method2
attempt3 <- method3
}
yield
{
List(attempt1, attempt2, attempt3).find(_.isDefined)
}
because its beautiful and what its doing is quite clear. However, if all methods succeed then they are all executed every time, regardless of whether an earlier method returns a usable answer. Unfortunately I can't have that.
Any suggestions would be appreciated.
scalaz can be of help here. You'll need scalaz-contrib which adds a monad instance for Try, then you can use OptionT which has nice combinators. Here is an example:
import scalaz.OptionT
import scalaz.contrib.std.utilTry._
import scala.util.Try
def method1: OptionT[Try, String] = OptionT(Try(Some("method1")))
def method2: OptionT[Try, String] = OptionT(Try(Some("method2")))
def method3: OptionT[Try, String] = { println("method 3 is never called") ; OptionT(Try(Some("method3"))) }
def method4: OptionT[Try, String] = OptionT(Try(None))
def method5: OptionT[Try, String] = OptionT(Try(throw new Exception("fail")))
println((method1 orElse method2 orElse method3).run) // Success(Some(method1))
println((method4 orElse method2 orElse method3).run) // Success(Some(method2))
println((method5 orElse method2 orElse method3).run) // Failure(java.lang.Exception: fail)
If you don't mind creating a function for each method, you can do the following:
(Try(None: Option[String]) /: Seq(method1 _, method2 _, method3 _)){ (l,r) =>
l match { case Success(None) => r(); case _ => l }
}
This is not at all idiomatic, but I would like to point out that there's a reasonably short imperative version also with a couple tiny methods:
def okay(tos: Try[Option[String]]) = tos.isFailure || tos.success.isDefined
val ans = {
var m = method1
if (okay(m)) m
else if ({m = method2; okay(m)}) m
method3
}
The foo method should do the same stuff as your code, I don't think it is possible to do it using the for comprehension
type tryOpt = Try[Option[String]]
def foo(m1: tryOpt, m2: tryOpt, m3: tryOpt) = m1 flatMap {
case x: Some[String] => Try(x)
case None => m2 flatMap {
case y: Some[String] => Try(y)
case None => m3
}
}
method1.flatMap(_.map(Success _).getOrElse(method2)).flatMap(_.map(Success _).getOrElse(method3))
How this works:
The first flatMap takes a Try[Option[String]], if it is a Failure it returns the Failure, if it is a Success it returns _.map(Success _).getOrElse(method2) on the option. If the option is Some then it returns the a Success of the Some, if it is None it returns the result of method2, which could be Success[None], Success[Some[String]] or Failure.
The second map works similarly with the result it gets, which could be from method1 or method2.
Since getOrElse takes a by-name paramater method2 and method3 are only called if they need to be.
You could also use fold instead of map and getOrElse, although in my opinion that is less clear.
From this blog:
def riskyCodeInvoked(input: String): Int = ???
def anotherRiskyMethod(firstOutput: Int): String = ???
def yetAnotherRiskyMethod(secondOutput: String): Try[String] = ???
val result: Try[String] = Try(riskyCodeInvoked("Exception Expected in certain cases"))
.map(anotherRiskyMethod(_))
.flatMap(yetAnotherRiskyMethod(_))
result match {
case Success(res) => info("Operation Was successful")
case Failure(ex: ArithmeticException) => error("ArithmeticException occurred", ex)
case Failure(ex) => error("Some Exception occurred", ex)
}
BTW, IMO, Option is no need here?
I've been trying to simplify the way I do futures in Scala. I got at one point a Future[Option[Future[Option[Boolean]] but I've simplified it further below. Is there a better way to simplify this?
Passing a future of "failed" doesn't seem like the best way to do this. i.e. in the sequential world I simply returned "FAIL!!" any time it failed rather than continuing to the end. Are there other ways?
val doSimpleWork = Future {
//Do any arbitrary work (can be a different function)
true //or false
}
val doComplexWork = Future {
//Do any arbitrary work (can be a different function)
Some("result") //or false
}
val failed = Future {
//Do no work at all!!! Just return
false
}
val fut1 = doSimpleWork
val fut2 = doSimpleWork
val fut3 = (fut1 zip fut2).map({
case (true, true) => true
case _ => false
})
val fut4 = fut3.flatMap({
case true =>
doComplexWork.flatMap({
case Some("result") =>
doSimpleWork
case None =>
failed
})
case false =>
failed
})
fut4.map({
case true =>
"SUCCESS!!!"
case _ =>
"FAIL!!"
})
Note that in your example, because you're eagerly instantiating the Futures to a val, all of them will start executing as soon as you declare them (val x = Future {...}). Using methods instead will make the Futures execute only when they're requested by the chain of execution.
One way to avoid the further computation would be to throw an exception, then handle it with onFailure:
def one = future { println("one") ; Some(1) }
def two = future { println("two") ; throw new Exception("no!"); 2 }
def three = future { println("three") ; 3 }
val f = one flatMap {
result1 => two flatMap {
result2 => three
}
}
f onFailure {
case e: Exception =>
println("failed somewhere in the chain")
}
You can see here that "three" isn't supposed to be printed out, because we fail on two. This is the case:
one
two
failed somewhere in the chain
a "Monad transformer" is a construct which lets you combine the "effects" of two monads, the scalaz project provides several different monad transformers. My suggestion is that you can use the OptionT monad transformer to simplify your code if you also make use of the fact that Option[Unit] is isomorphic to Boolean (Some(()) == true and None == false). Here's a complete example:
import scalaz._
import Scalaz._
import scala.concurrent._
import ExecutionContext.Implicits.global
import scala.concurrent.duration._
object Foo {
// We need a Monad instance for Future, here is a valid one, or you can use the implementation
// in the scalaz-contrib project, see http://typelevel.org
implicit def futureMonad(implicit executor: ExecutionContext): Monad[Future] = new Monad[Future] {
override def bind[A, B](fa: Future[A])(f: A ⇒ Future[B]) = fa flatMap f
override def point[A](a: ⇒ A) = Future(a)
override def map[A, B](fa: Future[A])(f: A ⇒ B) = fa map f
}
// OptionT allows you to combine the effects of the Future and Option monads
// to more easily work with a Future[Option[A]]
val doSimpleWork : OptionT[Future,Unit] = OptionT(Future {
// Option[Unit] is isomorphic to Boolean
Some(()) //or None
})
val simpleFail : OptionT[Future,Unit] = OptionT(Future {
None
})
val doComplexWork: OptionT[Future,String] = OptionT(Future {
Some("result") //or None
})
val f1 = doSimpleWork
val f2 = doSimpleWork
val f3 = doComplexWork
val f4 = doSimpleWork
def main(argv: Array[String]) {
val result = for {
_ <- f1
// we don't get here unless both the future succeeded and the result was Some
_ <- f2
_ <- f3
r <- f4
} yield(r)
result.fold((_ => println("SUCCESS!!")),println("FAIL!!"))
// "run" will get you to the Future inside the OptionT
Await.result(result.run, 1 second)
}
}
You could try something like this, using for comprehensions to clean up the code a bit:
def doSimpleWork = Future{
//do some simple work
true
}
def doComplexWork = Future{
//do something complex here
Some("result")
}
val fut1 = doSimpleWork
val fut2 = doSimpleWork
val fut = for{
f1Result <- fut1
f2Result <- fut2
if (f1Result && f2Result)
f3Result <- doComplexWork
if (f3Result.isDefined)
f4Result <- doSimpleWork
} yield "success"
fut onComplete{
case Success(value) => println("I succeeded")
case Failure(ex) => println("I failed: " + ex.getMessage)
}
And if you really just wanted to print out "success" or "failed" at the end, you could change that last piece of code to:
fut.recover{case ex => "failed"} onSuccess{
case value => println(value)
}
Now, to explain what's going on. For starters, we've defined two functions that return Futures that are doing some async work. The doSimpleWork function will do some simple work and return a boolean success/fail indicator. The doComplexWork function will do something more complex (and time consuming) and return an Option[String] representing a result. We then kick off two parallel invocations of doSimpleWork before entering the for comprehension. In for for comp, we get the results of fut1 and fut2 (in that order) before checking if they were both successful. If not, we would stop here, and the fut val would be failed with a NoSuchElementException which is what you get when a condition like this fails in a for comp. If both were successful, we would continue on and invoke the doComplexWork function and wait for its result. We would then check its result and if it was Some, we would kick off one last invocation of doSimpleWork. If that succeeds, we would yield the string "success". If you check the type of the fut val, its of type Future[String].
From there, we use one of the async callback functions to check if the whole sequence of calls either made it all the way through (the Success case), or failed somewhere in the process (the Failure case), printing out something related to which case it hit. In the alternate final code block, we recover from any possible failure by returning the String "failed" "and then use just the onSuccess callback which will print either "success" or "failed" depending on what happened.