Handling Option Inside For Comprehension of Futures - scala

Consider the following code inside a Play Framework controller:
val firstFuture = function1(id)
val secondFuture = function2(id)
val resultFuture = for {
first <- firstFuture
second <- secondFuture(_.get)
result <- function3(first, second)
} yield Ok(s"Processed $id")
resultFuture.map(result => result).recover { case t => InternalServerError(s"Error organizing files: $t.getMessage")}
Here are some details about the functions:
function1 returns Future[List]
function2 returns Future[Option[Person]]
function1 and function2 can run in parallel, but function3 needs the results for both.
Given this information, I have some questions:
Although the application is such that this code is very unlikely to be called with an improper id, I would like to handle this possibility. Basically, I would like to return NotFound if function2 returns None, but I can't figure out how to do that.
Will the recover call handle an Exception thrown any step of the way?
Is there a more elegant or idiomatic way to write this code?

Perhaps using collect, and then you can recover the NoSuchElementException--which yes, will recover a failure from any step of the way. resultFuture will either be successful with the mapped Result, or failed with the first exception that was thrown.
val firstFuture = function1(id)
val secondFuture = function2(id)
val resultFuture = for {
first <- firstFuture
second <- secondFuture.collect(case Some(x) => x)
result <- function3(first, second)
} yield Ok(s"Processed $id")
resultFuture.map(result => result)
.recover { case java.util.NoSuchElementException => NotFound }
.recover { case t => InternalServerError(s"Error organizing files: $t.getMessage")}

I would go with Scalaz OptionT. Maybe when you have only one function Future[Optipn[T]] it's overkill, but when you'll start adding more functions it will become super useful
import scala.concurrent.ExecutionContext.Implicits.global
import scalaz.OptionT
import scalaz.OptionT._
import scalaz.std.scalaFuture._
// Wrap 'some' result into OptionT
private def someOptionT[T](t: Future[T]): OptionT[Future, T] =
optionT[Future](t.map(Some.apply))
val firstFuture = function1(id)
val secondFuture = function2(id)
val action = for {
list <- someOptionT(firstFuture)
person <- optionT(secondFuture)
result = function3(list, person)
} yield result
action.run.map {
case None => NotFound
case Some(result) => Ok(s"Processed $id")
} recover {
case NonFatal(err) => InternalServerError(s"Error organizing files: ${err.getMessage}")
}

Related

Iterate data source asynchronously in batch and stop while remote return no data in Scala

Let's say we have a fake data source which will return data it holds in batch
class DataSource(size: Int) {
private var s = 0
implicit val g = scala.concurrent.ExecutionContext.global
def getData(): Future[List[Int]] = {
s = s + 1
Future {
Thread.sleep(Random.nextInt(s * 100))
if (s <= size) {
List.fill(100)(s)
} else {
List()
}
}
}
object Test extends App {
val source = new DataSource(100)
implicit val g = scala.concurrent.ExecutionContext.global
def process(v: List[Int]): Unit = {
println(v)
}
def next(f: (List[Int]) => Unit): Unit = {
val fut = source.getData()
fut.onComplete {
case Success(v) => {
f(v)
v match {
case h :: t => next(f)
}
}
}
}
next(process)
Thread.sleep(1000000000)
}
I have mine, the problem here is some portion is more not pure. Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list? My situation is a little from this post, the next() there is synchronous call while my is also async.
Or is it ever possible to do what I want? Next batch will only be fetched when the previous one is resolved in the end whether to fetch the next batch depends on the size returned?
What's the best way to walk through this type of data sources? Are there any existing Scala frameworks that provide the feature I am looking for? Is play's Iteratee, Enumerator, Enumeratee the right tool? If so, can anyone provide an example on how to use those facilities to implement what I am looking for?
Edit----
With help from chunjef, I had just tried out. And it actually did work out for me. However, there was some small change I made based on his answer.
Source.fromIterator(()=>Iterator.continually(source.getData())).mapAsync(1) (f=>f.filter(_.size > 0))
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runForeach(println)
However, can someone give comparison between Akka Stream and Play Iteratee? Does it worth me also try out Iteratee?
Code snip 1:
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
Code snip 2: Assuming the getData depends on some other output of another flow, and I would like to concat it with the below flow. However, it yield too many files open error. Not sure what would cause this error, the mapAsync has been limited to 1 as its throughput if I understood correctly.
Flow[Int].mapConcat[Future[List[Int]]](c => {
Iterator.continually(ds.getData(c)).to[collection.immutable.Iterable]
}).mapAsync(1)(identity).takeWhile(_.nonEmpty).runForeach(println)
The following is one way to achieve the same behavior with Akka Streams, using your DataSource class:
import scala.concurrent.Future
import scala.util.Random
import akka.actor.ActorSystem
import akka.stream._
import akka.stream.scaladsl._
object StreamsExample extends App {
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
val ds = new DataSource(100)
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
}
class DataSource(size: Int) {
...
}
A simplified line-by-line overview:
line 1: Creates a stream source that continually calls ds.getData if there is downstream demand.
line 2: mapAsync is a way to deal with stream elements that are Futures. In this case, the stream elements are of type Future[List[Int]]. The argument 1 is the level of parallelism: we specify 1 here because DataSource internally uses a mutable variable, and a parallelism level greater than one could produce unexpected results. identity is shorthand for x => x, which basically means that for each Future, we pass its result downstream without transforming it.
line 3: Essentially, ds.getData is called as long as the result of the Future is a non-empty List[Int]. If an empty List is encountered, processing is terminated.
line 4: runForeach here takes a function List[Int] => Unit and invokes that function for each stream element.
Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list?
I think you are looking for a Promise.
You would set up a Promise before you start the first iteration.
This gives you promise.future, a Future that you can then use to follow the completion of everything.
In your onComplete, you add a case _ => promise.success().
Something like
def loopUntilDone(f: (List[Int]) => Unit): Future[Unit] = {
val promise = Promise[Unit]
def next(): Unit = source.getData().onComplete {
case Success(v) =>
f(v)
v match {
case h :: t => next()
case _ => promise.success()
}
case Failure(e) => promise.failure(e)
}
// get going
next(f)
// return the Future for everything
promise.future
}
// future for everything, this is a `Future[Unit]`
// its `onComplete` will be triggered when there is no more data
val everything = loopUntilDone(process)
You are probably looking for a reactive streams library. My personal favorite (and one I'm most familiar with) is Monix. This is how it will work with DataSource unchanged
import scala.concurrent.duration.Duration
import scala.concurrent.Await
import monix.reactive.Observable
import monix.execution.Scheduler.Implicits.global
object Test extends App {
val source = new DataSource(100)
val completed = // <- this is Future[Unit], completes when foreach is done
Observable.repeat(Observable.fromFuture(source.getData()))
.flatten // <- Here it's Observable[List[Int]], it has collection-like methods
.takeWhile(_.nonEmpty)
.foreach(println)
Await.result(completed, Duration.Inf)
}
I just figured out that by using flatMapConcat can achieve what I wanted to achieve. There is no point to start another question as I have had the answer already. Put my sample code here just in case someone is looking for similar answer.
This type of API is very common for some integration between traditional Enterprise applications. The DataSource is to mock the API while the object App is to demonstrate how the client code can utilize Akka Stream to consume the APIs.
In my small project the API was provided in SOAP, and I used scalaxb to transform the SOAP to Scala async style. And with the client calls demonstrated in the object App, we can consume the API with AKKA Stream. Thanks for all for the help.
class DataSource(size: Int) {
private var transactionId: Long = 0
private val transactionCursorMap: mutable.HashMap[TransactionId, Set[ReadCursorId]] = mutable.HashMap.empty
private val cursorIteratorMap: mutable.HashMap[ReadCursorId, Iterator[List[Int]]] = mutable.HashMap.empty
implicit val g = scala.concurrent.ExecutionContext.global
case class TransactionId(id: Long)
case class ReadCursorId(id: Long)
def startTransaction(): Future[TransactionId] = {
Future {
synchronized {
transactionId += transactionId
}
val t = TransactionId(transactionId)
transactionCursorMap.update(t, Set(ReadCursorId(0)))
t
}
}
def createCursorId(t: TransactionId): ReadCursorId = {
synchronized {
val c = transactionCursorMap.getOrElseUpdate(t, Set(ReadCursorId(0)))
val currentId = c.foldLeft(0l) { (acc, a) => acc.max(a.id) }
val cId = ReadCursorId(currentId + 1)
transactionCursorMap.update(t, c + cId)
cursorIteratorMap.put(cId, createIterator)
cId
}
}
def createIterator(): Iterator[List[Int]] = {
(for {i <- 1 to 100} yield List.fill(100)(i)).toIterator
}
def startRead(t: TransactionId): Future[ReadCursorId] = {
Future {
createCursorId(t)
}
}
def getData(cursorId: ReadCursorId): Future[List[Int]] = {
synchronized {
Future {
Thread.sleep(Random.nextInt(100))
cursorIteratorMap.get(cursorId) match {
case Some(i) => i.next()
case _ => List()
}
}
}
}
}
object Test extends App {
val source = new DataSource(10)
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
implicit val g = scala.concurrent.ExecutionContext.global
//
// def process(v: List[Int]): Unit = {
// println(v)
// }
//
// def next(f: (List[Int]) => Unit): Unit = {
// val fut = source.getData()
// fut.onComplete {
// case Success(v) => {
// f(v)
// v match {
//
// case h :: t => next(f)
//
// }
// }
//
// }
//
// }
//
// next(process)
//
// Thread.sleep(1000000000)
val s = Source.fromFuture(source.startTransaction())
.map { e =>
source.startRead(e)
}
.mapAsync(1)(identity)
.flatMapConcat(
e => {
Source.fromIterator(() => Iterator.continually(source.getData(e)))
})
.mapAsync(5)(identity)
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runForeach(println)
/*
val done = Source.fromIterator(() => Iterator.continually(source.getData())).mapAsync(1)(identity)
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runFold(List[List[Int]]()) { (acc, r) =>
// println("=======" + acc + r)
r :: acc
}
done.onSuccess {
case e => {
e.foreach(println)
}
}
done.onComplete(_ => system.terminate())
*/
}

How to get rid of nested future in scala?

I've got some code in my play framework app that parses a JSON request and uses it to update a user's data. The problem is that I need to return a Future[Result], but my userDAO.update function returns a Future[Int] so I have nested futures.
I've resorted to using Await which isn't very good. How can I rewrite this code to avoid the nested future?
def patchCurrentUser() = Action.async { request =>
Future {
request.body.asJson
}.map {
case Some(rawJson) => Json.fromJson[User](rawJson).map { newUser =>
val currentUserId = 1
logger.info(s"Retrieving users own profile for user ID $currentUserId")
val futureResult: Future[Result] = userDAO.findById(currentUserId).flatMap {
case Some(currentUser) =>
val mergedUser = currentUser.copy(
firstName = newUser.firstName // ... and the other fields
)
userDAO.update(mergedUser).map(_ => Ok("OK"))
case _ => Future { Status(404) }
}
import scala.concurrent.duration._
// this is bad. How can I get rid of this?
Await.result(futureResult, 1 seconds)
}.getOrElse(Status(400))
case _ => Status(400)
}
}
Update:
Sod's law: Just after posting this I worked it out:
Future {
request.body.asJson
}.flatMap {
case Some(rawJson) => Json.fromJson[User](rawJson).map { newUser =>
val currentUserId = 1
userDAO.findById(currentUserId).flatMap {
case Some(currentUser) =>
val updatedUser = currentUser.copy(
firstName = newUser.firstName
)
userDAO.update(updatedUser).map(_ => Ok("OK"))
case _ => Future { Status(404) }
}
}.getOrElse(Future(Status(400)))
case _ => Future(Status(400))
}
But, is there a more elegant way? It seems like I'm peppering Future() around quite liberally which seems like a code smell.
Use flatMap instead of map.
flatMap[A, B](f: A => Future[B])
map[A, B](f: A => B)
More elegant way is to use for comprehension
Using For comprehension Code looks like this
for {
jsonOpt <- Future (request.body.asJson)
result <- jsonOpt match {
case Some(json) =>
json.validate[User] match {
case JsSuccess(newUser, _ ) =>
for {
currentUser <- userDAO.findById(1)
_ <- userDAO.update(currentUser.copy(firstName = newUser.firstName))
} yield Ok("ok")
case JsError(_) => Future(Status(400))
}
case None => Future(Status(400))
}
} yield result
As #pamu said it might clear your code a bit if you would use a for comprehension.
Another interesting approach (and more pure in terms of Functional Programming) would be to use monad transformers (normally a type similar to Future[Option[T]] screams monad transformer).
You should take a look at libraries like cats (and or scalaz). I'll try to give a small "pseudo code" example using cats (because I don't have play framework locally):
import cats.data.OptionT
import cats.instances.future._
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
def convertJsonToUser(json: Json): Future[Option[User]] = Json.fromJson[User](json)
def convertBodyToJson(request: Request): Future[Option[Json]] = Future {request.body.asJson}
def updateUser(user: User): Future[HttpResult] = Future {
// update user
Ok("ok")
}
def myFunction: Future[HttpResult] = {
val resultOpt: OptionT[Future, HttpResult] = for {
json <- OptionT(convertBodyToJson(request))
user <- OptionT(convertJsonToUser(json))
result <- OptionT.lift(updateUser(user))
} yield result
result.getOrElseF(Future {Status(400)})
}
As you can see, in this case the monad transformers allow to treat a type like Future[Option[T]] as a single "short-circuiting" type (e.g. the for comprehension will stop if you have either a failed future, or a future containing a None).

How to spawn an unknown amount of Futures and combine the result even if one or more failed?

I want to turn the following sequential code into concurrent code with Futures and need advice on how to structure it.
sequential:
import java.net.URL
val providers = List(
new URL("http://www.cnn.com"),
new URL("http://www.bbc.co.uk"),
new URL("http://www.othersite.com")
)
def download(urls: URL*) = urls.flatMap(url => io.Source.fromURL(url).getLines).distinct
val res = download(providers:_*)
I want to download all sources that are coming in via the varargs of the download method and combine the results into one Seq/List/Set, whatever, together. When one Future failed, let's say because the server is unreachable, it should take all others and move on and return the result nonetheless. firstCompletedOf won't work because I need the results of all, except one failed due to error. I thought about using Future.sequence like below but I can't get it to work. Here is what I tried...
def download(urls: URL*) = Future.sequence {
urls.map { url =>
Future {
io.Source.fromURL(url).getLines
}
}
}
This produces a Seq[Future[Iterator[String]]] which is not compatible with M_[Future[A_]].
A Future[Iterator[String]] is what I want. (I thought I return an Iterator because I need to reuse it later on with reset method on Iterator.)
You can use parallel collections:
import java.net.URL
val providers = List(
new URL("http://www.cnn.com"),
new URL("http://www.bbc.co.uk"),
new URL("http://www.othersite.com")
)
def download(urls: URL*) = urls.par.flatMap(url => {
Try {
io.Source.fromURL(url).getLines
} match {
case Success(e) => e
case Failure(_) => Seq()
}
}).toSeq
val res: Seq[String] = download(providers:_*)
Or if you want the non blocking version with a Future:
def download(urls: URL*) = Future {
blocking {
urls.par.flatMap(url => {
Try {
io.Source.fromURL(url).getLines
} match {
case Success(e) => e
case Failure(_) => Seq()
}
})
}
}
val res: Future[Seq[String]] = download(providers:_*)

Scala Future error handling with Either

I am writing a wrapper for an API and I want to do error handling for applications problems. Each request returns a Future so in order to do this I see 2 options: using a Future[Either] or using exceptions to fail the future immediately.
Here is a snippet with both situations, response is a future with the return of the HTTP request:
def handleRequestEither: Future[Either[String, String]] = {
response.map {
case "good_string" => Right("Success")
case _ => Left("Failed")
}
}
def handleRequest: Future[String] = {
response.map {
case "good_string" => "Success"
case _ => throw new Exception("Failed")
}
}
And here is the snippet to get the result in both cases:
handleRequestEither.onComplete {
case Success(res) =>
res match {
case Right(rightRes) => println(s"Success $res")
case Left(leftRes) => println(s"Failure $res")
}
case Failure(ex) =>
println(s"Failure $ex")
}
handleRequest.onComplete {
case Success(res) => println(s"Success $res")
case Failure(ex) => println(s"Failure $ex")
}
I don't like to use exceptions, but using Future[Either] makes it much more verbose to get the response afterwards, and if I want to map the result into another object it gets even more complicated. Is this the way to go, or are there better alternatives?
Let me paraphrase Erik Meijer and consider the following table:
Consider then this two features of a language construct: arity (does it aggregate one or many items?) and mode (synchronous when blocking read operations until ready or asynchronous when not).
All of this imply that Try constructs and blocks manage the success or failure of the block generating the result synchronously. You'll control whether your resources provides the right answer without encountering problems (those described by exceptions).
On the other hand a Future is a kind of asynchronous Try. That means that it successfully completes when no problems (exceptions) has been found then notifying its subscribers. Hence, I don't think you should have a future of Either in this case, that is your second handleRequest implementation is the right way of using futures.
Finally, if what disturbs you is throwing an exception, you could follow the approach of Promises:
def handleRequest: Future[String] = {
val p = Promise[String]
response.map {
case "good_string" => p.success("Success")
case _ => p.failure(new Exception("Failed"))
}
p.future
}
Or:
case class Reason(msg: String) extends Exception
def handleRequest: Future[String] = {
val p = Promise[String]
response.map {
case "good_string" => p.success("Success")
case _ => p.failure(Reason("Invalid response"))
}
p.future
}
I'd rather use your second approach.
You could use special type for that: EitherT from the scalaz library.
It works with scalaz enhanced version of Either : \/
It could transform combination of any monad and \/ into a single monad. So using scalaz instances for scala.concurent.Future you could achieve the desired mix. And you could go further with monad transformers if you wish. Read this beautiful blog if you're interested.
Here not prettified but working with scalaz 7.1 example for you:
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}
import scalaz._
import scalaz.std.scalaFuture._
import EitherT._
import scala.concurrent.ExecutionContext.Implicits.global
object EitherFuture {
type ETFS[X] = EitherT[Future, String, X]
val IntResponse = "result (\\d+)".r
def parse(response: Future[String]) =
eitherT(response map {
case IntResponse(num) ⇒ \/-(num.toInt)
case _ ⇒ -\/("bad response")
})
def divideBy2(x: Validation[String, Int]) =
x.ensure("non divisible by 2")(_ % 2 == 0).map(_ / 2)
def handleResponse(response: Future[String]) = for {
num ← parse(response).validationed(divideBy2)
} yield s"half is $num"
def main(args: Array[String]) {
Map(
'good → "result 10",
'proper → "result 11",
'bad → "bad_string"
) foreach { case (key, str) ⇒
val response = Future(str)
val handled = handleResponse(response)
val result = Await.result(handled.run, Duration.Inf)
println(s"for $key response we have $result")
}
}
}

waiting for "recursive" futures in scala

a simple code sample that describes my problem:
import scala.util._
import scala.concurrent._
import scala.concurrent.duration._
import ExecutionContext.Implicits.global
class LoserException(msg: String, dice: Int) extends Exception(msg) { def diceRoll: Int = dice }
def aPlayThatMayFail: Future[Int] = {
Thread.sleep(1000) //throwing a dice takes some time...
//throw a dice:
(1 + Random.nextInt(6)) match {
case 6 => Future.successful(6) //I win!
case i: Int => Future.failed(new LoserException("I did not get 6...", i))
}
}
def win(prefix: String): String = {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => "%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll)
case Failure(e) => e match {
case ex: LoserException => win("%s, and then i got %d".format(prefix, ex.diceRoll))
case _: Throwable => "%s, and then somebody cheated!!!".format(prefix)
}
})
"I want to do something like futureGameLog.waitForRecursiveResult, using Await.result or something like that..."
}
win("I started playing the dice")
this simple example illustrates what i want to do. basically, if to put it in words, i want to wait for a result for some computation, when i compose different actions on previous success or failed attampts.
so how would you implement the win method?
my "real world" problem, if it makes any difference, is using dispatch for asynchronous http calls, where i want to keep making http calls whenever the previous one ends, but actions differ on wether the previous http call succeeded or not.
You can recover your failed future with a recursive call:
def foo(x: Int) = x match {
case 10 => Future.successful(x)
case _ => Future.failed[Int](new Exception)
}
def bar(x: Int): Future[Int] = {
foo(x) recoverWith { case _ => bar(x+1) }
}
scala> bar(0)
res0: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise#64d6601
scala> res0.value
res1: Option[scala.util.Try[Int]] = Some(Success(10))
recoverWith takes a PartialFunction[Throwable,scala.concurrent.Future[A]] and returns a Future[A]. You should be careful though, because it will use quite some memory when it does lots of recursive calls here.
As drexin answered the part about exception handling and recovering, let me try and answer the part about a recursive function involving futures. I believe using a Promise will help you achieve your goal. The restructured code would look like this:
def win(prefix: String): String = {
val prom = Promise[String]()
def doWin(p:String) {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => prom.success("%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll))
case Failure(e) => e match {
case ex: LoserException => doWin("%s, and then i got %d".format(prefix, ex.diceRoll))
case other => prom.failure(new Exception("%s, and then somebody cheated!!!".format(prefix)))
}
})
}
doWin(prefix)
Await.result(prom.future, someTimeout)
}
Now this won't be true recursion in the sense that it will be building up one long stack due to the fact that the futures are async, but it is similar to recursion in spirit. Using the promise here gives you something to block against while the recursion does it's thing, blocking the caller from what's happening behind the scene.
Now, if I was doing this, I would probable redefine things like so:
def win(prefix: String): Future[String] = {
val prom = Promise[String]()
def doWin(p:String) {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => prom.success("%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll))
case Failure(e) => e match {
case ex: LoserException => doWin("%s, and then i got %d".format(prefix, ex.diceRoll))
case other => prom.failure(new Exception("%s, and then somebody cheated!!!".format(prefix)))
}
})
}
doWin(prefix)
prom.future
}
This way you can defer the decision on whether to block or use async callbacks to the caller of this function. This is more flexible, but it also exposes the caller to the fact that you are doing async computations and I'm not sure that is going to be acceptable for your scenario. I'll leave that decision up to you.
This works for me:
def retryWithFuture[T](f: => Future[T],retries:Int, delay:FiniteDuration) (implicit ec: ExecutionContext, s: Scheduler): Future[T] ={
f.recoverWith { case _ if retries > 0 => after[T](delay,s)(retryWithFuture[T]( f , retries - 1 , delay)) }
}