Scala Future flatMap implementation (chaining)

Scala Future flatMap implementation (chaining) - scala

Trying to wrap my head around how async tasks are chained together, for ex futures and flatMap
val a: Future[Int] = Future { 123 }
val b: Future[Int] = a.flatMap(a => Future { a + 321 })
How would i implement something similar where i build a new Future by waiting for a first and applying a f on the result. I tried looking a scala's source code but got stuck here: https://github.com/scala/scala/blob/2.13.x/src/library/scala/concurrent/Future.scala#L216
i would imagine some code like:
def myFlatMap(a: Future[Int])(f: a => Future[Int]): Future[Int] = {
Future {
// somehow wait for a to complete and the apply 'f'
a.onComplete {
case Success(x) => f(a)
}
}
}
but i guess above will return a Future[Unit] instead. I would just like to understand the "pattern" of how async tasks are chained together.

(Disclaimer: I'm the main maintainer of Scala Futures)
You wouldn't want to implement flatMap/recoverWith/transformWith yourself—it is a very complex feature to implement safely (stack-safe, memory-safe, and concurrency-safe).
You can see what I am talking about here:
https://github.com/scala/scala/blob/2.13.x/src/library/scala/concurrent/impl/Promise.scala#L442
https://github.com/scala/scala/blob/2.13.x/src/library/scala/concurrent/impl/Promise.scala#L307
https://github.com/scala/scala/blob/2.13.x/src/library/scala/concurrent/impl/Promise.scala#L274
There is more reading on the topic here: https://viktorklang.com/blog/Futures-in-Scala-2.12-part-9.html

I found that looking at the source for scala 2.12 was easier for me to understand how Future / Promise was implemented, i learned that Future is in fact a Promise and by looking at the implementation i now understand how "chaining" async tasks can be implemented.
Im posting the code that i wrote to help me better understand what was going on under the hood.
import java.util.concurrent.{ScheduledThreadPoolExecutor, TimeUnit}
import scala.collection.mutable.ListBuffer
trait MyAsyncTask[T] {
type Callback = T => Unit
var result: Option[T]
// register a new callback to be called when task is completed.
def onComplete(callback: Callback): Unit
// complete the given task, and invoke all registered callbacks.
def complete(value: T): Unit
// create a new task, that will wait for the current task to complete and apply
// the provided map function, and complete the new task when completed.
def flatMap[B](f: T => MyAsyncTask[B]): MyAsyncTask[B] = {
val wrapper = new DefaultAsyncTask[B]()
onComplete { x =>
f(x).onComplete { y =>
wrapper.complete(y)
}
}
wrapper
}
def map[B](f: T => B): MyAsyncTask[B] = flatMap(x => CompletedAsyncTask(f(x)))
}
/**
* task with fixed pre-calculated result.
*/
case class CompletedAsyncTask[T](value: T) extends MyAsyncTask[T] {
override var result: Option[T] = Some(value)
override def onComplete(callback: Callback): Unit = {
// we already have the result just call the callback.
callback(value)
}
override def complete(value: T): Unit = () // noop nothing to complete.
}
class DefaultAsyncTask[T] extends MyAsyncTask[T] {
override var result: Option[T] = None
var isCompleted = false
var listeners = new ListBuffer[Callback]()
/**
* register callback, to be called when task is completed.
*/
override def onComplete(callback: Callback): Unit = {
if (isCompleted) {
// already completed just invoke callback
callback(result.get)
} else {
// add the listener
listeners.addOne(callback)
}
}
/**
* trigger all registered callbacks to `onComplete`
*/
override def complete(value: T): Unit = {
result = Some(value)
listeners.foreach { listener =>
listener(value)
}
}
}
object MyAsyncTask {
def apply[T](body: (T => Unit) => Unit): MyAsyncTask[T] = {
val task = new DefaultAsyncTask[T]()
// pass in `complete` as callback.
body(task.complete)
task
}
}
object MyAsyncFlatMap {
/**
* helper to simulate async task.
*/
def delayedCall(body: => Unit, delay: Int): Unit = {
val scheduler = new ScheduledThreadPoolExecutor(1)
val run = new Runnable {
override def run(): Unit = {
body
}
}
scheduler.schedule(run, delay, TimeUnit.SECONDS)
}
def main(args: Array[String]): Unit = {
val getAge = MyAsyncTask[Int] { cb =>
delayedCall({ cb(66) }, 1)
}
val getName = MyAsyncTask[String] { cb =>
delayedCall({ cb("John") }, 2)
}
// same as: getAge.flatMap(age => getName.map(name => (age, name)))
val result: MyAsyncTask[(Int, String)] = for {
age <- getAge
name <- getName
} yield (age, name)
result.onComplete {
case (age, name) => println(s"hello $name $age")
}
}
}

Related

Why are Futures in scala realised using a collection in the companion object?

I'm not sure whether I chose the right title for my question..
I'm interested as to why the collection in the companion object is defined. Am I mistaken that this collection will have only one f in it? What I am seeing is a collection with exactly one element.
Here's the Future I'm dealing with:
trait Future[+T] { self =>
def onComplete(callback: Try[T] => Unit): Unit
def map[U](f: T => U) = new Future[U] {
def onComplete(callback: Try[U] => Unit) =
self onComplete (t => callback(t.map(f)))
}
def flatMap[U](f: T => Future[U]) = new Future[U] {
def onComplete(callback: Try[U] => Unit) =
self onComplete { _.map(f) match {
case Success(fu) => fu.onComplete(callback)
case Failure(e) => callback(Failure(e))
} }
}
def filter(p: T => Boolean) =
map { t => if (!p(t)) throw new NoSuchElementException; t }
}
Its companion object:
object Future {
def apply[T](f: => T) = {
val handlers = collection.mutable.Buffer.empty[Try[T] => Unit]
var result: Option[Try[T]] = None
val runnable = new Runnable {
def run = {
val r = Try(f)
handlers.synchronized {
result = Some(r)
handlers.foreach(_(r))
}
}
}
(new Thread(runnable)).start()
new Future[T] {
def onComplete(f: Try[T] => Unit) = handlers.synchronized {
result match {
case None => handlers += f
case Some(r) => f(r)
}
}
}
}
}
In my head I was imagining something like the following instead of the above companion object (notice how I replaced the above val handlers .. with var handler ..):
object Future {
def apply[T](f: => T) = {
var handler: Option[Try[T] => Unit] = None
var result: Option[Try[T]] = None
val runnable = new Runnable {
val execute_when_ready: Try[T] => Unit = r => handler match {
case None => execute_when_ready(r)
case Some(f) => f(r)
}
def run = {
val r = Try(f)
handler.synchronized {
result = Some(r)
execute_when_ready(r)
}
}
}
(new Thread(runnable)).start()
new Future[T] {
def onComplete(f: Try[T] => Unit) = handler.synchronized {
result match {
case None => handler = Some(f)
case Some(r) => f(r)
}
}
}
}
}
So why does the function execute_when_ready leads to stackoverflow, but that's not the case with handlers.foreach? what is the collection is offering me which I can't do without it? And is it possible to replace the collection with something else in the companion object?

The collection is not in the companion object, it is in the apply method, so there is a new instance for each Future. It is there because there can be multiple pending onComplete handlers on the same Future.
Your implementation only allows a single handler and silently removes any existing handler in onComplete which is a bad idea because the caller has no idea if a previous function has added an onComplete handler or not.
As noted in the comments, the stack overflow is because execute_when_ready calls itself if handler is None with no mechanism to stop the recursion.

How to implement multiple thread pools in a Play application using cats-effect IO

In my Play application, I service my requests usings cats-effect's IO, instead of Future in the controller, like this (super-simplified):
def handleServiceResult(serviceResult: ServiceResult): Result = ...
def serviceMyRequest(request: Request): IO[ServiceResult] = ...
def myAction = Action { request =>
handleServiceResult(
serviceMyRequest(request).unsafeRunSync()
)
}
Requests are then processed (asynchronously) on Play's default thread pool. Now, I want to implement multiple thread pools to handle different sorts of requests. Were I using Futures, I could do this:
val myCustomExecutionContext: ExecutionContext = ...
def serviceMyRequest(request: Request): Future[ServiceResult] = ...
def myAction = Action.async { request =>
Future(serviceMyRequest(request))(myCustomExecutionContext)
.map(handleServiceResult)(defaultExecutionContext)
}
But I'm not using Futures, I'm using IO, and I'm not sure about the right way to go about implementing it. This looks promising, but seems a bit clunky:
def serviceMyRequest(request: Request): IO[ServiceResult] = ...
def myAction = Action { request =>
val ioServiceResult = for {
_ <- IO.shift(myCustomExecutionContext)
serviceResult <- serviceMyRequest(request)
_ <- IO.shift(defaultExecutionContext)
} yield {
serviceResult
}
handleServiceResult(ioServiceResult.unsafeRunSync())
}
Is this the right way to implement it? Is there a best practice here? Am I screwing up badly? Thanks.

Ok, so since this doesn't seem to be well-trodden ground, this is what I ended up implementing:
trait PlayIO { self: BaseControllerHelpers =>
implicit class IOActionBuilder[A](actionBuilder: ActionBuilder[Request, A]) {
def io(block: Request[A] => IO[Result]): Action[A] = {
actionBuilder.apply(block.andThen(_.unsafeRunSync()))
}
def io(executionContext: ExecutionContext)(block: Request[A] => IO[Result]): Action[A] = {
val shiftedBlock = block.andThen(IO.shift(executionContext) *> _ <* IO.shift(defaultExecutionContext))
actionBuilder.apply(shiftedBlock.andThen(_.unsafeRunSync()))
}
}
}
Then (using the framework from the question) if I mix PlayIO into the controller, I can do this,
val myCustomExecutionContext: ExecutionContext = ...
def handleServiceResult(serviceResult: ServiceResult): Result = ...
def serviceMyRequest(request: Request): IO[ServiceResult] = ...
def myAction = Action.io(myCustomExecutionContext) { request =>
serviceMyRequest(request).map(handleServiceResult)
}
such that I execute the action's code block on myCustomExecutionContext and then, once complete, thread-shift back to Play's default execution context.
Update:
This is a bit more flexible:
trait PlayIO { self: BaseControllerHelpers =>
implicit class IOActionBuilder[R[_], A](actionBuilder: ActionBuilder[R, A]) {
def io(block: R[A] => IO[Result]): Action[A] = {
actionBuilder.apply(block.andThen(_.unsafeRunSync()))
}
def io(executionContext: ExecutionContext)(block: R[A] => IO[Result]): Action[A] = {
if (executionContext == defaultExecutionContext) io(block) else {
val shiftedBlock = block.andThen(IO.shift(executionContext) *> _ <* IO.shift(defaultExecutionContext))
io(shiftedBlock)
}
}
}
}
Update2:
Per the comment above, this will ensure we always shift back to the default thread pool:
trait PlayIO { self: BaseControllerHelpers =>
implicit class IOActionBuilder[R[_], A](actionBuilder: ActionBuilder[R, A]) {
def io(block: R[A] => IO[Result]): Action[A] = {
actionBuilder.apply(block.andThen(_.unsafeRunSync()))
}
def io(executionContext: ExecutionContext)(block: R[A] => IO[Result]): Action[A] = {
if (executionContext == defaultExecutionContext) io(block) else {
val shiftedBlock = block.andThen { ioResult =>
IO.shift(executionContext).bracket(_ => ioResult)(_ => IO.shift(defaultExecutionContext))
}
io(shiftedBlock)
}
}
}
}

IO and Future[Option] monad transformers

I'm trying to figure out how to write this piece of code in an elegant pure-functional style using scalaz7 IO and monad transformers but just can't get my head around it.
Just imagine I have this simple API:
def findUuid(request: Request): Option[String] = ???
def findProfile(uuid: String): Future[Option[Profile]] = redisClient.get[Profile](uuid)
Using this API I can easily write impure function with OptionT transformer like this:
val profileT = for {
uuid <- OptionT(Future.successful(findUuid(request)))
profile <- OptionT(findProfile(uuid))
} yield profile
val profile: Future[Option[Profile]] = profileT.run
As you have noticed - this function contains findProfile() with a side-effect. I want to isolate this effect inside of the IO monad and interpret outside of the pure function but don't know how to combine it all together.
def findProfileIO(uuid: String): IO[Future[Option[Profile]]] = IO(findProfile(uuid))
val profileT = for {
uuid <- OptionT(Future.successful(findUuid(request)))
profile <- OptionT(findProfileIO(uuid)) //??? how to put Option inside of the IO[Future[Option]]
} yield profile
val profile = profileT.run //how to run transformer and interpret IO with the unsafePerformIO()???
Any peaces of advice on how it might be done?

IO is meant more for synchronous effects. Task is more what you want!
See this question and answer: What's the difference between Task and IO in Scalaz?
You can convert your Future to Task and then have an API like this:
def findUuid(request: Request): Option[String] = ???
def findProfile(uuid: String): Task[Option[Profile]] = ???
This works because Task can represent both synchronous and asynchronous operations, so findUuid can also be wrapped in Task instead of IO.
Then you can wrap these in OptionT:
val profileT = for {
uuid <- OptionT(Task.now(findUuid(request)))
profile <- OptionT(findProfileIO(uuid))
} yield profile
Then at the end somewhere you can run it:
profileT.run.attemptRun
Check out this link for converting Futures to Tasks and vice versa: Scalaz Task <-> Future

End up with this piece of code, thought it might be useful for someone (Play 2.6).
Controller's method is a pure function since Task evaluation takes place outside of the controller inside of PureAction ActionBuilder. Thanks to Luka's answer!
Still struggling with new paradigm of Action composition in Play 2.6 though, but this is another story.
FrontendController.scala:
def index = PureAction.pure { request =>
val profileOpt = (for {
uuid <- OptionT(Task.now(request.cookies.get("uuid").map(t => uuidKey(t.value))))
profile <- OptionT(redis.get[Profile](uuid).asTask)
} yield profile).run
profileOpt.map { profileOpt =>
Logger.info(profileOpt.map(p => s"User logged in - $p").getOrElse("New user, suggesting login"))
Ok(views.html.index(profileOpt))
}
}
Actions.scala
Convenient action with Task resolution at the end
class PureAction #Inject()(parser: BodyParsers.Default)(implicit ec: ExecutionContext) extends ActionBuilderImpl(parser) {
self =>
def pure(block: Request[AnyContent] => Task[Result]): Action[AnyContent] = composeAction(new Action[AnyContent] {
override def parser: BodyParser[AnyContent] = self.parser
override def executionContext: ExecutionContext = self.ec
override def apply(request: Request[AnyContent]): Future[Result] = {
val taskResult = block(request)
taskResult.asFuture //End of the world lives here
}
})
}
Converters.scala
Task->Future and Future->Task implicit converters
implicit class FuturePimped[+T](root: => Future[T]) {
import scalaz.Scalaz._
def asTask(implicit ec: ExecutionContext): Task[T] = {
Task.async { register =>
root.onComplete {
case Success(v) => register(v.right)
case Failure(ex) => register(ex.left)
}
}
}
}
implicit class TaskPimped[T](root: => Task[T]) {
import scalaz._
val p: Promise[T] = Promise()
def asFuture: Future[T] = {
root.unsafePerformAsync {
case -\/(ex) => p.failure(ex); ()
case \/-(r) => p.success(r); ()
}
p.future
}
}

How to add a time based watcher to a Scala Future?

I want to add an after(d: FiniteDuration)(callback: => Unit) util to Scala Futures that would enable me to do this:
val f = Future(someTask)
f.after(30.seconds) {
println("f has not completed in 30 seconds!")
}
f.after(60.seconds) {
println("f has not completed in 60 seconds!")
}
How can I do this?

Usually I use a thread pool executor and promises:
import scala.concurrent.duration._
import java.util.concurrent.{Executors, ScheduledThreadPoolExecutor}
import scala.concurrent.{Future, Promise}
val f: Future[Int] = ???
val executor = new ScheduledThreadPoolExecutor(2, Executors.defaultThreadFactory(), AbortPolicy)
def withDelay[T](operation: ⇒ T)(by: FiniteDuration): Future[T] = {
val promise = Promise[T]()
executor.schedule(new Runnable {
override def run() = {
promise.complete(Try(operation))
}
}, by.length, by.unit)
promise.future
}
Future.firstCompletedOf(Seq(f, withDelay(println("still going"))(30 seconds)))
Future.firstCompletedOf(Seq(f, withDelay(println("still still going"))(60 seconds)))

One way is to use Future.firstCompletedOf (see this blogpost):
val timeoutFuture = Future { Thread.sleep(500); throw new TimeoutException }
val f = Future.firstCompletedOf(List(f, timeoutFuture))
f.map { case e: TimeoutException => println("f has not completed in 0.5 seconds!") }
where TimeoutException is some exception or type.

Use import akka.pattern.after. If you want to implement it without akka here is the source code. The other (java) example is TimeoutFuture in com.google.common.util.concurrent.

Something like this, perhaps:
object PimpMyFuture {
implicit class PimpedFuture[T](val f: Future[T]) extends AnyVal {
def after(delay: FiniteDuration)(callback: => Unit): Future[T] = {
Future {
blocking { Await.ready(f, delay) }
} recover { case _: TimeoutException => callback }
f
}
}
}
import PimpMyFuture._
Future { Thread.sleep(10000); println ("Done") }
.after(5.seconds) { println("Still going") }
This implementation is simple, but it basically doubles the number of threads you need - each active future effectively occupies two threads - which is a bit wasteful. Alternatively, you could use scheduled tasks to make your waits non-blocking. I don't know of a "standard" scheduler in scala (each lib has their own), but for a simple task like this you can use java's TimerTask directly:
object PimpMyFutureNonBlocking {
val timer = new java.util.Timer
implicit class PimpedFuture[T](val f: Future[T]) extends AnyVal {
def after(delay: FiniteDuration)(callback: => Unit): Future[T] = {
val task = new java.util.TimerTask {
def run() { if(!f.isCompleted) callback }
}
timer.schedule(task, delay.toMillis)
f.onComplete { _ => task.cancel }
f
}
}
}

Tackling recursion while using Scala Futures

In my working code is recursion involved. I try to avoid this, because the recursion could be too deep if a predicate don't hold.
case class Chan[T]() {
private var promise: Promise[T] = Promise[T]()
/** Binds a handler with the "write" in casu update() */
def this(handler: (T => Unit) => Unit) {
this
handler(update)
}
def filter(p: (T) => Boolean): Future[T] = {
apply().flatMap(value => if (p(value)) Future(value) else filter(p))
}
def apply(): Future[T] = {
promise = Promise[T]()
promise.future
}
def update(t: T): Unit = {
if (!promise.isCompleted) promise.success(t)
}
}
The problem lays in the filter method. When the predicate is not met, filter will be called again, hunting for an event which is conform the parameter. By that the stack will be filled, ahead for a StackOverflow :-).
How can the code refactored, in a loop or tail recursive calls, avoiding excessive stack usage?

So here's the solution as suggested above.
I hope that fits your purpose:
case class Chan[T]() {
private var pendingFilters: List[(T => Boolean, Promise[T])] = List.empty
/** Binds a handler with the "write" in casu update() */
def this(handler: (T => Unit) => Unit) {
this
handler(update)
}
def filter(p: (T) => Boolean): Future[T] = {
val promise = Promise[T]()
// add both the predicate and the promise to your pending filters
synchronized(pendingFilters = p -> promise :: pendingFilters)
promise.future
}
def update(t: T): Unit = {
synchronized {
// partition your pending filters into completed and uncompleted ones
val (completed, pending) = pendingFilters.partition(_._1(t))
pendingFilters = pending
completed
}.foreach{ case (_, promise) =>
// and finally complete the promises
promise.trySuccess(t)
}
}
}