How do I compose a list of `Futures`? - scala

I have a collection of Promises. I was looking for an effective way to compose the resulting Futures. Currently the cleanest way i found to combine them was to use a scalaz Monoid, like so:
val promises = List(Promise[Unit], Promise[Unit], Promise[Unit])
promises.map(_.future).reduce(_ |+| _).onSuccess {
case a => println(a)
}
promises.foreach(_.success(()))
Is there a clean way to do this that doesn't require scalaz?
The number of Futures in the collection will vary, and lots of intermediate collections is undesirable.

You could use Future.traverse:
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.{ Future, Promise }
val promises = List(Promise[Unit], Promise[Unit], Promise[Unit])
Future.traverse(promises)(_.future)
This will give you a Future[List[Unit]], which doesn't exactly qualify as "lots of intermediate collections", but isn't necessarily ideal, either. Future.reduce also works:
Future.reduce(promises.map(_.future))((_, _) => ())
This returns a Future[Unit] that will be satisfied when all of the futures are satisfied.

Future.sequence is the method you want.

Related

Reverse of Future.sequence

I know that a Future.sequence call can convert a List[Future[T]] to a Future[List[T]], but what if I want to go the other way around?
I want to convert a Future[List[T]] into a List[Future[T]].
The reason why I want to do this is as follows:
I send a message to an actor which uses Slick 3 to query a database. The Slick query returns list: Future[List[T]]. If I could convert this list to list: List[Future[T]], then I would be able to do:
list.map(convertToMessage).foreach(m => m pipeTo sender())
So basically I want to convert each record extracted from the DB into a message and then send it to a calling actor.
I don't think this is possible, sorry.
A Future[List[T]] could complete with an empty list, a list with one element, or any number of elements.
So if you converted it to a List[Future[T]], how many Futures would the list contain?
instead of using akkas pipeTo pattern, you can just do something like:
// capture the sender, before the future is started - just like pipeTo does
val s = sender()
futureOfListOfFoo.foreach { listOfFoo =>
listOfFoo.map(convertToMessage).foreach { foo => s ! foo }
}
Why do you want to do this? What do you want to achieve? As soon as your future resolves, all of the items in the list are available so you gain little by lifting each of them into its own future.
Anyway:
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
// Assuming some starting future.
val foo: Future[List[String]] = Future.successful(List("foo"))
// This is pointless, because all we're doing is lifting each item in the list into its own already resolved future.
val bar: Future[List[Future[String]]] = foo.map(_.map(Future.successful))
// NB: you shouldn't use Await.
val baz: List[Future[String]] = Await.result(bar, 0.nanos)

How to stack applicative functors in Scala

Applicative functors are often mentioned as an alternative to monads when your computation steps are independent. One of their often-mentioned advantages is that you don't need transformers when you want to stack applicatives, because F[G[X]] is always also an applicative.
Let's say I have following functions:
def getDataOption(): Option[Data]
def getUserFuture(): Future[User]
def process(data: Data, user: User)
I would like to have elegant stacking in order to get a Future[Option[User]] and Future[Option[Data]] and map that with process.
So far I only came up with this (using Cats):
Applicative[Future]
.compose[Option]
.map2(
Applicative[Future].pure(getDataOption()),
getUserFuture().map(Applicative[Option].pure))(process)
but I'm sure it's far from ideal. Is there a more elegant and generic way to achieve the same?
The most difficult thing is type inference here. This is the best I could do
// for the Applicative[Future[Option[?]]
import cats.Applicative
implicit val fo = {
import cats.std.future._
import cats.std.option._
Applicative[Future].compose[Option]
}
// for the |#| syntax
import cats.syntax.cartesian._
// to guide type inference
type FutureOption[A] = Future[Option[A]]
((Future(getDataOption): FutureOption[Data]) |#|
getUserFuture.map(Option.apply)).map(process _)

Executing sequence of functions that return a future sequentially

I have a sequence of functions that return a future. I want to execute them sequentially i.e. after the first function future is complete, execute the next function and so on. Is there a way to do it?
ops: Seq[() => Future[Unit]]
You can combine all the futures into a single one with a foldLeft and flatMap:
def executeSequentially(ops: Seq[() => Future[Unit]])(
implicit exec: ExecutionContext
): Future[Unit] =
ops.foldLeft(Future.successful(()))((cur, next) => cur.flatMap(_ => next()))
foldLeft ensures the order from left to right and flatMap gives sequential execution. Functions are executed with the ExecutionContext, so calling executeSequentially is not blocking. And you can add callbacks or await on the resulting Future when/if you need it.
If you are using Twitter Futures, then I guess you won't need to pass ExecutionContext, but the general idea with foldLeft and flatMap should still work.
If given a Seq[Future[T]] you can convert it to a Future[Seq[T]] like so:
Val a: Seq[Future[T]] = ???
val resut: Future[Seq[T]] = Future.sequence(a)
a little less boilerplate than the above :)
I believe this should do it:
import scala.concurrent.{Await, Future}
import scala.concurrent.duration.Duration
def runSequentially(ops: Seq[() => Future[Unit]]): Unit = {
ops.foreach(f => Await.result(f(), Duration.Inf))
}
If you want to wait less then Duration.Inf, or stop at failure - should be easy to do.

Is Future in Scala a monad?

Why and how specifically is a Scala Future not a Monad; and would someone please compare it to something that is a Monad, like an Option?
The reason I'm asking is Daniel Westheide's The Neophyte's Guide to Scala Part 8: Welcome to the Future where I asked whether or not a Scala Future was a Monad, and the author responded that it wasn't, which threw off base. I came here to ask for a clarification.
A summary first
Futures can be considered monads if you never construct them with effectful blocks (pure, in-memory computation), or if any effects generated are not considered as part of semantic equivalence (like logging messages). However, this isn't how most people use them in practice. For most people using effectful Futures (which includes most uses of Akka and various web frameworks), they simply aren't monads.
Fortunately, a library called Scalaz provides an abstraction called Task that doesn't have any problems with or without effects.
A monad definition
Let's review briefly what a monad is. A monad must be able to define at least these two functions:
def unit[A](block: => A)
: Future[A]
def bind[A, B](fa: Future[A])(f: A => Future[B])
: Future[B]
And these functions must statisfy three laws:
Left identity: bind(unit(a))(f) ≡ f(a)
Right identity: bind(m) { unit(_) } ≡ m
Associativity: bind(bind(m)(f))(g) ≡ bind(m) { x => bind(f(x))(g) }
These laws must hold for all possible values by definition of a monad. If they don't, then we simply don't have a monad.
There are other ways to define a monad that are more or less the same. This one is popular.
Effects lead to non-values
Almost every usage of Future that I've seen uses it for asychronous effects, input/output with an external system like a web service or a database. When we do this, a Future isn't even a value, and mathematical terms like monads only describe values.
This problem arises because Futures execute immediately upon data contruction. This messes up the ability to substitute expressions with their evaluated values (which some people call "referential transparency"). This is one way to understand why Scala's Futures are inadequate for functional programming with effects.
Here's an illustration of the problem. If we have two effects:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def twoEffects =
( Future { println("hello") },
Future { println("hello") } )
we will have two printings of "hello" upon calling twoEffects:
scala> twoEffects
hello
hello
scala> twoEffects
hello
hello
But if Futures were values, we should be able to factor out the common expression:
lazy val anEffect = Future { println("hello") }
def twoEffects = (anEffect, anEffect)
But this doesn't give us the same effect:
scala> twoEffects
hello
scala> twoEffects
The first call to twoEffects runs the effect and caches the result, so the effect isn't run the second time we call twoEffects.
With Futures, we end up having to think about the evaluation policy of the language. For instance, in the example above, the fact I use a lazy value rather than a strict one makes a difference in the operational semantics. This is exactly the kind of twisted reasoning functional programming is designed to avoid -- and it does it by programming with values.
Without substitution, laws break
In the presense of effects, monad laws break. Superficially, the laws appear to hold for simple cases, but the moment we begin to substitute expressions with their evaluated values, we end up with the same problems we illustrated above. We simply can't talk about mathematical concepts like monads when we don't have values in the first place.
To put it bluntly, if you use effects with your Futures, saying they're monads is not even wrong because they aren't even values.
To see how monad laws break, just factor out your effectful Future:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def unit[A]
(block: => A)
: Future[A] =
Future(block)
def bind[A, B]
(fa: Future[A])
(f: A => Future[B])
: Future[B] =
fa flatMap f
lazy val effect = Future { println("hello") }
Again, it will only run one time, but you need it to run twice -- once for the right-hand side of the law, and another for the left. I'll illustrate the problem for the right identity law:
scala> effect // RHS has effect
hello
scala> bind(effect) { unit(_) } // LHS doesn't
The implicit ExecutionContext
Without putting an ExecutionContext in implicit scope, we can't define either unit or bind in our monad. This is because the Scala API for Futures has these signature:
object Future {
// what we need to define unit
def apply[T]
(body: ⇒ T)
(implicit executor: ExecutionContext)
: Future[T]
}
trait Future {
// what we need to define bind
flatMap[S]
(f: T ⇒ Future[S])
(implicit executor: ExecutionContext)
: Future[S]
}
As a "convenience" to the user, the standard library encourages users to define an execution context in implicit scope, but I think this is a huge hole in the API that just leads to defects. One scope of the computation may have one execution context defined while another scope can have another context defined.
Perhaps you can ignore the problem if you define an instance of unit and bind that pins both operations to a single context and use this instance consistently. But this is not what people do most of the time. Most of the time, people use Futures with for-yield comprehensions that become map and flatMap calls. To make for-yield comprehensions work, an execution context must be defined at some non-global implicit scope (because for-yield doesn't provide a way to specify additional parameters to the map and flatMap calls).
To be clear, Scala lets you use lots of things with for-yield comprehensions that aren't actually monads, so don't believe that you have a monad just because it works with for-yield syntax.
A better way
There's a nice library for Scala called Scalaz that has an abstraction called scalaz.concurrent.Task. This abstraction doesn't run effects upon data construction as the standard library Future does. Furthermore, Task actually is a monad. We compose Task monadically (we can use for-yield comprehensions if we like), and no effects run while we're composing. We have our final program when we have composed a single expression evaluating to Task[Unit]. This ends up being our equivalent of a "main" function, and we can finally run it.
Here's an example illustrating how we can substitute Task expressions with their respective evaluated values:
import scalaz.concurrent.Task
import scalaz.IList
import scalaz.syntax.traverse._
def twoEffects =
IList(
Task delay { println("hello") },
Task delay { println("hello") }).sequence_
We will have two printings of "hello" upon calling twoEffects:
scala> twoEffects.run
hello
hello
And if we factor out the common effect,
lazy val anEffect = Task delay { println("hello") }
def twoEffects =
IList(anEffect, anEffect).sequence_
we get what we'd expect:
scala> twoEffects.run
hello
hello
In fact, it doesn't really matter that whether we use a lazy value or a strict value with Task; we get hello printed out twice either way.
If you want to functionally program, consider using Task everywhere you may use Futures. If an API forces Futures upon you, you can convert the Future to a Task:
import concurrent.
{ ExecutionContext, Future, Promise }
import util.Try
import scalaz.\/
import scalaz.concurrent.Task
def fromScalaDeferred[A]
(future: => Future[A])
(ec: ExecutionContext)
: Task[A] =
Task
.delay { unsafeFromScala(future)(ec) }
.flatMap(identity)
def unsafeToScala[A]
(task: Task[A])
: Future[A] = {
val p = Promise[A]
task.runAsync { res =>
res.fold(p failure _, p success _)
}
p.future
}
private def unsafeFromScala[A]
(future: Future[A])
(ec: ExecutionContext)
: Task[A] =
Task.async(
handlerConversion
.andThen { future.onComplete(_)(ec) })
private def handlerConversion[A]
: ((Throwable \/ A) => Unit)
=> Try[A]
=> Unit =
callback =>
{ t: Try[A] => \/ fromTryCatch t.get }
.andThen(callback)
The "unsafe" functions run the Task, exposing any internal effects as side-effects. So try not to call any of these "unsafe" functions until you've composed one giant Task for your entire program.
I believe a Future is a Monad, with the following definitions:
def unit[A](x: A): Future[A] = Future.successful(x)
def bind[A, B](m: Future[A])(fun: A => Future[B]): Future[B] = fut.flatMap(fun)
Considering the three laws:
Left identity:
Future.successful(a).flatMap(f) is equivalent to f(a). Check.
Right identity:
m.flatMap(Future.successful _) is equivalent to m (minus some possible performance implications). Check.
Associativity
m.flatMap(f).flatMap(g) is equivalent to m.flatMap(x => f(x).flatMap(g)). Check.
Rebuttal to "Without substitution, laws break"
The meaning of equivalent in the monad laws, as I understand it, is you could replace one side of the expression with the other side in your code without changing the behavior of the program. Assuming you always use the same execution context, I think that is the case. In the example #sukant gave, it would have had the same issue if it had used Option instead of Future. I don't think the fact that the futures are evaluated eagerly is relevant.
As the other commenters have suggested, you are mistaken. Scala's Future type has the monadic properties:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits._
def unit[A](block: => A): Future[A] = Future(block)
def bind[A, B](fut: Future[A])(fun: A => Future[B]): Future[B] = fut.flatMap(fun)
This is why you can use for-comprehension syntax with futures in Scala.

Concurrent map/foreach in scala

I have an iteration vals: Iterable[T] and a long-running function without any relevant side effects: f: (T => Unit). Right now this is applied to vals in the obvious way:
vals.foreach(f)
I would like the calls to f to be done concurrently (within reasonable limits). Is there an obvious function somewhere in the Scala base library? Something like:
Concurrent.foreach(8 /* Number of threads. */)(vals, f)
While f is reasonably long running, it is short enough that I don't want the overhead of invoking a thread for each call, so I am looking for something based on a thread pool.
Many of the answers from 2009 still use the old scala.actors.Futures._, which are no longer in the newer Scala. While Akka is the preferred way, a much more readable way is to just use parallel (.par) collections:
vals.foreach { v => f(v) }
becomes
vals.par.foreach { v => f(v) }
Alternatively, using parMap can appear more succinct though with the caveat that you need to remember to import the usual Scalaz*. As usual, there's more than one way to do the same thing in Scala!
Scalaz has parMap. You would use it as follows:
import scalaz.Scalaz._
import scalaz.concurrent.Strategy.Naive
This will equip every functor (including Iterable) with a parMap method, so you can just do:
vals.parMap(f)
You also get parFlatMap, parZipWith, etc.
I like the Futures answer. However, while it will execute concurrently, it will also return asynchronously, which is probably not what you want. The correct approach would be as follows:
import scala.actors.Futures._
vals map { x => future { f(x) } } foreach { _() }
I had some issues using scala.actors.Futures in Scala 2.8 (it was buggy when I checked). Using java libs directly worked for me, though:
final object Parallel {
val cpus=java.lang.Runtime.getRuntime().availableProcessors
import java.util.{Timer,TimerTask}
def afterDelay(ms: Long)(op: =>Unit) = new Timer().schedule(new TimerTask {override def run = op},ms)
def repeat(n: Int,f: Int=>Unit) = {
import java.util.concurrent._
val e=Executors.newCachedThreadPool //newFixedThreadPool(cpus+1)
(0 until n).foreach(i=>e.execute(new Runnable {def run = f(i)}))
e.shutdown
e.awaitTermination(Math.MAX_LONG, TimeUnit.SECONDS)
}
}
I'd use scala.actors.Futures:
vals.foreach(t => scala.actors.Futures.future(f(t)))
The latest release of Functional Java has some higher-order concurrency features that you can use.
import fjs.F._
import fj.control.parallel.Strategy._
import fj.control.parallel.ParModule._
import java.util.concurrent.Executors._
val pool = newCachedThreadPool
val par = parModule(executorStrategy[Unit](pool))
And then...
par.parMap(vals, f)
Remember to shutdown the pool.
You can use the Parallel Collections from the Scala standard library.
They're just like ordinary collections, but their operations run in parallel. You just need to put a par call before you invoke some collections operation.
import scala.collection._
val array = new Array[String](10000)
for (i <- (0 until 10000).par) array(i) = i.toString