.zip three futures in Scala [duplicate] - scala

This question already has answers here:
Return Future[(Int,Int)] instead of (Future[Int],Future[Int])
(2 answers)
Closed 5 years ago.
I need the result variable below to contain Future[(String,String,String)] with the result of futures f1, f2 and f3, but instead I'm getting Future[((String, String), String)]. I need the three futures to run in parallel. How to make this work?
def futureA = Future { "A" }
def futureB = Future { "B" }
def futureC = Future { "C" }
def futureFunc = {
val cond1 = 1
val cond2 = 0
val f1 = if (cond1 > 0)
futureA
else
Future {""}
val f2 = if (cond2 > 0)
futureB
else
Future {""}
val f3 = futureC
val fx = f1.zip(f2)
val result = fx.zip(f3)
}

If you create your futures beforehand, you can combine them in a for comprehension and they will run in parallel:
for {
a <- f1
b <- f2
c <- f3
} yield (a, b, c)
res0: scala.concurrent.Future[(String, String, String)]

I tried to create more solutions and here is result:
def futureFunc = {
val cond1 = 1
val cond2 = 0
val f1 = if (cond1 > 0)
futureA
else
Future {""}
val f2 = if (cond2 > 0)
futureB
else
Future {""}
val f3 = futureC
//#1
Future.sequence(List(f1, f2, f3)).map {
case List(a, b, c) => (a, b, c)
}
//#2
for{
f11 <- f1
f22 <- f2
f33 <- f3
} yield (f11, f22, f33)
//#3
f1.zip(f2).zip(f3).map{
case ((f11,f22),f33) => (f11,f22,f33)
}
}
First one uses Future sequence, for creating Future[List[]] and then mapping this list for tuple (because of type safety we don't have method for tupling list).
Second is usage of for comprehension as described by Sascha, as you may know it is syntactic sugar for maps and flatmaps which is preferred to work with futures.
Last one is using zips, as you wanted, but you still need to map last future to obtain tuple which you want.
All operations are non blocking, but for all operations you need to know exactly futures which you will be using. You can use additional libraries for tupling lists, and then use first solution for not well known amount for futures. For readability i think for comprehension is best.

Related

How can I have 3 Futures run asynchronously and display each answer?

I'm learning scala and I want to run 3 futures asynchronously and get an answer for each one.
But I can't get it to work.
import scala.concurrent._
import scala.concurrent.duration._
import scala.concurrent.Future
import scala.util.{Success, Failure}
import scala.concurrent.ExecutionContext.Implicits.global
def f_Fail(): Future[String] = Future {
Thread.sleep(1000)
throw new RuntimeException("わざと失敗")
}
def f_success(): Future[String] = Future {
Thread.sleep(1000)
"future!"
}
val f1 = f_success()
val f2 = f_success()
val f3 = f_Fail()
val future =for {
n <- f1
m <- f2
p <- f3
} yield n+"\n"+m+"\n"+p
Await.ready(future,Duration.Inf).value.get match {
case Success(v) => println(v)
case Failure(e) => println("not found")
}
Expected results
"future!"
"future!"
"not found"
Actual result
"not found"
A for-comprehension is so-called "syntactic sugar" for nested flatMap calls and finally a map. In your case, the for-comprehension can be effectively re-written as:
val future = f1.flatMap(n => f2.flatMap(m => f3.map(p => n+"\n"+m+"\n"+p)))
In the context of Futures, the flatMap method requires the Future upon which its called to be completed before the result is available for the following step. Simplifying for clarity, consider the following:
f1.flatMap(n => f2.map(m => m + n))
In order for the lamdba passed to the flatMap method to be invoked, the result of the calculation must be known.
Effectively, flatMap calls represent the concept of a direct dependency on a previous computation being executed successfully. In the same way, its "sugared" counterpart, the for-comprehension, represents the same.
This means that if any step of the chain fails, the overall result will be failed, which is what you are experiencing in your example.
What you can do is the following:
val f1 = f_success().recover(_ => "not found")
val f2 = f_success().recover(_ => "not found")
val f3 = f_Fail().recover(_ => "not found")
for {
n <- f1
m <- f2
p <- f3
} {
println(n)
println(m)
println(p)
}
Notice that:
Futures are "eager", i.e. the computation is started immediately when the Future itself is instantiated. This means that the first three lines effectively start each computation independently and (given enough threads and resources) concurrently or even in parallel.
each Future defines how to be recovered by returning the "not found" string in case of error (which is the case for f3).
the side effect (println) is performed as part of the for-comprehension itself, allowing you to avoid blocking on the synchronous context -- in this case, the for-comprehension without yield is equivalent to executing the final step inside a foreach instead of a map, which is better suited to express side effects.
Now, let's say that you want to start your Futures lazily by simply wrapping their definition in a function. You'll notice that suddenly the execution doesn't necessarily take advantage of multiple threads and cores and takes 3 seconds instead of one, as in the following example:
def f1 = f_success().recover(_ => "not found")
def f2 = f_success().recover(_ => "not found")
def f3 = f_Fail().recover(_ => "not found")
for {
n <- f1
m <- f2
p <- f3
} {
println(n)
println(m)
println(p)
}
This is because of the semantics of flatMap. You can solve this problem by using a construct that doesn't imply some form of direct dependency between the steps of the calculation, like zip:
def f1 = f_success().recover(_ => "not found")
def f2 = f_success().recover(_ => "not found")
def f3 = f_Fail().recover(_ => "not found")
for (((n, m), p) <- f1.zip(f2).zip(f3)) {
println(n)
println(m)
println(p)
}
This runs again in ~1 second as you might expect.
Alternatively, if you want to still return the result as a Future, you can of course use yield as follows:
val future = for (((n, m), p) <- f1.zip(f2).zip(f3)) yield s"$n\n$m\n$p"
You can read more about for-comprehensions here on the Scala book.

Difference between { zip map } and { flatMap map } in Future of Scala

I'm reading 《hands on scala》, and one of its exercise is parallelizing merge sort.
I want to know why for-comprehension, which can be translated into flatMap and map, takes more time than zip and map.
my code:
def mergeSortParallel0[T: Ordering](items: IndexedSeq[T]): Future[IndexedSeq[T]] = {
if (items.length <= 16) Future.successful(mergeSortSequential(items))
else {
val (left, right) = items.splitAt(items.length / 2)
for (
l <- mergeSortParallel0(left);
r <- mergeSortParallel0(right)
) yield merge(l, r)
}
}
the standard answer provided by book:
def mergeSortParallel0[T: Ordering](items: IndexedSeq[T]): Future[IndexedSeq[T]] = {
if (items.length <= 16) Future.successful(mergeSortSequential(items))
else {
val (left, right) = items.splitAt(items.length / 2)
mergeSortParallel0(left).zip(mergeSortParallel0(right)).map{
case (sortedLeft, sortedRight) => merge(sortedLeft, sortedRight)
}
}
}
flatMap or map are sequential operations on Scala Future and on their own have nothing to do with running things in parallel. They can be viewed as simple callbacks executed when a Future completes. Or in other words, provided code inside map(...) or flatMap(...) will start to execute only when the previous Future is finished.
zip on the other hand will run your Futures in parallel and return the result as a Tuple when both of them are complete. Similarly, you could use zipWith which takes a function to transform the results of two Futures (combines zip and map operations):
mergeSortParallel0(left).zipWith(mergeSortParallel0(right)){
case (sortedLeft, sortedRight) => merge(sortedLeft, sortedRight)
}
Another way to achieve parallelism is to declare Futures outside for-comprehension. This works as Futures in Scala are 'eager' and they start as soon as you declare them (assign to val):
def mergeSortParallel0[T: Ordering](items: IndexedSeq[T]): Future[IndexedSeq[T]] = {
if (items.length <= 16) Future.successful(mergeSortSequential(items))
else {
val (left, right) = items.splitAt(items.length / 2)
val leftF = mergeSortParallel0(left)
val rightF = mergeSortParallel0(right)
for {
sortedLeft <- leftF
sortedRight <- rightF
} yield {
merge(sortedLeft, sortedRight)
}
}
}

Running futures sequentially

The objective of the code below is to execute Future f3 or f4 depending on a condition. Note that the condition depends on the result of Future f1 or f2, so it has to wait. This seems to work, however since f1 and f2 are futures this code shouldn't run sequentially. Is this code correct?
object TestFutures extends App {
val f1 = Future { 1 }
val f2 = Future { 2 }
val f3 = Future { 3 }
val f4 = Future { 4 }
val y = 1
for {
condition <- if (y>0) f1 else f2
_ <- if (condition==1) f3.map {a => println("333")} else f4.map {b => println("444")}
} yield ()
Thread.sleep(5000)
}
No it is not correct. When you create a Future like you do it, it starts the computations immediately. Before reaching for comprehension, all of your 4 futures are running already. You need to create them later, depending on the conditions.
val y = 1
for {
condition <- if (y > 0) Future { 1 } else Future { 2 }
_ <- if (condition == 1)
Future { 3 }.map(a => println("333"))
else
Future { 4 }.map(b => println("444"))
} yield ()
It is probably good to extract creating each of those to a method, that you will just call, for sake of readability.
It should be obvious they start running when they are created because you can just say
Future(1).map(x => println(x))
and it works without any sort of triggering. Anyway try to run the following code
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
def printWhenCompleted[A](f: Future[A]): Future[A] = f.map { x =>
println(x)
x
}
val f1 = printWhenCompleted(Future { 1 })
val f2 = printWhenCompleted(Future { 2 })
val f3 = printWhenCompleted(Future { 3 })
for {
r3 <- f3
r2 <- f2
r1 <- f1
} yield r1 + r2 + r3
it should give you those numbers in random order, instead of sequential 3, 2, 1
Edit
Here is implementation of the first code (without println) using flatMap
val futureCondition = if (y > 0) Future(1) else Future(2)
futureCondition.flatMap(condition => if (condition == 1) Future(3) else Future(4))

Future composition in Scala with chunked response

I think I have understanding of how future composition works but I am confused how to invoke the next future on chunk of response from first future.
Say the first future returns a list of integer and list is huge. I want to apply some function to that list with 2 elements at a time. How do I do that?
This example summarizes my dilemma:
val a = Future(List(1,2,3,4,5,6))
def f(a: List[Int]) = Future(a map (_ + 2))
val res = for {
list <- a
chunked <- list.grouped(2).toList
} yield f(chunked)
<console>:14: error: type mismatch;
found : List[scala.concurrent.Future[List[Int]]]
required: scala.concurrent.Future[?]
chunked <- list.grouped(2).toList
^
The return type has to be Future[?] so I can fix it by moving second future to yield part:
val res = for {
list <- a
} yield {
val temp = for {
chunked <- list.grouped(2).toList
} yield f(chunked)
Future.sequence(temp)
}
I feel it loses its elegance now, since it becomes nested (see two for comprehensions instead of one in the first approach). Is there a better way to achieve the same?
Consider
a.map { _.grouped(2).toList }.flatMap { Future.traverse(_)(f) }
Or, if you are set on only using for comprehension for some reason, here is how, without "cheating" :)
for {
b <- a
c <- Future.traverse(b.grouped(2).toList)(f)
} yield c
Edit in response to the comment It's not really that hard to add more processing to your chunked list if needed:
for {
b <- a
chunks = b.grouped(2).toList
processedChunks = processChunks(chunks)
c <- Future.traverse(processedChunks)
} yield c
Or, without for comprehension:
a
.map { _.grouped(2).toList }
.map(processChunks)
.flatMap { Future.traverse(_)(f) }
You cannot mix Future with List in a for-comprehension. All involved objects have to be of the same type. Also, in your working example, your result value res is of type Future[Future[List[List[Int]]]], which is probably not what you want.
import scala.concurrent._
import scala.concurrent.ExecutionContext.Implicits.global
a: scala.concurrent.Future[List[Int]] = scala.concurrent.impl.Promise$DefaultPromise#3bd3cdc8
f: (a: List[Int])scala.concurrent.Future[List[Int]]
scala> val b: Future[List[List[Int]]] = a.map(list => list.grouped(2).toList)
b: scala.concurrent.Future[List[List[Int]]] = scala.concurrent.impl.Promise$DefaultPromise#74db196c
scala> val res: Future[List[List[Int]]] = b.flatMap(lists => Future.sequence(lists.map(f)))
res: scala.concurrent.Future[List[List[Int]]] = scala.concurrent.impl.Promise$DefaultPromise#28f9873c
With for-comprehension
for {
b ← a.map(list ⇒ list.grouped( 2 ).toList)
res ← Future.sequence(b.map(f))
} yield res

Scala - merging multiple iterators

I have multiple iterators which return items in a sorted manner according to some sorting criterion. Now, I would like to merge (multiplex) the iterators into one, combined iterator. I know how to do it in Java style, with e.g. tree-map, but I was wondering if there is a more functional approach? I want to preserve the laziness of the iterators as much as possible.
You can just do:
val it = iter1 ++ iter2
It creates another iterator and does not evaluate the elements, but wraps the two existing iterators.
It is fully lazy, so you are not supposed to use iter1 or iter2 once you do this.
In general, if you have more iterators to merge, you can use folding:
val iterators: Seq[Iterator[T]] = ???
val it = iterators.foldLeft(Iterator[T]())(_ ++ _)
If you have some ordering on the elements that you would like to maintain in the resulting iterator but you want lazyness, you can convert them to streams:
def merge[T: Ordering](iter1: Iterator[T], iter2: Iterator[T]): Iterator[T] = {
val s1 = iter1.toStream
val s2 = iter2.toStream
def mergeStreams(s1: Stream[T], s2: Stream[T]): Stream[T] = {
if (s1.isEmpty) s2
else if (s2.isEmpty) s1
else if (s1.head < s2.head) s1.head #:: mergeStreams(s1.tail, s2)
else s2.head #:: mergeStreams(s1, s2.tail)
}
mergeStreams(s1, s2).iterator
}
Not necessarily faster though, you should microbenchmark this.
A possible alternative is to use buffered iterators to achieve the same effect.
Like #axel22 mentioned, you can do this with BufferedIterators. Here's one Stream-free solution:
def combine[T](rawIterators: List[Iterator[T]])(implicit cmp: Ordering[T]): Iterator[T] = {
new Iterator[T] {
private val iterators: List[BufferedIterator[T]] = rawIterators.map(_.buffered)
def hasNext: Boolean = iterators.exists(_.hasNext)
def next(): T = if (hasNext) {
iterators.filter(_.hasNext).map(x => (x.head, x)).minBy(_._1)(cmp)._2.next()
} else {
throw new UnsupportedOperationException("Cannot call next on an exhausted iterator!")
}
}
You could try:
(iterA ++ iterB).toStream.sorted.toIterator
For example:
val i1 = (1 to 100 by 3).toIterator
val i2 = (2 to 100 by 3).toIterator
val i3 = (3 to 100 by 3).toIterator
val merged = (i1 ++ i2 ++ i3).toStream.sorted.toIterator
merged.next // results in: 1
merged.next // results in: 2
merged.next // results in: 3