Yield a result from a list of Scala Future - scala

I am brand new to Scala Futures and I am working on some simple task.
I have the following function that returns a list of Future and all I want to do is to read the result (and block until all the future finished).
private def findAll(className: String): List[Future[Vector[ParseObject]]]= {
def find(query: ParseQuery[ParseObject], from: Int, limit: Int) = {
query.skip(from)
query.limit(limit)
Future(query.find().asScala.toVector)
}
val count = ParseQuery.getQuery(className).count()
val skip = 1000
val fromAndLimit = for (from <- 0 to count by skip) yield (from, if (from + skip < count) skip else count - from )
println("fromAndLimit: " + fromAndLimit)
(for((from, limit) <- fromAndLimit if limit > 0) yield find(ParseQuery.getQuery(className), from, limit)).toList
}
As appears, the function try to read all objects from Parse.com and return all the objects in one big Vector.
(code snippet is very appreciated; as right now I am not trying to learn Future, I just want a solution for this case).

If you want to compose futures to one you could call:
val compositeFuture: Future[List[Vector[ParseObject]]] = Future.sequence(findAll(???))
If you want to wait for completion, then:
val result: List[Vector[ParseObject]] = Await.result(compositeFuture, 1 minute)

Related

Functional way of interrupting lazy iteration depedning on timeout and comparisson between previous and next, while, LazyList vs Stream

Background
I have the following scenario. I want to execute the method of a class from an external library, repeatedly, and I want to do so until a certain timeout condition and result condition (compared to the previous result) is met. Furthermore I want to collect the return values, even on the "failed" run (the run with the "failing" result condition that should interrupt further execution).
Thus far I have accomplished this with initializing an empty var result: Result, a var stop: Boolean and using a while loop that runs while the conditions are true and modifying the outer state. I would like to get rid of this and use a functional approach.
Some context. Each run is expected to run from 0 to 60 minutes and the total time of iteration is capped at 60 minutes. Theoretically, there's no bound to how many times it executes in this period but in practice, it's generally 2-60 times.
The problem is, the runs take a long time so I need to stop the execution. My idea is to use some kind of lazy Iterator or Stream coupled with scanLeft and Option.
Code
Boiler plate
This code isn't particularly relevant but used in my approach samples and provide identical but somewhat random pseudo runtime results.
import scala.collection.mutable.ListBuffer
import scala.util.Random
val r = Random
r.setSeed(1)
val sleepingTimes: Seq[Int] = (1 to 601)
.map(x => Math.pow(2, x).toInt * r.nextInt(100))
.toList
.filter(_ > 0)
.sorted
val randomRes = r.shuffle((0 to 600).map(x => r.nextInt(10)).toList)
case class Result(val a: Int, val slept: Int)
class Lib() {
def run(i: Int) = {
println(s"running ${i}")
Thread.sleep(sleepingTimes(i))
Result(randomRes(i), sleepingTimes(i))
}
}
case class Baz(i: Int, result: Result)
val lib = new Lib()
val timeout = 10 * 1000
While approach
val iteratorStart = System.currentTimeMillis()
val iterator = for {
i <- (0 to 600).iterator
if System.currentTimeMillis() < iteratorStart + timeout
f = Baz(i, lib.run(i))
} yield f
val iteratorBuffer = ListBuffer[Baz]()
if (iterator.hasNext) iteratorBuffer.append(iterator.next())
var run = true
while (run && iterator.hasNext) {
val next = iterator.next()
run = iteratorBuffer.last.result.a < next.result.a
iteratorBuffer.append(next)
}
Stream approach (Scala.2.12)
Full example
val streamStart = System.currentTimeMillis()
val stream = for {
i <- (0 to 600).toStream
if System.currentTimeMillis() < streamStart + timeout
} yield Baz(i, lib.run(i))
var last: Option[Baz] = None
val head = stream.headOption
val tail = if (stream.nonEmpty) stream.tail else stream
val streamVersion = (tail
.scanLeft((head, true))((x, y) => {
if (x._1.exists(_.result.a > y.result.a)) (Some(y), false)
else (Some(y), true)
})
.takeWhile {
case (baz, continue) =>
if (!baz.eq(head)) last = baz
continue
}
.map(_._1)
.toList :+ last).flatten
LazyList approach (Scala 2.13)
Full example
val lazyListStart = System.currentTimeMillis()
val lazyList = for {
i <- (0 to 600).to(LazyList)
if System.currentTimeMillis() < lazyListStart + timeout
} yield Baz(i, lib.run(i))
var last: Option[Baz] = None
val head = lazyList.headOption
val tail = if (lazyList.nonEmpty) lazyList.tail else lazyList
val lazyListVersion = (tail
.scanLeft((head, true))((x, y) => {
if (x._1.exists(_.result.a > y.result.a)) (Some(y), false)
else (Some(y), true)
})
.takeWhile {
case (baz, continue) =>
if (!baz.eq(head)) last = baz
continue
}
.map(_._1)
.toList :+ last).flatten
Result
Both approaches appear to yield the correct end result:
List(Baz(0,Result(4,170)), Baz(1,Result(5,208)))
and they interrupt execution as desired.
Edit: The desired outcome is to not execute the next iteration but still return the result of the iteration that caused the interruption. Thus the desired result is
List(Baz(0,Result(4,170)), Baz(1,Result(5,208)), Baz(2,Result(2,256))
and lib.run(i) should only run 3 times.
This is achieved by the while approach, as well as the LazyList approach but not the Stream approach which executes lib.run 4 times (Bad!).
Question
Is there another stateless approach, which is hopefully more elegant?
Edit
I realized my examples were faulty and not returning the "failing" result, which it should, and that they kept executing beyond the stop condition. I rewrote the code and examples but I believe the spirit of the question is the same.
I would use something higher level, like fs2.
(or any other high-level streaming library, like: monix observables, akka streams or zio zstreams)
def runUntilOrTimeout[F[_]: Concurrent: Timer, A](work: F[A], timeout: FiniteDuration)
(stop: (A, A) => Boolean): Stream[F, A] = {
val interrupt =
Stream.sleep_(timeout)
val run =
Stream
.repeatEval(work)
.zipWithPrevious
.takeThrough {
case (Some(p), c) if stop(p, c) => false
case _ => true
} map {
case (_, c) => c
}
run mergeHaltBoth interrupt
}
You can see it working here.

Scala how to decrease execution time

I have one method which generate UUID and code as below :
def generate(number : Int): List[String] = {
List.fill(number)(Generators.randomBasedGenerator().generate().toString.replaceAll("-",""))
}
and I called this as below :
for(i <-0 to 100) {
val a = generate(1000000)
println(a)
}
But for running the above for loop it take almost 8-9 minutes for execution, is there any other way to minimised execution time ?
Note: Here for understanding I added for loop but in real situation the generate method will call thousand of times from other request at same time.
The problem is the List. Filling a List with 1,000,000 generated and processed elements is going to take time (and memory) because every one of those elements has to be materialized.
You can generate an infinite number of processed UUID strings instantly if you don't have to materialize them until they are actually needed.
def genUUID :Stream[String] = Stream.continually {
Generators.randomBasedGenerator().generate().toString.filterNot(_ == '-')
}
val next5 = genUUID.take(5) //only the 1st (head) is materialized
next5.length //now all 5 are materialized
You can use Stream or Iterator for the infinite collection, whichever you find most conducive (or least annoying) to your work flow.
Basically you used not the fastest implementation. You should use that one when you pass Random to the constructor Generators.randomBasedGenerator(new Random(System.currentTimeMillis())). I did next things:
Use Array instead of List (Array is faster)
Removed string replacing, let's measure pure performance of generation
Dependency: "com.fasterxml.uuid" % "java-uuid-generator" % "3.1.5"
Result:
Generators.randomBasedGenerator(). Per iteration: 1579.6 ms
Generators.randomBasedGenerator() with passing Random Per iteration: 59.2 ms
Code:
import java.util.{Random, UUID}
import com.fasterxml.uuid.impl.RandomBasedGenerator
import com.fasterxml.uuid.{Generators, NoArgGenerator}
import org.scalatest.{FunSuiteLike, Matchers}
import scala.concurrent.duration.Deadline
class GeneratorTest extends FunSuiteLike
with Matchers {
val nTimes = 10
// Let use Array instead of List - Array is faster!
// and use pure UUID generators
def generate(uuidGen: NoArgGenerator, number: Int): Seq[UUID] = {
Array.fill(number)(uuidGen.generate())
}
test("Generators.randomBasedGenerator() without passed Random (secure one)") {
// Slow generator
val uuidGen = Generators.randomBasedGenerator()
// Warm up JVM
benchGeneration(uuidGen, 3)
val startTime = Deadline.now
benchGeneration(uuidGen, nTimes)
val endTime = Deadline.now
val perItermTimeMs = (endTime - startTime).toMillis / nTimes.toDouble
println(s"Generators.randomBasedGenerator(). Per iteration: $perItermTimeMs ms")
}
test("Generators.randomBasedGenerator() with passing Random (not secure)") {
// Fast generator
val uuidGen = Generators.randomBasedGenerator(new Random(System.currentTimeMillis()))
// Warm up JVM
benchGeneration(uuidGen, 3)
val startTime = Deadline.now
benchGeneration(uuidGen, nTimes)
val endTime = Deadline.now
val perItermTimeMs = (endTime - startTime).toMillis / nTimes.toDouble
println(s"Generators.randomBasedGenerator() with passing Random Per iteration: $perItermTimeMs ms")
}
private def benchGeneration(uuidGen: RandomBasedGenerator, nTimes: Int) = {
var r: Long = 0
for (i <- 1 to nTimes) {
val a = generate(uuidGen, 1000000)
r += a.length
}
println(r)
}
}
You could use scala's parallel collections to split the load on multiple cores/threads.
You could also avoid creating a new generator every time:
class Generator {
val gen = Generators.randomBasedGenerator()
def generate(number : Int): List[String] = {
List.fill(number)(gen.generate().toString.replaceAll("-",""))
}
}

Scala Futures with DB

I'm writing code in scala/play with anorm/postgres for match generation based on users profiles. The following code works, but I've commented out the section that is causing problems, the while loop. I noticed while running it that the first 3 Futures seem to work synchronously but the problem comes when I'm retrieving the count of rows in the table in the fourth step.
The fourth step returns the count before the above insert's actually happened. As far as I can tell, steps 1-3 are being queued up for postgres synchronously, but the call to retrieve the count seems to return BEFORE the first 3 steps complete, which makes no sense to me. If the first 3 steps get queued up in the correct order, why wouldn't the fourth step wait to return the count until after the inserts happen?
When I uncomment the while loop, the match generation and insert functions are called until memory runs out, as the count returned is continually below the desired threshold.
I know the format itself is subpar, but my question is not about how to write the most elegant scala code, but merely how to get it to work for now.
def matchGeneration(email:String,itNum:Int) = {
var currentIterationNumber = itNum
var numberOfMatches = MatchData.numberOfCurrentMatches(email)
while(numberOfMatches < 150){
Thread.sleep(25000)//delay while loop execution time
generateUsers(email) onComplete {
case(s) => {
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 1")
Thread.sleep(5000)//Time for initial user generation to take place
genDemoMatches(email, currentIterationNumber) onComplete {
case (s) => {
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 2")
genIntMatches(email,currentIterationNumber) onComplete {
case(s) => {
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 3")
genSchoolWorkMatches(email,currentIterationNumber) onComplete {
case(s) => {
Thread.sleep(10000)
print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 4")
incrementNumberOfMatches(email) onComplete {
case(s) => {
currentIterationNumber+=1
println(s"current number of matches: $numberOfMatches")
println(s"current Iteration: $currentIterationNumber")
}
}
}
}
}
}
}
}
}
}
//}
}
The match functions are defined as futures, such as :
def genSchoolWorkMatches(email:String,currentIterationNumber:Int):Future[Unit]=
Future(genUsersFromSchoolWorkData(email, currentIterationNumber))
genUsersFromSchoolWorkData(email:String) follows the same form as the other two. It is a function that initially gets all the school/work fields that a user has filled out in their profile ( SELECT major FROM school_work where email='$email') and it generates a dummyUser that contains one of those fields in common with this user of email:String. It would take about 30-40 lines of code to print this function so I can explain it further if need be.
I have edited my code, the only way I found so far to get this to work was by hacking it with Thread.sleep(). I think the problem may lie with anorm
as my Future logic constructs did work as I expected, but the problem lies in the inconsistency of when writes occur versus what the read returns. The numberOfCurrentMatches(email:String) function returns the number of matches as it is a simple SELECT count(email) from table where email='$email'. The problem is that sometimes after inserting 23 matches the count returns as 0, then after a second iteration it will return 46. I assumed that the onComplete() would bind to the underlying anorm function defined with DB.withConnection() but apparently it may be too far removed to accomplish this. I am not really sure at this point what to research or look up further to try to get around this problem, rather than writing a separate sort of supervisor function to return at a value closer to 150.
UPDATE
Thanks to the advice of user's here, and trying to understand Scala's documentation at this link: Scala Futures and Promises
I have updated my code to be a bit more readable and scala-esque:
def genMatchOfTypes(email:String,iterationNumber:Int) = {
genDemoMatches(email,iterationNumber)
genIntMatches(email,iterationNumber)
genSchoolWorkMatches(email,iterationNumber)
}
def matchGeneration(email:String) = {
var currentIterationNumber = 0
var numberOfMatches = MatchData.numberOfCurrentMatches(email)
while (numberOfMatches < 150) {
println(s"current number of matches: $numberOfMatches")
Thread.sleep(30000)
generateUsers(email)
.flatMap(users => genMatchOfTypes(email,currentIterationNumber))
.flatMap(matches => incrementNumberOfMatches(email))
.map{
result =>
currentIterationNumber += 1
println(s"current Iteration2: $currentIterationNumber")
numberOfMatches = MatchData.numberOfCurrentMatches(email)
println(s"current number of matches2: $numberOfMatches")
}
}
}
I still am heavily dependent upon the Thread.sleep(30000) to provide enough time to run through the while loop before it tries to loop back again. It's still an unwieldy hack. When I uncomment the Thread.sleep()
my output in bash looks like this:
users for match generation createdcurrent number of matches: 0
[error] c.MatchDataController - here is the list: jnkj
[error] c.MatchDataController - here is the list: hbhjbjjnkjn
current number of matches: 0
current number of matches: 0
current number of matches: 0
current number of matches: 0
current number of matches: 0
This of course is a truncated output. It runs like this over and over until I get errors about too many open files and the JVM/play server crashes entirely.
One solution is to use Future.traverse for known iteration count
Implying
object MatchData {
def numberOfCurrentMatches(email: String) = ???
}
def generateUsers(email: String): Future[Unit] = ???
def incrementNumberOfMatches(email: String): Future[Int] = ???
def genDemoMatches(email: String, it: Int): Future[Unit] = ???
def genIntMatches(email: String, it: Int): Future[Unit] = ???
def genSchoolWorkMatches(email: String, it: Int): Future[Unit] = ???
You can write code like
def matchGeneration(email: String, itNum: Int) = {
val numberOfMatches = MatchData.numberOfCurrentMatches(email)
Future.traverse(Stream.range(itNum, 150 - numberOfMatches + itNum)) { currentIterationNumber => for {
_ <- generateUsers(email)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 1")
_ <- genDemoMatches(email, currentIterationNumber)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 2")
_ <- genIntMatches(email, currentIterationNumber)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 3")
_ <- genSchoolWorkMatches(email, currentIterationNumber)
_ = Thread.sleep(15000)
_ = print(s">>>>>>>>>>>>>>>>>>>>>>>>>>>STEP 4")
numberOfMatches <- incrementNumberOfMatches(email)
_ = println(s"current number of matches: $numberOfMatches")
_ = println(s"current Iteration: $currentIterationNumber")
} yield ()
}
Update
If you urged to check some condition each time, one way is to use cool monadic things from scalaz library. It have definition of monad for scala.Future so we can replace word monadic with asynchronous when we want to
For example StreamT.unfoldM can create conditional monadic(asynchronous) loop, even if we don need elements of resulting collection we still can use it just for iteration.
Lets define your
def generateAll(email: String, iterationNumber: Int): Future[Unit] = for {
_ <- generateUsers(email)
_ <- genDemoMatches(email, iterationNumber)
_ <- genIntMatches(email, iterationNumber)
_ <- genSchoolWorkMatches(email, iterationNumber)
} yield ()
Then iteration step
def generateStep(email: String, limit: Int)(iterationNumber: Int): Future[Option[(Unit, Int)]] =
if (MatchData.numberOfCurrentMatches(email) >= limit) Future(None)
else for {
_ <- generateAll(email, iterationNumber)
_ <- incrementNumberOfMatches(email)
next = iterationNumber + 1
} yield Some((), next)
Now our resulting function simplifies to
import scalaz._
import scalaz.std.scalaFuture._
def matchGeneration(email: String, itNum: Int): Future[Unit] =
StreamT.unfoldM(0)(generateStep(email, 150) _).toStream.map(_.force: Unit)
It looks like synchronous method MatchData.numberOfCurrentMatches is reacting on your asynchronous modification inside the incrementNumberOfMatches. Note that generally it could lead to disastrous results and you probably need to move that state inside some actor or something like that

Efficient way to fold list in scala, while avoiding allocations and vars

I have a bunch of items in a list, and I need to analyze the content to find out how many of them are "complete". I started out with partition, but then realized that I didn't need to two lists back, so I switched to a fold:
val counts = groupRows.foldLeft( (0,0) )( (pair, row) =>
if(row.time == 0) (pair._1+1,pair._2)
else (pair._1, pair._2+1)
)
but I have a lot of rows to go through for a lot of parallel users, and it is causing a lot of GC activity (assumption on my part...the GC could be from other things, but I suspect this since I understand it will allocate a new tuple on every item folded).
for the time being, I've rewritten this as
var complete = 0
var incomplete = 0
list.foreach(row => if(row.time != 0) complete += 1 else incomplete += 1)
which fixes the GC, but introduces vars.
I was wondering if there was a way of doing this without using vars while also not abusing the GC?
EDIT:
Hard call on the answers I've received. A var implementation seems to be considerably faster on large lists (like by 40%) than even a tail-recursive optimized version that is more functional but should be equivalent.
The first answer from dhg seems to be on-par with the performance of the tail-recursive one, implying that the size pass is super-efficient...in fact, when optimized it runs very slightly faster than the tail-recursive one on my hardware.
The cleanest two-pass solution is probably to just use the built-in count method:
val complete = groupRows.count(_.time == 0)
val counts = (complete, groupRows.size - complete)
But you can do it in one pass if you use partition on an iterator:
val (complete, incomplete) = groupRows.iterator.partition(_.time == 0)
val counts = (complete.size, incomplete.size)
This works because the new returned iterators are linked behind the scenes and calling next on one will cause it to move the original iterator forward until it finds a matching element, but it remembers the non-matching elements for the other iterator so that they don't need to be recomputed.
Example of the one-pass solution:
scala> val groupRows = List(Row(0), Row(1), Row(1), Row(0), Row(0)).view.map{x => println(x); x}
scala> val (complete, incomplete) = groupRows.iterator.partition(_.time == 0)
Row(0)
Row(1)
complete: Iterator[Row] = non-empty iterator
incomplete: Iterator[Row] = non-empty iterator
scala> val counts = (complete.size, incomplete.size)
Row(1)
Row(0)
Row(0)
counts: (Int, Int) = (3,2)
I see you've already accepted an answer, but you rightly mention that that solution will traverse the list twice. The way to do it efficiently is with recursion.
def counts(xs: List[...], complete: Int = 0, incomplete: Int = 0): (Int,Int) =
xs match {
case Nil => (complete, incomplete)
case row :: tail =>
if (row.time == 0) counts(tail, complete + 1, incomplete)
else counts(tail, complete, incomplete + 1)
}
This is effectively just a customized fold, except we use 2 accumulators which are just Ints (primitives) instead of tuples (reference types). It should also be just as efficient a while-loop with vars - in fact, the bytecode should be identical.
Maybe it's just me, but I prefer using the various specialized folds (.size, .exists, .sum, .product) if they are available. I find it clearer and less error-prone than the heavy-duty power of general folds.
val complete = groupRows.view.filter(_.time==0).size
(complete, groupRows.length - complete)
How about this one? No import tax.
import scala.collection.generic.CanBuildFrom
import scala.collection.Traversable
import scala.collection.mutable.Builder
case class Count(n: Int, total: Int) {
def not = total - n
}
object Count {
implicit def cbf[A]: CanBuildFrom[Traversable[A], Boolean, Count] = new CanBuildFrom[Traversable[A], Boolean, Count] {
def apply(): Builder[Boolean, Count] = new Counter
def apply(from: Traversable[A]): Builder[Boolean, Count] = apply()
}
}
class Counter extends Builder[Boolean, Count] {
var n = 0
var ttl = 0
override def +=(b: Boolean) = { if (b) n += 1; ttl += 1; this }
override def clear() { n = 0 ; ttl = 0 }
override def result = Count(n, ttl)
}
object Counting extends App {
val vs = List(4, 17, 12, 21, 9, 24, 11)
val res: Count = vs map (_ % 2 == 0)
Console println s"${vs} have ${res.n} evens out of ${res.total}; ${res.not} were odd."
val res2: Count = vs collect { case i if i % 2 == 0 => i > 10 }
Console println s"${vs} have ${res2.n} evens over 10 out of ${res2.total}; ${res2.not} were smaller."
}
OK, inspired by the answers above, but really wanting to only pass over the list once and avoid GC, I decided that, in the face of a lack of direct API support, I would add this to my central library code:
class RichList[T](private val theList: List[T]) {
def partitionCount(f: T => Boolean): (Int, Int) = {
var matched = 0
var unmatched = 0
theList.foreach(r => { if (f(r)) matched += 1 else unmatched += 1 })
(matched, unmatched)
}
}
object RichList {
implicit def apply[T](list: List[T]): RichList[T] = new RichList(list)
}
Then in my application code (if I've imported the implicit), I can write var-free expressions:
val (complete, incomplete) = groupRows.partitionCount(_.time != 0)
and get what I want: an optimized GC-friendly routine that prevents me from polluting the rest of the program with vars.
However, I then saw Luigi's benchmark, and updated it to:
Use a longer list so that multiple passes on the list were more obvious in the numbers
Use a boolean function in all cases, so that we are comparing things fairly
http://pastebin.com/2XmrnrrB
The var implementation is definitely considerably faster, even though Luigi's routine should be identical (as one would expect with optimized tail recursion). Surprisingly, dhg's dual-pass original is just as fast (slightly faster if compiler optimization is on) as the tail-recursive one. I do not understand why.
It is slightly tidier to use a mutable accumulator pattern, like so, especially if you can re-use your accumulator:
case class Accum(var complete = 0, var incomplete = 0) {
def inc(compl: Boolean): this.type = {
if (compl) complete += 1 else incomplete += 1
this
}
}
val counts = groupRows.foldLeft( Accum() ){ (a, row) => a.inc( row.time == 0 ) }
If you really want to, you can hide your vars as private; if not, you still are a lot more self-contained than the pattern with vars.
You could just calculate it using the difference like so:
def counts(groupRows: List[Row]) = {
val complete = groupRows.foldLeft(0){ (pair, row) =>
if(row.time == 0) pair + 1 else pair
}
(complete, groupRows.length - complete)
}

Quicksort using Future ends up in a deadlock

I have written a quicksort (method quicksortF()) that uses a Scala's Future to let the recursive sorting of the partitions be done concurrently. I also have implemented a regular quicksort (method quicksort()). Unfortunately, the Future version ends up in a deadlock (apparently blocks forever) when the list to sort is greater than about 1000 elements (900 would work). The source is shown below.
I am relatively new to Actors and Futures. What is goind wrong here?
Thanks!
import util.Random
import actors.Futures._
/**
* Quicksort with and without using the Future pattern.
* #author Markus Gumbel
*/
object FutureQuickSortProblem {
def main(args: Array[String]) {
val n = 1000 // works for n = 900 but not for 1000 anymore.
// Create a random list of size n:
val list = (1 to n).map(c => Random.nextInt(n * 10)).toList
println(list)
// Sort it with regular quicksort:
val sortedList = quicksort(list)
println(sortedList)
// ... and with quicksort using Future (which hangs):
val sortedListF = quicksortF(list)
println(sortedListF)
}
// This one works.
def quicksort(list: List[Int]): List[Int] = {
if (list.length <= 1) list
else {
val pivot = list.head
val leftList = list.filter(_ < pivot)
val middleList = list.filter(pivot == _)
val rightList = list.filter(_ > pivot)
val sortedLeftList = quicksort(leftList)
val sortedRightList = quicksort(rightList)
sortedLeftList ::: middleList ::: sortedRightList
}
}
// Almost the same as quicksort() except that Future is used.
// However, this one hangs.
def quicksortF(list: List[Int]): List[Int] = {
if (list.length <= 1) list
else {
val pivot = list.head
val leftList = list.filter(_ < pivot)
val middleList = list.filter(pivot == _)
val rightList = list.filter(_ > pivot)
// Same as quicksort() but here we are using a Future
// to sort the left and right partitions independently:
val sortedLeftListFuture = future {
quicksortF(leftList)
}
val sortedRightListFuture = future {
quicksortF(rightList)
}
sortedLeftListFuture() ::: middleList ::: sortedRightListFuture()
}
}
}
class FutureQuickSortProblem // If not defined, Intellij won't find the main method.?!
Disclaimer: I've never personally used the (pre-2.10) standard library's actors or futures in any serious way, and there are a number of things I don't like (or at least don't understand) about the API there, compared for example to the implementations in Scalaz or Akka or Play 2.0.
But I can tell you that the usual approach in a case like this is to combine your futures monadically instead of claiming them immediately and combining the results. For example, you could write something like this (note the new return type):
import scala.actors.Futures._
def quicksortF(list: List[Int]): Responder[List[Int]] = {
if (list.length <= 1) future(list)
else {
val pivot = list.head
val leftList = list.filter(_ < pivot)
val middleList = list.filter(pivot == _)
val rightList = list.filter(_ > pivot)
for {
left <- quicksortF(leftList)
right <- quicksortF(rightList)
} yield left ::: middleList ::: right
}
}
Like your vanilla implementation, this won't necessarily be very efficient, and it will also blow the stack pretty easily, but it shouldn't run out of threads.
As a side note, why does flatMap on a Future return a Responder instead of a Future? I don't know, and neither do some other folks. For reasons like this I'd suggest skipping the now-deprecated pre-2.10 standard library actor-based concurrency stuff altogether.
As I understand, calling apply on the Future (as you do when concatenating the results of the recursive calls) will block until the result is retrieved.