akka split task into smaller and fold results - scala

The question is about Akka actors library. A want to split one big task into smaller tasks and then fold the result of them into one 'big' result. This will give me faster computation profit. Smaller tasks can be computed in parallel if they are independent.
Assume that we need to compute somethig like this. Function count2X is time consuming, so using it several times in one thread is not optimal.
//NOT OPTIMAL
def count2X(x: Int) = {
Thread.sleep(1000)
x * 2
}
val sum = count2X(1) + count2X(2) + count2X(3)
println(sum)
And here goes the question.
How to dispatch tasks and collect results and then fold them, all using akka actors?
Is such functionality already provided by Akka or do I need to implement it myself? What are best practisies in such approach.
Here is 'visual' interpretation of my question:
/-> [SMALL_TASK_1] -\
[BIG_TASK] -+--> [SMALL_TASK_1] --> [RESULT_FOLD]
\-> [SMALL_TASK_1] -/
Below is my scaffold implementation with missing/bad implementation :)
case class Count2X(x: Int)
class Count2XActor extends Actor {
def receive = {
case Count2X(x) => count2X(x); // AND NOW WHAT ?
}
}
case class CountSumOf2X(a: Int, b: Int, c: Int)
class SumOf2XActor extends Actor {
val aCounter = context.actorOf(Props[Count2XActor])
val bCounter = context.actorOf(Props[Count2XActor])
val cCounter = context.actorOf(Props[Count2XActor])
def receive = {
case CountSumOf2X(a, b, c) => // AND NOW WHAT ? aCounter ! Count2X(a); bCounter ! Count2X(b); cCounter ! Count2X(c);
}
}
val aSystem = ActorSystem("mySystem")
val actor = aSystem.actorOf(Props[SumOf2XActor])
actor ! CountSumOf2X(10, 20, 30)
Thanks for any help.

In Akka I would do something like this:
val a = aCounter ? Count2X(10) mapTo[Int]
val b = bCounter ? Count2X(10) mapTo[Int]
val c = cCounter ? Count2X(10) mapTo[Int]
Await.result(Future.sequence(a, b, c) map (_.sum), 1 second).asInstanceOf[Int]
I'm sure there is a better way - here you start summing results after all Future-s are complete in parallel, for simple task it's ok, but generally you shouldn't wait so long

Two things you could do:
1) Use Akka futures. These allow you to dispatch operations and fold on them in an asynchronous manner. Check out http://doc.akka.io/docs/akka/2.0.4/scala/futures.html for more information.
2) You can dispatch work to multiple "worker" actors and then have a "master" actor aggregate them, keeping track of which messages are pending/processed by storing information in the messages themselves. I have a simple stock quote example of this using Akka actors here: https://github.com/ryanlecompte/quotes

Related

Implementing a real "forall" on a list of scala futures

I am using scala (2.12) futures to apply a concurrent divide and conquer approach for a complex problem. Here is some (simplified) context:
def solve(list: List[Int], constraints: Con): Future[Boolean] =
Future.unit.flatMap{ _ =>
//positive case
if(list.isEmpty) Future.successful(true)
//negative case
else if(someTest(constraints)) Future.successful(false)
//divide and conquer
else {
//split to independent problems, according to constraints
val components: List[List[Int]] = split(list,constraints)
//update the constraints accordingly (heavy computation here)
val newConstr: Con = updateConstr(...)
val futureList = components.map(c => solve(c,newConstr))
allTrue(Future.successful(true), futureList)
}
}
This recursive function takes a list of integer variables and a Con object representing the problem constraints, and spawns multiple independent sub-problems during each call.
The relevant part for my question is the call to allTrue. If I was solving the problem sequentially, I would have written components.forall(c => solve(c,newConstr)). In the concurrent version however, I have something like
this, which doesn't stop computation at the first false case encountered.
//async continuation passing style "forall"
def allTrue(acc: Future[Boolean], remaining: List[Future[Boolean]]):
Future[Boolean] = {
remaining match {
case Nil => acc
case r :: tail => acc.flatMap{ b =>
if(b) allTrue(r,tail)
else{
//here, it would be more efficient to stop all other Futures
Future.successful(false)
}
}
}
}
I have read multiple blog posts and forum threads talking about how stopping scala futures is generally not a good idea, but in this case I think it would be very useful.
Any ideas on how to get the forall behaviour on a list of futures?
Simple approach without stopping futures would be Future.traverse
val all:Future[List[Boolean]] = Future.traverse(components)(c => solve(c, newConstr)
val forAll:Future[Boolean] = all.map(_.forall(identity))
For cancelable list of futures I would recommend to look at Observable pattern. In your case subscriber can unsubscribe as soon as it see False value and producer will stop calculations when no subscriber listens

Splitting a scalaz-stream process into two child streams

Using scalaz-stream is it possible to split/fork and then rejoin a stream?
As an example, let's say I have the following function
val streamOfNumbers : Process[Task,Int] = Process.emitAll(1 to 10)
val sumOfEvenNumbers = streamOfNumbers.filter(isEven).fold(0)(add)
val sumOfOddNumbers = streamOfNumbers.filter(isOdd).fold(0)(add)
zip( sumOfEven, sumOfOdd ).to( someEffectfulFunction )
With scalaz-stream, in this example the results would be as you expect - a tuple of numbers from 1 to 10 passed to a sink.
However if we replace streamOfNumbers with something that requires IO, it will actually execute the IO action twice.
Using a Topic I'm able create a pub/sub process that duplicates elements in the stream correctly, however it does not buffer - it simply consumers the entire source as fast as possible regardless of the pace sinks consume it.
I can wrap this in a bounded Queue, however the end result feels a lot more complex than it needs to be.
Is there a simpler way of splitting a stream in scalaz-stream without duplicate IO actions from the source?
Also to clarify the previous answer delas with the "splitting" requirement. The solution to your specific issue may be without the need of splitting streams:
val streamOfNumbers : Process[Task,Int] = Process.emitAll(1 to 10)
val oddOrEven: Process[Task,Int\/Int] = streamOfNumbers.map {
case even if even % 2 == 0 => right(even)
case odd => left(odd)
}
val summed = oddOrEven.pipeW(sump1).pipeO(sump1)
val evenSink: Sink[Task,Int] = ???
val oddSink: Sink[Task,Int] = ???
summed
.drainW(evenSink)
.to(oddSink)
You can perhaps still use topic and just assure that the children processes will subscribe before you will push to topic.
However please note this solution does not have any bounds on it, i.e. if you will be pushing too fast, you may encounter OOM error.
def split[A](source:Process[Task,A]): Process[Task,(Process[Task,A], Proces[Task,A])]] = {
val topic = async.topic[A]
val sub1 = topic.subscribe
val sub2 = topic.subscribe
merge.mergeN(Process(emit(sub1->sub2),(source to topic.publish).drain))
}
I likewise needed this functionality. My situation was quite a bit trickier disallowing me to work around it in this manner.
Thanks to Daniel Spiewak's response in this thread, I was able to get the following to work. I improved on his solution by adding onHalt so my application would exit once the Process completed.
def split[A](p: Process[Task, A], limit: Int = 10): Process[Task, (Process[Task, A], Process[Task, A])] = {
val left = async.boundedQueue[A](limit)
val right = async.boundedQueue[A](limit)
val enqueue = p.observe(left.enqueue).observe(right.enqueue).drain.onHalt { cause =>
Process.await(Task.gatherUnordered(Seq(left.close, right.close))){ _ => Halt(cause) }
}
val dequeue = Process((left.dequeue, right.dequeue))
enqueue merge dequeue
}

Factorial calculation using Scala actors

How to compute the factorial using Scala actors ?
And would it prove more time efficient compared to for instance
def factorial(n: Int): BigInt = (BigInt(1) to BigInt(n)).par.product
Many Thanks.
Problem
You have to split up your input in partial products. This partial products can then be calculated in parallel. The partial products are then multiplied to get the final product.
This can be reduced to a broader class of problems: The so called Parallel prefix calculation. You can read up about it on Wikipedia.
Short version: When you calculate a*b*c*d with an associative operation _ * _, you can structure the calculation a*(b*(c*d)) or (a*b)*(c*d). With the second approach, you can then calculate a*b and c*d in parallel and then calculate the final result from these partial results. Of course you can do this recursively, when you have a bigger number of input values.
Solution
Disclaimer
This sounds a little bit like a homework assignment. So I will provide a solution that has two properties:
It contains a small bug
It shows how to solve parallel prefix in general, without solving the problem directly
So you can see how the solution should be structured, but no one can use it to cheat on her homework.
Solution in detail
First I need a few imports
import akka.event.Logging
import java.util.concurrent.TimeUnit
import scala.concurrent.duration.FiniteDuration
import akka.actor._
Then I create some helper classes for the communication between the actors
case class Calculate[T](values : Seq[T], segment : Int, parallelLimit : Int, fn : (T,T) => T)
trait CalculateResponse
case class CalculationResult[T](result : T, index : Int) extends CalculateResponse
case object Busy extends CalculateResponse
Instead of telling the receiver you are busy, the actor could also use the stash or implement its own queue for partial results. But in this case I think the sender shoudl decide how much parallel calculations are allowed.
Now I create the actor:
class ParallelPrefixActor[T] extends Actor {
val log = Logging(context.system, this)
val subCalculation = Props(classOf[ParallelPrefixActor[BigInt]])
val fanOut = 2
def receive = waitForCalculation
def waitForCalculation : Actor.Receive = {
case c : Calculate[T] =>
log.debug(s"Start calculation for ${c.values.length} values, segment nr. ${c.index}, from ${c.values.head} to ${c.values.last}")
if (c.values.length < c.parallelLimit) {
log.debug("Calculating result direct")
val result = c.values.reduceLeft(c.fn)
sender ! CalculationResult(result, c.index)
}else{
val groupSize: Int = Math.max(1, (c.values.length / fanOut) + Math.min(c.values.length % fanOut, 1))
log.debug(s"Splitting calculation for ${c.values.length} values up to ${fanOut} children, ${groupSize} elements each, limit ${c.parallelLimit}")
def segments=c.values.grouped(groupSize)
log.debug("Starting children")
segments.zipWithIndex.foreach{case (values, index) =>
context.actorOf(subCalculation) ! c.copy(values = values, index = index)
}
val partialResults: Vector[T] = segments.map(_.head).to[Vector]
log.debug(s"Waiting for ${partialResults.length} results (${partialResults.indices})")
context.become(waitForResults(segments.length, partialResults, c, sender), discardOld = true)
}
}
def waitForResults(outstandingResults : Int, partialResults : Vector[T], originalRequest : Calculate[T], originalSender : ActorRef) : Actor.Receive = {
case c : Calculate[_] => sender ! Busy
case r : CalculationResult[T] =>
log.debug(s"Putting result ${r.result} on position ${r.index} in ${partialResults.length}")
val updatedResults = partialResults.updated(r.index, r.result)
log.debug("Killing sub-worker")
sender ! PoisonPill
if (outstandingResults==1) {
log.debug("Calculating result from partial results")
val result = updatedResults.reduceLeft(originalRequest.fn)
originalSender ! CalculationResult(result, originalRequest.index)
context.become(waitForCalculation, discardOld = true)
}else{
log.debug(s"Still waiting for ${outstandingResults-1} results")
// For fanOut > 2 one could here already combine consecutive partial results
context.become(waitForResults(outstandingResults-1, updatedResults, originalRequest, originalSender), discardOld = true)
}
}
}
Optimizations
Using parallel prefix calculation is not optimal. The actors calculating the the product of the bigger numbers will do much more work than the actors calculating the product of the smaller numbers (e.g. when calculating 1 * ... * 100 , it is faster to calculate 1 * ... * 10 than 90 * ... * 100). So it might be a good idea to shuffle the numbers, so big numbers will be mixed with small numbers. This works in this case, because we use an commutative operation. Parallel prefix calculation in general only needs an associative operation to work.
Performance
In theory
Performance of the actor solution is worse than the "naive" solution (using parallel collections) for small amounts of data. The actor solution will shine, when you make complex calculations or distribute your calculation on specialized hardware (e.g. graphics card or FPGA) or on multiple machines. With the actor you can control, who does which calculation and you can even restart "hanging calculations". This can give a big speed up.
On a single machine, the actor solution might help when you have a non-uniform memory architecture. You could then organize the actors in a way that pins memory to a certain processor.
Some measurement
I did some real performance measurement using a Scala worksheet in IntelliJ IDEA.
First I set up the actor system:
// Setup the actor system
val system = ActorSystem("root")
// Start one calculation actor
val calculationStart = Props(classOf[ParallelPrefixActor[BigInt]])
val calcolon = system.actorOf(calculationStart, "Calcolon-BigInt")
val inbox = Inbox.create(system)
Then I defined a helper method to measure time:
// Helper function to measure time
def time[A] (id : String)(f: => A) = {
val start = System.nanoTime()
val result = f
val stop = System.nanoTime()
println(s"""Time for "${id}": ${(stop-start)*1e-6d}ms""")
result
}
And then I did some performance measurement:
// Test code
val limit = 10000
def testRange = (1 to limit).map(BigInt(_))
time("par product")(testRange.par.product)
val timeOut = FiniteDuration(240, TimeUnit.SECONDS)
inbox.send(calcolon, Calculate[BigInt]((1 to limit).map(BigInt(_)), 0, 10, _ * _))
time("actor product")(inbox.receive(timeOut))
time("par sum")(testRange.par.sum)
inbox.send(calcolon, Calculate[BigInt](testRange, 0, 5, _ + _))
time("actor sum")(inbox.receive(timeOut))
I got the following results
> Time for "par product": 134.38289ms
res0: scala.math.BigInt = 284625968091705451890641321211986889014805140170279923
079417999427441134000376444377299078675778477581588406214231752883004233994015
351873905242116138271617481982419982759241828925978789812425312059465996259867
065601615720360323979263287367170557419759620994797203461536981198970926112775
004841988454104755446424421365733030767036288258035489674611170973695786036701
910715127305872810411586405612811653853259684258259955846881464304255898366493
170592517172042765974074461334000541940524623034368691540594040662278282483715
120383221786446271838229238996389928272218797024593876938030946273322925705554
596900278752822425443480211275590191694254290289169072190970836905398737474524
833728995218023632827412170402680867692104515558405671725553720158521328290342
799898184493136...
Time for "actor product": 1310.217247ms
res2: Any = CalculationResult(28462596809170545189064132121198688901480514017027
992307941799942744113400037644437729907867577847758158840621423175288300423399
401535187390524211613827161748198241998275924182892597878981242531205946599625
986706560161572036032397926328736717055741975962099479720346153698119897092611
277500484198845410475544642442136573303076703628825803548967461117097369578603
670191071512730587281041158640561281165385325968425825995584688146430425589836
649317059251717204276597407446133400054194052462303436869154059404066227828248
371512038322178644627183822923899638992827221879702459387693803094627332292570
555459690027875282242544348021127559019169425429028916907219097083690539873747
452483372899521802363282741217040268086769210451555840567172555372015852132829
034279989818449...
> Time for "par sum": 6.488620999999999ms
res3: scala.math.BigInt = 50005000
> Time for "actor sum": 657.752832ms
res5: Any = CalculationResult(50005000,0)
You can easily see that the actor version is much slower than using parallel collections.

how to solve an equation in scala using actors?

I want to know how an actor returns a value to the sender and how to store it in a variable.
For example, consider that we have to find the sum of squares of 2 numbers and print it.
i.e., sum = a2 + b2
I have 2 actors. 1 actor computes square of any number passed to it (say, SquareActor). The other actor sends the two numbers (a , b) to the SquareActor and computes their sum (say, SumActor)
/** Actor to find the square of a number */
class SquareActor (x: Int) extends Actor
{
def act()
{
react{
case x : Int => println (x * x)
// how to return the value of x*x to "SumActor" ?
}
}
}
/** Actor to find the sum of squares of a and b */
class SumActor (a: Int, b:Int) extends Actor
{
def act()
{
var a2 = 0
var b2 = 0
val squareActor = new SquareActor (a : Int)
squareActor.start
// call squareActor to get a*a
squareActor ! a
// How to get the value returned by SquareActor and store it in the variable 'a2' ?
// call squareActor to get b*b
squareActor ! b
// How to get the value returned by SquareActor and store it in the variable 'b2' ?
println ("Sum: " + a2+b2)
}
}
Pardon me if the above is not possible; I think my basic understanding of actors may itself be wrong.
Use Akka
Note that from Scala 2.10, the Akka actor library is an integrated part of the standard library. It is generally considered superior to the standard actor library, so getting familiar with that would benefit you.
Use Futures
Also note that what you want to achieve is easier and nicer (composes better) using Futures. A Future[A] represents a possibly concurrent computation, eventually yielding a result of type A.
def asyncSquare(x: Int): Future[Int] = Future(x * x)
val sq1 = asyncSquare(2)
val sq2 = asyncSquare(3)
val asyncSum =
for {
a <- sq1
b <- sq2
}
yield (a + b)
Note that the asyncSquare results are queried in advance to start their (independent) computations as soon as possible. Putting the calls inside the for comprehension would have serialized their execution, not using the possible concurrency.
You use Future-s in for comprehensions, map, flatMap, zip, sequence them, and in the very end, you can get the computed value using Await, which is a blocking operation, or using registered callbacks.
Use Futures with actors
It is handy that you can ask from actors, which results in a Future:
val futureResult: Future[Int] = (someActor ? 5).mapTo[Int]
Note the need to use of mapTo because the message passing interface of actors is not typed (there are however typed actors).
Bottom line
If you want to perform stateless computations in parallel, stick to plain Futures. If you need stateful but local computations, you can still use Future and thread the state yourself (or use scalaz StateT monad transformer + Future as monad, if you are on to that business). If you need computations which require global state, then isolate that state into an actor, and interact with that actor, possibly using Futures.
Remember that actors work by message passing. So to get the response from the SquareActor back to the SumActor, you need to send it as a message from the SquareActor, and add a handler to the SumActor.
Also, your SquareActor constructor doesn't need an integer parameter.
That is, in your SquareActor, instead of just printing x * x, pass it to the SumActor:
class SquareActor extends Actor
{
def act()
{
react{
case x : Int => sender ! (x * x)
}
}
}
(sender causes it to send the message to the actor that sent the message it is reacting to.)
In your SumActor, after you send a and b to the SquareActor, handle the received reply messages:
react {
case a2 : Int => react {
case b2 : Int => println ("Sum: " + (a2+b2))
}
}

Executing a simple task on another thread in scala

I was wondering if there was a way to execute very simple tasks on another thread in scala that does not have a lot of overhead?
Basically I would like to make a global 'executor' that can handle executing an arbitrary number of tasks. I can then use the executor to build up additional constructs.
Additionally it would be nice if blocking or non-blocking considerations did not have to be considered by the clients.
I know that the scala actors library is built on top of the Doug Lea FJ stuff, and also that they support to a limited degree what I am trying to accomplish. However from my understanding I will have to pre-allocate an 'Actor Pool' to accomplish.
I would like to avoid making a global thread pool for this, as from what I understand it is not all that good at fine grained parallelism.
Here is a simple example:
import concurrent.SyncVar
object SimpleExecutor {
import actors.Actor._
def exec[A](task: => A) : SyncVar[A] = {
//what goes here?
//This is what I currently have
val x = new concurrent.SyncVar[A]
//The overhead of making the actor appears to be a killer
actor {
x.set(task)
}
x
}
//Not really sure what to stick here
def execBlocker[A](task: => A) : SyncVar[A] = exec(task)
}
and now an example of using exec:
object Examples {
//Benchmarks a task
def benchmark(blk : => Unit) = {
val start = System.nanoTime
blk
System.nanoTime - start
}
//Benchmarks and compares 2 tasks
def cmp(a: => Any, b: => Any) = {
val at = benchmark(a)
val bt = benchmark(b)
println(at + " " + bt + " " +at.toDouble / bt)
}
//Simple example for simple non blocking comparison
import SimpleExecutor._
def paraAdd(hi: Int) = (0 until hi) map (i=>exec(i+5)) foreach (_.get)
def singAdd(hi: Int) = (0 until hi) foreach (i=>i+5)
//Simple example for the blocking performance
import Thread.sleep
def paraSle(hi : Int) = (0 until hi) map (i=>exec(sleep(i))) foreach (_.get)
def singSle(hi : Int) = (0 until hi) foreach (i=>sleep(i))
}
Finally to run the examples (might want to do it a few times so HotSpot can warm up):
import Examples._
cmp(paraAdd(10000), singAdd(10000))
cmp(paraSle(100), singSle(100))
That's what Futures was made for. Just import scala.actors.Futures._, use future to create new futures, methods like awaitAll to wait on the results for a while, apply or respond to block until the result is received, isSet to see if it's ready or not, etc.
You don't need to create a thread pool either. Or, at least, not normally you don't. Why do you think you do?
EDIT
You can't gain performance parallelizing something as simple as an integer addition, because that's even faster than a function call. Concurrency will only bring performance by avoiding time lost to blocking i/o and by using multiple CPU cores to execute tasks in parallel. In the latter case, the task must be computationally expensive enough to offset the cost of dividing the workload and merging the results.
One other reason to go for concurrency is to improve the responsiveness of the application. That's not making it faster, that's making it respond faster to the user, and one way of doing that is getting even relatively fast operations offloaded to another thread so that the threads handling what the user sees or does can be faster. But I digress.
There's a serious problem with your code:
def paraAdd(hi: Int) = (0 until hi) map (i=>exec(i+5)) foreach (_.get)
def singAdd(hi: Int) = (0 until hi) foreach (i=>i+5)
Or, translating into futures,
def paraAdd(hi: Int) = (0 until hi) map (i=>future(i+5)) foreach (_.apply)
def singAdd(hi: Int) = (0 until hi) foreach (i=>i+5)
You might think paraAdd is doing the tasks in paralallel, but it isn't, because Range has a non-strict implementation of map (that's up to Scala 2.7; starting with Scala 2.8.0, Range is strict). You can look it up on other Scala questions. What happens is this:
A range is created from 0 until hi
A range projection is created from each element i of the range into a function that returns future(i+5) when called.
For each element of the range projection (i => future(i+5)), the element is evaluated (foreach is strict) and then the function apply is called on it.
So, because future is not called in step 2, but only in step 3, you'll wait for each future to complete before doing the next one. You can fix it with:
def paraAdd(hi: Int) = (0 until hi).force map (i=>future(i+5)) foreach (_.apply)
Which will give you better performance, but never as good as a simple immediate addition. On the other hand, suppose you do this:
def repeat(n: Int, f: => Any) = (0 until n) foreach (_ => f)
def paraRepeat(n: Int, f: => Any) =
(0 until n).force map (_ => future(f)) foreach (_.apply)
And then compare:
cmp(repeat(100, singAdd(100000)), paraRepeat(100, singAdd(100000)))
You may start seeing gains (it will depend on the number of cores and processor speed).