Handling an infinite number of messages (akka) [closed] - scala

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have to write a code such that Actor A produces an infinite stream of numbers which Actor B consumes. Actor A outputs the sequence: x , f(x), g(f(x)) etc where f(x) = 10 if x is 0 and 3x otherwise, and where g(x) is x/2. i.e:
Output: x =0, f(x)=10, g(f(x)=5 (3 messages) then next 3 messages should be f(g(f(x)) , g(f(g(f(x))) , f(g(f(g(f(x)))) and their value...where the inner function becomes x each time to compute the result for the adjacent result
Actor B deals with numbers 3 at a time and it should print each triple tabulated on the same line with the average of the 3 numbers.
The value (0) is passed to ActorA from the main method.
My attempt:
import akka.actor._
class ActorA(processB:ActorRef) extends Actor with ActorLogging{
def f(x : Int) = if(x == 0) 10 else 3 * x
def g(x : Int) = x / 2
def receive = {
case 0 =>
val x = 0
//processB ! (x)
// processB ! (f(x))
// processB ! (g(f(x)))
println( (f(x)))
println((g(f(x))))
/*case Stop =>
Console.println("Stop")
processB ! Stop
context stop self */
}
}
class ActorB extends Actor with ActorLogging{
def receive = {
case ActorA =>
case Stop =>
Console.println("Stop")
exit()
}
}
case object ActorA
case object ActorB
case object Stop
object messages {
def main(args: Array[String]) :Unit = {
val system = ActorSystem("actors")
val processB = system.actorOf(Props[ActorB])
val actorA = system.actorOf(Props(new ActorA(processB)))
actorA ! 0
}
}
How to produce an INFINITE number of messages and can I deal with them 3 at a time? Thanks

To get an infinite sequence you can use a Stream.
Derek Wyatt has a good blog article on them and how generating Fibonacci numbers works:
http://www.derekwyatt.org/2011/07/29/understanding-scala-streams-through-fibonacci/
You can use the same basic principle for your sequence which, if I understand correctly, is alternately applying the f and g function on the previous value in the stream.
You can write that as follows:
lazy val stream: Stream[Int] = x #:: stream.zipWithIndex.map {
case (p,i) => if (i%2 == 0) f(p) else g(p)
}
You can then split the stream into chunks of 3 using grouped,
Here I've done that and then converted the resulting Stream[Int], each of size 3, into a tuple for convenience:
val chunks: Iterator[(Int,Int,Int)] = stream.grouped(3).map { s =>
(s.head, s.tail.head, s.tail.tail.head)
}
You can then make use of that however you want, sending the tuple to the other actor if you so wish.
At the other side you can match that tuple as follows:
case (a:Int, b:Int, c:Int) => ...

Actors have inboxes that will fill up, and there's no inherent backpressure to tell the producer to wait. See: How to use Akka BoundedMailBox to throttle a producer
In general, you'll probably want B to explicitly send messages to A asking for data.

Related

Implementing a real "forall" on a list of scala futures

I am using scala (2.12) futures to apply a concurrent divide and conquer approach for a complex problem. Here is some (simplified) context:
def solve(list: List[Int], constraints: Con): Future[Boolean] =
Future.unit.flatMap{ _ =>
//positive case
if(list.isEmpty) Future.successful(true)
//negative case
else if(someTest(constraints)) Future.successful(false)
//divide and conquer
else {
//split to independent problems, according to constraints
val components: List[List[Int]] = split(list,constraints)
//update the constraints accordingly (heavy computation here)
val newConstr: Con = updateConstr(...)
val futureList = components.map(c => solve(c,newConstr))
allTrue(Future.successful(true), futureList)
}
}
This recursive function takes a list of integer variables and a Con object representing the problem constraints, and spawns multiple independent sub-problems during each call.
The relevant part for my question is the call to allTrue. If I was solving the problem sequentially, I would have written components.forall(c => solve(c,newConstr)). In the concurrent version however, I have something like
this, which doesn't stop computation at the first false case encountered.
//async continuation passing style "forall"
def allTrue(acc: Future[Boolean], remaining: List[Future[Boolean]]):
Future[Boolean] = {
remaining match {
case Nil => acc
case r :: tail => acc.flatMap{ b =>
if(b) allTrue(r,tail)
else{
//here, it would be more efficient to stop all other Futures
Future.successful(false)
}
}
}
}
I have read multiple blog posts and forum threads talking about how stopping scala futures is generally not a good idea, but in this case I think it would be very useful.
Any ideas on how to get the forall behaviour on a list of futures?
Simple approach without stopping futures would be Future.traverse
val all:Future[List[Boolean]] = Future.traverse(components)(c => solve(c, newConstr)
val forAll:Future[Boolean] = all.map(_.forall(identity))
For cancelable list of futures I would recommend to look at Observable pattern. In your case subscriber can unsubscribe as soon as it see False value and producer will stop calculations when no subscriber listens

How to log flow rate in Akka Stream?

I have an Akka Stream application with a single flow/graph. I want to measure the flow rate at the source and log it every 5 seconds, like 'received 3 messages in the last 5 seconds'. I tried with,
someOtherFlow
.groupedWithin(Integer.MAX_VALUE, 5 seconds)
.runForeach(seq =>
log.debug(s"received ${seq.length} messages in the last 5 seconds")
)
but it only outputs when there are messages, no empty list when there are 0 messages. I want the 0's as well. Is this possible?
You could try something like
src
.conflateWithSeed(_ ⇒ 1){ case (acc, _) ⇒ acc + 1 }
.zip(Source.tick(5.seconds, 5.seconds, NotUsed))
.map(_._1)
which should batch your elements until the tick releases them. This is inspired from an example in the docs.
On a different note, if you need this for monitoring purposes, you could leverage a 3rd party tool for this purpose - e.g. Kamon.
A sample akka stream logging.
implicit val system: ActorSystem = ActorSystem("StreamLoggingActorSystem")
implicit val materializer: ActorMaterializer = ActorMaterializer()
implicit val adapter: LoggingAdapter = Logging(system, "customLogger")
implicit val ec: ExecutionContextExecutor = system.dispatcher
def randomInt = Random.nextInt()
val source = Source.repeat(NotUsed).map(_ ⇒ randomInt)
val logger = source
.groupedWithin(Integer.MAX_VALUE, 5.seconds)
.log(s"in the last 5 seconds number of messages received : ", _.size)
.withAttributes(
Attributes.logLevels(
onElement = Logging.WarningLevel,
onFinish = Logging.InfoLevel,
onFailure = Logging.DebugLevel
)
)
val sink = Sink.ignore
val result: Future[Done] = logger.runWith(sink)
result.onComplete{
case Success(_) =>
println("end of stream")
case Failure(_) =>
println("stream ended with failure")
}
source code is here.
Extending Stefano's answer a little I created the following flows:
def flowRate[T](metric: T => Int = (_: T) => 1, outputDelay: FiniteDuration = 1 second): Flow[T, Double, NotUsed] =
Flow[T]
.conflateWithSeed(metric(_)){ case (acc, x) ⇒ acc + metric(x) }
.zip(Source.tick(outputDelay, outputDelay, NotUsed))
.map(_._1.toDouble / outputDelay.toUnit(SECONDS))
def printFlowRate[T](name: String, metric: T => Int = (_: T) => 1,
outputDelay: FiniteDuration = 1 second): Flow[T, T, NotUsed] =
Flow[T]
.alsoTo(flowRate[T](metric, outputDelay)
.to(Sink.foreach(r => log.info(s"Rate($name): $r"))))
The first converts the flow into a rate per second. You can supply a metric which gives a value to each object passing through. Say you want to measure the rate of characters in a flow of strings then you could pass _.length. The second parameter is the delay between flow rate reports (defaults to one second).
The second flow can be used inline to print the flow rate for debugging purposes without modifying the value passing through the stream. eg
stringFlow
.via(printFlowRate[String]("Char rate", _.length, 10 seconds))
.map(_.toLowercase) // still a string
...
which will show every 10 seconds the average the rate (per second) of characters.
N.B. The above flowRate would however be lagging one outputDelay period behind, because the zip will consume from the conflate and then wait for a tick (which can be easily verified by putting a log after the conflateWithSeed). To obtain a non lagging flow rate (metric), one could duplicate the tick, in order to force the zip to consume a second fresh element from the conflate, and then aggregate both ticks, i.e.:
Flow[T]
.conflateWithSeed(metric(_)){case (acc, x) => acc + metric(x) }
.zip(Source.tick(outputDelay, outputDelay, NotUsed)
.mapConcat(_ => Seq(NotUsed, NotUsed))
)
.grouped(2).map {
case Seq((a, _), (b, _)) => a + b
}
.map(_.toDouble / outputDelay.toUnit(SECONDS))

Factorial calculation using Scala actors

How to compute the factorial using Scala actors ?
And would it prove more time efficient compared to for instance
def factorial(n: Int): BigInt = (BigInt(1) to BigInt(n)).par.product
Many Thanks.
Problem
You have to split up your input in partial products. This partial products can then be calculated in parallel. The partial products are then multiplied to get the final product.
This can be reduced to a broader class of problems: The so called Parallel prefix calculation. You can read up about it on Wikipedia.
Short version: When you calculate a*b*c*d with an associative operation _ * _, you can structure the calculation a*(b*(c*d)) or (a*b)*(c*d). With the second approach, you can then calculate a*b and c*d in parallel and then calculate the final result from these partial results. Of course you can do this recursively, when you have a bigger number of input values.
Solution
Disclaimer
This sounds a little bit like a homework assignment. So I will provide a solution that has two properties:
It contains a small bug
It shows how to solve parallel prefix in general, without solving the problem directly
So you can see how the solution should be structured, but no one can use it to cheat on her homework.
Solution in detail
First I need a few imports
import akka.event.Logging
import java.util.concurrent.TimeUnit
import scala.concurrent.duration.FiniteDuration
import akka.actor._
Then I create some helper classes for the communication between the actors
case class Calculate[T](values : Seq[T], segment : Int, parallelLimit : Int, fn : (T,T) => T)
trait CalculateResponse
case class CalculationResult[T](result : T, index : Int) extends CalculateResponse
case object Busy extends CalculateResponse
Instead of telling the receiver you are busy, the actor could also use the stash or implement its own queue for partial results. But in this case I think the sender shoudl decide how much parallel calculations are allowed.
Now I create the actor:
class ParallelPrefixActor[T] extends Actor {
val log = Logging(context.system, this)
val subCalculation = Props(classOf[ParallelPrefixActor[BigInt]])
val fanOut = 2
def receive = waitForCalculation
def waitForCalculation : Actor.Receive = {
case c : Calculate[T] =>
log.debug(s"Start calculation for ${c.values.length} values, segment nr. ${c.index}, from ${c.values.head} to ${c.values.last}")
if (c.values.length < c.parallelLimit) {
log.debug("Calculating result direct")
val result = c.values.reduceLeft(c.fn)
sender ! CalculationResult(result, c.index)
}else{
val groupSize: Int = Math.max(1, (c.values.length / fanOut) + Math.min(c.values.length % fanOut, 1))
log.debug(s"Splitting calculation for ${c.values.length} values up to ${fanOut} children, ${groupSize} elements each, limit ${c.parallelLimit}")
def segments=c.values.grouped(groupSize)
log.debug("Starting children")
segments.zipWithIndex.foreach{case (values, index) =>
context.actorOf(subCalculation) ! c.copy(values = values, index = index)
}
val partialResults: Vector[T] = segments.map(_.head).to[Vector]
log.debug(s"Waiting for ${partialResults.length} results (${partialResults.indices})")
context.become(waitForResults(segments.length, partialResults, c, sender), discardOld = true)
}
}
def waitForResults(outstandingResults : Int, partialResults : Vector[T], originalRequest : Calculate[T], originalSender : ActorRef) : Actor.Receive = {
case c : Calculate[_] => sender ! Busy
case r : CalculationResult[T] =>
log.debug(s"Putting result ${r.result} on position ${r.index} in ${partialResults.length}")
val updatedResults = partialResults.updated(r.index, r.result)
log.debug("Killing sub-worker")
sender ! PoisonPill
if (outstandingResults==1) {
log.debug("Calculating result from partial results")
val result = updatedResults.reduceLeft(originalRequest.fn)
originalSender ! CalculationResult(result, originalRequest.index)
context.become(waitForCalculation, discardOld = true)
}else{
log.debug(s"Still waiting for ${outstandingResults-1} results")
// For fanOut > 2 one could here already combine consecutive partial results
context.become(waitForResults(outstandingResults-1, updatedResults, originalRequest, originalSender), discardOld = true)
}
}
}
Optimizations
Using parallel prefix calculation is not optimal. The actors calculating the the product of the bigger numbers will do much more work than the actors calculating the product of the smaller numbers (e.g. when calculating 1 * ... * 100 , it is faster to calculate 1 * ... * 10 than 90 * ... * 100). So it might be a good idea to shuffle the numbers, so big numbers will be mixed with small numbers. This works in this case, because we use an commutative operation. Parallel prefix calculation in general only needs an associative operation to work.
Performance
In theory
Performance of the actor solution is worse than the "naive" solution (using parallel collections) for small amounts of data. The actor solution will shine, when you make complex calculations or distribute your calculation on specialized hardware (e.g. graphics card or FPGA) or on multiple machines. With the actor you can control, who does which calculation and you can even restart "hanging calculations". This can give a big speed up.
On a single machine, the actor solution might help when you have a non-uniform memory architecture. You could then organize the actors in a way that pins memory to a certain processor.
Some measurement
I did some real performance measurement using a Scala worksheet in IntelliJ IDEA.
First I set up the actor system:
// Setup the actor system
val system = ActorSystem("root")
// Start one calculation actor
val calculationStart = Props(classOf[ParallelPrefixActor[BigInt]])
val calcolon = system.actorOf(calculationStart, "Calcolon-BigInt")
val inbox = Inbox.create(system)
Then I defined a helper method to measure time:
// Helper function to measure time
def time[A] (id : String)(f: => A) = {
val start = System.nanoTime()
val result = f
val stop = System.nanoTime()
println(s"""Time for "${id}": ${(stop-start)*1e-6d}ms""")
result
}
And then I did some performance measurement:
// Test code
val limit = 10000
def testRange = (1 to limit).map(BigInt(_))
time("par product")(testRange.par.product)
val timeOut = FiniteDuration(240, TimeUnit.SECONDS)
inbox.send(calcolon, Calculate[BigInt]((1 to limit).map(BigInt(_)), 0, 10, _ * _))
time("actor product")(inbox.receive(timeOut))
time("par sum")(testRange.par.sum)
inbox.send(calcolon, Calculate[BigInt](testRange, 0, 5, _ + _))
time("actor sum")(inbox.receive(timeOut))
I got the following results
> Time for "par product": 134.38289ms
res0: scala.math.BigInt = 284625968091705451890641321211986889014805140170279923
079417999427441134000376444377299078675778477581588406214231752883004233994015
351873905242116138271617481982419982759241828925978789812425312059465996259867
065601615720360323979263287367170557419759620994797203461536981198970926112775
004841988454104755446424421365733030767036288258035489674611170973695786036701
910715127305872810411586405612811653853259684258259955846881464304255898366493
170592517172042765974074461334000541940524623034368691540594040662278282483715
120383221786446271838229238996389928272218797024593876938030946273322925705554
596900278752822425443480211275590191694254290289169072190970836905398737474524
833728995218023632827412170402680867692104515558405671725553720158521328290342
799898184493136...
Time for "actor product": 1310.217247ms
res2: Any = CalculationResult(28462596809170545189064132121198688901480514017027
992307941799942744113400037644437729907867577847758158840621423175288300423399
401535187390524211613827161748198241998275924182892597878981242531205946599625
986706560161572036032397926328736717055741975962099479720346153698119897092611
277500484198845410475544642442136573303076703628825803548967461117097369578603
670191071512730587281041158640561281165385325968425825995584688146430425589836
649317059251717204276597407446133400054194052462303436869154059404066227828248
371512038322178644627183822923899638992827221879702459387693803094627332292570
555459690027875282242544348021127559019169425429028916907219097083690539873747
452483372899521802363282741217040268086769210451555840567172555372015852132829
034279989818449...
> Time for "par sum": 6.488620999999999ms
res3: scala.math.BigInt = 50005000
> Time for "actor sum": 657.752832ms
res5: Any = CalculationResult(50005000,0)
You can easily see that the actor version is much slower than using parallel collections.

akka split task into smaller and fold results

The question is about Akka actors library. A want to split one big task into smaller tasks and then fold the result of them into one 'big' result. This will give me faster computation profit. Smaller tasks can be computed in parallel if they are independent.
Assume that we need to compute somethig like this. Function count2X is time consuming, so using it several times in one thread is not optimal.
//NOT OPTIMAL
def count2X(x: Int) = {
Thread.sleep(1000)
x * 2
}
val sum = count2X(1) + count2X(2) + count2X(3)
println(sum)
And here goes the question.
How to dispatch tasks and collect results and then fold them, all using akka actors?
Is such functionality already provided by Akka or do I need to implement it myself? What are best practisies in such approach.
Here is 'visual' interpretation of my question:
/-> [SMALL_TASK_1] -\
[BIG_TASK] -+--> [SMALL_TASK_1] --> [RESULT_FOLD]
\-> [SMALL_TASK_1] -/
Below is my scaffold implementation with missing/bad implementation :)
case class Count2X(x: Int)
class Count2XActor extends Actor {
def receive = {
case Count2X(x) => count2X(x); // AND NOW WHAT ?
}
}
case class CountSumOf2X(a: Int, b: Int, c: Int)
class SumOf2XActor extends Actor {
val aCounter = context.actorOf(Props[Count2XActor])
val bCounter = context.actorOf(Props[Count2XActor])
val cCounter = context.actorOf(Props[Count2XActor])
def receive = {
case CountSumOf2X(a, b, c) => // AND NOW WHAT ? aCounter ! Count2X(a); bCounter ! Count2X(b); cCounter ! Count2X(c);
}
}
val aSystem = ActorSystem("mySystem")
val actor = aSystem.actorOf(Props[SumOf2XActor])
actor ! CountSumOf2X(10, 20, 30)
Thanks for any help.
In Akka I would do something like this:
val a = aCounter ? Count2X(10) mapTo[Int]
val b = bCounter ? Count2X(10) mapTo[Int]
val c = cCounter ? Count2X(10) mapTo[Int]
Await.result(Future.sequence(a, b, c) map (_.sum), 1 second).asInstanceOf[Int]
I'm sure there is a better way - here you start summing results after all Future-s are complete in parallel, for simple task it's ok, but generally you shouldn't wait so long
Two things you could do:
1) Use Akka futures. These allow you to dispatch operations and fold on them in an asynchronous manner. Check out http://doc.akka.io/docs/akka/2.0.4/scala/futures.html for more information.
2) You can dispatch work to multiple "worker" actors and then have a "master" actor aggregate them, keeping track of which messages are pending/processed by storing information in the messages themselves. I have a simple stock quote example of this using Akka actors here: https://github.com/ryanlecompte/quotes

how to solve an equation in scala using actors?

I want to know how an actor returns a value to the sender and how to store it in a variable.
For example, consider that we have to find the sum of squares of 2 numbers and print it.
i.e., sum = a2 + b2
I have 2 actors. 1 actor computes square of any number passed to it (say, SquareActor). The other actor sends the two numbers (a , b) to the SquareActor and computes their sum (say, SumActor)
/** Actor to find the square of a number */
class SquareActor (x: Int) extends Actor
{
def act()
{
react{
case x : Int => println (x * x)
// how to return the value of x*x to "SumActor" ?
}
}
}
/** Actor to find the sum of squares of a and b */
class SumActor (a: Int, b:Int) extends Actor
{
def act()
{
var a2 = 0
var b2 = 0
val squareActor = new SquareActor (a : Int)
squareActor.start
// call squareActor to get a*a
squareActor ! a
// How to get the value returned by SquareActor and store it in the variable 'a2' ?
// call squareActor to get b*b
squareActor ! b
// How to get the value returned by SquareActor and store it in the variable 'b2' ?
println ("Sum: " + a2+b2)
}
}
Pardon me if the above is not possible; I think my basic understanding of actors may itself be wrong.
Use Akka
Note that from Scala 2.10, the Akka actor library is an integrated part of the standard library. It is generally considered superior to the standard actor library, so getting familiar with that would benefit you.
Use Futures
Also note that what you want to achieve is easier and nicer (composes better) using Futures. A Future[A] represents a possibly concurrent computation, eventually yielding a result of type A.
def asyncSquare(x: Int): Future[Int] = Future(x * x)
val sq1 = asyncSquare(2)
val sq2 = asyncSquare(3)
val asyncSum =
for {
a <- sq1
b <- sq2
}
yield (a + b)
Note that the asyncSquare results are queried in advance to start their (independent) computations as soon as possible. Putting the calls inside the for comprehension would have serialized their execution, not using the possible concurrency.
You use Future-s in for comprehensions, map, flatMap, zip, sequence them, and in the very end, you can get the computed value using Await, which is a blocking operation, or using registered callbacks.
Use Futures with actors
It is handy that you can ask from actors, which results in a Future:
val futureResult: Future[Int] = (someActor ? 5).mapTo[Int]
Note the need to use of mapTo because the message passing interface of actors is not typed (there are however typed actors).
Bottom line
If you want to perform stateless computations in parallel, stick to plain Futures. If you need stateful but local computations, you can still use Future and thread the state yourself (or use scalaz StateT monad transformer + Future as monad, if you are on to that business). If you need computations which require global state, then isolate that state into an actor, and interact with that actor, possibly using Futures.
Remember that actors work by message passing. So to get the response from the SquareActor back to the SumActor, you need to send it as a message from the SquareActor, and add a handler to the SumActor.
Also, your SquareActor constructor doesn't need an integer parameter.
That is, in your SquareActor, instead of just printing x * x, pass it to the SumActor:
class SquareActor extends Actor
{
def act()
{
react{
case x : Int => sender ! (x * x)
}
}
}
(sender causes it to send the message to the actor that sent the message it is reacting to.)
In your SumActor, after you send a and b to the SquareActor, handle the received reply messages:
react {
case a2 : Int => react {
case b2 : Int => println ("Sum: " + (a2+b2))
}
}