How to implement simple actors without Akka? I don't need high-performance for many (non-fixed count) actor instances, green-threads, IoC (lifecycle, Props-based factories, ActorRef's), supervising, backpressure etc. Need only sequentiality (queue) + handler + state + message passing.
As a side-effect I actually need small actor-based pipeline (with recursive links) + some parallell actors to optimize the DSP algorithm calculation. It will be inside library without transitive dependencies, so I don't want (and can't as it's a jar-plugin) to push user to create and pass akkaSystem, the library should have as simple and lightweight interface as possible. I don't need IoC as it's just a library (set of functions), not a framework - so it has more algorithmic complexity than structural. However, I see actors as a good instrument for describing protocols and I actually can decompose the algorithm to small amount of asynchronously interacting entities, so it fits to my needs.
Why not Akka
Akka is heavy, which means that:
it's an external dependency;
has complex interface and implementation;
non-transparent for library's user, for example - all instances are managed by akka's IoC, so there is no guarantee that one logical actor is always maintained by same instance, restart will create a new one;
requires additional support for migration which is comparable with scala's migration support itself.
It also might be harder to debug akka's green threads using jstack/jconsole/jvisualvm, as one actor may act on any thread.
Sure, Akka's jar (1.9Mb) and memory consumption (2.5 million actors per GB) aren't heavy at all, so you can run it even on Android. But it's also known that you should use specialized tools to watch and analyze actors (like Typesafe Activator/Console), which user may not be familiar with (and I wouldn't push them to learn it). It's all fine for enterprise project as it almost always has IoC, some set of specialized tools and continuous migration, but this isn't good approach for a simple library.
P.S. About dependencies. I don't have them and I don't want to add any (I'm even avoiding the scalaz, which actually fits here a little bit), as it will lead to heavy maintenance - I'll have to keep my simple library up-to-date with Akka.
Here is most minimal and efficient actor in the JVM world with API based on Minimalist Scala actor from Viktor Klang:
https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala
It is handy and safe in usage but isn't type safe in message receiving and cannot send messages between processes or hosts.
Main features:
simplest FSM-like API with just 3 states (Stay, Become and Die): https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L28-L30
minimalistic error handling - just proper forwading to the default exception handler of executor threads: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L52-L53
fast async initialization that takes ~200 ns to complete, so no need for additional futures/actors for time consuming actor initialization: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L447
smallest memory footprint, that is ~40 bytes in a passive state (BTW the new String() spends the same amout of bytes in the JVM heap): https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L449
very efficient in message processing with throughput ~90M msg/sec for 4 core CPU: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L466
very efficient in message sending/receiving with latency ~100 ns: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L472
per actor tuning of fairness by the batch parameter: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L32
Example of stateful counter:
def process(self: Address, msg: Any, state: Int): Effect = if (state > 0) {
println(msg + " " + state)
self ! msg
Become { msg =>
process(self, msg, state - 1)
}
} else Die
val actor = Actor(self => msg => process(self, msg, 5))
Results:
scala> actor ! "a"
a 5
scala> a 4
a 3
a 2
a 1
This will use FixedThreadPool (and so its internal task queue):
import scala.concurrent._
trait Actor[T] {
implicit val context = ExecutionContext.fromExecutor(java.util.concurrent.Executors.newFixedThreadPool(1))
def receive: T => Unit
def !(m: T) = Future { receive(m) }
}
FixedThreadPool with size 1 guarantees sequentiality here. Of course it's NOT the best way to manage your threads if you need 100500 dynamically created actors, but it's fine if you need some fixed amount of actors per application to implement your protocol.
Usage:
class Ping(pong: => Actor[Int]) extends Actor[Int] {
def receive = {
case m: Int =>
println(m)
if (m > 0) pong ! (m - 1)
}
}
object System {
lazy val ping: Actor[Int] = new Ping(pong) //be careful with lazy vals mutual links between different systems (objects); that's why people prefer ActorRef
lazy val pong: Actor[Int] = new Ping(ping)
}
System.ping ! 5
Results:
import scala.concurrent._
defined trait Actor
defined class Ping
defined object System
res17: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#6be61f2c
5
4
3
2
1
0
scala> System.ping ! 5; System.ping ! 7
5
7
4
6
3
5
2
res19: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#54b053b1
4
1
3
0
2
1
0
This implementation is using two Java threads, so it's "twice" faster than counting without parallelization.
Related
From this tutorial https://github.com/slouc/concurrency-in-scala-with-ce#threading
async operations are divided into 3 groups and require significantly different thread pools to run on:
Non-blocking asynchronous operations:
Bounded pool with a very low number of threads (maybe even just one), with a very high priority. These threads will basically just sit idle most of the time and keep polling whether there is a new async IO notification. Time that these threads spend processing a request directly maps into application latency, so it's very important that no other work gets done in this pool apart from receiving notifications and forwarding them to the rest of the application.
Bounded pool with a very low number of threads (maybe even just one), with a very high priority. These threads will basically just sit idle most of the time and keep polling whether there is a new async IO notification. Time that these threads spend processing a request directly maps into application latency, so it's very important that no other work gets done in this pool apart from receiving notifications and forwarding them to the rest of the application.
Blocking asynchronous operations:
Unbounded cached pool. Unbounded because blocking operation can (and will) block a thread for some time, and we want to be able to serve other I/O requests in the meantime. Cached because we could run out of memory by creating too many threads, so it’s important to reuse existing threads.
CPU-heavy operations:
Fixed pool in which number of threads equals the number of CPU cores. This is pretty straightforward. Back in the day the "golden rule" was number of threads = number of CPU cores + 1, but "+1" was coming from the fact that one extra thread was always reserved for I/O (as explained above, we now have separate pools for that).
In my Cats Effect application, I use Scala Future-based ReactiveMongo lib to access MongoDB, which does NOT block threads when talking with MongoDB, e.g. performs non-blocking IO.
It needs execution context.
Cats effect provides default execution context IOApp.executionContext
My question is: which execution context should I use for non-blocking io?
IOApp.executionContext?
But, from IOApp.executionContext documemntation:
Provides a default ExecutionContext for the app.
The default on top of the JVM is lazily constructed as a fixed thread pool based on the number available of available CPUs (see PoolUtils).
Seems like this execution context falls into 3rd group I listed above - CPU-heavy operations (Fixed pool in which number of threads equals the number of CPU cores.),
and it makes me think that IOApp.executionContext is not a good candidate for non-blocking IO.
Am I right and should I create a separate context with a fixed thread pool (1 or 2 threads) for non-blocking IO (so it will fall into the first group I listed above - Non-blocking asynchronous operations: Bounded pool with a very low number of threads (maybe even just one), with a very high priority.)?
Or is IOApp.executionContext designed for both CPU-bound and Non-Blocking IO operations?
The function I use to convert Scala Future to F and excepts execution context:
def scalaFutureToF[F[_]: Async, A](
future: => Future[A]
)(implicit ec: ExecutionContext): F[A] =
Async[F].async { cb =>
future.onComplete {
case Success(value) => cb(Right(value))
case Failure(exception) => cb(Left(exception))
}
}
In Cats Effect 3, each IOApp has a Runtime:
final class IORuntime private[effect] (
val compute: ExecutionContext,
private[effect] val blocking: ExecutionContext,
val scheduler: Scheduler,
val shutdown: () => Unit,
val config: IORuntimeConfig,
private[effect] val fiberErrorCbs: FiberErrorHashtable = new FiberErrorHashtable(16)
)
You will almost always want to keep the default values and not fiddle around declaring your own runtime, except in perhaps tests or educational examples.
Inside your IOApp you can then access the compute pool via:
runtime.compute
If you want to execute a blocking operation, then you can use the blocking construct:
blocking(IO(println("foo!"))) >> IO.unit
This way, you're telling the CE3 runtime that this operation could be blocking and hence should be dispatched to a dedicated pool. See here.
What about CE2? Well, it had similar mechanisms but they were very clunky and also contained quite a few surprises. Blocking calls, for example, were scheduled using Blocker which then had to be somehow summoned out of thin air or threaded through the whole app, and thread pool definitions were done using the awkward ContextShift. If you have any choice in the matter, I highly recommend investing some effort into migrating to CE3.
Fine, but what about Reactive Mongo?
ReactiveMongo uses Netty (which is based on Java NIO API). And Netty has its own thread pool. This is changed in Netty 5 (see here), but ReactiveMongo seems to still be on Netty 4 (see here).
However, the ExecutionContext you're asking about is the thread pool that will perform the callback. This can be your compute pool.
Let's see some code. First, your translation method. I just changed async to async_ because I'm using CE3, and I added the thread printline:
def scalaFutureToF[F[_]: Async, A](future: => Future[A])(implicit ec: ExecutionContext): F[A] =
Async[F].async_ { cb =>
future.onComplete {
case Success(value) => {
println(s"Inside Callback: [${Thread.currentThread.getName}]")
cb(Right(value))
}
case Failure(exception) => cb(Left(exception))
}
}
Now let's pretend we have two execution contexts - one from our IOApp and another one that's going to represent whatever ReactiveMongo uses to run the Future. This is the made-up ReactiveMongo one:
val reactiveMongoContext: ExecutionContext =
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(1))
and the other one is simply runtime.compute.
Now let's define the Future like this:
def myFuture: Future[Unit] = Future {
println(s"Inside Future: [${Thread.currentThread.getName}]")
}(reactiveMongoContext)
Note how we are pretending that this Future runs inside ReactiveMongo by passing the reactiveMongoContext to it.
Finally, let's run the app:
override def run: IO[Unit] = {
val myContext: ExecutionContext = runtime.compute
scalaFutureToF(myFuture)(implicitly[Async[IO]], myContext)
}
Here's the output:
Inside Future: [pool-1-thread-1]
Inside Callback: [io-compute-6]
The execution context we provided to scalaFutureToF merely ran the callback. Future itself ran on our separate thread pool that represents ReactiveMongo's pool. In reality, you will have no control over this pool, as it's coming from within ReactiveMongo.
Extra info
By the way, if you're not working with the type class hierarchy (F), but with IO values directly, then you could use this simplified method:
def scalaFutureToIo[A](future: => Future[A]): IO[A] =
IO.fromFuture(IO(future))
See how this one doesn't even require you to pass an ExecutionContext - it simply uses your compute pool. Or more specifically, it uses whatever is defined as def executionContext: F[ExecutionContext] for the Async[IO], which turns out to be the compute pool. Let's check:
override def run: IO[Unit] = {
IO.executionContext.map(ec => println(ec == runtime.compute))
}
// prints true
Last, but not least:
If we really had a way of specifying which thread pool ReactiveMongo's underlying Netty should be using, then yes, in that case we should definitely use a separate pool. We should never be providing our runtime.compute pool to other runtimes.
We have a fairly complex system developed using Akka HTTP and Actors model. Until now, we extensively used ask pattern and mixed Futures and Actors.
For example, an actor gets message, it needs to execute 3 operations in parallel, combine a result out of that data and returns it to sender. What we used is
declare a new variable in actor receive message callback to store a sender (since we use Future.map it can be another sender).
executed all those 3 futures in parallel using Future.sequence (sometimes its call of function that returns a future and sometimes it is ask to another actor to get something from it)
combine the result of all 3 futures using map or flatMap function of Future.sequence result
pipe a final result to a sender using pipeTo
Here is a code simplified:
case RetrieveData(userId, `type`, id, lang, paging, timeRange, platform) => {
val sen = sender
val result: Future[Seq[Map[String, Any]]] = if (paging.getOrElse(Paging(0, 0)) == Paging(0, 0)) Future.successful(Seq.empty)
else {
val start = System.currentTimeMillis()
val profileF = profileActor ? Get(userId)
Future.sequence(Seq(profileF, getSymbols(`type`, id), getData(paging, timeRange, platform)).map { result =>
logger.info(s"Got ${result.size} news in ${System.currentTimeMillis() - start} ms")
result
}.recover { case ex: Throwable =>
logger.error(s"Failure on getting data: ${ex.getMessage}", ex)
Seq.empty
}
}
result.pipeTo(sen)
}
Function getAndProcessData contains Future.sequence with executing 3 futures in parallel.
Now, as I'm reading more and more on Akka, I see that using ask is creating another actor listener. Questions are:
As we extensively use ask, can it lead to a to many threads used in a system and perhaps a thread starvation sometimes?
Using Future.map much also means different thread often. I read about one thread actor illusion which can be easily broken with mixing Futures.
Also, can this affect performances in a bad way?
Do we need to store sender in temp variable send, since we're using pipeTo? Could we do only pipeTo(sender). Also, does declaring sen in almost each receive callback waste to much resources? I would expect its reference will be removed once operation in complete.
Is there a chance to design such a system in a better way, meadning that we don't use map or ask so much? I looked at examples when you just pass a replyTo reference to some actor and the use tell instead of ask. Also, sending message to self and than replying to original sender can replace working with Future.map in some scenarios. But how it can be designed having in mind we want to perform 3 async operations in parallel and returns a formatted data to a sender? We need to have all those 3 operations completed to be able to format data.
I tried not to include to many examples, I hope you understand our concerns and problems. Many questions, but I would really love to understand how it works, simple and clear
Thanks in advance
If you want to do 3 things in parallel you are going to need to create 3 Future values which will potentially use 3 threads, and that can't be avoided.
I'm not sure what the issue with map is, but there is only one call in this code and that is not necessary.
Here is one way to clean up the code to avoid creating unnecessary Future values (untested!):
case RetrieveData(userId, `type`, id, lang, paging, timeRange, platform) =>
if (paging.forall(_ == Paging(0, 0))) {
sender ! Seq.empty
} else {
val sen = sender
val start = System.currentTimeMillis()
val resF = Seq(
profileActor ? Get(userId),
getSymbols(`type`, id),
getData(paging, timeRange, platform),
)
Future.sequence(resF).onComplete {
case Success(result) =>
val dur = System.currentTimeMillis() - start
logger.info(s"Got ${result.size} news in $dur ms")
sen ! result
case Failure(ex)
logger.error(s"Failure on getting data: ${ex.getMessage}", ex)
sen ! Seq.empty
}
}
You can avoid ask by creating your own worker thread that collects the different results and then sends the result to the sender, but that is probably more complicated than is needed here.
An actor only consumes a thread in the dispatcher when it is processing a message. Since the number of messages the actor spawned to manage the ask will process is one, it's very unlikely that the ask pattern by itself will cause thread starvation. If you're already very close to thread starvation, an ask might be the straw that breaks the camel's back.
Mixing Futures and actors can break the single-thread illusion, if and only if the code executing in the Future accesses actor state (meaning, basically, vars or mutable objects defined outside of a receive handler).
Request-response and at-least-once (between them, they cover at least most of the motivations for the ask pattern) will in general limit throughput compared to at-most-once tells. Implementing request-response or at-least-once without the ask pattern might in some situations (e.g. using a replyTo ActorRef for the ultimate recipient) be less overhead than piping asks, but probably not significantly. Asks as the main entry-point to the actor system (e.g. in the streams handling HTTP requests or processing messages from some message bus) are generally OK, but asks from one actor to another are a good opportunity to streamline.
Note that, especially if your actor imports context.dispatcher as its implicit ExecutionContext, transformations on Futures are basically identical to single-use actors.
Situations where you want multiple things to happen (especially when you need to manage partial failure (Future.sequence.recover is a possible sign of this situation, especially if the recover gets nontrivial)) are potential candidates for a saga actor to organize one particular request/response.
I would suggest instead of using Future.sequence, Souce from Akka can be used which will run all the futures in parallel, in which you can provide the parallelism also.
Here is the sample code:
Source.fromIterator( () => Seq(profileF, getSymbols(`type`, id), getData(paging, timeRange, platform)).iterator )
.mapAsync( parallelism = 1 ) { case (seqIdValue, row) =>
row.map( seqIdValue -> _ )
}.runWith( Sink.seq ).map(_.map(idWithDTO => idWithDTO))
This will return Future[Seq[Map[String, Any]]]
Im working with akka/scala/play stack.
Usually, im using stream to perform certain tasks. for example, I have a stream that wakes every minute, picks up something from the DB, and call another service to enrich its data using an API and save the enrichment to the DB.
something like this:
class FetcherAndSaveStream #Inject()(fetcherAndSaveGraph: FetcherAndSaveGraph, dbElementsSource: DbElementsSource)
(implicit val mat: Materializer,
implicit val exec: ExecutionContext) extends LazyLogging {
def graph[M1, M2](source: Source[BDElement, M1],
sink: Sink[BDElement, M2],
switch: SharedKillSwitch): RunnableGraph[(M1, M2)] = {
val fetchAndSaveDataFromExternalService: Flow[BDElement, BDElement, NotUsed] =
fetcherAndSaveGraph.fetchEndSaveEnrichment
source.viaMat(switch.flow)(Keep.left)
.via(fetchAndSaveDataFromExternalService)
.toMat(sink)(Keep.both).withAttributes(supervisionStrategy(resumingDecider))
}
def runGraph(switchSharedKill: SharedKillSwitch): (NotUsed, Future[Done]) = {
logger.info("FetcherAndSaveStream is now running")
graph(dbElementsSource.dbElements(), Sink.ignore, switchSharedKill).run()
}
}
I wonder, is this better than just using an actor that ticks every minute and do something like that? what is the comparison between using actors for this and stream?
trying to figure out still when should I choose which method (streams/actors). thanks!!
You can use both, depending on the requirements you have for your solution which are not listed there. The general concern you need to take into consideration - actors more low-level stuff than streams, so they require more code and debug.
Basically, streams are good for tasks where you have a relatively big amount of data you need to process with low memory consumption. With streams, you won't need to start to stream each n seconds, you can set this stream to run along with the application. That could make your code more concise by omitting scheduler logic.
I will omit your DI and architecture stuff, write solution with pseudocode:
val yourConsumer: Sink[YourDBRecord] = ???
val recordsSource: Source[YourDBRecord] =
val runnableGraph = (Source repeat ())
.throttle(1, n seconds)
.mapAsync(yourParallelism){_ =>
fetchReasonableAmountOfRecordsFromDB
} mapConcat identity to yourConsumer
This stream will do your stuff. You even can enhance it with more sophisticated logic to adapt the polling rate according to workloads using feedback loop in graph api. Also, you can add the error-handling strategy you need to resume in place your stream has crashed.
Moreover, there's alpakka connectors for DBS capable of doing so, you can see if solutions there fit your purpose, or check for implementation details.
What you can get by doing so - backpressure, ability to work with streams, clean and concise code with no timed automata managed directly by you.
https://doc.akka.io/docs/akka/current/stream/stream-rate.html
You can also create an actor, but then you should do all the things akka streams do for you by hand, i.e. back-pressure in case you want to interop with streams, scheduler, chunking and memory management(to not to load 100000 or so entries in one batch to memory), etc.
A Future is good at representing a single asynchronous task that will / should be completed within some fixed amount of time.
However, there exist another kind of asynchronous task, one where it's not possible / very hard to know exactly when it will finish. For example, the time taken for a particular string processing task might depend on various factors such as the input size.
For these kind of task, detecting failure might be better by checking if the task is able to make progress within a reasonable amount of time instead of by setting a hard timeout value such as in Future.
Are there any libraries providing suitable monadic abstraction of such kind of task in Scala?
You could use a stream of values like this:
sealed trait Update[T]
case class Progress[T](value: Double) extends Update[T]
case class Finished[T](result: T) extends Update[T]
let your task emit Progress values when it is convenient (e.g. every time a chunk of the computation has finished), and emit one Finished value once the complete computation is finished. The consumer could check for progress values to ensure that the task is still making progress. If a consumer is not interested in progress updates, you can just filter them out. I think this is more composable than an actor-based approach.
Depending on how much performance or purity you need, you might want to look at akka streams or scalaz streams. Akka streams has a pure DSL for building flow graphs, but allows mutability in processing stages. Scalaz streams is more functional, but has lower performance last I heard.
You can break your work into chunks. Each chunk is a Future with a timeout - your notion of reasonable progress. Chain those Futures together to get a complete task.
Example 1 - both chunks can run in parallel and don't depend on each other (embarassingly parallel task):
val chunk1 = Future { ... } // chunk 1 starts execution here
val chunk2 = Future { ... } // chunk 2 starts execution here
val result = for {
c1 <- chunk1
c2 <- chunk2
} yield combine(c1, c2)
Example 2 - second chunk depends on the first:
val chunk1 = Future { ... } // chunk 1 starts execution here
val result = for {
c1 <- chunk1
c2 <- Future { c1 => ... } // chunk 2 starts execution here
} yield combine(c1, c2)
There are obviously other constructs to help you when you have many Futures like sequence.
The article "The worst thing in our Scala code: Futures" by Ken Scambler points the need for a separation of concerns:
scala.concurrent.Future is built to work naturally with scala.util.Try, so our code often ends up clumsily using Try everywhere to represent failure, using raw exceptions as failure values even where no exceptions get thrown.
To do anything with a scala.concurrent.Future, you need to lug around an implicit ExecutionContext. This obnoxious dependency needs to be threaded through everywhere they are used.
So if your code does not depend directly on Future, but on simple Monad properties, you can abstract it with a Monad type:
trait Monad[F[_]] {
def flatMap[A,B](fa: F[A], f: A => F[B]): F[B]
def pure[A](a: A): F[A]
}
// read the type parameter as “for all F, constrained by Monad”.
final def load[F[_]: Monad](pageUrl: URL): F[Page]
Suppose I have to ехеcute several CPU-bound tasks. If I have 4 CPUs, for example, I would probably create a fixed-size thread pool of 4-5 worker threads waiting on a queue and put the tasks in the queue. In Java I can use java.util.concurrent (maybe ThreadPoolExecutor) to implement this mechanism.
How would you implement it with Scala actors?
All actors are basically threads which are executed by a scheduler under the hood. The scheduler creates a thread pool to execute actors roughly bound to your number of cores. This means that you can just create an actor per task you need to execute and leave the rest to Scala:
for(i <- 1 to 20) {
actor {
print(i);
Thread.sleep(1000);
}
}
The disadvantage here is depending on the number of tasks, the cost of creating a thread for each task may be quite expensive since threads are not so cheap in Java.
A simple way to create a bounded pool of worker actors and then distribute the tasks to them via messaging would be something like:
import scala.actors.Actor._
val numWorkers = 4
val pool = (1 to numWorkers).map { i =>
actor {
loop {
react {
case x: String => println(x)
}
}
}
}
for(i <- 1 to 20) {
val r = (new util.Random).nextInt(numWorkers)
pool(r) ! "task "+i
}
The reason we want to create multiple actors is because a single actor processes only one message (i.e. task) at a time so to get parallelism for your tasks you need to create multiple.
A side note: the default scheduler becomes particularly important when it comes to I/O bound tasks, as you will definitely want to change the size of the thread pool in that case. Two good blog posts which go into details about this are: Explore the Scheduling of Scala Actors and Scala actors thread pool pitfall.
With that said, Akka is an Actor framework that provides tools for more advanced workflows with Actors, and it is what I would use in any real application. Here is a load balancing (rather than random) task executor:
import akka.actor.Actor
import Actor._
import akka.routing.{LoadBalancer, CyclicIterator}
class TaskHandler extends Actor {
def receive = {
case t: Task =>
// some computationally expensive thing
t.execute
case _ => println("default case is required in Akka...")
}
}
class TaskRouter(numWorkers: Int) extends Actor with LoadBalancer {
val workerPool = Vector.fill(numWorkers)(actorOf[TaskHandler].start())
val seq = new CyclicIterator(workerPool)
}
val router = actorOf(new TaskRouter(4)).start()
for(i <- 1 to 20) {
router ! Task(..)
}
You can have different types of Load Balancing (CyclicIterator is round-robin distribution), so you can check the docs here for more info.
Well, you usually don't. Part of the attraction of using actors is that they handle such details for you.
If, however, you insist on managing that, you'll need to override the protected scheduler method on your Actor class to return an appropriate IScheduler. See also the scala.actors.scheduler package, and the comments on the Actor trait concerning schedulers.