Concurrent Akka Agents in Scala - scala

I'm working on a scala project right now, and I've decided to use Akka's agent library over the actor model, because it allows a more functional approach to concurrency.However, I'm having a problem running many different agents at a time. It seems like I'm capping at only having three or four agents running at once.
import akka.actor._
import akka.agent._
import scala.concurrent.ExecutionContext.Implicits.global
object AgentTester extends App {
// Create the system for the actors that power the agents
implicit val system = ActorSystem("ActorSystem")
// Create an agent for each int between 1 and 10
val agents = Vector.tabulate[Agent[Int]](10)(x=>Agent[Int](1+x))
// Define a function for each agent to execute
def printRecur(a: Agent[Int])(x: Int): Int = {
// Print out the stored number and sleep.
println(x)
Thread.sleep(250)
// Recur the agent
a sendOff printRecur(a) _
// Keep the agent's value the same
x
}
// Start each agent
for(a <- agents) {
Thread.sleep(10)
a sendOff printRecur(a) _
}
}
The above code creates an agent holding each integer between 1 and 10. The loop at the bottom sends the printRecur function to every agent. The output of the program should show the numbers 1 through 10 being printed out every quarter of a second (although not in any order). However, for some reason my output only shows the numbers 1 through 4 being outputted.
Is there a more canonical way to use agents in Akka that will work? I come from a clojure background and have used this pattern successfully there before, so I naively used the same pattern in Scala.

My guess is that you are running on a 4 core box and that is part of the reason why you only ever see the numbers 1-4. The big thing at play here is that you are using the default execution context which I'm guessing on your system uses a thread pool with only 4 threads on it (one for each core). With the way you've coded this in this sort of recursive manner, my guess is that the first 4 agents never relinquish the threads and they are the only ones that will ever print anything.
You can easily fix this by removing this line:
import scala.concurrent.ExecutionContext.Implicits.global
And adding this line after you create the ActorSystem
import system.dispatcher
This will use the default dispatcher for the actor system which is a fork join dispatcher which does not seem to have the same issue as the default execution context you imported in your sample.
You could also consider using send as opposed to sendOff as that will use the execution context that was available when you constructed the agent. I would think one would use sendOff when they had a case where they explicitly wanted to use another execution context.

Related

Is it possible to have a while loop in chisel based on a condition of Chisel data types?

Here's what I'm trying to accomplish: I have a Chisel accelerator which calls another Chisel accelerator and passes in a value. I want the second one to have a while loop in it where the condition is partially based on the input value. Here's some sample code:
class Module1 extends Module {
val in = 0.U
val Module2Module = Module2()
Module2Module.io.in := in
}
class Module2 extends Module {
val io = IO(new Bundle {
val in = Input(Reg(UInt))
}
val test = 0.U
while (test < io.in) {
}
}
I'm getting the error that "test < io.in" is a chisel.Bool, not a Boolean. I know that I can't convert that to Scala types, right?
What is the proper way to implement this? Is it by having signals sent to/from Module1 to Module2 to indicate that the accelerator isn't done yet and to only proceed when it is? If so, wouldn't this get complex quickly, if you have several functions, each in different modules?
You will need to use registers, created by the Reg family of constructors and control the flow with when, elsewhen, and otherwise. I think a good example for you is in 2.6_testers2.ipynb of chisel bootcamp. The GCD circuit is equivalent to a while loop. The circuit continues until the y register is decremented to zero. Each clock cycle corresponds to a single iteration of a software while loop. The circuit uses the ready and valid fields of the Decoupled inputs and outputs to coordinate ingesting new data and reporting when a GCD value has been computed. Take a look at this example and see if you have more questions.
Just to elaborate on why you can't use a while loop with hardware values like chisel3.Bool, you can think about a chisel3 design as a Scala program that constructs a hardware design as it executes. When chisel3 runs, it is just running a program who's output is your circuit (ultimately emitted as Verilog). while is a Scala construct so it's only available during the execution of the program, it doesn't exist in the actual hardware. There's a similar question and answer about for loops on the chisel-users mailing list.
Now to answer your question, as Chick mentioned you can use the chisel3 constructs when, .elsewhen, and .otherwise to handle control flow in the actual hardware:
class Module2 extends Module {
val io = IO(new Bundle {
val in = Input(Reg(UInt))
}
val test = 0.U
when (test < io.in) {
// Logic for that applies when (or while) the condition is true
} .otherwise {
// Logic that applies when it isn't
}
}
Also as Chick mentioned, you'll likely need some state (using Regs) since you may need to do things over multiple clock cycles. It's hard to advise beyond this simple example without more info, but please expand on your question or ask more questions if you need more help.
If so, wouldn't this get complex quickly, if you have several functions, each in different modules?
I'm not sure how to answer this bit without more context, but the whole purpose of Chisel is to make it easier to create abstractions that allow you to handle complexity. Chisel enables software engineering when designing hardware.

How to throttle the execution of future?

I have basically list of unique ids, and for every id, i make a call to function which returns future.
Problem is number of futures in a single call is variable.
list.map(id -> futureCall)
There will be too much parallelism which can affect my system. I want to configure number of futures execution in parallel.
I want testable design so i can't do this
After searching alot, i found this
I didn't get it how to use it. I tried but it didn't work.
After that i have just imported it in my class where i am making call.
I have used same snippet and set default maxConcurrent to 4.
I replaced import global execution context with ThrottledExecutionContext
You have to wrap your ExecutionContext with ThrottledExecutionContext.
Here is a little sample:
object TestApp extends App {
implicit val ec = ThrottledExecutionContext(maxConcurrents = 10)(scala.concurrent.ExecutionContext.global)
def futureCall(id:Int) = Future {
println(s"executing $id")
Thread.sleep(500)
id
}
val list = 1 to 1000
val results = list.map(futureCall)
Await.result(Future.sequence(results), 100.seconds)
}
Alternatively you can also try a FixedThreadPool:
implicit val ec = ExecutionContext.fromExecutor(java.util.concurrent.Executors.newFixedThreadPool(10))
I am not sure what you are trying to do here. Default global ExecutionContext uses as many threads as you have CPU cores. So, that would be your parallelism. If that's still "too many" for you, you can control that number with a system property: "scala.concurrent.context.maxThreads", and set that to a lower number.
That will be the maximum number of futures that are executed in parallel at any given time. You should not need to throttle anything explicitly.
Alternatively, you can create your own executor, and give it a BlockingQueue with a limited capacity. That would block on the producer side (when a work item is being submitted), like your implementation does, but I would very strongly advice you from doing that as it is very dangerous and prone to deadlocks, and also much less efficient, that the default ForkJoinPool implementation.

How to limit number of unprocessed Futures in Scala?

I cannot fund if there is way to limit number of unprocessed Futures in Scala.
For example in following code:
import ExecutionContext.Implicits.global
for (i <- 1 to N) {
val f = Future {
//Some Work with bunch of object creation
}
}
if N is too big, it will eventually throw OOM.
Is there a way to limit number of unprocessed Futures ether with queue-like wait or with exception?
So, the simplest answer is that you can create an ExecutionContext that blocks or throttles the execution of new tasks beyond a certain limit. See this blog post. For a more fleshed out example of a blocking Java ExecutorService, here is an example. [You can use it directly if you want, the library on Maven Central is here.] This wraps some nonblocking ExecutorService, which you can create using the factory methods of java.util.concurrent.Executors.
To convert a Java ExecutorService into a Scala ExecutionContext is just ExecutionContext.fromExecutorService( executorService ). So, using the library linked above, you might have code like...
import java.util.concurrent.{ExecutionContext,Executors}
import com.mchange.v3.concurrent.BoundedExecutorService
val executorService = new BoundedExecutorService(
Executors.newFixedThreadPool( 10 ), // a pool of ten Threads
100, // block new tasks when 100 are in process
50 // restart accepting tasks when the number of in-process tasks falls below 50
)
implicit val executionContext = ExecutionContext.fromExecutorService( executorService )
// do stuff that creates lots of futures here...
That's fine if you want a bounded ExecutorService that will last as long as your whole application. But if you are creating lots of futures in a localized point in your code, and you will want to shut down the ExecutorService when you are done with it. I define loan-pattern methods in Scala [maven central] that both create the context and shut it down after I'm done. The code ends up looking like...
import com.mchange.sc.v2.concurrent.ExecutionContexts
ExecutionContexts.withBoundedFixedThreadPool( size = 10, blockBound = 100, restartBeneath = 50 ) { implicit executionContext =>
// do stuff that creates lots of futures here...
// make sure the Futures have completed before the scope ends!
// that's important! otherwise, some Futures will never get to run
}
Rather than using an ExecutorService, that blocks outright, you can use an instance that slows things down by forcing the task-scheduling (Future-creating) Thread to execute the task rather than running it asynchronously. You'd make a java.util.concurrent.ThreadPoolExecutor using ThreadPoolExecutor.CallerRunsPolicy. But ThreadPoolExecutor is fairly complex to build directly.
A newer, sexier, more Scala-centric alternative to all of this would be to check out Akka Streams as an alternative to Future for concurrent execution with "back-pressure" to prevent OutOfMemoryErrors.

Locking read/write operations on a data structure in Scala/akka

I have multiple actors (in the form of Futures) firing other futures off based on what they read from a single object's cache. I want to make sure that no work overlaps, and thus want to put a lock on all read/modify/write operations. How do I do this in Scala?
I tried this, but I don't want every method/function that accesses the cache to have to be synchronized, but rather have anything that tries to access the cache understand that it needs to wait until it's time for it to access.
//The cache
object certCache {
var cache = new HashMap[Char, Future[Boolean]]
}
def someMethod = synchronized {
if(certCache ... )
certCache.do(...)
}
Any tips?
Agents
The akka library has a perfect solution for your question: Agents. From the documentation:
import scala.concurrent.ExecutionContext.Implicits.global
import akka.agent.Agent
val agent = Agent(42)
To read from an Agent you can dereference them or call the get method, both of which are immediately returning synchronous calls:
val agentResult = agent()
val agentResult2 = agent.get
Updates are asynchronous:
agent send (_ + 10) //updates value to 52, eventually
Similarly, you can get a Future of the Agent's value which completes after the currently queued updates have completed:
val futureValue = agent.future
Actors
Of course you can always go with a "home grown" solution and write an Actor that caches your values and responds to queries. BUT, this is a much more manual/inefficient solution than Agents.
Actors should only be considered as a last resort when other akka/scala solutions do not apply. This is because Actors are very low-level and the receive method is not compose-able.

How to implement actor model without Akka?

How to implement simple actors without Akka? I don't need high-performance for many (non-fixed count) actor instances, green-threads, IoC (lifecycle, Props-based factories, ActorRef's), supervising, backpressure etc. Need only sequentiality (queue) + handler + state + message passing.
As a side-effect I actually need small actor-based pipeline (with recursive links) + some parallell actors to optimize the DSP algorithm calculation. It will be inside library without transitive dependencies, so I don't want (and can't as it's a jar-plugin) to push user to create and pass akkaSystem, the library should have as simple and lightweight interface as possible. I don't need IoC as it's just a library (set of functions), not a framework - so it has more algorithmic complexity than structural. However, I see actors as a good instrument for describing protocols and I actually can decompose the algorithm to small amount of asynchronously interacting entities, so it fits to my needs.
Why not Akka
Akka is heavy, which means that:
it's an external dependency;
has complex interface and implementation;
non-transparent for library's user, for example - all instances are managed by akka's IoC, so there is no guarantee that one logical actor is always maintained by same instance, restart will create a new one;
requires additional support for migration which is comparable with scala's migration support itself.
It also might be harder to debug akka's green threads using jstack/jconsole/jvisualvm, as one actor may act on any thread.
Sure, Akka's jar (1.9Mb) and memory consumption (2.5 million actors per GB) aren't heavy at all, so you can run it even on Android. But it's also known that you should use specialized tools to watch and analyze actors (like Typesafe Activator/Console), which user may not be familiar with (and I wouldn't push them to learn it). It's all fine for enterprise project as it almost always has IoC, some set of specialized tools and continuous migration, but this isn't good approach for a simple library.
P.S. About dependencies. I don't have them and I don't want to add any (I'm even avoiding the scalaz, which actually fits here a little bit), as it will lead to heavy maintenance - I'll have to keep my simple library up-to-date with Akka.
Here is most minimal and efficient actor in the JVM world with API based on Minimalist Scala actor from Viktor Klang:
https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala
It is handy and safe in usage but isn't type safe in message receiving and cannot send messages between processes or hosts.
Main features:
simplest FSM-like API with just 3 states (Stay, Become and Die): https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L28-L30
minimalistic error handling - just proper forwading to the default exception handler of executor threads: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L52-L53
fast async initialization that takes ~200 ns to complete, so no need for additional futures/actors for time consuming actor initialization: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L447
smallest memory footprint, that is ~40 bytes in a passive state (BTW the new String() spends the same amout of bytes in the JVM heap): https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L449
very efficient in message processing with throughput ~90M msg/sec for 4 core CPU: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L466
very efficient in message sending/receiving with latency ~100 ns: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L472
per actor tuning of fairness by the batch parameter: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L32
Example of stateful counter:
def process(self: Address, msg: Any, state: Int): Effect = if (state > 0) {
println(msg + " " + state)
self ! msg
Become { msg =>
process(self, msg, state - 1)
}
} else Die
val actor = Actor(self => msg => process(self, msg, 5))
Results:
scala> actor ! "a"
a 5
scala> a 4
a 3
a 2
a 1
This will use FixedThreadPool (and so its internal task queue):
import scala.concurrent._
trait Actor[T] {
implicit val context = ExecutionContext.fromExecutor(java.util.concurrent.Executors.newFixedThreadPool(1))
def receive: T => Unit
def !(m: T) = Future { receive(m) }
}
FixedThreadPool with size 1 guarantees sequentiality here. Of course it's NOT the best way to manage your threads if you need 100500 dynamically created actors, but it's fine if you need some fixed amount of actors per application to implement your protocol.
Usage:
class Ping(pong: => Actor[Int]) extends Actor[Int] {
def receive = {
case m: Int =>
println(m)
if (m > 0) pong ! (m - 1)
}
}
object System {
lazy val ping: Actor[Int] = new Ping(pong) //be careful with lazy vals mutual links between different systems (objects); that's why people prefer ActorRef
lazy val pong: Actor[Int] = new Ping(ping)
}
System.ping ! 5
Results:
import scala.concurrent._
defined trait Actor
defined class Ping
defined object System
res17: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#6be61f2c
5
4
3
2
1
0
scala> System.ping ! 5; System.ping ! 7
5
7
4
6
3
5
2
res19: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#54b053b1
4
1
3
0
2
1
0
This implementation is using two Java threads, so it's "twice" faster than counting without parallelization.