object GetDetails{
def apply(outStream: java.io.Writer, someObject: SomeClass){
outStream.write(someObject.detail1) //detail1 is string
outStream.write(someObject.detail2) //detail2 is also string
outStream.flush()
}
}
If this implementation is not thread-safe how do I make it thread-safe?
This function is going to be called simultaneously with different inputs.
One thing to think about is how the "state" of your application might change when two different threads call the same function, or interact with the same object. In this case, your "state" might be "what's been written to outStream".
Consider the following scenario:
Thread 1 Thread 2
-------------------- -------------------------------
GetDetails(outStream, object1)
GetDetails(outStream, object2)
outStream.write(object1.detail1)
outStream.write(object2.detail1)
outStream.write(object1.detail2)
outStream.write(object2.detail2)
outStream.flush()
outStream.flush()
Two separate threads both call GetDetails, sharing the same outStream. This is a potential concurrency problem, as the data that gets written to outStream is not guaranteed to be in any particular order. You might get [object1.detail1, object2.detail1, object1.detail2, object2.detail2], or [object2.detail1, object1.detail1, object1.detail2, object2.detail2], and so on.
GetDetails.apply does not change any of GetDetails's state, but it does change the state of the Writer that you pass; in order to ensure thread-safety you must take efforts to avoid using the same Writer at the same time (i.e. the scenario above).
As a counter-point, here's a pretty thread-unsafe method:
object NotThreadSafe {
// mutable state
private var currentPrefix = ""
def countUp(prefix: String) = {
// red flag: changing mutable state, then referring to it
currentPrefix = prefix
for(i <- 1 to 5) println(s"$currentPrefix $i")
}
}
If thread 1 calls NotThreadSafe.countUp("hello") and thread 2 calls NotThreadSafe.countUp("goodbye"), the output will depend on which currentPrefix = prefix happens last.
You could end up with
hello 1
hello 2
goodbye 3 // should have been hello 3, but currentPrefix got changed
goodbye 1
goodbye 4 // should have been hello 4
goodbye 5 // should have been hello 5
goodbye 2
goodbye 3
goodbye 4
goodbye 5
or one of many other permutations.
This is one reason why Scala tends to prefer "immutability" and "statelessness", because those are tools that simply don't need to worry about this kind of issue.
If you do find yourself forced to deal with mutable state, often the simplest way to ensure thread safety is to ensure your method can only be called once at a time, i.e. by using synchronized. At a finer-grained level, you want to ensure that a specific sequence of steps is not interleaved with another copy of that sequence on another thread (e.g. the GetDetails scenario I described at the beginning).
You could also look into semaphores but that's beyond the scope of this answer.
Related
So, I was trying to learn about Continuation. I came across with the following saying (link):
Say you're in the kitchen in front of the refrigerator, thinking about a sandwich. You take a continuation right there and stick it in your pocket. Then you get some turkey and bread out of the refrigerator and make yourself a sandwich, which is now sitting on the counter. You invoke the continuation in your pocket, and you find yourself standing in front of the refrigerator again, thinking about a sandwich. But fortunately, there's a sandwich on the counter, and all the materials used to make it are gone. So you eat it. :-) — Luke Palmer
Also, I saw a program in Scala:
var k1 : (Unit => Sandwich) = null
reset {
shift { k : Unit => Sandwich) => k1 = k }
makeSandwich
}
val x = k1()
I don't really know the syntax of Scala (looks similar to Java and C mixed together) but I would like to understand the concept of Continuation.
Firstly, I tried to run this program (by adding it into main). But it fails, I think that it has a syntax error due to the ) near Sandwich but I'm not sure. I removed it but it still does not compile.
How to create a fully compiled example that shows the concept of the story above?
How this example shows the concept of Continuation.
In the link above there was the following saying: "Not a perfect analogy in Scala because makeSandwich is not executed the first time through (unlike in Scheme)". What does it mean?
Since you seem to be more interested in the concept of the "continuation" rather than specific code, let's forget about that code for a moment (especially because it is quite old and I don't really like those examples because IMHO you can't understand them correctly unless you already know what a continuation is).
Note: this is a very long answer with some attempts to describe what a continuations is and why it is useful. There are some examples in Scala-like pseudo-code none of which can actually be compiled and run (there is just one compilable example at the very end and it references another example from the middle of the answer). Expect to spend a significant amount of time just reading this answer.
Intro to continuations
Probably the first thing you should do to understand a continuation is to forget about how modern compilers for most of the imperative languages work and how most of the modern CPUs work and particularly the idea of the call stack. This is actually implementation details (although quite popular and quite useful in practice).
Assume you have a CPU that can execute some sequence of instructions. Now you want to have a high level languages that support the idea of methods that can call each other. The obvious problem you face is that the CPU needs some "forward only" sequence of commands but you want some way to "return" results from a sub-program to the caller. Conceptually it means that you need to have some way to store somewhere before the call all the state of the caller method that is required for it to continue to run after the result of the sub-program is computed, pass it to the sub-program and then ask the sub-program at the end to continue execution from that stored state. This stored state is exactly a continuation. In most of the modern environments those continuations are stored on the call stack and often there are some assembly instructions specifically designed to help handling it (like call and return). But again this is just implementation details. Potentially they might be stored in an arbitrary way and it will still work.
So now let's re-iterate this idea: a continuation is a state of the program at some point that is enough to continue its execution from that point, typically with no additional input or some small known input (like a return value of the called method). Running a continuation is different from a method call in that usually continuation never explicitly returns execution control back to the caller, it can only pass it to another continuation. Potentially you can create such a state yourself, but in practice for the feature to be useful you need some support from the compiler to build continuations automatically or emulate it in some other way (this is why the Scala code you see requires a compiler plugin).
Asynchronous calls
Now there is an obvious question: why continuations are useful at all? Actually there are a few more scenarios besides the simple "return" case. One such scenario is asynchronous programming. Actually if you do some asynchronous call and provide a callback to handle the result, this can be seen as passing a continuation. Unfortunately most of the modern languages do not support automatic continuations so you have to grab all the relevant state yourself. Another problem appears if you have some logic that needs a sequence of many async calls. And if some of the calls are conditional, you easily get to the callbacks hell. The way continuations help you avoid it is by allowing you build a method with effectively inverted control flow. With typical call it is the caller that knows the callee and expects to get a result back in a synchronous way. With continuations you can write a method with several "entry points" (or "return to points") for different stages of the processing logic that you can just pass to some other method and that method can still return to exactly that position.
Consider following example (in pseudo-code that is Scala-like but is actually far from the real Scala in many details):
def someBusinessLogic() = {
val userInput = getIntFromUser()
val firstServiceRes = requestService1(userInput)
val secondServiceRes = if (firstServiceRes % 2 == 0) requestService2v1(userInput) else requestService2v2(userInput)
showToUser(combineUserInputAndResults(userInput,secondServiceRes))
}
If all those calls a synchronous blocking calls, this code is easy. But assume all those get and request calls are asynchronous. How to re-write the code? The moment you put the logic in callbacks you loose the clarity of the sequential code. And here is where continuations might help you:
def someBusinessLogicCont() = {
// the method entry point
val userInput
getIntFromUserAsync(cont1, captureContinuationExpecting(entry1, userInput))
// entry/return point after user input
entry1:
val firstServiceRes
requestService1Async(userInput, captureContinuationExpecting(entry2, firstServiceRes))
// entry/return point after the first request to the service
entry2:
val secondServiceRes
if (firstServiceRes % 2 == 0) {
requestService2v1Async(userInput, captureContinuationExpecting(entry3, secondServiceRes))
// entry/return point after the second request to the service v1
entry3:
} else {
requestService2v2Async(userInput, captureContinuationExpecting(entry4, secondServiceRes))
// entry/return point after the second request to the service v2
entry4:
}
showToUser(combineUserInputAndResults(userInput, secondServiceRes))
}
It is hard to capture the idea in a pseudo-code. What I mean is that all those Async method never return. The only way to continue execution of the someBusinessLogicCont is to call the continuation passed into the "async" method. The captureContinuationExpecting(label, variable) call is supposed to create a continuation of the current method at the label with the input (return) value bound to the variable. With such a re-write you still has a sequential-looking business logic even with all those asynchronous calls. So now for a getIntFromUserAsync the second argument looks like just another asynchronous (i.e. never-returning) method that just requires one integer argument. Let's call this type Continuation[T]
trait Continuation[T] {
def continue(value: T):Nothing
}
Logically Continuation[T] looks like a function T => Unit or rather T => Nothing where Nothing as the return type signifies that the call actually never returns (note, in actual Scala implementation such calls do return, so no Nothing there, but I think conceptually it is easy to think about no-return continuations).
Internal vs external iteration
Another example is a problem of iteration. Iteration can be internal or external. Internal iteration API looks like this:
trait CollectionI[T] {
def forEachInternal(handler: T => Unit): Unit
}
External iteration looks like this:
trait Iterator[T] {
def nextValue(): Option[T]
}
trait CollectionE[T] {
def forEachExternal(): Iterator[T]
}
Note: often Iterator has two method like hasNext and nextValue returning T but it will just make the story a bit more complicated. Here I use a merged nextValue returning Option[T] where the value None means the end of the iteration and Some(value) means the next value.
Assuming the Collection is implemented by something more complicated than an array or a simple list, for example some kind of a tree, there is a conflict here between the implementer of the API and the API user if you use typical imperative language. And the conflict is over the simple question: who controls the stack (i.e. the easy to use state of the program)? The internal iteration is easier for the implementer because he controls the stack and can easily store whatever state is needed to move to the next item but for the API user the things become tricky if she wants to do some aggregation of the stored data because now she has to save the state between the calls to the handler somewhere. Also you need some additional tricks to let the user stop the iteration at some arbitrary place before the end of the data (consider you are trying to implement find via forEach). Conversely the external iteration is easy for the user: she can store all the state necessary to process data in any way in local variables but the API implementer now has to store his state between calls to the nextValue somewhere else. So fundamentally the problem arises because there is only one place to easily store the state of "your" part of the program (the call stack) and two conflicting users for that place. It would be nice if you could just have two different independent places for the state: one for the implementer and another for the user. And continuations provide exactly that. The idea is that we can pass execution control between two methods back and forth using two continuations (one for each part of the program). Let's change the signatures to:
// internal iteration
// continuation of the iterator
type ContIterI[T] = Continuation[(ContCallerI[T], ContCallerLastI)]
// continuation of the caller
type ContCallerI[T] = Continuation[(T, ContIterI[T])]
// last continuation of the caller
type ContCallerLastI = Continuation[Unit]
// external iteration
// continuation of the iterator
type ContIterE[T] = Continuation[ContCallerE[T]]
// continuation of the caller
type ContCallerE[T] = Continuation[(Option[T], ContIterE[T])]
trait Iterator[T] {
def nextValue(cont : ContCallerE[T]): Nothing
}
trait CollectionE[T] {
def forEachExternal(): Iterator[T]
}
trait CollectionI[T] {
def forEachInternal(cont : ContCallerI[T]): Nothing
}
Here ContCallerI[T] type, for example, means that this is a continuation (i.e. a state of the program) the expects two input parameters to continue running: one of type T (the next element) and another of type ContIterI[T] (the continuation to switch back). Now you can see that the new forEachInternal and the new forEachExternal+Iterator have almost the same signatures. The only difference in how the end of the iteration is signaled: in one case it is done by returning None and in other by passing and calling another continuation (ContCallerLastI).
Here is a naive pseudo-code implementation of a sum of elements in an array of Int using these signatures (an array is used instead of something more complicated to simplify the example):
class ArrayCollection[T](val data:T[]) : CollectionI[T] {
def forEachInternal(cont0 : ContCallerI[T], lastCont: ContCallerLastI): Nothing = {
var contCaller = cont0
for(i <- 0 to data.length) {
val contIter = captureContinuationExpecting(label, contCaller)
contCaller.continue(data(i), contIter)
label:
}
}
}
def sum(arr: ArrayCollection[Int]): Int = {
var sum = 0
val elem:Int
val iterCont:ContIterI[Int]
val contAdd0 = captureContinuationExpecting(labelAdd, elem, iterCont)
val contLast = captureContinuation(labelReturn)
arr.forEachInternal(contAdd0, contLast)
labelAdd:
sum += elem
val contAdd = captureContinuationExpecting(labelAdd, elem, iterCont)
iterCont.continue(contAdd)
// note that the code never execute this line, the only way to jump out of labelAdd is to call contLast
labelReturn:
return sum
}
Note how both implementations of the forEachInternal and of the sum methods look fairly sequential.
Multi-tasking
Cooperative multitasking also known as coroutines is actually very similar to the iterations example. Cooperative multitasking is an idea that the program can voluntarily give up ("yield") its execution control either to the global scheduler or to another known coroutine. Actually the last (re-written) example of sum can be seen as two coroutines working together: one doing iteration and another doing summation. But more generally your code might yield its execution to some scheduler that then will select which other coroutine to run next. And what the scheduler does is manages a bunch of continuations deciding which to continue next.
Preemptive multitasking can be seen as a similar thing but the scheduler is run by some hardware interruption and then the scheduler needs a way to create a continuation of the program being executed just before the interruption from the outside of that program rather than from the inside.
Scala examples
What you see is a really old article that is referring to Scala 2.8 (while current versions are 2.11, 2.12, and soon 2.13). As #igorpcholkin correctly pointed out, you need to use a Scala continuations compiler plugin and library. The sbt compiler plugin page has an example how to enable exactly that plugin (for Scala 2.12 and #igorpcholkin's answer has the magic strings for Scala 2.11):
val continuationsVersion = "1.0.3"
autoCompilerPlugins := true
addCompilerPlugin("org.scala-lang.plugins" % "scala-continuations-plugin_2.12.2" % continuationsVersion)
libraryDependencies += "org.scala-lang.plugins" %% "scala-continuations-library" % continuationsVersion
scalacOptions += "-P:continuations:enable"
The problem is that plugin is semi-abandoned and is not widely used in practice. Also the syntax has changed since the Scala 2.8 times so it is hard to get those examples running even if you fix the obvious syntax bugs like missing ( here and there. The reason of that state is stated on the GitHub as:
You may also be interested in https://github.com/scala/async, which covers the most common use case for the continuations plugin.
What that plugin does is emulates continuations using code-rewriting (I suppose it is really hard to implement true continuations over the JVM execution model). And under such re-writings a natural thing to represent a continuation is some function (typically called k and k1 in those examples).
So now if you managed to read the wall of text above, you can probably interpret the sandwich example correctly. AFAIU that example is an example of using continuation as means to emulate "return". If we re-sate it with more details, it could go like this:
You (your brain) are inside some function that at some points decides that it wants a sandwich. Luckily you have a sub-routine that knows how to make a sandwich. You store your current brain state as a continuation into the pocket and call the sub-routine saying to it that when the job is done, it should continue the continuation from the pocket. Then you make a sandwich according to that sub-routine messing up with your previous brain state. At the end of the sub-routine it runs the continuation from the pocket and you return to the state just before the call of the sub-routine, forget all your state during that sub-routine (i.e. how you made the sandwich) but you can see the changes in the outside world i.e. that the sandwich is made now.
To provide at least one compilable example with the current version of the scala-continuations, here is a simplified version of my asynchronous example:
case class RemoteService(private val readData: Array[Int]) {
private var readPos = -1
def asyncRead(callback: Int => Unit): Unit = {
readPos += 1
callback(readData(readPos))
}
}
def readAsyncUsage(rs1: RemoteService, rs2: RemoteService): Unit = {
import scala.util.continuations._
reset {
val read1 = shift(rs1.asyncRead)
val read2 = if (read1 % 2 == 0) shift(rs1.asyncRead) else shift(rs2.asyncRead)
println(s"read1 = $read1, read2 = $read2")
}
}
def readExample(): Unit = {
// this prints 1-42
readAsyncUsage(RemoteService(Array(1, 2)), RemoteService(Array(42)))
// this prints 2-1
readAsyncUsage(RemoteService(Array(2, 1)), RemoteService(Array(42)))
}
Here remote calls are emulated (mocked) with a fixed data provided in arrays. Note how readAsyncUsage looks like a totally sequential code despite the non-trivial logic of which remote service to call in the second read depending on the result of the first read.
For full example you need prepare Scala compiler to use continuations and also use a special compiler plugin and library.
The simplest way is a create a new sbt.project in IntellijIDEA with the following files: build.sbt - in the root of the project, CTest.scala - inside main/src.
Here is contents of both files:
build.sbt:
name := "ContinuationSandwich"
version := "0.1"
scalaVersion := "2.11.6"
autoCompilerPlugins := true
addCompilerPlugin(
"org.scala-lang.plugins" % "scala-continuations-plugin_2.11.6" % "1.0.2")
libraryDependencies +=
"org.scala-lang.plugins" %% "scala-continuations-library" % "1.0.2"
scalacOptions += "-P:continuations:enable"
CTest.scala:
import scala.util.continuations._
object CTest extends App {
case class Sandwich()
def makeSandwich = {
println("Making sandwich")
new Sandwich
}
var k1 : (Unit => Sandwich) = null
reset {
shift { k : (Unit => Sandwich) => k1 = k }
makeSandwich
}
val x = k1()
}
What the code above essentially does is calling makeSandwich function (in a convoluted manner). So execution result would be just printing "Making sandwich" into console. The same result would be achieved without continuations:
object CTest extends App {
case class Sandwich()
def makeSandwich = {
println("Making sandwich")
new Sandwich
}
val x = makeSandwich
}
So what's the point? My understanding is that we want to "prepare a sandwich", ignoring the fact that we may be not ready for that. We mark a point of time where we want to return to after all necessary conditions are met (i.e. we have all necessary ingredients ready). After we fetch all ingredients we can return to the mark and "prepare a sandwich", "forgetting that we were unable to do that in past". Continuations allow us to "mark point of time in past" and return to that point.
Now step by step. k1 is a variable to hold a pointer to a function which should allow to "create sandwich". We know it because k1 is declared so: (Unit => Sandwich).
However initially the variable is not initialized (k1 = null, "there are no ingredients to make a sandwich yet"). So we can't call the function preparing sandwich using that variable yet.
So we mark a point of execution where we want to return to (or point of time in past we want to return to) using "reset" statement.
makeSandwich is another pointer to a function which actually allows to make a sandwich. It's the last statement of "reset block" and hence it is passed to "shift" (function) as argument (shift { k : (Unit => Sandwich).... Inside shift we assign that argument to k1 variable k1 = k thus making k1 ready to be called as a function. After that we return to execution point marked by reset. The next statement is execution of function pointed to by k1 variable which is now properly initialized so finally we call makeSandwich which prints "Making sandwich" to a console. It also returns an instance of sandwich class which is assigned to x variable.
Not sure, probably it means that makeSandwich is not called inside reset block but just afterwards when we call it as k1().
I have basically list of unique ids, and for every id, i make a call to function which returns future.
Problem is number of futures in a single call is variable.
list.map(id -> futureCall)
There will be too much parallelism which can affect my system. I want to configure number of futures execution in parallel.
I want testable design so i can't do this
After searching alot, i found this
I didn't get it how to use it. I tried but it didn't work.
After that i have just imported it in my class where i am making call.
I have used same snippet and set default maxConcurrent to 4.
I replaced import global execution context with ThrottledExecutionContext
You have to wrap your ExecutionContext with ThrottledExecutionContext.
Here is a little sample:
object TestApp extends App {
implicit val ec = ThrottledExecutionContext(maxConcurrents = 10)(scala.concurrent.ExecutionContext.global)
def futureCall(id:Int) = Future {
println(s"executing $id")
Thread.sleep(500)
id
}
val list = 1 to 1000
val results = list.map(futureCall)
Await.result(Future.sequence(results), 100.seconds)
}
Alternatively you can also try a FixedThreadPool:
implicit val ec = ExecutionContext.fromExecutor(java.util.concurrent.Executors.newFixedThreadPool(10))
I am not sure what you are trying to do here. Default global ExecutionContext uses as many threads as you have CPU cores. So, that would be your parallelism. If that's still "too many" for you, you can control that number with a system property: "scala.concurrent.context.maxThreads", and set that to a lower number.
That will be the maximum number of futures that are executed in parallel at any given time. You should not need to throttle anything explicitly.
Alternatively, you can create your own executor, and give it a BlockingQueue with a limited capacity. That would block on the producer side (when a work item is being submitted), like your implementation does, but I would very strongly advice you from doing that as it is very dangerous and prone to deadlocks, and also much less efficient, that the default ForkJoinPool implementation.
What is the cost of a scala Future? Is it bad practice to spin up, say, 1000 of them only to flatMap them away again right away?
In my case, I don't need 1000 futures - I could actually get away with about 10 or so, but it makes my code cleaner to use more futures, and I'm trying to get a sense of tradeoffs between code elegance and abusing resources. Obviously if I had blocking code, they'd be expensive, but if not, how many should I feel free to spin up to save a few lines of code?
You say you create some of them just to deal with a homogeneous list of Future[T]. In that case, if you just want to lift some T to a Future[T], you can do Future.successful(myValue). This causes no asynchronous background operations to be performed. It's a ready value, just wrapped in Future context.
EDIT: After re-reading your question and comments, I believe this is enough for an answer. Continue reading for extra info.
Regarding flatMapping, be aware that if you create 1000 futures beforehand as 1000 different vals, they will start right away (well, whenever JVM execution context decides that it's a good time to start, but definitely as soon as possible). However, if you create them in-place inside the flatMap, they will be chained (whole point of M-word's flatMap is to chain stuff in sequential series, with each step possibly depending on the result of previous one).
Demo:
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
val f1 = Future({ Thread.sleep(2000); println("one") })
val f2 = Future({ Thread.sleep(2000); println("two") })
val result1 = f1.flatMap(f => f2)
Await.result(result1, 5 seconds)
val result2 = Future({ Thread.sleep(2000); println("one") })
.flatMap(f => Future({ Thread.sleep(2000); println("two") }))
Await.result(result2, 5 seconds)
In the case of result1, you will get "one" and "two" printed out together after two seconds (sometimes even first "two" then "one"). But in the second case where we created the futures in-place inside the flatMap, first "one" is printed after two seconds, and then after another two seconds "two". This means that if you chain 1000 Futures like this, but the chain breaks at step 42, the rest 958 futures will not be calculated.
By combining these two facts about Futures you can avoid creating unnecessary ones.
I hope I helped at least a little, because regarding your main question - how much memory and other overhead does a Future cost - I don't have the numbers. That really depends on the settings of your JVM and the machine that's running the code. I do think however that even if your system can take anything you throw at it, you shouldn't be doing (a lot of) unnecessary background Future computations. And even though there are such things as sensible timeouts and cancelling via their respective Promises, creating an extra million of Futures you won't need sounds like a bad design IMHO.
(Note that I said "background computations". If you mainly need all these Futures to keep all types "in the future level" so that the whole code is easier to work with (e.g. with for comprehensions), in that case aforementioned Future.successful is your friend since it's not a computation, just an already computed value stored in a Future context)
I might have misunderstood your question. Correct me if I am mistaken.
What is the cost of a scala Future?
Whenever you wrap expression(s) in future, a java runnable is created under the hood. And a Runnable is just an interface with a run(). Your block of codes is then wrapper inside the run method of the runnable. This runnable is then submitted to the execution context.
In a very general sense, a future is nothing more than a runnable with bunch of helper methods. An instance of future is no different from other objects. You may reference this thread to get a rough idea on what's the memory consumption of a single java object.
If you are interested, you can trace the whole chain of action starting from the creation of a future
How to implement simple actors without Akka? I don't need high-performance for many (non-fixed count) actor instances, green-threads, IoC (lifecycle, Props-based factories, ActorRef's), supervising, backpressure etc. Need only sequentiality (queue) + handler + state + message passing.
As a side-effect I actually need small actor-based pipeline (with recursive links) + some parallell actors to optimize the DSP algorithm calculation. It will be inside library without transitive dependencies, so I don't want (and can't as it's a jar-plugin) to push user to create and pass akkaSystem, the library should have as simple and lightweight interface as possible. I don't need IoC as it's just a library (set of functions), not a framework - so it has more algorithmic complexity than structural. However, I see actors as a good instrument for describing protocols and I actually can decompose the algorithm to small amount of asynchronously interacting entities, so it fits to my needs.
Why not Akka
Akka is heavy, which means that:
it's an external dependency;
has complex interface and implementation;
non-transparent for library's user, for example - all instances are managed by akka's IoC, so there is no guarantee that one logical actor is always maintained by same instance, restart will create a new one;
requires additional support for migration which is comparable with scala's migration support itself.
It also might be harder to debug akka's green threads using jstack/jconsole/jvisualvm, as one actor may act on any thread.
Sure, Akka's jar (1.9Mb) and memory consumption (2.5 million actors per GB) aren't heavy at all, so you can run it even on Android. But it's also known that you should use specialized tools to watch and analyze actors (like Typesafe Activator/Console), which user may not be familiar with (and I wouldn't push them to learn it). It's all fine for enterprise project as it almost always has IoC, some set of specialized tools and continuous migration, but this isn't good approach for a simple library.
P.S. About dependencies. I don't have them and I don't want to add any (I'm even avoiding the scalaz, which actually fits here a little bit), as it will lead to heavy maintenance - I'll have to keep my simple library up-to-date with Akka.
Here is most minimal and efficient actor in the JVM world with API based on Minimalist Scala actor from Viktor Klang:
https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala
It is handy and safe in usage but isn't type safe in message receiving and cannot send messages between processes or hosts.
Main features:
simplest FSM-like API with just 3 states (Stay, Become and Die): https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L28-L30
minimalistic error handling - just proper forwading to the default exception handler of executor threads: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L52-L53
fast async initialization that takes ~200 ns to complete, so no need for additional futures/actors for time consuming actor initialization: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L447
smallest memory footprint, that is ~40 bytes in a passive state (BTW the new String() spends the same amout of bytes in the JVM heap): https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L449
very efficient in message processing with throughput ~90M msg/sec for 4 core CPU: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L466
very efficient in message sending/receiving with latency ~100 ns: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/out0.txt#L472
per actor tuning of fairness by the batch parameter: https://github.com/plokhotnyuk/actors/blob/41eea0277530f86e4f9557b451c7e34345557ce3/src/test/scala/com/github/gist/viktorklang/Actor.scala#L32
Example of stateful counter:
def process(self: Address, msg: Any, state: Int): Effect = if (state > 0) {
println(msg + " " + state)
self ! msg
Become { msg =>
process(self, msg, state - 1)
}
} else Die
val actor = Actor(self => msg => process(self, msg, 5))
Results:
scala> actor ! "a"
a 5
scala> a 4
a 3
a 2
a 1
This will use FixedThreadPool (and so its internal task queue):
import scala.concurrent._
trait Actor[T] {
implicit val context = ExecutionContext.fromExecutor(java.util.concurrent.Executors.newFixedThreadPool(1))
def receive: T => Unit
def !(m: T) = Future { receive(m) }
}
FixedThreadPool with size 1 guarantees sequentiality here. Of course it's NOT the best way to manage your threads if you need 100500 dynamically created actors, but it's fine if you need some fixed amount of actors per application to implement your protocol.
Usage:
class Ping(pong: => Actor[Int]) extends Actor[Int] {
def receive = {
case m: Int =>
println(m)
if (m > 0) pong ! (m - 1)
}
}
object System {
lazy val ping: Actor[Int] = new Ping(pong) //be careful with lazy vals mutual links between different systems (objects); that's why people prefer ActorRef
lazy val pong: Actor[Int] = new Ping(ping)
}
System.ping ! 5
Results:
import scala.concurrent._
defined trait Actor
defined class Ping
defined object System
res17: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#6be61f2c
5
4
3
2
1
0
scala> System.ping ! 5; System.ping ! 7
5
7
4
6
3
5
2
res19: scala.concurrent.Future[Unit] = scala.concurrent.impl.Promise$DefaultPromise#54b053b1
4
1
3
0
2
1
0
This implementation is using two Java threads, so it's "twice" faster than counting without parallelization.
I'm working on a scala project right now, and I've decided to use Akka's agent library over the actor model, because it allows a more functional approach to concurrency.However, I'm having a problem running many different agents at a time. It seems like I'm capping at only having three or four agents running at once.
import akka.actor._
import akka.agent._
import scala.concurrent.ExecutionContext.Implicits.global
object AgentTester extends App {
// Create the system for the actors that power the agents
implicit val system = ActorSystem("ActorSystem")
// Create an agent for each int between 1 and 10
val agents = Vector.tabulate[Agent[Int]](10)(x=>Agent[Int](1+x))
// Define a function for each agent to execute
def printRecur(a: Agent[Int])(x: Int): Int = {
// Print out the stored number and sleep.
println(x)
Thread.sleep(250)
// Recur the agent
a sendOff printRecur(a) _
// Keep the agent's value the same
x
}
// Start each agent
for(a <- agents) {
Thread.sleep(10)
a sendOff printRecur(a) _
}
}
The above code creates an agent holding each integer between 1 and 10. The loop at the bottom sends the printRecur function to every agent. The output of the program should show the numbers 1 through 10 being printed out every quarter of a second (although not in any order). However, for some reason my output only shows the numbers 1 through 4 being outputted.
Is there a more canonical way to use agents in Akka that will work? I come from a clojure background and have used this pattern successfully there before, so I naively used the same pattern in Scala.
My guess is that you are running on a 4 core box and that is part of the reason why you only ever see the numbers 1-4. The big thing at play here is that you are using the default execution context which I'm guessing on your system uses a thread pool with only 4 threads on it (one for each core). With the way you've coded this in this sort of recursive manner, my guess is that the first 4 agents never relinquish the threads and they are the only ones that will ever print anything.
You can easily fix this by removing this line:
import scala.concurrent.ExecutionContext.Implicits.global
And adding this line after you create the ActorSystem
import system.dispatcher
This will use the default dispatcher for the actor system which is a fork join dispatcher which does not seem to have the same issue as the default execution context you imported in your sample.
You could also consider using send as opposed to sendOff as that will use the execution context that was available when you constructed the agent. I would think one would use sendOff when they had a case where they explicitly wanted to use another execution context.