Some help needed to help the type inferring engine - scala

I've problems understanding where to put type informations in scala, and how to put it. Here I create several sequences of Actors and I don't type them. Even if I had to, I wouldn't know which type of sequence map produces to give them the proper type.
Then later when the compiler yells at me because I'm trying to sum Anys, I've no idea where to begin filling in the gaps.
Here is my code, I tried to minimize it while still letting the necessary info available.
object Actors {
def main(args: Array[String]) {
val array = randomArray(5)
val master = new Master(array, 5)
def randomArray(length: Int): Array[Int] = {
val generator = new Random
new Array[Int](length) map((_:Int) => generator nextInt)
class Master(array: Array[Int], slavesNumber: Int) extends Actor {
def act () {
val slaves = (1 to slavesNumber).map(_ => new Slave)
slaves.foreach(s => s.start)
val futures = => s !! Work(array))
val results = awaitAll(3000, futures:_*)
val res2 = results.flatMap(x => x)
println((0 /: res2)(_+_))
class Slave() extends Actor {
def act () {
Actor.loop {
receive {
case Work(slice) =>
reply((slice :\ 0)(_+_))
I'd appreciate too some good pointers towards comprehensive doc on the matter.

The object that are passed between actors are not typed, actors have to filter the object themselves -- as you already do in the Slave actor. As you can see, !! is defined as
def !!(msg: Any): Future[Any]
so there is no type information in the returned Future. Probably the easiest solution is to replace the line var res2 .. with
val res2 = results collect {case Some(y:Int) => y}
this filters out just those Some results that are of type Int.


Iterate data source asynchronously in batch and stop while remote return no data in Scala

Let's say we have a fake data source which will return data it holds in batch
class DataSource(size: Int) {
private var s = 0
implicit val g =
def getData(): Future[List[Int]] = {
s = s + 1
Future {
Thread.sleep(Random.nextInt(s * 100))
if (s <= size) {
} else {
object Test extends App {
val source = new DataSource(100)
implicit val g =
def process(v: List[Int]): Unit = {
def next(f: (List[Int]) => Unit): Unit = {
val fut = source.getData()
fut.onComplete {
case Success(v) => {
v match {
case h :: t => next(f)
I have mine, the problem here is some portion is more not pure. Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list? My situation is a little from this post, the next() there is synchronous call while my is also async.
Or is it ever possible to do what I want? Next batch will only be fetched when the previous one is resolved in the end whether to fetch the next batch depends on the size returned?
What's the best way to walk through this type of data sources? Are there any existing Scala frameworks that provide the feature I am looking for? Is play's Iteratee, Enumerator, Enumeratee the right tool? If so, can anyone provide an example on how to use those facilities to implement what I am looking for?
With help from chunjef, I had just tried out. And it actually did work out for me. However, there was some small change I made based on his answer.
Source.fromIterator(()=>Iterator.continually(source.getData())).mapAsync(1) (f=>f.filter(_.size > 0))
However, can someone give comparison between Akka Stream and Play Iteratee? Does it worth me also try out Iteratee?
Code snip 1:
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
Code snip 2: Assuming the getData depends on some other output of another flow, and I would like to concat it with the below flow. However, it yield too many files open error. Not sure what would cause this error, the mapAsync has been limited to 1 as its throughput if I understood correctly.
Flow[Int].mapConcat[Future[List[Int]]](c => {
The following is one way to achieve the same behavior with Akka Streams, using your DataSource class:
import scala.concurrent.Future
import scala.util.Random
object StreamsExample extends App {
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
val ds = new DataSource(100)
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
class DataSource(size: Int) {
A simplified line-by-line overview:
line 1: Creates a stream source that continually calls ds.getData if there is downstream demand.
line 2: mapAsync is a way to deal with stream elements that are Futures. In this case, the stream elements are of type Future[List[Int]]. The argument 1 is the level of parallelism: we specify 1 here because DataSource internally uses a mutable variable, and a parallelism level greater than one could produce unexpected results. identity is shorthand for x => x, which basically means that for each Future, we pass its result downstream without transforming it.
line 3: Essentially, ds.getData is called as long as the result of the Future is a non-empty List[Int]. If an empty List is encountered, processing is terminated.
line 4: runForeach here takes a function List[Int] => Unit and invokes that function for each stream element.
Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list?
I think you are looking for a Promise.
You would set up a Promise before you start the first iteration.
This gives you promise.future, a Future that you can then use to follow the completion of everything.
In your onComplete, you add a case _ => promise.success().
Something like
def loopUntilDone(f: (List[Int]) => Unit): Future[Unit] = {
val promise = Promise[Unit]
def next(): Unit = source.getData().onComplete {
case Success(v) =>
v match {
case h :: t => next()
case _ => promise.success()
case Failure(e) => promise.failure(e)
// get going
// return the Future for everything
// future for everything, this is a `Future[Unit]`
// its `onComplete` will be triggered when there is no more data
val everything = loopUntilDone(process)
You are probably looking for a reactive streams library. My personal favorite (and one I'm most familiar with) is Monix. This is how it will work with DataSource unchanged
import scala.concurrent.duration.Duration
import scala.concurrent.Await
import monix.reactive.Observable
object Test extends App {
val source = new DataSource(100)
val completed = // <- this is Future[Unit], completes when foreach is done
.flatten // <- Here it's Observable[List[Int]], it has collection-like methods
Await.result(completed, Duration.Inf)
I just figured out that by using flatMapConcat can achieve what I wanted to achieve. There is no point to start another question as I have had the answer already. Put my sample code here just in case someone is looking for similar answer.
This type of API is very common for some integration between traditional Enterprise applications. The DataSource is to mock the API while the object App is to demonstrate how the client code can utilize Akka Stream to consume the APIs.
In my small project the API was provided in SOAP, and I used scalaxb to transform the SOAP to Scala async style. And with the client calls demonstrated in the object App, we can consume the API with AKKA Stream. Thanks for all for the help.
class DataSource(size: Int) {
private var transactionId: Long = 0
private val transactionCursorMap: mutable.HashMap[TransactionId, Set[ReadCursorId]] = mutable.HashMap.empty
private val cursorIteratorMap: mutable.HashMap[ReadCursorId, Iterator[List[Int]]] = mutable.HashMap.empty
implicit val g =
case class TransactionId(id: Long)
case class ReadCursorId(id: Long)
def startTransaction(): Future[TransactionId] = {
Future {
synchronized {
transactionId += transactionId
val t = TransactionId(transactionId)
transactionCursorMap.update(t, Set(ReadCursorId(0)))
def createCursorId(t: TransactionId): ReadCursorId = {
synchronized {
val c = transactionCursorMap.getOrElseUpdate(t, Set(ReadCursorId(0)))
val currentId = c.foldLeft(0l) { (acc, a) => acc.max( }
val cId = ReadCursorId(currentId + 1)
transactionCursorMap.update(t, c + cId)
cursorIteratorMap.put(cId, createIterator)
def createIterator(): Iterator[List[Int]] = {
(for {i <- 1 to 100} yield List.fill(100)(i)).toIterator
def startRead(t: TransactionId): Future[ReadCursorId] = {
Future {
def getData(cursorId: ReadCursorId): Future[List[Int]] = {
synchronized {
Future {
cursorIteratorMap.get(cursorId) match {
case Some(i) =>
case _ => List()
object Test extends App {
val source = new DataSource(10)
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
implicit val g =
// def process(v: List[Int]): Unit = {
// println(v)
// }
// def next(f: (List[Int]) => Unit): Unit = {
// val fut = source.getData()
// fut.onComplete {
// case Success(v) => {
// f(v)
// v match {
// case h :: t => next(f)
// }
// }
// }
// }
// next(process)
// Thread.sleep(1000000000)
val s = Source.fromFuture(source.startTransaction())
.map { e =>
e => {
Source.fromIterator(() => Iterator.continually(source.getData(e)))
val done = Source.fromIterator(() => Iterator.continually(source.getData())).mapAsync(1)(identity)
.runFold(List[List[Int]]()) { (acc, r) =>
// println("=======" + acc + r)
r :: acc
done.onSuccess {
case e => {
done.onComplete(_ => system.terminate())

Explanation on the error with for comprehension and co-variance

Would like to get assistance to understand the cause of the error. The original is from Coursera Scala Design Functional Random Generators.
With the factories for random int and random boolean, trying to implement a random tree factory.
trait Factory[+T] {
self => // alias of 'this'
def generate: T
def map[S](f: T => S): Factory[S] = new Factory[S] {
def generate = f(self.generate)
def flatMap[S](f: T => Factory[S]): Factory[S] = new Factory[S] {
def generate = f(self.generate).generate
val intFactory = new Factory[Int] {
val rand = new java.util.Random
def generate = rand.nextInt()
val boolFactory = => i > 0)
The implementation in the 1st block causes the error but if it changed into the 2nd block, it does not. I believe Factory[+T] meant that Factory[Inner] and Factory[Leaf] could be both treated as Factory[Tree].
I have no idea why the same if expression in for block is OK but it is not OK in yield block. I appreciate explanations.
trait Tree
case class Inner(left: Tree, right: Tree) extends Tree
case class Leaf(x: Int) extends Tree
def leafFactory: Factory[Leaf] = => new Leaf(i))
def innerFactory: Factory[Inner] = new Factory[Inner] {
def generate = new Inner(treeFactory.generate, treeFactory.generate)
def treeFactory: Factory[Tree] = for {
isLeaf <- boolFactory
} yield if (isLeaf) leafFactory else innerFactory
^^^^^^^^^^^ ^^^^^^^^^^^^
type mismatch; found : Factory[Inner] required: Tree
type mismatch; found : Factory[Leaf] required: Tree
However, below works.
def treeFactory: Factory[Tree] = for {
isLeaf <- boolFactory
tree <- if (isLeaf) leafFactory else innerFactory
} yield tree
I have no idea why the same if expression in for block is OK but it is
not OK in yield block
Because they are translated differently by the compiler. The former example is translated into:
boolFactory.flatMap((isLeaf: Boolean) => if (isLeaf) leafFactory else innerFactor)
Which yields the expected Factory[Tree], while the latter is being translated to: Boolean) => if (isLeaf) leafFactory else innerFactory)
Which yields a Factory[Factory[Tree]], not a Factory[Tree], thus not conforming to your method signature. This isn't about covariance, but rather how for comprehension translates these statements differently.

cache using functional callbacks/ proxy pattern implementation scala

How to implement cache using functional programming
A few days ago I came across callbacks and proxy pattern implementation using scala.
This code should only apply inner function if the value is not in the map.
But every time map is reinitialized and values are gone (which seems obivous.
How to use same cache again and again between different function calls
class Aggregator{
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
val cache = HashMap[Int, Int]()
(t:Int) => {
if (!cache.contains(t)) {
val r = function.apply(t);
def memoizedDoubler = memoize( (key:Int) => {
object Aggregator {
def main( args: Array[String] ) {
val agg = new Aggregator()
agg.memoizedDoubler(2)// It should not evaluate again but does
agg.memoizedDoubler(3)// It should not evaluate again but does
I see what you're trying to do here, the reason it's not working is that every time you call memoizedDoubler it's first calling memorize. You need to declare memoizedDoubler as a val instead of def if you want it to only call memoize once.
val memoizedDoubler = memoize( (key:Int) => {
This answer has a good explanation on the difference between def and val.
Aren't you declaring a new Map per invocation ?
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
val cache = HashMap[Int, Int]()
rather than specifying one per instance of Aggregator ?
class Aggregator{
private val cache = HashMap[Int, Int]()
def memoize(function: Function[Int, Int] ):Function[Int,Int] = {
To answer your question:
How to implement cache using functional programming
In functional programming there is no concept of mutable state. If you want to change something (like cache), you need to return updated cache instance along with the result and use it for the next call.
Here is modification of your code that follows that approach. function to calculate values and cache is incorporated into Aggregator. When memoize is called, it returns tuple, that contains calculation result (possibly taken from cache) and new Aggregator that should be used for the next call.
class Aggregator(function: Function[Int, Int], cache:Map[Int, Int] = Map.empty) {
def memoize:Int => (Int, Aggregator) = {
t:Int =>
cache.get(t).map {
res =>
(res, Aggregator.this)
}.getOrElse {
val res = function(t)
(res, new Aggregator(function, cache + (t -> res)))
object Aggregator {
def memoizedDoubler = new Aggregator((key:Int) => {
println("Evaluating..." + key)
def main(args: Array[String]) {
val (res, doubler1) = memoizedDoubler.memoize(2)
val (res1, doubler2) = doubler1.memoize(2)
val (res2, doubler3) = doubler2.memoize(3)
val (res3, doubler4) = doubler3.memoize(3)
This prints:

Scala Future[A] and Future[Option[B]] composition

I have an app that manages Items. When the client queries an item by some info, the app first tries to find an existing item in the db with the info. If there isn't one, the app would
Check if info is valid. This is an expensive operation (much more so than a db lookup), so the app only performs this when there isn't an existing item in the db.
If info is valid, insert a new Item into the db with info.
There are two more classes, ItemDao and ItemService:
object ItemDao {
def findByInfo(info: Info): Future[Option[Item]] = ...
// This DOES NOT validate info; it assumes info is valid
def insertIfNotExists(info: Info): Future[Item] = ...
object ItemService {
// Very expensive
def isValidInfo(info: Info): Future[Boolean] = ...
// Ugly
def findByInfo(info: Info): Future[Option[Item]] = {
ItemDao.findByInfo(info) flatMap { maybeItem =>
if (maybeItem.isDefined)
isValidInfo(info) flatMap {
if (_) ItemDao.insertIfNotExists(info) map (Some(_))
else Future.successful(None)
The ItemService.findByInfo(info: Info) method is pretty ugly. I've been trying to clean it up for a while, but it's difficult since there are three types involved (Future[Boolean], Future[Item], and Future[Option[Item]]). I've tried to use scalaz's OptionT to clean it up but the non-optional Futures make it not very easy either.
Any ideas on a more elegant implementation?
To expand on my comment.
Since you've already indicated a willingness to go down the route of monad transformers, this should do what you want. There is unfortunately quite a bit of line noise due to Scala's less than stellar typechecking here, but hopefully you find it elegant enough.
import scalaz._
import Scalaz._
object ItemDao {
def findByInfo(info: Info): Future[Option[Item]] = ???
// This DOES NOT validate info; it assumes info is valid
def insertIfNotExists(info: Info): Future[Item] = ???
object ItemService {
// Very expensive
def isValidInfo(info: Info): Future[Boolean] = ???
def findByInfo(info: Info): Future[Option[Item]] = {
lazy val nullFuture = OptionT(Future.successful(none[Item]))
lazy val insert = ItemDao.insertIfNotExists(info).liftM[OptionT]
lazy val validation =
.ifM(insert, nullFuture)
val maybeItem = OptionT(ItemDao.findByInfo(info))
val result = maybeItem <+> validation
Two comments about the code:
We are using the OptionT monad transformer here to capture the Future[Option[_]] stuff and anything that just lives inside Future[_] we're liftMing up to our OptionT[Future, _] monad.
<+> is an operation provided by MonadPlus. In a nutshell, as the name suggests, MonadPlus captures the intuition that often times monads have an intuitive way of being combined (e.g. List(1, 2, 3) <+> List(4, 5, 6) = List(1, 2, 3, 4, 5, 6)). Here we're using it to short-circuit when findByInfo returns Some(item) rather than the usual behavior to short-circuit on None (this is roughly analogous to List(item) <+> List() = List(item)).
Other small note, if you actually wanted to go down the monad transformers route, often times you end up building everything in your monad transformer (e.g. ItemDao.findByInfo would return an OptionT[Future, Item]) so that you don't have extraneous OptionT.apply calls and then .run everything at the end.
You don't need scalaz for this. Just break your flatMap into two steps:
first, find and validate, then insert if necessary. Something like this:
ItemDao.findByInfo(info).flatMap {
case None => isValidInfo(info).map(None -> _)
case x => Future.successful(x -> true)
}.flatMap {
case (_, true) => ItemDao.insertIfNotExists(info).map(Some(_))
case (x, _) => Future.successful(x)
Doesn't look too bad, does it? If you don't mind running validation in parallel with retrieval (marginally more expensive resource-vise, but likely faster on average), you could further simplify it like this:
.flatMap {
case (None, true) => ItemDao.insertIfNotExists(info).map(Some(_))
case (x, _) => x
Also, what does insertIfNotExists return if the item does exist? If it returned the existing item, things could be even simpler:
.flatMap { _ => ItemDao.insertIfNotExists(info) }
.map { item => Some(item) }
.recover { case _: NoSuchElementException => None }
If you are comfortable with path-dependent type and higher-kinded type, something like the following can be an elegant solution:
type Const[A] = A
sealed trait Request {
type F[_]
type A
type FA = F[A]
def query(client: Client): Future[FA]
case class FindByInfo(info: Info) extends Request {
type F[x] = Option[x]
type A = Item
def query(client: Client): Future[Option[Item]] = ???
case class CheckIfValidInfo(info: Info) extends Request {
type F[x] = Const[x]
type A = Boolean
def query(client: Client): Future[Boolean] = ???
class DB {
private val dbClient: Client = ???
def exec(request: Request): request.FA = request.query(dbClient)
What this does is basically to abstract over both the wrapper type (eg. Option[_]) as well as inner type. For types without a wrapper type, we use Const[_] type which is basically an identity type.
In scala, many problems alike this can be solved elegantly using Algebraic Data Type and its advanced type system (i.e path-dependent type & higher-kinded type). Note that now we have single point of entry exec(request: Request) for executing db requests instead of something like DAO.

Thread-safely transforming a value in a mutable map

Suppose I want to use a mutable map in Scala to keep track of the number of times I've seen some strings. In a single-threaded context, this is easy:
import scala.collection.mutable.{ Map => MMap }
class Counter {
val counts = MMap.empty[String, Int].withDefaultValue(0)
def add(s: String): Unit = counts(s) += 1
Unfortunately this isn't thread-safe, since the get and the update don't happen atomically.
Concurrent maps add a few atomic operations to the mutable map API, but not the one I need, which would look something like this:
def replace(k: A, f: B => B): Option[B]
I know I can use ScalaSTM's TMap:
import scala.concurrent.stm._
class Counter {
val counts = TMap.empty[String, Int]
def add(s: String): Unit = atomic { implicit txn =>
counts(s) = counts.get(s).getOrElse(0) + 1
But (for now) that's still an extra dependency. Other options would include actors (another dependency), synchronization (potentially less efficient), or Java's atomic references (less idiomatic).
In general I'd avoid mutable maps in Scala, but I've occasionally needed this kind of thing, and most recently I've used the STM approach (instead of just crossing my fingers and hoping I don't get bitten by the naïve solution).
I know there are a number of trade-offs here (extra dependencies vs. performance vs. clarity, etc.), but is there anything like a "right" answer to this problem in Scala 2.10?
How about this one? Assuming you don't really need a general replace method right now, just a counter.
import java.util.concurrent.ConcurrentHashMap
import java.util.concurrent.atomic.AtomicInteger
object CountedMap {
private val counts = new ConcurrentHashMap[String, AtomicInteger]
def add(key: String): Int = {
val zero = new AtomicInteger(0)
val value = Option(counts.putIfAbsent(key, zero)).getOrElse(zero)
You get better performance than synchronizing on the whole map, and you also get atomic increments.
The simplest solution is definitely synchronization. If there is not too much contention, performance might not be that bad.
Otherwise, you could try to roll up your own STM-like replace implementation. Something like this might do:
object ConcurrentMapOps {
private val rng = new util.Random
private val MaxReplaceRetryCount = 10
private val MinReplaceBackoffTime: Long = 1
private val MaxReplaceBackoffTime: Long = 20
implicit class ConcurrentMapOps[A, B]( val m: collection.concurrent.Map[A,B] ) {
import ConcurrentMapOps._
private def replaceBackoff() {
Thread.sleep( (MinReplaceBackoffTime + rng.nextFloat * (MaxReplaceBackoffTime - MinReplaceBackoffTime) ).toLong ) // A bit crude, I know
def replace(k: A, f: B => B): Option[B] = {
m.get( k ) match {
case None => return None
case Some( old ) =>
var retryCount = 0
while ( retryCount <= MaxReplaceRetryCount ) {
val done = m.replace( k, old, f( old ) )
if ( done ) {
return Some( old )
else {
retryCount += 1
sys.error("Could not concurrently modify map")
Note that collision issues are localized to a given key. If two threads access the same map but work on distinct keys, you'll have no collisions and the replace operation will always succeed the first time. If a collision is detected, we wait a bit (a random amount of time, so as to minimize the likeliness of threads fighting forever for the same key) and try again.
I cannot guarantee that this is production-ready (I just tossed it right now), but that might do the trick.
UPDATE: Of course (as Ionuț G. Stan pointed out), if all you want is increment/decrement a value, java's ConcurrentHashMap already provides thoses operations in a lock-free manner.
My above solution applies if you need a more general replace method that would take the transformation function as a parameter.
You're asking for trouble if your map is just sitting there as a val. If it meets your use case, I'd recommend something like
class Counter {
private[this] myCounts = MMap.empty[String, Int].withDefaultValue(0)
def counts(s: String) = myCounts.synchronized { myCounts(s) }
def add(s: String) = myCounts.synchronized { myCounts(s) += 1 }
def getCounts = myCounts.synchronized { Map[String,Int]() ++ myCounts }
for low-contention usage. For high-contention, you should use a concurrent map designed to support such use (e.g. java.util.concurrent.ConcurrentHashMap) and wrap the values in AtomicWhatever.
If you are ok to work with future based interface:
trait SingleThreadedExecutionContext {
val ec = ExecutionContext.fromExecutor(Executors.newSingleThreadExecutor())
class Counter extends SingleThreadedExecutionContext {
private val counts = MMap.empty[String, Int].withDefaultValue(0)
def get(s: String): Future[Int] = future(counts(s))(ec)
def add(s: String): Future[Unit] = future(counts(s) += 1)(ec)
Test will look like:
class MutableMapSpec extends Specification {
"thread safe" in {
val c = new Counter
val testData = Seq.fill(16)("1")
await(c.get("1")) mustEqual 16