Based on my previous question on locks based on value-equality rather than lock-equality, I came up with the following implementation:
/**
* An util that provides synchronization using value equality rather than referential equality
* It is guaranteed that if two objects are value-equal, their corresponding blocks are invoked mutually exclusively.
* But the converse may not be true i.e. if two objects are not value-equal, they may be invoked exclusively too
* Note: Typically, no need to create instances of this class. The default instance in the companion object can be safely reused
*
* #param size There is a 1/size probability that two invocations that could be invoked concurrently is not invoked concurrently
*
* Example usage:
* import EquivalenceLock.{defaultInstance => lock}
* def run(person: Person) = lock(person) { .... }
*/
class EquivalenceLock(val size: Int) {
private[this] val locks = IndexedSeq.fill(size)(new Object())
def apply[U](lock: Any)(f: => U) = locks(lock.hashCode().abs % size).synchronized(f)
}
object EquivalenceLock {
implicit val defaultInstance = new EquivalenceLock(1 << 10)
}
I wrote some tests to verify that my lock functions as expected:
import EquivalenceLock.{defaultInstance => lock}
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
import scala.collection.mutable
val journal = mutable.ArrayBuffer.empty[String]
def log(msg: String) = journal.synchronized {
println(msg)
journal += msg
}
def test(id: String, napTime: Int) = Future {
lock(id) {
log(s"Entering $id=$napTime")
Thread.sleep(napTime * 1000L)
log(s"Exiting $id=$napTime")
}
}
test("foo", 5)
test("foo", 2)
Thread.sleep(20 * 1000L)
val validAnswers = Set(
Seq("Entering foo=5", "Exiting foo=5", "Entering foo=2", "Exiting foo=2"),
Seq("Entering foo=2", "Exiting foo=2", "Entering foo=5", "Exiting foo=5")
)
println(s"Final state = $journal")
assert(validAnswers(journal))
The above tests works as expected (tested over millions of runs). But, when I change the following line:
def apply[U](lock: Any)(f: => U) = locks(lock.hashCode().abs % size).synchronized(f)
to this:
def apply[U](lock: Any) = locks(lock.hashCode().abs % size).synchronized _
the tests fail.
Expected:
Entering foo=5
Exiting foo=5
Entering foo=2
Exiting foo=2
OR
Entering foo=2
Exiting foo=2
Entering foo=5
Exiting foo=5
Actual:
Entering foo=5
Entering foo=2
Exiting foo=2
Exiting foo=5
The above two pieces of code should be the same and yet the tests (i.e. the lock(id) always enters concurrently for the same id) for the second flavor (the one with partial application) of code. Why?
By default function parameters are evaluated eagerly. So
def apply[U](lock: Any) = locks(lock.hashCode().abs % size).synchronized _
is equivalent to
def apply[U](lock: Any)(f: U) = locks(lock.hashCode().abs % size).synchronized(f)
in this case f is evaluated before the synchronized block.
Related
import scala.io._
object Sum {
def main(args :Array[String]):Unit = {
println("Enter some numbers and press ctrl-c")
val input = Source.fromInputStream(System.in)
val lines = input.getLines.toList
println("Sum "+sum(lines))
}
def toInt(in:String):Option[Int] =
try{
Some(Integer.parseInt(in.trim))
}
catch {
case e: NumberFormatException => None
}
def sum(in :Seq[String]) = {
val ints = in.flatMap(s=>toInt(s))
ints.foldLeft(0) ((a,b) => a +b)
} }
I am trying to run this program after passing input I have press
ctrl + c but
It gives this message E:\Scala>scala HelloWord.scala Enter some
numbers and press ctrl-c 1 2 3 Terminate batch job (Y/N)?
Additional observations, note trait App to make an object executable, hence not having to declare a main(...) function, for instance like this,
object Sum extends App {
import scala.io._
import scala.util._
val nums = Source.stdin.getLines.flatMap(v => Try(v.toInt).toOption)
println(s"Sum: ${nums.sum}")
}
Using Try, non successful conversions from String to Int are turned to None and flattened out.
Also note objects and classes are capitalized, hence instead of object sum by convention we write object Sum.
You can also use an external API. I really like scallop API
Try this piece of code. It should work as intended.
object Sum {
def main(args: Array[String]) {
val lines = io.Source.stdin.getLines
val numbers = lines.map(_.toInt)
println(s"Sum: ${numbers.sum}")
}
}
Plus, the correct shortcut to end the input stream is Ctrl + D.
I am new to both ScalaMock and mocking in general. I am trying to test a method which calls a method in another (mocked) class and then calls a method on the returned object.
Detailed information:
So I am using ScalaTest and there are five classes involved in this test...
SubInstruction which I am testing
class SubInstruction(label: String, val result: Int, val op1: Int, val op2: Int) extends Instruction(label, "sub") {
override def execute(m: Machine) {
val value1 = m.regs(op1)
val value2 = m.regs(op2)
m.regs(result) = value1 - value2
}
}
object SubInstruction {
def apply(label: String, result: Int, op1: Int, op2: Int) =
new SubInstruction(label, result, op1, op2)
}
Machine which must be mocked for the test
case class Machine(labels: Labels, prog: Vector[Instruction]) {
private final val NUMBEROFREGISTERS = 32
val regs: Registers = new Registers(NUMBEROFREGISTERS)
override def toString(): String = {
prog.foldLeft("")(_ + _)
}
def execute(start: Int) =
start.until(prog.length).foreach(x => prog(x) execute this)
}
object Machine extends App {
if (args.length == 0) {
println("Machine: args should be sml code file to execute")
} else {
println("SML interpreter - Scala version")
val m = Translator(args(0)).readAndTranslate(new Machine(Labels(), Vector()))
println("Here is the program; it has " + m.prog.size + " instructions.")
println(m)
println("Beginning program execution.")
m.execute(0)
println("Ending program execution.")
println("Values of registers at program termination:")
println(m.regs + ".")
}
}
Registers which is required to construct a Machine object
case class Registers(size: Int) {
val registers: Array[Int] = new Array(size)
override def toString(): String =
registers.mkString(" ")
def update(k: Int, v: Int) = registers(k) = v
def apply(k: Int) = registers(k)
}
MockableMachine which I have created as the original Machine class does not have an empty constructor and therefore (as I understand) can not be mocked
class MockableMachine extends Machine(Labels(), Vector()){
}
and finally my test class SubInstructionTest which compiles but throws the exception below.
class SubInstructionTest extends FlatSpec with MockFactory with Matchers {
val label1 = "f0"
val result1 = 25
val op1_1 = 24
val op2_1 = 20
val sub1 = SubInstruction(label1, result1, op1_1, op2_1)
"A SubInstruction" should "retrieve the operands from the correct registers in the given machine " +
"when execute(m: Machine) is called, and perform the operation saving the " +
"result in the correct register." in {
val mockMachine = mock[MockableMachine]
inSequence {
(mockMachine.regs.apply _).expects(op1_1).returning(50)
(mockMachine.regs.apply _).expects(op2_1).returning(16)
(mockMachine.regs.update _).expects(result1, 34)
}
sub1.execute(mockMachine)
}
}
Throws:
java.lang.NoSuchMethodException: Registers.mock$apply$0()
-
I have been searching for a straightforward way to mock this class for hours, but have found nothing. For the time being I have settled on the workaround detailed below, but I was under the impression that mocking would offer a less convoluted solution to the problem of testing my SubInstruction class.
The workaround:
Delete the MockableMachine class and create a CustomMachine class which extends Machine and replaces the registers value with mockedRegisters provided at construction time.
class CustomMachine (mockedRegister: Registers) extends Machine(Labels(), Vector()) {
override
val regs: Registers = mockedRegister
}
a MockableRegisters class which I have created as the original does not have an empty constructor and therefore (as I understand) can not be mocked
class MockableRegisters extends Registers(32) {
}
and the SubInstructionTest class written in a slightly different way
class SubInstructionTest extends FlatSpec with MockFactory with Matchers {
val label1 = "f0"
val result1 = 25
val op1_1 = 24
val op2_1 = 20
val sub1 = SubInstruction(label1, result1, op1_1, op2_1)
"A SubInstruction" should "retrieve the operands from the correct registers in the given machine " +
"when execute(m: Machine) is called, and perform the operation saving the " +
"result in the correct register." in {
val mockRegisters = mock[MockableRegisters]
val machine = new CustomMachine(mockRegisters)
inSequence {
(mockRegisters.apply _).expects(op1_1).returning(50)
(mockRegisters.apply _).expects(op2_1).returning(16)
(mockRegisters.update _).expects(result1, 34)
}
sub1.execute(machine)
}
}
As indicated, this feels like a workaround to me, is there not a simpler way to do this (perhaps similar to my original attempt)?
I have just included the essential code to ask the question, but you can find the full code on my GitHub account.
I don't think mocking nested objects is supported by Scalamock implicitly. You'll have to mock the object returned by the first call which is what your working example does.
FWIW, Mockito supports this. Search for RETURNS_DEEP_STUBS.
I have an Iterable[String] and I want to stream that to an external Process and return an Iterable[String] for the output.
I feel like this should work as it compiles
import scala.sys.process._
object PipeUtils {
implicit class IteratorStream(s: TraversableOnce[String]) {
def pipe(cmd: String) = s.toStream.#>(cmd).lines
def run(cmd: String) = s.toStream.#>(cmd).!
}
}
However, Scala tries to execute the contents of s instead of pass them in to standard in. Can anyone please tell me what I'm doing wrong?
UPDATE:
I think that my original problem was that the s.toStream was being implicity converted to a ProcessBuilder and then executed. This is incorrect as it's the input to the process.
I have come up with the following solution. This feels very hacky and wrong but it seems to work for now. I'm not writing this as an answer because I feel like the answer should be one line and not this gigantic thing.
object PipeUtils {
/**
* This class feels wrong. I think that for the pipe command it actually loads all of the output
* into memory. This could blow up the machine if used wrong, however, I cannot figure out how to get it to
* work properly. Hopefully http://stackoverflow.com/questions/28095469/stream-input-to-external-process-in-scala
* will get some good responses.
* #param s
*/
implicit class IteratorStream(s: TraversableOnce[String]) {
val in = (in: OutputStream) => {
s.foreach(x => in.write((x + "\n").getBytes))
in.close
}
def pipe(cmd: String) = {
val output = ListBuffer[String]()
val io = new ProcessIO(in,
out => {Source.fromInputStream(out).getLines.foreach(output += _)},
err => {Source.fromInputStream(err).getLines.foreach(println)})
cmd.run(io).exitValue
output.toIterable
}
def run(cmd: String) = {
cmd.run(BasicIO.standard(in)).exitValue
}
}
}
EDIT
The motivation for this comes from using Spark's .pipe function on an RDD. I want this exact same functionality on my local code.
Assuming scala 2.11+, you should use lineStream as suggested by #edi. The reason is that you get a streaming response as it becomes available instead of a batched response. Let's say I have a shell script echo-sleep.sh:
#/usr/bin/env bash
# echo-sleep.sh
while read line; do echo $line; sleep 1; done
and we want to call it from scala using code like the following:
import scala.sys.process._
import scala.language.postfixOps
import java.io.ByteArrayInputStream
implicit class X(in: TraversableOnce[String]) {
// Don't do the BAOS construction in real code. Just for illustration.
def pipe(cmd: String) =
cmd #< new ByteArrayInputStream(in.mkString("\n").getBytes) lineStream
}
Then if we do a final call like:
1 to 10 map (_.toString) pipe "echo-sleep.sh" foreach println
a number in the sequence appears on STDOUT every 1 second. If you buffer, and convert to an Iterable as in your example, you will lose this responsiveness.
Here's a solution demonstrating how to write the process code so that it streams both the input and output. The key is to produce a java.io.PipedInputStream that is passed to the input of the process. This stream is filled from the iterator asynchronously via a java.io.PipedOutputStream. Obviously, feel free to change the input type of the implicit class to an Iterable.
Here's an iterator used to show this works.
/**
* An iterator with pauses used to illustrate data streaming to the process to be run.
*/
class PausingIterator[A](zero: A, until: A, pauseMs: Int)(subsequent: A => A)
extends Iterator[A] {
private[this] var current = zero
def hasNext = current != until
def next(): A = {
if (!hasNext) throw new NoSuchElementException
val r = current
current = subsequent(current)
Thread.sleep(pauseMs)
r
}
}
Here's the actual code you want
import java.io.PipedOutputStream
import java.io.PipedInputStream
import java.io.InputStream
import java.io.PrintWriter
// For process stuff
import scala.sys.process._
import scala.language.postfixOps
// For asynchronous stream writing.
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
/**
* A streaming version of the original class. This does not block to wait for the entire
* input or output to be constructed. This allows the process to get data ASAP and allows
* the process to return information back to the scala environment ASAP.
*
* NOTE: Don't forget about error handling in the final production code.
*/
implicit class X(it: Iterator[String]) {
def pipe(cmd: String) = cmd #< iter2is(it) lineStream
/**
* Convert an iterator to an InputStream for use in the pipe function.
* #param it an iterator to convert
*/
private[this] def iter2is[A](it: Iterator[A]): InputStream = {
// What is written to the output stream will appear in the input stream.
val pos = new PipedOutputStream
val pis = new PipedInputStream(pos)
val w = new PrintWriter(pos, true)
// Scala 2.11 (scala 2.10, use 'future'). Executes asynchrously.
// Fill the stream, then close.
Future {
it foreach w.println
w.close
}
// Return possibly before pis is fully written to.
pis
}
}
The final call will show display 0 through 9 and will pause for 3 seconds in between the displaying of each number (second pause on the scala side, 1 second pause on the shell script side).
// echo-sleep.sh is the same script as in my previous post
new PausingIterator(0, 10, 2000)(_ + 1)
.map(_.toString)
.pipe("echo-sleep.sh")
.foreach(println)
Output
0 [ pause 3 secs ]
1 [ pause 3 secs ]
...
8 [ pause 3 secs ]
9 [ pause 3 secs ]
Background: I have a function:
def doWork(symbol: String): Future[Unit]
which initiates some side-effects to fetch data and store it, and completes a Future when its done. However, the back-end infrastructure has usage limits, such that no more than 5 of these requests can be made in parallel. I have a list of N symbols that I need to get through:
var symbols = Array("MSFT",...)
but I want to sequence them such that no more than 5 are executing simultaneously. Given:
val allowableParallelism = 5
my current solution is (assuming I'm working with async/await):
val symbolChunks = symbols.toList.grouped(allowableParallelism).toList
def toThunk(x: List[String]) = () => Future.sequence(x.map(doWork))
val symbolThunks = symbolChunks.map(toThunk)
val done = Promise[Unit]()
def procThunks(x: List[() => Future[List[Unit]]]): Unit = x match {
case Nil => done.success()
case x::xs => x().onComplete(_ => procThunks(xs))
}
procThunks(symbolThunks)
await { done.future }
but, for obvious reasons, I'm not terribly happy with it. I feel like this should be possible with folds, but every time I try, I end up eagerly creating the Futures. I also tried out a version with RxScala Observables, using concatMap, but that also seemed like overkill.
Is there a better way to accomplish this?
I have example how to do it with scalaz-stream. It's quite a lot of code because it's required to convert scala Future to scalaz Task (abstraction for deferred computation). However it's required to add it to project once. Another option is to use Task for defining 'doWork'. I personally prefer task for building async programs.
import scala.concurrent.{Future => SFuture}
import scala.util.Random
import scala.concurrent.ExecutionContext.Implicits.global
import scalaz.stream._
import scalaz.concurrent._
val P = scalaz.stream.Process
val rnd = new Random()
def doWork(symbol: String): SFuture[Unit] = SFuture {
Thread.sleep(rnd.nextInt(1000))
println(s"Symbol: $symbol. Thread: ${Thread.currentThread().getName}")
}
val symbols = Seq("AAPL", "MSFT", "GOOGL", "CVX").
flatMap(s => Seq.fill(5)(s).zipWithIndex.map(t => s"${t._1}${t._2}"))
implicit class Transformer[+T](fut: => SFuture[T]) {
def toTask(implicit ec: scala.concurrent.ExecutionContext): Task[T] = {
import scala.util.{Failure, Success}
import scalaz.syntax.either._
Task.async {
register =>
fut.onComplete {
case Success(v) => register(v.right)
case Failure(ex) => register(ex.left)
}
}
}
}
implicit class ConcurrentProcess[O](val process: Process[Task, O]) {
def concurrently[O2](concurrencyLevel: Int)(f: Channel[Task, O, O2]): Process[Task, O2] = {
val actions =
process.
zipWith(f)((data, f) => f(data))
val nestedActions =
actions.map(P.eval)
merge.mergeN(concurrencyLevel)(nestedActions)
}
}
val workChannel = io.channel((s: String) => doWork(s).toTask)
val process = Process.emitAll(symbols).concurrently(5)(workChannel)
process.run.run
When you'll have all this transformation in scope, basically all you need is:
val workChannel = io.channel((s: String) => doWork(s).toTask)
val process = Process.emitAll(symbols).concurrently(5)(workChannel)
Quite short and self-decribing
Although you've already got an excellent answer, I thought I might still offer an opinion or two about these matters.
I remember seeing somewhere (on someone's blog) "use actors for state and use futures for concurrency".
So my first thought would be to utilize actors somehow. To be precise, I would have a master actor with a router launching multiple worker actors, with number of workers restrained according to allowableParallelism. So, assuming I have
def doWorkInternal (symbol: String): Unit
which does the work from yours doWork taken 'outside of future', I would have something along these lines (very rudimentary, not taking many details into consideration, and practically copying code from akka documentation):
import akka.actor._
case class WorkItem (symbol: String)
case class WorkItemCompleted (symbol: String)
case class WorkLoad (symbols: Array[String])
case class WorkLoadCompleted ()
class Worker extends Actor {
def receive = {
case WorkItem (symbol) =>
doWorkInternal (symbol)
sender () ! WorkItemCompleted (symbol)
}
}
class Master extends Actor {
var pending = Set[String] ()
var originator: Option[ActorRef] = None
var router = {
val routees = Vector.fill (allowableParallelism) {
val r = context.actorOf(Props[Worker])
context watch r
ActorRefRoutee(r)
}
Router (RoundRobinRoutingLogic(), routees)
}
def receive = {
case WorkLoad (symbols) =>
originator = Some (sender ())
context become processing
for (symbol <- symbols) {
router.route (WorkItem (symbol), self)
pending += symbol
}
}
def processing: Receive = {
case Terminated (a) =>
router = router.removeRoutee(a)
val r = context.actorOf(Props[Worker])
context watch r
router = router.addRoutee(r)
case WorkItemCompleted (symbol) =>
pending -= symbol
if (pending.size == 0) {
context become receive
originator.get ! WorkLoadCompleted
}
}
}
You could query the master actor with ask and receive a WorkLoadCompleted in a future.
But thinking more about 'state' (of number of simultaneous requests in processing) to be hidden somewhere, together with implementing necessary code for not exceeding it, here's something of the 'future gateway intermediary' sort, if you don't mind imperative style and mutable (used internally only though) structures:
object Guardian
{
private val incoming = new collection.mutable.HashMap[String, Promise[Unit]]()
private val outgoing = new collection.mutable.HashMap[String, Future[Unit]]()
private val pending = new collection.mutable.Queue[String]
def doWorkGuarded (symbol: String): Future[Unit] = {
synchronized {
val p = Promise[Unit] ()
incoming(symbol) = p
if (incoming.size <= allowableParallelism)
launchWork (symbol)
else
pending.enqueue (symbol)
p.future
}
}
private def completionHandler (t: Try[Unit]): Unit = {
synchronized {
for (symbol <- outgoing.keySet) {
val f = outgoing (symbol)
if (f.isCompleted) {
incoming (symbol).completeWith (f)
incoming.remove (symbol)
outgoing.remove (symbol)
}
}
for (i <- outgoing.size to allowableParallelism) {
if (pending.nonEmpty) {
val symbol = pending.dequeue()
launchWork (symbol)
}
}
}
}
private def launchWork (symbol: String): Unit = {
val f = doWork(symbol)
outgoing(symbol) = f
f.onComplete(completionHandler)
}
}
doWork now is exactly like yours, returning Future[Unit], with the idea that instead of using something like
val futures = symbols.map (doWork (_)).toSeq
val future = Future.sequence(futures)
which would launch futures not regarding allowableParallelism at all, I would instead use
val futures = symbols.map (Guardian.doWorkGuarded (_)).toSeq
val future = Future.sequence(futures)
Think about some hypothetical database access driver with non-blocking interface, i.e. returning futures on requests, which is limited in concurrency by being built over some connection pool for example - you wouldn't want it to return futures not taking parallelism level into account, and require you to juggle with them to keep parallelism under control.
This example is more illustrative than practical since I wouldn't normally expect that 'outgoing' interface would be utilizing futures like this (which is quote ok for 'incoming' interface).
First, obviously some purely functional wrapper around Scala's Future is needed, cause it's side-effective and runs as soon as it can. Let's call it Deferred:
import scala.concurrent.Future
import scala.util.control.Exception.nonFatalCatch
class Deferred[+T](f: () => Future[T]) {
def run(): Future[T] = f()
}
object Deferred {
def apply[T](future: => Future[T]): Deferred[T] =
new Deferred(() => nonFatalCatch.either(future).fold(Future.failed, identity))
}
And here is the routine:
import java.util.concurrent.CopyOnWriteArrayList
import java.util.concurrent.atomic.AtomicInteger
import scala.collection.immutable.Seq
import scala.concurrent.{ExecutionContext, Future, Promise}
import scala.util.control.Exception.nonFatalCatch
import scala.util.{Failure, Success}
trait ConcurrencyUtils {
def runWithBoundedParallelism[T](parallelism: Int = Runtime.getRuntime.availableProcessors())
(operations: Seq[Deferred[T]])
(implicit ec: ExecutionContext): Deferred[Seq[T]] =
if (parallelism > 0) Deferred {
val indexedOps = operations.toIndexedSeq // index for faster access
val promise = Promise[Seq[T]]()
val acc = new CopyOnWriteArrayList[(Int, T)] // concurrent acc
val nextIndex = new AtomicInteger(parallelism) // keep track of the next index atomically
def run(operation: Deferred[T], index: Int): Unit = {
operation.run().onComplete {
case Success(value) =>
acc.add((index, value)) // accumulate result value
if (acc.size == indexedOps.size) { // we've done
import scala.collection.JavaConversions._
// in concurrent setting next line may be called multiple times, that's why trySuccess instead of success
promise.trySuccess(acc.view.sortBy(_._1).map(_._2).toList)
} else {
val next = nextIndex.getAndIncrement() // get and inc atomically
if (next < indexedOps.size) { // run next operation if exists
run(indexedOps(next), next)
}
}
case Failure(t) =>
promise.tryFailure(t) // same here (may be called multiple times, let's prevent stdout pollution)
}
}
if (operations.nonEmpty) {
indexedOps.view.take(parallelism).zipWithIndex.foreach((run _).tupled) // run as much as allowed
promise.future
} else {
Future.successful(Seq.empty)
}
} else {
throw new IllegalArgumentException("Parallelism must be positive")
}
}
In a nutshell, we run as much operations initially as allowed and then on each operation completion we run next operation available, if any. So the only difficulty here is to maintain next operation index and results accumulator in concurrent setting. I'm not an absolute concurrency expert, so make me know if there are some potential problems in the code above. Notice that returned value is also a deferred computation that should be run.
Usage and test:
import org.scalatest.{Matchers, FlatSpec}
import org.scalatest.concurrent.ScalaFutures
import org.scalatest.time.{Seconds, Span}
import scala.collection.immutable.Seq
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.duration._
class ConcurrencyUtilsSpec extends FlatSpec with Matchers with ScalaFutures with ConcurrencyUtils {
"runWithBoundedParallelism" should "return results in correct order" in {
val comp1 = mkDeferredComputation(1)
val comp2 = mkDeferredComputation(2)
val comp3 = mkDeferredComputation(3)
val comp4 = mkDeferredComputation(4)
val comp5 = mkDeferredComputation(5)
val compountComp = runWithBoundedParallelism(2)(Seq(comp1, comp2, comp3, comp4, comp5))
whenReady(compountComp.run()) { result =>
result should be (Seq(1, 2, 3, 4, 5))
}
}
// increase default ScalaTest patience
implicit val defaultPatience = PatienceConfig(timeout = Span(10, Seconds))
private def mkDeferredComputation[T](result: T, sleepDuration: FiniteDuration = 100.millis): Deferred[T] =
Deferred {
Future {
Thread.sleep(sleepDuration.toMillis)
result
}
}
}
Use Monix Task. An example from Monix document for parallelism=10
val items = 0 until 1000
// The list of all tasks needed for execution
val tasks = items.map(i => Task(i * 2))
// Building batches of 10 tasks to execute in parallel:
val batches = tasks.sliding(10,10).map(b => Task.gather(b))
// Sequencing batches, then flattening the final result
val aggregate = Task.sequence(batches).map(_.flatten.toList)
// Evaluation:
aggregate.foreach(println)
//=> List(0, 2, 4, 6, 8, 10, 12, 14, 16,...
Akka streams, allow you to do the following:
import akka.NotUsed
import akka.stream.Materializer
import akka.stream.scaladsl.Source
import scala.concurrent.Future
def sequence[A: Manifest, B](items: Seq[A], func: A => Future[B], parallelism: Int)(
implicit mat: Materializer
): Future[Seq[B]] = {
val futures: Source[B, NotUsed] =
Source[A](items.toList).mapAsync(parallelism)(x => func(x))
futures.runFold(Seq.empty[B])(_ :+ _)
}
sequence(symbols, doWork, allowableParallelism)
Is it possible to remove some types from the following code:
import util.continuations._
object TrackingTest extends App {
implicit def trackable(x: Int) = new {
def tracked[R] = shift { cf: (Int => (R, Set[Int])) =>
cf(x) match {
case (r, ints) => (r, ints + x)
}
}
}
def track[R](body: => R #cpsParam[(R, Set[Int]), (R, Set[Int])]) = reset {
(body, Set[Int]())
}
val result = track(7.tracked[Int] + 35.tracked[Int])
assert(result == (42, Set(7, 35)))
val differentTypes = track(9.tracked[String].toString)
assert(differentTypes == ("9", Set(9)))
}
track function tracks calls of tracked on Int instances (e.g. 7.tracked).
Is it possible to infer type parameter on tracked implicit, so the following would compile:
track(7.tracked + 35.tracked)
Your question made me think of how continuations can track state. So I adapted that to your case and came up with this:
import util.continuations._
object TrackingTest extends App {
type State = Set[Int]
type ST = State => State
implicit class Tracked(val i: Int) extends AnyVal {
def tracked = shift{ (k: Int=>ST) => (state:State) => k(i)(state + i) }
}
def track[A](thunk: => A#cps[ST]): (A, State) = {
var result: A = null.asInstanceOf[A]
val finalSate = (reset {
result = thunk
(state:State) => state
}).apply(Set[Int]())
(result, finalSate)
}
val result = track(7.tracked + 35.tracked)
assert(result == (42, Set(7, 35)))
val differentTypes = track(9.tracked.toString)
assert(differentTypes == ("9", Set(9)))
}
This is using 2.10.1 but it works fine with 2.9.1 as well provided you replace the 2.10.x implicit value class with:
implicit def tracked(i: Int) = new {
def tracked = shift{ (k: Int=>ST) => (state:State) => k(i)(state + i) }
}
The key change I made is to have tracked not use any type inference, fixing to Int#cps[ST]. The CPS plugin then maps the computation to the right type (like String#cps[ST]) as appropriate. The state is threaded by the continuation returning a State=>State function that takes the current state (the set of ints) and returns the next state. The return type of reset is a function from state to state (of type ST) that will take the initial state and will return the final state.
The final trick is to use a var to capture the result while still keeping the expected type for reset.
While the exact answer to this question can be given only by the authors of the compiler, we can guess it is not possible by giving a look to the continuation plugin source code.
If you look to the source of the continuations you can see this:
val anfPhase = new SelectiveANFTransform() {
val global = SelectiveCPSPlugin.this.global
val runsAfter = List("pickler")
}
val cpsPhase = new SelectiveCPSTransform() {
val global = SelectiveCPSPlugin.this.global
val runsAfter = List("selectiveanf")
}
The anfPhase phase is executed after the pickler phase, and the cpsPhase after selectiveAnf. If you look to SelectiveANFTransform.scala
abstract class SelectiveANFTransform extends PluginComponent with Transform with
TypingTransformers with CPSUtils {
// inherits abstract value `global' and class `Phase' from Transform
import global._ // the global environment
import definitions._ // standard classes and methods
import typer.atOwner // methods to type trees
/** the following two members override abstract members in Transform */
val phaseName: String = "selectiveanf"
If we use scalac -Xshow-phases, we can see the phases during the compilation process:
parser
namer
packageobjects
typer
superaccessors
pickler
refchecks
selectiveanf
liftcode
selectivecps
uncurry
......
As you can see the typer phase is applied before the selectiveAnf and selectiveCps phases. It should be confirmed that type inference occurs in the typer phase, but if this is really the case and it would make sense, it should be now clear why you can't omit the Int type on 7.tracked and 35.tracked.
Now if you are not satisfied yet, you should know that the compiler works by performing a set of transformations on "trees", which you might look, using the following options:
-Xprint: shows your scala code after a certain phase have been executed
-Xprint: -Yshow-trees shows your scala code and the trees after the phase have been executed
-YBrowse: opens a GUI to surf both.