Creating a time-based chunking Enumeratee - scala

I want to create a Play 2 Enumeratee that takes in values and outputs them, chunked together, every x seconds/milliseconds. That way, in a multi-user websocket environment with lots of user input, one could limit the number of received frames per second.
I know that it's possible to group a set number of items together like this:
val chunker = Enumeratee.grouped(
Traversable.take[Array[Double]](5000) &>> Iteratee.consume()
)
Is there a built-in way to do this based on time rather than based on the number of items?
I was thinking about doing this somehow with a scheduled Akka job, but on first sight this seems inefficient, and I'm not sure if concurency issues would arise.

How about like this? I hope this is helpful for you.
package controllers
import play.api._
import play.api.Play.current
import play.api.mvc._
import play.api.libs.iteratee._
import play.api.libs.concurrent.Akka
import play.api.libs.concurrent.Promise
object Application extends Controller {
def index = Action {
val queue = new scala.collection.mutable.Queue[String]
Akka.future {
while( true ){
Logger.info("hogehogehoge")
queue += System.currentTimeMillis.toString
Thread.sleep(100)
}
}
val timeStream = Enumerator.fromCallback { () =>
Promise.timeout(Some(queue), 200)
}
Ok.stream(timeStream.through(Enumeratee.map[scala.collection.mutable.Queue[String]]({ queue =>
var str = ""
while(queue.nonEmpty){
str += queue.dequeue + ", "
}
str
})))
}
}
And this document is also helpful for you.
http://www.playframework.com/documentation/2.0/Enumerators
UPDATE
This is for play2.1 version.
package controllers
import play.api._
import play.api.Play.current
import play.api.mvc._
import play.api.libs.iteratee._
import play.api.libs.concurrent.Akka
import play.api.libs.concurrent.Promise
import scala.concurrent._
import ExecutionContext.Implicits.global
object Application extends Controller {
def index = Action {
val queue = new scala.collection.mutable.Queue[String]
Akka.future {
while( true ){
Logger.info("hogehogehoge")
queue += System.currentTimeMillis.toString
Thread.sleep(100)
}
}
val timeStream = Enumerator.repeatM{
Promise.timeout(queue, 200)
}
Ok.stream(timeStream.through(Enumeratee.map[scala.collection.mutable.Queue[String]]({ queue =>
var str = ""
while(queue.nonEmpty){
str += queue.dequeue + ", "
}
str
})))
}
}

Here I've quickly defined an iteratee that will take values from an input for a fixed time length t measured in milliseconds and an enumeratee that will allow you to group and further process an input stream divided into segments constructed within such length t. It relies on JodaTime to keep track of how much time has passed since the iteratee began.
def throttledTakeIteratee[E](timeInMillis: Long): Iteratee[E, List[E]] = {
var startTime = new Instant()
def step(state: List[E])(input: Input[E]): Iteratee[E, List[E]] = {
val timePassed = new Interval(startTime, new Instant()).toDurationMillis
input match {
case Input.EOF => { startTime = new Instant; Done(state, Input.EOF) }
case Input.Empty => Cont[E, List[E]](i => step(state)(i))
case Input.El(e) =>
if (timePassed >= timeInMillis) { startTime = new Instant; Done(e::state, Input.Empty) }
else Cont[E, List[E]](i => step(e::state)(i))
}
}
Cont(step(List[E]()))
}
def throttledTake[T](timeInMillis: Long) = Enumeratee.grouped(throttledTakeIteratee[T](timeInMillis))

Related

Akka Streams: State in a flow

I want to read multiple big files using Akka Streams to process each line. Imagine that each key consists of an (identifier -> value). If a new identifier is found, I want to save it and its value in the database; otherwise, if the identifier has already been found while processing the stream of lines, I want to save only the value. For that, I think that I need some kind of recursive stateful flow in order to keep the identifiers that have already been found in a Map. I think I'd receive in this flow a pair of (newLine, contextWithIdentifiers).
I've just started to look into Akka Streams. I guess I can manage myself to do the stateless processing stuff but I have no clue about how to keep the contextWithIdentifiers. I'd appreciate any pointers to the right direction.
Maybe something like statefulMapConcat can help you:
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
import akka.stream.scaladsl.{Sink, Source}
import scala.util.Random._
import scala.math.abs
import scala.concurrent.ExecutionContext.Implicits.global
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
//encapsulating your input
case class IdentValue(id: Int, value: String)
//some random generated input
val identValues = List.fill(20)(IdentValue(abs(nextInt()) % 5, "valueHere"))
val stateFlow = Flow[IdentValue].statefulMapConcat{ () =>
//state with already processed ids
var ids = Set.empty[Int]
identValue => if (ids.contains(identValue.id)) {
//save value to DB
println(identValue.value)
List(identValue)
} else {
//save both to database
println(identValue)
ids = ids + identValue.id
List(identValue)
}
}
Source(identValues)
.via(stateFlow)
.runWith(Sink.seq)
.onSuccess { case identValue => println(identValue) }
A few years later, here is an implementation I wrote if you only need a 1-to-1 mapping (not 1-to-N):
import akka.stream.stage.{GraphStage, GraphStageLogic}
import akka.stream.{Attributes, FlowShape, Inlet, Outlet}
object StatefulMap {
def apply[T, O](converter: => T => O) = new StatefulMap[T, O](converter)
}
class StatefulMap[T, O](converter: => T => O) extends GraphStage[FlowShape[T, O]] {
val in = Inlet[T]("StatefulMap.in")
val out = Outlet[O]("StatefulMap.out")
val shape = FlowShape.of(in, out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
val f = converter
setHandler(in, () => push(out, f(grab(in))))
setHandler(out, () => pull(in))
}
}
Test (and demo):
behavior of "StatefulMap"
class Counter extends (Any => Int) {
var count = 0
override def apply(x: Any): Int = {
count += 1
count
}
}
it should "not share state among substreams" in {
val result = await {
Source(0 until 10)
.groupBy(2, _ % 2)
.via(StatefulMap(new Counter()))
.fold(Seq.empty[Int])(_ :+ _)
.mergeSubstreams
.runWith(Sink.seq)
}
result.foreach(_ should be(1 to 5))
}

Scala: Parallel execution with ListBuffer appends doesn't produce expected outcome

I know I'm doing something wrong with mutable.ListBuffer but I can't figure out how to fix it (and a proper explanation of the issue).
I simplified the code below to reproduce the behavior.
I'm basically trying to run functions in parallel to add elements to a list as my first list get processed. I end up "losing" elements.
import java.util.Properties
import scala.collection.mutable.ListBuffer
import scala.concurrent.duration.Duration
import scala.concurrent.{Await, Future}
import scala.concurrent.{ExecutionContext}
import ExecutionContext.Implicits.global
object MyTestObject {
var listBufferOfInts = new ListBuffer[Int]() // files that are processed
def runFunction(): Int = {
listBufferOfInts = new ListBuffer[Int]()
val inputListOfInts = 1 to 1000
val fut = Future.traverse(inputListOfInts) { i =>
Future {
appendElem(i)
}
}
Await.ready(fut, Duration.Inf)
listBufferOfInts.length
}
def appendElem(elem: Int): Unit = {
listBufferOfInts ++= List(elem)
}
}
MyTestObject.runFunction()
MyTestObject.runFunction()
MyTestObject.runFunction()
which returns:
res0: Int = 937
res1: Int = 992
res2: Int = 997
Obviously I would expect 1000 to be returned all the time. How can I fix my code to keep the "architecture" but make my ListBuffer "synchronized" ?
I don't know what exact problem is as you said you simplified it, but still you have an obvious race condition, multiple threads modify a single mutable collection and that is very bad. As other answers pointed out you need some locking so that only one thread could modify collection at the same time. If your calculations are heavy, appending result in synchronized way to a buffer shouldn't notably affect the performance but when in doubt always measure.
But synchronization is not needed, you can do something else instead, without vars and mutable state. Let each Future return your partial result and then merge them into a list, in fact Future.traverse does just that.
import scala.concurrent.duration._
import scala.concurrent.{Await, Future}
import scala.concurrent.ExecutionContext.Implicits.global
def runFunction: Int = {
val inputListOfInts = 1 to 1000
val fut: Future[List[Int]] = Future.traverse(inputListOfInts.toList) { i =>
Future {
// some heavy calculations on i
i * 4
}
}
val listOfInts = Await.result(fut, Duration.Inf)
listOfInts.size
}
Future.traverse already gives you an immutable list with all your results combined, no need to append them to a mutable buffer.
Needless to say, you will always get 1000 back.
# List.fill(10000)(runFunction).exists(_ != 1000)
res18: Boolean = false
I'm not sure the above shows what you are trying to do correctly. Maybe the issue is that you are actually sharing a var ListBuffer which you reinitialise within runFunction.
When I take this out I collect all the events I'm expecting correctly:
import java.util.Properties
import scala.collection.mutable.ListBuffer
import scala.concurrent.duration.Duration
import scala.concurrent.{ Await, Future }
import scala.concurrent.{ ExecutionContext }
import ExecutionContext.Implicits.global
object BrokenTestObject extends App {
var listBufferOfInts = ( new ListBuffer[Int]() )
def runFunction(): Int = {
val inputListOfInts = 1 to 1000
val fut = Future.traverse(inputListOfInts) { i =>
Future {
appendElem(i)
}
}
Await.ready(fut, Duration.Inf)
listBufferOfInts.length
}
def appendElem(elem: Int): Unit = {
listBufferOfInts.append( elem )
}
BrokenTestObject.runFunction()
BrokenTestObject.runFunction()
BrokenTestObject.runFunction()
println(s"collected ${listBufferOfInts.length} elements")
}
If you really have a synchronisation issue you can use something like the following:
import java.util.Properties
import scala.collection.mutable.ListBuffer
import scala.concurrent.duration.Duration
import scala.concurrent.{ Await, Future }
import scala.concurrent.{ ExecutionContext }
import ExecutionContext.Implicits.global
class WrappedListBuffer(val lb: ListBuffer[Int]) {
def append(i: Int) {
this.synchronized {
lb.append(i)
}
}
}
object MyTestObject extends App {
var listBufferOfInts = new WrappedListBuffer( new ListBuffer[Int]() )
def runFunction(): Int = {
val inputListOfInts = 1 to 1000
val fut = Future.traverse(inputListOfInts) { i =>
Future {
appendElem(i)
}
}
Await.ready(fut, Duration.Inf)
listBufferOfInts.lb.length
}
def appendElem(elem: Int): Unit = {
listBufferOfInts.append( elem )
}
MyTestObject.runFunction()
MyTestObject.runFunction()
MyTestObject.runFunction()
println(s"collected ${listBufferOfInts.lb.size} elements")
}
Changing
listBufferOfInts ++= List(elem)
to
synchronized {
listBufferOfInts ++= List(elem)
}
Make it work. Probably can become a performance issue? I'm still interested in an explanation and maybe a better way of doing things!

Akka-http process requests with Stream

I try write some simple akka-http and akka-streams based application, that handle http requests, always with one precompiled stream, because I plan to use long time processing with back-pressure in my requestProcessor stream
My application code:
import akka.actor.{ActorSystem, Props}
import akka.http.scaladsl._
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.server._
import akka.stream.ActorFlowMaterializer
import akka.stream.actor.ActorPublisher
import akka.stream.scaladsl.{Sink, Source}
import scala.annotation.tailrec
import scala.concurrent.Future
object UserRegisterSource {
def props: Props = Props[UserRegisterSource]
final case class RegisterUser(username: String)
}
class UserRegisterSource extends ActorPublisher[UserRegisterSource.RegisterUser] {
import UserRegisterSource._
import akka.stream.actor.ActorPublisherMessage._
val MaxBufferSize = 100
var buf = Vector.empty[RegisterUser]
override def receive: Receive = {
case request: RegisterUser =>
if (buf.isEmpty && totalDemand > 0)
onNext(request)
else {
buf :+= request
deliverBuf()
}
case Request(_) =>
deliverBuf()
case Cancel =>
context.stop(self)
}
#tailrec final def deliverBuf(): Unit =
if (totalDemand > 0) {
if (totalDemand <= Int.MaxValue) {
val (use, keep) = buf.splitAt(totalDemand.toInt)
buf = keep
use foreach onNext
} else {
val (use, keep) = buf.splitAt(Int.MaxValue)
buf = keep
use foreach onNext
deliverBuf()
}
}
}
object Main extends App {
val host = "127.0.0.1"
val port = 8094
implicit val system = ActorSystem("my-testing-system")
implicit val fm = ActorFlowMaterializer()
implicit val executionContext = system.dispatcher
val serverSource: Source[Http.IncomingConnection, Future[Http.ServerBinding]] = Http(system).bind(interface = host, port = port)
val mySource = Source.actorPublisher[UserRegisterSource.RegisterUser](UserRegisterSource.props)
val requestProcessor = mySource
.mapAsync(1)(fakeSaveUserAndReturnCreatedUserId)
.to(Sink.head[Int])
.run()
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
requestProcessor ! UserRegisterSource.RegisterUser(t)
???
}
}
}
def fakeSaveUserAndReturnCreatedUserId(param: UserRegisterSource.RegisterUser): Future[Int] =
Future.successful {
1
}
serverSource.to(Sink.foreach {
connection =>
connection handleWith Route.handlerFlow(route)
}).run()
}
I found solution about how create Source that can dynamically accept new items to process, but I can found any solution about how than obtain result of stream execution in my route
The direct answer to your question is to materialize a new Stream for each HttpRequest and use Sink.head to get the value you're looking for. Modifying your code:
val requestStream =
mySource.map(fakeSaveUserAndReturnCreatedUserId)
.to(Sink.head[Int])
//.run() - don't materialize here
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
//materialize a new Stream here
val userIdFut : Future[Int] = requestStream.run()
requestProcessor ! UserRegisterSource.RegisterUser(t)
//get the result of the Stream
userIdFut onSuccess { case userId : Int => ...}
}
}
}
However, I think your question is ill posed. In your code example the only thing you're using an akka Stream for is to create a new UserId. Futures readily solve this problem without the need for a materialized Stream (and all the accompanying overhead):
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
val user = RegisterUser(t)
fakeSaveUserAndReturnCreatedUserId(user) onSuccess { case userId : Int =>
...
}
}
}
}
If you want to limit the number of concurrent calls to fakeSaveUserAndReturnCreateUserId then you can create an ExecutionContext with a defined ThreadPool size, as explained in the answer to this question, and use that ExecutionContext to create the Futures:
val ThreadCount = 10 //concurrent queries
val limitedExecutionContext =
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(ThreadCount))
def fakeSaveUserAndReturnCreatedUserId(param: UserRegisterSource.RegisterUser): Future[Int] =
Future { 1 }(limitedExecutionContext)

Akka Dead Letters with Ask Pattern

I apologize in advance if this seems at all confusing, as I'm dumping quite a bit here. Basically, I have a small service grabbing some Json, parsing and extracting it to case class(es), then writing it to a database. This service needs to run on a schedule, which is being handled well by an Akka scheduler. My database doesn't like when Slick tries to ask for a new AutoInc id at the same time, so I built in an Await.result to block that from happening. All of this works quite well, but my issue starts here: there are 7 of these services running, so I would like to block each one using a similar Await.result system. Every time I try to send the end time of the request back as a response (at the end of the else block), it gets sent to dead letters instead of to the Distributor. Basically: why does sender ! time go to dead letters and not to Distributor. This is a long question for a simple problem, but that's how development goes...
ClickActor.scala
import java.text.SimpleDateFormat
import java.util.Date
import Message._
import akka.actor.{Actor, ActorLogging, Props}
import akka.util.Timeout
import com.typesafe.config.ConfigFactory
import net.liftweb.json._
import spray.client.pipelining._
import spray.http.{BasicHttpCredentials, HttpRequest, HttpResponse, Uri}
import akka.pattern.ask
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
case class ClickData(recipient : String, geolocation : Geolocation, tags : Array[String],
url : String, timestamp : Double, campaigns : Array[String],
`user-variables` : JObject, ip : String,
`client-info` : ClientInfo, message : ClickedMessage, event : String)
case class Geolocation(city : String, region : String, country : String)
case class ClientInfo(`client-name`: String, `client-os`: String, `user-agent`: String,
`device-type`: String, `client-type`: String)
case class ClickedMessage(headers : ClickHeaders)
case class ClickHeaders(`message-id` : String)
class ClickActor extends Actor with ActorLogging{
implicit val formats = DefaultFormats
implicit val timeout = new Timeout(3 minutes)
import context.dispatcher
val con = ConfigFactory.load("connection.conf")
val countries = ConfigFactory.load("country.conf")
val regions = ConfigFactory.load("region.conf")
val df = new SimpleDateFormat("EEE, dd MMM yyyy HH:mm:ss -0000")
var time = System.currentTimeMillis()
var begin = new Date(time - (12 hours).toMillis)
var end = new Date(time)
val pipeline : HttpRequest => Future[HttpResponse] = (
addCredentials(BasicHttpCredentials("api", con.getString("mailgun.key")))
~> sendReceive
)
def get(lastrun : Long): Future[String] = {
if(lastrun != 0) {
begin = new Date(lastrun)
end = new Date(time)
}
val uri = Uri(con.getString("mailgun.uri")) withQuery("begin" -> df.format(begin), "end" -> df.format(end),
"ascending" -> "yes", "limit" -> "100", "pretty" -> "yes", "event" -> "clicked")
val request = Get(uri)
val futureResponse = pipeline(request)
return futureResponse.map(_.entity.asString)
}
def receive = {
case lastrun : Long => {
val start = System.currentTimeMillis()
val responseFuture = get(lastrun)
responseFuture.onSuccess {
case payload: String => val json = parse(payload)
//println(pretty(render(json)))
val elements = (json \\ "items").children
if (elements.length == 0) {
log.info("[ClickActor: " + this.hashCode() + "] did not find new events between " +
begin.toString + " and " + end.toString)
sender ! time
context.stop(self)
}
else {
for (item <- elements) {
val data = item.extract[ClickData]
var tags = ""
if (data.tags.length != 0) {
for (tag <- data.tags)
tags += (tag + ", ")
}
var campaigns = ""
if (data.campaigns.length != 0) {
for (campaign <- data.campaigns)
campaigns += (campaign + ", ")
}
val timestamp = (data.timestamp * 1000).toLong
val msg = new ClickMessage(
data.recipient, data.geolocation.city,
regions.getString(data.geolocation.country + "." + data.geolocation.region),
countries.getString(data.geolocation.country), tags, data.url, timestamp,
campaigns, data.ip, data.`client-info`.`client-name`,
data.`client-info`.`client-os`, data.`client-info`.`user-agent`,
data.`client-info`.`device-type`, data.`client-info`.`client-type`,
data.message.headers.`message-id`, data.event, compactRender(item))
val csqla = context.actorOf(Props[ClickSQLActor])
val future = csqla.ask(msg)
val result = Await.result(future, timeout.duration).asInstanceOf[Int]
if (result == 1) {
log.error("[ClickSQLActor: " + csqla.hashCode() + "] shutting down due to lack of system environment variables")
context.stop(csqla)
}
else if(result == 0) {
log.info("[ClickSQLActor: " + csqla.hashCode() + "] successfully wrote to the DB")
}
}
sender ! time
log.info("[ClickActor: " + this.hashCode() + "] processed |" + elements.length + "| new events in " +
(System.currentTimeMillis() - start) + " ms")
}
}
}
}
}
Distributor.scala
import akka.actor.{Props, ActorSystem}
import akka.event.Logging
import akka.util.Timeout
import akka.pattern.ask
import scala.concurrent.duration._
import scala.concurrent.Await
class Distributor {
implicit val timeout = new Timeout(10 minutes)
var lastClick : Long = 0
def distribute(system : ActorSystem) = {
val log = Logging(system, getClass)
val clickFuture = (system.actorOf(Props[ClickActor]) ? lastClick)
lastClick = Await.result(clickFuture, timeout.duration).asInstanceOf[Long]
log.info(lastClick.toString)
//repeat process with other events (open, unsub, etc)
}
}
The reason is because the value of 'sender' (which is a method that retrieves the value) is no longer valid after leaving the receive block, yet the future that is being used in the above example will still be running and by the time that it finishes the actor will have left the receive block and bang; an invalid sender results in the message going to the dead letter queue.
The fix is to either not use a future, or when combining futures, actors and sender then capture the value of sender before you trigger the future.
val s = sender
val responseFuture = get(lastrun)
responseFuture.onSuccess {
....
s ! time
}

Improve this actor calling futures

I've an actor (Worker) which basically ask 3 other actors (Filter1, Filter2 and Filter3) for a result. If any of them return a false, It's unnecessary to wait for the others, like an "and" operation over the results. When a false response is receive, a cancel message is sent to the actors in a way to cancel the queued work and make it more effective in the execution.
Filters aren't children of Worker, but there are a common pool of actor which are used by all Worker actors. I use an Agent to maintain the collection of cancel Works. Then, before a particular work is processed, I check in the cancel agent if that work was cancel, and then avoid the execution for it. Cancel has a higher priority than Work, then, it is processed always first.
The code is something like this
Proxy, who create the actors tree:
import scala.collection.mutable.HashSet
import scala.concurrent.ExecutionContext.Implicits.global
import com.typesafe.config.Config
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.ActorSystem
import akka.actor.PoisonPill
import akka.actor.Props
import akka.agent.Agent
import akka.routing.RoundRobinRouter
class Proxy extends Actor with ActorLogging {
val agent1 = Agent(new HashSet[Work])
val agent2 = Agent(new HashSet[Work])
val agent3 = Agent(new HashSet[Work])
val filter1 = context.actorOf(Props(Filter1(agent1)).withDispatcher("priorityMailBox-dispatcher")
.withRouter(RoundRobinRouter(24)), "filter1")
val filter2 = context.actorOf(Props(Filter2(agent2)).withDispatcher("priorityMailBox-dispatcher")
.withRouter(RoundRobinRouter(24)), "filter2")
val filter3 = context.actorOf(Props(Filter3(agent3)).withDispatcher("priorityMailBox-dispatcher")
.withRouter(RoundRobinRouter(24)), "filter3")
//val workerRouter = context.actorOf(Props[SerialWorker].withRouter(RoundRobinRouter(24)), name = "workerRouter")
val workerRouter = context.actorOf(Props(new Worker(filter1, filter2, filter3)).withRouter(RoundRobinRouter(24)), name = "workerRouter")
def receive = {
case w: Work =>
workerRouter forward w
}
}
Worker:
import scala.concurrent.Await
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
import scala.concurrent.duration.DurationInt
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.Props
import akka.actor.actorRef2Scala
import akka.pattern.ask
import akka.pattern.pipe
import akka.util.Timeout
import akka.actor.ActorRef
import akka.routing.RoundRobinRouter
import akka.agent.Agent
import scala.collection.mutable.HashSet
class Worker(filter1: ActorRef, filter2: ActorRef, filter3: ActorRef) extends Actor with ActorLogging {
implicit val timeout = Timeout(30.seconds)
def receive = {
case w:Work =>
val start = System.currentTimeMillis();
val futureF3 = (filter3 ? w).mapTo[Response]
val futureF2 = (filter2 ? w).mapTo[Response]
val futureF1 = (filter1 ? w).mapTo[Response]
val aggResult = Future.find(List(futureF3, futureF2, futureF1)) { res => !res.reponse }
Await.result(aggResult, timeout.duration) match {
case None =>
Nqueen.fact(10500000L)
log.info(s"[${w.message}] Procesado mensaje TRUE en ${System.currentTimeMillis() - start} ms");
sender ! WorkResponse(w, true)
case _ =>
filter1 ! Cancel(w)
filter2 ! Cancel(w)
filter3 ! Cancel(w)
log.info(s"[${w.message}] Procesado mensaje FALSE en ${System.currentTimeMillis() - start} ms");
sender ! WorkResponse(w, false)
}
}
}
and Filters:
import scala.collection.mutable.HashSet
import scala.util.Random
import akka.actor.Actor
import akka.actor.ActorLogging
import akka.actor.actorRef2Scala
import akka.agent.Agent
trait CancellableFilter { this: Actor with ActorLogging =>
//val canceledJobs = new HashSet[Int]
val agent: Agent[HashSet[Work]]
def cancelReceive: Receive = {
case Cancel(w) =>
agent.send(_ += w)
//log.info(s"[$t] El trabajo se cancelara (si llega...)")
}
def cancelled(w: Work): Boolean =
if (agent.get.contains(w)) {
agent.send(_ -= w)
true
} else {
false
}
}
abstract class Filter extends Actor with ActorLogging { this: CancellableFilter =>
val random = new Random(System.currentTimeMillis())
def response: Boolean
val timeToWait: Int
val timeToExecutor: Long
def receive = cancelReceive orElse {
case w:Work if !cancelled(w) =>
//log.info(s"[$t] Llego trabajo")
Thread.sleep(timeToWait)
Nqueen.fact(timeToExecutor)
val r = Response(response)
//log.info(s"[$t] Respondio ${r.reponse}")
sender ! r
}
}
object Filter1 {
def apply(agente: Agent[HashSet[Work]]) = new Filter with CancellableFilter {
val timeToWait = 74
val timeToExecutor = 42000000L
val agent = agente
def response = true //random.nextBoolean
}
}
object Filter2 {
def apply(agente: Agent[HashSet[Work]]) = new Filter with CancellableFilter {
val timeToWait = 47
val timeToExecutor = 21000000L
val agent = agente
def response = true //random.nextBoolean
}
}
object Filter3 {
def apply(agente: Agent[HashSet[Work]]) = new Filter with CancellableFilter {
val timeToWait = 47
val timeToExecutor = 21000000L
val agent = agente
def response = true //random.nextBoolean
}
}
Basically, I think Worker code is ugly and I want to make it better. Could you help to improve it?
Other point I want to improve is the cancel message. As I don't know which of the filters are done, I need to Cancel all of them, then, at least one cancel is redundant (Since this work is completed)
It is minor, but why don't you store filters as sequence? filters.foreach(f ! Cancel(w)) is nicer than
filter1 ! Cancel(w)
filter2 ! Cancel(w)
filter3 ! Cancel(w)
Same for other cases:
class Worker(filter1: ActorRef, filter2: ActorRef, filter3: ActorRef) extends Actor with ActorLogging {
private val filters = Seq(filter1, filter2, filter3)
implicit val timeout = Timeout(30.seconds)
def receive = {
case w:Work =>
val start = System.currentTimeMillis();
val futures = filters.map { f =>
(f ? w).mapTo[Response]
}
val aggResult = Future.find(futures) { res => !res.reponse }
Await.result(aggResult, timeout.duration) match {
case None =>
Nqueen.fact(10500000L)
log.info(s"[${w.message}] Procesado mensaje TRUE en ${System.currentTimeMillis() - start} ms");
sender ! WorkResponse(w, true)
case _ =>
filters.foreach(f ! Cancel(w))
log.info(s"[${w.message}] Procesado mensaje FALSE en ${System.currentTimeMillis() - start} ms");
sender ! WorkResponse(w, false)
}
}
You may also consider to write constructor as Worker(filters: ActorRef*) if you do not enforce exactly three filters. It think it is okay to sendoff one redundant cancel (alternatives I see is overly complicated). I'm not sure, but if filters will be created very fast, if may got randoms initialized with the same seed value.