"Spawn" concurrent effect in a WebSocket endpoint - scala

I have the following code:
class ApiRoutes2[F[_]](implicit F: ConcurrentEffect[F]) extends Http4sDsl[F] {
var queue = Queue.bounded[F, String](100)
def createService(queue: Queue[F, String]): F[Unit] = ???
val service: HttpRoutes[F] = HttpRoutes.of[F] {
case PUT -> Root / "services" =>
val toClientF: F[Stream[F, WebSocketFrame]] = queue.map(_.dequeue.map(t => Text(t)))
val fromClient: Pipe[F, WebSocketFrame, Unit] = _.evalMap {
case Text(t, _) => F.delay(println(t))
case f => F.delay(println(s"Unknown type: $f"))
}
// How to "spawn" createService?
toClientF.flatMap { toClient =>
WebSocketBuilder[F].build(toClient, fromClient)
}
}
}
createService is a function which creates a new service. Creating a new service is a very complicated process, it envolves triggering CI pipelines, waiting for them to finish and then trigger more CI pipelines in the same fashion. The queue it receives will be used to report back to the browser the current operations being performed.
I wanna concurrently "spawn" the createService and let it run until it finishes. However at the same time I want to immediately return the WebSocket to the client. Aka I cannot block while "spawning" createService.
I'm stuck. I can only think of using shift but that would mean the next line in the for comprehension would block waiting for createService to finish only to then return the websocket to the client.
Is my approach wrong? What am I doing wrong?

Since F is an instance of ConcurrentEffect, you also have a Concurrent instance.
You can therefore use Concurrent[F].start which returns a Fiber to the running operation (you can just ignore the Fibre if you don't need to cancel/ensure completion though).
val service: HttpRoutes[F] = HttpRoutes.of[F] {
case PUT -> Root / "services" =>
val toClientF: F[Stream[F, WebSocketFrame]] = queue.map(_.dequeue.map(t => Text(t)))
val fromClient: Pipe[F, WebSocketFrame, Unit] = _.evalMap {
case Text(t, _) => F.delay(println(t))
case f => F.delay(println(s"Unknown type: $f"))
}
for {
toClient <- toClientF
_ <- Concurrent[F].start(createService)
websocket <- WebSocketBuilder[F].build(toClient, fromClient)
} yield websocket
}

Related

Scala Futures for-comprehension with a list of values

I need to execute a Future method on some elements I have in a list simultaneously. My current implementation works sequentially, which is not optimal for saving time. I did this by mapping my list and calling the method on each element and processing the data this way.
My manager shared a link with me showing how to execute Futures simultaneously using for-comprehension but I cannot see/understand how I can implement this with my List.
The link he shared with me is https://alvinalexander.com/scala/how-use-multiple-scala-futures-in-for-comprehension-loop/
Here is my current code:
private def method1(id: String): Tuple2[Boolean, List[MyObject]] = {
val workers = List.concat(idleWorkers, activeWorkers.keys.toList)
var ready = true;
val workerStatus = workers.map{ worker =>
val option = Await.result(method2(worker), 1 seconds)
var status = if (option.isDefined) {
if (option.get._2 == id) {
option.get._1.toString
} else {
"INVALID"
}
} else "FAILED"
val status = s"$worker: $status"
if (option.get._1) {
ready = false
}
MyObject(worker.toString, status)
}.toList.filterNot(s => s. status.contains("INVALID"))
(ready, workerStatus)
}
private def method2(worker: ActorRef): Future[Option[(Boolean, String)]] = Future{
implicit val timeout: Timeout = 1 seconds;
Try(Await.result(worker ? GetStatus, 1 seconds)) match {
case Success(extractedVal) => extractedVal match {
case res: (Boolean, String) => Some(res)
case _ => None
}
case Failure(_) => { None }
case _ => { None }
}
}
If someone could suggest how to implement for-comprehension in this scenario, I would be grateful. Thanks
For method2 there is no need for the Future/Await mix. Just map the Future:
def method2(worker: ActorRef): Future[Option[(Boolean, String)]] =
(worker ? GetStatus).map{
case res: (Boolean, String) => Some(res)
case _ => None
}
For method1 you likewise need to map the result of method2 and do the processing inside the map. This will make workerStatus a List[Future[MyObject]] and means that everything runs in parallel.
Then use Future.sequence(workerStatus) to turn the List[Future[MyObject]] into a Future[List[MyObject]]. You can then use map again to do the filtering/ checking on that List[MyObject]. This will happen when all the individual Futures have completed.
Ideally you would then return a Future from method1 to keep everything asynchronous. You could, if absolutely necessary, use Await.result at this point which would wait for all the asynchronous operations to complete (or fail).

Iterate data source asynchronously in batch and stop while remote return no data in Scala

Let's say we have a fake data source which will return data it holds in batch
class DataSource(size: Int) {
private var s = 0
implicit val g = scala.concurrent.ExecutionContext.global
def getData(): Future[List[Int]] = {
s = s + 1
Future {
Thread.sleep(Random.nextInt(s * 100))
if (s <= size) {
List.fill(100)(s)
} else {
List()
}
}
}
object Test extends App {
val source = new DataSource(100)
implicit val g = scala.concurrent.ExecutionContext.global
def process(v: List[Int]): Unit = {
println(v)
}
def next(f: (List[Int]) => Unit): Unit = {
val fut = source.getData()
fut.onComplete {
case Success(v) => {
f(v)
v match {
case h :: t => next(f)
}
}
}
}
next(process)
Thread.sleep(1000000000)
}
I have mine, the problem here is some portion is more not pure. Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list? My situation is a little from this post, the next() there is synchronous call while my is also async.
Or is it ever possible to do what I want? Next batch will only be fetched when the previous one is resolved in the end whether to fetch the next batch depends on the size returned?
What's the best way to walk through this type of data sources? Are there any existing Scala frameworks that provide the feature I am looking for? Is play's Iteratee, Enumerator, Enumeratee the right tool? If so, can anyone provide an example on how to use those facilities to implement what I am looking for?
Edit----
With help from chunjef, I had just tried out. And it actually did work out for me. However, there was some small change I made based on his answer.
Source.fromIterator(()=>Iterator.continually(source.getData())).mapAsync(1) (f=>f.filter(_.size > 0))
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runForeach(println)
However, can someone give comparison between Akka Stream and Play Iteratee? Does it worth me also try out Iteratee?
Code snip 1:
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
Code snip 2: Assuming the getData depends on some other output of another flow, and I would like to concat it with the below flow. However, it yield too many files open error. Not sure what would cause this error, the mapAsync has been limited to 1 as its throughput if I understood correctly.
Flow[Int].mapConcat[Future[List[Int]]](c => {
Iterator.continually(ds.getData(c)).to[collection.immutable.Iterable]
}).mapAsync(1)(identity).takeWhile(_.nonEmpty).runForeach(println)
The following is one way to achieve the same behavior with Akka Streams, using your DataSource class:
import scala.concurrent.Future
import scala.util.Random
import akka.actor.ActorSystem
import akka.stream._
import akka.stream.scaladsl._
object StreamsExample extends App {
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
val ds = new DataSource(100)
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
}
class DataSource(size: Int) {
...
}
A simplified line-by-line overview:
line 1: Creates a stream source that continually calls ds.getData if there is downstream demand.
line 2: mapAsync is a way to deal with stream elements that are Futures. In this case, the stream elements are of type Future[List[Int]]. The argument 1 is the level of parallelism: we specify 1 here because DataSource internally uses a mutable variable, and a parallelism level greater than one could produce unexpected results. identity is shorthand for x => x, which basically means that for each Future, we pass its result downstream without transforming it.
line 3: Essentially, ds.getData is called as long as the result of the Future is a non-empty List[Int]. If an empty List is encountered, processing is terminated.
line 4: runForeach here takes a function List[Int] => Unit and invokes that function for each stream element.
Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list?
I think you are looking for a Promise.
You would set up a Promise before you start the first iteration.
This gives you promise.future, a Future that you can then use to follow the completion of everything.
In your onComplete, you add a case _ => promise.success().
Something like
def loopUntilDone(f: (List[Int]) => Unit): Future[Unit] = {
val promise = Promise[Unit]
def next(): Unit = source.getData().onComplete {
case Success(v) =>
f(v)
v match {
case h :: t => next()
case _ => promise.success()
}
case Failure(e) => promise.failure(e)
}
// get going
next(f)
// return the Future for everything
promise.future
}
// future for everything, this is a `Future[Unit]`
// its `onComplete` will be triggered when there is no more data
val everything = loopUntilDone(process)
You are probably looking for a reactive streams library. My personal favorite (and one I'm most familiar with) is Monix. This is how it will work with DataSource unchanged
import scala.concurrent.duration.Duration
import scala.concurrent.Await
import monix.reactive.Observable
import monix.execution.Scheduler.Implicits.global
object Test extends App {
val source = new DataSource(100)
val completed = // <- this is Future[Unit], completes when foreach is done
Observable.repeat(Observable.fromFuture(source.getData()))
.flatten // <- Here it's Observable[List[Int]], it has collection-like methods
.takeWhile(_.nonEmpty)
.foreach(println)
Await.result(completed, Duration.Inf)
}
I just figured out that by using flatMapConcat can achieve what I wanted to achieve. There is no point to start another question as I have had the answer already. Put my sample code here just in case someone is looking for similar answer.
This type of API is very common for some integration between traditional Enterprise applications. The DataSource is to mock the API while the object App is to demonstrate how the client code can utilize Akka Stream to consume the APIs.
In my small project the API was provided in SOAP, and I used scalaxb to transform the SOAP to Scala async style. And with the client calls demonstrated in the object App, we can consume the API with AKKA Stream. Thanks for all for the help.
class DataSource(size: Int) {
private var transactionId: Long = 0
private val transactionCursorMap: mutable.HashMap[TransactionId, Set[ReadCursorId]] = mutable.HashMap.empty
private val cursorIteratorMap: mutable.HashMap[ReadCursorId, Iterator[List[Int]]] = mutable.HashMap.empty
implicit val g = scala.concurrent.ExecutionContext.global
case class TransactionId(id: Long)
case class ReadCursorId(id: Long)
def startTransaction(): Future[TransactionId] = {
Future {
synchronized {
transactionId += transactionId
}
val t = TransactionId(transactionId)
transactionCursorMap.update(t, Set(ReadCursorId(0)))
t
}
}
def createCursorId(t: TransactionId): ReadCursorId = {
synchronized {
val c = transactionCursorMap.getOrElseUpdate(t, Set(ReadCursorId(0)))
val currentId = c.foldLeft(0l) { (acc, a) => acc.max(a.id) }
val cId = ReadCursorId(currentId + 1)
transactionCursorMap.update(t, c + cId)
cursorIteratorMap.put(cId, createIterator)
cId
}
}
def createIterator(): Iterator[List[Int]] = {
(for {i <- 1 to 100} yield List.fill(100)(i)).toIterator
}
def startRead(t: TransactionId): Future[ReadCursorId] = {
Future {
createCursorId(t)
}
}
def getData(cursorId: ReadCursorId): Future[List[Int]] = {
synchronized {
Future {
Thread.sleep(Random.nextInt(100))
cursorIteratorMap.get(cursorId) match {
case Some(i) => i.next()
case _ => List()
}
}
}
}
}
object Test extends App {
val source = new DataSource(10)
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
implicit val g = scala.concurrent.ExecutionContext.global
//
// def process(v: List[Int]): Unit = {
// println(v)
// }
//
// def next(f: (List[Int]) => Unit): Unit = {
// val fut = source.getData()
// fut.onComplete {
// case Success(v) => {
// f(v)
// v match {
//
// case h :: t => next(f)
//
// }
// }
//
// }
//
// }
//
// next(process)
//
// Thread.sleep(1000000000)
val s = Source.fromFuture(source.startTransaction())
.map { e =>
source.startRead(e)
}
.mapAsync(1)(identity)
.flatMapConcat(
e => {
Source.fromIterator(() => Iterator.continually(source.getData(e)))
})
.mapAsync(5)(identity)
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runForeach(println)
/*
val done = Source.fromIterator(() => Iterator.continually(source.getData())).mapAsync(1)(identity)
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runFold(List[List[Int]]()) { (acc, r) =>
// println("=======" + acc + r)
r :: acc
}
done.onSuccess {
case e => {
e.foreach(println)
}
}
done.onComplete(_ => system.terminate())
*/
}

How would you "connect" many independent graphs maintaining backpressure between them?

Continuing series of questions about akka-streams I have another problem.
Variables:
Single http client flow with throttling
Multiple other flows that want to use first flow simultaneously
Goal:
Single http flow is flow that makes requests to particular API that limits number of calls to it. Otherwise it bans me. Thus it's very important to maintain rate of request regardless of how many clients in my code use it.
There are number of other flows that want to make requests to mentioned API but I'd like to have backpressure from http flow. Normally you connect whole thing to one graph and it works. But it my case I have multiple graphs.
How would you solve it ?
My attempt to solve it:
I use Source.queue for http flow so that I can queue http requests and have throttling. Problem is that Future from SourceQueue.offer fails if I exceed number of requests. Thus somehow I need to "reoffer" when previously offered event completes. Thus modified Future from SourceQueue would backpressure other graphs (inside their mapAsync) that make http requests.
Here is how I implemented it
object Main {
implicit val system = ActorSystem("root")
implicit val executor = system.dispatcher
implicit val materializer = ActorMaterializer()
private val queueHttp = Source.queue[(String, Promise[String])](2, OverflowStrategy.backpressure)
.throttle(1, FiniteDuration(1000, MILLISECONDS), 1, ThrottleMode.Shaping)
.mapAsync(4) {
case (text, promise) =>
// Simulate delay of http request
val delay = (Random.nextDouble() * 1000 / 2).toLong
Thread.sleep(delay)
Future.successful(text -> promise)
}
.toMat(Sink.foreach({
case (text, p) =>
p.success(text)
}))(Keep.left)
.run
val futureDeque = new ConcurrentLinkedDeque[Future[String]]()
def sendRequest(value: String): Future[String] = {
val p = Promise[String]()
val offerFuture = queueHttp.offer(value -> p)
def addToQueue(future: Future[String]): Future[String] = {
futureDeque.addLast(future)
future.onComplete {
case _ => futureDeque.remove(future)
}
future
}
offerFuture.flatMap {
case QueueOfferResult.Enqueued =>
addToQueue(p.future)
}.recoverWith {
case ex =>
val first = futureDeque.pollFirst()
if (first != null)
addToQueue(first.flatMap(_ => sendRequest(value)))
else
sendRequest(value)
}
}
def main(args: Array[String]) {
val allFutures = for (v <- 0 until 15)
yield {
val res = sendRequest(s"Text $v")
res.onSuccess {
case text =>
println("> " + text)
}
res
}
Future.sequence(allFutures).onComplete {
case Success(text) =>
println(s">>> TOTAL: ${text.length} [in queue: ${futureDeque.size()}]")
system.terminate()
case Failure(ex) =>
ex.printStackTrace()
system.terminate()
}
Await.result(system.whenTerminated, Duration.Inf)
}
}
Disadvantage of this solution is that I have locking on ConcurrentLinkedDeque which is probably not that bad for rate of 1 request per second but still.
How would you solve this task?
We have an open ticket (https://github.com/akka/akka/issues/19478) and some ideas for a "Hub" stage which would allow for dynamically combining streams, but I'm afraid I cannot give you any estimate for when it will be done.
So that is how we, in the Akka team, would solve the task. ;)

Combining Spray Routing + Actor Pattern Matching

Following the Akka Cluster documentation, I have the Worker Dial-in example running.
http://doc.akka.io/docs/akka/snapshot/java/cluster-usage.html
So I've trying to integrate that with a spray routing.
My idea is to have a cluster behind the scenes and through a http rest, call that service.
So I have the following code.
object Boot extends App {
val port = if (args.isEmpty) "0" else args(0)
val config =
ConfigFactory
.parseString(s"akka.remote.netty.tcp.port=$port")
.withFallback(ConfigFactory.parseString("akka.cluster.roles = [frontend]"))
.withFallback(ConfigFactory.load())
val system = ActorSystem("ClusterSystem", config)
val frontend = system.actorOf(Props[TransformationFrontend], name = "frontend")
implicit val actSystem = ActorSystem()
IO(Http) ! Http.Bind(frontend, interface = config.getString("http.interface"), port = config.getInt("http.port"))
}
class TransformationFrontend extends Actor {
var backends = IndexedSeq.empty[ActorRef]
var jobCounter = 0
implicit val timeout = Timeout(5 seconds)
override def receive: Receive = {
case _: Http.Connected => sender ! Http.Register(self)
case HttpRequest(GET, Uri.Path("/job"), _, _, _) =>
jobCounter += 1
val backend = backends(jobCounter % backends.size)
val originalSender = sender()
val future : Future[TransformationResult] = (backend ? new TransformationJob(jobCounter + "-job")).mapTo[TransformationResult]
future onComplete {
case Success(s) =>
println("received from backend: " + s.text)
originalSender ! s.text
case Failure(f) => println("error found: " + f.getMessage)
}
case job: TransformationJob if backends.isEmpty =>
sender() ! JobFailed("Service unavailable, try again later", job)
case job: TransformationJob =>
jobCounter += 1
backends(jobCounter % backends.size) forward job
case BackendRegistration if !backends.contains(sender()) =>
println("backend registered")
context watch sender()
backends = backends :+ sender()
case Terminated(a) =>
backends = backends.filterNot(_ == a)
}
}
But what I really want to do is to combining the spray routing with those pattern matching.
Instead of writing my GET like the above, I would like to write like this:
path("job") {
get {
respondWithMediaType(`application/json`) {
complete {
(backend ? new TransformationJob(jobCounter + "-job")).mapTo[TransformationResult]
}
}
}
}
But extending my Actor with this class, I have to do the following
def receive = runRoute(defaultRoute)
How can I combine this approach with my TransformationFrontend Actor pattern matching methods? BackendRegistration, Terminated, TransformationJob?
You can compose PartialFunctions like Receive with PartialFunction.orElse:
class TransformationFrontend extends Actor {
// ...
def myReceive: Receive = {
case job: TransformationJob => // ...
// ...
}
def defaultRoute: Route =
get {
// ...
}
override def receive: Receive = runRoute(defaultRoute) orElse myReceive
}
That said, it often makes sense to split up functionality into several actors (as suggested in the comment above) if possible.

waiting for "recursive" futures in scala

a simple code sample that describes my problem:
import scala.util._
import scala.concurrent._
import scala.concurrent.duration._
import ExecutionContext.Implicits.global
class LoserException(msg: String, dice: Int) extends Exception(msg) { def diceRoll: Int = dice }
def aPlayThatMayFail: Future[Int] = {
Thread.sleep(1000) //throwing a dice takes some time...
//throw a dice:
(1 + Random.nextInt(6)) match {
case 6 => Future.successful(6) //I win!
case i: Int => Future.failed(new LoserException("I did not get 6...", i))
}
}
def win(prefix: String): String = {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => "%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll)
case Failure(e) => e match {
case ex: LoserException => win("%s, and then i got %d".format(prefix, ex.diceRoll))
case _: Throwable => "%s, and then somebody cheated!!!".format(prefix)
}
})
"I want to do something like futureGameLog.waitForRecursiveResult, using Await.result or something like that..."
}
win("I started playing the dice")
this simple example illustrates what i want to do. basically, if to put it in words, i want to wait for a result for some computation, when i compose different actions on previous success or failed attampts.
so how would you implement the win method?
my "real world" problem, if it makes any difference, is using dispatch for asynchronous http calls, where i want to keep making http calls whenever the previous one ends, but actions differ on wether the previous http call succeeded or not.
You can recover your failed future with a recursive call:
def foo(x: Int) = x match {
case 10 => Future.successful(x)
case _ => Future.failed[Int](new Exception)
}
def bar(x: Int): Future[Int] = {
foo(x) recoverWith { case _ => bar(x+1) }
}
scala> bar(0)
res0: scala.concurrent.Future[Int] = scala.concurrent.impl.Promise$DefaultPromise#64d6601
scala> res0.value
res1: Option[scala.util.Try[Int]] = Some(Success(10))
recoverWith takes a PartialFunction[Throwable,scala.concurrent.Future[A]] and returns a Future[A]. You should be careful though, because it will use quite some memory when it does lots of recursive calls here.
As drexin answered the part about exception handling and recovering, let me try and answer the part about a recursive function involving futures. I believe using a Promise will help you achieve your goal. The restructured code would look like this:
def win(prefix: String): String = {
val prom = Promise[String]()
def doWin(p:String) {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => prom.success("%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll))
case Failure(e) => e match {
case ex: LoserException => doWin("%s, and then i got %d".format(prefix, ex.diceRoll))
case other => prom.failure(new Exception("%s, and then somebody cheated!!!".format(prefix)))
}
})
}
doWin(prefix)
Await.result(prom.future, someTimeout)
}
Now this won't be true recursion in the sense that it will be building up one long stack due to the fact that the futures are async, but it is similar to recursion in spirit. Using the promise here gives you something to block against while the recursion does it's thing, blocking the caller from what's happening behind the scene.
Now, if I was doing this, I would probable redefine things like so:
def win(prefix: String): Future[String] = {
val prom = Promise[String]()
def doWin(p:String) {
val futureGameLog = aPlayThatMayFail
futureGameLog.onComplete(t => t match {
case Success(diceRoll) => prom.success("%s, and finally, I won! I rolled %d !!!".format(prefix, diceRoll))
case Failure(e) => e match {
case ex: LoserException => doWin("%s, and then i got %d".format(prefix, ex.diceRoll))
case other => prom.failure(new Exception("%s, and then somebody cheated!!!".format(prefix)))
}
})
}
doWin(prefix)
prom.future
}
This way you can defer the decision on whether to block or use async callbacks to the caller of this function. This is more flexible, but it also exposes the caller to the fact that you are doing async computations and I'm not sure that is going to be acceptable for your scenario. I'll leave that decision up to you.
This works for me:
def retryWithFuture[T](f: => Future[T],retries:Int, delay:FiniteDuration) (implicit ec: ExecutionContext, s: Scheduler): Future[T] ={
f.recoverWith { case _ if retries > 0 => after[T](delay,s)(retryWithFuture[T]( f , retries - 1 , delay)) }
}