Stream Future in Play 2.5 - scala

Once again I am attempting to update some pre Play 2.5 code (based on this vid). For example the following used to be how to stream a Future:
Ok.chunked(Enumerator.generateM(Promise.timeout(Some("hello"), 500)))
I have created the following method for the work-around for Promise.timeout (deprecated) using Akka:
private def keepResponding(data: String, delay: FiniteDuration, interval: FiniteDuration): Future[Result] = {
val promise: Promise[Result] = Promise[Result]()
actorSystem.scheduler.schedule(delay, interval) { promise.success(Ok(data)) }
promise.future
}
According to the Play Framework Migration Guide; Enumerators should be rewritten to a Source and Source.unfoldAsync is apparently the equivalent of Enumerator.generateM so I was hoping that this would work (where str is a Future[String]):
def inf = Action { request =>
val str = keepResponding("stream me", 1.second, 2.second)
Ok.chunked(Source.unfoldAsync(str))
}
Of course I'm getting a Type mismatch error and when looking at the case class signature of unfoldAsync:
final class UnfoldAsync[S, E](s: S, f: S ⇒ Future[Option[(S, E)]])
I can see that the parameters are not correct but I'm not fully understanding what/how I should pass this through.

unfoldAsync is even more generic than Play!'s own generateM, as it allows you to pass through a status (S) value. This can make the value emitted depend on the previously emitted value(s).
The example below will load values by an increasing id, until the loading fails:
val source: Source[String, NotUsed] = Source.unfoldAsync(0){ id ⇒
loadFromId(id)
.map(s ⇒ Some((id + 1, s)))
.recover{case _ ⇒ None}
}
def loadFromId(id: Int): Future[String] = ???
In your case an internal state is not really needed, therefore you can just pass dummy values whenever required, e.g.
val source: Source[Result, NotUsed] = Source.unfoldAsync(NotUsed) { _ ⇒
schedule("stream me", 2.seconds).map(x ⇒ Some(NotUsed → x))
}
def schedule(data: String, delay: FiniteDuration): Future[Result] = {
akka.pattern.after(delay, system.scheduler){Future.successful(Ok(data))}
}
Note that your original implementation of keepResponding is incorrect, as you cannot complete a Promise more than once. Akka after pattern offer a simpler way to achieve what you need.
However, note that in your specific case, Akka Streams offers a more idiomatic solution with Source.tick:
val source: Source[String, Cancellable] = Source.tick(1.second, 2.seconds, NotUsed).mapAsync(1){ _ ⇒
loadSomeFuture()
}
def loadSomeFuture(): Future[String] = ???
or even simpler in case you don't actually need asynchronous computation as in your example
val source: Source[String, Cancellable] = Source.tick(1.second, 2.seconds, "stream me")

Related

How to Promise.allSettled with Scala futures?

I have two scala futures. I want to perform an action once both are completed, regardless of whether they were completed successfully. (Additionally, I want the ability to inspect those results at that time.)
In Javascript, this is Promise.allSettled.
Does Scala offer a simple way to do this?
One last wrinkle, if it matters: I want to do this in a JRuby application.
You can use the transform method to create a Future that will always succeed and return the result or the error as a Try object.
def toTry[A](future: Future[A])(implicit ec: ExecutionContext): Future[Try[A]] =
future.transform(x => Success(x))
To combine two Futures into one, you can use zip:
def settle2[A, B](fa: Future[A], fb: Future[B])(implicit ec: ExecutionContext)
: Future[(Try[A], Try[B])] =
toTry(fa).zip(toTry(fb))
If you want to combine an arbitrary number of Futures this way, you can use Future.traverse:
def allSettled[A](futures: List[Future[A]])(implicit ec: ExecutionContext)
: Future[List[Try[A]]] =
Future.traverse(futures)(toTry(_))
Normally in this case we use Future.sequence to transform a collection of a Future into one single Future so you can map on it, but Scala short circuit the failed Future and doesn't wait for anything after that (Scala considers one failure to be a failure for all), which doesn't fit your case.
In this case you need to map failed ones to successful, then do the sequence, e.g.
val settledFuture = Future.sequence(List(future1, future2, ...).map(_.recoverWith { case _ => Future.unit }))
settledFuture.map(//Here it is all settled)
EDIT
Since the results need to be kept, instead of mapping to Future.unit, we map the actual result into another layer of Try:
val settledFuture = Future.sequence(
List(Future(1), Future(throw new Exception))
.map(_.map(Success(_)).recover(Failure(_)))
)
settledFuture.map(println(_))
//Output: List(Success(1), Failure(java.lang.Exception))
EDIT2
It can be further simplified with transform:
Future.sequence(listOfFutures.map(_.transform(Success(_))))
Perhaps you could use a concurrent counter to keep track of the number of completed Futures and then complete the Promise once all Futures have completed
def allSettled[T](futures: List[Future[T]]): Future[List[Future[T]]] = {
val p = Promise[List[Future[T]]]()
val length = futures.length
val completedCount = new AtomicInteger(0)
futures foreach {
_.onComplete { _ =>
if (completedCount.incrementAndGet == length) p.trySuccess(futures)
}
}
p.future
}
val futures = List(
Future(-11),
Future(throw new Exception("boom")),
Future(42)
)
allSettled(futures).andThen(println(_))
// Success(List(Future(Success(-11)), Future(Failure(java.lang.Exception: boom)), Future(Success(42))))
scastie

Chaining context through akka streams

I'm converting some C# code to scala and akka streams.
My c# code looks something like this:
Task<Result1> GetPartialResult1Async(Request request) ...
Task<Result2> GetPartialResult2Async(Request request) ...
async Task<Result> GetResultAsync(Request request)
{
var result1 = await GetPartialResult1Async(request);
var result2 = await GetPartialResult2Async(request);
return new Result(request, result1, result2);
}
Now for the akka streams. Instead of having a function from Request to a Task of a result, I have flows from a Request to a Result.
So I already have the following two flows:
val partialResult1Flow: Flow[Request, Result1, NotUsed] = ...
val partialResult2Flow: Flow[Request, Result2, NotUsed] = ...
However I can't see how to combine them into a complete flow, since by calling via on the first flow we lose the original request, and by calling via on the second flow we lose the result of the first flow.
So I've created a WithState monad which looks something like this:
case class WithState[+TState, +TValue](value: TValue, state: TState) {
def map[TResult](func: TValue => TResult): WithState[TState, TResult] = {
WithState(func(value), state)
}
... bunch more helper functions go here
}
Then I'm rewriting my original flows to look like this:
def partialResult1Flow[TState]: Flow[WithState[TState, Request], WithState[TState, Result1]] = ...
def partialResult2Flow: Flow[WithState[TState, Request], WithState[TState, Result2]] = ...
and using them like this:
val flow = Flow[Request]
.map(x => WithState(x, x))
.via(partialResult1Flow)
.map(x => WithState(x.state, (x.state, x.value))
.via(partialResult2Flow)
.map(x => Result(x.state._1, x.state._2, x.value))
Now this works, but of course I can't guarantee how flow will be used. So I really ought to make it take a State parameter:
def flow[TState] = Flow[WithState[TState, Request]]
.map(x => WithState(x.value, (x.state, x.value)))
.via(partialResult1Flow)
.map(x => WithState(x.state._2, (x.state, x.value))
.via(partialResult2Flow)
.map(x => WithState(Result(x.state._1._2, x.state._2, x.value), x.state._1._1))
Now at this stage my code is getting extremely hard to read. I could clean it up by naming the functions, and using case classes instead of tuples etc. but fundamentally there's a lot of incidental complexity here, which is hard to avoid.
Am I missing something? Is this not a good use case for Akka streams? Is there some inbuilt way of doing this?
I don't have any fundamentally different way to do this than I described in the question.
However the current flow can be significantly improved:
Stage 1: FlowWithContext
Instead of using a custom WithState monad, it's possible to use the built in FlowWithContext.
The advantage of this is that you can use the standard operators on the flow, without needing to worry about transforming the WithState monad. Akka takes care of this for you.
So instead of
def partialResult1Flow[TState]: Flow[WithState[TState, Request], WithState[TState, Result1]] =
Flow[WithState[TState, Request]].mapAsync(_ mapAsync {doRequest(_)})
We can write:
def partialResult1Flow[TState]: FlowWithContext[Request, TState, Result1, TState, NotUsed] =
FlowWithContext[Request, TState].mapAsync(doRequest(_))
Unfortunately though, whilst FlowWithContext is quite easy to write when you don't need to change the context, it's a little fiddly to use when you need to go via a stream which requires you to move some of your current data into the context (as ours does). In order to do that you need to convert to a Flow (using asFlow), and then back to a FlowWithContext using asFlowWithContext.
I found it easiest to just write the whole thing as a Flow in such cases, and convert to a FlowWithContext at the end.
For example:
def flow[TState]: FlowWithContext[Request, TState, Result, TState, NotUsed] =
Flow[(Request, TState)]
.map(x => (x._1, (x._1, x._2)))
.via(partialResult1Flow)
.map(x => (x._2._1, (x._2._1, x._1, x._2._2))
.via(partialResult2Flow)
.map(x => (Result(x._2._1, x._2._2, x._1), x._2._2))
.asFlowWithContext((a: Request, b: TState) => (a,b))(_._2)
.map(_._1)
Is this any better?
In this particular case it's probably worse. In other cases, where you rarely need to change the context it would be better. However either way I would recommend using it as it's built in, rather than relying on a custom monad.
Stage 2: viaUsing
In order to make this a bit more user friendly I created a viaUsing extension method for Flow and FlowWithContext:
import akka.stream.{FlowShape, Graph}
import akka.stream.scaladsl.{Flow, FlowWithContext}
object FlowExtensions {
implicit class FlowViaUsingOps[In, Out, Mat](val f: Flow[In, Out, Mat]) extends AnyVal {
def viaUsing[Out2, Using, Mat2](func: Out => Using)(flow: Graph[FlowShape[(Using, Out), (Out2, Out)], Mat2]) : Flow[In, (Out2, Out), Mat] =
f.map(x => (func(x), x)).via(flow)
}
implicit class FlowWithContextViaUsingOps[In, CtxIn, Out, CtxOut, Mat](val f: FlowWithContext[In, CtxIn, Out, CtxOut, Mat]) extends AnyVal {
def viaUsing[Out2, Using, Mat2](func: Out => Using)(flow: Graph[FlowShape[(Using, (Out, CtxOut)), (Out2, (Out, CtxOut))], Mat2]):
FlowWithContext[In, CtxIn, (Out2, Out), CtxOut, Mat] =
f
.asFlow
.map(x => (func(x._1), (x._1, x._2)))
.via(flow)
.asFlowWithContext((a: In, b: CtxIn) => (a,b))(_._2._2)
.map(x => (x._1, x._2._1))
}
}
The purpose of viaUsing, is to create the input for a FlowWithContext from the current output, whilst preserving your current output by passing it through the context. It result in a Flow whose output is the a tuple of the output from the nested flow, and the original flow.
With viaUsing our example simplifies to:
def flow[TState]: FlowWithContext[Request, TState, Result, TState, NotUsed] =
FlowWithContext[Request, TState]
.viaUsing(x => x)(partialResult1Flow)
.viaUsing(x => x._2)(partialResult2Flow)
.map(x => Result(x._2._2, x._2._1, x._1))
I think this is a significant improvement. I've made a request to add viaUsing to Akka instead of relying on extension methods here.
I agree using Akka Streams for backpressure is useful. However, I'm not convinced that modelling the calculation of the partialResults as streams is useful here. having the 'inner' logic based on Futures and wrapping those in the mapAsync of your flow to apply backpressure to the whole operation as one unit seems simpler, and perhaps even better.
This is basically a boiled-down refactoring of Levi Ramsey's earlier excellent answer:
import scala.concurrent.{ ExecutionContext, Future }
import akka.NotUsed
import akka.stream._
import akka.stream.scaladsl._
case class Request()
case class Result1()
case class Result2()
case class Response(r: Request, r1: Result1, r2: Result2)
def partialResult1(req: Request): Future[Result1] = ???
def partialResult2(req: Request): Future[Result2] = ???
val system = akka.actor.ActorSystem()
implicit val ec: ExecutionContext = system.dispatcher
val flow: Flow[Request, Response, NotUsed] =
Flow[Request]
.mapAsync(parallelism = 12) { req =>
for {
res1 <- partialResult1(req)
res2 <- partialResult2(req)
} yield (Response(req, res1, res2))
}
I would start with this, and only if you know you have reason to split partialResult1 and partialResult2 into separate stages introduce an intermediate step in the Flow. Depending on your requirements mapAsyncUnordered might be more suitable.
Disclaimer, I'm not totally familiar with C#'s async/await.
From what I've been able to glean from a quick perusal of the C# docs, Task<T> is a strictly (i.e. eager, not lazy) evaluated computation which will if successful eventually contain a T. The Scala equivalent of this is Future[T], where the equivalent of the C# code would be:
import scala.concurrent.{ ExecutionContext, Future }
def getPartialResult1Async(req: Request): Future[Result1] = ???
def getPartialResult2Async(req: Request): Future[Result2] = ???
def getResultAsync(req: Request)(implicit ectx: ExecutionContext): Future[Result] = {
val result1 = getPartialResult1Async(req)
val result2 = getPartialResult2Async(req)
result1.zipWith(result2) { tup => val (r1, r2) = tup
new Result(req, r1, r2)
}
/* Could also:
* for {
* r1 <- result1
* r2 <- result2
* } yield { new Result(req, r1, r2) }
*
* Note that both the `result1.zipWith(result2)` and the above `for`
* construction may compute the two partial results simultaneously. If you
* want to ensure that the second partial result is computed after the first
* partial result is successfully computed:
* for {
* r1 <- getPartialResult1Async(req)
* r2 <- getPartialResult2Async(req)
* } yield new Result(req, r1, r2)
*/
}
No Akka Streams required for this particular case, but if you have some other need to use Akka Streams, You could express this as
val actorSystem = ??? // In Akka Streams 2.6, you'd probably have this as an implicit val
val parallelism = ??? // Controls requests in flight
val flow = Flow[Request]
.mapAsync(parallelism) { req =>
import actorSystem.dispatcher
getPartialResult1Async(req).map { r1 => (req, r1) }
}
.mapAsync(parallelism) { tup =>
import actorSystem.dispatcher
getPartialResult2Async(tup._2).map { r2 =>
new Result(tup._1, tup._2, r2)
}
}
/* Given the `getResultAsync` function in the previous snippet, you could also:
* val flow = Flow[Request].mapAsync(parallelism) { req =>
* getResultAsync(req)(actorSystem.dispatcher)
* }
*/
One advantage of the Future-based implementation is that it's pretty easy to integrate with whatever Scala abstraction of concurrency/parallelism you want to use in a given context (e.g. cats, akka stream, akka). My general instinct to an Akka Streams integration would be in the direction of the three-liner in my comment in the second code block.

How to throttle Futures with one-second delay with Akka

I have list of URIs, each of which I want to request with a one-second delay in between. How can I do that?
val uris: List[String] = List()
// How to make these URIs resolve 1 second apart?
val responses: List[Future[Response]] = uris.map(httpRequest(_))
You could create an Akka Streams Source from the list of URIs, then throttle the conversion of each URI to a Future[Response]:
def httpRequest(uri: String): Future[Response] = ???
val uris: List[String] = ???
val responses: Future[Seq[Response]] =
Source(uris)
.throttle(1, 1 second)
.mapAsync(parallelism = 1)(httpRequest)
.runWith(Sink.seq[Response])
Something like this perahps:
#tailrec
def withDelay(
uris: Seq[String],
delay: Duration = 1 second,
result: List[Future[Response]] = Nil,
): Seq[Future[Response]] = uris match {
case Seq() => result.reversed
case (head, tail#_*) =>
val v = result.headOption.getOrElse(Future.successful(null))
.flatMap { _ =>
akka.pattern.after(delay, context.system.scheduler)(httpRequest(head))
}
withDelay(tail, delay, v :: result)
}
this has a delay before the first execution as well, but I hope, it's clear enough how to get rid of it if necessary ...
Another caveat is that this assumes that all futures succeed. As soon as one fails, all subsequent processing is aborted.
If you need a different behavior, you may want to replace the .flatMap with .transform or add a .recover etc.
You can also write the same with .foldLeft if preferred:
uris.foldLeft(List.empty[Future[Response]]) { case (results, next) =>
results.headOption.getOrElse(Future.successful(null))
.flatMap { _ =>
akka.pattern.after(delay, context.system.scheduler)(httpRequest(next))
} :: results
}.reversed
akka streams has it out of the box with the throttle function (taking into account that you are using akka-http and added tag for akka streams)

Akka streams: dealing with futures within graph stage

Within an akka stream stage FlowShape[A, B] , part of the processing I need to do on the A's is to save/query a datastore with a query built with A data. But that datastore driver query gives me a future, and I am not sure how best to deal with it (my main question here).
case class Obj(a: String, b: Int, c: String)
case class Foo(myobject: Obj, name: String)
case class Bar(st: String)
//
class SaveAndGetId extends GraphStage[FlowShape[Foo, Bar]] {
val dao = new DbDao // some dao with an async driver
override def createLogic(inheritedAttributes: Attributes) = new GraphStageLogic(shape) {
setHandlers(in, out, new InHandler with Outhandler {
override def onPush() = {
val foo = grab(in)
val add = foo.record.value()
val result: Future[String] = dao.saveAndGetRecord(add.myobject)//saves and returns id as string
//the naive approach
val record = Await(result, Duration.inf)
push(out, Bar(record))// ***tests pass every time
//mapping the future approach
result.map { x=>
push(out, Bar(x))
} //***tests fail every time
The next stage depends on the id of the db record returned from query, but I want to avoid Await. I am not sure why mapping approach fails:
"it should work" in {
val source = Source.single(Foo(Obj("hello", 1, "world")))
val probe = source
.via(new SaveAndGetId))
.runWith(TestSink.probe)
probe
.request(1)
.expectBarwithId("one")//say we know this will be
.expectComplete()
}
private implicit class RichTestProbe(probe: Probe[Bar]) {
def expectBarwithId(expected: String): Probe[Bar] =
probe.expectNextChainingPF{
case r # Bar(str) if str == expected => r
}
}
When run with mapping future, I get failure:
should work ***FAILED***
java.lang.AssertionError: assertion failed: expected: message matching partial function but got unexpected message OnComplete
at scala.Predef$.assert(Predef.scala:170)
at akka.testkit.TestKitBase$class.expectMsgPF(TestKit.scala:406)
at akka.testkit.TestKit.expectMsgPF(TestKit.scala:814)
at akka.stream.testkit.TestSubscriber$ManualProbe.expectEventPF(StreamTestKit.scala:570)
The async side channels example in the docs has the future in the constructor of the stage, as opposed to building the future within the stage, so doesn't seem to apply to my case.
I agree with Ramon. Constructing a new FlowShapeis not necessary in this case and it is too complicated. It is very much convenient to use mapAsync method here:
Here is a code snippet to utilize mapAsync:
import akka.stream.scaladsl.{Sink, Source}
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
object MapAsyncExample {
val numOfParallelism: Int = 10
def main(args: Array[String]): Unit = {
Source.repeat(5)
.mapAsync(parallelism)(x => asyncSquare(x))
.runWith(Sink.foreach(println)) previous stage
}
//This method returns a Future
//You can replace this part with your database operations
def asyncSquare(value: Int): Future[Int] = Future {
value * value
}
}
In the snippet above, Source.repeat(5) is a dummy source to emit 5 indefinitely. There is a sample function asyncSquare which takes an integer and calculates its square in a Future. .mapAsync(parallelism)(x => asyncSquare(x)) line uses that function and emits the output of Future to the next stage. In this snipet, the next stage is a sink which prints every item.
parallelism is the maximum number of asyncSquare calls that can run concurrently.
I think your GraphStage is unnecessarily overcomplicated. The below Flow performs the same actions without the need to write a custom stage:
val dao = new DbDao
val parallelism = 10 //number of parallel db queries
val SaveAndGetId : Flow[Foo, Bar, _] =
Flow[Foo]
.map(foo => foo.record.value().myobject)
.mapAsync(parallelism)(rec => dao.saveAndGetRecord(rec))
.map(Bar.apply)
I generally try to treat GraphStage as a last resort, there is almost always an idiomatic way of getting the same Flow by using the methods provided by the akka-stream library.

Composing BodyParser in Play 2.5

Given a function with this signature:
def parser[A](otherParser: BodyParser[A]): BodyParser[A]
How can I write the function in such a way that the request body is examined and verified before it is passed to otherParser?
For simplicity let's say that I want to verify that a header ("Some-Header", perhaps) has a value that matches the body exactly. So if I have this action:
def post(): Action(parser(parse.tolerantText)) { request =>
Ok(request.body)
}
When I make a request like curl -H "Some-Header: hello" -d "hello" http://localhost:9000/post it should return "hello" in the response body with a status of 200. If my request is curl -H "Some-Header: hello" -d "hi" http://localhost:9000/post it should return a 400 with no body.
Here's what I've tried.
This one does not compile because otherParser(request).through(flow) expects flow to output a ByteString. The idea here was that the flow could notify the accumulator whether or not to continue processing via the Either output. I'm not sure how to let the accumulator know the status of the previous step.
def parser[A](otherParser: BodyParser[A]): BodyParser[A] = BodyParser { request =>
val flow: Flow[ByteString, Either[Result, ByteString], NotUsed] = Flow[ByteString].map { bytes =>
if (request.headers.get("Some-Header").contains(bytes.utf8String)) {
Right(bytes)
} else {
Left(BadRequest)
}
}
val acc: Accumulator[ByteString, Either[Result, A]] = otherParser(request)
// This fails to compile because flow needs to output a ByteString
acc.through(flow)
}
I also attempted to use filter. This one does compile and the response body that gets written is correct. However it always returns a 200 Ok response status.
def parser[A](otherParser: BodyParser[A]): BodyParser[A] = BodyParser { request =>
val flow: Flow[ByteString, ByteString, akka.NotUsed] = Flow[ByteString].filter { bytes =>
request.headers.get("Some-Header").contains(bytes.utf8String)
}
val acc: Accumulator[ByteString, Either[Result, A]] = otherParser(request)
acc.through(flow)
}
I came up with a solution using a GraphStageWithMaterializedValue. This concept was borrowed from Play's maxLength body parser. The key difference between my first attempt in my question (that doesn't compile) is that instead of attempting to mutate the stream I should use the materialized value to convey information about the state of processing. While I had created a Flow[ByteString, Either[Result, ByteString], NotUsed] it turns out what I needed was a Flow[ByteString, ByteString, Future[Boolean]].
So with that, my parser function ends up looking like this:
def parser[A](otherParser: BodyParser[A]): BodyParser[A] = BodyParser { request =>
val flow: Flow[ByteString, ByteString, Future[Boolean]] = Flow.fromGraph(new BodyValidator(request.headers.get("Some-Header")))
val parserSink: Sink[ByteString, Future[Either[Result, A]]] = otherParser.apply(request).toSink
Accumulator(flow.toMat(parserSink) { (statusFuture: Future[Boolean], resultFuture: Future[Either[Result, A]]) =>
statusFuture.flatMap { success =>
if (success) {
resultFuture.map {
case Left(result) => Left(result)
case Right(a) => Right(a)
}
} else {
Future.successful(Left(BadRequest))
}
}
})
}
The key line is this one:
val flow: Flow[ByteString, ByteString, Future[Boolean]] = Flow.fromGraph(new BodyValidator(request.headers.get("Some-Header")))
The rest kind of falls into place once you are able to create this flow. Unfortunately BodyValidator is pretty verbose and feels somewhat boiler-platey. In any case, it's mostly pretty easy to read. GraphStageWithMaterializedValue expects you to implement def shape: S (S is FlowShape[ByteString, ByteString] here) to specify the input type and output type of this graph. It also expects you to imlpement def createLogicAndMaterializedValue(inheritedAttributes: Attributes): (GraphStageLogic, M) (M is a Future[Boolean] here) to define what the graph should actually do. Here's the full code of BodyValidator (I'll explain in more detail below):
class BodyValidator(expected: Option[String]) extends GraphStageWithMaterializedValue[FlowShape[ByteString, ByteString], Future[Boolean]] {
val in = Inlet[ByteString]("BodyValidator.in")
val out = Outlet[ByteString]("BodyValidator.out")
override def shape: FlowShape[ByteString, ByteString] = FlowShape.of(in, out)
override def createLogicAndMaterializedValue(inheritedAttributes: Attributes): (GraphStageLogic, Future[Boolean]) = {
val status = Promise[Boolean]()
val bodyBuffer = new ByteStringBuilder()
val logic = new GraphStageLogic(shape) {
setHandler(out, new OutHandler {
override def onPull(): Unit = pull(in)
})
setHandler(in, new InHandler {
def onPush(): Unit = {
val chunk = grab(in)
bodyBuffer.append(chunk)
push(out, chunk)
}
override def onUpstreamFinish(): Unit = {
val fullBody = bodyBuffer.result()
status.success(expected.map(ByteString(_)).contains(fullBody))
completeStage()
}
override def onUpstreamFailure(e: Throwable): Unit = {
status.failure(e)
failStage(e)
}
})
}
(logic, status.future)
}
}
You first want to create an Inlet and Outlet to set up the inputs and outputs for your graph
val in = Inlet[ByteString]("BodyValidator.in")
val out = Outlet[ByteString]("BodyValidator.out")
Then you use these to define shape.
def shape: FlowShape[ByteString, ByteString] = FlowShape.of(in, out)
Inside createLogicAndMaterializedValue you need to initialize the value you intend to materialze. Here I've used a promise that can be resolved when I have the full data from the stream. I also create a ByteStringBuilder to track the data between iterations.
val status = Promise[Boolean]()
val bodyBuffer = new ByteStringBuilder()
Then I create a GraphStageLogic to actually set up what this graph does at each point of processing. Two handler are being set. One is an InHandler for dealing with data as it comes from the upstream source. The other is an OutHandler for dealing with data to send downstream. There's nothing really interesting in the OutHandler so I'll ignore it here besides to say that it is necessary boiler plate in order to avoid an IllegalStateException. Three methods are overridden in the InHandler: onPush, onUpstreamFinish, and onUpstreamFailure. onPush is called when new data is ready from upstream. In this method I simply grab the next chunk of data, write it to bodyBuffer and push the data downstream.
def onPush(): Unit = {
val chunk = grab(in)
bodyBuffer.append(chunk)
push(out, chunk)
}
onUpstreamFinish is called when the upstream finishes (surprise). This is where the business logic of comparing the body with the header happens.
override def onUpstreamFinish(): Unit = {
val fullBody = bodyBuffer.result()
status.success(expected.map(ByteString(_)).contains(fullBody))
completeStage()
}
onUpstreamFailure is implemented so that when something goes wrong, I can mark the materialized future as failed as well.
override def onUpstreamFailure(e: Throwable): Unit = {
status.failure(e)
failStage(e)
}
Then I just return the GraphStageLogic I've created and status.future as a tuple.