Composing BodyParser in Play 2.5 - scala

Given a function with this signature:
def parser[A](otherParser: BodyParser[A]): BodyParser[A]
How can I write the function in such a way that the request body is examined and verified before it is passed to otherParser?
For simplicity let's say that I want to verify that a header ("Some-Header", perhaps) has a value that matches the body exactly. So if I have this action:
def post(): Action(parser(parse.tolerantText)) { request =>
Ok(request.body)
}
When I make a request like curl -H "Some-Header: hello" -d "hello" http://localhost:9000/post it should return "hello" in the response body with a status of 200. If my request is curl -H "Some-Header: hello" -d "hi" http://localhost:9000/post it should return a 400 with no body.
Here's what I've tried.
This one does not compile because otherParser(request).through(flow) expects flow to output a ByteString. The idea here was that the flow could notify the accumulator whether or not to continue processing via the Either output. I'm not sure how to let the accumulator know the status of the previous step.
def parser[A](otherParser: BodyParser[A]): BodyParser[A] = BodyParser { request =>
val flow: Flow[ByteString, Either[Result, ByteString], NotUsed] = Flow[ByteString].map { bytes =>
if (request.headers.get("Some-Header").contains(bytes.utf8String)) {
Right(bytes)
} else {
Left(BadRequest)
}
}
val acc: Accumulator[ByteString, Either[Result, A]] = otherParser(request)
// This fails to compile because flow needs to output a ByteString
acc.through(flow)
}
I also attempted to use filter. This one does compile and the response body that gets written is correct. However it always returns a 200 Ok response status.
def parser[A](otherParser: BodyParser[A]): BodyParser[A] = BodyParser { request =>
val flow: Flow[ByteString, ByteString, akka.NotUsed] = Flow[ByteString].filter { bytes =>
request.headers.get("Some-Header").contains(bytes.utf8String)
}
val acc: Accumulator[ByteString, Either[Result, A]] = otherParser(request)
acc.through(flow)
}

I came up with a solution using a GraphStageWithMaterializedValue. This concept was borrowed from Play's maxLength body parser. The key difference between my first attempt in my question (that doesn't compile) is that instead of attempting to mutate the stream I should use the materialized value to convey information about the state of processing. While I had created a Flow[ByteString, Either[Result, ByteString], NotUsed] it turns out what I needed was a Flow[ByteString, ByteString, Future[Boolean]].
So with that, my parser function ends up looking like this:
def parser[A](otherParser: BodyParser[A]): BodyParser[A] = BodyParser { request =>
val flow: Flow[ByteString, ByteString, Future[Boolean]] = Flow.fromGraph(new BodyValidator(request.headers.get("Some-Header")))
val parserSink: Sink[ByteString, Future[Either[Result, A]]] = otherParser.apply(request).toSink
Accumulator(flow.toMat(parserSink) { (statusFuture: Future[Boolean], resultFuture: Future[Either[Result, A]]) =>
statusFuture.flatMap { success =>
if (success) {
resultFuture.map {
case Left(result) => Left(result)
case Right(a) => Right(a)
}
} else {
Future.successful(Left(BadRequest))
}
}
})
}
The key line is this one:
val flow: Flow[ByteString, ByteString, Future[Boolean]] = Flow.fromGraph(new BodyValidator(request.headers.get("Some-Header")))
The rest kind of falls into place once you are able to create this flow. Unfortunately BodyValidator is pretty verbose and feels somewhat boiler-platey. In any case, it's mostly pretty easy to read. GraphStageWithMaterializedValue expects you to implement def shape: S (S is FlowShape[ByteString, ByteString] here) to specify the input type and output type of this graph. It also expects you to imlpement def createLogicAndMaterializedValue(inheritedAttributes: Attributes): (GraphStageLogic, M) (M is a Future[Boolean] here) to define what the graph should actually do. Here's the full code of BodyValidator (I'll explain in more detail below):
class BodyValidator(expected: Option[String]) extends GraphStageWithMaterializedValue[FlowShape[ByteString, ByteString], Future[Boolean]] {
val in = Inlet[ByteString]("BodyValidator.in")
val out = Outlet[ByteString]("BodyValidator.out")
override def shape: FlowShape[ByteString, ByteString] = FlowShape.of(in, out)
override def createLogicAndMaterializedValue(inheritedAttributes: Attributes): (GraphStageLogic, Future[Boolean]) = {
val status = Promise[Boolean]()
val bodyBuffer = new ByteStringBuilder()
val logic = new GraphStageLogic(shape) {
setHandler(out, new OutHandler {
override def onPull(): Unit = pull(in)
})
setHandler(in, new InHandler {
def onPush(): Unit = {
val chunk = grab(in)
bodyBuffer.append(chunk)
push(out, chunk)
}
override def onUpstreamFinish(): Unit = {
val fullBody = bodyBuffer.result()
status.success(expected.map(ByteString(_)).contains(fullBody))
completeStage()
}
override def onUpstreamFailure(e: Throwable): Unit = {
status.failure(e)
failStage(e)
}
})
}
(logic, status.future)
}
}
You first want to create an Inlet and Outlet to set up the inputs and outputs for your graph
val in = Inlet[ByteString]("BodyValidator.in")
val out = Outlet[ByteString]("BodyValidator.out")
Then you use these to define shape.
def shape: FlowShape[ByteString, ByteString] = FlowShape.of(in, out)
Inside createLogicAndMaterializedValue you need to initialize the value you intend to materialze. Here I've used a promise that can be resolved when I have the full data from the stream. I also create a ByteStringBuilder to track the data between iterations.
val status = Promise[Boolean]()
val bodyBuffer = new ByteStringBuilder()
Then I create a GraphStageLogic to actually set up what this graph does at each point of processing. Two handler are being set. One is an InHandler for dealing with data as it comes from the upstream source. The other is an OutHandler for dealing with data to send downstream. There's nothing really interesting in the OutHandler so I'll ignore it here besides to say that it is necessary boiler plate in order to avoid an IllegalStateException. Three methods are overridden in the InHandler: onPush, onUpstreamFinish, and onUpstreamFailure. onPush is called when new data is ready from upstream. In this method I simply grab the next chunk of data, write it to bodyBuffer and push the data downstream.
def onPush(): Unit = {
val chunk = grab(in)
bodyBuffer.append(chunk)
push(out, chunk)
}
onUpstreamFinish is called when the upstream finishes (surprise). This is where the business logic of comparing the body with the header happens.
override def onUpstreamFinish(): Unit = {
val fullBody = bodyBuffer.result()
status.success(expected.map(ByteString(_)).contains(fullBody))
completeStage()
}
onUpstreamFailure is implemented so that when something goes wrong, I can mark the materialized future as failed as well.
override def onUpstreamFailure(e: Throwable): Unit = {
status.failure(e)
failStage(e)
}
Then I just return the GraphStageLogic I've created and status.future as a tuple.

Related

Akka Streams: handling Future inside GraphStage Source

I am trying to build an Akka Stream Source which receives data by making Future API calls (The nature of API is scrolling, which incrementally fetches results). To build such Source, I am using GraphStage.
I have modified the NumberSource example which simply pushes an Int at a time. The only change I did was to replace that Int with getvalue(): Future[Int] (to simulate the API call):
class NumbersSource extends GraphStage[SourceShape[Int]] {
val out: Outlet[Int] = Outlet("NumbersSource")
override val shape: SourceShape[Int] = SourceShape(out)
// simple example of future API call
private def getvalue(): Future[Int] = Future.successful(Random.nextInt())
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) {
setHandler(out, new OutHandler {
override def onPull(): Unit = {
// Future API call
getvalue().onComplete{
case Success(value) =>
println("Pushing value received..") // this is currently being printed just once
push(out, counter)
case Failure(exception) =>
}
}
}
})
}
}
// Using the Source and Running the stream
val sourceGraph: Graph[SourceShape[Int], NotUsed] = new NumbersSource
val mySource: Source[Int, NotUsed] = Source.fromGraph(sourceGraph)
val done: Future[Done] = mySource.runForeach{
num => println(s"Received: $num") // This is currently not printed
}
done.onComplete(_ => system.terminate())
The above code doesn't work. The println statement inside setHandler is executed just once and nothing is pushed downstream.
How should such Future calls be handled ? Thanks.
UPDATE
I tried to use getAsyncCallback by making changes as follow:
class NumbersSource(futureNum: Future[Int]) extends GraphStage[SourceShape[Int]] {
val out: Outlet[Int] = Outlet("NumbersSource")
override val shape: SourceShape[Int] = SourceShape(out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) {
override def preStart(): Unit = {
val callback = getAsyncCallback[Int] { (_) =>
completeStage()
}
futureNum.foreach(callback.invoke)
}
setHandler(out, new OutHandler {
override def onPull(): Unit = {
val value: Int = ??? // How to get this value ??
push(out, value)
}
})
}
}
// Using the Source and Running the Stream
def random(): Future[Int] = Future.successful(Random.nextInt())
val sourceGraph: Graph[SourceShape[Int], NotUsed] = new NumbersSource(random())
val mySource: Source[Int, NotUsed] = Source.fromGraph(sourceGraph)
val done: Future[Done] = mySource.runForeach{
num => println(s"Received: $num") // This is currently not printed
}
done.onComplete(_ => system.terminate())
But, now I am stuck at how to grab the value computed from Future. In case of a GraphStage, Flow, I could use:
val value = grab(in) // where in is Inlet of a Flow
But, what I have is a GraphStage, Source, so I have no idea how to grab the Int value of computed Future above.
I'm not sure if I understand correctly, but if you are trying to implement an infinite source out of elements computed in Futures then there is really no need to do it with your own GraphStage. You can do it simply as below:
Source.repeat(())
.mapAsync(parallelism) { _ => Future.successful(Random.nextInt()) }
The Source.repeat(()) is simply an infinite source of some arbitrary values (of type Unit in this case, but you can change () to whatever you want since it's ignored here). mapAsync then is used to integrate the asynchronous computations into the flow.
I would join to the other answer to try to avoid creating your own graphstage. After some experimentation this is what seems to work for me:
type Data = Int
trait DbResponse {
// this is just a callback for a compact solution
def nextPage: Option[() => Future[DbResponse]]
def data: List[Data]
}
def createSource(dbCall: DbResponse): Source[Data, NotUsed] = {
val thisPageSource = Source.apply(dbCall.data)
val nextPageSource = dbCall.nextPage match {
case Some(dbCallBack) => Source.lazySource(() => Source.future(dbCallBack()).flatMapConcat(createSource))
case None => Source.empty
}
thisPageSource.concat(nextPageSource)
}
val dataSource: Source[Data, NotUsed] = Source
.future(???: Future[DbResponse]) // the first db call
.flatMapConcat(createSource)
I tried it out and it works almost perfectly, I couldn't find out why, but the second page is instantaneously requested, but the rest will work as expected (with backpressure and what not).

MVar tryPut returns true and isEmpty also returns true

I wrote simple callback(handler) function which i pass to async api and i want to wait for result:
object Handlers {
val logger: Logger = Logger("Handlers")
implicit val cs: ContextShift[IO] =
IO.contextShift(ExecutionContext.Implicits.global)
class DefaultHandler[A] {
val response: IO[MVar[IO, A]] = MVar.empty[IO, A]
def onResult(obj: Any): Unit = {
obj match {
case obj: A =>
println(response.flatMap(_.tryPut(obj)).unsafeRunSync())
println(response.flatMap(_.isEmpty).unsafeRunSync())
case _ => logger.error("Wrong expected type")
}
}
def getResponse: A = {
response.flatMap(_.take).unsafeRunSync()
}
}
But for some reason both tryPut and isEmpty(when i'd manually call onResult method) returns true, therefore when i calling getResponse it sleeps forever.
This is the my test:
class HandlersTest extends FunSuite {
test("DefaultHandler.test") {
val handler = new DefaultHandler[Int]
handler.onResult(3)
val response = handler.getResponse
assert(response != 0)
}
}
Can somebody explain why tryPut returns true, but nothing puts. And what is the right way to use Mvar/channels in scala?
IO[X] means that you have the recipe to create some X. So on your example, yuo are putting in one MVar and then asking in another.
Here is how I would do it.
object Handlers {
trait DefaultHandler[A] {
def onResult(obj: Any): IO[Unit]
def getResponse: IO[A]
}
object DefaultHandler {
def apply[A : ClassTag]: IO[DefaultHandler[A]] =
MVar.empty[IO, A].map { response =>
new DefaultHandler[A] {
override def onResult(obj: Any): IO[Unit] = obj match {
case obj: A =>
for {
r1 <- response.tryPut(obj)
_ <- IO(println(r1))
r2 <- response.isEmpty
_ <- IO(println(r2))
} yield ()
case _ =>
IO(logger.error("Wrong expected type"))
}
override def getResponse: IO[A] =
response.take
}
}
}
}
The "unsafe" is sort of a hint, but every time you call unsafeRunSync, you should basically think of it as an entire new universe. Before you make the call, you can only describe instructions for what will happen, you can't actually change anything. During the call is when all the changes occur. Once the call completes, that universe is destroyed, and you can read the result but no longer change anything. What happens in one unsafeRunSync universe doesn't affect another.
You need to call it exactly once in your test code. That means your test code needs to look something like:
val test = for {
handler <- TestHandler.DefaultHandler[Int]
_ <- handler.onResult(3)
response <- handler.getResponse
} yield response
assert test.unsafeRunSync() == 3
Note this doesn't really buy you much over just using the MVar directly. I think you're trying to mix side effects inside IO and outside it, but that doesn't work. All the side effects need to be inside.

How to deal with future inside a customized akka Sink?

I am trying to implement a customized Akka Sink, but I could not find a way to handle future inside it.
class EventSink(...) {
val in: Inlet[EventEnvelope2] = Inlet("EventSink")
override val shape: SinkShape[EventEnvelope2] = SinkShape(in)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = {
new GraphStageLogic(shape) {
// This requests one element at the Sink startup.
override def preStart(): Unit = pull(in)
setHandler(in, new InHandler {
override def onPush(): Unit = {
val future = handle(grab(in))
Await.ready(future, Duration.Inf)
/*
future.onComplete {
case Success(_) =>
logger.info("pulling next events")
pull(in)
case Failure(failure) =>
logger.error(failure.getMessage, failure)
throw failure
}*/
pull(in)
}
})
}
}
private def handle(envelope: EventEnvelope2): Future[Unit] = {
val EventEnvelope2(query.Sequence(offset), _/*persistenceId*/, _/*sequenceNr*/, event) = envelope
...
db.run(statements.transactionally)
}
}
I have to go with blocking future at the moment, which does not look good. The non-blocking one I commented out only works for the first event. Could anyone please help?
Updated Thanks #ViktorKlang. It seems to be working now.
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
{
new GraphStageLogic(shape) {
val callback = getAsyncCallback[Try[Unit]] {
case Success(_) =>
//completeStage()
pull(in)
case Failure(error) =>
failStage(error)
}
// This requests one element at the Sink startup.
override def preStart(): Unit = {
pull(in)
}
setHandler(in, new InHandler {
override def onPush(): Unit = {
val future = handle(grab(in))
future.onComplete { result =>
callback.invoke(result)
}
}
})
}
}
I am trying to implement a Rational DB event sink connnecting to ReadJournal.eventsByTag. So this is a continuous stream, which will never end unless there is an error - This is what I want. Is my approach correct?
Two more questions:
Will the GraphStage never end unless I manually invoke completeStage or failStage?
Am I right or normal to declare callback outside preStart method? and Am I right to invoke pull(in) in preStart in this case?
Thanks,
Cheng
Avoid Custom Stages
In general, you should try to exhaust all possibilities with the given methods of the library's Source, Flow, and Sink. Custom stages are almost never necessary and make your code difficult to maintain.
Writing Your "Custom" Stage Using Standard Methods
Based on the details of your question's example code I don't see any reason why you would use a custom Sink to begin with.
Given your handle method, you could slightly modify it to do the logging that you specified in the question:
val loggedHandle : (EventEnvelope2) => Future[Unit] =
handle(_) transform {
case Success(_) => {
logger.info("pulling next events")
Success(Unit)
}
case Failure(failure) => {
logger.error(failure.getMessage, failure)
Failure(failure)
}
}
Then just use Sink.foreachParallel to handle the envelopes:
val createEventEnvelope2Sink : (Int) => Sink[EventEnvelope2, Future[Done]] =
(parallelism) =>
Sink[EventEnvelope2].foreachParallel(parallelism)(handle _)
Now, even if you want each EventEnvelope2 to be sent to the db in order you can just use 1 for parallelism:
val inOrderDBInsertSink : Sink[EventEnvelope2, Future[Done]] =
createEventEnvelope2Sink(1)
Also, if the database throws an exception you can still get a hold of it when the foreachParallel completes:
val someEnvelopeSource : Source[EventEnvelope2, _] = ???
someEnvelopeSource
.to(createEventEnvelope2Sink(1))
.run()
.andThen {
case Failure(throwable) => { /*deal with db exception*/ }
case Success(_) => { /*all inserts succeeded*/ }
}

Closing an Akka stream from inside a GraphStage (Akka 2.4.2)

In Akka Stream 2.4.2, PushStage has been deprecated. For Streams 2.0.3 I was using the solution from this answer:
How does one close an Akka stream?
which was:
import akka.stream.stage._
val closeStage = new PushStage[Tpe, Tpe] {
override def onPush(elem: Tpe, ctx: Context[Tpe]) = elem match {
case elem if shouldCloseStream ⇒
// println("stream closed")
ctx.finish()
case elem ⇒
ctx.push(elem)
}
}
How would I close a stream in 2.4.2 immediately, from inside a GraphStage / onPush() ?
Use something like this:
val closeStage = new GraphStage[FlowShape[Tpe, Tpe]] {
val in = Inlet[Tpe]("closeStage.in")
val out = Outlet[Tpe]("closeStage.out")
override val shape = FlowShape.of(in, out)
override def createLogic(inheritedAttributes: Attributes) = new GraphStageLogic(shape) {
setHandler(in, new InHandler {
override def onPush() = grab(in) match {
case elem if shouldCloseStream ⇒
// println("stream closed")
completeStage()
case msg ⇒
push(out, msg)
}
})
setHandler(out, new OutHandler {
override def onPull() = pull(in)
})
}
}
It is more verbose but one the one side one can define this logic in a reusable way and on the other side one no longer has to worry about differences between the stream elements because the GraphStage can be handled in the same way as a flow would be handled:
val flow: Flow[Tpe] = ???
val newFlow = flow.via(closeStage)
Posting for other people's reference. sschaef's answer is correct procedurally, but the connections was kept open for a minute and eventually would time out and throw a "no activity" exception, closing the connection.
In reading the docs further, I noticed that the connection was closed when all upstreams flows completed. In my case, I had more than one upstream.
For my particular use case, the fix was to add eagerComplete=true to close stream as soon as any (rather than all) upstream completes. Something like:
... = builder.add(Merge[MyObj](3,eagerComplete = true))
Hope this helps someone.

Asynchronous Iterable over remote data

There is some data that I have pulled from a remote API, for which I use a Future-style interface. The data is structured as a linked-list. A relevant example data container is shown below.
case class Data(information: Int) {
def hasNext: Boolean = ??? // Implemented
def next: Future[Data] = ??? // Implemented
}
Now I'm interested in adding some functionality to the data class, such as map, foreach, reduce, etc. To do so I want to implement some form of IterableLike such that it inherets these methods.
Given below is the trait Data may extend, such that it gets this property.
trait AsyncIterable[+T]
extends IterableLike[Future[T], AsyncIterable[T]]
{
def hasNext : Boolean
def next : Future[T]
// How to implement?
override def iterator: Iterator[Future[T]] = ???
override protected[this] def newBuilder: mutable.Builder[Future[T], AsyncIterable[T]] = ???
override def seq: TraversableOnce[Future[T]] = ???
}
It should be a non-blocking implementation, which when acted on, starts requesting the next data from the remote data source.
It is then possible to do cool stuff such as
case class Data(information: Int) extends AsyncIterable[Data]
val data = Data(1) // And more, of course
// Asynchronously print all the information.
data.foreach(data => println(data.information))
It is also acceptable for the interface to be different. But the result should in some way represent asynchronous iteration over the collection. Preferably in a way that is familiar to developers, as it will be part of an (open source) library.
In production I would use one of following:
Akka Streams
Reactive Extensions
For private tests I would implement something similar to following.
(Explanations are below)
I have modified a little bit your Data:
abstract class AsyncIterator[T] extends Iterator[Future[T]] {
def hasNext: Boolean
def next(): Future[T]
}
For it we can implement this Iterable:
class AsyncIterable[T](sourceIterator: AsyncIterator[T])
extends IterableLike[Future[T], AsyncIterable[T]]
{
private def stream(): Stream[Future[T]] =
if(sourceIterator.hasNext) {sourceIterator.next #:: stream()} else {Stream.empty}
val asStream = stream()
override def iterator = asStream.iterator
override def seq = asStream.seq
override protected[this] def newBuilder = throw new UnsupportedOperationException()
}
And if see it in action using following code:
object Example extends App {
val source = "Hello World!";
val iterator1 = new DelayedIterator[Char](100L, source.toCharArray)
new AsyncIterable(iterator1).foreach(_.foreach(print)) //prints 1 char per 100 ms
pause(2000L)
val iterator2 = new DelayedIterator[String](100L, source.toCharArray.map(_.toString))
new AsyncIterable(iterator2).reduceLeft((fl: Future[String], fr) =>
for(l <- fl; r <- fr) yield {println(s"$l+$r"); l + r}) //prints 1 line per 100 ms
pause(2000L)
def pause(duration: Long) = {println("->"); Thread.sleep(duration); println("\n<-")}
}
class DelayedIterator[T](delay: Long, data: Seq[T]) extends AsyncIterator[T] {
private val dataIterator = data.iterator
private var nextTime = System.currentTimeMillis() + delay
override def hasNext = dataIterator.hasNext
override def next = {
val thisTime = math.max(System.currentTimeMillis(), nextTime)
val thisValue = dataIterator.next()
nextTime = thisTime + delay
Future {
val now = System.currentTimeMillis()
if(thisTime > now) Thread.sleep(thisTime - now) //Your implementation will be better
thisValue
}
}
}
Explanation
AsyncIterable uses Stream because it's calculated lazily and it's simple.
Pros:
simplicity
multiple calls to iterator and seq methods return same iterable with all items.
Cons:
could lead to memory overflow because stream keeps all prevously obtained values.
first value is eagerly gotten during creation of AsyncIterable
DelayedIterator is very simplistic implementation of AsyncIterator, don't blame me for quick and dirty code here.
It's still strange for me to see synchronous hasNext and asynchronous next()
Using Twitter Spool I've implemented a working example.
To implement spool I modified the example in the documentation.
import com.twitter.concurrent.Spool
import com.twitter.util.{Await, Return, Promise}
import scala.concurrent.{ExecutionContext, Future}
trait AsyncIterable[+T <: AsyncIterable[T]] { self : T =>
def hasNext : Boolean
def next : Future[T]
def spool(implicit ec: ExecutionContext) : Spool[T] = {
def fill(currentPage: Future[T], rest: Promise[Spool[T]]) {
currentPage foreach { cPage =>
if(hasNext) {
val nextSpool = new Promise[Spool[T]]
rest() = Return(cPage *:: nextSpool)
fill(next, nextSpool)
} else {
val emptySpool = new Promise[Spool[T]]
emptySpool() = Return(Spool.empty[T])
rest() = Return(cPage *:: emptySpool)
}
}
}
val rest = new Promise[Spool[T]]
if(hasNext) {
fill(next, rest)
} else {
rest() = Return(Spool.empty[T])
}
self *:: rest
}
}
Data is the same as before, and now we can use it.
// Cool stuff
implicit val ec = scala.concurrent.ExecutionContext.global
val data = Data(1) // And others
// Print all the information asynchronously
val fut = data.spool.foreach(data => println(data.information))
Await.ready(fut)
It will trow an exception on the second element, because the implementation of next was not provided.