Akka Stream use HttpResponse in Flow - scala

I would like to utilize a simple Flow to gather some extra data from a http service and enhance my data object with the results. The following illustrates the Idea:
val httpClient = Http().superPool[User]()
val cityRequest = Flow[User].map { user=>
(HttpRequest(uri=Uri(config.getString("cityRequestEndpoint"))), User)
}
val cityResponse = Flow[(Try[HttpResponse], User)].map {
case (Failure(ex), user) => user
case (Success(resp), user) => {
// << What to do here to get the value >> //
val responseData = processResponseSomehowToGetAValue?
val enhancedUser = new EnhancedUser(user.data, responseData)
enhancedUser
}
}
val processEnhancedUser = Flow[EnhancedUser].map {
// e.g.: Asynchronously save user to a database
}
val useEnhancementGraph = userSource
.via(getRequest)
.via(httpClient)
.via(getResponse)
.via(processEnhancedUser)
.to(Sink.foreach(println))
I have a problem to understand the mechanics and difference between
the streaming nature and materialization / Futures inside the Flow.
Following ideas did not explain it to me:
http://doc.akka.io/docs/akka-http/current/scala/http/implications-of-streaming-http-entity.html
akka HttpResponse read body as String scala
How do i get the value from the response into the new user object,
so i can handle that object in the following steps.
Thanks for help.
Update:
I was evaluating the code with a remote akka http server answering to requests between immediately and 10 seconds using the code below for parsing.
This led to the effect that some "EnhancedUser" Instances showed up at the end, but the ones who took too long to answer were missing their values.
I added .async to the end of the cityResponse parser at some time and the result output took longer, but was correct.
What is the reason for that behaviour and how does it fit together with the accepted Answer?
val cityResponse = Flow[(Try[HttpResponse], User)].map {
case (Failure(ex), member) => member
case (Success(response), member) => {
Unmarshal(response.entity).to[String] onComplete {
case Success(s) => member.city = Some(s)
case Failure(ex) => member.city = None
}
}
member
}.async // <<-- This changed the behavior to be correct, why?

There are two different strategies you could use depending on the nature of the entity you are getting from "cityRequestEndpoint":
Stream Based
The typical way to handle this situation is to always assume that the entity coming from the source endpoint can contain N pieces of data, where N is not known in advance. This is usually the pattern to follow because it is the most generic and therefore "safest" in the real world.
The first step is to convert the HttpResponse coming from the endpoint into a Source of data:
val convertResponseToByteStrSource : (Try[HttpResponse], User) => Source[(Option[ByteString], User), _] =
(response, user) => response match {
case Failure(_) => Source single (None -> user)
case Success(r) => r.entity.dataBytes map (byteStr => Some(byteStr) -> user)
}
The above code is where we don't assume the size of N, r.entity.dataBytes could be a Source of 0 ByteString values, or potentially an infinite number values. But our logic doesn't care!
Now we need to combine the data coming from the Source. This is a good use case for Flow.flatMapConcat which takes a Flow of Sources and converts it into a Flow of values (similar to flatMap for Iterables):
val cityByteStrFlow : Flow[(Try[HttpResponse], User), (Option[ByteString], User), _] =
Flow[(Try[HttpResponse], User)] flatMapConcat convertResponseToByteStrSource
All that is left to do is convert the tuples of (ByteString, User) into EnhancedUser. Note: I am assuming below that User is a subclass of EnhancedUser which is inferred from the question logic:
val convertByteStringToUser : (Option[ByteString], User) => EnhancedUser =
(byteStr, user) =>
byteStr
.map(s => EnhancedUser(user.data, s))
.getOrElse(user)
val cityUserFlow : Flow[(Option[ByteString], User), EnhancedUser, _] =
Flow[(ByteString, User)] map convertByteStringToUser
These components can now be combined:
val useEnhancementGraph =
userSource
.via(cityRequest)
.via(httpClient)
.via(cityByteStrFlow)
.via(cityUserFlow)
.via(processEnhancedUser)
.to(Sink foreach println)
Future Based
We can use Futures to solve the problem, similar to the stack question you referenced in your original question. I don't recommend this approach for 2 reasons:
It assumes only 1 ByteString is coming from the endpoint. If the endpoint sends multiple values as ByteStrings then they all get concatenated together and you could get an error when creating EnhancedUser.
It places an artificial timeout on the materialization of the ByteString data, similar to Async.await (which should almost always be avoided).
To use the Future based approach the only big change to your original code is to use Flow.mapAsync instead of Flow.map to handle the fact that a Future is being created in the function:
val parallelism = 10
val timeout : FiniteDuration = ??? //you need to specify the timeout limit
val convertResponseToFutureByteStr : (Try[HttpResponse], User) => Future[EnhancedUser] =
_ match {
case (Failure(ex), user) =>
Future successful user
case (Success(resp), user) =>
resp
.entity
.toStrict(timeout)
.map(byteStr => new EnhancedUser(user.data, byteStr))
}
val cityResponse : Flow[(Try[HttpResponse], User), EnhancedUser, _] =
Flow[(Try[HttpResponse], User)].mapAsync(parallelism)(convertResponseToFutureByteStr)

Related

How to retrieve value from the output of a scala Future?

I am trying to query a table, store values of the query in a Scala Map & return the same map.
To do that, I came up with the following code:
def getBounds(incLogIdMap:scala.collection.mutable.Map[String, String]): Future[scala.collection.mutable.Map[String, String]] = Future {
var boundsMap = scala.collection.mutable.Map[String, String]()
incLogIdMap.keys.foreach(table => if(!incLogIdMap(table).contains("INVALID")) {
val minMax = s"select max(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) maxTms, min(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) minTms from queue.${table} where key_ids in (${incLogIdMap(table)})"
val boundsDF = spark.read.format("jdbc").option("url", commonParams.getGpConUrl()).option("dbtable", s"(${minMax}) as ctids")
.option("user", commonParams.getGpUserName()).option("password", commonParams.getGpPwd()).load()
val maxTms = boundsDF.select("minTms").head.getLong(0).toString + "," + boundsDF.select("maxTms").head.getLong(0).toString
boundsMap += (table -> maxTms)
}
)
boundsMap
}
In order to receive the value from the method: getBounds, I used the method onCompletion as below:
val tmsobj = new MinMaxVals(spark, commonParams)
val boundsMap = tmsobj.getBounds(incLogIds)
boundsMap.onComplete({
case Success(value) =>
case Failure(value) =>
})
I have coded in Scala before but I am new to Futures in Scala. Could anyone let me know how can I retrieve the value returned by getBounds into val boundsMap
You can use Awaits ( not the best aproach)
val boundsMap = Await.result(tmsobj.getBounds(incLogIds),Duration.Inf)
Or use the value only when you need
val boundsMap = tmsobj.getBounds(incLogIds)
booundsMap.map(value => Smth_To_Do(value))
Accessing a value from a Future is not recommended as it defeats the purpose of asynchronous computation. However, there may be cases where you are dealing with the legacy code or some situation where fetching the value from the future is the way forward. To deal with such situations, there are two approaches
Using await that will block the thread
Await.result(getBounds, 10 seconds)
So, here what await does is, it will wait for 10 seconds for the getBounds future to complete. If it completes within this time, then you have the value, else you get an exception here. The biggest drawback of this method is that it blocks the current thread of execution.
Using a callback method onComplete as you have used
getBounds onComplete {
case Success(someOption) => myMethod(someOption)
case Failure(t) => println("Error)
}
So what onComplete does is to register a callback function that will get executed whenever the future is completed. This is comparatively safer that await.
You can refer to Accessing value returned by scala futures for further details.
I hope that this answers your question.

Chaining together operations on an Option to a Future, then back to an Option?

I'm writing an authentication client that takes an Option[Credentials] as a parameter. This Credentials object has a .token method on it which I will then use to construct an HTTP request to post to an endpoint. This returns a Future[HttpResponse], which I then need to validate, unmarshal, and then convert back to my return type, which is an Option[String].
My first thought was to use a for comprehension like this:
val resp = for {
c <- creds
req <- buildRequest(c.token)
resp <- Http().singleRequest(req)
} yield resp
but then I found out that monads cannot be composed like that. My next thought is to do something like this:
val respFut = Http().singleRequest(buildRequest(token))
respFut.onComplete {
case Success(resp) => Some("john.doe")//do stuff
case Failure(_) => None
}
Unfortunately onComplete returns a unit, and map leaves me with a Future[Option[String]], and the only way I currently know to strip off the future wrapper is using the pipeTo methods in the akka framework. How can I convert this back to just an option string?
Once you've got a Future[T], it's usually good practice to not try to unbox it until you absolutely have to. Can you change your method to return a Future[Option[String]]? How far up the call stack can you deal with futures? Ideally it's all the way.
Something like this will give you a Future[Option[String]] as a result:
val futureResult = creds map {
case Some(c) => {
val req = buildRequest(c.token)
val futureResponse = Http().singleRequest(req)
futureResponse.map(res => Some(convertResponseToString(res)))
}
case None => Future(None)
}
If you really need to block and wait on the result, you can do Await.result as described here.
And if you want to do it in a more monadic style (in a for-comprehension, like you tried), cats has an OptionT type that will help with that, and I think scalaz does as well. But whether you want to get into either of those libraries is up to you.
It's easy to "upgrade" an Option to a Future[Option[...]], so use Future as your main monad. And deal with the simpler case first:
val f: Future[Option[String]] =
// no credential? just wrap a `None` in a successful future
credsOpt.fold(Future.successful(Option.empty[String])) {creds =>
Http()
.singleRequest(buildRequest(creds.token))
.map(convertResponseToString)
.recover {case _ => Option.empty[String]}
}
The only way to turn that future into Option[String] is to wait for it with Await.result(...)... but it's better if that future can be passed along to the next caller (no blocking).
I'm not 100% certain about what all your types are, but it seems like you want a for comprehension that mixes option and futures. I've often been in that situation and I find I can just chain my for comprehensions as a way to make the code look a bit better.
val resp = for {
c <- creds
req <- buildRequest(c.token)
} yield for {
resp <- Http().singleRequest(req)
} yield resp
resp becomes an Option[Future[HttpResponse]] which you can match / partial func around with None meaning the code never got to execute because it failed its conditions. This is a dumb little trick I use to make comprehensions look better and I hope it gives you a hint towards your solution.

Aliasing objects from expensive statements in Scala pattern match

I have an expensive case statement which needs to hit the database to determine a complete match. If there is a match, the result from the aforementioned call must be used to perform further operations:
def intent = {
case request # GET(Path(Seg(database :: Nil))) if recordsFrom(database) != Nil =>
renderOutput(recordsFrom(database))
case ...
}
I would like to call recordsFrom(database) only once. In the above example, it is called twice. It seems like I should be able to apply some alias to the statement?
Lawrence, from what I'm seeing you're using Unfiltered to handle a RESTful request but you've also combined a database lookup with that response filtering. I would advise you not to do that. Instead I'd arrange things as following:
val dbReqCommand = new DBRequestCommand(myDbConPool)
def intent ={
case req # GET(Path(Seq(database :: Nil))) => dbReqCommand(req, database)
}
Wherein you've encapsulated the db requests in an object that you could substitute out for testing purposes (think integration tests without a DB backend.) Within the request handler you might then put in the response:
Option(recordsFrom(database)) match{
case Some(value) => OK ~> renderOpupt(value)
case None => //an error response or Pass
}
That way you might have something along the lines of:
trait DBReqPlan{
def dbReqCommand: RequestCommand[String]
def intent ={
case req # GET(Path(Seq(database :: Nil))) => dbReqCommand(req, database)
}
}
which is easier to test against and work with.
What's wrong with:
def intent = {
case request # GET(Path(Seg(database :: Nil))) =>
val records = recordsFrom(database)
if(!records.isEmpty){
renderOutput(records)
} else {
...
}
case ...
You can move the body of the first case to a different function if you want to avoid having too many nested blocks.

How to create a play.api.libs.iteratee.Enumerator which inserts some data between the items of a given Enumerator?

I use Play framework with ReactiveMongo. Most of ReactiveMongo APIs are based on the Play Enumerator. As long as I fetch some data from MongoDB and return it "as-is" asynchronously, everything is fine. Also the transformation of the data, like converting BSON to String, using Enumerator.map is obvious.
But today I faced a problem which at the bottom line narrowed to the following code. I wasted half of the day trying to create an Enumerator which would consume items from the given Enumerator and insert some items between them. It is important not to load all the items at once, as there could be many of them (the code example has only two items "1" and "2"). But semantically it is similar to mkString of the collections. I am sure it can be done very easily, but the best I could come with - was this code. Very similar code creating an Enumerator using Concurrent.broadcast serves me well for WebSockets. But here even that does not work. The HTTP response never comes back. When I look at Enumeratee, it looks that it is supposed to provide such functionality, but I could not find the way to do the trick.
P.S. Tried to call chan.eofAndEnd in Iteratee.mapDone, and chunked(enums >>> Enumerator.eof instead of chunked(enums) - did not help. Sometimes the response comes back, but does not contain the correct data. What do I miss?
def trans(in:Enumerator[String]):Enumerator[String] = {
val (res, chan) = Concurrent.broadcast[String]
val iter = Iteratee.fold(true) { (isFirst, curr:String) =>
if (!isFirst)
chan.push("<-------->")
chan.push(curr)
false
}
in.apply(iter)
res
}
def enums:Enumerator[String] = {
val en12 = Enumerator[String]("1", "2")
trans(en12)
//en12 //if I comment the previous line and uncomment this, it prints "12" as expected
}
def enum = Action {
Ok.chunked(enums)
}
Here is my solution which I believe to be correct for this type of problem. Comments are welcome:
def fill[From](
prefix: From => Enumerator[From],
infix: (From, From) => Enumerator[From],
suffix: From => Enumerator[From]
)(implicit ec:ExecutionContext) = new Enumeratee[From, From] {
override def applyOn[A](inner: Iteratee[From, A]): Iteratee[From, Iteratee[From, A]] = {
//type of the state we will use for fold
case class State(prev:Option[From], it:Iteratee[From, A])
Iteratee.foldM(State(None, inner)) { (prevState, newItem:From) =>
val toInsert = prevState.prev match {
case None => prefix(newItem)
case Some(prevItem) => infix (prevItem, newItem)
}
for(newIt <- toInsert >>> Enumerator(newItem) |>> prevState.it)
yield State(Some(newItem), newIt)
} mapM {
case State(None, it) => //this is possible when our input was empty
Future.successful(it)
case State(Some(lastItem), it) =>
suffix(lastItem) |>> it
}
}
}
// if there are missing integers between from and to, fill that gap with 0
def fillGap(from:Int, to:Int)(implicit ec:ExecutionContext) = Enumerator enumerate List.fill(to-from-1)(0)
def fillFrom(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(x, input)
def fillTo(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(input, x)
val ints = Enumerator(10, 12, 15)
val toStr = Enumeratee.map[Int] (_.toString)
val infill = fill(
fillFrom(5),
fillGap,
fillTo(20)
)
val res = ints &> infill &> toStr // res will have 0,0,0,0,10,0,12,0,0,15,0,0,0,0
You wrote that you are working with WebSockets, so why don't you use dedicated solution for that? What you wrote is better for Server-Sent-Events rather than WS. As I understood you, you want to filter your results before sending them back to client? If its correct then you Enumeratee instead of Enumerator. Enumeratee is transformation from-to. This is very good piece of code how to use Enumeratee. May be is not directly about what you need but I found there inspiration for my project. Maybe when you analyze given code you would find best solution.

How should I handle Filter and Futures in play2 and Scala

I'm trying to learn Futures and ReactiveMongo.
In my case I have a couple of invite objects and want to filter out the ones that already exist in the db. I do not want to update or upsert the ones already in the db. Therefore I have created a filter method:
filter method:
def isAllowedToReview(invite: Invite): Future[Boolean] = {
ReviewDAO.findById(invite.recoId, invite.invitedUserId).map {
maybeReview => {
maybeReview match {
case Some(review) => false
case None => true
}
}
}
}
DAO:
def findById(rId: Long, userId: Long): Future[Option[Review]] = findOne(Json.obj("rId" -> recoId, "userId" -> userId))
def findOne(query: JsObject)(implicit reader: Reads[T]): Future[Option[T]] = {
collection.find(query).one[T]
}
and then call:
val futureOptionSet: Set[Future[Option[Invite]]] = smsSet.filter(isAllowedToReview)
save the filtered set somehow...
this doesn't work since filter expects in this case Invite => Boolean but I'm sending Invite => Future(Boolean). How would you filter and save this?
smsSet.map(sms => isAllowedToReview(sms).map(b => sms -> b)) will have type Set[Future[(Invite, Boolean)]]. You should be able to call Future.sequence to turn it into a Future[Set[(Invite, Boolean)]]. Then you can collect the results .map(_.collect{ case (sms, true) => sms}).
So putting everything together a solution may look like this:
val futures = smsSet.map(sms => isAllowedToReview(sms).map(b => sms -> b))
val future = Future.sequence(futures)
val result = future.map(_.collect{ case (sms, true) => sms})
When you see map and sequence you may be able to refactor to:
val filteredSet = Future.traverse(smsSet){ sms =>
isAllowedToReview(sms).map(b => sms -> b)
}.map(_.collect{ case (sms, true) => sms})
Note that instead of returning the set, you may just want to save your sms there. But the way I wrote this, all will be wrapped in a Future and you can still compose with other operations.
You could try something like this:
val revsFut = Future.sequence(smsSet.map(invite => ReviewDAO.findById(invite.recoId, invite.invitedUserId)))
val toSave = for(revs <- revsFut) yield {
val flatRevs = revs.flatten
smsSet.filter{ invite =>
flatRevs.find(review => /*Add filter code here */).isDefined
}
}
What I'm doing here is first fetching the Set of reviews matching the the invites by mapping over the smsSet, fetching each individually and then sequencing that into one singe Future. Then, in the for-comprehension I flatten the Set of Option[Review] and then filter down the smsSet based on what's in that flatRevs Set. Since I don't know your object model, I had to leave the impl of the flatRevs.find up to you, but it should be pretty easy as that point.