Convert into a For Comprehension in Scala - scala

Testing this I can see that it works:
def twoHtmlFutures = Action { request =>
val async1 = as1.index(embed = true)(request) // Future[Result]
val async2 = as2.index(embed = true)(request) // Future[Result]
val async1Html = async1.flatMap(x => Pagelet.readBody(x)) // Future[Html]
val async2Html = async2.flatMap(x => Pagelet.readBody(x)) // Future[Html]
val source1 = Source.fromFuture(async1Html) // Source[Html, NotUsed]
val source2 = Source.fromFuture(async2Html) // Source[Html, NotUsed]
val merged = source1.merge(source2) // Source[Html, NotUsed]
Ok.chunked(merged)
}
But trying to put it into a For Comprehension is not working for me. This is what I tried:
def twoHtmlFutures2 = Action.async { request =>
val async1 = as1.index(embed = true)(request)
val async2 = as2.index(embed = true)(request)
for {
async1Res <- async1 // from Future[Result] to Result
async2Res <- async2 // from Future[Result] to Result
async1Html <- Pagelet.readBody(async1Res) // from Result to Html
async2Html <- Pagelet.readBody(async2Res) // from Result to Html
} yield {
val source1 = single(async1Html) // from Html to Source[Html, NotUsed]
val source2 = single(async2Html) // from Html to Source[Html, NotUsed]
val merged = source1.merge(source2) // Source[Html, NotUsed]
Ok.chunked(merged)
}
}
But this just jumps on-screen at the same time rather than at different times (streamed) as the first example does. Any helpers out there to widen my eyelids?Thanks

Monads are a sequencing shape and Futures models this as causal dependence (first-this-future-completes-then-that-future-completes):
val x = Future(something).map(_ => somethingElse) or Future(something).flatMap(_ => Future(somethingElse)
However, there's a little trick one can do in for comprehensions:
def twoHtmlFutures = Action { request =>
Ok.chunked(
Source.fromFutureSource(
for { _ <- Future.unit // For Scala version <= 2.11 use Future.successful(())
async1 = as1.index(embed = true)(request) // Future[Result]
async2 = as2.index(embed = true)(request) // Future[Result]
async1Html = async1.flatMap(x => Pagelet.readBody(x)) // Future[Html]
async2Html = async2.flatMap(x => Pagelet.readBody(x)) // Future[Html]
source1 = Source.fromFuture(async1Html) // Source[Html, NotUsed]
source2 = Source.fromFuture(async2Html) // Source[Html, NotUsed]
} yield source1.merge(source2) // Source[Html, NotUsed]
)
)
I describe this technique in greater detail in this blogpost.
An alternate solution to your problem could be:
def twoHtmlFutures = Action { request =>
Ok.chunked(
Source.fromFuture(as1.index(embed = true)(request)).merge(Source.fromFuture(as2.index(embed = true)(request))).mapAsyncUnordered(2)(b => Pagelet.readBody(b))
)
}

For comprehension and flatMap (which is its desugared version) are used to sequence things.
In the context of Future, this means that in a for comprehension, each of the statement is started only once the previous one has successfully ended.
In your case, you want two Futures run in parallel. This is not what flatMap (or for comprehension) is.
What your code do is the following:
do the first index call
when that's over, do the second index call
when that's over, do the first readBody
when that's over, do the second readBody
when that's over create two (synchronous) sources with the values from the two previous steps, merge them, and start returning the merged source as chunked response.
What your previous code did was
do the first index call
when that's over, do the first readBody
in the meantime, do the same for the second index and readBody
in the meantime, create a source that will output an element when the first readBody yields a result
in the meantime, do the same for the second
merge these two sources, and start all at once to give the merged output as a chunked response.
So, in this case, you start your chunked response just after receiving the request (but with nothing inside yet, waiting for the Futures to be resolved), while in the former case, you wait at each computation for the previous one to be over, even if you don't need its result to go on.
What you should remember, is that you should use flatMap on Future, only if you need the result from a previous computation or if you wish for another computation to be over before doing something else. The same goes for for comprehension, which is just a nice-looking way of chaining flatMaps.

Here's what your proposed for comprehension looks like after it's been desugared (courtesy of IntelliJ's "Desugar Scala code ..." menu option):
async1.flatMap((async1Res: Nothing) =>
async2.flatMap((async2Res: Nothing) =>
Pagelet.readBody(async1Res).flatMap((async1Html: Nothing) =>
Pagelet.readBody(async2Res).map((async2Html: Nothing) =>
Ok.chunked(merged)))))
As you can see, the nesting, and the concluding flatMap/map pair, are very different from your original code plan.
As a general rule, every <- in a single for comprehension is turned into a flatMap() except for the final one, which is a map(), and each is nested inside the previous.

Related

Best way to get List[String] or Future[List[String]] from List[Future[List[String]]] Scala

I have a flow that returns List[Future[List[String]]] and I want to convert it to List[String] .
Here's what I am doing currently to achieve it -
val functionReturnedValue: List[Future[List[String]]] = functionThatReturnsListOfFutureList()
val listBuffer = new ListBuffer[String]
functionReturnedValue.map{futureList =>
val list = Await.result(futureList, Duration(10, "seconds"))
list.map(string => listBuffer += string)
}
listBuffer.toList
Waiting inside loop is not good, also need to avoid use of ListBuffer.
Or, if it is possible to get Future[List[String]] from List[Future[List[String]]]
Could someone please help with this?
There is no way to get a value from an asynchronus context to the synchronus context wihtout blocking the sysnchronus context to wait for the asynchronus context.
But, yes you can delay that blocking as much as you can do get better results.
val listFutureList: List[Future[List[String]]] = ???
val listListFuture: Future[List[List[String]]] = Future.sequence(listFutureList)
val listFuture: Future[List[String]] = listListFuture.map(_.flatten)
val list: List[String] = Await.result(listFuture, Duration.Inf)
Using Await.result invokes a blocking operation, which you should avoid if you can.
Just as a side note, in your code you are using .map but as you are only interested in the (mutable) ListBuffer you can just use foreach which has Unit as a return type.
Instead of mapping and adding item per item, you can use .appendAll
functionReturnedValue.foreach(fl =>
listBuffer.appendAll(Await.result(fl, Duration(10, "seconds")))
)
As you don't want to use ListBuffer, another way could be using .sequence is with a for comprehension and then .flatten
val fls: Future[List[String]] = for (
lls <- Future.sequence(functionReturnedValue)
) yield lls.flatten
You can transform List[Future[In]] to Future[List[In]] safetly as follows:
def aggregateSafeSequence[In](futures: List[Future[In]])(implicit ec: ExecutionContext): Future[List[In]] = {
val futureTries = futures.map(_.map(Success(_)).recover { case NonFatal(ex) => Failure(ex)})
Future.sequence(futureTries).map {
_.foldRight(List[In]()) {
case (curr, acc) =>
curr match {
case Success(res) => res :: acc
case Failure(ex) =>
println("Failure occurred", ex)
acc
}
}
}
}
Then you can use Await.result In order to wait if you like but it's not recommended and you should avoid it if possible.
Note that in general Future.sequence, if one the futures fails all the futures will fail together, so i went to a little different approach.
You can use the same way from List[Future[List[String]]] and etc.

What is the best way to merge two Future[Map[T1, T2]] in Scala

I have a list of fileNames and I want to load the correlated pages in batches (and not all at once). In order to do so, I'm using FoldLeft and I'm writing an aggregate function which aggregates a Future[Map[T1,T2]].
def loadPagesInBatches[T1, T2](fileNames: Set[FileName]): Future[Map[T1, T2]] = {
val fileNameToPageId: Map[FileName, PageId] = ... //invokes a function that returns the pageId correlated to the fileName.
val batches: Iterator[Set[FileName]] = fileNames.grouped(10) //batches of 10;
batches.foldLeft(Future(Map.empty[T1, T2]))(aggregate(fileNameToPageId))
}
And the signature of aggregate is as follows:
def aggregate(fileNameToPageId: Map[FileName, PageId]): (Future[Map[T1, T2]], Set[FileName]) => Future[Map[T1, T2]] = {..}
I'm trying to make sure what is the best way to merge these Future[Map]s.
Thanks ahead!
P.S: FileName and PageId are just Types of string.
In case you have exactly 2 futures, zipWith would probably be the most idiomatic.
val future1 = ???
val future2 = ???
future1.zipWith(future2)(_ ++ _)
Which is a shorter way of writing a for comprehension:
for {
map1 <- future1
map2 <- future2
} yield map1 ++ map2
Although zipWith could potentially implement some kind of optimization.
My solution was putting the two maps into a list and using Future.reduceLeft.
def aggregate(fileNameToPageId: Map[FileName, PageId]): (Future[Map[T1, T2]], Set[FileName]) => Future[Map[T1, T2]] = {
case (all, filesBatch) =>
val mapOfPages: Future[Map[NodeId, T]] = for {
... //Some logic
} yield "TheBatchMap"
Future.reduceLeft(List(all, mapOfPages))(_ ++ _)
}

request timeout from flatMapping over cats.effect.IO

I am attempting to transform some data that is encapsulated in cats.effect.IO with a Map that also is in an IO monad. I'm using http4s with blaze server and when I use the following code the request times out:
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// get the shifts
var getDbShifts: IO[List[Shift]] = shiftModel.findByUserId(userId)
// use the userRoleId to get the RoleId then get the tasks for this role
val taskMap : IO[Map[String, Double]] = taskModel.findByUserId(userId).flatMap {
case tskLst: List[Task] => IO(tskLst.map((task: Task) => (task.name -> task.standard)).toMap)
}
val traversed: IO[List[Shift]] = for {
shifts <- getDbShifts
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: IO[List[ShiftJson]] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) =>
taskMap.flatMap((tm: Map[String, Double]) =>
IO(ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / tm.get(sj.name).get)))
).sequence
//TODO: this flatMap is bricking my request
lstShiftJson.flatMap((sjLst: List[ShiftJson]) => {
IO(Shift(shift.id, shift.shiftDate, shift.shiftStart, shift.shiftEnd,
shift.lunchDuration, shift.shiftDuration, shift.breakOffProd, shift.systemDownOffProd,
shift.meetingOffProd, shift.trainingOffProd, shift.projectOffProd, shift.miscOffProd,
write[List[ShiftJson]](sjLst), shift.userRoleId, shift.isApproved, shift.score, shift.comments
))
})
})
} yield traversed
traversed.flatMap((sLst: List[Shift]) => Ok(write[List[Shift]](sLst)))
}
as you can see the TODO comment. I've narrowed down this method to the flatmap below the TODO comment. If I remove that flatMap and merely return "IO(shift)" to the traversed variable the request does not timeout; However, that doesn't help me much because I need to make use of the lstShiftJson variable which has my transformed json.
My intuition tells me I'm abusing the IO monad somehow, but I'm not quite sure how.
Thank you for your time in reading this!
So with the guidance of Luis's comment I refactored my code to the following. I don't think it is optimal (i.e. the flatMap at the end seems unecessary, but I couldnt' figure out how to remove it. BUT it's the best I've got.
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// FOR EACH SHIFT
// - read the shift.roleTasks into a ShiftJson object
// - divide each task value by the task.standard where task.name = shiftJson.name
// - write the list of shiftJson back to a string
val traversed = for {
taskMap <- taskModel.findByUserId(userId).map((tList: List[Task]) => tList.map((task: Task) => (task.name -> task.standard)).toMap)
shifts <- shiftModel.findByUserId(userId)
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: List[ShiftJson] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) => ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / taskMap.get(sj.name).get ))
shift.roleTasks = write[List[ShiftJson]](lstShiftJson)
IO(shift)
})
} yield traversed
traversed.flatMap((t: List[Shift]) => Ok(write[List[Shift]](t)))
}
Luis mentioned that mapping my List[Shift] to a Map[String, Double] is a pure operation so we want to use a map instead of flatMap.
He mentioned that I'm wrapping every operation that comes from the database in IO which is causing a great deal of recomputation. (including DB transactions)
To solve this issue I moved all of the database operations inside of my for loop, using the "<-" operator to flatMap each of the return values allows the variables being used to preside within the IO monads, hence preventing the recomputation experienced before.
I do think there must be a better way of returning my return value. flatMapping the "traversed" variable to get back inside of the IO monad seems to be unnecessary recomputation, so please anyone correct me.

Elegant way of reusing akka-stream flows

I am looking for a way to easily reuse akka-stream flows.
I treat the Flow I intend to reuse as a function, so I would like to keep its signature like:
Flow[Input, Output, NotUsed]
Now when I use this flow I would like to be able to 'call' this flow and keep the result aside for further processing.
So I want to start with Flow emiting [Input], apply my flow, and proceed with Flow emitting [(Input, Output)].
example:
val s: Source[Int, NotUsed] = Source(1 to 10)
val stringIfEven = Flow[Int].filter(_ % 2 == 0).map(_.toString)
val via: Source[(Int, String), NotUsed] = ???
Now this is not possible in a straightforward way because combining flow with .via() would give me Flow emitting just [Output]
val via: Source[String, NotUsed] = s.via(stringIfEven)
Alternative is to make my reusable flow emit [(Input, Output)] but this requires every flow to push its input through all the stages and make my code look bad.
So I came up with a combiner like this:
def tupledFlow[In,Out](flow: Flow[In, Out, _]):Flow[In, (In,Out), NotUsed] = {
Flow.fromGraph(GraphDSL.create() { implicit b =>
import GraphDSL.Implicits._
val broadcast = b.add(Broadcast[In](2))
val zip = b.add(Zip[In, Out])
broadcast.out(0) ~> zip.in0
broadcast.out(1) ~> flow ~> zip.in1
FlowShape(broadcast.in, zip.out)
})
}
that is broadcasting the input to the flow and as well in a parallel line directly -> both to the 'Zip' stage where I join values into a tuple. It then can be elegantly applied:
val tupled: Source[(Int, String), NotUsed] = s.via(tupledFlow(stringIfEven))
Everything great but when given flow is doing a 'filter' operation - this combiner is stuck and stops processing further events.
I guess that is due to 'Zip' behaviour that requires all subflows to do the same - in my case one branch is passing given object directly so another subflow cannot ignore this element with. filter(), and since it does - the flow stops because Zip is waiting for push.
Is there a better way to achieve flow composition?
Is there anything I can do in my tupledFlow to get desired behaviour when 'flow' ignores elements with 'filter' ?
Two possible approaches - with debatable elegance - are:
1) avoid using filtering stages, mutating your filter into a Flow[Int, Option[Int], NotUsed]. This way you can apply your zipping wrapper around your whole graph, as was your original plan. However, the code looks more tainted, and there is added overhead by passing around Nones.
val stringIfEvenOrNone = Flow[Int].map{
case x if x % 2 == 0 => Some(x.toString)
case _ => None
}
val tupled: Source[(Int, String), NotUsed] = s.via(tupledFlow(stringIfEvenOrNone)).collect{
case (num, Some(str)) => (num,str)
}
2) separate the filtering and transforming stages, and apply the filtering ones before your zipping wrapper. Probably a more lightweight and better compromise.
val filterEven = Flow[Int].filter(_ % 2 == 0)
val toString = Flow[Int].map(_.toString)
val tupled: Source[(Int, String), NotUsed] = s.via(filterEven).via(tupledFlow(toString))
EDIT
3) Posting another solution here for clarity, as per the discussions in the comments.
This flow wrapper allows to emit each element from a given flow, paired with the original input element that generated it. It works for any kind of inner flow (emitting 0, 1 or more elements for each input).
def tupledFlow[In,Out](flow: Flow[In, Out, _]): Flow[In, (In,Out), NotUsed] =
Flow[In].flatMapConcat(in => Source.single(in).via(flow).map( out => in -> out))
I came up with an implementation of TupledFlow that works when wrapped Flow uses filter() or mapAsync() and when wrapped Flow emits 0,1 or N elements for every input:
def tupledFlow[In,Out](flow: Flow[In, Out, _])(implicit materializer: Materializer, executionContext: ExecutionContext):Flow[In, (In,Out), NotUsed] = {
val v:Flow[In, Seq[(In, Out)], NotUsed] = Flow[In].mapAsync(4) { in: In =>
val outFuture: Future[Seq[Out]] = Source.single(in).via(flow).runWith(Sink.seq)
val bothFuture: Future[Seq[(In,Out)]] = outFuture.map( seqOfOut => seqOfOut.map((in,_)) )
bothFuture
}
val onlyDefined: Flow[In, (In, Out), NotUsed] = v.mapConcat[(In, Out)](seq => seq.to[scala.collection.immutable.Iterable])
onlyDefined
}
the only drawback I see here is that I am instantiating and materializing a flow for a single entity - just to get a notion of 'calling a flow as a function'.
I didn't do any performance tests on that - however since heavy-lifting is done in a wrapped Flow which is executed in a future - I believe this will be ok.
This implementation passes all the tests from https://gist.github.com/kretes/8d5f2925de55b2a274148b69f79e55ac#file-tupledflowspec-scala

How to create a play.api.libs.iteratee.Enumerator which inserts some data between the items of a given Enumerator?

I use Play framework with ReactiveMongo. Most of ReactiveMongo APIs are based on the Play Enumerator. As long as I fetch some data from MongoDB and return it "as-is" asynchronously, everything is fine. Also the transformation of the data, like converting BSON to String, using Enumerator.map is obvious.
But today I faced a problem which at the bottom line narrowed to the following code. I wasted half of the day trying to create an Enumerator which would consume items from the given Enumerator and insert some items between them. It is important not to load all the items at once, as there could be many of them (the code example has only two items "1" and "2"). But semantically it is similar to mkString of the collections. I am sure it can be done very easily, but the best I could come with - was this code. Very similar code creating an Enumerator using Concurrent.broadcast serves me well for WebSockets. But here even that does not work. The HTTP response never comes back. When I look at Enumeratee, it looks that it is supposed to provide such functionality, but I could not find the way to do the trick.
P.S. Tried to call chan.eofAndEnd in Iteratee.mapDone, and chunked(enums >>> Enumerator.eof instead of chunked(enums) - did not help. Sometimes the response comes back, but does not contain the correct data. What do I miss?
def trans(in:Enumerator[String]):Enumerator[String] = {
val (res, chan) = Concurrent.broadcast[String]
val iter = Iteratee.fold(true) { (isFirst, curr:String) =>
if (!isFirst)
chan.push("<-------->")
chan.push(curr)
false
}
in.apply(iter)
res
}
def enums:Enumerator[String] = {
val en12 = Enumerator[String]("1", "2")
trans(en12)
//en12 //if I comment the previous line and uncomment this, it prints "12" as expected
}
def enum = Action {
Ok.chunked(enums)
}
Here is my solution which I believe to be correct for this type of problem. Comments are welcome:
def fill[From](
prefix: From => Enumerator[From],
infix: (From, From) => Enumerator[From],
suffix: From => Enumerator[From]
)(implicit ec:ExecutionContext) = new Enumeratee[From, From] {
override def applyOn[A](inner: Iteratee[From, A]): Iteratee[From, Iteratee[From, A]] = {
//type of the state we will use for fold
case class State(prev:Option[From], it:Iteratee[From, A])
Iteratee.foldM(State(None, inner)) { (prevState, newItem:From) =>
val toInsert = prevState.prev match {
case None => prefix(newItem)
case Some(prevItem) => infix (prevItem, newItem)
}
for(newIt <- toInsert >>> Enumerator(newItem) |>> prevState.it)
yield State(Some(newItem), newIt)
} mapM {
case State(None, it) => //this is possible when our input was empty
Future.successful(it)
case State(Some(lastItem), it) =>
suffix(lastItem) |>> it
}
}
}
// if there are missing integers between from and to, fill that gap with 0
def fillGap(from:Int, to:Int)(implicit ec:ExecutionContext) = Enumerator enumerate List.fill(to-from-1)(0)
def fillFrom(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(x, input)
def fillTo(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(input, x)
val ints = Enumerator(10, 12, 15)
val toStr = Enumeratee.map[Int] (_.toString)
val infill = fill(
fillFrom(5),
fillGap,
fillTo(20)
)
val res = ints &> infill &> toStr // res will have 0,0,0,0,10,0,12,0,0,15,0,0,0,0
You wrote that you are working with WebSockets, so why don't you use dedicated solution for that? What you wrote is better for Server-Sent-Events rather than WS. As I understood you, you want to filter your results before sending them back to client? If its correct then you Enumeratee instead of Enumerator. Enumeratee is transformation from-to. This is very good piece of code how to use Enumeratee. May be is not directly about what you need but I found there inspiration for my project. Maybe when you analyze given code you would find best solution.