How to get a result from Enumerator/Iteratee? - scala

I am using play2 and reactivemongo to fetch a result from mongodb. Each item of the result needs to be transformed to add some metadata. Afterwards I need to apply some sorting to it.
To deal with the transformation step I use enumerate():
def ideasEnumerator = collection.find(query)
.options(QueryOpts(skipN = page))
.sort(Json.obj(sortField -> -1))
.cursor[Idea]
.enumerate()
Then I create an Iteratee as follows:
val processIdeas: Iteratee[Idea, Unit] =
Iteratee.foreach[Idea] { idea =>
resolveCrossLinks(idea) flatMap { idea =>
addMetaInfo(idea.copy(history = None))
}
}
Finally I feed the Iteratee:
ideasEnumerator(processIdeas)
And now I'm stuck. Every example I saw does some println inside foreach, but seems not to care about a final result.
So when all documents are returned and transformed how do I get a Sequence, a List or some other datatype I can further deal with?

Change the signature of your Iteratee from Iteratee[Idea, Unit] to Iteratee[Idea, Seq[A]] where A is the type. Basically the first param of Iteratee is Input type and second param is Output type. In your case you gave the Output type as Unit.
Take a look at the below code. It may not compile but it gives you the basic usage.
ideasEnumerator.run(
Iteratee.fold(List.empty[MyObject]) { (accumulator, next) =>
accumulator + resolveCrossLinks(next) flatMap { next =>
addMetaInfo(next.copy(history = None))
}
}
) // returns Future[List[MyObject]]
As you can see, Iteratee is a simply a state machine. Just extract that Iteratee part and assign it to a val:
val iteratee = Iteratee.fold(List.empty[MyObject]) { (accumulator, next) =>
accumulator + resolveCrossLinks(next) flatMap { next =>
addMetaInfo(next.copy(history = None))
}
}
and feel free to use it where ever you need to convert from your Idea to List[MyObject]

With the help of your answers I ended up with
val processIdeas: Iteratee[Idea, Future[Vector[Idea]]] =
Iteratee.fold(Future(Vector.empty[Idea])) { (accumulator: Future[Vector[Idea]], next:Idea) =>
resolveCrossLinks(next) flatMap { next =>
addMetaInfo(next.copy(history = None))
} flatMap (ideaWithMeta => accumulator map (acc => acc :+ ideaWithMeta))
}
val ideas = collection.find(query)
.options(QueryOpts(page, perPage))
.sort(Json.obj(sortField -> -1))
.cursor[Idea]
.enumerate(perPage).run(processIdeas)
This later needs a ideas.flatMap(identity) to remove the returning Future of Futures but I'm fine with it and everything looks idiomatic and elegant I think.
The performance gained compared to creating a list and iterate over it afterwards is negligible though.

Related

request timeout from flatMapping over cats.effect.IO

I am attempting to transform some data that is encapsulated in cats.effect.IO with a Map that also is in an IO monad. I'm using http4s with blaze server and when I use the following code the request times out:
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// get the shifts
var getDbShifts: IO[List[Shift]] = shiftModel.findByUserId(userId)
// use the userRoleId to get the RoleId then get the tasks for this role
val taskMap : IO[Map[String, Double]] = taskModel.findByUserId(userId).flatMap {
case tskLst: List[Task] => IO(tskLst.map((task: Task) => (task.name -> task.standard)).toMap)
}
val traversed: IO[List[Shift]] = for {
shifts <- getDbShifts
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: IO[List[ShiftJson]] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) =>
taskMap.flatMap((tm: Map[String, Double]) =>
IO(ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / tm.get(sj.name).get)))
).sequence
//TODO: this flatMap is bricking my request
lstShiftJson.flatMap((sjLst: List[ShiftJson]) => {
IO(Shift(shift.id, shift.shiftDate, shift.shiftStart, shift.shiftEnd,
shift.lunchDuration, shift.shiftDuration, shift.breakOffProd, shift.systemDownOffProd,
shift.meetingOffProd, shift.trainingOffProd, shift.projectOffProd, shift.miscOffProd,
write[List[ShiftJson]](sjLst), shift.userRoleId, shift.isApproved, shift.score, shift.comments
))
})
})
} yield traversed
traversed.flatMap((sLst: List[Shift]) => Ok(write[List[Shift]](sLst)))
}
as you can see the TODO comment. I've narrowed down this method to the flatmap below the TODO comment. If I remove that flatMap and merely return "IO(shift)" to the traversed variable the request does not timeout; However, that doesn't help me much because I need to make use of the lstShiftJson variable which has my transformed json.
My intuition tells me I'm abusing the IO monad somehow, but I'm not quite sure how.
Thank you for your time in reading this!
So with the guidance of Luis's comment I refactored my code to the following. I don't think it is optimal (i.e. the flatMap at the end seems unecessary, but I couldnt' figure out how to remove it. BUT it's the best I've got.
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// FOR EACH SHIFT
// - read the shift.roleTasks into a ShiftJson object
// - divide each task value by the task.standard where task.name = shiftJson.name
// - write the list of shiftJson back to a string
val traversed = for {
taskMap <- taskModel.findByUserId(userId).map((tList: List[Task]) => tList.map((task: Task) => (task.name -> task.standard)).toMap)
shifts <- shiftModel.findByUserId(userId)
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: List[ShiftJson] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) => ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / taskMap.get(sj.name).get ))
shift.roleTasks = write[List[ShiftJson]](lstShiftJson)
IO(shift)
})
} yield traversed
traversed.flatMap((t: List[Shift]) => Ok(write[List[Shift]](t)))
}
Luis mentioned that mapping my List[Shift] to a Map[String, Double] is a pure operation so we want to use a map instead of flatMap.
He mentioned that I'm wrapping every operation that comes from the database in IO which is causing a great deal of recomputation. (including DB transactions)
To solve this issue I moved all of the database operations inside of my for loop, using the "<-" operator to flatMap each of the return values allows the variables being used to preside within the IO monads, hence preventing the recomputation experienced before.
I do think there must be a better way of returning my return value. flatMapping the "traversed" variable to get back inside of the IO monad seems to be unnecessary recomputation, so please anyone correct me.

functional scala- how to avoid deep nesting on optional mappings

I have a set of operations that are completed in sequence, but if an intermediate sequence returns "null" I would like to abort the operation early (skip the subsequent steps).
I conjured up a function like this which given an input parameter, performs several operations against Redis and will return a product if it exists. Since it is possible that one of the intermediate requests returns a null value, the complete operation could "fail" and I would like to short circuit the unnecessary steps that come afterwards.
The nesting here is becoming crazy, and I would like to make it more legible. Is there a proper "functional" way to perform this type of "if/else" short circuiting?
def getSingleProduct(firstSku: String): Option[Product] = {
val jedis = pool.getResource
val sid: Array[Byte] = jedis.get(Keys.sidForSku(firstSku, sectionId, feedId).getBytes)
Option(sid).flatMap {
sid: Array[Byte] =>
Option(jedis.get(Keys.latestVersionForSid(sectionId, feedId, sid))) flatMap {
version: Array[Byte] =>
Option(Keys.dataForSid(sectionId, feedId, version, sid)) flatMap {
getDataKey: Array[Byte] =>
Option(jedis.get(getDataKey)) flatMap {
packedData: Array[Byte] =>
val data = doSomeStuffWith(packedData)
Option(Product(data, "more interesting things"))
}
}
}
}
}
The way to do this is to use for:
for {
sid <- Option(jedis.get(...))
version <- Option(jedis.get(..., sid, ...))
getDataKey <- Option(jedis.get(...version,...))
packedData <- Option(jedis.get(getDataKey))
} yield {
// Do stuff with packedData
}
This will return None if any of the get calls returns None, otherwise it will return Some(x) where x is the result of the yeild expression.
You might also want to consider writing a wrapper for jedis.get which returns Option(x) rather than using null as the error result.

Adding Futures to an immutable Seq and returning it

I feel like this is not that difficult but I'm struggling with the futures and adding objects or Ints to an immutable Seq.
def createCopyOfProcessTemplate(processTemplateId: Int): Action[AnyContent] = Action.async {
//val copies = Seq()
processTemplateDTO.createCopyOfProcessTemplate(processTemplateId).flatMap { process =>
processTemplateDTO.getProcessStepTemplates(processTemplateId).map { steps =>
steps.foreach(processStep =>
copy: Future[Option[ProcessTemplateModel] = processTemplateDTO.createCopyOfStepTemplates(processTemplateId, process.get.id.get, processStep))
//Seq should look something like this: [{processStep.id, copy.id},{processStep.id, copy.id},...] or [[processStep.id, copy.id],[processStep.id, copy.id],...]
}
Ok(Json.obj("copies" -> copies))
}
Where do I have to define the seq and how should I return it since it's handling Futures ?
Any ideas? Thanks in advance!
You can use Future.sequence to convert List[Future[A]] into Future[List[A]] and return as result. First, do not use steps.foreach with copy variable defined, instead use steps.map to get ProcessTemplateModel as result from processTemplateDTO.createCopyOfStepTemplates, and map will return List of future result - List[Future[Option[ProcessTemplateModel]]. Then you can convert the result with Future.sequence and finally return as Json object.
val copies:List[Future[Option[ProcessTemplateModel]] = processTemplateDTO.createCopyOfProcessTemplate(processTemplateId).flatMap {
process =>processTemplateDTO.getProcessStepTemplates(processTemplateId).map { steps =>
steps.map(processStep =>
processTemplateDTO.createCopyOfStepTemplates(processTemplateId, process.get.id.get, processStep))
}
Future.sequence(copies).map{ result =>
Ok(Json.obj("copies" -> result))
}

Using a Future's response

Hoping someone can offer an opinion on a solution for this issue I'm having.
I'll try to simplify the issue so save bringing in domain issues, etc.
I have a list of Optional strings. I'm using the collect method to basically filter out strings that don't exist.
names collect {
case Some(value) => value
}
Simple enough. I'm homing to actually go one further. If a value is a None I'd like to call a function and use its response in place of the None. For example
names collect {
case Some(value) => value
case _ => getData(_)
}
The catch is the getData method returns a future. I understand that conventions for futures advise accessing the value within a callback, so something like the map method or on complete, but the issue is that I don't know if I need to call the getData method until I'm in the collect and have the value, so I can't simply wrap all my logic in a map method on getData. It doesn't feel like using Await and blocking is a good idea.
Any idea how I could reasonably handle this would be greatly appreciated. Very new to Scala, so I'd love to hear opinions and options.
EDIT:
I was trying to simplify the problem but I think I've instead missed out on key information.
Below is the actual implementation of my method:
def calculateTracksToExport()(
implicit exportRequest: ExportRequest,
lastExportOption: Option[String]
): Future[List[String]] = {
val vendorIds = getAllFavouritedTracks().flatMap { favTracks =>
Future.sequence {
favTracks.map { track =>
musicClient.getMusicTrackDetailsExternalLinks(
track,
exportRequest.vendor.toString.toLowerCase
).map { details =>
details.data.flatMap { data =>
data.`external-links`.map { link =>
link.map(_.value).collect {
case Some(value) => value
case None => getData(track)
}
}
}.getOrElse(List())
}
}
}.map(_.flatten)
}
vendorIds
}
You can use Future.sequence for collecting values:
def collect(list:List[Option[String]]):Future[List[String]] = Future.sequence(
list.map {
case Some(item) => Future.successful(item)
case _ => getData()
}
)
If something can be in future, you will have to always treat it like future. So have sequence of Futures as return value:
def resolve[T](input: Seq[Option[T]], supplier: => Future[T]): Seq[Future[T]] = {
input.map(option => option.map(Future.successful).getOrElse(supplier))
}
Usage example:
// Input to process
val data = Seq(Some(1), None, Some(2), None, Some(5))
//Imitates long-running background process producing data
var count = 6
def getData: Future[Int] = Future( {
Thread sleep (1000)
count += 1
count
})
resolve(data, getData) // Resolve Nones
.map(Await.result(_, 10.second)).foreach( println ) // Use result
Outputs:
1
8
2
7
5
http://ideone.com/aa8nJ9

How to create a play.api.libs.iteratee.Enumerator which inserts some data between the items of a given Enumerator?

I use Play framework with ReactiveMongo. Most of ReactiveMongo APIs are based on the Play Enumerator. As long as I fetch some data from MongoDB and return it "as-is" asynchronously, everything is fine. Also the transformation of the data, like converting BSON to String, using Enumerator.map is obvious.
But today I faced a problem which at the bottom line narrowed to the following code. I wasted half of the day trying to create an Enumerator which would consume items from the given Enumerator and insert some items between them. It is important not to load all the items at once, as there could be many of them (the code example has only two items "1" and "2"). But semantically it is similar to mkString of the collections. I am sure it can be done very easily, but the best I could come with - was this code. Very similar code creating an Enumerator using Concurrent.broadcast serves me well for WebSockets. But here even that does not work. The HTTP response never comes back. When I look at Enumeratee, it looks that it is supposed to provide such functionality, but I could not find the way to do the trick.
P.S. Tried to call chan.eofAndEnd in Iteratee.mapDone, and chunked(enums >>> Enumerator.eof instead of chunked(enums) - did not help. Sometimes the response comes back, but does not contain the correct data. What do I miss?
def trans(in:Enumerator[String]):Enumerator[String] = {
val (res, chan) = Concurrent.broadcast[String]
val iter = Iteratee.fold(true) { (isFirst, curr:String) =>
if (!isFirst)
chan.push("<-------->")
chan.push(curr)
false
}
in.apply(iter)
res
}
def enums:Enumerator[String] = {
val en12 = Enumerator[String]("1", "2")
trans(en12)
//en12 //if I comment the previous line and uncomment this, it prints "12" as expected
}
def enum = Action {
Ok.chunked(enums)
}
Here is my solution which I believe to be correct for this type of problem. Comments are welcome:
def fill[From](
prefix: From => Enumerator[From],
infix: (From, From) => Enumerator[From],
suffix: From => Enumerator[From]
)(implicit ec:ExecutionContext) = new Enumeratee[From, From] {
override def applyOn[A](inner: Iteratee[From, A]): Iteratee[From, Iteratee[From, A]] = {
//type of the state we will use for fold
case class State(prev:Option[From], it:Iteratee[From, A])
Iteratee.foldM(State(None, inner)) { (prevState, newItem:From) =>
val toInsert = prevState.prev match {
case None => prefix(newItem)
case Some(prevItem) => infix (prevItem, newItem)
}
for(newIt <- toInsert >>> Enumerator(newItem) |>> prevState.it)
yield State(Some(newItem), newIt)
} mapM {
case State(None, it) => //this is possible when our input was empty
Future.successful(it)
case State(Some(lastItem), it) =>
suffix(lastItem) |>> it
}
}
}
// if there are missing integers between from and to, fill that gap with 0
def fillGap(from:Int, to:Int)(implicit ec:ExecutionContext) = Enumerator enumerate List.fill(to-from-1)(0)
def fillFrom(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(x, input)
def fillTo(x:Int)(input:Int)(implicit ec:ExecutionContext) = fillGap(input, x)
val ints = Enumerator(10, 12, 15)
val toStr = Enumeratee.map[Int] (_.toString)
val infill = fill(
fillFrom(5),
fillGap,
fillTo(20)
)
val res = ints &> infill &> toStr // res will have 0,0,0,0,10,0,12,0,0,15,0,0,0,0
You wrote that you are working with WebSockets, so why don't you use dedicated solution for that? What you wrote is better for Server-Sent-Events rather than WS. As I understood you, you want to filter your results before sending them back to client? If its correct then you Enumeratee instead of Enumerator. Enumeratee is transformation from-to. This is very good piece of code how to use Enumeratee. May be is not directly about what you need but I found there inspiration for my project. Maybe when you analyze given code you would find best solution.