How to retrieve value from the output of a scala Future? - scala

I am trying to query a table, store values of the query in a Scala Map & return the same map.
To do that, I came up with the following code:
def getBounds(incLogIdMap:scala.collection.mutable.Map[String, String]): Future[scala.collection.mutable.Map[String, String]] = Future {
var boundsMap = scala.collection.mutable.Map[String, String]()
incLogIdMap.keys.foreach(table => if(!incLogIdMap(table).contains("INVALID")) {
val minMax = s"select max(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) maxTms, min(cast(to_char(update_tms,'yyyyddmmhhmmss') as bigint)) minTms from queue.${table} where key_ids in (${incLogIdMap(table)})"
val boundsDF = spark.read.format("jdbc").option("url", commonParams.getGpConUrl()).option("dbtable", s"(${minMax}) as ctids")
.option("user", commonParams.getGpUserName()).option("password", commonParams.getGpPwd()).load()
val maxTms = boundsDF.select("minTms").head.getLong(0).toString + "," + boundsDF.select("maxTms").head.getLong(0).toString
boundsMap += (table -> maxTms)
}
)
boundsMap
}
In order to receive the value from the method: getBounds, I used the method onCompletion as below:
val tmsobj = new MinMaxVals(spark, commonParams)
val boundsMap = tmsobj.getBounds(incLogIds)
boundsMap.onComplete({
case Success(value) =>
case Failure(value) =>
})
I have coded in Scala before but I am new to Futures in Scala. Could anyone let me know how can I retrieve the value returned by getBounds into val boundsMap

You can use Awaits ( not the best aproach)
val boundsMap = Await.result(tmsobj.getBounds(incLogIds),Duration.Inf)
Or use the value only when you need
val boundsMap = tmsobj.getBounds(incLogIds)
booundsMap.map(value => Smth_To_Do(value))

Accessing a value from a Future is not recommended as it defeats the purpose of asynchronous computation. However, there may be cases where you are dealing with the legacy code or some situation where fetching the value from the future is the way forward. To deal with such situations, there are two approaches
Using await that will block the thread
Await.result(getBounds, 10 seconds)
So, here what await does is, it will wait for 10 seconds for the getBounds future to complete. If it completes within this time, then you have the value, else you get an exception here. The biggest drawback of this method is that it blocks the current thread of execution.
Using a callback method onComplete as you have used
getBounds onComplete {
case Success(someOption) => myMethod(someOption)
case Failure(t) => println("Error)
}
So what onComplete does is to register a callback function that will get executed whenever the future is completed. This is comparatively safer that await.
You can refer to Accessing value returned by scala futures for further details.
I hope that this answers your question.

Related

Best way to get List[String] or Future[List[String]] from List[Future[List[String]]] Scala

I have a flow that returns List[Future[List[String]]] and I want to convert it to List[String] .
Here's what I am doing currently to achieve it -
val functionReturnedValue: List[Future[List[String]]] = functionThatReturnsListOfFutureList()
val listBuffer = new ListBuffer[String]
functionReturnedValue.map{futureList =>
val list = Await.result(futureList, Duration(10, "seconds"))
list.map(string => listBuffer += string)
}
listBuffer.toList
Waiting inside loop is not good, also need to avoid use of ListBuffer.
Or, if it is possible to get Future[List[String]] from List[Future[List[String]]]
Could someone please help with this?
There is no way to get a value from an asynchronus context to the synchronus context wihtout blocking the sysnchronus context to wait for the asynchronus context.
But, yes you can delay that blocking as much as you can do get better results.
val listFutureList: List[Future[List[String]]] = ???
val listListFuture: Future[List[List[String]]] = Future.sequence(listFutureList)
val listFuture: Future[List[String]] = listListFuture.map(_.flatten)
val list: List[String] = Await.result(listFuture, Duration.Inf)
Using Await.result invokes a blocking operation, which you should avoid if you can.
Just as a side note, in your code you are using .map but as you are only interested in the (mutable) ListBuffer you can just use foreach which has Unit as a return type.
Instead of mapping and adding item per item, you can use .appendAll
functionReturnedValue.foreach(fl =>
listBuffer.appendAll(Await.result(fl, Duration(10, "seconds")))
)
As you don't want to use ListBuffer, another way could be using .sequence is with a for comprehension and then .flatten
val fls: Future[List[String]] = for (
lls <- Future.sequence(functionReturnedValue)
) yield lls.flatten
You can transform List[Future[In]] to Future[List[In]] safetly as follows:
def aggregateSafeSequence[In](futures: List[Future[In]])(implicit ec: ExecutionContext): Future[List[In]] = {
val futureTries = futures.map(_.map(Success(_)).recover { case NonFatal(ex) => Failure(ex)})
Future.sequence(futureTries).map {
_.foldRight(List[In]()) {
case (curr, acc) =>
curr match {
case Success(res) => res :: acc
case Failure(ex) =>
println("Failure occurred", ex)
acc
}
}
}
}
Then you can use Await.result In order to wait if you like but it's not recommended and you should avoid it if possible.
Note that in general Future.sequence, if one the futures fails all the futures will fail together, so i went to a little different approach.
You can use the same way from List[Future[List[String]]] and etc.

request timeout from flatMapping over cats.effect.IO

I am attempting to transform some data that is encapsulated in cats.effect.IO with a Map that also is in an IO monad. I'm using http4s with blaze server and when I use the following code the request times out:
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// get the shifts
var getDbShifts: IO[List[Shift]] = shiftModel.findByUserId(userId)
// use the userRoleId to get the RoleId then get the tasks for this role
val taskMap : IO[Map[String, Double]] = taskModel.findByUserId(userId).flatMap {
case tskLst: List[Task] => IO(tskLst.map((task: Task) => (task.name -> task.standard)).toMap)
}
val traversed: IO[List[Shift]] = for {
shifts <- getDbShifts
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: IO[List[ShiftJson]] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) =>
taskMap.flatMap((tm: Map[String, Double]) =>
IO(ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / tm.get(sj.name).get)))
).sequence
//TODO: this flatMap is bricking my request
lstShiftJson.flatMap((sjLst: List[ShiftJson]) => {
IO(Shift(shift.id, shift.shiftDate, shift.shiftStart, shift.shiftEnd,
shift.lunchDuration, shift.shiftDuration, shift.breakOffProd, shift.systemDownOffProd,
shift.meetingOffProd, shift.trainingOffProd, shift.projectOffProd, shift.miscOffProd,
write[List[ShiftJson]](sjLst), shift.userRoleId, shift.isApproved, shift.score, shift.comments
))
})
})
} yield traversed
traversed.flatMap((sLst: List[Shift]) => Ok(write[List[Shift]](sLst)))
}
as you can see the TODO comment. I've narrowed down this method to the flatmap below the TODO comment. If I remove that flatMap and merely return "IO(shift)" to the traversed variable the request does not timeout; However, that doesn't help me much because I need to make use of the lstShiftJson variable which has my transformed json.
My intuition tells me I'm abusing the IO monad somehow, but I'm not quite sure how.
Thank you for your time in reading this!
So with the guidance of Luis's comment I refactored my code to the following. I don't think it is optimal (i.e. the flatMap at the end seems unecessary, but I couldnt' figure out how to remove it. BUT it's the best I've got.
def getScoresByUserId(userId: Int): IO[Response[IO]] = {
implicit val formats = DefaultFormats + ShiftJsonSerializer() + RawShiftSerializer()
implicit val shiftJsonReader = new Reader[ShiftJson] {
def read(value: JValue): ShiftJson = value.extract[ShiftJson]
}
implicit val shiftJsonDec = jsonOf[IO, ShiftJson]
// FOR EACH SHIFT
// - read the shift.roleTasks into a ShiftJson object
// - divide each task value by the task.standard where task.name = shiftJson.name
// - write the list of shiftJson back to a string
val traversed = for {
taskMap <- taskModel.findByUserId(userId).map((tList: List[Task]) => tList.map((task: Task) => (task.name -> task.standard)).toMap)
shifts <- shiftModel.findByUserId(userId)
traversed <- shifts.traverse((shift: Shift) => {
val lstShiftJson: List[ShiftJson] = read[List[ShiftJson]](shift.roleTasks)
.map((sj: ShiftJson) => ShiftJson(sj.name, sj.taskType, sj.label, sj.value.toString.toDouble / taskMap.get(sj.name).get ))
shift.roleTasks = write[List[ShiftJson]](lstShiftJson)
IO(shift)
})
} yield traversed
traversed.flatMap((t: List[Shift]) => Ok(write[List[Shift]](t)))
}
Luis mentioned that mapping my List[Shift] to a Map[String, Double] is a pure operation so we want to use a map instead of flatMap.
He mentioned that I'm wrapping every operation that comes from the database in IO which is causing a great deal of recomputation. (including DB transactions)
To solve this issue I moved all of the database operations inside of my for loop, using the "<-" operator to flatMap each of the return values allows the variables being used to preside within the IO monads, hence preventing the recomputation experienced before.
I do think there must be a better way of returning my return value. flatMapping the "traversed" variable to get back inside of the IO monad seems to be unnecessary recomputation, so please anyone correct me.

How to traverse a Set[Future[Option[User]]] and mutate a map

I have a mutable map that contains users:
val userMap = mutable.Map.empty[Int, User] // Int is user.Id
Now I need to load the new users, and add them to the map. I have the following api methods:
def getNewUsers(): Seq[Int]
def getUser(userId: Int): Future[Option[User]]
So I first get all the users I need to load:
val newUserIds: Set[Int] = api.getNewUsers
I now need to load each user, but not sure how to do getUser returns a Future[Option[User]].
I tried this:
api.getNewUsers().map( getUser(_) )
But that returns a Set[Future[Option[User]]]
I'm not sure how to use Set[Future[Option[User]]] to update my userMap now.
You'll have to wait for all of the Futures to finish. You can use Future.sequence to transform your Set[Future[_]] into a Future[Set], so you can wait for them all to finish:
val s: Set[scala.concurrent.Future[Some[User]]] = Set(Future(Some(User(1))), Future(Some(User(2))))
val f: Future[Set[Some[User]]] = Future.sequence(s)
f.map(users => users.foreach(u => /* your code here */))
However, using a mutable Map may be dangerous because it's possible to open yourself up to race conditions. Futures are executed in different threads, and if you altering a mutable object's state in different threads, bad things will happen.
You can use Future.sequence:
transforms a TraversableOnce[Future[A]] into a
Future[TraversableOnce[A]]. Useful for reducing many Futures into a
single Future
from Scala Future
You can try:
val result : Future[Seq[Option[User]]] =
Future.sequence(
api.getNewUsers().map( getUser )
)
result.andThen {
case Success(users) =>
users.flatten.foreach(u => yourMap += u.id -> u)
}

Akka Stream use HttpResponse in Flow

I would like to utilize a simple Flow to gather some extra data from a http service and enhance my data object with the results. The following illustrates the Idea:
val httpClient = Http().superPool[User]()
val cityRequest = Flow[User].map { user=>
(HttpRequest(uri=Uri(config.getString("cityRequestEndpoint"))), User)
}
val cityResponse = Flow[(Try[HttpResponse], User)].map {
case (Failure(ex), user) => user
case (Success(resp), user) => {
// << What to do here to get the value >> //
val responseData = processResponseSomehowToGetAValue?
val enhancedUser = new EnhancedUser(user.data, responseData)
enhancedUser
}
}
val processEnhancedUser = Flow[EnhancedUser].map {
// e.g.: Asynchronously save user to a database
}
val useEnhancementGraph = userSource
.via(getRequest)
.via(httpClient)
.via(getResponse)
.via(processEnhancedUser)
.to(Sink.foreach(println))
I have a problem to understand the mechanics and difference between
the streaming nature and materialization / Futures inside the Flow.
Following ideas did not explain it to me:
http://doc.akka.io/docs/akka-http/current/scala/http/implications-of-streaming-http-entity.html
akka HttpResponse read body as String scala
How do i get the value from the response into the new user object,
so i can handle that object in the following steps.
Thanks for help.
Update:
I was evaluating the code with a remote akka http server answering to requests between immediately and 10 seconds using the code below for parsing.
This led to the effect that some "EnhancedUser" Instances showed up at the end, but the ones who took too long to answer were missing their values.
I added .async to the end of the cityResponse parser at some time and the result output took longer, but was correct.
What is the reason for that behaviour and how does it fit together with the accepted Answer?
val cityResponse = Flow[(Try[HttpResponse], User)].map {
case (Failure(ex), member) => member
case (Success(response), member) => {
Unmarshal(response.entity).to[String] onComplete {
case Success(s) => member.city = Some(s)
case Failure(ex) => member.city = None
}
}
member
}.async // <<-- This changed the behavior to be correct, why?
There are two different strategies you could use depending on the nature of the entity you are getting from "cityRequestEndpoint":
Stream Based
The typical way to handle this situation is to always assume that the entity coming from the source endpoint can contain N pieces of data, where N is not known in advance. This is usually the pattern to follow because it is the most generic and therefore "safest" in the real world.
The first step is to convert the HttpResponse coming from the endpoint into a Source of data:
val convertResponseToByteStrSource : (Try[HttpResponse], User) => Source[(Option[ByteString], User), _] =
(response, user) => response match {
case Failure(_) => Source single (None -> user)
case Success(r) => r.entity.dataBytes map (byteStr => Some(byteStr) -> user)
}
The above code is where we don't assume the size of N, r.entity.dataBytes could be a Source of 0 ByteString values, or potentially an infinite number values. But our logic doesn't care!
Now we need to combine the data coming from the Source. This is a good use case for Flow.flatMapConcat which takes a Flow of Sources and converts it into a Flow of values (similar to flatMap for Iterables):
val cityByteStrFlow : Flow[(Try[HttpResponse], User), (Option[ByteString], User), _] =
Flow[(Try[HttpResponse], User)] flatMapConcat convertResponseToByteStrSource
All that is left to do is convert the tuples of (ByteString, User) into EnhancedUser. Note: I am assuming below that User is a subclass of EnhancedUser which is inferred from the question logic:
val convertByteStringToUser : (Option[ByteString], User) => EnhancedUser =
(byteStr, user) =>
byteStr
.map(s => EnhancedUser(user.data, s))
.getOrElse(user)
val cityUserFlow : Flow[(Option[ByteString], User), EnhancedUser, _] =
Flow[(ByteString, User)] map convertByteStringToUser
These components can now be combined:
val useEnhancementGraph =
userSource
.via(cityRequest)
.via(httpClient)
.via(cityByteStrFlow)
.via(cityUserFlow)
.via(processEnhancedUser)
.to(Sink foreach println)
Future Based
We can use Futures to solve the problem, similar to the stack question you referenced in your original question. I don't recommend this approach for 2 reasons:
It assumes only 1 ByteString is coming from the endpoint. If the endpoint sends multiple values as ByteStrings then they all get concatenated together and you could get an error when creating EnhancedUser.
It places an artificial timeout on the materialization of the ByteString data, similar to Async.await (which should almost always be avoided).
To use the Future based approach the only big change to your original code is to use Flow.mapAsync instead of Flow.map to handle the fact that a Future is being created in the function:
val parallelism = 10
val timeout : FiniteDuration = ??? //you need to specify the timeout limit
val convertResponseToFutureByteStr : (Try[HttpResponse], User) => Future[EnhancedUser] =
_ match {
case (Failure(ex), user) =>
Future successful user
case (Success(resp), user) =>
resp
.entity
.toStrict(timeout)
.map(byteStr => new EnhancedUser(user.data, byteStr))
}
val cityResponse : Flow[(Try[HttpResponse], User), EnhancedUser, _] =
Flow[(Try[HttpResponse], User)].mapAsync(parallelism)(convertResponseToFutureByteStr)

Create Future without starting it

This is a follow-up to my previous question
Suppose I want to create a future with my function but don't want to start it immediately (i.e. I do not want to call val f = Future { ... // my function}.
Now I see it can be done as follows:
val p = promise[Unit]
val f = p.future map { _ => // my function here }
Is it the only way to create a future with my function w/o executing it?
You can do something like this
val p = Promise[Unit]()
val f = p.future
//... some code run at a later time
p.success {
// your function
}
LATER EDIT:
I think the pattern you're looking for can be encapsulated like this:
class LatentComputation[T](f: => T) {
private val p = Promise[T]()
def trigger() { p.success(f) }
def future: Future[T] = p.future
}
object LatentComputation {
def apply[T](f: => T) = new LatentComputation(f)
}
You would use it like this:
val comp = LatentComputation {
// your code to be executed later
}
val f = comp.future
// somewhere else in the code
comp.trigger()
You could always defer creation with a closure, you'll not get the future object right ahead, but you get a handle to call later.
type DeferredComputation[T,R] = T => Future[R]
def deferredCall[T,R](futureBody: T => R): DeferredComputation[T,R] =
t => future {futureBody(t)}
def deferredResult[R](futureBody: => R): DeferredComputation[Unit,R] =
_ => future {futureBody}
If you are getting too fancy with execution control, maybe you should be using actors instead?
Or, perhaps, you should be using a Promise instead of a Future: a Promise can be passed on to others, while you keep it to "fulfill" it at a later time.
It's also worth giving a plug to Promise.completeWith.
You already know how to use p.future onComplete mystuff.
You can trigger that from another future using p completeWith f.
You can also define a function that creates and returns the Future, and then call it:
val double = (value: Int) => {
val f = Future { Thread.sleep(1000); value * 2 }
f.onComplete(x => println(s"Future return: $x"))
f
}
println("Before future.")
double(2)
println("After future is called, but as the future takes 1 sec to run, it will be printed before.")
I used this to executes futures in batches of n, something like:
// The functions that returns the future.
val double = (i: Int) => {
val future = Future ({
println(s"Start task $i")
Thread.sleep(1000)
i * 2
})
future.onComplete(_ => {
println(s"Task $i ended")
})
future
}
val numbers = 1 to 20
numbers
.map(i => (i, double))
.grouped(5)
.foreach(batch => {
val result = Await.result( Future.sequence(batch.map{ case (i, callback) => callback(i) }), 5.minutes )
println(result)
})
Or just use regular methods that return futures, and fire them in series using something like a for comprehension (sequential call-site evaluation)
This well known problem with standard libraries Future: they are designed in such a way that they are not referentially transparent, since they evaluate eagerly and memoize their result. In most use cases, this is totally fine and Scala developers rarely need to create non-evaluated future.
Take the following program:
val x = Future(...); f(x, x)
is not the same program as
f(Future(...), Future(...))
because in the first case the future is evaluated once, in the second case it is evaluated twice.
The are libraries which provide the necessary abstractions to work with referentially transparent asynchronous tasks, whose evaluation is deferred and not memoized unless explicitly required by the developer.
Scalaz Task
Monix Task
fs2
If you are looking to use Cats, Cats effects works nicely with both Monix and fs2.
this is a bit of a hack, since it have nothing to do with how future works but just adding lazy would suffice:
lazy val f = Future { ... // my function}
but note that this is sort of a type change as well, because whenever you reference it you will need to declare the reference as lazy too or it will be executed.