Losing types on sequencing Futures - scala

I'm trying to do this:
case class ConversationData(members: Seq[ConversationMemberModel], messages: Seq[MessageModel])
val membersFuture: Future[Seq[ConversationMemberModel]] = ConversationMemberPersistence.searchByConversationId(conversationId)
val messagesFuture: Future[Seq[MessageModel]] = MessagePersistence.searchByConversationId(conversationId)
Future.sequence(List(membersFuture, messagesFuture)).map{ result =>
// some magic here
self ! ConversationData(members, messages)
}
But when I'm sequencing the two futures compiler is losing types. The compiler says that type of result is List[Seq[Product with Serializable]] At the beginning I expect to do something like
Future.sequence(List(membersFuture, messagesFuture)).map{ members, messages => ...
But it looks like sequencing futures don't work like this... I also tried to using a collect inside the map but I get similar errors.
Thanks for your help

When using Future.sequence, the assumption is that the underlying types produced by the multiple Futures are the same (or extend from the same parent type). With sequence, you basically invert a Seq of Futures for a particular type to a single Future for a Seq of that particular type. A concrete example is probably more illustrative of that point:
val f1:Future[Foo] = ...
val f2:Future[Foo] = ...
val f3:Future[Foo] = ...
val futures:List[Future[Foo]] = List(f1, f2, f3)
val aggregateFuture:Future[List[Foo]] = Future.sequence(futures)
So you can see that I went from a List of Future[Foo] to a single Future wrapping a List[Foo]. You use this when you already have a bunch of Futures for results of the same type (or base type) and you want to aggregate all of the results for the next processing step. The sequence method product a new Future that won't be completed until all of the aggregated Futures are done and it will then contain the aggregated results of all of those Futures. This works especially well when you have an indeterminate or variable number of Futures to process.
For your case, it seems that you have a fixed number of Futures to handle. As #Zoltan suggested, a simple for comprehension is probably a better fit here because the number of Futures is known. So solving your problem like so:
for{
members <- membersFuture
messages <- messagesFuture
} {
self ! ConversationData(members, messages)
}
is probably the best way to go for this specific example.

What are you trying to achieve with the sequence call? I'd just use a for-comprehension instead:
val membersFuture: Future[Seq[ConversationMemberModel]] = ConversationMemberPersistence.searchByConversationId(conversationId)
val messagesFuture: Future[Seq[MessageModel]] = MessagePersistence.searchByConversationId(conversationId)
for {
members <- membersFuture
messages <- messagesFuture
} yield (self ! ConversationData(members, messages))
Note that it is important that you declare the two futures outside the for-comprehension, because otherwise your messagesFuture wouldn't be submitted until the membersFuture is completed.
You could also use zip:
membersFuture.zip(messagesFuture).map {
case (members, messages) => self ! ConversationData(members, messages)
}
but I'd prefer the for-comprehension.

Related

Order of execution of Future - Making sequential inserts in a db non-blocking

A simple scenario here. I am using akka streams to read from kafka and write into an external source, in my case: cassandra.
Akka streams(reactive-kafka) library equips me with backpressure and other nifty things to make this possible.
kafka being a Source and Cassandra being a Sink, when I get bunch of events which are, for example be cassandra queries here through Kafka which are supposed to be executed sequentially (ex: it could be a INSERT, UPDATE and a DELETE and must be sequential).
I cannot use mayAsync and execute both the statement, Future is eager and there is a chance that DELETE or UPDATE might get executed first before INSERT.
I am forced to use Cassandra's execute as opposed to executeAsync which is non-blocking.
There is no way to make a complete async solution to this issue, but how ever is there a much elegant way to do this?
For ex: Make the Future lazy and sequential and offload it to a different execution context of sorts.
mapAsync gives a parallelism option as well.
Can Monix Task be of help here?
This a general design question and what are the approaches one can take.
UPDATE:
Flow[In].mapAsync(3)(input => {
input match {
case INSERT => //do insert - returns future
case UPDATE => //do update - returns future
case DELETE => //delete - returns future
}
The scenario is a little more complex. There could be thousands of insert, update and delete coming in order for specific key(s)(in kafka)
I would ideally want to execute the 3 futures of a single key in sequence. I believe Monix's Task can help?
If you process things with parallelism of 1, they will get executed in strict sequence, which will solve your problem.
But that's not interesting. If you want, you can run operations for different keys in parallel - if processing for different keys is independent, which, I assume from your description, is possible. To do this, you have to buffer the incoming values and then regroup it. Let's see some code:
import monix.reactive.Observable
import scala.concurrent.duration._
import monix.eval.Task
// Your domain logic - I'll use these stubs
trait Event
trait Acknowledgement // whatever your DB functions return, if you need it
def toKey(e: Event): String = ???
def processOne(event: Event): Task[Acknowledgement] = Task.deferFuture {
event match {
case _ => ??? // insert/update/delete
}
}
// Monix Task.traverse is strictly sequential, which is what you need
def processMany(evs: Seq[Event]): Task[Seq[Acknowledgement]] =
Task.traverse(evs)(processOne)
def processEventStreamInParallel(source: Observable[Event]): Observable[Acknowledgement] =
source
// Process a bunch of events, but don't wait too long for whole 100. Fine-tune for your data source
.bufferTimedAndCounted(2.seconds, 100)
.concatMap { batch =>
Observable
.fromIterable(batch.groupBy(toKey).values) // Standard collection methods FTW
.mapAsync(3)(processMany) // processing up to 3 different keys in parallel - tho 3 is not necessary, probably depends on your DB throughput
.flatMap(Observable.fromIterable) // flattening it back
}
The concatMap operator here will ensure that your chunks are processed sequentially as well. So even if one buffer has key1 -> insert, key1 -> update and the other has key1 -> delete, that causes no problems. In Monix, this is the same as flatMap, but in other Rx libraries flatMap might be an alias for mergeMap which has no ordering guarantee.
This can be done with Futures too, tho there's no standard "sequential traverse", so you have to roll your own, something like:
def processMany(evs: Seq[Event]): Future[Seq[Acknowledgement]] =
evs.foldLeft(Future.successful(Vector.empty[Acknowledgement])){ (acksF, ev) =>
for {
acks <- acksF
next <- processOne(ev)
} yield acks :+ next
}
You can use akka-streams subflows, to group by key, then merge substreams if you want to do something with what you get from your database operations:
def databaseOp(input: In): Future[Out] = input match {
case INSERT => ...
case UPDATE => ...
case DELETE => ...
}
val databaseFlow: Flow[In, Out, NotUsed] =
Flow[In].groupBy(Int.maxValues, _.key).mapAsync(1)(databaseOp).mergeSubstreams
Note that order from input source won't be kept in output as it is done in mapAsync, but all operations on the same key will still be in order.
You are looking for Future.flatMap:
def doSomething: Future[Unit]
def doSomethingElse: Future[Unit]
val result = doSomething.flatMap { _ => doSomethingElse }
This executes the first function, and then, when its Future is satisfied, starts the second one. The result is a new Future that completes when the result of the second execution is satisfied.
The result of the first future is passed into the function you give to .flatMap, so the second function can depend on the result of the first one. For example:
def getUserID: Future[Int]
def getUser(id: Int): Future[User]
val userName: Future[String] = getUserID.flatMap(getUser).map(_.name)
You can also write this as a for-comprehension:
for {
id <- getUserID
user <- getUser(id)
} yield user.name

How do I to flatMap a Try[Option] in an idiomatic way

I want to flatMap a Try[Option[A]] using some function that uses the value inside the Option to create another Try, and I want the solution to be simple and idiomatic. I have illustrated the problem with an example. The goal is to create a Option[Group] with members and events wrapped in a single Try that can contain errors from any of the three functions.
def getGroup(id: Long): Try[Option[Group]]
def getMembersForGroup(groupId: Long): Try[Seq[Member]]
def getMeetingsForGroup(groupId: Long): Try[Seq[Meeting]]
I find it difficult to flatMap from the Try returned by getGroup to the Try from the member- and meeting-functions because there's an Option "in the way". This is what i have come up with so far:
getGroup(id).flatMap(
groupOpt => groupOpt.map(
group => addStuff(group).map(group => Some(group))
).getOrElse(Success(None))
)
def addStuff(g: Group): Try[Group] =
for {
members <- getMembersForGroup(g.id)
meetings <- getMeetingsForGroup(g.id)
} yield g.copy(members = members, meetings = meetings)
What I don't like about my solution is that I have to wrap the group returned by addStuff in an Option to perform the getOrElse. At this point the type is Option[Try[Option[Group]]] which I think makes the solution difficult to understand at first glance.
Is there a simpler solution to this problem?
Cats has an OptionT type that might simplify this: documentation here and source here.
Your example would be:
def getGroupWithStuff(id: Long): OptionT[Try, Group] = {
for {
g <- OptionT(getGroup(id))
members <- OptionT.liftF(getMembersForGroup(g.id))
meetings <- OptionT.liftF(getMeetingsForGroup(g.id))
} yield g.copy(members = members, meetings = meetings)
}
You could use .fold instead of .map.getOrElse ... That makes it a little bit nicer:
getGroup(id)
.flatMap {
_.fold(Try(Option.empty[Group])){
addStuff(_).map(Option.apply)
}
}
or write the two cases explicitly - that may look a little clearer in this case, because you can avoid having to spell out the ugly looking type signature:
getGroup(id).flatMap {
case None => Success(None)
case Some(group) => addStuff(group).map(Option.apply)
}
You probably could simplify your getGroup call to:
getGroup(id).map(
groupOpt => groupOpt.flatMap(
group => addStuff(group).toOption
)
)
, however that would be at cost of ignoring potential failure info from addStuff call. If it is not acceptable then it is unlikely you could simplify your code further.
Try this. You get to keep your for comprehension syntax as well as Failure information from any of the three calls (whichever fails first).
def getFullGroup(id: Long): Try[Option[Group]] =
getGroup(id).flatMap[Option[Group]] { _.map[Try[Group]]{ group =>
for {
meetings <- getMeetingsForGroup(id)
members <- getMembersForGroup
} yield group.copy(meetings = meetings, members = members)
}.fold[Try[Option[Group]]](Success(None))(_.map(Some(_)))
}
Note the type acrobatics at the end:
fold[Try[Option[Group]]](Success(None))(_.map(Some(_)))
It's hard to get right without type annotations and an IDE. In this particular case, that's not too bad, but imagine meetings and members depended on another nested try option which in turn depended on the original. Or imagine if you wanted to a comprehension on individual Meetings and Groups rather than using the entire list.
You can try using an OptionT monad transformer from cats or scalaz to stack Try[Option[Group]] into a non-nested OptionT[Try, Group]. If you use a monad transformer, it can look like this:
def getFullGroup(id: Long): OptionT[Try, Group] =
OptionT(getGroup(id)).flatMapF { group =>
for {
meetings <- getMeetingsForGroup(id)
members <- getMembersForGroup(id)
} yield group.copy(meetings = meetings, members = members)
}
}
For this particular case, there's not really much gain. But do look into it if you have a lot of this kind of code.
By the way, the boilerplate at the end of the first example that flips the Try and Option is called a sequence. When it follows a map, the whole thing is called traverse. It's a pattern that comes up often and is abstracted away by functional programming libraries. Instead of using OptionT, you can do something like:
def getFullGroup(id: Long): Try[Option[Group]] =
getGroup(id).flatMap[Option[Group]] { _.traverse { group =>
for {
meetings <- getMeetingsForGroup(id)
members <- getMembersForGroup
} yield group.copy(meetings = meetings, members = members)
}
}
(Generally, if you're mapping f then flipping monads, you want to traverse with f.)

Scala adding elements to seq and handling futures, maps, and async behavior

I'm still a newbie in scala and don't quite yet understand the concept of Futures/Maps/Flatmaps/Seq and how to use them properly.
This is what I want to do (pseudo code):
def getContentComponents: Action[AnyContent] = Action.async {
contentComponentDTO.list().map( //Future[Seq[ContentComponentModel]] Get all contentComponents
contentComponents => contentComponents.map( //Iterate over [Seq[ContentComponentModel]
contentComponent => contentComponent.typeOf match { //Match the type of the contentComponent
case 1 => contentComponent.pictures :+ contentComponentDTO.getContentComponentPicture(contentComponent.id.get) //Future[Option[ContentComponentPictureModel]] add to _.pictures seq
case 2 => contentComponent.videos :+ contentComponentDTO.getContentComponentVideo(contentComponent.id.get) //Future[Option[ContentComponentVideoModel]] add to _.videos seq
}
)
Ok(Json.toJson(contentComponents)) //Return all the contentComponents in the end
)
}
I want to add a Future[Option[Foo]] to contentComponent.pictures: Option[Seq[Foo]] like so:
case 2 => contentComponent.pictures :+ contentComponentDTO.getContentComponentPicture(contentComponent.id.get) //contentComponent.pictures is Option[Seq[Foo]]
and return the whole contentComponent back to the front-end via json in the end.
I know this might be far away from the actual code in the end, but I hope you got the idea. Thanks!
I'll ignore your code and focus on what is short and makes sense:
I want to add a Future[Option[Foo]] to contentComponent.pictures: Option[Seq[Foo]] like so:
Let's do this, focusing on code readability:
// what you already have
val someFuture: Future[Option[Foo]] = ???
val pics: Option[Seq[Foo]] = contentComponent.pictures
// what I'm adding
val result: Future[Option[Seq[Foo]]] = someFuture.map {
case None => pics
case Some(newElement) =>
pics match {
case None => Some(Seq(newElement)) // not sure what you want here if pics is empty...
case Some(picsSequence) => Some(picsSequence :+ newElement)
}
}
And to show an example of flatMap let's say you need the result of result future in another future, just do:
val otherFuture: Future[Any] = ???
val everything: Future[Option[Seq[Foo]]] = otherFuture.flatmap { otherResult =>
// do something with otherResult i.e., the code above could be pasted in here...
result
}
My answer will attempt to help with some of the conceptual sub-questions which form parts of your overall larger question.
flatMap and for-yield
One of the points of flatMap is to help with the problem of the Pyramid of Doom. This happens when you have
structures nested within structures nested within structures ...
doA().map { resultOfA =>
doB(resultOfA).map { resolutOfB =>
doC(resultOfB).map { resultOfC =>
...
}
}
}
If you use for-yield you get flatMap out of the box and it allows you to
flatten the pyramid
so that your code looks more like a linear structure
for {
resultOfA <- doA
resultOfB <- doB(resultOfA)
resultOfC <- doC(resultOfB)
...
} yield {...}
There is a rule of thumb in software engineering that deeply nested structures are harder to debug and reason about, so
we strive to minimise the nesting. You will hit this issue especially when dealing with Futures.
Mapping over Future vs. mapping over sequence
Mapping is usually first thought in terms of iteration over a sequence, which might lead to understanding of
mapping over a Future in terms of iterating over a sequence of one. My advice would be not to use the iteration concept when
trying to understand mapping over Futures, Options etc. In these cases it might be better to think of mapping as a process of destructing the structure
so that you get at the element inside the structure. One could visualise mapping as
breaking the shell of a walnut so you get at the delicious kernel inside and then rebuilding the shell.
Futures and monads
As you try to learn more about Futures and when you begin to deal with types like Future[Option[SomeType]] you will inevitably
stumble upon documentation about monads and its cryptic terminology might scare you away. If this happens, it might help to think of monads (of which Future is a particular instance) as simply
something you can stick into a for-yield so that you can get at the
delicious walnut kernels whilst avoiding the pyramid of doom.

What is the advantage of using Option.map over Option.isEmpty and Option.get?

I am a new to Scala coming from Java background, currently confused about the best practice considering Option[T].
I feel like using Option.map is just more functional and beautiful, but this is not a good argument to convince other people. Sometimes, isEmpty check feels more straight forward thus more readable. Is there any objective advantages, or is it just personal preference?
Example:
Variation 1:
someOption.map{ value =>
{
//some lines of code
}
} orElse(foo)
Variation 2:
if(someOption.isEmpty){
foo
} else{
val value = someOption.get
//some lines of code
}
I intentionally excluded the options to use fold or pattern matching. I am simply not pleased by the idea of treating Option as a collection right now, and using pattern matching for a simple isEmpty check is an abuse of pattern matching IMHO. But no matter why I dislike these options, I want to keep the scope of this question to be the above two variations as named in the title.
Is there any objective advantages, or is it just personal preference?
I think there's a thin line between objective advantages and personal preference. You cannot make one believe there is an absolute truth to either one.
The biggest advantage one gains from using the monadic nature of Scala constructs is composition. The ability to chain operations together without having to "worry" about the internal value is powerful, not only with Option[T], but also working with Future[T], Try[T], Either[A, B] and going back and forth between them (also see Monad Transformers).
Let's try and see how using predefined methods on Option[T] can help with control flow. For example, consider a case where you have an Option[Int] which you want to multiply only if it's greater than a value, otherwise return -1. In the imperative approach, we get:
val option: Option[Int] = generateOptionValue
var res: Int = if (option.isDefined) {
val value = option.get
if (value > 40) value * 2 else -1
} else -1
Using collections style method on Option, an equivalent would look like:
val result: Int = option
.filter(_ > 40)
.map(_ * 2)
.getOrElse(-1)
Let's now consider a case for composition. Let's say we have an operation which might throw an exception. Additionaly, this operation may or may not yield a value. If it returns a value, we want to query a database with that value, otherwise, return an empty string.
A look at the imperative approach with a try-catch block:
var result: String = _
try {
val maybeResult = dangerousMethod()
if (maybeResult.isDefined) {
result = queryDatabase(maybeResult.get)
} else result = ""
}
catch {
case NonFatal(e) => result = ""
}
Now let's consider using scala.util.Try along with an Option[String] and composing both together:
val result: String = Try(dangerousMethod())
.toOption
.flatten
.map(queryDatabase)
.getOrElse("")
I think this eventually boils down to which one can help you create clear control flow of your operations. Getting used to working with Option[T].map rather than Option[T].get will make your code safer.
To wrap up, I don't believe there's a single truth. I do believe that composition can lead to beautiful, readable, side effect deferring safe code and I'm all for it. I think the best way to show other people what you feel is by giving them examples as we just saw, and letting them feel for themselves the power they can leverage with these sets of tools.
using pattern matching for a simple isEmpty check is an abuse of pattern matching IMHO
If you do just want an isEmpty check, isEmpty/isDefined is perfectly fine. But in your case you also want to get the value. And using pattern matching for this is not abuse; it's precisely the basic use-case. Using get allows to very easily make errors like forgetting to check isDefined or making the wrong check:
if(someOption.isEmpty){
val value = someOption.get
//some lines of code
} else{
//some other lines
}
Hopefully testing would catch it, but there's no reason to settle for "hopefully".
Combinators (map and friends) are better than get for the same reason pattern matching is: they don't allow you to make this kind of mistake. Choosing between pattern matching and combinators is a different question. Generally combinators are preferred because they are more composable (as Yuval's answer explains). If you want to do something covered by a single combinator, I'd generally choose them; if you need a combination like map ... getOrElse, or a fold with multi-line branches, it depends on the specific case.
It seems similar to you in case of Option but just consider the case of Future. You will not be able to interact with the future's value after going out of Future monad.
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Promise
import scala.util.{Success, Try}
// create a promise which we will complete after sometime
val promise = Promise[String]();
// Now lets consider the future contained in this promise
val future = promise.future;
val byGet = if (!future.value.isEmpty) {
val valTry = future.value.get
valTry match {
case Success(v) => v + " :: Added"
case _ => "DEFAULT :: Added"
}
} else "DEFAULT :: Added"
val byMap = future.map(s => s + " :: Added")
// promise was completed now
promise.complete(Try("PROMISE"))
//Now lets print both values
println(byGet)
// DEFAULT :: Added
println(byMap)
// Success(PROMISE :: Added)

Converting thunk to sequence upon iteration

I have a server API that returns a list of things, and does so in chunks of, let's say, 25 items at a time. With every response, we get a list of items, and a "token" that we can use for the following server call to return the next 25, and so on.
Please note that we're using a client library that has been written in stodgy old mutable Java, and doesn't lend itself nicely to all of Scala's functional compositional patterns.
I'm looking for a way to return a lazily evaluated sequence of all server items, by doing a server call with the latest token whenever the local list of items has been exhausted. What I have so far is:
def fetchFromServer(uglyStateObject: StateObject): Seq[Thing] = {
val results = server.call(uglyStateObject)
uglyStateObject.update(results.token())
results.asScala.toList ++ (if results.moreAvailable() then
fetchFromServer(uglyStateObject)
else
List())
}
However, this function does eager evaluation. What I'm looking for is to have ++ concatenate a "strict sequence" and a "lazy sequence", where a thunk will be used to retrieve the next set of items from the server. In effect, I want something like this:
results.asScala.toList ++ Seq.lazy(() => fetchFromServer(uglyStateObject))
Except I don't know what to use in place of Seq.lazy.
Things I've seen so far:
SeqView, but I've seen comments that it shouldn't be used because it re-evaluates all the time?
Streams, but they seem like the abstraction is supposed to generate elements at a time, whereas I want to generate a bunch of elements at a time.
What should I use?
I also suggest you to take a look at scalaz-strem. Here is small example how it may look like
import scalaz.stream._
import scalaz.concurrent.Task
// Returns updated state + fetched data
def fetchFromServer(uglyStateObject: StateObject): (StateObject, Seq[Thing]) = ???
// Initial state
val init: StateObject = new StateObject
val p: Process[Task, Thing] = Process.repeatEval[Task, Seq[Thing]] {
var state = init
Task(fetchFromServer(state)) map {
case (s, seq) =>
state = s
seq
}
} flatMap Process.emitAll
As a matter of fact, in the meantime I already found a slightly different answer that I find more readable (indeed using Streams):
def fetchFromServer(uglyStateObject: StateObject): Stream[Thing] = {
val results = server.call(uglyStateObject)
uglyStateObject.update(results.token())
results.asScala.toStream #::: (if results.moreAvailable() then
fetchFromServer(uglyStateObject)
else
Stream.empty)
}
Thanks everyone for