Use explicit execution context with OptionT[Future, _] - scala

I am trying to write a custom Akka SnapshotStore plugin.
I am at the point where I want to implement this method:
def loadAsync(persistenceId: String, criteria: SnapshotSelectionCriteria): Future[Option[SelectedSnapshot]]
This is what I have so far:
import cats.data.OptionT
import cats.implicits._
...
override def loadAsync(
persistenceId: String,
criteria: SnapshotSelectionCriteria
): Future[Option[SelectedSnapshot]] = {
// same as in the original plugin
val metadata = snapshotMetadatas(persistenceId, criteria).sorted.takeRight(maxLoadAttempts)
// need to get rid of this one!
import scala.concurrent.ExecutionContext.Implicits.global
val getSnapshotAndReportMetric = for {
snapshot <- OptionT.fromOption[Future](getMaybeSnapshotFromMetadata(metadata))
_ <- OptionT.liftF(Future {
observabilityService
.recordMetric(
LongCounterMetric(readVehicleSnapshotFromDiskCounter, SnapShotDirectoryScannerCommandOptions)
)
}(directorySnapshotScanningDispatcher))
} yield snapshot
getSnapshotAndReportMetric.value.recoverWith {
// retry if we listed an older snapshot that was deleted before loading
case _: NoSuchFileException => loadAsync(persistenceId, criteria)
}(streamDispatcher)
}
For reference here is the signature of getMaybeSnapshotFromMetadata. It's signature can be modified to add a Future, but ultimately the wrapped response must be of type Option[SelectedSnapshot].
private def getMaybeSnapshotFromMetadata(metadata: Seq[SnapshotMetadata]): Option[SelectedSnapshot]
The code as is compiles, but only because I have imported the implicit global ExecutionContext. My goal is to use different explicit execution contexts (different configurable dispatchers), but I don't see how to do it for the first line (the one that calls getMaybeSnapshotFromMetadata) within for-comprehensions. If I used OptionT.liftF then I could do it, e.g.
...
snapshot <- OptionT.liftF( Future { getMaybeSnapshotFromMetadata(metadata)}(streamDispatcher))
...but then I get a OptionT[Future, Option[SelectedSnapshot] as a result.
Is there a solution to what I want to achieve? If not I can work around with mere Futures and it's andThen chain method:
Future {
getMaybeSnapshotFromMetadata(metadata)
}(streamDispatcher)
.andThen(selectedSnapshot => {
observabilityService
.recordMetric(
LongCounterMetric(readVehicleSnapshotFromDiskCounter, SnapShotDirectoryScannerCommandOptions)
)
selectedSnapshot
})(opentelemetryDispatcher)
.recoverWith {
// retry if we listed an older snapshot that was deleted before loading
case _: NoSuchFileException => loadAsync(persistenceId, criteria)
}(streamDispatcher)
Update For the second line within for-comprehension liftF is actually a viable solution - I updated the code block.

Do not overuse cats. It's a nice tool for some things, but when used improperly, it just adds complexity and hurts readability without any benefit.
Future(getMaybeSnapshotFromMetadata(metadata))(streamDispatcher)
.andThen { case _ =>
observabilityService.recordWiseMetric(...)
}(directorySnapshotScanningDispatcher)
.recoverWith(...)(streamDispatcher)
This is equivalent to your code without all the complication ...
It is not a "workaround", but an actual proper way to write this.

Related

Multiple futures that may fail - returning both successes and failures?

I have a situation where I need to run a bunch of operations in parallel.
All operations have the same return value (say a Seq[String]).
Its possible that some of the operations may fail, and others successfully return results.
I want to return both the successful results, and any exceptions that happened, so I can log them for debugging.
Is there a built-in way, or easy way through any library (cats/scalaz) to do this, before I go and write my own class for doing this?
I was thinking of doing each operation in its own future, then checking each future, and returning a tuple of Seq[String] -> Seq[Throwable] where left value is the successful results (flattened / combined) and right is a list of any exceptions that occurred.
Is there a better way?
Using Await.ready, which you mention in a comment, generally loses most benefits from using futures. Instead you can do this just using the normal Future combinators. And let's do the more generic version, which works for any return type; flattening the Seq[String]s can be added easily.
def successesAndFailures[T](futures: Seq[Future[T]]): Future[(Seq[T], Seq[Throwable])] = {
// first, promote all futures to Either without failures
val eitherFutures: Seq[Future[Either[Throwable, T]]] =
futures.map(_.transform(x => Success(x.toEither)))
// then sequence to flip Future and Seq
val futureEithers: Future[Seq[Either[Throwable, T]]] =
Future.sequence(eitherFutures)
// finally, Seq of Eithers can be separated into Seqs of Lefts and Rights
futureEithers.map { seqOfEithers =>
val (lefts, rights) = seqOfEithers.partition(_.isLeft)
val failures = lefts.map(_.left.get)
val successes = rights.map(_.right.get)
(successes, failures)
}
}
Scalaz and Cats have separate to simplify the last step.
The types can be inferred by the compiler, they are shown just to help you see the logic.
Calling value on your Future returns an Option[Try[T]]. If the Future has not completed then the Option is None. If it has completed then it's easy to unwrap and process.
if (myFutr.isCompleted)
myFutr.value.map(_.fold( err: Throwable => //log the error
, ss: Seq[String] => //process results
))
else
// do something else, come back later
Sounds like a good use-case for the Try idiom (it's basically similar to the Either monad).
Example of usage from the doc:
import scala.util.{Success, Failure}
val f: Future[List[String]] = Future {
session.getRecentPosts
}
f onComplete {
case Success(posts) => for (post <- posts) println(post)
case Failure(t) => println("An error has occurred: " + t.getMessage)
}
It actually does a little bit more than what you asked because it is fully asynchronous. Does it fit your use-case?
I'd do it this way:
import scala.concurrent.{Future, ExecutionContext}
import scala.util.Success
def eitherify[A](f: Future[A])(implicit ec: ExecutionContext): Future[Either[Throwable, A]] = f.transform(tryResult => Success(tryResult.toEither))
def eitherifyF[A, B](f: A => Future[B])(implicit ec: ExecutionContext): A => Future[Either[Throwable, B]] = { a => eitherify(f(a)) }
// here we need some "cats" magic for `traverse` and `separate`
// instead of `traverse` you can use standard `Future.sequence`
// there is no analogue for `separate` in the standard library
import cats.implicits._
def myProgram[A, B](values: List[A], asyncF: A => Future[B])(implicit ec: ExecutionContext): Future[(List[Throwable], List[B])] = {
val appliedTransformations: Future[List[Either[Throwable, B]]] = values.traverse(eitherifyF(asyncF))
appliedTransformations.map(_.separate)
}

Error value flatMap is not a member of Product with Serializable in for with Future[Option]

The following block of code fails to build with error :
value flatMap is not a member of Product with Serializable
[error] if (matchingUser.isDefined) {
Here's the code:
for {
matchingUser <- userDao.findOneByEmail(email)
user <- {
if (matchingUser.isDefined) {
matchingUser.map(u => {
// update u with new values...
userDao.save(u)
u
})
} else {
val newUser = new User(email)
userDao.create(newUser)
newUser
}
}
} yield user
Method userDao.findOneByEmail(email) returns anFuture[Option[User]]object.
My Google searches are only aboutEitherwithRightandLeft` types.
Maybe I'm not doing this the proper way, please teach me how to properly do this.
The first branch of the if statement returns Option[User], the other one returns User. So, the result of the entire statement is inferred to have type Product with Serializable because it is the only common supertype of the two.
You could wrap the last statement inside the if into an Option (just do Option(newUser) instead of newUser) or, better yet, use fold instead of the whole if(matchingUser.isDefined) {...} thingy:
matchingUser.fold {
val u = new User(email)
userDao.create(u)
u
} { u =>
userDao.save(u)
u
}
This will make the result of that statement to be Option[User] as you probably intended ... but it still will not compile.
The problem is that you cannot mix different types of monads in the for-comprehension: since the first one was Future, all the others have to be as well. You can't have an Option there.
How to get around that? Well, one possibility is to make userDao.create and userDao.save return a future of the object they just saved. That is, probably a better thing to do in general, then what you have, because now you are returning the user before it was actually stored ... What if the create operation fails afterwards? Then you can just rewrite your for-comprehension like this:
for {
matchingUser <- userDao.findOneByEmail(email)
user <- matchingUser.fold(userDao.create(new User(email)))(userDao.save)
} yield user
Or just get rid of it entirely (for-comprehension is an overkill for simple cases like this):
userDao
.findOneByEmail(email)
.flatMap(_.fold(usrDao.create(new User(email)))(userDao.save))
Or, it may look a little nicer with pattern matching instead of fold in this case:
userDao
.findOneByEmail(email)
.flatMap {
case Some(u) => userDao.save(u)
case None => userDao.create(new User(email))
}
Here is my test example to reproduce your issue with solution.
Basically your user returns wrapper around Future (Option of Future), but it expected to be Future, as the first statement in for-comprehension.
That is why I've applied some addition unwrapping. See sample below.
Note: it does not looks so nice, I'd prefer to rewrite it with flatMap map.
object T {
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
def future1: Future[Option[Int]] = ???
def future2(i: Int): Future[Double] = ???
for {
matchingUser <- future1
user <- {
if (matchingUser.isDefined) {
matchingUser.map { i =>
future2(i)
}
} else {
Some(future2(42))
}
} match {
case None => Future.successful(-42.0)
case Some(x) => x
}
} yield user
}
And the same implemented with flatMap:
val userFuture = future1.flatMap {
case Some(i) => future2(i)
case None => future2(42)
}

Trying to understand Scala enumerator/iteratees

I am new to Scala and Play!, but have a reasonable amount of experience of building webapps with Django and Python and of programming in general.
I've been doing an exercise of my own to try to improve my understanding - simply pull some records from a database and output them as a JSON array. I'm trying to use the Enumarator/Iteratee functionality to do this.
My code follows:
TestObjectController.scala:
def index = Action {
db.withConnection { conn=>
val stmt = conn.createStatement()
val result = stmt.executeQuery("select * from datatable")
logger.debug(result.toString)
val resultEnum:Enumerator[TestDataObject] = Enumerator.generateM {
logger.debug("called enumerator")
result.next() match {
case true =>
val obj = TestDataObject(result.getString("name"), result.getString("object_type"),
result.getString("quantity").toInt, result.getString("cost").toFloat)
logger.info(obj.toJsonString)
Future(Some(obj))
case false =>
logger.warn("reached end of iteration")
stmt.close()
null
}
}
val consume:Iteratee[TestDataObject,Seq[TestDataObject]] = {
Iteratee.fold[TestDataObject,Seq[TestDataObject]](Seq.empty[TestDataObject]) { (result,chunk) => result :+ chunk }
}
val newIteree = Iteratee.flatten(resultEnum(consume))
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult.onSuccess { case x=> println(x)}
Ok("")
}
}
TestDataObject.scala:
package models
case class TestDataObject (name: String, objtype: String, quantity: Int, cost: Float){
def toJsonString: String = {
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
mapper.writeValueAsString(this)
}
}
I have two main questions:
How do i signal that the input is complete from the Enumerator callback? The documentation says "this method takes a callback function e: => Future[Option[E]] that will be called each time the iteratee this Enumerator is applied to is ready to take some input." but I am unable to pass any kind of EOF that I've found because it;s the wrong type. Wrapping it in a Future does not help, but instinctively I am not sure that's the right approach.
How do I get the final result out of the Future to return from the controller view? My understanding is that I would effectively need to pause the main thread to wait for the subthreads to complete, but the only examples I've seen and only things i've found in the future class is the onSuccess callback - but how can I then return that from the view? Does Iteratee.run block until all input has been consumed?
A couple of sub-questions as well, to help my understanding:
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
When I run the code for the first time, I get a single record logged from logger.info and then it reports "reached end of iteration". Subsequent runs in the same session call nothing. I am closing the statement though, so why do I get no results the second time around? I was expecting it to loop indefinitely as I don't know how to signal the correct termination for the loop.
Thanks a lot in advance for any answers, I thought I was getting the hang of this but obviously not yet!
How do i signal that the input is complete from the Enumerator callback?
You return a Future(None).
How do I get the final result out of the Future to return from the controller view?
You can use Action.async (doc):
def index = Action.async {
db.withConnection { conn=>
...
val eventuallyResult:Future[Seq[TestDataObject]] = newIteree.run
eventuallyResult map { data =>
OK(...)
}
}
}
Why do I need to wrap my object in Some() when it's already in a Future? What exactly does Some() represent?
The Future represents the (potentially asynchronous) processing to obtain the next element. The Option represents the availability of the next element: Some(x) if another element is available, None if the enumeration is completed.

Jedis in scala and handling errors

I am trying to find the best way to handle jedis commands from scala. I am trying to implement a finally block, and prevent the java exceptions from bubbling up to my caller.
Does the following code make sense, and is it the best I can do performance wise, if I want to ensure that I handle exceptions when redis may be down temporarily? This trait would be extended by an object, and I'd call objectname.del(key). I feel like I'm combining too many concepts (Either, Option, Try, feels like there should be a cleaner way)
trait MyExperiment {
implicit class TryOps[T](val t: Try[T]) {
def eventually[Ignore](effect: => Ignore): Try[T] = {
val ignoring = (_: Any) => { effect; t }
t transform (ignoring, ignoring)
}
}
val jpool:JedisPool = initialize()
// init the pool at object creation
private def initialize(): JedisPool =
{
val poolConfig = new JedisPoolConfig()
poolConfig.setMaxIdle(10)
poolConfig.setMinIdle(2)
poolConfig.setTestWhileIdle(true)
poolConfig.setTestOnBorrow(true)
poolConfig.setTestOnReturn(true)
poolConfig.setNumTestsPerEvictionRun(10)
new JedisPool( poolConfig , "localhost" )
}
// get a resource from pool. This can throw an error if redis is
// down
def getFromPool: Either[Throwable,Jedis] =
Try(jpool.getResource) match {
case Failure(m) => Left(m)
case Success(m) => Right(m)
}
// return an object to pool
// i believe this may also throw an error if redis is down?
def returnToPool(cache:Jedis): Unit =
Try(jpool.returnResource(cache))
// execute a command -- "del" in this case, (wrapped by
// the two methods above)
def del(key: String) : Option[Long] = {
getFromPool match {
case Left(m) => None
case Right(m) => Try(m.del(key)) eventually returnToPool(m) match {
case Success(r) => Option(r)
case Failure(r) => None
}
}
}
}
Not an exact answer, but I moved on after doing some performance testing. Using the standard java-ish exception blocks ended up being much faster at high iterations (at 10,000 iterations, it was about 2.5x faster than the (bad) code above). That also cleaned up my code, although it's more verbose.
So the answer I arrived at is to use the Java-style exception blocks which provide for the finally construct. I believe it should be significantly faster, as long as exceptions are a very rare occurance.

Simple Scala actor question

I'm sure this is a very simple question, but embarrassed to say I can't get my head around it:
I have a list of values in Scala.
I would like to use use actors to make some (external) calls with each value, in parallel.
I would like to wait until all values have been processed, and then proceed.
There's no shared values being modified.
Could anyone advise?
Thanks
There's an actor-using class in Scala that's made precisely for this kind of problem: Futures. This problem would be solved like this:
// This assigns futures that will execute in parallel
// In the example, the computation is performed by the "process" function
val tasks = list map (value => scala.actors.Futures.future { process(value) })
// The result of a future may be extracted with the apply() method, which
// will block if the result is not ready.
// Since we do want to block until all results are ready, we can call apply()
// directly instead of using a method such as Futures.awaitAll()
val results = tasks map (future => future.apply())
There you go. Just that.
Create workers and ask them for futures using !!; then read off the results (which will be calculated and come in in parallel as they're ready; you can then use them). For example:
object Example {
import scala.actors._
class Worker extends Actor {
def act() { Actor.loop { react {
case s: String => reply(s.length)
case _ => exit()
}}}
}
def main(args: Array[String]) {
val arguments = args.toList
val workers = arguments.map(_ => (new Worker).start)
val futures = for ((w,a) <- workers zip arguments) yield w !! a
val results = futures.map(f => f() match {
case i: Int => i
case _ => throw new Exception("Whoops--didn't expect to get that!")
})
println(results)
workers.foreach(_ ! None)
}
}
This does a very inexpensive computation--calculating the length of a string--but you can put something expensive there to make sure it really does happen in parallel (the last thing that case of the act block should be to reply with the answer). Note that we also include a case for the worker to shut itself down, and when we're all done, we tell the workers to shut down. (In this case, any non-string shuts down the worker.)
And we can try this out to make sure it works:
scala> Example.main(Array("This","is","a","test"))
List(4, 2, 1, 4)