I have a question regarding handling making async operations and taking action based on priority.
Consider the following code:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtList(listId: Int, siteId: Int): Future[Some[Int]] = Future {
Some(1)
}
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] = {
val domainFutures: Future[List[Option[Int]]] = Future.traverse(listIds)(listId => isSiteExcludedAtList(listId, domainHash))
val subDomainFutures: Future[List[Option[Int]]] = Future.traverse(listIds)(listId => isSiteExcludedAtList(listId, subdomainHash))
// Is there other way?
for {
res <- Future.sequence(
List(
subDomainFutures.map(res => "subdomain" -> res),
domainFutures.map(res => "domain" -> res)
)
)
} yield {
val subdomainExclusion: List[Int] = res.filter(_._1 == "subdomain").flatMap(_._2).flatten
val domainExclusion: List[Int] = res.filter(_._1 == "domain").flatMap(_._2).flatten
if (subdomainExclusion.nonEmpty) {
s"its subdomain exclusion with results: ${subdomainExclusion}"
}
else {
s"its domain exclusion with results: ${domainExclusion}"
}
}
}
What i want to achieve:
isSiteExcludedAtList returns Int object from database, this is mocked in my example but its actually and async call to get some int value from some key which contains both listId and siteId from database.
I want to create subdomainFutures and domainFutures and start to run them together
I want to check if there result from subdomainFutures, if so - its subdomain exclusion and i want to return this
if all subdomainFutures not return any result - i want to check domainFutures and return result base on this.
Note: waiting for only one result of subdomain is optional optimization.
Is there a more pretty way to achieve this?
Thanks!
So you want to fetch both domains and subdomains in parallel, you also want to concurrently execute as many isSiteExcludedAtList as possible. And additionally, if there is at least one subdomain, you want to cancel the domains.
That can easily be represented using cats-effect and fs2 by taking advantage of IO
(The following code assumes that isSiteExcludedAtList returns an IO[Option[Int]])
import cats.effect.IO
import cats.syntax.all._
import fs2.Stream
import fs2.concurrent.SignallingRef
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): IO[Unit] = {
def parallelStreamFor(siteId: Int): Stream[IO, Int] =
Stream
.emits(listIds)
.covary[IO]
.parEvalMapUnordered(maxConcurrent = 2)(listId => isSiteExcludedAtList(listId, siteId))
.collect {
case Some(result) => result
}
SignallingRef[IO].of(false).flatMap { signal =>
val processSubdomains =
parallelStreamFor(siteId = subdomainHash)
.evalTap(_ => signal.set(true))
.compile
.toList
val processDomains =
parallelStreamFor(siteId = domainHash)
.interruptWhen(signal)
.compile
.toList
(processSubdomains,processDomains).parTupled
} flatMap {
case (subdomainExclusions, domainExclusions) =>
if (subdomainExclusions.nonEmpty)
IO.println(s"Its subdomain exclusion with result: ${subdomainExclusions}")
else if (domainExclusions.nonEmpty)
IO.println(s"Its domain exclusion with result: ${domainExclusions}")
else
IO.println("All subdomains and domains are included!")
}
}
A couple of considerations:
If the order of elements matters, then replace parEvalMapUnordered with just parEvalMap which is a little bit less efficient.
Adjust the value of maxConcurrent so it makes sense given your workload.
If you would rather keep each stream synchronous and just run both concurrently we can replace the parEvalMapUnordered + parEvalMapUnordered with a single call to evalMapFilter
You can easily integrate this in your codebase without needing to refactor too much thanks to IO.fromFuture and IO.unsafeToFuture()
You can see the code running here.
Edit
OLD AND WRONG ANSWER
If I understood the problem correctly, you want to stop processing at the first result to return a Some
If you are open to using cats-effect, that is pretty easy to achieve like this:
import cats.effect.IO
import cats.syntax.all._
def isSiteExcludedAtList(listId: Int, siteId: Int): IO[Option[Int]] =
IO.println(s"Computing for ${listId} - ${siteId}").as(Some(10))
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): IO[Unit] = {
val processSubdomains =
listIds.collectFirstSomeM(listId => isSiteExcludedAtList(listId, siteId = subdomainHash))
val processDomains =
listIds.collectFirstSomeM(listId => isSiteExcludedAtList(listId, siteId = domainHash))
processSubdomains.flatMap {
case Some(subdomainExclusion) =>
IO.println(s"Its subdomain exclusion with result: ${subdomainExclusion}")
case None =>
processDomains.flatMap {
case Some(domainExclusion) =>
IO.println(s"Its domain exclusion with result: ${domainExclusion}")
case None =>
IO.println("All subdomains and domains are included!")
}
}
}
You can see the code running here
Note: Another approach would be to tag each computation with is origin (domain, or subdomain) and combine all them in a big list and perform a single collectFirstSomeM both are equivalent.
Something like this maybe?
subdomainFutures.map(_.flatten).flatMap {
case sds if (sds.nonEmpty) => Future.successful(sds -> Nil)
case _ => domainFutures.map(_.flatten).map(Nil -> _)
}.map {
case (sds, _) if (sds.nonEmpty) => s"subdomain exclusion $sds"
case (_, ds) if (ds.nonEmpty) => s"domain exclusion $ds"
case _ => "no exclusion"
}
Or, maybe, pull domain queries up to the same level too:
subdomainFutures.zip(domainFutures)
.map { case (s,d) = (s.flatten, d.flatten) }
.map {
case (sds, _) if (sds.nonEmpty) => s"subdomain exclusion $sds"
case (_, ds) if (ds.nonEmpty) => s"domain exclusion $ds"
case _ => "no exclusion"
}
I think, it's more or less the same thing you are doing, just expressed in a little bit more straightforward way IMO.
One downside is it will wait for all subdomain queries to come back even if the very first one returns a result (the second variant looks a little "slicker", but it also waits for all domain queries unconditionally, which is an additional inefficiency).
There are ways to optimize that (nothing is impossible!) but I can't think of any that wouldn't look excessively complicated for the use case to me.
I'd like to describe how to improve a bit your code while still using futures, but I'm a bit confused of what this code is doing. What is this number that isSiteExcludedAtList returns? Is it an identifier and you want to collect identifiers for all list ids, and you're only concerned with that you don't want to query using domainHash if it's enough to use subdomainHash? That's what your code seems to be doing but then, if I understand correctly the answer above, the one with cats-effect and collectFirstSomeM, then that code looks only for the first result that is Some(number) and then stops. For example, if the first ever call to isSiteExcludedAtList will return Some(1) then we won't call anything more.
So, I have three answers for you.
This is if you want to collect a list of ints and you only want to avoid calling isSiteExcludedAtList with domainHash if calls subdomainHash give you some results already. In this case you can chain both Future.traverse and call the second one only if the first one returns no results.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtList(listId: Int, siteId: Int): Future[Some[Int]] =
Future { Some(1) }
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] =
for {
res1 <- Future.traverse(listIds)(isSiteExcludedAtList(_, subdomainHash))
subIds = res1.flatten
res2 <- if (subIds.isEmpty)
Future.traverse(listIds)(isSiteExcludedAtList(_, domainHash))
else
Future.successful(Nil)
domIds = res2.flatten
} yield
if (subIds.nonEmpty)
s"its subdomain exclusion with results: ${subIds}"
else if (domIds.nonEmpty)
s"its domain exclusion with results: ${domIds}"
else
"no exclusion"
This is if you look for the first result that indicates that the listId is excluded and then you want to query no more. In that case, all calls to isSiteExcludedAtList must be chained, i.e. you call a next one only when you get no result from the previous one. It can be done with recursion:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtList(listId: Int, siteId: Int): Future[Option[Int]] =
Future { Some(1) }
def isSiteExcludedAtList(listIds: List[Int], hash: Int): Future[Option[Int]] =
listIds match {
case Nil =>
Future.successful(None)
case head :: tail =>
isSiteExcludedAtList(head, hash).flatMap {
case Some(id) => Future.successful(Some(id))
case None => isSiteExcludedAtList(tail, hash)
}
}
// if you use Scala 3, change this to an enum
sealed trait Exclusion
final case class SubdomainExclusion(id: Int) extends Exclusion
final case class DomainExclusion(id: Int) extends Exclusion
case object NoExclusion extends Exclusion
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] =
isSiteExcludedAtList(listIds, subdomainHash).flatMap {
case Some(id) =>
Future.successful(SubdomainExclusion(id))
case None =>
isSiteExcludedAtList(listIds, domainHash).map {
case Some(id) => DomainExclusion(id)
case None => NoExclusion
}
}.map {
case SubdomainExclusion(id) => s"subdomain exclusion $id"
case DomainExclusion(id) => s"domain exclusion: $id"
case NoExclusion => "no exclusion"
}
And the third possibility is that instead of using Future.traverse and asking for each listId separately, you will implement a query that will return all excluded ids for a given hash - subdomainHash or domainHash, and then you will just check if a common set of your listIds and ids returned by that query is non-empty. The code will be similar to that from my first answer, but it will make only two calls to the database. I'm writing about it because from my experience it's a common pattern in dealing with databases: we have some already written queries and as our code becomes more complex we start to use those queries in loops, which leads to sub-optimal performance, while instead we could write a bit more complex query which we would call only once.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtListBulk(siteId: Int): Future[Set[Int]] =
Future { Set(10, 20, 30) }
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] =
for {
excludedSubIds <- isSiteExcludedAtListBulk(subdomainHash)
subIds = listIds.filter(excludedSubIds)
excludedDomIds <- if (subIds.isEmpty)
isSiteExcludedAtListBulk(domainHash)
else
Future.successful(Set.empty)
domIds = listIds.filter(excludedDomIds)
} yield
if (subIds.nonEmpty)
s"its subdomain exclusion with results: ${subIds}"
else if (domIds.nonEmpty)
s"its domain exclusion with results: ${domIds}"
else
"no exclusion"
Related
I have a sequence of parameters. For each parameter I have to perform DB query, which may or may not return a result. Simply speaking, I need to stop after the first result is non-empty. Of course, I would like to avoid doing unnecessary calls. The caveat is - I need to have this operation(s) contained as a another Future - or any "most reactive" approach.
Speaking of code:
//that what I have
def dbQuery(p:Param): Future[Option[Result]] = {}
//my list of params
val input = Seq(p1,p2,p3)
//that what I need to implements
def getFirstNonEmpty(params:Seq[Param]): Future[Option[Result]]
I know I can possibly just wrap entire function in yet another Future and execute code sequentially (Await? Brrr...), but that not the cleanest solution.
Can I somehow create lazy initialized collection of futures, like
params.map ( p => FutureWhichWontStartUnlessAskedWhichWrapsOtherFuture { dbQuery(p) }).findFirst(!_.isEmpty())
I believe it's possible!
What do you think about something like this?
def getFirstNonEmpty(params: Seq[Param]): Future[Option[Result]] = {
params.foldLeft(Future.successful(Option.empty[Result])) { (accuFtrOpt, param) =>
accuFtrOpt.flatMap {
case None => dbQuery(param)
case result => Future.successful(result)
}
}
}
This might be overkill, but if you are open to using scalaz we can do this using OptionT and foldMap.
With OptionT we sort of combine Future and Option into one structure. We can get the first of two Futures with a non-empty result using OptionT.orElse.
import scalaz._, Scalaz._
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
val someF: Future[Option[Int]] = Future.successful(Some(1))
val noneF: Future[Option[Int]] = Future.successful(None)
val first = OptionT(noneF) orElse OptionT(someF)
first.run // Future[Option[Int]] = Success(Some(1))
We could now get the first non-empty Future from a List with reduce from the standard library (this will however run all the Futures) :
List(noneF, noneF, someF).map(OptionT.apply).reduce(_ orElse _).run
But with a List (or other collection) we can't be sure that there is at least one element, so we need to use fold and pass a start value. Scalaz can do this work for us by using a Monoid. The Monoid[OptionT[Future, Int]] we will use will supply the start value and combine the Futures with the orElse used above.
type Param = Int
type Result = Int
type FutureO[x] = OptionT[Future, x]
def query(p: Param): Future[Option[Result]] =
Future.successful{ println(p); if (p > 2) Some(p) else None }
def getFirstNonEmpty(params: List[Param]): Future[Option[Result]] = {
implicit val monoid = PlusEmpty[FutureO].monoid[Result]
params.foldMap(p => OptionT(query(p))).run
}
val result = getFirstNonEmpty(List(1,2,3,4))
// prints 1, 2, 3
result.foreach(println) // Some(3)
This is an old question, but if someone comes looking for an answer, here is my take. I solved it for a use case that required me to loop through a limited number of futures sequentially and stop when the first of them returned a result.
I did not need a library for my use-case, a light-weight combination of recursion and pattern matching was sufficient. Although the question here does not have the same problem as a sequence of futures, looping through a sequence of parameters would be similar.
Here would be the pseudo-code based on recursion.
I have not compiled this, fix the types being matched/returned.
def getFirstNonEmpty(params: Seq[Param]): Future[Option[Result]] = {
if (params.isEmpty) {
Future.successful(None)
} else {
val head = params.head
dbQuery(head) match {
case Some(v) => Future.successful(Some(v))
case None => getFirstNonEmpty(params.tail)
}
}
}
I've got an ADT that's essentially a cross between Option and Try:
sealed trait Result[+T]
case object Empty extends Result[Nothing]
case class Error(cause: Throwable) extends Result[Nothing]
case class Success[T](value: T) extends Result[T]
(assume common combinators like map, flatMap etc are defined on Result)
Given an Iteratee[A, Result[B] called inner, I want to create a new Iteratee[Result[A], Result[B]] with the following behavior:
If the input is a Success(a), feed a to inner
If the input is an Empty, no-op
If the input is an Error(err), I want inner to be completely ignored, instead returning a Done iteratee with the Error(err) as its result.
Example Behavior:
// inner: Iteratee[Int, Result[List[Int]]]
// inputs:
1
2
3
// output:
Success(List(1,2,3))
// wrapForResultInput(inner): Iteratee[Result[Int], Result[List[Int]]]
// inputs:
Success(1)
Success(2)
Error(Exception("uh oh"))
Success(3)
// output:
Error(Exception("uh oh"))
This sounds to me like the job for an Enumeratee, but I haven't been able to find anything in the docs that looks like it'll do what I want, and the internal implementations are still voodoo to me.
How can I implement wrapForResultInput to create the behavior described above?
Adding some more detail that won't really fit in a comment:
Yes it looks like I was mistaken in my question. I described it in terms of Iteratees but it seems I really am looking for Enumeratees.
At a certain point in the API I'm building, there's a Transformer[A] class that is essentially an Enumeratee[Event, Result[A]]. I'd like to allow clients to transform that object by providing an Enumeratee[Result[A], Result[B]], which would result in a Transformer[B] aka an Enumeratee[Event, Result[B]].
For a more complex example, suppose I have a Transformer[AorB] and want to turn that into a Transformer[(A, List[B])]:
// the Transformer[AorB] would give
a, b, a, b, b, b, a, a, b
// but the client wants to have
a -> List(b),
a -> List(b, b, b),
a -> Nil
a -> List(b)
The client could implement an Enumeratee[AorB, Result[(A, List[B])]] without too much trouble using Enumeratee.grouped, but they are required to provide an Enumeratee[Result[AorB], Result[(A, List[B])] which seems to introduce a lot of complication that I'd like to hide from them if possible.
val easyClientEnumeratee = Enumeratee.grouped[AorB]{
for {
_ <- Enumeratee.dropWhile(_ != a) ><> Iteratee.ignore
headResult <- Iteratee.head.map{ Result.fromOption }
bs <- Enumeratee.takeWhile(_ == b) ><> Iteratee.getChunks
} yield headResult.map{_ -> bs}
val harderEnumeratee = ??? ><> easyClientEnumeratee
val oldTransformer: Transformer[AorB] = ... // assume it already exists
val newTransformer: Transformer[(A, List[B])] = oldTransformer.andThen(harderEnumeratee)
So what I'm looking for is the ??? to define the harderEnumeratee in order to ease the burden on the user who already implemented easyClientEnumeratee.
I guess the ??? should be an Enumeratee[Result[AorB], AorB], but if I try something like
Enumeratee.collect[Result[AorB]] {
case Success(ab) => ab
case Error(err) => throw err
}
the error will actually be thrown; I actually want the error to come back out as an Error(err).
Simplest implementation of such would be Iteratee.fold2 method, that could collect elements until something is happened.
Since you return single result and can't really return anything until you verify there is no errors, Iteratee would be enough for such a task
def listResults[E] = Iteratee.fold2[Result[E], Either[Throwable, List[E]]](Right(Nil)) { (state, elem) =>
val Right(list) = state
val next = elem match {
case Empty => (Right(list), false)
case Success(x) => (Right(x :: list), false)
case Error(t) => (Left(t), true)
}
Future(next)
} map {
case Right(list) => Success(list.reverse)
case Left(th) => Error(th)
}
Now if we'll prepare little playground
import scala.concurrent.ExecutionContext.Implicits._
import scala.concurrent.{Await, Future}
import scala.concurrent.duration._
val good = Enumerator.enumerate[Result[Int]](
Seq(Success(1), Empty, Success(2), Success(3)))
val bad = Enumerator.enumerate[Result[Int]](
Seq(Success(1), Success(2), Error(new Exception("uh oh")), Success(3)))
def runRes[X](e: Enumerator[Result[X]]) : Result[List[X]] = Await.result(e.run(listResults), 3 seconds)
we can verify those results
runRes(good) //res0: Result[List[Int]] = Success(List(1, 2, 3))
runRes(bad) //res1: Result[List[Int]] = Error(java.lang.Exception: uh oh)
I have to get a list of issues for each file of a given list from a REST API with Scala. I want to do the requests in parallel, and use the Dispatch library for this. My method is called from a Java framework and I have to wait at the end of this method for the result of all the futures to yield the overall result back to the framework. Here's my code:
def fetchResourceAsJson(filePath: String): dispatch.Future[json4s.JValue]
def extractLookupId(json: org.json4s.JValue): Option[String]
def findLookupId(filePath: String): Future[Option[String]] =
for (json <- fetchResourceAsJson(filePath))
yield extractLookupId(json)
def searchIssuesJson(lookupId: String): Future[json4s.JValue]
def extractIssues(json: org.json4s.JValue): Seq[Issue]
def findIssues(lookupId: String): Future[Seq[Issue]] =
for (json <- searchIssuesJson(componentId))
yield extractIssues(json)
def getFilePathsToProcess: List[String]
def thisIsCalledByJavaFramework(): java.util.Map[String, java.util.List[Issue]] = {
val finalResultPromise = Promise[Map[String, Seq[Issue]]]()
// (1) inferred type of issuesByFile not as expected, cannot get
// the type system happy, would like to have Seq[Future[(String, Seq[Issue])]]
val issuesByFile = getFilePathsToProcess map { f =>
findLookupId(f).flatMap { lookupId =>
(f, findIssues(lookupId)) // I want to yield a tuple (String, Seq[Issue]) here
}
}
Future.sequence(issuesByFile) onComplete {
case Success(x) => finalResultPromise.success(x) // (2) how to return x here?
case Failure(x) => // (3) how to return null from here?
}
//TODO transform finalResultPromise to Java Map
}
This code snippet has several issues. First, I'm not getting the type I would expect for issuesByFile (1). I would like to just ignore the result of findLookUpId if it is not able to find the lookUp ID (i.e., None). I've read in various tutorials that Future[Option[X]] is not easy to handle in function compositions and for expressions in Scala. So I'm also curious what the best practices are to handle these properly.
Second, I somehow have to wait for all futures to finish, but don't know how to return the result to the calling Java framework (2). Can I use a promise here to achieve this? If yes, how can I do it?
And last but not least, in case of any errors, I would just like to return null from thisIsCalledByJavaFramework but don't know how (3).
Any help is much appreciated.
Thanks,
Michael
Several points:
The first problem at (1) is that you don't handle the case where findLookupId returns None. You need to decide what to do in this case. Fail the whole process? Exclude that file from the list?
The second problem at (1) is that findIssues will itself return a Future, which you need to map before you can build the result tuple
There's a shortcut for map and then Future.sequence: Future.traverse
If you cannot change the result type of the method because the Java interface is fixed and cannot be changed to support Futures itself you must wait for the Future to be completed. Use Await.ready or Await.result to do that.
Taking all that into account and choosing to ignore files for which no id could be found results in this code:
// `None` in an entry for a file means that no id could be found
def entryForFile(file: String): Future[(String, Option[Seq[Issue]])] =
findLookupId(file).flatMap {
// the need for this kind of pattern match shows
// the difficulty of working with `Future[Option[T]]`
case Some(id) ⇒ findIssues(id).map(issues ⇒ file -> Some(issues))
case None ⇒ Future.successful(file -> None)
}
def thisIsCalledByJavaFramework(): java.util.Map[String, java.util.List[Issue]] = {
val issuesByFile: Future[Seq[(String, Option[Seq[Issue]])]] =
Future.traverse(getFilePathsToProcess)(entryForFile)
import scala.collection.JavaConverters._
try
Await.result(issuesByFile, 10.seconds)
.collect {
// here we choose to ignore entries where no id could be found
case (f, Some(issues)) ⇒ f -> issues
}
.toMap.mapValues(_.asJava).asJava
catch {
case NonFatal(_) ⇒ null
}
}
Suppose I have few case classes and functions to test them:
case class PersonName(...)
case class Address(...)
case class Phone(...)
def testPersonName(pn: PersonName): Either[String, PersonName] = ...
def testAddress(a: Address): Either[String, Address] = ...
def testPhone(p: Phone): Either[String, Phone] = ...
Now I define a new case class Person and a test function, which fails fast.
case class Person(name: PersonName, address: Address, phone: Phone)
def testPerson(person: Person): Either[String, Person] = for {
pn <- testPersonName(person.name).right
a <- testAddress(person.address).right
p <- testPhone(person.phone).right
} yield person;
Now I would like function testPerson to accumulate the errors rather than just fail fast.
I would like testPerson to always execute all those test* functions and return Either[List[String], Person]. How can I do that ?
You want to isolate the test* methods and stop using a comprehension!
Assuming (for whatever reason) that scalaz isn't an option for you... it can be done without having to add the dependency.
Unlike a lot of scalaz examples, this is one where the library doesn't reduce verbosity much more than "regular" scala can:
def testPerson(person: Person): Either[List[String], Person] = {
val name = testPersonName(person.name)
val addr = testAddress(person.address)
val phone = testPhone(person.phone)
val errors = List(name, addr, phone) collect { case Left(err) => err }
if(errors.isEmpty) Right(person) else Left(errors)
}
Scala's for-comprehensions (which desugar to a combination of calls to flatMap and map) are designed to allow you to sequence monadic computations in such a way that you have access to the result of earlier computations in subsequent steps. Consider the following:
def parseInt(s: String) = try Right(s.toInt) catch {
case _: Throwable => Left("Not an integer!")
}
def checkNonzero(i: Int) = if (i == 0) Left("Zero!") else Right(i)
def inverse(s: String): Either[String, Double] = for {
i <- parseInt(s).right
v <- checkNonzero(i).right
} yield 1.0 / v
This won't accumulate errors, and in fact there's no reasonable way that it could. Suppose we call inverse("foo"). Then parseInt will obviously fail, which means there's no way we can have a value for i, which means there's no way we could move on to the checkNonzero(i) step in the sequence.
In your case your computations don't have this kind of dependency, but the abstraction you're using (monadic sequencing) doesn't know that. What you want is an Either-like type that isn't monadic, but that is applicative. See my answer here for some details about the difference.
For example, you could write the following with Scalaz's Validation without changing any of your individual validation methods:
import scalaz._, syntax.apply._, syntax.std.either._
def testPerson(person: Person): Either[List[String], Person] = (
testPersonName(person.name).validation.toValidationNel |#|
testAddress(person.address).validation.toValidationNel |#|
testPhone(person.phone).validation.toValidationNel
)(Person).leftMap(_.list).toEither
Although of course this is more verbose than necessary and is throwing away some information, and using Validation throughout would be a little cleaner.
As #TravisBrown is telling you, for comprehensions don't really mix with error accumulations. In fact, you generally use them when you don't want fine grained error control.
A for comprehension will "short-circuit" itself on the first error found, and this is almost always what you want.
The bad thing you are doing is using String to do flow control of exceptions. You should at all times use Either[Exception, Whatever] and fine tune logging with scala.util.control.NoStackTrace and scala.util.NonFatal.
There are much better alternatives, specifically:
scalaz.EitherT and scalaz.ValidationNel.
Update:(this is incomplete, I don't know exactly what you want). You have better options than matching, such as getOrElse and recover.
def testPerson(person: Person): Person = {
val attempt = Try {
val pn = testPersonName(person.name)
val a = testAddress(person.address)
testPhone(person.phone)
}
attempt match {
case Success(person) => //..
case Failure(exception) => //..
}
}
Starting in Scala 2.13, we can partitionMap a List of Eithers in order to partition elements based on their Either's side.
// def testName(pn: Name): Either[String, Name] = ???
// def testAddress(a: Address): Either[String, Address] = ???
// def testPhone(p: Phone): Either[String, Phone] = ???
List(testName(Name("name")), testAddress(Address("address")), testPhone(Phone("phone")))
.partitionMap(identity) match {
case (Nil, List(name: Name, address: Address, phone: Phone)) =>
Right(Person(name, address, phone))
case (left, _) =>
Left(left)
}
// Either[List[String], Person] = Left(List("wrong name", "wrong phone"))
// or
// Either[List[String], Person] = Right(Person(Name("name"), Address("address"), Phone("phone")))
If the left side is empty, then no elements were Left and thus we can build a Person out of the Right elements.
Otherwise, we return a Left List of the Left values.
Details of the intermediate step (partitionMap):
List(Left("bad name"), Right(Address("addr")), Left("bad phone"))
.partitionMap(identity)
// (List[String], List[Any]) = (List("bad name", "bad phone"), List[Any](Address("addr")))
I have a pattern to process web service requests using chained partial functions (this is a chain of responsibility pattern, I think?). In my example, let's say there are two parameters for the request, a string Id and a date. There's a verification step involving the id, a verification step checking the date, and finally some business logic that use both. So I have them implemented like so:
object Controller {
val OK = 200
val BAD_REQUEST = 400
type ResponseGenerator = PartialFunction[(String, DateTime), (String, Int)]
val errorIfInvalidId:ResponseGenerator = {
case (id, _) if (id == "invalid") => ("Error, Invalid ID!", BAD_REQUEST)
}
val errorIfFutureDate:ResponseGenerator = {
case (_, date) if (date.isAfter(DateTime.now)) => ("Error, date in future!", BAD_REQUEST)
}
val businessLogic:ResponseGenerator = {
case (id, date) => {
// ... do stuff
("Success!", OK)
}
}
def handleRequest(id:String, date:DateTime) = {
val chained = errorIfInvalidId orElse errorIfFutureDate orElse businessLogic
val result: (String, Int) = chained(id, date)
// make some sort of a response out of the message and status code
// e.g. in the Play framework...
Status(result._2)(result._1)
}
}
I like this pattern because it's very expressive - you can easily grasp what the controller method logic is just by looking at the chained functions. And, I can easily mix and match different verification steps for different requests.
The problem is that as I try to expand this pattern it starts to break down. Suppose my next controller takes an id I want to validate, but does not have the date parameter, and maybe it has some new parameter of a third type that does need validation. I don't want to keep expanding that tuple to (String, DateTime, Other) and have to pass in a dummy DateTime or Other. I want to have partial functions that accept different types of arguments (they can still return the same type). But I can't figure out how to compose them.
For a concrete question - suppose the example validator methods are changed to look like this:
val errorIfInvalidId:PartialFunction[String, (String, Int)] = {
case id if (id == "invalid") => ("Error, Invalid ID!", BAD_REQUEST)
}
val errorIfInvalidDate:PartialFunction[DateTime, (String, Int)] = {
case date if (date.isAfter(DateTime.now)) => ("Error, date in future!", BAD_REQUEST)
}
Can I still chain them together? It seems like I should be able to map the tuples to them, but I can't figure out how.
I'm a big fan of using scalaz's Validation for things like this. It gives you quite a bit of control over what you want to do with errors and how to handle them. Here's an example using you're controller:
import scalaz._
import Scalaz._
object Controller {
val OK = 200
val BAD_REQUEST = 400
case class Response(response: String, status: Int)
def validateIfInvalidId(id: String) = (id == "invalid") ?
Response("Error, Invalid ID!", BAD_REQUEST).fail[String] |
id.success[Response]
def validateIfFutureDate(date: DateTime, currentDate: DateTime = DateTime.now) = (date.isAfter(currentDate)) ?
Response("Error, date in future!", BAD_REQUEST).fail[DateTime] |
date.success[Response]
def handleRequest(id: String, date: DateTime) = {
val response = for {
validatedId <- validateIfInvalidId(id)
validatedDate <- validateIfFutureDate(date)
} yield {
// ... do stuff
Response("Success!", OK)
}
// make some sort of a response out of the message and status code
// e.g. in the Play framework...
response.fold(
failure => Status(failure.response, failure.status),
success => Status(success.response, success.status)
)
}
}
You can move the different validation functions off into their own world and then compose them anytime you want with the for comprehension in scala.
Okay, I found a way to do this which seems not too bad. Originally I was thinking it might work to wrap the "base" version of the partial function in another partial function that takes the tuple. But I couldn't figure out how to do it, until I hit on the obvious-in-retrospect idea of using isDefined in a case guard statement. Like so:
// "base" version
val errorIfInvalidId:PartialFunction[String, (String, Int)] = {
case id if (id == "invalid") => ("Error, Invalid ID!", BAD_REQUEST)
}
// wrapped to take tuple as parameter
val wrappedErrorIfInvalidId:PartialFunction[(String, DateTime), (String, Int)] = {
case (id, _) if (errorIfInvalidId.isDefinedAt(id)) => errorIfInvalidId(id)
}
This approach is serviceable, though I still wonder if there isn't a more direct way of accomplishing it. (I also may switch over to the Scalaz validation suggested by Noah after I get a chance to play with it a bit.)
You can make PartialFunction more generic, making it PartialFunction[Any, (String, Int)]
Altho, mb it will be slower. Do not know matching mechanics under PartialFunction