How to accumulate errors in Either? - scala

Suppose I have few case classes and functions to test them:
case class PersonName(...)
case class Address(...)
case class Phone(...)
def testPersonName(pn: PersonName): Either[String, PersonName] = ...
def testAddress(a: Address): Either[String, Address] = ...
def testPhone(p: Phone): Either[String, Phone] = ...
Now I define a new case class Person and a test function, which fails fast.
case class Person(name: PersonName, address: Address, phone: Phone)
def testPerson(person: Person): Either[String, Person] = for {
pn <- testPersonName(person.name).right
a <- testAddress(person.address).right
p <- testPhone(person.phone).right
} yield person;
Now I would like function testPerson to accumulate the errors rather than just fail fast.
I would like testPerson to always execute all those test* functions and return Either[List[String], Person]. How can I do that ?

You want to isolate the test* methods and stop using a comprehension!
Assuming (for whatever reason) that scalaz isn't an option for you... it can be done without having to add the dependency.
Unlike a lot of scalaz examples, this is one where the library doesn't reduce verbosity much more than "regular" scala can:
def testPerson(person: Person): Either[List[String], Person] = {
val name = testPersonName(person.name)
val addr = testAddress(person.address)
val phone = testPhone(person.phone)
val errors = List(name, addr, phone) collect { case Left(err) => err }
if(errors.isEmpty) Right(person) else Left(errors)
}

Scala's for-comprehensions (which desugar to a combination of calls to flatMap and map) are designed to allow you to sequence monadic computations in such a way that you have access to the result of earlier computations in subsequent steps. Consider the following:
def parseInt(s: String) = try Right(s.toInt) catch {
case _: Throwable => Left("Not an integer!")
}
def checkNonzero(i: Int) = if (i == 0) Left("Zero!") else Right(i)
def inverse(s: String): Either[String, Double] = for {
i <- parseInt(s).right
v <- checkNonzero(i).right
} yield 1.0 / v
This won't accumulate errors, and in fact there's no reasonable way that it could. Suppose we call inverse("foo"). Then parseInt will obviously fail, which means there's no way we can have a value for i, which means there's no way we could move on to the checkNonzero(i) step in the sequence.
In your case your computations don't have this kind of dependency, but the abstraction you're using (monadic sequencing) doesn't know that. What you want is an Either-like type that isn't monadic, but that is applicative. See my answer here for some details about the difference.
For example, you could write the following with Scalaz's Validation without changing any of your individual validation methods:
import scalaz._, syntax.apply._, syntax.std.either._
def testPerson(person: Person): Either[List[String], Person] = (
testPersonName(person.name).validation.toValidationNel |#|
testAddress(person.address).validation.toValidationNel |#|
testPhone(person.phone).validation.toValidationNel
)(Person).leftMap(_.list).toEither
Although of course this is more verbose than necessary and is throwing away some information, and using Validation throughout would be a little cleaner.

As #TravisBrown is telling you, for comprehensions don't really mix with error accumulations. In fact, you generally use them when you don't want fine grained error control.
A for comprehension will "short-circuit" itself on the first error found, and this is almost always what you want.
The bad thing you are doing is using String to do flow control of exceptions. You should at all times use Either[Exception, Whatever] and fine tune logging with scala.util.control.NoStackTrace and scala.util.NonFatal.
There are much better alternatives, specifically:
scalaz.EitherT and scalaz.ValidationNel.
Update:(this is incomplete, I don't know exactly what you want). You have better options than matching, such as getOrElse and recover.
def testPerson(person: Person): Person = {
val attempt = Try {
val pn = testPersonName(person.name)
val a = testAddress(person.address)
testPhone(person.phone)
}
attempt match {
case Success(person) => //..
case Failure(exception) => //..
}
}

Starting in Scala 2.13, we can partitionMap a List of Eithers in order to partition elements based on their Either's side.
// def testName(pn: Name): Either[String, Name] = ???
// def testAddress(a: Address): Either[String, Address] = ???
// def testPhone(p: Phone): Either[String, Phone] = ???
List(testName(Name("name")), testAddress(Address("address")), testPhone(Phone("phone")))
.partitionMap(identity) match {
case (Nil, List(name: Name, address: Address, phone: Phone)) =>
Right(Person(name, address, phone))
case (left, _) =>
Left(left)
}
// Either[List[String], Person] = Left(List("wrong name", "wrong phone"))
// or
// Either[List[String], Person] = Right(Person(Name("name"), Address("address"), Phone("phone")))
If the left side is empty, then no elements were Left and thus we can build a Person out of the Right elements.
Otherwise, we return a Left List of the Left values.
Details of the intermediate step (partitionMap):
List(Left("bad name"), Right(Address("addr")), Left("bad phone"))
.partitionMap(identity)
// (List[String], List[Any]) = (List("bad name", "bad phone"), List[Any](Address("addr")))

Related

Making Calculation on Future requests based on priority

I have a question regarding handling making async operations and taking action based on priority.
Consider the following code:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtList(listId: Int, siteId: Int): Future[Some[Int]] = Future {
Some(1)
}
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] = {
val domainFutures: Future[List[Option[Int]]] = Future.traverse(listIds)(listId => isSiteExcludedAtList(listId, domainHash))
val subDomainFutures: Future[List[Option[Int]]] = Future.traverse(listIds)(listId => isSiteExcludedAtList(listId, subdomainHash))
// Is there other way?
for {
res <- Future.sequence(
List(
subDomainFutures.map(res => "subdomain" -> res),
domainFutures.map(res => "domain" -> res)
)
)
} yield {
val subdomainExclusion: List[Int] = res.filter(_._1 == "subdomain").flatMap(_._2).flatten
val domainExclusion: List[Int] = res.filter(_._1 == "domain").flatMap(_._2).flatten
if (subdomainExclusion.nonEmpty) {
s"its subdomain exclusion with results: ${subdomainExclusion}"
}
else {
s"its domain exclusion with results: ${domainExclusion}"
}
}
}
What i want to achieve:
isSiteExcludedAtList returns Int object from database, this is mocked in my example but its actually and async call to get some int value from some key which contains both listId and siteId from database.
I want to create subdomainFutures and domainFutures and start to run them together
I want to check if there result from subdomainFutures, if so - its subdomain exclusion and i want to return this
if all subdomainFutures not return any result - i want to check domainFutures and return result base on this.
Note: waiting for only one result of subdomain is optional optimization.
Is there a more pretty way to achieve this?
Thanks!
So you want to fetch both domains and subdomains in parallel, you also want to concurrently execute as many isSiteExcludedAtList as possible. And additionally, if there is at least one subdomain, you want to cancel the domains.
That can easily be represented using cats-effect and fs2 by taking advantage of IO
(The following code assumes that isSiteExcludedAtList returns an IO[Option[Int]])
import cats.effect.IO
import cats.syntax.all._
import fs2.Stream
import fs2.concurrent.SignallingRef
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): IO[Unit] = {
def parallelStreamFor(siteId: Int): Stream[IO, Int] =
Stream
.emits(listIds)
.covary[IO]
.parEvalMapUnordered(maxConcurrent = 2)(listId => isSiteExcludedAtList(listId, siteId))
.collect {
case Some(result) => result
}
SignallingRef[IO].of(false).flatMap { signal =>
val processSubdomains =
parallelStreamFor(siteId = subdomainHash)
.evalTap(_ => signal.set(true))
.compile
.toList
val processDomains =
parallelStreamFor(siteId = domainHash)
.interruptWhen(signal)
.compile
.toList
(processSubdomains,processDomains).parTupled
} flatMap {
case (subdomainExclusions, domainExclusions) =>
if (subdomainExclusions.nonEmpty)
IO.println(s"Its subdomain exclusion with result: ${subdomainExclusions}")
else if (domainExclusions.nonEmpty)
IO.println(s"Its domain exclusion with result: ${domainExclusions}")
else
IO.println("All subdomains and domains are included!")
}
}
A couple of considerations:
If the order of elements matters, then replace parEvalMapUnordered with just parEvalMap which is a little bit less efficient.
Adjust the value of maxConcurrent so it makes sense given your workload.
If you would rather keep each stream synchronous and just run both concurrently we can replace the parEvalMapUnordered + parEvalMapUnordered with a single call to evalMapFilter
You can easily integrate this in your codebase without needing to refactor too much thanks to IO.fromFuture and IO.unsafeToFuture()
You can see the code running here.
Edit
OLD AND WRONG ANSWER
If I understood the problem correctly, you want to stop processing at the first result to return a Some
If you are open to using cats-effect, that is pretty easy to achieve like this:
import cats.effect.IO
import cats.syntax.all._
def isSiteExcludedAtList(listId: Int, siteId: Int): IO[Option[Int]] =
IO.println(s"Computing for ${listId} - ${siteId}").as(Some(10))
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): IO[Unit] = {
val processSubdomains =
listIds.collectFirstSomeM(listId => isSiteExcludedAtList(listId, siteId = subdomainHash))
val processDomains =
listIds.collectFirstSomeM(listId => isSiteExcludedAtList(listId, siteId = domainHash))
processSubdomains.flatMap {
case Some(subdomainExclusion) =>
IO.println(s"Its subdomain exclusion with result: ${subdomainExclusion}")
case None =>
processDomains.flatMap {
case Some(domainExclusion) =>
IO.println(s"Its domain exclusion with result: ${domainExclusion}")
case None =>
IO.println("All subdomains and domains are included!")
}
}
}
You can see the code running here
Note: Another approach would be to tag each computation with is origin (domain, or subdomain) and combine all them in a big list and perform a single collectFirstSomeM both are equivalent.
Something like this maybe?
subdomainFutures.map(_.flatten).flatMap {
case sds if (sds.nonEmpty) => Future.successful(sds -> Nil)
case _ => domainFutures.map(_.flatten).map(Nil -> _)
}.map {
case (sds, _) if (sds.nonEmpty) => s"subdomain exclusion $sds"
case (_, ds) if (ds.nonEmpty) => s"domain exclusion $ds"
case _ => "no exclusion"
}
Or, maybe, pull domain queries up to the same level too:
subdomainFutures.zip(domainFutures)
.map { case (s,d) = (s.flatten, d.flatten) }
.map {
case (sds, _) if (sds.nonEmpty) => s"subdomain exclusion $sds"
case (_, ds) if (ds.nonEmpty) => s"domain exclusion $ds"
case _ => "no exclusion"
}
I think, it's more or less the same thing you are doing, just expressed in a little bit more straightforward way IMO.
One downside is it will wait for all subdomain queries to come back even if the very first one returns a result (the second variant looks a little "slicker", but it also waits for all domain queries unconditionally, which is an additional inefficiency).
There are ways to optimize that (nothing is impossible!) but I can't think of any that wouldn't look excessively complicated for the use case to me.
I'd like to describe how to improve a bit your code while still using futures, but I'm a bit confused of what this code is doing. What is this number that isSiteExcludedAtList returns? Is it an identifier and you want to collect identifiers for all list ids, and you're only concerned with that you don't want to query using domainHash if it's enough to use subdomainHash? That's what your code seems to be doing but then, if I understand correctly the answer above, the one with cats-effect and collectFirstSomeM, then that code looks only for the first result that is Some(number) and then stops. For example, if the first ever call to isSiteExcludedAtList will return Some(1) then we won't call anything more.
So, I have three answers for you.
This is if you want to collect a list of ints and you only want to avoid calling isSiteExcludedAtList with domainHash if calls subdomainHash give you some results already. In this case you can chain both Future.traverse and call the second one only if the first one returns no results.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtList(listId: Int, siteId: Int): Future[Some[Int]] =
Future { Some(1) }
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] =
for {
res1 <- Future.traverse(listIds)(isSiteExcludedAtList(_, subdomainHash))
subIds = res1.flatten
res2 <- if (subIds.isEmpty)
Future.traverse(listIds)(isSiteExcludedAtList(_, domainHash))
else
Future.successful(Nil)
domIds = res2.flatten
} yield
if (subIds.nonEmpty)
s"its subdomain exclusion with results: ${subIds}"
else if (domIds.nonEmpty)
s"its domain exclusion with results: ${domIds}"
else
"no exclusion"
This is if you look for the first result that indicates that the listId is excluded and then you want to query no more. In that case, all calls to isSiteExcludedAtList must be chained, i.e. you call a next one only when you get no result from the previous one. It can be done with recursion:
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtList(listId: Int, siteId: Int): Future[Option[Int]] =
Future { Some(1) }
def isSiteExcludedAtList(listIds: List[Int], hash: Int): Future[Option[Int]] =
listIds match {
case Nil =>
Future.successful(None)
case head :: tail =>
isSiteExcludedAtList(head, hash).flatMap {
case Some(id) => Future.successful(Some(id))
case None => isSiteExcludedAtList(tail, hash)
}
}
// if you use Scala 3, change this to an enum
sealed trait Exclusion
final case class SubdomainExclusion(id: Int) extends Exclusion
final case class DomainExclusion(id: Int) extends Exclusion
case object NoExclusion extends Exclusion
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] =
isSiteExcludedAtList(listIds, subdomainHash).flatMap {
case Some(id) =>
Future.successful(SubdomainExclusion(id))
case None =>
isSiteExcludedAtList(listIds, domainHash).map {
case Some(id) => DomainExclusion(id)
case None => NoExclusion
}
}.map {
case SubdomainExclusion(id) => s"subdomain exclusion $id"
case DomainExclusion(id) => s"domain exclusion: $id"
case NoExclusion => "no exclusion"
}
And the third possibility is that instead of using Future.traverse and asking for each listId separately, you will implement a query that will return all excluded ids for a given hash - subdomainHash or domainHash, and then you will just check if a common set of your listIds and ids returned by that query is non-empty. The code will be similar to that from my first answer, but it will make only two calls to the database. I'm writing about it because from my experience it's a common pattern in dealing with databases: we have some already written queries and as our code becomes more complex we start to use those queries in loops, which leads to sub-optimal performance, while instead we could write a bit more complex query which we would call only once.
import scala.concurrent.Future
import scala.concurrent.ExecutionContext.Implicits.global
// mock database call
def isSiteExcludedAtListBulk(siteId: Int): Future[Set[Int]] =
Future { Set(10, 20, 30) }
// main logic
def isExcluded(listIds: List[Int], subdomainHash: Int, domainHash: Int): Future[String] =
for {
excludedSubIds <- isSiteExcludedAtListBulk(subdomainHash)
subIds = listIds.filter(excludedSubIds)
excludedDomIds <- if (subIds.isEmpty)
isSiteExcludedAtListBulk(domainHash)
else
Future.successful(Set.empty)
domIds = listIds.filter(excludedDomIds)
} yield
if (subIds.nonEmpty)
s"its subdomain exclusion with results: ${subIds}"
else if (domIds.nonEmpty)
s"its domain exclusion with results: ${domIds}"
else
"no exclusion"

Scala Cats Accumulating Errors and Successes with Ior

I am trying to use Cats datatype Ior to accumulate both errors and successes of using a service (which can return an error).
def find(key: String): F[Ior[NonEmptyList[Error], A]] = {
(for {
b <- service.findByKey(key)
} yield b.rightIor[NonEmptyList[Error]])
.recover {
case e: Error => Ior.leftNel(AnotherError)
}
}
def findMultiple(keys: List[String]): F[Ior[NonEmptyList[Error], List[A]]] = {
keys map find reduce (_ |+| _)
}
My confusion lies in how to combine the errors/successes. I am trying to use the Semigroup combine (infix syntax) to combine with no success. Is there a better way to do this? Any help would be great.
I'm going to assume that you want both all errors and all successful results. Here's a possible implementation:
class Foo[F[_]: Applicative, A](find: String => F[IorNel[Error, A]]) {
def findMultiple(keys: List[String]): F[IorNel[Error, List[A]]] = {
keys.map(find).sequence.map { nelsList =>
nelsList.map(nel => nel.map(List(_)))
.reduceOption(_ |+| _).getOrElse(Nil.rightIor)
}
}
}
Let's break it down:
We will be trying to "flip" a List[IorNel[Error, A]] into IorNel[Error, List[A]]. However, from doing keys.map(find) we get List[F[IorNel[...]]], so we need to also "flip" it in a similar fashion first. That can be done by using .sequence on the result, and is what forces F[_]: Applicative constraint.
N.B. Applicative[Future] is available whenever there's an implicit ExecutionContext in scope. You can also get rid of F and use Future.sequence directly.
Now, we have F[List[IorNel[Error, A]]], so we want to map the inner part to transform the nelsList we got. You might think that sequence could be used there too, but it can not - it has the "short-circuit on first error" behavior, so we'd lose all successful values. Let's try to use |+| instead.
Ior[X, Y] has a Semigroup instance when both X and Y have one. Since we're using IorNel, X = NonEmptyList[Z], and that is satisfied. For Y = A - your domain type - it might not be available.
But we don't want to combine all results into a single A, we want Y = List[A] (which also always has a semigroup). So, we take every IorNel[Error, A] we have and map A to a singleton List[A]:
nelsList.map(nel => nel.map(List(_)))
This gives us List[IorNel[Error, List[A]], which we can reduce. Unfortunately, since Ior does not have a Monoid, we can't quite use convenient syntax. So, with stdlib collections, one way is to do .reduceOption(_ |+| _).getOrElse(Nil.rightIor).
This can be improved by doing few things:
x.map(f).sequence is equivalent to doing x.traverse(f)
We can demand that keys are non-empty upfront, and give nonempty result back too.
The latter step gives us Reducible instance for a collection, letting us shorten everything by doing reduceMap
class Foo2[F[_]: Applicative, A](find: String => F[IorNel[Error, A]]) {
def findMultiple(keys: NonEmptyList[String]): F[IorNel[Error, NonEmptyList[A]]] = {
keys.traverse(find).map { nelsList =>
nelsList.reduceMap(nel => nel.map(NonEmptyList.one))
}
}
}
Of course, you can make a one-liner out of this:
keys.traverse(find).map(_.reduceMap(_.map(NonEmptyList.one)))
Or, you can do the non-emptiness check inside:
class Foo3[F[_]: Applicative, A](find: String => F[IorNel[Error, A]]) {
def findMultiple(keys: List[String]): F[IorNel[Error, List[A]]] = {
NonEmptyList.fromList(keys)
.map(_.traverse(find).map { _.reduceMap(_.map(List(_))) })
.getOrElse(List.empty[A].rightIor.pure[F])
}
}
Ior is a good choice for warning accumulation, that is errors and a successful value. But, as mentioned by Oleg Pyzhcov, Ior.Left case is short-circuiting. This example illustrates it:
scala> val shortCircuitingErrors = List(
Ior.leftNec("error1"),
Ior.bothNec("warning2", 2),
Ior.bothNec("warning3", 3)
).sequence
shortCircuitingErrors: Ior[Nec[String], List[Int]]] = Left(Chain(error1))
One way to accumulate both errors and successes is to convert all your Left cases into Both. One approach is using Option as right type and converting Left(errs) values into Both(errs, None). After calling .traverse, you end up with optList: List[Option] on the right side and you can flatten it with optList.flatMap(_.toList) to filter out None values.
class Error
class KeyValue
def find(key: String): Ior[Nel[Error], KeyValue] = ???
def findMultiple(keys: List[String]): Ior[Nel[Error], List[KeyValue]] =
keys
.traverse { k =>
val ior = find(k)
ior.putRight(ior.right)
}
.map(_.flatMap(_.toList))
Or more succinctly:
def findMultiple(keys: List[String]): Ior[Nel[Error], List[KeyValue]] =
keys.flatTraverse { k =>
val ior = find(k)
ior.putRight(ior.toList) // Ior[A,B].toList: List[B]
}

Combine multiple extractor objects to use in one match statement

Is it possible to run multiple extractors in one match statement?
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = ...
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = ...
}
// is there some syntax similar to this?
thing match {
case t # CoolStuff(someInfo) # NeatStuff(otherInfo) => process(someInfo, otherInfo)
case _ => // neither Cool nor Neat
}
The intent here being that there are two extractors, and I don't have to do something like this:
object CoolNeatStuff {
def unapply(thing: Thing): Option[(SomeInfo, OtherInfo)] = thing match {
case CoolStuff(someInfo) => thing match {
case NeatStuff(otherInfo) => Some(someInfo -> otherInfo)
case _ => None // Cool, but not Neat
case _ => None// neither Cool nor Neat
}
}
Can try
object ~ {
def unapply[T](that: T): Option[(T,T)] = Some(that -> that)
}
def too(t: Thing) = t match {
case CoolStuff(a) ~ NeatStuff(b) => ???
}
I've come up with a very similar solution, but I was a bit too slow, so I didn't post it as an answer. However, since #userunknown asks to explain how it works, I'll dump my similar code here anyway, and add a few comments. Maybe someone finds it a valuable addition to cchantep's minimalistic solution (it looks... calligraphic? for some reason, in a good sense).
So, here is my similar, aesthetically less pleasing proposal:
object && {
def unapply[A](a: A) = Some((a, a))
}
// added some definitions to make your question-code work
type Thing = String
type SomeInfo = String
type OtherInfo = String
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = Some(thing.toLowerCase)
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = Some(thing.toUpperCase)
}
def process(a: SomeInfo, b: OtherInfo) = s"[$a, $b]"
val res = "helloworld" match {
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
process(someInfo, otherInfo)
case _ =>
}
println(res)
This prints
[helloworld, HELLOWORLD]
The idea is that identifiers (in particular, && and ~ in cchantep's code) can be used as infix operators in patterns. Therefore, the match-case
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
will be desugared into
case &&(CoolStuff(someInfo), NeatStuff(otherInfo)) =>
and then the unapply method method of && will be invoked which simply duplicates its input.
In my code, the duplication is achieved by a straightforward Some((a, a)). In cchantep's code, it is done with fewer parentheses: Some(t -> t). The arrow -> comes from ArrowAssoc, which in turn is provided as an implicit conversion in Predef. This is just a quick way to create pairs, usually used in maps:
Map("hello" -> 42, "world" -> 58)
Another remark: notice that && can be used multiple times:
case Foo(a) && Bar(b) && Baz(c) => ...
So... I don't know whether it's an answer or an extended comment to cchantep's answer, but maybe someone finds it useful.
For those who might miss the details on how this magic actually works, just want to expand the answer by #cchantep anf #Andrey Tyukin (comment section does not allow me to do that).
Running scalac with -Xprint:parser option will give something along those lines (scalac 2.11.12)
def too(t: String) = t match {
case $tilde(CoolStuff((a # _)), NeatStuff((b # _))) => $qmark$qmark$qmark
}
This basically shows you the initial steps compiler does while parsing source into AST.
Important Note here is that the rules why compiler makes this transformation are described in Infix Operation Patterns and Extractor Patterns. In particular, this allows you to use any object as long as it has unapply method, like for example CoolStuff(a) AndAlso NeatStuff(b). In previous answers && and ~ were picked up as also possible but not the only available valid identifiers.
If running scalac with option -Xprint:patmat which is a special phase for translating pattern matching one can see something similar to this
def too(t: String): Nothing = {
case <synthetic> val x1: String = t;
case9(){
<synthetic> val o13: Option[(String, String)] = main.this.~.unapply[String](x1);
if (o13.isEmpty.unary_!)
{
<synthetic> val p3: String = o13.get._1;
<synthetic> val p4: String = o13.get._2;
{
<synthetic> val o12: Option[String] = main.this.CoolStuff.unapply(p3);
if (o12.isEmpty.unary_!)
{
<synthetic> val o11: Option[String] = main.this.NeatStuff.unapply(p4);
if (o11.isEmpty.unary_!)
matchEnd8(scala.this.Predef.???)
Here ~.unapply will be called on input parameter t which will produce Some((t,t)). The tuple values will be extracted into variables p3 and p4. Then, CoolStuff.unapply(p3) will be called and if the result is not None NeatStuff.unapply(p4) will be called and also checked if it is not empty. If both are not empty then according to Variable Patterns a and b will be bound to returned results inside corresponding Some.

Avoiding nested Ifs when working with multiple Options and Eithers

When I am coding with options I find the fold method very useful. Instead of writing if defined statements I can do
opt.fold(<not_defined>){ defined => }
this is good. but what to do if we are working with multiple options. or multiple eithers. Now I have to resort to writing code like
if (x.isDefined && y.isRight) {
val z = getSomething(x.get)
if (z.isDefined) {
....
Depending on the number of things involved, this code becomes very nested.
is there a functional trick to make this code a little un-nested and concise.... like the fold operation above?
have you tried for comprehension? Assuming you don't want to treat individual errors or empty optionals:
import scala.util._
val opt1 = Some("opt1")
val either2: Either[Error, String] = Right("either2")
val try3: Try[String] = Success("try3")
for {
v1 <- opt1
v2 <- either2.right.toOption
v3 <- try3.toOption
} yield {
println(s"$v1 $v2 $v3")
}
Note that Either is not right biased, so you need to call the .right method on the for comprehension (I think cats or scalaz have a right biased Either). Also, we are converting the Either and the Try to optionals, discarding errors
Cases when .isDefined is followed by .get call can be refactored using custom extractors for pattern matching:
def getSomething(s: String): Option[String] = if (s.isEmpty) None else Some(s.toUpperCase)
object MyExtractor {
def unapply(t: (Option[String], Either[Int, String])): Option[String] =
t match {
case (Some(x), Right(y)) => getSomething(x)
case _ => None
}
}
val x: Option[String] = Some("hello world")
val y: Either[Int, String] = Right("ok")
(x, y) match {
case MyExtractor(z) => z // let's do something with z
case _ => "world"
}
// HELLO WORLD
We managed to get rid of all .isDefined, .get and even .right calls by replacing them by explicit pattern matching thanks to our custom extractor MyExtractor.

Wait for a list of futures with composing Option in Scala

I have to get a list of issues for each file of a given list from a REST API with Scala. I want to do the requests in parallel, and use the Dispatch library for this. My method is called from a Java framework and I have to wait at the end of this method for the result of all the futures to yield the overall result back to the framework. Here's my code:
def fetchResourceAsJson(filePath: String): dispatch.Future[json4s.JValue]
def extractLookupId(json: org.json4s.JValue): Option[String]
def findLookupId(filePath: String): Future[Option[String]] =
for (json <- fetchResourceAsJson(filePath))
yield extractLookupId(json)
def searchIssuesJson(lookupId: String): Future[json4s.JValue]
def extractIssues(json: org.json4s.JValue): Seq[Issue]
def findIssues(lookupId: String): Future[Seq[Issue]] =
for (json <- searchIssuesJson(componentId))
yield extractIssues(json)
def getFilePathsToProcess: List[String]
def thisIsCalledByJavaFramework(): java.util.Map[String, java.util.List[Issue]] = {
val finalResultPromise = Promise[Map[String, Seq[Issue]]]()
// (1) inferred type of issuesByFile not as expected, cannot get
// the type system happy, would like to have Seq[Future[(String, Seq[Issue])]]
val issuesByFile = getFilePathsToProcess map { f =>
findLookupId(f).flatMap { lookupId =>
(f, findIssues(lookupId)) // I want to yield a tuple (String, Seq[Issue]) here
}
}
Future.sequence(issuesByFile) onComplete {
case Success(x) => finalResultPromise.success(x) // (2) how to return x here?
case Failure(x) => // (3) how to return null from here?
}
//TODO transform finalResultPromise to Java Map
}
This code snippet has several issues. First, I'm not getting the type I would expect for issuesByFile (1). I would like to just ignore the result of findLookUpId if it is not able to find the lookUp ID (i.e., None). I've read in various tutorials that Future[Option[X]] is not easy to handle in function compositions and for expressions in Scala. So I'm also curious what the best practices are to handle these properly.
Second, I somehow have to wait for all futures to finish, but don't know how to return the result to the calling Java framework (2). Can I use a promise here to achieve this? If yes, how can I do it?
And last but not least, in case of any errors, I would just like to return null from thisIsCalledByJavaFramework but don't know how (3).
Any help is much appreciated.
Thanks,
Michael
Several points:
The first problem at (1) is that you don't handle the case where findLookupId returns None. You need to decide what to do in this case. Fail the whole process? Exclude that file from the list?
The second problem at (1) is that findIssues will itself return a Future, which you need to map before you can build the result tuple
There's a shortcut for map and then Future.sequence: Future.traverse
If you cannot change the result type of the method because the Java interface is fixed and cannot be changed to support Futures itself you must wait for the Future to be completed. Use Await.ready or Await.result to do that.
Taking all that into account and choosing to ignore files for which no id could be found results in this code:
// `None` in an entry for a file means that no id could be found
def entryForFile(file: String): Future[(String, Option[Seq[Issue]])] =
findLookupId(file).flatMap {
// the need for this kind of pattern match shows
// the difficulty of working with `Future[Option[T]]`
case Some(id) ⇒ findIssues(id).map(issues ⇒ file -> Some(issues))
case None ⇒ Future.successful(file -> None)
}
def thisIsCalledByJavaFramework(): java.util.Map[String, java.util.List[Issue]] = {
val issuesByFile: Future[Seq[(String, Option[Seq[Issue]])]] =
Future.traverse(getFilePathsToProcess)(entryForFile)
import scala.collection.JavaConverters._
try
Await.result(issuesByFile, 10.seconds)
.collect {
// here we choose to ignore entries where no id could be found
case (f, Some(issues)) ⇒ f -> issues
}
.toMap.mapValues(_.asJava).asJava
catch {
case NonFatal(_) ⇒ null
}
}