Cleanest way in Scala to avoid nested ifs when transforming collections and checking for error conditions in each step - scala

I have some code for validating ip addresses that looks like the following:
sealed abstract class Result
case object Valid extends Result
case class Malformatted(val invalid: Iterable[IpConfig]) extends Result
case class Duplicates(val dups: Iterable[Inet4Address]) extends Result
case class Unavailable(val taken: Iterable[Inet4Address]) extends Result
def result(ipConfigs: Iterable[IpConfig]): Result = {
val invalidIpConfigs: Iterable[IpConfig] =
ipConfigs.filterNot(ipConfig => {
(isValidIpv4(ipConfig.address)
&& isValidIpv4(ipConfig.gateway))
})
if (!invalidIpConfigs.isEmpty) {
Malformatted(invalidIpConfigs)
} else {
val ipv4it: Iterable[Inet4Address] = ipConfigs.map { ipConfig =>
InetAddress.getByName(ipConfig.address).asInstanceOf[Inet4Address]
}
val dups = ipv4it.groupBy(identity).filter(_._2.size != 1).keys
if (!dups.isEmpty) {
Duplicates(dups)
} else {
val ipAvailability: Map[Inet4Address, Boolean] =
ipv4it.map(ip => (ip, isIpAvailable(ip)))
val taken: Iterable[Inet4Address] = ipAvailability.filter(!_._2).keys
if (!taken.isEmpty) {
Unavailable(taken)
} else {
Valid
}
}
}
}
I don't like the nested ifs because it makes the code less readable. Is there a nice way to linearize this code? In java, I might use return statements, but this is discouraged in scala.

I personally advocate using a match everywhere you can, as it in my opinion usually makes code very readable
def result(ipConfigs: Iterable[IpConfig]): Result =
ipConfigs.filterNot(ipc => isValidIpv4(ipc.address) && isValidIpv4(ipc.gateway)) match {
case Nil =>
val ipv4it = ipConfigs.map { ipc =>
InetAddress.getByName(ipc.address).asInstanceOf[Inet4Address]
}
ipv4it.groupBy(identity).filter(_._2.size != 1).keys match {
case Nil =>
val taken = ipv4it.map(ip => (ip, isIpAvailable(ip))).filter(!_._2).keys
if (taken.nonEmpty) Unavailable(taken) else Valid
case dups => Duplicates(dups)
}
case invalid => Malformatted(invalid)
}
Note that I've chosen to match on the else part first, since you generally go from specific to generic in matches, since Nil is a subclass of Iterable I put that as the first case, eliminating the need for an i if i.nonEmpty in the other case, since it would be a given if it didn't match Nil
Also a thing to note here, all your vals don't need the type explicitly defined, it significantly declutters the code if you write something like
val ipAvailability: Map[Inet4Address, Boolean] =
ipv4it.map(ip => (ip, isIpAvailable(ip)))
as simply
val ipAvailability = ipv4it.map(ip => (ip, isIpAvailable(ip)))
I've also taken the liberty of removing many one-off variables I didn't find remotely necessary, as all they did was add more lines to the code
A thing to note here about using match over nested ifs, is that is that it's easier to add a new case than it is to add a new else if 99% of the time, thereby making it more modular, and modularity is always a good thing.
Alternatively, as suggested by Nathaniel Ford, you can break it up into several smaller methods, in which case the above code would look like so:
def result(ipConfigs: Iterable[IpConfig]): Result =
ipConfigs.filterNot(ipc => isValidIpv4(ipc.address) && isValidIpv4(ipc.gateway)) match {
case Nil => wellFormatted(ipConfigs)
case i => Malformatted(i)
}
def wellFormatted(ipConfigs: Iterable[IpConfig]): Result = {
val ipv4it = ipConfigs.map(ipc => InetAddress.getByName(ipc.address).asInstanceOf[Inet4Address])
ipv4it.groupBy(identity).filter(_._2.size != 1).keys match {
case Nil => noDuplicates(ipv4it)
case dups => Duplicates(dups)
}
}
def noDuplicates(ipv4it: Iterable[IpConfig]): Result =
ipv4it.map(ip => (ip, isIpAvailable(ip))).filter(!_._2).keys match {
case Nil => Valid
case taken => Unavailable(taken)
}
This has the benefit of splitting it up into smaller more manageable chunks, while keeping to the FP ideal of having functions that only do one thing, but do that one thing well, rather than having god-methods that do everything.
Which style you prefer, of course is up to you.

This has some time now but I will add my 2 cents. The proper way to handle this is with Either. You can create a method like:
def checkErrors[T](errorList: Iterable[T], onError: Result) : Either[Result, Unit] = if(errorList.isEmpty) Right() else Left(onError)
so you can use for comprehension syntax
val invalidIpConfigs = getFormatErrors(ipConfigs)
val result = for {
_ <- checkErrors(invalidIpConfigs, Malformatted(invalidIpConfigs))
dups = getDuplicates(ipConfigs)
_ <- checkErrors(dups, Duplicates(dups))
taken = getAvailability(ipConfigs)
_ <- checkErrors(taken, Unavailable(taken))
} yield Valid
If you don't want to return an Either use
result.fold(l => l, r => r)
In case of the check methods uses Futures (could be the case for getAvailability, for example), you can use cats library to be able of use it in a clean way: https://typelevel.org/cats/datatypes/eithert.html

I think it's pretty readable and I wouldn't try to improve it from there, except that !isEmpty equals to nonEmpty.

Related

Rewriting mutable/imperative loop as functional

var lastSize = 0
var all = entries
while(lastSize != all.size){
entries = entries.flatMap(
_.instructions.iterator.asScala
.flatMap(_ match {
case m:MethodInsnNode => Some(MethodCall(m.owner, m.name, m.desc))
case _ => None
})
).toSet.flatMap(findMethod _)
lastSize = all.size
all = all.union(entries)
}
My code uses ASM to generate a Set of all potential MethodNodes that could be called from a given starting set, entries. However, I want to rewrite it in a functional manner. The repeated mapping of sets definitely seems recursive, though I can't entirely wrap my head around how to go about it.
I came up with this, though it uses slightly different (but working?) logic. Would it be possible to write it with tail-call recursion?
private def gatherMethods(current: Set[MethodNode]): Set[MethodNode] = {
val next = current.flatMap(
_.instructions.iterator.asScala
.flatMap(_ match {
case m:MethodInsnNode => Some(MethodCall(m.owner, m.name, m.desc))
case _ => None
})
).toSet.flatMap(findMethod _)
if(next == current) Set()
else current.union(gatherMethods(next))
}

flattening future of option after mapping with a function that return future of option

I have a collection of type Future[Option[String]] and I map it to a function that returns Future[Option[Profile]], but this create a return type of Future[Option[Future[Option[Profile]]]] because queryProfile return type is `Future[Option[Profile]]'
val users: Future[Option[User]] = someQuery
val email: Future[Option[String]] = users map(opSL => opSL map(_.email) )
val userProfile = email map {opE => opE map {E => queryProfile(E)}}
I need to use the Profile object contained deep inside val userProfile without unpacking all these levels, what would be the right way to use flatMap or `flatten', or is there a better approach all together ?
You can get a "partial Future" with something like this:
val maybeProfile: Future[Profile] = users
.collect { case Some(u) => u.email }
.flatMap { email => queryProfile(email) }
.collect { case Some(p) => p }
Now maybeProfile contains the (completely "naked"/unwrapped) Profile instance, but only if it was able to find it. You can .map it as usual to do something else with it, that'll work in the usual ways.
If you want to ever block and wait for completion, you will have to handle the missing case at some point. For example:
val optionalProfile: Option[Profile] = Await.result(
maybeProfile
.map { p => Some(p) } // Or just skip the last `collect` above
.recover { case _:NoSuchElementException => None },
1 seconds
)
If you are happy with just having Future[Option[Profile]], and would prefer to have the "unwrapping" magic, and handling the missing case localized in one place, you can put the two fragments from above together like this:
val maybeProfile: Future[Option[Profile]] = users
.collect { case Some(u) => u.email }
.flatMap { email => queryProfile(email) }
.recover { case _:NoSuchElementException => None }
Or use Option.fold like the other answer suggested:
val maybeProfile: Future[Option[Profile]] = users
.map { _.map(_.email) }
.flatMap { _.fold[Future[Option[Profile]]](Future.successful(None))(queryProfile) }
Personally, I find the last option less readable though.
Personally I think a monad transformer such as OptionT provided by scalaz/cats would be the cleanest approach:
val users = OptionT[Future,User](someQuery)
def queryProfile(email:String) : OptionT[Future,Profile] = ...
for {
u <- users
p <- queryProfile(u.email)
} yield p
I'd just create a helper method like this:
private def resolveProfile(optEmail: Option[String]): Future[Option[Profile] =
optEmail.fold(Future.successful(None)) { email =>
queryProfile(email).map(Some(_))
}
which then allows you to just flatMap your original email future like so:
val userProfile = email.flatMap(resolveProfile)

How to better partition valid or invalid inputs

Given a list of inputs that could be valid or invalid, is there a nice way to transform the list but to fail given one or more invalid inputs and, if necessary, to return information about those invalid inputs? I have something like this, but it feels very inelegant.
def processInput(inputList: List[Input]): Try[List[Output]] = {
inputList map { input =>
if (isValid(input)) Left(Output(input))
else Right(input)
} partition { result =>
result.isLeft
} match {
case (valids, Nil) =>
val outputList = valids map { case Left(output) => output }
Success(outputList)
case (_, invalids) =>
val errList = invalids map { case Right(invalid) => invalid }
Failure(new Throwable(s"The following inputs were invalid: ${errList.mkString(",")}"))
}
}
Is there a better way to do this?
I think you can simplify your current solution quite a bit with standard scala:
def processInput(inputList: List[Input]): Try[List[Output]] =
inputList.partition(isValid) match {
case (valids, Nil) => Success(valids.map(Output))
case (_, invalids) => Failure(new Throwable(s"The following inputs were invalid: ${invalids.mkString(",")}"))
}
Or, you can have a quite elegant solution with scalactic's Or.
import org.scalactic._
def processInputs(inputList: List[Input]): List[Output] Or List[Input] =
inputList.partition(isValid) match {
case (valid, Nil) => Good(valid.map(Output))
case (_, invalid) => Bad(invalid)
}
The result is of type org.scalactic.Or, which you then have to match to Good or Bad. This approach is more useful if you want the list of invalid inputs, you can match it out of Bad.
scalaz's validation is designed exactly for this. Try reading the tale of three nightclubs for how this would work, but the body of your function would probably end up just consisting of something like:
def processInput(inputList: List[Input]): Validation[List[Output]] = {
inputList foldMap { input =>
if (isValid(input)) Failure(Output(input))
else Success(List(input))
}

using if instead of match on Option[X]

I want to replace the below match with an if statement, preferably less verbose than this. I personally find if's to be easier to parse in the mind.
val obj = referencedCollection match{
case None => $set(nextColumn -> MongoDBObject("name" -> name))
case Some( collection) =>....}
Is there an equivalent if statement or some other method that produces the equivalent result?
You can replace the pattern match by a combination of map and getOrElse:
ox match {
case None => a
case Some(x) => f(x)
}
can be replaced by
ox.map(f).getOrElse(a)
You probably ought not indulge yourself in continuing to use if in conditions like this. It's not idiomatic, rarely is faster, is more verbose, requires intermediate variables so is more error-prone, etc..
Oh, and it's a bad idea to use anything with $ signs!
Here are some other patterns that you might use in addition to match:
val obj = referenceCollection.fold( $set(nextColumn -> MongoDBObject("name" -> name) ){
collection => ...
}
val obj = (for (collection <- referenceCollection) yield ...).getOrElse{
$set(nextColumn -> MongoDBObject("name" -> name)
}
val obj = referenceCollection.map{ collection => ... }.getOrElse{
$set(nextColumn -> MongoDBObject("name" -> name)
}
You can basically think of map as the if (x.isDefined) x.get... branch of the if and the getOrElse branch as the else $set... branch. It's not exactly the same, of course, as leaving off the else gives you an expression that doesn't return a value, while leaving off the getOrElse leaves you with an unpacked Option. But the thought-flow is very similar.
Anyway, fold is the most compact. Note that both of these have a bit of runtime overhead above a match or if statement.
There is huge number of options (pun intended):
if (referencedCollection != None) { ... } else { ... }
if (referencedCollection.isDefined) { ... } else { ... } // #Kigyo variant
if (referencedCollection.isEmpty) { /* None processing */ } else { ... }
You could do it like this:
val obj = if(referencedCollection.isDefined) { (work with referencedCollection.get) } else ...

Chaining validation in Scala

I have a Scala case class containing command-line configuration information:
case class Config(emailAddress: Option[String],
firstName: Option[String]
lastName: Option[String]
password: Option[String])
I am writing a validation function that checks that each of the values is a Some:
def validateConfig(config: Config): Try[Config] = {
if (config.emailAddress.isEmpty) {
Failure(new IllegalArgumentException("Email Address")
} else if (config.firstName.isEmpty) {
Failure(new IllegalArgumentException("First Name")
} else if (config.lastName.isEmpty) {
Failure(new IllegalArgumentException("Last Name")
} else if (config.password.isEmpty) {
Failure(new IllegalArgumentException("Password")
} else {
Success(config)
}
}
but if I understand monads from Haskell, it seems that I should be able to chain the validations together (pseudo syntax):
def validateConfig(config: Config): Try[Config] = {
config.emailAddress.map(Success(config)).
getOrElse(Failure(new IllegalArgumentException("Email Address")) >>
config.firstName.map(Success(config)).
getOrElse(Failure(new IllegalArgumentException("First Name")) >>
config.lastName.map(Success(config)).
getOrElse(Failure(new IllegalArgumentException("Last Name")) >>
config.password.map(Success(config)).
getOrElse(Failure(new IllegalArgumentException("Password"))
}
If any of the config.XXX expressions returns Failure, the whole thing (validateConfig) should fail, otherwise Success(config) should be returned.
Is there some way to do this with Try, or maybe some other class?
It's pretty straightforward to convert each Option to an instance of the right projection of Either:
def validateConfig(config: Config): Either[String, Config] = for {
_ <- config.emailAddress.toRight("Email Address").right
_ <- config.firstName.toRight("First Name").right
_ <- config.lastName.toRight("Last Name").right
_ <- config.password.toRight("Password").right
} yield config
Either isn't a monad in the standard library's terms, but its right projection is, and will provide the behavior you want in the case of failure.
If you'd prefer to end up with a Try, you could just convert the resulting Either:
import scala.util._
val validate: Config => Try[Config] = (validateConfig _) andThen (
_.fold(msg => Failure(new IllegalArgumentException(msg)), Success(_))
)
I wish that the standard library provided a nicer way to make this conversion, but it doesn't.
It's a case class, so why aren't you doing this with pattern matching?
def validateConfig(config: Config): Try[Config] = config match {
case Config(None, _, _, _) => Failure(new IllegalArgumentException("Email Address")
case Config(_, None, _, _) => Failure(new IllegalArgumentException("First Name")
case Config(_, _, None, _) => Failure(new IllegalArgumentException("Last Name")
case Config(_, _, _, None) => Failure(new IllegalArgumentException("Password")
case _ => Success(config)
}
In your simple example, my priority would be to forget monads and chaining, just get rid of that nasty if...else smell.
However, while a case class works perfectly well for a short list, for a large number of configuration options, this becomes tedious and the risk of error increases. In this case, I would consider something like this:
Add a method that returns a key->value map of the configuration options, using the option names as the keys.
Have the Validate method check if any value in the map is None
If no such value, return success.
If at least one value matches, return that value name with the error.
So assuming that somewhere is defined
type OptionMap = scala.collection.immutable.Map[String, Option[Any]]
and the Config class has a method like this:
def optionMap: OptionMap = ...
then I would write Config.validate like this:
def validate: Either[List[String], OptionMap] = {
val badOptions = optionMap collect { case (s, None) => s }
if (badOptions.size > 0)
Left(badOptions)
else
Right(optionMap)
}
So now Config.validate returns either a Left containing the name of all the bad options or a Right containing the full map of options and their values. Frankly, it probably doesn't matter what you put in the Right.
Now, anything that wants to validate a Config just calls Config.validate and examines the result. If it's a Left, it can throw an IllegalArgumentException containing one or more of the names of bad options. If it's a Right, it can do whatever it wanted to do, knowing the Config is valid.
So we could rewrite your validateConfig function as
def validateConfig(config: Config): Try[Config] = config.validate match {
case Left(l) => Failure(new IllegalArgumentException(l.toString))
case _ => Success(config)
}
Can you see how much more functional and OO this is getting?
No imperative chain of if...else
The Config object validates itself
The consequences of a Config object being invalid are left to the larger program.
I think a real example would be more complex yet, though. You are validating options by saying "Does it contain Option[String] or None?" but not checking the validity of the string itself. Really, I think your Config class should contain a map of options where the name maps to the value and to an anonymous function that validates the string. I could describe how to extend the above logic to work with that model, but I think I'll leave that as an exercise for you. I will give you a hint: you might want to return not just the list of failed options, but also the reason for failure in each case.
Oh, by the way... I hope none of the above implies that I think you should actually store the options and their values as an optionMap inside the object. I think it's useful to be able to retrieve them like that, but I wouldn't ever encourage such exposure of the actual internal representation ;)
Here's a solution that I came up with after some searching and scaladocs reading:
def validateConfig(config: Config): Try[Config] = {
for {
_ <- Try(config.emailAddress.
getOrElse(throw new IllegalArgumentException("Email address missing")))
_ <- Try(config.firstName.
getOrElse(throw new IllegalArgumentException("First name missing")))
_ <- Try(config.lastName.
getOrElse(throw new IllegalArgumentException("Last name missing")))
_ <- Try(config.password.
getOrElse(throw new IllegalArgumentException("Password missing")))
} yield config
}
Similar to Travis Brown's answer.