Parser combinator handling alternation with an optional parser - scala

I have a parser p of type Parser[Option[X]] and another q of type Parser[Y]. (X and Y are concrete types but that's not important here).
I'd like to combine them in such a way that the resulting parser returns a Parser[Either[X, Y]]. This parser will succeed with Left(x) if p yields Some(x) or, failing that, it will succeed with Right(y) if q yields a y. Otherwise, it will fail. Input will be consumed in the successful cases but not in the unsuccessful case.
I'd appreciate any help with this as I can't quite figure out how to make it work.

A little more perseverance after taking a break and I was able to solve this. I don't think my solution is the most elegant and would appreciate feedback:
def compose[X, Y](p: Parser[Option[X]], q: Parser[Y]): Parser[Either[X, Y]] = Parser {
in =>
p(in) match {
case s#this.Success(Some(_), _) => s map (xo => Left(xo.get))
case _ => q(in) match {
case s#this.Success(_, _) => s map (x => Right(x))
case _ => this.Failure("combine: failed", in)
}
}
}
implicit class ParserOps[X](p: Parser[Option[X]]) {
def ?|[Y](q: => Parser[Y]): Parser[Either[X, Y]] = compose(p, q)
}
// Example of usage
def anadicTerm: Parser[AnadicTerm] = (maybeNumber ?| anadicOperator) ^^ {
case Left(x: Number) => debug("anadicTerm (Number)", AnadicTerm(Right(x)))
case Right(x: String) => debug("anadicTerm (String)", AnadicTerm(Left(x)))
}

Related

Optimization of foldLeft

I'm using the following code, and I'm looking for some ideas to make some optimizations.
analyzePayload:
Input: payload which is JsObject and list of rules, each rule has several conditions.
Output: MyReport of all the rules which succeed, notApplicable or failed on this specific payload.
The size of the list can be pretty big, also each Rule has a big amount of conditions.
I am looking for some ideas on how to optimize that code, maybe with a lazy collection? view? stream? tailrec? and why - Thanks!
Also, note that I have anaylzeMode which can run only until one rule succeeds for ex.
def analyzePayload(payload: JsObject, rules: List[Rule]): MyReport = {
val analyzeMode = appConfig.analyzeMode
val (succeed, notApplicable, failed) = rules.foldLeft((List[Rule](), List[Rule](), List[Rule]())) { case ( seed # (succeedRules,notApplicableRules,failedRules), currRule) =>
// Evaluate Single Rule
def step(): (List[Rule], List[Rule], List[Rule]) = evalService.eval(currRule, payload) match {
// If the result is succeed
case EvalResult(true, _, _) => (currRule :: succeedRules, notApplicableRules, failedRules)
// If the result is notApplicable
case EvalResult(_, missing # _ :: _, _) => (succeedRules, currRule :: notApplicableRules, failedRules
)
// If the result is unmatched
case EvalResult(_, _, unmatched # _ :: _) => (succeedRules, notApplicableRules, currRule :: failedRules)
}
analyzeMode match {
case UNTIL_FIRST_SUCCEED => if(succeedRules.isEmpty) step() else seed
case UNTIL_FIRST_NOT_APPLICABLE => if(notApplicableRules.isEmpty) step() else seed
case UNTIL_FIRST_FAILED => if(failedRules.isEmpty) step() else seed
case DEFAULT => step()
case _ => throw new IllegalArgumentException(s"Unknown mode = ${analyzeMode}")
}
}
MyReport(succeed.reverse, notApplicable.reverse, failed.reverse)
}
First Edit:
Changed the code to use tailrec from #Tim Advise, any other suggestions? or some suggestions to make the code a little prettier?
Also, i wanted to ask if there any difference to use view before the foldLeft on the previous implementation.
Also maybe use other collection such as ListBuffer or Vector
def analyzePayload(payload: JsObject, actionRules: List[ActionRule]): MyReport = {
val analyzeMode = appConfig.analyzeMode
def isCompleted(succeed: List[Rule], notApplicable: List[Rule], failed: List[Rule]) = ((succeed, notApplicable, failed), analyzeMode) match {
case (( _ :: _, _, _), UNTIL_FIRST_SUCCEED) | (( _,_ :: _, _), UNTIL_FIRST_NOT_APPLICABLE) | (( _, _, _ :: _), UNTIL_FIRST_FAILED) => true
case (_, DEFAULT) => false
case _ => throw new IllegalArgumentException(s"Unknown mode on analyzePayload with mode = ${analyzeMode}")
}
#tailrec
def _analyzePayload(actionRules: List[ActionRule])(succeed: List[Rule], notApplicable: List[Rule], failed: List[Rule]): (List[Rule], List[Rule] ,List[Rule]) = actionRules match {
case Nil | _ if isCompleted(succeed, notApplicable, failed) => (succeed, notApplicable, failed)
case actionRule :: tail => actionRuleService.eval(actionRule, payload) match {
// If the result is succeed
case EvalResult(true, _, _) => _analyzePayload(tail)(actionRule :: succeed, notApplicable, failed)
// If the result is notApplicable
case EvalResult(_, missing # _ :: _, _) => _analyzePayload(tail)(succeed, actionRule :: notApplicable, failed)
// If the result is unmatched
case EvalResult(_, _, unmatched # _ :: _) => _analyzePayload(tail)(succeed, notApplicable, actionRule :: failed)
}
}
val res = _analyzePayload(actionRules)(Nil,Nil,Nil)
MyReport(res._1, res._2, res._3)
}
Edit 2: (Questions)
If there result will be forwarded to the Client - There no meaning for do it as view? since all the data will be evaluated right?
Maybe should I use ParSeq instead? or this will be just slower since the operation of the evalService.eval(...) is not a heavy operation?
Two obvious optimisations:
Use a tail-recursive function rater than foldLeft so that the compiler can generate an optimised loop and terminate as soon as the appropriate rule is found.
Since analyzeMode is constant, take the match outside the foldLeft. Either have separate code paths for each mode, or use analyzeMode to select a function that is used inside the loop to check for termination.
The code is rather fine, the main thing to revisit would be to make evalService.eval evaluate multiple rules in a single traversal of the json object, assuming the size of the json is not negligible

scala using calculations from pattern matching's guard (if) in body

I'm using pattern matching in scala a lot. Many times I need to do some calculations in guard part and sometimes they are pretty expensive. Is there any way to bind calculated values to separate value?
//i wan't to use result of prettyExpensiveFunc in body safely
people.collect {
case ...
case Some(Right((x, y))) if prettyExpensiveFunc(x, y) > 0 => prettyExpensiveFunc(x)
}
//ideally something like that could be helpful, but it doesn't compile:
people.collect {
case ...
case Some(Right((x, y))) if {val z = prettyExpensiveFunc(x, y); y > 0} => z
}
//this sollution works but it isn't safe for some `Seq` types and is risky when more cases are used.
var cache:Int = 0
people.collect {
case ...
case Some(Right((x, y))) if {cache = prettyExpensiveFunc(x, y); cache > 0} => cache
}
Is there any better solution?
ps: Example is simplified and I don't expect anwers that shows that I don't need pattern matching here.
You can use cats.Eval to make expensive calculations lazy and memoizable, create Evals using .map and extract .value (calculated at most once - if needed) in .collect
values.map { value =>
val expensiveCheck1 = Eval.later { prettyExpensiveFunc(value) }
val expensiveCheck2 = Eval.later { anotherExpensiveFunc(value) }
(value, expensiveCheck1, expensiveCheck2)
}.collect {
case (value, lazyResult1, _) if lazyResult1.value > 0 => ...
case (value, _, lazyResult2) if lazyResult2.value > 0 => ...
case (value, lazyResult1, lazyResult2) if lazyResult1.value > lazyResult2.value => ...
...
}
I don't see a way of doing what you want without creating some implementation of lazy evaluation, and if you have to use one, you might as well use existing one instead of rolling one yourself.
EDIT. Just in case you haven't noticed - you aren't losing the ability to pattern match by using tuple here:
values.map {
// originial value -> lazily evaluated memoized expensive calculation
case a # Some(Right((x, y)) => a -> Some(Eval.later(prettyExpensiveFunc(x, y)))
case a => a -> None
}.collect {
// match type and calculation
...
case (Some(Right((x, y))), Some(lazyResult)) if lazyResult.value > 0 => ...
...
}
Why not run the function first for every element and then work with a tuple?
Seq(1,2,3,4,5).map(e => (e, prettyExpensiveFunc(e))).collect {
case ...
case (x, y) if y => y
}
I tried own matchers and effect is somehow OK, but not perfect. My matcher is untyped, and it is bit ugly to make it fully typed.
class Matcher[T,E](f:PartialFunction[T, E]) {
def unapply(z: T): Option[E] = if (f.isDefinedAt(z)) Some(f(z)) else None
}
def newMatcherAny[E](f:PartialFunction[Any, E]) = new Matcher(f)
def newMatcher[T,E](f:PartialFunction[T, E]) = new Matcher(f)
def prettyExpensiveFunc(x:Int) = {println(s"-- prettyExpensiveFunc($x)"); x%2+x*x}
val x = Seq(
Some(Right(22)),
Some(Right(10)),
Some(Left("Oh now")),
None
)
val PersonAgeRank = newMatcherAny { case Some(Right(x:Int)) => (x, prettyExpensiveFunc(x)) }
x.collect {
case PersonAgeRank(age, rank) if rank > 100 => println("age:"+age + " rank:" + rank)
}
https://scalafiddle.io/sf/hFbcAqH/3

How to better partition valid or invalid inputs

Given a list of inputs that could be valid or invalid, is there a nice way to transform the list but to fail given one or more invalid inputs and, if necessary, to return information about those invalid inputs? I have something like this, but it feels very inelegant.
def processInput(inputList: List[Input]): Try[List[Output]] = {
inputList map { input =>
if (isValid(input)) Left(Output(input))
else Right(input)
} partition { result =>
result.isLeft
} match {
case (valids, Nil) =>
val outputList = valids map { case Left(output) => output }
Success(outputList)
case (_, invalids) =>
val errList = invalids map { case Right(invalid) => invalid }
Failure(new Throwable(s"The following inputs were invalid: ${errList.mkString(",")}"))
}
}
Is there a better way to do this?
I think you can simplify your current solution quite a bit with standard scala:
def processInput(inputList: List[Input]): Try[List[Output]] =
inputList.partition(isValid) match {
case (valids, Nil) => Success(valids.map(Output))
case (_, invalids) => Failure(new Throwable(s"The following inputs were invalid: ${invalids.mkString(",")}"))
}
Or, you can have a quite elegant solution with scalactic's Or.
import org.scalactic._
def processInputs(inputList: List[Input]): List[Output] Or List[Input] =
inputList.partition(isValid) match {
case (valid, Nil) => Good(valid.map(Output))
case (_, invalid) => Bad(invalid)
}
The result is of type org.scalactic.Or, which you then have to match to Good or Bad. This approach is more useful if you want the list of invalid inputs, you can match it out of Bad.
scalaz's validation is designed exactly for this. Try reading the tale of three nightclubs for how this would work, but the body of your function would probably end up just consisting of something like:
def processInput(inputList: List[Input]): Validation[List[Output]] = {
inputList foldMap { input =>
if (isValid(input)) Failure(Output(input))
else Success(List(input))
}

map expression in case clause in scala pattern matching

I have a configuration value that matches to one of the values in a map and depending on to which it matches i take an action. Here is some sample code of what i am trying to do
val x = 1 // or 2 or 3
val config = Map("c1"-> 1, "c2"-> 2, "c3"-> 3)
x match {
case config("c1") =>
println("1")
case config("c2") =>
println("2")
case config("c3") =>
println("3")
}
Now this should print 1 because config("c1") evaluates to 1 but it gives error
error: value config is not a case class, nor does it have an unapply/unapplySeq member
case config("c1") =>
Similarly for the other 2 cases. Why should i have an unapply here? Any pointers?
An expression like that looks like an extractor, hence the message about unapply/unapplySeq methods. If you don't want to use an extractor but just want to match against a plain value, you need to store that value in a stable identifier - you can't use an arbitrary expression as a match case:
val case1 = config("c1")
x match {
case case1 => println("1")
...
}
To the best of my knowledge, in Scala, x match {case config("c1") gets translated to config.unapply(x) with the branching dependent on the result of the unapply method. As Imm already mentioned in his answer, this isn't the case for stable identifiers (literals and val), and I'd encourage you to use his solution.
Nevertheless, to show you how you could solve the problem using extractors, I'd like to post a different solution:
def main(args: Array[String]): Unit = {
object config {
val configData = Map("c1" -> 1, "c2" -> 2, "c3" -> 3)
def unapply(value: Int): Option[String] = configData find (_._2 == value) map (_._1)
}
1 to 4 foreach {
case config("c1") => println("1")
case config("c2") => println("2")
case config("c3") => println("3")
case _ => println("no match")
}
}
I changed the match for a foreach to show the different results, but this has no effect on the implementation. This would print:
1
2
3
no match
As you can see, case config("c1") now calls the unapply method and checks whether the result is Some("c1"). Note that this is inverse to how you'd use a map: The key is searched according to the value. However, this makes sense: If in the map, "c1" and "c2" both map to 1, then 1 matches both, the same way _ matches everything, in our case even 4 which is not configured.
Here's also a very brief tutorial on extractors. I don't find it particularly good, because both, the returned type and the argument type are Int, but it might help you understand what's going on.
As others have stated, with x match { case config("c1") => ..., scala looks for an extractor by the name of config (something with an unapply method that takes a single value and returns an Optional value); Making pattern matching work this way seems like an abuse of the pattern, and I would not use an extractor for this.
Personally, I would recommend one of the following:
if (x == config("c1"))
println("1")
else if (x == config("c2"))
println("2")
else ...
Or, if you're set on using a match statement, you can use conditionals like this:
x match {
case _ if x == config("c1") =>
println("1")
case _ if x == config("c2") =>
println("2")
case _ if x == config("c3") =>
println("3")
}
Not as clean; unfortunately, there isn't a way to invoke a method call literally where the extractor goes. You can use back-ticks to tell scala "match against the value of this variable" (rather than default behavior, which would yield the value named as that variable):
val (c1,c2,c3) = (config("c1"), config("c2"), config("c3"))
x match {
case `c1` =>
println("1")
case `c2` =>
println("2")
case `c3` =>
println("3")
}
Finally, if your goal is to reverse-apply a map, maybe try this instead?
scala> Map("a" -> 1).map { case (k,v) => (v,k) }
res0: scala.collection.immutable.Map[Int,String] = Map(1 -> a)

Using Either to process failures in Scala code

Option monad is a great expressive way to deal with something-or-nothing things in Scala. But what if one needs to log a message when "nothing" occurs? According to the Scala API documentation,
The Either type is often used as an
alternative to scala.Option where Left
represents failure (by convention) and
Right is akin to Some.
However, I had no luck to find best practices using Either or good real-world examples involving Either for processing failures. Finally I've come up with the following code for my own project:
def logs: Array[String] = {
def props: Option[Map[String, Any]] = configAdmin.map{ ca =>
val config = ca.getConfiguration(PID, null)
config.properties getOrElse immutable.Map.empty
}
def checkType(any: Any): Option[Array[String]] = any match {
case a: Array[String] => Some(a)
case _ => None
}
def lookup: Either[(Symbol, String), Array[String]] =
for {val properties <- props.toRight('warning -> "ConfigurationAdmin service not bound").right
val logsParam <- properties.get("logs").toRight('debug -> "'logs' not defined in the configuration").right
val array <- checkType(logsParam).toRight('warning -> "unknown type of 'logs' confguration parameter").right}
yield array
lookup.fold(failure => { failure match {
case ('warning, msg) => log(LogService.WARNING, msg)
case ('debug, msg) => log(LogService.DEBUG, msg)
case _ =>
}; new Array[String](0) }, success => success)
}
(Please note this is a snippet from a real project, so it will not compile on its own)
I'd be grateful to know how you are using Either in your code and/or better ideas on refactoring the above code.
Either is used to return one of possible two meaningful results, unlike Option which is used to return a single meaningful result or nothing.
An easy to understand example is given below (circulated on the Scala mailing list a while back):
def throwableToLeft[T](block: => T): Either[java.lang.Throwable, T] =
try {
Right(block)
} catch {
case ex => Left(ex)
}
As the function name implies, if the execution of "block" is successful, it will return "Right(<result>)". Otherwise, if a Throwable is thrown, it will return "Left(<throwable>)". Use pattern matching to process the result:
var s = "hello"
throwableToLeft { s.toUpperCase } match {
case Right(s) => println(s)
case Left(e) => e.printStackTrace
}
// prints "HELLO"
s = null
throwableToLeft { s.toUpperCase } match {
case Right(s) => println(s)
case Left(e) => e.printStackTrace
}
// prints NullPointerException stack trace
Hope that helps.
Scalaz library has something alike Either named Validation. It is more idiomatic than Either for use as "get either a valid result or a failure".
Validation also allows to accumulate errors.
Edit: "alike" Either is complettly false, because Validation is an applicative functor, and scalaz Either, named \/ (pronounced "disjonction" or "either"), is a monad.
The fact that Validation can accumalate errors is because of that nature. On the other hand, / has a "stop early" nature, stopping at the first -\/ (read it "left", or "error") it encounters. There is a perfect explanation here: http://typelevel.org/blog/2014/02/21/error-handling.html
See: http://scalaz.googlecode.com/svn/continuous/latest/browse.sxr/scalaz/example/ExampleValidation.scala.html
As requested by the comment, copy/paste of the above link (some lines removed):
// Extracting success or failure values
val s: Validation[String, Int] = 1.success
val f: Validation[String, Int] = "error".fail
// It is recommended to use fold rather than pattern matching:
val result: String = s.fold(e => "got error: " + e, s => "got success: " + s.toString)
s match {
case Success(a) => "success"
case Failure(e) => "fail"
}
// Validation is a Monad, and can be used in for comprehensions.
val k1 = for {
i <- s
j <- s
} yield i + j
k1.toOption assert_≟ Some(2)
// The first failing sub-computation fails the entire computation.
val k2 = for {
i <- f
j <- f
} yield i + j
k2.fail.toOption assert_≟ Some("error")
// Validation is also an Applicative Functor, if the type of the error side of the validation is a Semigroup.
// A number of computations are tried. If the all success, a function can combine them into a Success. If any
// of them fails, the individual errors are accumulated.
// Use the NonEmptyList semigroup to accumulate errors using the Validation Applicative Functor.
val k4 = (fNel <**> fNel){ _ + _ }
k4.fail.toOption assert_≟ some(nel1("error", "error"))
The snippet you posted seems very contrived. You use Either in a situation where:
It's not enough to just know the data isn't available.
You need to return one of two distinct types.
Turning an exception into a Left is, indeed, a common use case. Over try/catch, it has the advantage of keeping the code together, which makes sense if the exception is an expected result. The most common way of handling Either is pattern matching:
result match {
case Right(res) => ...
case Left(res) => ...
}
Another interesting way of handling Either is when it appears in a collection. When doing a map over a collection, throwing an exception might not be viable, and you may want to return some information other than "not possible". Using an Either enables you to do that without overburdening the algorithm:
val list = (
library
\\ "books"
map (book =>
if (book \ "author" isEmpty)
Left(book)
else
Right((book \ "author" toList) map (_ text))
)
)
Here we get a list of all authors in the library, plus a list of books without an author. So we can then further process it accordingly:
val authorCount = (
(Map[String,Int]() /: (list filter (_ isRight) map (_.right.get)))
((map, author) => map + (author -> (map.getOrElse(author, 0) + 1)))
toList
)
val problemBooks = list flatMap (_.left.toSeq) // thanks to Azarov for this variation
So, basic Either usage goes like that. It's not a particularly useful class, but if it were you'd have seen it before. On the other hand, it's not useless either.
Cats has a nice way to create an Either from exception-throwing code:
val either: Either[NumberFormatException, Int] =
Either.catchOnly[NumberFormatException]("abc".toInt)
// either: Either[NumberFormatException,Int] = Left(java.lang.NumberFormatException: For input string: "abc")
in https://typelevel.org/cats/datatypes/either.html#working-with-exception-y-code