Signaling failure via the function application parser combinator - scala

I need to do more complex syntax checking on a parser match than the standard notation permits and am currently doing it in the function application ^^. A sample, simplified scenario is checking for duplicate keywords:
def keywords: Parser[List[String]] = "[" ~ repsep(keyword, ",") ~ "]" ^^ {
case _ ~ ks ~ _ =>
ks.groupBy(x => x).filter(_._2.length > 1).keys.toList match {
case Nil => ks
case x => throw new DuplicateKeywordsException(x)
}
}
This works, as in my parser will throw an exception, but I want the failure to be captured as a ParseResult.Failure capturing the Input of where it happened. I can't figure out how to signal this from within a ^^ block or use some other construct to achieve the same end.

Ok, I followed Erik Meijer's advice of following the Types down the happy path. Looking at how ^^ is defined in Programming in Scala (which differs from the actual code), i realized it was basically just a Map function:
def ˆˆ [U](f: T => U): Parser[U] = new Parser[U] {
def apply(in: Input) = p(in) match {
case Success(x, in1) => Success(f(x), in1)
case failure => failure
}
}
Basically it's Parser[T] => Parser[U].
Parser[T] itself is a function of Input => ParseResult[T] and ^^ just defines a new parser by supplying the apply method, which on invocation either transforms Success[T] to Success[U] or just passes along Failure.
To achieve the goal of injecting a new Failure during mapping I need a new mapping function that takes a function like f: T => Either[String,U] so I can signal an error message or successful mapping. I chose Either with string, since Failure just takes a string message. This new mapping function is then added to Parser[U] via an implicit class:
implicit class RichParser[+T](p: Parser[T]) {
def ^^? [U](f: T => Either[String,U]): Parser[U] = new Parser[U] {
def apply(in: Input) = p(in) match {
case Success(x, in1) => f(x) match {
case Left(error) => Failure(error,in1)
case Right(x1) => Success(x1,in1)
}
case failure:Failure => failure
case error:Error => error
}
}
}
And now keywords can be defined as:
def keywords: Parser[List[String]] = "[" ~ repsep(keyword, ",") ~ "]" ^^? {
case _ ~ ks ~ _ =>
ks.groupBy(x => x).filter(_._2.length > 1).keys.toList match {
case Nil => Right(ks)
case x => Left("found duplicate keywords: "+x.reduce[String] { case (a, b) => s"$a, $b"})
}
}

Related

Scala multi-level pattern matching

I'm stuck with multi-level pattern matching, in the code below I want to match one particular case which is checked at several levels "cfe is Assignment, assignmentCfe.getRight is BinaryExpression, and so on", the solution looks ugly and I hope there is something better Scala can offer me. :)
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.foreach { case (cfe, state) => cfe match {
case assign: Assignment => assign.getRight match {
case expression: BinaryExpression => expression.getOperator match {
case Operator.Div | Operator.Rem => processDivisions()
case _ =>
}
case _ =>
}
case _ =>
}
case _ =>
}
}
How to get rid of these empty default cases in the end?
Another approach would be using nested conditions, but IntelliJ IDEA offers me to replace these conditions back to pattern matching
states.foreach { case (cfe, state) => if (cfe.isInstanceOf[Assignment]) {
val assignment = cfe.asInstanceOf[Assignment]
if (assignment.getRight.isInstanceOf[BinaryExpression]) {
val expression = assignment.getRight.asInstanceOf[BinaryExpression]
if (expression.getOperator == Operator.Div || expression.getOperator == Operator.Rem) processDivisions()
}
}}
Are Assignment and BinaryExpression themselves case classes? Or do they have corresponding unapply methods? If so, then you can nest pattern matches and ignore fields you don't care about. For example, something like:
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.foreach {
case (Assignment(_, BinaryExpression(_, _, Operator.Div | Operator.Rem)), _) => processDivisions()
case _ =>
}
}
This will at least cut the number of default matches down to 1.
If these are not case classes or don't have extractors, then you could consider writing your own if this is a common enough (anti)pattern in your code: http://docs.scala-lang.org/tutorials/tour/extractor-objects.html
One other idea is you could use the "pimp my library" pattern to define an implicit conversion from any object into a class that can do a kind of partial matching:
class PartialMatcher[A](a: A) {
def partialMatch(f: PartialFunction[A, Unit]): Unit = if (f.isDefinedAt(a)) f(a)
}
implicit def makePartialMatcher[A](a: A) = new PartialMatcher(a)
Then just replace all of those matches with partialMatch:
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.foreach { case (cfe, state) => cfe partialMatch {
case assign: Assignment => assign.getRight partialMatch {
case expression: BinaryExpression => expression.getOperator partialMatch {
case Operator.Div | Operator.Rem => processDivisions()
}
}
}}
}
Note that there are other reasons why you might avoid this kind of thing... overusing implicit conversions can make understanding code a lot more difficult. It's a stylistic choice.
Use .collect:
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.collect { case (assign: Assignment, _) =>
assign.getRight
}.collect { case expression: BinaryExpression =>
expression.getOperator
}.collect { case Operator.Div | Operator.Rem =>
processDivisions
}

Scala Argument Types of Anonymous Function in Overload

There are quite a few questions about this error message on SO, but none of them seem to be about this issue.
The argument types of an anonymous function must be fully known. (SLS 8.5)
The offending block of code is attempting to emulate Ruby's block functionality, with the added benefit that an argument can be pattern matched in the process.
object Block {
def apply(f: => Unit) = apply((_: String) => f)
def apply(f: String => Unit) = ???
}
def example() = {
Block { // Error!
case "A" => println("First letter of the alphabet")
case _ => println("Not the first letter of the alphabet")
}
}
Even though, one line down, Scala can clearly see that I'm matching against a string, it can't infer the argument type.
The trouble here is that there are two apply methods. If there was only one:
object Block {
def apply(f: String => Bool) = ???
}
Then everything would work fine, as Scala would see the application and immediately understand the required type of the anonymous function. However, when there are two or more different methods:
object Block {
def apply(f: => Bool) = apply((_: String) => f)
def apply(f: String => Bool) = ???
}
Scala cannot deduce the type of the argument from the application of apply, and it cannot deduce which application of apply to use from the type of the argument, so it gets caught in a loop. The simplest solution, it seems, is to simply rename one of the methods.
object Block {
def apply(f: => Unit) = apply((_: String) => f)
def branchOff(f: String => Unit) = ???
}
It's not much more difficult to call now.
Block { println("This is a regular application.") }
Block.branchOff {
case "A" => println("A is for aardvark")
case "B" => println("B is for beaver")
case _ => println("Huh?")
}
And you don't have to specify any type arguments, or any explicit arguments at all for that matter.
More details on this in a thread over on GitHub: https://github.com/ReactiveX/RxScala/issues/160.
If you really like the idea of having two different apply() methods then you have to offer some help to the inference engine.
def example() = {
Block{s:String => s match {
case "A" => println("First letter of the alphabet")
case _ => println("Not the first letter of the alphabet")
}}
}

What is the reference of the object that a partial function matches on?

Looking at this function as an example:
def receive = {
case "test" => log.info("received test")
case _ => log.info("received unknown message")
}
What object is being matched on? On the right hand side of the arrows, how can I refer to the object being matched on?
You can do it with an if-guard:
def receive: String => Unit = {
case str if str == "test" => println(str)
case _ => println("other")
}
Option("test").map(receive) // prints "test"
Option("foo").map(receive) // prints "other"
Note that if you have an object that you want to refer to, then stuff like e.g. foo: Foo(s) won't work (foo: Foo will, but then you lose the reference to Foo's value s). In that case you need to use the # operator:
case class Foo(s: String)
def receive: Foo => Unit = {
case foo#Foo(s) => println(foo.s) // could've referred to just "s" too
case _ => println("other")
}
Option(Foo("test")).map(receive) // prints "test"
If you want a case to match on anything, and have a reference to it, use a variable name instead of underscore
def receive = {
case "test" => log.info("received test")
case other => log.info("received unknown message: " + other)
}

Can we have an array of by-name-parameter functions?

In Scala we have a by-name-parameters where we can write
def foo[T](f: => T):T = {
f // invokes f
}
// use as:
foo(println("hello"))
I now want to do the same with an array of methods, that is I want to use them as:
def foo[T](f:Array[ => T]):T = { // does not work
f(0) // invokes f(0) // does not work
}
foo(println("hi"), println("hello")) // does not work
Is there any way to do what I want? The best I have come up with is:
def foo[T](f:() => T *):T = {
f(0)() // invokes f(0)
}
// use as:
foo(() => println("hi"), () => println("hello"))
or
def foo[T](f:Array[() => T]):T = {
f(0)() // invokes f(0)
}
// use as:
foo(Array(() => println("hi"), () => println("hello")))
EDIT: The proposed SIP-24 is not very useful as pointed out by Seth Tisue in a comment to this answer.
An example where this will be problematic is the following code of a utility function trycatch:
type unitToT[T] = ()=>T
def trycatch[T](list:unitToT[T] *):T = list.size match {
case i if i > 1 =>
try list.head()
catch { case t:Any => trycatch(list.tail: _*) }
case 1 => list(0)()
case _ => throw new Exception("call list must be non-empty")
}
Here trycatch takes a list of methods of type ()=>T and applies each element successively until it succeeds or the end is reached.
Now suppose I have two methods:
def getYahooRate(currencyA:String, currencyB:String):Double = ???
and
def getGoogleRate(currencyA:String, currencyB:String):Double = ???
that convert one unit of currencyA to currencyB and output Double.
I use trycatch as:
val usdEuroRate = trycatch(() => getYahooRate("USD", "EUR"),
() => getGoogleRate("USD", "EUR"))
I would have preferred:
val usdEuroRate = trycatch(getYahooRate("USD", "EUR"),
getGoogleRate("USD", "EUR")) // does not work
In the example above, I would like getGoogleRate("USD", "EUR") to be invoked only if getYahooRate("USD", "EUR") throws an exception. This is not the intended behavior of SIP-24.
Here is a solution, although with a few restrictions compared to direct call-by-name:
import scala.util.control.NonFatal
object Main extends App {
implicit class Attempt[+A](f: => A) {
def apply(): A = f
}
def tryCatch[T](attempts: Attempt[T]*): T = attempts.toList match {
case a :: b :: rest =>
try a()
catch {
case NonFatal(e) =>
tryCatch(b :: rest: _*)
}
case a :: Nil =>
a()
case Nil => throw new Exception("call list must be non-empty")
}
def a = println("Hi")
def b: Unit = sys.error("one")
def c = println("bye")
tryCatch(a, b, c)
def d: Int = sys.error("two")
def e = { println("here"); 45 }
def f = println("not here")
val result = tryCatch(d, e, f)
println("Result is " + result)
}
The restrictions are:
Using a block as an argument won't work; only the last expression of the block will be wrapped in an Attempt.
If the expression is of type Nothing (e.g., if b and d weren't annotated), the conversion to Attempt is not inserted since Nothing is a subtype of every type, including Attempt. Presumably the same would apply for an expression of type Null.
As of Scala 2.11.7, the answer is no. However, there is SIP-24, so in some future version your f: => T* version may be possible.

How to pattern match on Scala's parser combinator result

We have a multithreaded RPC server that parses input strings. We've run into an issue where Scala's parser combinator library is not multithreaded safe: the var lastNoSuccess in Parsers.scala is used by any parsing. We get a NullPointerException in this line
if (!(lastNoSuccess != null && next.pos < lastNoSuccess.next.pos))
The default way to implement the parser by making an object that extends one of the Parsers, but I want to construct a parser on demand so each has its own internal state, so I'm using a class instead of an object. However, I can't get it to compile since I need to pattern match on the result:
import scala.util.parsing.combinator.RegexParsers
class SqlParserImpl
extends RegexParsers
{
val term: Parser[String] = """(?i)term\b""".r
}
object Test
{
def main(args: Array[String]): Unit =
{
val parser = new SqlParserImpl
parser.parseAll(parser.term, "term") match {
// How do I match?
case SqlParserImpl#Success(result, _) => true
case SqlParserImpl#NoSuccess => false
}
}
}
Fails with
t.scala:16: error: '=>' expected but '#' found.
case SqlParserImpl#Success(result, _) => true
^
t.scala:17: error: '=>' expected but '#' found.
case SqlParserImpl#NoSuccess => false
^
two errors found
Use this:
val parser = new SqlParserImpl
parser.parseAll(parser.term, "term") match {
case parser.Success(result, _) => true
case parser.NoSuccess(_, _) => false
}
The # sign is used to designate a type member. In your case it's using a constructor or an extractor pattern which needs to reference to an object or something that looks like a constructor.
Hmm. I don't have a 2.7 handy. Try this:
parser.parseAll(parser.term, "term") match {
case parser.Success(result, _) => true
case parser.Failure(_, _) => false
case parser.Error(_, _) => false
}
I was able to compile the following:
object Test {
def main(args: Array[String]): Unit = {
val parser = new SqlParserImpl
println(parser.parseAll(parser.term, "term") match {
case x: parser.Success[_] => true
case x: parser.NoSuccess => false
})
}
}
The NoSuccess object (with extractor) was added back in 2009, at a time when no code was being backported to 2.7 anymore It's implementation, however, is pretty simple:
object NoSuccess {
def unapply[T](x: ParseResult[T]) = x match {
case Failure(msg, next) => Some(msg, next)
case Error(msg, next) => Some(msg, next)
case _ => None
}
}
So you can replace the parser.NoSuccess(_, _) match with one parser.Failure(_, _) and one parser.Error(_, _) match. But if you are not interested in what is being returned, then it's simpler to match against the type:
case _: parser.Success[_] => true
case _: parser.NoSuccess => false
Like suggested by Eugene.