How to pattern match on Scala's parser combinator result - scala

We have a multithreaded RPC server that parses input strings. We've run into an issue where Scala's parser combinator library is not multithreaded safe: the var lastNoSuccess in Parsers.scala is used by any parsing. We get a NullPointerException in this line
if (!(lastNoSuccess != null && next.pos < lastNoSuccess.next.pos))
The default way to implement the parser by making an object that extends one of the Parsers, but I want to construct a parser on demand so each has its own internal state, so I'm using a class instead of an object. However, I can't get it to compile since I need to pattern match on the result:
import scala.util.parsing.combinator.RegexParsers
class SqlParserImpl
extends RegexParsers
{
val term: Parser[String] = """(?i)term\b""".r
}
object Test
{
def main(args: Array[String]): Unit =
{
val parser = new SqlParserImpl
parser.parseAll(parser.term, "term") match {
// How do I match?
case SqlParserImpl#Success(result, _) => true
case SqlParserImpl#NoSuccess => false
}
}
}
Fails with
t.scala:16: error: '=>' expected but '#' found.
case SqlParserImpl#Success(result, _) => true
^
t.scala:17: error: '=>' expected but '#' found.
case SqlParserImpl#NoSuccess => false
^
two errors found

Use this:
val parser = new SqlParserImpl
parser.parseAll(parser.term, "term") match {
case parser.Success(result, _) => true
case parser.NoSuccess(_, _) => false
}
The # sign is used to designate a type member. In your case it's using a constructor or an extractor pattern which needs to reference to an object or something that looks like a constructor.
Hmm. I don't have a 2.7 handy. Try this:
parser.parseAll(parser.term, "term") match {
case parser.Success(result, _) => true
case parser.Failure(_, _) => false
case parser.Error(_, _) => false
}

I was able to compile the following:
object Test {
def main(args: Array[String]): Unit = {
val parser = new SqlParserImpl
println(parser.parseAll(parser.term, "term") match {
case x: parser.Success[_] => true
case x: parser.NoSuccess => false
})
}
}

The NoSuccess object (with extractor) was added back in 2009, at a time when no code was being backported to 2.7 anymore It's implementation, however, is pretty simple:
object NoSuccess {
def unapply[T](x: ParseResult[T]) = x match {
case Failure(msg, next) => Some(msg, next)
case Error(msg, next) => Some(msg, next)
case _ => None
}
}
So you can replace the parser.NoSuccess(_, _) match with one parser.Failure(_, _) and one parser.Error(_, _) match. But if you are not interested in what is being returned, then it's simpler to match against the type:
case _: parser.Success[_] => true
case _: parser.NoSuccess => false
Like suggested by Eugene.

Related

How to create an empty JsResult in play json?

I am writing an implicit reads returning JsResult[Seq[T]] based on a condition:
some_condition match{
case Some(value) => // JsResult[Seq[T]]
case _ => // returning an empty JsResult here of similar type?
}
What would be the way for return such JsResult?
You will need to use the method validate, which returns JsResult. There are 2 case classes that inherits JsResult:
JsSuccess
JsError
Therefore you can consider the following example:
case class T()
implicit val reads = Json.reads[T]
val jsValue: JsValue = ...
jsValue.validate[Seq[T]] match {
case JsSuccess(t, path) =>
println(t)
case JsError(errors) =>
println(errors)
}

Scala multi-level pattern matching

I'm stuck with multi-level pattern matching, in the code below I want to match one particular case which is checked at several levels "cfe is Assignment, assignmentCfe.getRight is BinaryExpression, and so on", the solution looks ugly and I hope there is something better Scala can offer me. :)
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.foreach { case (cfe, state) => cfe match {
case assign: Assignment => assign.getRight match {
case expression: BinaryExpression => expression.getOperator match {
case Operator.Div | Operator.Rem => processDivisions()
case _ =>
}
case _ =>
}
case _ =>
}
case _ =>
}
}
How to get rid of these empty default cases in the end?
Another approach would be using nested conditions, but IntelliJ IDEA offers me to replace these conditions back to pattern matching
states.foreach { case (cfe, state) => if (cfe.isInstanceOf[Assignment]) {
val assignment = cfe.asInstanceOf[Assignment]
if (assignment.getRight.isInstanceOf[BinaryExpression]) {
val expression = assignment.getRight.asInstanceOf[BinaryExpression]
if (expression.getOperator == Operator.Div || expression.getOperator == Operator.Rem) processDivisions()
}
}}
Are Assignment and BinaryExpression themselves case classes? Or do they have corresponding unapply methods? If so, then you can nest pattern matches and ignore fields you don't care about. For example, something like:
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.foreach {
case (Assignment(_, BinaryExpression(_, _, Operator.Div | Operator.Rem)), _) => processDivisions()
case _ =>
}
}
This will at least cut the number of default matches down to 1.
If these are not case classes or don't have extractors, then you could consider writing your own if this is a common enough (anti)pattern in your code: http://docs.scala-lang.org/tutorials/tour/extractor-objects.html
One other idea is you could use the "pimp my library" pattern to define an implicit conversion from any object into a class that can do a kind of partial matching:
class PartialMatcher[A](a: A) {
def partialMatch(f: PartialFunction[A, Unit]): Unit = if (f.isDefinedAt(a)) f(a)
}
implicit def makePartialMatcher[A](a: A) = new PartialMatcher(a)
Then just replace all of those matches with partialMatch:
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.foreach { case (cfe, state) => cfe partialMatch {
case assign: Assignment => assign.getRight partialMatch {
case expression: BinaryExpression => expression.getOperator partialMatch {
case Operator.Div | Operator.Rem => processDivisions()
}
}
}}
}
Note that there are other reasons why you might avoid this kind of thing... overusing implicit conversions can make understanding code a lot more difficult. It's a stylistic choice.
Use .collect:
def findAll(cfg: Cfg, consumer: Consumer[Violation]): Unit = {
val fa = new FlowAnalyzer
val states = fa.analyze(cfg)
states.collect { case (assign: Assignment, _) =>
assign.getRight
}.collect { case expression: BinaryExpression =>
expression.getOperator
}.collect { case Operator.Div | Operator.Rem =>
processDivisions
}

What's the most idiomatic way in Scala to pattern match on a Seq holding enum values converted to strings?

I'm trying to match an enum value converted to a string held in a collection. Here's the code:
object Foo extends Enumeration {
val ONE = Value("ONE")
val TWO = Value("TWO")
}
def check(seq: Seq[String]): Unit = seq match {
case Seq(Foo.ONE.toString) => println("match")
case _ => println("no match")
}
This results in a compilation error:
error: stable identifier required, but Foo.ONE.toString found.
case Seq(Foo.ONE.toString) => println("match")
What is the proper way to use my Foo enumerated values as elements of my pattern matching case statements?
Map it back to the enum first:
import scala.util.Try
val enumSeq = seq map (x => Try(Foo.withName(x)))
Then you can either filter out the Failures or match on Seq(Success(ONE)), Seq(Success(ONE)), ..., Seq(Failure), etc.
def check(seq: Seq[String]): Unit = seq match {
case Seq(s # _) if s == Foo.ONE.toString => println("match")
case _ => println("no match")
}
I like the response from #cchantep, which was to avoid calling .toString inside the pattern match and implement the check method like so:
def check(seq: Seq[Foo.Value]): Unit = seq match {
case Seq(Foo.ONE) => println("match")
case _ => println("no match")
}

Signaling failure via the function application parser combinator

I need to do more complex syntax checking on a parser match than the standard notation permits and am currently doing it in the function application ^^. A sample, simplified scenario is checking for duplicate keywords:
def keywords: Parser[List[String]] = "[" ~ repsep(keyword, ",") ~ "]" ^^ {
case _ ~ ks ~ _ =>
ks.groupBy(x => x).filter(_._2.length > 1).keys.toList match {
case Nil => ks
case x => throw new DuplicateKeywordsException(x)
}
}
This works, as in my parser will throw an exception, but I want the failure to be captured as a ParseResult.Failure capturing the Input of where it happened. I can't figure out how to signal this from within a ^^ block or use some other construct to achieve the same end.
Ok, I followed Erik Meijer's advice of following the Types down the happy path. Looking at how ^^ is defined in Programming in Scala (which differs from the actual code), i realized it was basically just a Map function:
def ˆˆ [U](f: T => U): Parser[U] = new Parser[U] {
def apply(in: Input) = p(in) match {
case Success(x, in1) => Success(f(x), in1)
case failure => failure
}
}
Basically it's Parser[T] => Parser[U].
Parser[T] itself is a function of Input => ParseResult[T] and ^^ just defines a new parser by supplying the apply method, which on invocation either transforms Success[T] to Success[U] or just passes along Failure.
To achieve the goal of injecting a new Failure during mapping I need a new mapping function that takes a function like f: T => Either[String,U] so I can signal an error message or successful mapping. I chose Either with string, since Failure just takes a string message. This new mapping function is then added to Parser[U] via an implicit class:
implicit class RichParser[+T](p: Parser[T]) {
def ^^? [U](f: T => Either[String,U]): Parser[U] = new Parser[U] {
def apply(in: Input) = p(in) match {
case Success(x, in1) => f(x) match {
case Left(error) => Failure(error,in1)
case Right(x1) => Success(x1,in1)
}
case failure:Failure => failure
case error:Error => error
}
}
}
And now keywords can be defined as:
def keywords: Parser[List[String]] = "[" ~ repsep(keyword, ",") ~ "]" ^^? {
case _ ~ ks ~ _ =>
ks.groupBy(x => x).filter(_._2.length > 1).keys.toList match {
case Nil => Right(ks)
case x => Left("found duplicate keywords: "+x.reduce[String] { case (a, b) => s"$a, $b"})
}
}

Optional function parameter with generic return type

How would you implement class that parses some input via regex and transforms founded string to some other type? My approach is:
class ARegex[T](regex:Regex, reform:Option[String => T]){
def findFirst(input:String):Option[T] = {
(regex.findFirstIn(input), reform) match{
case (None, _) => None
case (Some(s), None) => Some(s) // this won't compile because of type mismatch
case (Some(s), Some(fun)) => Some(fun(s))
}
}
}
class BRegex[T](regex:Regex, reform:Option[String => T]) {
def findFirst(input:String) = { //returns Option[Any] - erasure
(regex.findFirstIn(input), reform) match{
case (None, _) => None
case (Some(s), None) => Some(s)
case (Some(s), Some(fun)) => Some(fun(s))
}
}
}
We can solve this problem by eliminating the Option part of the reform's type, and using a different mechanism to indicate that we don't want to change the match in any way. This mechanism is to use identity as a default parameter or pass identity when you don't want the type to change.
class ARegex[T](regex:Regex, reform:String => T = identity[String](_)){
def findFirst(input:String):Option[T] = {
regex.findFirstIn(input) match{
case None => None
case Some(s) => Some(reform(s))
}
}
}
new ARegex("something".r).findFirst("something else") //returns Option[String]
new ARegex("3".r, {x=>x.toInt}).findFirst("number 3") //returns Option[Int]
Well, the problem is the type mismatch, because you are returning either a String or a T, which, of course, are unified at Any. You can't say you are going to return Option[T] and then return Option[String].
Other than that, a simplified version of that code is this:
class ARegex[T](regex: Regex, reform: Option[String => T]) {
def findFirst(input: String): Option[Any] =
regex findFirstIn input map { s => reform map (_(s)) getOrElse s }
}
You could return an Option[Either[String, T]], though. The code would look like this:
class ARegex[T](regex: Regex, reform: Option[String => T]) {
def findFirst(input: String): Option[Either[String, T]] =
regex findFirstIn input map { s => reform map (_(s)) toRight s }
}
Why is reform Option[String => T] instead of just String => T? If you don't pass in a mechanism for creating an instance of your desired type, there's no mechanism for the runtime system to actually create the appropriate object. If you really need to pass in an Option[String => T] then your second case should simply return None.
Also, flatMap is your friend, and will give you the correct behavior (i.e. if reform is None, the method returns None.
class RegexExtractor[T](regex: Regex, reform: Option[String => T]) {
def findFirst(input: String): Option[T] = reform.flatMap(f => regex.findFirstIn(input).map(f))
}