I got a below program, I can parse the pattern like convert(a.ACCOUNT_ID, string) to the expression, but I want to replace this pattern with CAST(a.ACCOUNT_ID AS VARCHAR). I can do parse the result expression and replace the strings with the one above but there are expressions like this hence I don't want to do that way.. Is there any way that I can do a pattern replace? Like if I find a pattern as convert(a.ACCOUNT_ID, string) then replace it with CAST(a.ACCOUNT_ID AS VARCHAR)
import scala.util.parsing.combinator._
import scala.util.parsing.combinator.lexical._
import scala.util.parsing.combinator.syntactical._
import scala.util.parsing.combinator.token._
import scala.util.parsing.input.CharSequenceReader
trait QParser extends RegexParsers with JavaTokenParsers {
def knownFunction: Parser[Any] = ident ~ "(" ~ ident ~ ("." ~ ident <~ "," ~ ident ~ ")")
def parse(inputString: String): Any = synchronized {
phrase(knownFunction)(new CharSequenceReader(inputString)) match {
case Success(result, _) => result
case Failure(msg,_) => throw new DataTypeException(msg)
case Error(msg,_) => throw new DataTypeException(msg)
}
}
class DataTypeException(message: String) extends Exception(message)
}
object Parser extends QParser {
def main(args: Array[String]) {
println(parse("convert(a.ACCOUNT_ID, string)"));
}
}
Output: (((convert~()~a)~(.~ACCOUNT_ID))
I am not exactly sure what you mean with "there are expressions like this hence I don't want to do that way", but you can transform the result of your parser function using the ^^ operator.
A transformation function for your parser could be :
def knownFunction: Parser[String] =
ident ~ "(" ~ ident ~ "." ~ ident ~ "," ~ ident ~ ")" ^^ {
case func ~ "(" ~ obj ~ "." ~ value ~ "," ~ castType ~ ")" =>
val sqlFunc = Map("convert" -> "CAST")
val sqlType = Map("string" -> "VARCHAR")
s"${sqlFunc(func)}($obj.$value AS ${sqlType(castType)})"
}
Using this updated function, the output of your application would be :
CAST(a.ACCOUNT_ID AS VARCHAR)
More information about the Scala Combinator Parsing can be found in a chapter of Programming in Scala, 1ed.
Related
I am trying to write a Scala Parser combinator for the following input.
The input can be
10
(10)
((10)))
(((10)))
Here the number of brackets can keep on growing. but they should always match. So parsing should fail for ((((10)))
The result of parsing should always be the number at the center
I wrote the following parser
import scala.util.parsing.combinator._
class MyParser extends RegexParsers {
def i = "[0-9]+".r ^^ (_.toInt)
def n = "(" ~ i ~ ")" ^^ {case _ ~ b ~ _ => b.toInt}
def expr = i | n
}
val parser = new MyParser
parser.parseAll(parser.expr, "10")
parser.parseAll(parser.expr, "(10)")
but now how do I handle the case where the number of brackets keep growing but matched?
Easy, just make the parser recursive:
class MyParser extends RegexParsers {
def i = "[0-9]+".r ^^ (_.toInt)
def expr: Parser[Int] = i | "(" ~ expr ~ ")" ^^ {case _ ~ b ~ _ => b.toInt}
}
(but note that scala-parser-combinators has trouble with left-recursive definitions: Recursive definitions with scala-parser-combinators)
I want to be able to parse strings like the one below with Scala parser combinators.
aaa22[bbb33[ccc]ddd]eee44[fff]
Before every open square bracket an integer literal is guaranteed to exist.
The code I have so far:
import scala.util.parsing.combinator.RegexParsers
trait AST
case class LetterSeq(value: String) extends AST
case class IntLiteral(value: String) extends AST
case class Repeater(count: AST, content: List[AST]) extends AST
class ExprParser extends RegexParsers {
def intLiteral: Parser[AST] = "[0-9]+".r ^^ IntLiteral
def letterSeq: Parser[AST] = "[a-f]+".r ^^ LetterSeq
def term: Parser[AST] = letterSeq | repeater
def expr: Parser[List[AST]] = rep1(term)
def repeater: Parser[AST] = intLiteral ~> "[" ~> expr <~ "]" ^^ {
case intLiteral ~ expr => Repeater(intLiteral, expr)
}
}
The message I get:
<console>:25: error: constructor cannot be instantiated to expected type;
found : ExprParser.this.~[a,b]
required: List[AST]
case intLiteral ~ expr => Repeater(intLiteral, expr)
Any ideas?
Later Edit: After making the change suggested by #sepp2k I still get the same error. The change being:
def repeater: Parser[AST] = intLiteral ~ "[" ~> expr <~ "]" ^^ {
The error message is telling you that you're pattern matching a list against the ~ constructor, which isn't allowed. In order to use ~ in your pattern, you need to have used ~ in the parser.
It looks like in this case the problem is simply that you discarded the value of intLiteral using ~> when you did not mean to. If you use ~ instead of ~> here and add parentheses1, that should fix your problem.
1 The parentheses are required, so that the following ~> only throws away the bracket instead of the result of intLiteral ~ "[". intLiteral ~ "[" ~> expr <~ "]" is parsed as (intLiteral ~ "[") ~> expr <~ "]", which still throws away the intLiteral. You want intLiteral ~ ("[" ~> expr <~ "]") which only throws away the [ and ].
class ExprParser extends RegexParsers {
val number = "[0-9]+".r
def expr: Parser[Int] = term ~ rep(
("+" | "-") ~ term ^^ {
case "+" ~ t => t
case "-" ~ t => -t
}) ^^ { case t ~ r => t + r.sum }
def term: Parser[Int] = factor ~ (("*" ~ factor)*) ^^ {
case f ~ r => f * r.map(_._2).product
}
def factor: Parser[Int] = number ^^ { _.toInt } | "(" ~> expr <~ ")"
}
I get the following warning when compiling
warning: match may not be exhaustive.
It would fail on the following input: ~((x: String forSome x not in ("+", "-")), _)
("+" | "-") ~ term ^^ {
^
one warning found
I heard that #unchecked annotation can help. But in this case where should I put it?
The issue here is that with ("+" | "-") you are creating a parser that accepts only two possible strings. However when you map on the resulting parser to extract the value, the result you're going to extract will just be String.
In your pattern matching you only have cases for the strings "+" and "-", but the compiler has no way of knowing that those are the only possible strings that will show up, so it's telling you here that your match may not be exhaustive since it can't know any better.
You could use an unchecked annotation to suppress the warning, but there are much better, more idiomatic ways, to eliminate the issue. One way to solve this is to replace those strings with some kind of structured type as soon as possible. For example, create an ADT
sealed trait Operation
case object Plus extends Operation
case object Minus extends Operation
//then in your parser
("+" ^^^ Plus | "-" ^^^ Minus) ~ term ^^ {
case PLus ~ t => t
case Minus ~ t => -t
}
Now it should be able to realize that the only possible cases are Plus and Minus
Add a case to remove the warning
class ExprParser extends RegexParsers {
val number = "[0-9]+".r
def expr: Parser[Int] = term ~ rep(
("+" | "-") ~ term ^^ {
case "+" ~ t => t
case "-" ~ t => -t
case _ ~ t => t
}) ^^ { case t ~ r => t + r.sum }
def term: Parser[Int] = factor ~ (("*" ~ factor)*) ^^ {
case f ~ r => f * r.map(_._2).product
}
def factor: Parser[Int] = number ^^ { _.toInt } | "(" ~> expr <~ ")"
}
I am writing a Parser in scala and got stuck at this point:
private def expression : Parser[Expression] = cond | variable | integer | liste | function
private def cond : Parser[Expression] = "if" ~ predicate ~ "then" ~ expression ~ "else" ~ expression ^^ {case _~i~_~t~_~el => Cond(i,t,el)}
private def predicate: Parser[Predicate] = identifier ~ "?" ~ "(" ~ repsep(expression, ",") ~ ")" ^^{case n~_~_~el~_ => Predicate(n,el)}
private def function: Parser[Expression] = identifier ~ "(" ~ repsep(expression, ",") ~ ")" ^^{case n~_~el~_ => Function(n,el)}
private def liste: Parser[Expression] = "[" ~ repsep(expression, ",") ~ "]" ^^ {case _~ls~_ => Liste(ls)}
private def variable: Parser[Expression] = identifier ^^ {case v => Variable(v)}
def identifier: Parser[String] = """[a-zA-Z0-9]+""".r ^^ { _.toString }
def integer: Parser[Integer] = num ^^ { case i => Integer(i)}
def num: Parser[String] = """(-?\d*)""".r ^^ {_.toString}
My problem is that when it comes to an "expression" the Parser does not always takes the right way. Like if its funk(x,y) it tries to parse it like a variable ant not like a function.
Any idea?
Change order of parsers in your expression parser - put function before variable and after cond. In general, when you compose parsers using alternative A | B, then parser A shouldn't be able to parse input that is prefix of input parsable by parser B.
I want to parse a String with scala parser combinators. Lets take
abcd,123,ghijk
as example. So we have 2 words and an Integer joined by comma.
I can do it like that:
import scala.util.parsing.combinator._
case class MyObject(field1:String, field2:Integer, field3:String)
object Test3 extends RegexParsers {
def main(args:Array[String]) {
val testRow = "abcd,123,ghijk"
val parseResult = Test3.parse(Test3.myObject, testRow)
println(parseResult)
}
def word = "\\w+".r ^^ { _ toString }
def int = """\d+""".r ^^ { _ toInt }
def comma = "," ^^ { _ toString }
def myObject = word ~ comma ~ int ~ comma ~ word ^^ {
case wordfield1 ~ sep1 ~ intfield ~ sep2 ~ wordfield2
=> MyObject(wordfield1, intfield, wordfield2)
}
}
However, I want to use the logic "joined by comma". Therefore rather than explicit writing word ~ comma ~ int ~ comma ~ word it should look more like
List(word, int, word) someFunctionIDontKnow {
(resultParser, nextParser) => resultParser ~ comma ~ nextParser
}
I am a little stuck here because I'm not sure how to save my parsers (with different types: Parser[int] and Parser[String]) into a List while maintaining type safety and what function to use to combine these like i did manually. Is what I want even possible or am I on the wrong track here?