Building variations of nested case classes - scala

So I got something like this:
abstract class Term
case class App(f:Term,x:Term) extends Term
case class Var(s:String) extends Term
case class Amb(a:Term, b:Term) extends Term //ambiguity
And a Term may look like this:
App(Var(f),Amb(Var(x),Amb(Var(y),Var(z))))
So what I need is all variations that are indicated by the Amb class.
This is used to represent a ambiguous parse forest and I want to type check each possible variation and select the right one.
In this example I would need:
App(Var(f),Var(x))
App(Var(f),Var(y))
App(Var(f),Var(z))
Whats the best way to create these variations in scala?
Efficiency would be nice, but is not really requirement.
If possible I like to refrain from using reflection.

Scala provides pattern matching solve these kinds of problems. A solution would look like:
def matcher(term: Term): List[Term] = {
term match {
case Amb(a, b) => matcher(a) ++ matcher(b)
case App(a, b) => for { va <- matcher(a); vb <- matcher(b) } yield App(va, vb)
case v: Var => List(v)
}
}

You can do this pretty cleanly with a recursive function that traverses the tree and expands ambiguities:
sealed trait Term
case class App(f: Term, x: Term) extends Term
case class Var(s: String) extends Term
case class Amb(a: Term, b: Term) extends Term
def det(term: Term): Stream[Term] = term match {
case v: Var => Stream(v)
case App(f, x) => det(f).flatMap(detf => det(x).map(App(detf, _)))
case Amb(a, b) => det(a) ++ det(b)
}
Note that I'm using a sealed trait instead of an abstract class in order to take advantage of the compiler's ability to check exhaustivity.
It works as expected:
scala> val app = App(Var("f"), Amb(Var("x"), Amb(Var("y"), Var("z"))))
app: App = App(Var(f),Amb(Var(x),Amb(Var(y),Var(z))))
scala> det(app) foreach println
App(Var(f),Var(x))
App(Var(f),Var(y))
App(Var(f),Var(z))
If you can change the Term API, you could more or less equivalently add a def det: Stream[Term] method there.

Since my abstract syntax is fairly large (and I have multiple) and I tried my luck with Kiama.
So here is the version Travis Brown and Mark posted with Kiama.
Its not pretty, but I hope it works. Comments are welcome.
def disambiguateRule: Strategy = rule {
case Amb(a: Term, b: Term) =>
rewrite(disambiguateRule)(a).asInstanceOf[List[_]] ++
rewrite(disambiguateRule)(b).asInstanceOf[List[_]]
case x =>
val ch = getChildren(x)
if(ch.isEmpty) {
List(x)
}
else {
val chdis = ch.map({ rewrite(disambiguateRule)(_) }) // get all disambiguate children
//create all combinations of the disambiguated children
val p = combinations(chdis.asInstanceOf[List[List[AnyRef]]])
//use dup from Kiama to recreate the term with every combination
val xs = for { newchildren <- p } yield dup(x.asInstanceOf[Product], newchildren.toArray)
xs
}
}
def combinations(ll: List[List[AnyRef]]): List[List[AnyRef]] = ll match {
case Nil => Nil
case x :: Nil => x.map { List(_) }
case x :: xs => combinations(xs).flatMap({ ys => x.map({ xx => xx :: ys }) })
}
def getChildren(x: Any): List[Any] = {
val l = new ListBuffer[Any]()
all(queryf {
case a => l += a
})(x)
l.toList
}

Related

Clever way to break a Seq[Any] into a case class

I've been parsing a proprietary file format that has sections and each section has a number of records. The sections can be in any order and the records can be in any order. The order is not significant. While sections should not be duplicated, I can't guarantee that.
I've been using parboiled2 to generate the AST using a format like the following:
oneOrMore( Section1 | Section2 | Section3 )
Where every section generates a case class. They don't inherit from anything resulting in Seq[Any]
These section case classes also contain a Seq[T] of records specific to the section type.
I would like to transform the Seq[Any] into a
case class (section1:Seq[T1], section2:Seq[T2], section3:Seq[T3] )
Does someone have a clever and easy to read technique for that or should I make some mutable collections and use a foreach with a match?
I always feel like I am missing some Scala magic when I fall back to a foreach with vars.
EDIT 1:
It was brought up that I should extend a common base class, it is true that I could. But I don't see what that changes about the solution if I still have to use match to identify the type. I want to separate out the different case class types, for instance below I want to collect all the B's, C's, E's, and F's together into a Seq[B], Seq[C], Seq[E], and Seq[F]
class A()
case class B(v:Int) extends A
case class C(v:String) extends A
case class E(v:Int)
case class F(v:String)
val a:Seq[A] = B(1) :: C("2") :: Nil
val d:Seq[Any] = E(3) :: F("4") :: Nil
a.head match {
case B(v) => v should equal (1)
case _ => fail()
}
a.last match {
case C(v) => v should equal ("2")
case _ => fail()
}
d.head match {
case E(v) => v should equal (3)
case _ => fail()
}
d.last match {
case F(v) => v should equal ("4")
case _ => fail()
}
EDIT 2: Folding solution
case class E(v:Int)
case class F(v:String)
val d:Seq[Any] = E(3) :: F("4") :: Nil
val Ts = d.foldLeft((Seq[E](), Seq[F]()))(
(c,r) => r match {
case e:E => c.copy(_1=c._1 :+ e)
case e:F => c.copy(_2=c._2 :+ e)
}
)
Ts should equal ( (E(3) :: Nil, F("4") :: Nil) )
EDIT 3: Exhaustivity
sealed trait A //sealed is important
case class E(v:Int) extends A
case class F(v:String) extends A
val d:Seq[Any] = E(3) :: F("4") :: Nil
val Ts = d.foldLeft((Seq[E](), Seq[F]()))(
(c,r) => r match {
case e:E => c.copy(_1=c._1 :+ e)
case e:F => c.copy(_2=c._2 :+ e)
}
)
Ts should equal ( (E(3) :: Nil, F("4") :: Nil) )
While this could be done with shapeless to make a solution that is more terse (As Travis pointed out) I chose to go with a pure Scala solution based on Travis' feedback.
Here is an example of using foldLeft to manipulate a tuple housing strongly typed Seq[]. Unfortunately every type that is possible requires a case in the match which can become tedious if there are many types.
Also note, that if the base class is sealed, then the match will give an exhaustivity warning in the event a type was missed making this operation type safe.
sealed trait A //sealed is important
case class E(v:Int) extends A
case class F(v:String) extends A
val d:Seq[A] = E(3) :: F("4") :: Nil
val Ts = d.foldLeft((Seq[E](), Seq[F]()))(
(c,r) => r match {
case e:E => c.copy(_1=c._1 :+ e)
case e:F => c.copy(_2=c._2 :+ e)
}
)
Ts should equal ( (E(3) :: Nil, F("4") :: Nil) )

Scala, pattern matching on a tuple of generic trait, checking if types are equal

I know a lot of questions exist about type erasure and pattern matching on generic types, but I could not understand what should I do in my case from answers to those, and I could not explain it better in title.
Following code pieces are simplified to present my case.
So I have a trait
trait Feature[T] {
value T
def sub(other: Feature[T]): Double
}
// implicits for int,float,double etc to Feature with sub mapped to - function
...
Then I have a class
class Data(val features: IndexedSeq[Feature[_]]) {
def sub(other: Data): IndexedSeq[Double] = {
features.zip(other.features).map {
case(e1: Feature[t], e2: Feature[y]) => e1 sub e2.asInstanceOf[Feature[t]]
}
}
}
And I have a test case like this
case class TestFeature(val value: String) extends Feature[String] {
def sub(other: Feature[String]): Double = value.length - other.length
}
val testData1 = new Data(IndexedSeq(8, 8.3f, 8.232d, TestFeature("abcd"))
val testData2 = new Data(IndexedSeq(10, 10.1f, 10.123d, TestFeature("efg"))
testData1.sub(testData2).zipWithIndex.foreach {
case (res, 0) => res should be (8 - 10)
case (res, 1) => res should be (8.3f - 10.1f)
case (res, 2) => res should be (8.232d - 10.123d)
case (res, 3) => res should be (1)
}
This somehow works. If I try sub operation with instances of Data that have different types in same index of features, I get a ClassCastException. This actually satisfies my requirements, but if possible I would like to use Option instead of throwing an exception. How can I make following code work?
class Data(val features: IndexedSeq[Feature[_]]) {
def sub(other: Data): IndexedSeq[Double] = {
features.zip(other.features).map {
// of course this does not work, just to give idea
case(e1: Feature[t], e2: Feature[y]) if t == y => e1 sub e2.asInstanceOf[Feature[t]]
}
}
}
Also I am really inexperienced in Scala, so I would like to get feedback on this type of structure. Are there another ways to do this and which way would make most sense?
Generics don't exist at runtime, and an IndexedSeq[Feature[_]] has forgotten what the type parameter is even at compile time (#Jatin's answer won't allow you to construct a Data with a list of mixed types of Feature[_]). The easiest answer might be just to catch the exception (using catching and opt from scala.util.control.Exception). But, to answer the question as written:
You could check the classes at runtime:
case (e1: Feature[t], e2: Feature[y]) if e1.value.getClass ==
e2.value.getClass => ...
Or include the type information in the Feature:
trait Feature[T] {
val value: T
val valueType: ClassTag[T] // write classOf[T] in subclasses
def maybeSub(other: Feature[_]) = other.value match {
case valueType(v) => Some(actual subtraction)
case _ => None
}
}
The more complex "proper" solution is probably to use Shapeless HList to preserve the type information in your lists:
// note the type includes the type of all the elements
val l1: Feature[Int] :: Feature[String] :: HNil = f1 :: f2 :: HNil
val l2 = ...
// a 2-argument function that's defined for particular types
// this can be applied to `Feature[T], Feature[T]` for any `T`
object subtract extends Poly2 {
implicit def caseFeatureT[T] =
at[Feature[T], Feature[T]]{_ sub _}
}
// apply our function to the given HLists, getting a HList
// you would probably inline this
// could follow up with .toList[Double]
// since the resulting HList is going to be only Doubles
def subAll[L1 <: HList, L2 <: HList](l1: L1, l2: L2)(
implicit zw: ZipWith[L1, L2, subtract.type]) =
l1.zipWith(l2)(subtract)
That way subAll can only be called for l1 and l2 all of whose elements match, and this is enforced at compile time. (If you really want to do Options you can have two ats in the subtract, one for same-typed Feature[T]s and one for different-typed Feature[_]s, but ruling it out entirely seems like a better solution)
You could do something like this:
class Data[T: TypeTag](val features: IndexedSeq[Feature[T]]) {
val t = implicitly[TypeTag[T]]
def sub[E: TypeTag](other: Data[E]): IndexedSeq[Double] = {
val e = implicitly[TypeTag[E]]
features.zip(other.features).flatMap{
case(e1, e2: Feature[y]) if e.tpe == t.tpe => Some(e1 sub e2.asInstanceOf[Feature[T]])
case _ => None
}
}
}
And then:
case class IntFeature(val value: Int) extends Feature[Int] {
def sub(other: Feature[Int]): Double = value - other.value
}
val testData3 = new Data(IndexedSeq(TestFeature("abcd")))
val testData4 = new Data(IndexedSeq(IntFeature(1)))
println(testData3.sub(testData4).zipWithIndex)
gives Vector()

More efficient Solution with tailrecursion?

I have the following ADT for Formulas. (shortened to the important ones)
sealed trait Formula
case class Variable(id: String) extends Formula
case class Negation(f: Formula) extends Formula
abstract class BinaryConnective(val f0: Formula, val f1: Formula) extends Formula
Note that the following methods are defined in an implicit class for formulas.
Let's say i want to get all variables from a formula.
My first approach was:
Solution 1
def variables: Set[Variable] = formula match {
case v: Variable => HashSet(v)
case Negation(f) => f.variables
case BinaryConnective(f0, f1) => f0.variables ++ f1.variables
case _ => HashSet.empty
}
This approach is very simple to understand, but not tailrecursive. So I wanted to try something different. I implemented a foreach on my tree-like formulas.
Solution 2
def foreach(func: Formula => Unit) = {
#tailrec
def foreach(list: List[Formula]): Unit = list match {
case Nil =>
case _ => foreach(list.foldLeft(List.empty[Formula])((next, formula) => {
func(formula)
formula match {
case Negation(f) => f :: next
case BinaryConnective(f0, f1) => f0 :: f1 :: next
case _ => next
}
}))
}
foreach(List(formula))
}
Now I can implement many methods with the help of the foreach.
def variables2 = {
val builder = Set.newBuilder[Variable]
formula.foreach {
case v: Variable => builder += v
case _ =>
}
builder.result
}
Now finally to the question. Which solution is preferable in terms of efficieny? At least I find my simple first solution more aesthetic.
I would expect Solution 2 to be more efficient, because you aren't create many different HashSet instances and combining them together. It is also more general.
You can simplify your Solution 2, removing the foldLeft:
def foreach(func: Formula => Unit) = {
#tailrec
def foreach(list: List[Formula]): Unit = list match {
case Nil =>
case formula :: next => {
func(formula)
foreach {
formula match {
case Negation(f) => f :: next
case BinaryConnective(f0, f1) => f0 :: f1 :: next
case _ => next
}
}
}
}
foreach(List(formula))
}

Improving Pattern-matching Code

Assume the following data-structure.
sealed abstract class Formula {...}
//... some other case classes
sealed abstract class BinaryConnective(f0: Formula, f1: Formula) extends Formula {
def getf0 = f0
def getf1 = f1
}
object BinaryConnective {
def unapply(bc : BinaryConnective) = Some((bc.getf0, bc.getf1))
}
final case class Conjunction(f0: Formula, f1: Formula) extends BinaryConnective(f0,f1)
final case class Disjunction(f0: Formula, f1: Formula) extends BinaryConnective(f0,f1)
final case class Implication(f0: Formula, f1: Formula) extends BinaryConnective(f0,f1)
final case class Equivalence(f0: Formula, f1: Formula) extends BinaryConnective(f0,f1)
I now wrote a function that has a lot of pattern-matching:
The return-type of getCondition is Formula => Option[HashMap[Variable, Formula]]
formula match {
//.. irrelevant cases not shown
case Conjunction(f0, f1) => (g : Formula) => {
g match {
case conj # Conjunction(g0, g1) => {
getCondition(f0)(conj.f0) match {
case Some(map0) => {
getCondition(f1)(conj.f1) match {
case Some(map1) if map0.forall{case (key, value) => map1.get(key).map(_ == value).getOrElse(true)} => {
Some(map0 ++ map1)
}
case _ => None
}
}
case None => None
}
}
case _ => None
}
}
}
Now to my question.
1) Is there a nicer way to express this code? A lot of matches going on.
Edit 1: I could not think of a nice-looking way to use things like map, filter etc.., but it seems very compact with for-comprehensions. I've also noticed that conj # was not necessary at all, which also made it a little simpler.
case Conjunction(f0, f1) => (g: Formula) => g match {
case Conjunction(g0, g1) => for {
map0 <- getCondition(f0)(g0)
map1 <- getCondition(f1)(g1)
if map0.forall {case (key, value) => map1.get(key).map(_ == value).getOrElse(true)}
} yield map0 ++ map1
case _ => None
}
2) This is the match for Conjunction. I would have to repeat it for Disjunction, Implication and Equivalence. g has to be of the same class as formula. The only thing that would change is case conj # Conjunction(g0, g1). I would have to adjust it to case disj # Disjunction(g0, g1) if formula is a Disjunction etc...
Is there a way to do it combined for all cases?
Option should provide a lot of useful functions to simplify your code.
For example, when you write something like:
o match {
case Some(e) => Some(transform(e))
case _ => None
}
You could just call map: o.map(transform)
I also invite you to look at the filter function for the cases including a condition.
EDIT: great suggestion by #om-nom-nom: For comprehensions can also be used (they actually are sugar relying on map, flatMap, filter, etc):
for{
e <- o
} yield transform(e)

Alternative pattern matching with variable binding?

I try to implement an equivalence relation over terms which I also would like to match against some patterns. However my relation is symmetric and therefore, the pattern matching must reflect this too.
Have a look at the following example:
abstract class Term
case class Constructor(txt:String) extends Term
case class Variable(txt:String) extends Term
case class Equality(t1:Term, t2:Term)
def foobar(e:Equality) = e match {
case Equality(Variable(x),Constructor(y)) => "do something rather complicated with x and y"
case Equality(Constructor(y),Variable(x)) => "do it all over again"
}
Infact I would like to do something like this
def foobar(e:Equality) = e match {
case Equality(Variable(x),Constructor(y)) | Equality(Constructor(y),Variable(x))
=> "yeah! this time we need to write the code only one time ;-)"
}
However, as noted e.g. in here, this is not allowed. Does someone have a nice solution for this kind of problem? Any help/pointer is highly appreciated.
You could create your own unapply method like this:
object CVEquality {
def unapply(e: Equality): Option(String, String) = e match {
case Equality(Variable(v), Constructor(c)) => Some(c -> v)
case Equality(Constructor(c), Variable(v)) => Some(c -> v)
case _ => None
}
}
Usage:
def foobar(e:Equality) = e match {
case CVEquality(c, v) => "do something rather complicated with c and v"
}
The easiest way is to create method for something rather complicated:
def complicated(c: String, v: String) = "do something rather complicated with c and v"
def foobar(e:Equality) = e match {
case Equality(Variable(x),Constructor(y)) => complicated(y, x)
case Equality(Constructor(y),Variable(x)) => complicated(y, x)
}