Preferred way of collecting variables from a formula - scala

I'm dealing with propositional logic at the moment and I wrote two algorithms for collecting all variables in a formula. I want the output to be immutable. Which one should be preferred in terms of speed/elegance? Is there an even better way? Thanks in advance.
def getVariables(formula: Formula): Set[Variable] = formula match {
case v: Variable => HashSet(v)
case Negation(f) => getVariables(f)
case BinaryConnective(f0, f1) => getVariables(f0) ++ getVariables(f1)
case _ => HashSet.empty[Variable]
}
def getVariables2(formula: Formula): Set[Variable] = {
def getVariables2(formula: Formula, set: mutable.HashSet[Variable]): Unit = formula match {
case v: Variable => set += v
case Negation(f) => getVariables2(f, set)
case BinaryConnective(f0, f1) => getVariables2(f0, set); getVariables2(f1, set)
case _ =>
}
val set = mutable.HashSet.empty[Variable]
getVariables2(formula, set)
set.toSet
}

The fastest way is almost always to use a builder. So, assuming no stack overflows:
def getVars(formula: Formula): Set[Variable] = {
val sb = Set.newBuilder[Variable]
def inner(formula: Formula) { formula match {
case v: Variable => sb += v
case Negation(f) => inner(f)
case BinaryConnective(f0, f1) => inner(f0); inner(f1)
case _ =>
}}
inner(formula)
sb.result
}
Your first version is probably the most elegant, however.
Note that if you may have very large formulas, this recursive solution could be in danger of stack overflows. The fix is relatively straightforward:
def getVars2(formula: Formula): Set[Variable] = {
val sb = Set.newBuilder[Variable]
def inner(formulas: List[Formula]) {
var more: List[Formula] = Nil
formulas.foreach{ _ match {
case v: Variable => sb += v
case Negation(f) => more = f :: more
case BinaryConnective(f0, f1) => more = f1 :: f0 :: more
case _ =>
}}
if (!more.isEmpty) inner(more)
}
inner(formula :: Nil)
sb.result
}
Your names are way too long to allow convenient typing of an interestingly non-tiny expression, but if we abbreviate to the capital letters, then:
BC(N(V('x)), BC(BC(V('a),V('x)),V('y)))
will run about 7x faster with getVars than your first solution; getVars2 is a little slower (only 4x faster).
(Benchmark timings are:
getVariables 1380 ns +- 20 ns
getVars 190 ns +- 10 ns
getVars2 360 ns +- 10 ns
)

Related

Scala: Partitioning by case (not by filter)

I have a list of mixed values:
val list = List("A", 2, 'c', 4)
I know how to collect the chars, or strings, or ints, in a single operation:
val strings = list collect { case s:String => s }
==> List(A)
val chars = list collect { case c:Char => c }
==> List(c)
val ints = list collect { case i:Int => i }
==> List(2,4)
Can I do it all in one shot somehow? I'm looking for:
val (strings, chars, ints) = list ??? {
case s:String => s
case c:Char => c
case i:Int => i
}
EDIT
Confession -- An example closer to my actual use case:
I have a list of things, that I want to partition according to some conditions:
val list2 = List("Word", " ", "", "OtherWord")
val (empties, whitespacesonly, words) = list2 ??? {
case s:String if s.isEmpty => s
case s:String if s.trim.isEmpty => s
case s:String => s
}
N.B. partition would be great for this if I only had 2 cases (one where the condition was met and one where it wasn't) but here I have multiple conditions to split on.
Based on your second example: you can use groupBy and a key-ing function. I prefer to use those techniques in conjunction with a discriminated union to make the intention of the code more obvious:
val list2 = List("Word", " ", "", "OtherWord")
sealed trait Description
object Empty extends Description
object Whitespaces extends Description
object Words extends Description
def strToDesc(str : String) : Description = str match {
case _ if str.isEmpty() => Empty
case _ if str.trim.isEmpty() => Whitespaces
case _ => Words
}
val descMap = (list2 groupBy strToDesc) withDefaultValue List.empty[String]
val (empties, whitespaceonly, words) =
(descMap(Empty),descMap(Whitespaces),descMap(Words))
This extends well if you want to add another Description later, e.g. AllCaps...
Hope this help:
list.foldLeft((List[String](), List[String](), List[String]())) {
case ((e,s,w),str:String) if str.isEmpty => (str::e,s,w)
case ((e,s,w),str:String) if str.trim.isEmpty => (e,str::s,w)
case ((e,s,w),str:String) => (e,s,str::w)
case (acc, _) => acc
}
You could use partition twice :
def partitionWords(list: List[String]) = {
val (emptyOrSpaces, words) = list.partition(_.trim.isEmpty)
val (empty, spaces) = emptyOrSpaces.partition(_.isEmpty)
(empty, spaces, words)
}
Which gives for your example :
partitionWords(list2)
// (List(""),List(" "),List(Word, OtherWord))
In general you can use foldLeft with a tuple as accumulator.
def partitionWords2(list: List[String]) = {
val nilString = List.empty[String]
val (empty, spaces, words) = list.foldLeft((nilString, nilString, nilString)) {
case ((empty, spaces, words), elem) =>
elem match {
case s if s.isEmpty => (s :: empty, spaces, words)
case s if s.trim.isEmpty => (empty, s :: spaces, words)
case s => (empty, spaces, s :: words)
}
}
(empty.reverse, spaces.reverse, words.reverse)
}
Which will give you the same result.
A tail recursive method,
def partition(list: List[Any]): (List[Any], List[Any], List[Any]) = {
#annotation.tailrec
def inner(map: Map[String, List[Any]], innerList: List[Any]): Map[String, List[Any]] = innerList match {
case x :: xs => x match {
case s: String => inner(insertValue(map, "str", s), xs)
case c: Char => inner(insertValue(map, "char", c), xs)
case i: Int => inner(insertValue(map, "int", i), xs)
}
case Nil => map
}
def insertValue(map: Map[String, List[Any]], key: String, value: Any) = {
map + (key -> (value :: map.getOrElse(key, Nil)))
}
val partitioned = inner(Map.empty[String, List[Any]], list)
(partitioned.get("str").getOrElse(Nil), partitioned.get("char").getOrElse(Nil), partitioned.get("int").getOrElse(Nil))
}
val list1 = List("A", 2, 'c', 4)
val (strs, chars, ints) = partition(list1)
I wound up with this, based on #Nyavro's answer:
val list2 = List("Word", " ", "", "OtherWord")
val(empties, spaces, words) =
list2.foldRight((List[String](), List[String](), List[String]())) {
case (str, (e, s, w)) if str.isEmpty => (str :: e, s, w)
case (str, (e, s, w)) if str.trim.isEmpty => (e, str :: s, w)
case (str, (e, s, w)) => (e, s, str :: w)
}
==> empties: List[String] = List("")
==> spaces: List[String] = List(" ")
==> words: List[String] = List(Word, OtherWord)
I understand the risks of using foldRight: mainly that in order to start on the right, the runtime needs to recurse and that this may blow the stack on large inputs. However, my inputs are small and this risk is acceptable.
Having said that, if there's a quick way to _.reverse three lists of a tuple that I haven't thought of, I'm all ears.
Thanks all!

Difference between any and parametric polymorphism scala?

I know that parametric polymorphism is what actually works, but I'm curious why using Any in it's place does not. For example how is the first function
def len[T] (l:List[T]):Int =
l match {
case Nil => 0
case _ :: t => 1 + len(t)
}
different from this one?
def len (l:List[Any]):Int =
l match {
case Nil => 0
case _ :: t => 1 + len(t)
}
What do you mean it doesn't work? This seems fine:
len(List('a,'b,'c))
// res0: Int = 3
Your in your example, there really isn't a difference, since you're not actually using the contents of the list for anything, but imagine a slightly different function:
def second[T](l: List[T]): Option[T] =
l match {
case Nil => None
case _ :: Nil => None
case _ :: x :: _ => Some(x)
}
println(second(List(1,2,3)).map(_ + 5)) // Some(7)
println(second(List(List('a,'b,'c), List('d,'e))).map(_.head)) // Some('d)
If you tried this with Any, you wouldn't be able to get anything except Option[Any] in return, so the compiler wouldn't let you do anything useful with the result (like add it to an Int or call .head, as in the examples, respectively).
In this case there really isn't a difference, because you aren't relying on the contained type at all, just the structure of List itself. It doesn't matter what T is, the length will be the same either way.
The type parameter would be important if you wanted to return another List[T]. For example:
def takeEveryOther[T](l: List[T]): List[T] =
l.zipWithIndex.collect { case (a, i) if(i % 2 == 0) => a }

accessing list.head, deconstruction vs method call

I am trying to learn a bit of Scala and got stuck on a small oddity when as far as I can I can write the same in two supposedly equivalent ways, but one runs and the other does not.
val test_array = Array(1,2,3,4,5,6,7,8,9,10,3,4)
val it = test_array.sliding(2).toList
def iter(lst: List[Array[Int]]): List[Boolean] = lst match {
case h :: Nil => List(false)
case h :: tail => tail.map(x => x.sameElements(lst.head)) ++ iter(tail)
}
if(iter(it).contains(true)) ...
and
val test_array = Array(1,2,3,4,5,6,7,8,9,10,3,4)
val it = test_array.sliding(2).toList
def iter(lst: List[Array[Int]]): List[Boolean] = lst match {
case h :: Nil => List(false)
case h :: tail => tail.map(x => x.sameElements(h)) ++ iter(tail)
}
if(iter(it).contains(true)) ...
The first example runs, the second throws a noSuchMethodError: scala.collection.immutable.$colon$colon.hd$1()
The only difference is how I access head. In one case I use the deconstruction way and the other I use list.head. Why does one run and the other does not?

Group List elements with a distance less than x

I'm trying to figure out a way to group all the objects in a list depending on an x distance between the elements.
For instance, if distance is 1 then
List(2,3,1,6,10,7,11,12,14)
would give
List(List(1,2,3), List(6,7), List(10,11,12), List(14))
I can only come up with tricky approaches and loops but I guess there must be a cleaner solution.
You may try to sort your list and then use a foldLeft on it. Basically something like that
def sort = {
val l = List(2,3,1,6,10,7,11,12,14)
val dist = 1
l.sorted.foldLeft(List(List.empty[Int]))((list, n) => {
val last = list.head
last match {
case h::q if Math.abs(last.head-n) > dist=> List(n) :: list
case _ => (n :: last ) :: list.tail
}
}
)
}
The result seems to be okay but reversed. Call "reverse" if needed, when needed, on the lists. the code becomes
val l = List(2,3,1,6,10,7,11,12,14)
val dist = 1
val res = l.sorted.foldLeft(List(List.empty[Int]))((list, n) => {
val last = list.head
last match {
case h::q if Math.abs(last.head-n) > dist=> List(n) :: (last.reverse :: list.tail)
case _ => (n :: last ) :: list.tail
}
}
).reverse
The cleanest answer would rely upon a method that probably should be called groupedWhile which would split exactly where a condition was true. If you had this method, then it would just be
def byDist(xs: List[Int], d: Int) = groupedWhile(xs.sorted)((l,r) => r - l <= d)
But we don't have groupedWhile.
So let's make one:
def groupedWhile[A](xs: List[A])(p: (A,A) => Boolean): List[List[A]] = {
val yss = List.newBuilder[List[A]]
val ys = List.newBuilder[A]
(xs.take(1) ::: xs, xs).zipped.foreach{ (l,r) =>
if (!p(l,r)) {
yss += ys.result
ys.clear
}
ys += r
}
ys.result match {
case Nil =>
case zs => yss += zs
}
yss.result.dropWhile(_.isEmpty)
}
Now that you have the generic capability, you can get the specific one easily.

More efficient Solution with tailrecursion?

I have the following ADT for Formulas. (shortened to the important ones)
sealed trait Formula
case class Variable(id: String) extends Formula
case class Negation(f: Formula) extends Formula
abstract class BinaryConnective(val f0: Formula, val f1: Formula) extends Formula
Note that the following methods are defined in an implicit class for formulas.
Let's say i want to get all variables from a formula.
My first approach was:
Solution 1
def variables: Set[Variable] = formula match {
case v: Variable => HashSet(v)
case Negation(f) => f.variables
case BinaryConnective(f0, f1) => f0.variables ++ f1.variables
case _ => HashSet.empty
}
This approach is very simple to understand, but not tailrecursive. So I wanted to try something different. I implemented a foreach on my tree-like formulas.
Solution 2
def foreach(func: Formula => Unit) = {
#tailrec
def foreach(list: List[Formula]): Unit = list match {
case Nil =>
case _ => foreach(list.foldLeft(List.empty[Formula])((next, formula) => {
func(formula)
formula match {
case Negation(f) => f :: next
case BinaryConnective(f0, f1) => f0 :: f1 :: next
case _ => next
}
}))
}
foreach(List(formula))
}
Now I can implement many methods with the help of the foreach.
def variables2 = {
val builder = Set.newBuilder[Variable]
formula.foreach {
case v: Variable => builder += v
case _ =>
}
builder.result
}
Now finally to the question. Which solution is preferable in terms of efficieny? At least I find my simple first solution more aesthetic.
I would expect Solution 2 to be more efficient, because you aren't create many different HashSet instances and combining them together. It is also more general.
You can simplify your Solution 2, removing the foldLeft:
def foreach(func: Formula => Unit) = {
#tailrec
def foreach(list: List[Formula]): Unit = list match {
case Nil =>
case formula :: next => {
func(formula)
foreach {
formula match {
case Negation(f) => f :: next
case BinaryConnective(f0, f1) => f0 :: f1 :: next
case _ => next
}
}
}
}
foreach(List(formula))
}