How to generate transitive closure of set of tuples? - scala

What is the best way to generate transitive closure of a set of tuples?
Example:
Input Set((1, 2), (2, 3), (3, 4), (5, 0))
Output Set((1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4), (5, 0))

//one transitive step
def addTransitive[A, B](s: Set[(A, B)]) = {
s ++ (for ((x1, y1) <- s; (x2, y2) <- s if y1 == x2) yield (x1, y2))
}
//repeat until we don't get a bigger set
def transitiveClosure[A,B](s:Set[(A,B)]):Set[(A,B)] = {
val t = addTransitive(s)
if (t.size == s.size) s else transitiveClosure(t)
}
println(transitiveClosure(Set((1,2), (2,3), (3,4))))
This is not a very efficient implementation, but it is simple.

With the help of unfold,
def unfoldRight[A, B](seed: B)(f: B => Option[(A, B)]): List[A] = f(seed) match {
case Some((a, b)) => a :: unfoldRight(b)(f)
case None => Nil
}
def unfoldLeft[A, B](seed: B)(f: B => Option[(B, A)]) = {
def loop(seed: B)(ls: List[A]): List[A] = f(seed) match {
case Some((b, a)) => loop(b)(a :: ls)
case None => ls
}
loop(seed)(Nil)
}
it becomes rather simple:
def transitiveClosure(input: Set[(Int, Int)]) = {
val map = input.toMap
def closure(seed: Int) = unfoldLeft(map get seed) {
case Some(`seed`) => None
case Some(n) => Some(seed -> n -> (map get n))
case _ => None
}
map.keySet flatMap closure
}
Another way of writing closure is this:
def closure(seed: Int) = unfoldRight(seed) {
case n if map.get(n) != seed => map get n map (x => seed -> x -> x)
case _ => None
}
I'm not sure which way I like best, myself. I like the elegance of testing for Some(seed) to avoid loops, but, by the same token, I also like the elegance of mapping the result of map get n.
Neither version returns seed -> seed for loops, so you'll have to add that if needed. Here:
def closure(seed: Int) = unfoldRight(map get seed) {
case Some(`seed`) => Some(seed -> seed -> None)
case Some(n) => Some(seed -> n -> (map get n))
case _ => None
}

Model the problem as a directed graph as follows:
Represent the numbers in the tuples as vertices in a graph.
Then each tuple (x, y) represents a directed edge from x to y. After that, use Warshall's Algorithm to find the transitive closure of the graph.
For the resulting graph, each directed edge is then converted to an (x, y) tuple. That is the transitive closure of the set of tuples.

Assuming that what you have is a DAG (there are no cycles in your example data), you could use the code below. It expects the DAG as a Map from T to List[T], which you could get from your input using
input.groupBy(_._1) mapValues ( _ map (_._2) )
Here's the transitive closure:
def transitiveClosure[T]( dag: Map[ T, List[T] ] ) = {
var tc = Map.empty[ T, List[T] ]
def getItemTC( item:T ): List[T] = tc.get(item) match {
case None =>
val itemTC = dag(item) flatMap ( x => x::getItemTC(x) )
tc += ( item -> itemTC )
itemTC
case Some(itemTC) => itemTC
}
dag.keys foreach getItemTC
tc
}
This code figures out the closure for each element just once. However:
This code can cause a stack overflow if there are long enough paths through the DAG (the recursion is not tail recursion).
For a large graph, you would probably be better off making tc a mutable Map and then converting it at the end if you wanted an immutable Map.
If your elements were really small integers as in your example, you could improve performance significantly by using Arrays rather than Maps, although doing so would complicate some things.
To eliminate the stack overflow problem (for DAGs), you could do a topological sort, reverse it, and process the items in order. But see also this page:
best known transitive closure algorithm for graph

Related

Merging list of tuples in scala based on key

I have a list of tuples look like this:
Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
On the keys, merge ptxt with all the list that will come after it.
e.g.
create a new seq look like this :
Seq("how you doing", "whats up", "this is cool")
You could fold your Seq with foldLeft:
val s = Seq("ptxt"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse
If you don't care about an order you can omit reverse.
Function foldLeft takes two arguments first is the initial value and the second one is a function taking two arguments: the previous result and element of the sequence. Result of this method is then fed the next function call as the first argument.
For example for numbers foldLeft, would just create a sum of all elements starting from left.
List(5, 4, 8, 6, 2).foldLeft(0) { (result, i) =>
result + i
} // 25
For our case, we start with an empty list. Then we provide function, which handles two cases using pattern matching.
Case when the key is "ptxt". In this case, we just prepend the value to list.
case (xs, ("ptxt", s)) => s :: xs
Case when the key is "list". Here we take the first string from the list (using pattern matching) and then concatenate value to it, after that we put it back with the rest of the list.
case (x :: xs, ("list", s)) => (x + s) :: xs
At the end since we were prepending element, we need to revert our list. Why we were prepending, not appending? Because append on the immutable list is O(n) and prepend is O(1), so it's more efficient.
Here another solution:
val data = Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats", "list" -> "up","ptxt"-> "this ", "list"->"is cool")
First group Keys and Values:
val grouped = s.groupBy(_._1)
.map{case (k, l) => k -> l.map{case (_, v) => v.trim}}
// > Map(list -> List(you doing, up, is cool), ptxt -> List(how, whats, this))
Then zip and concatenate the two values:
grouped("ptxt").zip(grouped("list"))
.map{case (a, b) => s"$a $b"}
// > List(how you doing, whats up, this is cool)
Disclaimer: This only works if the there is always key, value, key, value,.. in the list - I had to adjust the input data.
If you change Seq for List, you can solve that with a simple tail-recursive function.
(The code uses Scala 2.13, but can be rewritten to use older Scala versions if needed)
def mergeByKey[K](list: List[(K, String)]): List[String] = {
#annotation.tailrec
def loop(remaining: List[(K, String)], acc: Map[K, StringBuilder]): List[String] =
remaining match {
case Nil =>
acc.valuesIterator.map(_.result()).toList
case (key, value) :: tail =>
loop(
remaining = tail,
acc.updatedWith(key) {
case None => Some(new StringBuilder(value))
case Some(oldValue) => Some(oldValue.append(value))
}
)
}
loop(remaining = list, acc = Map.empty)
}
val data = List("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
mergeByKey(data)
// res: List[String] = List("howwhats upthis ", "you doingis cool")
Or a one liner using groupMap.
(inspired on pme's answer)
data.groupMap(_._1)(_._2).view.mapValues(_.mkString).valuesIterator.toList
Adding another answer since I don't have enough reputation points for adding a comment. just an improvment on Krzysztof Atłasik's answer. to compensate for the case where the Seq starts with a "list" you might want to add another case as:
case (xs,("list", s)) if xs.isEmpty=>xs
So the final code could be something like:
val s = Seq("list"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs,("list", s)) if xs.isEmpty=>xs
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse

Lists Merge and reduce without using inbuilt functions

Consider an example
val a= List(1,2,3)
val b= List(4,5,6)
merge reduce function taking two lists and two function where one function acts as merge and another function to reduce it to Integer more of a general form.
merge by multiplying the head of two lists and then reduce using add
merge using max then get the min of the generated list
mergeReduce(a,b,product,add) = 32
mergeReduce(a,b,max,min) = 4
This can be achieved using inbuilt functions but is there a better way to do without the use of those functions in a recursive manner.
Here is your mergeReduce() (as I understand it).
def mergeReduce(a :List[Int], b :List[Int]
,f :(Int,Int)=>Int, g :(Int,Int)=>Int) :Int =
a.zip(b).map(f.tupled).reduce(g)
val a= List(1,2,3)
val b= List(4,5,6)
mergeReduce(a,b,_*_,_+_) // 32
mergeReduce(a,b,math.max,math.min) // 4
So, what are the "inbuilt" functions you want to replace? And why do you want to replace them?
Here then is a version without map, reduce, zip, and tupled.
def mergeReduce(lsta :List[Int], lstb :List[Int]
,f :(Int,Int)=>Int, g :(Int,Int)=>Int) :Int = {
def merg(x :List[Int], y :List[Int], acc :List[Int] = Nil) :List[Int] =
if (x.isEmpty || y.isEmpty) acc.reverse
else merg(x.tail, y.tail, f(x.head,y.head) :: acc)
def reduc(z: List[Int]) :Int = z match {
case Nil => -1 //error
case i :: Nil => i
case a::b::c => reduc(g(a,b) :: c)
}
reduc(merg(lsta, lstb))
}
This uses .isEmpty, .reverse, .head, .tail, and .unapply (the method by which pattern matching is accomplished). Still too much "inbuilt"?
I think this is what you are looking for. It performs merge and reduce in a single pass, using only the basic List operations:
def mergeReduce[T](a: List[T], b: List[T], merge: (T, T) => T, reduce: (T, T) => T): T = {
#tailrec
def loop(a: List[T], b: List[T], res: T): T =
(a, b) match {
case (a :: at, b :: bt) => loop(at, bt, reduce(res, merge(a, b)))
case _ => res
}
loop(a.tail, b.tail, merge(a.head, b.head))
}
This will fail if either list is Nil and will silently discard the values from the longer list if the lengths are not the same.

Scala grouping a collection by a finite sequence of values

For
val xs = (1 to 9).toArray
we can group for instance every two consecutive items like this,
xs.grouped(2)
Yet, given a finite sequence of values, namely for instance
val gr = Seq(3,2,1)
how to group xs based in gr so that
xs.grouped(gr)
res: Array(Array(1,2,3), Array(4,5), Array(6), Array(7,8,9))
Please, consider the following solution:
def groupBySeq[T](arr:Array[T], gr:Seq[Int]) = {
val r = gr.foldLeft((arr, List[Array[T]]())) {
case ((rest, acc), item) => (rest.drop(item), rest.take(item)::acc)
}
(r._1::r._2).reverse
}
The following function generates the result you are looking for, although I suspect there may be a better way:
def grouped[T](what: Seq[T], by: Seq[Int]) = {
def go(left: Seq[T], nextBy: Int, acc: List[Seq[T]]): List[Seq[T]] = (left.length, by(nextBy % by.length)) match {
case (n, sz) if n <= sz => left :: acc
case (n, sz) => go(left.drop(sz), nextBy+1, left.take(sz) :: acc)
}
go(what, 0, Nil).reverse
}

Scala: Elegant solution to iterate over (List, List)

I'm trying to come up with an "elegant" solution to iterate over two lists (pairs of values), and perform some tests on the resulting values.
Any ideas? Here's what I have so far, but I get "value filter is not a member of (List[Int], List[Int])," which surprises me I thought this would work. AND, I feel like there must be a much cleaner way to express this in Scala.
val accounts = random(count = 100, minimum = 1, maximum = GPDataTypes.integer._2)
val ids = random(count = 100, minimum = 1, maximum = GPDataTypes.integer._2)
for ((id, accountId) <- (ids, accounts)) {
val g = new GPGlimple(Some(id), Some(timestamp), accountId, false, false, 2)
println(g)
g.accountId mustEqual accountId
g.id mustEqual id
g.created.get must beLessThan(System.currentTimeMillis)
g.layers must beNone
g.version must be equalTo 2
}
The simplest solution for this is zip:
(ids zip accounts)
The documentation for zip says:
Returns a list formed from this list and another iterable collection by combining corresponding elements in pairs.
In other words, zip will return a list of tuples.
The zipped method could also work here:
(ids, accounts).zipped
You can find the zipped source for 2-tuples here. Note that this is made available through an enrichment of (T, U) where T is implicitly viewable as a TraversableLike and U is implicitly viewable as an IterableLike. That method returns a ZippedTraversable2, which is a minimal interface that encapsulates this sort of zipped return, and behaves more efficiently for large sequences by inhibiting the creation of intermediary collections. These are generally more performant because they use iterators internally, as can be seen in the source.
Note that the returns here are of different types, which could affect downstream behavior. One important difference is that the normal combinator methods on ZippedTraversable2 are slightly different that those on a Traversable of tuples. The methods on ZippedTraversable2 generally expect a function of 2 arguments, while those on a Traversable of tuples will expect a function with a single argument that is a tuple. For example, you can check this in the REPL for the foreach method:
val s1 = List(1, 2, 3)
val s2 = List('a', 'b', 'c')
(s1 -> s2).zipped.foreach _
// ((Int, Char) => Any) => Unit = <function1>
(s1 zip s2).foreach _
// (((Int, Char)) => Any) => Unit = <function1>
//Notice the extra parens here, signifying a method with a tuple argument
This difference means that you sometimes have to use a different syntax when using zip and zipped:
(s1 zip s2).map { x => x._1 + x._2 }
(s1, s2).zipped.map { x => x._1 + x._2 } //This won't work! The method shouldn't expect a tuple argument
//conversely
(s1, s2).zipped.map { (x, y) => x + y }
(s1 zip s2).map { (x, y) => x + y } //This won't work! The method shouldn't expect 2 arguments
//Added note: methods with 2 arguments can often use the more concise underscore notation:
(s1, s2).zipped.map { _ + _ }
Note that if you use the case notation, you're covered either way:
//case works for both syntaxes
(s1, s2).zipped.map { case (x, y) => x + y } \
(s1 zip s2).map { case (x, y) => x + y }
This works since the compiler understands this notation for methods with either two arguments, or a single tuple argument, as explained in section 8.5 of the spec:
val f: (Int, Int) => Int = { case (a, b) => a + b }
val g: ((Int, Int)) => Int = { case (a, b) => a + b }
Use zip:
for ((id, accountId) <- ids.zip(accounts)) {
// ...
}

Cartesian Product and Map Combined in Scala

This is a followup to: Expand a Set of Sets of Strings into Cartesian Product in Scala
The idea is you want to take:
val sets = Set(Set("a","b","c"), Set("1","2"), Set("S","T"))
and get back:
Set("a&1&S", "a&1&T", "a&2&S", ..., "c&2&T")
A general solution is:
def combine[A](f:(A, A) => A)(xs:Iterable[Iterable[A]]) =
xs.reduceLeft { (x, y) => x.view.flatMap {a => y.map(f(a, _)) } }
used as follows:
val expanded = combine{(x:String, y:String) => x + "&" + y}(sets).toSet
Theoretically, there should be a way to take input of type Set[Set[A]] and get back a Set[B]. That is, to convert the type while combining the elements.
An example usage would be to take in sets of strings (as above) and output the lengths of their concatenation. The f function in combine would something of the form:
(a:Int, b:String) => a + b.length
I was not able to come up with an implementation. Does anyone have an answer?
If you really want your combiner function to do the mapping, you can use a fold but as Craig pointed out you'll have to provide a seed value:
def combine[A, B](f: B => A => B, zero: B)(xs: Iterable[Iterable[A]]) =
xs.foldLeft(Iterable(zero)) {
(x, y) => x.view flatMap { y map f(_) }
}
The fact that you need such a seed value follows from the combiner/mapper function type (B, A) => B (or, as a curried function, B => A => B). Clearly, to map the first A you encounter, you're going to need to supply a B.
You can make it somewhat simpler for callers by using a Zero type class:
trait Zero[T] {
def zero: T
}
object Zero {
implicit object IntHasZero extends Zero[Int] {
val zero = 0
}
// ... etc ...
}
Then the combine method can be defined as:
def combine[A, B : Zero](f: B => A => B)(xs: Iterable[Iterable[A]]) =
xs.foldLeft(Iterable(implicitly[Zero[B]].zero)) {
(x, y) => x.view flatMap { y map f(_) }
}
Usage:
combine((b: Int) => (a: String) => b + a.length)(sets)
Scalaz provides a Zero type class, along with a lot of other goodies for functional programming.
The problem that you're running into is that reduce(Left|Right) takes a function (A, A) => A which doesn't allow you to change the type. You want something more like foldLeft which takes (B, A) ⇒ B, allowing you to accumulate an output of a different type. folds need a seed value though, which can't be an empty collection here. You'd need to take xs apart into a head and tail, map the head iterable to be Iterable[B], and then call foldLeft with the mapped head, the tail, and some function (B, A) => B. That seems like more trouble than it's worth though, so I'd just do all the mapping up front.
def combine[A, B](f: (B, B) => B)(g: (A) => B)(xs:Iterable[Iterable[A]]) =
xs.map(_.map(g)).reduceLeft { (x, y) => x.view.flatMap {a => y.map(f(a, _)) } }
val sets = Set(Set(1, 2, 3), Set(3, 4), Set(5, 6, 7))
val expanded = combine{(x: String, y: String) => x + "&" + y}{(i: Int) => i.toString}(sets).toSet