Transposing arbitrary collection-of-collections in Scala - scala

I have to often transpose a "rectangular" collection-of-collections in Scala, e.g.: a list of maps, a map of lists, a map of maps, a set of lists, a map of sets etc. Since collections can be uniformly viewed as a mapping from a specific domain to a co-domain (e.g.: a List[A]/Array[A] is a mapping from the Int domain to the A co-domain, Set[A]is a mapping from the A domain to the Boolean co-domain etc.), I'd like to write a clean, generic function to do a transpose operation (e.g.: turn a map of lists to the transposed list of maps). However, I'm having trouble because other than the () operator, Scala doesn't seem to have a unified API to view collections abstractly as mappings ?
So I end up writing a separate transpose for each type of collection-of-collections as follows:
def transposeMapOfLists[A,B]( mapOfLists: Map[A,List[B]] ) : List[Map[A,B]] = {
val k = ( mapOfLists keys ) toList
val l = ( k map { mapOfLists(_) } ) transpose;
l map { v => ( k zip v ) toMap }
}
def transposeListOfMaps[A,B]( listOfMaps: List[Map[A,B]]) : Map[A,List[B]] = {
val k = ( listOfMaps(0) keys ) toList
val l = ( listOfMaps map { m => k map { m(_) } } ) transpose;
( k zip l ) toMap
}
def transposeMapOfMaps[A,B,C]( mapOfMaps: Map[A,Map[B,C]] ) : Map[B,Map[A,C]] = {
val k = ( mapOfMaps keys ) toList
val listOfMaps = k map { mapOfMaps(_) }
val mapOfLists = transposeListOfMaps( listOfMaps )
mapOfLists map { p => ( p._1, ( k zip p._2 ) toMap ) }
}
Can someone help me unify these methods into one generic collection-of-collections transpose ? It will also help me (and I am sure others) learn some useful Scala features in the process.
ps: I have ignored exception handling and have assumed the input collection-of-collections is rectangular, i.e., all of the inner collections' domain elements constitute the same set.

I'm sure the following messy version using type classes could be cleaned up a lot, but it works as a quick proof-of-concept. I don't see an easy way to get the return types right without dependent method types (I'm sure it's possible), so you'll have to use -Xexperimental:
trait Mapping[A, B, C] {
type M[D] <: PartialFunction[A, D]
def domain(c: C): Seq[A]
def fromPairs[D](ps: Seq[(A, D)]): M[D]
def codomain(c: C)(implicit ev: C <:< PartialFunction[A, B]) =
domain(c).map(c)
def toPairs(c: C)(implicit ev: C <:< PartialFunction[A, B]) =
domain(c).map(a => (a, c(a)))
}
implicit def seqMapping[A, B <: Seq[A]] = new Mapping[Int, A, B] {
type M[C] = Seq[C]
def domain(c: B) = 0 until c.size
def fromPairs[C](ps: Seq[(Int, C)]) = ps.sortBy(_._1).map(_._2)
}
implicit def mapMapping[A, B, C <: Map[A, B]] = new Mapping[A, B, C] {
type M[D] = Map[A, D]
def domain(c: C) = c.keys.toSeq
def fromPairs[D](ps: Seq[(A, D)]) = ps.toMap
}
def transpose[A, B, C, M, N](m: M)(implicit
pev: M <:< PartialFunction[A, N],
qev: N <:< PartialFunction[B, C],
mev: Mapping[A, N, M],
nev: Mapping[B, C, N]
) = nev.fromPairs(nev.domain(mev.codomain(m).head).map(b =>
b -> mev.fromPairs(mev.toPairs(m).map { case (a, c) => a -> c(b) })
))
And now for some tests:
scala> println(transpose(List(Map("a" -> 1, "b" -> 13), Map("b" -> 99, "a" -> 14))))
Map(a -> Vector(1, 14), b -> Vector(13, 99))
scala> println(transpose(Map('a' -> List(1, 2, 3), 'z' -> List(4, 5, 6))))
Vector(Map(a -> 1, z -> 4), Map(a -> 2, z -> 5), Map(a -> 3, z -> 6))
scala> println(transpose(Map("x" -> Map(4 -> 'a, 99 -> 'z), "y" -> Map(4 -> 'b, 99 -> 's))))
Map(4 -> Map(x -> 'a, y -> 'b), 99 -> Map(x -> 'z, y -> 's))
So it's working as desired.

Related

Inverse a nested Map in Scala

I have a Map of type Map[A, Map[B, C]].
How can I inverse it to have a Map of type Map[B, Map[A, C]]?
There are lots of ways you could define this operation. I'll walk through a couple of the ones that I find the clearest. For the first implementation I'll start with a helper method:
def flattenNestedMap[A, B, C](nested: Map[A, Map[B, C]]): Map[(A, B), C] =
for {
(a, innerMap) <- nested
(b, c) <- innerMap
} yield (a, b) -> c
This flattens the nested map to a map from pairs to values. Next we can define another helper operation that gets us almost what we need.
def groupByBs[A, B, C](flattened: Map[(A, B), C]): Map[B, Map[(A, B), C]] =
flattened.groupBy(_._1._2)
Now we just need to remove the redundant B from the keys in the inner map:
def invert[A, B, C](nested: Map[A, Map[B, C]]): Map[B, Map[A, C]] =
groupByBs(flattenNestedMap(nested)).mapValues(
_.map {
case ((a, _), c) => a -> c
}
)
(Note that mapValues is lazy, which means that the result will be recomputed every time you use it. In general this isn't a problem, and there are easy workarounds, but they're not really relevant to the question.)
And we're done:
scala> invert(Map(1 -> Map(2 -> 3), 10 -> Map(2 -> 4)))
res0: Map[Int,Map[Int,Int]] = Map(2 -> Map(1 -> 3, 10 -> 4))
You could also skip the helper methods and just chain the operations in invert. I find breaking them up a little clearer, but that's a matter of style.
Alternatively you could use a couple of folds:
def invert[A, B, C](nested: Map[A, Map[B, C]]): Map[B, Map[A, C]] =
nested.foldLeft(Map.empty[B, Map[A, C]]) {
case (acc, (a, innerMap)) =>
innerMap.foldLeft(acc) {
case (innerAcc, (b, c)) =>
innerAcc.updated(b, innerAcc.getOrElse(b, Map.empty).updated(a, c))
}
}
Which does the same thing:
scala> invert(Map(1 -> Map(2 -> 3), 10 -> Map(2 -> 4)))
res1: Map[Int,Map[Int,Int]] = Map(2 -> Map(1 -> 3, 10 -> 4))
The foldLeft version has more of the shape of the straightforward imperative version—we're (functionally) iterating through the key-value pairs of the outer and inner maps and building up the result. Off the top of my head I'd guess it's also a little more efficient, but I'm not sure about that, and it's unlikely to matter much, so I'd suggest choosing the one you personally find clearer.
You can simply do it using map operation on given Map collection :
scala> Map("A" -> Map("B" -> "C"), "X" -> Map("Y" -> "Z"))
res1: scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,String]] = Map(A -> Map(B -> C), X -> Map(Y -> Z))
scala> res1.map{ case (key, valueMap) => valueMap.map{ case (vmKey, vmValue) => (vmKey -> Map(key -> vmValue)) } }
res2: scala.collection.immutable.Iterable[scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,String]]] = List(Map(B -> Map(A -> C)), Map(Y -> Map(X -> Z)))

Scala troubles with sorting

I am still on studying period when it comes to scala and faces some problems that I would like to solve.
What I have at the moment is a Seq of items type X. Now I want to make a function that returns me a map of numbers mapped with set of items that appear on that original seq certain amount of time.
Here is small example what I want to do:
val exampleSeq[X]: Seq = [a, b, d, d, c, b, d]
val exampleSeq2[x]: Seq = [a, a, a, c, c, b, b, c]
myMagicalFunction(exampleSeq) returns Map[1 -> Set[a, c], 2 -> Set[b], 3 -> Set[d]]
myMagicalFunction(exampleSeq2) returns Map[2 -> Set[b], 3 -> Set[a, c]]
So far I have been able to create a function that maps the item with the times it appears:
function[X](seq: Seq[X]) = seq.groupBy(item => item).mapValues(_.size)
Return for my exampleSeq from that one is
Map(a -> 1, b -> 2, c -> 1, d -> 3)
Thank you for answers :)
One approach, for
val a = Seq('a', 'b', 'd', 'd', 'c', 'b', 'd')
this
val b = for ( (k,v) <- a.groupBy(identity).mapValues(_.size).toArray )
yield (v,k)
delivers
Array((2,b), (3,d), (1,a), (1,c))
and so
b.groupBy(_._1).mapValues(_.map(_._2).toSet)
res: Map(2 -> Set(b), 1 -> Set(a, c), 3 -> Set(d))
Note seq.groupBy(item => item) is equivalent to seq.groupBy(identity).
You are almost there! Departing from the collection element -> count, you only need a transformation to get to count -> Col[elem].
Lets say that freqItem = Map(a -> 1, b -> 2, c -> 1, d -> 3) you would do something like:
val freqSet = freqItem.toSeq.map(_.swap).groupBy(_._1).mapValues(_.toSet)
Note that we transform the Map into a Seq before swapping the (k,v) into (v,k) because mapping over a Map preserves the semantics of key uniqueness and you'd lose one of (1 -> a), (1 -> b) otherwise.
You can write your function as :
def f[T](l: Seq[T]): Map[Int, Set[T]] = {
l.map {
x => (x, l.count(_ == x))
}.distinct.groupBy(_._2).mapValues(_.map(_._1).toSet)
}
val l = List("a","a","a","b","b","b","b","c","c","d","e")
f(l)
res0: Map[Int,Set[String]] = Map(2 -> Set(c), 4 -> Set(b), 1 -> Set(d, e), 3 -> Set(a))
scala> case class A(name:String,age:Int)
defined class A
scala> val l = List(new A("a",1),new A("b",2),new A("a",1),new A("c",1) )
l: List[A] = List(A(a,1), A(b,2), A(a,1), A(c,1))
scala> f[A](l)
res1: Map[Int,Set[A]] = Map(2 -> Set(A(a,1)), 1 -> Set(A(b,2), A(c,1)))

Flattening a map of sets

I am trying to flatten a map where the keys are traversables, in the sense that:
Map( Set(1, 2, 3) -> 'A', Set(4, 5, 6) -> 'B')
should flatten to:
Map(5 -> B, 1 -> A, 6 -> B, 2 -> A, 3 -> A, 4 -> B)
Here is what I did:
def fuse[A, B, T <: Traversable[A]](mapOfTravs: Map[T, B]): Map[A, B] = {
val pairs = for {
trav <- mapOfTravs.keys
key <- trav
} yield (key, mapOfTravs(trav))
pairs.toMap
}
It works. But:
Is there a simpler way to do this?
I'm not very comfortable with the Scala type system and I'm sure this can be improved. I have to specify the types explicitly whenever I use my function:
val map2 = Map( Set(1, 2, 3) -> 'A', Set(4, 5, 6) -> 'B')
val fused2 = fuse[Int, Char, Set[Int]](map2)
val map1: Map[Traversable[Int], Char] = Map( Set(1, 2, 3) -> 'A', Set(4, 5, 6) -> 'B')
val fused1 = fuse[Int, Char, Traversable[Int]](map1)
P.S.: this fuse function does not make much sense when the key traversables have a non-null intersection.
This is basically what you're doing in the for comprehension, but simplified a little bit:
def fuse[A, B, T <: Traversable[A]](mapOfTravs: Map[T, B]): Map[A, B] = {
mapOfTravs.flatMap({ case (s, c) => s.map(i => i -> c) })
}
Not much you can do about the types, I'm sure there's some type lambda shenanigans that you can do, I'm just not sure how to do them...
UPDATE
Here's a slightly better for version, same as the above flatMap:
def fuse2[A, B, T <: Traversable[A]](mapOfTravs: Map[T, B]): Map[A, B] = {
for {
(keys, value) <- mapOfTravs
key <- keys
} yield key -> value
}
Like #Azzie, I was thinking zip, but maybe Azzie has the advantage with those zees.
scala> val m = Map( Set(1, 2, 3) -> 'A', Set(4, 5, 6) -> 'B')
m: scala.collection.immutable.Map[scala.collection.immutable.Set[Int],Char] = Map(Set(1, 2, 3) -> A, Set(4, 5, 6) -> B)
scala> (m map { case (k, v) => k zip (Stream continually v) }).flatten.toMap
res0: scala.collection.immutable.Map[Int,Char] = Map(5 -> B, 1 -> A, 6 -> B, 2 -> A, 3 -> A, 4 -> B)
scala> (m map { case (k, v) => k zipAll (Nil, null, v) }).flatten.toMap
res1: scala.collection.immutable.Map[Any,Char] = Map(5 -> B, 1 -> A, 6 -> B, 2 -> A, 3 -> A, 4 -> B)
scala> m flatMap { case (k, v) => k zip (Stream continually v) }
res2: scala.collection.immutable.Map[Int,Char] = Map(5 -> B, 1 -> A, 6 -> B, 2 -> A, 3 -> A, 4 -> B)
It's not obvious how to generalize it nicely.
This looks horrible and using 0 is kind of cheating but it does the job
m.map( {case (s,c) => s.zipAll(Set(c),0,c)} ).flatten.toMap
Since I'm apparently on a "hideously generic implicits" kick lately:
import scala.collection.MapLike
import scala.collection.TraversableLike
import scala.collection.generic.CanBuildFrom
implicit class Map_[
A,
B,
T1 : ({type L[X] = X => TraversableLike[A, T2]})#L,
T2,
M1 : ({type L[X] = X => MapLike[T1, B, M2]})#L,
M2 <: MapLike[T1, B, M2] with Map[T1, B]
](map: M1) {
def fuse[M3](implicit cbfM: CanBuildFrom[M2, (A, B), M3]) : M3 =
map.flatMap({ case (k, v) => k.toTraversable.map((_, v)) })
}
Examples:
scala> Map(Set(1, 2, 3) -> 'A', Set(4, 5, 6) -> 'B').fuse
res: scala.collection.immutable.Map[Int,Char] =
Map(5 -> B, 1 -> A, 6 -> B, 2 -> A, 3 -> A, 4 -> B)
scala> Map(Array(1, 2, 3) -> 'A', Array(4, 5, 6) -> 'B').fuse
res: scala.collection.immutable.Map[Int,Char] =
Map(5 -> B, 1 -> A, 6 -> B, 2 -> A, 3 -> A, 4 -> B)

How to map 2 maps with a function in Scala?

When c map equals a function of a map I can calculate it as
val a: Map[T, U] = ...
def f(aValue: U): V = ...
val c: Map[T, V] = a.map(f)
but what if c map equals a function of both a and b as arguments? For example if a, b and c are Map[String, Int] and a c values are to equal corresponding a values raised to powers specified by corresponding b values?
Something like this?
val a: Map[String, Int] = Map("a" -> 10, "b" -> 20)
val b: Map[String, Int] = Map("a" -> 2, "b" -> 3)
def f(a: Int, b: Int): Int = math.pow(a,b).toInt // math.pow returns a Double
val c = for {
(ak, av) <- a // for all key-value pairs from a
bv <- b.get(ak) // for any matching value from b
} yield (ak, f(av,bv)) // yield a new key-value pair that results from applying f
// c: scala.collection.immutable.Map[String,Int] = Map(a -> 100, b -> 8000)
Is this what you're after?
val a = Map('a -> 2, 'b -> 3)
val b = Map('a -> 4, 'b -> 5)
a.map{ case (k, aVal) => (k, aVal + b(k)) } // Map('a -> 6, 'b -> 8)

Map a single entry of a Map

I want to achieve something like the following:
(_ : Map[K,Int]).mapKey(k, _ + 1)
And the mapKey function applies its second argument (Int => Int) only to the value stored under k. Is there something inside the standard lib? If not I bet there's something in Scalaz.
Of course I can write this function myself (m.updated(k,f(m(k))) and its simple to do so. But I've come over this problem several times, so maybe its already done?
For Scalaz I imagine something along the following code:
(m: Map[A,B]).project(k: A).map(f: B => B): Map[A,B]
You could of course add
def changeForKey[A,B](a: A, fun: B => B): Tuple2[A, B] => Tuple2[A, B] = { kv =>
kv match {
case (`a`, b) => (a, fun(b))
case x => x
}
}
val theMap = Map('a -> 1, 'b -> 2)
theMap map changeForKey('a, (_: Int) + 1)
res0: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 2, 'b -> 2)
But this would circumvent any optimisation regarding memory re-use and access.
I came also up with a rather verbose and inefficient scalaz solution using a zipper for your proposed project method:
theMap.toStream.toZipper.flatMap(_.findZ(_._1 == 'a).flatMap(elem => elem.delete.map(_.insert((elem.focus._1, fun(elem.focus._2)))))).map(_.toStream.toMap)
or
(for {
z <- theMap.toStream.toZipper
elem <- z.findZ(_._1 == 'a)
z2 <- elem.delete
} yield z2.insert((elem.focus._1, fun(elem.focus._2)))).map(_.toStream.toMap)
Probably of little use. I’m just posting for reference.
Here is one way:
scala> val m = Map(2 -> 3, 5 -> 11)
m: scala.collection.immutable.Map[Int,Int] = Map(2 -> 3, 5 -> 11)
scala> m ++ (2, m.get(2).map(1 +)).sequence
res53: scala.collection.immutable.Map[Int,Int] = Map(2 -> 4, 5 -> 11)
scala> m ++ (9, m.get(9).map(1 +)).sequence
res54: scala.collection.immutable.Map[Int,Int] = Map(2 -> 3, 5 -> 11)
This works because (A, Option[B]).sequence gives Option[(A, B)]. (sequence in general turns types inside out. i.e. F[G[A]] => [G[F[A]], given F : Traverse and G : Applicative.)
You can pimp it with this so that it creates a new map based on the old one:
class MapUtils[A, B](map: Map[A, B]) {
def mapValueAt(a: A)(f: (B) => B) = map.get(a) match {
case Some(b) => map + (a -> f(b))
case None => map
}
}
implicit def toMapUtils[A, B](map: Map[A, B]) = new MapUtils(map)
val m = Map(1 -> 1)
m.mapValueAt(1)(_ + 1)
// Map(1 -> 2)
m.mapValueAt(2)(_ + 1)
// Map(1 -> 1)