mapValues on innermost map of nested maps - scala

This inspiration for this question came when I tried to answer this one.
Say you have a sequence of data (may be from a CSV file for instance). groupBy can be used to analyze certain aspect of the data, grouping by column or a combination of columns. For instance:
val groups0: Map[String, Array[String]] =
seq.groupBy(row => row(0) + "-" + row(4))
If I then want to create sub-groups within the groups I can do
val groups1: Map[String, Map[String, Array[String]]] =
groups0.mapValues(row => row.groupBy(_(1))
If I want to do this more one time it gets really cumbersome:
val groups2 =
groups1.mapValues(groups => groups.mapValues(row => row.groupBy(_(2)))
So here is my question given an arbitrary nesting of Map[K0, Map[K1, ..., Map[Kn, V]]], how do you write a mapValues function that takes a f: (V) => B and applies to the innermost V to return a Map[K0, Map[K1, ..., Map[Kn, B]]]?

My first instinct said that handling arbitrary nesting in a type-safe way would be impossible, but it seems that it IS possible if you define a few implicits that tell the compiler how to do it.
Essentially, the "simple" mapper tells it how to handle the plain non-nested case, while "wrappedMapper" tells it how to drill down through one Map layer:
// trait to tell us how to map inside of a container.
trait CanMapInner[WrappedV, WrappedB,V,B] {
def mapInner(in: WrappedV, f: V => B): WrappedB
}
// simple base case (no nesting involved).
implicit def getSimpleMapper[V,B] = new CanMapInner[V,B,V,B] {
def mapInner(in: V, f: (V) => B): B = f(in)
}
// drill down one level of "Map".
implicit def wrappedMapper[K,V,B,InnerV,InnerB]
(implicit innerMapper: CanMapInner[InnerV,InnerB,V,B]) =
new CanMapInner[Map[K,InnerV], Map[K,InnerB],V,B] {
def mapInner(in: Map[K, InnerV], f: (V) => B): Map[K, InnerB] =
in.mapValues(innerMapper.mapInner(_, f))
}
// the actual implementation.
def deepMapValues[K,V,B,WrappedV,WrappedB](map: Map[K,WrappedV], f: V => B)
(implicit mapper: CanMapInner[WrappedV,WrappedB,V,B]) = {
map.mapValues(inner => mapper.mapInner(inner, f))
}
// testing with a simple map
{
val initMap = Map(1 -> "Hello", 2 -> "Goodbye")
val newMap = deepMapValues(initMap, (s: String) => s.length)
println(newMap) // Map(1 -> 5, 2 -> 7)
}
// testing with a nested map
{
val initMap = Map(1 -> Map("Hi" -> "Hello"), 2 -> Map("Bye" -> "Goodbye"))
val newMap = deepMapValues(initMap, (s: String) => s.length)
println(newMap) // Map(1 -> Map(Hi -> 5), 2 -> Map(Bye -> 7))
}
Of course, in real code the pattern-matching dynamic solution is awfully tempting thanks to its simplicity. Type-safety isn't everything :)

I'm sure there is a better way using Manifest, but pattern matching seems to distinguish Seq and Map, so here it is:
object Foo {
def mapValues[A <: Map[_, _], C, D](map: A)(f: C => D): Map[_, _] = map.mapValues {
case seq: Seq[C] => seq.groupBy(f)
case innerMap: Map[_, _] => mapValues(innerMap)(f)
}
}
scala> val group0 = List("fooo", "bar", "foo") groupBy (_(0))
group0: scala.collection.immutable.Map[Char,List[java.lang.String]] = Map((f,List(fooo, foo)), (b,List(bar)))
scala> val group1 = Foo.mapValues(group0)((x: String) => x(1))
group1: scala.collection.immutable.Map[_, Any] = Map((f,Map(o -> List(fooo, foo))), (b,Map(a -> List(bar))))
scala> val group2 = Foo.mapValues(group1)((x: String) => x(2))
group2: scala.collection.immutable.Map[_, Any] = Map((f,Map(o -> Map(o -> List(fooo, foo)))), (b,Map(a -> Map(r -> List(bar)))))
Edit:
Here's a typed version using higher-kinded type.
trait NestedMapValue[Z] {
type Next[X] <: NestedMapValue[Z]
def nextValues[D](f: Z => D): Next[D]
}
trait NestedMap[Z, A, B <: NestedMapValue[Z]] extends NestedMapValue[Z] { self =>
type Next[D] = NestedMap[Z, A, B#Next[D]]
val map: Map[A, B]
def nextValues[D](f: Z => D): Next[D] = self.mapValues(f)
def mapValues[D](f: Z => D): NestedMap[Z, A, B#Next[D]] = new NestedMap[Z, A, B#Next[D]] { val map = self.map.mapValues {
case x: B => x.nextValues[D](f)
}}
override def toString = "NestedMap(%s)" format (map.toString)
}
trait Bottom[A] extends NestedMapValue[A] {
type Next[D] = NestedMap[A, D, Bottom[A]]
val seq: Seq[A]
def nextValues[D](f: A => D): Next[D] = seq match {
case seq: Seq[A] => groupBy[D](f)
}
def groupBy[D](f: A => D): Next[D] = seq match {
case seq: Seq[A] =>
new NestedMap[A, D, Bottom[A]] { val map = seq.groupBy(f).map { case (key, value) => (key, new Bottom[A] { val seq = value })} }
}
override def toString = "Bottom(%s)" format (seq.toString)
}
object Bottom {
def apply[A](aSeq: Seq[A]) = new Bottom[A] { val seq = aSeq }
}
scala> val group0 = Bottom(List("fooo", "bar", "foo")).groupBy(x => x(0))
group0: NestedMap[java.lang.String,Char,Bottom[java.lang.String]] = NestedMap(Map(f -> Bottom(List(fooo, foo)), b -> Bottom(List(bar))))
scala> val group1 = group0.mapValues(x => x(1))
group1: NestedMap[java.lang.String,Char,Bottom[java.lang.String]#Next[Char]] = NestedMap(Map(f -> NestedMap(Map(o -> Bottom(List(fooo, foo)))), b -> NestedMap(Map(a -> Bottom(List(bar))))))
scala> val group2 = group1.mapValues(x => x.size)
group2: NestedMap[java.lang.String,Char,Bottom[java.lang.String]#Next[Char]#Next[Int]] = NestedMap(Map(f -> NestedMap(Map(o -> NestedMap(Map(4 -> Bottom(List(fooo)), 3 -> Bottom(List(foo)))))), b -> NestedMap(Map(a -> NestedMap(Map(3 -> Bottom(List(bar))))))))

Related

Using lambda with generic type in Scala

The second argument of myFunc is a function with complex arguments:
def myFunc(list : List[String],
combine: (Map[String, ListBuffer[String]], String, String) => Unit) = {
// body of myFunc is just a stub and doesn't matter
val x = Map[String, ListBuffer[String]]()
list.foreach ((e:String) => {
val spl = e.split(" ")
combine(x, spl(0), spl(1))
})
x
}
I need to pass second argument to myFunc, so it can be used with various types A, B instead of specific String, ListBuffer[String].
def myFunc(list : List[A], combine: (Map[A, B], A, A) => Unit) = {
val x = Map[A, B]()
list.foreach(e => {
combine(x, e)
})
}
How to declare and call such construct?
You can do the following,
def myFunc[A, B](list : List[A], combine: (Map[A, B], A, A) => Unit) = {
val x = Map[A, B]()
list.foreach (e => combine(x, e, e))
x
}
Ad use it like
myFunc[String, Int](List("1","2","3"), (obj, k, v) => obj.put(k, v.toInt) )
It seems that you are looking to generalise the container being used. Were you looking for something like this? Here we import scala.language.higherKinds so that we can take Container, a kind which takes a single type parameter as a type parameter to addPair.
import scala.language.higherKinds
def addPair[K, V, Container[_]](map: Map[K, Container[V]],
addToContainer: (Container[V], V) => Container[V],
emptyContainer: => Container[V],
pair: (K, V)): Map[K, Container[V]] = {
val (key, value) = pair
val existingValues = map.getOrElse(key, emptyContainer)
val newValues = addToContainer(existingValues, value)
map + (key -> newValues)
}

Defining a scala Map of functions with variable return types

Is it possible to generalize the return types of a Map of functions with variable return types to a common signature and then use the actual return type of each function in the Map at runtime?
Explanation:
I'm having a scala Map of string -> functions defined as:
Map[String, (String) => Seq[Any]] = Map("1" -> foo, 2 -> bar, 3 -> baz)
where foo, bar and baz are defined as:
foo(string: String): Seq[A]
bar(string: String): Seq[B]
baz(string: String): Seq[C]
The compilation works fine but at runtime Seq[A or B or C] types returned by the functions are being treated as Seq[Any] thereby giving me a reflection exception.
Lets imagine some Map-alike workaround
Suppose we define type like that
import scala.reflect.runtime.universe._
class PolyMap[K, V[+_]] {
var backing = Map[(K, TypeTag[_]), V[Any]]()
def apply[T: TypeTag](key: K) =
backing.get(key, typeTag[T]).asInstanceOf[Option[V[T]]]
def update[T: TypeTag](key: K, f: V[T]): this.type = {
backing += (key, typeTag[T]) → f
this
}
}
now having
type String2Seq[+X] = String ⇒ Seq[X]
val polyMap = new PolyMap[String, String2Seq]
polyMap("foo") = foo
polyMap("bar") = bar
you could ensure that
polyMap[String]("foo").map(_("x")) == Some(foo("x"))
polyMap[Int]("foo").map(_("x")) == None
polyMap[Int]("bar").map(_("x")) == Some(bar("x"))
I think you can try this variant
def foo(string: String): Seq[String] = {
Seq(string)
}
def bar(string: String): Seq[Int] = {
Seq(1)
}
val map = Map(1 -> foo _, 2 -> bar _)
val res = map(1) match {
case f: (String => Seq[String]) => f
case f: (String => Seq[Int]) => f
case _ => throw new NotImplementedError()
}
println(res("Hello"))
it work's for me.

Pretty print a nested Map in scala

Suppose I have a nexted Map in Scala as below:
type MapType = Map[String, Map[String, Map[String, (String, String)]]]
val m: MapType = Map("Alphabet" -> Map( "Big Boss" -> Map("Clandestine Mssion" -> ("Dracula Returns", "Enemy at the Gates"))))
println(m)
This would output the Map as shown below:
Map(Alphabet -> Map(Big Boss -> Map(Clandestine Mssion -> (Dracula Returns,Enemy at the Gates))))
How can I print it like below instead?:
Map(
Alphabet -> Map(Big Boss -> Map(
Clandestine Mssion -> (Dracula Returns,Enemy at the Gates)
)
)
)
Or in a way that is similar to pretty nested JSON.
This should do the trick
object App {
def main(args : Array[String]) {
type MapType = Map[String, Map[String, Map[String, (String, String)]]]
val m: MapType = Map("Alphabet" -> Map( "Big Boss" -> Map("Clandestine Mssion" -> ("Dracula Returns", "Enemy at the Gates"))))
println(m.prettyPrint)
}
implicit class PrettyPrintMap[K, V](val map: Map[K, V]) {
def prettyPrint: PrettyPrintMap[K, V] = this
override def toString: String = {
val valuesString = toStringLines.mkString("\n")
"Map (\n" + valuesString + "\n)"
}
def toStringLines = {
map
.flatMap{ case (k, v) => keyValueToString(k, v)}
.map(indentLine(_))
}
def keyValueToString(key: K, value: V): Iterable[String] = {
value match {
case v: Map[_, _] => Iterable(key + " -> Map (") ++ v.prettyPrint.toStringLines ++ Iterable(")")
case x => Iterable(key + " -> " + x.toString)
}
}
def indentLine(line: String): String = {
"\t" + line
}
}
}
The output is
Map (
Alphabet -> Map (
Big Boss -> Map (
Clandestine Mssion -> (Dracula Returns,Enemy at the Gates)
)
)
)
Here's an approach using type classes that avoids casting and toString.
trait Show[A] {
def apply(a: A): String
}
object Show {
def apply[A](f: A => String): Show[A] =
new Show[A] {
def apply(a: A): String = f(a)
}
}
implicit def showString: Show[String] = Show(identity)
implicit def show2[A, B](implicit sA: Show[A], sB: Show[B]): Show[(A, B)] =
Show({ case (a, b) => s"(${sA(a)}, ${sB(b)})" })
implicit def showMap[A, B](implicit sA: Show[A], sB: Show[B]): Show[Map[A, B]] =
Show(m => s"Map(\n${
m.map({
case (a, b) => s" ${sA(a)} -> ${sB(b).replace("\n", "\n ")}"
}).mkString(",\n")
}\n)")
def show[A](a: A)(implicit s: Show[A]): String = s(a)
val m = Map("Alphabet" -> Map("Big Boss" -> Map("Clandestine Mission" ->
("Dracula Returns", "Enemy at the Gates"))))
show(m)
Result:
Map(
Alphabet -> Map(
Big Boss -> Map(
Clandestine Mission -> (Dracula Returns, Enemy at the Gates)
)
)
)

How can I combine a tuple of values with a tuple of functions?

I have scalaZ available.
I have an (A, B) and a (A => C, B => D), I'd like to get a (C, D) in a simple and readable way.
I feel like there's something I can do with applicatives but I can't find the right methods.
Edit
Didn't get it at first, that the OP has tuple of functions. In such case as suggested in comments this should work:
val in = ("1", 2)
val fnT = ((s: String) => s.toInt, (i: Int) => i.toString)
val out = (in.bimap[Int, String] _).tupled(fnT)
Old
If you have two functions and want to apply them on tuple, you should be able to do:
import scalaz._
import Scalaz._
val in = ("1", 2)
val sToi = (s: String) => s.toInt
val iTos = (i: Int) => i.toString
val out = sToi <-: in :-> iTos
// or
val out1 = in.bimap(sToi, iTos)
// or
val out2 = (sToi *** iTos)(in)
Arrows? Something like:
(f *** g)(a, b)
http://eed3si9n.com/learning-scalaz/Arrow.html
I'm not finding scalaz more readable. Whats wrong with defining your own function.
def biFunc(valTup:(A,B), funTup:((A)=>C,(B)=>D)):(C,D) = (funTup._1(valTup._1), funTup._2(valTup._2))
I agree with Lionel Port, but you could make it more readable via:
case class BiFun[A,B,C,D](f1:A=>C, f2: B=>D){
def applyTo(a: (A,B)) = (f1(a._1), f2(a._2))
}
object BiFun{
implicit def toBiFun(a: (A=>C, B=>D)) = BiFun(a._1, a._2)
}
used like:
import BiFun._
val ab = (A(1), B(2))
val ac = (x: A) => C(x.i+2)
val bd = (x: B) => D(x.i+2)
val bifn = (ac, bd)
bifn applyTo ab
So, in the end you end up with funTuple applyTo tuple and gain your top level readability
Writing this method yourself might be the best bet:
def bimap[A,B,C,D](vals:(A, B), funcs:(A=>C, B=>D)):(C,D) = {
val ((func1, func2), (val1, val2)) = funcs -> vals
func1(val1) -> func2(val2)
}
And if you're doing this a lot, you might even enhance the tuple class:
implicit class EnhancedTuple2[A, B](val vals: (A, B)) extends AnyVal {
def bimap[C, D](funcs: (A=>C, B=>D)) = {
val ((func1, func2), (val1, val2)) = funcs -> vals
func1(val1) -> func2(val2)
}
}
So that you can do:
val func1: Int => Int = x => x * x
val func2: Int => String = x => x.toString
val tupledFuncs = func1 -> func2
(1, 2).bimap(tupledFuncs)

Generating change set between two maps

What is the best/cleanest/most-efficient way to detect changes between two Map instances. I.e.
val before = Map(1 -> "foo", 2 -> "bar", 3 -> "baz")
val after = Map(1 -> "baz", 2 -> "bar", 4 -> "boo")
// not pretty:
val removed = before.keySet diff after.keySet
val added = after.filterNot { case (key, _) => before contains key }
val changed = (before.keySet intersect after.keySet).flatMap { key =>
val a = before(key)
val b = after (key)
if (a == b) None else Some(key -> (a, b))
}
Here is an idea. It probably takes O(N * log N) with N = max(before.size, after.size):
sealed trait Change[+K, +V]
case class Removed[K ](key: K) extends Change[K, Nothing]
case class Added [K, V](key: K, value : V) extends Change[K, V]
case class Updated[K, V](key: K, before: V, after: V) extends Change[K, V]
def changes[K, V](before: Map[K, V], after: Map[K, V]): Iterable[Change[K, V]] ={
val b = Iterable.newBuilder[Change[K, V]]
before.foreach { case (k, vb) =>
after.get(k) match {
case Some(va) if vb != va => b += Updated(k, vb, va)
case None => b += Removed(k)
case _ =>
}
}
after.foreach { case (k, va) =>
if (!before.contains(k)) b += Added(k, va)
}
b.result()
}
changes(before, after).foreach(println)
// Updated(1,foo,baz)
// Removed(3)
// Added(4,boo)