I have a map like this:
val dummy = Map("1992" -> List("1", "2"), "1993" -> List("4", "5"))
I am trying to merge the lists as I go along from left to right in the map. So, my desired result would look like:
val result = Map("1992" -> List("1", "2"), "1993" -> List("1", "2", "4", "5"))
I tried to use scanLeft to achieve this, but I can't get the code to run correctly:
def mergeNodes(tuple1: (String, List[String]), tuple2: (String, List[String])): (String, List[String]) =
(tuple2._1, tuple1._2 ++ tuple2._2)
val dummy = Map("1992" -> List("1", "2"), "1993" -> List("4", "5"))
val res = dummy.scanLeft[(String, List[String])](("", List[String].empty))(mergeNodes)
But it gives the following error:
missing argument list for method mergeNodes
What am I doing wrong here?
I believe the pattern-matching approach to be
a bit more legible:
val m = Map("1992" -> List("1", "2"), "1993" -> List("4", "5"))
val res = m
.scanLeft("not-a-key" -> List.empty[String]){
case ((_, acc), (k, v)) => (k, acc ++ v)
}
.tail
.toMap
In each step, you drop the previous key, take the new key, and combine the accumulated entries with the new value.
Produces:
Map(1992 -> List(1, 2), 1993 -> List(1, 2, 4, 5))
I'm not too convinced about the order, though (Maps of size 2 are too small to be scrambled randomly). Maybe it's safer to convert it to a sorted list first.
Related
I have a Map[String, List[String]] and I want to invert it. For example, if I have something like
"1" -> List("a","b","c")
"2" -> List("a","j","k")
"3" -> List("a","c")
The result should be
"a" -> List("1","2","3")
"b" -> List("1")
"c" -> List("1","3")
"j" -> List("2")
"k" -> List("2")
I've tried this:
m.map(_.swap)
But it returns a Map[List[String], String]:
List("a","b","c") -> "1"
List("a","j","k") -> "2"
List("a","c") -> "3"
Map inversion is a little more complicated.
val m = Map("1" -> List("a","b","c")
,"2" -> List("a","j","k")
,"3" -> List("a","c"))
m flatten {case(k, vs) => vs.map((_, k))} groupBy (_._1) mapValues {_.map(_._2)}
//res0: Map[String,Iterable[String]] = Map(j -> List(2), a -> List(1, 2, 3), b -> List(1), c -> List(1, 3), k -> List(2))
Flatten the Map into a collection of tuples. groupBy will create a new Map with the old values as the new keys. Then un-tuple the values by removing the key (previously value) elements.
An alternative that does not rely on strange implicit arguments of flatten, as requested by yishaiz:
val m = Map(
"1" -> List("a","b","c"),
"2" -> List("a","j","k"),
"3" -> List("a","c"),
)
val res = (for ((digit, chars) <- m.toList; c <- chars) yield (c, digit))
.groupBy(_._1) // group by characters
.mapValues(_.unzip._2) // drop redundant digits from lists
res foreach println
gives:
(j,List(2))
(a,List(1, 2, 3))
(b,List(1))
(c,List(1, 3))
(k,List(2))
A simple nested for-comprehension may be used to invert the map in such a way that each value in the List of values are keys in the inverted map with respective keys as their values
implicit class MapInverter[T] (map: Map[T, List[T]]) {
def invert: Map[T, T] = {
val result = collection.mutable.Map.empty[T, T]
for ((key, values) <- map) {
for (v <- values) {
result += (v -> key)
}
}
result.toMap
}
Usage:
Map(10 -> List(3, 2), 20 -> List(16, 17, 18, 19)).invert
I am scratching my head vigorously, to understand the logic that produces the value out of a flatMap() operation:
val ys = Map("a" -> List(1 -> 11,1 -> 111), "b" -> List(2 -> 22,2 -> 222)).flatMap(e => {
| println("e =" + e)
| (e._2)
| })
e =(a,List((1,11), (1,111)))
e =(b,List((2,22), (2,222)))
ys: scala.collection.immutable.Map[Int,Int] = Map(1 -> 111, 2 -> 222)
The println clearly shows that flatMap is taking in one entry out of the input Map. So, e._2 is a List of Pairs. I can't figure out what exactly happens after that!
I am missing a very important and subtle step somewhere. Please enlighten me.
It can be thought of as:
First we map:
val a = Map("a" -> List(1 -> 11,1 -> 111), "b" -> List(2 -> 22,2 -> 222)).map(e => e._2)
// List(List((1, 11), (1, 111)), List((2, 22), (2, 222)))
Then we flatten:
val b = a.flatten
// List((1, 11), (1, 111), (2, 22), (2, 222))
Then we convert back to a map:
b.toMap
// Map(1 -> 111, 2 -> 222)
Since a map cannot have 2 values for 1 key, the value is overwritten.
Really whats going on is that the flatMap is being converted into a loop like so:
for (x <- m0) b ++= f(x)
where:
m0 is our original map
b is a collection builder that has to build a Map, aka, MapBuilder
f is our function being passed into the flatMap (it returns a List[(Int, Int)])
x is an element in our original map
The ++= function takes the list we got from calling f(x), and calls += on every element, to add it to our map. For a Map, += just calls the original + operator for a Map, which updates the value if the key already exists.
Finally we call result on our builder which just returns us our Map.
How can I merge a Seq of Maps to a single Map i.e.
Seq[Map[String, String]] => Map[String, String]
For example:
val someSeq = rdd.map(_._2).flatMap(...) //some transformation to produce the sequence of maps
where someSeq is Seq(student1, student2) and student1 and student2 are Maps :
var student1 = Map(a -> "1", b -> "1")
var student2 = Map(c -> "1", d -> "1")
I need a result like this:
val apps = Map(a -> "1", b -> "1", c -> "1", d -> "1")
Any idea ?
Unrelated to Spark, but one approach would be to fold over the sequence as follows:
val student1 = Map("a" -> "1", "b" -> "1")
val student2 = Map("c" -> "1", "d" -> "1")
val students = Seq(student1, student2)
students.foldLeft(Map[String, String]())(_ ++ _)
Returns
Map(a -> 1, b -> 1, c -> 1, d -> 1)
In regards to "undoing" a flatMap, I don't believe this is really possible. In order to achieve that, consider the notion of undoing a "flatten".
For example:
val x = Seq(1, 2)
val y = Seq(3, 4)
val combined = Seq(x, y)
val flattened = combined.flatten
val b = Seq(1, 2, 3)
val c = Seq(4)
val combined2 = Seq(b, c)
val flattened2 = combined2.flatten
flattened == flattened2
Returns true.
So basically, in this instance, you can go from unflattened to flattened, but not vice versa, because vice versa would yield multiple answers.
I've a Map where the key is a String and the value is an Int but represented as a String.
scala> val m = Map( "a" -> "1", "b" -> "2", "c" -> "3" )
m: scala.collection.immutable.Map[String,String] = Map(a -> 1, b -> 2, c -> 3)
Now I want to convert this into a Map[String, Int]
scala> m.mapValues(_.toInt)
res0: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, c -> 3)
As shown in Brian's answer, mapValues is the best way to do this.
You can achieve the same effect using pattern matching, which would look like this:
m.map{ case (k, v) => (k, v.toInt)}
and is useful in other situations (e.g. if you want to change the key as well).
Remember that you are pattern matching against each entry in the Map, represented as a tuple2, not against the Map as a whole.
You also have to use curly braces {} around the case statement, to keep the compiler happy.
What is the best way to turn a Map[A, Set[B]] into a Map[B, Set[A]]?
For example, how do I turn a
Map(1 -> Set("a", "b"),
2 -> Set("b", "c"),
3 -> Set("c", "d"))
into a
Map("a" -> Set(1),
"b" -> Set(1, 2),
"c" -> Set(2, 3),
"d" -> Set(3))
(I'm using immutable collections only here. And my real problem has nothing to do with strings or integers. :)
with help from aioobe and Moritz:
def reverse[A, B](m: Map[A, Set[B]]) =
m.values.toSet.flatten.map(v => (v, m.keys.filter(m(_)(v)))).toMap
It's a bit more readable if you explicitly call contains:
def reverse[A, B](m: Map[A, Set[B]]) =
m.values.toSet.flatten.map(v => (v, m.keys.filter(m(_).contains(v)))).toMap
Best I've come up with so far is
val intToStrs = Map(1 -> Set("a", "b"),
2 -> Set("b", "c"),
3 -> Set("c", "d"))
def mappingFor(key: String) =
intToStrs.keys.filter(intToStrs(_) contains key).toSet
val newKeys = intToStrs.values.flatten
val inverseMap = newKeys.map(newKey => (newKey -> mappingFor(newKey))).toMap
Or another one using folds:
def reverse2[A,B](m:Map[A,Set[B]])=
m.foldLeft(Map[B,Set[A]]()){case (r,(k,s)) =>
s.foldLeft(r){case (r,e)=>
r + (e -> (r.getOrElse(e, Set()) + k))
}
}
Here's a one statement solution
orginalMap
.map{case (k, v)=>value.map{v2=>(v2,k)}}
.flatten
.groupBy{_._1}
.transform {(k, v)=>v.unzip._2.toSet}
This bit rather neatly (*) produces the tuples needed to construct the reverse map
Map(1 -> Set("a", "b"),
2 -> Set("b", "c"),
3 -> Set("c", "d"))
.map{case (k, v)=>v.map{v2=>(v2,k)}}.flatten
produces
List((a,1), (b,1), (b,2), (c,2), (c,3), (d,3))
Converting it directly to a map overwrites the values corresponding to duplicate keys though
Adding .groupBy{_._1} gets this
Map(c -> List((c,2), (c,3)),
a -> List((a,1)),
d -> List((d,3)),
b -> List((b,1), (b,2)))
which is closer. To turn those lists into Sets of the second half of the pairs.
.transform {(k, v)=>v.unzip._2.toSet}
gives
Map(c -> Set(2, 3), a -> Set(1), d -> Set(3), b -> Set(1, 2))
QED :)
(*) YMMV
A simple, but maybe not super-elegant solution:
def reverse[A,B](m:Map[A,Set[B]])={
var r = Map[B,Set[A]]()
m.keySet foreach { k=>
m(k) foreach { e =>
r = r + (e -> (r.getOrElse(e, Set()) + k))
}
}
r
}
The easiest way I can think of is:
// unfold values to tuples (v,k)
// for all values v in the Set referenced by key k
def vk = for {
(k,vs) <- m.iterator
v <- vs.iterator
} yield (v -> k)
// fold iterator back into a map
(Map[String,Set[Int]]() /: vk) {
// alternative syntax: vk.foldLeft(Map[String,Set[Int]]()) {
case (m,(k,v)) if m contains k =>
// Map already contains a Set, so just add the value
m updated (k, m(k) + v)
case (m,(k,v)) =>
// key not in the map - wrap value in a Set and return updated map
m updated (k, Set(v))
}