Merge a sequence of tuples into another sequence in scala - scala

I have a sequence of tuples like below. Number of "x" and "y"s occurred in documents "abc" and "xyz"
Seq(("abc", Map("x" -> 1, "y" -> 2)), ("xyz", Map("x" -> 2, "y" -> 1)))
How can I create an output like below from this above sequence.
Seq(("x", Map("abc" -> 1, "xyz" -> 2)), ("y", Map("abc" -> 2, "xyz" -> 1)))

Here is one possibility:
val s = Seq(
("abc", Map("x" -> 1, "y" -> 2)),
("xyz", Map("x" -> 2, "y" -> 1))
)
val t = (for {
(x, yvs) <- s
(y, v) <- yvs
} yield (y, (x, v)))
.groupBy(_._1)
.mapValues(_.unzip._2.toMap)
println(t)
This produces (up to random reordering of the unsorted keys):
Map(
x -> Map(abc -> 1, xyz -> 2),
y -> Map(abc -> 2, xyz -> 1)
)

Related

Reduce rdd of maps

I have and rdd like that :
Map(A -> Map(A1 -> 1))
Map(A -> Map(A2 -> 2))
Map(A -> Map(A3 -> 3))
Map(B -> Map(B1 -> 4))
Map(B -> Map(B2 -> 5))
Map(B -> Map(B3 -> 6))
Map(C -> Map(C1 -> 7))
Map(C -> Map(C2 -> 8))
Map(C -> Map(C3 -> 9))
I need to have the same rdd reduced by key and having as many values as it has previously:
Map(A -> Map(A1 -> 1, A2 -> 2, A3 -> 3))
Map(B -> Map(B1 -> 4, B2 -> 5, B3 -> 6))
Map(C -> Map(C1 -> 7, C2 -> 8, C3 -> 9))
I tried with a reduce:
val prueba = replacements_2.reduce((x,y) => x ++ y)
But only remains the value of the last element evaluated with the same key:
(A,Map(A3 -> 3))
(C,Map(C3 -> 9))
(B,Map(B3 -> 6))
I think you should model your data differently, your Map approach seems a bit awkward. Why represent 1 entry by a Map with 1 element? A Tuple2 is more suitable for this... Anyway, you need reduceByKey. To do this, you first need to convert your rdd to a key-value RDD:
rdd
.map(m => (m.keys.head,m.values.head)) // create key-value RDD
.reduceByKey((a,b) => a++b) // merge maps
.map{case (k,v) => Map(k -> v)} // create Map again

How to make merge or intercalate two maps in scala one by one?

I want to merge two maps in a list of maps as follow:
val map1 = {"a" -> 1, "b" -> 2, "c" -> 3}
val map2 = {"x" -> 10, "y" -> 20, "z" -> 30}
val res = [{"a" ->1, "x" -> 10},{"b" -> 2, "y" -> 20},{"c" -> 3, "z" -> 30}]
Maybe something like this:
val map1 = Map("a" -> 1, "b" -> 2, "c" -> 3)
val map2 = Map("x" -> 10, "y" -> 20, "z" -> 30)
(map1.toList, map2.toList).zipped.map{
case (a,b) => Map(a,b)
}
You can also try this:
val map1 = Map("a" -> 1, "b" -> 2, "c" -> 3)
val map2 = Map("x" -> 10, "y" -> 20, "z" -> 30)
val res = for ((i, j) <- map1 zip map2) yield Map(i, j)

Invert a Map (String -> List) in Scala

I have a Map[String, List[String]] and I want to invert it. For example, if I have something like
"1" -> List("a","b","c")
"2" -> List("a","j","k")
"3" -> List("a","c")
The result should be
"a" -> List("1","2","3")
"b" -> List("1")
"c" -> List("1","3")
"j" -> List("2")
"k" -> List("2")
I've tried this:
m.map(_.swap)
But it returns a Map[List[String], String]:
List("a","b","c") -> "1"
List("a","j","k") -> "2"
List("a","c") -> "3"
Map inversion is a little more complicated.
val m = Map("1" -> List("a","b","c")
,"2" -> List("a","j","k")
,"3" -> List("a","c"))
m flatten {case(k, vs) => vs.map((_, k))} groupBy (_._1) mapValues {_.map(_._2)}
//res0: Map[String,Iterable[String]] = Map(j -> List(2), a -> List(1, 2, 3), b -> List(1), c -> List(1, 3), k -> List(2))
Flatten the Map into a collection of tuples. groupBy will create a new Map with the old values as the new keys. Then un-tuple the values by removing the key (previously value) elements.
An alternative that does not rely on strange implicit arguments of flatten, as requested by yishaiz:
val m = Map(
"1" -> List("a","b","c"),
"2" -> List("a","j","k"),
"3" -> List("a","c"),
)
val res = (for ((digit, chars) <- m.toList; c <- chars) yield (c, digit))
.groupBy(_._1) // group by characters
.mapValues(_.unzip._2) // drop redundant digits from lists
res foreach println
gives:
(j,List(2))
(a,List(1, 2, 3))
(b,List(1))
(c,List(1, 3))
(k,List(2))
A simple nested for-comprehension may be used to invert the map in such a way that each value in the List of values are keys in the inverted map with respective keys as their values
implicit class MapInverter[T] (map: Map[T, List[T]]) {
def invert: Map[T, T] = {
val result = collection.mutable.Map.empty[T, T]
for ((key, values) <- map) {
for (v <- values) {
result += (v -> key)
}
}
result.toMap
}
Usage:
Map(10 -> List(3, 2), 20 -> List(16, 17, 18, 19)).invert

Reverse a map of type [Int, Seq[Int]]

I need to reverse a map
customerIdToAccountIds:Map[Int, Seq[Int]]
such that each account ID is a key to a list of all the customer IDs of the account (many-to-many relationship):
accountIdToCustomerIds:Map[Int, Seq[Int]]
What is a good idiomatic way to accomplish this? Thanks!
Input:
val customerIdToAccountIds:Map[Int, Seq[Int]] = Map(1 -> Seq(5,6,7), 2 -> Seq(5,6,7), 3 -> Seq(5,7,8))
val accountIdToCustomerIds:Map[Int, Seq[Int]] = ???
1 -> Seq(5,6,7)
2 -> Seq(5,6,7)
3 -> Seq(5,7,8)
Output:
5 -> Seq(1,2,3)
6 -> Seq(1,2)
7 -> Seq(1,2,3)
8 -> Seq(3)
val m = Map( 1 -> Seq(5,6,7)
, 2 -> Seq(5,6,7)
, 3 -> Seq(5,7,8) )
// Map inverter: from (k -> List(vs)) to (v -> List(ks))
m flatten {case(k, vs) => vs.map((_, k))} groupBy (_._1) mapValues {_.map(_._2)}
//result: Map(8 -> List(3), 5 -> List(1, 2, 3), 7 -> List(1, 2, 3), 6 -> List(1, 2))
val customerIdToAccountIds = Map(1 -> Seq(5, 6, 7), 2 -> Seq(5, 6, 7), 3 -> Seq(5, 7, 8))
val accountIdToCustomerIds = customerIdToAccountIds.toSeq.flatMap {
case (customerId, accountIds) => accountIds.map { accountId => (customerId, accountId) } // swap
}.groupBy(_._2).mapValues(_.map(_._1)) // groupBy accountId and extract customerId from tuples

reduce variable number of tuples Sequences to Map[Key, List[Value]] in Scala

I have two sequences:
Seq("a" -> 1, "b" -> 2)
Seq("a" -> 3, "b" -> 4)
What I want is a result Map that looks like this:
Map(a -> List(3, 1), b -> List(4, 2))
val s1 = Seq("a" -> 1, "b" -> 2)
val s2 = Seq("a" -> 3, "b" -> 4)
val ss = s1 ++ s2
val toMap = ss.groupBy(x => x._1).map { case (k,v) => (k, v.map(_._2))}
res0: scala.collection.immutable.Map[String,Seq[Int]] = Map(b -> List(2, 4), a -> List(1, 3))
You can sort this or something you want.
You can try
scala> val seq = Seq("a" -> 1, "b" -> 2) ++ Seq("a" -> 3, "b" -> 4)
seq: Seq[(String, Int)] = List((a,1), (b,2), (a,3), (b,4))
scala> seq groupBy(_._1) mapValues(_ map(_._2))
res9: scala.collection.immutable.Map[String,Seq[Int]] = Map(b -> List(2, 4), a -> List(1, 3))
def reduceToMap[K, V](seqs: Seq[(K, V)]*): Map[K, List[V]] = {
seqs.reduce(_ ++ _).foldLeft(Map.empty[K, List[V]])((memo, next) =>
memo.get(next._1) match {
case None => memo.updated(next._1, next._2 :: Nil)
case Some(xs) => memo.updated(next._1, next._2 :: xs)
}
)
}
scala> reduceToMap(Seq("a" -> 1, "b" -> 2), Seq("a" -> 3, "b" -> 4))
res0: Map[String,List[Int]] = Map(a -> List(3, 1), b -> List(4, 2))
scala> reduceToMap(Seq.empty)
res1: Map[Nothing,List[Nothing]] = Map()