GroupBy in scala - scala

I have
val a = List((1,2), (1,3), (3,4), (3,5), (4,5))
I am using A.groupBy(_._1) which is groupBy with the first element. But, it gives me output as
Map(1 -> List((1,2) , (1,3)) , 3 -> List((3,4), (3,5)), 4 -> List((4,5)))
But, I want answer as
Map(1 -> List(2, 3), 3 -> List(4,5) , 4 -> List(5))
So, how can I do this?

You can do that by following up with mapValues (and a map over each value to extract the second element):
scala> a.groupBy(_._1).mapValues(_.map(_._2))
res2: scala.collection.immutable.Map[Int,List[Int]] = Map(4 -> List(5), 1 -> List(2, 3), 3 -> List(4, 5))

Make life easy with pattern match and Map#withDefaultValue:
scala> a.foldLeft(Map.empty[Int, List[Int]].withDefaultValue(Nil)){
case(r, (x, y)) => r.updated(x, r(x):+y)
}
res0: scala.collection.immutable.Map[Int,List[Int]] =
Map(1 -> List(2, 3), 3 -> List(4, 5), 4 -> List(5))
There are two points:
Map#withDefaultValue will get a map with a given default value, then you don't need to check if the map contains a key.
When somewhere in scala expected a function value (x1,x2,..,xn) => y, you can always use a pattern matching case(x1,x2,..,xn) => y here, the compiler will translate it to a function auto. Look into 8.5 Pattern Matching Anonymous Functions for more information.
Sorry for my poor english.

As from Scala 2.13 it would be possible to use groupMap
so you'd be able to write just:
// val list = List((1, 2), (1, 3), (3, 4), (3, 5), (4, 5))
list.groupMap(_._1)(_._2)
// Map(1 -> List(2, 3), 3 -> List(4, 5), 4 -> List(5))

As a variant:
a.foldLeft(Map[Int, List[Int]]()) {case (acc, (a,b)) => acc + (a -> (b::acc.getOrElse(a,List())))}

You can also do it with a foldLeft to have only one iteration.
a.foldLeft(Map.empty[Int, List[Int]])((map, t) =>
if(map.contains(t._1)) map + (t._1 -> (t._2 :: map(t._1)))
else map + (t._1 -> List(t._2)))
scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(3, 2), 3 ->
List(5, 4), 4 -> List(5))
If the order of the elements in the lists matters you need to include a reverse.
a.foldLeft(Map.empty[Int, List[Int]])((map, t) =>
if(map.contains(t._1)) (map + (t._1 -> (t._2 :: map(t._1)).reverse))
else map + (t._1 -> List(t._2)))
scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(2, 3), 3 ->
List(4, 5), 4 -> List(5))

Related

Find common elements in a map of sequences - scala

I have something like this:
val myMap: Map[Int, Seq[Int]] = Map(1 -> (1, 2, 3), 2 -> (2, 3, 4), 3 -> (3, 4, 5), 4 -> (4, 5, 6))
I am trying to find a way to relate all the keys and their common elements in the sequence they are mapped to.
For example:
1 and 2 share (2, 3)
1 and 3 share (3)
2 and 3 share (3, 4)
2 and 4 share (4)
3 and 4 share (4, 5)
I suspect I need to use intersect but I am not sure how to go about the problem. I am brand new to scala and functional programming and need a little help getting started on this. I know there are probably easier ways to do this with spark, however, I am trying to stick just to scala.
Any help is greatly appreciated!
Here's one way using flatMap and collect to generate the shared values from every combination of the key pairs via intersect:
val myMap: Map[Int, List[Int]] = Map(
1 -> List(1, 2, 3), 2 -> List(2, 3, 4), 3 -> List(3, 4, 5), 4 -> List(4, 5, 6)
)
val keys = myMap.keys.toList
keys.flatMap{ i => keys.collect{
case j if j > i => (i, j, myMap(i) intersect myMap(j))
}
}
// res1: List[(Int, Int, List[Int])] = List(
// (1,2,List(2, 3)),
// (1,3,List(3)),
// (1,4,List()),
// (2,3,List(3, 4)),
// (2,4,List(4)),
// (3,4,List(4, 5))
// )
The above is essentially the same as the following for comprehension:
for {
i <- keys
j <- keys
if j > i
} yield (i, j, myMap(i) intersect myMap(j))
How do you want the results returned? Do you just want to print them to STDOUT?
myMap.keys.toList.combinations(2).foreach{ case List(a,b) =>
println(s"$a,$b --> ${myMap(a) intersect myMap(b)}")
}
Pretty similar to #jwvh solution, but with less lookups in the map, in case it is big:
val myMap: Map[Int, Seq[Int]] = Map(1 -> Seq(1, 2, 3), 2 -> Seq(2, 3, 4), 3 -> Seq(3, 4, 5), 4 -> Seq(4, 5, 6))
myMap.toList.combinations(2).foreach {
case List((i1, s1), (i2, s2)) =>
val ints = s1.intersect(s2)
if (ints.nonEmpty) {
println(s"$i1 and $i2 share (${ints.mkString(", ")})")
}
case _ => ???
}
Code run at Scastie.

Scala grouping of Sequence of <Key, Value(Key)> to Map of <Key, Seq(Value)> [duplicate]

I have
val a = List((1,2), (1,3), (3,4), (3,5), (4,5))
I am using A.groupBy(_._1) which is groupBy with the first element. But, it gives me output as
Map(1 -> List((1,2) , (1,3)) , 3 -> List((3,4), (3,5)), 4 -> List((4,5)))
But, I want answer as
Map(1 -> List(2, 3), 3 -> List(4,5) , 4 -> List(5))
So, how can I do this?
You can do that by following up with mapValues (and a map over each value to extract the second element):
scala> a.groupBy(_._1).mapValues(_.map(_._2))
res2: scala.collection.immutable.Map[Int,List[Int]] = Map(4 -> List(5), 1 -> List(2, 3), 3 -> List(4, 5))
Make life easy with pattern match and Map#withDefaultValue:
scala> a.foldLeft(Map.empty[Int, List[Int]].withDefaultValue(Nil)){
case(r, (x, y)) => r.updated(x, r(x):+y)
}
res0: scala.collection.immutable.Map[Int,List[Int]] =
Map(1 -> List(2, 3), 3 -> List(4, 5), 4 -> List(5))
There are two points:
Map#withDefaultValue will get a map with a given default value, then you don't need to check if the map contains a key.
When somewhere in scala expected a function value (x1,x2,..,xn) => y, you can always use a pattern matching case(x1,x2,..,xn) => y here, the compiler will translate it to a function auto. Look into 8.5 Pattern Matching Anonymous Functions for more information.
Sorry for my poor english.
As from Scala 2.13 it would be possible to use groupMap
so you'd be able to write just:
// val list = List((1, 2), (1, 3), (3, 4), (3, 5), (4, 5))
list.groupMap(_._1)(_._2)
// Map(1 -> List(2, 3), 3 -> List(4, 5), 4 -> List(5))
As a variant:
a.foldLeft(Map[Int, List[Int]]()) {case (acc, (a,b)) => acc + (a -> (b::acc.getOrElse(a,List())))}
You can also do it with a foldLeft to have only one iteration.
a.foldLeft(Map.empty[Int, List[Int]])((map, t) =>
if(map.contains(t._1)) map + (t._1 -> (t._2 :: map(t._1)))
else map + (t._1 -> List(t._2)))
scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(3, 2), 3 ->
List(5, 4), 4 -> List(5))
If the order of the elements in the lists matters you need to include a reverse.
a.foldLeft(Map.empty[Int, List[Int]])((map, t) =>
if(map.contains(t._1)) (map + (t._1 -> (t._2 :: map(t._1)).reverse))
else map + (t._1 -> List(t._2)))
scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(2, 3), 3 ->
List(4, 5), 4 -> List(5))

scala - sum up map's value by position

I have a map like this
var kk:scala.collection.mutable.Map[Int,Array[Int]] = Map(2 -> Array(1, 3), 1 -> Array(2, 8), 3 -> Array(4, 5))
What I need is sum up value arrays by position, like numpy array adding, and the result should be like this Array(1+2+4, 3+8+5)
Or a little shorter using .values to just get a List of the arrays, and .transpose to apply .map on each "column":
scala> val data = Map(2 -> Array(1, 3), 1 -> Array(2, 8), 3 -> Array(4, 5))
data: scala.collection.immutable.Map[Int,Array[Int]] = Map(2 -> Array(1, 3), 1 -> Array(2, 8), 3 -> Array(4, 5))
scala> data.values.transpose.map(_.sum)
res4: Iterable[Int] = List(7, 16)
And add .toArray if you'd like to get an Array:
scala> data.values.transpose.map(_.sum).toArray
res5: Array[Int] = Array(7, 16)
you need to iterate over the map to get the separate list of first and second elements of your map values. Then apply fold function.
scala> val data = Map(2 -> Array(1, 3), 1 -> Array(2, 8), 3 -> Array(4, 5))
data: scala.collection.mutable.Map[Int,Array[Int]] = Map(2 -> Array(1, 3), 1 -> Array(2, 8), 3 -> Array(4, 5))
scala> data.map(kv => kv._2.head -> kv._2.last).foldLeft(0, 0)((a, b) => (a._1 + b._1) -> (a._2 + b._2))
res11: (Int, Int) = (7,16)
if you want the result to be explicitly an Array do .productIterator.toArray on above result.
scala> data.map(kv => kv._2.head -> kv._2.last).foldLeft(0, 0)((a, b) => (a._1 + b._1) -> (a._2 + b._2)).productIterator.toArray
res13: Array[Any] = Array(7, 16)

Convert List of Maps to Map of Lists based on Map Key

Lets say I have the following list:
val myList = List(Map(1 -> 1), Map(2 -> 2), Map(2 -> 7))
I want to convert this list to a single Map of Int -> List(Int) such that if we have duplicate keys then both values should be included in the resulting value list:
Map(2 -> List(7, 2), 1 -> List(1))
I came up with this working solution but it seems excessive and clunky:
myList.foldLeft(scala.collection.mutable.Map[Int,List[Int]]()) {(result,element) =>
for((k,v) <- element) {
if (result.keySet.contains(k)) {
result(k) = result(k).:: (v)
} else {
result += (k -> List(v))
}
}
result
}
Is there a better or more efficient approach here?
myList
.flatten
.groupBy(_._1)
.mapValues(_.map(_._2))
You can use a simpler (but probably less efficient) code:
val myList = List(Map(1 -> 1), Map(2 -> 2), Map(2 -> 7))
val grouped = myList.flatMap(_.toList).groupBy(_._1).mapValues(l => l.map(_._2))
println(grouped)
Map(2 -> List(2, 7), 1 -> List(1))
The idea is to first get List of all tuples from all inner Maps and then group them.
Starting Scala 2.13, we can now use groupMap which is a one-pass equivalent of a groupBy followed by mapValues (as its name suggests):
// val maps = List(Map(1 -> 1), Map(2 -> 2), Map(2 -> 7))
maps.flatten.groupMap(_._1)(_._2) // Map(1 -> List(1), 2 -> List(2, 7))
This:
flattens the list of maps into a list of tuples (List((1, 1), (2, 2), (2, 7)))
groups elements based on their first tuple part (_._1) (group part of groupMap)
maps grouped values to their second tuple part (_._2) (map part of groupMap)

Scala Map: Combine keys with the same value?

Suppose I have a Map like
val x = Map(1 -> List("a", "b"), 2 -> List("a"),
3 -> List("a", "b"), 4 -> List("a"),
5 -> List("c"))
How would I create from this a new Map where the keys are Lists of keys from x having the same value, e.g., how can I implement
def someFunction(m: Map[Int, List[String]]): Map[List[Int], List[String]] =
// stuff that would turn x into
// Map(List(1, 3) -> List("a", "b"), List(2, 4) -> List("a"), List(5) -> List("c"))
?
You can convert the Map to a List and then use groupBy to aggregate the first element of each tuple:
x.toList.groupBy(_._2).mapValues(_.map(_._1)).map{ case (x, y) => (y, x) }
// res37: scala.collection.immutable.Map[List[Int],List[String]] =
// Map(List(2, 4) -> List(a), List(1, 3) -> List(a, b), List(5) -> List(c))
Or as #Dylan commented, use _.swap to switch the tuples' elements:
x.toList.groupBy(_._2).mapValues(_.map(_._1)).map(_.swap)