Rolling up data with nested maps in Scala - scala

New to programming in a more "functional" style. Normally I would write a series of nested foreach loops and += to totals.
I have a data structure that looks like:
Map(
"team1" ->
Map(
"2015" -> Map("wins" -> 30, "losses" -> 5),
"2016" -> Map("wins" -> 3, "losses" -> 7)
),
"team2" ->
Map(
"2015" -> Map("wins" -> 22, "losses" -> 1),
"2016" -> Map("wins" -> 17, "losses" -> 4)
)
)
What I want is a data structure that simply throws away the year information and adds wins/losses together by team.
Map(
"team1" -> Map("wins" -> 33, "losses" -> 12),
"team2" -> Map("wins" -> 39, "losses" -> 5)
)
I've been looking at groupBy but that seems only be useful if I don't have this nested structure.
Any ideas? Or is the more traditional imperative/foreach approach favorable here.

myMap.map(i => i._1 -> i._2.values.flatMap(_.toList).groupBy(_._1).map(i => i._1 -> i._2.map(_._2).sum))
get all values
flatMap to list
groupBy by key
get all the grouped values and sum

Define a customized method to add two Maps by keys as:
def addMap(x: Map[String, Int], y: Map[String, Int]) =
x ++ y.map{ case (k, v) => (k, v + x.getOrElse(k, 0))}
m.mapValues(_.values.reduce(addMap(_, _)))
// res16: scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,Int]] =
// Map(team1 -> Map(wins -> 33, losses -> 12), team2 -> Map(wins -> 39, losses -> 5))

Using cats you could do :
import cats.implicits._
// or
// import cats.instances.map._
// import cats.instances.int._
// import cats.syntax.foldable._
teams.mapValues(_.combineAll)
// Map(
// team1 -> Map(wins -> 33, losses -> 12),
// team2 -> Map(wins -> 39, losses -> 5)
// )
combineAll combines the wins/losses maps of every year using a Monoid[Map[String, Int]] instance (also provided by the Cats library, see Monoid documentation), which sums the Ints for every key.

.mapValues { _.toSeq
.flatMap(_._2.toSeq)
.groupBy(_._1)
.mapValues(_.foldLeft(0)(_ + _._2)) }

scala> val sourceMap = Map(
| "team1" ->
| Map(
| "2015" -> Map("wins" -> 30, "losses" -> 5),
| "2016" -> Map("wins" -> 3, "losses" -> 7)
| ),
| "team2" ->
| Map(
| "2015" -> Map("wins" -> 22, "losses" -> 1),
| "2016" -> Map("wins" -> 17, "losses" -> 4)
| )
| )
sourceMap: scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,Int]]] = Map(team1 -> Map(2015 -> Map(wins -> 30, losses -> 5), 2016 -> Map(wins -> 3, losses -> 7)), team2 -> Map(2015 -> Map(wins -> 22, losses -> 1), 2016 -> Map(wins -> 17, losses -> 4)))
scala> sourceMap.map { case (team, innerMap) =>
| val outcomeGroups = innerMap.values.flatten.groupBy(_._1)
| team -> outcomeGroups.map { case (outcome, xs) =>
| val scores = xs.map(_._2).sum
| outcome -> scores
| }
| }
res0: scala.collection.immutable.Map[String,scala.collection.immutable.Map[String,Int]] = Map(team1 -> Map(losses -> 12, wins -> 33), team2 -> Map(losses -> 5, wins -> 39))

Related

Reduce rdd of maps

I have and rdd like that :
Map(A -> Map(A1 -> 1))
Map(A -> Map(A2 -> 2))
Map(A -> Map(A3 -> 3))
Map(B -> Map(B1 -> 4))
Map(B -> Map(B2 -> 5))
Map(B -> Map(B3 -> 6))
Map(C -> Map(C1 -> 7))
Map(C -> Map(C2 -> 8))
Map(C -> Map(C3 -> 9))
I need to have the same rdd reduced by key and having as many values as it has previously:
Map(A -> Map(A1 -> 1, A2 -> 2, A3 -> 3))
Map(B -> Map(B1 -> 4, B2 -> 5, B3 -> 6))
Map(C -> Map(C1 -> 7, C2 -> 8, C3 -> 9))
I tried with a reduce:
val prueba = replacements_2.reduce((x,y) => x ++ y)
But only remains the value of the last element evaluated with the same key:
(A,Map(A3 -> 3))
(C,Map(C3 -> 9))
(B,Map(B3 -> 6))
I think you should model your data differently, your Map approach seems a bit awkward. Why represent 1 entry by a Map with 1 element? A Tuple2 is more suitable for this... Anyway, you need reduceByKey. To do this, you first need to convert your rdd to a key-value RDD:
rdd
.map(m => (m.keys.head,m.values.head)) // create key-value RDD
.reduceByKey((a,b) => a++b) // merge maps
.map{case (k,v) => Map(k -> v)} // create Map again

How to make merge or intercalate two maps in scala one by one?

I want to merge two maps in a list of maps as follow:
val map1 = {"a" -> 1, "b" -> 2, "c" -> 3}
val map2 = {"x" -> 10, "y" -> 20, "z" -> 30}
val res = [{"a" ->1, "x" -> 10},{"b" -> 2, "y" -> 20},{"c" -> 3, "z" -> 30}]
Maybe something like this:
val map1 = Map("a" -> 1, "b" -> 2, "c" -> 3)
val map2 = Map("x" -> 10, "y" -> 20, "z" -> 30)
(map1.toList, map2.toList).zipped.map{
case (a,b) => Map(a,b)
}
You can also try this:
val map1 = Map("a" -> 1, "b" -> 2, "c" -> 3)
val map2 = Map("x" -> 10, "y" -> 20, "z" -> 30)
val res = for ((i, j) <- map1 zip map2) yield Map(i, j)

Merge a sequence of tuples into another sequence in scala

I have a sequence of tuples like below. Number of "x" and "y"s occurred in documents "abc" and "xyz"
Seq(("abc", Map("x" -> 1, "y" -> 2)), ("xyz", Map("x" -> 2, "y" -> 1)))
How can I create an output like below from this above sequence.
Seq(("x", Map("abc" -> 1, "xyz" -> 2)), ("y", Map("abc" -> 2, "xyz" -> 1)))
Here is one possibility:
val s = Seq(
("abc", Map("x" -> 1, "y" -> 2)),
("xyz", Map("x" -> 2, "y" -> 1))
)
val t = (for {
(x, yvs) <- s
(y, v) <- yvs
} yield (y, (x, v)))
.groupBy(_._1)
.mapValues(_.unzip._2.toMap)
println(t)
This produces (up to random reordering of the unsorted keys):
Map(
x -> Map(abc -> 1, xyz -> 2),
y -> Map(abc -> 2, xyz -> 1)
)

Scala - filter out null values in a Array[Map[String,Int]]

I have an Array of [Map[String,Int] like this:
val orArray = Array(Map("x" -> 24, "y" -> 25, "z" -> 26), null, Map("x" -> 11, "y" -> 22, "z" -> 33), null, Map("x" -> 111, "y" -> 222, "z" -> 333))
I want to remove the null elements in this array, to get something like:
Array[Map[String,Int]] = (Map("x" -> 24, "y" -> 25, "z" -> 26), Map("x" -> 11, "y" -> 22, "z" -> 33), Map("x" -> 111, "y" -> 222, "z" -> 333))
I was trying this so far
orArray.filterNot(p => p.isEmpty)
But it generates a NullPointerException. How could I filter out those two null values?
You can simply check the null values as
orArray.filter(map => map != null)
Output:
Map(x -> 24, y -> 25, z -> 26), Map(x -> 11, y -> 22, z -> 33), Map(x -> 111, y -> 222, z -> 333)
Hope this helps!

How to reorder a List containing Maps based on Keys using Scala?

How to reorder
List(Map(d -> 4, a -> 1, c -> 3, b -> 2), Map(d -> 8, a -> 2, c -> 6, b -> 4))
to
List(Map(a -> 1, b -> 2, c -> 3, d -> 4), Map(a -> 2, b -> 4, c -> 6, d -> 8))
using Scala?
import scala.collection.immutable.SortedMap
val a = Map('d' -> 4, 'a' -> 1, 'c' -> 3, 'b' -> 2)
val b = Map('d' -> 8, 'a' -> 2, 'c' -> 6, 'b' -> 4)
val c = List(a, b)
val d = c.map(SortedMap[Char, Int]() ++ _)
You can map over the contents of c and create a new SortedMap from the contents of each map.