Scala - reducing a map based on changed key - scala

Lets say have the following Map data
val testMap: Map[String, Int] = Map("AAA_abc" -> 1,
"AAA_anghesh" -> 2,
"BBB_wfejw" -> 3,
"BBB_qgqwe" -> 4,
"C_fkee" -> 5)
Now I want to reduce the map by key.split("_").head and add all the values for the keys that became equal. So for this example the Map should result into:
Map(AAA -> 3, BBB -> 7, C -> 5)
What would be the correct way to do so in Scala?
I tried constructions with groupBy and reduceLeft but could not find a solution.

Here's a way to do it:
testMap.groupBy(_._1.split("_").head).mapValues(_.values.sum)

A variation in one pass:
testMap.foldLeft(Map[String,Int]())( (map, kv) => {
val key = kv._1.split("_").head
val previous = map.getOrElse(key,0)
map.updated(key, previous + kv._2) })

Related

How to add all the values of a map without using recurrsion or var

I want to add all the values in a map without using var or any mutable structures. I have tried to do something like this but it doens't work:
val mymap = ("a" -> 1, "b" -> 2)
val sum_of_alcohol_consumption =
for ((k,v) <- mymap ) yield (sum_of_alcohol_consumption += v)
I have been told that I can use .sum on a list
Please help
Thanks
You can use the .values function of a Map to return an Iterable List of its values (all of the Integers) and then call the .sum function on that:
val myMap = Map("a" -> 1, "b" -> 2)
val sum = myMap.values.sum
println(sum) // Outputs: 3
An equivalent answer to the more elegant use of sum is to use a fold operation. sum is implemented in a manner similar to this:
val myMap = Map("a" -> 1, "b" -> 2)
val sumAlcoholConsumption = myMap.values.foldLeft(0)(_ + _)
values returns a sequence of only the values in the map. The first foldLeft argument is the zero value (think of it as the initial value for an accumulator value) for the operation. The second argument is a function that adds the current value of the accumulator to the current element, returning the sum of the two values - and it is applied to each value in turn. That said, sum is a lot more convenient.
To get the only values of map, it provides a function values which will return iterable,we can directly appy sum function to it.
scala> val mymap = Map("a" -> 1, "b" -> 2)
mymap: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2)
scala> mymap.values.sum
res7: Int = 3

Sum of Values based on key in scala

I am new to scala I have List of Integers
val list = List((1,2,3),(2,3,4),(1,2,3))
val sum = list.groupBy(_._1).mapValues(_.map(_._2)).sum
val sum2 = list.groupBy(_._1).mapValues(_.map(_._3)).sum
How to perform N values I tried above but its not good way how to sum N values based on key
Also I have tried like this
val sum =list.groupBy(_._1).values.sum => error
val sum =list.groupBy(_._1).mapvalues(_.map(_._2).sum (_._3).sum) error
It's easier to convert these tuples to List[Int] with shapeless and then work with them. Your tuples are actually more like lists anyways. Also, as a bonus, you don't need to change your code at all for lists of Tuple4, Tuple5, etc.
import shapeless._, syntax.std.tuple._
val list = List((1,2,3),(2,3,4),(1,2,3))
list.map(_.toList) // convert tuples to list
.groupBy(_.head) // group by first element of list
.mapValues(_.map(_.tail).map(_.sum).sum) // sums elements of all tails
Result is Map(2 -> 7, 1 -> 10).
val sum = list.groupBy(_._1).map(i => (i._1, i._2.map(j => j._1 + j._2 + j._3).sum))
> sum: scala.collection.immutable.Map[Int,Int] = Map(2 -> 9, 1 -> 12)
Since tuple can't type safe convert to List, need to specify add one by one as j._1 + j._2 + j._3.
using the first element in the tuple as the key and the remaining elements as what you need you could do something like this:
val list = List((1,2,3),(2,3,4),(1,2,3))
list: List[(Int, Int, Int)] = List((1, 2, 3), (2, 3, 4), (1, 2, 3))
val sum = list.groupBy(_._1).map { case (k, v) => (k -> v.flatMap(_.productIterator.toList.drop(1).map(_.asInstanceOf[Int])).sum) }
sum: Map[Int, Int] = Map(2 -> 7, 1 -> 10)
i know its a bit dirty to do asInstanceOf[Int] but when you do .productIterator you get a Iterator of Any
this will work for any tuple size

Fold from Map[String,List[Int]] to Map[String,Int]

I'm fairly new to Scala and functional approaches in general. I have a Map that looks something like this:
val myMap: Map[String, List[Int]]
I want to end up something that maps the key to the total of the associated list:
val totalsMap: Map[String, Int]
My initial hunch was to use a for comprehension:
val totalsMap = for (kvPair <- myMap) {
kvPair._2.foldLeft(0)(_+_)
}
But I have no idea what I would put in the yield() clause in order to get a map out of the for comprehension.
You can use mapValues for this,
val totalMap = myMap.mapValues(_.sum)
But mapValues will recalculate the sum every time you get a key from the Map. e.g. If you do totalMap("a") multiple times, it will recalculate the sum each time.
If you don't want this, you should use
val totalMap = myMap map {
case (k, v) => k -> v.sum
}
mapValues would be more suited for this case:
val m = Map[String, List[Int]]("a" -> List(1,2,3), "b" -> List(4,5,6))
m.mapValues(_.foldLeft(0)(_+_))
res1: scala.collection.immutable.Map[String,Int] = Map(a -> 6, b -> 15)
Or without foldLeft:
m.mapValues(_.sum)
val m = Map("hello" -> Seq(1, 1, 1, 1), "world" -> Seq(1, 1))
for ((k, v) <- m) yield (k, v.sum)
yields
Map(hello -> 4, world -> 2)`
The for comprehension will return whatever monadic type you give it. In this case, m is a Map, so that's what's going to come out. The yield must return a tuple. The first element (which becomes the key in each Map entry) is the word you're counting, and the second element (you guessed it, the value in each Map entry) becomes the sum of the original sequence of counts.

Different Representations of Scala HashMap

I've been playing around with the Scala HashMap and I've noticed two different representations of the HashMap. I was wondering if somebody could explain the difference of:
Map(123 -> 1)
and
{123=1}
Thanks!
Where have you seen {123=1}? It's not a standard representation in Scala, but it is the way Java defines toString for its Maps.
val sm = Map(1->1, 2->2) // Map(1 -> 1, 2 -> 2)
val jm = new java.util.HashMap[Int,Int]()
jm.put(1,1)
jm.put(2,2)
jm
// java.util.HashMap[Int,Int] = {1=1, 2=2}
-> is a method that creates tuples. By itself it doesn't directly have anything to do with maps. So for example 123 -> 1 returns a tuple (123, 1). You can try this in the REPL:
scala> 123 -> 1
res1: (Int, Int) = (123,1)
You can create a map by supplying tuples to object Map's apply method, which is what you are doing when you do this:
val m = Map(123 -> 1, 456 -> 2)
is the same as
val m = Map.apply(123 -> 1, 456 -> 2)
is the same as
val m = Map.apply((123, 1), (456, 2))
which creates a Map with two entries, one with key 123 and value 1, the other one with key 456 and value 2.

Scala: How to create a Map[K,V] from a Set[K] and a function from K to V?

What is the best way to create a Map[K,V] from a Set[K] and function from K to V?
For example, suppose I have
scala> val s = Set(2, 3, 5)
s: scala.collection.immutable.Set[Int] = Set(2, 3, 5)
and
scala> def func(i: Int) = "" + i + i
func: (i: Int)java.lang.String
What is the easiest way of creating a Map[Int, String](2 -> "22", 3 -> "33", 5 -> "55")
You can use foldLeft:
val func2 = (r: Map[Int,String], i: Int) => r + (i -> func(i))
s.foldLeft(Map.empty[Int,String])(func2)
This will perform better than Jesper's solution, because foldLeft constructs the Map in one pass. Jesper's code creates an intermediate data structure first, which then needs to be converted to the final Map.
Update: I wrote a micro benchmark testing the speed of each of the answers:
Jesper (original): 35s 738ms
Jesper (improved): 11s 618ms
dbyrne: 11s 906ms
Rex Kerr: 12s 206ms
Eastsun: 11s 988ms
Looks like they are all pretty much the same as long as you avoid constructing an intermediate data structure.
What about this:
(s map { i => i -> func(i) }).toMap
This maps the elements of s to tuples (i, func(i)) and then converts the resulting collection to a Map.
Note: i -> func(i) is the same as (i, func(i)).
dbyrne suggests creating a view of the set first (see his answer and comments), which prevents an intermediate collection from being made, improving performance:
(s.view map { i => i -> func(i) }).toMap
scala> import collection.breakOut
import collection.breakOut
scala> val set = Set(2,3,5)
set: scala.collection.immutable.Set[Int] = Set(2, 3, 5)
scala> def func(i: Int) = ""+i+i
func: (i: Int)java.lang.String
scala> val map: Map[Int,String] = set.map(i => i -> func(i))(breakOut)
map: Map[Int,String] = Map(2 -> 22, 3 -> 33, 5 -> 55)
scala>
In addition to the existing answers,
Map() ++ set.view.map(i => i -> f(i))
is pretty short and performs as well as the faster answers (fold/breakOut).
(Note the view to prevent creation of a new collection; it does the remapping as it goes.)
The other solutions lack creativity. Here's my own version, though I'd really like to get rid of the _.head map.
s groupBy identity mapValues (_.head) mapValues func
As with all great languages, there's a million ways to do everything.
Here's a strategy that zips the set with itself.
val s = Set(1,2,3,4,5)
Map(s.zip(s.map(_.toString)).toArray : _*)
EDIT: (_.toString) could be replaced with some function that returns something of type V
Without definition of func(i: Int) using "string repeating" operator *:
scala> s map { x => x -> x.toString*2 } toMap
res2: scala.collection.immutable.Map[Int,String] = Map(2 -> 22, 3 -> 33, 5 -> 55)