Reduce/fold over scala sequence with grouping - scala

In scala, given an Iterable of pairs, say Iterable[(String, Int]),
is there a way to accumulate or fold over the ._2s based on the ._1s? Like in the following, add up all the #s that come after A and separately the # after B
List(("A", 2), ("B", 1), ("A", 3))
I could do this in 2 steps with groupBy
val mapBy1 = list.groupBy( _._1 )
for ((key,sublist) <- mapBy1) yield (key, sublist.foldLeft(0) (_+_._2))
but then I would be allocating the sublists, which I would rather avoid.

You could build the Map as you go and convert it back to a List after the fact.
listOfPairs.foldLeft(Map[String,Int]().withDefaultValue(0)){
case (m,(k,v)) => m + (k -> (v + m(k)))
}.toList

You could do something like:
list.foldLeft(Map[String, Int]()) {
case (map, (k,v)) => map + (k -> (map.getOrElse(k, 0) + v))
}

You could also use groupBy with mapValues:
list.groupBy(_._1).mapValues(_.map(_._2).sum).toList
res1: List[(String, Int)] = List((A,5), (B,1))

Related

Sum of Values based on key in scala

I am new to scala I have List of Integers
val list = List((1,2,3),(2,3,4),(1,2,3))
val sum = list.groupBy(_._1).mapValues(_.map(_._2)).sum
val sum2 = list.groupBy(_._1).mapValues(_.map(_._3)).sum
How to perform N values I tried above but its not good way how to sum N values based on key
Also I have tried like this
val sum =list.groupBy(_._1).values.sum => error
val sum =list.groupBy(_._1).mapvalues(_.map(_._2).sum (_._3).sum) error
It's easier to convert these tuples to List[Int] with shapeless and then work with them. Your tuples are actually more like lists anyways. Also, as a bonus, you don't need to change your code at all for lists of Tuple4, Tuple5, etc.
import shapeless._, syntax.std.tuple._
val list = List((1,2,3),(2,3,4),(1,2,3))
list.map(_.toList) // convert tuples to list
.groupBy(_.head) // group by first element of list
.mapValues(_.map(_.tail).map(_.sum).sum) // sums elements of all tails
Result is Map(2 -> 7, 1 -> 10).
val sum = list.groupBy(_._1).map(i => (i._1, i._2.map(j => j._1 + j._2 + j._3).sum))
> sum: scala.collection.immutable.Map[Int,Int] = Map(2 -> 9, 1 -> 12)
Since tuple can't type safe convert to List, need to specify add one by one as j._1 + j._2 + j._3.
using the first element in the tuple as the key and the remaining elements as what you need you could do something like this:
val list = List((1,2,3),(2,3,4),(1,2,3))
list: List[(Int, Int, Int)] = List((1, 2, 3), (2, 3, 4), (1, 2, 3))
val sum = list.groupBy(_._1).map { case (k, v) => (k -> v.flatMap(_.productIterator.toList.drop(1).map(_.asInstanceOf[Int])).sum) }
sum: Map[Int, Int] = Map(2 -> 7, 1 -> 10)
i know its a bit dirty to do asInstanceOf[Int] but when you do .productIterator you get a Iterator of Any
this will work for any tuple size

How to sum a List[(Char,Int)] into a Map[Char,Int] in Scala?

I've got list of pairs:
List(('a',3),('b',3),('a',1))
and I would like to transform it by grouping by _1 and summing _2. The result should be like
Map('a'->4, 'b' -> 3)
I very new to Scala so please be kind :)
More direct version. We fold over the list, using a Map as the accumulator. The withDefaultValue means we don't have to test if we have the entry in the map already.
val xs = List(('a',3),('b',3),('a',1))
xs.foldLeft(Map[Char, Int]() withDefaultValue 0)
{case (m, (c, i)) => m updated (c,m(c)+i)}
//> res0: scala.collection.immutable.Map[Char,Int] = Map(a -> 4, b -> 3)
list.groupBy(_._1).mapValues(_.map(_._2).sum)
which can be written as
list.groupBy(_._1).mapValues { tuples =>
val ints = tuples.map { case (c, i) => i }
ints.sum
}

Functional Creating a list based on values in Scala

I have a task to traverse a sequence of tuples and based on last value in the tuple make 1 or more copies of a case class Item. I can solve this task with foreach and Mutable List. As I'm learning FP and Scala collections could it be done more functional way with immutable collections and high order functions in Scala?
For example, input:
List[("A", 2), ("B", 3), ...]
Output:
List[Item("A"), Item("A"), Item("B"),Item("B"),Item("B"), ...]
For each tuple flatMap using List.fill[A](n: Int)(elem: ⇒ A) which produces a List of elem n times.
scala> val xs = List(("A", 2), ("B", 3), ("C", 4))
xs: List[(String, Int)] = List((A,2), (B,3), (C,4))
scala> case class Item(s: String)
defined class Item
scala> xs.flatMap(x => List.fill(x._2)(Item(x._1)))
res2: List[Item] = List(Item(A), Item(A), Item(B), Item(B), Item(B), Item(C), Item(C), Item(C), Item(C))
Using flatten for case class Item(v: String) as follows
myList.map{ case(s,n) => List.fill(n)(Item(s)) }.flatten
Also with a for comprehension like this,
for ( (s,n) <- myList ; l <- List.fill(n)(Item(s)) ) yield l
which is syntax sugar for a call to flatMap.
In addition to List.fill consider List.tabulate for initialising lists, for instance in this way,
for ( (s,n) <- myList ; l <- List.tabulate(n)(_ => Item(s)) ) yield l

Play Scala - groupBy remove repetitive values

I apply groupBy function to my List collection, however I want to remove the repetitive values in the value part of the Map. Here is the initial List collection:
PO_ID PRODUCT_ID RETURN_QTY
1 1 10
1 1 20
1 2 30
1 2 10
When I apply groupBy to that List, it will produce something like this:
(1, 1) -> (1, 1, 10),(1, 1, 20)
(1, 2) -> (1, 2, 30),(1, 2, 10)
What I really want is something like this:
(1, 1) -> (10),(20)
(1, 2) -> (30),(10)
So, is there anyway to remove the repetitive part in the Map's values [(1,1),(1,2)] ?
Thanks..
For
val a = Seq( (1,1,10), (1,1,20), (1,2,30), (1,2,10) )
consider
a.groupBy( v => (v._1,v._2) ).mapValues( _.map (_._3) )
which delivers
Map((1,1) -> List(10, 20), (1,2) -> List(30, 10))
Note that mapValues operates over a List[List] of triplets obtained from groupBy, whereas in map we extract the third element of each triplet.
Is it easier to pull the tuple apart first?
scala> val ts = Seq( (1,1,10), (1,1,20), (1,2,30), (1,2,10) )
ts: Seq[(Int, Int, Int)] = List((1,1,10), (1,1,20), (1,2,30), (1,2,10))
scala> ts map { case (a,b,c) => (a,b) -> c }
res0: Seq[((Int, Int), Int)] = List(((1,1),10), ((1,1),20), ((1,2),30), ((1,2),10))
scala> ((Map.empty[(Int, Int), List[Int]] withDefaultValue List.empty[Int]) /: res0) { case (m, (k,v)) => m + ((k, m(k) :+ v)) }
res1: scala.collection.immutable.Map[(Int, Int),List[Int]] = Map((1,1) -> List(10, 20), (1,2) -> List(30, 10))
Guess not.

Fold from Map[String,List[Int]] to Map[String,Int]

I'm fairly new to Scala and functional approaches in general. I have a Map that looks something like this:
val myMap: Map[String, List[Int]]
I want to end up something that maps the key to the total of the associated list:
val totalsMap: Map[String, Int]
My initial hunch was to use a for comprehension:
val totalsMap = for (kvPair <- myMap) {
kvPair._2.foldLeft(0)(_+_)
}
But I have no idea what I would put in the yield() clause in order to get a map out of the for comprehension.
You can use mapValues for this,
val totalMap = myMap.mapValues(_.sum)
But mapValues will recalculate the sum every time you get a key from the Map. e.g. If you do totalMap("a") multiple times, it will recalculate the sum each time.
If you don't want this, you should use
val totalMap = myMap map {
case (k, v) => k -> v.sum
}
mapValues would be more suited for this case:
val m = Map[String, List[Int]]("a" -> List(1,2,3), "b" -> List(4,5,6))
m.mapValues(_.foldLeft(0)(_+_))
res1: scala.collection.immutable.Map[String,Int] = Map(a -> 6, b -> 15)
Or without foldLeft:
m.mapValues(_.sum)
val m = Map("hello" -> Seq(1, 1, 1, 1), "world" -> Seq(1, 1))
for ((k, v) <- m) yield (k, v.sum)
yields
Map(hello -> 4, world -> 2)`
The for comprehension will return whatever monadic type you give it. In this case, m is a Map, so that's what's going to come out. The yield must return a tuple. The first element (which becomes the key in each Map entry) is the word you're counting, and the second element (you guessed it, the value in each Map entry) becomes the sum of the original sequence of counts.