Scala: idiomatic way to merge list of maps with the greatest value of each key? - scala

I have a List of Map[Int, Int], that all have the same keys (from 1 to 20) and I'd like to merge their contents into a single Map[Int, Int].
I've read another post on stack overflow about merging maps that uses |+| from the scalaz library.
I've come up with the following solution, but it seems clunky to me.
val defaultMap = (2 to ceiling).map((_,0)).toMap
val factors: Map[Int, Int] = (2 to ceiling). map(primeFactors(_)).
foldRight(defaultMap)(mergeMaps(_, _))
def mergeMaps(xm: Map[Int, Int], ym: Map[Int, Int]): Map[Int,Int] = {
def iter(acc: Map[Int,Int], other: Map[Int,Int], i: Int): Map[Int,Int] = {
if (other.isEmpty) acc
else iter(acc - i + (i -> math.max(acc(i), other(i))), other - i, i + 1)
}
iter(xm, ym, 2)
}
def primeFactors(number: Int): Map[Int, Int] = {
def iter(factors: Map[Int,Int], rem: Int, i: Int): Map[Int,Int] = {
if (i > number) factors
else if (rem % i == 0) iter(factors - i + (i -> (factors(i)+1)), rem / i, i)
else iter(factors, rem, i + 1)
}
iter((2 to ceiling).map((_,0)).toMap, number, 2)
}
Explanation: val factors creates a list of maps that each represent the prime factors for the numbers from 2-20; then these 18 maps are folded into a single map containing the greatest value for each key.
UPDATE
Using the suggestion of #folone, I end up with the following code (a definite improvement over my original version, and I don't have to change the Maps to HashMaps):
import scalaz._
import Scalaz._
import Tags._
/**
* Smallest Multiple
*
* 2520 is the smallest number that can be divided by each of the numbers
* from 1 to 10 without any remainder. What is the smallest positive number
* that is evenly divisible by all of the numbers from 1 to 20?
*
* User: Alexandros Bantis
* Date: 1/29/13
* Time: 8:07 PM
*/
object Problem005 {
def findSmallestMultiple(ceiling: Int): Int = {
val factors = (2 to ceiling).map(primeFactors(_).mapValues(MaxVal)).reduce(_ |+| _)
(1 /: factors.map(m => intPow(m._1, m._2)))(_ * _)
}
private def primeFactors(number: Int): Map[Int, Int] = {
def iter(factors: Map[Int,Int], rem: Int, i: Int): Map[Int,Int] = {
if (i > number) factors.filter(_._2 > 0).mapValues(MaxVal)
else if (rem % i == 0) iter(factors - i + (i -> (factors(i)+1)), rem / i, i)
else iter(factors, rem, i + 1)
}
iter((2 to number).map((_,0)).toMap, number, 2)
}
private def intPow(x: Int, y: Int): Int = {
def iter(acc: Int, rem: Int): Int = {
if (rem == 0) acc
else iter(acc * x, rem -1)
}
if (y == 0) 1 else iter(1, y)
}
}

This solution does not work for general Maps, but if you are using immutable.HashMaps you may consider the merged method:
def merged[B1 >: B](that: HashMap[A, B1])(mergef: ((A, B1), (A, B1)) ⇒ (A, B1)): HashMap[A, B1]
Creates a new map which is the merge of this and the argument hash
map.
Uses the specified collision resolution function if two keys are the
same. The collision resolution function will always take the first
argument from this hash map and the second from that.
The merged method is on average more performant than doing a traversal
and reconstructing a new immutable hash map from scratch, or ++.
Use case:
val m1 = immutable.HashMap[Int, Int](1 -> 2, 2 -> 3)
val m2 = immutable.HashMap[Int, Int](1 -> 3, 4 -> 5)
m1.merged(m2) {
case ((k1, v1), (k2, v2)) => ((k1, math.max(v1, v2)))
}

As your tags suggest, you might be interested in a scalaz solution. Here goes:
> console
[info] Starting scala interpreter...
[info]
Welcome to Scala version 2.10.0 (OpenJDK 64-Bit Server VM, Java 1.7.0_15).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import scalaz._, Scalaz._, Tags._
import scalaz._
import Scalaz._
import Tags._
There exists a Semigroup instance for Ints under a maximum operation:
scala> Semigroup[Int ## MaxVal]
res0: scalaz.Semigroup[scalaz.##[Int,scalaz.Tags.MaxVal]] = scalaz.Semigroup$$anon$9#15a9a9c6
Let's just use it:
scala> val m1 = Map(1 -> 2, 2 -> 3) mapValues MaxVal
m1: scala.collection.immutable.Map[Int,scalaz.##[Int,scalaz.Tags.MaxVal]] = Map(1 -> 2, 2 -> 3)
scala> val m2 = Map(1 -> 3, 4 -> 5) mapValues MaxVal
m2: scala.collection.immutable.Map[Int,scalaz.##[Int,scalaz.Tags.MaxVal]] = Map(1 -> 3, 4 -> 5)
scala> m1 |+| m2
res1: scala.collection.immutable.Map[Int,scalaz.##[Int,scalaz.Tags.MaxVal]] = Map(1 -> 3, 4 -> 5, 2 -> 3)
If you're interested in how this "tagging" (the ## thing) works, here's a good explanation: http://etorreborre.blogspot.de/2011/11/practical-uses-for-unboxed-tagged-types.html

Starting Scala 2.13, another solution only based on the standard library consists in merging the Maps as sequences before applying a groupMapReduce which (as its name suggests) is an equivalent of a groupBy followed by a mapping and a reduce step on values:
// val map1 = Map(1 -> 2, 2 -> 3)
// val map2 = Map(1 -> 3, 4 -> 5)
(map1.toSeq ++ map2).groupMapReduce(_._1)(_._2)(_ max _)
// Map[Int,Int] = Map(2 -> 3, 4 -> 5, 1 -> 3)
This:
concatenates the two maps as a sequence of tuples (List((1,2), (2,3), (1,3), (4,5))). For conciseness, map2 is implicitly converted to Seq to adopt the type of map1.toSeq - but you could choose to make it explicit by using map2.toSeq.
groups elements based on their first tuple part (group part of groupMapReduce)
maps grouped values to their second tuple part (map part of groupMapReduce)
reduces mapped values (_ max _) by taking their max (reduce part of groupMapReduce)

Related

Most efficient way to create matrix from Map's [duplicate]

val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)
I want to merge them, and sum the values of same keys. So the result will be:
Map(2->20, 1->109, 3->300)
Now I have 2 solutions:
val list = map1.toList ++ map2.toList
val merged = list.groupBy ( _._1) .map { case (k,v) => k -> v.map(_._2).sum }
and
val merged = (map1 /: map2) { case (map, (k,v)) =>
map + ( k -> (v + map.getOrElse(k, 0)) )
}
But I want to know if there are any better solutions.
The shortest answer I know of that uses only the standard library is
map1 ++ map2.map{ case (k,v) => k -> (v + map1.getOrElse(k,0)) }
Scalaz has the concept of a Semigroup which captures what you want to do here, and leads to arguably the shortest/cleanest solution:
scala> import scalaz._
import scalaz._
scala> import Scalaz._
import Scalaz._
scala> val map1 = Map(1 -> 9 , 2 -> 20)
map1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 9, 2 -> 20)
scala> val map2 = Map(1 -> 100, 3 -> 300)
map2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 100, 3 -> 300)
scala> map1 |+| map2
res2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 109, 3 -> 300, 2 -> 20)
Specifically, the binary operator for Map[K, V] combines the keys of the maps, folding V's semigroup operator over any duplicate values. The standard semigroup for Int uses the addition operator, so you get the sum of values for each duplicate key.
Edit: A little more detail, as per user482745's request.
Mathematically a semigroup is just a set of values, together with an operator that takes two values from that set, and produces another value from that set. So integers under addition are a semigroup, for example - the + operator combines two ints to make another int.
You can also define a semigroup over the set of "all maps with a given key type and value type", so long as you can come up with some operation that combines two maps to produce a new one which is somehow the combination of the two inputs.
If there are no keys that appear in both maps, this is trivial. If the same key exists in both maps, then we need to combine the two values that the key maps to. Hmm, haven't we just described an operator which combines two entities of the same type? This is why in Scalaz a semigroup for Map[K, V] exists if and only if a Semigroup for V exists - V's semigroup is used to combine the values from two maps which are assigned to the same key.
So because Int is the value type here, the "collision" on the 1 key is resolved by integer addition of the two mapped values (as that's what Int's semigroup operator does), hence 100 + 9. If the values had been Strings, a collision would have resulted in string concatenation of the two mapped values (again, because that's what the semigroup operator for String does).
(And interestingly, because string concatenation is not commutative - that is, "a" + "b" != "b" + "a" - the resulting semigroup operation isn't either. So map1 |+| map2 is different from map2 |+| map1 in the String case, but not in the Int case.)
Quick solution:
(map1.keySet ++ map2.keySet).map {i=> (i,map1.getOrElse(i,0) + map2.getOrElse(i,0))}.toMap
Well, now in scala library (at least in 2.10) there is something you wanted - merged function. BUT it's presented only in HashMap not in Map. It's somewhat confusing. Also the signature is cumbersome - can't imagine why I'd need a key twice and when I'd need to produce a pair with another key. But nevertheless, it works and much cleaner than previous "native" solutions.
val map1 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
val map2 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
map1.merged(map2)({ case ((k,v1),(_,v2)) => (k,v1+v2) })
Also in scaladoc mentioned that
The merged method is on average more performant than doing a
traversal and reconstructing a new immutable hash map from
scratch, or ++.
This can be implemented as a Monoid with just plain Scala. Here is a sample implementation. With this approach, we can merge not just 2, but a list of maps.
// Monoid trait
trait Monoid[M] {
def zero: M
def op(a: M, b: M): M
}
The Map based implementation of the Monoid trait that merges two maps.
val mapMonoid = new Monoid[Map[Int, Int]] {
override def zero: Map[Int, Int] = Map()
override def op(a: Map[Int, Int], b: Map[Int, Int]): Map[Int, Int] =
(a.keySet ++ b.keySet) map { k =>
(k, a.getOrElse(k, 0) + b.getOrElse(k, 0))
} toMap
}
Now, if you have a list of maps that needs to be merged (in this case, only 2), it can be done like below.
val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)
val maps = List(map1, map2) // The list can have more maps.
val merged = maps.foldLeft(mapMonoid.zero)(mapMonoid.op)
map1 ++ ( for ( (k,v) <- map2 ) yield ( k -> ( v + map1.getOrElse(k,0) ) ) )
I wrote a blog post about this , check it out :
http://www.nimrodstech.com/scala-map-merge/
basically using scalaz semi group you can achieve this pretty easily
would look something like :
import scalaz.Scalaz._
map1 |+| map2
You can also do that with Cats.
import cats.implicits._
val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)
map1 combine map2 // Map(2 -> 20, 1 -> 109, 3 -> 300)
Starting Scala 2.13, another solution only based on the standard library consists in replacing the groupBy part of your solution with groupMapReduce which (as its name suggests) is an equivalent of a groupBy followed by mapValues and a reduce step:
// val map1 = Map(1 -> 9, 2 -> 20)
// val map2 = Map(1 -> 100, 3 -> 300)
(map1.toSeq ++ map2).groupMapReduce(_._1)(_._2)(_+_)
// Map[Int,Int] = Map(2 -> 20, 1 -> 109, 3 -> 300)
This:
Concatenates the two maps as a sequence of tuples (List((1,9), (2,20), (1,100), (3,300))). For conciseness, map2 is implicitly converted to Seq to adapt to the type of map1.toSeq - but you could choose to make it explicit by using map2.toSeq,
groups elements based on their first tuple part (group part of groupMapReduce),
maps grouped values to their second tuple part (map part of groupMapReduce),
reduces mapped values (_+_) by summing them (reduce part of groupMapReduce).
Andrzej Doyle's answer contains a great explanation of semigroups which allows you to use the |+| operator to join two maps and sum the values for matching keys.
There are many ways something can be defined to be an instance of a typeclass, and unlike the OP you might not want to sum your keys specifically. Or, you might want to do operate on a union rather than an intersection. Scalaz also adds extra functions to Map for this purpose:
https://oss.sonatype.org/service/local/repositories/snapshots/archive/org/scalaz/scalaz_2.11/7.3.0-SNAPSHOT/scalaz_2.11-7.3.0-SNAPSHOT-javadoc.jar/!/index.html#scalaz.std.MapFunctions
You can do
import scalaz.Scalaz._
map1 |+| map2 // As per other answers
map1.intersectWith(map2)(_ + _) // Do things other than sum the values
The fastest and simplest way:
val m1 = Map(1 -> 1.0, 3 -> 3.0, 5 -> 5.2)
val m2 = Map(0 -> 10.0, 3 -> 3.0)
val merged = (m2 foldLeft m1) (
(acc, v) => acc + (v._1 -> (v._2 + acc.getOrElse(v._1, 0.0)))
)
By this way, each of element's immediately added to map.
The second ++ way is:
map1 ++ map2.map { case (k,v) => k -> (v + map1.getOrElse(k,0)) }
Unlike the first way, In a second way for each element in a second map a new List will be created and concatenated to the previous map.
The case expression implicitly creates a new List using unapply method.
Here's what I ended up using:
(a.toSeq ++ b.toSeq).groupBy(_._1).mapValues(_.map(_._2).sum)
This is what I came up with...
def mergeMap(m1: Map[Char, Int], m2: Map[Char, Int]): Map[Char, Int] = {
var map : Map[Char, Int] = Map[Char, Int]() ++ m1
for(p <- m2) {
map = map + (p._1 -> (p._2 + map.getOrElse(p._1,0)))
}
map
}
Using the typeclass pattern, we can merge any Numeric type:
object MapSyntax {
implicit class MapOps[A, B](a: Map[A, B]) {
def plus(b: Map[A, B])(implicit num: Numeric[B]): Map[A, B] = {
b ++ a.map { case (key, value) => key -> num.plus(value, b.getOrElse(key, num.zero)) }
}
}
}
Usage:
import MapSyntax.MapOps
map1 plus map2
Merging a sequence of maps:
maps.reduce(_ plus _)
I've got a small function to do the job, it's in my small library for some frequently used functionality which isn't in standard lib.
It should work for all types of maps, mutable and immutable, not only HashMaps
Here is the usage
scala> import com.daodecode.scalax.collection.extensions._
scala> val merged = Map("1" -> 1, "2" -> 2).mergedWith(Map("1" -> 1, "2" -> 2))(_ + _)
merged: scala.collection.immutable.Map[String,Int] = Map(1 -> 2, 2 -> 4)
https://github.com/jozic/scalax-collection/blob/master/README.md#mergedwith
And here's the body
def mergedWith(another: Map[K, V])(f: (V, V) => V): Repr =
if (another.isEmpty) mapLike.asInstanceOf[Repr]
else {
val mapBuilder = new mutable.MapBuilder[K, V, Repr](mapLike.asInstanceOf[Repr])
another.foreach { case (k, v) =>
mapLike.get(k) match {
case Some(ev) => mapBuilder += k -> f(ev, v)
case _ => mapBuilder += k -> v
}
}
mapBuilder.result()
}
https://github.com/jozic/scalax-collection/blob/master/src%2Fmain%2Fscala%2Fcom%2Fdaodecode%2Fscalax%2Fcollection%2Fextensions%2Fpackage.scala#L190
For anyone coming across an AnyVal error, convert the values as follows.
Error:
"could not find implicit value for parameter num: Numeric[AnyVal]"
(m1.toSeq ++ m2.toSeq).groupBy(_._1).mapValues(_.map(_._2.asInstanceOf[Number].intValue()).sum)

Understanding Scala -> syntax

I am getting a taste of Scala through the artima "Programming in Scala" book.
While presenting the Map traits, the authors go to some lengths to describe the -> syntax as a method that can be applied to any type to get a tuple.
And indeed:
scala> (2->"two")
res1: (Int, String) = (2,two)
scala> (2,"two")
res2: (Int, String) = (2,two)
scala> (2->"two") == (2, "two")
res3: Boolean = true
But those are not equivalent:
scala> Map(1->"one") + (2->"two")
res4: scala.collection.immutable.Map[Int,String] = Map(1 -> one, 2 -> two)
scala> Map(1->"one") + (2, "two")
<console>:8: error: type mismatch;
found : Int(2)
required: (Int, ?)
Map(1->"one") + (2, "two")
Why is this so, since my first tests seem to show that both "pair" syntaxes build a tuple?
Regards.
They are exactly the same, thanks to this class in Predef (only partly reproduced here):
final class ArrowAssoc[A](val __leftOfArrow: A) extends AnyVal {
#inline def -> [B](y: B): Tuple2[A, B] = Tuple2(__leftOfArrow, y)
}
#inline implicit def any2ArrowAssoc[A](x: A): ArrowAssoc[A] = new ArrowAssoc(x)
So now the question is when will (a,b) syntax be ambiguous where (a -> b) is not? And the answer is in function calls, especially when they're overloaded:
def f[A](a: A) = a.toString
def f[A,B](a: A, b: B) = a.hashCode + b.hashCode
f(1,2) // Int = 3
f(1 -> 2) // String = (1,2)
f((1, 2)) // String = (1,2)
Map + in particular gets confused because it's overloaded with a multiple-argument version, so you could
Map(1 -> 2) + (3 -> 4, 4 -> 5, 5 -> 6)
and it thus interprets
Map(1 -> 2) + (3, 4)
as trying to add both 3 to the map, and then 4 to the map. Which of course makes no sense, but it doesn't try the other interpretation.
With -> there is no such ambiguity.
However, you can't
Map(1 -> 2) + 3 -> 4
because + and - have the same precedence. Thus it is interpreted as
(Map(1 -> 2) + 3) -> 4
which again fails because you're trying to add 3 in place of a key-value pair.

scalaz List[StateT].sequence - could not find implicit value for parameter n: scalaz.Applicative

I'm trying to figure out how to use StateT to combine two State state transformers based on a comment on my Scalaz state monad examples answer.
It seems I'm very close but I got an issue when trying to apply sequence.
import scalaz._
import Scalaz._
import java.util.Random
val die = state[Random, Int](r => (r, r.nextInt(6) + 1))
val twoDice = for (d1 <- die; d2 <- die) yield (d1, d2)
def freqSum(dice: (Int, Int)) = state[Map[Int,Int], Int]{ freq =>
val s = dice._1 + dice._2
val tuple = s -> (freq.getOrElse(s, 0) + 1)
(freq + tuple, s)
}
type StateMap[x] = State[Map[Int,Int], x]
val diceAndFreqSum = stateT[StateMap, Random, Int]{ random =>
val (newRandom, dice) = twoDice apply random
for (sum <- freqSum(dice)) yield (newRandom, sum)
}
So I got as far as having a StateT[StateMap, Random, Int] that I can unwrap with initial random and empty map states:
val (freq, sum) = diceAndFreqSum ! new Random(1L) apply Map[Int,Int]()
// freq: Map[Int,Int] = Map(9 -> 1)
// sum: Int = 9
Now I'd like to generate a list of those StateT and use sequence so that I can call list.sequence ! new Random(1L) apply Map[Int,Int](). But when trying this I get:
type StT[x] = StateT[StateMap, Random, x]
val data: List[StT[Int]] = List.fill(10)(diceAndFreqSum)
data.sequence[StT, Int]
//error: could not find implicit value for parameter n: scalaz.Applicative[StT]
data.sequence[StT, Int]
^
Any idea? I can use some help for the last stretch - assuming it's possible.
Ah looking at the scalaz Monad source, I noticed there was an implicit def StateTMonad that confirms that StateT[M, A, x] is a monad for type parameter x. Also monads are applicatives, which was confirmed by looking at the definition of the Monad trait and by poking in the REPL:
scala> implicitly[Monad[StT] <:< Applicative[StT]]
res1: <:<[scalaz.Monad[StT],scalaz.Applicative[StT]] = <function1>
scala> implicitly[Monad[StT]]
res2: scalaz.Monad[StT] = scalaz.MonadLow$$anon$1#1cce278
So this gave me the idea of defining an implicit Applicative[StT] to help the compiler:
type StT[x] = StateT[StateMap, Random, x]
implicit val applicativeStT: Applicative[StT] = implicitly[Monad[StT]]
That did the trick:
val data: List[StT[Int]] = List.fill(10)(diceAndFreqSum)
val (frequencies, sums) =
data.sequence[StT, Int] ! new Random(1L) apply Map[Int,Int]()
// frequencies: Map[Int,Int] = Map(10 -> 1, 6 -> 3, 9 -> 1, 7 -> 1, 8 -> 2, 4 -> 2)
// sums: List[Int] = List(9, 6, 8, 8, 10, 4, 6, 6, 4, 7)

Best way to merge two maps and sum the values of same key?

val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)
I want to merge them, and sum the values of same keys. So the result will be:
Map(2->20, 1->109, 3->300)
Now I have 2 solutions:
val list = map1.toList ++ map2.toList
val merged = list.groupBy ( _._1) .map { case (k,v) => k -> v.map(_._2).sum }
and
val merged = (map1 /: map2) { case (map, (k,v)) =>
map + ( k -> (v + map.getOrElse(k, 0)) )
}
But I want to know if there are any better solutions.
The shortest answer I know of that uses only the standard library is
map1 ++ map2.map{ case (k,v) => k -> (v + map1.getOrElse(k,0)) }
Scalaz has the concept of a Semigroup which captures what you want to do here, and leads to arguably the shortest/cleanest solution:
scala> import scalaz._
import scalaz._
scala> import Scalaz._
import Scalaz._
scala> val map1 = Map(1 -> 9 , 2 -> 20)
map1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 9, 2 -> 20)
scala> val map2 = Map(1 -> 100, 3 -> 300)
map2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 100, 3 -> 300)
scala> map1 |+| map2
res2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 109, 3 -> 300, 2 -> 20)
Specifically, the binary operator for Map[K, V] combines the keys of the maps, folding V's semigroup operator over any duplicate values. The standard semigroup for Int uses the addition operator, so you get the sum of values for each duplicate key.
Edit: A little more detail, as per user482745's request.
Mathematically a semigroup is just a set of values, together with an operator that takes two values from that set, and produces another value from that set. So integers under addition are a semigroup, for example - the + operator combines two ints to make another int.
You can also define a semigroup over the set of "all maps with a given key type and value type", so long as you can come up with some operation that combines two maps to produce a new one which is somehow the combination of the two inputs.
If there are no keys that appear in both maps, this is trivial. If the same key exists in both maps, then we need to combine the two values that the key maps to. Hmm, haven't we just described an operator which combines two entities of the same type? This is why in Scalaz a semigroup for Map[K, V] exists if and only if a Semigroup for V exists - V's semigroup is used to combine the values from two maps which are assigned to the same key.
So because Int is the value type here, the "collision" on the 1 key is resolved by integer addition of the two mapped values (as that's what Int's semigroup operator does), hence 100 + 9. If the values had been Strings, a collision would have resulted in string concatenation of the two mapped values (again, because that's what the semigroup operator for String does).
(And interestingly, because string concatenation is not commutative - that is, "a" + "b" != "b" + "a" - the resulting semigroup operation isn't either. So map1 |+| map2 is different from map2 |+| map1 in the String case, but not in the Int case.)
Quick solution:
(map1.keySet ++ map2.keySet).map {i=> (i,map1.getOrElse(i,0) + map2.getOrElse(i,0))}.toMap
Well, now in scala library (at least in 2.10) there is something you wanted - merged function. BUT it's presented only in HashMap not in Map. It's somewhat confusing. Also the signature is cumbersome - can't imagine why I'd need a key twice and when I'd need to produce a pair with another key. But nevertheless, it works and much cleaner than previous "native" solutions.
val map1 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
val map2 = collection.immutable.HashMap(1 -> 11 , 2 -> 12)
map1.merged(map2)({ case ((k,v1),(_,v2)) => (k,v1+v2) })
Also in scaladoc mentioned that
The merged method is on average more performant than doing a
traversal and reconstructing a new immutable hash map from
scratch, or ++.
This can be implemented as a Monoid with just plain Scala. Here is a sample implementation. With this approach, we can merge not just 2, but a list of maps.
// Monoid trait
trait Monoid[M] {
def zero: M
def op(a: M, b: M): M
}
The Map based implementation of the Monoid trait that merges two maps.
val mapMonoid = new Monoid[Map[Int, Int]] {
override def zero: Map[Int, Int] = Map()
override def op(a: Map[Int, Int], b: Map[Int, Int]): Map[Int, Int] =
(a.keySet ++ b.keySet) map { k =>
(k, a.getOrElse(k, 0) + b.getOrElse(k, 0))
} toMap
}
Now, if you have a list of maps that needs to be merged (in this case, only 2), it can be done like below.
val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)
val maps = List(map1, map2) // The list can have more maps.
val merged = maps.foldLeft(mapMonoid.zero)(mapMonoid.op)
map1 ++ ( for ( (k,v) <- map2 ) yield ( k -> ( v + map1.getOrElse(k,0) ) ) )
I wrote a blog post about this , check it out :
http://www.nimrodstech.com/scala-map-merge/
basically using scalaz semi group you can achieve this pretty easily
would look something like :
import scalaz.Scalaz._
map1 |+| map2
You can also do that with Cats.
import cats.implicits._
val map1 = Map(1 -> 9 , 2 -> 20)
val map2 = Map(1 -> 100, 3 -> 300)
map1 combine map2 // Map(2 -> 20, 1 -> 109, 3 -> 300)
Starting Scala 2.13, another solution only based on the standard library consists in replacing the groupBy part of your solution with groupMapReduce which (as its name suggests) is an equivalent of a groupBy followed by mapValues and a reduce step:
// val map1 = Map(1 -> 9, 2 -> 20)
// val map2 = Map(1 -> 100, 3 -> 300)
(map1.toSeq ++ map2).groupMapReduce(_._1)(_._2)(_+_)
// Map[Int,Int] = Map(2 -> 20, 1 -> 109, 3 -> 300)
This:
Concatenates the two maps as a sequence of tuples (List((1,9), (2,20), (1,100), (3,300))). For conciseness, map2 is implicitly converted to Seq to adapt to the type of map1.toSeq - but you could choose to make it explicit by using map2.toSeq,
groups elements based on their first tuple part (group part of groupMapReduce),
maps grouped values to their second tuple part (map part of groupMapReduce),
reduces mapped values (_+_) by summing them (reduce part of groupMapReduce).
Andrzej Doyle's answer contains a great explanation of semigroups which allows you to use the |+| operator to join two maps and sum the values for matching keys.
There are many ways something can be defined to be an instance of a typeclass, and unlike the OP you might not want to sum your keys specifically. Or, you might want to do operate on a union rather than an intersection. Scalaz also adds extra functions to Map for this purpose:
https://oss.sonatype.org/service/local/repositories/snapshots/archive/org/scalaz/scalaz_2.11/7.3.0-SNAPSHOT/scalaz_2.11-7.3.0-SNAPSHOT-javadoc.jar/!/index.html#scalaz.std.MapFunctions
You can do
import scalaz.Scalaz._
map1 |+| map2 // As per other answers
map1.intersectWith(map2)(_ + _) // Do things other than sum the values
The fastest and simplest way:
val m1 = Map(1 -> 1.0, 3 -> 3.0, 5 -> 5.2)
val m2 = Map(0 -> 10.0, 3 -> 3.0)
val merged = (m2 foldLeft m1) (
(acc, v) => acc + (v._1 -> (v._2 + acc.getOrElse(v._1, 0.0)))
)
By this way, each of element's immediately added to map.
The second ++ way is:
map1 ++ map2.map { case (k,v) => k -> (v + map1.getOrElse(k,0)) }
Unlike the first way, In a second way for each element in a second map a new List will be created and concatenated to the previous map.
The case expression implicitly creates a new List using unapply method.
Here's what I ended up using:
(a.toSeq ++ b.toSeq).groupBy(_._1).mapValues(_.map(_._2).sum)
This is what I came up with...
def mergeMap(m1: Map[Char, Int], m2: Map[Char, Int]): Map[Char, Int] = {
var map : Map[Char, Int] = Map[Char, Int]() ++ m1
for(p <- m2) {
map = map + (p._1 -> (p._2 + map.getOrElse(p._1,0)))
}
map
}
Using the typeclass pattern, we can merge any Numeric type:
object MapSyntax {
implicit class MapOps[A, B](a: Map[A, B]) {
def plus(b: Map[A, B])(implicit num: Numeric[B]): Map[A, B] = {
b ++ a.map { case (key, value) => key -> num.plus(value, b.getOrElse(key, num.zero)) }
}
}
}
Usage:
import MapSyntax.MapOps
map1 plus map2
Merging a sequence of maps:
maps.reduce(_ plus _)
I've got a small function to do the job, it's in my small library for some frequently used functionality which isn't in standard lib.
It should work for all types of maps, mutable and immutable, not only HashMaps
Here is the usage
scala> import com.daodecode.scalax.collection.extensions._
scala> val merged = Map("1" -> 1, "2" -> 2).mergedWith(Map("1" -> 1, "2" -> 2))(_ + _)
merged: scala.collection.immutable.Map[String,Int] = Map(1 -> 2, 2 -> 4)
https://github.com/jozic/scalax-collection/blob/master/README.md#mergedwith
And here's the body
def mergedWith(another: Map[K, V])(f: (V, V) => V): Repr =
if (another.isEmpty) mapLike.asInstanceOf[Repr]
else {
val mapBuilder = new mutable.MapBuilder[K, V, Repr](mapLike.asInstanceOf[Repr])
another.foreach { case (k, v) =>
mapLike.get(k) match {
case Some(ev) => mapBuilder += k -> f(ev, v)
case _ => mapBuilder += k -> v
}
}
mapBuilder.result()
}
https://github.com/jozic/scalax-collection/blob/master/src%2Fmain%2Fscala%2Fcom%2Fdaodecode%2Fscalax%2Fcollection%2Fextensions%2Fpackage.scala#L190
For anyone coming across an AnyVal error, convert the values as follows.
Error:
"could not find implicit value for parameter num: Numeric[AnyVal]"
(m1.toSeq ++ m2.toSeq).groupBy(_._1).mapValues(_.map(_._2.asInstanceOf[Number].intValue()).sum)

Simple question about tuple of scala

I'm new to scala, and what I'm learning is tuple.
I can define a tuple as following, and get the items:
val tuple = ("Mike", 40, "New York")
println("Name: " + tuple._1)
println("Age: " + tuple._2)
println("City: " + tuple._3)
My question is:
How to get the length of a tuple?
Is tuple mutable? Can I modify its items?
Is there any other useful operation we can do on a tuple?
Thanks in advance!
1] tuple.productArity
2] No.
3] Some interesting operations you can perform on tuples: (a short REPL session)
scala> val x = (3, "hello")
x: (Int, java.lang.String) = (3,hello)
scala> x.swap
res0: (java.lang.String, Int) = (hello,3)
scala> x.toString
res1: java.lang.String = (3,hello)
scala> val y = (3, "hello")
y: (Int, java.lang.String) = (3,hello)
scala> x == y
res2: Boolean = true
scala> x.productPrefix
res3: java.lang.String = Tuple2
scala> val xi = x.productIterator
xi: Iterator[Any] = non-empty iterator
scala> while(xi.hasNext) println(xi.next)
3
hello
See scaladocs of Tuple2, Tuple3 etc for more.
One thing that you can also do with a tuple is to extract the content using the match expression:
def tupleview( tup: Any ){
tup match {
case (a: String, b: String) =>
println("A pair of strings: "+a + " "+ b)
case (a: Int, b: Int, c: Int) =>
println("A triplet of ints: "+a + " "+ b + " " +c)
case _ => println("Unknown")
}
}
tupleview( ("Hello", "Freewind"))
tupleview( (1,2,3))
Gives:
A pair of strings: Hello Freewind
A triplet of ints: 1 2 3
Tuples are immutable, but, like all cases classes, they have a copy method that can be used to create a new Tuple with a few changed elements:
scala> (1, false, "two")
res0: (Int, Boolean, java.lang.String) = (1,false,two)
scala> res0.copy(_2 = true)
res1: (Int, Boolean, java.lang.String) = (1,true,two)
scala> res1.copy(_1 = 1f)
res2: (Float, Boolean, java.lang.String) = (1.0,true,two)
Concerning question 3:
A useful thing you can do with Tuples is to store parameter lists for functions:
def f(i:Int, s:String, c:Char) = s * i + c
List((3, "cha", '!'), (2, "bora", '.')).foreach(t => println((f _).tupled(t)))
//--> chachacha!
//--> borabora.
[Edit] As Randall remarks, you'd better use something like this in "real life":
def f(i:Int, s:String, c:Char) = s * i + c
val g = (f _).tupled
List((3, "cha", '!'), (2, "bora", '.')).foreach(t => println(g(t)))
In order to extract the values from tuples in the middle of a "collection transformation chain" you can write:
val words = List((3, "cha"),(2, "bora")).map{ case(i,s) => s * i }
Note the curly braces around the case, parentheses won't work.
Another nice trick ad question 3) (as 1 and 2 are already answered by others)
val tuple = ("Mike", 40, "New York")
tuple match {
case (name, age, city) =>{
println("Name: " + name)
println("Age: " + age)
println("City: " + city)
}
}
Edit: in fact it's rather a feature of pattern matching and case classes, a tuple is just a simple example of a case class...
You know the size of a tuple, it's part of it's type. For example if you define a function def f(tup: (Int, Int)), you know the length of tup is 2 because values of type (Int, Int) (aka Tuple2[Int, Int]) always have a length of 2.
No.
Not really. Tuples are useful for storing a fixed amount of items of possibly different types and passing them around, putting them into data structures etc. There's really not much you can do with them, other than creating tuples, and getting stuff out of tuples.
1 and 2 have already been answered.
A very useful thing that you can use tuples for is to return more than one value from a method or function. Simple example:
// Get the min and max of two integers
def minmax(a: Int, b: Int): (Int, Int) = if (a < b) (a, b) else (b, a)
// Call it and assign the result to two variables like this:
val (x, y) = minmax(10, 3) // x = 3, y = 10
Using shapeless, you easily get a lot of useful methods, that are usually available only on collections:
import shapeless.syntax.std.tuple._
val t = ("a", 2, true, 0.0)
val first = t(0)
val second = t(1)
// etc
val head = t.head
val tail = t.tail
val init = t.init
val last = t.last
val v = (2.0, 3L)
val concat = t ++ v
val append = t :+ 2L
val prepend = 1.0 +: t
val take2 = t take 2
val drop3 = t drop 3
val reverse = t.reverse
val zip = t zip (2.0, 2, "a", false)
val (unzip, other) = zip.unzip
val list = t.toList
val array = t.toArray
val set = t.to[Set]
Everything is typed as one would expect (that is first has type String, concat has type (String, Int, Boolean, Double, Double, Long), etc.)
The last method above (.to[Collection]) should be available in the next release (as of 2014/07/19).
You can also "update" a tuple
val a = t.updatedAt(1, 3) // gives ("a", 3, true, 0.0)
but that will return a new tuple instead of mutating the original one.