Scala sum Map values - scala

I have a List
val l : List[Map[String,Any]] = List(Map("a" -> 1, "b" -> 2.8), Map("a" -> 3, "c" -> 4), Map("c" -> 5, "d" -> "abc"))
and I used the following code to find the sum for the keys "a" (Int), "b" (Double) and "c" (Int). "d" is included as noise.
l.map(n => n.mapValues( v => if (v.isInstanceOf[Number]) {v match {
case x:Int => x.asInstanceOf[Int]
case x:Double => x.asInstanceOf[Double]
}} else 0)).foldLeft((0,0.0,0))((t, m) => (
t._1 + m.get("a").getOrElse(0),
t._2 + m.get("b").getOrElse(0.0),
t._3 + m.get("c").getOrElse(0)))
I expect the output would be (4, 2.8, 9) but instead I was trashed with
<console>:10: error: overloaded method value + with alternatives:
(x: Int)Int <and>
(x: Char)Int <and>
(x: Short)Int <and>
(x: Byte)Int
cannot be applied to (AnyVal)
I think the exception was trying to tell me that '+' doesn't work with AnyVal. How do I get this to work to get my the result that I want? Thanks

m.foldLeft(0)(_+_._2)
it's a very clear and simple solution.
reference: http://ktuman.blogspot.com/2009/10/how-to-simply-sum-values-in-map-in.html

You can use foldLeft function:
scala> val l : List[Map[String,Any]] = List(Map("a" -> 1, "b" -> 2.8), Map("a" -> 3, "c" -> 4), Map("c" -> 5, "d" -> "abc"))
l: List[Map[String,Any]] = List(Map(a -> 1, b -> 2.8), Map(a -> 3, c -> 4), Map(c -> 5, d -> abc))
scala> val (sa, sb, sc) = l.foldLeft((0: Int, 0: Double, 0: Int)){
| case ((a, b, c), m) => (
| a + m.get("a").collect{case i: Int => i}.getOrElse(0),
| b + m.get("b").collect{case i: Double => i}.getOrElse(0.),
| c + m.get("c").collect{case i: Int => i}.getOrElse(0)
| )
| }
sa: Int = 4
sb: Double = 2.8
sc: Int = 9
Updated using incrop's idea of collect instead of match.

First, you totally miss the point of pattern matching
{case i: Int => i
case d: Double => d
case _ => 0}
is the proper replacement of all your function inside mapValues. Yet this is not the problem, your writing, while complex, does the same thing.
Your function in mapValues returns Double (because some branches return Int and others return Double, and in this case, Int is promoted to Double. If it were not, it would return AnyVal).
So you get a List[Map[String, Double]]. At this point, you have lost the Ints.
When you do m.get("a"), this returns Option[Double]. Option[A] has method getOrElse(default: A) : A (actually, default: => X) but it makes no difference here).
If you call getOrElse(0.0) instead of getOrElse(0), you get a Double. Your code still fails, because your fold start with (Int, Double, Double), and you would return (Double, Double, Double). If you start your fold with (0.0, 0.0, 0.0), it works, but you have lost your Ints, you get (4.0, 2.8, 9.0)
Now, about the error message. You pass an Int to a method expecting a Double (getOrElse), the Int should normally be converted to Double, and it would be as if you called with getOrElse(0.0). Except that Option is covariant (declared trait Option[+A]). if X is an ancestor of A, an Option[A] is also an Option[X]. So an Option[Double] is also Option[AnyVal] and Option[Any]. The call getOrElse(0) works if the option is considered an Option[AnyVal], and the result is AnyVal (would work with Any too, but AnyVal is more precise and this is the one the compiler chooses). Because the expression compiles as is, there is no need to promote the 0 to 0.0. Thus m.get("a").getOrElse(0) is of type AnyVal, which cannot be added to t._1. This is what your error message says.
You have knowledge that "a" is associated with Int, "b" with double, but you don't pass this knowledge to the compiler.

A nifty one-liner:
l.map(_.filterKeys(_ != "d")).flatten groupBy(_._1) map { case (k,v) => v map { case (k2,v2: Number) => v2.doubleValue} sum }
res0: scala.collection.immutable.Iterable[Double] = List(9.0, 4.0, 2.8)

In general, if you don't know the keys, but just want to sum values you can do
val filtered = for {
map <- l
(k, v) <- map
if v.isInstanceOf[Number]
} yield k -> v.asInstanceOf[Number].doubleValue
val summed = filtered.groupBy(_._1) map { case (k, v) => k -> v.map(_._2).sum }
scala> l
res1: List[Map[String,Any]] = List(Map(a -> 1, b -> 2.8), Map(a -> 3, c -> 4), Map(c -> 5, d -> abc))
scala> filtered
res2: List[(String, Double)] = List((a,1.0), (b,2.8), (a,3.0), (c,4.0), (c,5.0))
scala> summed
res3: Map[String,Double] = Map(c -> 9.0, a -> 4.0, b -> 2.8)
Update
You can filter map by type you want, for example
scala> val intMap = for (x <- l) yield x collect { case (k, v: Int) => k -> v }
intMap: List[scala.collection.immutable.Map[String,Int]] = List(Map(a -> 1), Map(a -> 3, c -> 4), Map(c -> 5))
and then sum values (see linked question)
scala> intMap reduce { _ |+| _ }
res4: scala.collection.immutable.Map[String,Int] = Map(a -> 4, c -> 9)

Am I missing something or can you not just do:
map.values.sum
?

Related

In Scala, what does x=> x._1._1 denotes

In the following snippet of code, I am aware that x._1 denotes the first element of the tuple, but I couldn't understand what x._1._1 represents.I am not so familiar with Scala, sorry if it is a relatively naive question, thank you!!
val a = b.groupBy(x=> x._1._1)
Here is a quick example in the REPL of a nested tuple
scala> val t = ((1, 2), 3)
t: ((Int, Int), Int) = ((1,2),3)
scala> t._1 // Get the first part of the tuple
res0: (Int, Int) = (1,2)
scala> t._2 // Get the second part of the tuple
res1: Int = 3
scala> t._1._1 // Get the first part of the first part
res2: Int = 1
And here is an example with a sequence to demonstrate the groupBy:
scala> val s = Seq(((1, 2), 3), ((1, 5), 6), ((2, 4), 32))
s: Seq[((Int, Int), Int)] = List(((1,2),3), ((1,5),6), ((2,4),32))
scala> s.groupBy
def groupBy[K](f: (((Int, Int), Int)) => K): scala.collection.immutable.Map[K,Seq[((Int, Int), Int)]]
scala> s.groupBy(x => x._1._1)
res3: scala.collection.immutable.Map[Int,Seq[((Int, Int), Int)]] = Map(2 -> List(((2,4),32)), 1 -> List(((1,2),3), ((1,5),6)))
In this case the first element of the first element are the target for the grouping. Here's the result in an easier to look at format:
Map(
2 -> List(
((2,4),32)),
1 -> List(
((1,2),3),
((1,5),6))
)
It means x._1 itself is a tuple.
Example:
val b = Seq((("subTuple_1", "subTuple_2"), "tuple_2"))
val a = b.groupBy(x=> x._1._1)
As you mentioned ._1 gives you the first column of your tuple, and if the result of first column is Tuple, you can do ._1.
eg.
scala> Map(("a" -> "b") -> 100, ("c" -> "d") -> 200).map(_._1)
res31: scala.collection.immutable.Map[String,String] = Map(a -> b, c -> d)
scala> Map(("a" -> "b") -> 100, ("c" -> "d") -> 200).map(_._1._1)
res32: scala.collection.immutable.Iterable[String] = List(a, c)
groupBy,
scala> Map(("a" -> "b") -> 100, ("a" -> "c") -> 200).groupBy(_._1._1)
res19: scala.collection.immutable.Map[String,scala.collection.immutable.Map[(String, String),Int]] = Map(a -> Map((a,b) -> 100, (a,c) -> 200))

scala type mismatch error, GenTraversableOnce[?] required

Why does this code result in the compilation error
type mismatch; found : (Int, Char) required:
scala.collection.GenTraversableOnce[?]
?
val n = Map(1 -> 'a', 4 -> 'a')
def f(i: Int, c: Char) = (i -> c)
n.flatMap (e => f(e._1, e._2))
Use map() instead:
n.map (e => f(e._1, e._2))
flatMap() assumes you are returning a collection of values rather than a single element. Thus these would work:
n.flatMap (e => List(f(e._1, e._2))
n.flatMap (e => List(f(e._1, e._2), f(e._1 * 10, e._2)))
The second example is interesting. For each [key, value] pair we return two pairs which are then merged, so the result is:
Map(1 -> a, 10 -> a, 4 -> a, 40 -> a)

How to map 2 maps with a function in Scala?

When c map equals a function of a map I can calculate it as
val a: Map[T, U] = ...
def f(aValue: U): V = ...
val c: Map[T, V] = a.map(f)
but what if c map equals a function of both a and b as arguments? For example if a, b and c are Map[String, Int] and a c values are to equal corresponding a values raised to powers specified by corresponding b values?
Something like this?
val a: Map[String, Int] = Map("a" -> 10, "b" -> 20)
val b: Map[String, Int] = Map("a" -> 2, "b" -> 3)
def f(a: Int, b: Int): Int = math.pow(a,b).toInt // math.pow returns a Double
val c = for {
(ak, av) <- a // for all key-value pairs from a
bv <- b.get(ak) // for any matching value from b
} yield (ak, f(av,bv)) // yield a new key-value pair that results from applying f
// c: scala.collection.immutable.Map[String,Int] = Map(a -> 100, b -> 8000)
Is this what you're after?
val a = Map('a -> 2, 'b -> 3)
val b = Map('a -> 4, 'b -> 5)
a.map{ case (k, aVal) => (k, aVal + b(k)) } // Map('a -> 6, 'b -> 8)

How to convert a mutable HashMap into an immutable equivalent in Scala?

Inside a function of mine I construct a result set by filling a new mutable HashMap with data (if there is a better way - I'd appreciate comments). Then I'd like to return the result set as an immutable HashMap. How to derive an immutable from a mutable?
Discussion about returning immutable.Map vs. immutable.HashMap notwithstanding, what about simply using the toMap method:
scala> val m = collection.mutable.HashMap(1 -> 2, 3 -> 4)
m: scala.collection.mutable.HashMap[Int,Int] = Map(3 -> 4, 1 -> 2)
scala> m.toMap
res22: scala.collection.immutable.Map[Int,Int] = Map(3 -> 4, 1 -> 2)
As of 2.9, this uses the method toMap in TraversableOnce, which is implemented as follows:
def toMap[T, U](implicit ev: A <:< (T, U)): immutable.Map[T, U] = {
val b = immutable.Map.newBuilder[T, U]
for (x <- self)
b += x
b.result
}
scala> val m = collection.mutable.HashMap(1->2,3->4)
m: scala.collection.mutable.HashMap[Int,Int] = Map(3 -> 4, 1 -> 2)
scala> collection.immutable.HashMap() ++ m
res1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 4)
or
scala> collection.immutable.HashMap(m.toSeq:_*)
res2: scala.collection.immutable.HashMap[Int,Int] = Map(1 -> 2, 3 -> 4)
If you have a map : logMap: Map[String, String]
just need to do : logMap.toMap()

Map a single entry of a Map

I want to achieve something like the following:
(_ : Map[K,Int]).mapKey(k, _ + 1)
And the mapKey function applies its second argument (Int => Int) only to the value stored under k. Is there something inside the standard lib? If not I bet there's something in Scalaz.
Of course I can write this function myself (m.updated(k,f(m(k))) and its simple to do so. But I've come over this problem several times, so maybe its already done?
For Scalaz I imagine something along the following code:
(m: Map[A,B]).project(k: A).map(f: B => B): Map[A,B]
You could of course add
def changeForKey[A,B](a: A, fun: B => B): Tuple2[A, B] => Tuple2[A, B] = { kv =>
kv match {
case (`a`, b) => (a, fun(b))
case x => x
}
}
val theMap = Map('a -> 1, 'b -> 2)
theMap map changeForKey('a, (_: Int) + 1)
res0: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 2, 'b -> 2)
But this would circumvent any optimisation regarding memory re-use and access.
I came also up with a rather verbose and inefficient scalaz solution using a zipper for your proposed project method:
theMap.toStream.toZipper.flatMap(_.findZ(_._1 == 'a).flatMap(elem => elem.delete.map(_.insert((elem.focus._1, fun(elem.focus._2)))))).map(_.toStream.toMap)
or
(for {
z <- theMap.toStream.toZipper
elem <- z.findZ(_._1 == 'a)
z2 <- elem.delete
} yield z2.insert((elem.focus._1, fun(elem.focus._2)))).map(_.toStream.toMap)
Probably of little use. I’m just posting for reference.
Here is one way:
scala> val m = Map(2 -> 3, 5 -> 11)
m: scala.collection.immutable.Map[Int,Int] = Map(2 -> 3, 5 -> 11)
scala> m ++ (2, m.get(2).map(1 +)).sequence
res53: scala.collection.immutable.Map[Int,Int] = Map(2 -> 4, 5 -> 11)
scala> m ++ (9, m.get(9).map(1 +)).sequence
res54: scala.collection.immutable.Map[Int,Int] = Map(2 -> 3, 5 -> 11)
This works because (A, Option[B]).sequence gives Option[(A, B)]. (sequence in general turns types inside out. i.e. F[G[A]] => [G[F[A]], given F : Traverse and G : Applicative.)
You can pimp it with this so that it creates a new map based on the old one:
class MapUtils[A, B](map: Map[A, B]) {
def mapValueAt(a: A)(f: (B) => B) = map.get(a) match {
case Some(b) => map + (a -> f(b))
case None => map
}
}
implicit def toMapUtils[A, B](map: Map[A, B]) = new MapUtils(map)
val m = Map(1 -> 1)
m.mapValueAt(1)(_ + 1)
// Map(1 -> 2)
m.mapValueAt(2)(_ + 1)
// Map(1 -> 1)