Scala: using foldl to add pairs from list to a map? - scala

I am trying to add pairs from list to a map using foldl. I get the following error:
"missing arguments for method /: in trait TraversableOnce; follow this method with `_' if you want to treat it as a partially applied function"
code:
val pairs = List(("a", 1), ("a", 2), ("c", 3), ("d", 4))
def lstToMap(lst:List[(String,Int)], map: Map[String, Int] ) = {
(map /: lst) addToMap ( _, _)
}
def addToMap(pair: (String, Int), map: Map[String, Int]): Map[String, Int] = {
map + (pair._1 -> pair._2)
}
What is wrong?

scala> val pairs = List(("a", 1), ("a", 2), ("c", 3), ("d", 4))
pairs: List[(String, Int)] = List((a,1), (a,2), (c,3), (d,4))
scala> (Map.empty[String, Int] /: pairs)(_ + _)
res9: scala.collection.immutable.Map[String,Int] = Map(a -> 2, c -> 3, d -> 4)
But you know, you could just do:
scala> pairs.toMap
res10: scala.collection.immutable.Map[String,Int] = Map(a -> 2, c -> 3, d -> 4)

You need to swap the input values of addToMap and put it in parenthesis for this to work:
def addToMap( map: Map[String, Int], pair: (String, Int)): Map[String, Int] = {
map + (pair._1 -> pair._2)
}
def lstToMap(lst:List[(String,Int)], map: Map[String, Int] ) = {
(map /: lst)(addToMap)
}
missingfaktor's answer is much more concise, reusable, and scala-like.

If you already have a collection of Tuple2s, you don't need to implement this yourself, there is already a toMap method that only works if the elements are tuples!
The full signature is:
def toMap[T, U](implicit ev: <:<[A, (T, U)]): Map[T, U]
It works by requiring an implicit A <:< (T, U) which is essentially a function that can take the element type A and cast/convert it to tuples of type (T, U). Another way of saying this is that it requires an implicit witness that A is-a (T, U). Therefore, this is completely type-safe.
Update: which is what #missingfaktor said

This is not a direct answer to the question, which is about folding correctly on the map, but I deem it important to emphasize that
a Map can be treated as a generic Traversable of pairs
and you can easily combine the two!
scala> val pairs = List(("a", 1), ("a", 2), ("c", 3), ("d", 4))
pairs: List[(String, Int)] = List((a,1), (a,2), (c,3), (d,4))
scala> Map.empty[String, Int] ++ pairs
res1: scala.collection.immutable.Map[String,Int] = Map(a -> 2, c -> 3, d -> 4)
scala> pairs.toMap
res2: scala.collection.immutable.Map[String,Int] = Map(a -> 2, c -> 3, d -> 4)

Related

Scala - Reduce list of tuples by key

I have list of tuples which contains userId and point. I want to combine or reduce this list by the key.
val points: List[(Int, Double)] = List(
(1, 1.0),
(2, 3.2),
(4, 2.0),
(1, 4.0),
(2, 6.8)
)
The expected result should look like:
List((1, 5.0), (2, 10.0), (4, 2.0))
I tried with groupBy and mapValue, but got an error:
val aggrPoint: Map[Int, Double] = incomes.groupBy(_._1).mapValues(seq => seq.reduce(_._2 + _._2))
Error:(16, 180) type mismatch;
found : Double
required: (Int, Double)
What am I doing wrong, and is there a idiomatic way to achieve this?
P.S) I found that in Spark aggregateByKey does this job. But, is there a built-in method in Scala?
What am I doing wrong, and is there a idiomatic way to achieve this?
let's go step by step to see what are you doing wrong. (I am going to use REPL)
first of all lets define the points
scala> val points: List[(Int, Double)] = List(
| (1, 1.0),
| (2, 3.2),
| (4, 2.0),
| (1, 4.0),
| (2, 6.8)
| )
points: List[(Int, Double)] = List((1,1.0), (2,3.2), (4,2.0), (1,4.0), (2,6.8))
As you can see that you have List[Tuple2[Int, Double]] so when you do groupBy and mapValues as
scala> points.groupBy(_._1).mapValues(seq => println(seq))
List((2,3.2), (2,6.8))
List((4,2.0))
List((1,1.0), (1,4.0))
res1: scala.collection.immutable.Map[Int,Unit] = Map(2 -> (), 4 -> (), 1 -> ())
You can see that seq object is of List[Tuple2[Int, Double]] again but only contains the grouped tuples as list.
So when you apply seq.reduce(_._2 + _._2), the reduce function takes two inputs of Tuple2[Int, Double] but the output is Double only which doesn't match for the next iteration on seq as the expected input is Tuple2[Int, Double]. Thats the main issue. All you have to do is match the input and output types for reduce function
One way would be to match Tuple2[Int, Double] as
scala> points.groupBy(_._1).mapValues(seq => seq.reduce{(x,y) => (x._1, x._2 + y._2)})
res6: scala.collection.immutable.Map[Int,(Int, Double)] = Map(2 -> (2,10.0), 4 -> (4,2.0), 1 -> (1,5.0))
But this isn't your desired output, so you can extract the double value from the reduced Tuple2[Int, Double] as
scala> points.groupBy(_._1).mapValues(seq => seq.reduce{(x,y) => (x._1, x._2 + y._2)}._2)
res8: scala.collection.immutable.Map[Int,Double] = Map(2 -> 10.0, 4 -> 2.0, 1 -> 5.0)
or you can just use map before you apply reduce function as
scala> points.groupBy(_._1).mapValues(seq => seq.map(_._2).reduce(_ + _))
res3: scala.collection.immutable.Map[Int,Double] = Map(2 -> 10.0, 4 -> 2.0, 1 -> 5.0)
I hope the explanation is clear enough to understand your mistake and you must have understood how a reduce function works
You can map the tuples in the mapValues to their 2nd elements then sum them as follows:
points.groupBy(_._1).mapValues( _.map(_._2).sum ).toList
// res1: List[(Int, Double)] = List((2,10.0), (4,2.0), (1,5.0))
Using collect
points.groupBy(_._1).collect{
case e => e._1 -> e._2.map(_._2).sum
}.toList
//res1: List[(Int, Double)] = List((2,10.0), (4,2.0), (1,5.0))

In Scala, what does x=> x._1._1 denotes

In the following snippet of code, I am aware that x._1 denotes the first element of the tuple, but I couldn't understand what x._1._1 represents.I am not so familiar with Scala, sorry if it is a relatively naive question, thank you!!
val a = b.groupBy(x=> x._1._1)
Here is a quick example in the REPL of a nested tuple
scala> val t = ((1, 2), 3)
t: ((Int, Int), Int) = ((1,2),3)
scala> t._1 // Get the first part of the tuple
res0: (Int, Int) = (1,2)
scala> t._2 // Get the second part of the tuple
res1: Int = 3
scala> t._1._1 // Get the first part of the first part
res2: Int = 1
And here is an example with a sequence to demonstrate the groupBy:
scala> val s = Seq(((1, 2), 3), ((1, 5), 6), ((2, 4), 32))
s: Seq[((Int, Int), Int)] = List(((1,2),3), ((1,5),6), ((2,4),32))
scala> s.groupBy
def groupBy[K](f: (((Int, Int), Int)) => K): scala.collection.immutable.Map[K,Seq[((Int, Int), Int)]]
scala> s.groupBy(x => x._1._1)
res3: scala.collection.immutable.Map[Int,Seq[((Int, Int), Int)]] = Map(2 -> List(((2,4),32)), 1 -> List(((1,2),3), ((1,5),6)))
In this case the first element of the first element are the target for the grouping. Here's the result in an easier to look at format:
Map(
2 -> List(
((2,4),32)),
1 -> List(
((1,2),3),
((1,5),6))
)
It means x._1 itself is a tuple.
Example:
val b = Seq((("subTuple_1", "subTuple_2"), "tuple_2"))
val a = b.groupBy(x=> x._1._1)
As you mentioned ._1 gives you the first column of your tuple, and if the result of first column is Tuple, you can do ._1.
eg.
scala> Map(("a" -> "b") -> 100, ("c" -> "d") -> 200).map(_._1)
res31: scala.collection.immutable.Map[String,String] = Map(a -> b, c -> d)
scala> Map(("a" -> "b") -> 100, ("c" -> "d") -> 200).map(_._1._1)
res32: scala.collection.immutable.Iterable[String] = List(a, c)
groupBy,
scala> Map(("a" -> "b") -> 100, ("a" -> "c") -> 200).groupBy(_._1._1)
res19: scala.collection.immutable.Map[String,scala.collection.immutable.Map[(String, String),Int]] = Map(a -> Map((a,b) -> 100, (a,c) -> 200))

How to merge two Seq[String], Seq[Double] to Seq[(String,Double)]

I have two Seq.
1 has Seq[String] and another has Seq[(String,Double)]
a -> ["a","b","c"] and
b-> [1,2,3]
I want to create output as
[("a",1),("b",2),("c",3)]
I have a code
a.zip(b) is actually creating a seq of those two elements instead of creating a map
Can anyone suggest how to do that in scala?
you simply need .toMap so that you can transform List[Tuple[String, Int]] to Map[String, Int]
scala> val seq1 = List("a", "b", "c")
seq1: List[String] = List(a, b, c)
scala> val seq2 = List(1, 2, 3)
seq2: List[Int] = List(1, 2, 3)
scala> seq1.zip(seq2)
res0: List[(String, Int)] = List((a,1), (b,2), (c,3))
scala> seq1.zip(seq2).toMap
res1: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, c -> 3)
also see
How to convert a Seq[A] to a Map[Int, A] using a value of A as the key in the map?

Composing two maps

Is there a function in Scala to compose two maps or is flatMap a sensible approach?
scala> val caps: Map[String, Int] = Map(("A", 1), ("B", 2))
caps: Map[String,Int] = Map(A -> 1, B -> 2)
scala> val lower: Map[Int, String] = Map((1, "a"), (2, "b"))
lower: Map[Int,String] = Map(1 -> a, 2 -> b)
scala> caps.flatMap {
| case (cap, idx) => Map((cap, lower(idx)))
| }
res1: scala.collection.immutable.Map[String,String] = Map(A -> a, B -> b)
Some syntactic sugar would be great!
If you know lower will contain keys for all the values in caps, you can use mapValues:
scala> caps mapValues lower
res0: scala.collection.immutable.Map[String,String] = Map(A -> a, B -> b)
If you don't want or need a new collection, just a mapping, it's a little more idiomatic to use andThen:
scala> val composed = caps andThen lower
composed: PartialFunction[String,String] = <function1>
scala> composed("A")
res1: String = a
This also assumes there aren't values in caps that aren't mapped in lower.

How to convert a mutable HashMap into an immutable equivalent in Scala?

Inside a function of mine I construct a result set by filling a new mutable HashMap with data (if there is a better way - I'd appreciate comments). Then I'd like to return the result set as an immutable HashMap. How to derive an immutable from a mutable?
Discussion about returning immutable.Map vs. immutable.HashMap notwithstanding, what about simply using the toMap method:
scala> val m = collection.mutable.HashMap(1 -> 2, 3 -> 4)
m: scala.collection.mutable.HashMap[Int,Int] = Map(3 -> 4, 1 -> 2)
scala> m.toMap
res22: scala.collection.immutable.Map[Int,Int] = Map(3 -> 4, 1 -> 2)
As of 2.9, this uses the method toMap in TraversableOnce, which is implemented as follows:
def toMap[T, U](implicit ev: A <:< (T, U)): immutable.Map[T, U] = {
val b = immutable.Map.newBuilder[T, U]
for (x <- self)
b += x
b.result
}
scala> val m = collection.mutable.HashMap(1->2,3->4)
m: scala.collection.mutable.HashMap[Int,Int] = Map(3 -> 4, 1 -> 2)
scala> collection.immutable.HashMap() ++ m
res1: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 3 -> 4)
or
scala> collection.immutable.HashMap(m.toSeq:_*)
res2: scala.collection.immutable.HashMap[Int,Int] = Map(1 -> 2, 3 -> 4)
If you have a map : logMap: Map[String, String]
just need to do : logMap.toMap()