How to merge Maps on Keys and combine their values scala [duplicate] - scala

This question already has answers here:
Merge maps by key
(8 answers)
Closed 5 years ago.
I have 2 Maps :
val map1 = Map("col_1" -> "data_1", "col_2" -> "data_2", "col_3" -> "data_3")
val map2 = Map("col_1" -> "myval_1", "col_2" -> "myval_2", "col_3" -> "myval_3")
Required Output:
res = Map("col_1" -> ("data_1", "myval_1"), "col_2" -> ("data_2", "myval_2"),
"col_2" -> ("data_2", "myval_2") )
Basically Keeping the keys of 'map1' & merging values of both maps
Output must be Tuple and not a List or Seq

Use map (throws if one of keys is missing on the other map):
val res = map1.map { case (k, v) => (k, (v, map2(k))) }
Or use collect (skips the keys not present in both maps):
val res = map1.collect { case (k, v) if map2.contains(k) => (k, (v, map2(k))) }
Or with default value for map2:
val res = map1.map { case (k, v) => (k, (v, map2.getOrElse(k, ""))) }
For symmetric case, I'd go with Scalaz version from my other answer

Get all the keys which present in map1 along with the combination which keys present in map2 :
val res = map1.collect { case (k, v) => if (map2.contains(k)) (k, (v, map2(k))) else (k, (v, "")) }

This is a simple solution you can apply in your case
val resultMap = map1.map(kv => {
if(map2(kv._1) != None){
kv._1 -> (kv._2, map2(kv._1))
}
else{
kv
}
})
resultMap should be
Map(col_1 -> (data_1,myval_1), col_2 -> (data_2,myval_2), col_3 -> (data_3,myval_3))
Above case fails when there is no key of map1 in map2, for that case you can use match case as following
val resultMap = map1.map(kv => map2 getOrElse(kv._1, "noKey") match{
case "noKey" => kv
case x => kv._1 -> (kv._2, x)
})

Related

Group, map & reduce with two different reducer-operators

I have these tuples:
("T1",2,"x1"),
("T1",2,"x2"),
// … etc
And i want to reduce it to ("T1", 4, List("x1", "x2")). How can i do this ?
I did something like .group(_._1).map{case (key,list) => key-> list.map(_._2).reduce(_+_)}
But this is not working, and just sums the numbers without appending the list.
With groupMapReduce:
val xs = List(
("T1",40,"x1"),
("T1",2,"x2"),
("T2",58,"x3")
)
println(xs.groupMapReduce(_._1)
(e => (e._2, List(e._3)))
({ case ((x, y), (z, w)) => (x + z, y ++ w)})
)
with groupBy:
val xs = List(
("T1",40,"x1"),
("T1",2,"x2"),
("T2",58,"x3")
)
println(xs.groupBy(_._1)
.view
.mapValues(ys => (ys.view.map(_._2).sum, ys.map(_._3)))
.toMap
)
If you want to do it in one pass per list, and not use ++ you could try sth. like this:
xs.groupBy(_._1)
.view
.mapValues(ys =>
ys.foldRight((0, List.empty[String])){
case ((_, n, x), (sum, acc)) => (n + sum, x :: acc)
}
)
.toMap
All three variants give
Map(T2 -> (58,List(x3)), T1 -> (42,List(x1, x2)))
Note that combining many lists with ++ might become very inefficient if the number of lists becomes large. It depends on your use-case whether this is acceptable or not.
Using foldLeft
val tuples = List(
("T1",2,"x1"),
("T1",2,"x2"),
("T2",2,"x1"),
("T2",2,"x2"),
("T3",2,"x1")
)
tuples.foldLeft(Map.empty[String, (Int, List[String])]){ (acc, curr) =>
acc.get(curr._1).fold(acc + (curr._1 -> (curr._2, List(curr._3)))) { case (int, ls) =>
acc + (curr._1 -> (int + curr._2, (curr._3 :: ls).reverse))
}
}
If you have cats in scope, all you need to do is this:
import cats.data.Chain
import cats.syntax.all._
def combineTripletes(data: List[(String, Int, String)]): Map[String, (Int, List[String])] =
data.foldMap {
case (key, i, str) =>
Map(key -> (i, Chain.one(str)))
} fmap {
case (sum, chain) =>
sum -> chain.toList
}

Handling of Nested Maps

I have a Map which has a key containing another Map.
i.e -
val myDetailsMap = Map("name" -> "abc",
"class" -> "10",
"section" -> "A",
"marksPerSubjectId" -> Map(101 -> "Physics= '70' AND Chemistry='80'",
102 -> "History= '60' AND Civics = '67'"),
"status" -> "pass")
Now, I want to iterate through the marksPerSubjectId key containing another MAP using foreach. How should I proceed ?
On Databricks -
What about using pattern matching?
in Scala 2.13:
myDetailsMap.foreachEntry{ (k, v) =>
v match {
case map: Map[_, _] => map.foreachEntry{ (k, v) => println(v)}
case other => println(other)
}
}
in Scala 2.11:
myDetailsMap.foreach{ case (k, v) =>
v match {
case map: Map[_, _] => map.foreach{ case (k, v) => println(v)}
case other => println(other)
}
}

How do I access "filtered" items from collection?

I have a string val trackingHeader = "k1=v1, k2=v2, k3=v3, k4=v4" which I would like to parse and convert it to a Map(k1 -> v1, k2 -> v2, k3 -> v3, k4 -> v4). Following is the code I used to do this:
val trackingHeadersMap = trackingHeader
.replaceAll("\\s", "")
.split(",")
.map(_ split "=")
.map { case Array(k, v) => (k, v) }
.toMap
I was able to get my desired output. But, I also need to handle a malformed input case like val trackingHeader = "k1=v1, k2=v2, k3=v3, k4=". Notice there is no value for key k4. My above code will start breaking with scala.MatchError: [Ljava.lang.String;#622a1e0c (of class [Ljava.lang.String;) so I changed it to:
val trackingHeadersMap = trackingHeader
.replaceAll("\\s", "")
.split(",")
.map(_ split "=")
.collect { case Array(k, v) => (k, v) }
.toMap
Great now I have handle the malformed case as well by using collect but I would like to know what key had this issue and log it (in this example its k4). I tried the following and was able to get the desired result but I am not sure if its the right way to do it:
val badKeys = trackingHeader
.replaceAll("\\s", "")
.split(",")
.map(_ split "=")
.filterNot(_.length == 2)
Now I can iterate over the badKeys and print them out. Is there a better way to do this?
You could make the result optional, and use flatMap instead of map
.flatMap {
case Array(k, v) => Some(k -> v)
case Array(k) => println(s"Bad entry: $k"); None
}
One solution would be to add a map step that prints the bad key for elements matching a one-element array before the call to collect:
val trackingHeadersMap = trackingHeader
.replaceAll("\\s", "")
.split(",")
.map(_ split "=")
.map {
case v # Array(k) => println(s"bad key: $k"); v
case v => v
}.collect {
case Array(k, v) => (k, v)
}
.toMap
A better solution (which separates side-effects from transformations) would be to use partition which would split this into two collections ("good" and "bad"), and handle each one separately:
val (good, bad) = trackingHeader
.replaceAll("\\s", "")
.split(",")
.map(_ split "=")
.partition(_.length == 2)
val trackingHeadersMap = good.map { case Array(k, v) => (k, v) }.toMap
bad.map(_(0)).foreach(k => println(s"bad key: $k"))

Converting List to Map using keys in Scala

I am struggling with finding an elegant FP approach to solving the following problem in Scala:
Say I have a set of candidate keys
val validKeys = Set("key1", "key2", "key3")
And a list that
Starts with a key
has some number of non-keys (> 0) between each key
Does not end with a key
For example:
val myList = List("key3", "foo", "bar", "key1", "baz")
I'd like to transform this list into a map by choosing using valid keys as the key and aggregating non-keys as the value. So, in the example above:
("key3" -> "foo\nbar", "key1" -> "baz")
Thanks in advance.
Short and simple:
def create(a: List[String]): Map[String, String] = a match {
case Nil => Map()
case head :: tail =>
val (vals, rest) = tail.span(!validKeys(_))
create(rest) + (head -> vals.mkString("\n"))
}
Traversing a list from left to right, accumulating a result should suggest foldLeft
myList.foldLeft((Map[String, String](), "")) {
case ((m, lk), s) =>
if (validKeys contains s)
(m updated (s, ""), s)
else (m updated (lk, if (m(lk) == "") s else m(lk) + "\n" + s), lk)
}._1
// Map(key3 -> foo\nbar, key1 -> baz)
As a first approximation solution:
def group(list:List[String]):List[(String, List[String])] = {
#tailrec
def grp(list:List[String], key:String, acc:List[String]):List[(String, List[String])] =
list match {
case Nil => List((key, acc.reverse))
case x :: xs if validKeys(x) => (key, acc.reverse)::group(x::xs)
case x :: xs => grp(xs, key, x::acc)
}
list match {
case Nil => Nil
case x::xs => grp(xs, x, List())
}
}
val map = group(myList).toMap
Another option:
list.foldLeft((Map[String, String](), "")) {
case ((map, key), item) if validKeys(item) => (map, item)
case ((map, key), item) =>
(map.updated(key, map.get(key).map(v => v + "\n" + item).getOrElse(item)), key)
}._1

How to find out common tuples from list of tuples using scala?

I have two list as following-
val list1 = List(("192.168.0.1","A"),("192.168.0.2","B"),("192.168.0.3","C"))
val list2 = List(("192.168.0.104",2), ("192.168.0.119",2), ("205.251.0.185",24), ("192.168.0.1",153))
I want to match first value of both lists as shown as following:
outputList = List(("192.168.0.1","A",153))
Currently I am using following to get output -
list1.map{
ajson =>
val findHost = list2.filter(_._1.contains(ajson._1.trim))
if(findHost.nonEmpty) {
(ajson._1,ajson._2,findHost.head._2)
} else ("NA","NA",0)
}.filterNot(p => p._1.equals("NA") || p._2.equals("NA"))
Is this right approach?
I also tried
(list1 ::: list2).groupBy(_._1).map{.......}
But it gives all elements from list1.
Can anyone help me to get expected output?
You can try this:
val res = for(
(k,v) <- list1;
n <- list2.toMap.get(k)
) yield (k,v,n)
Probably most performant would be
val m1 = list1.toMap
val m2 = list2.toMap
m1.keySet.intersect(m2.keySet).map(key => (key, m1(key), m2(key)))
UPDATE
If you have more complex shapes than Tuple2 in you list for example
val list1 = List(("192.168.0.1", "A", true, 'C'), ("192.168.0.2", "B", false, 'D'), ("192.168.0.3", "C", true, 'E'))
val list2 = List(("192.168.0.104", 2, 5.7), ("192.168.0.119", 2, 13.4), ("205.251.0.185", 24, 11.2), ("192.168.0.1", 153, 34.8))
, you may need additional reshaping like
val m1 = list1.view.map { case (key, v1, v2, v3) => (key, (v1, v2, v3)) }.toMap
val m2 = list2.view.map { case (key, v1, v2) => (key, (v1, v2)) }.toMap
m1.keySet.intersect(m2.keySet).map(key => (key, m1(key), m2(key)))
Or you could use enhanced johny's version with such reshaping :
val m2 = list2.view.map { case (key, v1, v2) => (key, (v1, v2)) }.toMap
list1.collect { case (ip, x1, x2, x3) if m2 contains ip => (ip, (x1, x2, x3), m2(ip)) }
The following code should do the trick
val result = list2.flatMap {
entry =>
map.get(entry._1.trim) match {
case Some(list) =>
Some(list.map {
l =>
(l._1, l._2, entry._2)
})
case None => None
}
}.flatten
val list2Map = list2.toMap
list1.withFilter(x => list2Map.contains(x._1)).map(s => (s._1, s._2, list2Map(s._1)))