find unique elements amongst the values of a map in scala - scala

I have Map[String,Seq[String]].
I want to find the unique elements among all the values in the map. I want to do this in Scala.
Say, I have
Map['a' -> Seq(1,2,3),
'b' -> Seq(2,3),
'c' -> Seq(4)
]
I want the desired result to be
Map['a' -> Seq(3), 'c' -> Seq(4)]
Any idea on how to do this?
Thanks!

If you are looking for unique element in each list, then you can use currentList.diff(rest_of_the_list)
Given
scala> val input = Map('a' -> Seq(1,2,3), 'b' -> Seq(2,3), 'c' -> Seq(4))
input: scala.collection.immutable.Map[Char,Seq[Int]] = Map(a -> List(1, 2, 3), b -> List(2, 3), c -> List(4))
Find the rest of the elements for each key,
scala> val unions = input.map(elem => elem._1 -> input.filter(!_._1.equals(elem._1)).flatMap(_._2).toSet)
unions: scala.collection.immutable.Map[Char,scala.collection.immutable.Set[Int]] = Map(a -> Set(2, 3, 4), b -> Set(1, 2, 3, 4), c -> Set(1, 2, 3))
Then, iterate over input map and find the unique element in each each list
scala> input.map(x => x._1 -> x._2.diff(unions(x._1).toList))
res18: scala.collection.immutable.Map[Char,Seq[Int]] = Map(a -> List(1), b -> List(), c -> List(4))
If you don't want empty keys (b in above example)
scala> input.map(x => x._1 -> x._2.diff(unions(x._1).toList)).filter(_._2.nonEmpty)
res21: scala.collection.immutable.Map[Char,Seq[Int]] = Map(a -> List(1), c -> List(4))

Find the elements that non-unique by flattening all values and filter elements that size more than 1. Then, remove all non-unique element in every key.
val input = Map('a' -> Seq(1,2,3),
'b' -> Seq(2,3),
'c' -> Seq(4))
val nonUnique = input.values.flatten
.groupBy(identity)
.filter(_._2.size > 1)
.keys.toSeq
input.mapValues(x => x.diff(nonUnique)).filter(_._2.size == 1)

Related

Reduce rdd of maps

I have and rdd like that :
Map(A -> Map(A1 -> 1))
Map(A -> Map(A2 -> 2))
Map(A -> Map(A3 -> 3))
Map(B -> Map(B1 -> 4))
Map(B -> Map(B2 -> 5))
Map(B -> Map(B3 -> 6))
Map(C -> Map(C1 -> 7))
Map(C -> Map(C2 -> 8))
Map(C -> Map(C3 -> 9))
I need to have the same rdd reduced by key and having as many values as it has previously:
Map(A -> Map(A1 -> 1, A2 -> 2, A3 -> 3))
Map(B -> Map(B1 -> 4, B2 -> 5, B3 -> 6))
Map(C -> Map(C1 -> 7, C2 -> 8, C3 -> 9))
I tried with a reduce:
val prueba = replacements_2.reduce((x,y) => x ++ y)
But only remains the value of the last element evaluated with the same key:
(A,Map(A3 -> 3))
(C,Map(C3 -> 9))
(B,Map(B3 -> 6))
I think you should model your data differently, your Map approach seems a bit awkward. Why represent 1 entry by a Map with 1 element? A Tuple2 is more suitable for this... Anyway, you need reduceByKey. To do this, you first need to convert your rdd to a key-value RDD:
rdd
.map(m => (m.keys.head,m.values.head)) // create key-value RDD
.reduceByKey((a,b) => a++b) // merge maps
.map{case (k,v) => Map(k -> v)} // create Map again

Convert List of Maps to Map of Lists based on Map Key

Lets say I have the following list:
val myList = List(Map(1 -> 1), Map(2 -> 2), Map(2 -> 7))
I want to convert this list to a single Map of Int -> List(Int) such that if we have duplicate keys then both values should be included in the resulting value list:
Map(2 -> List(7, 2), 1 -> List(1))
I came up with this working solution but it seems excessive and clunky:
myList.foldLeft(scala.collection.mutable.Map[Int,List[Int]]()) {(result,element) =>
for((k,v) <- element) {
if (result.keySet.contains(k)) {
result(k) = result(k).:: (v)
} else {
result += (k -> List(v))
}
}
result
}
Is there a better or more efficient approach here?
myList
.flatten
.groupBy(_._1)
.mapValues(_.map(_._2))
You can use a simpler (but probably less efficient) code:
val myList = List(Map(1 -> 1), Map(2 -> 2), Map(2 -> 7))
val grouped = myList.flatMap(_.toList).groupBy(_._1).mapValues(l => l.map(_._2))
println(grouped)
Map(2 -> List(2, 7), 1 -> List(1))
The idea is to first get List of all tuples from all inner Maps and then group them.
Starting Scala 2.13, we can now use groupMap which is a one-pass equivalent of a groupBy followed by mapValues (as its name suggests):
// val maps = List(Map(1 -> 1), Map(2 -> 2), Map(2 -> 7))
maps.flatten.groupMap(_._1)(_._2) // Map(1 -> List(1), 2 -> List(2, 7))
This:
flattens the list of maps into a list of tuples (List((1, 1), (2, 2), (2, 7)))
groups elements based on their first tuple part (_._1) (group part of groupMap)
maps grouped values to their second tuple part (_._2) (map part of groupMap)

Scala Map: Combine keys with the same value?

Suppose I have a Map like
val x = Map(1 -> List("a", "b"), 2 -> List("a"),
3 -> List("a", "b"), 4 -> List("a"),
5 -> List("c"))
How would I create from this a new Map where the keys are Lists of keys from x having the same value, e.g., how can I implement
def someFunction(m: Map[Int, List[String]]): Map[List[Int], List[String]] =
// stuff that would turn x into
// Map(List(1, 3) -> List("a", "b"), List(2, 4) -> List("a"), List(5) -> List("c"))
?
You can convert the Map to a List and then use groupBy to aggregate the first element of each tuple:
x.toList.groupBy(_._2).mapValues(_.map(_._1)).map{ case (x, y) => (y, x) }
// res37: scala.collection.immutable.Map[List[Int],List[String]] =
// Map(List(2, 4) -> List(a), List(1, 3) -> List(a, b), List(5) -> List(c))
Or as #Dylan commented, use _.swap to switch the tuples' elements:
x.toList.groupBy(_._2).mapValues(_.map(_._1)).map(_.swap)

How to create a map out of two lists?

I have two lists
val a = List(1,2,3)
val b = List(5,6,7)
I'd like to create a Map like:
val h = Map(1->5, 2->6, 3->7)
basically iterating thru both the lists and assigning key value pairs.
How to do it properly in Scala?
You can zip the lists together into a list of tuples, then call toMap:
(a zip b) toMap
Note that if one list is longer than the other, it will be truncated.
Example:
val a = List(1, 2, 3)
val b = List(5, 6, 7)
scala> (a zip b) toMap
res2: scala.collection.immutable.Map[Int,Int] = Map(1 -> 5, 2 -> 6, 3 -> 7)
With truncation:
val c = List("a", "b", "c", "d", "e")
scala> (a zip c) toMap
res3: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b, 3 -> c)
(c zip a) toMap
res4: scala.collection.immutable.Map[String,Int] = Map(a -> 1, b -> 2, c -> 3)

Remove an entry from a Map and return a new Map

I want to check if a Map doesn't contain empty values. If the value is empty it shouldn't includen in the new Map.
I tried something like:
val newmap = map.map{ entry => if(!entry._2.isEmpty()) Map(entry._1 -> entry._2)}
This does exactly do what I want, but it is not very nice. Is there a better solution?
scala> Map(1 -> List(3, 4), 2 -> Nil, 3 -> List(11))
res2: scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(3, 4), 2 -> List(), 3 -> List(11))
scala> res2.filter(_._2.nonEmpty)
res3: scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(3, 4), 3 -> List(11))
scala>
You mean empty as in null?
scala> val map = collection.immutable.HashMap[Int, String] (1 -> "a", 2-> "b", 3 -> null)
map: scala.collection.immutable.HashMap[Int,String] = Map(1 -> a, 2 -> b, 3 -> null)
scala> val newmap=map filter (_._2 != null)
newmap: scala.collection.immutable.HashMap[Int,String] = Map(1 -> a, 2 -> b)
EDIT: dang... #missingfaktor beat me to it... :)