Inverting a Map[Long, Set[Long]) to a Map[Long, Long] - scala

Trying to convert a Map[Long, Set[Long]] to a Map[Long, Long].
I tried this but having compile issues:
m.map(_.swap).map(k => k._1.map((_, k._2)))
Example:
Map(10 -> Set(1,2,3), 11 -> Set(4,5))
Should become:
Map(1 -> 10,
2 -> 10,
3 -> 10,
4 -> 11,
5 -> 11)

flatMap on Map[A,B] will "just work" with collections of tuples:
m.flatMap {case (k,v) => v.map(_ -> k)} // Map[Long,Long]
going from a Map[Long,Set[Long]] to a series of Set[(Long,Long)] that gets flattened to a Map[Long,Long].

Assuming in is your Map[Long, Set[Long]]:
in./:(Map.empty[Long, Long]) { case (acc, (key, values)) => acc ++ values.map(_ -> key) }

To clarify, seem like you have this:
Map(10 -> Set(1,2,3), 11 -> Set(4,5))
And you want to convert this map in another map, but with something like this:
Map(1 -> 10,
2 -> 10,
3 -> 10,
4 -> 11,
5 -> 11)
As you can see if the sets are not disjoint, some keys in the resulted map with be missing:
Having this in consideration, the code will look like this:
val m: Map[Long, Set[Long]] = Map(10l -> Set(1l,2l,3l), 11l -> Set(4l,5l))
m.map(_.swap).map(k => k._1.map((_, k._2)))
val foo: Iterable[(Long, Long)] = m.flatMap { t =>
val (key, value) = t
value.map(_ -> key)
}
val result: Map[Long, Long] = foo.toMap

This will invert your Map m from Map[Long, Set[Long]] to Map[Long, List[Long]].
m flatten {case(k, vs) => vs.map((_, k))} groupBy (_._1) mapValues {_.map(_._2)}
You haven't specified what should happen when different Set values contains some of the same Longs (i.e. Map(8 -> Set(1,2), 9 -> Set(2,3))). If you're sure that won't happen you can use the following adjustment.
m flatten {case(k, vs) => vs.map((_, k))} groupBy (_._1) mapValues {_.head._2}
Or even more simply:
m.flatten {case(k, vs) => vs.map((_, k))}.toMap

Related

How to reverse Map

Trying to reverse Map and the output is only 2 element
val occurrences: Map[String, Int] = arr.groupMapReduce(identity)(_ => 1)(_ + _)
Output: HashMap(world -> 2, Hello, -> 1, hello, -> 1, hello -> 2, and -> 1, world, -> 1)
val reversed = for ((k,v) <- occurrences) yield (v, k)
Output: HashMap(1 -> world,, 2 -> hello)
How did I lost the other patameters?
Similar to #user proposal, but trying to be a little bit more efficient.
def invertMap[K, V](map: Map[K, V]): Map[V, List[K]] =
map
.view
.groupMap(_._2)(_._1)
.view
.mapValues(_.toList)
.toMap
The performance difference would probably be negligible so go with the one you find more readable.
As #Luis Miguel Mejía Suárez said, you can't duplicate keys in a Map, so when you try to make the values the keys, some of the entries are lost.
You can instead do this to obtain a Map[Int, List[String]]
val occurrences = Map("world" -> 2, "Hello," -> 1, "hello," -> 1, "hello" -> 2, "and" -> 1, "world," -> 1)
val x: Map[Int, List[String]] =
occurrences.toList
.groupBy { case (k, v) => v }
.view.mapValues(v => v.map(_._1))
.toMap
Output:
Map(1 -> List(Hello,, hello,, and, world,), 2 -> List(world, hello))
P.S. The .view and .toMap stuff is because mapValues on MapOps is deprecated for now. There'll be a proper strict version later, though.

How to find duplicate values in Map

I have the following Map[String, Int]:
val m = Map[String, Int](
"[LOGIN-011]" -> 0,
"[LOGIN-103]" -> 3,
"[LOGIN-222]" -> 10,
"[ERROR-110]" -> 1,
"[ERROR-012]" -> 3,
...
)
How to find duplicated values in the Map and print the values with List[String] as follows:
3 -> List("[LOGIN-103]", "[ERROR-012]")
Try
m
.toSeq
.groupBy { case (key, value) => value }
.collect { case (key, values: List[(String, Int)]) if values.size > 1 => (key, values.map(_._1)) }
which outputs
HashMap(3 -> List([ERROR-012], [LOGIN-103]))
Here is Luis' one-liner:
m.groupBy(_._2).collect { case (key, group: Map[String, Int]) if group.size > 1 => (key, group.keySet) }
Following works in scala 2.13+ only
val map = Map (
"[LOGIN-011]" -> 0,
"[LOGIN-103]" -> 3,
"[LOGIN-222]" -> 10,
"[ERROR-110]" -> 1,
"[ERROR-012]" -> 3
)
val duplicateValues = map.groupMap(_._2)(_._1).filterNot(_._2.sizeIs == 1)
//Map(3 -> List([ERROR-012], [LOGIN-103]))

How to calculate running total for a SortedMap / TreeMap

So I have the following SortedMap:
val mySortedMap: SortedMap[Double, Int] = SortedMap(1.1 -> 7, 2.4 -> 3, 6.5 -> 12)
Now I need to calculate the running total for each key, so the output should look like this:
val result: SortedMap[Double, Int] = SortedMap(1.1 -> 7, 2.4 -> 10, 6.5 -> 22)
I know I can do something similar by using scanLeft:
val result: Iterable[Int] = mySortedMap.scanLeft(0)((c, e) => c + e._2)
But this returns Iterable whereas I need to keep my SortedMap as specified above. What is the most functional / efficient way to do so?
You can use foldLeft where your accumulator is both, a new SortedMap and the running total.
import scala.collection.immutable.SortedMap
def runningTotal(map: SortedMap[Double, Int]): SortedMap[Double, Int] = {
val (transformed, _) = map.foldLeft((SortedMap.empty[Double, Int], 0)) {
(acc, element) =>
val (mapAcc, totalAcc) = acc
val (key, value) = element
val newTotal = totalAcc + value
val newMap = mapAcc + (key -> newTotal)
(newMap, newTotal)
}
transformed
}
Bonus, here is your solution but using Iterators instead, thus it would be a little bit more efficient.
def runningTotal(map: SortedMap[Double, Int]): SortedMap[Double, Int] = {
val newValues = map.valuesIterator.scanLeft(0) {
(acc, value) => acc + value
}.drop(1)
map.keysIterator.zip(newValues).to(SortedMap)
}
Ok, I think I might have found one possible solution:
val input: SortedMap[Double, Int] = SortedMap(1.1 -> 7, 2.4 -> 3, 6.5 -> 12)
val aggregates: Iterable[Int] = mySortedMap.scanLeft(0)((c, e) => c + e._2).tail
val sortedAggregates: SortedMap[Double, Int] = input.keySet.zip(aggregates).to(SortedMap)
println(sortedAggregates)
gives TreeMap(1.1 -> 7, 2.4 -> 10, 6.5 -> 22).
I wonder if there is a better way?

How to merge Maps in Scala with tuples as key

I have this initial kind of maps:
m: Map[(String, String, String), Double]
and I would like to merge them in a way to get a final Map with the following type:
mm: Map[(String, String, String), Seq[Double]]
So for example:
val m1 = Map (("a","b","c") -> 2.0, ("a","b","d") -> 3.0)
val m2 = Map (("a","b","c") -> 5.0, ("a","b","k") -> 3.0)
// after the merge
Map (("a","b","c") -> Seq(2.0, 5.0), ("a","b","d") -> Seq(3.0), ("a","b","k") -> Seq(3.0))
How can I get that with Scala?
You can do:
(m1.toSeq ++ m2.toSeq)
.groupBy { case (k, v) => k }
.mapValues(_.map { case (k, v) => v })
If you have already imported scalaz then you can do:
m1.mapValues(_.point[List]) |+| m2.mapValues(_.point[List])
You can convert Maps to Seq and then group the Seq by the key:
(m1.toSeq ++ m2.toSeq).groupBy(_._1).mapValues(_.map(_._2))
// res80: scala.collection.immutable.Map[(String, String, String),Seq[Double]] = Map((a,b,k) -> ArrayBuffer(3.0), (a,b,c) -> ArrayBuffer(2.0, 5.0), (a,b,d) -> ArrayBuffer(3.0))

Scala: how to merge a collection of Maps

I have a List of Map[String, Double], and I'd like to merge their contents into a single Map[String, Double]. How should I do this in an idiomatic way? I imagine that I should be able to do this with a fold. Something like:
val newMap = Map[String, Double]() /: listOfMaps { (accumulator, m) => ... }
Furthermore, I'd like to handle key collisions in a generic way. That is, if I add a key to the map that already exists, I should be able to specify a function that returns a Double (in this case) and takes the existing value for that key, plus the value I'm trying to add. If the key does not yet exist in the map, then just add it and its value unaltered.
In my specific case I'd like to build a single Map[String, Double] such that if the map already contains a key, then the Double will be added to the existing map value.
I'm working with mutable maps in my specific code, but I'm interested in more generic solutions, if possible.
Well, you could do:
mapList reduce (_ ++ _)
except for the special requirement for collision.
Since you do have that special requirement, perhaps the best would be doing something like this (2.8):
def combine(m1: Map, m2: Map): Map = {
val k1 = Set(m1.keysIterator.toList: _*)
val k2 = Set(m2.keysIterator.toList: _*)
val intersection = k1 & k2
val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
r2 ++ r1
}
You can then add this method to the map class through the Pimp My Library pattern, and use it in the original example instead of "++":
class CombiningMap(m1: Map[Symbol, Double]) {
def combine(m2: Map[Symbol, Double]) = {
val k1 = Set(m1.keysIterator.toList: _*)
val k2 = Set(m2.keysIterator.toList: _*)
val intersection = k1 & k2
val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
r2 ++ r1
}
}
// Then use this:
implicit def toCombining(m: Map[Symbol, Double]) = new CombiningMap(m)
// And finish with:
mapList reduce (_ combine _)
While this was written in 2.8, so keysIterator becomes keys for 2.7, filterKeys might need to be written in terms of filter and map, & becomes **, and so on, it shouldn't be too different.
How about this one:
def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] =
(Map[A, B]() /: (for (m <- ms; kv <- m) yield kv)) { (a, kv) =>
a + (if (a.contains(kv._1)) kv._1 -> f(a(kv._1), kv._2) else kv)
}
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
val mm = mergeMap(ms)((v1, v2) => v1 + v2)
println(mm) // prints Map(hello -> 5.5, world -> 2.2, goodbye -> 3.3)
And it works in both 2.7.5 and 2.8.0.
I'm surprised no one's come up with this solution yet:
myListOfMaps.flatten.toMap
Does exactly what you need:
Merges the list to a single Map
Weeds out any duplicate keys
Example:
scala> List(Map('a -> 1), Map('b -> 2), Map('c -> 3), Map('a -> 4, 'b -> 5)).flatten.toMap
res7: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 4, 'b -> 5, 'c -> 3)
flatten turns the list of maps into a flat list of tuples, toMap turns the list of tuples into a map with all the duplicate keys removed
Starting Scala 2.13, another solution which handles duplicate keys and is only based on the standard library consists in merging the Maps as sequences (flatten) before applying the new groupMapReduce operator which (as its name suggests) is an equivalent of a groupBy followed by a mapping and a reduce step of grouped values:
List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
.flatten
.groupMapReduce(_._1)(_._2)(_ + _)
// Map("world" -> 2.2, "goodbye" -> 3.3, "hello" -> 5.5)
This:
flattens (concatenates) the maps as a sequence of tuples (List(("hello", 1.1), ("world", 2.2), ("goodbye", 3.3), ("hello", 4.4))), which keeps all key/values (even duplicate keys)
groups elements based on their first tuple part (_._1) (group part of groupMapReduce)
maps grouped values to their second tuple part (_._2) (map part of groupMapReduce)
reduces mapped grouped values (_+_) by taking their sum (but it can be any reduce: (T, T) => T function) (reduce part of groupMapReduce)
The groupMapReduce step can be seen as a one-pass version equivalent of:
list.groupBy(_._1).mapValues(_.map(_._2).reduce(_ + _))
Interesting, noodling around with this a bit, I got the following (on 2.7.5):
General Maps:
def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: Seq[scala.collection.Map[A,B]]): Map[A, B] = {
listOfMaps.foldLeft(Map[A, B]()) { (m, s) =>
Map(
s.projection.map { pair =>
if (m contains pair._1)
(pair._1, collisionFunc(m(pair._1), pair._2))
else
pair
}.force.toList:_*)
}
}
But man, that is hideous with the projection and forcing and toList and whatnot. Separate question: what's a better way to deal with that within the fold?
For mutable Maps, which is what I was dealing with in my code, and with a less general solution, I got this:
def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A, B] = {
listOfMaps.foldLeft(mutable.Map[A,B]()) {
(m, s) =>
for (k <- s.keys) {
if (m contains k)
m(k) = collisionFunc(m(k), s(k))
else
m(k) = s(k)
}
m
}
}
That seems a little bit cleaner, but will only work with mutable Maps as it's written. Interestingly, I first tried the above (before I asked the question) using /: instead of foldLeft, but I was getting type errors. I thought /: and foldLeft were basically equivalent, but the compiler kept complaining that I needed explicit types for (m, s). What's up with that?
I reading this question quickly so I'm not sure if I'm missing something (like it has to work for 2.7.x or no scalaz):
import scalaz._
import Scalaz._
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)
You can change the monoid definition for Double and get another way to accumulate the values, here getting the max:
implicit val dbsg: Semigroup[Double] = semigroup((a,b) => math.max(a,b))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 4.4, world -> 2.2)
I wrote a blog post about this , check it out :
http://www.nimrodstech.com/scala-map-merge/
basically using scalaz semi group you can achieve this pretty easily
would look something like :
import scalaz.Scalaz._
listOfMaps reduce(_ |+| _)
a oneliner helper-func, whose usage reads almost as clean as using scalaz:
def mergeMaps[K,V](m1: Map[K,V], m2: Map[K,V])(f: (V,V) => V): Map[K,V] =
(m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(mergeMaps(_,_)(_ + _))
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)
for ultimate readability wrap it in an implicit custom type:
class MyMap[K,V](m1: Map[K,V]) {
def merge(m2: Map[K,V])(f: (V,V) => V) =
(m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })
}
implicit def toMyMap[K,V](m: Map[K,V]) = new MyMap(m)
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms reduceLeft { _.merge(_)(_ + _) }