Scala groupBy all elements in the item's list - scala

I have a list of tuples, where the first element is a string and the second is a list of strings.
For example...(ignoring speech marks)
val p = List((a, List(x,y,z)), (b, List(x)), (c, List(y,z)))
My goal is to group this list into a map with the elements of the nested lists acting as keys.
val q = Map(x -> List(a,b), y -> List(a,c), z-> List(a,c))
My initial thought was to group by the second elements of p but this assigns the entire lists to the keys.
I'm a beginner to Scala so any advice is appreciated. Should I expect to be able to complete this with higher order functions or would for loops be useful here?
Thanks in advance :)

Here are two variants:
val p = List(("a", List("x","y","z")), ("b", List("x")), ("c", List("y","z")))
// 1. "Transducers"
p.flatMap{ case (k, v) => v.map { _ -> k } } // List((x,a), (y,a), (z,a), (x,b), (y,c), (z,c))
.groupBy(_._1) // Map(z -> List((z,a), (z,c)), y -> List((y,a), (y,c)), x -> List((x,a), (x,b)))
.mapValues(_.map(_._2)) // Map(z -> List(a, c), y -> List(a, c), x -> List(a, b))
// 2. For-loop
var res = Map[String, List[String]]()
for ( (k, vs) <- p; v <- vs) {
res += v -> k :: res.getOrElse(v, List())
}
res // Map(x -> List(b, a), y -> List(c, a), z -> List(c, a))
// Note, values of `res` are inverted,
// because the efficient "cons" operator (::) was used to add values to the lists
// you can revert the lists afterwards as this:
res.mapValues(_.reverse) // Map(x -> List(a, b), y -> List(a, c), z -> List(a, c))
Second variant is more performant, because no intermediate collections are created, but it also could be considered "less idiomatic", as mutable variable res is used. However, it's totally fine to use mutable approach inside a private method.
UPD. Per #LuisMiguelMejíaSuárez's suggestions:
In (1), since scala 2.13, groupBy followed by mapValues can be replaced by groupMap, so the whole chain becomes:
p.flatMap{ case (k, v) => v.map { _ -> k } }
.groupMap(_._1)(_._2)
Another functional variant without intermediate collections can be achieved using foldLeft:
p.foldLeft(Map[String, List[String]]()) {
case (acc, (k, vs)) =>
vs.foldLeft(acc) { (acc1, v) =>
acc1 + (v -> (k :: acc1.getOrElse(v, List())))
}
}
Or slightly more efficiently with updatedWith (scala 2.13):
p.foldLeft(Map[String, List[String]]()) {
case (acc, (k, vs)) =>
vs.foldLeft(acc) { (acc1, v) =>
acc1.updatedWith(v) {
case Some(list) => Some(k :: list)
case None => Some(List(k))
}
}
}
... or same thing slightly shorter:
p.foldLeft(Map[String, List[String]]()) {
case (acc, (k, vs)) =>
vs.foldLeft(acc) { (acc1, v) =>
acc1.updatedWith(v)(_.map(k :: _).orElse(Some(List(k))))
}
}
Overall, I'd suggest either using foldLeft variant (most performant and functional), or the first, groupMap variant (shorter, and arguably more readable, but less performant), depending on your goals.

Your input list p is one step away from being a Map. From there all you need is a general purpose Map inverter.
import scala.collection.generic.IsIterableOnce
import scala.collection.Factory
// from Map[K,C[V]] to Map[V,C[K]] (Scala 2.13.x)
implicit class MapInverter[K,V,C[_]](m: Map[K,C[V]]) {
def invert(implicit iio: IsIterableOnce[C[V]] {type A = V}
, fac: Factory[K,C[K]]): Map[V,C[K]] =
m.foldLeft(Map.empty[V, List[K]]) {
case (acc, (k, vs)) =>
iio(vs).iterator.foldLeft(acc) {
case (a, v) =>
a + (v -> (k::a.getOrElse(v,Nil)))
}
}.map{case (k,v) => k -> v.to(fac)}
}
usage:
val p = List(("a", List("x","y","z")), ("b", List("x")), ("c", List("y","z")))
val q = p.toMap.invert
//Map(x -> List(b, a), y -> List(c, a), z -> List(c, a))

Related

Reverse a Map which has A Set as its value using HOF

I am trying to reverse a map that has a String as the key and a set of numbers as its value
My goal is to create a list that contains a tuple of a number and a list of strings that had the same number in the value set
I have this so far:
def flipMap(toFlip: Map[String, Set[Int]]): List[(Int, List[String])] = {
toFlip.flatMap(_._2).map(x => (x, toFlip.keys.toList)).toList
}
but it is only assigning every String to every Int
val map = Map(
"A" -> Set(1,2),
"B" -> Set(2,3)
)
should produce:
List((1, List(A)), (2, List(A, B)), (3, List(B)))
but is producing:
List((1, List(A, B)), (2, List(A, B)), (3, List(A, B)))
This works to, but it's not exactly what you might need and you may need some conversions to get the exact data type you need:
toFlip.foldLeft(Map.empty[Int, Set[String]]) {
case (acc, (key, numbersSet)) =>
numbersSet.foldLeft(acc) {
(updatingMap, newNumber) =>
updatingMap.updatedWith(newNumber) {
case Some(existingSet) => Some(existingSet + key)
case None => Some(Set(key))
}
}
}
I used Set to avoid duplicate key insertions in the the inner List, and used Map for better look up instead of the outer List.
You can do something like this:
def flipMap(toFlip: Map[String, Set[Int]]): List[(Int, List[String])] =
toFlip
.toList
.flatMap {
case (key, values) =>
values.map(value => value -> key)
}.groupMap(_._1)(_._2)
.view
.mapValues(_.distinct)
.toList
Note, I personally would return a Map instead of a List
Or if you have cats in scope.
def flipMap(toFlip: Map[String, Set[Int]]): Map[Int, Set[String]] =
toFlip.view.flatMap {
case (key, values) =>
values.map(value => Map(value -> Set(key)))
}.toList.combineAll
// both scala2 & scala3
scala> map.flatten{ case(k, s) => s.map(v => (k, v)) }.groupMapReduce{ case(k, v) => v }{case(k, v) => List(k)}{ _ ++ _ }
val res0: Map[Int, List[String]] = Map(1 -> List(A), 2 -> List(A, B), 3 -> List(B))
// scala3 only
scala> map.flatten((k, s) => s.map(v => (k, v))).groupMapReduce((k, v) => v)((k, v) => List(k))( _ ++ _ )
val res1: Map[Int, List[String]] = Map(1 -> List(A), 2 -> List(A, B), 3 -> List(B))

flatmapping a nested Map in scala

Suppose I have val someMap = Map[String -> Map[String -> String]] defined as such:
val someMap =
Map(
("a1" -> Map( ("b1" -> "c1"), ("b2" -> "c2") ) ),
("a2" -> Map( ("b3" -> "c3"), ("b4" -> "c4") ) ),
("a3" -> Map( ("b5" -> "c5"), ("b6" -> "c6") ) )
)
and I would like to flatten it to something that looks like
List(
("a1","b1","c1"),("a1","b2","c2"),
("a2","b3","c3"),("a2","b4","c4"),
("a3","b5","c5"),("a3","b6","c6")
)
What is the most efficient way of doing this? I was thinking about creating some helper function that processes each (a_i -> Map(String,String)) key value pair and return
def helper(key: String, values: Map[String -> String]): (String,String,String)
= {val sublist = values.map(x => (key,x._1,x._2))
return sublist
}
then flatmap this function over someMap. But this seems somewhat unnecessary to my novice scala eyes, so I was wondering if there was a more efficient way to parse this Map.
No need to create helper function just write nested lambda:
val result = someMap.flatMap { case (k, v) => v.map { case (k1, v1) => (k, k1, v1) } }
Or
val y = someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2)))
Since you're asking about efficiency, the most efficient yet functional approach I can think of is using foldLeft and foldRight.
You need foldRight since :: constructs the immutable list in reverse.
someMap.foldRight(List.empty[(String, String, String)]) { case ((a, m), acc) =>
m.foldRight(acc) {
case ((b, c), acc) => (a, b, c) :: acc
}
}
Here, assuming Map.iterator.reverse is implemented efficiently, no intermediate collections are created.
Alternatively, you can use foldLeft and then reverse the result:
someMap.foldLeft(List.empty[(String, String, String)]) { case (acc, (a, m)) =>
m.foldLeft(acc) {
case (acc, (b, c)) => (a, b, c) :: acc
}
}.reverse
This way a single intermediate List is created, but you don't rely on the implementation of the reversed iterator (foldLeft uses forward iterator).
Note: one liners, such as someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2))) are less efficient, as, in addition to the temporary buffer to hold intermediate results of flatMap, they create and discard additional intermediate collections for each inner map.
UPD
Since there seems to be some confusion, I'll clarify what I mean. Here is an implementation of map, flatMap, foldLeft and foldRight from TraversibleLike:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
def builder = { // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
val b = bf(repr)
b.sizeHint(this)
b
}
val b = builder
for (x <- this) b += f(x)
b.result
}
def flatMap[B, That](f: A => GenTraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
def builder = bf(repr) // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
val b = builder
for (x <- this) b ++= f(x).seq
b.result
}
def foldLeft[B](z: B)(op: (B, A) => B): B = {
var result = z
this foreach (x => result = op(result, x))
result
}
def foldRight[B](z: B)(op: (A, B) => B): B =
reversed.foldLeft(z)((x, y) => op(y, x))
It's clear that map and flatMap create intermediate buffer using corresponding builder, while foldLeft and foldRight reuse the same user-supplied accumulator object, and only use iterators.

Scala - Merge two lists of tuples by common elements

How to merge two lists of tuples that simulates Chasles' Relation?
(a, b), (b, c) => (a, c)
Here is an example:
val l1 = List(("Dan", "b"), ("Dan","a"), ("Bart", "c"))
val l2 = List(("a", "1"), ("c", "1"), ("b", "3"), ("a", "2"))
Expected result would be:
val result = List(("Dan", "3"), ("Dan", "1"), ("Dan", "2"), ("Bart", "1"))
You basically want to consider all pairs of one element from the first list and one from the second and keep those where the "b" elements match.
In other words, we want to map over l1 and, inside that map, map over l2, meaning we consider all the pairs of an element from each list, so something like:
l1.map(x => l2.map(y => (x,y))
That's not quite right, though, since we now have a List[List[((String, String),(String,String))]]--we needed to flatmap:
l1.flatMap(x => l2.map(y => (x,y)))
Now we have to filter to keep just the pairs we want and tidy up:
l1.flatMap(x => l2.map(y => (x,y)))
.filter{ case ((_,y),(b,_)) => y == b }
.map {case ((x, _),(_,c)) => (x,c) }
which gives us
List((Dan,3), (Dan,1), (Dan,2), (Bart,1))
That was kind of an ugly mess, so and we can tidy it up a bit--let's filter l2 in our original flatmap and build the result there, so we don't have to juggle the tuple of tuples:
l1.map{ case (x,y) =>
l2.filter{ case (b, _) => y == b}
.map{ case (_, c) => (x, c)} }
This is one of those cases where it's easier to read a for comprehension:
for {
(x, y) <- l1
(b, c) <- l2
if y == b
} yield (x,c)
For each tuple in l1 you can filter l2 to select the tuples with the matching first element:
def join[A, B, C](l1: List[(A, B)], l2: List[(B, C)]): List[(A, C)] = {
for {
(key, subkey) <- l1
value <- l2.collect { case (`subkey`, value) => value }
} yield key -> value
}
You could also convert l2 into a Map beforehand for better selection performance:
def join[A, B, C](l1: List[(A, B)], l2: List[(B, C)]): List[(A, C)] = {
val valuesMap = l2.groupBy(_._1)
for {
(key, subkey) <- l1
(_, value) <- valuesMap.getOrElse(subkey, Nil)
} yield key -> value
}

Inverting mapping from list elements to to String in Scala

I am new to Scala I was trying to flatten the list and invert the mapping. For example I have a map as below :
Map("abc"->List(1,2,3),"def"->List(1,5,6))
I want the result to be :
Map(1->List("abc","def"),2->List("abc"),3->List("abc"),5->List("def"),6->List("def"))
What is the best way to achieve this?
scala> val mm = Map("abc"->List(1,2,3),"def"->List(1,5,6))
mm.toList.flatMap{ case (s, l) => l.map(ll => (ll, s))}.groupBy(_._1).map{ case (i, l) => (i, l.map(_._2))}
mm: scala.collection.immutable.Map[String,List[Int]] = Map(abc -> List(1, 2, 3), def -> List(1, 5, 6))
scala> res9: scala.collection.immutable.Map[Int,List[String]] = Map(5 -> List(def), 1 -> List(abc, def), 6 -> List(def), 2 -> List(abc), 3 -> List(abc))
scala>
UPDATE:
A slightly different solution I like better:
mm.toList.flatMap{ case (s, l) =>
l.map(li => (li, s))
}.foldLeft(Map.empty[Int, List[String]]){
case (m, (i, s)) => m.updated(i, s :: m.getOrElse(i, List.empty))
}
Here is how you can do in simple way
val data = Map("abc"->List(1,2,3),"def"->List(1,5,6))
val list = data.toList.flatMap(x => {
x._2.map(y => (y, x._1))
}).groupBy(_._1).map(x => (x._1, x._2.map(_._2)))
Output:
(5,List(def))
(1,List(abc, def))
(6,List(def))
(2,List(abc))
(3,List(abc))
Hope this helps!
Here is one more way of doing this:
Map("abc" -> List(1,2,3), "def"-> List(1,5,6)).flatMap {
case (key, values) => values.map(elem => Map(elem -> key))
}.flatten.foldRight(Map.empty[Int, List[String]]) { (elem, acc) =>
val (key, value) = elem
if (acc.contains(key)) {
val newValues = acc(key) ++ List(value)
(acc - key) ++ Map(key -> newValues)
} else {
acc ++ Map(key -> List(value))
}
}
So basically what I do is to go over the initial Map, transform that to a tuple and then do a foldRight and group identical keys into the accumulator.
This is a bit verbose than the other solutions posted here, but I prefer to avoid using underscores in my implementations as much as possible.
Another way to invert the Map:
val m = Map("abc" -> List(1, 2, 3), "def" -> List(1, 5, 6))
m.map{ case (k, v) => v.map((_, k)) }.flatten.
groupBy(_._1).mapValues( _.map(_._2) )
// res1: scala.collection.immutable.Map[Int,scala.collection.immutable.Iterable[String]] = Map(
// 5 -> List(def), 1 -> List(abc, def), 6 -> List(def), 2 -> List(abc), 3 -> List(abc)
// )

Is there a cleaner way to pattern-match in Scala anonymous functions?

I find myself writing code like the following:
val b = a map (entry =>
entry match {
case ((x,y), u) => ((y,x), u)
}
)
I would like to write it differently, if only this worked:
val c = a map (((x,y) -> u) =>
(y,x) -> u
)
Is there any way I can get something close to this?
Believe it or not, this works:
val b = List(1, 2)
b map {
case 1 => "one"
case 2 => "two"
}
You can skip the p => p match in simple cases. So this should work:
val c = a map {
case ((x,y) -> u) => (y,x) -> u
}
In your example, there are three subtly different semantics that you may be going for.
Map over the collection, transforming each element that matches a pattern. Throw an exception if any element does not match. These semantics are achieved with
val b = a map { case ((x, y), u) => ((y, x), u) }
Map over the collection, transforming each element that matches a pattern. Silently discard elements that do not match:
val b = a collect { case ((x, y), u) => ((y, x), u) }
Map over the collection, safely destructuring and then transforming each element. These are the semantics that I would expect for an expression like
val b = a map (((x, y), u) => ((y, x), u)))
Unfortunately, there is no concise syntax to achieve these semantics in Scala.
Instead, you have to destructure yourself:
val b = a map { p => ((p._1._2, p._1._1), p._2) }
One might be tempted to use a value definition for destructuring:
val b = a map { p => val ((x,y), u) = p; ((y, x), u) }
However, this version is no more safe than the one that uses explicit pattern matching. For this reason, if you want the safe destructuring semantics, the most concise solution is to explicitly type your collection to prevent unintended widening and use explicit pattern matching:
val a: List[((Int, Int), Int)] = // ...
// ...
val b = a map { case ((x, y), u) => ((y, x), u) }
If a's definition appears far from its use (e.g. in a separate compilation unit), you can minimize the risk by ascribing its type in the map call:
val b = (a: List[((Int, Int), Int)]) map { case ((x, y), u) => ((y, x), u) }
In your quoted example, the cleanest solution is:
val xs = List((1,2)->3,(4,5)->6,(7,8)->9)
xs map { case (a,b) => (a.swap, b) }
val b = a map { case ((x,y), u) => ((y,x), u) }