Scala concatenate maps from a list - scala

Having val mapList: List[Map[String, Int]], I want to do something like:
val map = mapList foldLeft (Map[String, Int]()) ( _ ++ _ )
or
val map = mapList foldLeft (Map[String, Int]())
( (m1: Map[String, Int], m2: Map[String, Int]) => m1 ++ m2 )
Neither option is compiled (first says "missing parameter type for expanded function (x, y) => x ++ y" and second says "type mismatch; found (Map[String, Int], Map[String, Int]) => Map[String, Int]; required: String").
I want to achieve a classical solution for concatenating a list of immutable maps such as List( Map("apple" -> 5, "pear" -> 7), Map("pear" -> 3, "apricot" -> 0) ) would produce a Map("apple" -> 5, "pear" -> 10, "apricot" -> 0).
Using scala 2.10.5.

You need to add a dot before foldLeft. You can only use spaces instead of dots under specialized conditions, such as for methods with exactly 1 parameter (arity-1 methods):
val map = mapList.foldLeft(Map[String, Int]()) ( _ ++ _ )
You can read more about method invocation best practices here.
You might also be interested in the reduce methods, which are specialized versions of the fold methods, where the return type is the same as the type of the elements of the collection. For example reduceLeft uses the first element of the collection as a seed for the foldLeft. Of course, since this relies on the first element's existence, it will throw an exception if the collection is empty. Since reduceLeft takes only 1 parameter, you can more easily use a space to invoke the method:
mapList.reduceLeft( _ ++ _)
mapList reduceLeft(_ ++ _)
Finally, you should note that all you are doing here is merging the maps. When using ++ to merge the maps, you will just override keys that are already present in the map – you won't be adding the values of duplicate keys. If you wanted to do that, you could follow the answers provided here, and apply them to the foldLeft or reduceLeft. For example:
mapList reduceLeft { (acc, next) =>
(acc.toList ++ next.toList).groupBy(_._1).toMap.mapValues(_.map(_._2).sum)
}
Or slightly differently:
mapList.map(_.toSeq).reduceLeft(_ ++ _).groupBy(_._1).toMap.mapValues(_.map(_._2).sum)
And, if you're using Scalaz, then most concisely:
mapList reduceLeft { _ |+| _ }

Related

Convert List[(Int,String)] into List[Int] in scala

My goal is to to map every word in a text (Index, line) to a list containing the indices of every line the word occurs in. I managed to write a function that returns a list of all words assigned to a index.
The following function should do the rest (map a list of indices to every word):
def mapIndicesToWords(l:List[(Int,String)]):Map[String,List[Int]] = ???
If I do this:
l.groupBy(x => x._2)
it returns a Map[String, List[(Int,String)]. Now I just want to change the value to type List[Int].
I thought of using .mapValues(...) and fold the list somehow, but I'm new to scala and don't know the correct approach for this.
So how do I convert the list?
Also you can use foldLeft, you need just specify accumulator (in your case Map[String, List[Int]]), which will be returned as a result, and write some logic inside. Here is my implementation.
def mapIndicesToWords(l:List[(Int,String)]): Map[String,List[Int]] =
l.foldLeft(Map[String, List[Int]]())((map, entry) =>
map.get(entry._2) match {
case Some(list) => map + (entry._2 -> (entry._1 :: list))
case None => map + (entry._2 -> List(entry._1))
}
)
But with foldLeft, elements of list will be in reversed order, so you can use foldRight. Just change foldLeft to foldRight and swap input parameters, (map, entry) to (entry, map).
And be careful, foldRight works 2 times slower. It is implemented using method reverse list and foldLeft.
scala> val myMap: Map[String,List[(Int, String)]] = Map("a" -> List((1,"line1"), (2, "line")))
myMap: Map[String,List[(Int, String)]] = Map(a -> List((1,line1), (2,line)))
scala> myMap.mapValues(lst => lst.map(pair => pair._1))
res0: scala.collection.immutable.Map[String,List[Int]] = Map(a -> List(1, 2))

How to make a nested map as dot separated strings of key and value in scala

I have a Map[String, Any]. The value can be another Map and so on.
val m: Map[String, Any] = Map("a" -> Map("b" -> Map("c" -> 1, "d" -> 4)))
What is the best way to convert this nested Map into another Map with values like
Map("a.b.c" -> 1, "a.b.d" -> 4)
Just use recursion as in any other programming language (it's not something Scala-specific).
val m: Map[String, Any] = Map("a" -> Map("b" -> Map("c" -> 1, "d" -> 4)))
def traverse(el: Any, acc: List[String] = List.empty[String]): Map[String, Int] = el match {
case leaf: Int => Map(acc.reverse.mkString(".") -> leaf)
case m: Map[String, Any] => m flatMap {
case (k, v) => traverse(v, k :: acc)
}
}
traverse(m)
res2_2: Map[String, Int] = Map("a.b.c" -> 1, "a.b.d" -> 4)
Btw,
1) the presented solution is not tail-recursive (so might throw stack-overflow on very deep trees) - you could write tail-recursive version as an exercise. Hint - you would need an accumulator for resulting collection (or use mutable buffer, idn, maybe you're not really into functional programming and was forced to use Scala by employer :) ).
2) List is inappropriate structure for accumulator (coz of performance), I was just lazy to use less common one, as you were lazy to try to at least somehow implement this trivial algorithm. I bet there should be a duplicate question on SO, but again, lazy to look for it :).
3) #unchecked annotation would be appropriate at some place in my code (guess where?). And also default case for pattern matching (you can build a test case that breaks my function to figure out why).

In Scala, is there an equivalent of Haskell's "fromListWith" for Map?

In Haskell, there is a function called fromListWith which can generate a Map from a function (used to merge values with the same key) and a list:
fromListWith :: Ord k => (a -> a -> a) -> [(k, a)] -> Map k a
The following expression will be evaluated to true:
fromListWith (++) [(5,"a"), (5,"b"), (3,"b"), (3,"a"), (5,"a")] == fromList [(3, "ab"), (5, "aba")]
In Scala, there is a similar function called toMap on List objects , which can also convert a list to a Map, but it can't have a parameter of function to deal with duplicated keys.
Does anyone have ideas about this?
Apart from using scalaz you could also define one yourself:
implicit class ListToMapWith[K, V](list: List[(K, V)]) {
def toMapWith(op: (V, V) => V) =
list groupBy (_._1) mapValues (_ map (_._2) reduce op)
}
Here is a usage example:
scala> val testList = List((5,"a"), (5,"b"), (3,"b"), (3,"a"), (5,"a"))
scala> testList toMapWith (_ + _)
res1: scala.collection.immutable.Map[Int,String] = Map(5 -> aba, 3 -> ba)
The stdlib doesn't have such a feature, however, there is a port of Data.Map available in scalaz that does have this function available.

scala: type equality of two variables

I have two Map[String, T]s, where T is an instance of subtype of Fruit. I need to construct new Map from two Maps, where the key is the common key names from the two maps, and the value is the Seq[Fruit] iff the values from the two maps shares the same type.
class Fruit
case class Apple() extends Fruit
case class Banana(num: Int) extends Fruit
case class Orange() extends Fruit
For example, if I have following two maps:
val map1 = Map("first" -> Apple(),
"second" -> Banana(3),
"third" -> Orange())
val map2 = Map("first" -> Orange(),
"second" -> Banana(4),
"third" -> Orange())
I need the result map, map3 which has following members:
generateMap(map1: Map[String, Fruit], map2: Map[String, Fruit]): Map[String, Seq[Fruit]]
=> results a map look like
Map("second" -> Seq(Banana(3), Banana(4)),
"third" -> Seq(Orange(), Orange())
I'm not sure how to write a function, generateMap. Could anyone help me to implement that? (using Scala 2.11.x)
Note that the class definitions (Fruits and others) are fixed, so I cannot modify them.
scala> val r: Map[String, Seq[Fruit]] = (map1.toList ++ map2.toList).
groupBy(x => x._1).
mapValues(lst => lst.map(x => x._2)).
.filter {
case (key, lst) => lst.forall(x =>
x.getClass == lst.head.getClass)
}
r: Map[String, Seq[Fruit]] = Map(third -> List(Orange(), Orange()),
second -> List(Banana(3), Banana(4)))
val m3 = (map1.toSeq ++ map2.toSeq). // Combine the maps
groupBy (x=>x._1). //Group by the original keys
map{case (k,lst)=> (k, lst.map(x=> x._2))}. //Strip the keys from the grouped sequences
filter{case (_, lst) => lst.forall(i => lst.head.getClass == i.getClass)}. //Filter out hetergeneous seqs
toMap // Make a map
Without forall:
(map1.toList ++ map2.toList).groupBy(_._1).mapValues(_.map(_._2))
.filter(_._2.map(_.getClass).toSet.tail.isEmpty)
Map(third -> List(Orange(), Orange()), second -> List(Banana(3), Banana(4)))
This version requires a little more (but still linear inside filter) CPU and memory than version with forall, so you should use it only for small collections.

Scala: how to merge a collection of Maps

I have a List of Map[String, Double], and I'd like to merge their contents into a single Map[String, Double]. How should I do this in an idiomatic way? I imagine that I should be able to do this with a fold. Something like:
val newMap = Map[String, Double]() /: listOfMaps { (accumulator, m) => ... }
Furthermore, I'd like to handle key collisions in a generic way. That is, if I add a key to the map that already exists, I should be able to specify a function that returns a Double (in this case) and takes the existing value for that key, plus the value I'm trying to add. If the key does not yet exist in the map, then just add it and its value unaltered.
In my specific case I'd like to build a single Map[String, Double] such that if the map already contains a key, then the Double will be added to the existing map value.
I'm working with mutable maps in my specific code, but I'm interested in more generic solutions, if possible.
Well, you could do:
mapList reduce (_ ++ _)
except for the special requirement for collision.
Since you do have that special requirement, perhaps the best would be doing something like this (2.8):
def combine(m1: Map, m2: Map): Map = {
val k1 = Set(m1.keysIterator.toList: _*)
val k2 = Set(m2.keysIterator.toList: _*)
val intersection = k1 & k2
val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
r2 ++ r1
}
You can then add this method to the map class through the Pimp My Library pattern, and use it in the original example instead of "++":
class CombiningMap(m1: Map[Symbol, Double]) {
def combine(m2: Map[Symbol, Double]) = {
val k1 = Set(m1.keysIterator.toList: _*)
val k2 = Set(m2.keysIterator.toList: _*)
val intersection = k1 & k2
val r1 = for(key <- intersection) yield (key -> (m1(key) + m2(key)))
val r2 = m1.filterKeys(!intersection.contains(_)) ++ m2.filterKeys(!intersection.contains(_))
r2 ++ r1
}
}
// Then use this:
implicit def toCombining(m: Map[Symbol, Double]) = new CombiningMap(m)
// And finish with:
mapList reduce (_ combine _)
While this was written in 2.8, so keysIterator becomes keys for 2.7, filterKeys might need to be written in terms of filter and map, & becomes **, and so on, it shouldn't be too different.
How about this one:
def mergeMap[A, B](ms: List[Map[A, B]])(f: (B, B) => B): Map[A, B] =
(Map[A, B]() /: (for (m <- ms; kv <- m) yield kv)) { (a, kv) =>
a + (if (a.contains(kv._1)) kv._1 -> f(a(kv._1), kv._2) else kv)
}
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
val mm = mergeMap(ms)((v1, v2) => v1 + v2)
println(mm) // prints Map(hello -> 5.5, world -> 2.2, goodbye -> 3.3)
And it works in both 2.7.5 and 2.8.0.
I'm surprised no one's come up with this solution yet:
myListOfMaps.flatten.toMap
Does exactly what you need:
Merges the list to a single Map
Weeds out any duplicate keys
Example:
scala> List(Map('a -> 1), Map('b -> 2), Map('c -> 3), Map('a -> 4, 'b -> 5)).flatten.toMap
res7: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 4, 'b -> 5, 'c -> 3)
flatten turns the list of maps into a flat list of tuples, toMap turns the list of tuples into a map with all the duplicate keys removed
Starting Scala 2.13, another solution which handles duplicate keys and is only based on the standard library consists in merging the Maps as sequences (flatten) before applying the new groupMapReduce operator which (as its name suggests) is an equivalent of a groupBy followed by a mapping and a reduce step of grouped values:
List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
.flatten
.groupMapReduce(_._1)(_._2)(_ + _)
// Map("world" -> 2.2, "goodbye" -> 3.3, "hello" -> 5.5)
This:
flattens (concatenates) the maps as a sequence of tuples (List(("hello", 1.1), ("world", 2.2), ("goodbye", 3.3), ("hello", 4.4))), which keeps all key/values (even duplicate keys)
groups elements based on their first tuple part (_._1) (group part of groupMapReduce)
maps grouped values to their second tuple part (_._2) (map part of groupMapReduce)
reduces mapped grouped values (_+_) by taking their sum (but it can be any reduce: (T, T) => T function) (reduce part of groupMapReduce)
The groupMapReduce step can be seen as a one-pass version equivalent of:
list.groupBy(_._1).mapValues(_.map(_._2).reduce(_ + _))
Interesting, noodling around with this a bit, I got the following (on 2.7.5):
General Maps:
def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: Seq[scala.collection.Map[A,B]]): Map[A, B] = {
listOfMaps.foldLeft(Map[A, B]()) { (m, s) =>
Map(
s.projection.map { pair =>
if (m contains pair._1)
(pair._1, collisionFunc(m(pair._1), pair._2))
else
pair
}.force.toList:_*)
}
}
But man, that is hideous with the projection and forcing and toList and whatnot. Separate question: what's a better way to deal with that within the fold?
For mutable Maps, which is what I was dealing with in my code, and with a less general solution, I got this:
def mergeMaps[A,B](collisionFunc: (B,B) => B)(listOfMaps: List[mutable.Map[A,B]]): mutable.Map[A, B] = {
listOfMaps.foldLeft(mutable.Map[A,B]()) {
(m, s) =>
for (k <- s.keys) {
if (m contains k)
m(k) = collisionFunc(m(k), s(k))
else
m(k) = s(k)
}
m
}
}
That seems a little bit cleaner, but will only work with mutable Maps as it's written. Interestingly, I first tried the above (before I asked the question) using /: instead of foldLeft, but I was getting type errors. I thought /: and foldLeft were basically equivalent, but the compiler kept complaining that I needed explicit types for (m, s). What's up with that?
I reading this question quickly so I'm not sure if I'm missing something (like it has to work for 2.7.x or no scalaz):
import scalaz._
import Scalaz._
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)
You can change the monoid definition for Double and get another way to accumulate the values, here getting the max:
implicit val dbsg: Semigroup[Double] = semigroup((a,b) => math.max(a,b))
ms.reduceLeft(_ |+| _)
// returns Map(goodbye -> 3.3, hello -> 4.4, world -> 2.2)
I wrote a blog post about this , check it out :
http://www.nimrodstech.com/scala-map-merge/
basically using scalaz semi group you can achieve this pretty easily
would look something like :
import scalaz.Scalaz._
listOfMaps reduce(_ |+| _)
a oneliner helper-func, whose usage reads almost as clean as using scalaz:
def mergeMaps[K,V](m1: Map[K,V], m2: Map[K,V])(f: (V,V) => V): Map[K,V] =
(m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms.reduceLeft(mergeMaps(_,_)(_ + _))
// returns Map(goodbye -> 3.3, hello -> 5.5, world -> 2.2)
for ultimate readability wrap it in an implicit custom type:
class MyMap[K,V](m1: Map[K,V]) {
def merge(m2: Map[K,V])(f: (V,V) => V) =
(m1 -- m2.keySet) ++ (m2 -- m1.keySet) ++ (for (k <- m1.keySet & m2.keySet) yield { k -> f(m1(k), m2(k)) })
}
implicit def toMyMap[K,V](m: Map[K,V]) = new MyMap(m)
val ms = List(Map("hello" -> 1.1, "world" -> 2.2), Map("goodbye" -> 3.3, "hello" -> 4.4))
ms reduceLeft { _.merge(_)(_ + _) }