Zip two HashMaps(or dictionaries) - scala

What would be a functional way to zip two dictionaries in Scala?
map1 = new HashMap("A"->1,"B"->2)
map2 = new HashMap("B"->22,"D"->4) // B is the only common key
zipper(map1,map2) should give something similar to
Seq( ("A",1,0), // no A in second map, so third value is zero
("B",2,22),
("D",0,4)) // no D in first map, so second value is zero
If not functional, any other style is also appreciated

def zipper(map1: Map[String, Int], map2: Map[String, Int]) = {
for(key <- map1.keys ++ map2.keys)
yield (key, map1.getOrElse(key, 0), map2.getOrElse(key, 0))
}
scala> val map1 = scala.collection.immutable.HashMap("A" -> 1, "B" -> 2)
map1: scala.collection.immutable.HashMap[String,Int] = Map(A -> 1, B -> 2)
scala> val map2 = scala.collection.immutable.HashMap("B" -> 22, "D" -> 4)
map2: scala.collection.immutable.HashMap[String,Int] = Map(B -> 22, D -> 4)
scala> :load Zipper.scala
Loading Zipper.scala...
zipper: (map1: Map[String,Int], map2: Map[String,Int])Iterable[(String, Int, Int)]
scala> zipper(map1, map2)
res1: Iterable[(String, Int, Int)] = Set((A,1,0), (B,2,22), (D,0,4))
Note using get is probably preferable to getOrElse in this case. None is used to specify that a value does not exist instead of using 0.

As an alternative to Brian's answer, this can be used to enhance the map class by way of implicit methods:
implicit class MapUtils[K, +V](map: collection.Map[K, V]) {
def zipAllByKey[B >: V, C >: V](that: collection.Map[K, C], thisElem: B, thatElem: C): Iterable[(K, B, C)] =
for (key <- map.keys ++ that.keys)
yield (key, map.getOrElse(key, thisElem), that.getOrElse(key, thatElem))
}
The naming and API are similar to the sequence zipAll.

Related

Return type in scala generic

I am pretty new to scala, and i have hard time with generic in some case.
I try to find simple example to illustrate my issue
let's start by setting 3 simple maps
scala> val map1 = Map[String, String]("A" -> "toto", "B" -> "tutu")
map1: scala.collection.immutable.Map[String,String] = Map(A -> toto, B -> tutu)
scala> val map2 = Map[String, Int]("B" -> 2)
map2: scala.collection.immutable.Map[String,Int] = Map(B -> 2)
scala> val map3 = Map[String, String]("A" -> "foo")
map3: scala.collection.immutable.Map[String,String] = Map(A -> foo)
if i merge using ++
scala> map1 ++ map2
res0: scala.collection.immutable.Map[String,Any] = Map(A -> toto, B -> 2)
My map became [String, Any] which is what i want to reproduce
I create new method in Map class
object MapUtils {
implicit class utils[K, V](map: Map[K, V]) {
def myUselessMethod(merge: Map[K, V]): Map[K, V] = {
map ++ merge
}
}
}
import MapUtils._
scala> map1 myUselessMethod map3
res4: Map[String,String] = Map(A -> foo, B -> tutu)
scala> map1 myUselessMethod map2
<console>:18: error: type mismatch;
found : scala.collection.immutable.Map[String,Int]
required: Map[String,String]
map1 myUselessMethod map2
How should i define my method so that it can return [String, Any]
I also checked scala MAP API to see if i can have some indications
https://www.scala-lang.org/api/2.12.3/scala/collection/Map.html
but
++[B >: (K, V), That](that: GenTraversableOnce[B])(implicit bf:
CanBuildFrom[Map[K, V], B, That]): That
is somehow indecipherable for me...
Thank you
Just give the compiler some slack in the return type:
object MapUtils {
implicit class utils[K, V](map: Map[K, V]) {
def myUselessMethod[L >: V](merge: Map[K, L]): Map[K, L] = {
map ++ merge
}
}
}
import MapUtils._
val map1 = Map[String, String]("A" -> "toto", "B" -> "tutu")
val map2 = Map[String, Int]("B" -> 2)
map1 myUselessMethod map2
Now it will infer the least upper bound L of V and the value type of the merge-argument map. In case of Int and String, L would be inferred to be Any.

How to tell if a Map has a default value?

Is there a way to check if a Map has a defined default value? What I would like is some equivalent of myMap.getOrElse(x, y) where if the key x is not in the map,
if myMap has a default value, return that value
else return y
A contrived example of the issue:
scala> def f(m: Map[String, String]) = m.getOrElse("hello", "world")
f: (m: Map[String,String])String
scala> val myMap = Map("a" -> "A").withDefaultValue("Z")
myMap: scala.collection.immutable.Map[String,String] = Map(a -> A)
scala> f(myMap)
res0: String = world
In this case, I want res0 to be "Z" instead of "world", because myMap was defined with that as a default value. But getOrElse doesn't work that way.
I could use m.apply instead of m.getOrElse, but the map is not guaranteed to have a default value, so it could throw an exception (I could catch the exception, but this is nonideal).
scala> def f(m: Map[String, String]) = try {
| m("hello")
| } catch {
| case e: java.util.NoSuchElementException => "world"
| }
f: (m: Map[String,String])String
scala> val myMap = Map("a" -> "A").withDefaultValue("Z")
myMap: scala.collection.immutable.Map[String,String] = Map(a -> A)
scala> f(myMap)
res0: String = Z
scala> val mapWithNoDefault = Map("a" -> "A")
mapWithNoDefault: scala.collection.immutable.Map[String,String] = Map(a -> A)
scala> f(mapWithNoDefault)
res1: String = world
The above yields the expected value but seems messy. I can't pattern match and call apply or getOrElse based on whether or not the map had a default value, because the type is the same (scala.collection.immutable.Map[String,String]) regardless of default-ness.
Is there a way to do this that doesn't involve catching exceptions?
You can check whether the map is an instance of Map.WithDefault:
implicit class EnrichedMap[K, V](m: Map[K, V]) {
def getOrDefaultOrElse(k: K, v: => V) =
if (m.isInstanceOf[Map.WithDefault[K, V]]) m(k) else m.getOrElse(k, v)
}
And then:
scala> val myMap = Map("a" -> "A").withDefaultValue("Z")
myMap: scala.collection.immutable.Map[String,String] = Map(a -> A)
scala> myMap.getOrDefaultOrElse("hello", "world")
res11: String = Z
scala> val myDefaultlessMap = Map("a" -> "A")
myDefaultlessMap: scala.collection.immutable.Map[String,String] = Map(a -> A)
scala> myDefaultlessMap.getOrDefaultOrElse("hello", "world")
res12: String = world
Whether this kind of reflection is any better than using exceptions for non-exceptional control flow is an open question.
You could use Try instead of try/catch, and it would look a little cleaner.
val m = Map(1 -> 2, 3 -> 4)
import scala.util.Try
Try(m(10)).getOrElse(0)
res0: Int = 0
val m = Map(1 -> 2, 3 -> 4).withDefaultValue(100)
Try(m(10)).getOrElse(0)
res1: Int = 100

Scala: Merge map

How can I merge maps like below:
Map1 = Map(1 -> Class1(1), 2 -> Class1(2))
Map2 = Map(2 -> Class2(1), 3 -> Class2(2))
After merged.
Merged = Map( 1 -> List(Class1(1)), 2 -> List(Class1(2), Class2(1)), 3 -> Class2(2))
Can be List, Set or any other collection who has size attribute.
Using the standard lib, you can do it as follows:
// convert maps to seq, to keep duplicate keys and concat
val merged = Map(1 -> 2).toSeq ++ Map(1 -> 4).toSeq
// merged: Seq[(Int, Int)] = ArrayBuffer((1,2), (1,4))
// group by key
val grouped = merged.groupBy(_._1)
// grouped: scala.collection.immutable.Map[Int,Seq[(Int, Int)]] = Map(1 -> ArrayBuffer((1,2), (1,4)))
// remove key from value set and convert to list
val cleaned = grouped.mapValues(_.map(_._2).toList)
// cleaned: scala.collection.immutable.Map[Int,List[Int]] = Map(1 -> List(2, 4))
This is the simplest implementation i could come up with,
val m1 = Map(1 -> "1", 2 -> "2")
val m2 = Map(2 -> "21", 3 -> "3")
def merge[K, V](m1:Map[K, V], m2:Map[K, V]):Map[K, List[V]] =
(m1.keySet ++ m2.keySet) map { i => i -> (m1.get(i).toList ::: m2.get(i).toList) } toMap
merge(m1, m2) // Map(1 -> List(1), 2 -> List(2, 21), 3 -> List(3))
You could use scalaz:
import scalaz._, Scalaz._
val m1 = Map('a -> 1, 'b -> 2)
val m2 = Map('b -> 3, 'c -> 4)
m1.mapValues{List(_)} |+| m2.mapValues{List(_)}
// Map('b -> List(2, 3), 'c -> List(4), 'a -> List(1))
You could use Set(_) instead of List(_) to get Sets as values in Map.
See Semigroup in scalaz cheat sheet (or in learning scalaz) for details about |+| operator.
For Int |+| works as +, for List - as ++, for Map it applies |+| to values of same keys.
One clean way to do it, with cats:
import cats.implicits._
Map(1 -> "Hello").combine(Map(2 -> "Goodbye"))
//Map(2 -> Goodbye, 1 -> Hello)
It's important to note that both maps have to be of the same type (in this case, Map[Int, String]).
Long explanation:
combine isn't really a member of Map. By importing cats.implicits you're bringing into scope cats's Map built-in monoid instances, along with some implicit classes which enable the terse syntax.
The above is equivalent to this:
Monoid[Map[Int, String]].combine(Map(1 -> "Hello"), Map(2 -> "Goodbye"))
Where we're using the Monoid "summoner" function to get the Monoid[Map[Int, String]] instance in scope and using its combine function.
Starting Scala 2.13, another solution only based on the standard library consists in using groupMap which (as its name suggests) is an equivalent of a groupBy followed by mapValues:
// val m1 = Map(1 -> "a", 2 -> "b")
// val m2 = Map(2 -> "c", 3 -> "d")
(m1.toSeq ++ m2).groupMap(_._1)(_._2)
// Map[Int,Seq[String]] = Map(2 -> List("b", "c"), 1 -> List("a"), 3 -> List("d"))
This:
Concatenates the two maps as a sequence of tuples (List((1,"a"), (2,"b"), (2,"c"), (3,"d"))). For conciseness, m2 is implicitly converted to Seq to adapt to the type of m1.toSeq - but you could choose to make it explicit by using m2.toSeq.
groups elements based on their first tuple part (_._1) (group part of groupMap)
maps grouped values to their second tuple part (_._2) (map part of groupMap)
I wrote a blog post about this , check it out :
http://www.nimrodstech.com/scala-map-merge/
basically using scalaz semi group you can achieve this pretty easily
would look something like :
import scalaz.Scalaz._
Map1 |+| Map2
You can use foldLeft to merge two Maps of the same type
def merge[A, B](a: Map[A, B], b: Map[A, B])(mergef: (B, Option[B]) => B): Map[A, B] = {
val (big, small) = if (a.size > b.size) (a, b) else (b, a)
small.foldLeft(big) { case (z, (k, v)) => z + (k -> mergef(v, z.get(k))) }
}
def mergeIntSum[A](a: Map[A, Int], b: Map[A, Int]): Map[A, Int] =
merge(a, b)((v1, v2) => v2.map(_ + v1).getOrElse(v1))
Example:
val a = Map("a" -> 1, "b" -> 5, "c" -> 6)
val b = Map("a" -> 4, "z" -> 8)
mergeIntSum(a, b)
res0: Map[String,Int] = Map(a -> 5, b -> 5, c -> 6, z -> 8)
There is a Scala module called scala-collection-contrib, which offers very useful methods like mergeByKey.
First, we need to add an additional dependency to build.sbt:
libraryDependencies += "org.scala-lang.modules" %% "scala-collection-contrib" % "0.1.0"
and then it's possible to do merge like this:
import scala.collection.decorators._
val map1 = Map(1 -> Class1(1), 2 -> Class1(2))
val map2 = Map(2 -> Class2(1), 3 -> Class2(2))
map1.mergeByKeyWith(map2){
case (a,b) => a.toList ++ b.toList
}
Solution to combine two maps: Map[A,B], the result type: Map[A,List[B]] via the Scala Cats (slightly improved version, offered by #David Castillo)
//convert each original map to Map[A,List[B]].
//Add an instance of Monoid[List] into the scope to combine lists:
import cats.instances.map._ // for Monoid
import cats.syntax.semigroup._ // for |+|
import cats.instances.list._
val map1 = Map("a" -> 1, "b" -> 2)
.mapValues(List(_))
val map2 = Map("b" -> 3, "d" -> 4)
.mapValues(List(_))
map1 |+| map2
If you don't want to mess around with original maps you could do something like following
val target = map1.clone()
val source = map2.clone()
source.foreach(e => target += e._1 -> e._2)
left.keys map { k => k -> List(left(k),right(k)) } toMap
Is concise and will work, assuming your two maps are left and right. Not sure about efficiency.
But your question is a bit ambiguous, for two reasons. You don't specify
The subtyping relationship between the values (i.e. class1,class2),
What happens if the maps have different keys
For the first case, consider the following example:
val left = Map("foo" ->1, "bar" ->2)
val right = Map("bar" -> 'a', "foo" -> 'b')
Which results in
res0: Map[String,List[Int]] = Map(foo -> List(1, 98), bar -> List(2, 97))
Notice how the Chars have been converted to Ints, because of the scala type hierarchy. More generally, if in your example class1 and class2 are not related, you would get back a List[Any]; this is probably not what you wanted.
You can work around this by dropping the List constructor from my answer; this will return Tuples which preserve the type:
res0: Map[String,(Int, Char)] = Map(foo -> (1,b), bar -> (2,a))
The second problem is what happens when you have maps that don't have the same keys. This will result in a key not found exception. Put in another way, are you doing a left, right, or inner join of the two maps? You can disambiguate the type of join by switching to right.keys or right.keySet ++ left.keySet for right/inner joins respectively. The later will work around the missing key problem, but maybe that's not what you want i.e. maybe you want a left or right join instead. In that case you can consider using the withDefault method of Map to ensure every key returns a value, e.g. None, but this needs a bit more work.
m2.foldLeft(m1.mapValues{List[CommonType](_)}) { case (acc, (k, v)) =>
acc.updated(k, acc.getOrElse(k, List.empty) :+ v)
}
As noted by jwvh, List type should be specified explicitly if Class1 is not upper type bound for Class2. CommonType is a type which is upper bound for both Class1 and Class2.
This answer does not solve the initial question directly although it solves a common/related scenario which is merging two maps by the common keys.
Based on #Drexin's answer I wrote a generic method to extend the existing Map functionality by providing a join method for Maps:
object implicits {
type A = Any
implicit class MapExt[K, B <: A, C <: A](val left: immutable.Map[K, B]) {
def join(right: immutable.Map[K, C]) : immutable.Map[K, Seq[A]] = {
val inter = left.keySet.intersect(right.keySet)
val leftFiltered = left.filterKeys{inter.contains}
val rightFiltered = right.filterKeys{inter.contains}
(leftFiltered.toSeq ++ rightFiltered.toSeq)
.groupBy(_._1)
.mapValues(_.map{_._2}.toList)
}
}
}
Notes:
Join is based on intersection of the keys, which resembles an "inner join" from the SQL world.
It works with Scala <= 2.12, for Scala 2.13 consider using groupMap as #Xavier Guihot suggested.
Consider replacing type A with your own base type.
Usage:
import implicits._
val m1 = Map("k11" -> "v11", "k12" -> "v12")
val m2 = Map("k11" -> "v21", "k12" -> "v22", "k13" -> "v23")
println (m1 join m2)
// Map(k11 -> List(v11, v21), k12 -> List(v12, v22))
this's two map merge
def mergeMap[A, B](map1: Map[A, B], map2: Map[A, B], op: (B, B) => B, default: => B): Map[A, B] = (map1.keySet ++ map2.keySet).map(x => (x, op(map1.getOrElse(x, default), map2.getOrElse(x, default)))).toMap
this’s multiple map merge
def mergeMaps[A, B](maps: Seq[Map[A, B]], op: (B, B) => B, default: => B): Map[A, B] = maps.reduce((a, b) => mergeMap(a, b, op, default))

Use map elements as method arguments

I've been stuck on this for a while now: I have this method
Method(a: (A,B)*): Unit
and I have a map of type Map[A,B], is there a way to convert this map so that I can directly use it as an argument?
Something like:
Method(map.convert)
Thank you.
You can build a sequence of pairs from a Map by using .toSeq.
You can also pass a sequence of type Seq[T] as varargs by "casting" it with : _*.
Chain the conversions to achieve what you want:
scala> val m = Map('a' -> 1, 'b' -> 2, 'z' -> 26)
m: scala.collection.immutable.Map[Char,Int] = Map(a -> 1, b -> 2, z -> 26)
scala> def foo[A,B](pairs : (A,B)*) = pairs.foreach(println)
foo: [A, B](pairs: (A, B)*)Unit
scala> foo(m.toSeq : _*)
(a,1)
(b,2)
(z,26)
I'm not sure but you can try converting map into array of tuples and then cast it to vararg type, like so:
Method(map.toArray: _*)
Just convert the map to a sequence of pairs using ".toSeq" and pass it to the var-args method postfixed with ":_*" (which is Scala syntax to allow passing a sequence as arguments to a var-args method).
Example:
def m(a: (String, Int)*) { for ((k, v) <- a) println(k+"->"+v) }
val x= Map("a" -> 1, "b" -> 2)
m(x.toSeq:_*)
In repl:
scala> def m(a: (String, Int)*) { for ((k, v) <- a) println(k+"->"+v) }
m: (a: (String, Int)*)Unit
scala> val x= Map("a" -> 1, "b" -> 2)
x: scala.collection.immutable.Map[java.lang.String,Int] = Map(a -> 1, b -> 2)
scala> m(x.toSeq:_*)
a->1
b->2

Cleaner tuple groupBy

I have a sequence of key-value pairs (String, Int), and I want to group them by key into a sequence of values (i.e. Seq[(String, Int)]) => Map[String, Iterable[Int]])).
Obviously, toMap isn't useful here, and groupBy maintains the values as tuples. The best I managed to come up with is:
val seq: Seq[( String, Int )]
// ...
seq.groupBy( _._1 ).mapValues( _.map( _._2 ) )
Is there a cleaner way of doing this?
Here's a pimp that adds a toMultiMap method to traversables. Would it solve your problem?
import collection._
import mutable.Builder
import generic.CanBuildFrom
class TraversableOnceExt[CC, A](coll: CC, asTraversable: CC => TraversableOnce[A]) {
def toMultiMap[T, U, That](implicit ev: A <:< (T, U), cbf: CanBuildFrom[CC, U, That]): immutable.Map[T, That] =
toMultiMapBy(ev)
def toMultiMapBy[T, U, That](f: A => (T, U))(implicit cbf: CanBuildFrom[CC, U, That]): immutable.Map[T, That] = {
val mutMap = mutable.Map.empty[T, mutable.Builder[U, That]]
for (x <- asTraversable(coll)) {
val (key, value) = f(x)
val builder = mutMap.getOrElseUpdate(key, cbf(coll))
builder += value
}
val mapBuilder = immutable.Map.newBuilder[T, That]
for ((k, v) <- mutMap)
mapBuilder += ((k, v.result))
mapBuilder.result
}
}
implicit def commomExtendTraversable[A, C[A] <: TraversableOnce[A]](coll: C[A]): TraversableOnceExt[C[A], A] =
new TraversableOnceExt[C[A], A](coll, identity)
Which can be used like this:
val map = List(1 -> 'a', 1 -> 'à', 2 -> 'b').toMultiMap
println(map) // Map(1 -> List(a, à), 2 -> List(b))
val byFirstLetter = Set("abc", "aeiou", "cdef").toMultiMapBy(elem => (elem.head, elem))
println(byFirstLetter) // Map(c -> Set(cdef), a -> Set(abc, aeiou))
If you add the following implicit defs, it will also work with collection-like objects such as Strings and Arrays:
implicit def commomExtendStringTraversable(string: String): TraversableOnceExt[String, Char] =
new TraversableOnceExt[String, Char](string, implicitly)
implicit def commomExtendArrayTraversable[A](array: Array[A]): TraversableOnceExt[Array[A], A] =
new TraversableOnceExt[Array[A], A](array, implicitly)
Then:
val withArrays = Array(1 -> 'a', 1 -> 'à', 2 -> 'b').toMultiMap
println(withArrays) // Map(1 -> [C#377653ae, 2 -> [C#396fe0f4)
val byLowercaseCode = "Mama".toMultiMapBy(c => (c.toLower.toInt, c))
println(byLowercaseCode) // Map(97 -> aa, 109 -> Mm)
There's no method or data structure in the standard library to do this, and your solution looks about as concise as you'll get. If you use this in more than one place, you might like to factor it out into a utility method
def groupTuples[A, B](seq: Seq[(A, B)]) =
seq groupBy (_._1) mapValues (_ map (_._2))
which you then obviously just call with groupTuples(seq). This might not be the most efficient possible in terms of CPU clock cycles, but I don't think it's particularly inefficient either.
I did a rough benchmark against Jean-Philippe's solution on a list of 9 tuples and this is marginally faster. Both were about twice as fast as folding the sequence into a map (effectively re-implementing groupBy to give the output you want).
I don't know if you consider it cleaner:
seq.groupBy(_._1).map { case (k,v) => (k,v.map(_._2))}
Starting Scala 2.13, most collections are provided with the groupMap method which is (as its name suggests) an equivalent (more efficient) of a groupBy followed by mapValues:
List(1 -> 'a', 1 -> 'b', 2 -> 'c').groupMap(_._1)(_._2)
// Map[Int,List[Char]] = Map(2 -> List(c), 1 -> List(a, b))
This:
groups elements based on the first part of tuples (Map(2 -> List((2,c)), 1 -> List((1,a), (1,b))))
maps grouped values (List((1,a), (1,b))) by taking their second tuple part (List(a, b)).