Using filterNot with a Map - scala

I have a simple map
val m = Map("a1" -> "1", "b2" -> "2", "c3" -> "3")
val someList = List("b", "d")
m.filterNot( (k,v) => someList.exist(l => k.startsWith(l)) )
I'm getting an error:
error: missing parameter type
I'm doing something silly here I'm sure, why isn' this compiling?

filterNot needs a case keyword and {} when you extract k, v from tuple.
note that its not exist its exists
m.filterNot { case (k,v) => someList.exists(l => k.startsWith(l)) }
or
m.filterNot(pair => someList.exists(l => pair._1.startsWith(l)))
Explanation
As you are extracting k, v from the tuple using extractor syntax you have to use case keyword and {}
Without extractor syntax you can do
m.filterNot(pair => someList.exists(l => pair._1.startsWith(l)))
Scala REPL
scala> val m = Map("a1" -> "1", "b2" -> "2", "c3" -> "3")
m: scala.collection.immutable.Map[String,String] = Map(a1 -> 1, b2 -> 2, c3 -> 3)
scala> val someList = List("b", "d")
someList: List[String] = List(b, d)
scala> m.filterNot { case (k,v) => someList.exists(l => k.startsWith(l)) }
res15: scala.collection.immutable.Map[String,String] = Map(a1 -> 1, c3 -> 3)
Without extractor syntax
Now you need not use case keyword and {} as we are not using extracting the key and value using extractor syntax
scala> m.filterNot(pair => someList.exists(l => pair._1.startsWith(l)))
res18: scala.collection.immutable.Map[String,String] = Map(a1 -> 1, c3 -> 3)
scala> val m = Map("a1" -> "1", "b2" -> "2", "c3" -> "3")
m: scala.collection.immutable.Map[String,String] = Map(a1 -> 1, b2 -> 2, c3 -> 3)
scala> val someList = List("b", "d")
someList: List[String] = List(b, d)
scala> m.filterNot(pair => someList.exists(l => pair._1.startsWith(l)))
res19: scala.collection.immutable.Map[String,String] = Map(a1 -> 1, c3 -> 3)

The issue is this: the filterNot method accepts one parameter, while you are defining a list of two parameters. This confuses the compiler and that message is the result.
In order to solve it, you can use the following syntax (notice the usage of pattern matching with the case keyword):
m.filterNot { case (k,v) => someList.exist(l => k.startsWith(l)) }
Using the pattern matching like this creates a PartialFunction that will be decompose key and value and be applied like a normal function to your Map.

Using for comprehension syntax, extraction and filtering may be achieved as follows,
for ( pair#(k,v) <- m; l <- someList if !k.startsWith(l)) yield pair

Related

Scala groupBy for a list

I'd like to create a map on which the key is the string and the value is the number of how many times the string appears on the list. I tried the groupBy method, but have been unsuccessful with that.
Required Answer
scala> val l = List("abc","abc","cbe","cab")
l: List[String] = List(abc, abc, cbe, cab)
scala> l.groupBy(identity).mapValues(_.size)
res91: scala.collection.immutable.Map[String,Int] = Map(cab -> 1, abc -> 2, cbe -> 1)
Suppose you have a list as
scala> val list = List("abc", "abc", "bc", "b", "abc")
list: List[String] = List(abc, abc, bc, b, abc)
You can write a function
scala> def generateMap(list: List[String], map:Map[String, Int]) : Map[String, Int] = list match {
| case x :: y => if(map.keySet.contains(x)) generateMap(y, map ++ Map(x -> (map(x)+1))) else generateMap(y, map ++ Map(x -> 1))
| case Nil => map
| }
generateMap: (list: List[String], map: Map[String,Int])Map[String,Int]
Then call the function as
scala> generateMap(list, Map.empty)
res1: Map[String,Int] = Map(abc -> 3, bc -> 1, b -> 1)
This also works:
scala> val l = List("abc","abc","cbe","cab")
val l: List[String] = List(abc, abc, cbe, cab)
scala> l.groupBy(identity).map(x => (x._1, x._2.length))
val res1: Map[String, Int] = HashMap(cbe -> 1, abc -> 2, cab -> 1)

Difference between mapValues and transform in Map

In Scala Map (see API) what is the difference in semantics and performance between mapValues and transform ?
For any given map, for instance
val m = Map( "a" -> 2, "b" -> 3 )
both
m.mapValues(_ * 5)
m.transform( (k,v) => v * 5 )
deliver the same result.
Let's say we have a Map[A,B]. For clarification: I'm always referring to an immutable Map.
mapValues takes a function B => C, where C is the new type for the values.
transform takes a function (A, B) => C, where this C is also the type for the values.
So both will result in a Map[A,C].
However with the transform function you can influence the result of the new values by the value of their keys.
For example:
val m = Map( "a" -> 2, "b" -> 3 )
m.transform((key, value) => key + value) //Map[String, String](a -> a2, b -> b3)
Doing this with mapValues will be quite hard.
The next difference is that transform is strict, whereas mapValues will give you only a view, which will not store the updated elements. It looks like this:
protected class MappedValues[C](f: B => C) extends AbstractMap[A, C] with DefaultMap[A, C] {
override def foreach[D](g: ((A, C)) => D): Unit = for ((k, v) <- self) g((k, f(v)))
def iterator = for ((k, v) <- self.iterator) yield (k, f(v))
override def size = self.size
override def contains(key: A) = self.contains(key)
def get(key: A) = self.get(key).map(f)
}
(taken from https://github.com/scala/scala/blob/v2.11.2/src/library/scala/collection/MapLike.scala#L244)
So performance-wise it depends what is more effective. If f is expensive and you only access a few elements of the resulting map, mapValues might be better, since f is only applied on demand. Otherwise I would stick to map or transform.
transform can also be expressed with map. Assume m: Map[A,B] and f: (A,B) => C, then
m.transform(f) is equivalent to m.map{case (a, b) => (a, f(a, b))}
collection.Map doesn't provide transform: it has a different signature for mutable and immutable Maps.
$ scala
Welcome to Scala version 2.11.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_11).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val im = Map('a -> 1, 'b -> 2, 'c -> 3)
im: scala.collection.immutable.Map[Symbol,Int] = Map('a -> 1, 'b -> 2, 'c -> 3)
scala> im.mapValues(_ * 7) eq im
res0: Boolean = false
scala> im.transform { case (k,v) => v*7 } eq im
res2: Boolean = false
scala> val mm = collection.mutable.Map('a -> 1, 'b -> 2, 'c -> 3)
mm: scala.collection.mutable.Map[Symbol,Int] = Map('b -> 2, 'a -> 1, 'c -> 3)
scala> mm.mapValues(_ * 7) eq mm
res3: Boolean = false
scala> mm.transform { case (k,v) => v*7 } eq mm
res5: Boolean = true
Mutable transform mutates in place:
scala> mm.transform { case (k,v) => v*7 }
res6: mm.type = Map('b -> 98, 'a -> 49, 'c -> 147)
scala> mm.transform { case (k,v) => v*7 }
res7: mm.type = Map('b -> 686, 'a -> 343, 'c -> 1029)
So mutable transform doesn't change the type of the map:
scala> im mapValues (_ => "hi")
res12: scala.collection.immutable.Map[Symbol,String] = Map('a -> hi, 'b -> hi, 'c -> hi)
scala> mm mapValues (_ => "hi")
res13: scala.collection.Map[Symbol,String] = Map('b -> hi, 'a -> hi, 'c -> hi)
scala> mm.transform { case (k,v) => "hi" }
<console>:9: error: type mismatch;
found : String("hi")
required: Int
mm.transform { case (k,v) => "hi" }
^
scala> im.transform { case (k,v) => "hi" }
res15: scala.collection.immutable.Map[Symbol,String] = Map('a -> hi, 'b -> hi, 'c -> hi)
...as can happen when constructing a new map.
Here's a couple of unmentioned differences:
mapValues creates a Map that is NOT serializable, without any indication that it's just a view (the type is Map[_, _], but just try to send one across the wire).
Since mapValues is just a view, every instance contains the real Map - which could be another result of mapValues. Imagine you have an actor with some state, and every mutation of the state sets the new state to be a mapValues on the previous state...in the end you have deeply nested maps with a copy of each previous state of the actor (and, yes, both of these are from experience).

Filter Map by key set

Is there a shortcut to filter a Map keeping only the entries where the key is contained in a given Set?
Here is some example code
scala> val map = Map("1"->1, "2"->2, "3"->3)
map: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2, 3 -> 3)
scala> map.filterKeys(Set("1","2").contains)
res0: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2)
I am searching for something shorter than this.
Answering the Question
You can take advantage of the fact that a Set[A] is a predicate; i.e. A => Boolean
map filterKeys set
Here it is at work:
scala> val map = Map("1" -> 1, "2" -> 2, "3" -> 3)
map: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2, 3 -> 3)
scala> val set = Set("1", "2")
set: scala.collection.immutable.Set[java.lang.String] = Set(1, 2)
scala> map filterKeys set
res0: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2)
Or if you prefer:
scala> map filterKeys Set("1", "2")
res1: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2)
Predicates
It's actually really useful to have some wrapper around a predicate. Like so:
scala> class PredicateW[A](self: A => Boolean) {
| def and(other: A => Boolean): A => Boolean = a => self(a) && other(a)
| def or(other: A => Boolean): A => Boolean = a => self(a) || other(a)
| def unary_! : A => Boolean = a => !self(a)
| }
defined class PredicateW
And an implicit conversion:
scala> implicit def Predicate_Is_PredicateW[A](p: A => Boolean) = new PredicateW(p)
Predicate_Is_PredicateW: [A](p: A => Boolean)PredicateW[A]
And then you can use it:
scala> map filterKeys (Set("1", "2") and Set("2", "3"))
res2: scala.collection.immutable.Map[java.lang.String,Int] = Map(2 -> 2)
scala> map filterKeys (Set("1", "2") or Set("2", "3"))
res3: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2, 3 -> 3)
scala> map filterKeys !Set("2", "3")
res4: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1)
This can be extended to xor, nand etc etc and if you include symbolic unicode can make for amazingly readable code:
val mustReport = trades filter (uncoveredShort ∨ exceedsDollarMax)
val european = {
val Europe = (_ : Market).exchange.country.region == Region.EU
trades filter (_.market ∈: Europe)
}
Sorry, not a direct answer to your question, but if you know which keys you want to remove (instead of which ones you want to keep), you could do this:
map -- Set("3")
A tangential tip, in case you are going to follow the PredicateW idea in #oxbow_lakes' answer:
In functional programming, instead of defining ad hoc functions, we aim for more generalized and composable abstractions. For this particular case, Applicative fits the bill.
Set themselves are functions, and the Applicative instance for [B]Function1[A, B] lets us lift functions to context. In other words, you can lift functions of type (Boolean, Boolean) => Boolean (such as ||, && etc.) to (A => Boolean, A => Boolean) => (A => Boolean). (Here you can find a great explanation on this concept of lifting.)
However the data structure Set itself has an Applicative instance available, which will be favored over [B]Applicative[A => B] instance. To prevent that, we will have to explicitly tell the compiler to treat the given set as a function. We define a following enrichment for that:
scala> implicit def setAsFunction[A](set: Set[A]) = new {
| def f: A => Boolean = set
| }
setAsFunction: [A](set: Set[A])java.lang.Object{def f: A => Boolean}
scala> Set(3, 4, 2).f
res144: Int => Boolean = Set(3, 4, 2)
And now put this Applicative goodness into use.
scala> val map = Map("1" -> 1, "2" -> 2, "3" -> 3)
map: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2, 3 -> 3)
scala> map filterKeys ((Set("1", "2").f |#| Set("2", "3").f)(_ && _))
res150: scala.collection.immutable.Map[java.lang.String,Int] = Map(2 -> 2)
scala> map filterKeys ((Set("1", "2").f |#| Set("2", "3").f)(_ || _))
res151: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1, 2 -> 2, 3 -> 3)
scala> map filterKeys (Set("2", "3").f map (!_))
res152: scala.collection.immutable.Map[java.lang.String,Int] = Map(1 -> 1)
Note: All of the above requires Scalaz.

Convert List of tuple to map (and deal with duplicate key ?)

I was thinking about a nice way to convert a List of tuple with duplicate key [("a","b"),("c","d"),("a","f")] into map ("a" -> ["b", "f"], "c" -> ["d"]). Normally (in python), I'd create an empty map and for-loop over the list and check for duplicate key. But I am looking for something more scala-ish and clever solution here.
btw, actual type of key-value I use here is (Int, Node) and I want to turn into a map of (Int -> NodeSeq)
For Googlers that don't expect duplicates or are fine with the default duplicate handling policy:
List("a" -> 1, "b" -> 2, "a" -> 3).toMap
// Result: Map(a -> 3, c -> 2)
As of 2.12, the default policy reads:
Duplicate keys will be overwritten by later keys: if this is an unordered collection, which key is in the resulting map is undefined.
Group and then project:
scala> val x = List("a" -> "b", "c" -> "d", "a" -> "f")
//x: List[(java.lang.String, java.lang.String)] = List((a,b), (c,d), (a,f))
scala> x.groupBy(_._1).map { case (k,v) => (k,v.map(_._2))}
//res1: scala.collection.immutable.Map[java.lang.String,List[java.lang.String]] = Map(c -> List(d), a -> List(b, f))
More scalish way to use fold, in the way like there (skip map f step).
Here's another alternative:
x.groupBy(_._1).mapValues(_.map(_._2))
For Googlers that do care about duplicates:
implicit class Pairs[A, B](p: List[(A, B)]) {
def toMultiMap: Map[A, List[B]] = p.groupBy(_._1).mapValues(_.map(_._2))
}
> List("a" -> "b", "a" -> "c", "d" -> "e").toMultiMap
> Map("a" -> List("b", "c"), "d" -> List("e"))
Starting Scala 2.13, most collections are provided with the groupMap method which is (as its name suggests) an equivalent (more efficient) of a groupBy followed by mapValues:
List("a" -> "b", "c" -> "d", "a" -> "f").groupMap(_._1)(_._2)
// Map[String,List[String]] = Map(a -> List(b, f), c -> List(d))
This:
groups elements based on the first part of tuples (group part of groupMap)
maps grouped values by taking their second tuple part (map part of groupMap)
This is an equivalent of list.groupBy(_._1).mapValues(_.map(_._2)) but performed in one pass through the List.
Below you can find a few solutions. (GroupBy, FoldLeft, Aggregate, Spark)
val list: List[(String, String)] = List(("a","b"),("c","d"),("a","f"))
GroupBy variation
list.groupBy(_._1).map(v => (v._1, v._2.map(_._2)))
Fold Left variation
list.foldLeft[Map[String, List[String]]](Map())((acc, value) => {
acc.get(value._1).fold(acc ++ Map(value._1 -> List(value._2))){ v =>
acc ++ Map(value._1 -> (value._2 :: v))
}
})
Aggregate Variation - Similar to fold Left
list.aggregate[Map[String, List[String]]](Map())(
(acc, value) => acc.get(value._1).fold(acc ++ Map(value._1 ->
List(value._2))){ v =>
acc ++ Map(value._1 -> (value._2 :: v))
},
(l, r) => l ++ r
)
Spark Variation - For big data sets (Conversion to a RDD and to a Plain Map from RDD)
import org.apache.spark.rdd._
import org.apache.spark.{SparkContext, SparkConf}
val conf: SparkConf = new
SparkConf().setAppName("Spark").setMaster("local")
val sc: SparkContext = new SparkContext (conf)
// This gives you a rdd of the same result
val rdd: RDD[(String, List[String])] = sc.parallelize(list).combineByKey(
(value: String) => List(value),
(acc: List[String], value) => value :: acc,
(accLeft: List[String], accRight: List[String]) => accLeft ::: accRight
)
// To convert this RDD back to a Map[(String, List[String])] you can do the following
rdd.collect().toMap
Here is a more Scala idiomatic way to convert a list of tuples to a map handling duplicate keys. You want to use a fold.
val x = List("a" -> "b", "c" -> "d", "a" -> "f")
x.foldLeft(Map.empty[String, Seq[String]]) { case (acc, (k, v)) =>
acc.updated(k, acc.getOrElse(k, Seq.empty[String]) ++ Seq(v))
}
res0: scala.collection.immutable.Map[String,Seq[String]] = Map(a -> List(b, f), c -> List(d))
You can try this
scala> val b = new Array[Int](3)
// b: Array[Int] = Array(0, 0, 0)
scala> val c = b.map(x => (x -> x * 2))
// c: Array[(Int, Int)] = Array((1,2), (2,4), (3,6))
scala> val d = Map(c : _*)
// d: scala.collection.immutable.Map[Int,Int] = Map(1 -> 2, 2 -> 4, 3 -> 6)

Scala Map conversion

I'm a Scala newbie I'm afraid:
I'm trying to convert a Map to a new Map based on some simple logic:
val postVals = Map("test" -> "testing1", "test2" -> "testing2", "test3" -> "testing3")
I want to test for value "testing1" and change the value (while creating a new Map)
def modMap(postVals: Map[String, String]): Map[String, String] = {
postVals foreach {case(k, v) => if(v=="testing1") postVals.update(k, "new value")}
}
You could use the 'map' method. That returns a new collection by applying the given function to all elements of it:
scala> def modMap(postVals: Map[String, String]): Map[String, String] = {
postVals map {case(k, v) => if(v == "a") (k -> "other value") else (k ->v)}
}
scala> val m = Map[String, String]("1" -> "a", "2" -> "b")
m: scala.collection.immutable.Map[String,String] = Map((1,a), (2,b))
scala> modMap(m)
res1: Map[String,String] = Map((1,other value), (2,b))
Alternative to Arjan's answer: (just a slight change)
scala> val someMap = Map("a" -> "apple", "b" -> "banana")
someMap: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(a -> apple, b -> banana)
scala> val newMap = someMap map {
| case(k , v # "apple") => (k, "alligator")
| case pair => pair
| }
newMap: scala.collection.immutable.Map[java.lang.String,java.lang.String] = Map(a -> alligator, b -> banana)
even easier:
val myMap = Map("a" -> "apple", "b" -> "banana")
myMap map {
case (k, "apple") => (k, "apfel")
case pair => pair
}