Is there a cleaner way to pattern-match in Scala anonymous functions? - scala

I find myself writing code like the following:
val b = a map (entry =>
entry match {
case ((x,y), u) => ((y,x), u)
}
)
I would like to write it differently, if only this worked:
val c = a map (((x,y) -> u) =>
(y,x) -> u
)
Is there any way I can get something close to this?

Believe it or not, this works:
val b = List(1, 2)
b map {
case 1 => "one"
case 2 => "two"
}
You can skip the p => p match in simple cases. So this should work:
val c = a map {
case ((x,y) -> u) => (y,x) -> u
}

In your example, there are three subtly different semantics that you may be going for.
Map over the collection, transforming each element that matches a pattern. Throw an exception if any element does not match. These semantics are achieved with
val b = a map { case ((x, y), u) => ((y, x), u) }
Map over the collection, transforming each element that matches a pattern. Silently discard elements that do not match:
val b = a collect { case ((x, y), u) => ((y, x), u) }
Map over the collection, safely destructuring and then transforming each element. These are the semantics that I would expect for an expression like
val b = a map (((x, y), u) => ((y, x), u)))
Unfortunately, there is no concise syntax to achieve these semantics in Scala.
Instead, you have to destructure yourself:
val b = a map { p => ((p._1._2, p._1._1), p._2) }
One might be tempted to use a value definition for destructuring:
val b = a map { p => val ((x,y), u) = p; ((y, x), u) }
However, this version is no more safe than the one that uses explicit pattern matching. For this reason, if you want the safe destructuring semantics, the most concise solution is to explicitly type your collection to prevent unintended widening and use explicit pattern matching:
val a: List[((Int, Int), Int)] = // ...
// ...
val b = a map { case ((x, y), u) => ((y, x), u) }
If a's definition appears far from its use (e.g. in a separate compilation unit), you can minimize the risk by ascribing its type in the map call:
val b = (a: List[((Int, Int), Int)]) map { case ((x, y), u) => ((y, x), u) }

In your quoted example, the cleanest solution is:
val xs = List((1,2)->3,(4,5)->6,(7,8)->9)
xs map { case (a,b) => (a.swap, b) }

val b = a map { case ((x,y), u) => ((y,x), u) }

Related

flatmapping a nested Map in scala

Suppose I have val someMap = Map[String -> Map[String -> String]] defined as such:
val someMap =
Map(
("a1" -> Map( ("b1" -> "c1"), ("b2" -> "c2") ) ),
("a2" -> Map( ("b3" -> "c3"), ("b4" -> "c4") ) ),
("a3" -> Map( ("b5" -> "c5"), ("b6" -> "c6") ) )
)
and I would like to flatten it to something that looks like
List(
("a1","b1","c1"),("a1","b2","c2"),
("a2","b3","c3"),("a2","b4","c4"),
("a3","b5","c5"),("a3","b6","c6")
)
What is the most efficient way of doing this? I was thinking about creating some helper function that processes each (a_i -> Map(String,String)) key value pair and return
def helper(key: String, values: Map[String -> String]): (String,String,String)
= {val sublist = values.map(x => (key,x._1,x._2))
return sublist
}
then flatmap this function over someMap. But this seems somewhat unnecessary to my novice scala eyes, so I was wondering if there was a more efficient way to parse this Map.
No need to create helper function just write nested lambda:
val result = someMap.flatMap { case (k, v) => v.map { case (k1, v1) => (k, k1, v1) } }
Or
val y = someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2)))
Since you're asking about efficiency, the most efficient yet functional approach I can think of is using foldLeft and foldRight.
You need foldRight since :: constructs the immutable list in reverse.
someMap.foldRight(List.empty[(String, String, String)]) { case ((a, m), acc) =>
m.foldRight(acc) {
case ((b, c), acc) => (a, b, c) :: acc
}
}
Here, assuming Map.iterator.reverse is implemented efficiently, no intermediate collections are created.
Alternatively, you can use foldLeft and then reverse the result:
someMap.foldLeft(List.empty[(String, String, String)]) { case (acc, (a, m)) =>
m.foldLeft(acc) {
case (acc, (b, c)) => (a, b, c) :: acc
}
}.reverse
This way a single intermediate List is created, but you don't rely on the implementation of the reversed iterator (foldLeft uses forward iterator).
Note: one liners, such as someMap.flatMap(x => x._2.map(y => (x._1, y._1, y._2))) are less efficient, as, in addition to the temporary buffer to hold intermediate results of flatMap, they create and discard additional intermediate collections for each inner map.
UPD
Since there seems to be some confusion, I'll clarify what I mean. Here is an implementation of map, flatMap, foldLeft and foldRight from TraversibleLike:
def map[B, That](f: A => B)(implicit bf: CanBuildFrom[Repr, B, That]): That = {
def builder = { // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
val b = bf(repr)
b.sizeHint(this)
b
}
val b = builder
for (x <- this) b += f(x)
b.result
}
def flatMap[B, That](f: A => GenTraversableOnce[B])(implicit bf: CanBuildFrom[Repr, B, That]): That = {
def builder = bf(repr) // extracted to keep method size under 35 bytes, so that it can be JIT-inlined
val b = builder
for (x <- this) b ++= f(x).seq
b.result
}
def foldLeft[B](z: B)(op: (B, A) => B): B = {
var result = z
this foreach (x => result = op(result, x))
result
}
def foldRight[B](z: B)(op: (A, B) => B): B =
reversed.foldLeft(z)((x, y) => op(y, x))
It's clear that map and flatMap create intermediate buffer using corresponding builder, while foldLeft and foldRight reuse the same user-supplied accumulator object, and only use iterators.

Scala groupBy all elements in the item's list

I have a list of tuples, where the first element is a string and the second is a list of strings.
For example...(ignoring speech marks)
val p = List((a, List(x,y,z)), (b, List(x)), (c, List(y,z)))
My goal is to group this list into a map with the elements of the nested lists acting as keys.
val q = Map(x -> List(a,b), y -> List(a,c), z-> List(a,c))
My initial thought was to group by the second elements of p but this assigns the entire lists to the keys.
I'm a beginner to Scala so any advice is appreciated. Should I expect to be able to complete this with higher order functions or would for loops be useful here?
Thanks in advance :)
Here are two variants:
val p = List(("a", List("x","y","z")), ("b", List("x")), ("c", List("y","z")))
// 1. "Transducers"
p.flatMap{ case (k, v) => v.map { _ -> k } } // List((x,a), (y,a), (z,a), (x,b), (y,c), (z,c))
.groupBy(_._1) // Map(z -> List((z,a), (z,c)), y -> List((y,a), (y,c)), x -> List((x,a), (x,b)))
.mapValues(_.map(_._2)) // Map(z -> List(a, c), y -> List(a, c), x -> List(a, b))
// 2. For-loop
var res = Map[String, List[String]]()
for ( (k, vs) <- p; v <- vs) {
res += v -> k :: res.getOrElse(v, List())
}
res // Map(x -> List(b, a), y -> List(c, a), z -> List(c, a))
// Note, values of `res` are inverted,
// because the efficient "cons" operator (::) was used to add values to the lists
// you can revert the lists afterwards as this:
res.mapValues(_.reverse) // Map(x -> List(a, b), y -> List(a, c), z -> List(a, c))
Second variant is more performant, because no intermediate collections are created, but it also could be considered "less idiomatic", as mutable variable res is used. However, it's totally fine to use mutable approach inside a private method.
UPD. Per #LuisMiguelMejíaSuárez's suggestions:
In (1), since scala 2.13, groupBy followed by mapValues can be replaced by groupMap, so the whole chain becomes:
p.flatMap{ case (k, v) => v.map { _ -> k } }
.groupMap(_._1)(_._2)
Another functional variant without intermediate collections can be achieved using foldLeft:
p.foldLeft(Map[String, List[String]]()) {
case (acc, (k, vs)) =>
vs.foldLeft(acc) { (acc1, v) =>
acc1 + (v -> (k :: acc1.getOrElse(v, List())))
}
}
Or slightly more efficiently with updatedWith (scala 2.13):
p.foldLeft(Map[String, List[String]]()) {
case (acc, (k, vs)) =>
vs.foldLeft(acc) { (acc1, v) =>
acc1.updatedWith(v) {
case Some(list) => Some(k :: list)
case None => Some(List(k))
}
}
}
... or same thing slightly shorter:
p.foldLeft(Map[String, List[String]]()) {
case (acc, (k, vs)) =>
vs.foldLeft(acc) { (acc1, v) =>
acc1.updatedWith(v)(_.map(k :: _).orElse(Some(List(k))))
}
}
Overall, I'd suggest either using foldLeft variant (most performant and functional), or the first, groupMap variant (shorter, and arguably more readable, but less performant), depending on your goals.
Your input list p is one step away from being a Map. From there all you need is a general purpose Map inverter.
import scala.collection.generic.IsIterableOnce
import scala.collection.Factory
// from Map[K,C[V]] to Map[V,C[K]] (Scala 2.13.x)
implicit class MapInverter[K,V,C[_]](m: Map[K,C[V]]) {
def invert(implicit iio: IsIterableOnce[C[V]] {type A = V}
, fac: Factory[K,C[K]]): Map[V,C[K]] =
m.foldLeft(Map.empty[V, List[K]]) {
case (acc, (k, vs)) =>
iio(vs).iterator.foldLeft(acc) {
case (a, v) =>
a + (v -> (k::a.getOrElse(v,Nil)))
}
}.map{case (k,v) => k -> v.to(fac)}
}
usage:
val p = List(("a", List("x","y","z")), ("b", List("x")), ("c", List("y","z")))
val q = p.toMap.invert
//Map(x -> List(b, a), y -> List(c, a), z -> List(c, a))

Scala lists with existential types: `map{ case t => ... }` works, `map{ t => ... }` doesn't?

Suppose that we have defined an existential type:
type T = (X => X, X) forSome { type X }
and then defined a list of type List[T]:
val list = List[T](
((x: Int) => x * x, 42),
((_: String).toUpperCase, "foo")
)
It is well known [1], [2] that the following attempt to map does not work:
list.map{ x => x._1(x._2) }
But then, why does the following work?:
list.map{ case x => x._1(x._2) }
Note that answers to both linked questions assumed that a type variable is required in the pattern matching, but it also works without the type variable. The emphasis of the question is more on Why does the { case x => ... } work?.
(My own attempt to answer the question; Should be not too wrong, but maybe a bit superficial.)
First, observe that
list.map{ x => x._1(x._2) }
list.map{ case x => x._1(x._2) }
is essentially the same as
list map f1
list map f2
with
val f1: T => Any = t => t._1(t._2)
val f2: T => Any = _ match {
case q => q._1(q._2)
}
Indeed, compilation of f1 fails, whereas f2 succeeds.
We can see why the compilation of f1 has to fail:
t is of type (X => X, X) forSome { type X }
Therefore, the first component t._1 is inferred to have the type (X => X) forSome { type X }.
Likewise, the second component t._2 is inferred to have the type X forSome { type X }, which is just Any.
We cannot apply an (X => X) forSome { type X } to Any, because it actually could turn out to be (SuperSpecialType => SuperSpecialType) for some SuperSpecialType.
Therefore, compilation of f1 should fail, and it indeed does fail.
To see why f2 compiles successfully, one can look at the output of the typechecker. If we save this as someFile.scala:
class O {
type T = (X => X, X) forSome { type X }
def f2: T => Any = t => t match {
case q => q._1(q._2)
}
def f2_explicit_func_arg: T => Any = t => t match {
case q => {
val f = q._1
val x = q._2
f(x)
}
}
}
and then generate the output of the typechecker with
$ scalac -Xprint:typer someFile.scala
we obtain essentially (with some noise removed):
class O extends scala.AnyRef {
type T = (X => X, X) forSome { type X };
def f2: O.this.T => Any = ((t: O.this.T) => t match {
case (q # _) => q._1.apply(q._2)
});
def f2_explicit_func_arg: O.this.T => Any = ((t: O.this.T) => t match {
case (q # _) => {
val f: X => X = q._1;
val x: X = q._2;
f.apply(x)
}
})
}
The second f2_explicit_func_arg version (equivalent to f2) is more enlightening than the shorter original f2-version. In the desugared and type-checked code of f2_explicit_func_arg, we see that the type X miraculously reappears, and the typechecker indeed infers:
f: X => X
x: X
so that f(x) is indeed valid.
In the more obvious work-around with an explicitly named type variable, we do manually what the compiler does for us in this case.
We could also have written:
type TypeCons[X] = (X => X, X)
list.map{ case t: TypeCons[x] => t._1(t._2) }
or even more explicitly:
list.map{ case t: TypeCons[x] => {
val func: x => x = t._1
val arg: x = t._2
func(arg)
}}
and both versions would compile for very much the same reasons as f2.

Using Tuples in map, flatmap,... partial functions

If I do:
val l = Seq(("un", ""), ("deux", "hehe"), ("trois", "lol"))
l map { t => t._1 + t._2 }
It's ok.
If I do:
val l = Seq(("un", ""), ("deux", "hehe"), ("trois", "lol"))
l map { case (b, n) => b + n }
It's ok too.
But if I do:
val l = Seq(("un", ""), ("deux", "hehe"), ("trois", "lol"))
l map { (b, n) => b + n }
It will not work.
Why should I use "case" keyword to use named tuples?
The error message with 2.11 is more explanatory:
scala> l map { (b, n) => b + n }
<console>:9: error: missing parameter type
Note: The expected type requires a one-argument function accepting a 2-Tuple.
Consider a pattern matching anonymous function, `{ case (b, n) => ... }`
l map { (b, n) => b + n }
^
<console>:9: error: missing parameter type
l map { (b, n) => b + n }
^
For an apply, you get "auto-tupling":
scala> def f(p: (Int, Int)) = p._1 + p._2
f: (p: (Int, Int))Int
scala> f(1,2)
res0: Int = 3
where you supplied two args instead of one.
But you don't get auto-untupling.
People have always wanted it to work that way.
This situation can be understand with the types of inner function.
First, the type syntax of parameter function for the map function is as follows.
Tuple2[Int,Int] => B //Function1[Tuple2[Int, Int], B]
The first parameter function is expand to this.
(t:(Int,Int)) => t._1 + t._2 // type : Tuple2[Int,Int] => Int
This is ok. Then the second function.
(t:(Int, Int)) => t match {
case (a:Int, b:Int) => a + b
}
This is also ok. In the failure scenario,
(a:Int, b:Int) => a + b
Lets check the types of the function
(Int, Int) => Int // Function2[Int, Int, Int]
So the parameter function type is wrong.
As a solution, you can convert multiple arity functions to tuple mode and backward with the helper functions in Function object. You can do following.
val l = Seq(("un", ""), ("deux", "hehe"), ("trois", "lol"))
l map(Function.tupled((b, n) => b + n ))
Please refer Function API for further information.
The type of a function argument passed to map function applied to a sequence is inferred by the type of elements in the sequence. In particular,
scenario 1: l map { t => t._1 + t._2 } is same as l map { t: ((String, String)): (String) => t._1 + t._2 } but shorter, which is possible because of type inference. Scala compiler automatically inferred the type of the argument to be (String, String) => String
scenario 2: you can also write in longer form
l map { t => t match {
case(b, n) => b + n
}
}
scenario 3: a function of wrong type is passed to map, which is similar to
def f1 (a: String, b: String) = a + b
def f2 (t: (String, String)) = t match { case (a, b) => a + b }
l map f1 // won't work
l map f2

Example in Scala of hashmap forall() method?

Can someone please give an example of how to use the HashMap forall() method? I find the Scala docs to be impenetrable.
What I want is something like this:
val myMap = HashMap[Int, Int](1 -> 10, 2 -> 20)
val areAllValuesTenTimesTheKey = myMap.forall((k, v) => k * 10 == v)
but this gives:
error: wrong number of parameters; expected = 1
You need instead
val myMap = HashMap[Int, Int](1 -> 10, 2 -> 20)
val areAllValuesTenTimesTheKey = myMap.forall { case (k, v) => k * 10 == v }
The problem is that forall wants a function that takes a single Tuple2, rather than two arguments. (We're thinking of a Map[A,B] as an Iterable[(A,B)] when we use forall.) Using a case statement is a nice workaround; it's really using pattern matching here to break apart the Tuple2 and give the parts names.
If you don't want to use pattern matching, you could have also written
val areAllValuesTenTimesTheKey = myMap.forall(p => p._1 * 10 == p._2 }
but I think that's less helpful.
forall is passed a single (Int, Int) Tuple (as opposed to multiple parameters). Consider this (which explicitly shows a single tuple value is decomposed):
val areAllValuesTenTimesTheKey = myMap.forall(t => t match { case (k, v) => k * 10 == v })
Or, the short-hand (which actually passes a PartialFunction):
val areAllValuesTenTimesTheKey = myMap.forall {case (k, v) => k * 10 == v}
(These both decompose the tuple take in.)
Additionally, the function can be "tupled"ed:
val myMap = Map((1,10), (2,20))
val fn = (k: Int, v: Int) => k * 10 == v
val tupled_fn = fn.tupled
val areAllValuesTenTimesTheKey = myMap.forall(tupled_fn)
myMap: scala.collection.immutable.Map[Int,Int] = Map((1,10), (2,20))
fn: (Int, Int) => Boolean = // takes in two parameters
tupled_fn: ((Int, Int)) => Boolean = // note that it now takes in a single Tuple
areAllValuesTenTimesTheKey: Boolean = true
Happy coding.
The problem with your code, is that you give forall method a function, that accepts 2 arguments and returns Boolean, or in other words (Int, Int) => Boolean. If you will look in the documentation, then you will find this signature:
def forall (p: ((A, B)) => Boolean): Boolean
in this case forall method expects Tuple2[A, B] => Boolean, so it also can be written like this:
def forall (p: Tuple2[A, B] => Boolean): Boolean
In order to fix your example you can either call forall and give it function, that accepts 1 tuple argument:
myMap.forall(keyVal => keyVal._1 * 10 == keyVal._2)
or you make patterns match and extract key and value:
myMap.forall {case (k, v) => k * 10 == v}
In this case you are giving PartialFunction[(Int, Int), Boolean] to the forall method