Anonymous functions and Maps in Scala - scala

I'm not sure why this doesn't work:
scala> case class Loader(n: String, x: String, l: List[String])
scala> val m: Map[String, (List[String])=>Loader] =
| Map("x" -> Loader("x", "x1", _:List[String]))
<console>:8: error: type mismatch;
found : (List[String]) => (java.lang.String, Loader)
required: (String, (List[String]) => Loader)
Map("x" -> Loader("x", "x1", _:List[String]))
but this does?
scala> Loader("t", "x", _:List[String])
res7: (List[String]) => Loader = function1>
scala> val m = Map("x" -> res7)
m: scala.collection.immutable.Map[java.lang.String,(List[String]) => Loader] =
Map((String,function1>))

One more victim of the overload of _ in Scala. Consider this:
f(_, 5) + 1 // Partial function application
f(_ + 1, 5) // Closure
In the first case, _ is replacing the entire parameter. In this case, it stands for a partial application of f. In practice, it's equivalent to x => f(x, 5) + 1, as the whole expression that contains f is turned into a closure.
In the second case, _ is part of an expression. In this case, the whole expression is turned into a closure, up to any expression delimiter -- by which I mean that if the expression is nested inside another, only the inner expression is turned into a closure. In practice, it is equivalent to f(x => x + 1, 5).

The parser was not sure where to put the beginning of the anonymous function. Sometimes you can solve this by adding another pair of parentheses (though not always):
val m: Map[String, (List[String])=>Loader] =
Map("x" -> (Loader("x", "x1", _:List[String])))
I don’t see any ambiguities here, so it might just not have been smart enough to figure it out. I think, the parser overlooked the possibility to have an anonymous function just after the -> (which also is a library construct and uses implicit magic and all the wicket stuff which makes the little parser’s mind loop).
When you write it as an explicit tuple, it’ll work fine.
val m: Map[String, (List[String])=>Loader] =
Map(("x", Loader("x", "x1", _:List[String])))

Related

Scala: Is it possible to get partially applied function from leftfold?

I'm currently learning Scala, and I just wondered at fold-left.
Since fold-left is curried, you should be able to get a partially applied function(PAF) with a first parameter as below.
(0 /: List(1, 2, 3)) _
But actually, I've got an error.
<console>:8: error: missing arguments for method /: in trait TraversableOnce;
follow this method with `_' if you want to treat it as a partially applied function
Then I tried same thing by fold-right such as below
(List(1, 2, 3) :\ 0) _
In this way, it went correctly, and I could get a PAF such as ((Int, Int) => Int) => Int
I know I can get a PAF by using foldLeft method, but I wonder whether it is possible to express it with '/:' or not.
The underscore syntax does not work well with right-associative methods that take multiple parameter lists. Here are the options I see:
Declare a variable type:
val x: ((Int, Int) => Int) => Int = 0 /: List(1, 2, 3)
Similarly, use type ascription:
val x = (0 /: List(1,2,3)) : ((Int, Int) => Int) => Int
Use the postfix notation:
val x = List(1,2,3)./:(0) _
Use the foldLeft synonym:
val x = List(1,2,3).foldLeft(0) _
I played around with it, and couldn't find a configuration that works.
There's always the more explicit:
val f = List(1,2,3,4,5).foldLeft(0)_
Which is arguably neater. I'll keep poking around though.
Edit:
There's this:
val f2 = (0 /: List(1,2,3,4,5))(_: (Int,Int) => Int)
val x = f2(_+_)
But that's getting pretty ugly. Without the type annotation, it complains. That's the best I could do though.

Scala groupBy + mapValues vs. groupBy + map + breakOut

Let's say I have data like this:
scala> case class Foo(a: Int, b: Int)
defined class Foo
scala> val data: List[Foo] = Foo(1,10) :: Foo(2, 20) :: Foo(3,30) :: Nil
data: List[Foo] = List(Foo(1,10), Foo(2,20), Foo(3,30))
I know that in my data, there will be no instances of Foo with the same value of field a - and I want to transform it to Map[Int, Foo] (I don't want Map[Int, List[Foo]])
I can either:
scala> val m: Map[Int,Foo] = data.groupBy(_.a).mapValues(_.head)
m: Map[Int,Foo] = Map(2 -> Foo(2,20), 1 -> Foo(1,10), 3 -> Foo(3,30))
or:
scala> val m: Map[Int,Foo] = data.groupBy(_.a).map(e => e._1 -> e._2.head)(collection.breakOut)
m: Map[Int,Foo] = Map(2 -> Foo(2,20), 1 -> Foo(1,10), 3 -> Foo(3,30))
My questions:
1) How could I make the implementation with breakOut more concise / idiomatic?
2) What should I be aware of "under the covers" in each of the above-two solutions? I.e. hidden memory / compute costs. In particular, I am looking for a "layperson's" explanation of breakOut that does not necessarily involve an in-depth discussion of the signature of map.
3) Are there any other solutions I should be aware of (including, for example, using libraries such as ScalaZ)?
1) As pointed out by #Kigyo, the right answer, given that there are no duplicate as, wouldn't use groupBy:
val m: Map[Int,Foo] = data.map(e => e.a -> e)(breakOut)
Using groupBy is good when there could be duplicate as, but is totally unnecessary given your problem.
2) First, don't use mapValues if you plan on accessing values multiple times. The .mapValues method does not create a new Map (like the .map method does). Instead, it creates a view of a Map that recomputes the function (_.head in your case) every time it is accessed. If you plan on accessing things a lot, consider map{case (a,b) => a -> ??} instead.
Second, passing the breakOut function as the CanBuildFrom parameter does not incur additional costs. The reason for this is that the CanBuildFrom parameter is always present, just sometimes it's implicit. The true signature is this:
def map[B, That](f: (A) ⇒ B)(implicit bf: CanBuildFrom[List[A], B, That]): That
The purpose of the CanBuildFrom is to tell scala how to make a That out of the result of mapping (which is a collection of Bs). If you leave off breakOut, then it uses an implicit CanBuildFrom, but either way, there must be a CanBuildFrom so that there is some object that is able to build the That out of the Bs.
Finally, in your example with breakOut, the breakOut is completely redundant since groupBy produces a Map, so .map on a Map gives you back a Map by default.
val m: Map[Int,Foo] = data.groupBy(_.a).map(e => e._1 -> e._2.head)

In scala, compile error goes away when adding irrelevant line after toMap

This is cross-posted from the coursera functional programming course because there's a lot less activity on that forum.
I wrote the following code (parts are redacted because it's homework):
type Occurrences = List[(Char, Int)]
def subtract(x: Occurrences, y: Occurrences): Occurrences = {
val mx: Map[Char, Int] = x toMap
y.foldLeft (redacted) (redacted => simple expression using updated and -)) toList
}
This produces the following compile error:
type mismatch; found : Map[Char,Int] required: <:<[(Char, Int), (?, ?)]
However if I add a copy of the third line, without the toList, in between via a val statement, the error goes away:
type Occurrences = List[(Char, Int)]
def subtract(x: Occurrences, y: Occurrences): Occurrences = {
val mx: Map[Char, Int] = x toMap
val foo: Map[Char, Int] = y.foldLeft (redacted) (redacted => simple expression using updated and -))
y.foldLeft (redacted) (redacted => simple expression using updated and -)) toList
}
I'm guessing this has something to do with giving some kind of extra hint to the type checker, but does anyone know specifically why this happens?
Below follows a few examples and some explanations on why it happens.
First, a working and a non-working cases:
scala> { List('a -> 1, 'b -> 2).toMap
| println("aaa") }
aaa
scala> { List('a -> 1, 'b -> 2) toMap
| println("aaa") }
<console>:9: error: type mismatch;
found : Unit
required: <:<[(Symbol, Int),(?, ?)]
println("aaa") }
^
This happens because the syntax "obj method arg" is considered to be "obj.method(arg)" and so is "obj method \n arg", this way the argument can be written in the next line. Notice below:
scala> { val x = List('a -> 1, 'b -> 2) map
| identity
|
| println(x) }
List(('a,1), ('b,2))
It's the same as List('a -> 1, 'b -> 2).map(identity).
Now for the weird error message found : Unit, required: <:<[(Symbol, Int),(?, ?)]. It happens that toMap actually takes one argument, here is it's signature:
def toMap[T, U](implicit ev: <:<[A,(T, U)]): Map[T,U],
but it's an implicit argument, so doesn't need to be provided explicitly in this case. But when you use the obj method \n arg syntax it fills the method argument. In the above non-working example the argument is println which has type Unit, hence it is not accepted by the compiler.
One workaround is to have two \n to separate the lines:
scala> { List('a -> 1, 'b -> 2) toMap
|
| println("aaa") }
aaa
You can also use a ; to separate the lines.
#RexKerr & #DidierDupont are right, you're having issues because you called toMap like a binary operator, so the compiler freaked out.
My two cents: you should probably read the Suffix Notation section of the Scala Style Guide.

How does this recursive List flattening work?

A while back this was asked and answered on the Scala mailing list:
Kevin:
Given some nested structure: List[List[...List[T]]]
what's the best (preferably type-safe) way to flatten it to a List[T]
Jesper:
A combination of implicits and default arguments works:
case class Flat[T, U](fn : T => List[U])
implicit def recFlattenFn[T, U](implicit f : Flat[T, U] = Flat((l : T)
=> List(l))) =
Flat((l : List[T]) => l.flatMap(f.fn))
def recFlatten[T, U](l : List[T])(implicit f : Flat[List[T], U]) = f.fn(l)
Examples:
scala> recFlatten(List(1, 2, 3))
res0: List[Int] = List(1, 2, 3)
scala> recFlatten(List(List(1, 2, 3), List(4, 5)))
res1: List[Int] = List(1, 2, 3, 4, 5)
scala> recFlatten(List(List(List(1, 2, 3), List(4, 5)), List(List(6, 7))))
res2: List[Int] = List(1, 2, 3, 4, 5, 6, 7)
I have been looking at this code for a while. I cannot figure out how it works. There seems to be some recursion involved... Can anybody shed some light? Are there other examples of this pattern and does it have a name?
Oh wow, this is an old one! I'll start by cleaning up the code a bit and pulling it into line with current idiomatic conventions:
case class Flat[T, U](fn: T => List[U])
implicit def recFlattenFn[T, U](
implicit f: Flat[T, U] = Flat((xs: T) => List(xs))
) = Flat((xs: List[T]) => xs flatMap f.fn)
def recFlatten[T, U](xs: List[T3])(implicit f: Flat[List[T], U]) = f fn xs
Then, without further ado, break down the code. First, we have our Flat class:
case class Flat[T, U](fn: T => List[U])
This is nothing more than a named wrapper for the function T => List[U], a function that will build a List[U] when given an instance of type T. Note that T here could also be a List[U], or a U, or a List[List[List[U]]], etc. Normally, such a function could be directly specified as the type of a parameter. But we're going to be using this one in implicits, so the named wrapper avoids any risk of an implicit conflict.
Then, working backwards from recFlatten:
def recFlatten[T, U](xs: List[T])(implicit f: Flat[List[T], U]) = f fn xs
This method will take xs (a List[T]) and convert it to a U. To achieve this, it locates an implicit instance of Flat[T,U] and invokes the enclosed function, fn
Then, the real magic:
implicit def recFlattenFn[T, U](
implicit f: Flat[T, U] = Flat((xs: T) => List(xs))
) = Flat((xs: List[T]) => xs flatMap f.fn)
This satisfies the implicit parameter required by recFlatten, it also takes another implicit paramater. Most crucially:
recFlattenFn can act as its own implicit parameter
it returns a Flat[List[X], X], so recFlattenFn will only be implicitly resolved as a Flat[T,U] if T is a List
the implicit f can fallback to a default value if implicit resolution fails (i.e. T is NOT a List)
Perhaps this is best understood in the context of one of the examples:
recFlatten(List(List(1, 2, 3), List(4, 5)))
The type T is inferred as List[List[Int]]
implicit lookup is attempted for a `Flat[List[List[Int]], U]
this is matched by a recursively defined recFlattenFn
Broadly speaking:
recFlattenFn[List[List[Int]], U] ( f =
recFlattenFn[List[Int], U] ( f =
Flat[Int,U]((xs: T) => List(xs)) //default value
)
)
Note that recFlattenFn will only match an implicit search for a Flat[List[X], X] and the type params [Int,_] fail this match because Int is not a List. This is what triggers the fallback to the default value.
Type inference also works backwards up that structure, resolving the U param at each level of recursion:
recFlattenFn[List[List[Int]], Int] ( f =
recFlattenFn[List[Int], Int] ( f =
Flat[Int,Int]((xs: T) => List(xs)) //default value
)
)
Which is just a nesting of Flat instances, each one (except the innermost) performing a flatMap operation to unroll one level of the nested List structure. The innermost Flat simply wraps all the individual elements back up in a single List.
Q.E.D.
May be a good solution is to try to look at how the types are infered. To avoid ambiguity, let us rename the generics:
case class Flat[T, U](fn : T => List[U])
implicit def recFlattenFn[T2, U2](implicit f : Flat[T2, U2] =
Flat((l : T2) => List(l))) =
Flat((l : List[T2]) => l.flatMap(f.fn))
def recFlatten[T3, U3](l : List[T3])(implicit f : Flat[List[T3], U3]) = f.fn(l)
In the first case, res0, the type of T3 is Int you cannot infer yet the type of U3, but you know that you will need a Flat[List[Int, U3]] object that will be provided implicitly. There is only one "implicit candidate": the result of the recFlattenFn function and its type is Flat[List[T2], List[U2]]. Thus T2 = Int and U2 = U3 (that we still need to infer).
Now, if we weant to be able to use recFlatten we must provide it a parameter f. Here is the trick. You can either use an implicit of type Flat[Int, U2] or the default value of type Int => List[Int]. Let us look about the available implicits. As explained before recFlattenFn can provide a Flat[List[T2], U2] (for a new T2 and U2) object. It does not fit the expected signature of fat this point. Thus, no implicit are a good candidate here and we must use the default argument. As the type of the default argument is Int => List[Int], U2and U3 are Int and there we go.
Hope that this long prose will help. I leave you with the resolution of res1 and res2.

Why is there a difference in behavior between these two pattern matches in a for comprehension?

Consider this Map[String, Any]:
val m1 = Map(("k1" -> "v1"), ("k2" -> 10))
Now let's write a for:
scala> for ((a, b) <- m1) println(a + b)
k1v1
k210
So far so good.
Now let's specify the type of the second member:
scala> for ((a, b: String) <- m1) println(a + b)
k1v1
scala> for ((a, b: Integer) <- m1) println(a + b)
k210
Here, as I specify a type, filtering takes place, which is great.
Now say I want to use an Array[Any] instead:
val l1 = Array("a", 2)
Here, things break:
scala> for (v: String <- l1) println(v)
<console>:7: error: type mismatch;
found : (String) => Unit
required: (Any) => ?
My double question is:
why doesn't the second match filter as well?
is there a way to express such filtering in the second scenario without using a dirty isInstanceOf?
Well, the latter example doesn't work because it isn't spec'ed to. There's some discussion as to what would be the reasonable behavior. Personally, I'd expect it to work just like you. The thing is that:
val v: String = (10: Any) // is a compile error
(10: Any) match {
case v: String =>
} // throws an exception
If you are not convinced by this, join the club. :-) Here's a workaround:
for (va # (v: String) <- l1) println(v)
Note that in Scala 3, you can:
for (case v: String <- l1) println(v)
The main reason for the speced behavior is that we want to encourage people to add type annotations, for clarity. If in for comprehensions, they get potentially very costly filter operations instead, that's a trap we want to avoid. However, I agree that we should make it easier to specify that something is a pattern. Probably a single pair of parens should suffice.
val x: String = y // type check, can fail at compile time
val (x: String) = y // pattern match, can fail at run time
for (x: String <- ys) // type check, can fail at compile time
for ((x: String) <- ys) // pattern match, can filter at run time