Partial Function Application in Scala - scala

I'm learning Functional Programming, by following the book Functional Programming in Scala by Paul Chiusano and Rúnar Bjarnason. I'm specifically on chapter 3, where I am implementing some companion functions to a class representing a singly-linked list, that the authors provided.
package fpinscala.datastructures
sealed trait List[+A]
case object Nil extends List[Nothing]
case class Cons[+A](head: A, tail: List[A]) extends List[A]
object List {
def sum(ints: List[Int]): Int = ints match {
case Nil => 0
case Cons(x,xs) => x + sum(xs)
}
def product(ds: List[Double]): Double = ds match {
case Nil => 1.0
case Cons(0.0, _) => 0.0
case Cons(x,xs) => x * product(xs)
}
def apply[A](as: A*): List[A] =
if (as.isEmpty) Nil
else Cons(as.head, apply(as.tail: _*))
def tail[A](ls: List[A]): List[A] = ls match {
case Nil => Nil
case Cons(x,xs) => xs
}
... (more functions)
}
The functions I am implementing go inside the object List, being companion functions.
While implementing dropWhile, whose method signature is:
def dropWhile[A](l: List[A])(f: A => Boolean): List[A]
I came across some questions regarding partial function application:
In the book, the authors say that the predicate, f, is passed in a separate argument group to help the scala compiler with type inference because if we do this, Scala can determine the type of f without any annotation, based on what it knows about the type of the List , which makes the function more convenient to use.
So, if we passed f in the same argument group, scala would force the call to become something like this: val total = List.dropWhile(example, (x:Int) => 6%x==0 ) where we define the type of x explicitly and we would "lose" the possibility of partial function application, am I right?
However, why is partial function application useful in this case? Only to allow for type inference? Does it make sense to "partially apply" a function like dropWhile without applying the predicate f to it? Because it seems to me that the computation becomes "halted" before being useful if we don't apply f...
So... why is partial function application useful? And is this how it's always done or is it only something specific to Scala? I know Haskell has something called "complete inference" but I don't know exactly its implications...
Thanks in advance

There's a couple of questions scattered in there, so I'll try to answer them separately.
About the type inference, yes, separating the parameter lists helps the compile in inferring the type of f.
This is because Scala has linear local type inference (from left to right) and it uses the first parameter list to infer A (from the type of l). Then it can use this information to infer the type of f.
Given for example
dropWhile(List(1, 2, 3))(x => x < 3)
the compiler will perform this steps:
first parameter list
A is unknown.
a List[A] is expected
a List[Int] is provided (this is inferred by the type of the elements in the List)
=> A is an Int
second parameter list
we know A = Int
so we're expecting a function Int => Boolean as f
If you don't separate the two parameter lists, the compiler can't "stop" and decide the type of A before type-checking f. f would be part of the "conversation" while deciding the type of A so you would need to annotate it.
This is something Haskell can do better, since it uses a different type system (Hindley-Milner) that can also use information deriving from the context in which the function is applied. This is why it's also called "complete" or "universal".
Why does Scala doesn't feature an Hindley-Milner type system? Long story short, because Scala also supports sub-typing, which hardly co-exists with such a powerful type system. More on the subject:
Why is Scala's type inference not as powerful as Haskell's?
http://www.codecommit.com/blog/scala/what-is-hindley-milner-and-why-is-it-cool
http://www.scala-lang.org/old/node/4654
About partial application, the question "why is it useful" is definitely too broad to be answered here. However, in the specific dropWhile case, suppose you have a list of functions representing different "drop" conditions. Using a partially applied function you could do:
val list = List(1, 2, 3)
val conditions: List[Int => Boolean] = List(_ < 1, _ < 2, _ < 3)
conditions.map(dropWhile(list)) // List(List(1, 2, 3), List(2, 3), List(3))
Obviously, with a non-curried function (i.e. a single parameter list) you could have achieved the same with
val list = List(1, 2, 3)
val conditions: List[Int => Boolean] = List(_ < 1, _ < 2, _ < 3)
conditions.map(cond => dropWhile(list, cond))
but currying allows for more flexibility while composing functions.
More on the subject:
https://softwareengineering.stackexchange.com/questions/185585/what-is-the-advantage-of-currying
What's the difference between multiple parameters lists and multiple parameters per list in Scala?

Related

Curry in scala with parametric types

The authors of Functional Programming in Scala give this as the definition of curry in scala:
def curry[A,B,C](f: (A, B) => C): A => (B => C) =
a => b => f(a, b)
However, if we apply it to a function taking parametric types, e.g.:
def isSorted[A](as: Array[A], ordered:(A,A)=>Boolean) =
if(as.size < 2)
true else
as.zip(as.drop(1)).map(ordered.tupled).reduce(_ && _)
Then the result wants A (in isSorted) to be nothing:
scala> curry(isSorted)
res29: Array[Nothing] => (((Nothing, Nothing) => Boolean) => Boolean) = <function1>
This is obviously not what is desired. Should curry be defined differently, or called differently, or is it not practical to implement curry in Scala?
You're running into two separate problems here. The first is that isSorted when it is passed to curry is forced to become monomorphic. The second is that Scala's type inference is failing you here.
This is one of those times where the difference between a function and a method matters in Scala. isSorted is eta-expanded into a function which in turn is a Scala value, not a method. Scala values are always monomorphic, only methods can be polymorphic. For any method of types (A, B) C (this is the syntax for a method type and is different from (A, B) => C which is a function and therefore a value), the default eta-expansion is going to result in the superclass of all functions of that arity, namely (Nothing, Nothing) => Any. This is responsible for all the Nothings you see (you don't have any Anys because isSorted is monomorphic in its return value).
You might imagine despite the monomorphic nature of Scala values though, that you could ideally do something like
def first[A, B](x: A, y: B): A = x
curry(first)(5)(6) // This doesn't compile
This is Scala's local type inference biting you. It works on separate parameter lists from left to right first is the first thing to get a type inferred and as mentioned above, it gets inferred to be (Nothing, Nothing) => Any. This clashes with Ints that follow.
As you've realized, one way of getting around this is annotating your polymorphic method that you pass to curry so that it eta-expands into the correct type. This is almost certainly the way to go.
Another thing you could possibly do (although I'm don't think it'll serve anything except pedagogical purposes) is to curry curry itself and define it as follows:
def curryTwo[A, B, C](x: A)(y: B)(f: (A, B) => C): C = f(x, y)
On the one hand, the below works now because of the left-to-right type inference.
curryTwo(5)(6)(first) // 5
On the other hand, to use curryTwo in the scenarios where you'd want to use curry, you're going to need to need to provide types to Scala's type inference engine anyway.
It turns out I can call curry like this:
curry(isSorted[Int])
Which yields:
scala> curry(isSorted[Int])
res41: Array[Int] => (((Int, Int) => Boolean) => Boolean) = <function1>
See https://stackoverflow.com/a/4593509/21640

Reason for type inference limitations in Scala compiler when dealing with partially applied functions

In Scala, when using partially applied functions vs curried functions, we have to deal with a different way of handling type inference. Let me show it with an example, using a basic filtering function (examples taken from the excellent Functional Programming in Scala book):
1) Partially applied function
def dropWhile[A](l: List[A], f: A => Boolean): List[A] = l match {
case Nil => Nil
case x::xs if (f(x)) => dropWhile(xs, f)
case _ => l
}
2) Curried partially applied function
def dropWhileCurried[A](l: List[A])(f: A => Boolean): List[A] = l match {
case Nil => Nil
case x::xs if (f(x)) => dropWhileCurried(xs)(f)
case _ => l
}
Now, while the implementation is identical in both versions, the difference comes when we call these functions. While the curried version can be simply called like:
dropWhileCurried(List(1,2,3,4,5))(x => x < 3)
This same way (omitting type for x) cannot be used with the non curried one:
dropWhile(List(1,2,3,4,5), x => x < 3)
<console>:9: error: missing parameter type
dropWhile(List(1,2,3,4,5), x => x < 3)
So this form must be used instead:
dropWhile(List(1,2,3,4,5), (x: Int) => x < 3)
I understand this is the case, and I know there are other questions in SO regarding this fact, but what I am trying to understand is why this is the case. What is the reason for the Scala compiler to treat this two types of partially applied functions differently when it comes to type inference?
Firstly both of your examples are not partially applied functions. Partially applied function (do not confuse with Partial Functions) is the function of which only part of it's arguments applied, - but you have all your arguments in place.
But you can easily make the 2nd example into partially applied function (and curried): val a = dropWhileCurried(List(new B, new B))_. Now you have a which has only first argument applied, and you need to apply the 2nd to execute it: println(a(x => true)). You can do the same with 1st example: val a = dropWhile(List(new B, new B), _: B => Boolean).
Now as for the inference and why it works like that: I can only assume, but it sounds quite reasonable for me. You can think of each argument in the function as equal by its importance, but if inference would work and you wrote dropWhile(List(new B, new B), _ => true), you'd assume that _ is of type B, however this also is possible dropWhile(List(new B, new B), _: A => true if B extends A. In that case if you change the order of arguments the inference would change or it wouldn't work at all: dropWhile(_ => true, List(new B, new B)) And it would definetely make the inference quite complicated for the compiler to do as it has to scan the definition several times.
Now if you get back to the partial application and think of call dropWhileCurried(xs)(f) as always a partial application of xs to dropWhileCurried and then application of f to the result of previous operation, it starts to sounds reasonable. Compiler needs to infer type when you already wrote dropWhileCurried(xs) because this is a partial application (I'm still missing _ in the end though). So now, when the type is inferred, it can continue and apply (f) to it.
This is at least how I perceive the question. There might be more reasons to this, but this should help to understand some of the background if you won't receive any better answer.

Behavior of flatMap when applied to List[Option[T]]

Let's take a look at this code:
scala> val a = List(Some(4), None)
a: List[Option[Int]] = List(Some(4), None)
scala> a.flatMap( e=> e)
List[Int] = List(4)
Why would applying flatMap with the function { e => e } on a List[Option[T]] returns a List[T] with the None elements removed?
Specifically, what is the conceptual reasoning behind it -- is it based on some existing theory in functional programming? Is this behavior common in other functional languages?
This, while indeed useful, does feel a bit magical and arbitrary at the same time.
EDIT:
Thank you for your feedbacks and answer.
I have rewritten my question to put more emphasis on the conceptual nature of the question. Rather than the Scala specific implementation details, I'm more interested in knowing the formal concepts behind it.
Let's first look at the Scaladoc for Option's companion object. There we see an implicit conversion:
implicit def option2Iterable[A](xo: Option[A]): Iterable[A]
This means that any option can be implicitly converted to an Iterable, resulting in a collection with zero or one elements. If you have an Option[A] where you need an Iterable[A], the compiler will add the conversion for you.
In your example:
val a = List(Some(4), None)
a.flatMap(e => e)
We are calling List.flatMap, which takes a function A => GenTraversableOnce[B]. In this case, A is Option[Int] and B will be inferred as Int, because through the magic of implicit conversion, e when returned in that function will be converted from an Option[Int] to an Iterable[Int] (which is a subtype of GenTraversableOnce).
At this point, we've essentially done the following:
List(List(1), Nil).flatMap(e => e)
Or, to make our implicit explicit:
List(Option(1), None).flatMap(e => e.toList)
flatMap then works on Option as it does for any linear collection in Scala: take a function of A => List[B] (again, simplifying) and produce a flattened collection of List[B], un-nesting the nested collections in the process.
I assume you mean the support for mapping and filtering at the same time with flatMap:
scala> List(1, 2).flatMap {
| case i if i % 2 == 0 => Some(i)
| case i => None
| }
res0: List[Int] = List(2)
This works because Option's companion object includes an implicit conversion from Option[A] to Iterable[A], which is a GenTraversableOnce[A], which is what flatMap expects as the return type for its argument function.
It's a convenient idiom, but it doesn't really exist in other functional languages (at least the ones I'm familiar with), since it relies on Scala's weird mix of subtyping, implicit conversions, etc. Haskell for example provides similar functionality through mapMaybe for lists, though.
A short answer to your question is: the flatMap method of the List type is defined to work with a more general function type, not just a function that only produces a List[B] result type.
The general result type is IterableOnce[B], as shown in the faltMap method signature: final def flatMap[B](f: (A) => IterableOnce[B]): List[B]. The flatMap implementation is rather simple in that it applies the f function to each element and iterates over the result in a nested while loop. All results from the nested loop are added to a result of type List[B].
Therefore the flatMap works with any function that produces an IterableOnce[B] result from each list element. The IterableOnce is a trait that defines a minimal interface that is inherited by all iterable classes, including all collection types (Set, Map and etc.) and the Option class.
The Option class implementation returns collection.Iterator.empty for None and collection.Iterator.single(x) for Some(x). Therefore the flatMap method skips None element.
The question uses the identity function. It is better to use flatten method when the purpose is to flat iterable elements.
scala> val a = List(Some(4), None)
a: List[Option[Int]] = List(Some(4), None)
scala> a.flatten
res0: List[Int] = List(4)

Multiple brackets in function invocation

I'm a bit confused by this Scala notation:
List(1, 2, 3).foldLeft(0)((x, acc) => acc+x)
Both "0" and the function are arguments for foldLeft, why are they passed in two adjacent brackets groups? I'd aspect this to work:
List(1, 2, 3).foldLeft(0, ((x, acc) => acc+x))
But it doesn't. Can anyone explain this to me? Also, how and why to declare such a type of function? Thanks
Scala allows you to have multiple arguments list:
def foo(a: Int)(b: String) = ???
def bar(a: Int)(b: String)(c: Long) = ???
The reason for using such syntax for foldLeft is the way compiler does type inference: already inferred types in the previous group of arguments used to infer types in consecutive arguments group. In case of foldLeft it allows you to drop type ascription next to the (x, acc), so instead of:
List(1, 2, 3).foldLeft(0)((x: Int, acc: Int) => acc+x)
you can write just
List(1, 2, 3).foldLeft(0)((x, acc) => acc+x)
This is an example of multiple parameter lists in Scala. They're really just syntactic sugar for a normal method call (if you look at the class file's method signatures with javap you'll see that when compiled to Java bytecode they're all combined into a single argument list). The reason for supporting multiple parameter lists are twofold:
Passing functions as arguments: Scala will allow you to replace a parameter list that takes a single argument with a function literal in curly braces {}. For example, your code could be re-written as List(1, 2, 3).foldLeft(0) { (x, acc) => acc+x }, which might be considered more readable. (Then again, I'd just use List(1, 2, 3).foldLeft(0)(_+_) in this case...) Being able to use curly braces like this makes it possible for the user to declare new functions that look more like native syntax. A good example of this is the react function for Actors.
Type inference: There are some details of the type inference process (which I admit I don't fully understand) that make it easier to infer the types used in a later list based on the types in an earlier list. For example, the initial z value passed to foldLeft is used to infer the result type (and left argument type) of the function parameter.
Because in Scala you can define function arguments in multiple groups separated by ()
def test(a: String)(b: String)(implicit ev: Something) { }
The most practical scenario is where a context bound or currying is required, e.g. a specific implicit definition available in scope.
For instance, Future will expect an implicit executor. Look here.
If you look at the definition of the foldLeft method, you will see the first argument is an accumulator and the second a function that will be used for currying.
def foldLeft[B](z: B)(op: (B, A) ⇒ B): B
The parentheses thing is a very useful separation of concerns.
Also, once you define a method with:
def test(a: String)(b: String)
You can't call it with: test("a", "b");

Function syntax puzzler in scalaz

Following watching Nick Partidge's presentation on deriving scalaz, I got to looking at this example, which is just awesome:
import scalaz._
import Scalaz._
def even(x: Int) : Validation[NonEmptyList[String], Int]
= if (x % 2 ==0) x.success else "not even: %d".format(x).wrapNel.fail
println( even(3) <|*|> even(5) ) //prints: Failure(NonEmptyList(not even: 3, not even: 5))
I was trying to understand what the <|*|> method was doing, here is the source code:
def <|*|>[B](b: M[B])(implicit t: Functor[M], a: Apply[M]): M[(A, B)]
= <**>(b, (_: A, _: B))
OK, that is fairly confusing (!) - but it references the <**> method, which is declared thus:
def <**>[B, C](b: M[B], z: (A, B) => C)(implicit t: Functor[M], a: Apply[M]): M[C]
= a(t.fmap(value, z.curried), b)
So I have a few questions:
How come the method appears to take a higher-kinded type of one type parameter (M[B]) but can get passed a Validation (which has two type paremeters)?
The syntax (_: A, _: B) defines the function (A, B) => Pair[A,B] which the 2nd method expects: what is happening to the Tuple2/Pair in the failure case? There's no tuple in sight!
Type Constructors as Type Parameters
M is a type parameter to one of Scalaz's main pimps, MA, that represents the Type Constructor (aka Higher Kinded Type) of the pimped value. This type constructor is used to look up the appropriate instances of Functor and Apply, which are implicit requirements to the method <**>.
trait MA[M[_], A] {
val value: M[A]
def <**>[B, C](b: M[B], z: (A, B) => C)(implicit t: Functor[M], a: Apply[M]): M[C] = ...
}
What is a Type Constructor?
From the Scala Language Reference:
We distinguish between first-order
types and type constructors, which
take type parameters and yield types.
A subset of first-order types called
value types represents sets of
(first-class) values. Value types are
either concrete or abstract. Every
concrete value type can be represented
as a class type, i.e. a type
designator (§3.2.3) that refers to a
class1 (§5.3), or as a compound type
(§3.2.7) representing an intersection
of types, possibly with a refinement
(§3.2.7) that further constrains the
types of itsmembers. Abstract value
types are introduced by type
parameters (§4.4) and abstract type
bindings (§4.3). Parentheses in types
are used for grouping. We assume that
objects and packages also implicitly
define a class (of the same name as
the object or package, but
inaccessible to user programs).
Non-value types capture properties of
identifiers that are not values
(§3.3). For example, a type
constructor (§3.3.3) does not directly
specify the type of values. However,
when a type constructor is applied to
the correct type arguments, it yields
a first-order type, which may be a
value type. Non-value types are
expressed indirectly in Scala. E.g., a
method type is described by writing
down a method signature, which in
itself is not a real type, although it
gives rise to a corresponding function
type (§3.3.1). Type constructors are
another example, as one can write type
Swap[m[_, _], a,b] = m[b, a], but
there is no syntax to write the
corresponding anonymous type function
directly.
List is a type constructor. You can apply the type Int to get a Value Type, List[Int], which can classify a value. Other type constructors take more than one parameter.
The trait scalaz.MA requires that it's first type parameter must be a type constructor that takes a single type to return a value type, with the syntax trait MA[M[_], A] {}. The type parameter definition describes the shape of the type constructor, which is referred to as its Kind. List is said to have the kind '* -> *.
Partial Application of Types
But how can MA wrap a values of type Validation[X, Y]? The type Validation has a kind (* *) -> *, and could only be passed as a type argument to a type parameter declared like M[_, _].
This implicit conversion in object Scalaz converts a value of type Validation[X, Y] to a MA:
object Scalaz {
implicit def ValidationMA[A, E](v: Validation[E, A]): MA[PartialApply1Of2[Validation, E]#Apply, A] = ma[PartialApply1Of2[Validation, E]#Apply, A](v)
}
Which in turn uses a trick with a type alias in PartialApply1Of2 to partially apply the type constructor Validation, fixing the type of the errors, but leaving the type of the success unapplied.
PartialApply1Of2[Validation, E]#Apply would be better written as [X] => Validation[E, X]. I recently proposed to add such a syntax to Scala, it might happen in 2.9.
Think of this as a type level equivalent of this:
def validation[A, B](a: A, b: B) = ...
def partialApply1Of2[A, B C](f: (A, B) => C, a: A): (B => C) = (b: B) => f(a, b)
This lets you combine Validation[String, Int] with a Validation[String, Boolean], because the both share the type constructor [A] Validation[String, A].
Applicative Functors
<**> demands the the type constructor M must have associated instances of Apply and Functor. This constitutes an Applicative Functor, which, like a Monad, is a way to structure a computation through some effect. In this case the effect is that the sub-computations can fail (and when they do, we accumulate the failures).
The container Validation[NonEmptyList[String], A] can wrap a pure value of type A in this 'effect'. The <**> operator takes two effectful values, and a pure function, and combines them with the Applicative Functor instance for that container.
Here's how it works for the Option applicative functor. The 'effect' here is the possibility of failure.
val os: Option[String] = Some("a")
val oi: Option[Int] = Some(2)
val result1 = (os <**> oi) { (s: String, i: Int) => s * i }
assert(result1 == Some("aa"))
val result2 = (os <**> (None: Option[Int])) { (s: String, i: Int) => s * i }
assert(result2 == None)
In both cases, there is a pure function of type (String, Int) => String, being applied to effectful arguments. Notice that the result is wrapped in the same effect (or container, if you like), as the arguments.
You can use the same pattern across a multitude of containers that have an associated Applicative Functor. All Monads are automatically Applicative Functors, but there are even more, like ZipStream.
Option and [A]Validation[X, A] are both Monads, so you could also used Bind (aka flatMap):
val result3 = oi flatMap { i => os map { s => s * i } }
val result4 = for {i <- oi; s <- os} yield s * i
Tupling with `<|**|>`
<|**|> is really similar to <**>, but it provides the pure function for you to simply build a Tuple2 from the results. (_: A, _ B) is a shorthand for (a: A, b: B) => Tuple2(a, b)
And beyond
Here's our bundled examples for Applicative and Validation. I used a slightly different syntax to use the Applicative Functor, (fa ⊛ fb ⊛ fc ⊛ fd) {(a, b, c, d) => .... }
UPDATE: But what happens in the Failure Case?
what is happening to the Tuple2/Pair in the failure case?
If any of the sub-computations fails, the provided function is never run. It only is run if all sub-computations (in this case, the two arguments passed to <**>) are successful. If so, it combines these into a Success. Where is this logic? This defines the Apply instance for [A] Validation[X, A]. We require that the type X must have a Semigroup avaiable, which is the strategy for combining the individual errors, each of type X, into an aggregated error of the same type. If you choose String as your error type, the Semigroup[String] concatenates the strings; if you choose NonEmptyList[String], the error(s) from each step are concatenated into a longer NonEmptyList of errors. This concatenation happens below when two Failures are combined, using the ⊹ operator (which expands with implicits to, for example, Scalaz.IdentityTo(e1).⊹(e2)(Semigroup.NonEmptyListSemigroup(Semigroup.StringSemigroup)).
implicit def ValidationApply[X: Semigroup]: Apply[PartialApply1Of2[Validation, X]#Apply] = new Apply[PartialApply1Of2[Validation, X]#Apply] {
def apply[A, B](f: Validation[X, A => B], a: Validation[X, A]) = (f, a) match {
case (Success(f), Success(a)) => success(f(a))
case (Success(_), Failure(e)) => failure(e)
case (Failure(e), Success(_)) => failure(e)
case (Failure(e1), Failure(e2)) => failure(e1 ⊹ e2)
}
}
Monad or Applicative, how shall I choose?
Still reading? (Yes. Ed)
I've shown that sub-computations based on Option or [A] Validation[E, A] can be combined with either Apply or with Bind. When would you choose one over the other?
When you use Apply, the structure of the computation is fixed. All sub-computations will be executed; the results of one can't influence the the others. Only the 'pure' function has an overview of what happened. Monadic computations, on the other hand, allow the first sub-computation to influence the later ones.
If we used a Monadic validation structure, the first failure would short-circuit the entire validation, as there would be no Success value to feed into the subsequent validation. However, we are happy for the sub-validations to be independent, so we can combine them through the Applicative, and collect all the failures we encounter. The weakness of Applicative Functors has become a strength!