Are polymorphic functions "restrictive" in Scala? - scala

In the book Functional Programming in Scala MEAP v10, the author mentions
Polymorphic functions are often so constrained by their type that they only have one implementation!
and gives the example
def partial1[A,B,C](a: A, f: (A,B) => C): B => C = (b: B) => f(a, b)
What does he mean by this statement? Are polymorphic functions restrictive?

Here's a simpler example:
def mysteryMethod[A, B](somePair: (A, B)): B = ???
What does this method do? It turns out, that there is only one thing this method can do! You don't need the name of the method, you don't need the implementation of the method, you don't need any documentation. The type tells you everything it could possibly do, and it turns out that "everything" in this case is exactly one thing.
So, what does it do? It takes a pair (A, B) and returns some value of type B. What value does it return? Can it construct a value of type B? No, it can't, because it doesn't know what B is! Can it return a random value of type B? No, because randomness is a side-effect and thus would have to appear in the type signature. Can it go out in the universe and fetch some B? No, because that would be a side-effect and would have to appear in the type signature!
In fact, the only thing it can do is return the value of type B that was passed into it, the second element of the pair. So, this mysteryMethod is really the second method, and its only sensible implementation is:
def second[A, B](somePair: (A, B)): B = somePair._2
Note that in reality, since Scala is neither pure nor total, there are in fact a couple of other things the method could do: throw an exception (i.e. return abnormally), go into an infinite loop (i.e. not return at all), use reflection to figure out the actual type of B and reflectively invoke the constructor to fabricate a new value, etc.
However, assuming purity (the return value may only depend on the arguments), totality (the method must return a value normally) and parametricity (it really doesn't know anything about A and B), then there is in fact an awful lot you can tell about a method by only looking at its type.
Here's another example:
def mysteryMethod(someBoolean: Boolean): Boolean = ???
What could this do? It could always return false and ignore its argument. But then it would be overly constrained: if it always ignores its argument, then it doesn't care that it is a Boolean and its type would rather be
def alwaysFalse[A](something: A): Boolean = false // same for true, obviously
It could always just return its argument, but again, then it wouldn't actually care about booleans, and its type would rather be
def identity[A](something: A): A = something
So, really, the only thing it can do is return a different boolean than the one that was passed in, and since there are only two booleans, we know that our mysteryMethod is, in fact, not:
def not(someBoolean: Boolean): Boolean = if (someBoolean) false else true
So, here, we have an example, where the types don't give us the implementation, but at least, they give as a (small) set of 4 possible implementations, only one of which makes sense.
(By the way: it turns out that there is only one possible implementation of a method which takes an A and returns an A, and it is the identity method shown above.)
So, to recap:
purity means that you can only use the building blocks that were handed to you (the arguments)
a strong, strict, static type system means that you can only use those building blocks in such a way that their types line up
totality means that you can't do stupid things (like infinite loops or throwing exceptions)
parametricity means that you cannot make any assumptions at all about your type variables
Think about your arguments as parts of a machine and your types as connectors on those machine parts. There will only be a limited number of ways that you can connect those machine parts together in a way that you only plug together compatible connectors and you don't have any leftover parts. Often enough, there will be only one way, or if there are multiple ways, then often one will be obviously the right one.
What this means is that, once you have designed the types of your objects and methods, you won't even have to think about how to implement those methods, because the types will already dictate the only possible way to implement them! Considering how many questions on StackOverflow are basically "how do I implement this?", can you imagine how freeing it must be not having to think about that at all, because the types already dictate the one (or one of a few) possible implementation?
Now, look at the signature of the method in your question and try playing around with different ways to combine a and f in such a way that the types line up and you use both a and f and you will indeed see that there is only one way to do that. (As Chris and Paul have shown.)

def partial1[A,B,C](a: A, f: (A,B) => C): B => C = (b: B) => f(a, b)
Here, partial1 takes as parameters value of type A, and a function that takes a parameter of type A and a parameter of type B, returning a value of type C.
partial1 must return a function taking a value of type B and returning a C. Given A, B, and C are arbitary, we cannot apply any functions to their values. So the only possibility is to apply the function f to the value a passed to partial, and the value of type B that is a parameter to the function we return.
So you end up with the single possibility that's in the definition f(a,b)

To take a simpler example, consider the type Option[A] => Boolean. There's only a couple ways to implement this:
def foo1(x: Option[A]): Boolean = x match { case Some(_) => true
case None => false }
def foo2(x: Option[A]): Boolean = !foo1(x)
def foo3(x: Option[A]): Boolean = true
def foo4(x: Option[A]): Boolean = false
The first two choices are pretty much the same, and the last two are trivial, so essentially there's only one useful thing this function could do, which is tell you whether the Option is Some or None.
The space of possible implementation is "restricted" by the abstractness of the function type. Since A is unconstrained, the option's value could be anything, so the function can't depend on that value in any way because you know nothing about what it. The only "understanding" the function may have about its parameter is the structure of Option[_].
Now, back to your example. You have no idea what C is, so there's no way you can construct one yourself. Therefore the function you create is going to have to call f to get a C. And in order to call f, you need to provide an arguments of types A and B. Again, since there's no way to create an A or a B yourself, the only thing you can do is use the arguments that are given to you. So there's no other possible function you could write.

Related

Type constraint on input function parameter in Scala

I'm trying to create a higher order function that has an upper bound on the type of the parameter accepted by the input function.
A toy example of a naive attempt that demonstrates the problem:
class A
class B extends A
def AFunc(f: A => Unit): Unit = {}
AFunc((b: B) => {})
//<console>:14: error: type mismatch;
// found : B => Unit
// required: A => Unit
// AFunc((b: B) => {})
This doesn't work, I assume, because functions are contravariant in their parameters so B => Unit is a supertype of A => Unit.
It's possible to get it working with polymorphic functions like so:
class A
class B extends A
def AFunc[T <: A](f: T => Unit): Unit = {}
AFunc[B]((b: B) => {})
This solution has the drawback of requiring the caller to know the exact type of the passed function's parameter, despite the fact that all AFunc cares about the type is that it is a subtype of A.
Is there a type-safe way to enforce the type constraint without explicitly passing the type?
Your problem here is not due to language construct limitations, it's conceptual. There is a reason why functions are contravariant in their arguments.
What would you expect to happen if we take your code and extend it slightly into a following situation:
class A
class B extends A
def AFunc(f: A => Unit): Unit = {
f(new A) // let's use the function f
}
AFunc((b: B) => {}) // ???
How would this compile? Method AFunc clearly needs a way to handle values of type A, but you only provided it with a way of handling one subset of those values (= those of type B).
Sure, you can make your method polymorphic the way you did, and btw you won't even need to specify the argument type at call-site because it will be inferred. BUT, this is not the "solution". It's simply a different thing. Now instead of saying "I need a function that handles values of type A", your method is saying "I need a function that handles one particular subtype of A, but I will be fine with whatever subtype you decide on".
So, as it often happens on StackOverflow, I have to ask you - what are you trying to achieve? Is your method AFunc going to be handling all kinds of As, or just one particular subtype at a time?
If it's the latter, parameterising with an upper bound (as you did) should be fine. Just be careful with the inference if you decide not to specify the type at call site, because sometimes your code might compile just because the compiler inferred something other than what you intended (I see from the comments in your question that you already discovered this behaviour). But, if parametric method is indeed what you need, then I don't see why would specifying the type at call-site be a problem.
despite the fact that all AFunc cares about the type is that it is a subtype of A.
Sure, AFunc doesn't care. But the caller does. I mean, someone must care, right? :) Again, it's hard to continue discussion without knowing more about the actual problem you're solving, but if you want to provide more details I'd be happy to help.

What is "value" in pure functional programming?

What constitutes a value in pure functional programming?
I am asking myself these questions after seeing a sentence:
Task(or IO) has a constructor that captures side-effects as values.
Is a function a value?
If so, what does it mean when equating two functions: assert(f == g). For two functions that are equivalent but defined separately => f != g, why don't they work as 1 == 1?
Is an object with methods a value? (for example IO { println("") })
Is an object with setter methods and mutable state a value?
Is an object with mutable state which works as a state machine a value?
How do we test whether something is a value? Is immutability a sufficient condition?
UPDATE:
I'm using Scala.
I'll try to explain what a value is by contrasting it with things that are not values.
Roughly speaking, values are structures produced by the process of evaluation which correspond to terms that cannot be simplified any further.
Terms
First, what are terms? Terms are syntactic structures that can be evaluated. Admittedly, this is a bit circular, so let's look at a few examples:
Constant literals are terms:
42
Functions applied to other terms are terms:
atan2(123, 456 + 789)
Function literals are terms
(x: Int) => x * x
Constructor invocations are terms:
Option(42)
Contrast this to:
Class declarations / definitions are not terms:
case class Foo(bar: Int)
that is, you cannot write
val x = (case class Foo(bar: Int))
this would be illegal.
Likewise, trait and type definitions are not terms:
type Bar = Int
sealed trait Baz
Unlike function literals, method definitions are not terms:
def foo(x: Int) = x * x
for example:
val x = (a: Int) => a * 2 // function literal, ok
val y = (def foo(a: Int): Int = a * 2) // no, not a term
Package declarations and import statements are not terms:
import foo.bar.baz._ // ok
List(package foo, import bar) // no
Normal forms, values
Now, when it is hopefully somewhat clearer what a term is, what was meant by "cannot be simplified any further*? In idealized functional programming languages, you can define what a normal form, or rather weak head normal form is. Essentially, a term is in a (wh-) normal form if no reduction rules can be applied to the term to make it any simpler. Again, a few examples:
This is a term, but it's not in normal form, because it can be reduced to 42:
40 + 2
This is not in weak head normal form:
((x: Int) => x * 2)(3)
because we can further evaluate it to 6.
This lambda is in weak head normal form (it's stuck, because the computation cannot proceed until an x is supplied):
(x: Int) => x * 42
This is not in normal form, because it can be simplified further:
42 :: List(10 + 20, 20 + 30)
This is in normal form, no further simplifications possible:
List(42, 30, 50)
Thus,
42,
(x: Int) => x * 42,
List(42, 30, 50)
are values, whereas
40 + 2,
((x: Int) => x * 2)(3),
42 :: List(10 + 20, 20 + 30)
are not values, but merely non-normalized terms that can be further simplified.
Examples and non-examples
I'll just go through your list of sub-questions one-by-one:
Is a function a value
Yes, things like (x: T1, ..., xn: Tn) => body are considered to be stuck terms in WHNF, in functional languages they can actually be represented, so they are values.
If so, what does it mean when equating two functions: assert(f == g) for two functions that are equivalent but defined separately => f != g, why don't they work as 1 == 1?
Function extensionality is somewhat unrelated to the question whether something is a value or not. In the above "definition by example", I talked only about the shape of the terms, not about the existence / non-existence of some computable relations defined on those terms. The sad fact is that you can't even really determine whether a lambda-expression actually represents a function (i.e. whether it terminates for all inputs), and it is also known that there cannot be an algorithm that could determine whether two functions produce the same output for all inputs (i.e. are extensionally equal).
Is an object with methods a value? (for example IO { println("") })
Not quite clear what you're asking here. Objects don't have methods. Classes have methods. If you mean method invocations, then, no, they are terms that can be further simplified (by actually running the method), so they are not values.
Is an object with setter methods and mutable state a value?
Is an object with mutable state which works as a state machine a value?
There is no such thing in pure functional programming.
What constitutes a value in pure functional programming?
Background
In pure functional programming there is no mutation. Hence, code such as
case class C(x: Int)
val a = C(42)
val b = C(42)
would become equivalent to
case class C(x: Int)
val a = C(42)
val b = a
since, in pure functional programming, if a.x == b.x, then we would have a == b. That is, a == b would be implemented comparing the values inside.
However, Scala is not pure, since it allows mutation, like Java. In such case, we do NOT have the equivalence between the two snippets above, when we declare case class C(var x: Int). Indeed, performing a.x += 1 afterwords does not affect b.x in the first snippet, but does in the second one, where a and b point to the same object. In such case, it is useful to have a comparison a == b which compares the object references, rather than its inner integer value.
When using case class C(x: Int), Scala comparisons a == b behave closer to pure functional programming, comparing the integers values. With regular (non case) classes, Scala instead compares object references breaking the equivalence between the two snippets. But, again, Scala is not pure. By comparison, in Haskell
data C = C Int deriving (Eq)
a = C 42
b = C 42
is indeed equivalent to
data C = C Int deriving (Eq)
a = C 42
b = a
since there are no "references" or "object identities" in Haskell. Note that the Haskell implementation likely will allocate two "objects" in the first snippet, and only one object in the second one, but since there is no way to tell them apart inside Haskell, the program output will be the same.
Answer
Is a function a value ? (then what it means when equating two function: assert(f==g). For two function that is equivalent but defined separately => f!=g, why not they work like 1==1)
Yes, functions are values in pure functional programming.
Above, when you mention "function that is equivalent but defined separately", you are assuming that we can compare the "references" or "object identities" for these two functions. In pure functional programming we can not.
Pure functional programming should compare functions making f == g equivalent to f x == g x for all possible arguments x. This is feasible when there is only a few values for x, e.g. if f,g :: Bool -> Int we only need to check x=True, x=False. For functions having infinite domains, this is much harder. For instance, if f,g :: String -> Int we can not check infinitely many strings.
Theoretical computer science (computability theory) also proved that there is no algorithm to compare two functions String -> Int, not even an inefficient algorithm, not even if we have access to the source code of the two functions. For this mathematical reason, we must accept that functions are values that can not be compared. In Haskell, we express this through the Eq typeclass, stating that almost all the standard types are comparable, functions being the exception.
Is an object with methods a value ? (for example, IO{println("")})
Yes. Roughly speaking, "everything is a value", including IO actions.
Is an object with setter methods and mutable states a value ?
Is an object with mutable states and works as a state machine a value ?
There is no mutable state in pure functional programming.
At best, the setters can produce a "new" object with the modified fields.
And yes, the object would be a value.
How do we test if it is a value, is that immutable can be a sufficient condition to be a value ?
In pure functional programming, we can only have immutable data.
In impure functional programming, I think we can call most immutable objects "values", when we do not compare object references. If the "immutable" object contains a reference to a mutable object, e.g.
case class D(var x: Int)
case class C(c: C)
val a = C(D(42))
then things are more tricky. I guess we could still call a "immutable", since we can not alter a.c, but we should be careful since a.c.x can be mutated.
Depending on the intent, I think that some would not call a immutable. I would not consider a to be a value.
To make things more muddy, in impure programming, there are objects which use mutation to present a "pure" interface in an efficient way. For instance one can write a pure function that, before returning, stores its result in a cache. When called again on the same argument, it will return the previously computed result
(this is usually called memoization). Here, mutation happens, but it is not observable from outside, where at most we can observe a faster implementation. In this case, we can simply pretend the that function is pure (even if it performs mutation) and consider it a "value".
The contrast with imperative languages is stark. In inperitive languages, like Python, the output of a function is directed. It can be assigned to a variable, explicitly returned, printed or written to a file.
When I compose a function in Haskell, I never consider output. I never use "return" Everything has "a" value. This is called "symbolic" programming. By "everything", is meant "symbols". Like human language, the nouns and verbs represent something. That something is their value. The "value" of "Pete" is Pete. The name "Pete" is not Pete but is a representation of Pete, the person. The same is true of functional programming. The best analogy is math or logic When you do pages of calculations, do you direct the output of each function? You even "assign" variables to be replaced by their "value" in functions or expressions.
Values are
Immutable/Timeless
Anonymous
Semantically Transparent
What is the value of 42? 42. What is the "value" of new Date()? Date object at 0x3fa89c3. What is the identity of 42? 42. What is the identity of new Date()? As we saw in the previous example, it's the thing that lives at the place. It may have many different "values" in different contexts but it has only one identity. OTOH, 42 is sufficient unto itself. It's semantically meaningless to ask where 42 lives in the system. What is the semantic meaning of 42? Magnitude of 42. What is the semantic meaning of new Foo()? Who knows.
I would add a fourth criterion (see this in some contexts in the wild but not others) which is: values are language agnostic (I'm not certain the first 3 are sufficient to guarantee this nor that such a rule is entirely consistent with most people's intuition of what value means).
Values are things that
functions can take as inputs and return as outputs, that is, can be computed, and
are members of a type, that is, elements of some set, and
can be bound to a variable, that is, can be named.
First point is really the crucial test whether something is a value. Perhaps the word value, due to conditioning, might immediately make us think of just numbers, but the concept is very general. Essentially anything we can give to and get out of a function can be considered a value. Numbers, strings, booleans, instances of classes, functions themselves, predicates, and even types themselves, can be inputs and outputs of functions, and thus are values.
IO monad is a great example of how general this concept is. When we say IO monad models side-effects as values, we mean a function can take a side-effect (say println) as input and return as output. IO(println(...)) separates the idea of the effect of an action of println from the actual execution of an action, and allows these effects to be considered as first class values that can be computed with using the same language facilities as for any other values such as numbers.

how many implementations a scala method can have?

I'm new in Scala. I come across with this question:
If side effects (outputs, global variables ...) are not taken into account, how many implementations can the below method have? how can I figure it out in general?
def g[A,B,C](x: A,y: B,f:(A,B) => C): C
You can figure it out in general by going backwards from your return type, so you look at C and try to look for where it might come from, we can see that f itself returns C, but it requires us to give it an A and a B, so now we look for those and we find them only once, which gives us the answer that there's only one way to produce a C and therefore only one way to implement this function. If we had two values of A, we'd get two different implementations.
Now of course this kind of ignores the fact that you also implement it using null or an infinite number of ways using asInstanceOf casts. So if you have to be super exact there are 1* implementations of this function.
This question is ill-posed even without nulls, asInstanceOfs and throwing exceptions. For every natural number n, you can simply append .swap.swap to (x, y) n-times in the following implementation (example for n = 1):
def g[A, B, C](x: A, y: B, f: (A, B) => C): C = {
val (xx, yy) = (x,y).swap.swap
f(xx, yy)
}
You can do the same thing by inserting arbitrarily many identity functions at various places. Or by using _1, _2 projections and (-,-)-constructors (or, in general, repeatedly applying non-productive eta-expansions and beta-reductions of your choice).

Why T is in the covariance position or contravariance position

Given the following class definition
class X[+T] {
def get[U>:T](default:U):U
}
why T in the method def get[U>:T](default:U) is in the covariance position
Given the following class definition
class X[-T] {
def get[U<:T](default:U):U
}
why T in the method def get[U<:T](default:U) is in the cotravariance position
It is hard to answer "Why?" question without you providing more details but I'll try. I assume that your question is really "why type restriction on U are not inverted?". The short answer: because this is type-safe and covers some cases that are not covered otherwise.
Your first example is probably inspired by Option[T] and its getOrElse method. Although I'm not sure why anybody needs getOrElse with U different from T, logic why type restriction can be only U>:T seems obvious to me. Let's assume you have 3 classes: C which inherits B which inherits A and you have an Option[B]. If your default value is already B or C you don't need anything beyond U = T and thus simpler signature without additional generic U would suffice. The only case when you can't pass default value to the getOrElse method is if you have it of some type which is not a subtype of B (such as A). Let's extend this signature even more for a moment
def getOrElse[U, R](default:U): R
How types U, T and R should be related? Obviously R should be some common super-type of U and T because it should be able to contain both T and U. However such definition would be an overkill. First of all it is really weird to have default value of a type that is not related to the T at all. Secondly, even if it is such a strange case, you (and compiler) still can calculate a common super-type and assign it to some new U' = R'. Thus you don't need R but adding U adds some more flexibility (at least theoretically). But U still has to be some super-type of T because it is also the return type.
So to sum up: adding U with U<:T will
either produce wrong code if you use U as the result type (U as a result type can't hold T)
or if you use T as the result type would not extend applicability of this method i.e. would not allow any code that does not compile without U to compile with this additional U.
But adding U with U>:T will allow some more code which is actually type-safe to compile such as (yes, stupid example but as I said I don't know any real life examples):
val opt = Option[Int](1)
val orElse: Any = opt.getOrElse(List())

def layout[A](x: A) = ... syntax in Scala

I'm a beginner of Scala who is struggling with Scala syntax.
I got the line of code from https://www.tutorialspoint.com/scala/higher_order_functions.htm.
I know (x: A) is an argument of layout function
( which means argument x of Type A)
But what is [A] between layout and (x: A)?
I've been googling scala function syntax, couldn't find it.
def layout[A](x: A) = "[" + x.toString() + "]"
It's a type parameter, meaning that the method is parameterised (some also say "generic"). Without it, compiler would think that x: A denotes a variable of some concrete type A, and when it wouldn't find any such type it would report a compile error.
This is a fairly common thing in statically typed languages; for example, Java has the same thing, only syntax is <A>.
Parameterized methods have rules where the types can occur which involve concepts of covariance and contravariance, denoted as [+A] and [-A]. Variance is definitely not in the scope of this question and is probably too much for you too handle right now, but it's an important concept so I figured I'd just mention it, at least to let you know what those plus and minus signs mean when you see them (and you will).
Also, type parameters can be upper or lower bounded, denoted as [A <: SomeType] and [A >: SomeType]. This means that generic parameter needs to be a subtype/supertype of another type, in this case a made-up type SomeType.
There are even more constructs that contribute extra information about the type (e.g. context bounds, denoted as [A : Foo], used for typeclass mechanism), but you'll learn about those later.
This means that the method is using a generic type as its parameter. Every type you pass that has the definition for .toString could be passed through layout.
For example, you could pass both int and string arguments to layout, since you could call .toString on both of them.
val i = 1
val s = "hi"
layout(i) // would give "[1]"
layout(s) // would give "[hi]"
Without the gereric parameter, for this example you would have to write two definitions for layout: one that accepts integers as param, and one that accepts string as param. Even worse: every time you need another type you'd have to write another definition that accepts it.
Take a look at this example here and you'll understand it better.
I also recomend you to take a look at generic classes here.
A is a type parameter. Rather than being a data type itself (Ex. case class A), it is generic to allow any data type to be accepted by the function. So both of these will work:
layout(123f) [Float datatype] will output: "[123]"
layout("hello world") [String datatype] will output: "[hello world]"
Hence, whichever datatype is passed, the function will allow. These type parameters can also specify rules. These are called contravariance and covariance. Read more about them here!