Related
I read the definition of a State Monad as follows, which includes a definition of flatMap. The literal definition of flatMap is clear to me, but what would be a typical use case of it?
trait State[S,A]{
def run (initial:S):(S,A)
def flatMap[B] (f:A=>State[S,B]):State[S,B] =
State{ s =>
val (s1, a) = run(s)
f(a).run(s1)
}
}
object State{
def apply[S,A](f:S => (S,A)):State[S,A] =
new State[S,A] {
def run(initial:S): (S,A) = f(initial)
}
}
According to cats State monad documentation
The flatMap method on State[S, A] lets you use the result of one State
in a subsequent State
It means we can have state transitions nicely lined up within a for-comprehension like so
val createRobot: State[Seed, Robot] =
for {
id <- nextLong
sentient <- nextBoolean
isCatherine <- nextBoolean
name = if (isCatherine) "Catherine" else "Carlos"
isReplicant <- nextBoolean
model = if (isReplicant) "replicant" else "borg"
} yield Robot(id, sentient, name, model)
In general, purpose of flatMap is chaining of monadic computations, so whatever monad we have we can stick it within a for-comprehension.
This quote is taken from wikibooks about state monad in haskell.
If you have programmed in any other language before, you likely wrote some functions that "kept state". For those new to the concept, a state is one or more variables that are required to perform some computation but are not among the arguments of the relevant function. Object-oriented languages like C++ make extensive use of state variables (in the form of member variables inside classes and objects). Procedural languages like C on the other hand typically use global variables declared outside the current scope to keep track of state.
In Haskell, however, such techniques are not as straightforward to apply. Doing so will require mutable variables which would mean that functions will have hidden dependencies, which is at odds with Haskell's functional purity. Fortunately, often it is possible to keep track of state in a functionally pure way. We do so by passing the state information from one function to the next, thus making the hidden dependencies explicit.
Basically its purpose is the to write purely functional programs that manipulate state, having the API compute the next state rather than actually mutate anything.
The most common examples for the state monad are:
Generate random numbers.
Building games
Parsers
Data Stuctures
Any finite state machine program.
You can also check the cats page for the state monad
Note: There is an additional more complexed state monad which is called IndexedState monad, which basically gives you the option to change the state.
What constitutes a value in pure functional programming?
I am asking myself these questions after seeing a sentence:
Task(or IO) has a constructor that captures side-effects as values.
Is a function a value?
If so, what does it mean when equating two functions: assert(f == g). For two functions that are equivalent but defined separately => f != g, why don't they work as 1 == 1?
Is an object with methods a value? (for example IO { println("") })
Is an object with setter methods and mutable state a value?
Is an object with mutable state which works as a state machine a value?
How do we test whether something is a value? Is immutability a sufficient condition?
UPDATE:
I'm using Scala.
I'll try to explain what a value is by contrasting it with things that are not values.
Roughly speaking, values are structures produced by the process of evaluation which correspond to terms that cannot be simplified any further.
Terms
First, what are terms? Terms are syntactic structures that can be evaluated. Admittedly, this is a bit circular, so let's look at a few examples:
Constant literals are terms:
42
Functions applied to other terms are terms:
atan2(123, 456 + 789)
Function literals are terms
(x: Int) => x * x
Constructor invocations are terms:
Option(42)
Contrast this to:
Class declarations / definitions are not terms:
case class Foo(bar: Int)
that is, you cannot write
val x = (case class Foo(bar: Int))
this would be illegal.
Likewise, trait and type definitions are not terms:
type Bar = Int
sealed trait Baz
Unlike function literals, method definitions are not terms:
def foo(x: Int) = x * x
for example:
val x = (a: Int) => a * 2 // function literal, ok
val y = (def foo(a: Int): Int = a * 2) // no, not a term
Package declarations and import statements are not terms:
import foo.bar.baz._ // ok
List(package foo, import bar) // no
Normal forms, values
Now, when it is hopefully somewhat clearer what a term is, what was meant by "cannot be simplified any further*? In idealized functional programming languages, you can define what a normal form, or rather weak head normal form is. Essentially, a term is in a (wh-) normal form if no reduction rules can be applied to the term to make it any simpler. Again, a few examples:
This is a term, but it's not in normal form, because it can be reduced to 42:
40 + 2
This is not in weak head normal form:
((x: Int) => x * 2)(3)
because we can further evaluate it to 6.
This lambda is in weak head normal form (it's stuck, because the computation cannot proceed until an x is supplied):
(x: Int) => x * 42
This is not in normal form, because it can be simplified further:
42 :: List(10 + 20, 20 + 30)
This is in normal form, no further simplifications possible:
List(42, 30, 50)
Thus,
42,
(x: Int) => x * 42,
List(42, 30, 50)
are values, whereas
40 + 2,
((x: Int) => x * 2)(3),
42 :: List(10 + 20, 20 + 30)
are not values, but merely non-normalized terms that can be further simplified.
Examples and non-examples
I'll just go through your list of sub-questions one-by-one:
Is a function a value
Yes, things like (x: T1, ..., xn: Tn) => body are considered to be stuck terms in WHNF, in functional languages they can actually be represented, so they are values.
If so, what does it mean when equating two functions: assert(f == g) for two functions that are equivalent but defined separately => f != g, why don't they work as 1 == 1?
Function extensionality is somewhat unrelated to the question whether something is a value or not. In the above "definition by example", I talked only about the shape of the terms, not about the existence / non-existence of some computable relations defined on those terms. The sad fact is that you can't even really determine whether a lambda-expression actually represents a function (i.e. whether it terminates for all inputs), and it is also known that there cannot be an algorithm that could determine whether two functions produce the same output for all inputs (i.e. are extensionally equal).
Is an object with methods a value? (for example IO { println("") })
Not quite clear what you're asking here. Objects don't have methods. Classes have methods. If you mean method invocations, then, no, they are terms that can be further simplified (by actually running the method), so they are not values.
Is an object with setter methods and mutable state a value?
Is an object with mutable state which works as a state machine a value?
There is no such thing in pure functional programming.
What constitutes a value in pure functional programming?
Background
In pure functional programming there is no mutation. Hence, code such as
case class C(x: Int)
val a = C(42)
val b = C(42)
would become equivalent to
case class C(x: Int)
val a = C(42)
val b = a
since, in pure functional programming, if a.x == b.x, then we would have a == b. That is, a == b would be implemented comparing the values inside.
However, Scala is not pure, since it allows mutation, like Java. In such case, we do NOT have the equivalence between the two snippets above, when we declare case class C(var x: Int). Indeed, performing a.x += 1 afterwords does not affect b.x in the first snippet, but does in the second one, where a and b point to the same object. In such case, it is useful to have a comparison a == b which compares the object references, rather than its inner integer value.
When using case class C(x: Int), Scala comparisons a == b behave closer to pure functional programming, comparing the integers values. With regular (non case) classes, Scala instead compares object references breaking the equivalence between the two snippets. But, again, Scala is not pure. By comparison, in Haskell
data C = C Int deriving (Eq)
a = C 42
b = C 42
is indeed equivalent to
data C = C Int deriving (Eq)
a = C 42
b = a
since there are no "references" or "object identities" in Haskell. Note that the Haskell implementation likely will allocate two "objects" in the first snippet, and only one object in the second one, but since there is no way to tell them apart inside Haskell, the program output will be the same.
Answer
Is a function a value ? (then what it means when equating two function: assert(f==g). For two function that is equivalent but defined separately => f!=g, why not they work like 1==1)
Yes, functions are values in pure functional programming.
Above, when you mention "function that is equivalent but defined separately", you are assuming that we can compare the "references" or "object identities" for these two functions. In pure functional programming we can not.
Pure functional programming should compare functions making f == g equivalent to f x == g x for all possible arguments x. This is feasible when there is only a few values for x, e.g. if f,g :: Bool -> Int we only need to check x=True, x=False. For functions having infinite domains, this is much harder. For instance, if f,g :: String -> Int we can not check infinitely many strings.
Theoretical computer science (computability theory) also proved that there is no algorithm to compare two functions String -> Int, not even an inefficient algorithm, not even if we have access to the source code of the two functions. For this mathematical reason, we must accept that functions are values that can not be compared. In Haskell, we express this through the Eq typeclass, stating that almost all the standard types are comparable, functions being the exception.
Is an object with methods a value ? (for example, IO{println("")})
Yes. Roughly speaking, "everything is a value", including IO actions.
Is an object with setter methods and mutable states a value ?
Is an object with mutable states and works as a state machine a value ?
There is no mutable state in pure functional programming.
At best, the setters can produce a "new" object with the modified fields.
And yes, the object would be a value.
How do we test if it is a value, is that immutable can be a sufficient condition to be a value ?
In pure functional programming, we can only have immutable data.
In impure functional programming, I think we can call most immutable objects "values", when we do not compare object references. If the "immutable" object contains a reference to a mutable object, e.g.
case class D(var x: Int)
case class C(c: C)
val a = C(D(42))
then things are more tricky. I guess we could still call a "immutable", since we can not alter a.c, but we should be careful since a.c.x can be mutated.
Depending on the intent, I think that some would not call a immutable. I would not consider a to be a value.
To make things more muddy, in impure programming, there are objects which use mutation to present a "pure" interface in an efficient way. For instance one can write a pure function that, before returning, stores its result in a cache. When called again on the same argument, it will return the previously computed result
(this is usually called memoization). Here, mutation happens, but it is not observable from outside, where at most we can observe a faster implementation. In this case, we can simply pretend the that function is pure (even if it performs mutation) and consider it a "value".
The contrast with imperative languages is stark. In inperitive languages, like Python, the output of a function is directed. It can be assigned to a variable, explicitly returned, printed or written to a file.
When I compose a function in Haskell, I never consider output. I never use "return" Everything has "a" value. This is called "symbolic" programming. By "everything", is meant "symbols". Like human language, the nouns and verbs represent something. That something is their value. The "value" of "Pete" is Pete. The name "Pete" is not Pete but is a representation of Pete, the person. The same is true of functional programming. The best analogy is math or logic When you do pages of calculations, do you direct the output of each function? You even "assign" variables to be replaced by their "value" in functions or expressions.
Values are
Immutable/Timeless
Anonymous
Semantically Transparent
What is the value of 42? 42. What is the "value" of new Date()? Date object at 0x3fa89c3. What is the identity of 42? 42. What is the identity of new Date()? As we saw in the previous example, it's the thing that lives at the place. It may have many different "values" in different contexts but it has only one identity. OTOH, 42 is sufficient unto itself. It's semantically meaningless to ask where 42 lives in the system. What is the semantic meaning of 42? Magnitude of 42. What is the semantic meaning of new Foo()? Who knows.
I would add a fourth criterion (see this in some contexts in the wild but not others) which is: values are language agnostic (I'm not certain the first 3 are sufficient to guarantee this nor that such a rule is entirely consistent with most people's intuition of what value means).
Values are things that
functions can take as inputs and return as outputs, that is, can be computed, and
are members of a type, that is, elements of some set, and
can be bound to a variable, that is, can be named.
First point is really the crucial test whether something is a value. Perhaps the word value, due to conditioning, might immediately make us think of just numbers, but the concept is very general. Essentially anything we can give to and get out of a function can be considered a value. Numbers, strings, booleans, instances of classes, functions themselves, predicates, and even types themselves, can be inputs and outputs of functions, and thus are values.
IO monad is a great example of how general this concept is. When we say IO monad models side-effects as values, we mean a function can take a side-effect (say println) as input and return as output. IO(println(...)) separates the idea of the effect of an action of println from the actual execution of an action, and allows these effects to be considered as first class values that can be computed with using the same language facilities as for any other values such as numbers.
From my understanding, a partially applied function are functions, which we can invoke without
passing all/some of the required arguments.
def add(x:Int, y:Int) = x + y
val paf = add(_ :Int, 3)
val paf1 = add(_ :Int, _ :Int)
In the above example, paf1 refers to partially applied function with all the arguments missing and I can invoke is using: paf1(10,20) and the original function can be invoked using add(10,20)
My question is, what is the extra benefit of creating a partially applied function with all the arguments missing, since the invocation syntax is pretty much the same? Is it just to convert methods into first class functions?
Scala's def keyword is how you define methods and methods are not functions (in Scala). So your add is not a first-class function entity the way your paf1 is, even if they're semantically equivalent in what they do with their arguments in producing a result.
Scala will automatically use partial application to turn a method into an equivalent function, which you can see by extending your example a bit:
def add(x: Int, y: Int) = x + y
...
val pa2: (Int, Int) => Int = add
pa2: (Int, Int) => Int = <function2>
This may seem of little benefit in this example, but in many cases there are non-explicit constraints indicating a function is required (or more accurately, constraints specified explicitly elsewhere) that allow you to simply give a (type-compatible) method name in a place where you need a function.
There is a difference between Methods and functions.
If you look at the declaration of List.map for example, it really expects a function. But the Scala compiler is smart enough to accept both methods and functions.
A quote from here
this trick ... for coercing a method into something where a function is expected, is so easy that even the compiler can detect and do it. In fact, this automatic coercion got an own name – it’s called Eta expansion.
On the other hand, have a look at Java 8; as far as I can tell, it's not that easy there.
Update: the question was, Why would I ever want an eta-expanded method? One of the great rhetorical strategies in the Scala bible is that they lead you to an example over many pages. I employ the Exodus metaphor below because I just saw "The Ten Commandments" with Charlton Heston. I'm not pretending that this answer is more explanatory than Randall's.
You might need someone with lesser rep to note that the big build-up in the bible's book of Exodus is to:
http://www.artima.com/pins1ed/first-steps-in-scala.html#step6
args foreach println.
The previous step among the "first steps" is, indeed,
args foreach (arg => println(arg))
but I'm guessing no one does it that way if the type inference gods are kind.
From the change log in the spec: "a partially unapplied method is now designated m _ instead
of the previous notation &m." That is, at a certain point, the notion of a "function ptr" became a partial function with no args supplied. Which is what it is. Update: "metaphorically."
I'm sure others can synthesize a bunch of use-cases for this, but really it's just a consequence of the fact that functions are values. It would be much weirder if you couldn't pass a function around as a normal value.
Scala's handling of this stuff is a little clunky. Partial application is usually combined with currying. I consider it a quirk that you can basically eta-expand any expression using _. What you're effectively doing (modulo Scala's syntactic quirks around currying) by writing add(_ : Int, _ : Int) is writing (x : Int) => (y : Int) => add(x, y). I'm sure you can think of instances where the latter definition might be useful in a program.
I understand different notions of functional programming by itself: side effects, immutability, pure functions, referential transparency. But I can't connect them together in my head. For example, I have the following questions:
What is the relation between ref. transparency and immutability. Does one implies the other?
Sometimes side effects and immutability is used interchangeably. Is it correct?
This question requires some especially nit-picky answers, since it's about defining common vocabulary.
First, a function is a kind of mathematical relation between a "domain" of inputs and a "range" (or codomain) of outputs. Every input produces an unambiguous output. For example, the integer addition function + accepts input in the domain Int x Int and produces outputs in the range Int.
object Ex0 {
def +(x: Int, y: Int): Int = x + y
}
Given any values for x and y, clearly + will always produce the same result. This is a function. If the compiler were extra clever, it could insert code to cache the results of this function for each pair of inputs, and perform a cache lookup as an optimization. That's clearly safe here.
The problem is that in software, the term "function" has been somewhat abused: although functions accept arguments and return values as declared in their signature, they can also read and write to some external context. For example:
class Ex1 {
def +(x: Int): Int = x + Random.nextInt
}
We can't think of this as a mathematical function anymore, because for a given value of x, + can produce different results (depending on a random value, which doesn't appear anywhere in +'s signature). The result of + can not be safely cached as described above. So now we have a vocabulary problem, which we solve by saying that Ex0.+ is pure, and Ex1.+ isn't.
Okay, since we've now accepted some degree of impurity, we need to define what kind of impurity we're talking about! In this case, we've said the difference is that we can cache Ex0.+'s results associated with its inputs x and y, and that we can't cache Ex1.+'s results associated with its input x. The term we use to describe cacheability (or, more properly, substitutability of a function call with its output) is referential transparency.
All pure functions are referentially transparent, but some referentially transparent functions aren't pure. For example:
object Ex2 {
var lastResult: Int
def +(x: Int, y: Int): Int = {
lastResult = x + y
lastResult
}
}
Here we're not reading from any external context, and the value produced by Ex2.+ for any inputs x and y will always be cacheable, as in Ex0. This is referentially transparent, but it does have a side effect, which is to store the last value computed by the function. Somebody else can come along later and grab lastResult, which will give them some sneaky insight into what's been happening with Ex2.+!
A side note: you can also argue that Ex2.+ is not referentially transparent, because although caching is safe with respect to the function's result, the side effect is silently ignored in the case of a cache "hit." In other words, introducing a cache changes the program's meaning, if the side effect is important (hence Norman Ramsey's comment)! If you prefer this definition, then a function must be pure in order to be referentially transparent.
Now, one thing to observe here is that if we call Ex2.+ twice or more in a row with the same inputs, lastResult will not change. The side effect of invoking the method n times is equivalent to the side effect of invoking the method only once, and so we say that Ex2.+ is idempotent. We could change it:
object Ex3 {
var history: Seq[Int]
def +(x: Int, y: Int): Int = {
result = x + y
history = history :+ result
result
}
}
Now, each time we call Ex3.+, the history changes, so the function is no longer idempotent.
Okay, a recap so far: a pure function is one that neither reads from nor writes to any external context. It is both referentially transparent and side effect free. A function that reads from some external context is no longer referentially transparent, whereas a function that writes to some external context is no longer free of side effects. And finally, a function which when called multiple times with the same input has the same side effect as calling it only once, is called idempotent. Note that a function with no side effects, such as a pure function, is also idempotent!
So how does mutability and immutability play into all this? Well, look back at Ex2 and Ex3. They introduce mutable vars. The side effects of Ex2.+ and Ex3.+ is to mutate their respective vars! So mutability and side effects go hand-in-hand; a function that operates only on immutable data must be side effect free. It might still not be pure (that is, it might not be referentially transparent), but at least it will not produce side effects.
A logical follow-up question to this might be: "what are the benefits of a pure functional style?" The answer to that question is more involved ;)
"No" to the first - one implies the other, but not the converse, and a qualified "Yes" to the second.
"An expression is said to be referentially transparent if it can be replaced with its value without changing the behavior of a program".
Immutable input suggests that an expression (function) will always evaluate to the same value, and therefore be referentially transparent.
However, (mergeconflict has kindly correct me on this point) being referentially transparent does not necessarily require immutability.
By definition, a side-effect is an aspect of a function; meaning that when you call a function, it changes something.
Immutability is an aspect of data; it cannot be changed.
Calling a function on such does imply that there can be no side-effects. (in Scala, that's limited to "no changes to the immutable object(s)" - the developer has responsibilities and decisions).
While side-effect and immutability don't mean the same thing, they are closely related aspects of a function and the data the function is applied to.
Since Scala is not a pure functional programming language, one must be careful when considering the meaning of such statements as "immutable input" - the scope of the input to a function may include elements other than those passed as parameters. Similarly for considering side-effects.
It rather depends on the specific definitions you use (there can be disagreement, see for example Purity vs Referential transparency), but I think this is a reasonable interpretation:
Referential transparency and 'purity' are properties of functions/expressions. A function/expression may or may not have side-effects. Immutability, on the other hand, is a property of objects, not functions/expressions.
Referential transparency, side-effects and purity are closely related: 'pure' and 'referentially transparent' are equivalent, and those notions are equivalent to the absence of side-effects.
An immutable object may have methods which are not referentially transparent: those methods will not change the object itself (as that would make the object mutable), but may have other side-effects such as performing I/O or manipulating their (mutable) parameters.
This is just one of those "I was wondering..." questions.
Scala has immutable data structures and (optional) lazy vals etc.
How close can a Scala program be to one that is fully pure (in a functional programming sense) and fully lazy (or as Ingo points out, can it be sufficiently non-strict)? What values are unavoidably mutable and what evaluation unavoidably greedy?
Regarding lazyness - currently, passing a parameter to a method is by default strict:
def square(a: Int) = a * a
but you use call-by-name parameters:
def square(a: =>Int) = a * a
but this is not lazy in the sense that it computes the value only once when needed:
scala> square({println("calculating");5})
calculating
calculating
res0: Int = 25
There's been some work into adding lazy method parameters, but it hasn't been integrated yet (the below declaration should print "calculating" from above only once):
def square(lazy a: Int) = a * a
This is one piece that is missing, although you could simulate it with a local lazy val:
def square(ap: =>Int) = {
lazy val a = ap
a * a
}
Regarding mutability - there is nothing holding you back from writing immutable data structures and avoid mutation. You can do this in Java or C as well. In fact, some immutable data structures rely on the lazy primitive to achieve better complexity bounds, but the lazy primitive can be simulated in other languages as well - at the cost of extra syntax and boilerplate.
You can always write immutable data structures, lazy computations and fully pure programs in Scala. The problem is that the Scala programming model allows writing non pure programs as well, so the type checker can't always infer some properties of the program (such as purity) which it could infer given that the programming model was more restrictive.
For example, in a language with pure expressions the a * a in the call-by-name definition above (a: =>Int) could be optimized to evaluate a only once, regardless of the call-by-name semantics. If the language allows side-effects, then such an optimization is not always applicable.
Scala can be as pure and lazy as you like, but a) the compiler won't keep you honest with regards to purity and b) it will take a little extra work to make it lazy. There's nothing too profound about this; you can even write lazy and pure Java code if you really want to (see here if you dare; achieving laziness in Java requires eye-bleeding amounts of nested anonymous inner classes).
Purity
Whereas Haskell tracks impurities via the type system, Scala has chosen not to go that route, and it's difficult to tack that sort of thing on when you haven't made it a goal from the beginning (and also when interoperability with a thoroughly impure language like Java is a major goal of the language).
That said, some believe it's possible and worthwhile to make the effort to document effects in Scala's type system. But I think purity in Scala is best treated as a matter of self-discipline, and you must be perpetually skeptical about the supposed purity of third-party code.
Laziness
Haskell is lazy by default but can be made stricter with some annotations sprinkled in your code... Scala is the opposite: strict by default but with the lazy keyword and by-name parameters you can make it as lazy as you like.
Feel free to keep things immutable. On the other hand, there's no side effect tracking, so you can't enforce or verify it.
As for non-strictness, here's the deal... First, if you choose to go completely non-strict, you'll be forsaking all of Scala's classes. Even Scalaz is not non-strict for the most part. If you are willing to build everything yourself, you can make your methods non-strict and your values lazy.
Next, I wonder if implicit parameters can be non-strict or not, or what would be the consequences of making them non-strict. I don't see a problem, but I could be wrong.
But, most problematic of all, function parameters are strict, and so are closures parameters.
So, while it is theoretically possible to go fully non-strict, it will be incredibly inconvenient.