Explaining flatMap associativity - scala

I am reading Functional Programming in Scala and the book comments that flatMap for monads must follow the associativity law as per below.
x.flatMap(f).flatMap(g) == x.flatMap(a => f(a).flatMap(g))
I normally take associativity to mean something like (a+(b+c)) == ((a+b)+c) but I am failing to translate the equation here to something similar.
The two sides seem equivalent to me. Assuming x is of type M[A], They both seem to be applying f first to a and subsequently applying flatMap(g) to the result of f(a).
What is the significance of this law?

If you're confused by the syntax and have trouble seeing analogy to (a+(b+c))==((a+b)+c), consider composing functions of type A => M[B] where A and B can change while M stays the same. Now consider an operation which composes these functions like that:
def compose[A,B,C](f: A => M[B], g: B => M[C]): A => M[C] =
a => f(a).flatMap(g)
Now the associativity law reads like:
compose(compose(f, g), h) == compose(f, compose(g, h))
If we had some infix operator for compose, it could look like this:
(f comp g) comp h == f comp (g comp h)
BTW: In functional programming terminology, these functions are called Kleisli

Related

What kind of morphism is `filter` in category theory?

In category theory, is the filter operation considered a morphism? If yes, what kind of morphism is it? Example (in Scala)
val myNums: Seq[Int] = Seq(-1, 3, -4, 2)
myNums.filter(_ > 0)
// Seq[Int] = List(3, 2) // result = subset, same type
myNums.filter(_ > -99)
// Seq[Int] = List(-1, 3, -4, 2) // result = identical than original
myNums.filter(_ > 99)
// Seq[Int] = List() // result = empty, same type
One interesting way of looking at this matter involves not picking filter as a primitive notion. There is a Haskell type class called Filterable which is aptly described as:
Like Functor, but it [includes] Maybe effects.
Formally, the class Filterable represents a functor from Kleisli Maybe to Hask.
The morphism mapping of the "functor from Kleisli Maybe to Hask" is captured by the mapMaybe method of the class, which is indeed a generalisation of the homonymous Data.Maybe function:
mapMaybe :: Filterable f => (a -> Maybe b) -> f a -> f b
The class laws are simply the appropriate functor laws (note that Just and (<=<) are, respectively, identity and composition in Kleisli Maybe):
mapMaybe Just = id
mapMaybe (g <=< f) = mapMaybe g . mapMaybe f
The class can also be expressed in terms of catMaybes...
catMaybes :: Filterable f => f (Maybe a) -> f a
... which is interdefinable with mapMaybe (cf. the analogous relationship between sequenceA and traverse)...
catMaybes = mapMaybe id
mapMaybe g = catMaybes . fmap g
... and amounts to a natural transformation between the Hask endofunctors Compose f Maybe and f.
What does all of that have to do with your question? Firstly, a functor is a morphism between categories, and a natural transformation is a morphism between functors. That being so, it is possible to talk of morphisms here in a sense that is less boring than the "morphisms in Hask" one. You won't necessarily want to do so, but in any case it is an existing vantage point.
Secondly, filter is, unsurprisingly, also a method of Filterable, its default definition being:
filter :: Filterable f => (a -> Bool) -> f a -> f a
filter p = mapMaybe $ \a -> if p a then Just a else Nothing
Or, to spell it using another cute combinator:
filter p = mapMaybe (ensure p)
That indirectly gives filter a place in this particular constellation of categorical notions.
To answer are question like this, I'd like to first understand what is the essence of filtering.
For instance, does it matter that the input is a list? Could you filter a tree? I don't see why not! You'd apply a predicate to each node of the tree and discard the ones that fail the test.
But what would be the shape of the result? Node deletion is not always defined or it's ambiguous. You could return a list. But why a list? Any data structure that supports appending would work. You also need an empty member of your data structure to start the appending process. So any unital magma would do. If you insist on associativity, you get a monoid. Looking back at the definition of filter, the result is a list, which is indeed a monoid. So we are on the right track.
So filter is just a special case of what's called Foldable: a data structure over which you can fold while accumulating the results in a monoid. In particular, you could use the predicate to either output a singleton list, if it's true; or an empty list (identity element), if it's false.
If you want a categorical answer, then a fold is an example of a catamorphism, an example of a morphism in the category of algebras. The (recursive) data structure you're folding over (a list, in the case of filter) is an initial algebra for some functor (the list functor, in this case), and your predicate is used to define an algebra for this functor.
In this answer, I will assume that you are talking about filter on Set (the situation seems messier for other datatypes).
Let's first fix what we are talking about. I will talk specifically about the following function (in Scala):
def filter[A](p: A => Boolean): Set[A] => Set[A] =
s => s filter p
When we write it down this way, we see clearly that it's a polymorphic function with type parameter A that maps predicates A => Boolean to functions that map Set[A] to other Set[A]. To make it a "morphism", we would have to find some categories first, in which this thing could be a "morphism". One might hope that it's natural transformation, and therefore a morphism in the category of endofunctors on the "default ambient category-esque structure" usually referred to as "Hask" (or "Scal"? "Scala"?). To show that it's natural, we would have to check that the following diagram commutes for every f: B => A:
- o f
Hom[A, Boolean] ---------------------> Hom[B, Boolean]
| |
| |
| |
| filter[A] | filter[B]
| |
V ??? V
Hom[Set[A], Set[A]] ---------------> Hom[Set[B], Set[B]]
however, here we fail immediately, because it's not clear what to even put on the horizontal arrow at the bottom, since the assignment A -> Hom[Set[A], Set[A]] doesn't even seem functorial (for the same reasons why A -> End[A] is not functorial, see here and also here).
The only "categorical" structure that I see here for a fixed type A is the following:
Predicates on A can be considered to be a partially ordered set with implication, that is p LEQ q if p implies q (i.e. either p(x) must be false, or q(x) must be true for all x: A).
Analogously, on functions Set[A] => Set[A], we can define a partial order with f LEQ g whenever for each set s: Set[A] it holds that f(s) is subset of g(s).
Then filter[A] would be monotonic, and therefore a functor between poset-categories. But that's somewhat boring.
Of course, for each fixed A, it (or rather its eta-expansion) is also just a function from A => Boolean to Set[A] => Set[A], so it's automatically a "morphism" in the "Hask-category". But that's even more boring.
filter can be written in terms of foldRight as:
filter p ys = foldRight(nil)( (x, xs) => if (p(x)) x::xs else xs ) ys
foldRight on lists is a map of T-algebras (where here T is the List datatype functor), so filter is a map of T-algebras.
The two algebras in question here are the initial list algebra
[nil, cons]: 1 + A x List(A) ----> List(A)
and, let's say the "filter" algebra,
[nil, f]: 1 + A x List(A) ----> List(A)
where f(x, xs) = if p(x) x::xs else xs.
Let's call filter(p, _) the unique map from the initial algebra to the filter algebra in this case (it is called fold in the general case). The fact that it is a map of algebras means that the following equations are satisfied:
filter(p, nil) = nil
filter(p, x::xs) = f(x, filter(p, xs))

Functor for sequence computation

I am reading scala cats book from underscore.io. It says following about Monad and Functor:
While monads and functors are the most widely used sequencing data
types..
I can see that Monad is using for sequencing data but Functor not at all. Could someone please show about sequencing computation on functors?
Seq(1, 2, 3).map(_ * 2).map(_.toString).foreach(println)
Here: you have a sequence of operations on a sequence of data.
Every monad is actually a functor, because you could implement map with flatMap and unit/pure/whatever your implementation calls it. So if you agree that monads are "sequencing data types", then you should agree on functors being them too.
Taken out of context, this statement is less clear than it could be.
A fuller version of the quote is:
While monads and functors are the most widely used sequencing data types
[...], semigroupals and applicatives are the most general.
The goal of this statement is not to erase the difference between functorial and monadic notions of "sequencing", but rather to contrast them with obviously non-sequential operations provided by Semigroupal.
Both Functor and Monad do support (different) kinds of "sequencing".
Given a value x of type F[X] for some functor F and some type X, we can "sequence" pure functions
f: X => Y
g: Y => Z
like this:
x map f map g
You can call this "sequencing", at least "elementwise". The point is that g has to wait until f produces at least a single y of type Y in order to do anything useful. However, this does not mean that all invocations of f must be finished before g is invoked for the first time, e.g. if x is a long list, one could process each element in parallel - that's why I called it "elementwise".
With monads that represent monadic effects, the notion of "sequencing" is usually taken a bit more seriously. For example, if you are working with a value x of type M[X] = Writer[Vector[String], X], and you have two functions with the "writing"-effect
f: X => M[Y]
g: Y => M[Z]
and then you sequence them like this:
x flatMap f flatMap g
you really want f to finish completely, until g begins to write anything into the Vector[String]-typed log. So here, this is literally just "sequencing", without any fine-print.
Now, contrast this with Semigroupal.
Suppose that F is semigroupal, that is, we can always form a F[(A,B)] from F[A] and F[B]. Given two functions
f: X => F[A]
g: X => F[B]
we can build a function
(x: X) => product(f(x), g(x))
that returns results of type F[(A, B)]. Note that now f and g can process x completely independently: whatever it is, it is definitely not sequencing.
Similarly, for an Applicative F and functions
f: A => X
g: B => Y
c: (X, Y) => Z
and two independent values a: F[A], b: F[B], you can process a and b completely independently with f and g, and then combine the results in the end with c into a single F[Z]:
map2(a, b){ (a, b) => c(f(a), g(b)) }
Again: f and g don't know anything about each other's inputs and outputs, they work completely independently until the very last step c, so this is again not "sequencing".
I hope it clarifies the distinction somewhat.

Why is there no >=> semigroup for A => M[A] in Scalaz?

This is a followup to my previous question
Kleisli defines two operators <=< (compose) and >=> (andThen). The >=> looks very natural for me and I don't understand how <=< can be useful.
Moreover, it looks like there is no >=> semigroup for A => M[A] but the <=< semigroup does exist.
What is the rationale behind it ?
compose (or <=<) is a little more natural when translating between point-free and non point-free styles. For example, if we have these functions:
val f: Int => Int = _ + 1
val g: Int => Int = _ * 10
We get the following equivalences:
scala> (f andThen g)(3) == g(f(3))
res0: Boolean = true
scala> (f compose g)(3) == f(g(3))
res1: Boolean = true
In the compose case the f and g are in the same order on both sides of the equation.
Unfortunately Scala's type inference often makes andThen (or >=>) more convenient, and it tends to be more widely used than compose. So this is a case where mathematical conventions and the quirks of Scala's type inference system are at odds. Scalaz (not too surprisingly, given the culture of the project) chooses the math side.

How to concisely express function iteration?

Is there a concise, idiomatic way how to express function iteration? That is, given a number n and a function f :: a -> a, I'd like to express \x -> f(...(f(x))...) where f is applied n-times.
Of course, I could make my own, recursive function for that, but I'd be interested if there is a way to express it shortly using existing tools or libraries.
So far, I have these ideas:
\n f x -> foldr (const f) x [1..n]
\n -> appEndo . mconcat . replicate n . Endo
but they all use intermediate lists, and aren't very concise.
The shortest one I found so far uses semigroups:
\n f -> appEndo . times1p (n - 1) . Endo,
but it works only for positive numbers (not for 0).
Primarily I'm focused on solutions in Haskell, but I'd be also interested in Scala solutions or even other functional languages.
Because Haskell is influenced by mathematics so much, the definition from the Wikipedia page you've linked to almost directly translates to the language.
Just check this out:
Now in Haskell:
iterateF 0 _ = id
iterateF n f = f . iterateF (n - 1) f
Pretty neat, huh?
So what is this? It's a typical recursion pattern. And how do Haskellers usually treat that? We treat that with folds! So after refactoring we end up with the following translation:
iterateF :: Int -> (a -> a) -> (a -> a)
iterateF n f = foldr (.) id (replicate n f)
or point-free, if you prefer:
iterateF :: Int -> (a -> a) -> (a -> a)
iterateF n = foldr (.) id . replicate n
As you see, there is no notion of the subject function's arguments both in the Wikipedia definition and in the solutions presented here. It is a function on another function, i.e. the subject function is being treated as a value. This is a higher level approach to a problem than implementation involving arguments of the subject function.
Now, concerning your worries about the intermediate lists. From the source code perspective this solution turns out to be very similar to a Scala solution posted by #jmcejuela, but there's a key difference that GHC optimizer throws away the intermediate list entirely, turning the function into a simple recursive loop over the subject function. I don't think it could be optimized any better.
To comfortably inspect the intermediate compiler results for yourself, I recommend to use ghc-core.
In Scala:
Function chain Seq.fill(n)(f)
See scaladoc for Function. Lazy version: Function chain Stream.fill(n)(f)
Although this is not as concise as jmcejuela's answer (which I prefer), there is another way in scala to express such a function without the Function module. It also works when n = 0.
def iterate[T](f: T=>T, n: Int) = (x: T) => (1 to n).foldLeft(x)((res, n) => f(res))
To overcome the creation of a list, one can use explicit recursion, which in reverse requires more static typing.
def iterate[T](f: T=>T, n: Int): T=>T = (x: T) => (if(n == 0) x else iterate(f, n-1)(f(x)))
There is an equivalent solution using pattern matching like the solution in Haskell:
def iterate[T](f: T=>T, n: Int): T=>T = (x: T) => n match {
case 0 => x
case _ => iterate(f, n-1)(f(x))
}
Finally, I prefer the short way of writing it in Caml, where there is no need to define the types of the variables at all.
let iterate f n x = match n with 0->x | n->iterate f (n-1) x;;
let f5 = iterate f 5 in ...
I like pigworker's/tauli's ideas the best, but since they only gave it as a comments, I'm making a CW answer out of it.
\n f x -> iterate f x !! n
or
\n f -> (!! n) . iterate f
perhaps even:
\n -> ((!! n) .) . iterate

How can I write f(g(h(x))) in Scala with fewer parentheses?

Expressions like
ls map (_ + 1) sum
are lovely because they are left-to-right and not nested. But if the functions in question are defined outside the class, it is less pretty.
Following an example I tried
final class DoublePlus(val self: Double) {
def hypot(x: Double) = sqrt(self*self + x*x)
}
implicit def doubleToDoublePlus(x: Double) =
new DoublePlus(x)
which works fine as far as I can tell, other than
A lot of typing for one method
You need to know in advance that you want to use it this way
Is there a trick that will solve those two problems?
You can call andThen on a function object:
(h andThen g andThen f)(x)
You can't call it on methods directly though, so maybe your h needs to become (h _) to transform the method into a partially applied function. The compiler will translate subsequent method names to functions automatically because the andThen method accepts a Function parameter.
You could also use the pipe operator |> to write something like this:
x |> h |> g |> f
Enriching an existing class/interface with an implicit conversion (which is what you did with doubleToDoublePlus) is all about API design when some classes aren't under your control. I don't recommend to do that lightly just to save a few keystrokes or having a few less parenthesis. So if it's important to be able to type val h = d hypot x, then the extra keystrokes should not be a concern. (there may be object allocations concerns but that's different).
The title and your example also don't match:
f(g(h(x))) can be rewritten asf _ compose g _ compose h _ apply x if your concern is about parenthesis or f compose g compose h apply x if f, g, h are function objects rather than def.
But ls map (_ + 1) sum aren't nested calls as you say, so I'm not sure how that relates to the title. And although it's lovely to use, the library/language designers went through a lot of efforts to make it easy to use and under the hood is not simple (much more complex than your hypot example).
def fgh (n: N) = f(g(h(n)))
val m = fgh (n)
Maybe this, observe how a is provided:
def compose[A, B, C](f: B => C, g: A => B): A => C = (a: A) => f(g(a))
basically like the answer above combine the desired functions to a intermediate one which you then can use easily with map.
Starting Scala 2.13, the standard library provides the chaining operation pipe which can be used to convert/pipe a value with a function of interest.
Using multiple pipes we can thus build a pipeline which as mentioned in the title of your question, minimizes the number of parentheses:
import scala.util.chaining._
x pipe h pipe g pipe f