I am reading scala cats book from underscore.io. It says following about Monad and Functor:
While monads and functors are the most widely used sequencing data
types..
I can see that Monad is using for sequencing data but Functor not at all. Could someone please show about sequencing computation on functors?
Seq(1, 2, 3).map(_ * 2).map(_.toString).foreach(println)
Here: you have a sequence of operations on a sequence of data.
Every monad is actually a functor, because you could implement map with flatMap and unit/pure/whatever your implementation calls it. So if you agree that monads are "sequencing data types", then you should agree on functors being them too.
Taken out of context, this statement is less clear than it could be.
A fuller version of the quote is:
While monads and functors are the most widely used sequencing data types
[...], semigroupals and applicatives are the most general.
The goal of this statement is not to erase the difference between functorial and monadic notions of "sequencing", but rather to contrast them with obviously non-sequential operations provided by Semigroupal.
Both Functor and Monad do support (different) kinds of "sequencing".
Given a value x of type F[X] for some functor F and some type X, we can "sequence" pure functions
f: X => Y
g: Y => Z
like this:
x map f map g
You can call this "sequencing", at least "elementwise". The point is that g has to wait until f produces at least a single y of type Y in order to do anything useful. However, this does not mean that all invocations of f must be finished before g is invoked for the first time, e.g. if x is a long list, one could process each element in parallel - that's why I called it "elementwise".
With monads that represent monadic effects, the notion of "sequencing" is usually taken a bit more seriously. For example, if you are working with a value x of type M[X] = Writer[Vector[String], X], and you have two functions with the "writing"-effect
f: X => M[Y]
g: Y => M[Z]
and then you sequence them like this:
x flatMap f flatMap g
you really want f to finish completely, until g begins to write anything into the Vector[String]-typed log. So here, this is literally just "sequencing", without any fine-print.
Now, contrast this with Semigroupal.
Suppose that F is semigroupal, that is, we can always form a F[(A,B)] from F[A] and F[B]. Given two functions
f: X => F[A]
g: X => F[B]
we can build a function
(x: X) => product(f(x), g(x))
that returns results of type F[(A, B)]. Note that now f and g can process x completely independently: whatever it is, it is definitely not sequencing.
Similarly, for an Applicative F and functions
f: A => X
g: B => Y
c: (X, Y) => Z
and two independent values a: F[A], b: F[B], you can process a and b completely independently with f and g, and then combine the results in the end with c into a single F[Z]:
map2(a, b){ (a, b) => c(f(a), g(b)) }
Again: f and g don't know anything about each other's inputs and outputs, they work completely independently until the very last step c, so this is again not "sequencing".
I hope it clarifies the distinction somewhat.
Related
The Scala 3 reference at https://docs.scala-lang.org/scala3/reference/metaprogramming/compiletime-ops.html mentions some "Prolog-like programming style" possible with Scala 3 mataprogramming:
The problem so far was that the Prolog-like programming style of
implicit search becomes viral: Once some construct depends on implicit
search it has to be written as a logic program itself.
But they all keep the viral nature of implicit search programs based
on logic programming.
I did some search but understood only that it somehow abuses the Scala compile-time behavior, and that something in it resembles Prolog.
What is that "Prolog-like programming style" and how it works? What namely resembles Prolog? Does it work in Scala 3?
Here is a basic correspondence to get you started:
Prolog-Atoms correspond to ordinary types.
Prolog-Variables correspond to type parameters.
What's called "Functors" here corresponds to type constructors.
Stating facts corresponds to providing constant givens.
Making queries corresponds to summoning corresponding proof-terms.
Let's start with a very simple example where we have just one atom, one prolog-"functor", one fact, and one query: this example from Wikipedia
cat(tom).
?- cat(tom).
>> Yes
can be directly translated into
trait Tom // Atom 'Tom'
trait Cat[X] // Functor 'Cat' with arity = 1
// Fact: cat(tom).
given Cat[Tom] with {}
// Query: ?- cat(tom). (is Tom a Cat?)
val query = summon[Cat[Tom]] // type-checks = Yes
Prolog rules correspond to parameterized given-definitions, the antecedents on the right side correspond to the using-parameters in the parameter list. For example, the classical syllogism example expressed in Prolog as
man(socrates).
mortal(X) :- man(X).
?- mortal(socrates).
>> True
can be encoded with a parameterized given that uses Man[X] and produces a proof of Mortal[X]:
trait Socrates
trait Man[X]
trait Mortal[X]
given socratesIsAMan: Man[Socrates] with {}
given allMenAreMortal[X](using m: Man[X]): Mortal[X] with {}
val query = summon[Mortal[Socrates]]
You can use scala3-compiler -Xprint:typer to see the composite proof term that the compiler generated to prove that Socrates is Mortal:
allMenAreMortal[Socrates](socratesIsAMan)
Knowing how to encode rules allows you to encode the more complicated example from the Wikipedia:
mother_child(trude, sally).
father_child(tom, sally).
father_child(tom, erica).
father_child(mike, tom).
sibling(X, Y) :- parent_child(Z, X), parent_child(Z, Y).
parent_child(X, Y) :- father_child(X, Y).
parent_child(X, Y) :- mother_child(X, Y).
?- sibling(sally, erica).
>> Yes
as follows:
trait Trude
trait Sally
trait Tom
trait Erica
trait Mike
trait FatherChild[F, C]
trait MotherChild[M, C]
trait ParentChild[P, C]
trait Sibling[X, Y]
given MotherChild[Trude, Sally] with {}
given FatherChild[Tom, Sally] with {}
given FatherChild[Tom, Erica] with {}
given FatherChild[Mike, Tom] with {}
given [X, Y, Z](using pczx: ParentChild[Z, X], pczy: ParentChild[Z, Y])
: Sibling[X, Y] with {}
given fatherhoodImpliesParentship[X, Y](using fc: FatherChild[X, Y])
: ParentChild[X, Y] with {}
given motherhoodImpliesParentship[X, Y](using mc: MotherChild[X, Y])
: ParentChild[X, Y] with {}
val query = summon[Sibling[Erica, Sally]] // Yes
Here, the compiler will generate a proof term that explains that Erica and Sally are Siblings because they have the same father Tom:
given_Sibling_X_Y[Erica, Sally, Tom](
fatherhoodImpliesParentship[Tom, Erica](given_FatherChild_Tom_Erica),
fatherhoodImpliesParentship[Tom, Sally](given_FatherChild_Tom_Sally)
)
More generally, conjunctions are encoded by multiple using-parameters, and disjunctions are encoded by multiple givens with the same result type:
// We can write "X /\ Y" as infix operator for conjunction
case class /\[A, B](a: A, b: B)
// We can write "X \/ Y" as infix operator for disjunctions
enum \/[+A, +B]:
case Inl(a: A)
case Inr(b: B)
// Inference for conjunctions: multiple parameters in `using`
given [A, B](using a:A, b: B): (A /\ B) = /\(a, b)
// Inference for disjunctions: multiple rules
given [A](using a: A): \/[A, Nothing] = \/.Inl(a)
given [B](using b: B): \/[Nothing, B] = \/.Inr(b)
// Example:
trait X
trait Y
trait Z
trait W
given X with { override def toString = "X" }
given W with { override def toString = "W" }
#main def query(): Unit =
println(summon[(X \/ Y) /\ (Z \/ W)])
// Finds a proof and prints `/\(Inl(X), Inr(W))`.
Since Scala 3, there is even negation available through util.NotGiven:
import scala.util.NotGiven
trait X
trait Y
trait /\[X, Y]
given X with {}
given [A, B](using a: A, b: B): /\[A, B] with {}
// Fails if we add `given Y with {}`
val query = summon[X /\ NotGiven[Y]]
Scala 3 adds a whole bunch of stuff on top of that, such as tuples (which are basically type-level lists) or computing with numeric / boolean / string singleton types, but I don't want to go too deeply into the details here.
Instead, I'd like to conclude by briefly sketching how it all fits into the landscape. The interesting difference between Prolog and Scala's type system is that the Scala compiler actually generates proof terms, and unlike in Prolog (where you get a simple "Yes"/"No"), those proof terms can carry around arbitrarily complicated computational content.
You might have noticed that in the examples above, the with {} mostly remained empty. This is usually not the case in the real code, quite the contrary: in the real code, you usually have some non-trivial definitions in the body of every given ... with { ... }. The reason why one is writing all those facts and inference rules is not for solving logical puzzles and obtaining "Yes" / "No" answers, but for generating huge complicated proof terms that do useful stuff.
The way it works is usually as follows:
Suppose you want to obtain a thing X
You import some big-&-smart library that knows how to construct a variety of things similar to the desired X
You use the "predicates" ( = typeclasses) / facts / rules ( = givens) of that library to express very precisely the properties that you want the thing X to have
If your description is precise enough, the library & the Scala compiler is able to summon the thing X purely from its type description.
So, in your average programming language, you have to write out all the terms manually. In Scala 3, you can instead specify the desired properties of the desired term through types, and the compiler will use this Prolog-like term inference system to summon a term with the right properties (given the right libraries, that is).
I was recently reading Category Theory for Programmers and in one of the challenges, Bartosz proposed to write a function called memoize which takes a function as an argument and returns the same one with the difference that, the first time this new function is called, it stores the result of the argument and then returns this result each time it is called again.
def memoize[A, B](f: A => B): A => B = ???
The problem is, I can't think of any way to implement this function without resorting to mutability. Moreover, the implementations I have seen uses mutable data structures to accomplish the task.
My question is, is there a purely functional way of accomplishing this? Maybe without mutability or by using some functional trick?
Thanks for reading my question and for any future help. Have a nice day!
is there a purely functional way of accomplishing this?
No. Not in the narrowest sense of pure functions and using the given signature.
TLDR: Use mutable collections, it's okay!
Impurity of g
val g = memoize(f)
// state 1
g(a)
// state 2
What would you expect to happen for the call g(a)?
If g(a) memoizes the result, an (internal) state has to change, so the state is different after the call g(a) than before.
As this could be observed from the outside, the call to g has side effects, which makes your program impure.
From the Book you referenced, 2.5 Pure and Dirty Functions:
[...] functions that
always produce the same result given the same input and
have no side effects
are called pure functions.
Is this really a side effect?
Normally, at least in Scala, internal state changes are not considered side effects.
See the definition in the Scala Book
A pure function is a function that depends only on its declared inputs and its internal algorithm to produce its output. It does not read any other values from “the outside world” — the world outside of the function’s scope — and it does not modify any values in the outside world.
The following examples of lazy computations both change their internal states, but are normally still considered purely functional as they always yield the same result and have no side effects apart from internal state:
lazy val x = 1
// state 1: x is not computed
x
// state 2: x is 1
val ll = LazyList.continually(0)
// state 1: ll = LazyList(<not computed>)
ll(0)
// state 2: ll = LazyList(0, <not computed>)
In your case, the equivalent would be something using a private, mutable Map (as the implementations you may have found) like:
def memoize[A, B](f: A => B): A => B = {
val cache = mutable.Map.empty[A, B]
(a: A) => cache.getOrElseUpdate(a, f(a))
}
Note that the cache is not public.
So, for a pure function f and without looking at memory consumption, timings, reflection or other evil stuff, you won't be able to tell from the outside whether f was called twice or g cached the result of f.
In this sense, side effects are only things like printing output, writing to public variables, files etc.
Thus, this implementation is considered pure (at least in Scala).
Avoiding mutable collections
If you really want to avoid var and mutable collections, you need to change the signature of your memoize method.
This is, because if g cannot change internal state, it won't be able to memoize anything new after it was initialized.
An (inefficient but simple) example would be
def memoizeOneValue[A, B](f: A => B)(a: A): (B, A => B) = {
val b = f(a)
val g = (v: A) => if (v == a) b else f(v)
(b, g)
}
val (b1, g) = memoizeOneValue(f, a1)
val (b2, h) = memoizeOneValue(g, a2)
// ...
The result of f(a1) would be cached in g, but nothing else. Then, you could chain this and always get a new function.
If you are interested in a faster version of that, see #esse's answer, which does the same, but more efficient (using an immutable map, so O(log(n)) instead of the linked list of functions above, O(n)).
Let's try(Note: I have change the return type of memoize to store the cached data):
import scala.language.existentials
type M[A, B] = A => T forSome { type T <: (B, A => T) }
def memoize[A, B](f: A => B): M[A, B] = {
import scala.collection.immutable
def withCache(cache: immutable.Map[A, B]): M[A, B] = a => cache.get(a) match {
case Some(b) => (b, withCache(cache))
case None =>
val b = f(a)
(b, withCache(cache + (a -> b)))
}
withCache(immutable.Map.empty)
}
def f(i: Int): Int = { print(s"Invoke f($i)"); i }
val (i0, m0) = memoize(f)(1) // f only invoked at first time
val (i1, m1) = m0(1)
val (i2, m2) = m1(1)
Yes there is pure functional ways to implement polymorphic function memoization. The topic is surprisingly deep and even summons the Yoneda Lemma, which is likely what Bartosz had in mind with this exercise.
The blog post Memoization in Haskell gives a nice introduction by simplifying the problem a bit: instead of looking at arbitrary functions it restricts the problem to functions from the integers.
The following memoize function takes a function of type Int -> a and
returns a memoized version of the same function. The trick is to turn
a function into a value because, in Haskell, functions are not
memoized but values are. memoize converts a function f :: Int -> a
into an infinite list [a] whose nth element contains the value of f n.
Thus each element of the list is evaluated when it is first accessed
and cached automatically by the Haskell runtime thanks to lazy
evaluation.
memoize :: (Int -> a) -> (Int -> a)
memoize f = (map f [0 ..] !!)
Apparently the approach can be generalised to function of arbitrary domains. The trick is to come up with a way to use the type of the domain as an index into a lazy data structure used for "storing" previous values. And this is where the Yoneda Lemma comes in and my own understanding of the topic becomes flimsy.
I am reading Functional Programming in Scala and the book comments that flatMap for monads must follow the associativity law as per below.
x.flatMap(f).flatMap(g) == x.flatMap(a => f(a).flatMap(g))
I normally take associativity to mean something like (a+(b+c)) == ((a+b)+c) but I am failing to translate the equation here to something similar.
The two sides seem equivalent to me. Assuming x is of type M[A], They both seem to be applying f first to a and subsequently applying flatMap(g) to the result of f(a).
What is the significance of this law?
If you're confused by the syntax and have trouble seeing analogy to (a+(b+c))==((a+b)+c), consider composing functions of type A => M[B] where A and B can change while M stays the same. Now consider an operation which composes these functions like that:
def compose[A,B,C](f: A => M[B], g: B => M[C]): A => M[C] =
a => f(a).flatMap(g)
Now the associativity law reads like:
compose(compose(f, g), h) == compose(f, compose(g, h))
If we had some infix operator for compose, it could look like this:
(f comp g) comp h == f comp (g comp h)
BTW: In functional programming terminology, these functions are called Kleisli
In category theory, is the filter operation considered a morphism? If yes, what kind of morphism is it? Example (in Scala)
val myNums: Seq[Int] = Seq(-1, 3, -4, 2)
myNums.filter(_ > 0)
// Seq[Int] = List(3, 2) // result = subset, same type
myNums.filter(_ > -99)
// Seq[Int] = List(-1, 3, -4, 2) // result = identical than original
myNums.filter(_ > 99)
// Seq[Int] = List() // result = empty, same type
One interesting way of looking at this matter involves not picking filter as a primitive notion. There is a Haskell type class called Filterable which is aptly described as:
Like Functor, but it [includes] Maybe effects.
Formally, the class Filterable represents a functor from Kleisli Maybe to Hask.
The morphism mapping of the "functor from Kleisli Maybe to Hask" is captured by the mapMaybe method of the class, which is indeed a generalisation of the homonymous Data.Maybe function:
mapMaybe :: Filterable f => (a -> Maybe b) -> f a -> f b
The class laws are simply the appropriate functor laws (note that Just and (<=<) are, respectively, identity and composition in Kleisli Maybe):
mapMaybe Just = id
mapMaybe (g <=< f) = mapMaybe g . mapMaybe f
The class can also be expressed in terms of catMaybes...
catMaybes :: Filterable f => f (Maybe a) -> f a
... which is interdefinable with mapMaybe (cf. the analogous relationship between sequenceA and traverse)...
catMaybes = mapMaybe id
mapMaybe g = catMaybes . fmap g
... and amounts to a natural transformation between the Hask endofunctors Compose f Maybe and f.
What does all of that have to do with your question? Firstly, a functor is a morphism between categories, and a natural transformation is a morphism between functors. That being so, it is possible to talk of morphisms here in a sense that is less boring than the "morphisms in Hask" one. You won't necessarily want to do so, but in any case it is an existing vantage point.
Secondly, filter is, unsurprisingly, also a method of Filterable, its default definition being:
filter :: Filterable f => (a -> Bool) -> f a -> f a
filter p = mapMaybe $ \a -> if p a then Just a else Nothing
Or, to spell it using another cute combinator:
filter p = mapMaybe (ensure p)
That indirectly gives filter a place in this particular constellation of categorical notions.
To answer are question like this, I'd like to first understand what is the essence of filtering.
For instance, does it matter that the input is a list? Could you filter a tree? I don't see why not! You'd apply a predicate to each node of the tree and discard the ones that fail the test.
But what would be the shape of the result? Node deletion is not always defined or it's ambiguous. You could return a list. But why a list? Any data structure that supports appending would work. You also need an empty member of your data structure to start the appending process. So any unital magma would do. If you insist on associativity, you get a monoid. Looking back at the definition of filter, the result is a list, which is indeed a monoid. So we are on the right track.
So filter is just a special case of what's called Foldable: a data structure over which you can fold while accumulating the results in a monoid. In particular, you could use the predicate to either output a singleton list, if it's true; or an empty list (identity element), if it's false.
If you want a categorical answer, then a fold is an example of a catamorphism, an example of a morphism in the category of algebras. The (recursive) data structure you're folding over (a list, in the case of filter) is an initial algebra for some functor (the list functor, in this case), and your predicate is used to define an algebra for this functor.
In this answer, I will assume that you are talking about filter on Set (the situation seems messier for other datatypes).
Let's first fix what we are talking about. I will talk specifically about the following function (in Scala):
def filter[A](p: A => Boolean): Set[A] => Set[A] =
s => s filter p
When we write it down this way, we see clearly that it's a polymorphic function with type parameter A that maps predicates A => Boolean to functions that map Set[A] to other Set[A]. To make it a "morphism", we would have to find some categories first, in which this thing could be a "morphism". One might hope that it's natural transformation, and therefore a morphism in the category of endofunctors on the "default ambient category-esque structure" usually referred to as "Hask" (or "Scal"? "Scala"?). To show that it's natural, we would have to check that the following diagram commutes for every f: B => A:
- o f
Hom[A, Boolean] ---------------------> Hom[B, Boolean]
| |
| |
| |
| filter[A] | filter[B]
| |
V ??? V
Hom[Set[A], Set[A]] ---------------> Hom[Set[B], Set[B]]
however, here we fail immediately, because it's not clear what to even put on the horizontal arrow at the bottom, since the assignment A -> Hom[Set[A], Set[A]] doesn't even seem functorial (for the same reasons why A -> End[A] is not functorial, see here and also here).
The only "categorical" structure that I see here for a fixed type A is the following:
Predicates on A can be considered to be a partially ordered set with implication, that is p LEQ q if p implies q (i.e. either p(x) must be false, or q(x) must be true for all x: A).
Analogously, on functions Set[A] => Set[A], we can define a partial order with f LEQ g whenever for each set s: Set[A] it holds that f(s) is subset of g(s).
Then filter[A] would be monotonic, and therefore a functor between poset-categories. But that's somewhat boring.
Of course, for each fixed A, it (or rather its eta-expansion) is also just a function from A => Boolean to Set[A] => Set[A], so it's automatically a "morphism" in the "Hask-category". But that's even more boring.
filter can be written in terms of foldRight as:
filter p ys = foldRight(nil)( (x, xs) => if (p(x)) x::xs else xs ) ys
foldRight on lists is a map of T-algebras (where here T is the List datatype functor), so filter is a map of T-algebras.
The two algebras in question here are the initial list algebra
[nil, cons]: 1 + A x List(A) ----> List(A)
and, let's say the "filter" algebra,
[nil, f]: 1 + A x List(A) ----> List(A)
where f(x, xs) = if p(x) x::xs else xs.
Let's call filter(p, _) the unique map from the initial algebra to the filter algebra in this case (it is called fold in the general case). The fact that it is a map of algebras means that the following equations are satisfied:
filter(p, nil) = nil
filter(p, x::xs) = f(x, filter(p, xs))
Expressions like
ls map (_ + 1) sum
are lovely because they are left-to-right and not nested. But if the functions in question are defined outside the class, it is less pretty.
Following an example I tried
final class DoublePlus(val self: Double) {
def hypot(x: Double) = sqrt(self*self + x*x)
}
implicit def doubleToDoublePlus(x: Double) =
new DoublePlus(x)
which works fine as far as I can tell, other than
A lot of typing for one method
You need to know in advance that you want to use it this way
Is there a trick that will solve those two problems?
You can call andThen on a function object:
(h andThen g andThen f)(x)
You can't call it on methods directly though, so maybe your h needs to become (h _) to transform the method into a partially applied function. The compiler will translate subsequent method names to functions automatically because the andThen method accepts a Function parameter.
You could also use the pipe operator |> to write something like this:
x |> h |> g |> f
Enriching an existing class/interface with an implicit conversion (which is what you did with doubleToDoublePlus) is all about API design when some classes aren't under your control. I don't recommend to do that lightly just to save a few keystrokes or having a few less parenthesis. So if it's important to be able to type val h = d hypot x, then the extra keystrokes should not be a concern. (there may be object allocations concerns but that's different).
The title and your example also don't match:
f(g(h(x))) can be rewritten asf _ compose g _ compose h _ apply x if your concern is about parenthesis or f compose g compose h apply x if f, g, h are function objects rather than def.
But ls map (_ + 1) sum aren't nested calls as you say, so I'm not sure how that relates to the title. And although it's lovely to use, the library/language designers went through a lot of efforts to make it easy to use and under the hood is not simple (much more complex than your hypot example).
def fgh (n: N) = f(g(h(n)))
val m = fgh (n)
Maybe this, observe how a is provided:
def compose[A, B, C](f: B => C, g: A => B): A => C = (a: A) => f(g(a))
basically like the answer above combine the desired functions to a intermediate one which you then can use easily with map.
Starting Scala 2.13, the standard library provides the chaining operation pipe which can be used to convert/pipe a value with a function of interest.
Using multiple pipes we can thus build a pipeline which as mentioned in the title of your question, minimizes the number of parentheses:
import scala.util.chaining._
x pipe h pipe g pipe f