Why does fold left expect (a -> b -> a) instead of (b -> a -> a)? - scala

I wonder why the function expected by fold left has type signature a -> b -> a instead of b -> a -> a. Is there a design decision behind this?
In Haskell, for example, I have to write foldl (\xs x -> x:xs) [] xs to reverse a list instead of the shorter foldl (:) [] xs (which would be possible with b -> a -> a). On the other hand, there are use cases which require the standard a -> b -> a. In Scala, this could be appending: xs.foldLeft(List.empty[Int]) ((xs, x) => xs:+x) which can be written as xs.foldLeft(List.empty[Int]) (_:+_).
Do proportionately more use cases occur requiring the given type signature instead of the alternative one, or are there other decisions which led to the design that fold left has in Haskell and Scala (and probably lots of other languages)?

Conceptually speaking, a right fold, say foldr f z [1..4] replaces a list of the following form
:
/ \
1 :
/ \
2 :
/ \
3 :
/ \
4 []
with the value of an expression of the following form
f
/ \
1 f
/ \
2 f
/ \
3 f
/ \
4 z
If we were to represent this expression on a single line, all parentheses would associate to the right, hence the name right fold: (1 `f` (2 `f` (3 `f` (4 `f` z)))). A left fold is dual in some sense to a right fold. In particular, we would like for the shape of the corresponding diagram for a left fold to be a mirror image of that for a left fold, as in the following:
f
/ \
f 4
/ \
f 3
/ \
f 2
/ \
z 1
If we were to write out this diagram on a single line, we would get an expression where all parentheses associate to the left, which jibes well with the name of a left fold:
((((z `f` 1) `f` 2) `f` 3) `f` 4)
But notice that in this mirror diagram, the recursive result of the fold is fed to f as the first argument, while each element of the list is fed as the second argument, ie the arguments are fed to f in reverse order compared to right folds.

The type signature is foldl :: (a -> b -> a) -> a -> [b] -> a; it's natural for the combining function to have the initial value on the left, because that's the way it combines with the elements of the list. Similarly, you'll notice foldr has it the other way round. The complication in your definition of reverse is because you're using a lambda expression where flip would have been nicer: foldl (flip (:)) [] xs, which also has the pleasant similarity between the concepts of flip and reverse.

Because you write (a /: bs) for foldLeft in short form; this is an operator which pushes a through all the bs, so it is natural to write the function the same way (i.e. (A,B) => A). Note that foldRight does it in the other order.

Say you have this:
List(4, 2, 1).foldLeft(8)(_ / _)
That's the same as:
((8 / 4) / 2) / 1
See how the first parameter is always te accumulator? Having the parameters in this order makes placeholder syntax (the underscore) a direct translation to the expanded expression.

Related

Understand Either as a Functor

Looking into how Either is defined as a functor, I can see that
derive instance functorEither :: Functor (Either a)
which reads to me as "You can map an Either so long as you can map its element.
But either doesn't have just one element. How would this be implemented without derive? Here's what I've tried:
data Either a b = Left a | Right b
instance functorEither :: Functor (Either a)
where
map f (Right b) = Right $ f b
map _ a = a
Of course, the types don't work here:
The Right has this signature: map :: forall a b. (a -> b) -> f a -> f b
The Left however, isn't okay: map :: forall a b. (a -> b) -> f a -> f a
Part of my intuition is saying that Either a b isn't a functor, only Either a is a functor. Which is why map works over Right and ignores Left
That doesn't really give me any intuition for how this is implemented. I still need a way of matching both constructors, don't I?
On the other hand, I think an implementation of map that replaces the inner function with identity is technically law-abiding for functor? The law of composition is met if you just ignore it?
While your proposed definition of the Functor instance indeed fails to compile, it isn't for the reason you say. And it's also "essentially" correct, just not written in a way that will satisfy the compiler.
For convenience, here's your definition again:
data Either a b = Left a | Right b
instance functorEither :: Functor (Either a)
where
map f (Right b) = Right $ f b
map _ a = a
and here's the actual error that you get when trying to compile it:
Could not match type
a02
with type
b1
while trying to match type Either a0 a02
with type Either a0 b1
while checking that expression a
has type Either a0 b1
in value declaration functorEither
where a0 is a rigid type variable
bound at (line 0, column 0 - line 0, column 0)
b1 is a rigid type variable
bound at (line 0, column 0 - line 0, column 0)
a02 is a rigid type variable
bound at (line 0, column 0 - line 0, column 0)
I admit that's a little hard to interpret, if you're not expecting it. But it has to do with the fact that map for Either a needs to have type forall b c. (b -> c) -> Either a b -> Either a c. So the a on the left of map _ a = a has type Either a b, while the one on the right has type Either a c - these are different types (in general), since b and c can be anything, so you can't use the same variable, a, to denote a value of each type.
(This question, although about Haskell rather than Purescript, goes deeper into explanation of exactly this error.)
To fix it, as implied in the question above, you have to explicitly mention that the value you're mapping over is a Left value:
data Either a b = Left a | Right b
instance functorEither :: Functor (Either a)
where
map f (Right b) = Right $ f b
map _ (Left a) = Left a
which is fine because Left a can be interpreted on the left as of type Either a b and on the right as an Either a c.
As for what the instance "does": you are correct that "Either a b isn't a functor, only Either a is a functor" - because a functor must take one type variable, which Either a does but Either a b doesn't. And yes, because the type variable that actually "varies" between Either a b and Either a c is the one that is used in Right, map must only map over the Right values, and leave the Left ones alone - that's the only thing that will satisfy the types needed.
Either a b is often interpreted as representing the result of a computation, where Left values represent failure while Right ones represent success. In this sense it's a slightly "expanded" version of Maybe - the difference is that rather than failure being represented by a single value (Nothing), you get a piece of data (the a type in Either a b) which can tell you information about the error. But the Functor instance works identically to that for Maybe: it maps over any success, and leaves failures alone.
(But there's no logical reason why you can't "map over" the Left values as well. The Bifunctor class is an extension of Functor which can do exactly that.)

Is there a function that transforms/maps both Either's Left and Right cases taking two transformation functions respectively?

I have not found a function in Scala or Haskell that can transform/map both Either's Left and Right cases taking two transformation functions at the same time, namely a function that is of the type
(A => C, B => D) => Either[C, D]
for Either[A, B] in Scala, or the type
(a -> c, b -> d) -> Either a b -> Either c d
in Haskell. In Scala, it would be equivalent to calling fold like this:
def mapLeftOrRight[A, B, C, D](e: Either[A, B], fa: A => C, fb: B => D): Either[C, D] =
e.fold(a => Left(fa(a)), b => Right(fb(b)))
Or in Haskell, it would be equivalent to calling either like this:
mapLeftOrRight :: (a -> c) -> (b -> d) -> Either a b -> Either c d
mapLeftOrRight fa fb = either (Left . fa) (Right . fb)
Does a function like this exist in the library? If not, I think something like this is quite practical, why do the language designers choose not to put it there?
Don't know about Scala, but Haskell has a search engine for type signatures. It doesn't give results for the one you wrote, but that's just because you take a tuple argument while Haskell functions are by convention curried†. https://hoogle.haskell.org/?hoogle=(a -> c) -> (b -> d) -> Either a b -> Either c d does give matches, the most obvious being:
mapBoth :: (a -> c) -> (b -> d) -> Either a b -> Either c d
...actually, even Google finds that, because the type variables happen to be exactly as you thought. (Hoogle also finds it if you write it (x -> y) -> (p -> q) -> Either x p -> Either y q.)
But actually, as Martijn said, this behaviour for Either is only a special case of a bifunctor, and indeed Hoogle also gives you the more general form, which is defined in the base library:
bimap :: Bifunctor p => (a -> b) -> (c -> d) -> p a c -> p b d
†TBH I'm a bit disappointed that Hoogle doesn't by itself figure out to curry the signature or to swap arguments. Pretty sure it actually used to do that automatically, but at some point they simplified the algorithm because with the huge number of libraries the time it took and number of results got out of hand.
Cats provides Bifunctor, for example
import cats.implicits._
val e: Either[String, Int] = Right(41)
e.bimap(e => s"boom: $e", v => 1 + v)
// res0: Either[String,Int] = Right(42)
The behaviour you are talking about is a bifunctor behaviour, and would commonly be called bimap. In Haskell, a bifunctor for either is available: https://hackage.haskell.org/package/bifunctors-5/docs/Data-Bifunctor.html
Apart from the fold you show, another implementation in scala would be either.map(fb).left.map(fa)
There isn't such a method in the scala stdlib, probably because it wasn't found useful or fundamental enough. I can somewhat relate to that: mapping both sides in one operation instead of mapping each side individually doesn't come across as fundamental or useful enough to warrant inclusion in the scala stdlib to me either. The bifunctor is available in Cats though.
In Haskell, the method exists on Either as mapBoth and BiFunctor is in base.
In Haskell, you can use Control.Arrow.(+++), which works on any ArrowChoice:
(+++) :: (ArrowChoice arr) => arr a b -> arr c d -> arr (Either a c) (Either b d)
infixr 2 +++
Specialised to the function arrow arr ~ (->), that is:
(+++) :: (a -> b) -> (c -> d) -> Either a c -> Either b d
Hoogle won’t find +++ if you search for the type specialised to functions, but you can find generalised operators like this by replacing -> in the signature you want with a type variable: x a c -> x b d -> x (Either a b) (Either c d).
An example of usage:
renderResults
:: FilePath
-> Int
-> Int
-> [Either String Int]
-> [Either String String]
renderResults file line column
= fmap ((prefix ++) +++ show)
where
prefix = concat [file, ":", show line, ":", show column, ": error: "]
renderResults "test" 12 34 [Right 1, Left "beans", Right 2, Left "bears"]
==
[ Right "1"
, Left "test:12:34: error: beans"
, Right "2"
, Left "test:12:34: error: bears"
]
There is also the related operator Control.Arrow.(|||) which does not tag the result with Either:
(|||) :: arr a c -> a b c -> arr (Either a b) c
infixr 2 |||
Specialised to (->):
(|||) :: (a -> c) -> (b -> c) -> Either a b -> c
Example:
assertRights :: [Either String a] -> [a]
assertRights = fmap (error ||| id)
sum $ assertRights [Right 1, Right 2]
==
3
sum $ assertRights [Right 1, Left "oh no"]
==
error "oh no"
(|||) is a generalisation of the either function in the Haskell Prelude for matching on Eithers. It’s used in the desugaring of if and case in arrow proc notation.

What does >>= mean in purescript?

I was reading the purescript wiki and found following section which explains do in terms of >>=.
What does >>= mean?
Do notation
The do keyword introduces simple syntactic sugar for monadic
expressions.
Here is an example, using the monad for the Maybe type:
maybeSum :: Maybe Number -> Maybe Number -> Maybe Number
maybeSum a b = do
n <- a
m <- b
let result = n + m
return result
maybeSum takes two
values of type Maybe Number and returns their sum if neither number is
Nothing.
When using do notation, there must be a corresponding
instance of the Monad type class for the return type. Statements can
have the following form:
a <- x which desugars to x >>= \a -> ...
x which desugars to x >>= \_ -> ... or just x if this is the last statement.
A let binding let a = x. Note the lack of the in keyword.
The example maybeSum desugars to ::
maybeSum a b =
a >>= \n ->
b >>= \m ->
let result = n + m
in return result
>>= is a function, nothing more. It resides in the Prelude module and has type (>>=) :: forall m a b. (Bind m) => m a -> (a -> m b) -> m b, being an alias for the bind function of the Bind type class. You can find the definitions of the Prelude module in this link, found in the Pursuit package index.
This is closely related to the Monad type class in Haskell, which is a bit easier to find resources. There's a famous question on SO about this concept, which is a good starting point if you're looking to improve your knowledge on the bind function (if you're starting on functional programming now, you can skip it for a while).

Why does Haskell's foldr NOT stackoverflow while the same Scala implementation does?

I am reading FP in Scala.
Exercise 3.10 says that foldRight overflows (See images below).
As far as I know , however foldr in Haskell does not.
http://www.haskell.org/haskellwiki/
-- if the list is empty, the result is the initial value z; else
-- apply f to the first element and the result of folding the rest
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
-- if the list is empty, the result is the initial value; else
-- we recurse immediately, making the new initial value the result
-- of combining the old initial value with the first element.
foldl f z [] = z
foldl f z (x:xs) = foldl f (f z x) xs
How is this different behaviour possible?
What is the difference between the two languages/compilers that cause this different behaviour?
Where does this difference come from ? The platform ? The language? The compiler?
Is it possible to write a stack-safe foldRight in Scala? If yes, how?
Haskell is lazy. The definition
foldr f z (x:xs) = f x (foldr f z xs)
tells us that the behaviour of foldr f z xs with a non-empty list xs is determined by the laziness of the combining function f.
In particular the call foldr f z (x:xs) allocates just one thunk on the heap, {foldr f z xs} (writing {...} for a thunk holding an expression ...), and calls f with two arguments - x and the thunk. What happens next, is f's responsibility.
In particular, if it's a lazy data constructor (like e.g. (:)), it will immediately be returned to the caller of the foldr call (with the constructor's two slots filled by (references to) the two values).
And if f does demand its value on the right, with minimal compiler optimizations no thunks should be created at all (or one, at the most - the current one), as the value of foldr f z xs is immediately needed and the usual stack-based evaluation can used:
foldr f z [a,b,c,....,n] ==
a `f` (b `f` (c `f` (... (n `f` z)...)))
So foldr can indeed cause SO, when used with strict combining function on extremely long input lists. But if the combining function doesn't demand right away its value on the right, or only demands a part of it, the evaluation will be suspended in a thunk, and the partial result as created by f will be immediately returned. Same with the argument on the left, but they already come as thunks, potentially, in the input list.
Haskell is lazy. So foldr allocates on the heap, not the stack. Depending on the strictness of the argument function, it may allocate a single (small) result, or a large structure.
You're still losing space, compared to a strict, tail-recursive implementation, but it doesn't look as obvious, since you've traded stack for heap.
Note that the authors here are not referring to any foldRight definition in the scala standard library, such as the one defined on List. They are referring to the definition of foldRight they gave above in section 3.4.
The scala standard library defines the foldRight in terms of foldLeft by reversing the list (which can be done in constant stack space) then calling foldLeft with the the arguments of the passed function reversed. This works for lists, but won't work for a structure which cannot be safely reversed, for example:
scala> Stream.continually(false)
res0: scala.collection.immutable.Stream[Boolean] = Stream(false, ?)
scala> res0.reverse
java.lang.OutOfMemoryError: GC overhead limit exceeded
Now lets think about what should be the result of this operation:
Stream.continually(false).foldRight(true)(_ && _)
The answer should be false, it doesn't matter how many false values are in the stream or if it is infinite, if we are going to combine them with a conjunction, the result will be false.
haskell of course gets this with no problem:
Prelude> foldr (&&) True (repeat False)
False
And that is because of two important things: haskell's foldr will traverse the stream from left to right, not right to left, and haskell is lazy by default. The first item here, that foldr actually traverses the list from left to right might surprise or confuse some people who think of a right fold as starting from the right, but the important feature of a right fold is not which end of a structure it starts on, but in which direction the associativity is. So give a list [1,2,3,4] and an op named op, a left fold is
((1 op 2) op 3) op 4)
and a right fold is
(1 op (2 op (3 op 4)))
But the order of evaluation shouldn't matter. So what the authors have done here in chapter 3 is to give you a fold which traverses the list from left to right, but because scala is by default strict, we still will not be able to traverse our stream of infinite falses, but have some patience, they will get to that in chapter 5 :) I'll give you a sneak peek, lets look at the difference between foldRight as it is defined in the standard library and as it is defined in the Foldable typeclass in scalaz:
Here's the implementation from the scala standard library:
def foldRight[B](z: B)(op: (A, B) => B): B
Here's the definition from scalaz's Foldable:
def foldRight[B](z: => B)(f: (A, => B) => B): B
The difference is that the Bs are all lazy, and now we get to fold our infinite stream again, as long as we give a function which is sufficiently lazy in its second parameter:
scala> Foldable[Stream].foldRight(Stream.continually(false),true)(_ && _)
res0: Boolean = false
One easy way to demonstrate this in Haskell is to use equational reasoning to demonstrate lazy evaluation. Let's write the find function in terms of foldr:
-- Return the first element of the list that satisfies the predicate, or `Nothing`.
find :: (a -> Bool) -> [a] -> Maybe a
find p = foldr (step p) Nothing
where step pred x next = if pred x then Just x else next
foldr :: (a -> b -> b) -> b -> [a] -> b
foldr f z [] = z
foldr f z (x:xs) = f x (foldr f z xs)
In an eager language, if you wrote find with foldr it would traverse the whole list and use O(n) space. With lazy evaluation, it stops at the first element that satisfies the predicate, and uses only O(1) space (modulo garbage collection):
find odd [0..]
== foldr (step odd) Nothing [0..]
== step odd 0 (foldr (step odd) Nothing [1..])
== if odd 0 then Just 0 else (foldr (step odd) Nothing [1..])
== if False then Just 0 else (foldr (step odd) Nothing [1..])
== foldr (step odd) Nothing [1..]
== step odd 1 (foldr (step odd) Nothing [2..])
== if odd 1 then Just 1 else (foldr (step odd) Nothing [2..])
== if True then Just 1 else (foldr (step odd) Nothing [2..])
== Just 1
This evaluation stops in a finite number of steps, in spite of the fact that the list [0..] is infinite, so we know that we're not traversing the whole list. In addition, there is an upper bound on the complexity of the expressions at each step, which translates into a constant upper bound on the memory required to evaluate this.
The key here is that the step function that we're folding with has this property: no matter what the values of x and next are, it will either:
Evaluate to Just x, without invoking the next thunk, or
Tail-call the next thunk (in effect, if not literally).

How to concisely express function iteration?

Is there a concise, idiomatic way how to express function iteration? That is, given a number n and a function f :: a -> a, I'd like to express \x -> f(...(f(x))...) where f is applied n-times.
Of course, I could make my own, recursive function for that, but I'd be interested if there is a way to express it shortly using existing tools or libraries.
So far, I have these ideas:
\n f x -> foldr (const f) x [1..n]
\n -> appEndo . mconcat . replicate n . Endo
but they all use intermediate lists, and aren't very concise.
The shortest one I found so far uses semigroups:
\n f -> appEndo . times1p (n - 1) . Endo,
but it works only for positive numbers (not for 0).
Primarily I'm focused on solutions in Haskell, but I'd be also interested in Scala solutions or even other functional languages.
Because Haskell is influenced by mathematics so much, the definition from the Wikipedia page you've linked to almost directly translates to the language.
Just check this out:
Now in Haskell:
iterateF 0 _ = id
iterateF n f = f . iterateF (n - 1) f
Pretty neat, huh?
So what is this? It's a typical recursion pattern. And how do Haskellers usually treat that? We treat that with folds! So after refactoring we end up with the following translation:
iterateF :: Int -> (a -> a) -> (a -> a)
iterateF n f = foldr (.) id (replicate n f)
or point-free, if you prefer:
iterateF :: Int -> (a -> a) -> (a -> a)
iterateF n = foldr (.) id . replicate n
As you see, there is no notion of the subject function's arguments both in the Wikipedia definition and in the solutions presented here. It is a function on another function, i.e. the subject function is being treated as a value. This is a higher level approach to a problem than implementation involving arguments of the subject function.
Now, concerning your worries about the intermediate lists. From the source code perspective this solution turns out to be very similar to a Scala solution posted by #jmcejuela, but there's a key difference that GHC optimizer throws away the intermediate list entirely, turning the function into a simple recursive loop over the subject function. I don't think it could be optimized any better.
To comfortably inspect the intermediate compiler results for yourself, I recommend to use ghc-core.
In Scala:
Function chain Seq.fill(n)(f)
See scaladoc for Function. Lazy version: Function chain Stream.fill(n)(f)
Although this is not as concise as jmcejuela's answer (which I prefer), there is another way in scala to express such a function without the Function module. It also works when n = 0.
def iterate[T](f: T=>T, n: Int) = (x: T) => (1 to n).foldLeft(x)((res, n) => f(res))
To overcome the creation of a list, one can use explicit recursion, which in reverse requires more static typing.
def iterate[T](f: T=>T, n: Int): T=>T = (x: T) => (if(n == 0) x else iterate(f, n-1)(f(x)))
There is an equivalent solution using pattern matching like the solution in Haskell:
def iterate[T](f: T=>T, n: Int): T=>T = (x: T) => n match {
case 0 => x
case _ => iterate(f, n-1)(f(x))
}
Finally, I prefer the short way of writing it in Caml, where there is no need to define the types of the variables at all.
let iterate f n x = match n with 0->x | n->iterate f (n-1) x;;
let f5 = iterate f 5 in ...
I like pigworker's/tauli's ideas the best, but since they only gave it as a comments, I'm making a CW answer out of it.
\n f x -> iterate f x !! n
or
\n f -> (!! n) . iterate f
perhaps even:
\n -> ((!! n) .) . iterate