Is foldRight equivalent to foldLeft given a noncommutative associative operation? - scala

In a online course it was stated that foldLeft and foldRight are equivalent for operators that are associative and commutative.
One of the students is adamant that such operators only need to be associative. So this property should be true for operations like function composition and matrix multiplication.
As far as I can tell an associative operation that isn't commutative will not produce equivalent results for foldLeft and foldRight unless z is neutral and the operation is accumulated in such a way that the order of the operands remains untouched. IMO the operation has to be commutative in the general case.
list.foldLeft(z)(operation) == list.foldRight(z)(operation)
So, for foldLeft and foldRight to be equivalent should operation be simultaneously associative and commutative or is it enough for operation to be associative?

String concatenation ("abc" + "xyz") is associative but not commutative and foldLeft/foldRight will place the initial/zero element at opposite ends of the resulting string. If that zero element is not the empty string then the results are different.

The function must be both commutative and associative.
If our function is f, and our elements are x1 to x4, then:
foldLeft is f(f(f(x1, x2), x3), x4)
foldRight is f(x1, f(x2, f(x3, x4)))
Let's use the average function, which is commutative but not associative ((a + b) / 2 == (b + a) / 2):
scala> def avg(a: Double, b: Double): Double = (a + b) / 2
avg: (a: Double, b: Double)Double
scala> (0 until 10).map(_.toDouble).foldLeft(0d)(avg)
res4: Double = 8.001953125
scala> (0 until 10).map(_.toDouble).foldRight(0d)(avg)
res5: Double = 0.9892578125
EDIT: I missed the boat on only associative vs only commutative. See #jwvy's example of string concatenation for an associative but not commutative function.

foldLeft is (...(z op x1)... op xn)
foldRight is x1 op (x2 op (... xn op z)...)
So the op needs to be commutative and associative for the two to be equivalent in the general case

There are at least three relevant cases with separate answers:
In the general case where op: (B, A) -> B or op: (A, B) -> B, as in the signatures of foldLeft and foldRight, neither associativity nor commutativity are defined.
If B >: A and z is a two-sided identity of op: (B, B) -> B and op is associative then for all L of type List[A], L.foldLeft(z)(op) returns the same result as L.foldRight(z)(op).
This is closely related to the fact that if B >: A and op: (B, B) -> B then, if op is associative, for all L of type List[A] L.reduceLeft(op) returns the same result as L.reduceRight(op).
If B >: A, and op: (B, B) -> B is both associative and commutative then for all L of type List[A] and z of type B, L.foldLeft(z)(op) returns the same result as L.foldRight(z)(op).

Related

Monoid homomorphism and isomorphism

I am reading the book 'Programming in Scala' (The red book).
In the chapter about Monoids, I understand what a Monoid homomorphism is, for example: The String Monoid M with concatenation and length function f preserves the monoid structure, and hence are homomorphic.
M.op(f(x), f(y)) == M.op(f(x) + f(y))
// "Lorem".length + "ipsum".length == ("Lorem" + "ipsum").length
Quoting the book (From memory, so correct me if I am wrong:
When this happens in both directions, it is named Monoid isomorphisim, that means that for monoids M, N, and functions f, g, f andThen g and g andThen f are the identity function. For example the String Monoid and List[Char] Monoid with concatenation are isomorphic.
But I can't see an actual example for seeing this, I can only think of f as the length function, but what happens with g?
Note: I have seen this question: What are isomorphism and homomorphisms.
To see the isomorphism between String and List[Char] we have toList: String -> List[Char] and mkString: List[Char] -> String.
length is a homomorphism from the String monoid to the monoid of natural numbers with addition.
A couple of examples of endo-homomorphism of the String monoid are toUpperCase and toLowerCase.
For lists, we have a lot of homomorphisms, many of which are just versions of fold.
Here is siyopao's answer expressed as ScalaCheck program
object IsomorphismSpecification extends Properties("f and g") {
val f: String => List[Char] = _.toList
val g: List[Char] => String = _.mkString
property("isomorphism") = forAll { (a: String, b: List[Char]) =>
(f andThen g)(a) == a && (g andThen f)(b) == b
}
}
which outputs
+ f and g.isomorphism: OK, passed 100 tests.

What is monoid homomorphism exactly?

I've read about monoid homomorphism from Monoid Morphisms, Products, and Coproducts and could not understand 100%.
The author says (emphasis original):
The length function maps from String to Int while preserving the
monoid structure. Such a function, that maps from one monoid to
another in such a preserving way, is called a monoid homomorphism. In
general, for monoids M and N, a homomorphism f: M => N, and all values
x:M, y:M, the following equations hold:
f(x |+| y) == (f(x) |+| f(y))
f(mzero[M]) == mzero[N]
Does he mean that, since the datatypes String and Int are monoids, and the function length maps String => Int preserving the monoid structure (Int is a monoid), it is called monoid homomorphism, right?
Does he mean, the datatype String and Int are monoid.
No, neither String nor Int are monoids. A monoid is a 3-tuple (S, ⊕, e) where ⊕ is a binary operator ⊕ : S×S → S, such that for all elements a, b, c∈S it holds that (a⊕b)⊕c=a⊕(b⊕c), and e∈S is an "identity element" such that a⊕e=e⊕a=a. String and Int are types, so basically sets of values, but not 3-tuples.
The article says:
Let's take the String concatenation and Int addition as
example monoids that have a relationship.
So the author clearly also mentions the binary operators ((++) in case of String, and (+) in case of Int). The identities (empty string in case of String and 0 in case of Int) are left implicit; leaving the identities as an exercise for the reader is common in informal English discourse.
Now given that we have two monoid structures (M, ⊕, em) and (N, ⊗, en), a function f : M → N (like length) is then called a monoid homomorphism [wiki] given it holds that f(m1⊕m2)=f(m1)⊗f(m2) for all elements m1, m2∈M and that mapping also preserves the identity element: f(em)=en.
For example length :: String -> Int is a monoid homomorphism, since we can consider the monoids (String, (++), "") and (Int, (+), 0). It holds that:
length (s1 ++ s2) == length s1 + length s2 (for all Strings s1 and s2); and
length "" == 0.
Datatype cannot be a monoid on its own. For a monoid, you need a data type T and two more things:
an associative binary operation, let's call it |+|, that takes two elements of type T and produces an element of type T
an identity element of type T, let's call it i, such that for every element t of type T the following holds: t |+| i = i |+| t = t
Here are some examples of a monoid:
set of integers with operation = addition and identity = zero
set of integers with operation = multiplication and identity = one
set of lists with operation = appending and identity = empty list
set of strings with operation = concatenation and identity = empty string
Monoid homomorphism
String concatenation monoid can be transformed into integer addition monoid by applying .length to all its elements. Both those sets form a monoid. By the way, remember that we can't just say "set of integers forms a monoid"; we have to pick an associative operation and a corresponding identity element. If we take e.g. division as operation, we break the first rule (instead of producing an element of type integer, we might produce an element of type float/double).
Method length allows us to go from a monoid (string concatenation) to another monoid (integer addition). If such operation also preserves monoid structure, it is considered to be a monoid homomorphism.
Preserving the structure means:
length(t1 |+| t2) = length(t1) |+| length(t2)
and
length(i) = i'
where t1 and t2 represent elements of the "source" monoid, i is the identity of the "source" monoid, and i' is the identity of the "destination" monoid. You can try it out yourself and see that length indeed is a structure-preserving operation on a string concatenation monoid, while e.g. indexOf("a") isn't.
Monoid isomorphism
As demonstrated, length maps all strings to their corresponding integers and forms a monoid with addition as operation and zero as identity. But we can't go back - for every string, we can figure out its length, but given a length we can't reconstruct the "original" string. If we could, then the operation of "going forward" combined with the operation of "going back" would form a monoid isomorphism.
Isomorphism means being able to go back and forth without any loss of information. For example, as stated earlier, list forms a monoid under appending as operation and empty list as identity element. We could go from "list under appending" monoid to "vector under appending" monoid and back without any loss of information, which means that operations .toVector and .toList together form an isomorphism. Another example of an isomorphism, which Runar mentioned in his text, is String ⟷ List[Char].
Colloquially a homomorphism is a function that preserves structure. In the example of the length function the preserved structure is the sum of the lengths of two strings being equal to the length of the concatenation of the same strings. Since both strings and integers can be regarded as monoids (when equipped with an identity and an associative binary operation obeying the monoid laws) length is called a monoid homomorphism.
See also the other answers for a more technical explanation.
trait Monoid[T] {
def op(a: T, b: T): T
def zero: T
}
val strMonoid = new Monoid[String] {
def op(a: String, b: String): String = a ++ b
def zero: String = ""
}
val lcMonoid = new Monoid[List[Char]] {
def op(a: List[Char], b: List[Char]): List[Char] = a ::: b
def zero = List.empty[Char]
}
homomorphism via function f
f{M.op(x,y)} = N.op(f(x),g(y))
for example, using toList available on String
//in REPL
scala> strMonoid.op("abc","def").toList == lcMonoid.op("abc".toList,"def".toList)
res4: Boolean = true
isomorphism via functions f and g
given bi-directional homomorphism between monoids M and N,
f{M.op(x,y)} = N.op(f(x),f(y))
g{N.op(x,y)} = M.op(g(x),g(y))
And if both (f andThen g) and (g andThen f) are identify functions, then monoids M and N are isomorphic via f and g
g{f{M.op(x,y)}} = g{N.op(f(x),f(y))} = M.op(g(f(x)),g(f(y))) = M.op(x,y)
for example, using toList available on String and toString available on List[Char] (where toList andThen toString and toString andThen toList are identity functions)
scala> ( strMonoid.op("abc","def").toList ).toString == ( lcMonoid.op("abc".toList,"def".toList) ).toString
res7: Boolean = true

What are isomorphism and homomorphisms

I tried to understand isomorphism and homomorphisms in context of programming and need some help.
In the book FPiS it explains:
Let start with a homomorphisms:
"foo".length + "bar".length == ("foo" + "bar").length
Here, length is a function from String to Int that preserves the monoid structure.
Why is that a homomorphisms?
Why it preserve the monoid structure?
Is for example map on list function a homomorphisms?
About isomorphism, I have following explaination that I took it from a book:
A monoid isomorphism between M and N has two homomorphisms
f and g, where both f andThen g and g andThen f are an identity function.
For example, the String and List[Char] monoids with concatenation are isomorphic.
The two Boolean monoids (false, ||) and (true, &&) are also isomorphic,
via the ! (negation) function.
Why (false, ||), (true, &&) and String and List[Char] monoids with concatenation are isomorphism?
Why is that a homomorphisms?
By definition.
Why it preserve the monoid structure?
Because of the == in the expression above.
Is for example map on list function a homomorphisms?
Yes. Replace "foo" and "bar" by two lists, and .length by .map(f). It's then easy to see (and prove) that the equation holds.
Why (false, ||), (true, &&) and String and List[Char] monoids with concatenation are isomorphism?
By definition. The proof is trivial, left as an exercise. (Hint: take the definition of an isomorphism, replace all abstract objects with concrete objects, prove that the resulting mathematical expression is correct)
Edit: Here are the few definitions you asked in comments:
Homomorphism: a transformation of one set into another that preserves in the second set the relations between elements of the first. Formally f: A → B where bothA and B have a * operation such that f(x * y) = f(x) * f(y).
Monoid: algebraic structure with a single associative binary operation and an identity element. Formally (M, *, id) is a Monoid iff (a * b) * c == a * (b * c) && a * id == a && id * a == a for all a, b, c in M.
"foo".length + "bar".length == ("foo" + "bar").length
To be precise, this is saying that length is a monoid homomorphism between the monoid of strings with concatenation, and the monoid of natural numbers with addition. That these two structures are monoids are easy to see if you put in the effort.
The reason length is a monoid homomorphism is because it has the properties that "".length = 0 and x.length ⊕ y.length = (x ⊗ y).length. Here, I've deliberately used two different symbols for the two monoid operations, to stress the fact that we are either applying the addition operation on the results of applying length vs the string concatenation operation on the two arguments before applying length. It is just unfortunate notation that the example you're looking at uses the same symbol + for both operations.
Edited to add: The question poster asked for some further details on what exactly is a monoid homomorphism.
OK, so suppose we have two monoids (A, ⊕, a) and (B, ⊗, b), meaning A and B are our two carriers, ⊕ : A × A → A and ⊗ : B × B → B are our two binary operators, and a ∈ A and b ∈ B are our two neutral elements. A monoid homomorphism between these two monoids is a function f : A → B with the following properties:
f(a) = b, i.e. if you apply f on the neutral element of A, you get the neutral element of B
f(x ⊕ y) = f(x) ⊗ f(y), i.e. if you apply f on the result of the operator of A on two elements, it is the same as applying it twice, on the two A elements, and then combining the results using the operator of B.
The point is that the monoid homomorphism is a structure-preserving mapping (which is what a homomorphism is; break down the word to its ancient Greek roots and you will see that it means 'same-shaped-ness').
OK, you asked for examples, here are some examples!
The one from the above example: length is a monoid homomorphism from the free monoid (A, ·, ε)* to (ℕ, +, 0)
Negation is a monoid homomorphism from (Bool, ∨, false) to (Bool, ∧, true) and vice verse.
exp is a monoid homomorphism from (ℝ, +, 0) to (ℝ\{0}, *, 1)
In fact, every group homomorphism is also, of course, a monoid homomorphism.

What is point of /: function?

Documentation for /: includes
Note: might return different results for different runs, unless the underlying collection type is ordered or the operator
is associative and commutative.
( src)
This just applies if the par version of this function is run, otherwise the result is deterministic (same as foldLeft) ?
Also this function is calling foldLeft under the hood : def /:[B](z: B)(op: (B, A) => B): B = foldLeft(z)(op)
Their function definitions are same (except for function param label, "op" instad of "f") :
def /:[B](z: B)(op: (B, A) ⇒ B): B
def foldLeft[B](z: B)(f: (B, A) ⇒ B): B
For these reasons what is point of /: function and when should it be used in favour of foldLeft ?
Is my reasoning incorrect ?
It's just an alternative syntax. Methods ending in : are called on the right hand side.
Instead of
list.foldLeft(0) { op(_, _) }
or
list./:(0) { op(_, _) }
you can
( z /: list ) { op(_, _) }
For example,
scala> val a = List(1,2,3,4)
a: List[Int] = List(1, 2, 3, 4)
scala> ( 0 /: a ) { _ + _ }
res5: Int = 10
Yes, those are aliases originating from dark times when people liked their operators like this:
val x = y |#<#|: z.
The point of the note is to remind that for collections with unspecified iteration order the result of folds might differ. Consider having a Set {1,2,3} that doesn't guarantee the same access order even if left unmodified, and applying an operation that is not e. g. associative (like /). Even if run not after par call, this might result in the following (pseudocode):
{1,2,3} foldLeft / ==> (1 / 2) / 3 ==> 1/6 = 0.1(6)
{3,1,2} foldLeft / ==> (3 / 1) / 2 ==> 3/2 = 1.5
In terms of consistency this is similar to applying non-parallelizable operations to parallel collections, though.

Scala : fold vs foldLeft

I am trying to understand how fold and foldLeft and the respective reduce and reduceLeft work. I used fold and foldLeft as my example
scala> val r = List((ArrayBuffer(1, 2, 3, 4),10))
scala> r.foldLeft(ArrayBuffer(1,2,4,5))((x,y) => x -- y._1)
scala> res28: scala.collection.mutable.ArrayBuffer[Int] = ArrayBuffer(5)
scala> r.fold(ArrayBuffer(1,2,4,5))((x,y) => x -- y._1)
<console>:11: error: value _1 is not a member of Serializable with Equals
r.fold(ArrayBuffer(1,2,4,5))((x,y) => x -- y._1)
Why fold didn't work as foldLeft? What is Serializable with Equals? I understand fold and foldLeft has slight different API signature in terms of parameter generic types. Please advise. Thanks.
The method fold (originally added for parallel computation) is less powerful than foldLeft in terms of types it can be applied to. Its signature is:
def fold[A1 >: A](z: A1)(op: (A1, A1) => A1): A1
This means that the type over which the folding is done has to be a supertype of the collection element type.
def foldLeft[B](z: B)(op: (B, A) => B): B
The reason is that fold can be implemented in parallel, while foldLeft cannot. This is not only because of the *Left part which implies that foldLeft goes from left to right sequentially, but also because the operator op cannot combine results computed in parallel -- it only defines how to combine the aggregation type B with the element type A, but not how to combine two aggregations of type B. The fold method, in turn, does define this, because the aggregation type A1 has to be a supertype of the element type A, that is A1 >: A. This supertype relationship allows in the same time folding over the aggregation and elements, and combining aggregations -- both with a single operator.
But, this supertype relationship between the aggregation and the element type also means that the aggregation type A1 in your example should be the supertype of (ArrayBuffer[Int], Int). Since the zero element of your aggregation is ArrayBuffer(1, 2, 4, 5) of the type ArrayBuffer[Int], the aggregation type is inferred to be the supertype of both of these -- and that's Serializable with Equals, the only least upper bound of a tuple and an array buffer.
In general, if you want to allow parallel folding for arbitrary types (which is done out of order) you have to use the method aggregate which requires defining how two aggregations are combined. In your case:
r.aggregate(ArrayBuffer(1, 2, 4, 5))({ (x, y) => x -- y._1 }, (x, y) => x intersect y)
Btw, try writing your example with reduce/reduceLeft -- because of the supertype relationship between the element type and the aggregation type that both these methods have, you will find that it leads to a similar error as the one you've described.