Is Monoid[String] really a Monoid in scala - scala

I am currently learning about category theory in scala and the law of associativity says
(x + y) + z = x + (y + z)
Thats just fine when working with more than two values
("Foo" + "Bar") + "Test" == "Foo" + ("Bar" + "Test") // true
In that case order doesn't matter. But what if there are only two values. In case for numbers its still working ( commutative ) but when doing the same thing with strings it fails.
3+1==1+3 // True
("Foo" + "Bar") == ("Bar" + "Foo") // Not commuative
So is it legal to say that associativity requires commudativity to fullfill the monoid law ? And so is a String Monoid valid anyway ?

So is it legal to say that associativity requires commutativity to fulfill the monoid law?
No. A binary operation does not need to be commutative to be associative. The fact that ("Foo" + "Bar") == ("Bar" + "Foo") is false is irrelevant to the fact that + is associative for String.
And so is a String Monoid valid anyway?
Yes, you can have a Monoid[String].
By definition:
A monoid is a set that is closed under an associative binary operation +and has an identity element I in S such that for all a in S, I + a = a + I = a.
A monoid must contain at least one element.
Let + be the binary operation for a Monoid[String]. For any two strings, a and b, a + b is also a String, so the binary operation is closed over the type String. Without rigorous proof, we also know that it is associative.
i.e. for all strings a, b, and c:
(a + b) + c == a + (b + c)
We also have an identity element "" (the empty string), because for any string a, a + "" == a and "" + a == a.
A monoid whose binary operation is also a commutative is called a commutative monoid. And you clearly cannot have a commutative monoid for String using the + operation.

Related

What is monoid homomorphism exactly?

I've read about monoid homomorphism from Monoid Morphisms, Products, and Coproducts and could not understand 100%.
The author says (emphasis original):
The length function maps from String to Int while preserving the
monoid structure. Such a function, that maps from one monoid to
another in such a preserving way, is called a monoid homomorphism. In
general, for monoids M and N, a homomorphism f: M => N, and all values
x:M, y:M, the following equations hold:
f(x |+| y) == (f(x) |+| f(y))
f(mzero[M]) == mzero[N]
Does he mean that, since the datatypes String and Int are monoids, and the function length maps String => Int preserving the monoid structure (Int is a monoid), it is called monoid homomorphism, right?
Does he mean, the datatype String and Int are monoid.
No, neither String nor Int are monoids. A monoid is a 3-tuple (S, ⊕, e) where ⊕ is a binary operator ⊕ : S×S → S, such that for all elements a, b, c∈S it holds that (a⊕b)⊕c=a⊕(b⊕c), and e∈S is an "identity element" such that a⊕e=e⊕a=a. String and Int are types, so basically sets of values, but not 3-tuples.
The article says:
Let's take the String concatenation and Int addition as
example monoids that have a relationship.
So the author clearly also mentions the binary operators ((++) in case of String, and (+) in case of Int). The identities (empty string in case of String and 0 in case of Int) are left implicit; leaving the identities as an exercise for the reader is common in informal English discourse.
Now given that we have two monoid structures (M, ⊕, em) and (N, ⊗, en), a function f : M → N (like length) is then called a monoid homomorphism [wiki] given it holds that f(m1⊕m2)=f(m1)⊗f(m2) for all elements m1, m2∈M and that mapping also preserves the identity element: f(em)=en.
For example length :: String -> Int is a monoid homomorphism, since we can consider the monoids (String, (++), "") and (Int, (+), 0). It holds that:
length (s1 ++ s2) == length s1 + length s2 (for all Strings s1 and s2); and
length "" == 0.
Datatype cannot be a monoid on its own. For a monoid, you need a data type T and two more things:
an associative binary operation, let's call it |+|, that takes two elements of type T and produces an element of type T
an identity element of type T, let's call it i, such that for every element t of type T the following holds: t |+| i = i |+| t = t
Here are some examples of a monoid:
set of integers with operation = addition and identity = zero
set of integers with operation = multiplication and identity = one
set of lists with operation = appending and identity = empty list
set of strings with operation = concatenation and identity = empty string
Monoid homomorphism
String concatenation monoid can be transformed into integer addition monoid by applying .length to all its elements. Both those sets form a monoid. By the way, remember that we can't just say "set of integers forms a monoid"; we have to pick an associative operation and a corresponding identity element. If we take e.g. division as operation, we break the first rule (instead of producing an element of type integer, we might produce an element of type float/double).
Method length allows us to go from a monoid (string concatenation) to another monoid (integer addition). If such operation also preserves monoid structure, it is considered to be a monoid homomorphism.
Preserving the structure means:
length(t1 |+| t2) = length(t1) |+| length(t2)
and
length(i) = i'
where t1 and t2 represent elements of the "source" monoid, i is the identity of the "source" monoid, and i' is the identity of the "destination" monoid. You can try it out yourself and see that length indeed is a structure-preserving operation on a string concatenation monoid, while e.g. indexOf("a") isn't.
Monoid isomorphism
As demonstrated, length maps all strings to their corresponding integers and forms a monoid with addition as operation and zero as identity. But we can't go back - for every string, we can figure out its length, but given a length we can't reconstruct the "original" string. If we could, then the operation of "going forward" combined with the operation of "going back" would form a monoid isomorphism.
Isomorphism means being able to go back and forth without any loss of information. For example, as stated earlier, list forms a monoid under appending as operation and empty list as identity element. We could go from "list under appending" monoid to "vector under appending" monoid and back without any loss of information, which means that operations .toVector and .toList together form an isomorphism. Another example of an isomorphism, which Runar mentioned in his text, is String ⟷ List[Char].
Colloquially a homomorphism is a function that preserves structure. In the example of the length function the preserved structure is the sum of the lengths of two strings being equal to the length of the concatenation of the same strings. Since both strings and integers can be regarded as monoids (when equipped with an identity and an associative binary operation obeying the monoid laws) length is called a monoid homomorphism.
See also the other answers for a more technical explanation.
trait Monoid[T] {
def op(a: T, b: T): T
def zero: T
}
val strMonoid = new Monoid[String] {
def op(a: String, b: String): String = a ++ b
def zero: String = ""
}
val lcMonoid = new Monoid[List[Char]] {
def op(a: List[Char], b: List[Char]): List[Char] = a ::: b
def zero = List.empty[Char]
}
homomorphism via function f
f{M.op(x,y)} = N.op(f(x),g(y))
for example, using toList available on String
//in REPL
scala> strMonoid.op("abc","def").toList == lcMonoid.op("abc".toList,"def".toList)
res4: Boolean = true
isomorphism via functions f and g
given bi-directional homomorphism between monoids M and N,
f{M.op(x,y)} = N.op(f(x),f(y))
g{N.op(x,y)} = M.op(g(x),g(y))
And if both (f andThen g) and (g andThen f) are identify functions, then monoids M and N are isomorphic via f and g
g{f{M.op(x,y)}} = g{N.op(f(x),f(y))} = M.op(g(f(x)),g(f(y))) = M.op(x,y)
for example, using toList available on String and toString available on List[Char] (where toList andThen toString and toString andThen toList are identity functions)
scala> ( strMonoid.op("abc","def").toList ).toString == ( lcMonoid.op("abc".toList,"def".toList) ).toString
res7: Boolean = true

What are isomorphism and homomorphisms

I tried to understand isomorphism and homomorphisms in context of programming and need some help.
In the book FPiS it explains:
Let start with a homomorphisms:
"foo".length + "bar".length == ("foo" + "bar").length
Here, length is a function from String to Int that preserves the monoid structure.
Why is that a homomorphisms?
Why it preserve the monoid structure?
Is for example map on list function a homomorphisms?
About isomorphism, I have following explaination that I took it from a book:
A monoid isomorphism between M and N has two homomorphisms
f and g, where both f andThen g and g andThen f are an identity function.
For example, the String and List[Char] monoids with concatenation are isomorphic.
The two Boolean monoids (false, ||) and (true, &&) are also isomorphic,
via the ! (negation) function.
Why (false, ||), (true, &&) and String and List[Char] monoids with concatenation are isomorphism?
Why is that a homomorphisms?
By definition.
Why it preserve the monoid structure?
Because of the == in the expression above.
Is for example map on list function a homomorphisms?
Yes. Replace "foo" and "bar" by two lists, and .length by .map(f). It's then easy to see (and prove) that the equation holds.
Why (false, ||), (true, &&) and String and List[Char] monoids with concatenation are isomorphism?
By definition. The proof is trivial, left as an exercise. (Hint: take the definition of an isomorphism, replace all abstract objects with concrete objects, prove that the resulting mathematical expression is correct)
Edit: Here are the few definitions you asked in comments:
Homomorphism: a transformation of one set into another that preserves in the second set the relations between elements of the first. Formally f: A → B where bothA and B have a * operation such that f(x * y) = f(x) * f(y).
Monoid: algebraic structure with a single associative binary operation and an identity element. Formally (M, *, id) is a Monoid iff (a * b) * c == a * (b * c) && a * id == a && id * a == a for all a, b, c in M.
"foo".length + "bar".length == ("foo" + "bar").length
To be precise, this is saying that length is a monoid homomorphism between the monoid of strings with concatenation, and the monoid of natural numbers with addition. That these two structures are monoids are easy to see if you put in the effort.
The reason length is a monoid homomorphism is because it has the properties that "".length = 0 and x.length ⊕ y.length = (x ⊗ y).length. Here, I've deliberately used two different symbols for the two monoid operations, to stress the fact that we are either applying the addition operation on the results of applying length vs the string concatenation operation on the two arguments before applying length. It is just unfortunate notation that the example you're looking at uses the same symbol + for both operations.
Edited to add: The question poster asked for some further details on what exactly is a monoid homomorphism.
OK, so suppose we have two monoids (A, ⊕, a) and (B, ⊗, b), meaning A and B are our two carriers, ⊕ : A × A → A and ⊗ : B × B → B are our two binary operators, and a ∈ A and b ∈ B are our two neutral elements. A monoid homomorphism between these two monoids is a function f : A → B with the following properties:
f(a) = b, i.e. if you apply f on the neutral element of A, you get the neutral element of B
f(x ⊕ y) = f(x) ⊗ f(y), i.e. if you apply f on the result of the operator of A on two elements, it is the same as applying it twice, on the two A elements, and then combining the results using the operator of B.
The point is that the monoid homomorphism is a structure-preserving mapping (which is what a homomorphism is; break down the word to its ancient Greek roots and you will see that it means 'same-shaped-ness').
OK, you asked for examples, here are some examples!
The one from the above example: length is a monoid homomorphism from the free monoid (A, ·, ε)* to (ℕ, +, 0)
Negation is a monoid homomorphism from (Bool, ∨, false) to (Bool, ∧, true) and vice verse.
exp is a monoid homomorphism from (ℝ, +, 0) to (ℝ\{0}, *, 1)
In fact, every group homomorphism is also, of course, a monoid homomorphism.

Right associative with operator :

I can merge two lists as follow together:
List(1,2,3) ::: List(4,5,6)
and the result is:
res2: List[Int] = List(1, 2, 3, 4, 5, 6)
the operator ::: is right associative, what does it mean?
In math, right associative is:
5 + ( 5 - ( 2 * 3 ) )
Right associative means the operator (in our case, the ::: method) is applied on the right operand while using the left operand as the argument. This means that the actual method invocation is done like this:
List(4,5,6).:::(List(1,2,3))
Since ::: prepends the list, the result is List(1,2,3,4,5,6).
In the most general sense, right-associative means that if you don't put any parentheses, they will be assumed to be on the right:
a ::: b ::: c == a ::: (b ::: c)
whereas left-associative operators (such as +) would have
a + b + c == (a + b) + c
However, according to the spec (6.12.3 Infix Operations),
A left-associative binary operation e1 op e2 is interpreted as e1.op(e2). If op is rightassociative,
the same operation is interpreted as { val x=e1; e2.op(x ) }, where
x is a fresh name.
So right-associative operators in scala are considered as methods of their right operand with their left operand as parameter (as explained in #Yuval's answer).

Fold left and fold right

I am trying to learn how to use fold left and fold right. This is my first time learning functional programming. I am having trouble understanding when to use fold left and when to use fold right. It seems to me that a lot of the time the two functions are interchangeable. For example (in Scala)the two functions:
val nums = List(1, 2, 3, 4, 5)
val sum1 = nums.foldLeft(0) { (total, n) =>
total + n
}
val sum2 = nums.foldRight(0) {(total, n) =>
total + n
}
both yield the same result. Why and when would I choose one or the other?
foldleft and foldright differ in the way the function is nested.
foldleft: (((...) + a) + a) + a
foldright: a + (a + (a + (...)))
Since the function you are using is addition, both of them give the same result. Try using subtraction.
Moreover, the motivation to use fold(left/right) is not the result - in most of the cases, both yield the same result. It depends on which you you want your function to be aggregated.
Since the operator you are using is associated & commutative operator means a + b = b + a that's why leftFold and rightFold worked equivalent but it's not the equivalent in general as you can visualised by below examples where operator(+) is not associative & commutative operation i.e in case of string concatenation '+' operator is not associative & commutative means 'a' + 'b' != 'b' + 'a'
val listString = List("a", "b", "c") // : List[String] = List(a,b,c)
val leftFoldValue = listString.foldLeft("z")((el, acc) => el + acc) // : String = zabc
val rightFoldValue = listString.foldRight("z")((el, acc) => el + acc) // : abcz
OR in shorthand ways
val leftFoldValue = listString.foldLeft("z")(_ + _) // : String = zabc
val rightFoldValue = listString.foldRight("z")(_ + _) // : String = abcz
Explanation:
leftFold is worked as ( ( ('z' + 'a') + 'b') + 'c') = ( ('za' + 'b') + 'c') = ('zab' + 'c') = 'zabc'
and rightFold as ('a' + ('b' + ('c' + 'z'))) = ('a' + ('b' + 'cz')) = ('a' + 'bcz') = 'abcz'
So in short for operators that are associative and commutative, foldLeft and
foldRight are equivalent (even though there may be a difference in
efficiency).
But sometimes, only one of the two operators is appropriate.

Why do I have to explicitly state Tuple2(a, b) to be able to use Map add in a foldLeft?

I wish to create a Map keyed by name containing the count of things with that name. I have a list of the things with name, which may contain more than one item with the same name. Coded like this I get an error "type mismatch; found : String required: (String, Int)":
//variation 0, produces error
(Map[String, Int]() /: entries)((r, c) => { r + (c.name, if (r.contains(c.name)) (c.name) + 1 else 1) })
This confuses me as I though (a, b) was a Tuple2 and therefore suitable for use with Map add. Either of the following variations works as expected:
//variation 1, works
(Map[String, Int]() /: entries)((r, c) => { r + Tuple2(c.name, if (r.contains(c.name)) (c.name) + 1 else 1) })
//variation 2, works
(Map[String, Int]() /: entries)((r, c) => {
val e = (c.name, if (r.contains(c.name)) (c.name) + 1 else 1) })
r + e
I'm unclear on why there is a problem with my first version; can anyone advise. I am using Scala-IDE 2.0.0 beta 2 to edit the source; the error is from the Eclipse Problems window.
When passing a single tuple argument to a method used with operator notation, like your + method, you should use double parentheses:
(Map[String, Int]() /: entries)((r, c) => { r + ((c.name, r.get(c.name).map(_ + 1).getOrElse(1) )) })
I've also changed the computation of the Int, which looks funny in your example…
Because + is used to concatenate strings stuff with strings. In this case, parenthesis are not being taken to mean a tuple, but to mean a parameter.
Scala has used + for other stuff, which resulted in all sorts of problems, just like the one you mention.
Replace + with updated, or use -> instead of ,.
r + (c.name, if (r.contains(c.name)) (c.name) + 1 else 1)
is parsed as
r.+(c.name, if (r.contains(c.name)) (c.name) + 1 else 1)
So the compiler looks for a + method with 2 arguments on Map and doesn't find it. The form I prefer over double parentheses (as Jean-Philippe Pellet suggests) is
r + (c.name -> if (r.contains(c.name)) (c.name) + 1 else 1)
UPDATE:
if Pellet is correct, it's better to write
r + (c.name -> r.getOrElse(c.name, 0) + 1)
(and of course James Iry's solution expresses the same intent even better).