What exactly is a category? - scala

I am reading Category Theory for Programmers, and I cannot figure out what exactly a category is.
Let's consider the Bool type. Is Bool a category and True or False are objects (inhabitants)?

One reason you're getting a lot of potentially confusing answers is that your question is a little like asking: "Let's consider a soccer ball. Is the soccer ball a 'game' with the black and white polygons its 'pieces'?"
The answer might be #arrowd's answer: "No, you've confused the game of soccer (Hask) with its ball (Bool), and the polygons (True and False) don't matter." Or, it might be #freestyle's answer: "Yes, we could certainly create a game using the soccer ball and assign one player to the black polygons and the other player the white polygons, but what would the rules be?" Or, it might be #Yuval Itzchakov's answer: "Formally, a 'game' is a collection of one or more players, zero or more pieces, and a set of rules such that, etc., etc."
So, let me add to the confusion with one more (very long) answer, but maybe it will answer your question a little more directly.
Yes, but it's a boring category
Instead of talking about the Haskell type Bool, let's just talk about the abstract concept of boolean logic and the boolean values true and false. Can we form a category with the abstract values "true" and "false" as the objects?
The answer is definitely yes. In fact, we can form (infinitely) many such categories. All we need to do is explain what the "objects" are, what the "arrows" (sometimes called morphisms) are, and make sure that they obey the formal mathematical rules for categories.
Here is one category: Let the "objects" be "true" and "false", and let there be two "arrows":
true -> true
false -> false
Note: Don't confuse this -> notation with Haskell functions. These arrows don't "mean" anything yet, they are just abstract connections between objects.
Now, I know this is a category because it includes both identity arrows (from an object to itself), and it satisfies the composition property which basically says that if I can follow two arrows from a -> b -> c , then there must be a direct arrow a -> c representing their "composition". (Again, when I write a -> b -> c, I'm not talking about a function type here -- these are abstract arrows connecting a to b and then b to c.) Anyway, I don't have enough arrows to worry too much about composition for this category because I don't have any "paths" between different objects. I will call this the "Discrete Boolean" category. I agree that it is mostly useless, just like a game based on the polygons of a soccer ball would be pretty stupid.
Yes, but it has nothing to do with boolean values
Here's a slightly more interesting category. Let the objects be "true" and "false", and let the arrows be the two identity arrows above plus:
false -> true
This is a category, too. It has all the identity arrows, and it satisfies composition because, ignoring the identity arrows, the only interesting "path" I can follow is from "false" to "true", and there's nowhere else to go, so I still don't really have enough arrows to worry about the composition rule.
There are a couple more categories you could write down here. See if you can find one.
Unfortunately, these last two categories don't really have anything to do with the properties of boolean logic. It's true that false -> true looks a little like a not operation, but then how could we explain false -> false or true -> true, and why isn't true -> false there, too?
Ultimately, we could just as easily have called these objects "foo" and "bar" or "A" and "B" or not even bothered to name them, and the categories would be just as valid. So, while technically these are categories with "true" and "false" as objects, they don't capture anything interesting about boolean logic.
Quick aside: multiple arrows
Something I haven't mentioned yet is that categories can contain multiple, distinct arrows between two objects, so there could be two arrows from a to b. To differentiate them, I might give them names, like:
u : a -> b
v : a -> b
I could even have an arrow separate from the identity from b to itself:
w : b -> b -- some non-identity arrow
The composition rule would have to be satisfied by all the different paths. So, because there's a path u : a -> b and a path w : b -> b (even though it doesn't "go" anywhere else), there would have to be an arrow representing the composition of u followed by w from a -> b. Its value might be equal to "u" again, or it might be "v", or it might be some other arrow from a -> b. Part of describing a category is explaining how all the arrows compose and demonstrating that they obey the category laws (the unit law and the associative law, but let's not worry about those laws here).
Armed with this knowledge, you can create an infinite number of boolean categories just by adding more arrows wherever you want and inventing any rules you'd like about how they should compose, subject to the category laws.
Sort of, if you use more complicated objects
Here's a more interesting category that captures some of the "meaning" of boolean logic. It's kind of complicated to explain, so bear with me.
Let the objects be boolean expressions with zero or more boolean variables:
true
false
not x
x and y
(not (y or false)) and x
We'll consider expressions that are "always the same" to be the same object, so y or false and y are the same object, since no matter what the value of y is, they have the same boolean value. That means that the last expression above could have been written (not y) and x instead.
Let the arrows represent the act of setting zero or more boolean variables to specific values. We'll label these arrows with little annotations, so that the arrow {x=false,y=true} represents the act of setting two variables as indicated. We'll assume that the settings are applied in order, so the arrow {x=false,x=true} would have the same effect on an expression as {x=false}, even though they're different arrows. That means that we have arrows like:
{x=false} : not x -> true
{x=true,y=true} : x and y -> true
We also have:
{x=false} : x and y -> false and y -- or just "false", the same thing
Technically, the two arrows labelled {x=false} are different arrows. (They can't be the same arrow because they're arrows between different objects.) It's very common in category theory to use the same name for different arrows like this if they have the same "meaning" or "interpretation", like these ones do.
We'll define composition of arrows to be the act of applying the sequence of settings in the first arrow and then applying the settings from the second arrow, so the composition of:
{x=false}: x or y -> y and {y=true} : y -> true
is the arrow:
{x=false,y=true}: x or y -> true
This is a category. It has identity arrows for every expression, consisting of not setting any variables:
{} : true -> true
{} : not (x or y) and (u or v) -> not (x or y) and (u or v)
It defines composition for every pair of arrows, and the compositions obey the unit and associative laws (again, let's not worry about that detail here).
And, it represents a particular aspect of boolean logic, specifically the act of calculating the value of a boolean expression by substituting boolean values into variables.
Hey, look! A Functor!
It also has a somewhat interesting functor which we might call "Negate". I won't explain what a functor is here. I'll just say that Negate maps this category to itself by:
taking each object (boolean expression) to its logical negation
taking each arrow to a new arrow representing the same variable substitutions
So, the arrow:
{a=false} : (not a) and b -> b
gets mapped by the Negate functor to:
{a=false} : not ((not a) and b) -> not b
or more simply, using the rules of boolean logic:
{a=false} : a or (not b) -> not b
which is a valid arrow in the original category.
This functor captures the idea that "negating a boolean expression" is equivalent to "negating its final result", or maybe more generally that the process of substituting variables in a negated expression has the same structure as doing it to the original expression. Maybe that's not too exciting, but this is just a long Stack Overflow answer, not a 500-page textbook on Category Theory, right?
Bool as part of the Hask category
Now, let's switch from talking about abstract boolean categories to your specific question, whether the Bool Haskell type is a category with objects True and False.
The answers above still apply, to the extent that this Haskell type can be used as a model of boolean logic.
However, when people talk about categories in Haskell, they are usually talking about a specific category Hask where:
the objects are types (like Bool, Int, etc.)
the arrows are Haskell functions (like f :: Int -> Double). Finally, the Haskell syntax and our abstract category syntax coincide -- the Haskell function f can be thought of as an arrow from the object Int to the object Double).
composition is regular function composition
If we are talking about this category, then the answer to your question is: no, in the Hask category, Bool is one of the objects, and the arrows are Haskell functions like:
id :: Bool -> Bool
not :: Bool -> Bool
(==0) :: Int -> Bool
foo :: Bool -> Int
foo b = if b then 10 else 15
To make things more complicated, the objects also include types of functions, so Bool -> Bool is one of the objects. One example of an arrow that uses this object is:
and :: Bool -> (Bool -> Bool)
which is an arrow from the object Bool to the object Bool -> Bool.
In this scenario, True and False aren't part of the category. Haskell values for function types, like sqrt or length are part of the category because they're arrows, but True and False are non-function types, so we just leave them out of the definition.
Category Theory
Note that this last category, like the first categories we looked at, has absolutely nothing to do with boolean logic, even though Bool is one of the objects. In fact, in this category, Bool and Int look about the same -- they're just two types that can have arrows leaving or entering them, and you'd never know that Bool was about true and false or that Int represented integers, if you were just looking at the Hask category.
This is a fundamental aspect of category theory. You use a specific category to study a specific aspect of some system. Whether or not Bool is a category or a part of category is sort of a vague question. The better question would be, "is this particular aspect of Bool that I'm interest in something that can be represented as a useful category?"
The categories I gave above roughly correspond to these potentially interesting aspects of Bool:
The "Discrete Boolean" category represents Bool as a plain mathematical set of two objects, "true" and "false", with no additional interesting features.
The "false -> true" category represents an ordering of boolean values, false < true, where each arrow represents the operator '<='.
The boolean expression category represents an evaluation model for simple boolean expressions.
Hask represents the composition of functions whose input and output types may be a boolean type or a functional type involving boolean and other types.

If you are talking about Hask category, then no. Hask is the category and the objects are Haskell types. That is, Bool is an object, and True/False aren't even talked about here. The description of Hask can be found on Haskell wiki. There are also talks, that Hask isn't even a proper category, read this.

Related

Why doesn't a prism set function return an Option/Maybe

In functional optics, a well-behaved prism (called a partial lens in scala, I believe) is supposed to have a set function of type 'subpart -> 'parent -> 'parent, where if the prism "succeeds" and is structurally compatible with the 'parent argument given, then it returns the 'parent given with the appropriate subpart modified to have the 'subpart value given. If the prism "fails" and is structurally incompatible with the 'parent argument, then it returns the 'parent given unmodified.
I'm wondering why the prism doesn't return a 'parent option (Maybe for Haskellers) to represent the pass/fail nature of the set function? Shouldn't the programmer be able to tell from the return type whether the set was "successful" or not?
I know there's been a lot of research and thought put into the realm of functional optics, so I'm sure there must be a definitive answer that I just can't seem to find.
(I'm from an F# background, so I apologize if the syntax I've used is a bit opaque for Haskell or Scala programmers).
I doubt there's one definitive answer, so I'll give you two here.
Origin
I believe prisms were first imagined (by Dan Doel, if my vague recollection is correct) as "co-lenses". Whereas a lens from s to a offers
get :: s -> a
set :: (s, a) -> s
a prism from s to a offers
coget :: a -> s
coset :: s -> Either s a
All the arrows are reversed, and the product, (,), is replaced by a coproduct, Either. So a prism in the category of types and functions is a lens in the dual category.
For simple prisms, that s -> Either s a seems a bit weird. Why would you want the original value back? But the lens package also offers type-changing optics. So we end up with
get :: s -> a
set :: (s, b) -> t
coget :: a -> s
coset :: t -> Either s b
Suddenly what we're getting back in the non-matching case may actually be a bit different! What's that about? Here's an example:
cogetLeft :: a -> Either a x
cogetLeft = Left
cosetLeft :: Either b x -> Either (Either a x) b
cosetLeft (Left b) = Right b
cosetLeft (Right x) = Left (Right x)
In the second (non-matching) case, the value we get back is the same, but its type has been changed.
Nice hierarchy
For both Van Laarhoven (as in lens) and profunctor style frameworks, both lenses and prisms can also stand in for traversals. To do that, they need to have similar forms, and this design accomplishes that. leftaroundabout's answer gives more detail on this aspect.
To answer the “why” – lenses etc. are pretty rigidly derived from category theory, so this is actually quite clear-cut – the behaviour you describe just drops out of the maths, it's not something anybody defined for any purpose but follows from far more general ideas.
Ok, that's not really satisfying.
Not sure if other languages' type systems are powerful enough to express this, but in principle and in Haskell, a prism is a special case of a traversal.
A traversal is a way to “visit” all occurences of “elements” within some “container”. The classical example is
mapM :: Monad m => (a -> m b) -> [a] -> m [b]
This is typically used like
Prelude> mapM print [1..4]
1
2
3
4
[(),(),(),()]
The focus here is on: sequencing the actions/side-effects, and gathering back the result in a container with the same structure as the one we started with.
What's special about a prism is simply that the containers are restricted to contain either one or zero elements† (whereas a general traversal can go over any number of elements). But the set operator doesn't know about that because it's strictly more general. The nice thing is that you can therefore use this on a lens, or a prism, or on mapM, and always get a sensible behaviour. But it's not the behaviour of “insert exactly once into the structure or else tell me if it failed”.
Not that this isn't a sensible operation, just it's not what lens libraries call “setting”. You can do it by explicitly matching and re-building:
set₁ :: Prism s a -> a -> s -> Maybe s
set₁ p x = case matching p x of
Left _ -> Nothing
Right a -> Just $ a ^. re p
†More precisely: a prism seperates the cases: a container may either contain one element, and nothing else apart from that, or it may have no element but possibly something unrelated.

General Advice about When to Use Prop and When to use bool

I am formalizing a grammar which is essentially one over boolean expressions. In coq, you can get boolean-like things in Prop or more explicitly in bool.
So for example, I could write:
true && true
Or
True /\ True
The problem is that in proofs (which is what I really care about) I can do a case analysis in domain bool, but in Prop this is not possible (since all members are not enumerable, I suppose). Giving up this tactic and similar rewriting tactics seems like a huge drawback even for very simple proofs.
In general, what situations would one choose Prop over bool for formalizing? I realize this is a broad question, but I feel like this is not addressed in the Coq manual sufficiently. I am interested in real world experience people have had going down both routes.
There are lots of different opinions on this. My personal take is that you are often better off not making this choice: it makes sense to have two versions of a property, one in Prop, the other one in bool.
Why would you want this? As you pointed out, booleans support case analysis in proofs and functions, which general propositions do not. However, Prop is more convenient to use in certain cases. Suppose you have a type T with finitely many values. We can write a procedure
all : (T -> bool) -> bool
that decides whether a boolean property P : T -> bool holds of all elements of T. Imagine that we know that all P = true, for some property P. We might want to use this fact to conclude that P x = true for some value x. To do this, we need to prove a lemma about all:
allP : forall P : T -> bool,
all P = true <-> (forall x : T, P x = true)
This lemma connects two different formulations of the same property: a boolean one and a propositional one. When reasoning about all in a proof, we can invoke allP to convert freely between the two. We can also have different conversion lemmas:
allPn : forall P,
all P = false <-> (exists x, P x = false)
In fact, we are free to choose any Coq proposition whatsoever to relate to a boolean computation (as long, of course, as we can prove that the two are logically equivalent). For instance, if we would like to have a custom induction principle associated with a boolean property, we can look for an equivalent formulation as an inductively defined proposition.
The Mathematical Components library is a good example of development that follows this style. Indeed, because it is so pervasive there, the library provides a special view mechanism for writing conversion lemmas like the one above and applying them. In plain Coq, we can also use the rewrite tactic to apply logical equivalences more conveniently.
Of course, there are many situations where it does not make sense to have two formulations of the same property. Sometimes, you are forced to use Prop, because the property you care about is undecidable. Sometimes, you might feel that you wouldn't gain anything by writing your property in Prop, and may keep it only as a boolean.
In addition to the Software Foundations chapter linked above, this answer discusses the difference between bool and Prop in more depth.

What's the difference between a lens and a partial lens?

A "lens" and a "partial lens" seem rather similar in name and in concept. How do they differ? In what circumstances do I need to use one or the other?
Tagging Scala and Haskell, but I'd welcome explanations related to any functional language that has a lens library.
To describe partial lenses—which I will henceforth call, according to the Haskell lens nomenclature, prisms (excepting that they're not! See the comment by Ørjan)—I'd like to begin by taking a different look at lenses themselves.
A lens Lens s a indicates that given an s we can "focus" on a subcomponent of s at type a, viewing it, replacing it, and (if we use the lens family variation Lens s t a b) even changing its type.
One way to look at this is that Lens s a witnesses an isomorphism, an equivalence, between s and the tuple type (r, a) for some unknown type r.
Lens s a ====== exists r . s ~ (r, a)
This gives us what we need since we can pull the a out, replace it, and then run things back through the equivalence backward to get a new s with out updated a.
Now let's take a minute to refresh our high school algebra via algebraic data types. Two key operations in ADTs are multiplication and summation. We write the type a * b when we have a type consisting of items which have both an a and a b and we write a + b when we have a type consisting of items which are either a or b.
In Haskell we write a * b as (a, b), the tuple type. We write a + b as Either a b, the either type.
Products represent bundling data together, sums represent bundling options together. Products can represent the idea of having many things only one of which you'd like to choose (at a time) whereas sums represent the idea of failure because you were hoping to take one option (on the left side, say) but instead had to settle for the other one (along the right).
Finally, sums and products are categorical duals. They fit together and having one without the other, as most PLs do, puts you in an awkward place.
So let's take a look at what happens when we dualize (part of) our lens formulation above.
exists r . s ~ (r + a)
This is a declaration that s is either a type a or some other thing r. We've got a lens-like thing that embodies the notion of option (and of failure) deep at it's core.
This is exactly a prism (or partial lens)
Prism s a ====== exists r . s ~ (r + a)
exists r . s ~ Either r a
So how does this work concerning some simple examples?
Well, consider the prism which "unconses" a list:
uncons :: Prism [a] (a, [a])
it's equivalent to this
head :: exists r . [a] ~ (r + (a, [a]))
and it's relatively obvious what r entails here: total failure since we have an empty list!
To substantiate the type a ~ b we need to write a way to transform an a into a b and a b into an a such that they invert one another. Let's write that in order to describe our prism via the mythological function
prism :: (s ~ exists r . Either r a) -> Prism s a
uncons = prism (iso fwd bck) where
fwd [] = Left () -- failure!
fwd (a:as) = Right (a, as)
bck (Left ()) = []
bck (Right (a, as)) = a:as
This demonstrates how to use this equivalence (at least in principle) to create prisms and also suggests that they ought to feel really natural whenever we're working with sum-like types such as lists.
A lens is a "functional reference" that allows you to extract and/or update a generalized "field" in a larger value. For an ordinary, non-partial lens that field is always required to be there, for any value of the containing type. This presents a problem if you want to look at something like a "field" which might not always be there. For example, in the case of "the nth element of a list" (as listed in the Scalaz documentation #ChrisMartin pasted), the list might be too short.
Thus, a "partial lens" generalizes a lens to the case where a field may or may not always be present in a larger value.
There are at least three things in the Haskell lens library that you could think of as "partial lenses", none of which corresponds exactly to the Scala version:
An ordinary Lens whose "field" is a Maybe type.
A Prism, as described by #J.Abrahamson.
A Traversal.
They all have their uses, but the first two are too restricted to include all cases, while Traversals are "too general". Of the three, only Traversals support the "nth element of list" example.
For the "Lens giving a Maybe-wrapped value" version, what breaks is the lens laws: to have a proper lens, you should be able to set it to Nothing to remove the optional field, then set it back to what it was, and then get back the same value. This works fine for a Map say (and Control.Lens.At.at gives such a lens for Map-like containers), but not for a list, where deleting e.g. the 0th element cannot avoid disturbing the later ones.
A Prism is in a sense a generalization of a constructor (approximately case class in Scala) rather than a field. As such the "field" it gives when present should contain all the information to regenerate the whole structure (which you can do with the review function.)
A Traversal can do "nth element of a list" just fine, in fact there are at least two different functions ix and element that both work for this (but generalize slightly differently to other containers).
Thanks to the typeclass magic of lens, any Prism or Lens automatically works as a Traversal, while a Lens giving a Maybe-wrapped optional field can be turned into a Traversal of a plain optional field by composing with traverse.
However, a Traversal is in some sense too general, because it is not restricted to a single field: A Traversal can have any number of "target" fields. E.g.
elements odd
is a Traversal that will happily go through all the odd-indexed elements of a list, updating and/or extracting information from them all.
In theory, you could define a fourth variant (the "affine traversals" #J.Abrahamson mentions) that I think might correspond more closely to Scala's version, but due to a technical reason outside the lens library itself they would not fit well with the rest of the library - you would have to explicitly convert such a "partial lens" to use some of the Traversal operations with it.
Also, it would not buy you much over ordinary Traversals, since there's e.g. a simple operator (^?) to extract just the first element traversed.
(As far as I can see, the technical reason is that the Pointed typeclass which would be needed to define an "affine traversal" is not a superclass of Applicative, which ordinary Traversals use.)
Scalaz documentation
Below are the scaladocs for Scalaz's LensFamily and PLensFamily, with emphasis added on the diffs.
Lens:
A Lens Family, offering a purely functional means to access and retrieve a field transitioning from type B1 to type B2 in a record simultaneously transitioning from type A1 to type A2. scalaz.Lens is a convenient alias for when A1 =:= A2, and B1 =:= B2.
The term "field" should not be interpreted restrictively to mean a member of a class. For example, a lens family can address membership of a Set.
Partial lens:
Partial Lens Families, offering a purely functional means to access and retrieve an optional field transitioning from type B1 to type B2 in a record that is simultaneously transitioning from type A1 to type A2. scalaz.PLens is a convenient alias for when A1 =:= A2, and B1 =:= B2.
The term "field" should not be interpreted restrictively to mean a member of a class. For example, a partial lens family can address the nth element of a List.
Notation
For those unfamiliar with scalaz, we should point out the symbolic type aliases:
type #>[A, B] = Lens[A, B]
type #?>[A, B] = PLens[A, B]
In infix notation, this means the type of a lens that retrieves a field of type B from a record of type A is expressed as A #> B, and a partial lens as A #?> B.
Argonaut
Argonaut (a JSON library) provides a lot of examples of partial lenses, because the schemaless nature of JSON means that attempting to retrieve something from an arbitrary JSON value always has the possibility of failure. Here are a few examples of lens-constructing functions from Argonaut:
def jArrayPL: Json #?> JsonArray — Retrieves a value only if the JSON value is an array
def jStringPL: Json #?> JsonString — Retrieves a value only if the JSON value is a string
def jsonObjectPL(f: JsonField): JsonObject #?> Json — Retrieves a value only if the JSON object has the field f
def jsonArrayPL(n: Int): JsonArray #?> Json — Retrieves a value only if the JSON array has an element at index n

Sets, Functors and Eq confusion

A discussion came up at work recently about Sets, which in Scala support the zip method and how this can lead to bugs, e.g.
scala> val words = Set("one", "two", "three")
scala> words zip (words map (_.length))
res1: Set[(java.lang.String, Int)] = Set((one,3), (two,5))
I think it's pretty clear that Sets shouldn't support a zip operation, since the elements are not ordered. However, it was suggested that the problem is that Set isn't really a functor, and shouldn't have a map method. Certainly, you can get yourself into trouble by mapping over a set. Switching to Haskell now,
data AlwaysEqual a = Wrap { unWrap :: a }
instance Eq (AlwaysEqual a) where
_ == _ = True
instance Ord (AlwaysEqual a) where
compare _ _ = EQ
and now in ghci
ghci> import Data.Set as Set
ghci> let nums = Set.fromList [1, 2, 3]
ghci> Set.map unWrap $ Set.map Wrap $ nums
fromList [3]
ghci> Set.map (unWrap . Wrap) nums
fromList [1, 2, 3]
So Set fails to satisfy the functor law
fmap f . fmap g = fmap (f . g)
It can be argued that this is not a failing of the map operation on Sets, but a failing of the Eq instance that we defined, because it doesn't respect the substitution law, namely that for two instances of Eq on A and B and a mapping f : A -> B then
if x == y (on A) then f x == f y (on B)
which doesn't hold for AlwaysEqual (e.g. consider f = unWrap).
Is the substition law a sensible law for the Eq type that we should try to respect? Certainly, other equality laws are respected by our AlwaysEqual type (symmetry, transitivity and reflexivity are trivially satisfied) so substitution is the only place that we can get into trouble.
To me, substition seems like a very desirable property for the Eq class. On the other hand, some comments on a recent Reddit discussion include
"Substitution seems stronger than necessary, and is basically quotienting the type, putting requirements on every function using the type."
-- godofpumpkins
"I also really don't want substitution/congruence since there are many legitimate uses for values which we want to equate but are somehow distinguishable."
-- sclv
"Substitution only holds for structural equality, but nothing insists Eq is structural."
-- edwardkmett
These three are all pretty well known in the Haskell community, so I'd be hesitant to go against them and insist on substitability for my Eq types!
Another argument against Set being a Functor - it is widely accepted that being a Functor allows you to transform the "elements" of a "collection" while preserving the shape. For example, this quote on the Haskell wiki (note that Traversable is a generalization of Functor)
"Where Foldable gives you the ability to go through the structure processing the elements but throwing away the shape, Traversable allows you to do that whilst preserving the shape and, e.g., putting new values in."
"Traversable is about preserving the structure exactly as-is."
and in Real World Haskell
"...[A] functor must preserve shape. The structure of a collection should not be affected by a functor; only the values that it contains should change."
Clearly, any functor instance for Set has the possibility to change the shape, by reducing the number of elements in the set.
But it seems as though Sets really should be functors (ignoring the Ord requirement for the moment - I see that as an artificial restriction imposed by our desire to work efficiently with sets, not an absolute requirement for any set. For example, sets of functions are a perfectly sensible thing to consider. In any case, Oleg has shown how to write efficient Functor and Monad instances for Set that don't require an Ord constraint). There are just too many nice uses for them (the same is true for the non-existant Monad instance).
Can anyone clear up this mess? Should Set be a Functor? If so, what does one do about the potential for breaking the Functor laws? What should the laws for Eq be, and how do they interact with the laws for Functor and the Set instance in particular?
Another argument against Set being a Functor - it is widely accepted that being a Functor allows you to transform the "elements" of a "collection" while preserving the shape. [...] Clearly, any functor instance for Set has the possibility to change the shape, by reducing the number of elements in the set.
I'm afraid that this is a case of taking the "shape" analogy as a defining condition when it is not. Mathematically speaking, there is such a thing as the power set functor. From Wikipedia:
Power sets: The power set functor P : Set → Set maps each set to its power set and each function f : X → Y to the map which sends U ⊆ X to its image f(U) ⊆ Y.
The function P(f) (fmap f in the power set functor) does not preserve the size of its argument set, yet this is nonetheless a functor.
If you want an ill-considered intuitive analogy, we could say this: in a structure like a list, each element "cares" about its relationship to the other elements, and would be "offended" if a false functor were to break that relationship. But a set is the limiting case: a structure whose elements are indifferent to each other, so there is very little you can do to "offend" them; the only thing is if a false functor were to map a set that contains that element to a result that doesn't include its "voice."
(Ok, I'll shut up now...)
EDIT: I truncated the following bits when I quoted you at the top of my answer:
For example, this quote on the Haskell wiki (note that Traversable is a generalization of Functor)
"Where Foldable gives you the ability to go through the structure processing the elements but throwing away the shape, Traversable allows you to do that whilst preserving the shape and, e.g., putting new values in."
"Traversable is about preserving the structure exactly as-is."
Here's I'd remark that Traversable is a kind of specialized Functor, not a "generalization" of it. One of the key facts about any Traversable (or, actually, about Foldable, which Traversable extends) is that it requires that the elements of any structure have a linear order—you can turn any Traversable into a list of its elements (with Foldable.toList).
Another, less obvious fact about Traversable is that the following functions exist (adapted from Gibbons & Oliveira, "The Essence of the Iterator Pattern"):
-- | A "shape" is a Traversable structure with "no content,"
-- i.e., () at all locations.
type Shape t = t ()
-- | "Contents" without a shape are lists of elements.
type Contents a = [a]
shape :: Traversable t => t a -> Shape t
shape = fmap (const ())
contents :: Traversable t => t a -> Contents a
contents = Foldable.toList
-- | This function reconstructs any Traversable from its Shape and
-- Contents. Law:
--
-- > reassemble (shape xs) (contents xs) == Just xs
--
-- See Gibbons & Oliveira for implementation. Or do it as an exercise.
-- Hint: use the State monad...
--
reassemble :: Traversable t => Shape t -> Contents a -> Maybe (t a)
A Traversable instance for sets would violate the proposed law, because all non-empty sets would have the same Shape—the set whose Contents is [()]. From this it should be easy to prove that whenever you try to reassemble a set you would only ever get the empty set or a singleton back.
Lesson? Traversable "preserves shape" in a very specific, stronger sense than Functor does.
Set is "just" a functor (not a Functor) from the subcategory of Hask where Eq is "nice" (i.e. the subcategory where congruence, substitution, holds). If constraint kinds were around from way back then perhaps set would be a Functor of some kind.
Well, Set can be treated as a covariant functor, and as a contravariant functor; usually it's a covariant functor. And for it to behave regarding equality one has to make sure that whatever the implementation, it does.
Regarding Set.zip - it is nonsense. As well as Set.head (you have it in Scala). It should not exist.

Kind vs Rank in type theory

I'm having a hard time understanding Higher Kind vs Higher Rank types. Kind is pretty simple (thanks Haskell literature for that) and I used to think rank is like kind when talking about types but apparently not! I read the Wikipedia article to no avail. So can someone please explain what is a Rank? and what is meant by Higher Rank? Higher Rank Polymorphism? how that comes to Kinds (if any) ? Comparing Scala and Haskell would be awesome too.
The concept of rank is not really related to the concept of kinds.
The rank of a polymorphic type system describes where foralls may appear in types. In a rank-1 type system foralls may only appear at the outermost level, in a rank-2 type system they may appear at one level of nesting and so on.
So for example forall a. Show a => (a -> String) -> a -> String would be a rank-1 type and forall a. Show a => (forall b. Show b => b -> String) -> a -> String would be a rank-2 type. The difference between those two types is that in the first case, the first argument to the function can be any function that takes one showable argument and returns a String. So a function of type Int -> String would be a valid first argument (like a hypothetical function intToString), so would a function of type forall a. Show a => a -> String (like show). In the second case only a function of type forall a. Show a => a -> String would be a valid argument, i.e. show would be okay, but intToString wouldn't be. As a consequence the following function would be a legal function of the second type, but not the first (where ++ is supposed to represent string concatenation):
higherRankedFunction(f, x) = f("hello") ++ f(x) ++ f(42)
Note that here the function f is applied to (potentially) three different types of arguments. So if f were the function intToString this would not work.
Both Haskell and Scala are Rank-1 (so the above function can not be written in those languages) by default. But GHC contains a language extension to enable Rank-2 polymorphism and another one to enable Rank-n polymorphism for arbitrary n.