Encoding of inferrable records - encoding

As you probably know, records are somewhat special in ocaml, as each label has to be uniquely assigned to a nominal record type, i.e. the following function cannot be typed without context:
let f r = r.x
Proper first class records (i.e. things that behave like tuples with labels) are trivially encoded using objects, e.g.
let f r = r#x
when creating the objects in the right way (i.e. no self-recursion, no mutation), they behave just like records.
I am however, somewhat unhappy with this solution for two reasons:
when making records updatetable (i.e. by adding an explicit "with_l" method for each label l), the type is somewhat too loose (it should be the same as the original record). Admitted, one can enforce this equality, but this is still inconvenient.
I have the suspicion that the OCaml compiler does not infer that these records are actually immutable: In a function
let f r = r#x + r#x
would the compiler be able to run a common subexpression elimination?
For these reasons, I wonder if there is a better encoding:
Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
Can this encoding avoid the problems mentioned above?

If I understand you correctly you're looking for a very special kind of polymorphism. You want to write a function that will work for all types, such that the type is a record with certain fields. This sounds more like a syntactic polymorphism in a C++ style, not as semantic polymorphism in ML style. If we will slightly rephrase the task, by capturing the idea that a field accessing is just a syntactic sugar for a field projection function, then we can say, that you want to write a function that is polymorphic over all types that provide a certain set of operations. This kind of polymorphism can be captured by OCaml using one of the following mechanisms:
functors
first class modules
objects
I think that functors are obvious, so I will show an example with first class modules. We will write a function print_student that will work on any type that satisfies the Student signature:
module type Student = sig
type t
val name : t -> string
val age : t -> int
end
let print_student (type t)
(module S : Student with type t = t) (s : t) =
Printf.printf "%s %d" (S.name s) (S.age s)
The type of print_student function is (module Student with type t = 'a) -> 'a -> unit. So it works for any type that satisfies the Student interface, and thus it is polymorphic. This is a very powerful polymorphism that comes with a price, you need to pass the module structure explicitly when you're invoking the function, so it is a System F style polymorphism. Functors will also require you to specify concrete module structure. So both are not inferrable (i.e., not an implicit Hindley-Milner-like style polymorphism, that you are looking for). For the latter, only objects will work (there are also modular implicits, that relax the explicitness requirement, but they are still not in the trunk, but they will actually answer your requirements).
With object-style row polymorphism it is possible to write a function that is polymorphic over a set of types conforming to some signature, and to infer this signature implicitly from the function definintion. However, such power comes with a price. Since object operations are encoded with methods and methods are just function pointers that are assigned dynamically in the runtime, you shouldn't expect any compile time optimizations. It is not possible to perform any static analysis on something that is bound dynamically. So, of course, no Common Subexpression elimination, nor inlining. For functors and first class modules, the optimization is possible on a newer branch of the compiler with flamba (See 4.03.0+flambda opam switch). But on a regular compiler installation no inlining will be performed.
Different approaches
What concerning other techniques. First of all we can use camlp{4,5}, or ppx or even m4 and cpp to preprocess code, but this would be hardly idiomatic and of doubtful usefulness.
Another way, is instead of writing a function that is polymorphic, we can try to find a suitable monomorphic data type. A direct approach would be to use a list of polymorphic variants, e.g.,
type attributes = [`name of string | `age of int]
type student = attribute list
In fact we even don't need to specify all these types ahead, and our function can require only those fields that are needed, a form of a row polymorphism:
let rec name = function
| [] -> raise Not_found
| `name n -> n
| _ :: student -> name student
The only problem with this encoding, is that you cannot guarantee that the same named attribute can occur once and only once. So it is possible that a student doesn't have a name at all, or, that is worser, it can have more then one names. Depending on your problem domain it can be acceptable.
If it is not, then we can use GADT and extensible variants to encode heterogenous maps, i.e., an associative data structures that map keys to
different type (in a regular (homogenous) map or assoc list value type is unified). How to construct such containers is beyond the scope of the answer, but fortunately there're at least two available implementations. One, that I use personally is called universal map (Univ_map) and is provided by a Core library (Core_kernel in fact). It allows you to specify two kinds of heterogenous maps, with and without a default values. The former corresponds to a record with optional field, the latter has default for each field, so an accessor is a total function. For example,
open Core_kernel.Std
module Dict = Univ_map.With_default
let name = Dict.Key.create ~name:"name" ~default:"Joe" sexp_of_string
let age = Dict.Key.create ~name:"age" ~default:18 sexp_of_int
let print student =
printf "%s %d"
(Dict.get student name) (Dict.get age name)
You can hide that you're using universal map using abstract type, as there is only one Dict.t that can be used across different abstractions, that may break modularity. Another example of heterogeneous map implementation is from Daniel Bunzli. It doesn't provide With_default kind of map, but has much less dependencies.
P.S. Of course for such a redundant case, where this only one operation it is much easier to just pass this operation explicitly as function, instead of packing it into a structure, so we can write function f from your example as simple as let f x r = x r + x r. But this would be the same kind of polymoprism as with first class modules/functors, just simplified. And I assume, that your example was specifically reduced to one field, and in your real use case you have more complex set of fields.

Very roughly speaking, an OCaml object is a hash table whose keys are its method name hash. (The hash of a method name can be obtained by Btype.hash_variant of OCaml compiler implementation.)
Just like objects, you can encode polymorphic records using (int, Obj.t) Hashtbl.t. For example, a function to get a value of a field l can be written as follows:
(** [get r "x"] is poly-record version of [r.x] *)
let get r k = Hashtbl.find t (Btype.hash_variant k))
Since it is easy to access the internals unlike objects, the encoding of {r with l = e} is trivial:
(** [copy_with r [(k1,v1);..;(kn,vn)]] is poly-record version of
[{r with k1 = v1; ..; kn = vn}] *)
let copy_with r fields =
let r = Hashtbl.copy r in
List.iter (fun (k,v) -> Hashtbl.replace r (Btype.hash_variant k) v) fields
and the creation of poly-records:
(** [create [(k1,v1);..(kn,vn)]] is poly-record version of [{k1=v1;..;kn=vn}] *)
let create fields = copy_with fields (Hashtbl.create (List.length fields))
Since all the types of the fields are squashed into one Obj.t, you have to use Obj.magic to store various types into this implementation and therefore this is not type-safe by itself. However, we can make it type-safe wrapping (int, Obj.t) Hashtbl.t with phantom type whose parameter denotes the fields and their types of a poly-record. For example,
<x : int; y : float> Poly_record.t
is a poly-record whose fields are x : int and y : float.
Details of this phantom type wrapping for the type safety is too long to explain here. Please see my implementation https://bitbucket.org/camlspotter/ppx_poly_record/src . To tell short, it uses PPX preprocessor to generate code for type-safety and to provide easier syntax sugar.
Compared with the encoding by objects, this approach has the following properties:
The same type safety and the same field access efficiency as objects
It can enjoy structural subtyping like objects, what you want for poly-records.
{r with l = e} is possible
Streamable outside of a program safely, since hash tables themselves have no closure in it. Objects are always "contaminated" with closures therefore they are not safely streamable.
Unfortunately it lacks efficient pattern matching, which is available for mono-records. (And this is why I do not use my implementation :-( ) I feel for it PPX reprocessing is not enough and some compiler modification is required. It will not be really hard though since we can make use of typing of objects.
Ah and of course, this encoding is very side effective therefore no CSE optimization can be expected.

Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
For immutable records, yes. There is a standard theoretical duality between polymorphic records ("inferrable" records as you describe) and polymorphic variants. In short, a record { l_1 = v_1; l_2 = v_2; ...; l_n = v_n } can be implemented by
function `l_1 k -> k v_1 | `l_2 k -> k v_2 | ... | `l_n k -> k v_n
and then the projection r.l_i becomes r (`l_i (fun v -> v)). For instance, the function fun r -> r.x is encoded as fun r -> r (`x (fun v -> v)). See also the following example session:
# let myRecord = (function `field1 k -> k 123 | `field2 k -> k "hello") ;;
(* encodes { field1 = 123; field2 = "hello" } *)
val myRecord : [< `field1 of int -> 'a | `field2 of string -> 'a ] -> 'a = <fun>
# let getField1 r = r (`field1 (fun v -> v)) ;;
(* fun r -> r.field1 *)
val getField1 : ([> `field1 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField1 myRecord ;;
- : int = 123
# let getField2 r = r (`field2 (fun v -> v)) ;;
(* fun r -> r.field2 *)
val getField2 : ([> `field2 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField2 myRecord ;;
- : string = "hello"
For mutable records, we can add setters like:
let ref1 = ref 123
let ref2 = ref "hello"
let myRecord =
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
and use them like myRecord (`set_field1(456, fun v -> v)) and myRecord (`set_field2("world", fun v -> v)) for example. However, localizing ref1 and ref2 like
let myRecord =
let ref1 = ref 123 in
let ref2 = ref "hello" in
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
causes a value restriction problem and requires a little more polymorphic typing trick (which I omit here).
Can this encoding avoid the problems mentioned above?
The "common subexpression elimination" for (the encoding of) r.x + r.x can be done only if OCaml knows the definition of r and inlines it. (Sorry my previous answer was inaccurate here.)

Related

Is it an example of a Monad? If not, why?

I am trying to study functional programming applied to Swift (a multi-paradigm language). One of the exercises I proposed to myself was trying to do a declarative Poker Hand evaluator.
Here are some code excerpts and my question at the end:
typealias Rule = ([Card]) -> Result
Where Result is a type that holds the current evaluation state (cards already evaluated on a rank, remaining cards, if the last rule evaluation was successful or not and the evaluated ranks). The input is an array (could be a set of) cards to be evaluated.
I also created this function:
func id(_ hand:[Card]) -> Result ...
That creates a mininum Result from a card set.
Result also have a set of functions to chain rule evaluations (simplified here):
func apply(_ rule:Rule) -> Result
func andThen(_ rule:Rule) -> Result
func andAlso(_ rule:Rule) -> Result
func otherwise(_ rule:Rule) -> Result
func continueWith(_ rule:Rule) -> Result
Whats allowed me to declare the poker rank rules as:
let fullHouse = { (hand) in
threeOfAKind(hand).andThen(pair)
}
or
let royalStraightFlush = { (hand) in
straightFlush(hand).andAlso(straightAceHigh)
}
and chaining all rank rules as:
let evaluate = { (hand) in
//id(hand)
royalStraightFlush(hand)
.otherwise(straightFlush)
.otherwise(fourOfAKind)
.otherwise(fullHouse)
.otherwise(flush)
.otherwise(straight)
.otherwise(threeOfAKind)
.otherwise(twoPair)
.otherwise(pair)
.continueWith(highCard) }
Result, as it is:
Encompasses a type ([Card]) into a broader context (R a)
Has an id (result) function that put an object of type a into a minimum "result" context (a -> R a)
But...
It has not just one generic >>= bind function, but several specific ones, that takes R a -> (a -> R a) -> R a that could chain rules and get the card set to be evaluated from the previous partial result state.
It is not (as it is implemented) generic enough to handle other types instead of Card or [Card]. In the other hand I think the same chaining logic could be used on other rule systems with some changes...
My question is: is Result a monad? Otherwise, why it is not? My two concerns are those presented above.
I think understanding these points (or knowing that are some other blind spots) on this concrete example will clarify a little bit the monad concept to me.
Thanks!
Monads have to obey the three Monad Laws:
Left identity: return a >>= f ≡ f a
Right identity: m >>= return ≡ m
Associativity: (m >>= f) >>= g ≡ m >>= (\x -> f x >>= g)
This also implies the implementation of return (or equivalent) and >>= (variously called bind, flatmap, etc.) for your monad.. Haskell (and other languages) also adds the map function (>>) even though it can be implemented in terms of >>=.
In addition, as #chepner points out, monads are abstractions that are not concrete types in themselves. They are generic and can only be instantiated via a type constructor. List is a monad but you cannot instantiate a List. You can, however, instantiate a List[Int].
If you want to get a good conceptual picture of what makes a monad a monad, take a look at Brian Beckman's video Don't Fear the Monad. Also, there is a good series of blog posts, Monads are Elephants.

Method types incompatiblity

I tried recently play with a streaming liquidsoap-like stuff... There's some code which uses OCaml classes and C libraries for encoding like lame (via ocaml-lame) etc.
(* Lame module *)
type encoder
(* ... *)
external encode_buffer_float_part : encoder -> float array -> float array -> int -> int -> string = "ocaml_lame_encode_buffer_float"
(* Otherencoder module *)
type encoder
(* ... *)
external encode_buffer_float_part : encoder -> float array -> float array -> int -> int -> string = "ocaml_otherencoder_encode_buffer_float"
(=same interface)
Somewhere there's a two high-level classes that inherit from two separated encoderbase virtual classes:
(* Mp3_output module *)
class virtual encoderbase =
object (self)
method encode ncoder channels buf offset size =
if channels = 1 then
Lame.encode_buffer_float_part ncoder buf.(0) buf.(0) offset size
else
Lame.encode_buffer_float_part ncoder buf.(0) buf.(1) offset size
end
(* somewhere in the code *)
class to_shout sprop =
(* some let-s *)
object (self)
inherit
[Lame.encoder] Icecast2.output ~format:Format_mp3 (* more params *) as super
inherit base
(* ... *)
end
and
(* Other_output module *)
class virtual encoderbase =
object (self)
method encode ncoder channels buf offset size =
if channels = 1 then
Otherencoder.encode_buffer_float_part ncoder buf.(0) buf.(0) offset size
else
Otherencoder.encode_buffer_float_part ncoder buf.(0) buf.(1) offset size
end
(* somewhere in the code *)
class to_shout sprop =
(* some let-s *)
object (self)
inherit
[Otherencoder.encoder] Icecast2.output ~format:Format_other (* more params *) as super
inherit base
(* ... *)
end
All things work fine with:
let icecast_out source format =
let sprop =
new Mp3_output.shout_sprop
in
(* some code here *)
new Mp3_output.to_shout sprop
but when I try something like this:
let icecast_out source format =
let sprop =
if format = Format_other then
new Other_output.shout_sprop
else
new Mp3_output.shout_sprop
in
(* some code here *)
if format = Format_mp3 then
new Mp3_output.to_shout sprop
else
new Other_output.to_shout sprop
compilation breaks with an error # new Other_output.to_shout sprop:
Error: This expression has type Other_output.to_shout
but an expression was expected of type Mp3_output.to_shout
Types for method encode are incompatible
Is there any way to "convince" OCaml (common ancestor? wrapping class? type casting?) to compile with that two different classes/bindings at once?
Update (2015.12.15):
Code sample: https://gist.github.com/soutys/22b67a5df9ae0a6f1f72
Is there any way to "convince" OCaml (common ancestor? wrapping class? type casting?) to compile with that two different classes/bindings at once?
It is believed that OCaml is a type safe language, so it is not possible to convince OCaml to compile a program that will crash, unless you're using some sinister methods.
The root of your misunderstanding is illustrated by the following code snippet from your example:
type 'a term = Save of 'a
let enc_t =
if format_num = 1
then Save Lame.encoder
else Save Other.encoder
An expression Save Lame.encoder has a type Lame.encoder term, while an expression Save Other.encoder has a type Other.encoder term. From a type system perspective this are two completly different types, albeit they were built by the same type constructor term. C.f., int list and float list are different types, and you can't assign to the same variable values of this two different types. That is not a property of OCaml per se, this is a property of any parametric polymorphism, e.g., std::vector<int> and std::vector<float> are different types with different representation and their values can't be used interchangeably, although strictly speaking templates in c++ are not true parametric polymorphism, they are just macro.
But back to OCaml. The idea behind the polymorphic function is that a parameter which has a polymorphic data type has the same representation, and function can be applied to any instance of this type without any runtime check, since all type information is removed. For example,
let rec length = function
| [] -> 0
| _ :: xs -> 1 + length xs
is polymorphic because it can be applied to any instance of polymorphic data type 'a list, including int list, float list, person list, etc.
On the other hand, a function
let rec sum = function
| [] -> 0
| x :: xs -> x + sum xs
is not polymorphic and can be applied only to values of type int list. The reason is because the implementation of this function relies on a fact, that every element is an integer. If you will be able to convince the type system, to apply this function to float list, you will get a segmentation fault.
But you may say, that we miss something, as a sum function for float list looks basically the same:
let rec fsum = function
| [] -> 0.
| x :: xs -> x +. fsum xs
So there is an opportunity to abstract a summation. When we abstract something we find the things that differ between different implementations and abstract them out. The simplest abstraction primitive in OCaml is a function, so lets do it:
let rec gsum zero plus xs =
let (+) = plus in
let rec sum = function
| [] -> zero
| x :: xs -> x + sum xs in
sum xs
We abstracted out the zero element and a plus function. So that we get an abstraction of summation, that works for any type for which you can provide a plus operation and element neutral to this opperation (called a Ring data stucture in abstract algebra). The type of the gsum is
'a -> ('b -> 'a -> 'a) -> 'b list -> 'a
It is even too generic, and we can specialize it a little bit, as this type
'a -> ('a -> 'a -> 'a) -> 'a list -> 'a
suits better. Instead of passing elements of the ring structure one by one, we can aggregate it into a record type:
type 'a ring = {
zero : 'a;
plus : 'a -> 'a -> 'a
}
and implement our generic sum as follows:
let rec gsum ring xs =
let (+) = ring.plus in
let rec sum = function
| [] -> ring.zero
| x :: xs -> x + sum xs in
sum xs
In that case we will have a nice type sum : 'a ring -> 'a list -> 'a. At some point of time, you will find yourself extending this record with new fields, and implementing more and more functions, that accepts this ring structure as a first parameter. And this would be a good time to use a more heavy abstraction, called functor. The functor is actually a record on steroids, that is implicitly passed to each function of the functor implementation. Other then functions, records of functions and functors there're other abstraction techniques: first-class modules, objects, classes and, soon, implicits (aka type classes). It is a part of the programming expertise to learn how to choose an abstraction technique that suits better in each particular case. The general advice would be to use least heavy one if possible. And indeed in 95% using functions or records of functions is enough.
Now let's go back to your example. Here you're hitting in the same hole,
you're confusing polymorpism with abstraction:
let fmt =
if format_num = 1
then new Lamefmt.sout sp
else new Otherfmt.sout sp in
Lamefmt.sout and Otherfmt.sout are different types, the first one has type:
type sout = <
doenc : Lame.encoder -> float array array -> int -> int -> string;
encode : Lame.encoder -> float array array -> int -> int -> string
>
and the second:
type sout = <
doenc : Other.encoder -> float array array -> int -> int -> string;
encode : Other.encoder -> float array array -> int -> int -> string
>
This are two different objects, although they have resembling scheme, and that means that we have some abstraction opportunity here.
In your case we can start with a simple observation, that both encoder functions has the same type modulo the encoder object itself. Using Occam's razor principle, we're trying to catch this with the simplest abstraction possible -- a function:
type encoder = buffer -> buffer -> int -> int -> string
where
type buffer = float array array
Then we can construct different encoders:
let lame : encoder =
let encoder = Lame.create_encoder () in
Lame.encode_buffer_float_part encoder
let other : encoder =
let encoder = Other.create_encoder () in
Other.encode_buffer_float_part encoder
And then you can use this two values interchangaebly. Sometimes, different encoders will require different parameters, in that case our task is to dispatch them as soon as possible, e.g.,
let very_customizable_encoder x y z : encoder =
let encoder = VCE.create_encoder x y z in
Other.encode_buffer_float_part encoder
In that case, you should resolve the customization issue as close to user as possible, and later work with the abstraction.
It is quite often to use hashtables or other associative data structures to store encoders. Such approach will even allow you to represent a plugin architecture, where plugins are loaded dynamically and register themselves (the value of type encoder) in some table.
Conclusion. It is enough to use a function to represent your problem. Maybe at some point of time you will need to use records of functions. So far I don't see the need for using classes. Usually, they are required when you're interested in open recursion, i.e., when your problem is represented by a set of mutual recursive functions, and you want to leave some of the functions unspecified, i.e., to parametrize the implementation.
The first thing is to simplify your example. e.g. removing irrelevant bits and adding dummy stubs so we don't need real encoders:
module Lame = struct
type encoder
let encode_buffer_float_part : encoder -> float -> unit = fun _ -> failwith "ocaml_lame_encode_buffer_float"
end
module Otherencoder = struct
type encoder
let encode_buffer_float_part : encoder -> float -> unit = fun _ -> failwith "ocaml_otherencoder_encode_buffer_float"
end
module Mp3_output = struct
class to_shout = object
method encode ncoder x =
Lame.encode_buffer_float_part ncoder x
end
end
module Other_output = struct
class to_shout = object
method encode ncoder x =
Otherencoder.encode_buffer_float_part ncoder x
end
end
type format = Format_other | Format_mp3
let icecast_out source format =
if format = Format_mp3 then new Mp3_output.to_shout
else new Other_output.to_shout
gives
Error: This expression has type Other_output.to_shout
but an expression was expected of type Mp3_output.to_shout
Types for method encode are incompatible
Which is correct. If I give you "a value returned by icecast_out" then you wouldn't know how to call it, because you wouldn't know what the first argument should be.
Since the encoder is different for each class, it should probably be a constructor argument. e.g.
module Mp3_output = struct
class to_shout ncoder = object
method encode x =
Lame.encode_buffer_float_part ncoder x
end
end
module Other_output = struct
class to_shout ncoder = object
method encode x =
Otherencoder.encode_buffer_float_part ncoder x
end
end
type format = Format_other | Format_mp3
let icecast_out source format =
if format = Format_mp3 then new Mp3_output.to_shout Lame.ncoder
else new Other_output.to_shout Otherencoder.ncoder

Gentle Intro to Haskell: " .... there is no single type that contains both 2 and 'b'." Can I not make such a type ?

I am currently learning Haskell, so here are a beginner's questions:
What is meant by single type in the text below ?
Is single type a special Haskell term ? Does it mean atomic type here ?
Or does it mean that I can never make a list in Haskell in which I can put both 1 and 'c' ?
I was thinking that a type is a set of values.
So I cannot define a type that contains Chars and Ints ?
What about algebraic data types ?
Something like: data IntOrChar = In Int | Ch Char ? (I guess that should work but I am confused what the author meant by that sentence.)
Btw, is that the only way to make a list in Haskell in which I can put both Ints and Chars? Or is there a more tricky way ?
A Scala analogy: in Scala it would be possible to write implicit conversions to a type that represents both Ints and Chars (like IntOrChar) and then it would be possible to put seemlessly Ints and Chars into List[IntOrChar], is that not possible with Haskell ? Do I always have to explicitly wrap every Int or Char into IntOrChar if I want to put them into a list of IntOrChar ?
From Gentle Intro to Haskell:
Haskell also incorporates polymorphic types---types that are
universally quantified in some way over all types. Polymorphic type
expressions essentially describe families of types. For example,
(forall a)[a] is the family of types consisting of, for every type a,
the type of lists of a. Lists of integers (e.g. [1,2,3]), lists of
characters (['a','b','c']), even lists of lists of integers, etc., are
all members of this family. (Note, however, that [2,'b'] is not a
valid example, since there is no single type that contains both 2 and
'b'.)
Short answer.
In Haskell there are no implicit conversions. Also there are no union types - only disjoint unions(which are algebraic data types). So you can only write:
someList :: [IntOrChar]
someList = [In 1, Ch 'c']
Longer and certainly not gentle answer.
Note: This is a technique that's very rarely used. If you need it you're probably overcomplicating your API.
There are however existential types.
{-# LANGUAGE ExistentialQuantification, RankNTypes #-}
class IntOrChar a where
intOrChar :: a -> Either Int Char
instance IntOrChar Int where
intOrChar = Left
instance IntOrChar Char where
intOrChar = Right
data List = Nil
| forall a. (IntOrChar a) => Cons a List
someList :: List
someList = (1 :: Int) `Cons` ('c' `Cons` Nil)
Here I have created a typeclass IntOrChar with only function intOrChar. This way you can convert anything of type forall a. (IntOrChar a) => a to Either Int Char.
And also a special kind of list that uses existential type in its second constructor.
Here type variable a is bound(with forall) at the constructor scope. Therefore every time
you use Cons you can pass anything of type forall a. (IntOrChar a) => a as a first argument. Consequently during a destruction(i.e. pattern matching) the first argument will
still be forall a. (IntOrChar a) => a. The only thing you can do with it is either pass it on or call intOrChar on it and convert it to Either Int Char.
withHead :: (forall a. (IntOrChar a) => a -> b) -> List -> Maybe b
withHead f Nil = Nothing
withHead f (Cons x _) = Just (f x)
intOrCharToString :: (IntOrChar a) => a -> String
intOrCharToString x =
case intOrChar of
Left i -> show i
Right c -> show c
someListHeadString :: Maybe String
someListHeadString = withHead intOrCharToString someList
Again note that you cannot write
{- Wont compile
safeHead :: IntOrChar a => List -> Maybe a
safeHead Nil = Nothing
safeHead (Cons x _) = Just x
-}
-- This will
safeHead2 :: List -> Maybe (Either Int Char)
safeHead2 Nil = Nothing
safeHead2 (Cons x _) = Just (intOrChar x)
safeHead will not work because you want a type of IntOrChar a => Maybe a with a bound at safeHead scope and Just x will have a type of IntOrChar a1 => Maybe a1 with a1 bound at Cons scope.
In Scala there are types that include both Int and Char such as AnyVal and Any, which are both supertypes of Char and Int. In Haskell there is no such hierarchy, and all the basic types are disjoint.
You can create your own union types which describe the concept of 'either an Int or a Char (or you could use the built-in Either type), but there are no implicit conversions in Haskell to transparently convert an Int into an IntOrChar.
You could emulate the concept of 'Any' using existential types:
data AnyBox = forall a. (Show a, Hashable a) => AB a
heteroList :: [AnyBox]
heteroList = [AB (1::Int), AB 'b']
showWithHash :: AnyBox -> String
showWithHash (AB v) = show v ++ " - " ++ (show . hash) v
let strs = map showWithHash heteroList
Be aware that this pattern is discouraged however.
I think that the distinction that is being made here is that your algebraic data type IntOrChar is a "tagged union" - that is, when you have a value of type IntOrChar you will know if it is an Int or a Char.
By comparison consider this anonymous union definition (in C):
typedef union { char c; int i; } intorchar;
If you are given a value of type intorchar you don't know (apriori) which selector is valid. That's why most of the time the union constructor is used in conjunction with a struct to form a tagged-union construction:
typedef struct {
int tag;
union { char c; int i; } intorchar_u
} IntOrChar;
Here the tag field encodes which selector of the union is valid.
The other major use of the union constructor is to overlay two structures to get an efficient mapping between sub-structures. For example, this union is one way to efficiently access the individual bytes of a int (assuming 8-bit chars and 32-bit ints):
union { char b[4]; int i }
Now, to illustrate the main difference between "tagged unions" and "anonymous unions" consider how you go about defining a function on these types.
To define a function on an IntOrChar value (the tagged union) I claim you need to supply two functions - one which takes an Int (in the case that the value is an Int) and one which takes a Char (in case the value is a Char). Since the value is tagged with its type, it knows which of the two functions it should use.
If we let F(a,b) denote the set of functions from type a to type b, we have:
F(IntOrChar,b) = F(Int,b) \times F(Char,b)
where \times denotes the cross product.
As for the anonymous union intorchar, since a value doesn't encode anything bout its type the only functions which can be applied are those which are valid for both Int and Char values, i.e.:
F(intorchar,b) = F(Int,b) \cap F(Char,b)
where \cap denotes intersection.
In Haskell there is only one function (to my knowledge) which can be applied to both integers and chars, namely the identity function. So there's not much you could do with a list like [2, 'b'] in Haskell. In other languages this intersection may not be empty, and then constructions like this make more sense.
To summarize, you can have integers and characters in the same list if you create a tagged-union, and in that case you have to tag each of the values which will make you list look like:
[ I 2, C 'b', ... ]
If you don't tag your values then you are creating something akin to an anonymous union, but since there aren't any (useful) functions which can be applied to both integers and chars there's not really anything you can do with that kind of union.

Convert datatype to string (SML)

I have a function that returns a (char * int) list list, like [[(#"D", 3)], [(#"F", 7)]], and now I'm wondering if it's posssible to convert this to a string, so that I can use I/O and read it to another file?
First of all, I assume you meant a value like [[(#"D", 3)], [(#"F", 7)]] (note the extra parens) since SML requires parentheses around tuple construction. OCaml uses a slightly different syntax, and allows just commas, like a, b, to construct tuples. I mention this because what follows is totally specific to Standard ML, and doesn't apply to OCaml, because I believe that in OCaml your best bet is an entirely different approach, which I don't know much about (macros, i.e. ocamlp4/5). So I assume that was just a typo and that you're interested in Standard ML.
Now, unfortunately there is no general toString function in Standard ML. Something like that would have to have some kind special support in the language and implementation, since it's not possible to write a function with the type 'a -> string. You basically have to write your own toString : t -> string for each type t.
As you can imagine, this gets tedious fast. I've spent a little time researching the options (for this and other boilerplate functions like compare : 't * 't -> order) and there is one very interesting technique outlined in the paper "Generics for the working ML'er" (http://dl.acm.org/citation.cfm?id=1292547) but it's pretty advanced and I could never actually get the code to compile (that said the paper is very interesting) The full generics library described in that paper is in the MLton lib repo (https://github.com/MLton/mltonlib/tree/master/com/ssh/generic/unstable). Maybe you'll have better luck?
Here's a slightly lighter weight approach that is less powerful but easier to understand, IMHO. I wrote this after reading that paper and struggling to get it to work. The idea is to write building blocks for toString functions (called show in this case) and compose them with other functions for your own types.
structure Show =
struct
(* Show.t is the type of toString functions *)
type 'a t = 'a -> string
val int: int t = Int.toString
val char: char t = Char.toString
val list: 'a t -> 'a list t =
fn show => fn xs => "[" ^ concat (ExtList.interleave (map show xs) ",") ^ "]"
val pair: 'a t * 'b t -> ('a * 'b) t =
fn (showa,showb) => fn (a,b) => "(" ^ showa a ^ "," ^ showb b ^ ")"
(* ... *)
end
Since your type doesn't actually have any user defined datatypes, it's very easy to write the toString function using this structure:
local
open Show
in
val show : (char * int) list list -> string = list (list (pair (char, int)))
end
- show [[(#"D", 3)], [(#"F", 7)]] ;
val it = "[[(D,3)],[(F,7)]]" : string
What I like about this is that the composed functions read like the type turned inside out. It's a quite an elegant style, which I cannot take credit for as I took it from the generics paper linked above.
The rest of the code for Show (and a related module Eq for equality comparison) is here: https://github.com/spacemanaki/lib.sml

What does "coalgebra" mean in the context of programming?

I have heard the term "coalgebras" several times in functional programming and PLT circles, especially when the discussion is about objects, comonads, lenses, and such. Googling this term gives pages that give mathematical description of these structures which is pretty much incomprehensible to me. Can anyone please explain what coalgebras mean in the context of programming, what is their significance, and how they relate to objects and comonads?
Algebras
I think the place to start would be to understand the idea of an algebra. This is just a generalization of algebraic structures like groups, rings, monoids and so on. Most of the time, these things are introduced in terms of sets, but since we're among friends, I'll talk about Haskell types instead. (I can't resist using some Greek letters though—they make everything look cooler!)
An algebra, then, is just a type τ with some functions and identities. These functions take differing numbers of arguments of type τ and produce a τ: uncurried, they all look like (τ, τ,…, τ) → τ. They can also have "identities"—elements of τ that have special behavior with some of the functions.
The simplest example of this is the monoid. A monoid is any type τ with a function mappend ∷ (τ, τ) → τ and an identity mzero ∷ τ. Other examples include things like groups (which are just like monoids except with an extra invert ∷ τ → τ function), rings, lattices and so on.
All the functions operate on τ but can have different arities. We can write these out as τⁿ → τ, where τⁿ maps to a tuple of n τ. This way, it makes sense to think of identities as τ⁰ → τ where τ⁰ is just the empty tuple (). So we can actually simplify the idea of an algebra now: it's just some type with some number of functions on it.
An algebra is just a common pattern in mathematics that's been "factored out", just like we do with code. People noticed that a whole bunch of interesting things—the aforementioned monoids, groups, lattices and so on—all follow a similar pattern, so they abstracted it out. The advantage of doing this is the same as in programming: it creates reusable proofs and makes certain kinds of reasoning easier.
F-Algebras
However, we're not quite done with factoring. So far, we have a bunch of functions τⁿ → τ. We can actually do a neat trick to combine them all into one function. In particular, let's look at monoids: we have mappend ∷ (τ, τ) → τ and mempty ∷ () → τ. We can turn these into a single function using a sum type—Either. It would look like this:
op ∷ Monoid τ ⇒ Either (τ, τ) () → τ
op (Left (a, b)) = mappend (a, b)
op (Right ()) = mempty
We can actually use this transformation repeatedly to combine all the τⁿ → τ functions into a single one, for any algebra. (In fact, we can do this for any number of functions a → τ, b → τ and so on for any a, b,….)
This lets us talk about algebras as a type τ with a single function from some mess of Eithers to a single τ. For monoids, this mess is: Either (τ, τ) (); for groups (which have an extra τ → τ operation), it's: Either (Either (τ, τ) τ) (). It's a different type for every different structure. So what do all these types have in common? The most obvious thing is that they are all just sums of products—algebraic data types. For example, for monoids, we could create a monoid argument type that works for any monoid τ:
data MonoidArgument τ = Mappend τ τ -- here τ τ is the same as (τ, τ)
| Mempty -- here we can just leave the () out
We can do the same thing for groups and rings and lattices and all the other possible structures.
What else is special about all these types? Well, they're all Functors! E.g.:
instance Functor MonoidArgument where
fmap f (Mappend τ τ) = Mappend (f τ) (f τ)
fmap f Mempty = Mempty
So we can generalize our idea of an algebra even more. It's just some type τ with a function f τ → τ for some functor f. In fact, we could write this out as a typeclass:
class Functor f ⇒ Algebra f τ where
op ∷ f τ → τ
This is often called an "F-algebra" because it's determined by the functor F. If we could partially apply typeclasses, we could define something like class Monoid = Algebra MonoidArgument.
Coalgebras
Now, hopefully you have a good grasp of what an algebra is and how it's just a generalization of normal algebraic structures. So what is an F-coalgebra? Well, the co implies that it's the "dual" of an algebra—that is, we take an algebra and flip some arrows. I only see one arrow in the above definition, so I'll just flip that:
class Functor f ⇒ CoAlgebra f τ where
coop ∷ τ → f τ
And that's all it is! Now, this conclusion may seem a little flippant (heh). It tells you what a coalgebra is, but does not really give any insight on how it's useful or why we care. I'll get to that in a bit, once I find or come up with a good example or two :P.
Classes and Objects
After reading around a bit, I think I have a good idea of how to use coalgebras to represent classes and objects. We have a type C that contains all the possible internal states of objects in the class; the class itself is a coalgebra over C which specifies the methods and properties of the objects.
As shown in the algebra example, if we have a bunch of functions like a → τ and b → τ for any a, b,…, we can combine them all into a single function using Either, a sum type. The dual "notion" would be combining a bunch of functions of type τ → a, τ → b and so on. We can do this using the dual of a sum type—a product type. So given the two functions above (called f and g), we can create a single one like so:
both ∷ τ → (a, b)
both x = (f x, g x)
The type (a, a) is a functor in the straightforward way, so it certainly fits with our notion of an F-coalgebra. This particular trick lets us package up a bunch of different functions—or, for OOP, methods—into a single function of type τ → f τ.
The elements of our type C represent the internal state of the object. If the object has some readable properties, they have to be able to depend on the state. The most obvious way to do this is to make them a function of C. So if we want a length property (e.g. object.length), we would have a function C → Int.
We want methods that can take an argument and modify state. To do this, we need to take all the arguments and produce a new C. Let's imagine a setPosition method which takes an x and a y coordinate: object.setPosition(1, 2). It would look like this: C → ((Int, Int) → C).
The important pattern here is that the "methods" and "properties" of the object take the object itself as their first argument. This is just like the self parameter in Python and like the implicit this of many other languages. A coalgebra essentially just encapsulates the behavior of taking a self parameter: that's what the first C in C → F C is.
So let's put it all together. Let's imagine a class with a position property, a name property and setPosition function:
class C
private
x, y : Int
_name : String
public
name : String
position : (Int, Int)
setPosition : (Int, Int) → C
We need two parts to represent this class. First, we need to represent the internal state of the object; in this case it just holds two Ints and a String. (This is our type C.) Then we need to come up with the coalgebra representing the class.
data C = Obj { x, y ∷ Int
, _name ∷ String }
We have two properties to write. They're pretty trivial:
position ∷ C → (Int, Int)
position self = (x self, y self)
name ∷ C → String
name self = _name self
Now we just need to be able to update the position:
setPosition ∷ C → (Int, Int) → C
setPosition self (newX, newY) = self { x = newX, y = newY }
This is just like a Python class with its explicit self variables. Now that we have a bunch of self → functions, we need to combine them into a single function for the coalgebra. We can do this with a simple tuple:
coop ∷ C → ((Int, Int), String, (Int, Int) → C)
coop self = (position self, name self, setPosition self)
The type ((Int, Int), String, (Int, Int) → c)—for any c—is a functor, so coop does have the form we want: Functor f ⇒ C → f C.
Given this, C along with coop form a coalgebra which specifies the class I gave above. You can see how we can use this same technique to specify any number of methods and properties for our objects to have.
This lets us use coalgebraic reasoning to deal with classes. For example, we can bring in the notion of an "F-coalgebra homomorphism" to represent transformations between classes. This is a scary sounding term that just means a transformation between coalgebras that preserves structure. This makes it much easier to think about mapping classes onto other classes.
In short, an F-coalgebra represents a class by having a bunch of properties and methods that all depend on a self parameter containing each object's internal state.
Other Categories
So far, we've talked about algebras and coalgebras as Haskell types. An algebra is just a type τ with a function f τ → τ and a coalgebra is just a type τ with a function τ → f τ.
However, nothing really ties these ideas to Haskell per se. In fact, they're usually introduced in terms of sets and mathematical functions rather than types and Haskell functions. Indeed,we can generalize these concepts to any categories!
We can define an F-algebra for some category C. First, we need a functor F : C → C—that is, an endofunctor. (All Haskell Functors are actually endofunctors from Hask → Hask.) Then, an algebra is just an object A from C with a morphism F A → A. A coalgebra is the same except with A → F A.
What do we gain by considering other categories? Well, we can use the same ideas in different contexts. Like monads. In Haskell, a monad is some type M ∷ ★ → ★ with three operations:
map ∷ (α → β) → (M α → M β)
return ∷ α → M α
join ∷ M (M α) → M α
The map function is just a proof of the fact that M is a Functor. So we can say that a monad is just a functor with two operations: return and join.
Functors form a category themselves, with morphisms between them being so-called "natural transformations". A natural transformation is just a way to transform one functor into another while preserving its structure. Here's a nice article helping explain the idea. It talks about concat, which is just join for lists.
With Haskell functors, the composition of two functors is a functor itself. In pseudocode, we could write this:
instance (Functor f, Functor g) ⇒ Functor (f ∘ g) where
fmap fun x = fmap (fmap fun) x
This helps us think about join as a mapping from f ∘ f → f. The type of join is ∀α. f (f α) → f α. Intuitively, we can see how a function valid for all types α can be thought of as a transformation of f.
return is a similar transformation. Its type is ∀α. α → f α. This looks different—the first α is not "in" a functor! Happily, we can fix this by adding an identity functor there: ∀α. Identity α → f α. So return is a transformation Identity → f.
Now we can think about a monad as just an algebra based around some functor f with operations f ∘ f → f and Identity → f. Doesn't this look familiar? It's very similar to a monoid, which was just some type τ with operations τ × τ → τ and () → τ.
So a monad is just like a monoid, except instead of having a type we have a functor. It's the same sort of algebra, just in a different category. (This is where the phrase "A monad is just a monoid in the category of endofunctors" comes from as far as I know.)
Now, we have these two operations: f ∘ f → f and Identity → f. To get the corresponding coalgebra, we just flip the arrows. This gives us two new operations: f → f ∘ f and f → Identity. We can turn them into Haskell types by adding type variables as above, giving us ∀α. f α → f (f α) and ∀α. f α → α. This looks just like the definition of a comonad:
class Functor f ⇒ Comonad f where
coreturn ∷ f α → α
cojoin ∷ f α → f (f α)
So a comonad is then a coalgebra in a category of endofunctors.
F-algebras and F-coalgebras are mathematical structures which are instrumental in reasoning about inductive types (or recursive types).
F-algebras
We'll start first with F-algebras. I will try to be as simple as possible.
I guess you know what is a recursive type. For example, this is a type for a list of integers:
data IntList = Nil | Cons (Int, IntList)
It is obvious that it is recursive - indeed, its definition refers to itself. Its definition consists of two data constructors, which have the following types:
Nil :: () -> IntList
Cons :: (Int, IntList) -> IntList
Note that I have written type of Nil as () -> IntList, not simply IntList. These are in fact equivalent types from the theoretical point of view, because () type has only one inhabitant.
If we write signatures of these functions in a more set-theoretical way, we will get
Nil :: 1 -> IntList
Cons :: Int × IntList -> IntList
where 1 is a unit set (set with one element) and A × B operation is a cross product of two sets A and B (that is, set of pairs (a, b) where a goes through all elements of A and b goes through all elements of B).
Disjoint union of two sets A and B is a set A | B which is a union of sets {(a, 1) : a in A} and {(b, 2) : b in B}. Essentially it is a set of all elements from both A and B, but with each of this elements 'marked' as belonging to either A or B, so when we pick any element from A | B we will immediately know whether this element came from A or from B.
We can 'join' Nil and Cons functions, so they will form a single function working on a set 1 | (Int × IntList):
Nil|Cons :: 1 | (Int × IntList) -> IntList
Indeed, if Nil|Cons function is applied to () value (which, obviously, belongs to 1 | (Int × IntList) set), then it behaves as if it was Nil; if Nil|Cons is applied to any value of type (Int, IntList) (such values are also in the set 1 | (Int × IntList), it behaves as Cons.
Now consider another datatype:
data IntTree = Leaf Int | Branch (IntTree, IntTree)
It has the following constructors:
Leaf :: Int -> IntTree
Branch :: (IntTree, IntTree) -> IntTree
which also can be joined into one function:
Leaf|Branch :: Int | (IntTree × IntTree) -> IntTree
It can be seen that both of this joined functions have similar type: they both look like
f :: F T -> T
where F is a kind of transformation which takes our type and gives more complex type, which consists of x and | operations, usages of T and possibly other types. For example, for IntList and IntTree F looks as follows:
F1 T = 1 | (Int × T)
F2 T = Int | (T × T)
We can immediately notice that any algebraic type can be written in this way. Indeed, that is why they are called 'algebraic': they consist of a number of 'sums' (unions) and 'products' (cross products) of other types.
Now we can define F-algebra. F-algebra is just a pair (T, f), where T is some type and f is a function of type f :: F T -> T. In our examples F-algebras are (IntList, Nil|Cons) and (IntTree, Leaf|Branch). Note, however, that despite that type of f function is the same for each F, T and f themselves can be arbitrary. For example, (String, g :: 1 | (Int x String) -> String) or (Double, h :: Int | (Double, Double) -> Double) for some g and h are also F-algebras for corresponding F.
Afterwards we can introduce F-algebra homomorphisms and then initial F-algebras, which have very useful properties. In fact, (IntList, Nil|Cons) is an initial F1-algebra, and (IntTree, Leaf|Branch) is an initial F2-algebra. I will not present exact definitions of these terms and properties since they are more complex and abstract than needed.
Nonetheless, the fact that, say, (IntList, Nil|Cons) is F-algebra allows us to define fold-like function on this type. As you know, fold is a kind of operation which transforms some recursive datatype in one finite value. For example, we can fold a list of integer into a single value which is a sum of all elements in the list:
foldr (+) 0 [1, 2, 3, 4] -> 1 + 2 + 3 + 4 = 10
It is possible to generalize such operation on any recursive datatype.
The following is a signature of foldr function:
foldr :: ((a -> b -> b), b) -> [a] -> b
Note that I have used braces to separate first two arguments from the last one. This is not real foldr function, but it is isomorphic to it (that is, you can easily get one from the other and vice versa). Partially applied foldr will have the following signature:
foldr ((+), 0) :: [Int] -> Int
We can see that this is a function which takes a list of integers and returns a single integer. Let's define such function in terms of our IntList type.
sumFold :: IntList -> Int
sumFold Nil = 0
sumFold (Cons x xs) = x + sumFold xs
We see that this function consists of two parts: first part defines this function's behavior on Nil part of IntList, and second part defines function's behavior on Cons part.
Now suppose that we are programming not in Haskell but in some language which allows usage of algebraic types directly in type signatures (well, technically Haskell allows usage of algebraic types via tuples and Either a b datatype, but this will lead to unnecessary verbosity). Consider a function:
reductor :: () | (Int × Int) -> Int
reductor () = 0
reductor (x, s) = x + s
It can be seen that reductor is a function of type F1 Int -> Int, just as in definition of F-algebra! Indeed, the pair (Int, reductor) is an F1-algebra.
Because IntList is an initial F1-algebra, for each type T and for each function r :: F1 T -> T there exist a function, called catamorphism for r, which converts IntList to T, and such function is unique. Indeed, in our example a catamorphism for reductor is sumFold. Note how reductor and sumFold are similar: they have almost the same structure! In reductor definition s parameter usage (type of which corresponds to T) corresponds to usage of the result of computation of sumFold xs in sumFold definition.
Just to make it more clear and help you see the pattern, here is another example, and we again begin from the resulting folding function. Consider append function which appends its first argument to second one:
(append [4, 5, 6]) [1, 2, 3] = (foldr (:) [4, 5, 6]) [1, 2, 3] -> [1, 2, 3, 4, 5, 6]
This how it looks on our IntList:
appendFold :: IntList -> IntList -> IntList
appendFold ys () = ys
appendFold ys (Cons x xs) = x : appendFold ys xs
Again, let's try to write out the reductor:
appendReductor :: IntList -> () | (Int × IntList) -> IntList
appendReductor ys () = ys
appendReductor ys (x, rs) = x : rs
appendFold is a catamorphism for appendReductor which transforms IntList into IntList.
So, essentially, F-algebras allow us to define 'folds' on recursive datastructures, that is, operations which reduce our structures to some value.
F-coalgebras
F-coalgebras are so-called 'dual' term for F-algebras. They allow us to define unfolds for recursive datatypes, that is, a way to construct recursive structures from some value.
Suppose you have the following type:
data IntStream = Cons (Int, IntStream)
This is an infinite stream of integers. Its only constructor has the following type:
Cons :: (Int, IntStream) -> IntStream
Or, in terms of sets
Cons :: Int × IntStream -> IntStream
Haskell allows you to pattern match on data constructors, so you can define the following functions working on IntStreams:
head :: IntStream -> Int
head (Cons (x, xs)) = x
tail :: IntStream -> IntStream
tail (Cons (x, xs)) = xs
You can naturally 'join' these functions into single function of type IntStream -> Int × IntStream:
head&tail :: IntStream -> Int × IntStream
head&tail (Cons (x, xs)) = (x, xs)
Notice how the result of the function coincides with algebraic representation of our IntStream type. Similar thing can also be done for other recursive data types. Maybe you already have noticed the pattern. I'm referring to a family of functions of type
g :: T -> F T
where T is some type. From now on we will define
F1 T = Int × T
Now, F-coalgebra is a pair (T, g), where T is a type and g is a function of type g :: T -> F T. For example, (IntStream, head&tail) is an F1-coalgebra. Again, just as in F-algebras, g and T can be arbitrary, for example,(String, h :: String -> Int x String) is also an F1-coalgebra for some h.
Among all F-coalgebras there are so-called terminal F-coalgebras, which are dual to initial F-algebras. For example, IntStream is a terminal F-coalgebra. This means that for every type T and for every function p :: T -> F1 T there exist a function, called anamorphism, which converts T to IntStream, and such function is unique.
Consider the following function, which generates a stream of successive integers starting from the given one:
nats :: Int -> IntStream
nats n = Cons (n, nats (n+1))
Now let's inspect a function natsBuilder :: Int -> F1 Int, that is, natsBuilder :: Int -> Int × Int:
natsBuilder :: Int -> Int × Int
natsBuilder n = (n, n+1)
Again, we can see some similarity between nats and natsBuilder. It is very similar to the connection we have observed with reductors and folds earlier. nats is an anamorphism for natsBuilder.
Another example, a function which takes a value and a function and returns a stream of successive applications of the function to the value:
iterate :: (Int -> Int) -> Int -> IntStream
iterate f n = Cons (n, iterate f (f n))
Its builder function is the following one:
iterateBuilder :: (Int -> Int) -> Int -> Int × Int
iterateBuilder f n = (n, f n)
Then iterate is an anamorphism for iterateBuilder.
Conclusion
So, in short, F-algebras allow to define folds, that is, operations which reduce recursive structure down into a single value, and F-coalgebras allow to do the opposite: construct a [potentially] infinite structure from a single value.
In fact in Haskell F-algebras and F-coalgebras coincide. This is a very nice property which is a consequence of presence of 'bottom' value in each type. So in Haskell both folds and unfolds can be created for every recursive type. However, theoretical model behind this is more complex than the one I have presented above, so I deliberately have avoided it.
Going through the tutorial paper A tutorial on (co)algebras and (co)induction should give you some insight about co-algebra in computer science.
Below is a citation of it to convince you,
In general terms, a program in some programming language manipulates data. During the
development of computer science over the past few decades it became clear that an abstract
description of these data is desirable, for example to ensure that one's program does not depend on the particular representation of the data on which it operates. Also, such abstractness facilitates correctness proofs.
This desire led to the use of algebraic methods in computer science, in a branch called algebraic specification or abstract data type theory. The object of study are data types in themselves, using notions of techniques which are familiar from algebra. The data types used by computer scientists are often generated from a given collection of (constructor) operations,and it is for this reason that "initiality" of algebras plays such an important role.
Standard algebraic techniques have proved useful in capturing various essential aspects of data structures used in computer science. But it turned out to be difficult to algebraically describe some of the inherently dynamical structures occurring in computing. Such structures usually involve a notion of state, which can be transformed in various ways. Formal approaches to such state-based dynamical systems generally make use of automata or transition systems, as classical early references.
During the last decade the insight gradually grew that such state-based systems should not be described as algebras, but as so-called co-algebras. These are the formal dual of algebras, in a way which will be made precise in this tutorial. The dual property of "initiality" for algebras, namely finality turned out to be crucial for such co-algebras. And the logical reasoning principle that is needed for such final co-algebras is not induction but co-induction.
Prelude, about Category theory.
Category theory should be rename theory of functors.
As categories are what one must define in order to define functors.
(Moreover, functors are what one must define in order to define natural transformations.)
What's a functor?
It's a transformation from one set to another which preserving their structure.
(For more detail there is a lot of good description on the net).
What's is an F-algebra?
It's the algebra of functor.
It's just the study of the universal propriety of functor.
How can it be link to computer science ?
Program can be view as a structured set of information.
Program's execution correspond to modification of this structured set of information.
It sound good that execution should preserve the program structure.
Then execution can be view as the application of a functor over this set of information.
(The one defining the program).
Why F-co-algebra ?
Program are dual by essence as they are describe by information and they act on it.
Then mainly the information which compose program and make them changed can be view in two way.
Data which can be define as the information being processed by the program.
State which can be define as the information being shared by the program.
Then at this stage, I'd like to say that,
F-algebra is the study of functorial transformation acting over Data's Universe (as been defined here).
F-co-algebras is the study of functorial transformation acting on State's Universe (as been defined here).
During the life of a program, data and state co-exist, and they complete each other.
They are dual.
I'll start with stuff that is obviously programming-related and then add on some mathematics stuff, to keep it as concrete and down-to-earth as I can.
Let's quote some computer-scientists on coinduction…
http://www.cs.umd.edu/~micinski/posts/2012-09-04-on-understanding-coinduction.html
Induction is about finite data, co-induction is about infinite data.
The typical example of infinite data is the type of a lazy list (a
stream). For example, lets say that we have the following object in
memory:
let (pi : int list) = (* some function which computes the digits of
π. *)
The computer can’t hold all of π, because it only has a finite amount
of memory! But what it can do is hold a finite program, which will
produce any arbitrarily long expansion of π that you desire. As long
as you only use finite pieces of the list, you can compute with that
infinite list as much as you need.
However, consider the following program:
let print_third_element (k : int list) = match k with
| _ :: _ :: thd :: tl -> print thd
print_third_element pi
This program should print the
third digit of pi. But in some languages, any argument to a function is evaluated before being passed
into a function (strict, not lazy, evaluation). If we use this
reduction order, then our above program will run forever computing the
digits of pi before it can be passed to our printer function (which
never happens). Since the machine does not have infinite memory, the
program will eventually run out of memory and crash. This might not be the best evaluation order.
http://adam.chlipala.net/cpdt/html/Coinductive.html
In lazy functional programming languages like Haskell, infinite data structures
are everywhere. Infinite lists and more exotic datatypes provide convenient
abstractions for communication between parts of a program. Achieving similar
convenience without infinite lazy structures would, in many cases, require
acrobatic inversions of control flow.
http://www.alexandrasilva.org/#/talks.html
Relating the ambient mathematical context to usual programming tasks
What is "an algebra"?
Algebraic structures generally look like:
Stuff
What the stuff can do
This should sound like objects with 1. properties and 2. methods. Or even better, it should sound like type signatures.
Standard mathematical examples include monoid ⊃ group ⊃ vector-space ⊃ "an algebra". Monoids are like automata: sequences of verbs (eg, f.g.h.h.nothing.f.g.f). A git log that always adds history and never deletes it would be a monoid but not a group. If you add inverses (eg negative numbers, fractions, roots, deleting accumulated history, un-shattering a broken mirror) you get a group.
Groups contain things that can be added or subtracted together. For example Durations can be added together. (But Dates cannot.) Durations live in a vector-space (not just a group) because they can also be scaled by outside numbers. (A type signature of scaling :: (Number,Duration) → Duration.)
Algebras ⊂ vector-spaces can do yet another thing: there’s some m :: (T,T) → T. Call this "multiplication" or don't, because once you leave Integers it’s less obvious what "multiplication" (or "exponentiation") should be.
(This is why people look to (category-theoretic) universal properties: to tell them what multiplication should do or be like:
)
Algebras → Coalgebras
Comultiplication is easier to define in a way that feels non-arbitrary, than is multiplication, because to go from T → (T,T) you can just repeat the same element. ("diagonal map" – like diagonal matrices/operators in spectral theory)
Counit is usually the trace (sum of diagonal entries), although again what's important is what your counit does; trace is just a good answer for matrices.
The reason to look at a dual space, in general, is because it's easier to think in that space. For example it's sometimes easier to think about a normal vector than about the plane it's normal to, but you can control planes (including hyperplanes) with vectors (and now I'm speaking of the familiar geometric vector, like in a ray-tracer).
Taming (un)structured data
Mathematicians might be modelling something fun like TQFT's, whereas programmers have to wrestle with
dates/times (+ :: (Date,Duration) → Date),
places (Paris ≠ (+48.8567,+2.3508)! It's a shape, not a point.),
unstructured JSON which is supposed to be consistent in some sense,
wrong-but-close XML,
incredibly complex GIS data which should satisfy loads of sensible relations,
regular expressions which meant something to you, but mean considerably less to perl.
CRM that should hold all the executive's phone numbers and villa locations, his (now ex-) wife and kids' names, birthday and all the previous gifts, each of which should satisfy "obvious" relations (obvious to the customer) which are incredibly hard to code up,
.....
Computer scientists, when talking about coalgebras, usually have set-ish operations in mind, like Cartesian product. I believe this is what people mean when they say like "Algebras are coalgebras in Haskell". But to the extent that programmers have to model complex data-types like Place, Date/Time, and Customer—and make those models look as much like the real world (or at least the end-user's view of the real world) as possible—I believe duals, could be useful beyond only set-world.