Scala only language with overloaded extractors? - scala

In at least some of the ML family languages, you can define records on which you can perform pattern matching e.g. http://learnyouahaskell.com/making-our-own-types-and-typeclasses - the basic idea is that you define a record type with named fields, a constructor is automatically created with those fields as parameters so you can create records of that type, and an extractor is automatically created with those fields as parameters so you can pattern match on records of that type.
Scala goes a step further and allows the fields stored in the record, the constructor parameters and the extractor parameters to be decoupled from each other e.g. http://daily-scala.blogspot.com/2009/11/overloaded-unapply.html - in this it is living up to its goal of supporting both object-oriented and functional programming. (Object-oriented languages of course normally allow stored fields and constructor parameters to be decoupled, though they don't normally have extractors.)
Are there any other languages that have pattern matching and allow such decoupling?
Has anything been written about the pros and cons of such decoupling?

I admit that I don't have 100% of the background required to understand your question, but I can say that F# has a feature called "Active Patterns" that it seems could be used to build the same functionality that your daily-scala link demonstrates.
Is that in the neighborhood of what you're looking for?

No, F# also provides that feature.
Examples in the second article can be implemented using Partial Active Patterns:
let (|String|_|) = function "s" -> Some "yay" | _ -> None
let (|Int|_|) = function 1 -> Some "hmm" | _ -> None
let (|StringList|_|) = function "x" -> Some [1; 2; 3] | _ -> None
let (|IntList|_|) = function 1 -> Some ["one"; "two"] | _ -> None
match 1 with
| Int s -> printfn "%O" s
| _ -> printfn "Unmatched"
match "s" with
| String s -> printfn "%O" s
| _ -> printfn "Unmatched"
match "x" with
| StringList [x; y; z] -> printfn "%O" (x, y, z)
| _ -> printfn "Unmatched"
match 1 with
| IntList [x; y] -> printfn "%O" (x, y)
| _ -> printfn "Unmatched"
Active Patterns is a powerful technique, you can even write it in a recursive way. Its combination with pattern matching provides a convenient toolkit for destructuring data. However, pattern matching is inexhaustive so you have to use wildcard(_) as the last pattern.

For reference, Don Syme (the inventor of F#) wrote a paper about F#'s "Active Patterns": Extensible Pattern Matching Via a Lightweight Language Extension – Syme, et al.

There is a long history of first class patterns in typed functional languages.
In Haskell land, we use the -XViewPatterns extension for programmatic patterns.
The first true view patterns go back to Phil Wadler's 1987 paper on views

Related

Is it an example of a Monad? If not, why?

I am trying to study functional programming applied to Swift (a multi-paradigm language). One of the exercises I proposed to myself was trying to do a declarative Poker Hand evaluator.
Here are some code excerpts and my question at the end:
typealias Rule = ([Card]) -> Result
Where Result is a type that holds the current evaluation state (cards already evaluated on a rank, remaining cards, if the last rule evaluation was successful or not and the evaluated ranks). The input is an array (could be a set of) cards to be evaluated.
I also created this function:
func id(_ hand:[Card]) -> Result ...
That creates a mininum Result from a card set.
Result also have a set of functions to chain rule evaluations (simplified here):
func apply(_ rule:Rule) -> Result
func andThen(_ rule:Rule) -> Result
func andAlso(_ rule:Rule) -> Result
func otherwise(_ rule:Rule) -> Result
func continueWith(_ rule:Rule) -> Result
Whats allowed me to declare the poker rank rules as:
let fullHouse = { (hand) in
threeOfAKind(hand).andThen(pair)
}
or
let royalStraightFlush = { (hand) in
straightFlush(hand).andAlso(straightAceHigh)
}
and chaining all rank rules as:
let evaluate = { (hand) in
//id(hand)
royalStraightFlush(hand)
.otherwise(straightFlush)
.otherwise(fourOfAKind)
.otherwise(fullHouse)
.otherwise(flush)
.otherwise(straight)
.otherwise(threeOfAKind)
.otherwise(twoPair)
.otherwise(pair)
.continueWith(highCard) }
Result, as it is:
Encompasses a type ([Card]) into a broader context (R a)
Has an id (result) function that put an object of type a into a minimum "result" context (a -> R a)
But...
It has not just one generic >>= bind function, but several specific ones, that takes R a -> (a -> R a) -> R a that could chain rules and get the card set to be evaluated from the previous partial result state.
It is not (as it is implemented) generic enough to handle other types instead of Card or [Card]. In the other hand I think the same chaining logic could be used on other rule systems with some changes...
My question is: is Result a monad? Otherwise, why it is not? My two concerns are those presented above.
I think understanding these points (or knowing that are some other blind spots) on this concrete example will clarify a little bit the monad concept to me.
Thanks!
Monads have to obey the three Monad Laws:
Left identity: return a >>= f ≡ f a
Right identity: m >>= return ≡ m
Associativity: (m >>= f) >>= g ≡ m >>= (\x -> f x >>= g)
This also implies the implementation of return (or equivalent) and >>= (variously called bind, flatmap, etc.) for your monad.. Haskell (and other languages) also adds the map function (>>) even though it can be implemented in terms of >>=.
In addition, as #chepner points out, monads are abstractions that are not concrete types in themselves. They are generic and can only be instantiated via a type constructor. List is a monad but you cannot instantiate a List. You can, however, instantiate a List[Int].
If you want to get a good conceptual picture of what makes a monad a monad, take a look at Brian Beckman's video Don't Fear the Monad. Also, there is a good series of blog posts, Monads are Elephants.

Encoding of inferrable records

As you probably know, records are somewhat special in ocaml, as each label has to be uniquely assigned to a nominal record type, i.e. the following function cannot be typed without context:
let f r = r.x
Proper first class records (i.e. things that behave like tuples with labels) are trivially encoded using objects, e.g.
let f r = r#x
when creating the objects in the right way (i.e. no self-recursion, no mutation), they behave just like records.
I am however, somewhat unhappy with this solution for two reasons:
when making records updatetable (i.e. by adding an explicit "with_l" method for each label l), the type is somewhat too loose (it should be the same as the original record). Admitted, one can enforce this equality, but this is still inconvenient.
I have the suspicion that the OCaml compiler does not infer that these records are actually immutable: In a function
let f r = r#x + r#x
would the compiler be able to run a common subexpression elimination?
For these reasons, I wonder if there is a better encoding:
Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
Can this encoding avoid the problems mentioned above?
If I understand you correctly you're looking for a very special kind of polymorphism. You want to write a function that will work for all types, such that the type is a record with certain fields. This sounds more like a syntactic polymorphism in a C++ style, not as semantic polymorphism in ML style. If we will slightly rephrase the task, by capturing the idea that a field accessing is just a syntactic sugar for a field projection function, then we can say, that you want to write a function that is polymorphic over all types that provide a certain set of operations. This kind of polymorphism can be captured by OCaml using one of the following mechanisms:
functors
first class modules
objects
I think that functors are obvious, so I will show an example with first class modules. We will write a function print_student that will work on any type that satisfies the Student signature:
module type Student = sig
type t
val name : t -> string
val age : t -> int
end
let print_student (type t)
(module S : Student with type t = t) (s : t) =
Printf.printf "%s %d" (S.name s) (S.age s)
The type of print_student function is (module Student with type t = 'a) -> 'a -> unit. So it works for any type that satisfies the Student interface, and thus it is polymorphic. This is a very powerful polymorphism that comes with a price, you need to pass the module structure explicitly when you're invoking the function, so it is a System F style polymorphism. Functors will also require you to specify concrete module structure. So both are not inferrable (i.e., not an implicit Hindley-Milner-like style polymorphism, that you are looking for). For the latter, only objects will work (there are also modular implicits, that relax the explicitness requirement, but they are still not in the trunk, but they will actually answer your requirements).
With object-style row polymorphism it is possible to write a function that is polymorphic over a set of types conforming to some signature, and to infer this signature implicitly from the function definintion. However, such power comes with a price. Since object operations are encoded with methods and methods are just function pointers that are assigned dynamically in the runtime, you shouldn't expect any compile time optimizations. It is not possible to perform any static analysis on something that is bound dynamically. So, of course, no Common Subexpression elimination, nor inlining. For functors and first class modules, the optimization is possible on a newer branch of the compiler with flamba (See 4.03.0+flambda opam switch). But on a regular compiler installation no inlining will be performed.
Different approaches
What concerning other techniques. First of all we can use camlp{4,5}, or ppx or even m4 and cpp to preprocess code, but this would be hardly idiomatic and of doubtful usefulness.
Another way, is instead of writing a function that is polymorphic, we can try to find a suitable monomorphic data type. A direct approach would be to use a list of polymorphic variants, e.g.,
type attributes = [`name of string | `age of int]
type student = attribute list
In fact we even don't need to specify all these types ahead, and our function can require only those fields that are needed, a form of a row polymorphism:
let rec name = function
| [] -> raise Not_found
| `name n -> n
| _ :: student -> name student
The only problem with this encoding, is that you cannot guarantee that the same named attribute can occur once and only once. So it is possible that a student doesn't have a name at all, or, that is worser, it can have more then one names. Depending on your problem domain it can be acceptable.
If it is not, then we can use GADT and extensible variants to encode heterogenous maps, i.e., an associative data structures that map keys to
different type (in a regular (homogenous) map or assoc list value type is unified). How to construct such containers is beyond the scope of the answer, but fortunately there're at least two available implementations. One, that I use personally is called universal map (Univ_map) and is provided by a Core library (Core_kernel in fact). It allows you to specify two kinds of heterogenous maps, with and without a default values. The former corresponds to a record with optional field, the latter has default for each field, so an accessor is a total function. For example,
open Core_kernel.Std
module Dict = Univ_map.With_default
let name = Dict.Key.create ~name:"name" ~default:"Joe" sexp_of_string
let age = Dict.Key.create ~name:"age" ~default:18 sexp_of_int
let print student =
printf "%s %d"
(Dict.get student name) (Dict.get age name)
You can hide that you're using universal map using abstract type, as there is only one Dict.t that can be used across different abstractions, that may break modularity. Another example of heterogeneous map implementation is from Daniel Bunzli. It doesn't provide With_default kind of map, but has much less dependencies.
P.S. Of course for such a redundant case, where this only one operation it is much easier to just pass this operation explicitly as function, instead of packing it into a structure, so we can write function f from your example as simple as let f x r = x r + x r. But this would be the same kind of polymoprism as with first class modules/functors, just simplified. And I assume, that your example was specifically reduced to one field, and in your real use case you have more complex set of fields.
Very roughly speaking, an OCaml object is a hash table whose keys are its method name hash. (The hash of a method name can be obtained by Btype.hash_variant of OCaml compiler implementation.)
Just like objects, you can encode polymorphic records using (int, Obj.t) Hashtbl.t. For example, a function to get a value of a field l can be written as follows:
(** [get r "x"] is poly-record version of [r.x] *)
let get r k = Hashtbl.find t (Btype.hash_variant k))
Since it is easy to access the internals unlike objects, the encoding of {r with l = e} is trivial:
(** [copy_with r [(k1,v1);..;(kn,vn)]] is poly-record version of
[{r with k1 = v1; ..; kn = vn}] *)
let copy_with r fields =
let r = Hashtbl.copy r in
List.iter (fun (k,v) -> Hashtbl.replace r (Btype.hash_variant k) v) fields
and the creation of poly-records:
(** [create [(k1,v1);..(kn,vn)]] is poly-record version of [{k1=v1;..;kn=vn}] *)
let create fields = copy_with fields (Hashtbl.create (List.length fields))
Since all the types of the fields are squashed into one Obj.t, you have to use Obj.magic to store various types into this implementation and therefore this is not type-safe by itself. However, we can make it type-safe wrapping (int, Obj.t) Hashtbl.t with phantom type whose parameter denotes the fields and their types of a poly-record. For example,
<x : int; y : float> Poly_record.t
is a poly-record whose fields are x : int and y : float.
Details of this phantom type wrapping for the type safety is too long to explain here. Please see my implementation https://bitbucket.org/camlspotter/ppx_poly_record/src . To tell short, it uses PPX preprocessor to generate code for type-safety and to provide easier syntax sugar.
Compared with the encoding by objects, this approach has the following properties:
The same type safety and the same field access efficiency as objects
It can enjoy structural subtyping like objects, what you want for poly-records.
{r with l = e} is possible
Streamable outside of a program safely, since hash tables themselves have no closure in it. Objects are always "contaminated" with closures therefore they are not safely streamable.
Unfortunately it lacks efficient pattern matching, which is available for mono-records. (And this is why I do not use my implementation :-( ) I feel for it PPX reprocessing is not enough and some compiler modification is required. It will not be really hard though since we can make use of typing of objects.
Ah and of course, this encoding is very side effective therefore no CSE optimization can be expected.
Is there another (aside from using objects) type-safe encoding (e.g. using polymorphic variants) of records with inferrable type in OCaml?
For immutable records, yes. There is a standard theoretical duality between polymorphic records ("inferrable" records as you describe) and polymorphic variants. In short, a record { l_1 = v_1; l_2 = v_2; ...; l_n = v_n } can be implemented by
function `l_1 k -> k v_1 | `l_2 k -> k v_2 | ... | `l_n k -> k v_n
and then the projection r.l_i becomes r (`l_i (fun v -> v)). For instance, the function fun r -> r.x is encoded as fun r -> r (`x (fun v -> v)). See also the following example session:
# let myRecord = (function `field1 k -> k 123 | `field2 k -> k "hello") ;;
(* encodes { field1 = 123; field2 = "hello" } *)
val myRecord : [< `field1 of int -> 'a | `field2 of string -> 'a ] -> 'a = <fun>
# let getField1 r = r (`field1 (fun v -> v)) ;;
(* fun r -> r.field1 *)
val getField1 : ([> `field1 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField1 myRecord ;;
- : int = 123
# let getField2 r = r (`field2 (fun v -> v)) ;;
(* fun r -> r.field2 *)
val getField2 : ([> `field2 of 'a -> 'a ] -> 'b) -> 'b = <fun>
# getField2 myRecord ;;
- : string = "hello"
For mutable records, we can add setters like:
let ref1 = ref 123
let ref2 = ref "hello"
let myRecord =
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
and use them like myRecord (`set_field1(456, fun v -> v)) and myRecord (`set_field2("world", fun v -> v)) for example. However, localizing ref1 and ref2 like
let myRecord =
let ref1 = ref 123 in
let ref2 = ref "hello" in
function
| `field1 k -> k !ref1
| `field2 k -> k !ref2
| `set_field1(v1, k) -> k (ref1 := v1)
| `set_field2(v2, k) -> k (ref2 := v2)
causes a value restriction problem and requires a little more polymorphic typing trick (which I omit here).
Can this encoding avoid the problems mentioned above?
The "common subexpression elimination" for (the encoding of) r.x + r.x can be done only if OCaml knows the definition of r and inlines it. (Sorry my previous answer was inaccurate here.)

What does >>= mean in purescript?

I was reading the purescript wiki and found following section which explains do in terms of >>=.
What does >>= mean?
Do notation
The do keyword introduces simple syntactic sugar for monadic
expressions.
Here is an example, using the monad for the Maybe type:
maybeSum :: Maybe Number -> Maybe Number -> Maybe Number
maybeSum a b = do
n <- a
m <- b
let result = n + m
return result
maybeSum takes two
values of type Maybe Number and returns their sum if neither number is
Nothing.
When using do notation, there must be a corresponding
instance of the Monad type class for the return type. Statements can
have the following form:
a <- x which desugars to x >>= \a -> ...
x which desugars to x >>= \_ -> ... or just x if this is the last statement.
A let binding let a = x. Note the lack of the in keyword.
The example maybeSum desugars to ::
maybeSum a b =
a >>= \n ->
b >>= \m ->
let result = n + m
in return result
>>= is a function, nothing more. It resides in the Prelude module and has type (>>=) :: forall m a b. (Bind m) => m a -> (a -> m b) -> m b, being an alias for the bind function of the Bind type class. You can find the definitions of the Prelude module in this link, found in the Pursuit package index.
This is closely related to the Monad type class in Haskell, which is a bit easier to find resources. There's a famous question on SO about this concept, which is a good starting point if you're looking to improve your knowledge on the bind function (if you're starting on functional programming now, you can skip it for a while).

Convert datatype to string (SML)

I have a function that returns a (char * int) list list, like [[(#"D", 3)], [(#"F", 7)]], and now I'm wondering if it's posssible to convert this to a string, so that I can use I/O and read it to another file?
First of all, I assume you meant a value like [[(#"D", 3)], [(#"F", 7)]] (note the extra parens) since SML requires parentheses around tuple construction. OCaml uses a slightly different syntax, and allows just commas, like a, b, to construct tuples. I mention this because what follows is totally specific to Standard ML, and doesn't apply to OCaml, because I believe that in OCaml your best bet is an entirely different approach, which I don't know much about (macros, i.e. ocamlp4/5). So I assume that was just a typo and that you're interested in Standard ML.
Now, unfortunately there is no general toString function in Standard ML. Something like that would have to have some kind special support in the language and implementation, since it's not possible to write a function with the type 'a -> string. You basically have to write your own toString : t -> string for each type t.
As you can imagine, this gets tedious fast. I've spent a little time researching the options (for this and other boilerplate functions like compare : 't * 't -> order) and there is one very interesting technique outlined in the paper "Generics for the working ML'er" (http://dl.acm.org/citation.cfm?id=1292547) but it's pretty advanced and I could never actually get the code to compile (that said the paper is very interesting) The full generics library described in that paper is in the MLton lib repo (https://github.com/MLton/mltonlib/tree/master/com/ssh/generic/unstable). Maybe you'll have better luck?
Here's a slightly lighter weight approach that is less powerful but easier to understand, IMHO. I wrote this after reading that paper and struggling to get it to work. The idea is to write building blocks for toString functions (called show in this case) and compose them with other functions for your own types.
structure Show =
struct
(* Show.t is the type of toString functions *)
type 'a t = 'a -> string
val int: int t = Int.toString
val char: char t = Char.toString
val list: 'a t -> 'a list t =
fn show => fn xs => "[" ^ concat (ExtList.interleave (map show xs) ",") ^ "]"
val pair: 'a t * 'b t -> ('a * 'b) t =
fn (showa,showb) => fn (a,b) => "(" ^ showa a ^ "," ^ showb b ^ ")"
(* ... *)
end
Since your type doesn't actually have any user defined datatypes, it's very easy to write the toString function using this structure:
local
open Show
in
val show : (char * int) list list -> string = list (list (pair (char, int)))
end
- show [[(#"D", 3)], [(#"F", 7)]] ;
val it = "[[(D,3)],[(F,7)]]" : string
What I like about this is that the composed functions read like the type turned inside out. It's a quite an elegant style, which I cannot take credit for as I took it from the generics paper linked above.
The rest of the code for Show (and a related module Eq for equality comparison) is here: https://github.com/spacemanaki/lib.sml

Defining a function a -> String, which works for types without Show?

I'd like to define a function which can "show" values of any type, with special behavior for types which actually do define a Show instance:
magicShowCast :: ?
debugShow :: a -> String
debugShow x = case magicShowCast x of
Just x' -> show x'
Nothing -> "<unprintable>"
This would be used to add more detailed information to error messages when something goes wrong:
-- needs to work on non-Showable types
assertEq :: Eq a => a -> a -> IO ()
assertEq x y = when (x /= y)
(throwIO (AssertionFailed (debugShow x) (debugShow y)))
data CanShow = CanShow1
| CanShow 2
deriving (Eq, Show)
data NoShow = NoShow1
| NoShow2
deriving (Eq)
-- AssertionFailed "CanShow1" "CanShow2"
assertEq CanShow1 CanShow2
-- AssertionFailed "<unprintable>" "<unprintable>"
assertEq NoShow1 NoShow2
Is there any way to do this? I tried using various combinations of GADTs, existential types, and template haskell, but either these aren't enough or I can't figure out how to apply them properly.
The real answer: You can't. Haskell intentionally doesn't define a generic "serialize to string" function, and being able to do so without some type class constraint would violate parametricity all over town. Dreadful, just dreadful.
If you don't see why this poses a problem, consider the following type signature:
something :: (a, a) -> b -> a
How would you implement this function? The generic type means it has to be either const . fst or const . snd, right? Hmm.
something (x,y) z = if debugShow z == debugShow y then y else x
> something ('a', 'b') ()
'a'
> something ('a', 'b') 'b'
'b'
Oooooooops! So much for being able to reason about your program in any sane way. That's it, show's over, go home, it was fun while it lasted.
The terrible, no good, unwise answer: Sure, if you don't mind shamelessly cheating. Did I mention that example above was an actual GHCi session? Ha, ha.
import Control.Exception
import Control.Monad
import GHC.Vacuum
debugShow :: a -> String
debugShow = show . nameGraph . vacuumLazy
assertEq :: Eq a => a -> a -> IO ()
assertEq x y = when (x /= y) . throwIO . AssertionFailed $
unlines ["assertEq failed:", '\t':debugShow x, "=/=", '\t':debugShow y]
data NoShow = NoShow1
| NoShow2
deriving (Eq)
> assertEq NoShow1 NoShow2
*** Exception: assertEq failed:
[("|0",["NoShow1|1"]),("NoShow1|1",[])]
=/=
[("|0",["NoShow2|1"]),("NoShow2|1",[])]
Oh. Ok. That looks like a fantastic idea, doesn't it.
Anyway, this doesn't quite give you what you want, since there's no obvious way to fall back to a proper Show instance when available. On the other hand, this lets you do a lot more than show can do, so perhaps it's a wash.
Seriously, though. Don't do this in actual, working code. Ok, error reporting, debugging, logging... that makes sense. But otherwise, it's probably very ill-advised.
I asked this question a while ago on the haskell-cafe list, and the experts said no. Here are some good responses,
http://www.haskell.org/pipermail/haskell-cafe/2011-May/091744.html
http://www.haskell.org/pipermail/haskell-cafe/2011-May/091746.html
The second one mentions GHC advanced overlap, but my experience was that it doesn't really work.
For your particular problem, I'd introduce a typeclass
class MaybeShow a where mshow :: a -> String
make anything that is showable do the logical thing
instance Show a => MaybeShow a where mshow = show
and then, if you have a fixed number of types which wouldn't be showable, say
instance MaybeShow NotShowableA where mshow _ = "<unprintable>"
of course you could abstract it a little,
class NotShowable a
instance NotShowable a => MaybeShow a where mshow _ = "<unprintable>"
instance NotShowable NotShowableA -- etc.
You shouldn't be able to. The simplest way to implement type classes is by having them compile into an extra parameter
foo :: Show s => a -> s
turns into
foo :: show -> a -> s
The program just passess around the type class instances (like v-tables in C++) as ordinary data. This is why you can trivially use things that look not just like multiple dispatch in OO languages, but can dispatch off the return type.
The problem is that a signature
foo :: a -> String
has no way of getting the implementation of Show that goes for a in cases when it has one.
You might be able to get something like this to work in particular implementations, with the correct language extensions (overlapping instance, etc) on, but I havent tried it
class MyShow a where
myShow :: a -> String
instance (Show a) => MyShow a where
myShow = show
instance MyShow a where
myShow = ...
one trick that might help is enable type families. It can let you write code like
instance (a' ~ a, Show a') => MyShow a
which can sometimes help you get code past the compiler that it doesn't think looks okay.