From reading about FP, my understanding of the benefits of removing side-effects is that if all our functions are pure/have referential transparency (something that can only be achieved without side-effects), then our functions are easier to test, debug, reuse, and are more modular.
Since exceptions are a form of side-effect, we need to avoid throwing exceptions. Obviously we still need to be able to terminate processes when something goes wrong so FP uses monads to achieve both referential transparency and the ability to handle exceptions.
What I'm confused about is how exactly monads achieve this. Suppose I have this code using scalaz
def couldThrowException: Exception \/ Boolean = ??
val f = couldThrowException()
val g = couldThrowException()
Since couldThrowException may return an exception, there is no gaurantee that f and g will be the same. f could be \/-(true) and g be a -\/(NullPointerException). Since couldThrowException can return different values with the same input, it is not a pure function. Wasn't the point of using monads to keep our functions pure?
f() and g() should evaluate to the same value, given the same input.
In pure FP a function with no arguments must necessarily evaluate to the same result every time it's called. So it's not pure FP if your couldThrowException sometimes returns \/-(true) and sometimes -\/(NullPointerException).
It makes more sense to return an Either if couldThrowException takes a parameter. If it's a pure function, it will have referential transparency, so some inputs will always result in \/-(true) and some inputs will always result in -\/(NullPointerException).
In Scala you may well be using a function that is not pure, and not referentially transparent. Perhaps it's a Java class. Perhaps it's using a part of Scala that's not pure.
But I guess you're interested in bridging between the pure FP world and impure libraries. A classic example of this is IO. println could fail for all kinds of reasons - permissions, filesystem full, etc.
The classic way to handle this in FP is to use an IO function that takes a "world" state as an input parameter, and returns both the result of the IO call, and the new "world" state. The state can be a "fake" value backed by library code in an impure language, but what it means is that each time you call the function, you're passing a different state, so it's referentially transparent.
Often, a monad is used to encapsulate the "world".
You can find out a lot about this approach by reading about Haskell's IO monad.
Core Scala IO isn't wholly pure, so println might throw and exception, and hence, as you've spotted, isn't fully referentially transparent. Scalaz provides an IO monad similar to Haskell's.
One thing to note, because it trips up a lot of beginners: there's nothing about the "World" approach that requires a monad, and IO isn't easiest the monad to look at when first learning what a monad is and why they're useful.
Suppose I allocation some large object (e.g. a vector of size N, which might be very large) and perform a sequence of m operations on it:
fm( .. f3( f2( f1( vec ) ) ) )
with each returning a collection of size N.
For simplicity let's assume each f is quite simple
def f5(vec: Vector[Int]) = { gc(); f6( vec.map(_+1) ) }
So, vec no longer has future references at the point where each subsequent call is made. (f1's vec parameter is never used after f2 is entered, and so forth for each call)
However, because most JVMs don't decrement references until the stack unwinds (AFAIK), isn't my program required to consume NxM memory. By comparison in the following style only 2xM is required (and less in other implementations)
var vec:Vector[Int] = ...
for ( f <- F ) {
vec = f(vec)
gc()
}
Does the same issue exist for tail recursive methods?
This isn't just an academic exercise - in some types of big-data type problems, we might to choose N so that our program is fits fully into RAM. In this case, should I be concerned that one style of pipelining is preferable to another?
First of all, your question contains a serious misconception, and an example of disastrously bad coding.
However, because most JVMs don't decrement references until the stack unwinds (AFAIK) ...
Actually there are no mainstream JVMs that use reference counting on references at all. Instead, they all use mark-sweep, copying or generational collection algorithms of some kind that do not rely on reference counting.
Next this:
def f5(vec: Vector[Int]) = { gc(); f6( vec.map(_+1) ) }
I think you are trying to "force" a garbage collection with the gc() call. Don't do this: it is horribly inefficient. And even if you are only doing to investigate memory management behavior, you are most likely distorting that behavior to the extent that what you are seeing is NOT representative of normal Scala code.
Having said that, the answer is basically yes. If your Scala function cannot be tail-call optimized, then there is the potential for a deep recursion to cause garbage retention problems. The only "get out" would be if the JIT compiler was able to tell the GC that certain variables were "dead" at particular points in a method call. I don't know if HotSpot JITs / GCs can do that.
(I guess, another way to do that would be for the Scala compiler to explicitly assign null to dead reference variables. But that has potential performance issues when you don't have a garbage retention problem!)
To add to #StephenC's answer
I don't know if HotSpot JITs / GCs can do that.
The hotspot jit can do liveness analysis within a method and deem local variables as unreachable even while a frame is still on the stack. This is why JDK9 introduces Reference.reachabilityFence, under some conditions even this can become unreachable while executing a member method of that instance.
But that optimization only applies when there really nothing in the control flow that can still read that local variable, e.g. no finally blocks or monitor exits. So it would depend on the bytecode generated by scala.
The calls in your example are tail calls. They really shouldn't have a stack frame allocated at all. However, for various unfortunate reasons, the Scala Language Specification does not mandate Proper Tail Calls, and for similarly unfortunate reasons, the Scala-JVM implementation does not perform Tail Call Optimization.
However, some JVMs have TCO, e.g. the J9 JVM performs TCO, and thus there shouldn't be any additional stack frames allocated, making the intermediate objects unreachable as soon as the next tail call happens. Even JVMs that do not have TCO are able to perform various static (escape analysis, liveness analysis) or dynamic (escape detection, e.g. the Azul Zing JVM does this) analysis that may or may not help with this case.
There are also other implementations of Scala: Scala.js does not perform TCO, as far as I know, but it compiles to ECMAScript, and as of ECMAScript 2015, ECMAScript does have Proper Tail Calls, so as long as the encoding of Scala method calls ends up as ECMAScript function calls, an standards-conforming ECMAScript 2015 engine should eliminate Scala tail calls.
Scala-native currently does not perform TCO, but it will in the future.
This is just one of those "I was wondering..." questions.
Scala has immutable data structures and (optional) lazy vals etc.
How close can a Scala program be to one that is fully pure (in a functional programming sense) and fully lazy (or as Ingo points out, can it be sufficiently non-strict)? What values are unavoidably mutable and what evaluation unavoidably greedy?
Regarding lazyness - currently, passing a parameter to a method is by default strict:
def square(a: Int) = a * a
but you use call-by-name parameters:
def square(a: =>Int) = a * a
but this is not lazy in the sense that it computes the value only once when needed:
scala> square({println("calculating");5})
calculating
calculating
res0: Int = 25
There's been some work into adding lazy method parameters, but it hasn't been integrated yet (the below declaration should print "calculating" from above only once):
def square(lazy a: Int) = a * a
This is one piece that is missing, although you could simulate it with a local lazy val:
def square(ap: =>Int) = {
lazy val a = ap
a * a
}
Regarding mutability - there is nothing holding you back from writing immutable data structures and avoid mutation. You can do this in Java or C as well. In fact, some immutable data structures rely on the lazy primitive to achieve better complexity bounds, but the lazy primitive can be simulated in other languages as well - at the cost of extra syntax and boilerplate.
You can always write immutable data structures, lazy computations and fully pure programs in Scala. The problem is that the Scala programming model allows writing non pure programs as well, so the type checker can't always infer some properties of the program (such as purity) which it could infer given that the programming model was more restrictive.
For example, in a language with pure expressions the a * a in the call-by-name definition above (a: =>Int) could be optimized to evaluate a only once, regardless of the call-by-name semantics. If the language allows side-effects, then such an optimization is not always applicable.
Scala can be as pure and lazy as you like, but a) the compiler won't keep you honest with regards to purity and b) it will take a little extra work to make it lazy. There's nothing too profound about this; you can even write lazy and pure Java code if you really want to (see here if you dare; achieving laziness in Java requires eye-bleeding amounts of nested anonymous inner classes).
Purity
Whereas Haskell tracks impurities via the type system, Scala has chosen not to go that route, and it's difficult to tack that sort of thing on when you haven't made it a goal from the beginning (and also when interoperability with a thoroughly impure language like Java is a major goal of the language).
That said, some believe it's possible and worthwhile to make the effort to document effects in Scala's type system. But I think purity in Scala is best treated as a matter of self-discipline, and you must be perpetually skeptical about the supposed purity of third-party code.
Laziness
Haskell is lazy by default but can be made stricter with some annotations sprinkled in your code... Scala is the opposite: strict by default but with the lazy keyword and by-name parameters you can make it as lazy as you like.
Feel free to keep things immutable. On the other hand, there's no side effect tracking, so you can't enforce or verify it.
As for non-strictness, here's the deal... First, if you choose to go completely non-strict, you'll be forsaking all of Scala's classes. Even Scalaz is not non-strict for the most part. If you are willing to build everything yourself, you can make your methods non-strict and your values lazy.
Next, I wonder if implicit parameters can be non-strict or not, or what would be the consequences of making them non-strict. I don't see a problem, but I could be wrong.
But, most problematic of all, function parameters are strict, and so are closures parameters.
So, while it is theoretically possible to go fully non-strict, it will be incredibly inconvenient.
What are the key differences between the approaches taken by Scala and F# to unify OO and FP paradigms?
EDIT
What are the relative merits and demerits of each approach? If, in spite of the support for subtyping, F# can infer the types of function arguments then why can't Scala?
I have looked at F#, doing low level tutorials, so my knowledge of it is very limited. However, it was apparent to me that its style was essentially functional, with OO being more like an add on -- much more of an ADT + module system than true OO. The feeling I get can be best described as if all methods in it were static (as in Java static).
See, for instance, any code using the pipe operator (|>). Take this snippet from the wikipedia entry on F#:
[1 .. 10]
|> List.map fib
(* equivalent without the pipe operator *)
List.map fib [1 .. 10]
The function map is not a method of the list instance. Instead, it works like a static method on a List module which takes a list instance as one of its parameters.
Scala, on the other hand, is fully OO. Let's start, first, with the Scala equivalent of that code:
List(1 to 10) map fib
// Without operator notation or implicits:
List.apply(Predef.intWrapper(1).to(10)).map(fib)
Here, map is a method on the instance of List. Static-like methods, such as intWrapper on Predef or apply on List, are much more uncommon. Then there are functions, such as fib above. Here, fib is not a method on int, but neither it is a static method. Instead, it is an object -- the second main difference I see between F# and Scala.
Let's consider the F# implementation from the Wikipedia, and an equivalent Scala implementation:
// F#, from the wiki
let rec fib n =
match n with
| 0 | 1 -> n
| _ -> fib (n - 1) + fib (n - 2)
// Scala equivalent
def fib(n: Int): Int = n match {
case 0 | 1 => n
case _ => fib(n - 1) + fib(n - 2)
}
The above Scala implementation is a method, but Scala converts that into a function to be able to pass it to map. I'll modify it below so that it becomes a method that returns a function instead, to show how functions work in Scala.
// F#, returning a lambda, as suggested in the comments
let rec fib = function
| 0 | 1 as n -> n
| n -> fib (n - 1) + fib (n - 2)
// Scala method returning a function
def fib: Int => Int = {
case n # (0 | 1) => n
case n => fib(n - 1) + fib(n - 2)
}
// Same thing without syntactic sugar:
def fib = new Function1[Int, Int] {
def apply(param0: Int): Int = param0 match {
case n # (0 | 1) => n
case n => fib.apply(n - 1) + fib.apply(n - 2)
}
}
So, in Scala, all functions are objects implementing the trait FunctionX, which defines a method called apply. As shown here and in the list creation above, .apply can be omitted, which makes function calls look just like method calls.
In the end, everything in Scala is an object -- and instance of a class -- and every such object does belong to a class, and all code belong to a method, which gets executed somehow. Even match in the example above used to be a method, but has been converted into a keyword to avoid some problems quite a while ago.
So, how about the functional part of it? F# belongs to one of the most traditional families of functional languages. While it doesn't have some features some people think are important for functional languages, the fact is that F# is function by default, so to speak.
Scala, on the other hand, was created with the intent of unifying functional and OO models, instead of just providing them as separate parts of the language. The extent to which it was succesful depends on what you deem to be functional programming. Here are some of the things that were focused on by Martin Odersky:
Functions are values. They are objects too -- because all values are objects in Scala -- but the concept that a function is a value that can be manipulated is an important one, with its roots all the way back to the original Lisp implementation.
Strong support for immutable data types. Functional programming has always been concerned with decreasing the side effects on a program, that functions can be analysed as true mathematical functions. So Scala made it easy to make things immutable, but it did not do two things which FP purists criticize it for:
It did not make mutability harder.
It does not provide an effect system, by which mutability can be statically tracked.
Support for Algebraic Data Types. Algebraic data types (called ADT, which confusingly also stands for Abstract Data Type, a different thing) are very common in functional programming, and are most useful in situations where one commonly use the visitor pattern in OO languages.
As with everything else, ADTs in Scala are implemented as classes and methods, with some syntactic sugars to make them painless to use. However, Scala is much more verbose than F# (or other functional languages, for that matter) in supporting them. For example, instead of F#'s | for case statements, it uses case.
Support for non-strictness. Non-strictness means only computing stuff on demand. It is an essential aspect of Haskell, where it is tightly integrated with the side effect system. In Scala, however, non-strictness support is quite timid and incipient. It is available and used, but in a restricted manner.
For instance, Scala's non-strict list, the Stream, does not support a truly non-strict foldRight, such as Haskell does. Furthermore, some benefits of non-strictness are only gained when it is the default in the language, instead of an option.
Support for list comprehension. Actually, Scala calls it for-comprehension, as the way it is implemented is completely divorced from lists. In its simplest terms, list comprehensions can be thought of as the map function/method shown in the example, though nesting of map statements (supports with flatMap in Scala) as well as filtering (filter or withFilter in Scala, depending on strictness requirements) are usually expected.
This is a very common operation in functional languages, and often light in syntax -- like in Python's in operator. Again, Scala is somewhat more verbose than usual.
In my opinion, Scala is unparalled in combining FP and OO. It comes from the OO side of the spectrum towards the FP side, which is unusual. Mostly, I see FP languages with OO tackled on it -- and it feels tackled on it to me. I guess FP on Scala probably feels the same way for functional languages programmers.
EDIT
Reading some other answers I realized there was another important topic: type inference. Lisp was a dynamically typed language, and that pretty much set the expectations for functional languages. The modern statically typed functional languages all have strong type inference systems, most often the Hindley-Milner1 algorithm, which makes type declarations essentially optional.
Scala can't use the Hindley-Milner algorithm because of Scala's support for inheritance2. So Scala has to adopt a much less powerful type inference algorithm -- in fact, type inference in Scala is intentionally undefined in the specification, and subject of on-going improvements (it's improvement is one of the biggest features of the upcoming 2.8 version of Scala, for instance).
In the end, however, Scala requires all parameters to have their types declared when defining methods. In some situations, such as recursion, return types for methods also have to be declared.
Functions in Scala can often have their types inferred instead of declared, though. For instance, no type declaration is necessary here: List(1, 2, 3) reduceLeft (_ + _), where _ + _ is actually an anonymous function of type Function2[Int, Int, Int].
Likewise, type declaration of variables is often unnecessary, but inheritance may require it. For instance, Some(2) and None have a common superclass Option, but actually belong to different subclases. So one would usually declare var o: Option[Int] = None to make sure the correct type is assigned.
This limited form of type inference is much better than statically typed OO languages usually offer, which gives Scala a sense of lightness, and much worse than statically typed FP languages usually offer, which gives Scala a sense of heavyness. :-)
Notes:
Actually, the algorithm originates from Damas and Milner, who called it "Algorithm W", according to the wikipedia.
Martin Odersky mentioned in a comment here that:
The reason Scala does not have Hindley/Milner type inference is
that it is very difficult to combine with features such as
overloading (the ad-hoc variant, not type classes), record
selection, and subtyping
He goes on to state that it may not be actually impossible, and it came down to a trade-off. Please do go to that link for more information, and, if you do come up with a clearer statement or, better yet, some paper one way or another, I'd be grateful for the reference.
Let me thank Jon Harrop for looking this up, as I was assuming it was impossible. Well, maybe it is, and I couldn't find a proper link. Note, however, that it is not inheritance alone causing the problem.
F# is functional - It allows OO pretty well, but the design and philosophy is functional nevertheless. Examples:
Haskell-style functions
Automatic currying
Automatic generics
Type inference for arguments
It feels relatively clumsy to use F# in a mainly object-oriented way, so one could describe the main goal as to integrate OO into functional programming.
Scala is multi-paradigm with focus on flexibility. You can choose between authentic FP, OOP and procedural style depending on what currently fits best. It's really about unifying OO and functional programming.
There are quite a few points that you can use for comparing the two (or three). First, here are some notable points that I can think of:
Syntax
Syntactically, F# and OCaml are based on the functional programming tradition (space separated and more lightweight), while Scala is based on the object-oriented style (although Scala makes it more lightweight).
Integrating OO and FP
Both F# and Scala very smoothly integrate OO with FP (because there is no contradiction between these two!!) You can declare classes to hold immutable data (functional aspect) and provide members related to working with the data, you can also use interfaces for abstraction (object-oriented aspects). I'm not as familiar with OCaml, but I would think that it puts more emphasis on the OO side (compared to F#)
Programming style in F#
I think that the usual programming style used in F# (if you don't need to write .NET library and don't have other limitations) is probably more functional and you'd use OO features only when you need to. This means that you group functionality using functions, modules and algebraic data types.
Programming style in Scala
In Scala, the default programming style is more object-oriented (in the organization), however you still (probably) write functional programs, because the "standard" approach is to write code that avoids mutation.
What are the key differences between the approaches taken by Scala and F# to unify OO and FP paradigms?
The key difference is that Scala tries to blend the paradigms by making sacrifices (usually on the FP side) whereas F# (and OCaml) generally draw a line between the paradigms and let the programmer choose between them for each task.
Scala had to make sacrifices in order to unify the paradigms. For example:
First-class functions are an essential feature of any functional language (ML, Scheme and Haskell). All functions are first-class in F#. Member functions are second-class in Scala.
Overloading and subtypes impede type inference. F# provides a large sublanguage that sacrifices these OO features in order to provide powerful type inference when these features are not used (requiring type annotations when they are used). Scala pushes these features everywhere in order to maintain consistent OO at the cost of poor type inference everywhere.
Another consequence of this is that F# is based upon tried and tested ideas whereas Scala is pioneering in this respect. This is ideal for the motivations behind the projects: F# is a commercial product and Scala is programming language research.
As an aside, Scala also sacrificed other core features of FP such as tail-call optimization for pragmatic reasons due to limitations of their VM of choice (the JVM). This also makes Scala much more OOP than FP. Note that there is a project to bring Scala to .NET that will use the CLR to do genuine TCO.
What are the relative merits and demerits of each approach? If, in spite of the support for subtyping, F# can infer the types of function arguments then why can't Scala?
Type inference is at odds with OO-centric features like overloading and subtypes. F# chose type inference over consistency with respect to overloading. Scala chose ubiquitous overloading and subtypes over type inference. This makes F# more like OCaml and Scala more like C#. In particular, Scala is no more a functional programming language than C# is.
Which is better is entirely subjective, of course, but I personally much prefer the tremendous brevity and clarity that comes from powerful type inference in the general case. OCaml is a wonderful language but one pain point was the lack of operator overloading that required programmers to use + for ints, +. for floats, +/ for rationals and so on. Once again, F# chooses pragmatism over obsession by sacrificing type inference for overloading specifically in the context of numerics, not only on arithmetic operators but also on arithmetic functions such as sin. Every corner of the F# language is the result of carefully chosen pragmatic trade-offs like this. Despite the resulting inconsistencies, I believe this makes F# far more useful.
From this article on Programming Languages:
Scala is a rugged, expressive,
strictly superior replacement for
Java. Scala is the programming
language I would use for a task like
writing a web server or an IRC client.
In contrast to OCaml [or F#], which was a
functional language with an
object-oriented system grafted to it,
Scala feels more like an true hybrid
of object-oriented and functional
programming. (That is, object-oriented
programmers should be able to start
using Scala immediately, picking up
the functional parts only as they
choose to.)
I first learned about Scala at POPL
2006 when Martin Odersky gave an
invited talk on it. At the time I saw
functional programming as strictly
superior to object-oriented
programming, so I didn't see a need
for a language that fused functional
and object-oriented programming. (That
was probably because all I wrote back
then were compilers, interpreters and
static analyzers.)
The need for Scala didn't become
apparent to me until I wrote a
concurrent HTTPD from scratch to
support long-polled AJAX for yaplet.
In order to get good multicore
support, I wrote the first version in
Java. As a language, I don't think
Java is all that bad, and I can enjoy
well-done object-oriented programming.
As a functional programmer, however,
the lack of (or needlessly verbose)
support of functional programming
features (like higher-order functions)
grates on me when I program in Java.
So, I gave Scala a chance.
Scala runs on the JVM, so I could
gradually port my existing project
into Scala. It also means that Scala,
in addition to its own rather large
library, has access to the entire Java
library as well. This means you can
get real work done in Scala.
As I started using Scala, I became
impressed by how cleverly the
functional and object-oriented worlds
blended together. In particular, Scala
has a powerful case
class/pattern-matching system that
addressed pet peeves lingering from my
experiences with Standard ML, OCaml
and Haskell: the programmer can decide
which fields of an object should be
matchable (as opposed to being forced
to match on all of them), and
variable-arity arguments are
permitted. In fact, Scala even allows
programmer-defined patterns. I write a
lot of functions that operate on
abstract syntax nodes, and it's nice
to be able to match on only the
syntactic children, but still have
fields for things such as annotations
or lines in the original program. The
case class system lets one split the
definition of an algebraic data type
across multiple files or across
multiple parts of the same file, which
is remarkably handy.
Scala also
supports well-defined multiple
inheritance through class-like devices
called traits.
Scala also allows a
considerable degree of overloading;
even function application and array
update can be overloaded. In my
experience, this tends to make my
Scala programs more intuitive and
concise.
One feature that turns out to save a
lot of code, in the same way that type
classes save code in Haskell, is
implicits. You can imagine implicits
as an API for the error-recovery phase
of the type-checker. In short, when
the type checker needs an X but got a
Y, it will check to see if there's an
implicit function in scope that
converts Y into X; if it finds one, it
"casts" using the implicit. This makes
it possible to look like you're
extending just about any type in
Scala, and it allows for tighter
embeddings of DSLs.
From the above excerpt it is clear that Scala's approach to unify OO and FP paradigms is far more superior to that of OCaml or F#.
HTH.
Regards,
Eric.
The syntax of F# was taken from OCaml but the object model of F# was taken from .NET. This gives F# a light and terse syntax that is characteristic of functional programming languages and at the same time allows F# to interoperate with the existing .NET languages and .NET libraries very smoothly through its object model.
Scala does a similar job on the JVM that F# does on the CLR. However Scala has chosen to adopt a more Java-like syntax. This may assist in its adoption by object-oriented programmers but to a functional programmer it can feel a bit heavy. Its object model is similar to Java's allowing for seamless interoperation with Java but has some interesting differences such as support for traits.
If functional programming means programming with functions, then Scala bends that a bit. In Scala, if I understand correctly, you're programming with methods instead of functions.
When the class (and the object of that class) behind the method don't matter, Scala will let you pretend it's just a function. Perhaps a Scala language lawyer can elaborate on this distinction (if it even is a distinction), and any consequences.