F# has tail call elimination? - scala

In this talk, in the first 8 minutes, Runar explains that Scala has problems with tail call elimination, this makes me wonder whether F# has similar problems ? If not, why not ?

The problem with Proper Tail Calls in Scala is one of engineering trade-offs. It would be easily possible to add PTCs to Scala: just add a sentence to the SLS. Voilà: PTCs in Scala. From a Language Design perspective, we are done.
Now the poor compiler writers need to implement that spec. Well, compiling into a language with PTCs is easy … but unfortunately, the JVM byte code isn't such a language. Okay, so what about GOTO? Nope. Continuations? Nope. Exceptions (which are known to be equivalent to Continuations)? Ah, now we are getting somewhere! So, we could use exceptions to implement PTCs. Or, alternatively, we could just not use the JVM call stack at all and implement our own stack.
After all, there are multiple Scheme implementations on the JVM, all of them support PTCs just fine. It's a myth that you cannot have PTCs on the JVM, just because the JVM doesn't support them. After all, x86 doesn't have them either, but nonetheless, there are languages running on x86 that have them.
So, if implementing PTCs on the JVM is possible, then why doesn't Scala have them? Like I said above, you could use exceptions or your own stack to implement them. But using exceptions for control flow or implementing your own stack means that everything which expects the JVM call stack to look a certain way would no longer work.
In particular, you would lose pretty much all interoperability with the Java tooling ecosystem (debuggers, visualizers, static analyzers). You would also have to build bridges to interoperate with Java libraries, which would be slow, so you lose interop with the Java library ecosystem as well.
But that is a major design goal of Scala! And that's why Scala doesn't have PTCs.
I call this "Hickey's Theorem", after Rich Hickey, the designer of Clojure who once said in a talk "Tail Calls, Interop, Performance – Pick Two."
You would also present the JIT compiler with some very unusual byte code patterns that it may not know how to optimize well.
If you were to port F# to the JVM, you would basically have to make exactly that choice: do you give up Tail Calls (you can't, because they are required by the Language Spec), do you give up Interop or do you give up Performance? On .NET, you can have all three, because Tail Calls in F# can simply be compiled into Tail Calls in MSIL. (Although the actual translation is more complex than that, and the implementation of Tail Calls in MSIL is buggy in some corner cases.)
This poses the question: why not add Tail Calls to the JVM? Well, this is very hard, due to a design flaw in the JVM byte code. The designers wanted the JVM byte code to have certain safety properties. But instead of designing the JVM byte code language in such a way that you cannot write an unsafe program in the first place (like, say, in Java, for example, where you cannot write a program that violates pointer safety, because the language just doesn't give you access to pointers in the first place), JVM byte code in itself is unsafe and needs a separate byte code verifier to make it safe.
That byte code verifier is based on stack inspection, and Tail Calls change the stack. So, the two are very hard to reconcile, but the JVM simply doesn't work without the byte code verifier. It took a long time and some very smart people to finally figure out how to implement Tail Calls on the JVM without losing the byte code verifier (see A Tail-Recursive Machine with Stack Inspection by Clements and Felleisen and tail calls in the VM by John Rose (JVM lead designer)), so we have now moved from the stage where it was an open research problem to the stage where it is "just" an open engineering problem.
Note that Scala and some other languages do have intra-method direct tail-recursion. However, that is pretty boring, implementation-wise: it is just a while loop. Most targets have while loops or something equivalent, e.g. the JVM has intra-method GOTO. Scala also has the scala.util.control.TailCalls object, which is kind-of a re-ified trampoline. (See Stackless Scala With Free Monads by Rúnar Óli Bjarnason for a more general version of this idea, which can eliminate all use of the stack, not just in tail-calls.) This can be used to implement a tail-calling algorithm in Scala, but this is then not compatible with the JVM stack, i.e. it doesn't look like a recursive method call to other languages or to a debugger:
import scala.util.control.TailCalls._
def isEven(xs: List[Int]): TailRec[Boolean] =
if (xs.isEmpty) done(true) else tailcall(isOdd(xs.tail))
def isOdd(xs: List[Int]): TailRec[Boolean] =
if (xs.isEmpty) done(false) else tailcall(isEven(xs.tail))
isEven((1 to 100000).toList).result
def fib(n: Int): TailRec[Int] =
if (n < 2) done(n) else for {
x <- tailcall(fib(n - 1))
y <- tailcall(fib(n - 2))
} yield (x + y)
fib(40).result
Clojure has the recur special form, which is also an explicit trampoline.

F# does not have a problem with tail calls. Here is what it does:
If you have a single tail-recursive function, the compiler generates a loop with mutation because this is faster than generating the .tail instruction
In other tail-call positions (e.g. when using continuations or two mutually recursive functions), it generates the .tail instruction and so the tail call is handled by the CLR
By default, tail-call optimization is turned off in Debug mode in Visual Studio, because this makes debugging easier (you can inspect the stack), but you can turn it on in project properties if needed.
In the old days, there used to be problems with the .tail instruction on some runtimes (CLR x64 and Mono), but that all of those have now been fixed and everything works fine.

It turns out that for proper tail calls, you have to either compile in "Release Mode" as opposed to the default "Debug Mode", or to open your project properties, and in the Build menu, scroll down and check "Generate tail calls". Thanks to Arnavion on IRC.freenode.net #fsharp for the tip.

Related

What can I do to my scala code so it will compile faster?

I have a large scala code base. (https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL)
It's like 70K lines of scala code. We are on scala 2.11.7
Development is getting difficult because compilation - the edit-compile-test-debug cycle is too long for small changes.
Incremental recompile times can be a minute, and this is without optimization turned on. Sometimes longer. And that's with not having edited very many changes into files. Sometimes a very small change causes a huge recompilation.
So my question: What can I do by way of organizing the code, that will improve compilation time?
E.g., decomposing code into smaller files? Will this help?
E.g., more smaller libraries?
E.g., avoiding use of implicits? (we have very few)
E.g., avoiding use of traits? (we have tons)
E.g., avoiding lots of imports? (we have tons - package boundaries are pretty chaotic at this point)
Or is there really nothing much I can do about this?
I feel like this very long compilation is somehow due to some immense amount of recompiling due to dependencies, and I am thinking of how to reduce false dependencies....but that's just a theory
I'm hoping someone else can shed some light on something we might do which would improve compilation speed for incremental changes.
Here are the phases of the scala compiler, along with slightly edited
versions of their comments from the source code. Note that this
compiler is unusual in being heavily weighted towards type checking
and to transformations that are more like desugarings. Other compilers
include a lot of code for: optimization, register allocation, and
translation to IR.Some top-level points:
There is a lot of tree rewriting. Each phase tends to read in a tree
from the previous phase and transform it to a new tree. Symbols, to
contrast, remain meaningful throughout the life of the compiler. So
trees hold pointers to symbols, and not vice versa. Instead of
rewriting symbols, new information gets attached to them as the phases
progress.
Here is the list of phases from Global:
analyzer.namerFactory: SubComponent,
analyzer.typerFactory: SubComponent,
superAccessors, // add super accessors
pickler, // serializes symbol tables
refchecks, // perform reference and override checking,
translate nested objects
liftcode, // generate reified trees
uncurry, // uncurry, translate function values to anonymous
classes
tailCalls, // replace tail calls by jumps
explicitOuter, // replace C.this by explicit outer pointers,
eliminate pattern matching
erasure, // erase generic types to Java 1.4 types, add
interfaces for traits
lambdaLift, // move nested functions to top level
constructors, // move field definitions into constructors
flatten, // get rid of inner classes
mixer, // do mixin composition
cleanup, // some platform-specific cleanups
genicode, // generate portable intermediate code
inliner, // optimization: do inlining
inlineExceptionHandlers, // optimization: inline exception handlers
closureElimination, // optimization: get rid of uncalled closures
deadCode, // optimization: get rid of dead cpde
if (forMSIL) genMSIL else genJVM, // generate .class files
some work around with scala compiler
Thus scala compiler has to do a lot more work than the Java compiler, however in particular there are some things which makes the Scala compiler drastically slower, which include
Implicit resolution. Implicit resolution (i.e. scalac trying to find an implicit value when you make an implicit declartion) bubbles up over every parent scope in the declaration, this search time can be massive (particularly if you reference the same the same implicit variable many times, and its declared in some library all the way down your dependancy chain). The compile time gets even worse when you take into account implicit trait resolution and type classes, which is used heavily by libraries such as scalaz and shapeless.
Also using a huge number of anonymous classes (i.e. lambdas, blocks, anonymous functions).Macros obviously add to compile time.
A very nice writeup by Martin Odersky
Further the Java and Scala compilers convert source code into JVM bytecode and do very little optimization.On most modern JVMs, once the program bytecode is run, it is converted into machine code for the computer architecture on which it is being run. This is called the just-in-time compilation. The level of code optimization is, however, low with just-in-time compilation, since it has to be fast. To avoid recompiling, the so called HotSpot compiler only optimizes parts of the code which are executed frequently.
A program might have different performance each time it is run. Executing the same piece of code (e.g. a method) multiple times in the same JVM instance might give very different performance results depending on whether the particular code was optimized in between the runs. Additionally, measuring the execution time of some piece of code may include the time during which the JIT compiler itself was performing the optimization, thus giving inconsistent results.
One common cause of a performance deterioration is also boxing and unboxing that happens implicitly when passing a primitive type as an argument to a generic method and also frequent GC.There are several approaches to avoid the above effects during measurement,like It should be run using the server version of the HotSpot JVM, which does more aggressive optimizations.Visualvm is a great choice for profiling a JVM application. It’s a visual tool integrating several command line JDK tools and lightweight profiling capabilities.However scala abstracions are very complex and unfortunately VisualVM does not yet support this.parsing mechanisms which was taking a long time to process like cause using a lot of exists and forall which are methods of Scala collections which take predicates,predicates to FOL and thus may pass entire sequence maximizing performance.
Also making the modules cohisive and less dependent is a viable solution.Mind that intermediate code gen is somtimes machine dependent and various architechures give varied results.
An Alternative:Typesafe has released Zinc which separates the fast incremental compiler from sbt and lets the maven/other build tools use it. Thus using Zinc with the scala maven plugin has made compiling a lot faster.
A simple problem: Given a list of integers, remove the greatest one. Ordering is not necessary.
Below is version of the solution (An average I guess).
def removeMaxCool(xs: List[Int]) = {
val maxIndex = xs.indexOf(xs.max);
xs.take(maxIndex) ::: xs.drop(maxIndex+1)
}
It's Scala idiomatic, concise, and uses a few nice list functions. It's also very inefficient. It traverses the list at least 3 or 4 times.
Now consider this , Java-like solution. It's also what a reasonable Java developer (or Scala novice) would write.
def removeMaxFast(xs: List[Int]) = {
var res = ArrayBuffer[Int]()
var max = xs.head
var first = true;
for (x <- xs) {
if (first) {
first = false;
} else {
if (x > max) {
res.append(max)
max = x
} else {
res.append(x)
}
}
}
res.toList
}
Totally non-Scala idiomatic, non-functional, non-concise, but it's very efficient. It traverses the list only once!
So trade-offs should also be prioritized and sometimes you may have to work things like a java developer if none else.
Some ideas that might help - depends on your case and style of development:
Use incremental compilation ~compile in SBT or provided by your IDE.
Use sbt-revolver and maybe JRebel to reload your app faster. Better suited for web apps.
Use TDD - rather than running and debugging the whole app write tests and only run those.
Break your project down into libraries/JARs. Use them as dependencies via your build tool: SBT/Maven/etc. Or a variation of this next...
Break your project into subprojects (SBT). Compile separately what's needed or root project if you need everything. Incremental compilation is still available.
Break your project down to microservices.
Wait for Dotty to solve your problem to some degree.
If everything fails don't use advanced Scala features that make compilation slower: implicits, metaprogramming, etc.
Don't forget to check that you are allocating enough memory and CPU for your Scala compiler. I haven't tried it, but maybe you can use RAM disk instead of HDD for your sources and compile artifacts (easy on Linux).
You are touching one of the main problems of object oriented design (over engineering), in my opinion you have to flatten your class-object-trait hierachy and reduce the dependecies between classes. Brake packages to different jar files and use them as mini libraries which are "frozen" and concentrate on new code.
Check some videos also from Brian Will, who makes a case against OO over-engineering
i.e https://www.youtube.com/watch?v=IRTfhkiAqPw (you can take the good points)
I don't agree with him 100% but it makes a good case against over-engineering.
Hope that helps.
You can try to use the Fast Scala Compiler.
Asides minor code improvements like (e.g #tailrec annotations), depending on how brave you feel, you could also play around with Dotty which boasts faster compile times among other things.

Is it possible/useful to transpile Scala to golang?

Scala native has been recently released, but the garbage collector they used (for now) is extremely rudimentary and makes it not suitable for serious use.
So I wonder: why not just transpile Scala to Go (a la Scala.js)? It's going to be a fast, portable runtime. And their GC is getting better and better. Not to mention the inheritance of a great concurrency model: channels and goroutines.
So why did scala-native choose to go so low level with LLVM?
What would be the catch with a golang transpiler?
There are two kinds of languages that are good targets for compilers:
Languages whose semantics closely match the source language's semantics.
Languages which have very low-level and thus very general semantics (or one might argue: no semantics at all).
Examples for #1 include: compiling ECMAScript 2015 to ECMAScript 5 (most language additions were specifically designed as syntactic sugar for existing features, you just have to desugar them), compiling CoffeeScript to ECMAScript, compiling TypeScript to ECMAScript (basically, after type checking, just erase the types and you are done), compiling Java to JVM byte code, compiling C♯ to CLI CIL bytecode, compiling Python to CPython bytecode, compiling Python to PyPy bytecode, compiling Ruby to YARV bytecode, compiling Ruby to Rubinius bytecode, compiling ECMAScript to SpiderMonkey bytecode.
Examples for #2 include: machine code for a general purpose CPU (RISC even more so), C--, LLVM.
Compiling Scala to Go fits neither of the two. Their semantics are very different.
You need either a language with powerful low-level semantics as the target language, so that you can build your own semantics on top, or you need a language with closely matching semantics, so that you can map your own semantics into the target language.
In fact, even JVM bytecode is already too high-level! It has constructs such as classes that do not match constructs such as Scala's traits, so there has to be a fairly complex encoding of traits into classes and interfaces. Likewise, before invokedynamic, it was actually pretty much impossible to represent dynamic dispatch on structural types in JVM bytecode. The Scala compiler had to resort to reflection, or in other words, deliberately stepping outside of the semantics of JVM bytecode (which resulted in a terrible performance overhead for method dispatch on structural types compared to method dispatch on other class types, even though both are the exact same thing).
Proper Tail Calls are another example: we would like to have them in Scala, but because JVM bytecode is not powerful enough to express them without a very complex mapping (basically, you have to forego using the JVM's call stack altogether and manage your own stack, which destroys both performance and Java interoperability), it was decided to not have them in the language.
Go has some of the same problems: in order to implement Scala's expressive non-local control-flow constructs such as exceptions or threads, we need an equally expressive non-local control-flow construct to map to. For typical target languages, this "expressive non-local control-flow construct" is either continuations or the venerable GOTO. Go has GOTO, but it is deliberately limited in its "non-localness". For writing code by humans, limiting the expressive power of GOTO is a good thing, but for a compiler target language, not so much.
It is very likely possible to rig up powerful control-flow using goroutines and channels, but now we are already leaving the comfortable confines of just mapping Scala semantics to Go semantics, and start building Scala high-level semantics on top of Go high-level semantics that weren't designed for such usage. Goroutines weren't designed as a general control-flow construct to build other kinds of control-flow on top of. That's not what they're good at!
So why did scala-native choose to go so low level with LLVM?
Because that's precisely what LLVM was designed for and is good at.
What would be the catch with a golang transpiler?
The semantics of the two languages are too different for a direct mapping and Go's semantics are not designed for building different language semantics on top of.
their GC is getting better and better
So can Scala-native's. As far as I understand, the choice for current use of Boehm-Dehmers-Weiser is basically one of laziness: it's there, it works, you can drop it into your code and it'll just do its thing.
Note that changing the GC is under discussion. There are other GCs which are designed as drop-ins rather than being tightly coupled to the host VM's object layout. E.g. IBM is currently in the process of re-structuring J9, their high-performance JVM, into a set of loosely coupled, independently re-usable "runtime building blocks" components and releasing them under a permissive open source license.
The project is called "Eclipse OMR" (source on GitHub) and it is already production-ready: the Java 8 implementation of IBM J9 was built completely out of OMR components. There is a Ruby + OMR project which demonstrates how the components can easily be integrated into an existing language runtime, because the components themselves assume no language semantics and no specific memory or object layout. The commit which swaps out the GC and adds a JIT and a profiler clocks in at just over 10000 lines. It isn't production-ready, but it boots and runs Rails. They also have a similar project for CPython (not public yet).
why not just transpile Scala to Go (a la Scala.js)?
Note that Scala.JS has a lot of the same problems I mentioned above. But they are doing it anyway, because the gain is huge: you get access to every web browser on the planet. There is no comparable gain for a hypothetical Scala.go.
There's a reason why there are initiatives for getting low-level semantics into the browser such as asm.js and WebAssembly, precisely because compiling a high-level language to another high-level language always has this "semantic gap" you need to overcome.
In fact, note that even for lowish-level languages that were specifically designed as compilation targets for a specific language, you can still run into trouble. E.g. Java has generics, JVM bytecode doesn't. Java has inner classes, JVM bytecode doesn't. Java has anonymous classes, JVM bytecode doesn't. All of these have to be encoded somehow, and specifically the encoding (or rather non-encoding) of generics has caused all sorts of pain.

Why is Scala's Type system not a Library in Clojure

I've heard people claim that:
Scala's type system is amazing (existential types, variant, co-variant)
Because of the power of macros, everything is a library in Clojure: (pattern matching, logic programming, non-determinism, ..)
Question:
If both assertions are true, why is Scala's type system not a library in Clojure? Is it because:
types are one of these things that do not work well as a library? [i.e. the changes would somehow have to threaded through every existing clojure library, including clojure.core?]
is Scala's notion of types fundamentally incompatible with clojure protocol / records?
... ?
It's an interesting question.
You are certainly right about Scala having an amazing type system, and about Clojure being phenomenal for meta-programming and extension of the language (although that is about more than just macros....).
A few reasons I can think of:
Clojure is a dynamically typed language while Scala is a statically typed language. Having powerful type inference isn't so much use in a language where you can assume relatively little about the types of your inputs.
Clojure already has a very interesting project to add typing as a library (Typed Clojure) which looks very promising - however it's very different in approach to Scala as it is designed for a dynamic language from the start (inspired more by Typed Racket, I believe).
Clojure philosophy actually discourages certain OOP concepts (particularly implementation inheritance, mutable objects, and data encapsulation). A type system that supports these things (as Scala does) wouldn't be a good fit for Clojure idioms - at best they would be ignored, but they could easily encourage a style of development that would cause people to run into severe problems later.
Clojure already provides tools that solve many of the problems you would typically solve with types in other languages - e.g. the use of protocols for polymorphism.
There's a strong focus in the Clojure community on simplicity (in the sense of the excellent video "Simple Made Easy" - see particularly the slide at 39:30). While Scala's type system is certainly amazing, I think it's a stretch to describe it as "Simple"
Putting in a Scala-style type system would probably require a complete rewrite of the Clojure compiler and make it substantially more complex. Nobody seems to have signed up so far to take on that particular challenge... and there's a risk that even if someone were willing and able to do this then the changes could be rejected for the various cultural / technical reasons covered above.
In the absence of a major change to Clojure itself (which I think would be unlikely) then one interesting possibility would be to create a DSL within Clojure that provided Scala-style type inference for a specific domain and compiled this DSL direct to optimised Java bytecode. I could see that being a useful approach for specific problem domains (large scale numerical data crunching with big matrices, for example).
To simply answer your question "... why is Scala's type system not a library in Clojure?":
Because the type system is part of the scala compiler and not of the scala library. The whole power of scalas type system only exists at compile time. The JVM has no support for things like that, because of type erasure and also, because it would simply slow down execution. And also there is no need for it. If you have a statically typed language, you don't need type information at runtime, unless you want to do dirty stuff.
edit:
#mikera the jvm is sure capable of running the scala compiler, I did not say anything like that. I just said, that the jvm has no support for type systems like that. It does not even support generics. At runtime all these types are gone. The compiler checks for the correctness of a program and removes all the higher kinded types / generics.
example:
val xs: List[Int] = List(1,2,3,4)
val x1: Int = xs.head
will at runtime look like this:
val xs: List = List.apply(1,2,3,4)
val x1: Int = xs.head.asInstanceOf[Int]
But it doesn't matter, because the compiler checked it before. You can only get in trouble here, when you use reflection, because you could put any value in the list and it would break at runtime exactly where the value is casted to Int.
And this is one of the reasons, why the scala type system is not part of the scala library, but built into the compiler.
And also the question of the OP was "... why is Scala's type system not a library in Clojure?" and not "Is it possible to create a type system such as scalas for clojure?" and I perfectly answered that question.

Why doesn't Scala have an IO Monad?

I'm wondering why Scala does not have an IO Monad like Haskell.
So, in Scala the return type of method readLine is String whereas in Haskell the comparable function getLine has the return type IO String.
There is a similar question about this topic, but its answer it not satisfying:
Using IO is certainly not the dominant style in scala.
Can someone explain this a bit further? What was the design decision for not including IO Monads to Scala?
Because Scala is not pure (and has no means to enforce that a function is pure, like D has) and allows side effects. It interoperates closely with Java (e.g. reuses big parts of the Java libraries). Scala is not lazy, so there is no problem regarding execution order like in Haskell (e.g. no need for >> or seq). Under these circumstances introducing the IO Monad would make life harder without gaining much.
But if you really have applications where the IO monad has significant advantages, nothing stops you from writing your own implementation or to use scalaz. See e.g. http://apocalisp.wordpress.com/2011/12/19/towards-an-effect-system-in-scala-part-2-io-monad/
[Edit]
Why wasn't it done as a lazy and pure language?
This would have been perfectly possible (e.g. look at Frege, a JVM language very similar to Haskell). Of course this would make the Java interoperability more complicate, but I don't think this is the main reason. I think a lazy and pure language is a totally cool thing, but simply too alien to most Java programmers, which are the target audience of Scala. Scala was designed to cooperate with Java's object model (which is the exact opposite of pure and lazy), allowing functional and mixed functional-OO programming, but not enforcing it (which would have chased away almost all Java programmers). In fact there is no point in having yet another completely functional language: There is Haskell, Erlang, F# (and other MLs) and Clojure (and other Schemes / Lisps), which are all very sophisticated, stable and successful, and won't be easily replaced by a newcomer.

Debunking Scala myths [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
What are the most commonly held misconceptions about the Scala language, and what counter-examples exist to these?
UPDATE
I was thinking more about various claims I've seen, such as "Scala is dynamically typed" and "Scala is a scripting language".
I accept that "Scala is [Simple/Complex]" might be considered a myth, but it's also a viewpoint that's very dependent on context. My personal belief is that it's the very same features that can make Scala appear either simple or complex depending oh who's using them. Ultimately, the language just offers abstractions, and it's the way that these are used that shapes perceptions.
Not only that, but it has a certain tendency to inflame arguments, and I've not yet seen anyone change a strongly-held viewpoint on the topic...
Myth: That Scala’s “Option” and Haskell’s “Maybe” types won’t save you from null. :-)
Debunked: Why Scala's "Option" and Haskell's "Maybe" types will save you from null by James Iry.
Myth: Scala supports operator overloading.
Actually, Scala just has very flexible method naming rules and infix syntax for method invocation, with special rules for determining method precedence when the infix syntax is used with 'operators'. This subtle distinction has critical implications for the utility and potential for abuse of this language feature compared to true operator overloading (a la C++), as explained more thoroughly in James Iry's answer to this question.
Myth: methods and functions are the same thing.
In fact, a function is a value (an instance of one of the FunctionN classes), while a method is not. Jim McBeath explains the differences in greater detail. The most important practical distinctions are:
Only methods can have type parameters
Only methods can take implicit arguments
Only methods can have named and default parameters
When referring to a method, an underscore is often necessary to distinguish method invocation from partial function application (e.g. str.length evaluates to a number, while str.length _ evaluates to a zero-argument function).
I disagree with the argument that Scala is hard because you can use very advanced features to do hard stuff with it. The scalability of Scala means that you can write DSL abstractions and high-level APIs in Scala itself that otherwise would need a language extension. So to be fair you need to compare Scala libraries to other languages compilers. People don't say that C# is hard because (I assume, don't have first hand knowledge on this) the C# compiler is pretty impenetrable. For Scala it's all out in the open. But we need to get to a point where we make clear that most people don't need to write code on this level, nor should they do it.
I think a common misconception amongst many scala developers, those at EPFL (and yourself, Kevin) is that "scala is a simple language". The argument usually goes something like this:
scala has few keywords
scala reuses the same few constructs (e.g. PartialFunction syntax is used as the body of a catch block)
scala has a few simple rules which allow you to create library code (which may appear as if the language has special keywords/constructs). I'm thinking here of implicits; methods containing colons; allowed identifier symbols; the equivalence of X(a, b) and a X b with extractors. And so on
scala's declaration-site variance means that the type system just gets out of your way. No more wildcards and ? super T
My personal opinion is that this argument is completely and utterly bogus. Scala's type system taken together with implicits allows one to write frankly impenetrable code for the average developer. Any suggestion otherwise is just preposterous, regardless of what the above "metrics" might lead you to think. (Note here that those who I've seen scoffing at the non-complexity of Java on Twitter and elsewhere happen to be uber-clever types who, it sometimes seems, had a grasp of monads, functors and arrows before they were out of short pants).
The obvious arguments against this are (of course):
you don't have to write code like this
you don't have to pander to the average developer
Of these, it seems to me that only #2 is valid. Whether or not you write code quite as complex as scalaz, I think it's just silly to use the language (and continue to use it) with no real understanding of the type system. How else can one get the best out of the language?
There is a myth that Scala is difficult because Scala is a complex language.
This is false--by a variety of metrics, Scala is no more complex than Java. (Size of grammar, lines of code or number of classes or number of methods in the standard API, etc..)
But it is undeniably the case that Scala code can be ferociously difficult to understand. How can this be, if Scala is not a complex language?
The answer is that Scala is a powerful language. Unlike Java, which has many special constructs (like enums) that accomplish one particular thing--and requires you to learn specialized syntax that applies just to that one thing, Scala has a variety of very general constructs. By mixing and matching these constructs, one can express very complex ideas with very little code. And, unsurprisingly, if someone comes along who has not had the same complex idea and tries to figure out what you're doing with this very compact code, they may find it daunting--more daunting, even, than if they saw a couple of pages of code to do the same thing, since then at least they'd realize how much conceptual stuff there was to understand!
There is also an issue of whether things are more complex than they really need to be. For example, some of the type gymnastics present in the collections library make the collections a joy to use but perplexing to implement or extend. The goals here are not particularly complicated (e.g. subclasses should return their own types), but the methods required (higher-kinded types, implicit builders, etc.) are complex. (So complex, in fact, that Java just gives up and doesn't try, rather than doing it "properly" as in Scala. Also, in principle, there is hope that this will improve in the future, since the method can evolve to more closely match the goal.) In other cases, the goals are complex; list.filter(_<5).sorted.grouped(10).flatMap(_.tail.headOption) is a bit of a mess, but if you really want to take all numbers less than 5, and then take every 2nd number out of 10 in the remaining list, well, that's just a somewhat complicated idea, and the code pretty much says what it does if you know the basic collections operations.
Summary: Scala is not complex, but it allows you to compactly express complex ideas. Compact expression of complex ideas can be daunting.
There is a myth that Scala is non-deployable, whereas a wide range of third-party Java libraries can be deployed without a second thought.
To the extent that this myth exists, I suspect it exists among people who are not accustomed to separating a virtual machine and API from a language and compiler. If java == javac == Java API in your mind, you might get a little nervous if someone suggests using scalac instead of javac, because you see how nicely your JVM runs.
Scala ends up as JVM bytecode, plus its own custom library. There's no reason to be any more worried about deploying Scala on a small scale or as part of some other large project as there is in deploying any other library that may or may not stay compatible with whichever JVM you prefer. Granted, the Scala development team is not backed by quite as much force as the Google collections, or Apache Commons, but its got at least as much weight behind it as things like the Java Advanced Imaging project.
Myth:
def foo() = "something"
and
def bar = "something"
is the same.
It is not; you can call foo(), but bar() tries to call the apply method of StringLike with no arguments (results in an error).
Some common misconceptions related to Actors library:
Actors handle incoming messages in a parallel, in multiple threads / against a thread pool (in fact, handling messages in multiple threads is contrary to the actors concept and may lead to racing conditions - all messages are sequentially handled in one thread (thread-based actors use one thread both for mailbox processing and execution; event-based actors may share one VM thread for execution, using multi-threaded executor to schedule mailbox processing))
Uncaught exceptions don't change actor's behavior/state (in fact, all uncaught exceptions terminate the actor)
Myth: You can replace a fold with a reduce when computing something like a sum from zero.
This is a common mistake/misconception among new users of Scala, particularly those without prior functional programming experience. The following expressions are not equivalent:
seq.foldLeft(0)(_+_)
seq.reduceLeft(_+_)
The two expressions differ in how they handle the empty sequence: the fold produces a valid result (0), while the reduce throws an exception.
Myth: Pattern matching doesn't fit well with the OO paradigm.
Debunked here by Martin Odersky himself. (Also see this paper - Matching Objects with Patterns - by Odersky et al.)
Myth: this.type refers to the same type represented by this.getClass.
As an example of this misconception, one might assume that in the following code the type of v.me is B:
trait A { val me: this.type = this }
class B extends A
val v = new B
In reality, this.type refers to the type whose only instance is this. In general, x.type is the singleton type whose only instance is x. So in the example above, the type of v.me is v.type. The following session demonstrates the principle:
scala> val s = "a string"
s: java.lang.String = a string
scala> var v: s.type = s
v: s.type = a string
scala> v = "another string"
<console>:7: error: type mismatch;
found : java.lang.String("another string")
required: s.type
v = "another string"
Scala has type inference and refinement types (structural types), whereas Java does not.
The myth is busted by James Iry.
Myth: that Scala is highly scalable, without qualifying what forms of scalability.
Scala may indeed be highly scalable in terms of the ability to express higher-level denotational semantics, and this makes it a very good language for experimentation and even for scaling production at the project-level scale of top-down coordinated compositionality.
However, every referentially opaque language (i.e. allows mutable data structures), is imperative (and not declarative) and will not scale to WAN bottom-up, uncoordinated compositionality and security. In other words, imperative languages are compositional (and security) spaghetti w.r.t. uncoordinated development of modules. I realize such uncoordinated development is perhaps currently considered by most to be a "pipe dream" and thus perhaps not a high priority. And this is not to disparage the benefit to compositionality (i.e. eliminating corner cases) that higher-level semantic unification can provide, e.g. a category theory model for standard library.
There will possibly be significant cognitive dissonance for many readers, especially since there are popular misconceptions about imperative vs. declarative (i.e. mutable vs. immutable), (and eager vs. lazy,) e.g. the monadic semantic is never inherently imperative yet there is a lie that it is. Yes in Haskell the IO monad is imperative, but it being imperative has nothing to with it being a monad.
I explained this in more detail in the "Copute Tutorial" and "Purity" sections, which is either at the home page or temporarily at this link.
My point is I am very grateful Scala exists, but I want to clarify what Scala scales and what is does not. I need Scala for what it does well, i.e. for me it is the ideal platform to prototype a new declarative language, but Scala itself is not exclusively declarative and afaik referential transparency can't be enforced by the Scala compiler, other than remembering to use val everywhere.
I think my point applies to the complexity debate about Scala. I have found (so far and mostly conceptually, since so far limited in actual experience with my new language) that removing mutability and loops, while retaining diamond multiple inheritance subtyping (which Haskell doesn't have), radically simplifies the language. For example, the Unit fiction disappears, and afaics, a slew of other issues and constructs become unnecessary, e.g. non-category theory standard library, for comprehensions, etc..