Compiler cache for Scala? - scala

Compilation in Scala is fairly slow. Are there any hopes to make it faster?
One thing which comes to my mind is Scala equivalent of ccache: a cache where compiler does not have to recompile some parts. I know that type inference make things more complicated, but I wonder whether it is feasible at all. Perhaps caching should be done on different level (e.g. AST) or it needs to do some kind of preprocessing.
I will be happy to see some estimates how much could be potentially saved if that kind of tool exists. What kind of challenges are needed to be solved to build it?

As well as SBT which only recompiles what's needed, JRebel helps to solve this problem and has Scala support.

Related

How to package/publish add-on for two incompatible versions of underlying scala library?

I'm developing a Scala project at https://github.com/jonaskoelker/equate/ which gives you equality assertions for ScalaTest which print a diff of observed vs. expected if they're unequal. This is particularly useful for long strings and large case classes.
I'd like to publish one version of equate for ScalaTest v3.0.8 and one for ScalaTest v3.1.1.
What are best practices for doing so? My web searches came up empty. My own first idea is to publish two things with different names, where the name says which version of ScalaTest each thing is compatible with. Is there a better way?
My way seems rather low-tech. It seems to me that some grunt work could be automated away if the ScalaTest version information was encoded some other way. This seems obvious enough that someone else has probably thought about it and done something about it. I'd like to know what, such that I can release my code the smart way rather than the dumb way.

When does IntelliJ's Scala incremental compilation happen?

When does IntelliJ's Scala incremental compilation happen? I notice that making changes to a file does not cause the corresponding .class files (in /target) to be updated. When does this happen?
I think you misunderstand how Scala incremental compilation works.
There are 2 different things that might be called "IntelliJ's Scala incremental compilation ":
1) Proper Scala incremental compilation which is more or less a set of typcial strategies applicable for different programming languages to not (re-)compile everythings from the scratch again when you hit Compile button. The main idea behind that is that the build system might notice that certain files and all their dependency haven't changed since the last compilation and thus you don't have to re-compile them and can use result of the last compilation instead. Those heuristics are actually complicated for Scala as it is a complicated language. Some ideas on what can be done are described at the SBT document "Understanding Incremental Recompilation". At some point JetBrains decided that they are smarter and implemented their set of heuristics and they claim that they are better (i.e. incremental compilation is faster) so now you chose between SBT-based and Idea-based incremental compilation under Scala Compiler settings. But still it only works when you hit Compile (or Run or Debug or something similar). This not something Idea does in background.
2) There is another thing specific for IntelliJ Idea that also requires a kind of incremental recompilation and this one works in almost real time. It is the synxtax highlighting feature that is implemented by Idea's Scala plugin and it requires immediate re-processing of all the files you change in a way similar but now exactly the same as what the real compiler does. And actually you are not supposed to look into the details of that process (unless you are going to develop Scala plug-in itself). What those process provides is some syntax structure of the code but not the actual .class files.

When, exactly, does Scala code perform heap allocations?

I've written a lot of high-performance Java code, and past experience has taught me that one of the most important things for getting peak performance is to be fully aware of the load you're putting on the garbage collector, and work to reduce that load in critical loops. For example, cryptographic hash functions should be written in such a way that they never allocate any memory.
Prior to Java 1.5 it was very easy to know exactly when Java code would be compiled into JVM bytecodes that performed heap allocation: only the "new" keyword in Java source code would result in a heap allocation. Post-Java-1.5 you can just look at the "desugaring" of the new language features (like int->Integer autoconversion).
I'm having a really, really, really hard time finding an equivalent answer for Scala, and this is a problem. Did I miss it? Where can I find a clear, comprehensive explanation of exactly when the Scala compiler will produce heap-allocating bytecodes? And no, the scala compiler source code isn't an answer since it changes over time -- I'm asking about the language rather than one specific compiler.
Edit: forgot to mention that the Java primitive "+" applied to Strings is a language-level construct that incurs heap allocation. So pre-Java-1.5 it should be "new and +-on-String".
I highly recommend getting to know the program JD-GUI, which decompiles .class files into java-ish code and lets you inspect it. I've learned a lot about how Scala code is compiled from looking at that.
http://jd.benow.ca/

intellij idea 11, scala slow execution [duplicate]

I've been programming in Scala for a while and I like it but one thing I'm annoyed by is the time it takes to compile programs. It's seems like a small thing but with Java I could make small changes to my program, click the run button in netbeans, and BOOM, it's running, and over time compiling in scala seems to consume a lot of time. I hear that with many large projects a scripting language becomes very important because of the time compiling takes, a need that I didn't see arising when I was using Java.
But I'm coming from Java which as I understand it, is faster than any other compiled language, and is fast because of the reasons I switched to Scala(It's a very simple language).
So I wanted to ask, can I make Scala compile faster and will scalac ever be as fast as javac.
There are two aspects to the (lack of) speed for the Scala compiler.
Greater startup overhead
Scalac itself consists of a LOT of classes which have to be loaded and jit-compiled
Scalac has to search the classpath for all root packages and files. Depending on the size of your classpath this can take one to three extra seconds.
Overall, expect a startup overhead of scalac of 4-8 seconds, longer if you run it the first time so disk-caches are not filled.
Scala's answer to startup overhead is to either use fsc or to do continuous building with sbt. IntelliJ needs to be configured to use either option, otherwise its overhead even for small files is unreasonably large.
Slower compilation speed. Scalac manages about 500 up to 1000 lines/sec. Javac manages about 10 times that. There are several reasons for this.
Type inference is costly, in particular if it involves implicit search.
Scalac has to do type checking twice; once according to Scala's rules and a second time after erasure according to Java's rules.
Besides type checking there are about 15 transformation steps to go from Scala to Java, which all take time.
Scala typically generates many more classes per given file size than Java, in particular if functional idioms are heavily used. Bytecode generation and class writing takes time.
On the other hand, a 1000 line Scala program might correspond to a 2-3K line Java program, so some of the slower speed when counted in lines per second has to balanced against more functionality per line.
We are working on speed improvements (for instance by generating class files in parallel), but one cannot expect miracles on this front. Scalac will never be as fast as javac.
I believe the solution will lie in compile servers like fsc in conjunction with good dependency analysis so that only the minimal set of files has to be recompiled. We are working on that, too.
The Scala compiler is more sophisticated than Java's, providing type inference, implicit conversion, and a much more powerful type system. These features don't come for free, so I wouldn't expect scalac to ever be as fast as javac. This reflects a trade-off between the programmer doing the work and the compiler doing the work.
That said, compile times have already improved noticeably going from Scala 2.7 to Scala 2.8, and I expect the improvements to continue now that the dust has settled on 2.8. This page documents some of the ongoing efforts and ideas to improve the performance of the Scala compiler.
Martin Odersky provides much more detail in his answer.
You should be aware that Scala compilation takes at least an order of magnitude longer than Java to compile. The reasons for this are as follows:
Naming conventions (a file XY.scala file need not contain a class called XY and may contain multiple top-level classes). The compiler may therefore have to search more source files to find a given class/trait/object identifier.
Implicits - heavy use of implicits means the compiler needs to search any in-scope implicit conversion for a given method and rank them to find the "right" one. (i.e. the compiler has a massively-increased search domain when locating a method.)
The type system - the scala type system is way more complicated than Java's and hence takes more CPU time.
Type inference - type inference is computationally expensive and a job that javac does not need to do at all
scalac includes an 8-bit simulator of a fully armed and operational battle station, viewable using the magic key combination CTRL-ALT-F12 during the GenICode compilation phase.
The best way to do Scala is with IDEA and SBT. Set up an elementary SBT project (which it'll do for you, if you like) and run it in automatic compile mode (command ~compile) and when you save your project, SBT will recompile it.
You can also use the SBT plug-in for IDEA and attach an SBT action to each of your Run Configurations. The SBT plug-in also gives you an interactive SBT console within IDEA.
Either way (SBT running externally or SBT plug-in), SBT stays running and thus all the classes used in building your project get "warmed up" and JIT-ed and the start-up overhead is eliminated. Additionally, SBT compiles only source files that need it. It is by far the most efficient way to build Scala programs.
The latest revisions of Scala-IDE (Eclipse) are much better atmanaging incremental compilation.
See "What’s the best Scala build system?" for more.
The other solution is to integrate fsc - Fast offline compiler for the Scala 2 language - (as illustrated in this blog post) as a builder in your IDE.
But not in directly Eclipse though, as Daniel Spiewak mentions in the comments:
You shouldn't be using FSC within Eclipse directly, if only because Eclipse is already using FSC under the surface.
FSC is basically a thin layer on top of the resident compiler which is precisely the mechanism used by Eclipse to compile Scala projects.
Finally, as Jackson Davis reminds me in the comments:
sbt (Simple build Tool) also include some kind of "incremental" compilation (through triggered execution), even though it is not perfect, and enhanced incremental compilation is in the work for the upcoming 0.9 sbt version.
Use fsc - it is a fast scala compiler that sits as a background task and does not need loading all the time. It can reuse previous compiler instance.
I'm not sure if Netbeans scala plugin supports fsc (documentation says so), but I couldn't make it work. Try nightly builds of the plugin.
You can use the JRebel plugin which is free for Scala. So you can kind of "develop in the debugger" and JRebel would always reload the changed class on the spot.
I read some statement somewhere by Martin Odersky himself where he is saying that the searches for implicits (the compiler must make sure there is not more than one single implicit for the same conversion to rule out ambiguities) can keep the compiler busy. So it might be a good idea to handle implicits with care.
If it doesn't have to be 100% Scala, but also something similar, you might give Kotlin a try.
-- Oliver
I'm sure this will be down-voted, but extremely rapid turn-around is not always conducive to quality or productivity.
Take time to think more carefully and execute fewer development micro-cycles. Good Scala code is denser and more essential (i.e., free from incidental details and complexity). It demands more thought and that takes time (at least at first). You can progress well with fewer code / test / debug cycles that are individually a little longer and still improve your productivity and the quality of your work.
In short: Seek an optimum working pattern better suited to Scala.

AOT compilation or native code compilation of Scala?

My scala application needs to perform simple operations over large arrays of integers & doubles, and performance is a bottleneck. I've struggled to put my finger on exactly when certain optimizations kick in (e.g. escape analysis) although I can observe their results through various benchmarking. I'd love to do some AOT compilation of my scala application, so I can see or enforce (or implement) certain optimizations ... or compile to native code, if possible, so I can cut corners like bounds checking and observe if it makes a difference.
My question: what alternative compilation methods work for scala? I'm interested in tools like llvm, vmkit, soot, gcj, etc. Who is using those successfully with scala at this point, or are none of these methods currently compatible or maintained?
GCJ can compile JVM classes to native code. This blog describes tests done with Scala code: http://lampblogs.epfl.ch/b2evolution/blogs/index.php/2006/10/02/scala_goes_native_almost?blog=7
To answer my own question, there is no alternative backend for Scala except for the JVM. The .NET backend has been in development for a long time, but its status is unclear. The LLVM backend is also not yet ready for use, and it's not clear what its future is.