Graphing sbt's incremental compilation logic - scala

sbt maintains dependencies between tasks, and the resulting graph can be reasoned about fairly easily. On the other hand, skimming the source code, it seems like the incremental compilation logic is a lot more opaque. I'd like to be able to do the following things:
Say the equivalent of "if I modified this interface [in this way], what would get invalidated?"
Build a graph of how modifying different class interfaces affects the rest of the build. Graphing scala import dependencies isn't a particularly good approximation of this, given how complicated implicit dependencies can get in Scala. It seems like sbt must maintain this information in some form or another to do incremental compilation, so I "just" need to figure out how to access it and hope that it's in a form suitable for my use case.
Are either of these feasible? I'm not opposed to writing sbt plugins, but would appreciate hints about how to proceed.
Edit: it looks like Relation's usesInternalSrc(dep: File): Set[File] could be promising. Does that capture all of sbt's dependency knowledge?
Edit 2: even more promising, there's a DotGraph object inside the sbt source tree. It has no documentation and google doesn't have any human-readable text about it. If I can figure out how to use it I'll post an answer.

Sample console-project session:
> val (s, a) = runTask(compile in Compile, currentState)
> DotGraph.sources(a.relations, file("source-graph"), Nil)
source-graph is a directory that will contain two dot files, one with source dependencies and one with binary. You can alternatively directly interact with a.relations of type Relations, as suggested in the question, and which does capture all of sbt's dependency knowledge. In 0.13 there will also be information about which dependencies are due to inheriting from something in another source file.
In terms of how modifying a source file affects invalidation, it is very coarse grained. Any change to any non-private signature marks a source as changed. In 0.12 and earlier, this will at least invalidate direct dependencies and maybe more. In 0.13, this will invalidate direct dependencies only, except for inherited dependencies, which are transitively invalidated. There is currently no way to see what will be invalidated when a source file's non-private API is modified except by doing it.

Related

Why does scalac need a transitive dependency on the classpath

I'm running into an unexpected (only for me?) ScalaC behavior.
The TL;DR is that the following is a recreation of an issue I saw while trying to migrate a codebase from maven to bazel. One of the main focuses of this migration is to try to minimize the dependencies each class needs for compilation so that builds will be triggered only when needed.
Unfortunately what I saw is that given ClassIndirectlyNeedingFoo(uses)->ClassUsingFoo(uses)->Supplier the compilation of ClassIndirectlyNeedingFoo breaks if Supplier is not on the classpath.
The full details are here (https://github.com/ittaiz/scalac-troubleshooting).
If anyone knows why scalac behaves like this I'd really appreciate it.
Thanks!
BTW, Supplier is not in the source or bytecode of ClassIndirectlyNeedingFoo...
Ok so the short answer is that Why isn't totally clear to anyone (see #4). What is clear is that it's known scalac sometimes needs more dependencies than one would think and it's also clear that sometimes when this happens it's a bug.
Furthermore from a discussion with Jason Zaugg on Gitter he seems to think my above issue is just apart of a family of bugs like the one linked above.
As Seth linked in the comments the ScalaCenter has accepted a proposal (original PR) for clarifying this area.
Most related to this issue are the four proposals there:
Improvements to the user experience of stub errors, centered around the Statement: that they are an expected common case, rather than a
rare, unexpected, or fatal condition.
Reduction of the number of cases that result in stub errors... ie, allowing more usecases that currently result in a stub error to
successfully compile, and thus allowing for fewer direct dependencies.
A compiler flag to require import statements for all symbols used during compilation (including those not otherwise mentioned in the
source). For symmetry with -Ywarn-unused-imports, this option might
potentially be called -Ywarn-undeclared-imports. It would primarily
assist with making the transition from transitive to direct
dependencies, rather than helping to maintain a build that is already
using direct dependencies. (as suggested by
#posco and
#adriaanm)
An expansion of the Scala Language Specification to list all cases in which a symbol from another compilation unit must be present on the
classpath, including: 1) subclassing, 2) return types of superclasses'
public methods, 3) direct reference, 4) etc.
It was agreed to go ahead with #3 though I don't know when the work will commence.
Eugeene Burmako, who co-authored the proposal, started prototyping the solution and I've made a small change on top of that.
For now this will have to do for my problem.

How to generate control flow graph from Scala code?

I want to see the control flow graph generated by the Scala compiler. Is there a way to do so? I tried searching online but only found Eclipse plugins for Java like the one from here www.drgarbage.com but none for Scala.
Thanks
EDIT: I took the .class file generated by scalac and opened that with the dr garbage plug in to see the bytecode visualized as a control flow graph. But scalac makes 3 different .class files: Foo, Foo$, and Foo$delayedInit$body. I see a bunch of disconnected graphs and only one of the graphs in Foo$ looks reasonable. I tried searching online for the difference between the 3 .class files but couldn't find anything.
I didn't realize that the IR (intermediate representation) for scala in the backend was actually called icode. The option in the compiler -Xprint-icode actually shows the IR separated into basic blocks. This was what I was looking for.
A compiler plugin can do just that. However, it requires abstracting away some internals introduced by the compiler - which are more specific than what you'd expect from how a project's source code looks. You can use this plugin to get the raw information extracted for you, whereas the re-abstraction from the raw data is still work-in-progress, and you'd have to sbt publishLocal before you can include this plugin in your sbt definition.

How to test build.sbt code?

I've got a moderately simple but slightly tricky sbt setup code I want write a few tests for. I would, however, like it to stay in build.sbt.
For the time being I've moved it to an object (in project/), referenced it in the build.sbt as well as in the tests in project/src/test/scala.
But is it possible for code in build.sbt to be tested?
According to the advice in the SBT docs:
The recommended approach is to define most configuration in .sbt files, using .scala files for task implementations or to share values, such as keys, across .sbt files.
So since this sounds like testable code, and not static configuration, your best bet is to leave it in a Scala file.
I'm sure there's a way to manually evaluate the SBT DSL if you really, really want to, but you'll probably have to dig deep into the docs or the SBT source. I can't help you there, though.

Macro project - macro in its own configuration

The SBT documentation about Macro Projects starts with the following:
The current macro implementation in the compiler requires that macro implementations be compiled before they are used. The solution is typically to put the macros in a subproject or in their own configuration.
What does exactly mean to put the macros "in their own configuration"? Does it mean there is an alternative to putting macro source in a subproject? If so, what would that alternative be? I'm looking for an option where I'd not have to separate macro source from invocations, mainly because I don't want yet another subproject for common code.
I believe the author refers to the Configuration scop. In short: you already know that diffrent sources can reside under src/main/scala and src/test/scala, and that code in the test configuration can use code from the compile (main) configuration. so why not having a custom configuration, e.g. macro, and sources for it can reside under src/macro/scala?
there's a great answer on this matter, I suggest you take a look.
also, you could find useful examples here. it's an explanation on defining new configurations for tests. but you could exploit it for your needs.
I don't know what the author meant by own configuration.
An alternative is to use SBT's ability to generate sources (c.f. sourceGenerators) before they get compiled. "Generating" in fact allows you for instance to copy macro source files from elsewhere.
However that's convoluted, so I'd recommend to separate macros in a sub-project. Besides, separating macros still allows you to build even if you switch to an IDE that doesn't support SBT (then you have a macro-defining project and a macro-using project).

How do I generate new source code in text form in a Scala compiler plugin?

I have just finished the first version of a Java 6 compiler plugin, that automatically generates wrappers (proxy, adapter, delegate, call it what you like) based on an annotation.
Since I am doing mixed Java/Scala projects, I would like to be able to use the same annotation inside my Scala code, and get the same generated code (except of course in Scala). That basically means starting from scratch.
What I would like to do, and for which I haven't found an example yet, is how do I generate the code inside a Scala compiler plugin in the same way as in the Java compiler plugin. That is, I match/find where my annotation is used, get the AST for the annotated interface, and then ask the API to give me a Stream/Writer in which I output the generated Scala source code, using String manipulation.
That last part is what I could not find. So how do I tell the API to create a new Scala source file, and give me a Stream/Writer/File/Handle, so I can just write in it, and when I'm done, the Scala compiler compiles it, within the same run in which the plugin was invoked?
Why would I want to do that? Firstly, because than both plugins have the same structure, so maintenance is easy. Secondly, I want to open source it, and there is just no way to support every option that anyone would want, so I expect potential users to want to extend the generation with their own code. This will be a lot easier for them if they just have to do some printf(), instead of learning the AST API (this also applies to me).
Short answer:
It can't be done
Long answer:
You could conceivably generate your source file and push that through a parser instance within your plugin. But not in any way that's likely to be of any use to you, because you'd now have a bigger problem to contend with:
In order to grab all the type/name information for generating the delagate/proxy, you'll have to pick up the annotated type's AST after it has run through both the namer and typer phases (which are inseperable). The catch is that any attempts to call your generated code will already have failed typechecking, the compiler will have thrown an error, and any further bets are off.
Method synthesis is possible in limited cases, so long as you can somehow fool the typechecker for just long enough to get your code generated, which is the trick I pulled with my Autoproxy 'lite' plugin. Even then, you're far better off working with TreeDSL to generate code instead of pumping out raw source.
Kevin is entirely correct, but just for completeness it's worth mentioning that there is another alternative - write a compiler plugin that generates source. This is the approach that I've adopted in Borachio. It's not a very satisfactory solution, but it can be made to work.
Edit - I just reread your question and realised that you're actually asking about generating source anyway
So there is no support for this directly, but it's basically just a question of opening a file and writing the relevant "print" statements. There's no way to invoke the compiler "inside" a plugin AFAIK, but I've written an sbt plugin which hides most of the complexity of invoking the compiler twice.