sbt doesn't correctly deal with resources in multi-module projects when running integration tests. Why? - scala

I have the following configuration on an sbt project:
moduleA Contains a bunch of integration tests.
moduleB (depends on moduleA). Contains a reference.conf file;
moduleC (aggregates moduleA and moduleB -- this is the root).
When I try to run it:test I get errors, as the tests cannot find the values available on reference.conf. Manually copying the reference.conf to moduleA makes it work.
The issue seems clearly to be that by some reason when running the it:tests (at the root), sbt is not smart enough to add the reference.conf to the classpath.
Can anyone theorize why is that the case? How does sbt work with classpaths and classloaders? Will it just dump everything in a single classloader? It certainly doesn't seem to be the case.
Thanks

In order to address your question and comment, let me break down what SBT is doing with your project.
ModuleC is the root project, and it aggregates ModuleA and ModuleB. In the context of SBT, aggregation means that any command that is run on the root project is also run on the aggregated sub-projects. So, for example, if you run integration tests on the root module, then you will also run the integration tests for its aggregated modules. However, it's important to understand that this is not done all-at-once: the command is run on each sub-project individually.
The next question that SBT has to address is the order in which the sub-projects should be processed. In this case, since ModuleB depends on ModuleA, it has to process ModuleA before processing ModuleB. Otherwise, if there was no dependency between them, then the order wouldn't matter, and SBT would most likely stick with the order that they were specified in ModuleC's aggregation list.
But what does it mean for one sub-project to depend upon another? It's akin to the relationship between an SBT project and one of its libraryDependencies: the dependent library must be available to the sub-project, and its resources and classes are made available on the classpath during the indicated phases (compile, test, run, etc.).
So, when integration tests are run on ModuleC, SBT will first run ModuleA's integration tests. Since ModuleA has no other dependencies within the project, it will be processed without any of the other sub-projects available on its classpath. (This is why it cannot access the reference.conf file that is part of ModuleB.) This makes sense, if you think about it, because otherwise—if ModuleA and ModuleB are dependent upon each other—you would have an unresolvable chicken-and-egg situation, in which neither project could be built.
(BTW, if ModuleA has sources that have not yet been compiled, then they will be compiled, on a separate compile pass, before running the integration tests.)
Next it will try to process ModuleB, adding ModuleA's resources and classes to its classpath, since it is dependent upon them.
From your description, it seems that at least some of the configuration settings in ModuleB's reference.conf file should belong to ModuleA, since it needs access to them during its integration tests. Whether this means that the whole of the file should belong to ModuleA is up to you. However, it's possible for each sub-project to have its own reference.conf file resource (that's a design feature of the Typesafe Config library that I'm assuming you're using). Any configuration settings that belong to ModuleA's reference.conf file will also be available to ModuleB, since it is dependent upon ModuleA. (If you have multiple reference.conf files, the only issue you have will depend upon how you package and release ModuleC. If you package everything in all of your sub-projects into a single JAR file, then you would need to merge the various reference.conf files together, for example.)
Another possibility is that some or all of the integration tests should actually belong to ModuleC rather than either ModuleA or ModuleB. Again, making this determination will depend upon your requirements. If it makes sense for each sub-project to perform integration tests in all cases, then place them in the sub-projects. If they only make sense for the completed project as a whole, then put them in ModuleC.
You might want to read the documentation for SBT multi-project builds for further details.

Related

SBT build structure for a mono repository

I'd like to use SBT for a build structured around a single Git repository with tens of projects. I'd like to have the following possibilities from the build:
One-click build/test/release of all projects in the build.
Define and apply common settings for all the projects in the build.
Define project-specific settings in the project sub-directories, keeping the root of the build clean.
Project should be able depend on each other, but...
Most of the projects will not depend on each other in the classpath sense.
Some of the projects will be the SBT plugins that should be included into other projects (but not to all of them).
Now with these requirements in mind, what should be the structure of the build? Because of requirement 3, I can't just go with a single build.sbt in a root of a build, because I don't want to put all the projects settings there, since it will be a lot of text, and every change in a single project will be reflected on the top level.
I've also heard the usage of *.sbt files both for root project and sub-projects is error-prone and not generally recommended (Producing no artifact for root project with package under multi-project build in SBT, or How can I use an sbt plugin as a dependency in a multi-project build?, SBT: plugins.sbt in subproject is ignored? etc.). I've tried only simple multi-projects builds with *.sbt files on different levels, and it just worked. Which pitfalls do I need to keep in mind if I'll go for a multi *.sbt files approach, given the requirements above?
Ok, since no one have posted anything so far, I've figured I post what I've learned from doing mono repository with SBT so far.
Some facts first: our SBT build consists of 40+ projects, with 10+ common projects (in the sense that other projects depend on them), and 5 groups of projects related to a single product (5-7 projects in each group). In each group, there's typically one group-common project.
Build organization
We have the following build structure:
One main build.sbt for the whole build.
One build.sbt per project.
Several local SBT plugins in project directory.
Let's talk about each of these items.
1. Main build.sbt
In this file, the general build structure is defined. Namely, we keep cross-project dependencies in there. We don't use the standard commonSettings approach, as in:
val commonSettings = Seq(scalaVersion := "2.12.3", ...)
...
val proj1 = (project in file("p1")).settings(commonSettings)
val proj2 = (project in file("p2")).settings(commonSettings)
...
This would be too wordy and easy to get wrong for a new project. Instead we use a local SBT project that automatically applies to every project in the build (more on this later).
2. Per-project build.sbt
In those files, we generally define all the project settings (non-auto plugins, library dependencies, etc.). We don't define cross-project dependencies in these files, because this doesn't really work as expected. SBT loads all the *.sbt files in certain order, and project definition in every build overrides the previously found ones. In other words, if you avoid (re-)defining projects in per-project *.sbt files, things will work well. All the other settings can be kept there, to avoid too much clutter in main build.sbt.
3. Local SBT plugins
We use a trick to define SBT auto-plugin in <root_dir>/project/ directory, and make them load automatically for all the projects in the build. We use those plugins to automatically define the settings and tasks for all the projects (for things like Scalastyle, scalafmt, Sonar, deployment, etc.). We also keep common settings there (scalaVersion, etc.). Another thing we keep in <root_dir>/project/ is common dependencies versions (not in a plugin, just pure *.scala file).
Summary
Using SBT for a mono repository seems to work, and has certain advantages and disadvantages.
Advantages: It's super easy to re-use the code between products. Also common SBT stuff like Scalastyle, scalafmt, etc. is defined once, and all new projects get it for free. Upgrading a dependency version is done in one place for all the projects, so when someone upgrades the version, he or she does this for all the projects at once, so that different teams benefit from that. This requires certain discipline between teams, but it worked for us so far.
Another advantage is use of common tooling. We have a Gerrit+Jenkins continuous integration, and we have a single job for e.g. pre-submit verification. New projects get a lot of this machinery pretty much for free, again.
Disadvantages: For one, the build load time. On top 13" MacBook Pro it can easily last 30+ seconds (this is time from starting SBT to getting to SBT's command prompt). This is not that bad if you can keep the SBT constantly running though. It's much worse for Intellij refreshing the build information, where it can take around 15 minutes. I don't know why it takes so much longer than in SBT, but here's that. Can be mitigated by avoiding refreshing the Intellij unless absolutely necessary, but it's a real pain.
Yet another problem is that you can't load an individual project or group of projects into Intellij IDEA. You are forced to load the build of whole mono repository. If that would be possible, then, I guess, Intellij's situation could have been better.
Another disadvantage is the fact that one can't use different versions of same SBT plugin for different projects. When one project can't be upgraded to a new plugin version for some reason, the whole repository has to wait. Sometimes this is useful, that is, it expedites maintenance work, and forces us to keep projects in maintenance mode up to date. But sometimes for legacy projects it can be challenging.
Conclusion
All in all, we have worked for a around a year in this mode, and we intend to keep doing so in the foreseeable future. What concerns us is the long Intellij IDEA refresh time, and it only gets worse as we add more projects into this build. We might evaluate alternative build systems later that would avoid us loading projects in isolation to help with Intellij performance, but SBT seems to be up to a task.
Here is an update on this discussion 5 years later.
For those who understand french, there is this presentation from ScalaIO 2019:
François Sarradin: SBT monorepo et livraison, https://www.youtube.com/watch?v=nT8YhC5iRco
There is also this video that is not specific to Sbt but give good tips that was recommanded in that gitter discussion: https://gitter.im/sbt/sbt?at=63584746f00b697fec5b7184
Gil Tayar: FOUR PILLARS AND A BASE: THE NUTS AND BOLTS OF A MONOREPO jsday 2020, https://www.youtube.com/watch?v=cIGFyv1KuGI

Adding library dependency in play 2.3.8

I'm trying to add the apache commons email library to my Play project and I'm having trouble.
Firstly I have both build.sbt and plugins.sbt in my project and I'm not sure which one I should be putting the import into, does anyone know?
Also, I'm not sure why there even is the separate project module in my project, intelliJ created it as part of the project. Could anyone explain the purpose of the two separate modules and why they are there?
Thanks!
So, in sbt, you have your project. This is specified in build.sbt (or more correctly, any *.sbt file in your projects base directory). Any libraries that your applications code needs, for example, if your application needs to send emails using the commons email library, go in to the librarDependencies seeing in here.
But build.sbt itself is Scala code that needs to be compiled, but it's not part of your applications runtime. So in sbt, your projects build is a project itself, one that has to be compiled. It has its own classpath, which consists of the sbt plugins you're using, so for example, if you need a less compiler to compile your less files, that's not something that gets done at runtime, so you don't want your application code depending on that, it goes into your project builds libraryDependencies, which gets specified in project/plugins.sbt (or in fact any *.sbt in the project directory). So, once you add it there, you can use the Scala code it provides from build.sbt. IntelliJ imports this project for you so that you can have syntax highlighting and other IDE features in build.sbt.
But it doesn't stop there. How does project/plugins.sbt get compiled, where is its classpath? Well, your projects builds projects builds project is also an sbt project itself too... It keeps going down. IntelliJ stops at that point though, it doesn't keep importing these meta sbt projects because it's actually very rare to need additional sbt plugins for your projects builds projects builds project, so it just uses the same classpath as your projects build project for syntax highlighting in project/plugins.sbt.

Using sbt with custom Scala builds

I occasionally play with Scala forks and sometimes need to debug these forks on SBT projects. In general, scalaHome works great, but there are a few things that I'd like to find better ways to achieve.
1) Is it possible to have SBT pick up custom scalac class files produced by the ant quick build rather than jar files emitted by the ant pack build? The latter implies 5-10 seconds of additional delay per build, so it'd be great to avoid it.
2) Even in big projects, problems exhibited by scalac usually manifest themselves when compiling single files. Is there a way to tell sbt to neglect its change tracking heuristics and recompile just a single file? What I would particularly like to prevent is recompilation of the whole world when I recompile scalaHome or change scalac flags.
3) Would it be possible to have sbt hot reload scalac classes coming from scalaHome, when scalaHome gets recompiled? Currently I have to shutdown and restart sbt to apply the changes.
1) No, this would make sbt depend on the details of the Scala build. If Scala were built with sbt, you might be able to depend on Scala as a source dependency or at least this could probably be supported without too many changes.
2) No, see https://github.com/sbt/sbt/issues/604
3) sbt 0.13 should check the last modified times of the jars coming from scalaHome and use a new class loader. It is a bug if it does not.

OpenWrap: test-wrap, how does it work?

I am using the beta of OpenWrap 2.0. OpenWrap contains support to run unit-tests, my question is how exactly does this work?
Should I see it as a test-runner that takes a built wrap, searches for the tests included in the wrap, and tries to run them? Is it required to include the tests inside the wrap?
How does the dependency resolving work in the context of tests? I can specify a tests-scope which adds extra dependencies required for the tests. When are those dependencies used? I assume it is used to build the test-projects, and to run the tests with test-wrap? However, when I do include the tests in the wrap, shouldn't those test-scoped dependencies also be considered dependencies for the wrap, or are they only used as dependencies when I try to execute "test-wrap"?
Another thing I was wondering about in the context of the tests, is the difference between compile-time and run-time dependencies.
As an example, I have a project API that specifies an API. Next to that project, I have 2 other projects Impl1 and Impl2 that each specify a different implementation of that API. And next to that I have a test project API.Tests that contains tests against the API. The tests use dependency injection to inject either Impl1 or Impl2 to run the tests.
In this case, the API.Tests project only has a compile time dependency on the API (and should only have that available as a compile time dependency). When running the tests however, the project has a run-time dependency on Impl1 or Impl2. Any suggestions on how to package this?
test-wrap will be able to run a test-runner for tests that are shipping as part of a pacakge (in /tests).
The implementation right now is not up-to-date anymore, mostly because packages do not include the testdriven.net test runner, which makes running those tests rather complicated. I've not re-evaluated our plans for this yet because of this reason.
OpenWrap 2 uses scopes to define dependencies that only apply to a certain subset of your code. In the case of tests, provided you have the correct dicrectory-structure instruction in the descriptor, your project will pull in those dependencies in the correct scope.
That said we don't preserve that information in the assembly, so when you run those tests we don't load up the dependencies for the test scope, which we should probably do (at least for tests). All assemblies in your package are however injected in the current appdomain, so for your scenario, provided you have your tests in /tests, you just need to package all those assemblies in the same package and it should just work.
The same mechanism will

Selectively include dependencies in JAR

I have a library that I wrote in Scala that uses Bouncy Castle and has a whole bunch of dependencies. When I roll a jar, I can either roll a "fat" jar that has all the dependencies (including scala), which weighs in around 19 MB, or I can roll a skinny jar, which doesn't have dependencies, but is only a few hundred KB.
The problem is that I need to include the Bouncy Castle classes/jar with my library, because if its not on the classpath at runtime, all kinds of exceptions get thrown.
So, I think the ideal situation is if there is some way that I can get either Maven or SBT to include some but not all dependencies in the jar that gets rolled. Some dependencies are needed at compile-time, but not at run time, such as the Scala standard libraries. Is there some way to get that to happen?
Thanks!
I would try out the sbt proguard plugin from https://github.com/nuttycom/sbt-proguard-plugin . It should be able to weed out the classes that are not in use.
If it is sufficient to explicitly define which dependencies should be added (one the artifact-level, i.e., single JARs), you can define an assembly (in case of a single project) or an additional assembly project (in case of a multi-module project). Assembly descriptors can explicitly exclude/include artifacts from the dependencies.
Here is some good documentation on this topic (section 8.5.4), here is the official documentation.
Note that you can include all artifacts that belong to one group by using the wildcard notation in dependecySets, e.g. hibernate:*:jar would include all JAR files belonging to the hibernate group.
Covering maven...
Because you declare your project to be dependent upon bouncy castle in your maven pom, anybody using maven to depend upon your library will by default pull in bouncy castle as a transitive dependency.
You should set the appropriate scope on your dependencies, e.g. compile for stuff needed at compile and runtime, test for dependencies only needed in testing and provided for stuff you expect to be provided by the environment.
Whether your library's dependencies are packaged into dependent projects when they are built is a question of how those are projects configured and setting the scopes will influence the default behaviour.
For example, jar type packaging by default does not include dependencies, whereas war will include those in compile scope (but not test or provided). The design aim here was to have packaging plugins behave in the most commonly required way without needing configuration, but of course packaging plugins in maven can be configured to have different behaviour if needed. The plugins themselves which do packaging are well documented at the apache maven site.
If users of your library are unlikely to be using maven to build their projects, an option is to use the shade plugin which will allow you to produce an "uber-jar" which contains all the dependencies you wish. You can configure particular includes or excludes.
This can be a problematic way to deliver, for example where your library includes dependencies which version clash with the direct dependencies of projects using it, i.e. they use a different version of the same libraries yours does.
However if you can it is best that you leave this to maven to manage so that projects using your library can decide whether they want your dependencies or to specify particular versions giving them more flexibility. This is the idiomatic approach.
For more information on dependencies and scopes in maven, see the reference guide published by Sonatype.
I'm not a scala guy, but I have played around with assembling stuff in Java + Maven.
Have you tried looking into creating your own assembly descriptor for the assembly plugin? https://maven.apache.org/plugins/maven-assembly-plugin/assembly.html
You can copy / paste the jar-with-dependencies descriptor then just add some excludes to your < dependencySet >. I'm not a Maven expert, but you should be able to configure it so different profiles kick off different assembly builds.
EDIT: Ack, didn't see my HTML got hidden