An easy way to get rid of *everything* generated by SBT? asks for an easy way to clean up all files generated from sbt, and didn't find one. I'll ask for a hard one. How do I get a make cleanall with sbt?
UPDATE: For a definition of everything, you have to know why you ever need to clean all. There are two common reasons:
To distribute something: You need to confirm that someone can take a fresh machine, download your code, and build it without problem. (Or, someone can't, and you want to debug).
When something has gone horribly wrong. Some funny version of some jar somewhere is breaking something somehow. Sometimes it's easiest to just start fresh, rather than try to debug... (See Jim Gray's Why Computers Stop)
Here are the usual suspects of things you need to clean:
~/.sbt/boot - Safe cache of sbt itself to avoid disappearing JAR oddiites
~/.sbt/**/target - Compiled global plugin/build definitions.
~/.ivy2/cache - Ivy cache of resolved dependencies
~/.ivy2/local - publishLocal files
<cwd>/project/**/target - Compiled build/plugin definitioins
<cwd>/**/target - Artifacts from building your project, including compiler caches, classfiles, etc.
sbt is unable to provide a "clean everything" task directly, because deleting the JARs/classes sbt is actively using to run your build leads to super odd and evil behavior. But you could write a simple BASH script which can accomplish this if desired.
Related
To speed up our development workflow we split the tests and run each part on multiple agents in parallel. However, compiling test sources seem to take most of the time for the testing steps.
To avoid this, we pre-compile the tests using sbt test:compile and build a docker image with compiled targets.
Later, this image is used in each agent to run the tests. However, it seems to recompile the tests and application sources even though the compiled classes exists.
Is there a way to make sbt use existing compiled targets?
Update: To give more context
The question strictly relates to scala and sbt (hence the sbt tag).
Our CI process is broken down in to multiple phases. Its roughly something like this.
stage 1: Use SBT to compile Scala project into java bitecode using sbt compile We compile the test sources in the same test using sbt test:compile The targes are bundled in a docker image and pushed to the remote repository,
stage 2: We use multiple agents to split and run tests in parallel.
The tests run from the built docker image, so the environment is the
same. However, running sbt test causes the project to recompile even
through the compiled bitecode exists.
To make this clear, I basically want to compile on one machine and run the compiled test sources in another without re-compiling
Update
I don't think https://stackoverflow.com/a/37440714/8261 is the same problem because unlike it, I don't mount volumes or build on the host machine. Everything is compiled and run within docker but in two build stages. The file modified times and paths are retained the same because of this.
The debug output has something like this
Initial source changes:
removed:Set()
added: Set()
modified: Set()
Invalidated products: Set(/app/target/scala-2.12/classes/Class1.class, /app/target/scala-2.12/classes/graph/Class2.class, ...)
External API changes: API Changes: Set()
Modified binary dependencies: Set()
Initial directly invalidated classes: Set()
Sources indirectly invalidated by:
product: Set(/app/Class4.scala, /app/Class5.scala, ...)
binary dep: Set()
external source: Set()
All initially invalidated classes: Set()
All initially invalidated sources:Set(/app/Class4.scala, /app/Class5.scala, ...)
Recompiling all 304 sources: invalidated sources (266) exceeded 50.0% of all sources
Compiling 302 Scala sources and 2 Java sources to /app/target/scala-2.12/classes ...
It has no Initial source changes, but products are invalidated.
Update: Minimal project to reproduce
I created a minimal sbt project to reproduce the issue.
https://github.com/pulasthibandara/sbt-docker-recomplile
As you can see, nothing changes between the build stages, other than running in the second stage in a new step (new container).
While https://stackoverflow.com/a/37440714/8261 pointed at the right direction, the underlying issue and the solution for this was different.
Issue
SBT seems to recompile everything when it's run on different stages of a docker build. This is because docker compresses images created in each stage, which strips out the millisecond portion of the lastModifiedDate from sources.
SBT depends on lastModifiedDate when determining if sources have changed, and since its different (the milliseconds part) the build triggers a full recompilation.
Solution
Java 8:
Setting -Dsbt.io.jdktimestamps=true when running SBT as recommended in https://github.com/sbt/sbt/issues/4168#issuecomment-417655678 to workaround this issue.
Newer:
Follow recomendation in https://github.com/sbt/sbt/issues/4168#issuecomment-417658294
I solved the issue by setting SBT_OPTS env variable in the docker file like
ENV SBT_OPTS="${SBT_OPTS} -Dsbt.io.jdktimestamps=true"
The test project has been updated with this workaround.
Using SBT:
I think there is already an answer to this here: https://stackoverflow.com/a/37440714/8261
It looks tricky to get exactly right. Good luck!
Avoiding SBT:
If the above approach is too difficult (i.e. getting sbt test to consider that your test classes do not need re-compiling), you could instead avoid using sbt but instead run your test suite using java directly.
If you can get sbt to log the java command that it is using to run your test suite (e.g. using debug logging), then you could run that command on your test runner agents directly, which would completely preclude sbt re-compiling things.
(You might need to write the java command into a script file, if the classpath is too long to pass as a command-line argument in your shell. I have previously had to do that for a large project.)
This would be a much hackier approach that the one above, but might be quicker to get working.
A possible solution might be defining your own sbt task without dependencies or try to change the test task. For example you could create a task to run a JUnit runner if that was your testing framework. To define a task see this on Implementing Tasks.
You could even go as far as compiling sending the code and running the remotes from the same task as it is any scala code you want. From the sbt reference manual
You could be defining your own task, or you could be planning to redefine an existing task. Either way looks the same; use := to associate some code with the task key
I'm working in a project in a very security concious place with no access via proxy to all the online repositories SBT usually requires. We'd like to fetch the dependencies and transitive dependencies we need once.
How can sbt be forced to fetch all the dependencies a project needs once and from there on, only work offline? I have tried doing exactly that from home. I then copied over everything under:
~/.ivy2/cache
~/.ivy2/local
$ACTIVATOR_HOME/repository
but still SBT even when executed with sbt "set offline := true" run goes and tries to fetch everything online ... is a pain. Then finally breaks and complains it doesn't find some dependency.
UPDATE: I noticed another source of troubles but can't yet conclude it is the culprit of the OP broken build issue. I build and get the dependencies for the project from a Linux (Ubuntu box) and then I copy all the files to the corporate Windows 7 Pro environment. I found that many property files under ~/.ivy2/cache refer to the absolute path of the activator repository directory in Ubuntu and this is of course incorrect in the Windows env e.g.
#ivy cached data file for ch.qos.logback#logback-classic;1.1.3
#Fri Mar 10 08:39:37 CET 2017
artifact\:ivy\#ivy.original\#xml\#-1844423371.location=/opt/dev/activator/1.3.12/repository/ch.qos.logback/logback-classic/1.1.3/ivys/ivy.xml
artifact\:ivy\#ivy\#xml\#1016118566.is-local=true
artifact\:ivy\#ivy\#xml\#1016118566.location=/opt/dev/activator/1.3.12/repository/ch.qos.logback/logback-classic/1.1.3/ivys/ivy.xml
artifact\:ivy\#ivy.original\#xml\#-1844423371.is-local=true
artifact\:ivy\#ivy\#xml\#1016118566.exists=true
artifact\:logback-classic\#jar\#jar\#804750561.is-local=true
artifact\:logback-classic\#jar\#jar\#804750561.location=/opt/dev/activator/1.3.12/repository/ch.qos.logback/logback-classic/1.1.3/jars/logback-classic.jar
artifact\:ivy\#ivy.original\#xml\#-1844423371.exists=true
artifact\:logback-classic\#jar\#jar\#804750561.exists=true
So I went and did a find and replace but the build still doesn't work. It doesn't look like a brilliant idea to have thousands of property files hardcoding an absolute path to the activator location. I would rather prefer they use an environment variable for that.
Maybe you could try coursier?
No only it offers
better offline mode - one can safely work with snapshot dependencies if these are in cache (SBT tends to try and fail if it cannot check for updates)
but also is much faster than Ivy due to parallel artifacts downloading. The project is young but promising.
We are using Jenkins and SBT pluging to build, test and deploy our application. I see everywhere that people use "clean" command before "test" or "stage" which leads to very long compile times.
Is it necessary to clean everything for each build? What is the risk of using the incremental compiler for production builds?
I think it's a matter of taste. For continuous integration builds, it's ok to rely on the incremental compiler; if something fails I can investigate quite quickly.
But if, for whatever reason, the incremental compiler decides to ignore some changes (and it already happened to me in development), you may get some difficult bugs to resolve.
So, if you want something that's reproducible from a fresh code checkout; just run "clean" before everything else.
Most of the time, I configure jenkins with two build tasks: test builds, triggered by code changes, and packaging builds targeted for deployment and triggered periodically (or manually in some cases, with automatic deployment).
If the build script and/or command line to sbt have not changed then it should not be necessary to do clean.
I have been using Spark for several years - and it is one of the most complex sbt builds you can find. It works just fine to avoid doing clean as long as the build process itself (as embodied in the project/sbt build artifacts and the sbt command line) are the same as previous runs.
I often have to run some time-consuming experiments in scala, and usually I run a second sbt
instance for the same project where I make changes to the code that is running in the other instance and compile.
The reason I do this is so that I don't have to wait for a long running process to finish before I make progress with my code.
My question is: Is it safe to do that, or is there a possibility that recompiling parts of currently running code in sbt/scala will cause problems in my running process?
What I have observed so far is that most of the time it is fine, but I did run into a class not defined error once when refactoring my code while running.
As #marcus mentioned, the compiler writing a .class file that has not yet been loaded by your running JVM stands the chance of being loaded and not matching the other compiled classes. In many instances you'll be fine, but it could cause problems. There are a few things you can do in this situation:
Compile in separate directories. Check your code out into two completely different directories and do local commits (assuming you're using git) to push/pull from one copy of the repository to another. This will ensure that your testing doesn't get the compilation changes until you're ready (when you "pull" from the development repository).
Use an automated CI system like Jenkins or Travis to run your tests on each commit. This will, similarly to #1, not conflict with your development work since it is a separate checkout of the code.
Use sbt-revolver which runs the program in a separate JVM with the re-start command and will restart it whenever there are changes. This would interrupt your testing, however.
Use JRebel which does a better job of reloading classes than the JVM or most IDEs.
As I understand it, when I do nrepl-jack-in a REPL is loaded along with all the dependencies defined in project.clj. If I then update project.clj to add a new dependency, do I need to kill the server and re-run nrepl-jack-in or is there way to update the dependencies in the current REPL?
Update: Maybe there is some hope,
See https://github.com/cemerick/pomegranate
Previously:
The short answer is yes - you do have to restart the JVM process.
I am aware of no good way to update dependencies in a live repl. Leiningen (called by nrepl-jack-in) will manage dependencies and set up the classpath only upon restarting. Trying to do something dynamic and clever is horribly fragile.
The struck out text below is factually true but upon a moment's reflection seemed such bad advice I have marked it up as such...
If you have a local dependency (e.g. jar file) you might use the long-time deprecated function add-classpath at the repl. But you will be entering the dragon infested swamp of java classloaders.
Restarting the REPL seems to be the simplest way. This can be done
with:
M-x cider-restart
That also appears to accomplish a lein deps. So the whole process
of adding a new dependency simply involves adding the require to
your project.clj and then invoking cider-restart.
Another (very convenient) way is to use
clj-refactor.
Adding the artifact (C-c m a p or cljr-add-project-dependency)
will prompt for the version you want, automatically put the new
dependency into your project.clj file, and reload your session.
Before pomegranate existed, I wrote my own library to dynamically load dependencies.
https://github.com/bmillare/dj
After the release of lein2 and how it under the covers can use pomegrante, I rewrote dj to use this underneath. So, even if you don't use 'dj', it might be a useful as a reference to see what its doing.