How do I build a project that uses sbt as its build system? - scala

I have downloaded a project which uses sbt as its build system and I want to build it. You'd think it would be as simple as typing "sbt" or something, but no.
I thought I'd add a question for this because it can take literally hours to figure this out on your own. I'm not joking.

tl;dr:
sbt compile
If you want to run it:
sbt run
To see what other targets are available:
sbt tasks
To get some (other) help, but mostly targeted at commands typed from the sbt console (ie, running sbt without parameters):
sbt help
This all assumes sbt version >= 0.10.0. To see what version of sbt is in use, do:
grep sbt.version project/build.properties
If there's no such file, and there's a file with extension ".sbt" on the base directory (not the project directory), then it's >= 0.10.0. Of course, if the grep works, it should tell you the version.

First, you'll want to use sbt-extras, because that automatically downloads and uses the right version of sbt. Trying to use the wrong version of sbt (newer or older than what the project you're trying to build says it requires) won't necessarily work, and may cause strange errors.
Run it:
~/path/to/sbt-extras/sbt
Wait for it to start up and download everything. If you need to use an authenticated proxy, you'll need to edit the script to specify the username and password for the proxy.
Check the version of Scala that sbt thinks it needs to build against (at the end of the output, if everything worked). If this is OK, fine, you don't need to do anything. If it isn't, you can temporarily specify a version explicitly with ++, e.g.:
++2.8.1
(If you want to make this permanent, you can edit the build definition files, but as that involves making a change to files under version control, that might not be what you want to do.)
Now, if you are using an older version of sbt, don't skip the next step! You could get strange errors if you do.
update
Now you can build and test what you've built:
test
If you get an error "Filename too long", this is not an sbt-specific problem, it's a scala problem, which most frequently affects Ubuntu users (technically, for Unbuntu users it's generally related to home directories encrypted with encfs). If you are using Scala >= 2.9, edit the build to use the scalac command-line option that allows you to specify a maximum filename length. Otherwise, if you are on Linux, you can redirect the build to /dev/shm or /tmp, by running these commands in a shell prompt (don't background sbt with CTRL+Z on Unix, because it may appear to stop working properly):
rm -rf target
ln -s /dev/shm target
(you may have to execute these commands in project/build instead or as well.)
Actually, it's probably better, and may even be more secure, to create a subdirectory of /dev/shm or /tmp and use that instead.
The compilation result should appear in target. You might then want to run it, if it's something you can run:
run
If everything looks OK, you can optionally publish the result locally so that the result can then be picked up automatically by other sbt builds:
publish-local

I don't think I could explain it better that the Getting Started Guide could. Please read the first 6 parts of it, which shouldn't too long time, to get it up and running.

Related

How to vendor dependencies in an SBT project?

After ~8 years of using Scala, I took a detour to a small language you might have heard of called Go. It's not without its flaws (grow a real type system, boi!), but it does some things much better than Scala could ever hope to.
Go manages dependencies in source form, which any sensible engineer would consider terrifying until she discovers that storing one's dependencies in a vendor/ directory under source control is a "get out of jail free" card for cases when dependency resolution either becomes too complicated for its own good, or depends on flaky 3rd party resources, such as the network.
The latest version of Go's CLI tooling comes with a command called go mod vendor, which does the legwork of downloading the current module's dependencies into a vendor/ directory inside the project, which can subsequently be checked into source control. Setting aside discussions regarding the merits of aggressively and preemptively caching dependencies in this fashion, I would like to state for the record that this command is very convenient.
SBT is notorious for downloading dependencies into ~/.ivy2, which is more of a free-for-all cache shared by all of a user's projects rather than just one. There's a smaller cache in ~/.sbt, which is used by SBT itself as a Humpty Dumpty / Mr Potato Head scratch space. Both directories will be created & populated automatically if they don't exist, but neither is intended to be explicitly managed by the user. Both are internal implementation details of SBT and/or Ivy, and should not be messed with "unless you know what you're doing".
What I want (and now I'll be asking for things) is a sbt vendor command that would do the legwork of populating the unmanaged classpath with all of my project's dependencies. If it can also download all that's needed to run SBT itself into the same directory, that would be just peachy.
Is there a SBT plugin or some sequence of arcane incantations that can be used to accomplish that which I seek?
Before this question gets closed forever by an overzealous posse of moderators, I'm going to post here the hack which got me over this particular hump. As usual, the solution ended up being a shell script:
#!/bin/bash
root="$(readlink -f "$(dirname "$0")")"
sbt="$root/.sbt-launcher"
if [ ! -x "$sbt" ]; then
echo "$sbt does not exist or is not executable"
exit 1
fi
exec "$sbt" \
-ivy "$root/.ivy2" \
-sbt-dir "$root/.sbt" \
-sbt-boot "$root/.sbt/boot" \
-sbt-launch-dir "$root/.sbt/launchers" \
$#
Let's unpack this really quickly. First, the shell script is a wrapper for the real SBT launcher, which is located in the same directory and named .sbt-launcher. Any recent version should work; you too can download one from http://git.io/sbt.
My wrapper ensures that four flags are always passed to the real SBT launcher:
-ivy specifies a custom location for the Ivy cache.
-sbt-dir, -sbt-boot, and -sbt-launch-dir together force SBT to stop using the account-wide ~/.sbt directory as a dumping ground for SBT JARs and other things.
I saved this shell script as sbt inside my project, placed .sbt-launcher from http://git.io/sbt right next to it, and began using the wrapper instead of the real SBT. I then checked into source control the directories .ivy2 and .sbt which were created inside my project.
That's all it took. It's not an elegant solution, but it does well to isolate my CI/CD process from volatility in Internet artifact repositories.

How do I get the commands executed by Bazel

I was wondering if there is a way to get Bazel to list, output, display, etc., all of the commands that can be executed from a command line that are run during a build after a clean. I do not care if the output is to the screen, in a file, etc. I will massage it into a usable form if necessary.
I have captured the screen output during a run of Bazel which gives me an idea of what is being done, however it does not give me a command I can execute on the command line. The command would have to include all of the command options and not display variables.
If this is not possible, since Bazel is open source, where in the code is/are the lines that represent the commands to be run so that I can modify Bazel to output the executable commands.
I am aware of the query command within Bazel, and used it generate the dependency diagram. If this could be done as a query command it would be even better.
TLDR;
My goal is to build TensorFlow using Bazel on Windows. Yes I know of all of the problems and reasons NOT to do it and have successfully installed TensorFlow on Windows via a Virtual Machine or Docker. I did take a shot at building Bazel on Windows starting with Cygwin, but that started to get out of hand as I am use to installing with packages and Cygwin doesn't play nice with packages, so then I started trying to build Bazel by hand and that was turning into a quagmire. So I am now trying to just build TensorFlow by hand on Windows by duplicating what Bazel would do to build TensorFlow on Linux.
You are correct, you can use the -s (--subcommands) option:
bazel build -s //foo
See https://docs.bazel.build/versions/master/user-manual.html#flag--subcommands.
For your use case, you'd probably want to redirect the output to a file and then global replace any library/binary paths to the Windows equivalents.
You might want to track https://github.com/bazelbuild/bazel/issues/276 (Windows support), although it'll probably be a while.
(Disclaimer: This solution does not print the commands that currently get executed but the commands that would get or got executed.)
I'd use aquery (action graph query) (forget about "graph"):
bazel aquery //foo
Advantages:
It's very fast, because it prints the actions without executing the build.
It's a query. It does not have side effects.
You don't have to do a bazel clean before in order to find out the build steps for a library that has already been built.
It prints information about the specific build step that you request. It does not print all the build commands required for the dependencies.

How to debug %post with rpmbuild

I'm building an RPM that needs to run a number of scripts to configure it after it's been installed to complete the installation. I have to run the scripts in the %post section because the configuration is dependent upon the type of host. All this is fairly easy and well, but every time I run into a bug with the %post section, I have to rebuild the entire package which takes about 20 minutes. Is there a way to skip recompiling everything and just build a new package with just the changes from %post?
If your spec file doesn't create a random build directory and won't delete that build tree afterwards, the more time consuming compiling can be omitted by make. I.e. Similar to not using the --clean option in rpmbuild.
You can then also use the --short-circuit flag to rpmbuild to skip the first stages in building.
Is the script doing something different when run manually vs. run from the RPM install? You can make a test RPM that has no %post, then manually run the script under test (as root). Do this on every host type that needs to be tested until you think you got it. Then add it as the %post and give it a try.

Achieve SBT Run startup speed while executing through command line

I've been working on a small set of command line programs in Scala. While
developing I used SBT, and tested the program with run within the console. At
this point the programs had a fast startup time (when re-run after initial compilation); nearly instant, even
with additional dependencies.
Now that I'm trying to actually utilize them on my system outside of sbt, the speeds have noticeable lag. I'm looking for ways to
reduce this, since the nature of these utilities requires little to no delay.
The best speeds I've achieved so far has been through utilizing Drip. I include all dependencies in a lib directory by utilizing Pack and then run by executing a shell script like this:
#!/bin/sh
SCRIPT=$(readlink -f "$0")
SCRIPT_PATH=$(dirname "$SCRIPT")
PROG_HOME=`cd "$SCRIPT_PATH/../" && pwd`
CLASSPATH_SUFFIX=""
# Path separator used in EXTRA_CLASSPATH
PSEP=":"
exec drip \
-cp "${PROG_HOME}/lib/*${CLASSPATH_SUFFIX}" \ # Add lib directory to classpath
TagWorkspace "$#" # TagWorkspace is the main class
This is still noticeably slower then invoking run from within SBT.
I'm curious as to why SBT is able to startup the application so much faster, and if there is someway for me to levarage its strategy, or SBT itself, even if that means keeping a long living process around to actually run a command through.
Unless you have forking turned on for your run task, this is likely due to VM startup time. When you run from inside an active SBT session, you have an already initialized VM pointing at your classes - all SBT needs to do is create a new ClassLoader and point it at your build output directory. This bypasses all of the other (not insignificant) stuff that happens when you fire up a new VM.
Have you tried using the client VM to start your utility from the command line? Sadly, this isn't an option with 64-bit Java, since Oracle apparently doesn't want to support it, but if you're using a 32-bit VM, try adding the -client argument to the list that you give the VM from the command line.
If you are using a 64-bit VM, some googling will find you some unofficial forks of OpenJDK that have the client VM re-enabled. It's really just a #define in the JVM build itself - it works fine once it's been compiled in.
The only slowness I have is launching SBT. Running a hello-word Scala app with java (no Drip) version 1.8 on a 7381 bogomips CPU takes only 0.2 seconds.
If you're not in that magnitude, I suspect your application startup requires loading thousands of classes, and creating instances of them.

Does SBT use the Fast Scala Compiler (fsc)?

Does SBT make use of fsc?
For test purposes I am compiling a 500-line program on a fairly slow Ubuntu machine (Atom N270). Three successive compile times were 77s, 66s, and 66s.
I then compiled the file with fsc from the command line. Now my times were 80s, 25s, 18s. Better! That implies to me sbt is not using fsc. Am I right? If so, why doesn't it use it?
I may try getting sbt to explicitly use fsc to compile, though I am not sure I will figure out the config. Has anyone done this?
This discussion made me realize I have been using sbt the wrong way.
Instead of (from the command line):
$ sbt compile
$ sbt test
..one should keep sbt running and treat it as the command prompt.
$ sbt
> compile
...
> test
It has the command history and even ability to dive back into OS command line. I wrote this 'answer' for others like me (coming from a Makefile mindset) that might not realize we're taking the pill all wrong. :)
(It's still slow, though.)
SBT cannot benefit from the Fast Scala Compiler when you run it interactively (with or without using its continuous build mode) because the Scala compiler classes are loaded and get "warmed up" and JIT-ed, which is the entirety of fsc's advantage.
At least for SBT 0.7.x the authors explain that it is not as fast as fsc, which caches the compiler instance (including the loaded libraries), rather than just the JITted compiler classes:
http://code.google.com/p/simple-build-tool/wiki/ChangeDetectionAndTesting
My experience also confirms that fsc is faster for full compiles, but does not automatically select what to recompile.
For SBT 0.10, I can find no docs whatsoever on this issue.
You don't need to do it anymore. Sbt has its own way now:
https://github.com/sbt/sbt/wiki/Client-server-split