How do I resolve my own test artifacts in SBT? - scala

One of my projects will provide a jar package supposed to be used for unit testing in several other projects. So far I managed to make sbt produce a objects-commons_2.10-0.1-SNAPSHOT-test.jar and have it published in my repository.
However, I can't find a way to tell sbt to use that artifact with the testing scope in other projects.
Adding the following dependencies in my build.scala will not get the test artifact loaded.
"com.company" %% "objects-commons" % "0.1-SNAPSHOT",
"com.company" %% "objects-commons" % "0.1-SNAPSHOT-test" % "test",
What I need is to use the default .jar file as compile and runtime dependency and the -test.jar as dependency in my test scope. But somehow sbt never tries to resolve the test jar.

How to use test artifacts
To enable publishing the test artifact when the main artifact is published you need to add to your build.sbt of the library:
publishArtifact in (Test, packageBin) := true
Publish your artifact. There should be at least two JARs: objects-commons_2.10.jar and objects-commons_2.10-test.jar.
To use the library at runtime and the test library at test scope add the following lines to build.sbt of the main application:
libraryDependencies ++= Seq("com.company" % "objects-commons_2.10" % "0.1-SNAPSHOT"
, "com.company" % "objects-commons_2.10" % "0.1-SNAPSHOT" % "test" classifier "tests" //for SBT 12: classifier test (not tests with s)
)
The first entry loads the the runtime libraries and the second entry forces that the "tests" artifact is only available in the test scope.
I created an example project:
git clone git#github.com:schleichardt/stackoverflow-answers.git --branch so15290881-how-do-i-resolve-my-own-test-artifacts-in-sbt
Or you can view the example directly in github.

Your problem is that sbt thinks that your two jars are the same artifact, but with different versions. It takes the "latest", which is 0.1-SNAPSHOT, and ignores the 0.1-SNAPSHOT-test. This is the same behaviour as you would see if, for instance you have 0.1-SNAPSHOT and 0.2-SNAPSHOT.
I don't know what is in these two jars, but if you want them both to be on the classpath, which is what you seem to want to do, then you'll need to change the name of the test artifact to objects-commons-test, as Kazuhiro suggested. It seems that this should be easy enough for you, since you're already putting it in the repo yourself.

It will work fine if you change the name like this.
"com.company" %% "objects-commons" % "0.1-SNAPSHOT",
"com.company" %% "objects-commons-test" % "0.1-SNAPSHOT" % "test",

Related

How to externalize protobuf files in JVM ecosystem?

I stumbled upon this Akka grpc tutorial which suggests that we can create a jar from a project that has .proto file under src/main/proto and add it as a dependency in client and server projects to build their respective stubs.
libraryDependencies += "com.example" %% "my-grpc-service" % "1.0.0" % "protobuf-src"
But this doesn't seem to work!! Are there any example projects that demonstrates how this would work in action? How can we externalize protobuf sources and use the same in a jvm based project?
I was able to figure out how to externalise protobuf files properly as per suggestion from akka-grpc docs.
The problem was that I was not adding sbt-akka-grpc plugin required by sbt to recognise .proto files and include them in the packaged jar. Without this plugin there won't be any .proto file made available in the packaged jar.
addSbtPlugin("com.lightbend.akka.grpc" % "sbt-akka-grpc" % "1.1.0")
Make sure to add organization settings in your build.sbt to prepare jar correctly.
organization := "com.iamsmkr"
Also, if you wish to cross-compile this jar to multiple versions add following entries in your build.sbt:
scalaVersion := "2.13.3"
crossScalaVersions := Seq(scalaVersion.value, "2.12.14")
and then to publish:
$ sbt +publishLocal
With appropriate jars published you can now add them as dependencies in your client and server projects like so:
libraryDependencies +=
"com.iamsmkr" %% "prime-protobuf" % protobufSourceVersion % "protobuf-src"
You can check out this project I am working on to see this in action.
Alternate Way
An alternate way I figured is that you can keep your .proto files in a root directory and then refer them in client and server build.sbt like so:
PB.protoSources.in(Compile) := Seq(sourceDirectory.value / ".." / ".." / "proto")
Checkout this project to see it in action.

sbt assembly, including my jar

I want to build a 'fat' jar of my code. I understand how to do this mostly but all the examples I have use the idea that the jar is not local and I am not sure how to include into my assembled jar another JAR that I built that the scala code uses. Like what folder does this JAR I have to include reside in?
Normally when I run my current code as a test using spark-shell it looks like this:
spark-shell --jars magellan_2.11-1.0.6-SNAPSHOT.jar -i st_magellan_abby2.scala
(the jar file is right in the same path as the .scala file)
So now I want to build a build.sbt file that does the same and includes that SNAPSHOT.jar file?
name := "PSGApp"
version := "1.0"
scalaVersion := "2.11.8"
resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"
//provided means don't included it is there. already on cluster?
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.2.0" % "provided",
"org.apache.spark" %% "spark-sql" % "2.2.0" % "provided",
"org.apache.spark" %% "spark-streaming" % "2.2.0" % "provided",
//add magellan here somehow?
)
So where would I put the jar in the SBT project folder structure so it gets picked up when I run sbt assembly? Is that in the main/resources folder? Which the reference manual says is where 'files to include in the main jar' go?
What would I put in the libraryDependencies here so it knows to add that specific jar and not go out into the internet to get it?
One last thing, I was also doing some imports in my test code that doesn't seem to fly now that I put this code in an object with a def main attached to it.
I had things like:
import sqlContext.implicits._ which was right in the code above where it was about to be used like so:
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
import org.apache.spark.sql.functions.udf
val distance =udf {(a: Point, b: Point) =>
a.withinCircle(b, .001f); //current radius set to .0001
}
I am not sure can I just keep these imports inside the def main? or do I have to move them elsewhere somehow? (Still learning scala and wrangling the scoping I guess).
One way is to build your fat jar using the assembly plugin (https://github.com/sbt/sbt-assembly) locally and publishLocal to store the resulting jar into your local ivy2 cache
This will make it available for inclusion in your other project based on build.sbt settings in this project, eg:
name := "My Project"
organization := "org.me"
version := "0.1-SNAPSHOT"
Will be locally available as "org.me" %% "my-project" % "0.1-SNAPSHOT"
SBT will search local cache before trying to download from external repo.
However, this is considered bad practise, because only final project should ever be a fat-jar. You should never include one as dependency (many headaches).
There is no reason to make project magellan a fat-jar if library is included in PGapp. Just publishLocal without assembly
Another way is to make projects dependant on each other as code, not library.
lazy val projMagellan = RootProject("../magellan")
lazy val projPSGApp = project.in(file(".")).dependsOn(projMagellan)
This makes compilation in projPSGApp tigger compilation in projMagellan.
It depends on your use case though.
Just don't get in a situation where you have to manage your .jar manually
The other question:
import sqlContext.implicits._ should always be included in the scope where dataframe actions are required, so you shouldn't put that import near the other ones in the header
Update
Based on discussion in comments, my advise would be:
Get the magellan repo
git clone git#github.com:harsha2010/magellan.git
Create a branch to work on, eg.
git checkout -b new-stuff
Change the code you want
Then update the versioning number, eg.
version := "1.0.7-SNAPSHOT"
Publish locally
sbt publishLocal
You'll see something like (after a while):
[info] published ivy to /Users/tomlous/.ivy2/local/harsha2010/magellan_2.11/1.0.7-SNAPSHOT/ivys/ivy.xml
Go to your other project
Change build.sbt to include
"harsha2010" %% "magellan" % "1.0.7-SNAPSHOT" in your libraryDependencies
Now you have a good (temp) reference to your library.
Your PSGApp should be build as an fat jar assembly to pass to Spark
sbt clean assembly
This will pull in the custom build jar
If the change in the magellan project is usefull for the rest of the world, you should push your changes and create a pull request, so that in the future you can just include the latest build of this library

How to activate the sbt DockerPlugin in scala?

I have two scala projects, one is already defined to build its docker container through the sbt docker plugin. The other one I want to dockerify as well.
The working one has in its build.sbt the following lines relevant to the docker config:
organization := "com.namespace"
name := "dockerized-app"
version := sys.env.getOrElse("PIPELINE_VERSION", "0.1.0_local")
scalaVersion := "2.12.4"
enablePlugins(JavaAppPackaging)
enablePlugins(DockerPlugin)
packageName in Docker := packageName.value
dockerRepository := Some("our-docker.io:5001")
dockerExposedPorts := Seq(8080)
I thought that I could copy paste the relevant lines to the new project, change the name, and make it work.
Yet when I add the line to the about to be dockerified scala project:
enablePlugins(DockerPlugin)
I get the error:
Cannot resolve symbol DockerPlugin
I've looked through the prexisting projects libraryDependencies, yet it doesn't seem to be configured that way. In the the pre-configured project, IntellJ somehow knows the plugin, I can track that the DockerPlugin comes from com.typesafe.sbt.packager.docker. This made me assume that sbt comes shipped with it by default.
Yet apparently I have to activate it somehow.
Digging deeper I also tried adding this to my plugins.sbt to no avail:
addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "1.3.2")
How to activate DockerPlugin using sbt in scala?
In order to make it working properly you need to add the following line:
addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "1.3.2")
in your project/plugins.sbt file.
Then refresh your project and it should work.
For further information, please check the Sbt Native Packager documentation.

How to specify a dependency that does not appear in the published artifact?

I have a build that contains two modules a and b. b depends on a. a contains only resources (think: logging configuration). Locally, if I run the sbt task b/console, I want a to be on the classpath. However, I don't want to publish a and hence don't want the dependency to appear in b's artifact. How can I configure this?
You can use the 'provided' scope when adding the library dependency
libraryDependencies += "javax.servlet" % "javax.servlet-api" % "3.0.1" % "provided"

Shading over third party classes

I'm currently facing a problem with deploying an uber-jar to a Spark Streaming application, where there are congruent JARs with different versions which are causing spark to throw run-time exceptions. The library in question is TypeSafe Config.
After attempting many things, my solution was to defer to shading the provided dependency so it won't clash with the JAR provided by Spark at run-time.
Hence, I went to the documentation for sbt-assembly and under shading, I saw the following example:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("org.apache.commons.io.**" -> "shadeio.#1")
.inLibrary("commons-io" % "commons-io" % "2.4", ...).inProject
)
Attempting to shade over com.typesafe.config, I tried applying the following solution to my build.sbt:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.typesafe.config.**" -> "shadeio.#1").inProject
)
I assumed it was supposed to rename any reference to TypeSafe Config in my project. But, this doesn't work. It matches multiple classes in my project and causing them to be removed from the uber jar. I see this when trying to run sbt assembly:
Fully-qualified classname does not match jar entry:
jar entry: ***/Identifier.class
class name: **/Identifier.class
Omitting ***/OtherIdentifier.class.
Fully-qualified classname does not match jar entry:
jar entry: ***\SparkBaseJobRunner$$anonfun$1.class
class name: ***/SparkBaseJobRunner$$anonfun$1.class
I also attempted using:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.typesafe.config.**" -> "shadeio.#1")
.inLibrary("com.typesafe" % "config" % "1.3.0")
This did finish the assemblying process of the uber JAR, but didn't have the desired run time effect.
I'm not sure I fully comprehend the effect shading has on my build process with sbt.
How can I shade over references to com.typesafe.config in my project so when I invoke the library at run-time Spark will load my shaded library and avoid the clash caused by versioning?
I'm running sbt-assembly v0.14.1
Turns out this was a bug in sbt-assembly where shading was completely broken on Windows. This caused source files to be removed from the uber JAR, and for tests to fail as the said classes were unavailable.
I created a pull request to fix this. Starting version 0.14.3 of SBT, the shading feature works properly. All you need to do is update to the relevant version in plugins.sbt:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
In order to shade a specific JAR in your project, you do the following:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.typesafe.config.**" -> "my_conf.#1")
.inLibrary("com.typesafe" % "config" % "1.3.0")
.inProject
)
This will rename the com.typesafe.config assembly to be packaged inside my_conf. You can then view this using jar -tf on your assembly (omitted irrelevant parts for brevity):
***> jar -tf myassembly.jar
my_conf/
my_conf/impl/
my_conf/parser/
Edit
I wrote a blog post describing the issue and the process that led to it for anyone interested in a more in-depth explanation.