Suppress main class in sbt-assembly - scala

In a multi-project SBT build, how should one explicitly suppress the mainClass attribute in SBT-assembly?
I've searched extensively but cannot seem to find out how to prevent a main class from being set in the jar built by sbt-assembly in a multi-project build.
In a single-project build, this seems to work as long as there are at least two classes that might potentially be invoked by command line:
import sbt._
import Keys._
import sbtassembly.Plugin._
import sbtassembly.AssemblyUtils._
import AssemblyKeys._
object TestBuild extends Build {
lazy val buildSettings = Defaults.defaultSettings ++ assemblySettings ++ Seq(
version := "0.1-SNAPSHOT",
organization := "com.organization",
scalaVersion := "2.10.2",
mergeStrategy in assembly := mergeFirst
)
lazy val root = Project(id = "test-assembly",
base = file("."),
settings = buildSettings) settings(
mainClass in assembly := None
)
lazy val mergeFirst: String => MergeStrategy = {
case "reference.conf" | "rootdoc.txt" => MergeStrategy.concat
case PathList("META-INF", xs # _*) =>
(xs map {_.toLowerCase}) match {
case ("manifest.mf" :: Nil) | ("index.list" :: Nil) | ("dependencies" :: Nil) =>
MergeStrategy.discard
case ps # (x :: xs) if ps.last.endsWith(".sf") || ps.last.endsWith(".dsa") =>
MergeStrategy.discard
case "plexus" :: xs =>
MergeStrategy.discard
case "services" :: xs =>
MergeStrategy.filterDistinctLines
case ("spring.schemas" :: Nil) | ("spring.handlers" :: Nil) =>
MergeStrategy.filterDistinctLines
case _ => MergeStrategy.first
}
case _ => MergeStrategy.first
}
}
However, it appears that the mainClass := None isn't even necessary. In fact, even if it is left there, a main class will still be set in the manifest if there is only one candidate class.
Unforunately, in a multi-project build, I could not prevent a main class from being set even by including an additional class as a dummy entry-point.
After a lot of fiddling around, I found that I could prevent a main class from being set by setting mainClass to None in several scopes independently. In particular, this does the trick:
mainClass in assembly := None
mainClass in packageBin := None
nainClass in Compile := None
mainClass in run := None
I could not find this requirement stated in the documentation and cannot figure out why this is necessary. Setting mainClass in (Compile, run) := None does not work. They really need to be scoped separately.
Is this the proper way to manually suppress a main class or am I missing something? If not a bug, I think this should be documented somewhere, especially given that the behavior is not consistent between single- and multi-project builds.

Main-Class and all other attributes in jar are ultimately determined by packageOptions in assembly, so all you could remove Package.MainClass from it as follows:
lazy val root = Project(id = "test-assembly",
base = file("."),
settings = buildSettings) settings(
packageOptions in assembly ~= { os => os filterNot {_.isInstanceOf[Package.MainClass]} }
)
This should work for multi-project builds too.
Note: sbt-assembly caches the output jar based on the metadata of the source input, so testing settings around assembly requires clearing the cache by calling clean.

Related

Generate Executable Jar for a Scala Project

I have a scala project which contain multiple main methods. I want to generate a fat jar so that I can run one of the main method related code.
build.sbt
lazy val commonSettings = Seq(
version := "0.1-SNAPSHOT",
organization := "my.company",
scalaVersion := "2.11.12",
test in assembly := {}
)
lazy val app = (project in file("app")).
settings(commonSettings: _*).
settings(
mainClass in assembly := Some("my.test.Category")
)
assemblyMergeStrategy in assembly := {
case "reference.conf" => MergeStrategy.concat
case "META-INF/services/org.apache.spark.sql.sources.DataSourceRegister" => MergeStrategy.concat
case PathList("META-INF", xs#_*) => MergeStrategy.discard
case _ => MergeStrategy.first
}
plugins.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.15.0")
By this Manifest file is generated successfully in my resource folder.
next I run sbt assembly and executable jar is generated successfully.
When i run java -jar category-assembly-0.1.jar i get the following error
no main manifest attribute, in category-assembly-0.1.jar
I tried many steps given in the internet but i keep getting this error
UPDATE
Currently following is included in my build.sbt.
lazy val spark: Project = project
.in(file("./spark"))
.settings(
mainClass in assembly := Some("my.test.Category"),
mainClass in (Compile, packageBin) := Some("my.test.Category"),
mainClass in (Compile, run) := Some("my.test.Category")
)
assemblyMergeStrategy in assembly := {
case "reference.conf" => MergeStrategy.concat
case "META-INF/services/org.apache.spark.sql.sources.DataSourceRegister" => MergeStrategy.concat
case PathList("META-INF", xs#_*) => MergeStrategy.discard
case _ => MergeStrategy.first
}
After building the artifacts and run the command sbt assembly and tried running the genrated jar im still getting the same error as follows
no main manifest attribute, in category-assembly-0.1.jar
I had the same issue when i was deploying on EMR.
Add the following to your setting in build.sbt and it will be alright.
lazy val spark: Project = project
.in(file("./spark"))
.settings(
...
// set the main class for packaging the main jar
mainClass in (Compile, packageBin) := Some("com.orgname.name.ClassName"),
mainClass in (Compile, run) := Some("com.orgname.name.ClassName"),
...
)
This essentially sets the name of your existing class in your project as default mainClass. i set both for packageBin and run so you should be alright.
Just don't forget to rename the com.orgname.name.ClassName to your classname.
(This is just to refresh your memory)
Classname consists of <PackageName>/<ClassName>.
For example:
package com.orgname.name
object ClassName {}

Is it possible to extend build.sbt syntax using implicit classes?

I have a big root sbt project that contains several sub-projects. Among those project definitions there is a lot of code duplication that I am trying to remove.
For example every assembly project contains the following code:
project
.enablePlugins(sbtassembly.AssemblyPlugin)
.settings(
mainClass in Compile := Some(mainClassName),
assemblyJarName in assembly := jarName,
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs#_*) => MergeStrategy.discard
case PathList("sandbox.sc") => MergeStrategy.discard
case PathList("org", "joda", "time", xs#_*) => MergeStrategy.first
case PathList("reference.conf") => MergeStrategy.concat
case x => MergeStrategy.deduplicate
}
)
Instead i would like to write something like that:
project
.assembly(className, jarName)
Is it possible to achieve such syntax? I know it is possible to achieve this syntax in a typical scala file using an implicit class. Is these a way to do it in sbt ?
Yes, it's possible. create object(SbtProjectImplicits.scala) under the project directory(depend on your project structure), like:
your-project/
project/SbtProjectImplicits.scala
src/
...
and the SbtProjectImplicits.scala object content maybe like:
object SbtProjectImplicits {
implicit class ProjectSettings(p: sbt.Project) {
def assembly(className: Class, jarName): sbt.Project = {
p.settings(
mainClass in Compile := Some(mainClassName),
assemblyJarName in assembly := jarName,
...
)
p
}
}
}
so you can in build.sbt do it like:
import SbtProjectImplicits._
project
.assembly(className, jarName)

How to add merge strategy to my build settings

Currently my build fails because the mergeStrategy isn't correct.
How can I fix this?
object MyAppBuild extends Build {
import Resolvers._
import Dependency._
import BuildSettings._
lazy val myApp = Project(
id = "myApp",
base = file("."),
settings = buildSettings ++ Seq(
resolvers := allResolvers,
exportJars := true,
libraryDependencies ++= Dependencies.catalogParserDependencies,
parallelExecution in Test := false,
//mergeStrategy in assembly := {
// ....
//}
)
)
}
If I had my settings in the build.sbt file it works like this:
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
I want to move this logic to my Build.scala file now.
Please migrate to build.sbt style. http://www.scala-sbt.org/0.13/docs/Basic-Def.html
lazy val myApp = Project(
id = "myApp",
base = file("."),
settings = buildSettings ++ ... // this is likely the problem
The *.scala style has been discouraged in the docs and, sbt 0.13.13 officially deprecates it. One of the reasons is that Project(...)'s settings parameter is not compatible with auto plugin initialization order. If you migrate to build.sbt style it should be resolved.

SBT - Multi project merge strategy and build sbt structure when using assembly

I have a project that consists of multiple smaller projects, some with dependencies upon each other, for example, there is a utility project that depends upon commons project.
Other projects may or may not depend upon utilities or commons or neither of them.
In the build.sbt I have the assembly merge strategy at the end of the file, along with the tests in assembly being {}.
My question is: is this correct, should each project have its own merge strategy and if so, will the others that depend on it inherit this strategy from them? Having the merge strategy contained within all of the project definitions seems clunky and would mean a lot of repeated code.
This question applied to the tests as well, should each project have the line for whether tests should be carried out or not, or will that also be inherited?
Thanks in advance. If anyone knows of a link to a sensible (relatively complex) example that'd also be great.
In my day job I currently work on a large multi-project. Unfortunately its closed source so I can't share specifics, but I can share some guidance.
Create a rootSettings used only by the root/container project, since it usually isn't part of an assembly or publish step. It would contain something like:
lazy val rootSettings := Seq(
publishArtifact := false,
publishArtifact in Test := false
)
Create a commonSettings shared by all the subprojects. Place the base/shared assembly settings here:
lazy val commonSettings := Seq(
// We use a common directory for all of the artifacts
assemblyOutputPath in assembly := baseDirectory.value /
"assembly" / (name.value + "-" + version.value + ".jar"),
// This is really a one-time, global setting if all projects
// use the same folder, but should be here if modified for
// per-project paths.
cleanFiles <+= baseDirectory { base => base / "assembly" },
test in assembly := {},
assemblyMergeStrategy in assembly := {
case "BUILD" => MergeStrategy.discard
case "logback.xml" => MergeStrategy.first
case other: Any => MergeStrategy.defaultMergeStrategy(other)
},
assemblyExcludedJars in assembly := {
val cp = (fullClasspath in assembly).value
cp filter { _.data.getName.matches(".*finatra-scalap-compiler-deps.*") }
}
)
Each subproject uses commonSettings, and applies project-specific overrides:
lazy val fubar = project.in(file("fubar-folder-name"))
.settings(commonSettings: _*)
.settings(
// Project-specific settings here.
assemblyMergeStrategy in assembly := {
// The fubar-specific strategy
case "fubar.txt" => MergeStrategy.discard
case other: Any =>
// Apply inherited "common" strategy
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(other)
}
)
.dependsOn(
yourCoreProject,
// ...
)
And BTW, if using IntelliJ. don't name your root project variable root, as this is what appears as the project name in the recent projects menu.
lazy val myProjectRoot = project.in(file("."))
.settings(rootSettings: _*)
.settings(
// ...
)
.dependsOn(
// ...
)
.aggregate(
fubar,
// ...
)
You may also need to add a custom strategy for combining reference.conf files (for the Typesafe Config library):
val reverseConcat: sbtassembly.MergeStrategy = new sbtassembly.MergeStrategy {
val name = "reverseConcat"
def apply(tempDir: File, path: String, files: Seq[File]): Either[String, Seq[(File, String)]] =
MergeStrategy.concat(tempDir, path, files.reverse)
}
assemblyMergeStrategy in assembly := {
case "reference.conf" => reverseConcat
case other => MergeStrategy.defaultMergeStrategy(other)
}

assembly-merge-strategy issues using sbt-assembly

I am trying to convert a scala project into a deployable fat jar using sbt-assembly. When I run my assembly task in sbt I am getting the following error:
Merging 'org/apache/commons/logging/impl/SimpleLog.class' with strategy 'deduplicate'
:assembly: deduplicate: different file contents found in the following:
[error] /Users/home/.ivy2/cache/commons-logging/commons-logging/jars/commons-logging-1.1.1.jar:org/apache/commons/logging/impl/SimpleLog.class
[error] /Users/home/.ivy2/cache/org.slf4j/jcl-over-slf4j/jars/jcl-over-slf4j-1.6.4.jar:org/apache/commons/logging/impl/SimpleLog.class
Now from the sbt-assembly documentation:
If multiple files share the same relative path (e.g. a resource named
application.conf in multiple dependency JARs), the default strategy is
to verify that all candidates have the same contents and error out
otherwise. This behavior can be configured on a per-path basis using
either one of the following built-in strategies or writing a custom one:
MergeStrategy.deduplicate is the default described above
MergeStrategy.first picks the first of the matching files in classpath order
MergeStrategy.last picks the last one
MergeStrategy.singleOrError bails out with an error message on conflict
MergeStrategy.concat simply concatenates all matching files and includes the result
MergeStrategy.filterDistinctLines also concatenates, but leaves out duplicates along the way
MergeStrategy.rename renames the files originating from jar files
MergeStrategy.discard simply discards matching files
Going by this I setup my build.sbt as follows:
import sbt._
import Keys._
import sbtassembly.Plugin._
import AssemblyKeys._
name := "my-project"
version := "0.1"
scalaVersion := "2.9.2"
crossScalaVersions := Seq("2.9.1","2.9.2")
//assemblySettings
seq(assemblySettings: _*)
resolvers ++= Seq(
"Typesafe Releases Repository" at "http://repo.typesafe.com/typesafe/releases/",
"Typesafe Snapshots Repository" at "http://repo.typesafe.com/typesafe/snapshots/",
"Sonatype Repository" at "http://oss.sonatype.org/content/repositories/releases/"
)
libraryDependencies ++= Seq(
"org.scalatest" %% "scalatest" % "1.6.1" % "test",
"org.clapper" %% "grizzled-slf4j" % "0.6.10",
"org.scalaz" % "scalaz-core_2.9.2" % "7.0.0-M7",
"net.databinder.dispatch" %% "dispatch-core" % "0.9.5"
)
scalacOptions += "-deprecation"
mainClass in assembly := Some("com.my.main.class")
test in assembly := {}
mergeStrategy in assembly := mergeStrategy.first
In the last line of the build.sbt, I have:
mergeStrategy in assembly := mergeStrategy.first
Now, when I run SBT, I get the following error:
error: value first is not a member of sbt.SettingKey[String => sbtassembly.Plugin.MergeStrategy]
mergeStrategy in assembly := mergeStrategy.first
Can somebody point out what I might be doing wrong here?
Thanks
As for the current version 0.11.2 (2014-03-25), the way to define the merge strategy is different.
This is documented here, the relevant part is:
NOTE:
mergeStrategy in assembly expects a function, you can't do
mergeStrategy in assembly := MergeStrategy.first
The new way is (copied from the same source):
mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
{
case PathList("javax", "servlet", xs # _*) => MergeStrategy.first
case PathList(ps # _*) if ps.last endsWith ".html" => MergeStrategy.first
case "application.conf" => MergeStrategy.concat
case "unwanted.txt" => MergeStrategy.discard
case x => old(x)
}
}
This is possibly applicable to earlier versions as well, I don't know exactly when it has changed.
I think it should be MergeStrategy.first with a capital M, so mergeStrategy in assembly := MergeStrategy.first.
this is the proper way to merge most of the common java/scala projects.
it takes care of META-INF and classes.
also the service registration in META-INF is taken care of.
assemblyMergeStrategy in assembly := {
case x if Assembly.isConfigFile(x) =>
MergeStrategy.concat
case PathList(ps # _*) if Assembly.isReadme(ps.last) || Assembly.isLicenseFile(ps.last) =>
MergeStrategy.rename
case PathList("META-INF", xs # _*) =>
(xs map {_.toLowerCase}) match {
case ("manifest.mf" :: Nil) | ("index.list" :: Nil) | ("dependencies" :: Nil) =>
MergeStrategy.discard
case ps # (x :: xs) if ps.last.endsWith(".sf") || ps.last.endsWith(".dsa") =>
MergeStrategy.discard
case "plexus" :: xs =>
MergeStrategy.discard
case "services" :: xs =>
MergeStrategy.filterDistinctLines
case ("spring.schemas" :: Nil) | ("spring.handlers" :: Nil) =>
MergeStrategy.filterDistinctLines
case _ => MergeStrategy.first
}
case _ => MergeStrategy.first}
I have just setup a little sbt project that needs to rewire some mergeStrategies, and found the answer a little outdated, let me add my working code for versions (as of 4-7-2015)
sbt 0.13.8
scala 2.11.6
assembly 0.13.0
mergeStrategy in assembly := {
case x if x.startsWith("META-INF") => MergeStrategy.discard // Bumf
case x if x.endsWith(".html") => MergeStrategy.discard // More bumf
case x if x.contains("slf4j-api") => MergeStrategy.last
case x if x.contains("org/cyberneko/html") => MergeStrategy.first
case PathList("com", "esotericsoftware", xs#_ *) => MergeStrategy.last // For Log$Logger.class
case x =>
val oldStrategy = (mergeStrategy in assembly).value
oldStrategy(x)
}
For the new sbt version (sbt-version :0.13.11), I was getting the error for slf4j; for the time being took the easy way out : Please also check the answer here Scala SBT Assembly cannot merge due to de-duplication error in StaticLoggerBinder.class where sbt-dependency-graph tool is mentioned which is pretty cool to do this manually
assemblyMergeStrategy in assembly <<= (assemblyMergeStrategy in assembly) {
(old) => {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
}
Quick update: mergeStrategy is deprecated. Use assemblyMergeStrategy. Apart from that, earlier responses are still solid
Add following to build.sbt to add kafka as source or destination
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
//To add Kafka as source
case "META-INF/services/org.apache.spark.sql.sources.DataSourceRegister" =>
MergeStrategy.concat
case x => MergeStrategy.first
}