SBT assembly falis - scala

I am running a spark job through intellij. Job executes and gives me output. i need to take this job as jar file to server and run, but when i try to do sbt assembly it throws below error:
[error] Not a valid command: assembly
[error] Not a valid project ID: assembly
[error] Expected ':' (if selecting a configuration)
[error] Not a valid key: assembly
[error] assembly
my sbt version is 0.13.8
below is my build.sbt file:
import sbt._, Keys._
name := "mobilewalla"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "2.0.0",
"org.apache.spark" %% "spark-sql" % "2.0.0")
i added a file assembly.sbt under project dir. it contains:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
what am i missing here

Add these lines in your build.sbt
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
mainClass in assembly := Some("com.SparkMain")
resolvers += "spray repo" at "http://repo.spray.io"
assemblyJarName in assembly := "streaming-api.jar"
and include these lines in your plugins.sbt file
addSbtPlugin("io.spray" % "sbt-revolver" % "0.7.2")
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")

To assemble the multiple jars to one u need add below plugin in plugins.sbt under project directory.
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")
If u need to customize the assembled jar to trigger specific MainClass take example assembly.sbt
import sbtassembly.Plugin.AssemblyKeys._
Project.inConfig(Compile)(baseAssemblySettings)
mainClass in (Compile, assembly) := Some("<main application name with package path>")
jarName in (Compile, assembly) := s"${name.value}-${version.value}-dist.jar"
//below is merge strategy to make what all file need to exclude or include
mergeStrategy in (Compile, assembly) <<= (mergeStrategy in (Compile, assembly)) {
(old) => {
case PathList(ps # _*) if ps.last endsWith ".html" =>MergeStrategy.first
case "META-INF/MANIFEST.MF" => MergeStrategy.discard
case x => old(x)
}
}

Related

Spark / scala : publish a single fat jar with sbt

I am trying to deploy a single Fat Jar from a spark/scala project to a private nexus repository. The publish works fine but I get a large amount of files instead of one single assembly Jar :
*-assembly.jar
*-assembly.jar.md5
*-assembly.jar.sha1
*-javadoc.jar
*-javadoc.jar.md5
*-javadoc.jar.sha1
*-source.jar
*-source.jar.md5
*-source.jar.sha1
*.jar
*.jar.md5
*.jar.sha1
*-pom.jar
*-pom.jar.md5
*-pom.jar.sha1
My built.sbt is :
name := "SampleApp"
version := "0.1-SNAPSHOT"
scalaVersion := "2.12.14"
ThisBuild / useCoursier := false
libraryDependencies ++= Seq(
"com.github.scopt" %% "scopt" % "4.0.1",
"org.apache.spark" %% "spark-sql" % "3.1.1" % "provided"
)
resolvers ++= Seq(
"confluent" at "https://packages.confluent.io/maven/"
)
assembly / assemblyMergeStrategy := {
case PathList("META-INF","services",xs # _*) => MergeStrategy.filterDistinctLines
case PathList("META-INF",xs # _*) => MergeStrategy.discard
case "application.conf" => MergeStrategy.concat
case _ => MergeStrategy.first
}
assembly / assemblyExcludedJars := {
val cp = (assembly / fullClasspath).value
cp filter { f =>
f.data.getName.contains("hadoop-hdfs-2")
f.data.getName.contains("hadoop-client")
}
}
assembly / artifact := {
val art = (assembly / artifact).value
art.withClassifier(Some("assembly"))
}
assembly / assemblyJarName := s"${name.value}-${version.value}.jar"
addArtifact(assembly / artifact, assembly)
resolvers += ("Sonatype Nexus Repository Manager" at "http://localhost:8081/repository/app/").withAllowInsecureProtocol(true)
credentials += Credentials("Sonatype Nexus Repository Manager", "localhost:8081", "user", "pass")
publishTo := {
val nexus = "http://localhost:8081/repository/app/"
if (isSnapshot.value) {
Some("snapshots" at nexus + "test-SNAPSHOT")
} else
Some("releases" at nexus + "test-RELEASE")
}
Is there a way to filter files with sbt before the publish in order to get only the *-assembly.jar ? Thanks a lot.
This should disable the publish of unnecessary files in my case :
publishArtifact in (Compile, packageBin) := false
publishArtifact in (Compile, packageDoc) := false
publishArtifact in (Compile, packageSrc) := false

sbt is not including the main class in the manifest after assembly

I am using sbt 1.2.8 with assembly plugin. This is my sbt file:
name := "my-project"
version := "0.1"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
... some dependency ...
)
mainClass in (Compile, assembly) := Some("some.package.MyMainClass")
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
After I run the command sbt assembly configured in assembly.sbt with:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.7")
I get the following content from the extracted file:
scala-2.11 $ cat META-INF/MANIFEST.MF
Manifest-Version: 1.0
Implementation-Title: my-project
Implementation-Version: 0.1
Specification-Vendor: default
Specification-Title: my-project
Implementation-Vendor-Id: default
Specification-Version: 0.1
Implementation-Vendor: default
but I cannot see where my main class is specified. Any idea?
It works for me. The only change I did is rename assembly.sbt to plugins.sbt. Try it out.

Sbt ( new version 1.0.4) assembly failure

I have been trying to build fat jar for some time now. I got assembly.sbt in project folder and it looks like below
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.6")
and my build.sbt looks like below
name := "cool"
version := "0.1"
scalaVersion := "2.11.8"
resolvers += "Hortonworks Repository" at
"http://repo.hortonworks.com/content/repositories/releases/"
resolvers += "Hortonworks Jetty Maven Repository" at
"http://repo.hortonworks.com/content/repositories/jetty-hadoop/"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-streaming_2.10" % "1.6.1.2.4.2.0-258" %
"provided",
"org.apache.spark" % "spark-streaming-kafka-assembly_2.10" %
"1.6.1.2.4.2.0-258"
)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
i get below error
assemblyMergeStrategy in assembly := {
^
C:\Users\sreer\Desktop\workspace\cool\build.sbt:14: error: not found:
value assembly
assemblyMergeStrategy in assembly := {
^
C:\Users\sreer\Desktop\workspace\cool\build.sbt:15: error: not found:
value PathList
case PathList("META-INF", xs # _*) => MergeStrategy.discard
^
C:\Users\sreer\Desktop\workspace\cool\build.sbt:15: error: star patterns
must correspond with varargs parameters
case PathList("META-INF", xs # _*) => MergeStrategy.discard
^
C:\Users\sreer\Desktop\workspace\cool\build.sbt:15: error: not found:
value MergeStrategy
case PathList("META-INF", xs # _*) => MergeStrategy.discard
^
C:\Users\sreer\Desktop\workspace\cool\build.sbt:16: error: not found:
value MergeStrategy
case x => MergeStrategy.first
^
[error] Type error in expression
I get this Type error and seems like it won't recognize keys like "assemblyMergeStrategy". I use sbt new version 1.0.4 and latest version of eclipse IDE for scala.
I have tried changing version of sbt and still no result, went through whole document at https://github.com/sbt/sbt-assembly, made sure there were no typos and suggestions mentioned in other threads weren't of much help ( mostly questions are about older versions of sbt). If some one could guide me that would be very helpful. Thanks.

How to include test dependencies in sbt-assembly jar?

I am unable to package my test dependencies in my test assembly jar. Here is an excerpt from my build.sbt:
...
name := "project"
scalaVersion := "2.10.6"
assemblyOption in (Compile, assembly) := (assemblyOption in (Compile, assembly)).value.copy(includeScala = false)
fork in Test := true
parallelExecution in IntegrationTest := false
lazy val root = project.in(file(".")).configs(IntegrationTest.extend(Test)).settings(Defaults.itSettings: _ *)
Project.inConfig(Test)(baseAssemblySettings)
test in (Test, assembly) := {}
assemblyOption in (Test, assembly) := (assemblyOption in (Test, assembly)).value.copy(includeScala = false, includeDependency = true)
assemblyJarName in (Test, assembly) := s"${name.value}-test.jar"
fullClasspath in (Test, assembly) := {
val cp = (fullClasspath in Test).value
cp.filter{ file => (file.data.name contains "classes") || (file.data.name contains "test-classes")} ++ (fullClasspath in Runtime).value
}
libraryDependencies ++= Seq(
...
"com.typesafe.play" %% "play-json" % "2.3.10" % "test" excludeAll ExclusionRule(organization = "joda-time"),
...
)
...
When I assemble my fat jar using sbt test:assembly, is produces the fat jar project-test.jar, but the play-json dependencies aren't being packaged in:
$ jar tf /path/to/project-test.jar | grep play
$
However, if I remove the "test" configuration from the play-json dep (i.e. "com.typesafe.play" %% "play-json" % "2.3.10" excludeAll ExclusionRule(organization = "joda-time")), I can see it being included:
$ jar tf /path/to/project-test.jar | grep play
...
play/libs/Json.class
...
$
Am I doing anything wrong and/or missing anything? My goal here is to include the play-json library in ONLY the test:assembly jar and NOT the assembly jar
I have left out a crucial part in the original build.sbt excerpt I posted above which turned out to be the cause of the issuse:
fullClasspath in (Test, assembly) := {
val cp = (fullClasspath in Test).value
cp.filter{ file => (file.data.name contains "classes") || (file.data.name contains "test-classes")} ++ (fullClasspath in Runtime).value
}
This code block was essentially filter out deps from the test classpath. We include this to avoid painful merge conflicts. I fixed this by adding logic to include the play-json dep that was needed:
fullClasspath in (Test, assembly) := {
val cp = (fullClasspath in Test).value
cp.filter{ file =>
(file.data.name contains "classes") ||
(file.data.name contains "test-classes") ||
// sorta hacky
(file.data.name contains "play")
} ++ (fullClasspath in Runtime).value
}

How to package an akka project for a netlogo extension?

I am trying to make a simple NetLogo extension that is based on akka.
However, whenever I try to load the extension in NetLogo, I get the error:
Caused by: com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'akka.version'
Which obviously means that some configuration is missing. I then proceded to add reference.conf to my resources folder but with no luck.
The last thing I tried was to use the sbt-assemblty plugin, but I keep getting the same error. So this is my build.sbt:
name := "TestAkka"
version := "1.0"
scalaVersion := "2.11.7"
scalaSource in Compile <<= baseDirectory(_ / "src")
scalacOptions ++= Seq("-deprecation", "-unchecked", "-Xfatal-warnings",
"-encoding", "us-ascii")
libraryDependencies ++= Seq(
"org.nlogo" % "NetLogo" % "5.3.0" from
"http://ccl.northwestern.edu/devel/NetLogo-5.3-17964bb.jar",
"asm" % "asm-all" % "3.3.1",
"org.picocontainer" % "picocontainer" % "2.13.6",
"com.typesafe" % "config" % "1.3.0",
"com.typesafe.akka" %% "akka-actor" % "2.4.1",
"com.typesafe.akka" %% "akka-remote" % "2.4.1"
)
artifactName := { (_, _, _) => "sample-scala.jar" }
packageOptions := Seq(
Package.ManifestAttributes(
("Extension-Name", "sample-scala"),
("Class-Manager", "main.scala.akkatest.TestClassManager"),
("NetLogo-Extension-API-Version", "5.3")))
packageBin in Compile <<= (packageBin in Compile, baseDirectory, streams) map {
(jar, base, s) =>
IO.copyFile(jar, base / "sample-scala.jar")
Process("pack200 --modification-time=latest --effort=9 --strip-debug " +
"--no-keep-file-order --unknown-attribute=strip " +
"sample-scala.jar.pack.gz sample-scala.jar").!!
if(Process("git diff --quiet --exit-code HEAD").! == 0) {
Process("git archive -o sample-scala.zip --prefix=sample-scala/ HEAD").!!
IO.createDirectory(base / "sample-scala")
IO.copyFile(base / "sample-scala.jar", base / "sample-scala" / "sample-scala.jar")
IO.copyFile(base / "sample-scala.jar.pack.gz", base / "sample-scala" / "sample-scala.jar.pack.gz")
Process("zip sample-scala.zip sample-scala/sample-scala.jar sample-scala/sample-scala.jar.pack.gz").!!
IO.delete(base / "sample-scala")
}
else {
s.log.warn("working tree not clean; no zip archive made")
IO.delete(base / "sample-scala.zip")
}
jar
}
cleanFiles <++= baseDirectory { base =>
Seq(base / "sample-scala.jar",
base / "sample-scala.jar.pack.gz",
base / "sample-scala.zip") }
I have an project/assembly.sbt with the contents:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.1")
I have a assembly.sbt in root with the contents:
import sbtassembly.AssemblyKeys._
baseAssemblySettings
In my scala code I have:
val configString = ConfigFactory.parseString(
"""
akka {
loglevel = "INFO"
actor {
provider = "akka.remote.RemoteActorRefProvider"
}
remote {
enabled-transports = ["akka.remote.netty.tcp"]
netty.tcp {
hostname = "127.0.0.1"
port = "9500"
}
log-sent-messages = on
log-received-messages = on
}
}
""".stripMargin)
val config = ConfigFactory.load(configString)
The resources folder contains an application.conf which I don't use at the moment. Greping the output of jar tf command with the expression "reference", clearly shows that reference.conf is present:
How do I package this akka example for a netlogo extension?
Note: I have included akka-actor and akka-remote as library dependencies. I am using Intellij and SBT 0.13.8 on a OS X platform.
EDIT:
After taking the advice from Ayush, I get the following output from the command sbt assembly, however the same exception is still present:
I think the problem is that while using sbt:assembly the default merge strategy excludes all the reference.conf files. This is what i found in documentation.
If multiple files share the same relative path (e.g. a resource named
application.conf in multiple dependency JARs), the default strategy is
to verify that all candidates have the same contents and error out
otherwise.
Can you try adding a MergeStrategy as follows
assemblyMergeStrategy in assembly := {
case PathList("reference.conf") => MergeStrategy.concat
}
there's an extra trick to solving this with newer akka libraries as akka does not include all their default configurations in the resource.conf
https://stackoverflow.com/a/72325132/1286583