Where do you put assemblyMergeStrategy in build.sbt? - scala

I have a MergeStrategy problem. How do I resolve it? Why are all those squiggly lines there?
The error message is Type mismatch, expected: String => MergeStrategy, actual: String => Any
I am new to scala, so, I have no idea what that syntax means. I have tried copying different merge strategies from all over stackoverflow and none and them work.
I have scala version 2.12.7 and sbt version 1.2.6.
My build.sbt looks like this:
lazy val root = (project in file(".")).
settings(
name := "bigdata-mx-2",
version := "0.1",
scalaVersion := "2.12.7",
mainClass in Compile := Some("Main")
)
libraryDependencies ++= Seq(
"org.apache.hadoop" % "hadoop-core" % "1.2.1",
"org.apache.parquet" % "parquet-hadoop" % "1.10.0",
"junit" % "junit" % "4.12" % Test,
"org.scalatest" %% "scalatest" % "3.2.0-SNAP10" % Test,
"org.scalacheck" %% "scalacheck" % "1.14.0" % Test,
"org.scala-lang" % "scala-library" % "2.12.7"
)
// Where do I put this thing:
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
Maybe I'm not putting it into the right place, where does it go?

Related

No TypeTag available for String

I'm trying to run my fat jar using scala -classpath "target/scala-2.13/Capstone-assembly-0.1.0-SNAPSHOT.jar" src/main/scala/project/Main.scala, but I get an error caused by .toString: val generateUUID: UserDefinedFunction = udf((str: String) => nameUUIDFromBytes(str.getBytes).toString) No TypeTag available for String, when I run from IDE everything is working but not from jar
My build.sbt:
ThisBuild / version := "0.1.0-SNAPSHOT"
ThisBuild / scalaVersion := "2.13.8"
lazy val root = (project in file("."))
.settings(
name := "Capstone"
)
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "3.2.0",
"org.apache.spark" %% "spark-sql" % "3.2.0",
"org.scalatest" %% "scalatest" % "3.2.12" % "test",
"org.rogach" %% "scallop" % "4.1.0"
)
compileOrder := CompileOrder.JavaThenScala
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
If I delete .toString I get an error Schema for type java.util.UUID is not supported
I tried to change String to java.util.String or scala.Predef.String, but this didn't work

Dependency duplication when creating a fat jar for playframework with play-ws

I noticed that when trying to build fat jara for playframework 2.7 with play-ws (sbt assembly), dependency duplications occur. I get a lot of errors related to javax.activation-api and shaded-asynchttpclient e.g.
[error] deduplicate: different file contents found in the following:
[error] /home/user/.cache/coursier/v1/https/repo1.maven.org/maven2/javax/activation/javax.activation-api/1.2.0/javax.activation-api-1.2.0.jar:javax/activation/UnsupportedDataTypeException.class
[error] /home/user/.cache/coursier/v1/https/repo1.maven.org/maven2/com/typesafe/play/shaded-asynchttpclient/2.0.6/shaded-asynchttpclient-2.0.6.jar:javax/activation/UnsupportedDataTypeException.class
The problem turns out to be the play-ws without which sbt assembly is carried out correctly. The only place in my code where I explicitly use javax is dependency injection. Using guice instead gives the same result. Here is my build.sbt (which is based on https://www.playframework.com/documentation/2.7.x/Deploying)
name := "My-App"
version := "1.0"
scalaVersion := "2.11.12"
lazy val root = (project in file(".")).enablePlugins(PlayScala)
scalaSource in ThisScope := baseDirectory.value
mainClass in assembly := Some("play.core.server.ProdServerStart")
fullClasspath in assembly += Attributed.blank(PlayKeys.playPackageAssets.value)
libraryDependencies ++= Seq(
ws,
specs2 % Test,
"com.typesafe.play" %% "play-json" % "2.7.4",
"com.typesafe.play" %% "play-slick" % "4.0.2",
"com.typesafe.play" %% "play-slick-evolutions" % "4.0.2",
"com.typesafe" % "config" % "1.4.0",
"org.mindrot" % "jbcrypt" % "0.4",
"mysql" % "mysql-connector-java" % "8.0.17",
"org.mindrot" % "jbcrypt" % "0.4",
"com.iheart" %% "ficus" % "1.4.7",
"com.typesafe.scala-logging" % "scala-logging_2.11" % "3.9.0"
)
assemblyMergeStrategy in assembly := {
case manifest if manifest.contains("MANIFEST.MF") =>
// We don't need manifest files since sbt-assembly will create
// one with the given settings
MergeStrategy.discard
case referenceOverrides if referenceOverrides.contains("reference-overrides.conf") =>
// Keep the content for all reference-overrides.conf files
MergeStrategy.concat
case x =>
// For all the other files, use the default sbt-assembly merge strategy
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}

assemblyMergeStrategy type error

I am trying to set up a small project to have an aws-lambda written in scala:
javacOptions ++= Seq("-source", "1.8", "-target", "1.8", "-Xlint")
lazy val root = (project in file(".")).
settings(
name := "xxx",
version := "0.1",
scalaVersion := "2.12.3",
retrieveManaged := true
)
libraryDependencies ++= Seq(
"com.amazonaws" % "aws-lambda-java-core" % "1.1.0" % Provided,
"com.amazonaws" % "aws-lambda-java-events" % "1.1.0" % Provided,
"org.scalatest" % "scalatest" % "2.2.6" % Test
)
scalacOptions += "-deprecation"
assemblyMergeStrategy in assembly <<= (assemblyMergeStrategy in assembly) {
(old) => {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
}
Results in :
xxx/build.sbt:25: error: not found: value assemblyMergeStrategy
assemblyMergeStrategy in assembly <<= (assemblyMergeStrategy in
assembly) { ^ [error] Type error in expression
The source of inspiration was this blog.
Also tried the provided version as mergeStrategy might have been replaced by assemblyMergeStrategy.
Did you reference assembly plugin in your project/plugins.sbt file?
assemblyMergeStrategy is defined by the plugin.
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.3")

scala.MatchError: org\apache\commons\io\IOCase.class (of class java.lang.String) in sbt+assembly

When I user sbt assembly, it prints error like this:
[error] (*:assembly) scala.MatchError: org\apache\commons\io\IOCase.class (of class java.lang.String)
and these are my configurations:
1、assembly.sbt:
import AssemblyKeys._
assemblySettings
mergeStrategy in assembly := {
case PathList("org", "springframework", xs#_*) => MergeStrategy.last
}
2、bulid.sbt
import AssemblyKeys._
lazy val root = (project in file(".")).
settings(
name := "DmpRealtimeFlow",
version := "1.0",
scalaVersion := "2.11.8",
libraryDependencies += "com.jd.ads.index" % "ad_index_dmp_common" % "0.0.4-SNAPSHOT",
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0" % "provided",
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.1.0" % "provided",
libraryDependencies += "org.apache.spark" % "spark-streaming_2.11" % "2.1.0" % "provided",
libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.8",
libraryDependencies += "org.springframework" % "spring-beans" % "3.1.0.RELEASE",
libraryDependencies += "org.springframework" % "spring-context" % "3.1.0.RELEASE",
libraryDependencies += "org.springframework" % "spring-core" % "3.1.0.RELEASE",
libraryDependencies += "org.springframework" % "spring-orm" % "3.1.0.RELEASE",
libraryDependencies += "org.mybatis" % "mybatis" % "3.2.1" % "compile",
libraryDependencies += "org.mybatis" % "mybatis-spring" % "1.2.2",
libraryDependencies += "c3p0" % "c3p0" % "0.9.1.2"
)
3、project tools:
sbt:0.13.5
assembly:0.11.2
java:1.7
scala:2.11.8
any help?
The problem may be in the missing default case in mergeStrategy in assembly block :
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
Also, mergeStrategy is deprecated and assemblyMergeStrategy should be used instead.
Basically the
{
case PathList("org", "springframework", xs#_*) => MergeStrategy.last
}
is a partial function String => MergeStrategy defined for only one type of inputs, i.e. for classes with package prefix "org\springframework". However, it is applied to all class files in the project and the first one that doesn't match the prefix above (org\apache\commons\io\IOCase.class) causes MatchError.

Why does sbt assembly in Spark project fail with "Please add any Spark dependencies by supplying the sparkVersion and sparkComponents"?

I work on a sbt-managed Spark project with spark-cloudant dependency. The code is available on GitHub (on spark-cloudant-compile-issue branch).
I've added the following line to build.sbt:
"cloudant-labs" % "spark-cloudant" % "1.6.4-s_2.10" % "provided"
And so build.sbt looks as follows:
name := "Movie Rating"
version := "1.0"
scalaVersion := "2.10.5"
libraryDependencies ++= {
val sparkVersion = "1.6.0"
Seq(
"org.apache.spark" %% "spark-core" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"org.apache.spark" %% "spark-streaming-kafka" % sparkVersion % "provided",
"org.apache.spark" %% "spark-mllib" % sparkVersion % "provided",
"org.apache.kafka" % "kafka-log4j-appender" % "0.9.0.0",
"org.apache.kafka" % "kafka-clients" % "0.9.0.0",
"org.apache.kafka" %% "kafka" % "0.9.0.0",
"cloudant-labs" % "spark-cloudant" % "1.6.4-s_2.10" % "provided"
)
}
assemblyMergeStrategy in assembly := {
case PathList("org", "apache", "spark", xs # _*) => MergeStrategy.first
case PathList("scala", xs # _*) => MergeStrategy.discard
case PathList("META-INF", "maven", "org.slf4j", xs # _* ) => MergeStrategy.first
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
unmanagedBase <<= baseDirectory { base => base / "lib" }
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
When I execute sbt assembly I get the following error:
java.lang.RuntimeException: Please add any Spark dependencies by
supplying the sparkVersion and sparkComponents. Please remove:
org.apache.spark:spark-core:1.6.0:provided
Probably related: https://github.com/databricks/spark-csv/issues/150
Can you try adding spIgnoreProvided := true to your build.sbt?
(This might not be the answer and I could have just posted a comment but I don't have enough reputation)
NOTE I still can't reproduce the issue, but think it does not really matter.
java.lang.RuntimeException: Please add any Spark dependencies by supplying the sparkVersion and sparkComponents.
In your case, your build.sbt misses a sbt resolver to find spark-cloudant dependency. You should add the following line to build.sbt:
resolvers += "spark-packages" at "https://dl.bintray.com/spark-packages/maven/"
PROTIP I strongly recommend using spark-shell first and only when you're comfortable with the package switch to sbt (esp. if you're new to sbt and perhaps other libraries/dependencies too). It's too much to digest in one bite. Follow https://spark-packages.org/package/cloudant-labs/spark-cloudant.