Build.sbt breaks when adding GraphFrames build with scala 2.11 - scala

I'm trying to add GraphFrames to my scala spark application, and this was going fine when I added the one based on 2.10. However, as soon as I tried to build it with GraphFrames build with scala 2.11, it breaks.
The problem would be that there are conflicting versions of scala used (2.10 and 2.11). I'm getting the following error:
[error] Modules were resolved with conflicting cross-version suffixes in {file:/E:/Documents/School/LSDE/hadoopcryptoledger/examples/scala-spark-graphx-bitcointransaction/}root:
[error] org.apache.spark:spark-launcher _2.10, _2.11
[error] org.json4s:json4s-ast _2.10, _2.11
[error] org.apache.spark:spark-network-shuffle _2.10, _2.11
[error] com.twitter:chill _2.10, _2.11
[error] org.json4s:json4s-jackson _2.10, _2.11
[error] com.fasterxml.jackson.module:jackson-module-scala _2.10, _2.11
[error] org.json4s:json4s-core _2.10, _2.11
[error] org.apache.spark:spark-unsafe _2.10, _2.11
[error] org.apache.spark:spark-core _2.10, _2.11
[error] org.apache.spark:spark-network-common _2.10, _2.11
However, I can't troubleshoot what causes this.. This is my full build.sbt:
import sbt._
import Keys._
import scala._
lazy val root = (project in file("."))
.settings(
name := "example-hcl-spark-scala-graphx-bitcointransaction",
version := "0.1"
)
.configs( IntegrationTest )
.settings( Defaults.itSettings : _*)
scalacOptions += "-target:jvm-1.7"
crossScalaVersions := Seq("2.11.8")
resolvers += Resolver.mavenLocal
fork := true
jacoco.settings
itJacoco.settings
assemblyJarName in assembly := "example-hcl-spark-scala-graphx-bitcointransaction.jar"
libraryDependencies += "com.github.zuinnote" % "hadoopcryptoledger-fileformat" % "1.0.7" % "compile"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.0" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-graphx" % "1.5.0" % "provided"
libraryDependencies += "org.apache.hadoop" % "hadoop-client" % "2.7.0" % "provided"
libraryDependencies += "javax.servlet" % "javax.servlet-api" % "3.0.1" % "it"
libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.7.0" % "it" classifier "" classifier "tests"
libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.7.0" % "it" classifier "" classifier "tests"
libraryDependencies += "org.apache.hadoop" % "hadoop-minicluster" % "2.7.0" % "it"
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.2.0" % "provided"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1" % "test,it"
libraryDependencies += "graphframes" % "graphframes" % "0.5.0-spark2.1-s_2.11"
Can anyone pinpoint which dependency is based on scala 2.10 causing the build to fail?

I found out what the problem was. Apparently, if you use:
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.0" % "provided"
It uses the 2.10 version by default. It all worked once I changed the dependencies of spark core and spark graphx to:
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.2.0"
libraryDependencies += "org.apache.spark" % "spark-graphx_2.11" % "2.2.0" % "provided"

Related

SCALA and Elastic Search: Add symbol to classpath (Databricks)

:)
I'm getting an error that I have no idea how to fix.. I could not really find outstanding documentation for this SchemaRDD type, and how to use it.
build.sbt contains:
scalaVersion := "2.11.12"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.0"
libraryDependencies += "org.scalaj" %% "scalaj-http" % "2.4.1"
libraryDependencies += "io.spray" %% "spray-json" % "1.3.5"
libraryDependencies += "com.amazonaws" % "aws-java-sdk-core" % "1.11.534"
libraryDependencies += "com.amazonaws" % "aws-encryption-sdk-java" % "1.3.6"
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.11.550"
libraryDependencies += "com.typesafe" % "config" % "1.3.4"
libraryDependencies += "org.elasticsearch" %% "elasticsearch-spark-1.2" % "2.4.4"
Error:
Symbol 'type org.apache.spark.sql.SchemaRDD' is missing from the classpath.
[error] This symbol is required by 'value org.elasticsearch.spark.sql.package.rdd'.
[error] Make sure that type SchemaRDD is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
[error] A full rebuild may help if 'package.class' was compiled against an incompatible version of org.apache.spark.sql.
Thank you a lot for all kind of support! :)
Dependency elasticsearch-spark-1.2 is for Spark 1.x, need to use elasticsearch-spark-20 instead. The latest version is built for Spark 2.3
libraryDependencies += "org.elasticsearch" %% "elasticsearch-spark-20" % "7.1.1"

Scala: object profile is not a member of package com.amazonaws.auth

I am having a build problem. Here is my sbt file:
name := "SparkPi"
version := "0.2.15"
scalaVersion := "2.11.8"
// https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "2.0.1"
// old:
//libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.1"
// https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.0.002"
scalacOptions ++= Seq("-feature")
Here is the full error message I am seeing:
[info] Set current project to SparkPi (in build file:/Users/xxx/prog/yyy/)
[info] Updating {file:/Users/xxx/prog/yyy/}yyy...
[info] Resolving jline#jline;2.12.1 ...
[info] Done updating.
[info] Compiling 2 Scala sources to /Users/xxx/prog/yyy/target/scala-2.11/classes...
[error] /Users/xxx/prog/yyy/src/main/scala/PiSpark.scala:6: object profile is not a member of package com.amazonaws.auth
[error] import com.amazonaws.auth.profile._
[error] ^
[error] /Users/xxx/prog/yyy/src/main/scala/PiSpark.scala:87: not found: type ProfileCredentialsProvider
[error] val creds = new ProfileCredentialsProvider(profile).getCredentials()
[error] ^
[error] two errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 14 s, completed Nov 3, 2016 1:43:34 PM
And here are the imports I am trying to use:
import com.amazonaws.services.s3._
import com.amazonaws.auth.profile._
How do I import com.amazonaws.auth.profile.ProfileCredentialsProvider in Scala?
EDIT
Changed sbt file so spark core version corresponds to Scala version, new contents:
name := "SparkPi"
version := "0.2.15"
scalaVersion := "2.11.8"
// https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.1"
// old:
//libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.1"
// https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.0.002"
scalacOptions ++= Seq("-feature")
You are using scalaVersion := "2.11.8" but library dependency has underscore 2.10 spark-core_2.10 which is bad.
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "2.0.1"
^
change 2.10 to 2.11
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.1"
`

SBT Assembly plugin Errors out

I have written the following sbt file
name := "Test"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies ++= Seq(
"org.apache.hadoop" % "hadoop-client" % "2.7.1",
"org.apache.spark" % "spark-core_2.10" % "1.3.0",
"org.apache.avro" % "avro" % "1.7.7",
"org.apache.avro" % "avro-mapred" % "1.7.7"
)
mainClass := Some("com.test.Foo")
I also have the following assembly.sbt file in projects folder
resolvers += Resolver.url("bintray-sbt-plugins", url("http://dl.bintray.com/sbt/sbt-plugin-releases"))(Resolver.ivyStylePatterns)
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.0")
when i do sbt assembly i get a huge list of errors
[error] (*:assembly) deduplicate: different file contents found in the following:
[error] /Users/abhishek.srivastava/.ivy2/cache/com.esotericsoftware.kryo/kryo/bundles/kryo-2.21.jar:com/esotericsoftware/minlog/Log$Logger.class
[error] /Users/abhishek.srivastava/.ivy2/cache/com.esotericsoftware.minlog/minlog/jars/minlog-1.2.jar:com/esotericsoftware/minlog/Log$Logger.class
[error] deduplicate: different file contents found in the following:
[error] /Users/abhishek.srivastava/.ivy2/cache/com.esotericsoftware.kryo/kryo/bundles/kryo-2.21.jar:com/esotericsoftware/minlog/Log.class
[error] /Users/abhishek.srivastava/.ivy2/cache/com.esotericsoftware.minlog/minlog/jars/minlog-1.2.jar:com/esotericsoftware/minlog/Log.class
[error] deduplicate: different file contents found in the following:
I was able to resolve the problem. actually there is no need to build a fat jar because the "spark-submit" tool will have everything in the classpath anyway.
thus the right way to build the jar file is
name := "Test"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies ++= Seq(
"org.apache.hadoop" % "hadoop-client" % "2.7.1" % "provided",
"org.apache.spark" % "spark-core_2.10" % "1.3.0" % "provided",
"org.apache.avro" % "avro" % "1.7.7" % "provided",
"org.apache.avro" % "avro-mapred" % "1.7.7" % "provided"
)
mainClass := Some("com.test.Foo")
1. use the MergeStrategy, see sbt-assembly
2. exclude the duplicated jars, such as:
lazy val hbaseLibSeq = Seq(
("org.apache.hbase" % "hbase" % hbaseVersion).
excludeAll(
ExclusionRule(organization = "org.slf4j"),
ExclusionRule(organization = "org.mortbay.jetty"),
ExclusionRule(organization = "javax.servlet")),
("net.java.dev.jets3t" % "jets3t" % "0.6.1" % "provided").
excludeAll(ExclusionRule(organization = "javax.servlet"))
)
3. use the provided scope
show dependency tree:
➜ cat ~/.sbt/0.13/plugins/plugins.sbt
addSbtPlugin("net.virtual-void" % "sbt-dependency-graph" % "0.7.5")
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.2")
➜ cat ~/.sbt/0.13/global.sbt
net.virtualvoid.sbt.graph.Plugin.graphSettings
➜ sbt dependency-graph

Why is the error Conflicting cross-version suffixes?

I'm getting this error when I try to compile a Scala project in sbt.
Modules were resolved with conflicting cross-version suffixes in {file:/home/seven3n/caja/Flujo_de_caja/}flujo_de_caja:
[error] com.typesafe.akka:akka-actor _2.11, _2.10
[error] org.scalaz:scalaz-effect _2.10, _2.11
[error] org.scalaz:scalaz-core _2.10, _2.11
[trace] Stack trace suppressed: run last *:update for the full output.
[error] (*:update) Conflicting cross-version suffixes in: com.typesafe.akka:akka-actor, org.scalaz:scalaz-effect, org.scalaz:scalaz-core
This is my build.sbt file:
scalaVersion := "2.11.0"
resolvers ++= Seq(
"Sonatype snapshots repository" at "https://oss.sonatype.org/content/repositories/snapshots/",
"Spray repository" at "http://repo.spray.io/",
"Typesafe repository" at "http://repo.typesafe.com/typesafe/releases/"
)
libraryDependencies ++= {
val akkaVersion = "2.3.2"
val sprayVersion = "1.3.1-20140423"
val sprayJsonVersion = "1.2.6"
val reactiveMongoVersion = "0.11.0-SNAPSHOT"
val scalaTestVersion = "2.1.5"
val specs2Version = "2.3.11"
val foloneVersion = "0.12-SNAPSHOT"
Seq(
"com.typesafe.akka" %% "akka-actor" % akkaVersion,
"com.typesafe.akka" %% "akka-testkit" % akkaVersion,
"io.spray" %% "spray-can" % sprayVersion,
"io.spray" %% "spray-routing" % sprayVersion,
"io.spray" %% "spray-testkit" % sprayVersion,
"io.spray" %% "spray-json" % sprayJsonVersion,
"org.reactivemongo" % "reactivemongo_2.10" % reactiveMongoVersion,
"org.scalatest" %% "scalatest" % scalaTestVersion % "test",
"org.specs2" %% "specs2" % specs2Version % "test",
"info.folone" % "poi-scala_2.10" % foloneVersion
)
}
Any suggestions?
The conflicts appear because:
you've specified your Scala version to be 2.11
you've explicitly specified the Scala version (2.10) for the reactivemongo and poi-scala libraries.
The fix is to use the %% operator for those two libraries as well.
"org.reactivemongo" %% "reactivemongo" % reactiveMongoVersion,
"info.folone" %% "poi-scala" % foloneVersion
That's the purpose of the %% operator. To append the declared Scala version (2.11 in your case) to the artifact name.
I had the same problem and I simply removed the scalaVersion tag from my sbt file and modified the line
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"
to
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.6.0"
and the problem went away.
I have tried using %% but it didn't work. I have manually excluded it using
("org.reactivemongo" % "reactivemongo" % reactiveMongoVersion)
.exclude("com.typesafe.akka", "akka-actor_2.10")
.exclude("org.scalaz", "scalaz-effect")
.exclude("org.scalaz", "scalaz-core")
To investigate who is caller you can use a plugin but an easier way to look into target/scala-2.*/resolution-cache/reports/.
There is Ivy's resolution report for each configuration.
Look for *-compile.xml and *-test.xml and search for conflicting library.
You can see similar with
<module organisation="com.github.nscala-time" name="nscala-time_2.11">
...
<caller organisation="com.tumblr" name="colossus-metrics_2.11" conf="compile, runtime" rev="1.2.0" rev-constraint-default="1.2.0" rev-constraint-dynamic="1.2.0" callerrev="0.7.2-RC1"/>
...
</module>
This should tell you the caller of the module.

play-json breaks sbt build

Suddenly as of today my project has stopped compiling successfuly. Upon further investigation I've found out the reason is play-json library that I include in dependencies.
Here's my build.sbt:
name := """project-name"""
version := "1.0"
scalaVersion := "2.10.2"
libraryDependencies ++= Seq(
"com.typesafe.akka" %% "akka-actor" % "2.2.1",
"com.typesafe.akka" %% "akka-testkit" % "2.2.1",
"org.scalatest" %% "scalatest" % "1.9.1" % "test",
"org.bouncycastle" % "bcprov-jdk16" % "1.46",
"com.sun.mail" % "javax.mail" % "1.5.1",
"com.typesafe.slick" %% "slick" % "2.0.1",
"org.postgresql" % "postgresql" % "9.3-1101-jdbc41",
"org.slf4j" % "slf4j-nop" % "1.6.4",
"com.drewnoakes" % "metadata-extractor" % "2.6.2",
"com.typesafe.play" %% "play-json" % "2.2.2"
)
resolvers += "Typesafe repository" at "http://repo.typesafe.com/typesafe/releases/"
If I try to create a new project in activator with all the lines except "com.typesafe.play" %% "play-json" % "2.2.2" then it compiles successfully. But once I add play-json I get the folloing error:
[error] References to undefined settings:
[error]
[error] *:playCommonClassloader from echo:run
[error]
[error] docs:managedClasspath from echo:run
[error]
[error] *:playReloaderClassloader from echo:run
[error]
[error] echo:playVersion from echo:echoTracePlayVersion
[error]
[error] *:playRunHooks from echo:playRunHooks
[error] Did you mean echo:playRunHooks ?
[error]
And I keep getting this error even if I remove play-json line. Why is it so? What should I do to fix it?