SCALA and Elastic Search: Add symbol to classpath (Databricks) - scala

:)
I'm getting an error that I have no idea how to fix.. I could not really find outstanding documentation for this SchemaRDD type, and how to use it.
build.sbt contains:
scalaVersion := "2.11.12"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.0"
libraryDependencies += "org.scalaj" %% "scalaj-http" % "2.4.1"
libraryDependencies += "io.spray" %% "spray-json" % "1.3.5"
libraryDependencies += "com.amazonaws" % "aws-java-sdk-core" % "1.11.534"
libraryDependencies += "com.amazonaws" % "aws-encryption-sdk-java" % "1.3.6"
libraryDependencies += "com.amazonaws" % "aws-java-sdk" % "1.11.550"
libraryDependencies += "com.typesafe" % "config" % "1.3.4"
libraryDependencies += "org.elasticsearch" %% "elasticsearch-spark-1.2" % "2.4.4"
Error:
Symbol 'type org.apache.spark.sql.SchemaRDD' is missing from the classpath.
[error] This symbol is required by 'value org.elasticsearch.spark.sql.package.rdd'.
[error] Make sure that type SchemaRDD is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
[error] A full rebuild may help if 'package.class' was compiled against an incompatible version of org.apache.spark.sql.
Thank you a lot for all kind of support! :)

Dependency elasticsearch-spark-1.2 is for Spark 1.x, need to use elasticsearch-spark-20 instead. The latest version is built for Spark 2.3
libraryDependencies += "org.elasticsearch" %% "elasticsearch-spark-20" % "7.1.1"

Related

How to set up a spark build.sbt file?

I have been trying all day and cannot figure out how to make it work.
So I have a common library that will be my core lib for spark.
My build.sbt file is not working:
name := "CommonLib"
version := "0.1"
scalaVersion := "2.12.5"
// addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.6")
// resolvers += "bintray-spark-packages" at "https://dl.bintray.com/spark-packages/maven/"
// resolvers += Resolver.sonatypeRepo("public")
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.10" % "1.6.0" exclude("org.apache.hadoop", "hadoop-yarn-server-web-proxy"),
"org.apache.spark" % "spark-sql_2.10" % "1.6.0" exclude("org.apache.hadoop", "hadoop-yarn-server-web-proxy"),
"org.apache.hadoop" % "hadoop-common" % "2.7.0" exclude("org.apache.hadoop", "hadoop-yarn-server-web-proxy"),
// "org.apache.spark" % "spark-sql_2.10" % "1.6.0" exclude("org.apache.hadoop", "hadoop-yarn-server-web-proxy"),
"org.apache.spark" % "spark-hive_2.10" % "1.6.0" exclude("org.apache.hadoop", "hadoop-yarn-server-web-proxy"),
"org.apache.spark" % "spark-yarn_2.10" % "1.6.0" exclude("org.apache.hadoop", "hadoop-yarn-server-web-proxy"),
"com.github.scopt" %% "scopt" % "3.7.0"
)
//addSbtPlugin("org.spark-packages" % "sbt-spark-package" % "0.2.6")
//libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0"
//libraryDependencies ++= {
// val sparkVer = "2.1.0"
// Seq(
// "org.apache.spark" %% "spark-core" % sparkVer % "provided" withSources()
// )
//}
All the commented out are all the test I've done and I don't know what to do anymore.
My goal is to have spark 2.3 to work and to have scope available too.
For my sbt version, I have 1.1.1 installed.
Thank you.
I think I had two main issues.
Spark is not compatible with scala 2.12 yet. So moving to 2.11.12 solved one issue
The second issue is that for intelliJ SBT console to reload the build.sbt you either need to kill and restart the console or use the reload command which I didnt know so I was not actually using the latest build.sbt file.
There's a Giter8 template that should work nicely:
https://github.com/holdenk/sparkProjectTemplate.g8

Error when run jar Exception in thread "main" java.lang.NoSuchMethodError scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;

I work on spark application using (spark 2.0.0 & scala 2.11.8) and the application works fine within intellij Idea environment. I've extracted application as jar file and tried to run spark application from jar file but this error raised on terminal:
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
at org.apache.spark.util.Utils$.getSystemProperties(Utils.scala:1632)
at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:65)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:60)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:55)
at Main$.main(Main.scala:26)
at Main.main(Main.scala)
I've read discussions and similar question but all of them talk about different scala versions, however my sbt file is this:
name := "BaiscFM"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.0"
libraryDependencies += "org.apache.spark" % "spark-streaming_2.11" % "2.0.0"
libraryDependencies += "org.apache.spark" % "spark-streaming-kafka-0-8_2.11" % "2.0.0"
libraryDependencies += "com.datastax.spark" % "spark-cassandra-connector_2.11" % "2.0.0"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.0"
libraryDependencies += "org.apache.spark" % "spark-graphx_2.11" % "2.0.0"
libraryDependencies += "com.typesafe.akka" % "akka-actor_2.11" % "2.4.17"
libraryDependencies += "net.liftweb" % "lift-json_2.11" % "2.6"
libraryDependencies += "com.typesafe.play" % "play-json_2.11" % "2.4.0-M2"
libraryDependencies += "org.json" % "json" % "20090211"
libraryDependencies += "org.scalaj" % "scalaj-http_2.11" % "2.3.0"
libraryDependencies += "org.drools" % "drools-core" % "6.3.0.Final"
libraryDependencies += "org.drools" % "drools-compiler" % "6.3.0.Final"
How to fix this problem?

Error : java.lang.NoSuchMethodError: com.google.common.util.concurrent.MoreExecutors.directExecutor()Ljava/util/concurrent/Executor;

I get error, just like the title. I'm already research, and found some similar, but its not working on me.
NoSuchMethodError: com.google.common.util.concurrent.MoreExecutors.directExecutor conflits on Elastic Search jar
Java elasticsearch client always null
https://github.com/elastic/elasticsearch/pull/7593
java.lang.NoSuchMethodError during Elastic search start
https://discuss.elastic.co/t/transportclient-in-2-1-x/38818/6
I'm using Scala as programming language to create API, and Elasticsearch as database.
here is my code build.sbt
name := "LearningByDoing"
version := "1.0"
scalaVersion := "2.10.5"
resolvers += "spray repo" at "http://repo.spray.io"
resolvers += "spray nightlies repo" at "http://nightlies.spray.io"
libraryDependencies += "io.spray" % "spray-json_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-can_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-client_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-testkit_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-routing_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-http_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-httpx_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-util_2.10" % "1.3.2"
libraryDependencies += "io.spray" % "spray-can_2.10" % "1.3.2"
libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.12"
libraryDependencies += "org.elasticsearch" % "elasticsearch" % "2.3.1"
libraryDependencies += "com.sksamuel.elastic4s" % "elastic4s-streams_2.10" % "2.3.1"
libraryDependencies += "org.elasticsearch" % "elasticsearch-mapper-attachments" % "2.3.1"
libraryDependencies += "com.typesafe" % "config" % "1.2.1"
libraryDependencies += "com.typesafe.akka" % "akka-actor_2.10" % "2.3.1"
Here is my code plugins.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")
addSbtPlugin("com.typesafe.sbt" % "sbt-native-packager" % "1.0.0-M4")
addSbtPlugin("com.typesafe.sbt" % "sbt-multi-jvm" % "0.3.9")
addSbtPlugin("org.scalastyle" %% "scalastyle-sbt-plugin" % "0.8.0")
at terminal, I was written sbt clean compile test update package and everything works normal. but when I hit the API is always come error like that.
Seems like you have some wrong guava version, just like the firs link you mentioned, may be with this sbt plugin you can see the dependency tree and figured it out some messing dependencies.
The issue is the TCP client for Elasticsearch since 5.0 uses Netty 4.1, which is incompatible with Spray which uses Netty 4. There is no workaround other than waiting for Spray to upgrade or switching to an elasticsearch HTTP client.

Phantom Cassandra driver dependency error

I would like to use the phantom cassandra wrapper in my scala project, but when I try to update my sbt build I get a dependency error.
My build.sbt:
version := "1.0"
scalaVersion := "2.11.2"
seq(lsSettings :_*)
libraryDependencies ++= Seq(
"org.clapper" %% "grizzled-scala" % "1.2",
"commons-io" % "commons-io" % "2.4",
"org.rauschig" % "jarchivelib" % "0.6.0",
"com.google.code.findbugs" % "jsr305" % "3.0.0",
"org.scalatest" % "scalatest_2.11" % "2.2.0" % "test",
"com.github.nscala-time" %% "nscala-time" % "1.2.0",
"org.json4s" %% "json4s-native" % "3.2.10",
"org.scala-lang" % "scala-library" % "2.11.2",
"com.websudos" % "phantom-dsl_2.10" % "1.2.0"
)
resolvers += "grizzled-scala-resolver-0" at "https://oss.sonatype.org/content/repositories/releases"
resolvers += "Typesafe repository releases" at "http://repo.typesafe.com/typesafe/releases/"
I get the following error:
[warn] Note: Some unresolved dependencies have extra attributes. Check that these dependencies exist with the requested attributes.
[warn] com.typesafe.sbt:sbt-pgp:0.8.1 (sbtVersion=0.13, scalaVersion=2.10)
Don't know what I have to do...
edit:
Answer from https://github.com/websudosuk/phantom/issues/119
error is on the pom side, new version 1.2.1 coming soon...
Answer from https://github.com/websudosuk/phantom/issues/119
error is on the pom side, new version 1.2.1 coming soon...

Why is the error Conflicting cross-version suffixes?

I'm getting this error when I try to compile a Scala project in sbt.
Modules were resolved with conflicting cross-version suffixes in {file:/home/seven3n/caja/Flujo_de_caja/}flujo_de_caja:
[error] com.typesafe.akka:akka-actor _2.11, _2.10
[error] org.scalaz:scalaz-effect _2.10, _2.11
[error] org.scalaz:scalaz-core _2.10, _2.11
[trace] Stack trace suppressed: run last *:update for the full output.
[error] (*:update) Conflicting cross-version suffixes in: com.typesafe.akka:akka-actor, org.scalaz:scalaz-effect, org.scalaz:scalaz-core
This is my build.sbt file:
scalaVersion := "2.11.0"
resolvers ++= Seq(
"Sonatype snapshots repository" at "https://oss.sonatype.org/content/repositories/snapshots/",
"Spray repository" at "http://repo.spray.io/",
"Typesafe repository" at "http://repo.typesafe.com/typesafe/releases/"
)
libraryDependencies ++= {
val akkaVersion = "2.3.2"
val sprayVersion = "1.3.1-20140423"
val sprayJsonVersion = "1.2.6"
val reactiveMongoVersion = "0.11.0-SNAPSHOT"
val scalaTestVersion = "2.1.5"
val specs2Version = "2.3.11"
val foloneVersion = "0.12-SNAPSHOT"
Seq(
"com.typesafe.akka" %% "akka-actor" % akkaVersion,
"com.typesafe.akka" %% "akka-testkit" % akkaVersion,
"io.spray" %% "spray-can" % sprayVersion,
"io.spray" %% "spray-routing" % sprayVersion,
"io.spray" %% "spray-testkit" % sprayVersion,
"io.spray" %% "spray-json" % sprayJsonVersion,
"org.reactivemongo" % "reactivemongo_2.10" % reactiveMongoVersion,
"org.scalatest" %% "scalatest" % scalaTestVersion % "test",
"org.specs2" %% "specs2" % specs2Version % "test",
"info.folone" % "poi-scala_2.10" % foloneVersion
)
}
Any suggestions?
The conflicts appear because:
you've specified your Scala version to be 2.11
you've explicitly specified the Scala version (2.10) for the reactivemongo and poi-scala libraries.
The fix is to use the %% operator for those two libraries as well.
"org.reactivemongo" %% "reactivemongo" % reactiveMongoVersion,
"info.folone" %% "poi-scala" % foloneVersion
That's the purpose of the %% operator. To append the declared Scala version (2.11 in your case) to the artifact name.
I had the same problem and I simply removed the scalaVersion tag from my sbt file and modified the line
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0"
to
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.6.0"
and the problem went away.
I have tried using %% but it didn't work. I have manually excluded it using
("org.reactivemongo" % "reactivemongo" % reactiveMongoVersion)
.exclude("com.typesafe.akka", "akka-actor_2.10")
.exclude("org.scalaz", "scalaz-effect")
.exclude("org.scalaz", "scalaz-core")
To investigate who is caller you can use a plugin but an easier way to look into target/scala-2.*/resolution-cache/reports/.
There is Ivy's resolution report for each configuration.
Look for *-compile.xml and *-test.xml and search for conflicting library.
You can see similar with
<module organisation="com.github.nscala-time" name="nscala-time_2.11">
...
<caller organisation="com.tumblr" name="colossus-metrics_2.11" conf="compile, runtime" rev="1.2.0" rev-constraint-default="1.2.0" rev-constraint-dynamic="1.2.0" callerrev="0.7.2-RC1"/>
...
</module>
This should tell you the caller of the module.