SBT Verify Error caused by multiple protobuf 2/3 dependencies in spite of shading - scala

I am struggling with Verify Errors with this below sample project using Scio/Bigtable/HBase. The dependency tree requires protobuf version (2.5, 2.6.1, 3.0, 3.1) and seems to default to 3.2. I used the shading component of sbt-assembly, not sure I am right with it.
My build.sbt:
name := "scalalab"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq(
"com.spotify" %% "scio-core" % "0.3.2",
"com.google.cloud.bigtable" % "bigtable-hbase-1.2" % "0.9.2",
"org.apache.hbase" % "hbase-client" % "1.2.5",
"org.apache.hbase" % "hbase-common" % "1.2.5",
"org.apache.hadoop" % "hadoop-common" % "2.7.3"
)
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.**" -> "shadeio.#1").inAll
)
assemblyMergeStrategy in assembly := {
case x => MergeStrategy.first
}
My Main.scala:
import com.google.cloud.bigtable.hbase.adapters.Adapters
import com.google.cloud.bigtable.hbase.adapters.read.DefaultReadHooks
import org.apache.hadoop.hbase.client.Scan
object Main {
def main(args: Array[String]): Unit = {
val scan = new Scan
scan.setMaxVersions()
scan.addFamily("family".getBytes)
scan.setRowPrefixFilter("prefix".getBytes)
val builder = Adapters.SCAN_ADAPTER.adapt(scan, new DefaultReadHooks)
System.out.println(builder)
}
}
outputs to:
scalalab git:(master) ? java -cp .:target/scala-2.11/scalalab-assembly-1.0.jar Main
Exception in thread "main" java.lang.VerifyError: Bad return type
Exception Details:
Location:
shadeio/cloud/bigtable/hbase/adapters/AppendAdapter.adapt(Lorg/apache/hadoop/hbase/client/Operation;)Lshadeio/bigtable/repackaged/com/google/protobuf/GeneratedMessageV3$Builder; #8: areturn
Reason:
Type 'shadeio/bigtable/v2/ReadModifyWriteRowRequest$Builder' (current frame, stack[0]) is not assignable to 'shadeio/bigtable/repackaged/com/google/protobuf/GeneratedMessageV3$Builder' (from method signature)
Current Frame:
bci: #8
flags: { }
locals: { 'shadeio/cloud/bigtable/hbase/adapters/AppendAdapter', 'org/apache/hadoop/hbase/client/Operation' }
stack: { 'shadeio/bigtable/v2/ReadModifyWriteRowRequest$Builder' }
Bytecode:
0x0000000: 2a2b c000 28b6 00a0 b0
at shadeio.cloud.bigtable.hbase.adapters.Adapters.<clinit>(Adapters.java:40)
at Main$.main(Main.scala:12)
at Main.main(Main.scala)
What am I doing wrong ?
Thanks for your help

The dependency issue is quit a pain for the Cloud Bigtable HBase client. We'll be fixing it in the next release. To fix your current problem, import "bigtable-hbase" instead of "bigtable-hbase-1.2"
Also, I would suggest using the latest version of the client is possible, which is 0.9.7.1.

Scio has a bigtable artifact, which you can include via:
libraryDependencies ++= Seq(
"com.spotify" %% "scio-core" % "0.3.2",
"com.spotify" %% "scio-bigtable" % "0.3.2"
)
Usually, there is no need for the other dependencies to use Bigtable, in fact they might cause issues.
Make sure to then import bigtable package via:
import com.spotify.scio.bigtable._
Check this BigtableExample.
Side note you might want to give sbt-pack a try instead of sbt-assembly to fully leverage artifact caching. For an example of sbt setup check the template here.

Related

is it necessary to add my custom scala library dependencies in new scala project?

I am new to Scala and I am trying to develop a small project which uses a custom library. I have created a mysql connection pool inside the library. Here's my build.sbt for library
organization := "com.learn"
name := "liblearn-scala"
version := "0.1"
scalaVersion := "2.12.6"
libraryDependencies += "mysql" % "mysql-connector-java" % "6.0.6"
libraryDependencies += "org.apache.tomcat" % "tomcat-dbcp" % "8.5.0"
I have published the same to local ivy repo using sbt publishLocal
Now I have a project which will be making use of the above library with following build.sbt
name := "SBT1"
version := "0.1"
scalaVersion := "2.12.6"
libraryDependencies += "com.learn" % "liblearn-scala_2.12" % "0.1"
I am able to compile the new project but when I run it I get
java.lang.ClassNotFoundException: org.apache.tomcat.dbcp.dbcp2.BasicDataSource
But if I add
libraryDependencies += "mysql" % "mysql-connector-java" % "6.0.6"
libraryDependencies += "org.apache.tomcat" % "tomcat-dbcp" % "8.5.0"
in the project's build.sbt it works without any issues.
Is this the actual way of doing things with scala - sbt? ie : I have to mention dependencies of custom library also inside the project?
Here is my library code (I have just 1 file)
package com.learn.scala.db
import java.sql.Connection
import org.apache.tomcat.dbcp.dbcp2._
object MyMySQL {
private val dbUrl = s"jdbc:mysql://localhost:3306/school?autoReconnect=true"
private val connectionPool = new BasicDataSource()
connectionPool.setUsername("root")
connectionPool.setPassword("xyz")
connectionPool.setDriverClassName("com.mysql.cj.jdbc.Driver")
connectionPool.setUrl(dbUrl)
connectionPool.setInitialSize(3)
def getConnection: Connection = connectionPool.getConnection
}
This is my project code:
try {
val conn = MyMySQL.getConnection
val ps = conn.prepareStatement("select * from school")
val rs = ps.executeQuery()
while (rs.next()) {
print(rs.getString("name"))
print(rs.getString("rank"))
println("----------------------------------")
}
rs.close()
ps.close()
conn.close()
} catch {
case ex: Exception => {
println(ex.printStackTrace())
}
}
By default SBT fetches all project dependencies, transitively. This means it should be necessary to explicitly declare only liblearn-scala, and not also the transitive dependencies mysql-connector-java and tomcat-dbcp. Transitivity can be disabled, and transitive dependencies can be excluded, however unless this has been done explicitly, then it should not be the cause of the problem.
Without seeing your whole build.sbt, I believe you are doing the right thing. If sbt clean publishLocal is not solving the problem, you could try the nuclear option and clear the whole ivy cache (note this will force all projects to re-fetch dependencies).

Specs2 test within plays gives me "could not find implicit value for evidence parameter of type org.specs2.main.CommandLineAsResult

I'm trying to write a test case for a simple REST API in Play2/Scala that send/receives JSON. My test looks like the following:
import org.junit.runner.RunWith
import org.specs2.matcher.JsonMatchers
import org.specs2.mutable._
import org.specs2.runner.JUnitRunner
import play.api.libs.json.{Json, JsArray, JsValue}
import play.api.test.Helpers._
import play.api.test._
import play.test.WithApplication
/**
* Add your spec here.
* You can mock out a whole application including requests, plugins etc.
* For more information, consult the wiki.
*/
#RunWith(classOf[JUnitRunner])
class APIv1Spec extends Specification with JsonMatchers {
val registrationJson = Json.parse("""{"device":"576b9cdc-d3c3-4a3d-9689-8cd2a3e84442", |
"firstName":"", "lastName":"Johnny", "email":"justjohnny#test.com", |
"pass":"myPassword", "acceptTermsOfService":true}
""")
def dropJsonElement(json : JsValue, element : String) = (json \ element).get match {
case JsArray(items) => util.dropAt(items, 1)
}
def invalidRegistrationData(remove : String) = {
dropJsonElement(registrationJson,remove)
}
"API" should {
"Return Error on missing first name" in new WithApplication {
val result= route(
FakeRequest(
POST,
"/api/v1/security/register",
FakeHeaders(Seq( ("Content-Type", "application/json") )),
invalidRegistrationData("firstName").toString()
)
).get
status(result) must equalTo(BAD_REQUEST)
contentType(result) must beSome("application/json")
}
...
However when I attempt to run sbt test, I get the following error:
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=384M; support was removed in 8.0
[info] Loading project definition from /home/cassius/brentspace/esalestracker/project
[info] Set current project to eSalesTracker (in build file:/home/cassius/brentspace/esalestracker/)
[info] Compiling 3 Scala sources to /home/cassius/brentspace/esalestracker/target/scala-2.11/test-classes...
[error] /home/cassius/brentspace/esalestracker/test/APIv1Spec.scala:34: could not find implicit value for evidence parameter of type org.specs2.main.CommandLineAsResult[play.test.WithApplication{val result: scala.concurrent.Future[play.api.mvc.Result]}]
[error] "Return Error on missing first name" in new WithApplication {
[error] ^
[error] one error found
[error] (test:compileIncremental) Compilation failed
[error] Total time: 3 s, completed 18/01/2016 9:30:42 PM
I have similar tests in other applications, but it looks like the new version of specs adds a lot of support for Futures and other things that invalidate previous tutorials. I'm on Scala 2.11.6, Activator 1.3.6 and my build.sbt looks like the following:
name := """eSalesTracker"""
version := "1.0-SNAPSHOT"
lazy val root = (project in file(".")).enablePlugins(PlayScala)
scalaVersion := "2.11.6"
libraryDependencies ++= Seq(
jdbc,
cache,
ws,
"com.typesafe.slick" %% "slick" % "3.1.0",
"org.postgresql" % "postgresql" % "9.4-1206-jdbc42",
"org.slf4j" % "slf4j-api" % "1.7.13",
"ch.qos.logback" % "logback-classic" % "1.1.3",
"ch.qos.logback" % "logback-core" % "1.1.3",
evolutions,
specs2 % Test,
"org.specs2" %% "specs2-matcher-extra" % "3.7" % Test
)
resolvers += "scalaz-bintray" at "http://dl.bintray.com/scalaz/releases"
resolvers += Resolver.url("Typesafe Ivy releases", url("https://repo.typesafe.com/typesafe/ivy-releases"))(Resolver.ivyStylePatterns)
// Play provides two styles of routers, one expects its actions to be injected, the
// other, legacy style, accesses its actions statically.
routesGenerator := InjectedRoutesGenerator
I think you are using the wrong WithApplication import.
Use this one:
import play.api.test.WithApplication
The last line of the testcase should be the assertion/evaluation statement.
e.g. before the last } of the failing testcase method put the statement false must beEqualTo(true) and run it again.

updateStateByKey, noClassDefFoundError

I have problem with using updateStateByKey() function. I have following, simple code (written base on book: "Learning Spark - Lighting Fast Data Analysis"):
object hello {
def updateStateFunction(newValues: Seq[Int], runningCount: Option[Int]): Option[Int] = {
Some(runningCount.getOrElse(0) + newValues.size)
}
def main(args: Array[String]) {
val conf = new SparkConf().setMaster("local[5]").setAppName("AndrzejApp")
val ssc = new StreamingContext(conf, Seconds(4))
ssc.checkpoint("/")
val lines7 = ssc.socketTextStream("localhost", 9997)
val keyValueLine7 = lines7.map(line => (line.split(" ")(0), line.split(" ")(1).toInt))
val statefullStream = keyValueLine7.updateStateByKey(updateStateFunction _)
ssc.start()
ssc.awaitTermination()
}
}
My build.sbt is:
name := "stream-correlator-spark"
version := "1.0"
scalaVersion := "2.11.4"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.3.1" % "provided",
"org.apache.spark" %% "spark-streaming" % "1.3.1" % "provided"
)
When I build it with sbt assembly command everything goes fine. When I run this on spark cluster in local mode I got error:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/dstream/DStream$
at hello$.main(helo.scala:25)
...
line 25 is:
val statefullStream = keyValueLine7.updateStateByKey(updateStateFunction _)
I feel this might be some compatibility version problem but I don't know what might be the reason and how to resolve this.
I would be really grateful for help!
When you are writing "provided" in the SBT this means exactly that your library is provided by the environment and need no to be included in the package.
Try to remove "provided" mark from "spark-streaming" library.
You can add "provided" back when you need to submit your app to a spark cluster to run. The benefit of having "provided" is that the result fat jar will not include classes from the provided dependencies, which will yield a much smaller fat jar, comparing to not having "provided". In my case, the result jar will be around 90M without "provided" and then shrink to 30+M with "provided".

Compilation errors with spark cassandra connector and SBT

I'm trying to get the DataStax spark cassandra connector working. I've created a new SBT project in IntelliJ, and added a single class. The class and my sbt file is given below. Creating spark contexts seem to work, however, the moment I uncomment the line where I try to create a cassandraTable, I get the following compilation error:
Error:scalac: bad symbolic reference. A signature in CassandraRow.class refers to term catalyst
in package org.apache.spark.sql which is not available.
It may be completely missing from the current classpath, or the version on
the classpath might be incompatible with the version used when compiling CassandraRow.class.
Sbt is kind of new to me, and I would appreciate any help in understanding what this error means (and of course, how to resolve it).
name := "cassySpark1"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.1.0"
libraryDependencies += "com.datastax.spark" % "spark-cassandra-connector" % "1.1.0" withSources() withJavadoc()
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector-java" % "1.1.0-alpha2" withSources() withJavadoc()
resolvers += "Akka Repository" at "http://repo.akka.io/releases/"
And my class:
import org.apache.spark.{SparkConf, SparkContext}
import com.datastax.spark.connector._
object HelloWorld { def main(args:Array[String]): Unit ={
System.setProperty("spark.cassandra.query.retry.count", "1")
val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "cassandra-hostname")
.set("spark.cassandra.username", "cassandra")
.set("spark.cassandra.password", "cassandra")
val sc = new SparkContext("local", "testingCassy", conf)
> //val foo = sc.cassandraTable("keyspace name", "table name")
val rdd = sc.parallelize(1 to 100)
val sum = rdd.reduce(_+_)
println(sum) } }
You need to add spark-sql to dependencies list
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.1.0"
Add library dependency in your project's pom.xml file. It seems they have changed the Vector.class dependencies location in the new refactoring.

Error in hello world spray app with scala 2.11

I'm trying to get a simple "hello world" server running using spray with scala 2.11:
import spray.routing.SimpleRoutingApp
import akka.actor.ActorSystem
object SprayTest extends App with SimpleRoutingApp {
implicit val system = ActorSystem("my-system")
startServer(interface = "localhost", port = 8080) {
path("hello") {
get {
complete {
<h1>Say hello to spray</h1>
}
}
}
}
}
However, I receive the following compile errors:
Multiple markers at this line
- not found: value port
- bad symbolic reference to spray.can encountered in class file 'SimpleRoutingApp.class'. Cannot
access term can in package spray. The current classpath may be missing a definition for spray.can, or
SimpleRoutingApp.class may have been compiled against a version that's incompatible with the one
found on the current classpath.
- not found: value interface
Does anyone know what might be the issue? BTW, I'm very new to spray and actors, so I lack a lot of intuition for how spray and actors work (that's why I'm doing this simple tutorial).
Finally found the answer myself. I needed to add the spray-can dependency to my pom file. Leaving this question and answer in case anyone else runs into the same problem.
SBT example:
scalaVersion := "2.10.4"
val akkaVersion = "2.3.6"
val sprayVersion = "1.3.2"
resolvers ++= Seq(
"Spray Repository" at "http://repo.spray.io/"
)
libraryDependencies ++= Seq(
"com.typesafe.akka" %% "akka-actor" % akkaVersion,
"io.spray" %% "spray-can" % sprayVersion,
"io.spray" %% "spray-routing" % sprayVersion
)