Issue in resolving vunerabilities check - scala

We had a vulnerability check in our sbt project using Anchor Engine.
Most of the errors related to the Jackson data bind. We are not even using it as we are using spray JSON for serialization. After searching I found it was used internally by sbt. So I can not upgrade its version. So I tried upgrading the sbt version from 1.2.6 to 1.4.0, to resolve this issue but it didn't work.
object Versions {
val guice = "4.2.1"
val slick = "3.3.2"
val hikariCP = "3.3.0"
val postgres = "42.2.5"
val rabbitMQClient = "5.5.1"
val logbackClassic = "1.2.3"
val sprayJson = "1.3.5"
val akkaHttp = "10.1.5"
val akkaActor = "2.5.19"
val akkaStream = "2.5.19"
val scalaTest = "3.0.1"
val h2 = "1.4.197"
val rabbitmqMock = "1.0.8"
val mockito = "1.9.5"
}
object CompileDeps {
val guice = "com.google.inject" % "guice" % Versions.guice
val scalaGuice = "net.codingwell" %% "scala-guice" % Versions.guice
val postgresql = "org.postgresql" % "postgresql" % Versions.postgres
val slick = "com.typesafe.slick" %% "slick" % Versions.slick
val hikariCP = "com.typesafe.slick" %% "slick-hikaricp" % Versions.hikariCP
val rabbitMQClient= "com.rabbitmq" % "amqp-client" % Versions.rabbitMQClient exclude("com.fasterxml.jackson.core", "jackson-databind")
val logbackClassic = "ch.qos.logback" % "logback-classic" % Versions.logbackClassic
val sprayJson = "io.spray" %% "spray-json" % Versions.sprayJson
val akkaHttp = "com.typesafe.akka" %% "akka-http" % Versions.akkaHttp
val akkaActor = "com.typesafe.akka" %% "akka-actor" % Versions.akkaActor
val akkaStream = "com.typesafe.akka" %% "akka-stream" % Versions.akkaStream
val akkaHttpSprayJson = "com.typesafe.akka" %% "akka-http-spray-json" % Versions.akkaHttp
}
DependencyBrowseGraph
So can anyone please guide me on how can I resolve these security checks?
Thanks

You are fetching Jackson via RabbitMQ dependency. See compile dependencies of your version of RabbitMQ on Maven repository.
This dependency is marked as optional so you could probably safely remove it using exclude("com.fasterxml.jackson.core", "jackson-databind"). Test it! If it doesn't work add dependency explicitly to bump to some newer safer version or find a way to suppress warning.
For the future: use sbt-dependency-graph to generate visual dependency graph (dependencyBrowseGraph), then you'll be able to see which libraries fetches and evicts your dependencies.

Related

Can you import a separate version of the same dependency into one build file for test?

I was thinking this would work but it did not for me.
libraryDependencies += "org.json4s" %% "json4s-core" % "3.6.7" % "test"
libraryDependencies += "org.json4s" %% "json4s-core" % "3.7.0" % "compile"
Any idea?
Test classpath includes compile classpath.
So create different subrojects for different versions of the dependency if you need that.
lazy val forJson4s370 = project
.settings(
libraryDependencies += "org.json4s" %% "json4s-core" % "3.7.0" % "compile"
)
lazy val forJson4s367 = project
.settings(
libraryDependencies += "org.json4s" %% "json4s-core" % "3.6.7" % "test"
)
If you don't want to create different subprojects you can try custom sbt configurations
https://www.scala-sbt.org/1.x/docs/Advanced-Configurations-Example.html
An exotic solution is to manage dependencies and compile/run code programmatically in the exceptional class. Then you can have dependencies/versions different from specified in build.sbt.
import java.net.URLClassLoader
import coursier.{Dependency, Module, Organization, ModuleName, Fetch}
import scala.reflect.runtime.universe
import scala.reflect.runtime.universe.Quasiquote
import scala.tools.reflect.ToolBox
val files = Fetch()
.addDependencies(
Dependency(Module(Organization("org.json4s"), ModuleName("json4s-core_2.13")), "3.6.7"),
)
.run()
val depClassLoader = new URLClassLoader(
files.map(_.toURI.toURL).toArray,
/*getClass.getClassLoader*/ null // ignoring current classpath
)
val rm = universe.runtimeMirror(depClassLoader)
val tb = rm.mkToolBox()
tb.eval(q"""
import org.json4s._
// some exceptional json4s 3.6.7 code
println("hi")
""")
// hi
build.sbt
libraryDependencies ++= Seq(
scalaOrganization.value % "scala-compiler" % scalaVersion.value % "test",
"io.get-coursier" %% "coursier" % "2.1.0-M7-39-gb8f3d7532" % "test",
"org.json4s" %% "json4s-core" % "3.7.0" % "compile",
)

Not able to read parquet files in spark : java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods

I am trying to read snappy compressed parquet file but keep on getting below exception. I am not able to find the root cause for this exception, can someone please guide me here?
val sparkSession: SparkSession = SparkSession.builder()
.master("local[2]")
.config("spark.ui.enabled",false)
.appName("local-intellij")
.getOrCreate()
val df = sparkSession.read.parquet("C:\\data\\parquet\\part-00000-4ce5708f-2f50-485d-8ae4-7c5ea440fda6.c000.snappy.parquet")
My dependencies are :
lazy val json4sVersion = "3.5.0"
lazy val json4sDeps = Seq(
"org.json4s" %% "json4s-core" % json4sVersion,
"org.json4s" %% "json4s-native" % json4sVersion,
"org.json4s" %% "json4s-ast" % json4sVersion,
"org.json4s" %% "json4s-jackson" % json4sVersion)
lazy val sparkVersionCore = "2.3.0.cloudera2"
lazy val sparkDeps = Seq(
"org.apache.spark" %% "spark-hive" % sparkVersionCore,
"org.apache.spark" %% "spark-core" % sparkVersionCore)
java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods$.parse(Lorg/json4s/JsonInput;Z)Lorg/json4s/JsonAST$JValue;
at org.apache.spark.sql.types.DataType$.fromJson(DataType.scala:113)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$org$apache$spark$sql$execution$datasources$parquet$ParquetFileFormat$$deserializeSchemaString$3.apply(ParquetFileFormat.scala:650)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$org$apache$spark$sql$execution$datasources$parquet$ParquetFileFormat$$deserializeSchemaString$3.apply(ParquetFileFormat.scala:650)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.org$apache$spark$sql$execution$datasources$parquet$ParquetFileFormat$$deserializeSchemaString(ParquetFileFormat.scala:650)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readSchemaFromFooter$1.apply(ParquetFileFormat.scala:643)
at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$$anonfun$readSchemaFromFooter$1.apply(ParquetFileFormat.scala:643)
clear jar version mismatch for spark version 2.3.0.
AFAIK you have to use org.json4s json4s-jackson_2.11 3.2.11
// https://mvnrepository.com/artifact/org.json4s/json4s-jackson
libraryDependencies += "org.json4s" %% "json4s-jackson" % "3.2.11"
AFAIK this entry is not reuqired I think will be auomatically downloaded once you mention the spark version in sbt. Just try to remove the entry and see... if its not working add aforementioned entry.

Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.8.9

When I do df.show() to print the content of DataFrame rows, I get this error:
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.8.9
at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:747)
at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
This is how I create df:
object Test extends App {
val spark = SparkSession.builder()
.config("es.nodes", "XXX.XX.XX.XX")
.config("es.port", "9200")
.config("es.nodes.wan.only", "false")
.config("es.resource","myIndex")
.appName("Test")
.master("local[*]")
.getOrCreate()
val df_source = spark
.read.format("org.elasticsearch.spark.sql")
.option("pushdown", "true")
.load("myIndex")
df_source.show(5)
}
I do not use Jackson library in my build.sbt.
UPDATE:
import sbtassembly.AssemblyPlugin.autoImport.assemblyOption
name := "test"
lazy val spark = "org.apache.spark"
lazy val typesafe = "com.typesafe.akka"
val sparkVersion = "2.2.0"
val elasticSparkVersion = "6.2.4"
val scalaLoggingVersion = "3.7.2"
val slf4jVersion = "1.7.5"
val kafkaVersion = "0.8.0.0"
val akkaVersion = "2.5.9"
val playVersion = "2.6.8"
val sprayVersion = "1.3.2"
val opRabbitVersion = "2.1.0"
val orientdbVersion = "2.2.34"
val livyVersion = "0.5.0-incubating"
val scalaHttpVersion = "2.3.0"
val scoptVersion = "3.3.0"
resolvers ++= Seq(
// repo for op-rabbit client
"SpinGo OSS" at "http://spingo-oss.s3.amazonaws.com/repositories/releases",
"SparkPackagesRepo" at "http://dl.bintray.com/spark-packages/maven",
"cloudera.repo" at "https://repository.cloudera.com/artifactory/cloudera-repos"
)
lazy val commonSettings = Seq(
organization := "org.test",
version := "0.1",
scalaVersion := "2.11.8",
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = true),
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case PathList("reference.conf") => MergeStrategy.concat
case x => MergeStrategy.first
}
)
val sparkSQL = spark %% "spark-sql" % sparkVersion
val sparkGraphx = spark %% "spark-graphx" % sparkVersion
val sparkMLLib = spark %% "spark-mllib" % sparkVersion
val elasticSpark = "org.elasticsearch" % "elasticsearch-hadoop" % elasticSparkVersion
val livyAPI = "org.apache.livy" % "livy-api" % livyVersion
val livyScalaAPI = "org.apache.livy" %% "livy-scala-api" % livyVersion
val livyClientHttp = "org.apache.livy" % "livy-client-http" % livyVersion
val spingoCore = "com.spingo" %% "op-rabbit-core" % opRabbitVersion
val spingoPlayJson = "com.spingo" %% "op-rabbit-play-json" % opRabbitVersion
val spingoJson4s = "com.spingo" %% "op-rabbit-json4s" % opRabbitVersion
val spingoAirbrake = "com.spingo" %% "op-rabbit-airbrake" % opRabbitVersion
val spingoAkkaStream = "com.spingo" %% "op-rabbit-akka-stream" % opRabbitVersion
val orientDB = "com.orientechnologies" % "orientdb-graphdb" % orientdbVersion excludeAll(
ExclusionRule("commons-beanutils", "commons-beanutils-core"),
ExclusionRule("commons-collections", "commons-collections"),
ExclusionRule("commons-logging", "commons-logging"),
ExclusionRule("stax", "stax-api")
)
val scopt = "com.github.scopt" %% "scopt" % scoptVersion
val spray = "io.spray" %% "spray-json" % sprayVersion
val scalaHttp = "org.scalaj" %% "scalaj-http" % scalaHttpVersion
lazy val graph = (project in file("./app"))
.settings(
commonSettings,
libraryDependencies ++= Seq(sparkSQL, sparkGraphx, sparkMLLib, orientDB,
livyAPI, livyScalaAPI, livyClientHttp, scopt,
spingoCore, scalaHttp,
spray, spingoCore, spingoPlayJson, spingoJson4s,
spingoAirbrake, spingoAkkaStream, elasticSpark)
)
dependencyOverrides += "com.typesafe.akka" %% "akka-stream" % akkaVersion
I tried to add Jackson libraries for Spark, but it didn't solve the problem:
val jacksonCore = "com.fasterxml.jackson.core" % "jackson-core" % "2.6.5"
val jacksonDatabind = "com.fasterxml.jackson.core" % "jackson-databind" % "2.6.5"
val jacksonAnnotations = "com.fasterxml.jackson.core" %% "jackson-annotations" % "2.6.5"
val jacksonScala = "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.6.5"
Finally, I did this (the last two dependencies cannot be resolved for some reason):
dependencyOverrides += "com.typesafe.akka" %% "akka-stream" % akkaVersion
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-databind" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-annotations" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.8.9"
dependencyOverrides += "com.fasterxml.jackson.module" % "jackson-module-paranamer" % "2.8.9"
But now I get the error:
Exception in thread "main" java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/DefaultScalaModule$
Caused by: java.lang.ClassNotFoundException: com.fasterxml.jackson.module.scala.DefaultScalaModule$
jackson version for Spark 2.2.0 is 2.6.5, it looks like one of your other dependencies is using jackson 2.8.9, these two versions are not compatible so you need to align them to the same jackson version.
This build.sbt looks very problematic since you have mixed a lot of thing that probably don't align in jackson and with other dependencies.
For example op-rabbit-json4s is expecting jackson to be 3.5.3, on the other hand I think that orientdb-graphdb is expecting a third version of jackson (2.2.3)
In summary, you need to align as much as possible your dependencies to make sure there are no concflicts.
Here you can find a useful plugin to check dependencies https://github.com/jrudolph/sbt-dependency-graph

Build sbt for spark with janusgraph and gremlin scala

I was trying to setup a IntelliJ build for spark with janusgraph using gremlin scala but I am running into errors.
My build.sbt file is:
version := "1.0"
scalaVersion := "2.11.11"
libraryDependencies += "com.michaelpollmeier" % "gremlin-scala" % "2.3.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.1"
// https://mvnrepository.com/artifact/org.apache.spark/spark-sql
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.1"
// https://mvnrepository.com/artifact/org.apache.spark/spark-mllib
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.2.1"
// https://mvnrepository.com/artifact/org.apache.spark/spark-hive
libraryDependencies += "org.apache.spark" %% "spark-hive" % "2.2.1"
// https://mvnrepository.com/artifact/org.janusgraph/janusgraph-core
libraryDependencies += "org.janusgraph" % "janusgraph-core" % "0.2.0"
libraryDependencies ++= Seq(
"ch.qos.logback" % "logback-classic" % "1.2.3" % Test,
"org.scalatest" %% "scalatest" % "3.0.3" % Test
)
resolvers ++= Seq(
Resolver.mavenLocal,
"Sonatype OSS" at "https://oss.sonatype.org/content/repositories/public"
)
But I am getting errors when I try to compile code that uses gremlin scala libraries or io.Source libraries. Can someone share their build file or tell what I should modify to fix it.
Thanks in advance.
So, I was trying to compile this code:
import gremlin.scala._
import org.apache.commons.configuration.BaseConfiguration
import org.janusgraph.core.JanusGraphFactory
class Test1() {
val conf = new BaseConfiguration()
conf.setProperty("storage.backend", "inmemory")
val gr = JanusGraphFactory.open(conf)
val graph = gr.asScala()
graph.close
}
object Test{
def main(args: Array[String]) {
val t = new Test1()
println("in Main")
}
}
The errors I get are:
Error:(1, 8) not found: object gremlin
import gremlin.scala._
Error:(10, 18) value asScala is not a member of org.janusgraph.core.JanusGraph
val graph = gr.asScala()
If you go to the Gremlin-Scala GitHub page you'll see that the current version is "3.3.1.1" and that
Typically you just need to add a dependency on "com.michaelpollmeier" %% "gremlin-scala" % "SOME_VERSION" and one for the graph db of your choice to your build.sbt (this readme assumes tinkergraph). The latest version is displayed at the top of this readme in the maven badge.
It is not a surprise that the APi has changed when the major version of the
library is different. If I change your first dependency as
//libraryDependencies += "com.michaelpollmeier" % "gremlin-scala" % "2.3.0" //old!
libraryDependencies += "com.michaelpollmeier" %% "gremlin-scala" % "3.3.1.1"
then your example code compiles for me.

SBT cannot append Seq[Object] to Seq[ModuleID]

SBT keeps failing with improper append errors. Im using the exact format of build files I have seen numerous times.
build.sbt:
lazy val backend = (project in file("backend")).settings(
name := "backend",
libraryDependencies ++= (Dependencies.backend)
).dependsOn(api).aggregate(api)
dependencies.scala:
import sbt._
object Dependencies {
lazy val backend = common ++ metrics
val common = Seq(
"com.typesafe.akka" %% "akka-actor" % Version.akka,
"com.typesafe.akka" %% "akka-cluster" % Version.akka,
"org.scalanlp.breeze" %% "breeze" % Version.breeze,
"com.typesafe.akka" %% "akka-contrib" % Version.akka,
"org.scalanlp.breeze-natives" % Version.breeze,
"com.google.guava" % "guava" % "17.0"
)
val metrics = Seq("org.fusesource" % "sigar" % "1.6.4")
Im Im not quite why SBT is complaining
error: No implicit for Append.Values[Seq[sbt.ModuleID], Seq[Object]] found,
so Seq[Object] cannot be appended to Seq[sbt.ModuleID]
libraryDependencies ++= (Dependencies.backend)
^
Short Version (TL;DR)
There's an error in common: you want to replace this line
"org.scalanlp.breeze-natives" % Version.breeze,
with this line
"org.scalanlp" %% "breeze-natives" % Version.beeze,
Long Version
"org.scalanlp.breeze-natives" % Version.breeze is a GroupArtifactID not a ModuleID.
This causes common to become a Seq[Object] instead of a Seq[ModuleID].
And therefore also Dependencies.backend to be a Seq[Object]
Which ultimately can't be appended (via ++=) to libraryDependencies (defined as a SettingKey[Seq[ModuleID]]) because there is no available Append.Values[Seq[sbt.ModuleID], Seq[Object]].
One of common or metrics is not a Seq[sbt.ModuleID]. You could find out which with a type ascription:
val common: Seq[sbt.ModuleID] = ...
val metrics: Seq[sbt.ModuleID] = ...
My money is on common, this line doesn't have enough %s in it:
"org.scalanlp.breeze-natives" % Version.breeze