AssertionError on Scala 2.13.10 project when not using any assert statement - scala

I'm getting the following error code when running sbt clean assembly on a Scala 2.13.10 project using JVM 1.8 and IntelliJ IDEA 2022.3.1 (Community Edition). I think it's not caused by any assert included in my code.
java.lang.AssertionError: assertion failed:
List(method apply$mcI$sp, method apply$mcI$sp)
while compiling: D:\repo\git\external\project9\src\main\scala\es\package\graph\KNNGraphBuilder.scala
during phase: globalPhase=specialize, enteringPhase=explicitouter
library version: version 2.13.10
compiler version: version 2.13.10
reconstructed args:
last tree to typer: Ident(x1)
tree position: line 270 of D:\repo\git\external\project9\src\main\scala\es\package\graph\KNNGraphBuilder.scala
tree tpe: x1.type
symbol: case value x1
symbol definition: case val x1: (Int, es.package.graph.NeighborsForElement) (a TermSymbol)
symbol package: es.package.graph
symbol owners: value x1 -> method $anonfun$addElements -> method addElements -> class GroupedNeighborsForElement
call site: method neighborsWithComparisonCountOfGroup in class GroupedNeighborsForElementWithComparisonCount in package graph
== Source file context for tree position ==
267
268 def addElements(n:GroupedNeighborsForElement):Unit=
269 {
270 for ((k,v) <- n.neighbors)
271 addElementsOfGroup(k,v)
272 }
273
n.neighbors is returning Map[Int,NeighborsForElement] here. Other relevant code can be found below:
def addElementsOfGroup(groupId:Int, n:NeighborsForElement):Unit=
{
getOrCreateGroup(groupId).addElements(n)
}
private def getOrCreateGroup(groupId:Int):NeighborsForElement=
{
val g=neighbors.get(groupId)
if (g.isDefined) return g.get
val newGroup=new NeighborsForElement(numNeighbors)
neighbors(groupId)=newGroup
return newGroup
}
This is the build.sbt file:
name := "project9"
version := "0.1"
organization := "es.package"
scalaVersion := "2.13.10"
val sparkVersion = "3.3.1"
resolvers ++= Seq(
"apache-snapshots" at "https://repository.apache.org/snapshots/"
)
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-mllib" % sparkVersion,
"org.scalatest" %% "scalatest" % "3.2.0"
)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
Any ideas what's wrong with this code? Thanks

Related

java.lang.VerifyError: Operand stack overflow for google-ads API and SBT

I am trying to migrate from Google-AdWords to google-ads-v10 API in spark 3.1.1 in EMR.
I am facing some dependency issues due to conflicts with existing jars.
Initially, we were facing a dependency related to Protobuf jar:
Exception in thread "grpc-default-executor-0" java.lang.IllegalAccessError: tried to access field com.google.protobuf.AbstractMessage.memoizedSize from class com.google.ads.googleads.v10.services.SearchGoogleAdsRequest
at com.google.ads.googleads.v10.services.SearchGoogleAdsRequest.getSerializedSize(SearchGoogleAdsRequest.java:394)
at io.grpc.protobuf.lite.ProtoInputStream.available(ProtoInputStream.java:108)
In order to resolve this, tried to shade the Protobuf jar and have a uber-jar instead. After the shading, running my project locally in IntelliJ works fine, But when trying to run an executable jar I created I get the following error:
Exception in thread "main" io.grpc.ManagedChannelProvider$ProviderNotFoundException: No functional channel service provider found. Try adding a dependency on the grpc-okhttp, grpc-netty, or grpc-netty-shaded artifact
I tried adding all those libraries in --spark.jars.packages but it didn't help.
java.lang.VerifyError: Operand stack overflow
Exception Details:
Location:
io/grpc/internal/TransportTracer.getStats()Lio/grpc/InternalChannelz$TransportStats; ...
...
...
at io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder.<init>(NettyChannelBuilder.java:96)
at io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder.forTarget(NettyChannelBuilder.java:169)
at io.grpc.netty.shaded.io.grpc.netty.NettyChannelBuilder.forAddress(NettyChannelBuilder.java:152)
at io.grpc.netty.shaded.io.grpc.netty.NettyChannelProvider.builderForAddress(NettyChannelProvider.java:38)
at io.grpc.netty.shaded.io.grpc.netty.NettyChannelProvider.builderForAddress(NettyChannelProvider.java:24)
at io.grpc.ManagedChannelBuilder.forAddress(ManagedChannelBuilder.java:39)
at com.google.api.gax.grpc.InstantiatingGrpcChannelProvider.createSingleChannel(InstantiatingGrpcChannelProvider.java:348)
Has anyone ever encountered such an issue?
Build.sbt
lazy val dependencies = new {
val sparkRedshift = "io.github.spark-redshift-community" %% "spark-redshift" % "5.0.3" % "provided" excludeAll (ExclusionRule(organization = "com.amazonaws"))
val jsonSimple = "com.googlecode.json-simple" % "json-simple" % "1.1" % "provided"
val googleAdsLib = "com.google.api-ads" % "google-ads" % "17.0.1"
val jedis = "redis.clients" % "jedis" % "3.0.1" % "provided"
val sparkAvro = "org.apache.spark" %% "spark-avro" % sparkVersion % "provided"
val queryBuilder = "com.itfsw" % "QueryBuilder" % "1.0.4" % "provided" excludeAll (ExclusionRule(organization = "com.fasterxml.jackson.core"))
val protobufForGoogleAds = "com.google.protobuf" % "protobuf-java" % "3.18.1"
val guavaForGoogleAds = "com.google.guava" % "guava" % "31.1-jre"
}
libraryDependencies ++= Seq(
dependencies.sparkRedshift, dependencies.jsonSimple, dependencies.googleAdsLib,dependencies.guavaForGoogleAds,dependencies.protobufForGoogleAds
,dependencies.jedis, dependencies.sparkAvro,
dependencies.queryBuilder
)
dependencyOverrides ++= Set(
dependencies.guavaForGoogleAds
)
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.protobuf.**" -> "repackaged.protobuf.#1").inAll
)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs#_*) => MergeStrategy.discard
case PathList("module-info.class", xs#_*) => MergeStrategy.discard
case x => MergeStrategy.first
}
I had a similar issue and I changed the assembly merge strategy to this:
assemblyMergeStrategy in assembly := {
case x if x.contains("io.netty.versions.properties") => MergeStrategy.discard
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
Solved this by using the google-ads-shadowjar as an external jar rather than having a dependency on google-ads library. This solves the problem of having to deal with dependencies manually but makes your jar size bigger.

SBT[1.1.1] Different libraryDependencies for different Scala Versions

I have tried the solution from: SBT cross building - choosing a different library version for different scala version however this results in
build.sbt:27: error: No implicit for Append.Value[Seq[sbt.librarymanagement.ModuleID], sbt.Def.Initialize[sbt.librarymanagement.ModuleID]] found,
so sbt.Def.Initialize[sbt.librarymanagement.ModuleID] cannot be appended to Seq[sbt.librarymanagement.ModuleID]
libraryDependencies += scalaVersion(jsonDependency(_)),
^
[error] sbt.compiler.EvalException: Type error in expression
[error] sbt.compiler.EvalException: Type error in expression
[error] Use 'last' for the full log.
What is the correct way of forcing library dependencies for different Scala versions in sbt 1.1.1?
build.sbt:
libraryDependencies += scalaVersion(jsonDependency(_))
def jsonDependency(scalaVersion: String) = scalaVersion match {
case "2.11.7" => "com.typesafe.play" %% "play-json" % "2.4.2"
case "2.12.4" => "com.typesafe.play" %% "play-json" % "2.6.9"
}
The first line should be:
libraryDependencies += jsonDependency(scalaVersion.value)
As for the rest, it's unnecessarily sensitive to exact Scala version numbers. Consider using CrossVersion.partialVersion to be sensitive to the Scala major version only, as follows:
def jsonDependency(scalaVersion: String) =
"com.typesafe.play" %% "play-json" %
(CrossVersion.partialVersion(scalaVersion) match {
case Some((2, 11)) => "2.4.2"
case _ => "2.6.9"
})

Run Scala Spark with SBT

The code below causes Spark to become unresponsive:
System.setProperty("hadoop.home.dir", "H:\\winutils");
val sparkConf = new SparkConf().setAppName("GroupBy Test").setMaster("local[1]")
val sc = new SparkContext(sparkConf)
def main(args: Array[String]) {
val text_file = sc.textFile("h:\\data\\details.txt")
val counts = text_file
.flatMap(line => line.split(" "))
.map(word => (word, 1))
.reduceByKey(_ + _)
println(counts);
}
I'm setting hadoop.home.dir in order to avoid the error mentioned here: Failed to locate the winutils binary in the hadoop binary path
This is how my build.sbt file looks like:
lazy val root = (project in file(".")).
settings(
name := "hello",
version := "1.0",
scalaVersion := "2.11.0"
)
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "1.6.0"
)
Should Scala Spark be compilable/runnable using the sbt code in the file?
I think code is fine, it was taken verbatim from http://spark.apache.org/examples.html, but I am not sure if the Hadoop WinUtils path is required.
Update: "The solution was to use fork := true in the main build.sbt"
Here is the reference: Spark: ClassNotFoundException when running hello world example in scala 2.11
This is the content of my build.sbt. Notice that if your internet connection is slow it might take some time.
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "1.6.1",
"org.apache.spark" %% "spark-mllib" % "1.6.1",
"org.apache.spark" %% "spark-sql" % "1.6.1",
"org.slf4j" % "slf4j-api" % "1.7.12"
)
run in Compile <<= Defaults.runTask(fullClasspath in Compile, mainClass in (Compile, run), runner in (Compile, run))
In the main I added this, however it depends on where you placed the winutil folder.
System.setProperty("hadoop.home.dir", "c:\\winutil")

Conflicting files in uber-jar creation in SBT using sbt-assembly

I am trying to compile and package a fat jar using SBT and I keep running into the following error. I have tried everything from using library dependency exclude and merging.
[trace] Stack trace suppressed: run last *:assembly for the full output.
[error] (*:assembly) deduplicate: different file contents found in the following:
[error] /Users/me/.ivy2/cache/org.slf4j/slf4j-api/jars/slf4j-api-1 .7.10.jar:META-INF/maven/org.slf4j/slf4j-api/pom.properties
[error] /Users/me/.ivy2/cache/com.twitter/parquet-format/jars/parquet-format-2.2.0-rc1.jar:META-INF/maven/org.slf4j/slf4j-api/pom.properties
[error] Total time: 113 s, completed Jul 10, 2015 1:57:21 AM
The current incarnation of my build.sbt file is below:
import AssemblyKeys._
assemblySettings
name := "ldaApp"
version := "0.1"
scalaVersion := "2.10.4"
mainClass := Some("myApp")
libraryDependencies +="org.scalanlp" %% "breeze" % "0.11.2"
libraryDependencies +="org.scalanlp" %% "breeze-natives" % "0.11.2"
libraryDependencies += "org.apache.spark" % "spark-mllib_2.10" % "1.3.1"
libraryDependencies +="org.ini4j" % "ini4j" % "0.5.4"
jarName in assembly := "myApp"
net.virtualvoid.sbt.graph.Plugin.graphSettings
libraryDependencies += "org.slf4j" %% "slf4j-api"" % "1.7.10" % "provided"
I realize I am doing something wrong...I just have no idea what.
Here is how you can handle these merge issues.
import sbtassembly.Plugin._
lazy val assemblySettings = sbtassembly.Plugin.assemblySettings ++ Seq(
publishArtifact in packageScala := false, // Remove scala from the uber jar
mergeStrategy in assembly <<= (mergeStrategy in assembly) { (old) =>
{
case PathList("META-INF", "CHANGES.txt") => MergeStrategy.first
// ...
case PathList(ps # _*) if ps.last endsWith "pom.properties" => MergeStrategy.first
case x => old(x)
}
}
)
Then add these settings to your project.
lazy val projectToJar = Project(id = "MyApp", base = file(".")).settings(assemblySettings: _*)
I got your assembly build running by removing spark from the fat jar (mllib is already included in spark).
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "1.3.1" % "provided"
Like vitalii said in a comment, this solution was already here. I understand that spending hours on a problem without finding the fix can be frustrating but please be nice.

Suggestions needed to improve packaging all sources and javadoc of sbt projects

To avoid version-related problems with scala (2.9, 2.10, 2.11, …), we want to include all necessary jar files to use scala in a java application. To facilitate debugging & development, we want to include the sources & javadocs of all such libraries too.
I know this topic has been asked many times before; however, I haven't found a solution that could work for us (scala 2.11 & sbt 0.13.5).
I managed to prototype an approximate solution with an sbt project configured as follows:
./build.sbt:
val packAllCommand = Command.command("packAll") {
state =>
"clean" :: "update" :: "updateClassifiers" ::
"pack" :: "dependencyGraph" :: "dependencyDot" ::
state
}
commands += packAllCommand
./project/plugins.sbt:
resolvers +=
"sonatype-releases" at "https://oss.sonatype.org/content/repositories/releases/"
addSbtPlugin("org.xerial.sbt" % "sbt-pack" % "0.6.1")
addSbtPlugin("net.virtual-void" % "sbt-dependency-graph" % "0.7.4")
./project/Build.scala
import sbt._
import Keys._
import net.virtualvoid.sbt.graph.Plugin.graphSettings
import xerial.sbt.Pack._
/**
* Goal:
*
* use sbt to package all the jars/sources/javadoc for scala & related libraries needed to use scala in a java application
* without requiring scala to be installed on the system.
*
* #author Nicolas.F.Rouquette#jpl.nasa.gov
*/
object BuildWithSourcesAndJavadocs extends Build {
object Versions {
val scala = "2.11.2"
val config = "1.2.1"
val scalaCheck = "1.11.5"
val scalaTest = "2.2.1"
val specs2 = "2.4"
val parboiled = "2.0.0"
}
lazy val scalaLibs: Project = Project(
"scalaLibs",
file( "scalaLibs" ),
settings = Defaults.coreDefaultSettings ++ Defaults.runnerSettings ++ Defaults.baseTasks ++ graphSettings ++ packSettings ++ Seq(
scalaVersion := Versions.scala,
packExpandedClasspath := true,
libraryDependencies ++= Seq(
"org.scala-lang" % "scala-library" % scalaVersion.value % "compile" withSources () withJavadoc (),
"org.scala-lang" % "scala-compiler" % scalaVersion.value % "compile" withSources () withJavadoc (),
"org.scala-lang" % "scala-reflect" % scalaVersion.value % "compile" withJavadoc () withJavadoc () ),
( mappings in pack ) := { extraPackFun.value } ) )
lazy val otherLibs: Project = Project(
"otherLibs",
file( "otherLibs" ),
settings = Defaults.coreDefaultSettings ++ Defaults.runnerSettings ++ Defaults.baseTasks ++ graphSettings ++ packSettings ++ Seq(
scalaVersion := Versions.scala,
packExpandedClasspath := true,
libraryDependencies ++= Seq(
"org.scala-lang" % "scala-library" % Versions.scala % "provided",
"org.scala-lang" % "scala-compiler" % Versions.scala % "provided",
"org.scala-lang" % "scala-reflect" % Versions.scala % "provided",
"com.typesafe" % "config" % Versions.config % "compile" withSources () withJavadoc (),
"org.scalacheck" %% "scalacheck" % Versions.scalaCheck % "compile" withSources () withJavadoc (),
"org.scalatest" %% "scalatest" % Versions.scalaTest % "compile" withSources () withJavadoc (),
"org.specs2" %% "specs2" % Versions.specs2 % "compile" withSources () withJavadoc (),
"org.parboiled" %% "parboiled" % Versions.parboiled % "compile" withSources () withJavadoc () ),
( mappings in pack ) := { extraPackFun.value } ) ).dependsOn( scalaLibs )
lazy val root: Project = Project( "root", file( "." ) ) aggregate ( scalaLibs, otherLibs )
val extraPackFun: Def.Initialize[Task[Seq[( File, String )]]] = Def.task[Seq[( File, String )]] {
def getFileIfExists( f: File, where: String ): Option[( File, String )] = if ( f.exists() ) Some( ( f, s"${where}/${f.getName()}" ) ) else None
val ivyHome: File = Classpaths.bootIvyHome( appConfiguration.value ) getOrElse sys.error( "Launcher did not provide the Ivy home directory." )
// this is a workaround; how should it be done properly in sbt?
// goal: process the list of library dependencies of the project.
// that is, we should be able to tell the classification of each library dependency module as shown in sbt:
//
// > show libraryDependencies
// [info] List(
// org.scala-lang:scala-library:2.11.2,
// org.scala-lang:scala-library:2.11.2:provided,
// org.scala-lang:scala-compiler:2.11.2:provided,
// org.scala-lang:scala-reflect:2.11.2:provided,
// com.typesafe:config:1.2.1:compile,
// org.scalacheck:scalacheck:1.11.5:compile,
// org.scalatest:scalatest:2.2.1:compile,
// org.specs2:specs2:2.4:compile,
// org.parboiled:parboiled:2.0.0:compile)
// but... libraryDependencies is a SettingKey (see ld below)
// I haven't figured out how to get the sequence of modules from it.
val ld: SettingKey[Seq[ModuleID]] = libraryDependencies
// workaround... I found this API that I managed to call...
// this overrides the classification of all jars -- i.e., it is as if all library dependencies had been classified as "compile".
// for now... it's a reasonable approaximation of the goal...
val managed: Classpath = Classpaths.managedJars( Compile, classpathTypes.value, update.value )
val result: Seq[( File, String )] = managed flatMap { af: Attributed[File] =>
af.metadata.entries.toList flatMap { e: AttributeEntry[_] =>
e.value match {
case null => Seq()
case m: ModuleID => Seq() ++
getFileIfExists( new File( ivyHome, s"cache/${m.organization}/${m.name}/srcs/${m.name}-${m.revision}-sources.jar" ), "lib.srcs" ) ++
getFileIfExists( new File( ivyHome, s"cache/${m.organization}/${m.name}/docs/${m.name}-${m.revision}-javadoc.jar" ), "lib.javadoc" )
case _ => Seq()
}
}
}
result
}
}
Thanks to the sbt-pack and sbt-dependency-graph plugins, the above produces what I need:
scalaLibs/target/dependencies-compile.dot
scalaLibs/target/pack/lib
scalaLibs/target/pack/lib.srcs
scalaLibs/target/pack/lib.javadoc
otherLibs/target/dependencies-compile.dot
otherLibs/target/pack/lib
otherLibs/target/pack/lib.srcs
otherLibs/target/pack/lib.javadoc
The dot files can be visualized with GraphViz; it helps explain why a particular library is included…
I would like to improve this approach in terms of the following:
some libraries in scalaLibs are duplicated in otherLibs,
this approach ignores library dependency classification & overrides (not used here)
Suggestions?
Nicolas.