What means "xs # _*" in PathList of SBT assembly-plugin? - scala

What these phrases mean:
xs # _*
ps # _*
Copied from documentation:
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs # _*) => MergeStrategy.first
case PathList(ps # _*) if ps.last endsWith ".html" => MergeStrategy.first
case "application.conf" => MergeStrategy.concat
case "unwanted.txt" => MergeStrategy.discard
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}

In your example _* means everything, # for matching
xs # _* is case pattern means pick every thing that matched for first case /javax/servlet/*
ps # _* means pick up all that match as /* and with html extension

'#' operator is used to bind to variables in pattern matching.
<somevar> : _* is used to unpack varargs as sequence of appropriate type.
In this example ps # _* tells pattern matching to retrieve varargs from PathList as Sequence of paths.

Related

Run Drools Kie project from fat jar

I am trying to run a drools based (and KieServices based) project from a fat jar that has been generated using sbt assembly.
[main] INFO org.drools.compiler.kie.builder.impl.ClasspathKieProject - Found kmodule: jar:file:/.../myJar.jar!/META-INF/kmodule.xml
[main] ERROR org.drools.compiler.kie.builder.impl.ClasspathKieProject - Unable to build index of kmodule.xml url=jar:file:/.../myJar.jar/META-INF/kmodule.xml
You're trying to perform a xml related operation without the necessary xml support for drools. Please add the module org.drools:drools-xml-support to your classpath.
[main] ERROR org.drools.compiler.kie.builder.impl.KieContainerImpl - Unknown KieSession name: DroolDummyKS
This error is triggered by the following code, last line to be specific:
val kieServices: KieServices = KieServices.Factory.get
val kieContainer: KieContainer = kieServices.getKieClasspathContainer
// Apply the correct kie session from the ./resources/META-INF/kmodule.xml configuration
val kieSession: KieSession = kieContainer.newKieSession("DroolDummyKS")
Project is a Scala SBT project (Java 11 as compiler)
I couldn't reproduce Unable to build index of kmodule.xml. Just kieContainer.newKieSession("DroolDummyKS") returns null for assembly jar (java -jar myJar.jar), on contrary to sbt run, where it returns KieSession[0].
Notice that it's written in your error:
Please add the module org.drools:drools-xml-support to your classpath
Do two things in build.sbt:
add to libraryDependencies
"org.drools" % "drools-xml-support" % "8.31.1.Final"
and
unignore kmodule.xml in assembly strategy (so that kmodule.xml is included into assembly jar), for example with singleOrError (not sure that concatenation makes sense for xml on contrary to service files, we're making sure that it's your kmodule.xml being included, otherwise it throws)
assembly / assemblyMergeStrategy := {
case PathList("META-INF", "services", xs#_*) => MergeStrategy.concat
case PathList("META-INF", "kmodule.xml") => MergeStrategy.singleOrError
case PathList("META-INF", xs#_*) => MergeStrategy.discard
case _ => MergeStrategy.first
}
Update. With
assembly / assemblyMergeStrategy := {
case _ => MergeStrategy.singleOrError
}
you can check what files have duplicates.
You should add the dependency "org.drools" % "drools-xml-support" % "8.31.1.Final" as I adviced earlier. Otherwise there is NPE even for sbt run. I see that you added it in update.
You should remove the file src/main/resources/META-INF/services/org.kie.api.KieServices. Anyway it's present in the dependency drools-compiler-8.31.1.Final.jar.
Try the strategy ignoring as less as possible
assembly / assemblyMergeStrategy := {
case xs if xs.endsWith("LICENSE") => MergeStrategy.discard
case xs if xs.endsWith("LICENSE.txt") => MergeStrategy.discard
case xs if xs.endsWith("INDEX.LIST") => MergeStrategy.discard
case xs if xs.endsWith("MANIFEST.MF") => MergeStrategy.discard
case xs if xs.endsWith("NOTICE") => MergeStrategy.discard
case xs if xs.endsWith("NOTICE.txt") => MergeStrategy.discard
case xs if xs.endsWith("module-info.class") => MergeStrategy.discard
case PathList("META-INF", "services", "org.apache.poi.sl.draw.ImageRenderer") => MergeStrategy.filterDistinctLines
case PathList("META-INF", "services", "org.apache.poi.ss.usermodel.WorkbookProvider") => MergeStrategy.filterDistinctLines
case PathList("META-INF", "services", "org.apache.poi.extractor.ExtractorProvider") => MergeStrategy.filterDistinctLines
case PathList("META-INF", "services", "org.drools.wiring.api.ComponentsSupplier") => MergeStrategy.filterDistinctLines
case _ => MergeStrategy.singleOrError
}
Duplicates should be resolved then.
filterDistinctLines is similar to concat, just not adding the same lines.
The strategy can be simplified
assembly / assemblyMergeStrategy := {
case xs if Seq(
"LICENSE",
"LICENSE.txt",
"INDEX.LIST",
"MANIFEST.MF",
"NOTICE",
"NOTICE.txt",
"module-info.class"
).exists(xs.endsWith) => MergeStrategy.discard
case PathList("META-INF", "services", xs#_*) => MergeStrategy.filterDistinctLines
case _ => MergeStrategy.singleOrError
}
Update 2. I looked for reasons of the latest NPE
Exception in thread "main" java.lang.NullPointerException:
Cannot invoke
"org.drools.compiler.compiler.Dialect.getId()"
because the return value of
"org.drools.compiler.rule.builder.RuleBuildContext.getDialect()"
is null
It turns out that the thing was in the file META-INF/kie.default.properties.conf of one of dependencies. So it was enough to additionally unignore it
assembly / assemblyMergeStrategy := {
case x if x.endsWith("module-info.class") => MergeStrategy.discard
case PathList("META-INF", "services", xs#_*) => MergeStrategy.concat
case PathList("META-INF", "kmodule.xml") => MergeStrategy.singleOrError
case PathList("META-INF", "kie.default.properties.conf") => MergeStrategy.singleOrError
case PathList("META-INF", xs#_*) => MergeStrategy.discard
case _ => MergeStrategy.first
}
But the main conclusion for us should be that ignoring the whole META-INF can be dangerous. There can be some other files there that are important for some of dependencies. Maybe now there is no NPE but some issues can be later.
It turns out that the default assembly strategy
assembly / assemblyMergeStrategy := MergeStrategy.defaultMergeStrategy
or
assembly / assemblyMergeStrategy := {
case x =>
val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
oldStrategy(x)
}
or (see here)
val defaultMergeStrategy: String => MergeStrategy = {
case x if Assembly.isConfigFile(x) =>
MergeStrategy.concat
case PathList(ps # _*) if Assembly.isReadme(ps.last) || Assembly.isLicenseFile(ps.last) =>
MergeStrategy.rename
case PathList("META-INF", xs # _*) =>
(xs map {_.toLowerCase}) match {
case ("manifest.mf" :: Nil) | ("index.list" :: Nil) | ("dependencies" :: Nil) =>
MergeStrategy.discard
case ps # (x :: xs) if ps.last.endsWith(".sf") || ps.last.endsWith(".dsa") =>
MergeStrategy.discard
case "plexus" :: xs =>
MergeStrategy.discard
case "services" :: xs =>
MergeStrategy.filterDistinctLines
case ("spring.schemas" :: Nil) | ("spring.handlers" :: Nil) =>
MergeStrategy.filterDistinctLines
case _ => MergeStrategy.deduplicate
}
case _ => MergeStrategy.deduplicate
}
does the work pretty well. You just had to ignore additionally files module-info.class. So you can prefer
assembly / assemblyMergeStrategy := {
case x if x.endsWith("module-info.class") => MergeStrategy.discard
case x =>
val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
oldStrategy(x)
}
https://github.com/ThijmenL98/DroolsMCVE/pull/1

Remove warning from scala: "It would fail on the following input: Nil"

I have a scala code.
val lines = Source
.fromResource("doc-topics-new.txt")
.getLines
.toList
.drop(1) match {
case x :: xs => x.split(" ").drop(2).mkString(" ") :: xs
}
when I run the code it is working how ever there is a warning message
Warning:(81, 14) match may not be exhaustive.
It would fail on the following input: Nil
.drop(1) match {
Please suggest how to remove this warning.
Just add a case of Nil:
val lines = Source
.fromResource("doc-topics-new.txt")
.getLines
.toList
.drop(1) match {
case Nil => List.empty // Add this line
case x :: xs => x.split(" ").drop(2).mkString(" ") :: xs
}
you can just add the Nil case to your pattern matching.
if you really wanna suppress the warning you could use the unchecked annotation:
val lines = (Source
.fromResource("doc-topics-new.txt")
.getLines
.toList
.drop(1): #unchecked) match {
case x :: xs => x.split(" ").drop(2).mkString(" ") :: xs
}

Packaging Jar using SBT assembly

How do packaging a jar by selected scala classes using SBT assembly. While using SBT full project is created as jar. Is there a way to exclude the classes? Please help .
You can use assemblyMergeStrategy with MergeStrategy.discard to discard to the classes that you want, like:
assemblyMergeStrategy in assembly := {
case PathList("javax", "servlet", xs # _*) => MergeStrategy.first
case PathList(ps # _*) if ps.last endsWith ".html" => MergeStrategy.first
case "application.conf" => MergeStrategy.concat
case "unwanted.txt" => MergeStrategy.discard
case x =>
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
Reference:
https://github.com/sbt/sbt-assembly#merge-strategy

Resolve spark-avro error = Failed to load class for data source: com.databricks.spark.avro

I am trying to use the spark-avro library to process avro files. I am using SBT:
build.sbt:
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-sql" % "1.3.0",
"com.databricks" %% "spark-avro" % "1.0.0")
tester.scala:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql._
import com.databricks.spark.avro._
object tester {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("SimpleApplication").setMaster("local")
val sc = new SparkContext(conf)
// Creates a DataFrame from a specified file
val df = sqlContext.load("episodes.avro", "com.databricks.spark.avro")
}
}
when I run tester in IntelliJ IDE, I get the following stack trace:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/avro/mapred/FsInput
at com.databricks.spark.avro.AvroRelation.newReader(AvroRelation.scala:111)
at com.databricks.spark.avro.AvroRelation.<init>(AvroRelation.scala:53)
at com.databricks.spark.avro.DefaultSource.createRelation(DefaultSource.scala:41)
at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:290)
when I run:
$ sbt package
$ ~/spark-1.3.1/bin/spark-submit --class "tester" target/scala-2.10/project_2.10-0.1-SNAPSHOT.jar
I get the following stack trace:
Exception in thread "main" java.lang.RuntimeException: Failed to load class for data source: com.databricks.spark.avro
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.sources.ResolvedDataSource$.lookupDataSource(ddl.scala:194)
at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:205)
at org.apache.spark.sql.SQLContext.load(SQLContext.scala:697)
What can I do to resolve this error? Any help is greatly appreciated. Thanks!!
"sbt package" will not include your dependencies, try sbt-assembly instead.
I changed the build.sbt file to:
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-sql" % "1.3.0",
"com.databricks" %% "spark-avro" % "1.0.0",
"org.apache.avro" % "avro" % "1.7.7",
"org.apache.avro" % "avro-mapred" % "1.7.7")
assemblyMergeStrategy in assembly := {
case PathList("org", "slf4j", xs#_*) => MergeStrategy.first
case PathList("org", "apache", "spark", xs #_*) => MergeStrategy.first
case PathList("com", "esotericsoftware", "minlog", xs #_*) => MergeStrategy.first
case PathList("javax", "activation", xs #_*) => MergeStrategy.first
case PathList("javax", "servlet", xs #_*) => MergeStrategy.first
case PathList("javax", "xml", "stream", xs #_*) => MergeStrategy.first
case PathList("org", "apache", "commons", xs #_*) => MergeStrategy.first
case PathList("com", "google", "common", xs #_*) => MergeStrategy.first
case "org/apache/hadoop/yarn/factories/package-info.class" => MergeStrategy.first
case "org/apache/hadoop/yarn/factory/providers/package-info.class" => MergeStrategy.first
case "org/apache/hadoop/yarn/util/package-info.class" => MergeStrategy.first
case x if x.startsWith("META-INF") => MergeStrategy.discard
case x if x.startsWith("plugin.properties") => MergeStrategy.discard
case x => {
val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
}
and used the command
$ sbt assembly
to build the jar. Everything works now.

Why does the empty string not match as Seq.empty?

EDIT: This was an old bug long since fixed in Scala 2.8 and later
During some experimentation around question Pattern matching a String as Seq[Char], I ran across another weird matching phenomenon. Consider the following code that treats a string as a sequence of characters:
def %%&#(input: String) : String = {
val uha : Seq[Char] = input
uha match {
case Seq() => "Empty"
case Seq(first # _, 'o', 'o') => "Bar"
case _ => "Oh"
}
}
Calling input on the empty String "" correctly yields "Empty".
However, if I rewrite the first match clause as
case Seq.empty => "Empty"
the matching of "" fails and matches the default clause instead.
Walking through the Scala library source code (which you shouldn't have to do in an ideal world :-) ) I believe that both Seq() and Seq.empty will result in RandomAccessSeq.empty. Apparently, this does not concur with the phenomenon described above because only Seq() matches the empty String.
UPDATE: Upon some further experimentation this question can be narrowed down to the following:
val list = List()
>>> list2: List[Nothing] = List()
val emptySeq = Seq.empty
list == emptySeq
>>> res1: Boolean = false
This basically means that an empty Seq does not automatically equal Seq.empty .
So when matching against a constant (as opposed to using an extractor as suggested by starblue) this unequality leads to the failing match.
The same is true when interpreting the empty String as a sequence.
This appears to be a bug in the library. Do you want to file the bug or shall I?
scala> Seq.empty match {case Seq() => "yup"; case _ => "nope"}
res0: java.lang.String = yup
scala> Seq() match {case Seq.empty => "yup"; case _ => "nope"}
res1: java.lang.String = yup
scala> ("" : Seq[Char]) match {case Seq() => "yup"; case _ => "nope"}
res2: java.lang.String = yup
scala> ("" : Seq[Char]) match {case Seq.empty => "yup"; case _ => "nope"}
res3: java.lang.String = nope
In matching the unapply or unapplySeq functions are used, not apply as you seem to believe.