Create executable jar file using maven on Scala - scala

I have been trying to create a simple JAR file from a scala script with maven and pass it to spark-submit. For that, I am using the following version of maven
> mvn -version
Apache Maven 3.6.3
Maven home: /usr/share/maven
Java version: 1.8.0_312, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "5.4.0-100-generic", arch: "amd64", family: "unix"
To start, I create the working directory with the following command
mvn archetype:generate -DgroupId=com.xom -DartifactId=workflow -DarchetypeArtifactId=maven-archetype-quickstart -DarchetypeVersion=1.4 -DinteractiveMode=false
After creating the directory with maven, I add the MainTest.scala file into src/main/java/com/xom/. this Scala file is the demo code in spark quickstart
package com.xom
import org.apache.spark.sql.SparkSession
object MainTest {
def main(args: Array[String]) {
val logFile = "/opt/spark/README.md" // Should be some file on your system
val spark = SparkSession.builder.appName("Simple Application").config("spark.master", "local").getOrCreate()
val logData = spark.read.textFile(logFile).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println(s"Lines with a: $numAs, Lines with b: $numBs")
spark.stop()
}
}
According to maven website, in order to create an executable JAR file, I replaced the default <build> </build> in pom.xml for the one provided in the previous link, passing the path of my class.
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>com.xom.MainTest</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
Then I run
mvn clean install
which creates the jar file. When I try to access it with java -jar, I get the following error message:
Error: Could not find or load main class com.xom.MainTest
Caused by: java.lang.ClassNotFoundException: com.xom.MainTest
I cannot launch it with spark-submit neither (I get Error: Failed to load class MainTest.).
I am really new to this world of scala & maven, and I have searched everywhere for a work around, but so far I was not able to figure out why this is happening. Most of the solutions mention changing some configs in the pom.xml, but everything I have tried did not work.

Related

Project with gmaven-plugin in POM compiles at command line but fails to compile in Eclipse

I'm trying to get a project that uses gmaven-plugin to compile in Eclipse. When I import the project into Eclipse using the Maven Import I get the error shown below.
After the project finishes importing, I'm left with these Java compile errors (The errors persist if I select one of the MVN profiles):
(The errors persist if I select one of the MVN profiles.)
I’m not sure if the two are related???
I get errors if I try to install this:
http://dist.springsource.org/release/GRECLIPSE/e4.2/
This installs but I still get the same import and compile errors for my project:
https://marketplace.eclipse.org/category/free-tagging/groovy
Snippet from pom.xml is below.
Full project is at
https://github.com/OHDSI/WebAPI
<plugin>
<groupId>org.codehaus.gmaven</groupId>
<artifactId>gmaven-plugin</artifactId>
<version>1.5</version>
<executions>
<execution>
<id>add-git-branch-info</id>
<phase>generate-resources</phase>
<goals>
<goal>execute</goal>
</goals>
<configuration>
<source>
if (project.properties.getProperty("git.branch") == null) project.properties.setProperty("git.branch", "*");
if (project.properties.getProperty("git.commit.id.abbrev") == null) project.properties.setProperty("git.commit.id.abbrev", "*");
</source>
</configuration>
</execution>
</executions>
</plugin>
How can these issues be resolved?

Vertx config with spring-config-server: unknown configuration store implementation: spring-config-server

I would like to fetch my config from a spring cloud config server as described in https://vertx.io/docs/vertx-config/java/#_spring_config_server_store
Used imports:
import io.vertx.config.ConfigRetrieverOptions;
import io.vertx.config.ConfigStoreOptions;
import io.vertx.core.DeploymentOptions;
import io.vertx.core.VertxOptions;
import io.vertx.core.buffer.Buffer;
import io.vertx.core.dns.AddressResolverOptions;
import io.vertx.core.json.JsonObject;
import io.vertx.reactivex.config.ConfigRetriever;
import io.vertx.reactivex.core.Vertx;
// relevant section:
final ConfigStoreOptions storeOptions = new ConfigStoreOptions()
.setType("spring-config-server")
.setConfig(new JsonObject().put("url", "url-to-server"));
final ConfigRetrieverOptions options = new ConfigRetrieverOptions()
.addStore(storeOptions);
I use maven to build the jar and I can run the application in IntelliJ. The Jar contains all needed dependencies. However if I start the jar via CLI "java -jar articfact.jar" I get the following Error:
2020-01-17 10:54:04.121 INFO [main] c.e.Runner
- Bootstrapping application... Exception in thread "main" java.lang.IllegalArgumentException: unknown configuration store
implementation: spring-config-server (known implementations are:
[event-bus, file, json, http, env, sys, directory]) at
io.vertx.config.impl.ConfigRetrieverImpl.(ConfigRetrieverImpl.java:111)
at io.vertx.config.ConfigRetriever.create(ConfigRetriever.java:53)
at
com.example.Runner.main(Runner.java:41)
Im using Vertx version 3.8.4
Available config stores are found using the Java Service Loader utility. It means that the service loader will look for all files in the classpath like:
META-INF/services/io.vertx.config.spi.ConfigStoreFactory
These files contain the names of the available config store factories.
Since you build a FAT jar, it is likely that your build process keeps only the core service file and drops the service file that comes with the spring config server module.
You must configure your build to merge the content of all these files.
The Vert.x Maven plugin does it by default, but you can also do it with the Maven shade plugin:
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/services/io.vertx.core.spi.VerticleFactory</resource>
</transformer>
<transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
<resource>META-INF/services/io.vertx.config.spi.ConfigStoreFactory</resource>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>

Compiling Scala Using Maven

I want to create a hello world application using maven.
here is my pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>ColossusPlay</groupId>
<artifactId>ColossusPlay</artifactId>
<version>0.0.1-SNAPSHOT</version>
<build>
<sourceDirectory>src</sourceDirectory>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.3</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<groupId>com.tumblr</groupId>
<artifactId>colossus-metrics_2.10</artifactId>
<version>0.8.1-RC1</version>
</dependency>
</dependencies>
</project>
and here is my scala code:
object Main extends App{
println( "Helo World" )
}
when I run
mvn package
it generates a jar file in the target directory. Then what I want to be able to do is run that jar file using
scala target/ColossusPlay-0.0.1-SNAPSHOT.jar
However I get NullPointer Exception like this:
java.lang.NullPointerException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at scala.reflect.internal.util.ScalaClassLoader$$anonfun$tryClass$1.apply(ScalaClassLoader.scala:43)
at scala.reflect.internal.util.ScalaClassLoader$$anonfun$tryClass$1.apply(ScalaClassLoader.scala:43)
at scala.util.control.Exception$Catch$$anonfun$opt$1.apply(Exception.scala:119)
at scala.util.control.Exception$Catch$$anonfun$opt$1.apply(Exception.scala:119)
at scala.util.control.Exception$Catch.apply(Exception.scala:103)
at scala.util.control.Exception$Catch.opt(Exception.scala:119)
at scala.reflect.internal.util.ScalaClassLoader$class.tryClass(ScalaClassLoader.scala:42)
at scala.reflect.internal.util.ScalaClassLoader$class.tryToInitializeClass(ScalaClassLoader.scala:39)
at scala.reflect.internal.util.ScalaClassLoader$URLClassLoader.tryToInitializeClass(ScalaClassLoader.scala:101)
at scala.reflect.internal.util.ScalaClassLoader$class.run(ScalaClassLoader.scala:63)
at scala.reflect.internal.util.ScalaClassLoader$URLClassLoader.run(ScalaClassLoader.scala:101)
at scala.tools.nsc.CommonRunner$class.run(ObjectRunner.scala:22)
at scala.tools.nsc.JarRunner$.run(MainGenericRunner.scala:13)
at scala.tools.nsc.CommonRunner$class.runAndCatch(ObjectRunner.scala:29)
at scala.tools.nsc.JarRunner$.runJar(MainGenericRunner.scala:25)
at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:69)
at scala.tools.nsc.MainGenericRunner.run$1(MainGenericRunner.scala:87)
at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:98)
at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:103)
at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala)
What am I missing?
Update:
The problem appears to be that the maven build does not see the source files. I tried to force it to have a build error writing nonesense to the source file but the mvn package still says build success. Additionaly when I examine the jar file, there isn't any class files inside. How can I make the maven see the source files.
You have to add a scale compiler plugin such sbt-compiler plugin
SBT compiler plugin
Example pom
<plugin>
<groupId>com.google.code.sbt-compiler-maven-plugin</groupId>
<artifactId>sbt-compiler-maven-plugin</artifactId>
<version>1.0.0-beta9</version>
<executions>
<execution>
<id>default-sbt-compile</id>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
Using maven with scala in specific use case ( profiles, complex deployments, deep hierarchy ) is even better than pure sbt , but at the beginning is a bit tricky.

Scalatest Maven Plugin "no tests were executed"

I'm trying to use scalatest and spark-testing-base on Maven for integration testing Spark. The Spark job reads in a CSV file, validates the results, and inserts the data into a database. I'm trying to test the validation by putting in files of known format and seeing if and how they fail. This particular test just makes sure the validation passes. Unfortunately, scalatest can't find my tests.
Relevant pom plugins:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<configuration>
<skipTests>true</skipTests>
</configuration>
</plugin>
<!-- enable scalatest -->
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<wildcardSuites>com.cainc.data.etl.schema.proficiency</wildcardSuites>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
And here's the test class:
class ProficiencySchemaITest extends FlatSpec with Matchers with SharedSparkContext with BeforeAndAfter {
private var schemaStrategy: SchemaStrategy = _
private var dataReader: DataFrameReader = _
before {
val sqlContext = new SQLContext(sc)
import sqlContext._
import sqlContext.implicits._
val dataInReader = sqlContext.read.format("com.databricks.spark.csv")
.option("header", "true")
.option("nullValue", "")
schemaStrategy = SchemaStrategyChooser("dim_state_test_proficiency")
dataReader = schemaStrategy.applySchema(dataInReader)
}
"Proficiency Validation" should "pass with the CSV file proficiency-valid.csv" in {
val dataIn = dataReader.load("src/test/resources/proficiency-valid.csv")
val valid: Try[DataFrame] = Try(schemaStrategy.validateCsv(dataIn))
valid match {
case Success(v) => ()
case Failure(e) => fail("Validation failed on what should have been a clean file: ", e)
}
}
}
When I run mvn test, it can't find any tests and outputs this message:
[INFO] --- scalatest-maven-plugin:1.0:test (test) # load-csv-into-db ---
[36mDiscovery starting.[0m
[36mDiscovery completed in 54 milliseconds.[0m
[36mRun starting. Expected test count is: 0[0m
[32mDiscoverySuite:[0m
[36mRun completed in 133 milliseconds.[0m
[36mTotal number of tests run: 0[0m
[36mSuites: completed 1, aborted 0[0m
[36mTests: succeeded 0, failed 0, canceled 0, ignored 0, pending 0[0m
[33mNo tests were executed.[0m
UPDATE
By using:
<suites>com.cainc.data.etl.schema.proficiency.ProficiencySchemaITest</suites>
Instead of:
<wildcardSuites>com.cainc.data.etl.schema.proficiency</wildcardSuites>
I can get that one Test to run. Obviously, this is not ideal. It's possible wildcardSuites is broken; I'm going to open a ticket on GitHub and see what happens.
This is probably because there are some space characters in the project path.
Remove space in project path and the tests can be discovered successfully.
Hope this help.
Try excluding junit as a transitive dependency. Works for me. Example below, but note the Scala and Spark versions are specific to my environment.
<dependency>
<groupId>com.holdenkarau</groupId>
<artifactId>spark-testing-base_2.10</artifactId>
<version>1.5.0_0.6.0</version>
<scope>test</scope>
<exclusions>
<!-- junit is not compatible with scalatest -->
<exclusion>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
</exclusion>
</exclusion>
</dependency>
With me, it's because I wasn't using the following plugin:
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<args>
<arg>-target:jvm-1.8</arg>
</args>
</configuration>
</plugin>
The issue I had with tests not getting discovered came down to the fact that the tests are discovered from the class files, so to make the tests get discovered I need to add <goal>testCompile</goal> to scala-maven-plugin goals.
In my case it's because of the nesting of tests inside the test directory and using the <memberOnlySuites> configuration. <memberonlySuites> only looks out for the test files in the give package / directory. Instead use <wildcardSuites> which will look into a package / directory and all it's subdirectories.
This happens quiet often when you are adding more tests to your test suite and organising them in a more structured manner.
Cause: Maven plugins does not compile your test code whenever you run mvn commands.
Work around:
Run scala tests using your IDE which will compile the test code and saves it in target directory. And when next time you run mvn test or any maven command which internally triggers maven's test cycle it should run the scala tests

running a maven scala project

Im starting to learn scala and mongo , my IDE is intellij IDEA. I created a scala project using
mvn:archetype-generate
and typed a simple hello world program in the IDEA with some arithmetic options such as
println(5)
val i = 1+2
println(i)
Then i compiled it using
mvn compile
It said
build success
But now how should i execute my application and verify the output. There isn't a single article which explains how to start off with scala,maven,idea and i am entirely new to all of this. any help would be useful for me.
maven-exec-plugin
Try with this code:
package com.example
object Main {
def main(args: Array[String]) {
println(5)
val i = 1 + 2
println(i)
}
}
Place it under /src/main/scala/com/example/Main.scala and run it using:
$ mvn package exec:java -Dexec.mainClass=com.example.Main
If you don't want to pass mainClass manually, you can do this in plugin configuration:
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.1</version>
<configuration>
<mainClass>com.example.Main</mainClass>
</configuration>
</plugin>
</plugins>
There are other possibilities, this is the easiest one. Of course in IntelliJ you should be able to run the program directly.
maven-jar-plugin
If you want to ship the application, use maven-jar-plugin to add Main-Class and Class-Path entries to the manifest:
Main-Class: com.example.Main
Class-Path: lib/scala-library-2.9.0-1.jar lib/slf4j-api-1.6.1.jar ...
The following configuration does that and also copies all the dependencies (including Scala runtime library) to target/lib.
<plugin>
<artifactId>maven-jar-plugin</artifactId>
<version>2.3.1</version>
<configuration>
<archive>
<manifest>
<mainClass>com.example.Main</mainClass>
<addClasspath>true</addClasspath>
<classpathLayoutType>custom</classpathLayoutType>
<customClasspathLayout>lib/$${artifact.artifactId}-$${artifact.version}$${dashClassifier?}.$${artifact.extension}
</customClasspathLayout>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<artifactId>maven-dependency-plugin</artifactId>
<version>2.3</version>
<configuration>
<outputDirectory>${project.build.directory}/lib</outputDirectory>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
</execution>
</executions>
</plugin>
Now you can simply run your application by (note the target/lib directory is required):
$ java -jar target/your_app-VERSION.jar
You can ship your application simply by copying your JAR file along with /lib subdirectory.
Also see Exec Maven Plugin and Playing with Scala and Maven.