Spark Scala : Failed to find data source: kafka - scala

I'm trying to use an example from sparkByExamples.com .
But for some reason, spark program doesn't read data from kafka topic.
Code is here
Error msg below:
Exception in thread "main" org.apache.spark.sql.AnalysisException: Failed to find data source: kafka. Please deploy the application as per the deployment section of "Structured Streaming + Kafka Integration Guide".;
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:666)
at org.apache.spark.sql.streaming.DataStreamReader.load(DataStreamReader.scala:194)
at com.sparkbyexamples.spark.streaming.kafka.json.SparkStreamingConsumerKafkaJson$.main(SparkStreaming.scala:22)
at com.sparkbyexamples.spark.streaming.kafka.json.SparkStreamingConsumerKafkaJson.main(SparkStreaming.scala)
Process finished with exit code 1
Basically, I have a kafka topic1 < data.json and reading from beginning.
I recently started spark + scala + kafka activities and would appreciate your help.
pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<groupId>com.sparkbyexamples</groupId>
<modelVersion>4.0.0</modelVersion>
<artifactId>spark-scala-examples</artifactId>
<version>1.0-SNAPSHOT</version>
<inceptionYear>2008</inceptionYear>
<packaging>jar</packaging>
<properties>
<scala.version>2.12.12</scala.version>
<spark.version>3.0.0</spark.version>
</properties>
<repositories>
<repository>
<id>scala-tools.org</id>
<name>Scala-Tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-Tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.specs</groupId>
<artifactId>specs</artifactId>
<version>1.2.5</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.thoughtworks.xstream</groupId>
<artifactId>xstream</artifactId>
<version>1.4.11</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>${spark.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>${spark.version}</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-xml_2.11</artifactId>
<version>0.4.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<resources><resource><directory>src/main/resources</directory></resource></resources>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<args>
<arg>-target:jvm-1.5</arg>
</args>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-eclipse-plugin</artifactId>
<configuration>
<downloadSources>true</downloadSources>
<buildcommands>
<buildcommand>ch.epfl.lamp.sdt.core.scalabuilder</buildcommand>
</buildcommands>
<additionalProjectnatures>
<projectnature>ch.epfl.lamp.sdt.core.scalanature</projectnature>
</additionalProjectnatures>
<classpathContainers>
<classpathContainer>org.eclipse.jdt.launching.JRE_CONTAINER</classpathContainer>
<classpathContainer>ch.epfl.lamp.sdt.launching.SCALA_CONTAINER</classpathContainer>
</classpathContainers>
</configuration>
</plugin>
</plugins>
</build>
<reporting>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
</configuration>
</plugin>
</plugins>
</reporting>
</project>

As the error indicates, you are missing the spark-sql-kafka-0-10 dependency
Refer docs - https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html
And the correct pom file - https://github.com/sparkbyexamples/spark-examples/blob/master/spark-streaming/pom.xml#L64
Unrelated, your spark-xml Scala version needs to be 2.12

Related

How to generate source code from avro schema in maven(eclipse)?

I know this is not the first time this question has been asked. As i have gone through several links, but unable to understand why avro schema is not generating the source code when no error is being thrown in maven(eclipse). I have tried deleting 'pluginManagement' from pom.xml file but it throws error.
Tried deleting .m2 repo,mvn clean, mvn install, nothing seems to work. Not sure what is the issue.
Also, could you please tell me how to add 'fieldVisibility=private' argument while generating the source code from the command line.
Please guide! Thank you!
This is the pom.xml file
http://maven.apache.org/xsd/maven-4.0.0.xsd">
4.0.0
<groupId>kafka</groupId>
<artifactId>ProducerConsumer</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>ProducerConsumer</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>1.0.0</version>
</dependency>
<dependency>
<groupId>io.confluent</groupId>
<artifactId>kafka-avro-serializer</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.8.2</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-tools</artifactId>
<version>1.8.2</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.19</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.1.6</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.11.2</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api</artifactId>
<version>2.11.2</version>
</dependency>
</dependencies>
<repositories>
<repository>
<id>confluent</id>
<url>http://packages.confluent.io/maven/</url>
</repository>
</repositories>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.2</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.9.2</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
</goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
<outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
<fieldVisibility>PRIVATE</fieldVisibility>
<includes>
<include>**/*.avro</include>
</includes>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</pluginManagement>
</build>
I would suggest to use the correct setup like this:
There are two things: you are using not the avro-maven-plugin and also you have configured it in the pluginManagement instead of in using correctly located in <build><plugins>...</plugins></build>.
<project..>
<build>
<plugins>
<plugin>
<groupId>org.apache.avro</groupId>
<artifactId>avro-maven-plugin</artifactId>
<version>1.9.2</version>
<executions>
<execution>
<phase>generate-sources</phase>
<goals>
<goal>schema</goal>
</goals>
<configuration>
<sourceDirectory>${project.basedir}/src/main/avro/</sourceDirectory>
<outputDirectory>${project.basedir}/src/main/java/</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>

spark kafka streaming code not working on livy

I've been working in some scala codes to run spark streaming for kafka data acquisition. Once updated all dependencies and adding new ones to the compiled jar, I am able to to run it with spark-submit with the code spark-submit /codigos/codigos/adquisicion.jar. But when running it via livy2 I'm getting an exception when creating stream in:
val stream = KafkaUtils.createDirectStream[String, String](
ssc,
PreferConsistent,
Subscribe[String, String](topics, kafkaParams)
)
Code used to run it via livy2 is:
uri<-"http://192.168.0.28:8999/batches"
jsonBody2 = paste0('{ "file": "/user/bigdata/adquisicion.jar",
"numExecutors":1,
"executorCores":3,
"name": "Adquisición Datos",
"args": [""],
"className": "codigos.adquisicionDatos" }')
request <- httr::POST(url = uri, body = jsonBody2, httr::add_headers(.headers = c('Content-Type' = 'application/json', 'X-Requested-by'='user')))
And received exception is the following one:
18/11/06 08:01:00 ERROR ApplicationMaster: User class threw exception: java.lang.NoClassDefFoundError: org/apache/kafka/clients/consumer/Consumer
java.lang.NoClassDefFoundError: org/apache/kafka/clients/consumer/Consumer
at org.apache.spark.streaming.kafka010.ConsumerStrategies$.Subscribe(ConsumerStrategy.scala:256)
at codigos.adquisicionDatos$.main(adquisicionDatos.scala:251)
at codigos.adquisicionDatos.main(adquisicionDatos.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:646)
Caused by: java.lang.ClassNotFoundException: org.apache.kafka.clients.consumer.Consumer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 8 more
EDIT:
POM file:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>codigos</groupId>
<artifactId>adquisicionDatos</artifactId>
<version>1.2</version>
<packaging>jar</packaging>
<name>Adquisicion Datos Kafka</name>
<properties>
<app.main.class>adquisicionDatos.SparkKMeansApp</app.main.class>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<spark.core.version>2.3.2</spark.core.version>
<!--hbase.version>0.98.7-hadoop1</hbase.version-->
<hbase.version>1.0.1</hbase.version>
<slf4j.version>1.7.5</slf4j.version>
<guava.version>13.0.1</guava.version>
<gson.version>2.2.4</gson.version>
<spark.version>2.3.2</spark.version>
</properties>
<repositories>
<repository>
<id>scalanlp.org</id>
<name>ScalaNLP Maven2 Repository</name>
<url>http://repo.scalanlp.org/repo</url>
</repository>
<repository>
<id>apache release</id>
<url>https://repository.apache.org/content/repositories/releases/</url>
</repository>
<repository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</repository>
<repository>
<id>cloudera.repo</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
<name>Cloudera Repositories</name>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.sameersingh.scalaplot/scalaplot -->
<dependency>
<groupId>org.sameersingh.scalaplot</groupId>
<artifactId>scalaplot</artifactId>
<version>0.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/javax.mail/javax.mail-api -->
<dependency>
<groupId>javax.mail</groupId>
<artifactId>javax.mail-api</artifactId>
<version>1.5.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<!--
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>${spark.version}</version>
<scope>compile</scope>
<exclusions>
<exclusion>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.11</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka_2.11</artifactId>
<version>0.10.2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.11 -->
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>${spark.version}</version>
<!--<version>2.0.5</version>-->
</dependency>
<!-- https://mvnrepository.com/artifact/org.postgresql/postgresql -->
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.1.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients -->
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>0.10.2.1</version>
</dependency>
</dependencies>
<build>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<version>2.15.2</version>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<source>1.6</source>
<target>1.6</target>
</configuration>
</plugin>
</plugins>
</pluginManagement>
<plugins>
<plugin>
<groupId>org.scala-tools</groupId>
<artifactId>maven-scala-plugin</artifactId>
<executions>
<execution>
<id>scala-compile-first</id>
<phase>process-resources</phase>
<goals>
<goal>add-source</goal>
<goal>compile</goal>
</goals>
</execution>
<execution>
<id>scala-test-compile</id>
<phase>process-test-resources</phase>
<goals>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<finalName>${project.artifactId}-${project.version}-jar-with-dependencies</finalName>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.4</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifest>
<mainClass>codigos.adquisicionDatos</mainClass>
</manifest>
</archive>
</configuration>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
I compile it in server and then I copy it to docker (via shared volume cp ~/adquisicionDatos/target/adquisicionDatos1.2-jar-with-dependencies.jar /mnt/codigos/adquisicion.jar) and upload it to HDFS using incrond script to automatically upload files when updated (working tested).
If running with spark-submit under root user everything is working fine but using livy or running under livy user is not working and throwing above's error.

Error: Could not find or load main class scala Ecllipse IDE

I have clone the project through my GitHub repository.
Whenever trying to run, it is giving the error:
Could not find main class.
I have my pom.xml as:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<artifactId>Scalable ML</artifactId>
<groupId>org.big-mint</groupId>
<version>0.0.1-SNAPSHOT</version>
<name>Scalable ML</name>
<description>
Scalable Machine learning algorithms to process big data.
</description>
<inceptionYear>2017</inceptionYear>
<licenses>
<license>
<name>GNU GENERAL PUBLIC LICENSE, Version 3</name>
<url>http://www.gnu.org/licenses/gpl-3.0.txt</url>
<distribution>repo</distribution>
</license>
</licenses>
<properties>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<scala.version>2.11.8</scala.version>
<scala.binary.version>2.11</scala.binary.version>
<spark.version>2.1.1</spark.version>
<sansa.version>0.2.0</sansa.version>
<jena.version>3.1.1</jena.version>
</properties>
<dependencies>
<!-- Scala -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- Apache Spark Core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Apache Spark SQL -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Apache JENA 3.x -->
<dependency>
<groupId>org.apache.jena</groupId>
<artifactId>apache-jena-libs</artifactId>
<type>pom</type>
<version>${jena.version}</version>
</dependency>
<!-- SANSA RDF -->
<dependency>
<groupId>net.sansa-stack</groupId>
<artifactId>sansa-rdf-spark-bundle_${scala.binary.version}</artifactId>
<version>${sansa.version}</version>
</dependency>
<!-- SANSA OWL -->
<dependency>
<groupId>net.sansa-stack</groupId>
<artifactId>sansa-owl-spark_${scala.binary.version}</artifactId>
<version>${sansa.version}</version>
</dependency>
<!-- SANSA Inference -->
<!-- <dependency> <groupId>${project.groupId}</groupId> <artifactId>sansa-inference-parent_${scala.binary.version}</artifactId>
<version>${sansa.version}</version> </dependency> -->
<dependency>
<groupId>net.sansa-stack</groupId>
<artifactId>sansa-inference-spark_${scala.binary.version}</artifactId>
<version>${sansa.version}</version>
</dependency>
<!-- SANSA Querying -->
<dependency>
<groupId>net.sansa-stack</groupId>
<artifactId>sansa-query-spark-bundle_${scala.binary.version}</artifactId>
<version>${sansa.version}</version>
</dependency>
<!-- SANSA ML -->
<dependency>
<groupId>net.sansa-stack</groupId>
<artifactId>sansa-ml-spark_${scala.binary.version}</artifactId>
<version>${sansa.version}</version>
</dependency>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.8.1</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.binary.version}</artifactId>
<version>2.2.6</version>
<scope>test</scope>
</dependency>
<!-- Logging -->
<dependency>
<groupId>com.typesafe.scala-logging</groupId>
<artifactId>scala-logging-slf4j_${scala.binary.version}</artifactId>
<version>2.1.2</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.1</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<!--<arg>-make:transitive</arg> -->
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<recompileMode>incremental</recompileMode>
</configuration>
</plugin>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.3.2</version>
<configuration>
<source>${maven.compiler.source}</source>
<target>${maven.compiler.target}</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.4.3</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>asm:asm</exclude>
<exclude>com.clearspring.analytics:stream</exclude>
<exclude>com.esotericsoftware:kryo*</exclude>
<exclude>com.esotericsoftware:minlog</exclude>
<exclude>com.fasterxml.jackson.core:jackson*</exclude>
<exclude>com.fasterxml.jackson.module:jackson-module*</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
<exclude>com.google.code.gson:gson</exclude>
<exclude>com.google.inject.extensions:guice-servlet</exclude>
<exclude>com.google.guava:guava</exclude>
<exclude>com.google.protobuf:protobuf-java</exclude>
<exclude>com.jcraft:jsch</exclude>
<exclude>com.ning:compress-lzf</exclude>
<exclude>com.sun.jersey:jersey-*</exclude>
<exclude>com.sun.jersey.contribs:jersey-guice</exclude>
<exclude>com.sun.xml.bind:jaxb-impl</exclude>
<exclude>com.thoughtworks.paranamer:paranamer</exclude>
<exclude>com.twitter:chill*</exclude>
<exclude>com.univocity:univocity-parsers</exclude>
<exclude>commons-beanutils:commons-beanutils*</exclude>
<exclude>commons-cli:commons-cli</exclude>
<exclude>commons-codec:commons-codec</exclude>
<exclude>commons-collections:commons-collections</exclude>
<exclude>commons-configuration:commons-configuration</exclude>
<exclude>commons-digester:commons-digester</exclude>
<exclude>commons-httpclient:commons-httpclient</exclude>
<exclude>commons-io:commons-io</exclude>
<exclude>commons-lang:commons-lang</exclude>
<exclude>commons-logging:commons-logging</exclude>
<exclude>commons-net:commons-net</exclude>
<exclude>io.dropwizard.metrics:metrics*</exclude>
<exckude>io.netty:netty*</exckude>
<exclude>javax.activation:activation</exclude>
<exclude>javax.annotation:javax.annotation-api</exclude>
<exclude>javax.servlet:javax.servlet-api</exclude>
<exclude>javax.servlet.jsp:jsp-api</exclude>
<exclude>javax.servlet:servlet-api</exclude>
<exclude>javax.validation:validation-api</exclude>
<exclude>javax.ws.rs:javax.ws.rs-api</exclude>
<exclude>javax.xml.bind:jaxb-api</exclude>
<exclude>javax.xml.stream:stax-api</exclude>
<exclude>jdk.tools:jdk.tools</exclude>
<exclude>net.java.dev.jets3t:jets3t</exclude>
<exclude>net.jpountz.lz4:lz4</exclude>
<exclude>net.razorvine:pyrolite</exclude>
<exclude>net.sf.py4j:py4j</exclude>
<exclude>org.antlr:antlr4-runtime</exclude>
<exclude>org.apache.avro:avro*</exclude>
<exclude>org.apache.commons:commons-lang3</exclude>
<exclude>org.apache.commons:commons-math3</exclude>
<exclude>org.apache.commons:commons-compress</exclude>
<exclude>org.apache.curator:curator*</exclude>
<exclude>org.apache.directory.api:*</exclude>
<exclude>org.apache.directory.server:*</exclude>
<exclude>org.apache.hadoop:*</exclude>
<exclude>org.apache.htrace:htrace-core</exclude>
<exclude>org.apache.httpcomponents:*</exclude>
<exclude>org.apache.ivy:ivy</exclude>
<exclude>org.apache.mesos:mesos</exclude>
<exclude>org.apache.parquet:parquet*</exclude>
<exclude>org.apache.spark:*</exclude>
<exclude>org.apache.xbean:xbean-asm5-shaded</exclude>
<exclude>org.apache.zookeeper:zookeeper</exclude>
<exclude>org.codehaus.jackson:jackson-*</exclude>
<exclude>org.codehaus.janino:*</exclude>
<exclude>org.codehaus.jettison:jettison</exclude>
<exclude>org.fusesource.leveldbjni:leveldbjni-all</exclude>
<exckude>org.glassfish.hk2*</exckude>
<exclude>org.glassfish.jersey*</exclude>
<exclude>org.javassist:javassist</exclude>
<exclude>org.json4s:json4s*</exclude>
<exclude>org.mortbay.jetty:jetty*</exclude>
<exclude>org.objenesis:objenesis</exclude>
<exclude>org.roaringbitmap:RoaringBitmap</exclude>
<exclude>org.scala-lang:*</exclude>
<exclude>org.slf4j:jul-to-slf4j</exclude>
<exclude>org.slf4j:jcl-over-slf4j</exclude>
<exclude>org.spark-project.spark:unused</exclude>
<exclude>org.xerial.snappy:snappy-java</exclude>
<exclude>oro:oro</exclude>
<exclude>xmlenc:xmlenc</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
</execution>
</executions>
</plugin>
<!-- disable surefire -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.7</version>
<configuration>
<useFile>false</useFile>
<disableXmlReport>true</disableXmlReport>
<!-- If you have classpath issue like NoDefClassError,... -->
<!-- useManifestOnlyJar>false</useManifestOnlyJar -->
<includes>
<include>**/*Test.*</include>
<include>**/*Suite.*</include>
</includes>
<skipTests>true</skipTests>
</configuration>
</plugin>
<!-- enable scalatest -->
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>1.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
<junitxml>.</junitxml>
<filereports>WDF TestSuite.txt</filereports>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>oss-sonatype</id>
<name>oss-sonatype</name>
<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
<repository>
<id>apache-snapshot</id>
<name>Apache repository (snapshots)</name>
<url>https://repository.apache.org/content/repositories/snapshots/</url>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
<repository>
<id>maven.aksw.internal</id>
<name>AKSW Release Repository</name>
<url>http://maven.aksw.org/archiva/repository/internal</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>maven.aksw.snapshots</id>
<name>AKSW Snapshot Repository</name>
<url>http://maven.aksw.org/archiva/repository/snapshots</url>
<releases>
<enabled>false</enabled>
</releases>
<snapshots>
<enabled>true</enabled>
</snapshots>
</repository>
</repositories>
Have several maven and other dependencies.
I Don't know why it is not loading the main class.
I have tried many solutions as mentioned by others i.e. in run configuration giving complete class name and etc. But nothing worked.
try to add this property :
<properties>
<jobMainClass>package.mainclass</jobMainClass>
</properties>
If it doesn't work, give me the command that you use to run your jarfile.

java.lang.NoClassDefFoundError when running Scala-Maven project

Trying to run a new project for the first time. mvn install goes smoothly, but when I try to run it, I get:
Exception in thread "main" java.lang.NoClassDefFoundError: scala/Tuple2
Now, something is terribly wrong if it can't even find the Scala basics. To be honest, it doesn't recognize anything outside of the code itself.
I''m attaching my POM file, hope someone can help..
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.foo</groupId>
<artifactId>bar</artifactId>
<version>1.0.0</version>
<packaging>jar</packaging>
<name>Foo Bar</name>
<properties>
<app.main.class>com.foo.main</app.main.class>
<spark.version>1.6.1</spark.version>
<scala.version>2.10.4</scala.version>
<scala.binary.version>2.10</scala.binary.version>
</properties>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<repositories>
<repository>
<id>maven-repo</id>
<name>Maven Repository</name>
<url>http://repo1.maven.apache.org/maven2</url>
<releases>
<enabled>true</enabled>
</releases>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>apache-repo</id>
<name>Apache release repo</name>
<url>https://github.com/adatao/mvnrepos/tree/master/releases/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
</dependencies>
<build>
<extensions>
<extension>
<groupId>org.springframework.build</groupId>
<artifactId>aws-maven</artifactId>
<version>5.0.0.RELEASE</version>
</extension>
</extensions>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>2.4</version>
<configuration>
<archive>
<manifest>
<addClasspath>true</addClasspath>
<classpathPrefix>lib/</classpathPrefix>
<mainClass>com.foo.main</mainClass>
</manifest>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<jvmArgs>
<jvmArg>-Xms256m</jvmArg>
<jvmArg>-Xmx2048m</jvmArg>
</jvmArgs>
</configuration>
</plugin>
</plugins>
</build>
</project>

Scala SuperSafe Community Plugin artifact, sbt 0.13, scala 2.11.8 not resolving

I want to use ScalaTest in my Scala project more specially the SuperSafe Community Edition. I followed the installation instructions and I am using sbt 0.13 and scala 2.11.8.
I get the following error:
[error] (*:update) sbt.ResolveException: unresolved dependency: com.artima.supersafe#sbtplugin;1.1.0-RC6: not found
I have tried to use the other artifacts related to the scala 2.11.8 but with no luck.
Can I use the SuperSafe Community Edition with sbt 0.3 and scala 2.11.8?
I was facing the same issue while I was getting started with scalatest. I used version 1.0.6-M2 while resolved the error for me. The below 2 lines in plugins.sbt did the trick
resolvers += "Artima Maven Repository" at "http://repo.artima.com/releases"
addSbtPlugin("com.artima.supersafe" % "sbtplugin" % "1.0.6-M2")
Additionally, you should remove scoverage-plugin and scoverage-runtime artifacts from your dependencies. Scoverage plugin will add them, when needed.
I switched over to a Maven POM based project for internal reasons. Here are the dependencies I used:
<?xml version='1.0' encoding='UTF-8'?>
<project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://maven.apache.org/POM/4.0.0">
<modelVersion>4.0.0</modelVersion>
<groupId>default</groupId>
<artifactId>my-project</artifactId>
<packaging>jar</packaging>
<description>My Project</description>
<version>1.0</version>
<name>data-transformer</name>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<scala.binary.version>2.11</scala.binary.version>
<scala.version>2.11.8</scala.version>
<spark.version>1.6.1</spark.version>
<junit.version>4.12</junit.version>
<compiler.plugin.version>3.5</compiler.plugin.version>
<project-info-reports.plugin.version>2.9</project-info-reports.plugin.version>
<site.plugin.version>3.4</site.plugin.version>
<surefire.plugin.version>2.19.1</surefire.plugin.version>
<scala.plugin.version>3.2.2</scala.plugin.version>
<scoverage.plugin.version>1.1.0</scoverage.plugin.version>
</properties>
<build>
<sourceDirectory>${basedir}/src/main/scala</sourceDirectory>
<testSourceDirectory>${basedir}/src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${compiler.plugin.version}</version>
<configuration>
<skipMain>true</skipMain> <!-- skip compile -->
<skip>true</skip> <!-- skip testCompile -->
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>${surefire.plugin.version}</version>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>${scala.plugin.version}</version>
<configuration>
<scalaCompatVersion>${scala.binary.version}</scalaCompatVersion>
<scalaVersion>${scala.version}</scalaVersion>
</configuration>
<executions>
<execution>
<id>default-sbt-compile</id>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.scoverage</groupId>
<artifactId>scoverage-maven-plugin</artifactId>
<version>${scoverage.plugin.version}</version>
<configuration>
<scalaVersion>2.11.8</scalaVersion>
<highlighting>true</highlighting>
</configuration>
</plugin>
</plugins>
<pluginManagement>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-site-plugin</artifactId>
<version>${site.plugin.version}</version>
</plugin>
</plugins>
</pluginManagement>
</build>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.8</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>1.6.1</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.11</artifactId>
<version>1.4.0</version>
</dependency>
<!-- Test Dependencies -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>${junit.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_2.11</artifactId>
<version>1.12.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.11</artifactId>
<version>2.2.6</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.holdenkarau</groupId>
<artifactId>spark-testing-base_2.11</artifactId>
<version>1.6.0_0.3.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<repositories>
<repository>
<id>ArtimaMavenRepository</id>
<name>Artima Maven Repository</name>
<url>http://repo.artima.com/releases/</url>
<layout>default</layout>
</repository>
</repositories>
<reporting>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-project-info-reports-plugin</artifactId>
<reportSets>
<reportSet>
<reports>
<report>index</report>
</reports>
</reportSet>
</reportSets>
</plugin>
<plugin>
<groupId>org.scoverage</groupId>
<artifactId>scoverage-maven-plugin</artifactId>
<version>${scoverage.plugin.version}</version>
<reportSets>
<reportSet>
<reports>
<report>report</report> <!-- select only one report from: report, integration-report and report-only reporters -->
</reports>
</reportSet>
</reportSets>
</plugin>
</plugins>
</reporting>