Spark Maven fail to find ml classes - scala

I create code spark with SparkSession but i can't run this code.
I think I am missing some dependencies in my pom.xml or something else-
import org.apache.spark.sql.SparkSession
val spark = SparkSession
.builder
.appName("loader")
.master("local")
.getOrCreate()
pom.xml for scala 2.11
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>info.daviot</groupId>
<version>0.1-SNAPSHOT</version>
<artifactId>demo</artifactId>
<packaging>jar</packaging>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<scala.version>2.11.5</scala.version>
<java.version>1.7</java.version>
</properties>
<dependencies>
<dependency>
<artifactId>scala-library</artifactId>
<groupId>org.scala-lang</groupId>
<version>${scala.version}</version>
</dependency>
<!-- optional dependencies -->
<dependency>
<groupId>com.softwaremill.macwire</groupId>
<artifactId>macros_2.11</artifactId>
<version>0.8.0</version>
</dependency>
<dependency>
<groupId>com.typesafe.akka</groupId>
<artifactId>akka-actor_2.11</artifactId>
<version>2.3.9</version>
</dependency>
<dependency>
<groupId>com.github.nscala-time</groupId>
<artifactId>nscala-time_2.11</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.propensive</groupId>
<artifactId>rapture-json-jawn_2.11</artifactId>
<version>1.1.0</version>
</dependency>
<!-- logs -->
<dependency>
<groupId>org.clapper</groupId>
<artifactId>grizzled-slf4j_2.11</artifactId>
<version>1.0.2</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.1.2</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.6</version>
</dependency>
<!-- tests -->
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.11</artifactId>
<version>2.2.2</version>
<scope>test</scope>
</dependency>
<dependency>
<artifactId>junit</artifactId>
<groupId>junit</groupId>
<version>4.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.powermock</groupId>
<artifactId>powermock-api-mockito</artifactId>
<version>1.5.5</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.1.6</version>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>2.0.2</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
I got the same error when I tried to add :
<repositories>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/clouderarepos/</url>
</repository>
</repositories>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
import works :
import org.apache.spark.ml.feature.Tokenizer
import org.apache.spark.ml.Pipeline
import org.apache.spark.ml.feature.Word2VecModel
import dosn't work :
import org.apache.spark.ml.feature.CountVectorizerModel
import org.apache.spark.ml.feature.StopWordsRemover
with error Cannot resolve symbol

you would need these dependencies
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.2.0</version>
</dependency>
for more information checkout https://mvnrepository.com/artifact/org.apache.spark

Add below dependency-
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.0.0</version>
<scope>provided</scope>
</dependency>

Related

Conflict versions when running spark application on Databricks runtime 10.3

I'm running scala/spark application on Databricks cluster (it tries to read data from an Event Hub topic using EventHub schemaregistry) and getting the following error:
22/02/17 07:36:14 ERROR JacksonVersion: Version '1.0.19-SNAPSHOT' of package 'jackson-dataformat-xml' is not supported (older than earliest supported version - `2.10.0`), please upgrade.
22/02/17 07:36:14 ERROR JacksonVersion: Version '1.0.19-SNAPSHOT' of package 'jackson-datatype-jsr310' is not supported (older than earliest supported version - `2.10.0`), please upgrade.
22/02/17 07:36:14 INFO JacksonVersion: Package versions: jackson-annotations=2.12.3, jackson-core=2.12.3, jackson-databind=2.12.3, jackson-dataformat-xml=1.0.19-SNAPSHOT, jackson-datatype-jsr310=1.0.19-SNAPSHOT, azure-core=1.8.1, Troubleshooting version conflicts: https://aka.ms/azsdk/java/dependency/troubleshoot
22/02/17 07:36:14 ERROR ScalaDriverLocal: User Code Stack Trace:
java.lang.NoSuchMethodError: reactor.netty.http.client.HttpClient.resolver(Lio/netty/resolver/AddressResolverGroup;)Lreactor/netty/transport/ClientTransport;
Looks that maven is trying to use main artificat version (1.0.19-SNAPSHOT) to 'jackson-dataformat-xml' and 'jackson-datatype-jsr310':
22/02/17 07:36:14 ERROR JacksonVersion: Version '1.0.19-SNAPSHOT' of package 'jackson-dataformat-xml' is not supported (older than earliest supported version - `2.10.0`), please upgrade.
22/02/17 07:36:14 ERROR JacksonVersion: Version '1.0.19-SNAPSHOT' of package 'jackson-datatype-jsr310' is not supported (older than earliest supported version - `2.10.0`), please upgrade.
The project uses Databricks 10.3 runtime and this pom:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>group</groupId>
<artifactId>parent</artifactId>
<version>3.1-SNAPSHOT</version>
</parent>
<artifactId>ingestion</artifactId>
<version>${revision}</version>
<packaging>jar</packaging>
<name>ingestion</name>
<organization>
<name>company</name>
</organization>
<properties>
<revision>1.0.19-SNAPSHOT</revision>
<encoding>UTF-8</encoding>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.compact.version}</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.compact.version}</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.compact.version}</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_${scala.compact.version}</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-core_${scala.compact.version}</artifactId>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.compact.version}</artifactId>
<version>3.2.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-scala_${scala.compact.version}</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
</dependency>
<dependency>
<groupId>io.spray</groupId>
<artifactId>spray-json_${scala.compact.version}</artifactId>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>dbutils-api_${scala.compact.version}</artifactId>
</dependency>
<dependency>
<groupId>com.github.mrpowers</groupId>
<artifactId>spark-daria_${scala.compact.version}</artifactId>
</dependency>
<dependency>
<groupId>com.github.mrpowers</groupId>
<artifactId>spark-fast-tests_${scala.compact.version}</artifactId>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
</dependency>
<dependency>
<groupId>io.circe</groupId>
<artifactId>circe-yaml_${scala.compact.version}</artifactId>
</dependency>
<dependency>
<groupId>io.circe</groupId>
<artifactId>circe-generic_${scala.compact.version}</artifactId>
</dependency>
<dependency>
<groupId>io.jvm.uuid</groupId>
<artifactId>scala-uuid_2.12</artifactId>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
</dependency>
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-identity</artifactId>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-core</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-xml</artifactId>
<version>2.12.3</version>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-jsr310</artifactId>
<version>2.12.3</version>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-data-schemaregistry</artifactId>
</dependency>
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-core-http-netty</artifactId>
</dependency>
</dependencies>
<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<filtering>false</filtering>
</resource>
<resource>
<directory>src/main/resources-filtered</directory>
<filtering>true</filtering>
</resource>
</resources>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<executions>
<execution>
<id>scala-compile</id>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<jvmArgs>
<jvmArg>-Xms64m</jvmArg>
<jvmArg>-Xmx1024m</jvmArg>
</jvmArgs>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.0.2</version>
<configuration>
<archive>
<manifestEntries>
<Implementation-Title>${project.artifactId}</Implementation-Title>
<Implementation-Version>${project.version}</Implementation-Version>
<Specification-Vendor>${project.groupId}</Specification-Vendor>
<Specification-Title>${project.artifactId}</Specification-Title>
<Implementation-Vendor-Id>${project.groupId}</Implementation-Vendor-Id>
<Specification-Version>${project.version}</Specification-Version>
<Implementation-Vendor>${project.groupId}</Implementation-Vendor>
</manifestEntries>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>flatten-maven-plugin</artifactId>
<version>1.2.7</version>
<configuration>
<updatePomFile>true</updatePomFile>
<flattenMode>resolveCiFriendliesOnly</flattenMode>
</configuration>
<executions>
<execution>
<id>flatten</id>
<phase>process-resources</phase>
<goals>
<goal>flatten</goal>
</goals>
</execution>
<execution>
<id>flatten.clean</id>
<phase>clean</phase>
<goals>
<goal>clean</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.3.0</version>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
<configuration>
<finalName>${project.artifactId}-${project.version}-assembly</finalName>
<appendAssemblyId>false</appendAssemblyId>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
<archive>
<manifestEntries>
<Implementation-Title>${project.artifactId}</Implementation-Title>
<Implementation-Version>${project.version}</Implementation-Version>
<Specification-Vendor>${project.groupId}</Specification-Vendor>
<Specification-Title>${project.artifactId}</Specification-Title>
<Implementation-Vendor-Id>${project.groupId}</Implementation-Vendor-Id>
<Specification-Version>${project.version}</Specification-Version>
<Implementation-Vendor>${project.groupId}</Implementation-Vendor>
</manifestEntries>
</archive>
</configuration>
</plugin>
<plugin>
<groupId>org.scalatest</groupId>
<artifactId>scalatest-maven-plugin</artifactId>
<version>2.0.0</version>
<configuration>
<reportsDirectory>${project.build.directory}/surefire-reports</reportsDirectory>
</configuration>
<executions>
<execution>
<id>test</id>
<goals>
<goal>test</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.scalastyle</groupId>
<artifactId>scalastyle-maven-plugin</artifactId>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>check</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
and this is the parent project:
<groupId>group</groupId>
<artifactId>parent</artifactId>
<version>3.1-SNAPSHOT</version>
<packaging>pom</packaging>
<properties>
<scala.version>2.12.14</scala.version>
<scala.compact.version>2.12</scala.compact.version>
<spark.version>3.2.1</spark.version>
<databricks.dbutils.api.version>0.0.5</databricks.dbutils.api.version>
<delta.version>1.1.0</delta.version>
<postgresql.version>42.2.9</postgresql.version>
<typesafe.config.version>1.4.0</typesafe.config.version>
<snakeyaml.version>1.25</snakeyaml.version>
<spray-json.version>1.3.5</spray-json.version>
<h2.version>1.4.200</h2.version>
<mockito.version>1.16.46</mockito.version>
<scalacheck.version>1.14.0</scalacheck.version>
<adal4j.version>1.6.6</adal4j.version>
<scalastyle.version>0.8.0</scalastyle.version>
<maven-surefire-plugin.version>2.7</maven-surefire-plugin.version>
<scalatest-maven-plugin.version>1.0</scalatest-maven-plugin.version>
<spark-daria.version>1.2.3</spark-daria.version>
<spark-fast-tests.version>1.1.0</spark-fast-tests.version>
<kafka.version>3.1.0</kafka.version>
<scala-maven-plugin.version>4.4.0</scala-maven-plugin.version>
<cats-core.version>2.0.0</cats-core.version>
<pureconfig_2.11.version>0.12.3</pureconfig_2.11.version>
<jgitflow-maven-plugin.version>1.0-m5.1</jgitflow-maven-plugin.version>
<joda-time.version>2.10.6</joda-time.version>
<mssql-jdbc.version>8.2.2.jre8</mssql-jdbc.version>
<circe-yaml.artifact>circe-yaml_2.12</circe-yaml.artifact>
<circe-yaml.version>0.13.1</circe-yaml.version>
<circe-generic.artifact>circe-generic_2.12</circe-generic.artifact>
<circe-generic.version>0.13.0</circe-generic.version>
<scala-uuid_2.12.version>0.3.1</scala-uuid_2.12.version>
<spark-testing-base.version>0.14.0</spark-testing-base.version>
<scalatest.version>3.0.5</scalatest.version>
<httpclient.version>4.5.6</httpclient.version>
<azure-identity.version>1.4.0</azure-identity.version>
<azure-core.version>1.23.0</azure-core.version>
<azure-data-schemaregistry.version>1.0.0</azure-data-schemaregistry.version>
<log4j-core.version>2.17.1</log4j-core.version>
<commons-cli.version>1.2</commons-cli.version>
<reactor-netty.version>1.0.15</reactor-netty.version>
<azure-core-http-netty.version>1.11.7</azure-core-http-netty.version>
</properties>
<dependencyManagement>
<dependencies>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>${log4j-core.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.compact.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.compact.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_${scala.compact.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_${scala.compact.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>${kafka.version}</version>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>${kafka.version}</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_${scala.compact.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.microsoft.azure</groupId>
<artifactId>adal4j</artifactId>
<version>${adal4j.version}</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>dbutils-api_2.12</artifactId>
<version>${databricks.dbutils.api.version}</version>
<exclusions>
<exclusion>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
</exclusion>
</exclusions>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>joda-time</groupId>
<artifactId>joda-time</artifactId>
<version>${joda-time.version}</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>${httpclient.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>io.delta</groupId>
<artifactId>delta-core_${scala.compact.version}</artifactId>
<version>${delta.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>${typesafe.config.version}</version>
</dependency>
<dependency>
<groupId>org.yaml</groupId>
<artifactId>snakeyaml</artifactId>
<version>${snakeyaml.version}</version>
</dependency>
<dependency>
<groupId>io.circe</groupId>
<artifactId>${circe-yaml.artifact}</artifactId>
<version>${circe-yaml.version}</version>
</dependency>
<dependency>
<groupId>io.circe</groupId>
<artifactId>${circe-generic.artifact}</artifactId>
<version>${circe-generic.version}</version>
</dependency>
<dependency>
<groupId>io.jvm.uuid</groupId>
<artifactId>scala-uuid_2.12</artifactId>
<version>${scala-uuid_2.12.version}</version>
</dependency>
<dependency>
<groupId>io.spray</groupId>
<artifactId>spray-json_${scala.compact.version}</artifactId>
<version>${spray-json.version}</version>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-scala_${scala.compact.version}</artifactId>
<version>${mockito.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.typelevel</groupId>
<artifactId>cats-core_2.12</artifactId>
<version>${cats-core.version}</version>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_${scala.compact.version}</artifactId>
<version>${scalacheck.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.holdenkarau</groupId>
<artifactId>spark-testing-base_${scala.compact.version}
<version>2.4.5_${spark-testing-base.version}</version>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.compact.version}</artifactId>
<version>${scalatest.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.github.mrpowers</groupId>
<artifactId>spark-fast-tests_${scala.compact.version}</artifactId>
<version>${spark-fast-tests.version}</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>com.github.mrpowers</groupId>
<artifactId>spark-daria_${scala.compact.version}</artifactId>
<version>${spark-daria.version}</version>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-identity</artifactId>
<version>${azure-identity.version}</version>
<exclusions>
<exclusion>
<groupId>com.azure</groupId>
<artifactId>azure-core-http-netty</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-core</artifactId>
<version>${azure-core.version}</version>
<exclusions>
<exclusion>
<groupId>com.fasterxml.jackson.dataformat</groupId>
<artifactId>jackson-dataformat-xml</artifactId>
</exclusion>
<exclusion>
<groupId>com.fasterxml.jackson.datatype</groupId>
<artifactId>jackson-datatype-jsr310</artifactId>
</exclusion>
<exclusion>
<groupId>io.projectreactor.netty</groupId>
<artifactId>reactor-netty</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-core-http-netty</artifactId>
<version>${azure-core-http-netty.version}</version>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-data-schemaregistry</artifactId>
<version>${azure-data-schemaregistry.version}</version>
<exclusions>
<exclusion>
<groupId>com.azure</groupId>
<artifactId>azure-core-http-netty</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
<version>${commons-cli.version}</version>
</dependency>
</dependencies>
</dependencyManagement>
<build>
<plugins>
<plugin>
<groupId>external.atlassian.jgitflow</groupId>
<artifactId>jgitflow-maven-plugin</artifactId>
<version>${jgitflow-maven-plugin.version}</version>
<configuration>
<pushFeatures>true</pushFeatures>
<pushHotfixes>true</pushHotfixes>
<pushReleases>true</pushReleases>
<enableSshAgent>true</enableSshAgent>
<noDeploy>true</noDeploy>
<flowInitContext>
<masterBranchName>master</masterBranchName>
<developBranchName>develop</developBranchName>
<featureBranchPrefix>feature/</featureBranchPrefix>
<releaseBranchPrefix>release/</releaseBranchPrefix>
<hotfixBranchPrefix>hotfix/</hotfixBranchPrefix>
<versionTagPrefix>${project.artifactId}-</versionTagPrefix>
</flowInitContext>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>${scala-maven-plugin.version}</version>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<jvmArgs>
<jvmArg>-Xms1024M</jvmArg>
<jvmArg>-Xmx2048M</jvmArg>
<jvmArg>-XX:+CMSClassUnloadingEnabled</jvmArg>
</jvmArgs>
</configuration>
</plugin>
<plugin>
<groupId>org.scalastyle</groupId>
<artifactId>scalastyle-maven-plugin</artifactId>
<version>${scalastyle.version}</version>
<configuration>
<verbose>true</verbose>
<failOnViolation>true</failOnViolation>
<includeTestSourceDirectory>true</includeTestSourceDirectory>
<failOnWarning>false</failOnWarning>
<sourceDirectory>${project.basedir}/src/main/scala</sourceDirectory>
<testSourceDirectory>${project.basedir}/src/test/scala</testSourceDirectory>
<configLocation>${project.basedir}/scalastyle_config.xml</configLocation>
<outputFile>${project.basedir}/scalastyle-output.xml</outputFile>
<outputEncoding>UTF-8</outputEncoding>
</configuration>
<executions>
<execution>
<goals>
<goal>check</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>${maven-surefire-plugin.version}</version>
<configuration>
<skipTests>true</skipTests>
</configuration>
</plugin>
</plugins>
</build>
<repositories>
<repository>
<id>jitpack.io</id>
<url>https://jitpack.io</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
<repository>
<id>dl.binatry.com</id>
<url>http://dl.bintray.com/spark-packages/maven</url>
<snapshots>
<enabled>false</enabled>
</snapshots>
</repository>
</repositories>
When I run it locally it works fine, the issue arrives when trying to run it in the cluster.
I try to exclude some conflict versions artificats but the issue persists.
Any help?

Maven Setup is not generating class files

I have been trying to run the one of the scala-spark code in Eclipse (windows), But I could not create the class files when I try maven install. Please note the same code is running in mac eclipse. But I try to run with the same code in windows eclipse. Below is the pom file content,
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.mask</groupId>
<artifactId>**masked**</artifactId>
<version>${revision}</version>
<packaging>jar</packaging>
<name>**masked**</name>
<url>**masked**</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<revision>1.0.0</revision>
<log4j-version>1.2.17</log4j-version>
<slf4j-version>1.7.25</slf4j-version>
<typesafe-version>1.3.3</typesafe-version>
<json4s-version>3.6.3</json4s-version>
<commons-io-version>2.6</commons-io-version>
<httpcomponents-version>4.5.8</httpcomponents-version>
<spark-version>2.4.0</spark-version>
<kafka-version>2.0.0</kafka-version>
<es-version>7.3.1</es-version>
<!-- <es-version>7.3.1</es-version> -->
<zoo-version>3.4.13</zoo-version>
<java-version>1.8</java-version>
<scala-version>2.11.11</scala-version>
<scalatest-version>3.0.5</scalatest-version>
<scala-maven-version>3.4.6</scala-maven-version>
<maven-compiler-version>3.8.0</maven-compiler-version>
<maven-assembly-version>2.5.3</maven-assembly-version>
<artifact-scala-version>2.11</artifact-scala-version>
<deploy-root>nda-apps</deploy-root>
<common-lib-dir>common-lib</common-lib-dir>
<source-lib-dir>source-lib</source-lib-dir>
</properties>
<!-- Project Dependencies -->
<dependencies>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>${log4j-version}</version>
</dependency>
<!-- <dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>${slf4j-version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>${slf4j-version}</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>${slf4j-version}</version>
</dependency> -->
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>${typesafe-version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>${commons-io-version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>${httpcomponents-version}</version>
<scope>provided</scope>
</dependency>
<!-- <dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>${slf4j-version}</version>
<scope>provided</scope>
</dependency> -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark-version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>${spark-version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>${spark-version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark-version}</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<groupId>org.json4s</groupId>
<artifactId>json4s-native_2.11</artifactId>
</exclusion>
<exclusion>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_2.11</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-clients</artifactId>
<version>${kafka-version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark-20_2.11</artifactId>
<version>${es-version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-client</artifactId>
<!-- <version>${es-version}</version> -->
<version>6.5.4</version>
<scope>provided</scope>
<exclusions>
<exclusion>
<groupId>org.json4s</groupId>
<artifactId>json4s-native_2.11</artifactId>
</exclusion>
<exclusion>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_2.11</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
<version>${zoo-version}</version>
</dependency>
</dependencies>
<build>
<finalName>${project.artifactId}-${project.version}_${artifact-scala-version}</finalName>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<pluginManagement>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>${scala-maven-version}</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
<configuration>
<scalaVersion>${scala-version}</scalaVersion>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>${maven-compiler-version}</version>
<configuration>
<source>${java-version}</source>
<target>${java-version}</target>
</configuration>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>${maven-assembly-version}</version>
<configuration>
<descriptor>src/assembly/assembly.xml</descriptor>
</configuration>
<executions>
<execution>
<id>create-archive</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</pluginManagement>
</build>
I have setup it in JDK 1.8. Tried lot of time to remove and reconfigure setup from the begining but not working.

How to use a sub module of spark (say spark_core) in the project? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I know i can add the jar in my project and i can use it, but that's something i don't want to do. I need to use the source code of core Module which we have in the spark_parent_2.12 on github.
I am able to extract the core project from the spark and add it into my project as dependency here is my pom.xml.
project's pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.12</artifactId>
<version>3.0.0-SNAPSHOT</version>
<modules>
<module>core</module>
</modules>
<properties>
<sbt.project.name>core</sbt.project.name>
<sbt.project.name>core</sbt.project.name><avro.mapred.classifier>hadoop2</avro.mapred.classifier>
<javaxservlet.version>3.1.0</javaxservlet.version>
<scala.binary.version>2.12</scala.binary.version>
<oro.version>2.0.8</oro.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<scala.version>2.12.8</scala.version>
<scala.plugin.version>3.2.2</scala.plugin.version>
<scoverage.plugin.version>1.3.0</scoverage.plugin.version>
<project-info-reports.plugin.version>3.0.0</project-info-reports.plugin.version>
<json4s.version>3.6.5</json4s.version>
<scalaxml.version>1.1.1</scalaxml.version>
</properties>
<dependencies>
<dependency>
<groupId>com.thoughtworks.paranamer</groupId>
<artifactId>paranamer</artifactId>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<classifier>${avro.mapred.classifier}</classifier>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
</dependency>
<dependency>
<groupId>com.twitter</groupId>
<artifactId>chill_${scala.binary.version}</artifactId>
</dependency>
<dependency>
<groupId>com.twitter</groupId>
<artifactId>chill-java</artifactId>
</dependency>
<dependency>
<groupId>org.apache.xbean</groupId>
<artifactId>xbean-asm7-shaded</artifactId>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-launcher_${scala.binary.version}</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-kvstore_${scala.binary.version}</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-network-common_${scala.binary.version}</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-network-shuffle_${scala.binary.version}</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-unsafe_${scala.binary.version}</artifactId>
<version>2.4.3</version>
</dependency>
<dependency>
<groupId>javax.activation</groupId>
<artifactId>activation</artifactId>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
</dependency>
<!-- With curator 2.12 SBT/Ivy doesn't get ZK on the build classpath.
Explicitly declaring it as a dependency fixes this. -->
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</dependency>
<!-- Jetty dependencies promoted to compile here so they are shaded
and inlined into spark-core jar -->
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-plus</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-security</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-util</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-http</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-continuation</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-servlet</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-proxy</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-client</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-servlets</artifactId>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>javax.servlet</groupId>
<artifactId>javax.servlet-api</artifactId>
<version>${javaxservlet.version}</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
</dependency>
<dependency>
<groupId>com.google.code.findbugs</groupId>
<artifactId>jsr305</artifactId>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jul-to-slf4j</artifactId>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jcl-over-slf4j</artifactId>
</dependency>
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
</dependency>
<dependency>
<groupId>com.ning</groupId>
<artifactId>compress-lzf</artifactId>
</dependency>
<dependency>
<groupId>org.xerial.snappy</groupId>
<artifactId>snappy-java</artifactId>
</dependency>
<dependency>
<groupId>org.lz4</groupId>
<artifactId>lz4-java</artifactId>
</dependency>
<dependency>
<groupId>com.github.luben</groupId>
<artifactId>zstd-jni</artifactId>
</dependency>
<dependency>
<groupId>org.roaringbitmap</groupId>
<artifactId>RoaringBitmap</artifactId>
</dependency>
<dependency>
<groupId>commons-net</groupId>
<artifactId>commons-net</artifactId>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
</dependency>
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${scala.binary.version}</artifactId>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-client</artifactId>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-common</artifactId>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.core</groupId>
<artifactId>jersey-server</artifactId>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-servlet</artifactId>
</dependency>
<dependency>
<groupId>org.glassfish.jersey.containers</groupId>
<artifactId>jersey-container-servlet-core</artifactId>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-all</artifactId>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty</artifactId>
</dependency>
<dependency>
<groupId>com.clearspring.analytics</groupId>
<artifactId>stream</artifactId>
</dependency>
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-core</artifactId>
</dependency>
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-jvm</artifactId>
</dependency>
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-json</artifactId>
</dependency>
<dependency>
<groupId>io.dropwizard.metrics</groupId>
<artifactId>metrics-graphite</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.module</groupId>
<artifactId>jackson-module-scala_${scala.binary.version}</artifactId>
</dependency>
<dependency>
<groupId>org.apache.derby</groupId>
<artifactId>derby</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.ivy</groupId>
<artifactId>ivy</artifactId>
</dependency>
<dependency>
<groupId>oro</groupId>
<!-- oro is needed by ivy, but only listed as an optional dependency, so we include it. -->
<artifactId>oro</artifactId>
<version>${oro.version}</version>
</dependency>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-java</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.seleniumhq.selenium</groupId>
<artifactId>selenium-htmlunit-driver</artifactId>
<scope>test</scope>
</dependency>
<!-- Added for selenium: -->
<dependency>
<groupId>xml-apis</groupId>
<artifactId>xml-apis</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.hamcrest</groupId>
<artifactId>hamcrest-library</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.mockito</groupId>
<artifactId>mockito-core</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-test</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>net.razorvine</groupId>
<artifactId>pyrolite</artifactId>
<version>4.23</version>
<exclusions>
<exclusion>
<groupId>net.razorvine</groupId>
<artifactId>serpent</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>net.sf.py4j</groupId>
<artifactId>py4j</artifactId>
<version>0.10.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_${scala.binary.version}</artifactId>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-launcher_${scala.binary.version}</artifactId>
<version>2.4.3</version>
<classifier>tests</classifier>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-network-shuffle_${scala.binary.version}</artifactId>
<version>2.4.3</version>
<classifier>tests</classifier>
<scope>test</scope>
</dependency>
<!--
This spark-tags test-dep is needed even though it isn't used in this module, otherwise testing-cmds that exclude
them will yield errors.
-->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_${scala.binary.version}</artifactId>
<type>test-jar</type>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-crypto</artifactId>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>3.0.0-SNAPSHOT</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_2.12</artifactId>
<version>3.2.0-SNAP10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalacheck</groupId>
<artifactId>scalacheck_2.12</artifactId>
<version>1.14.0</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.json4s/json4s-ast -->
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-ast_2.12</artifactId>
<version>${json4s.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang.modules/scala-xml -->
<dependency>
<groupId>org.scala-lang.modules</groupId>
<artifactId>scala-xml_2.12</artifactId>
<version>${scalaxml.version}</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>
${project.basedir}/src/main/scala
</sourceDirectory>
<plugins>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.1.3</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
<goal>add-source</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<execution>
<phase>install</phase>
<goals>
<goal>shade</goal>
</goals>
</execution>
</executions>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<finalName>${project.artifactId}-${project.version}</finalName>
</configuration>
</plugin>
<plugin>
<groupId>org.scoverage</groupId>
<artifactId>scoverage-maven-plugin</artifactId>
<version>${scoverage.plugin.version}</version>
<configuration>
<scalaVersion>${scala.version}</scalaVersion>
<aggregate>true</aggregate>
<highlighting>true</highlighting>
</configuration>
<executions>
<execution>
<goals>
<goal>report</goal> <!-- or integration-check -->
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
<reporting>
<plugins>
<plugin>
<groupId>org.scoverage</groupId>
<artifactId>scoverage-maven-plugin</artifactId>
<version>${scoverage.plugin.version}</version>
<reportSets>
<reportSet>
<reports>
<report>report</report>
<!-- or <report>integration-report</report> -->
<!-- or <report>report-only</report> -->
</reports>
</reportSet>
</reportSets>
</plugin>
</plugins>
</reporting>
</project>
spark_core's pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<!--
~ Licensed to the Apache Software Foundation (ASF) under one or more
~ contributor license agreements. See the NOTICE file distributed with
~ this work for additional information regarding copyright ownership.
~ The ASF licenses this file to You under the Apache License, Version 2.0
~ (the "License"); you may not use this file except in compliance with
~ the License. You may obtain a copy of the License at
~
~ http://www.apache.org/licenses/LICENSE-2.0
~
~ Unless required by applicable law or agreed to in writing, software
~ distributed under the License is distributed on an "AS IS" BASIS,
~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~ See the License for the specific language governing permissions and
~ limitations under the License.
-->
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>org.apache.spark</groupId>
<artifactId>spark-parent_2.12</artifactId>
<version>3.0.0-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>
<artifactId>spark-core_2.12</artifactId>
<packaging>jar</packaging>
<name>Spark Project Core</name>
<url>http://spark.apache.org/</url>
<build>
<outputDirectory>target/scala-${scala.binary.version}/classes</outputDirectory>
<testOutputDirectory>target/scala-${scala.binary.version}/test-classes</testOutputDirectory>
<resources>
<resource>
<directory>${project.basedir}/src/main/resources</directory>
</resource>
<resource>
<!-- Include the properties file to provide the build information. -->
<directory>${project.build.directory}/extra-resources</directory>
<filtering>true</filtering>
</resource>
</resources>
<sourceDirectory>
${project.basedir}/src/main/scala
</sourceDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-antrun-plugin</artifactId>
<executions>
<execution>
<phase>generate-resources</phase>
<configuration>
<!-- Execute the shell script to generate the spark build information. -->
<target>
<exec executable="bash">
<arg value="${project.basedir}/../build/spark-build-info"/>
<arg value="${project.build.directory}/extra-resources"/>
<arg value="${project.version}"/>
</exec>
</target>
</configuration>
<goals>
<goal>run</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-dependency-plugin</artifactId>
<executions>
<!-- When using SPARK_PREPEND_CLASSES Spark classes compiled locally don't use
shaded deps. So here we store jars in their original form which are added
when the classpath is computed. -->
<!-- See similar execution in mllib/pom.xml -->
<execution>
<id>copy-dependencies</id>
<phase>package</phase>
<goals>
<goal>copy-dependencies</goal>
</goals>
<configuration>
<outputDirectory>${project.build.directory}</outputDirectory>
<overWriteReleases>false</overWriteReleases>
<overWriteSnapshots>false</overWriteSnapshots>
<overWriteIfNewer>true</overWriteIfNewer>
<useSubDirectoryPerType>true</useSubDirectoryPerType>
<includeArtifactIds>
guava,jetty-io,jetty-servlet,jetty-servlets,jetty-continuation,jetty-http,jetty-plus,jetty-util,jetty-server,jetty-security,jetty-proxy,jetty-client
</includeArtifactIds>
<silent>true</silent>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>Windows</id>
<activation>
<os>
<family>Windows</family>
</os>
</activation>
<properties>
<script.extension>.bat</script.extension>
</properties>
</profile>
<profile>
<id>unix</id>
<activation>
<os>
<family>unix</family>
</os>
</activation>
<properties>
<script.extension>.sh</script.extension>
</properties>
</profile>
<profile>
<id>sparkr</id>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.6.0</version>
<executions>
<execution>
<id>sparkr-pkg</id>
<phase>compile</phase>
<goals>
<goal>exec</goal>
</goals>
</execution>
</executions>
<configuration>
<executable>${project.basedir}${file.separator}..${file.separator}R${file.separator}install-dev${script.extension}</executable>
</configuration>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
I am able to access org.apache.spark.SparkConf or SparkContext but i am not able to access org.apache.spark.annotation.DeveloperApi in the Aggregator.scala file.
any help will be much appreciated.
You should add spark-tags dependency :
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-tags_2.12</artifactId>
<version>2.4.3</version>
</dependency>

Scala Maven Compiler: NullPointerException

I have just moved a project from scala 2.11.11 to 2.12.6
My build now throws the following exception:
error: java.lang.NullPointerException [INFO] at
scala.tools.nsc.transform.patmat.MatchTranslation$MatchTranslator.translateMatch(MatchTranslation.scala:215)
My pom.xml is below. Any advice on how to troubleshoot would be appreciated. The internet does not have any discussion of this issue
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>ikoda</groupId>
<artifactId>scalaML</artifactId>
<version>1.0-SNAPSHOT</version>
<inceptionYear>2008</inceptionYear>
<properties>
<scala.version>2.12.6</scala.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs</groupId>
<artifactId>specs</artifactId>
<version>1.2.5</version>
<scope>compile</scope>
</dependency>
<dependency>
<groupId>org.clapper</groupId>
<artifactId>grizzled-slf4j_2.12</artifactId>
<version>1.3.2</version>
</dependency>
<!---spark-->
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-reflect -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>2.11.12</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.guava/guava -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>15.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-mllib_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-graphx_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-graphx_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/com.typesafe/config -->
<dependency>
<groupId>com.typesafe</groupId>
<artifactId>config</artifactId>
<version>1.3.2</version>
</dependency>
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-core_2.12</artifactId>
<version>3.6.0-M4</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.github.scopt/scopt -->
<dependency>
<groupId>com.github.scopt</groupId>
<artifactId>scopt_2.10</artifactId>
<version>3.2.0</version>
</dependency>
<dependency>
<groupId>ikoda</groupId>
<artifactId>ikoda-utils</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>ikoda</groupId>
<artifactId>ikodatext-textAnalysis</artifactId>
<version>0.0.1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.11</artifactId>
<version>2.0.7</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.8.0</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.8.0</version>
<classifier>models</classifier>
<scope>test</scope>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.8.0</version>
<classifier>models-chinese</classifier>
<scope>test</scope>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.8.0</version>
<classifier>models-english</classifier>
<scope>test</scope>
</dependency>
<!-- http://mvnrepository.com/artifact/edu.stanford.nlp/stanford-parser -->
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-parser</artifactId>
<version>3.8.0</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.commons/commons-math3 -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
<version>3.6.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.github.pathikrit/better-files -->
<dependency>
<groupId>com.github.pathikrit</groupId>
<artifactId>better-files_2.11</artifactId>
<version>3.4.0</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<!-- exclude log4j.properties -->
<excludes>
<exclude>**/log4j.properties</exclude>
</excludes>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.3.3</version>
<executions>
<execution>
<phase>compile</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>3.1.0</version>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>create-archive</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>

Error: Could not find or load main class in scala

After installing eclipse scala plugins and eclipse maven plugin for scala .
I am new to scala , so i tried to so ensured that the enviorment was working after testing a scala hello world project. It works as expected.
But i am facing difficulty while trying to execute the project that i had checked out from the company's repository. No matter what I do (clean,build, clean-install via mave etc) I am getting a "Error: Could not find or load main class com.company.team.spark.sqlutil.testQuery" while trying to run even a small hello world program inside the project. My hunch says eclipse is unable to create class files for the project due to a pom issse, but I am unable to nail it down even after several tries. Please help me to figure this out
Version: Eclipse Luna Release (4.4.0)
Build id: 20140612-0600
scala - 2.10.6
Scalacode - testQuery.scala
package com.company.team.spark.sqlutil
object testQuery {
def main(args: Array[String]): Unit = {
print ("Hello")
}
}
Below is the POM I used.
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.company.team.spark</groupId>
<artifactId>HomeSpark</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>HomeSpark</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<lib.dir>${project.basedir}\lib\</lib.dir>
</properties>
<dependencies>
<!-- <dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>system</scope>
<systemPath>${lib.dir}junit-3.8.1.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.1.0</version>
<scope>system</scope>
<systemPath>${lib.dir}spark-core_2.10-2.1.0.jar</systemPath>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.1.0</version>
<scope>system</scope>
<systemPath>${lib.dir}spark-sql_2.10-2.1.0.jar</systemPath>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.5.0</version>
<scope>system</scope>
<systemPath>${lib.dir}spark-csv_2.10-1.5.0.jar</systemPath>
</dependency> -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.1.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql_2.10 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.1.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.databricks/spark-csv_2.10 -->
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.5.0</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.9.2</version>
</dependency>
</dependencies>
<build>
<sourceDirectory>${project.basedir}/src/main/scala</sourceDirectory>
<testOutputDirectory>${project.build.directory}/test-classes</testOutputDirectory>
<plugins><plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.1.3</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin></plugins>
</build>
</project>
Link to image of project structure
Use compile install as required for scala-maven-plugin. You might be using clean install which is deleting generated .class files from /bin, eclipse could not find or load main class.
I was able to resolve the issues after opted for scala IDE over eclipse integrated with scala IDE plugin.
Also changed the pom.xml to the following:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.company.fuison</groupId>
<artifactId>SomeCloud</artifactId>
<version>0.0.1-SNAPSHOT</version>
<name>${project.artifactId}</name>
<description>My wonderfull scala app</description>
<inceptionYear>2015</inceptionYear>
<licenses>
<license>
<name>My License</name>
<url>http://....</url>
<distribution>repo</distribution>
</license>
</licenses>
<properties>
<maven.compiler.source>1.6</maven.compiler.source>
<maven.compiler.target>1.6</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.11.5</scala.version>
<scala.compat.version>2.11</scala.compat.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core_2.11 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.0</version>
</dependency>
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.9.2</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.databricks/spark-csv_2.11 -->
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.11</artifactId>
<version>1.5.0</version>
</dependency>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.12</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-core_${scala.compat.version}</artifactId>
<version>2.4.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.compat.version}</artifactId>
<version>2.2.4</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-junit_${scala.compat.version}</artifactId>
<version>2.4.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-api-scala_2.11</artifactId>
<version>2.8.1</version>
</dependency>
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.8.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.12.1</version>
</dependency>
-->
<!-- https://mvnrepository.com/artifact/com.typesafe.scala-logging/scala-logging_2.11 -->
<dependency>
<groupId>com.typesafe.scala-logging</groupId>
<artifactId>scala-logging_2.11</artifactId>
<version>3.5.0</version>
</dependency>
</dependencies>
<build>
<resources>
<resource>
<directory>${project.basedir}/config/log4j</directory>
</resource>
</resources>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<!-- see http://davidb.github.com/scala-maven-plugin -->
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.18.1</version>
<configuration>
<useFile>false</useFile>
<disableXmlReport>true</disableXmlReport>
<!-- If you have classpath issue like NoDefClassError,... -->
<!-- useManifestOnlyJar>false</useManifestOnlyJar -->
<includes>
<include>**/*Test.*</include>
<include>**/*Suite.*</include>
</includes>
</configuration>
</plugin>
</plugins>
</build>
</project>