I'm using maven with scala archetype. I'm getting that error:
“value $ is not a member of StringContext”
I already tried to add several things in pom.xml, but nothing worked very well...
My code:
import org.apache.spark.ml.evaluation.RegressionEvaluator
import org.apache.spark.ml.regression.LinearRegression
import org.apache.spark.ml.tuning.{ParamGridBuilder, TrainValidationSplit}
// To see less warnings
import org.apache.log4j._
Logger.getLogger("org").setLevel(Level.ERROR)
// Start a simple Spark Session
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().getOrCreate()
// Prepare training and test data.
val data = spark.read.option("header","true").option("inferSchema","true").format("csv").load("USA_Housing.csv")
// Check out the Data
data.printSchema()
// See an example of what the data looks like
// by printing out a Row
val colnames = data.columns
val firstrow = data.head(1)(0)
println("\n")
println("Example Data Row")
for(ind <- Range(1,colnames.length)){
println(colnames(ind))
println(firstrow(ind))
println("\n")
}
////////////////////////////////////////////////////
//// Setting Up DataFrame for Machine Learning ////
//////////////////////////////////////////////////
// A few things we need to do before Spark can accept the data!
// It needs to be in the form of two columns
// ("label","features")
// This will allow us to join multiple feature columns
// into a single column of an array of feautre values
import org.apache.spark.ml.feature.VectorAssembler
import org.apache.spark.ml.linalg.Vectors
// Rename Price to label column for naming convention.
// Grab only numerical columns from the data
val df = data.select(data("Price").as("label"),$"Avg Area Income",$"Avg Area House Age",$"Avg Area Number of Rooms",$"Area Population")
// An assembler converts the input values to a vector
// A vector is what the ML algorithm reads to train a model
// Set the input columns from which we are supposed to read the values
// Set the name of the column where the vector will be stored
val assembler = new VectorAssembler().setInputCols(Array("Avg Area Income","Avg Area House Age","Avg Area Number of Rooms","Area Population")).setOutputCol("features")
// Use the assembler to transform our DataFrame to the two columns
val output = assembler.transform(df).select($"label",$"features")
// Create a Linear Regression Model object
val lr = new LinearRegression()
// Fit the model to the data
// Note: Later we will see why we should split
// the data first, but for now we will fit to all the data.
val lrModel = lr.fit(output)
// Print the coefficients and intercept for linear regression
println(s"Coefficients: ${lrModel.coefficients} Intercept: ${lrModel.intercept}")
// Summarize the model over the training set and print out some metrics!
// Explore this in the spark-shell for more methods to call
val trainingSummary = lrModel.summary
println(s"numIterations: ${trainingSummary.totalIterations}")
println(s"objectiveHistory: ${trainingSummary.objectiveHistory.toList}")
trainingSummary.residuals.show()
println(s"RMSE: ${trainingSummary.rootMeanSquaredError}")
println(s"MSE: ${trainingSummary.meanSquaredError}")
println(s"r2: ${trainingSummary.r2}")
and my pom.xml is that:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>test</groupId>
<artifactId>outrotest</artifactId>
<version>1.0-SNAPSHOT</version>
<name>${project.artifactId}</name>
<description>My wonderfull scala app</description>
<inceptionYear>2015</inceptionYear>
<licenses>
<license>
<name>My License</name>
<url>http://....</url>
<distribution>repo</distribution>
</license>
</licenses>
<properties>
<maven.compiler.source>1.6</maven.compiler.source>
<maven.compiler.target>1.6</maven.compiler.target>
<encoding>UTF-8</encoding>
<scala.version>2.11.5</scala.version>
<scala.compat.version>2.11</scala.compat.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.11</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.2</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.11</artifactId>
<version>1.5.0</version>
</dependency>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.11</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-junit_${scala.compat.version}</artifactId>
<version>2.4.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-core_${scala.compat.version}</artifactId>
<version>2.4.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.compat.version}</artifactId>
<version>2.2.4</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<sourceDirectory>src/main/scala</sourceDirectory>
<testSourceDirectory>src/test/scala</testSourceDirectory>
<plugins>
<plugin>
<!-- see http://davidb.github.com/scala-maven-plugin -->
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
<configuration>
<args>
<!--<arg>-make:transitive</arg>-->
<arg>-dependencyfile</arg>
<arg>${project.build.directory}/.scala_dependencies</arg>
</args>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.18.1</version>
<configuration>
<useFile>false</useFile>
<disableXmlReport>true</disableXmlReport>
<!-- If you have classpath issue like NoDefClassError,... -->
<!-- useManifestOnlyJar>false</useManifestOnlyJar -->
<includes>
<include>**/*Test.*</include>
<include>**/*Suite.*</include>
</includes>
</configuration>
</plugin>
</plugins>
</build>
</project>
I have no idea about how to fix it. Does anybody have any idea?
Add this.. it will work
val spark = SparkSession.builder().getOrCreate()
import spark.implicits._ // << add this
You can use the col function instead just import it like this :
import org.apache.spark.sql.functions.col
And then change the $"column" to col("column")
Hope it helps
#Apurva's answer initially worked for me in that the error vanished from IntelliJ
But then it resulted in "Could not find implicit value for spark" during sbt compile phase
I found a strange work-around by importing spark.implicits._ from SparkSession referenced from DataFrame instead of one obtained by getOrCreate
import df.sparkSession.implicits._
where df is a DataFrame
This could be because my code was placed inside a case class that received an implicit val spark: SparkSession parameter; but I'm not really sure as to why this fix worked for me
I'm using spark 1.6. The above answers are great but unfortunately doesn't work in 1.6
The way I solved it was by using df.col("column-name")
val df = df_mid
.withColumn("dt", date_format(df_mid.col("timestamp"), "yyyy-MM-dd"))
.filter("dt != 'null'")
Related
I am working on compile team weaving using AspectJ as Load time Weaving for the same is causing extra overhead on server startup.so the issue is at compile all the classes is being weaved. However when running application on server it is never coming to any of the Aspect class.
So as I have some classes that are using lombok so I have done like this and added compile time maven plugin
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.x.rgx</groupId>
<artifactId>web</artifactId>
<version>10.0</version>
<packaging>war</packaging>
<properties>
<runSuite>**/AllTests.class</runSuite>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
<spring-framework.version>5.0.4.RELEASE</spring-framework.version>
<lombok.version>1.18.2</lombok.version>
<aspectj.version>1.8.13</aspectj.version>
</properties>
<dependencies>
<dependency>
<groupId>org.projectlombok</groupId>
<artifactId>lombok</artifactId>
<version>${lombok.version}</version>
<scope>provided</scope>
<optional>true</optional>
</dependency>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjrt</artifactId>
<version>${aspectj.version}</version>
</dependency>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjtools</artifactId>
<version>${aspectj.version}</version>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-aspects</artifactId>
<version>${spring-framework.version}</version>
</dependency>
<dependencies>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<executions>
<execution>
<id>default-compile</id>
<configuration>
<compilerArguments>
<d>${project.build.directory}/classes</d>
</compilerArguments>
</configuration>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>aspectj-maven-plugin</artifactId>
<version>1.11</version>
<dependencies>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjrt</artifactId>
<version>${aspectj.version}</version>
</dependency>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjtools</artifactId>
<version>${aspectj.version}</version>
</dependency>
</dependencies>
<configuration>
<complianceLevel>${maven.compiler.target}</complianceLevel>
<source>${maven.compiler.target}</source>
<target>${maven.compiler.target}</target>
<showWeaveInfo>true</showWeaveInfo>
<verbose>true</verbose>
<Xlint>ignore</Xlint>
<encoding>${project.build.sourceEncoding}</encoding>
<forceAjcCompile>true</forceAjcCompile>
<sources />
<weaveDirectories>
<weaveDirectory>${project.build.directory}/classes</weaveDirectory>
</weaveDirectories>
<aspectLibraries>
<aspectLibrary>
<groupId>org.springframework</groupId>
<artifactId>spring-aspects</artifactId>
</aspectLibrary>
</aspectLibraries>
</configuration>
<executions>
<execution>
<phase>process-classes</phase>
<goals>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-surefire-plugin</artifactId>
<version>2.22.1</version>
<configuration>
<includes>
<include>${runSuite}</include>
</includes>
</configuration>
</plugin>
</plugins>
</build>
</project>
package com.x.aspect.config;
#Configuration
#ComponentScan(basePackages = { "com.x" })
public class AspectConfig {
}
package com.x.login;
#Component
#Scope("session")
public class LoginMBean extends AbstractMbean {
#Autowired
LoginService loginService ;
public void loginUserData(){
LoginInfo info= new LoginInfo();
//setter for info object
//some nested method calls
loginService.insertLoginData(info);
}
}
package com.x.aspects;
#Component
#Aspect
public class Aspects {
private static Logger Logger= LoggerFactory.getLogger(Aspects.class);
#Pointcut("execution(* *(..)) && cflow(execution(* com.x.login..*(..)))")
public void methodsToBeProfiled() {}
#Around("methodsToBeProfiled()")
public Object methodsToBeProfiled(ProceedingJoinPoint point) throws Throwable {
StopWatch sw = new StopWatch(getClass().getSimpleName());
try {
sw.start(point.getSignature().getName());
return point.proceed();
} finally {
sw.stop();
Logger.info("Elapsed Time, Package Name, Method Name");
Logger.info(sw.prettyPrint());
Logger.info("Package Name: " + point.getStaticPart());
}
}
}
[INFO] Join point 'method-execution(java.lang.String com.x.login.LoginMBean.getArisgPersistenceUnitName(java.lang.String))' in Type 'com.x.login.LoginMBean' (LoginMBean.java:258) advised by around advice from 'com.x.aspects.Aspects' (Aspects.class(from Aspects.java)) [with runtime test]
[INFO] Join point 'method-execution(java.lang.String com.x.login.LoginMBean.getMultiDb())' in Type 'com.x.login.LoginMBean' (LoginMBean.java:269) advised by around advice from 'com.x.aspects.Aspects' (Aspects.class(from Aspects.java)) [with runtime test]
[INFO] Join point 'method-execution(void com.x.login.LoginMBean.setMultiDb(java.lang.String))' in Type 'com.x.login.LoginMBean' (LoginMBean.java:273) advised by around advice from 'com.x.aspects.Aspects' (Aspects.class(from Aspects.java)) [with runtime test]
[INFO] Join point 'method-execution(boolean com.x.login.LoginMBean.isDbListStatus())' in Type 'com.x.login.LoginMBean' (LoginMBean.java:277) advised by around advice from 'com.x.aspects.Aspects' (Aspects.class(from Aspects.java)) [with runtime test]
So now as in the compile time it has weaved all the classes. But at the runtime it not coming to Aspects.java. Anything else i need to add up for configuration.? Do i need configuration added in spring-config.xml?
List item
It has worked by changing Pointcut to:
#Pointcut("execution(* *(..)) && cflow(execution(* com.x.login.LoginMBean.*(..)))")
I don't understand why I get an exception in this very basic test of iText :
package com.itextpdf.testpdf4;
import com.itextpdf.io.font.FontConstants;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.List;
import com.itextpdf.layout.element.ListItem;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.text.DocumentException;
import com.itextpdf.licensekey.LicenseKey;
import com.itextpdf.test.annotations.WrapToTest;
import java.io.File;
import java.io.IOException;
#WrapToTest
public class HelloWorld {
public static final String DEST = "result/hello.pdf";
public static void main(String[] args)
throws DocumentException, IOException {
LicenseKey.loadLicenseFile("C:\\dev\\testPDF4\\src\\main\\java\\com\\itextpdf\\testpdf4\\itextkey1544447451310_0.xml");
File file = new File(DEST);
file.getParentFile().mkdirs();
new HelloWorld().createPdf(DEST);
}
public void createPdf(String dest) throws DocumentException, IOException {
PdfWriter writer = new PdfWriter(dest);
//Initialize PDF document
PdfDocument pdf = new PdfDocument(writer);
// Initialize document
Document document = new Document(pdf);
// Create a PdfFont
PdfFont font = PdfFontFactory.createFont(FontConstants.TIMES_ROMAN);
// Add a Paragraph
document.add(new Paragraph("iText is:").setFont(font));
// Create a List
List list = new List()
.setSymbolIndent(12)
.setListSymbol("\u2022")
.setFont(font);
// Add ListItem objects
list.add(new ListItem("Never gonna give you up"))
.add(new ListItem("Never gonna let you down"))
.add(new ListItem("Never gonna run around and desert you"))
.add(new ListItem("Never gonna make you cry"))
.add(new ListItem("Never gonna say goodbye"))
.add(new ListItem("Never gonna tell a lie and hurt you"));
// Add the list
document.add(list);
//Close document
document.close();
}
}
Exception in thread "main" java.lang.NoClassDefFoundError:
com/itextpdf/kernel/pdf/tagutils/DefaultAccessibilityProperties at
com.itextpdf.testpdf4.HelloWorld.createPdf(HelloWorld.java:56)
(line 56 is : document.add(new Paragraph("iText is:").setFont(font)); )
This code comes from here: https://developers.itextpdf.com/fr/content/itext-7-jump-start-tutorial/examples/chapter-1 -> C01E02_RickAstley.java
In the POM.XML :
<modelVersion>4.0.0</modelVersion>
<groupId>com.itextpdf</groupId>
<artifactId>testPDF4</artifactId>
<version>1.0</version>
(package is : package com.itextpdf.testpdf4;)
Here is the complete POM.XML :
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.itextpdf</groupId>
<artifactId>testPDF4</artifactId>
<version>1.0</version>
<properties>
<itext.version>7.1.4</itext.version>
<java.version>1.8</java.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<junit.version>4.12</junit.version>
</properties>
<repositories>
<repository>
<id>itext</id>
<name>iText Repository - releases</name>
<url>https://repo.itextsupport.com/releases</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>kernel</artifactId>
<version>7.0.4</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>io</artifactId>
<version>7.0.4</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>layout</artifactId>
<version>7.1.4</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>forms</artifactId>
<version>7.0.4</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>pdfa</artifactId>
<version>7.0.4</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>pdftest</artifactId>
<version>7.0.4</version>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-log4j12</artifactId>
<version>1.7.18</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext-licensekey</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itextpdf</artifactId>
<version>5.5.13</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-sandbox-parent</artifactId>
<version>2</version>
<type>pom</type>
</dependency>
</dependencies>
<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<excludes>
<exclude>**/*.p12</exclude>
</excludes>
</resource>
</resources>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.6.0</version>
<configuration>
<source>${java.version}</source>
<target>${java.version}</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-javadoc-plugin</artifactId>
<version>2.10.4</version>
<configuration>
<excludePackageNames>com.itextpdf.xml</excludePackageNames>
</configuration>
<executions>
<execution>
<id>attach-javadocs</id>
<phase>package</phase>
<goals>
<goal>jar</goal>
</goals>
</execution>
</executions>
</plugin>
<plugin>
<groupId>external.atlassian.jgitflow</groupId>
<artifactId>jgitflow-maven-plugin</artifactId>
<version>1.0-m5.1</version>
<configuration>
<!-- see goals wiki page for configuration options -->
<flowInitContext>
<masterBranchName>master</masterBranchName>
<developBranchName>develop</developBranchName>
<featureBranchPrefix>feature/</featureBranchPrefix>
<releaseBranchPrefix>release/</releaseBranchPrefix>
<hotfixBranchPrefix>hotfix/</hotfixBranchPrefix>
<versionTagPrefix />
</flowInitContext>
<allowUntracked>true</allowUntracked>
<autoVersionSubmodules>true</autoVersionSubmodules>
<updateDependencies>true</updateDependencies>
</configuration>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<id>public</id>
<activation>
<activeByDefault>true</activeByDefault>
</activation>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.0.2</version>
<configuration>
<excludes>
<exclude>com/itextpdf/xml/**</exclude>
<exclude>**/*.p12</exclude>
</excludes>
</configuration>
</plugin>
</plugins>
</build>
</profile>
<profile>
<id>internal</id>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-jar-plugin</artifactId>
<version>3.0.2</version>
<configuration>
<excludes>
<exclude>**/*.p12</exclude>
</excludes>
<classifier>INTERNAL</classifier>
</configuration>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
Does anyone see something wrong ? I don't
Thanks
You're mixing different core itext artifact versions, 7.0.4 and 7.1.4.
...
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>io</artifactId>
<version>7.0.4</version>
</dependency>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>layout</artifactId>
<version>7.1.4</version>
<type>jar</type>
</dependency>
...
Don't mix these. Use the same version of all your core itext artifacts.
By the way, you put your test project into the itext group:
<groupId>com.itextpdf</groupId>
<artifactId>testPDF4</artifactId>
You shouldn't do that, in particular not with production use projects.
Thanks a lot mkl ;
Bad group IP, Bad versions, and a wrong nbaction.xml
Trying to read a simple csv file and load it in a dataframe throw a java.lang.ArrayIndexOutOfBoundsException.
As I am new to Scala I may have missed something trivial, however a thorough search both in google and stackoverflow lead nothing.
The code is the following:
import org.apache.spark.sql.SparkSession
object TransformInitial {
def main(args: Array[String]): Unit = {
val session = SparkSession.builder.master("local").appName("test").getOrCreate()
val df = session.read.format("csv").option("header", "true").option("inferSchema", "true").option("delimiter",",").load("data_sets/small_test.csv")
df.show()
}
}
small_test.csv is as simple as possible:
v1,v2,v3
0,1,2
3,4,5
Here is the actual pom of this Maven project:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>Scala_tests</groupId>
<artifactId>Scala_tests</artifactId>
<version>0.0.1-SNAPSHOT</version>
<build>
<sourceDirectory>src</sourceDirectory>
<resources>
<resource>
<directory>src</directory>
<excludes>
<exclude>**/*.java</exclude>
</excludes>
</resource>
</resources>
<plugins>
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.8.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
</plugins>
</build>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.12</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.12</artifactId>
<version>2.4.0</version>
</dependency>
</dependencies>
</project>
Execution of the code throw the following
java.lang.ArrayIndexOutOfBoundsException:
18/11/09 12:03:31 INFO FileSourceStrategy: Pruning directories with:
18/11/09 12:03:31 INFO FileSourceStrategy: Post-Scan Filters: (length(trim(value#0, None)) > 0)
18/11/09 12:03:31 INFO FileSourceStrategy: Output Data Schema: struct<value: string>
18/11/09 12:03:31 INFO FileSourceScanExec: Pushed Filters:
18/11/09 12:03:31 INFO CodeGenerator: Code generated in 413.859722 ms
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 10582
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.accept(BytecodeReadingParanamer.java:563)
at com.thoughtworks.paranamer.BytecodeReadingParanamer$ClassReader.access$200(BytecodeReadingParanamer.java:338)
at com.thoughtworks.paranamer.BytecodeReadingParanamer.lookupParameterNames(BytecodeReadingParanamer.java:103)
at com.thoughtworks.paranamer.CachingParanamer.lookupParameterNames(CachingParanamer.java:90)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.getCtorParams(BeanIntrospector.scala:44)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$1$adapted(BeanIntrospector.scala:58)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:241)
at scala.collection.Iterator.foreach(Iterator.scala:929)
at scala.collection.Iterator.foreach$(Iterator.scala:929)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1417)
at scala.collection.IterableLike.foreach(IterableLike.scala:71)
at scala.collection.IterableLike.foreach$(IterableLike.scala:70)
at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:241)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:238)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.findConstructorParam$1(BeanIntrospector.scala:58)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$19(BeanIntrospector.scala:176)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:32)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:29)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:191)
at scala.collection.TraversableLike.map(TraversableLike.scala:234)
at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:191)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14(BeanIntrospector.scala:170)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.$anonfun$apply$14$adapted(BeanIntrospector.scala:169)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:389)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:241)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:238)
at scala.collection.immutable.List.flatMap(List.scala:352)
at com.fasterxml.jackson.module.scala.introspect.BeanIntrospector$.apply(BeanIntrospector.scala:169)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$._descriptorFor(ScalaAnnotationIntrospectorModule.scala:22)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.fieldName(ScalaAnnotationIntrospectorModule.scala:30)
at com.fasterxml.jackson.module.scala.introspect.ScalaAnnotationIntrospector$.findImplicitPropertyName(ScalaAnnotationIntrospectorModule.scala:78)
at com.fasterxml.jackson.databind.introspect.AnnotationIntrospectorPair.findImplicitPropertyName(AnnotationIntrospectorPair.java:467)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector._addFields(POJOPropertiesCollector.java:351)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.collectAll(POJOPropertiesCollector.java:283)
at com.fasterxml.jackson.databind.introspect.POJOPropertiesCollector.getJsonValueMethod(POJOPropertiesCollector.java:169)
at com.fasterxml.jackson.databind.introspect.BasicBeanDescription.findJsonValueMethod(BasicBeanDescription.java:223)
at com.fasterxml.jackson.databind.ser.BasicSerializerFactory.findSerializerByAnnotations(BasicSerializerFactory.java:348)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory._createSerializer2(BeanSerializerFactory.java:210)
at com.fasterxml.jackson.databind.ser.BeanSerializerFactory.createSerializer(BeanSerializerFactory.java:153)
at com.fasterxml.jackson.databind.SerializerProvider._createUntypedSerializer(SerializerProvider.java:1203)
at com.fasterxml.jackson.databind.SerializerProvider._createAndCacheUntypedSerializer(SerializerProvider.java:1157)
at com.fasterxml.jackson.databind.SerializerProvider.findValueSerializer(SerializerProvider.java:481)
at com.fasterxml.jackson.databind.SerializerProvider.findTypedValueSerializer(SerializerProvider.java:679)
at com.fasterxml.jackson.databind.ser.DefaultSerializerProvider.serializeValue(DefaultSerializerProvider.java:107)
at com.fasterxml.jackson.databind.ObjectMapper._configAndWriteValue(ObjectMapper.java:3559)
at com.fasterxml.jackson.databind.ObjectMapper.writeValueAsString(ObjectMapper.java:2927)
at org.apache.spark.rdd.RDDOperationScope.toJson(RDDOperationScope.scala:52)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:142)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:339)
at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3384)
at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3365)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3365)
at org.apache.spark.sql.Dataset.head(Dataset.scala:2545)
at org.apache.spark.sql.Dataset.take(Dataset.scala:2759)
at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:232)
at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:68)
at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:63)
at org.apache.spark.sql.execution.datasources.DataSource.$anonfun$getOrInferFileFormatSchema$12(DataSource.scala:183)
at scala.Option.orElse(Option.scala:289)
at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:180)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:373)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at TransformInitial$.main(TransformInitial.scala:9)
at TransformInitial.main(TransformInitial.scala)
For the record eclipse version is 2018-09 (4.9.0).
I've hunted for special characters in the csv with a cat -A. It yield nothing.
I'm out of options, something trivial must be missing but I can't put a finger on it.
I'm not sure exactly what is causing your error, since the code works for me. It could be related to the version of the Scala compiler that you are using, since there's no information about that in your Maven file.
I have posted my complete solution—using SBT— to GitHub. To exectute the code, you'll need to install SBT, cd to the checked out source's root folder, then run the following command:
$ sbt run
BTW, I changed your code to take advantage of more idiomatic Scala conventions, and also used the csv function to load your file. The new Scala code looks like this:
import org.apache.spark.sql.SparkSession
// Extending App is more idiomatic than writing a "main" function.
object TransformInitial
extends App {
val session = SparkSession.builder.master("local").appName("test").getOrCreate()
// As of Spark 2.0, it's easier to read CSV files.
val df = session.read.option("header", "true").option("inferSchema", "true").csv("data_sets/small_test.csv")
df.show()
// Shutdown gracefully.
session.stop()
}
Note that I also removed the redundant delimiter option.
Downgrading scala version to 2.11 fixed for me.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.4.0</version>
</dependency>
The following code is meant to read messages from Kafka using Spark Submit.
The code executes and terminates without errors but reads no messages(The output file is empty and the log inside rdd.foreachPartition does not print).Please indicate what i am missing.
package hive;
import java.net.URI;
import java.util.*;
import org.apache.spark.SparkConf;
import org.apache.spark.TaskContext;
import org.apache.spark.api.java.*;
import org.apache.spark.api.java.function.*;
import org.apache.spark.streaming.Duration;
import org.apache.spark.streaming.Durations;
import org.apache.spark.streaming.StreamingContext;
import org.apache.spark.streaming.api.java.*;
import org.apache.spark.streaming.kafka010.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.apache.hadoop.fs.FileSystem;
import org.apache.kafka.clients.consumer.ConsumerRecord;
import org.apache.kafka.common.TopicPartition;
import org.apache.kafka.common.serialization.StringDeserializer;
import scala.Tuple2;
public class SparkKafka1 {
private static final Logger logger = LoggerFactory.getLogger(SparkKafka1.class);
public static void main(String[] args) {
Map<String, Object> kafkaParams = new HashMap<>();
kafkaParams.put("bootstrap.servers", "http://192.168.1.214:9092,http://192.168.1.214:9093");
kafkaParams.put("key.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
kafkaParams.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
//kafkaParams.put("group.id", "StreamingGroup");
kafkaParams.put("auto.offset.reset", "smallest");
kafkaParams.put("enable.auto.commit", false);
String user = "ankit";
String password = "noida#123";
Collection<String> topics = Arrays.asList("StreamingTopic");
SparkConf conf = new SparkConf().setMaster("spark://192.168.1.214:7077")
.set("spark.deploy.mode", "cluster").set("user",user)
.set("password",password).set("spark.driver.memory", "1g").set("fs.defaultFS", "hdfs://192.168.1.214:9000")
.setAppName("NetworkWordCount");
JavaStreamingContext streamingContext = new JavaStreamingContext(conf,new Duration(500));
JavaInputDStream<ConsumerRecord<String, String>> stream =
KafkaUtils.createDirectStream(
streamingContext,
LocationStrategies.PreferConsistent(),
ConsumerStrategies.<String, String>Subscribe(topics, kafkaParams)
);
stream.mapToPair(record -> new Tuple2<>(record.key(), record.value()));
stream.foreachRDD(rdd ->{
rdd.foreachPartition(item ->{
while (item.hasNext()) {
System.out.println(">>>>>>>>>>>>>>>>>>>>>>>>>>>"+item.next());
logger.info("next item="+item.next());
}
});
});
logger.info("demo log="+stream.count());
stream.foreachRDD(rdd -> {
OffsetRange[] offsetRanges = ((HasOffsetRanges) rdd.rdd()).offsetRanges();
rdd.foreachPartition(consumerRecords -> {
OffsetRange o = offsetRanges[TaskContext.get().partitionId()];
System.out.println(
o.topic() + " " + o.partition() + " " + o.fromOffset() + " " + o.untilOffset());
rdd.saveAsTextFile("/home/ankit/work/warehouse/Manish.txt");
logger.info("tokenizing inside processElement method");
});
});
}
}
The following is the pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>SparkTest</groupId>
<artifactId>SparkTest</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>SparkTest</name>
<url>http://maven.apache.org</url>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.scala-lang/scala-library -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.11.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
<scope>provided </scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.1.0</version>
<scope>provided </scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.1.0</version>
<scope>provided </scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.1.0</version>
<scope>provided </scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-flume_2.11</artifactId>
<version>2.1.0</version>
<scope>provided </scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.0</version>
</dependency>
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>3.8.1</version>
<scope>test</scope>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.1</version>
<configuration>
<!-- or whatever version you use -->
<source>1.8</source>
<target>1.8</target>
</configuration>
</plugin>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.0.0</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/LICENSE</exclude>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
<filter>
<artifact>org.apache.spark:spark-streaming-kafka-0-10_2.11</artifact>
<includes> <include>org/apache/spark/streaming/kafka010/**</include>
</includes>
</filter>
</filters>
<transformers>
<transformer
implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
</transformers>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
</project>
The following command submits the job:
./spark-submit --class hive.SparkKafka1 --master spark://192.168.1.214:6066 --deploy-mode cluster --supervise --executor-memory 2G --total-executor-cores 4 hdfs://192.168.1.214:9000/input/SparkTest-0.0.1-SNAPSHOT.jar
i haven't run this program to see but it seems you are using kafka 0.10.2 and smallest is deprecated please use earliest instead.
You need add this two commands;
streamingContext.start();//start this app.
streamingContext.awaitTermination();//prevent this app close.
And I see you use http* value for bootstrap.servers. Delete the http prefix.
By the way, if you set spark conf in the code. It's useless set the same value in the command line.
Just check it. If the error exist as before. please let me know.
java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
at org.elasticsearch.spark.serialization.ReflectionUtils$.org$elasticsearch$spark$serialization$ReflectionUtils$$checkCaseClass(ReflectionUtils.scala:42)
at org.elasticsearch.spark.serialization.ReflectionUtils$$anonfun$checkCaseClassCache$1.apply(ReflectionUtils.scala:84)
it is seems scala version uncompatible,but i see the document of spark ,spark 2.10 and scala 2.11.8 is ok.
that is my pom.xml and that is just a test for spark to write to elasticsearch with es-hadoop,i have no idea how to solve this exception. `
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>cn.jhTian</groupId>
<artifactId>sparkLink</artifactId>
<version>0.0.1-SNAPSHOT</version>
<packaging>jar</packaging>
<name>${project.artifactId}</name>
<description>My wonderfull scala app</description>
<inceptionYear>2015</inceptionYear>
<licenses>
<license>
<name>My License</name>
<url>http://....</url>
<distribution>repo</distribution>
</license>
</licenses>
<properties>
<encoding>UTF-8</encoding>
<scala.version>2.11.8</scala.version>
<scala.compat.version>2.11</scala.compat.version>
</properties>
<repositories>
<repository>
<id>ainemo</id>
<name>xylink</name>
<url>http://10.170.209.180:8081/nexus/content/groups/public/</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>2.6.4</version><!-- 2.64 -->
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!--<dependency>-->
<!--<groupId>org.scala-lang</groupId>-->
<!--<artifactId>scala-compiler</artifactId>-->
<!--<version>${scala.version}</version>-->
<!--</dependency>-->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>2.6.4</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>3.1.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-hadoop</artifactId>
<version>5.3.0 </version>
</dependency>
<!-- Test -->
<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.specs2</groupId>
<artifactId>specs2-core_${scala.compat.version}</artifactId>
<version>2.4.16</version>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.scalatest</groupId>
<artifactId>scalatest_${scala.compat.version}</artifactId>
<version>2.2.4</version>
<scope>test</scope>
</dependency>
</dependencies>
</project>'
this is my code
import org.apache.spark.{SparkConf, SparkContext}
import org.elasticsearch.spark._
/**
* Created by jhTian on 2017/4/19.
*/
object EsWrite {
def main(args: Array[String]) {
val sparkConf = new SparkConf()
.set("es.nodes", "1.1.1.1")
.set("es.port", "9200")
.set("es.index.auto.create", "true")
.setAppName("es-spark-demo")
val sc = new SparkContext(sparkConf)
val job1 = Job("C开发工程师","http://job.c.com","c公司","10000")
val job2 = Job("C++开发工程师","http://job.c++.com","c++公司","10000")
val job3 = Job("C#开发工程师","http://job.c#.com","c#公司","10000")
val job4 = Job("Java开发工程师","http://job.java.com","java公司","10000")
val job5 = Job("Scala开发工程师","http://job.scala.com","java公司","10000")
// val numbers = Map("one" -> 1, "two" -> 2, "three" -> 3)
// val airports = Map("arrival" -> "Otopeni", "SFO" -> "San Fran")
// val rdd=sc.makeRDD(Seq(numbers,airports))
val rdd=sc.makeRDD(Seq(job1,job2,job3,job4,job5))
rdd.saveToEs("job/info")
sc.stop()
}
}
case class Job(jobName:String, jobUrl:String, companyName:String, salary:String)'
Generally NoSuchMethodError implies the caller was compiled with a different version than was found on the classpath at runtime (or you have multiple versions on the CP).
In your case, I'd guess that es-hadoop is built against a different version of Scala I've not used maven in a little while but I think the command you need to get some useful into is mvn depdencyTree. Use the output to see which version of Scala es-hadoop is built with and then configure your project to use the same Scala version.
To get stable/reproducible builds I'd recommend using something like the maven-enforcer-plugin:
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-enforcer-plugin</artifactId>
<version>1.4.1</version>
<executions>
<execution>
<id>enforce</id>
<configuration>
<rules>
<dependencyConvergence />
</rules>
</configuration>
<goals>
<goal>enforce</goal>
</goals>
</execution>
</executions>
</plugin>
it can be annoying initially but once you have all your dependencies sorted you shouldn't get issues like this anymore.
use dependency like this
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch-spark-20_2.11</artifactId>
<version>5.2.2</version>
</dependency>
for spark 2.0 and scala 2.11