Error while using SparkSession or sqlcontext - scala

I am new to spark. I am just trying to parse a json file using sparksession or sqlcontext.
But whenever I run them, I am getting the following error.
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.spark.internal.config.package$.CATALOG_IMPLEMENTATION()Lorg/apache/spark/internal/config/ConfigEntry; at
org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$sessionStateClassName(SparkSession.scala:930) at
org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:112) at
org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:110) at
org.apache.spark.sql.DataFrameReader.<init>(DataFrameReader.scala:535) at
org.apache.spark.sql.SparkSession.read(SparkSession.scala:595) at
org.apache.spark.sql.SQLContext.read(SQLContext.scala:504) at
joinAssetsAndAd$.main(joinAssetsAndAd.scala:21) at
joinAssetsAndAd.main(joinAssetsAndAd.scala)
As of now I created a scala project in eclipse IDE and configured it as Maven project and added the spark and sql dependencies.
My dependencies :
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>2.0.0</version>
</dependency>
</dependencies>
Could you please explain why I am getting this error and how to correct them?

Try to use the same version for spark-core and spark-sql. Change version of spark-sql to 2.1.0

Related

Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$

Hi I try to run spark on my local laptop.
I created a mvn project in intelijidea and in my main class I have one line like bellow and when I try to run a project I got the error like below
val spark = SparkSession.builder().master("local").getOrCreate()
21/11/02 18:02:35 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module #0x34e9fd99) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module #0x34e9fd99
at org.apache.spark.storage.StorageUtils$.(StorageUtils.scala:213)
at org.apache.spark.storage.BlockManagerMasterEndpoint.(BlockManagerMasterEndpoint.scala:110)
at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:348)
at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:287)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:336)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:191)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
at org.apache.spark.SparkContext.(SparkContext.scala:460)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
at scala.Option.getOrElse(Option.scala:201)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
at Main$.main(Main.scala:8)
at Main.main(Main.scala)
My dependency in pom
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.13</artifactId>
<version>3.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.13</artifactId>
<version>3.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.1.3</version>
<scope>provided</scope>
</dependency>
Any idea how to resolve this problem ?
At the time of writing this answer, Spark does not support Java 17 - only Java 8/11 (source: https://spark.apache.org/docs/latest/).
In my case uninstalling Java 17 and installing Java 8 (e.g. OpenJDK 8) fixed the problem and I started using Spark on my laptop.
UPDATE:
Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+.

Getting dependency error for sparksession and SQLContext

I am getting dependency error for my SQLContext and sparksession in my spark program
val sqlContext = new SQLContext(sc)
val spark = SparkSession.builder()
Error for SQLCOntext
Symbol 'type org.apache.spark.Logging' is missing from the classpath. This symbol is required by 'class org.apache.spark.sql.SQLContext'. Make sure that type Logging is in your classpath and check for conflicting dependencies with -Ylog-classpath. A full rebuild may help if 'SQLContext.class' was compiled against an incompatible version of org.apache.spark.
Error for SparkSession:
not found: value SparkSession
Below are the spark dependencies in my pom.xml
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.6.0-cdh5.15.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-catalyst_2.10</artifactId>
<version>1.6.0-cdh5.15.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-test-tags_2.10</artifactId>
<version>1.6.0-cdh5.15.1</version>
</dependency>
You can't have both Spark 2 and Spark 1.6 dependencies defined in your project.
org.apache.spark.Logging is not available in Spark 2 anymore.
Change
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.0.0-cloudera1-SNAPSHOT</version>
</dependency>
to
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.6.0-cdh5.15.1</version>
</dependency>

Creating a kafka consumer using spark streaming

I have started kafka, created a topic and a producer. Now I want to read messages sent from that producer. My code
def main(args: Array[String]){
val sparkConf = new SparkConf()
val spark = new SparkContext(sparkConf)
val streamingContext = new StreamingContext(spark, Seconds(5))
val kafkaStream = KafkaUtils.createStream(streamingContext
, "localhost:2181"
, "test-group"
, Map("test" -> 1))
kafkaStream.print
streamingContext.start
streamingContext.awaitTermination
}
The dependencies I use
<properties>
<spark.version>1.6.2</spark.version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.11</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.11</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies>
But every time I try to run it in idea I get
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
at org.apache.spark.util.Utils$.getSystemProperties(Utils.scala:1582)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:59)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:53)
at com.mypackage.KafkaConsumer$.main(KafkaConsumer.scala:10)
at com.mypackage.KafkaConsumer.main(KafkaConsumer.scala)
Other questions here point to conflicts between dependencies.
I use scala 2.10.5 and spark 1.6.2. I tried them in other projects, they worked fine.
Line 10 in this case is val sparkConf = new SparkConf()
I try to run the app in the IDEA without packaging it.
What can be the reason for this problem?
It's an error with Scala version. You're are using different versions of scala in your code and dependencies.
You said you're using scala 2.10 but you import spark-XX_2.11 dependencies. Unify your scala version.

Exception while running Spark program with SQL context in Scala

I am trying to run a simple Spark scala program built with Maven
Below is the source code:
case class Person(name:String,age:Int)
object parquetoperations {
def main(args:Array[String]){
val sparkconf=new SparkConf().setAppName("spark1").setMaster("local")
val sc=new SparkContext(sparkconf);
val sqlContext= new SQLContext(sc)
import sqlContext.implicits._
val peopleRDD = sc.textFile(args(0));
val peopleDF=peopleRDD.map(_.split(",")).map(attributes=>Person(attributes(0),attributes(1).trim.toInt)).toDF()
peopleDF.createOrReplaceTempView("people")
val adultsDF=sqlContext.sql("select * from people where age>18")
//adultsDF.map(x => "Name: "+x.getAs[String]("name")+ " age is: "+x.getAs[Int]("age")).show();
}
}
and below are the maven dependencies I have.
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>2.10.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-xml</artifactId>
<version>2.11.0-M4</version>
</dependency>
It throws the below error. Tried to debug in various ways with no luck
Exception in thread "main" java.lang.NoSuchMethodError:
scala.Predef$.$scope()Lscala/xml/TopScope$;
Looks like this is an error related to loading the spark web ui
All your dependencies are on Scala 2.10, but your scala-xml dependency is on Scala 2.11.
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-xml</artifactId>
<version>2.11.0-M4</version>
</dependency>
Btw, unless you really have a strong reason to do so, I would suggest you move to Scala 2.11.8. Everything is so much better with 2.11 compared to 2.10.

Getting exception : java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;) while using data frames

I am receiving "java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)" error while using dataframes in scala app and running it using spark. However if I work using only RDD's and not dataframes, no such error comes up with same pom and settings. Also while going through other posts with same error, it is mentioned that scala version has to be 2.10 as spark is not compatible with 2.11 scala, and i am using 2.10 scala version with 2.0.0 spark.
Below is the snip from pom:
<properties>
<spark-assembly>/usr/lib/spark/lib/spark-assembly.jar</spark-assembly>
<encoding>UTF-8</encoding>
<hadoop.version>2.7.1</hadoop.version>
<hbase.version>1.1.1</hbase.version>
<scala.version>2.10.5</scala.version>
<scala.tools.version>2.10</scala.tools.version>
<spark.version>2.0.0</spark.version>
<phoenix.version>4.7.0-HBase-1.1</phoenix.version>
</properties>
<dependencies>
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-client</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hbase</groupId>
<artifactId>hbase-server</artifactId>
<version>${hbase.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.tools.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.tools.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_${scala.tools.version}</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies>
Error:
16/10/19 02:57:26 ERROR yarn.ApplicationMaster: User class threw exception: java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;
at com.abc.xyz.Compare$.main(Compare.scala:64)
at com.abc.xyz.Compare.main(Compare.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:627)
16/10/19 02:57:26 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: java.lang.NoSuchMethodError: scala.reflect.api.JavaUniverse.runtimeMirror(Ljava/lang/ClassLoader;)Lscala/reflect/api/JavaMirrors$JavaMirror;)
16/10/19 02:57:26 INFO spark.SparkContext: Invoking stop() from shutdown hook
Change scala version
<scala.version>2.11.8</scala.version>
<scala.tools.version>2.11</scala.tools.version>
and add
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-reflect</artifactId>
<version>${scala.version}</version>
</dependency>
I also faced this error and this is purely a version issue.
Your scala version is not compatible or may be you are using the correct version but the intellij libraries has the old version.
Quick fix :
I as using spark 2.2.0 and scala 2.10.4 , which I then changed to scala version 2.11.8.After that do the below:
1) right click on intellij module
2) open module-settings.
3) go to libraries and clear all of them
4) Rebuild
Doing above for me issue is resolved.