NoSuchMethodException: scala.tools.nsc.interpreter.ILoop.scala when running pyspark code zepplin - scala

I am trying to run a pyspark cell in zeppelin but i get a NoSuchMethodException on scala.tools.nsc.interpreter.ILoop.scala
cell code:
%pyspark
print("hello")
Error:
java.lang.NoSuchMethodException: scala.tools.nsc.interpreter.ILoop.scala$tools$nsc$interpreter$ILoop$$loopPostInit()
at java.lang.Class.getMethod(Class.java:1786)
at org.apache.zeppelin.spark.BaseSparkScalaInterpreter.callMethod(BaseSparkScalaInterpreter.scala:268)
at org.apache.zeppelin.spark.BaseSparkScalaInterpreter.callMethod(BaseSparkScalaInterpreter.scala:262)
at org.apache.zeppelin.spark.SparkScala211Interpreter.open(SparkScala211Interpreter.scala:84)
at org.apache.zeppelin.spark.NewSparkInterpreter.open(NewSparkInterpreter.java:102)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:62)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.spark.PySparkInterpreter.getSparkInterpreter(PySparkInterpreter.java:664)
at org.apache.zeppelin.spark.PySparkInterpreter.createGatewayServerAndStartScript(PySparkInterpreter.java:260)
at org.apache.zeppelin.spark.PySparkInterpreter.open(PySparkInterpreter.java:194)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:69)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:617)
at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
zeppelin version: 0.8.0
spark version: 2.4.0
scala version: 2.11.12

the issue is described in [ZEPPELIN-3810] Support Spark 2.4 and release of zeppelin 0.8.1 has fixed the issue.

There is something about spark version with zeppelin, I had the same error, I just tried to change spark version to 2.3.2 and work it.

Related

spark & zeppelin problems with integration

I want to connect my locally installed zeppelin 0.10.0 to an also locally installed spark 3.2.0 (I tried the same procedure with spark2.3.0 and it worked.). But it looks like zeppelin itself has an internal spark which uses the internal one every time I try. I have gone through the setting for spark interpreters with no use.
I just want to know if there is anyway I can change the default internal spark that zeppelin uses and change it to a spark 3.2.0 I want to use.
I put the parameters of SPARK_HOME what it is said to be and spark.master local[*] receiving the following error:
org.apache.zeppelin.interpreter.InterpreterException: java.lang.NoSuchMethodError: scala.tools.nsc.Settings.usejavacp()Lscala/tools/nsc/settings/AbsSettings$AbsSetting;
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:833)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:741)
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132)
at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: scala.tools.nsc.Settings.usejavacp()Lscala/tools/nsc/settings/AbsSettings$AbsSetting;
at org.apache.zeppelin.spark.SparkScala212Interpreter.open(SparkScala212Interpreter.scala:66)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:121)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
... 8 more
org.apache.zeppelin.interpreter.InterpreterException: java.lang.NoSuchMethodError: scala.tools.nsc.Settings.usejavacp()Lscala/tools/nsc/settings/AbsSettings$AbsSetting;
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:833)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:741)
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:132)
at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: scala.tools.nsc.Settings.usejavacp()Lscala/tools/nsc/settings/AbsSettings$AbsSetting;
at org.apache.zeppelin.spark.SparkScala212Interpreter.open(SparkScala212Interpreter.scala:66)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:121)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
... 8 more
I've run into the same issue myself - you won't run Spark 3.2.0 on Zeppelin 0.10.0. Spark 3.1.2 works without any issues and Zeppelin has Spark 2.4.5 included - this is a problem with a tool itself.
According to the ticket ZEPPELIN-5565 version 0.10.0 DOES NOT support Spark 3.2.0. This should be fixed in 0.10.1 and 0.11.0 (info from mentioned ticket and I've also checked the Github repo).
Pull request that fixes this issue is much longer, but in Zeppelin 0.10.0 there is this strategic line:
public static final SparkVersion UNSUPPORTED_FUTURE_VERSION = SPARK_3_2_0;

How to add Delta Lake support to Zeppelin's spark interpreter?

I'm trying to add the Delta Lake support to Zeppelin.
So far I've tried adding the io.delta:delta-core_2.12:0.7.0 dependency to the spark interpreter, as well as a couple other related actions within the interpreters view... but nothing has worked.
When I add the io.delta:delta-core_2.12:0.7.0 dependency, I get errors within my notebooks such as:
org.apache.zeppelin.interpreter.InterpreterException: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:76)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:668)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:577)
at org.apache.zeppelin.scheduler.Job.run(Job.java:172)
at org.apache.zeppelin.scheduler.AbstractScheduler.runJob(AbstractScheduler.java:130)
at org.apache.zeppelin.scheduler.FIFOScheduler.lambda$runJobInScheduler$0(FIFOScheduler.java:39)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NoSuchMethodError: scala.Predef$.refArrayOps([Ljava/lang/Object;)Lscala/collection/mutable/ArrayOps;
at org.apache.spark.util.Utils$.stringToSeq(Utils.scala:2664)
at org.apache.spark.internal.config.ConfigHelpers$.stringToSeq(ConfigBuilder.scala:49)
at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
at org.apache.spark.internal.config.TypedConfigBuilder$$anonfun$toSequence$1.apply(ConfigBuilder.scala:125)
at org.apache.spark.internal.config.TypedConfigBuilder.createWithDefault(ConfigBuilder.scala:143)
at org.apache.spark.internal.config.package$.<init>(package.scala:172)
at org.apache.spark.internal.config.package$.<clinit>(package.scala)
at org.apache.spark.SparkConf$.<init>(SparkConf.scala:716)
at org.apache.spark.SparkConf$.<clinit>(SparkConf.scala)
at org.apache.spark.SparkConf.set(SparkConf.scala:95)
at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:77)
at org.apache.spark.SparkConf$$anonfun$loadFromSystemProperties$3.apply(SparkConf.scala:76)
at scala.collection.TraversableLike$WithFilter.$anonfun$foreach$1(TraversableLike.scala:877)
at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:234)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:468)
at scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:876)
at org.apache.spark.SparkConf.loadFromSystemProperties(SparkConf.scala:76)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:71)
at org.apache.spark.SparkConf.<init>(SparkConf.scala:58)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:80)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
... 8 more
My goal is to read/write from/to Delta Lake tables using Scala + Spark.
Thanks!
The most probable reason for this is that you're using Delta Lake with Spark 2.x - the package that you're using is supposed to work with Spark 3.0+ (compiled with Scala 2.12). The latest version of Delta that supports 2.4 (minimum 2.4.2) is 0.6.1 (see this answer).
So you need to upgrade Spark version if you want to use this specific package, or use another version of Delta if you want to keep you Spark installations.

Exception while running StreamingContext.start()

Exception while running python code in Windows 10. I am using Apache Kafka and PySpark.
Python code snippet to read data from Kafka
ssc=StreamingContext(sc,60)
zkQuorum, topic = sys.argv[1:]
kvs=KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", {topic: 1})
lines = kvs.map(lambda x: [x[0],x[1]])
lines.pprint()
lines.foreachRDD(SaveRecord)
ssc.start()
ssc.awaitTermination()
Exception while running the code
Exception in thread "streaming-start" java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging$class
at org.apache.spark.streaming.kafka.KafkaReceiver.<init>(KafkaInputDStream.scala:69)
at org.apache.spark.streaming.kafka.KafkaInputDStream.getReceiver(KafkaInputDStream.scala:60)
at org.apache.spark.streaming.scheduler.ReceiverTracker.$anonfun$launchReceivers$1(ReceiverTracker.scala:441)
at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)
at scala.collection.TraversableLike.map(TraversableLike.scala:237)
at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:198)
at org.apache.spark.streaming.scheduler.ReceiverTracker.launchReceivers(ReceiverTracker.scala:440)
at org.apache.spark.streaming.scheduler.ReceiverTracker.start(ReceiverTracker.scala:160)
at org.apache.spark.streaming.scheduler.JobScheduler.start(JobScheduler.scala:102)
at org.apache.spark.streaming.StreamingContext.$anonfun$start$1(StreamingContext.scala:583)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.ThreadUtils$$anon$1.run(ThreadUtils.scala:145)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.internal.Logging$class
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 16 more
This may be due to incompatible version of Scala with Spark. Make sure your Scala Version in Project configuration matches with the Version your Spark Version supports.
Spark requires Scala 2.12; support for Scala 2.11 was removed in Spark 3.0.0
It is also possible that the third party jar (like dstream-twitter for twitter streaming application or your Kafka streaming jar) is built for unsupported version of Scala in your application.
For me dstream-twitter_2.11-2.3.0-SNAPSHOT For Instance didn't work with Spark 3.0, It gave Exception in thread "streaming-start" java.lang.NoClassDefFoundError: org/apache/spark/internal/Logging$class). But when I updated the dtream-twitter jar with scala 2.12 version it solved the issue.
Make sure you get all the Scala Versions correct.

Scala Worksheet IntelliJ -- Internal Error

I'm having issues with getting Scala Worksheets to work in IntelliJ.
I've already reinstalled JDK 8 and installed the latest SBT version.
The setting in IntelliJ seem fine to me, but I might missed something. Anyone knows how to resolve these issues?
Internal error: Could not initialize class com.sun.jna.platform.win32.WinBase$FILETIME
org.jetbrains.jps.incremental.scala.remote.ClientEventProcessor.process(ClientEventProcessor.scala:22)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.handle(RemoteResourceOwner.scala:47)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.handle$(RemoteResourceOwner.scala:37)
org.jetbrains.plugins.scala.compiler.RemoteServerRunner.handle(RemoteServerRunner.scala:16)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.$anonfun$send$5(RemoteResourceOwner.scala:30)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.$anonfun$send$5$adapted(RemoteResourceOwner.scala:29)
org.jetbrains.jps.incremental.scala.package$.using(package.scala:21)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.$anonfun$send$3(RemoteResourceOwner.scala:29)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.$anonfun$send$3$adapted(RemoteResourceOwner.scala:25)
org.jetbrains.jps.incremental.scala.package$.using(package.scala:21)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.$anonfun$send$2(RemoteResourceOwner.scala:25)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.$anonfun$send$2$adapted(RemoteResourceOwner.scala:24)
org.jetbrains.jps.incremental.scala.package$.using(package.scala:21)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.send(RemoteResourceOwner.scala:24)
org.jetbrains.jps.incremental.scala.remote.RemoteResourceOwner.send$(RemoteResourceOwner.scala:22)
org.jetbrains.plugins.scala.compiler.RemoteServerRunner.send(RemoteServerRunner.scala:16)
org.jetbrains.plugins.scala.compiler.RemoteServerRunner$$anon$1.$anonfun$run$1(RemoteServerRunner.scala:36)
scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:156)
org.jetbrains.plugins.scala.compiler.RemoteServerRunner$$anon$1.run(RemoteServerRunner.scala:32)
org.jetbrains.plugins.scala.worksheet.server.RemoteServerConnector.compileAndRun(RemoteServerConnector.scala:111)
org.jetbrains.plugins.scala.worksheet.processor.WorksheetCompiler$$anon$3.run(WorksheetCompiler.scala:66)
com.intellij.compiler.progress.CompilerTask.run(CompilerTask.java:192)
com.intellij.openapi.progress.impl.CoreProgressManager$TaskRunnable.run(CoreProgressManager.java:750)
com.intellij.openapi.progress.impl.CoreProgressManager.lambda$runProcess$1(CoreProgressManager.java:157)
com.intellij.openapi.progress.impl.CoreProgressManager.registerIndicatorAndRun(CoreProgressManager.java:580)
com.intellij.openapi.progress.impl.CoreProgressManager.executeProcessUnderProgress(CoreProgressManager.java:525)
com.intellij.openapi.progress.impl.ProgressManagerImpl.executeProcessUnderProgress(ProgressManagerImpl.java:85)
com.intellij.openapi.progress.impl.CoreProgressManager.runProcess(CoreProgressManager.java:144)
com.intellij.openapi.progress.impl.CoreProgressManager$4.run(CoreProgressManager.java:395)
com.intellij.openapi.application.impl.ApplicationImpl$1.run(ApplicationImpl.java:305)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
java.lang.Thread.run(Thread.java:745)
Settings IntelliJ:

NoSuchMethod when trying to execute HelloWorld in Scala on Zeppelin with local JAR as dependency

I have a problem with execution of local .jar on Zeppelin. I'm adding dependency jar via this guide, but when I go to notebook and try to execute
println("Hi")
I'm getting the stack listed below:
java.lang.NoSuchMethodError: scala.collection.immutable.$colon$colon.hd$1()Ljava/lang/Object;
at scala.tools.nsc.settings.MutableSettings.loop$1(MutableSettings.scala:64)
at scala.tools.nsc.settings.MutableSettings.processArguments(MutableSettings.scala:91)
at org.apache.zeppelin.spark.SparkInterpreter.open(SparkInterpreter.java:706)
at org.apache.zeppelin.interpreter.LazyOpenInterpreter.open(LazyOpenInterpreter.java:70)
at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:491)
at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
When there is no dependencies in interpreter, everything works fine.
I know that might be scala dependencies issues, but I've tried with different versions of scala and that won't help.
Also util.Properties.versionString on Zeppelin notebook returns res1: String = version 2.11.8 - it's the same version as in my test .jar file.