Run a simple spark code in Scala IDE - scala

I want to use Scala IDE and run spark code on Windows 7. I already installed Scala IDE and start with creating a scala project. So I need to know:
Is there any instruction to run the following code in Scala IDE:
/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val logFile = "D:/Spark_Installation/eclipse-ws/Scala/README.md" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")
.setMaster("spark://myhost:7077")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
}
}
When I run this code I got the following errors:
15/03/26 11:59:55 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 11:59:58 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 11:59:58 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:15 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 12:00:17 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 12:00:17 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:35 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 12:00:37 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 12:00:37 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:55 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/03/26 12:00:55 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
15/03/26 12:00:55 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.

Do you have a spark master set up? If not, Have a look at this:
http://spark.apache.org/docs/1.2.1/submitting-applications.html#master-urls
You would mostly want to use
local[*]
which would use each core that your local computer has, instead of using:
spark://myhost:7077
The spark:// assumes you have a spark master setup at myhost:7077

If you are running as a local standalone cluster then please use local[*] that means use all the cores your machine has. So now your sparkconf object creation would look like following.val conf = new SparkConf().setAppName("Simple Application").setMaster("local[*]")

Related

Remote Spark Connection - Scala: NullPointerException on registerBlockManager

Spark Master and Worker, both are running in localhost. I have started Master and Worker node by triggering command:
sbin/start-all.sh
Logs for Master node invocation:
Spark Command: /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/bin/java -cp /Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/conf/:/Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.master.Master --host 186590dbe5bd.ant.abc.com --port 7077 --webui-port 8080
Logs for Worker node invocation:
Spark Command: /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/bin/java -cp /Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/conf/:/Users/gaurishi/spark/spark-2.3.1-bin-hadoop2.7/jars/* -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 spark://186590dbe5bd.ant.abc.com:7077
I have following configuration in conf/spark-env.sh
SPARK_MASTER_HOST=186590dbe5bd.ant.abc.com
Content of /etc/hosts:
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
127.0.0.1 186590dbe5bd.ant.abc.com
Scala code, that I am running to establish remote spark connection:
val sparkConf = new SparkConf()
.setAppName(AppConstants.AppName)
.setMaster("spark://186590dbe5bd.ant.abc.com:7077")
val sparkSession = SparkSession.builder()
.appName(AppConstants.AppName)
.config(sparkConf)
.enableHiveSupport()
.getOrCreate()
While executing code from IDE, I am getting following exception in console:
2018-10-04 18:58:38,488 INFO [main] storage.BlockManagerMaster (Logging.scala:logInfo(54)) - Registering BlockManager BlockManagerId(driver, 192.168.0.38, 56083, None)
2018-10-04 18:58:38,491 ERROR [main] spark.SparkContext (Logging.scala:logError(91)) - Error initializing SparkContext.
java.lang.NullPointerException
at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:227)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:518)
………
018-10-04 18:58:38,496 INFO [main] spark.SparkContext (Logging.scala:logInfo(54)) - SparkContext already stopped.
2018-10-04 18:58:38,492 INFO [dispatcher-event-loop-3] scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint (Logging.scala:logInfo(54)) - OutputCommitCoordinator stopped!
Exception in thread "main" java.lang.NullPointerException
at org.apache.spark.storage.BlockManagerMaster.registerBlockManager(BlockManagerMaster.scala:64)
at org.apache.spark.storage.BlockManager.initialize(BlockManager.scala:227)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:518)
………
Logs from /logs/master shows following error:
18/10/04 18:58:18 ERROR TransportRequestHandler: Error while invoking RpcHandler#receive() for one-way message.
java.io.InvalidClassException: org.apache.spark.rpc.RpcEndpointRef; local class incompatible: stream classdesc serialVersionUID = 1835832137613908542, local class serialVersionUID = -1329125091869941550
at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
…………
Spark Versions:
Spark: spark-2.3.1-bin-hadoop2.7
Build dependencies:
Scala: 2.11
Spark-hive: 2.2.2
Maven-org-spark-project-hive hive-metastore = 1.x;
What changes should be done to successfully connect spark remotely? Thanks
Complete Log:
Console.log
Spark Master-Node.log

Why did Worker kill executor?

I'm programing spark application in spark standalone cluster. When I run following code, I got below ClassNotFoundException(reference screenshot). So, I follow the worker(192.168.111.202) log.
package main
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object mavenTest {
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("stream test").setMaster("spark://192.168.111.201:7077")
val sc = new SparkContext(conf)
val input = sc.textFile("file:///root/test")
val words = input.flatMap { line => line.split(" ") }
val counts = words.map(word => (word, 1)).reduceByKey { case (x, y) => x + y }
counts.saveAsTextFile("file:///root/mapreduce")
}
}
Following logs are worker's log. These logs say worker kill executor, and error occur. Why did Worker kill executor? Could you give any clue?
16/03/24 20:16:48 INFO Worker: Asked to launch executor app-20160324201648-0011/0 for stream test
16/03/24 20:16:48 INFO SecurityManager: Changing view acls to: root
16/03/24 20:16:48 INFO SecurityManager: Changing modify acls to: root
16/03/24 20:16:48 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/03/24 20:16:48 INFO ExecutorRunner: Launch command: "/usr/java/jdk1.8.0_73/jre/bin/java" "-cp" "/opt/spark-1.5.2-bin-hadoop2.6/sbin/../conf/:/opt/spark-1.5.2-bin-hadoop2.6/lib/spark-assembly-1.5.2-hadoop2.6.0.jar:/opt/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar:/opt/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/opt/spark-1.5.2-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/etc/hadoop" "-Xms1024M" "-Xmx1024M" "-Dspark.driver.port=40243" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "akka.tcp://sparkDriver#192.168.111.201:40243/user/CoarseGrainedScheduler" "--executor-id" "0" "--hostname" "192.168.111.202" "--cores" "1" "--app-id" "app-20160324201648-0011" "--worker-url" "akka.tcp://sparkWorker#192.168.111.202:53363/user/Worker"
16/03/24 20:16:54 INFO Worker: Asked to kill executor app-20160324201648-0011/0
16/03/24 20:16:54 INFO ExecutorRunner: Runner thread for executor app-20160324201648-0011/0 interrupted
16/03/24 20:16:54 INFO ExecutorRunner: Killing process!
16/03/24 20:16:54 ERROR FileAppender: Error writing stream to file /opt/spark-1.5.2-bin-hadoop2.6/work/app-20160324201648-0011/0/stderr
java.io.IOException: Stream closed
at java.io.BufferedInputStream.getBufIfOpen(BufferedInputStream.java:170)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:283)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at org.apache.spark.util.logging.FileAppender.appendStreamToFile(FileAppender.scala:70)
at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply$mcV$sp(FileAppender.scala:39)
at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
at org.apache.spark.util.logging.FileAppender$$anon$1$$anonfun$run$1.apply(FileAppender.scala:39)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at org.apache.spark.util.logging.FileAppender$$anon$1.run(FileAppender.scala:38)
16/03/24 20:16:54 INFO Worker: Executor app-20160324201648-0011/0 finished with state KILLED exitStatus 143
16/03/24 20:16:54 INFO Worker: Cleaning up local directories for application app-20160324201648-0011
16/03/24 20:16:54 INFO ExternalShuffleBlockResolver: Application app-20160324201648-0011 removed, cleanupLocalDirs = true
I found it was problem about memory, but I don't know well why this problem happen. just add following property in yarn-site.xml file. Apache hadoop say this configure decide whether virtual memory limits will be enforced for containers.
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
What's your spark version? This is a spark's known bug, and fixed in version 1.6.
More detail u can see [SPARK-9844]

Using Scala IDE and Apache Spark on Windows

I want to start working on a project that uses Spark with Scala on Windows 7.
I downloaded the Apache Spark pre-build for hadoop 2.4 (download page) and I can run it from command prompt (cmd). I can run all of the codes on the quick start of spark page before self-contains application section.
Then I downloaded Scala IDE 4.0.0 from its download page (Sorry it's not possible to post more than 2 links).
Now I created a new scala project and also import the spark assembly jar file into the project. When I want to run the example in the self-contains application section in quick start of spark page but I got the following errors:
15/03/26 11:59:55 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 11:59:58 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 11:59:58 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:15 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 12:00:17 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 12:00:17 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:35 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 12:00:37 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 12:00:37 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:55 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/03/26 12:00:55 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
15/03/26 12:00:55 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
The only line of code that I add to the example, is .setMaster("spark://myhost:7077") for SparkConf definition. I think I need to configure the Scala IDE to use the pre-build spark on my computer but actually I don't know how and I couldn't find anything by googling.
Could you help me to get Scala IDE works with the spark on windows 7?
Thanks in advance
I found the answer:
I should correct the master definition in my code as follow:
replace:
.setMaster("spark://myhost:7077")
with:
.setMaster("local[*]")
Hope that it helps you as well.

Monitoring a task in apache Spark

I start spark master using : ./sbin/start-master.sh
as described at :
http://spark.apache.org/docs/latest/spark-standalone.html
I then submit the Spark job :
sh ./bin/spark-submit \
--class simplespark.Driver \
--master spark://`localhost`:7077 \
C:\\Users\\Adrian\\workspace\\simplespark\\target\\simplespark-0.0.1-SNAPSHOT.jar
How can run a simple app which demonstrates a parallel task running ?
When I view http://localhost:4040/executors/ & http://localhost:8080/ there are no
tasks running :
The .jar I'm running (simplespark-0.0.1-SNAPSHOT.jar) just contains a single Scala object :
package simplespark
import org.apache.spark.SparkContext
object Driver {
def main(args: Array[String]) {
val conf = new org.apache.spark.SparkConf()
.setMaster("local")
.setAppName("knn")
.setSparkHome("C:\\spark-1.1.0-bin-hadoop2.4\\spark-1.1.0-bin-hadoop2.4")
.set("spark.executor.memory", "2g");
val sc = new SparkContext(conf);
val l = List(1)
sc.parallelize(l)
while(true){}
}
}
Update : When I change --master spark://localhost:7077 \ to --master spark://Adrian-PC:7077 \
I can see update on the Spark UI :
I have also updated Driver.scala to read default context, as I'm not sure if I set it correctly for submitting Spark jobs :
package simplespark
import org.apache.spark.SparkContext
object Driver {
def main(args: Array[String]) {
System.setProperty("spark.executor.memory", "2g")
val sc = new SparkContext();
val l = List(1)
val c = sc.parallelize(List(2, 3, 5, 7)).count()
println(c)
sc.stop
}
}
On Spark console I receive multiple same all same messages :
14/12/26 20:08:32 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory
So it appears that the Spark job is not reaching the master ?
Update2 : After I start (thanks to Lomig Mégard comment below) the worker using :
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://Adrian-PC:7077
I receive error :
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/8 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor added: app-20141227212351-0003/9 on worker-20141227211411-Adrian-PC-58199 (Adrian-PC:58199) with 4 cores
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Granted executor ID app-20141227212351-0003/9 on hostPort Adrian-PC:58199 with 4 cores, 2.0 GB RAM
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now RUNNING
14/12/27 21:23:52 INFO AppClient$ClientActor: Executor updated: app-20141227212351-0003/9 is now FAILED (java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified)
14/12/27 21:23:52 INFO SparkDeploySchedulerBackend: Executor app-20141227212351-0003/9 removed: java.io.IOException: Cannot run program "C:\cygdrive\c\spark-1.1.0-bin-hadoop2.4\spark-1.1.0-bin-hadoop2.4/bin/compute-classpath.cmd" (in directory "."): CreateProcess error=2, The system cannot find the file specified
14/12/27 21:23:52 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our application: FAILED
14/12/27 21:23:52 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: Master removed our application: FAILED
14/12/27 21:23:52 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at Driver.scala:14)
14/12/27 21:23:52 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
I'm running the scripts on Windows using Cygwin. To fix this error I copy the Spark installation to cygwin C:\ drive. But then I receive a new error :
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Master removed our application: FAILED
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
You have to start the actual computation to see the job.
val c = sc.parallelize(List(2, 3, 5, 7)).count()
println(c)
Here count is called an action, you need at least one of them to begin a job. You can find the list of available actions in the Spark doc.
The other methods are called transformations. They are lazily executed.
Don't forget to stop the context at the end, instead of your infinite loop, with sc.stop().
Edit: For the updated question, you allocate more memory to the executor than there is available in the worker. The defaults should be fine for simple tests.
You also need to have a running worker linked to your master. See this doc to start it.
./sbin/start-master.sh
./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT

Spark 0.9.0: worker keeps dying in standalone mode when job fails

I am new to spark. I am running Spark in standalone mode on my mac. I bring up the master and the worker and they all come up fine. The log file of the master looks like:
...
14/02/25 18:52:43 INFO Slf4jLogger: Slf4jLogger started
14/02/25 18:52:43 INFO Remoting: Starting remoting
14/02/25 18:52:43 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkMaster#Shirishs-MacBook-Pro.local:7077]
14/02/25 18:52:43 INFO Master: Starting Spark master at spark://Shirishs-MacBook-Pro.local:7077
14/02/25 18:52:43 INFO MasterWebUI: Started Master web UI at http://192.168.1.106:8080
14/02/25 18:52:43 INFO Master: I have been elected leader! New state: ALIVE
14/02/25 18:53:03 INFO Master: Registering worker Shirishs-MacBook-Pro.local:53956 with 4 cores, 15.0 GB RAM
The worker log looks like:
14/02/25 18:53:02 INFO Slf4jLogger: Slf4jLogger started
14/02/25 18:53:02 INFO Remoting: Starting remoting
14/02/25 18:53:02 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkWorker#192.168.1.106:53956]
14/02/25 18:53:02 INFO Worker: Starting Spark worker 192.168.1.106:53956 with 4 cores, 15.0 GB RAM
14/02/25 18:53:02 INFO Worker: Spark home: /Users/shirish_kumar/Developer/spark-0.9.0-incubating
14/02/25 18:53:02 INFO WorkerWebUI: Started Worker web UI at http://192.168.1.106:8081
14/02/25 18:53:02 INFO Worker: Connecting to master spark://Shirishs-MacBook-Pro.local:7077...
14/02/25 18:53:03 INFO Worker: Successfully registered with master spark://Shirishs-MacBook-Pro.local:7077
Now, when I submit a job, the job fails to execute (because class not found error) but the worker also dies. Here is the master log:
14/02/25 18:55:52 INFO Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
14/02/25 18:55:52 INFO Master: Launching driver driver-20140225185552-0000 on worker worker-20140225185302-192.168.1.106-53956
14/02/25 18:55:55 INFO Master: Registering worker Shirishs-MacBook-Pro.local:53956 with 4 cores, 15.0 GB RAM
14/02/25 18:55:55 INFO Master: Attempted to re-register worker at same address: akka.tcp://sparkWorker#192.168.1.106:53956
14/02/25 18:55:55 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:55:57 INFO Master: akka.tcp://driverClient#192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:55:57 INFO Master: akka.tcp://driverClient#192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:55:57 INFO LocalActorRef: Message [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying] from Actor[akka://sparkMaster/deadLetters] to Actor[akka://sparkMaster/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkMaster%40192.168.1.106%3A53962-2#-21389169] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
4/02/25 18:55:57 INFO Master: akka.tcp://driverClient#192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:55:57 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster#Shirishs-MacBook-Pro.local:7077] -> [akka.tcp://driverClient#192.168.1.106:53961]: Error [Association failed with [akka.tcp://driverClient#192.168.1.106:53961]] [
akka.remote.EndpointAssociationException: Association failed with [akka.tcp://driverClient#192.168.1.106:53961]
Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: /192.168.1.106:53961
]
...
...
14/02/25 18:55:57 INFO Master: akka.tcp://driverClient#192.168.1.106:53961 got disassociated, removing it.
14/02/25 18:56:03 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:10 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:18 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:25 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:33 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/02/25 18:56:40 WARN Master: Got heartbeat from unregistered worker worker-20140225185555-192.168.1.106-53956
14/
The worker log looks like this
14/02/25 18:55:52 INFO Worker: Asked to launch driver driver-20140225185552-0000
2014-02-25 18:55:52.534 java[11415:330b] Unable to load realm info from SCDynamicStore
14/02/25 18:55:52 INFO DriverRunner: Copying user jar file:/Users/shirish_kumar/Developer/spark_app/SimpleApp to /Users/shirish_kumar/Developer/spark-0.9.0-incubating/work/driver-20140225185552-0000/SimpleApp
14/02/25 18:55:53 INFO DriverRunner: Launch Command: "/Library/Java/JavaVirtualMachines/jdk1.7.0_40.jdk/Contents/Home/bin/java" "-cp" ":/Users/shirish_kumar/Developer/spark-0.9.0-incubating/work/driver-20140225185552-0000/SimpleApp:/Users/shirish_kumar/Developer/spark-0.9.0-incubating/conf:/Users/shirish_kumar/Developer/spark-0.9.0-incubating/assembly/target/scala-2.10/spark-assembly-0.9.0-incubating-hadoop1.0.4.jar" "-Xms512M" "-Xmx512M" "org.apache.spark.deploy.worker.DriverWrapper" "akka.tcp://sparkWorker#192.168.1.106:53956/user/Worker" "SimpleApp"
14/02/25 18:55:55 ERROR OneForOneStrategy: FAILED (of class scala.Enumeration$Val)
scala.MatchError: FAILED (of class scala.Enumeration$Val)
at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:277)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/02/25 18:55:55 INFO Worker: Starting Spark worker 192.168.1.106:53956 with 4 cores, 15.0 GB RAM
14/02/25 18:55:55 INFO Worker: Spark home: /Users/shirish_kumar/Developer/spark-0.9.0-incubating
14/02/25 18:55:55 INFO WorkerWebUI: Started Worker web UI at http://192.168.1.106:8081
14/02/25 18:55:55 INFO Worker: Connecting to master spark://Shirishs-MacBook-Pro.local:7077...
14/02/25 18:55:55 INFO Worker: Successfully registered with master spark://Shirishs-MacBook-Pro.local:7077
After this in the webUI - the worker is show is dead.
My question is - has anyone encountered this problem. The worker should not die if a job fails.
Check you /Spark/work folder.
You can see the exact error for that particular driver.
For me its a class not found exception.Just give the fully qualified class name for the application main class(include package name too).
Then clear out the work directory and launch your application again in stand alone mode again.
This will work....!
You have to specify the path to your JAR files.
Pragmatically, you can do it this way:
sparkConf.set("spark.jars", "file:/myjar1, file:/myjarN")
Which implies you have to first compile a JAR file.
You also have to link dependent JARs - of which there are multiple ways of automating, but well beyond the scope of this question.