I want to Save and Read Spark DataFrame from AWS S3. I googled it a lot but found nothing much of use.
The code that I have written is like this:
val spark = SparkSession.builder().master("local").appName("test").getOrCreate()
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "**********")
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", "********************")
import spark.implicits._
spark.read.textFile("s3n://myBucket/testFile").show(false)
List(1,2,3,4).toDF.write.parquet("s3n://myBucket/test/abc.parquet")
But when run it I get following error:
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/myBucket/testFile' - ResponseCode=403, ResponseMessage=Forbidden
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:245)
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:119)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] at java.lang.reflect.Method.invoke(Method.java:498)
[info] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
[info] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
[info] at org.apache.hadoop.fs.s3native.$Proxy15.retrieveMetadata(Unknown Source)
[info] at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:414)
[info] ...
[info] Cause: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/myBucket/testFile' - ResponseCode=403, ResponseMessage=Forbidden
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:477)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:718)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1599)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1535)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1987)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1332)
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:111)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] ...
[info] Cause: org.jets3t.service.impl.rest.HttpException:
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:475)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:718)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1599)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1535)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1987)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1332)
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:111)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] ...
I am using
Spark: 2.1.0
Scala: 2.11.2
AWS Java SDK: 1.11.126
Any help is appreciated !
I have tried following things on spark version 2.1.1 and its worked fine for me.
Step 1: Download following jars:
-- hadoop-aws-2.7.3.jar
-- aws-java-sdk-1.7.4.jar
Note:
If you not able to find the following jars, then you can get the jars from hadoop-2.7.3
Step 2: Place the above jars into $SPARK_HOME/jars/
Step 3: code:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
val conf = new SparkConf().setMaster("local").setAppName("My App")
val sc = new SparkContext(conf)
sc.getOrCreate.hadoopConfiguration.set("fs.s3a.access.key"", "***********")
sc.getOrCreate.hadoopConfiguration.set("fs.s3a.secret.key", "******************")
val input = sc.textFile("s3a://mybucket/*.txt")
List(1,2,3,4).toDF.write.parquet("s3a://mybucket/abc.parquet")
Set the secrets in the spark conf itself with an option like "spark.hadoop.fs.s3n...", so that spark propagates it around with the work
Related
I am running the example provided by official akka: https://github.com/akka/akka-samples/tree/2.5/akka-sample-cluster-scala.
My OS is: Linux Mint 19 with the latest kernel.
And for the Worker Dial-in Example(Transformation Example), I cannot fully run this example as there is no enough space in /dev/shm. Although I have more than 2GB available space.
The problem is when I launch the first frontend node, it eats some KBs space. When I launch the second one, it eats some MBs space. When I launch the third one, it eats some hundred of MBs space. Further I just cannot even launch the fourth one, it just throws an error which causes the whole cluster down:
[info] Warning: space is running low in /dev/shm (tmpfs) threshold=167,772,160 usable=95,424,512
[info] Warning: space is running low in /dev/shm (tmpfs) threshold=167,772,160 usable=45,088,768
[info] [ERROR] [11/05/2018 21:03:56.156] [ClusterSystem-akka.actor.default-dispatcher-12] [akka://ClusterSystem#127.0.0.1:57246/] swallowing exception during message send
[info] io.aeron.exceptions.RegistrationException: IllegalStateException : Insufficient usable storage for new log of length=50335744 in /dev/shm (tmpfs)
[info] at io.aeron.ClientConductor.onError(ClientConductor.java:174)
[info] at io.aeron.DriverEventsAdapter.onMessage(DriverEventsAdapter.java:81)
[info] at org.agrona.concurrent.broadcast.CopyBroadcastReceiver.receive(CopyBroadcastReceiver.java:100)
[info] at io.aeron.DriverEventsAdapter.receive(DriverEventsAdapter.java:56)
[info] at io.aeron.ClientConductor.service(ClientConductor.java:660)
[info] at io.aeron.ClientConductor.awaitResponse(ClientConductor.java:696)
[info] at io.aeron.ClientConductor.addPublication(ClientConductor.java:371)
[info] at io.aeron.Aeron.addPublication(Aeron.java:259)
[info] at akka.remote.artery.aeron.AeronSink$$anon$1.<init>(AeronSink.scala:103)
[info] at akka.remote.artery.aeron.AeronSink.createLogicAndMaterializedValue(AeronSink.scala:100)
[info] at akka.stream.impl.GraphStageIsland.materializeAtomic(PhasedFusingActorMaterializer.scala:630)
[info] at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:450)
[info] at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:415)
[info] at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:406)
[info] at akka.stream.scaladsl.RunnableGraph.run(Flow.scala:588)
[info] at akka.remote.artery.Association.runOutboundOrdinaryMessagesStream(Association.scala:726)
[info] at akka.remote.artery.Association.runOutboundStreams(Association.scala:657)
[info] at akka.remote.artery.Association.associate(Association.scala:649)
[info] at akka.remote.artery.AssociationRegistry.association(Association.scala:989)
[info] at akka.remote.artery.ArteryTransport.association(ArteryTransport.scala:724)
[info] at akka.remote.artery.ArteryTransport.send(ArteryTransport.scala:710)
[info] at akka.remote.RemoteActorRef.$bang(RemoteActorRefProvider.scala:591)
[info] at akka.actor.ActorRef.tell(ActorRef.scala:124)
[info] at akka.actor.ActorSelection$.rec$1(ActorSelection.scala:265)
[info] at akka.actor.ActorSelection$.deliverSelection(ActorSelection.scala:269)
[info] at akka.actor.ActorSelection.tell(ActorSelection.scala:46)
[info] at akka.actor.ScalaActorSelection.$bang(ActorSelection.scala:280)
[info] at akka.actor.ScalaActorSelection.$bang$(ActorSelection.scala:280)
[info] at akka.actor.ActorSelection$$anon$1.$bang(ActorSelection.scala:198)
[info] at akka.cluster.ClusterCoreDaemon.gossipTo(ClusterDaemon.scala:1330)
[info] at akka.cluster.ClusterCoreDaemon.gossip(ClusterDaemon.scala:1047)
[info] at akka.cluster.ClusterCoreDaemon.gossipTick(ClusterDaemon.scala:1010)
[info] at akka.cluster.ClusterCoreDaemon$$anonfun$initialized$1.applyOrElse(ClusterDaemon.scala:496)
[info] at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
[info] at akka.actor.Actor.aroundReceive(Actor.scala:517)
[info] at akka.actor.Actor.aroundReceive$(Actor.scala:515)
[info] at akka.cluster.ClusterCoreDaemon.aroundReceive(ClusterDaemon.scala:295)
[info] at akka.actor.ActorCell.receiveMessage(ActorCell.scala:588)
[info] at akka.actor.ActorCell.invoke(ActorCell.scala:557)
[info] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
[info] at akka.dispatch.Mailbox.run(Mailbox.scala:225)
[info] at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
[info] at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
[info] at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
[info] at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
[info] at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
It seems it is sending huge message(48MB+?) to every nodes.
So what's up here? What is the root cause and how shall I fix this?
When I try to build hive from source by following these commands:
$git clone https://git-wip-us.apache.org/repos/asf/hive.git
$cd hive
$mvn clean package -Pdist
I receive this errors after I run "mvn clean package -Pdist
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project hive-exec: There are test failures.
[ERROR]
[ERROR] Please refer to /home/elbasir/hive/ql/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :hive-exec
Can anybody help to solve this? I have been searching the internet for a week to solve it but no success until now.
Here is the failure I got:
Running org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.573 sec <<< FAILURE! - in
org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest
createDestinationPath(org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest) Time elapsed: 1.919 sec <<< FAILURE!
java.lang.AssertionError:
Expected: is "hdfs://somehost:9000/someBasePath/withADir/replicaDbName/somefunctionname/9223372036854775807/ab.jar"
but: was "hdfs://somehost:9000/someBasePath/withADir/replicadbname/somefunctionname/0/ab.jar" at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:865)
at org.junit.Assert.assertThat(Assert.java:832)
at
org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest.createDestinationPath(PrimaryToReplicaResourceFunctionTest.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:316)
at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88)
at org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:300)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:131)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.access$100(PowerMockJUnit47RunnerDelegateImpl.java:59)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner$TestExecutorStatement.evaluate(PowerMockJUnit47RunnerDelegateImpl.java:147)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.evaluateStatement(PowerMockJUnit47RunnerDelegateImpl.java:107)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:288)
at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86)
at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:208)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.runMethods(PowerMockJUnit44RunnerDelegateImpl.java:147)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$1.run(PowerMockJUnit44RunnerDelegateImpl.java:121)
at org.junit.internal.runners.ClassRoadie.runUnprotected(ClassRoadie.java:33)
at org.junit.internal.runners.ClassRoadie.runProtected(ClassRoadie.java:45)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.run(PowerMockJUnit44RunnerDelegateImpl.java:123)
at org.powermock.modules.junit4.common.internal.impl.JUnit4TestSuiteChunkerImpl.run(JUnit4TestSuiteChunkerImpl.java:121)
at org.powermock.modules.junit4.common.internal.impl.AbstractCommonPowerMockRunner.run(AbstractCommonPowerMockRunner.java:53)
at org.powermock.modules.junit4.PowerMockRunner.run(PowerMockRunner.java:59)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Results :
Failed tests:
PrimaryToReplicaResourceFunctionTest.createDestinationPath:100 Expected: is "hdfs://somehost:9000/someBasePath/withADir/replicaDbName/somefunctionname/9223372036854775807/ab.jar"
but: was "hdfs://somehost:9000/someBasePath/withADir/replicadbname/somefunctionname/0/ab.jar"
Tests run: 3305, Failures: 1, Errors: 0, Skipped: 14
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Hive ............................................... SUCCESS [ 3.223 s]
[INFO] Hive Shims Common .................................. SUCCESS [ 10.496 s]
[INFO] Hive Shims 0.23 .................................... SUCCESS [ 7.779 s]
[INFO] Hive Shims Scheduler ............................... SUCCESS [ 3.569 s]
[INFO] Hive Shims ......................................... SUCCESS [ 1.963 s]
[INFO] Hive Storage API ................................... SUCCESS [01:32 min]
[INFO] Hive Common ........................................ SUCCESS [01:07 min]
[INFO] Hive Service RPC ................................... SUCCESS [ 5.318 s]
[INFO] Hive Serde ......................................... SUCCESS [03:32 min]
[INFO] Hive Standalone Metastore .......................... SUCCESS [ 37.830 s]
[INFO] Hive Metastore ..................................... SUCCESS [02:29 min]
[INFO] Hive Vector-Code-Gen Utilities ..................... SUCCESS [ 0.298 s]
[INFO] Hive Llap Common ................................... SUCCESS [ 7.068 s]
[INFO] Hive Llap Client ................................... SUCCESS [ 22.578 s]
[INFO] Hive Llap Tez ...................................... SUCCESS [ 16.250 s]
[INFO] Spark Remote Client ................................ SUCCESS [ 26.667 s]
[INFO] Hive Query Language ................................ FAILURE [ 01:09 h]
[INFO] Hive Llap Server ................................... SKIPPED
[INFO] Hive Service ....................................... SKIPPED
[INFO] Hive Accumulo Handler .............................. SKIPPED
[INFO] Hive JDBC .......................................... SKIPPED
[INFO] Hive Beeline ....................................... SKIPPED
[INFO] Hive CLI ........................................... SKIPPED
[INFO] Hive Contrib ....................................... SKIPPED
[INFO] Hive Druid Handler ................................. SKIPPED
[INFO] Hive HBase Handler ................................. SKIPPED
[INFO] Hive JDBC Handler .................................. SKIPPED
[INFO] Hive HCatalog ...................................... SKIPPED
[INFO] Hive HCatalog Core ................................. SKIPPED
[INFO] Hive HCatalog Pig Adapter .......................... SKIPPED
[INFO] Hive HCatalog Server Extensions .................... SKIPPED
[INFO] Hive HCatalog Webhcat Java Client .................. SKIPPED
[INFO] Hive HCatalog Webhcat .............................. SKIPPED
[INFO] Hive HCatalog Streaming ............................ SKIPPED
[INFO] Hive HPL/SQL ....................................... SKIPPED
[INFO] Hive Llap External Client .......................... SKIPPED
[INFO] Hive Shims Aggregator .............................. SKIPPED
[INFO] Hive TestUtils ..................................... SKIPPED
[INFO] Hive Packaging ..................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
After switching Spark framework from 1.6 to 2.1, running my tests results in many occurrences of the following error:
[info] java.lang.AssertionError: assertion failed: copyAndReset must return a zero value copy
[info] at scala.Predef$.assert(Predef.scala:170)
[info] at org.apache.spark.util.AccumulatorV2.writeReplace(AccumulatorV2.scala:163)
[info] at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] at java.lang.reflect.Method.invoke(Method.java:497)
[info] at java.io.ObjectStreamClass.invokeWriteReplace(ObjectStreamClass.java:1075)
[info] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1136)
[info] at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
[info] at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
[info] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
[info] ...
Any ideas what's happening and how I can fix this?
I want to running a spark streaming example DirectKafkaWordCount.
This is my directory structure:
root#sandbox:/usr/local/spark/test# find
.
./src
./src/main
./src/main/scala
./src/main/scala/DirectKafkaWordCount.scala
./simple.sbt
sbt package is done, everything is ok.
........
[info] Done updating.
[info] Compiling 1 Scala source to /usr/local/spark-1.6.0-bin-hadoop2.6/test/target/scala-2.10.5/classes...
[info] Packaging /usr/local/spark-1.6.0-bin-hadoop2.6/test/target/scala-2.10.5/direct-kafka-word-count_2.10.5-1.0.jar ...
[info] Done packaging.
[success] Total time: 60 s, completed May 12, 2016 1:34:04 AM
but errors found when I run the spark-submit:
root#sandbox:/usr/local/spark# bin/spark-submit --class DirectKafkaWordCount --master local[4] test/target/scala-2.10.5/direct-kafka-word-count_2.10.5-1.0.jar
java.lang.ClassNotFoundException: DirectKafkaWordCount
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:278)
at org.apache.spark.util.Utils$.classForName(Utils.scala:174)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:689)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
I am new in Spark, hope someone can help me.
->Actually I think you should include the groupId and artifactId of your class when you run the
spark-submit command like:
spark-submit --class com.balabala.spark.DirectKafkaWordCount --master local[4] test/target/scala-2.10.5/direct-kafka-word-count_2.10.5-1.0.jar
Then spark should be able to find your class.
I'm playing a bit with scala and play framework, but when i try to run my test project using activator, sbt on a mac i keep getting:
[info] Caused by: java.security.PrivilegedActionException: null
[info] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_60]
[info] at play.runsupport.Reloader$.play$runsupport$Reloader$$withReloaderContextClassLoader(Reloader.scala:39) ~[na:na]
[info] at play.runsupport.Reloader.reload(Reloader.scala:321) ~[na:na]
[info] at play.core.server.DevServerStart$$anonfun$mainDev$1$$anon$1$$anonfun$get$1.apply(DevServerStart.scala:113) ~[play-server_2.11-2.4.3.jar:2.4.3]
[info] at play.core.server.DevServerStart$$anonfun$mainDev$1$$anon$1$$anonfun$get$1.apply(DevServerStart.scala:111) ~[play-server_2.11-2.4.3.jar:2.4.3]
[info] at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) ~[scala-library-2.11.7.jar:na]
[info] at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:40) ~[akka-actor_2.11-2.3.13.jar:na]
[info] at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:397) ~[akka-actor_2.11-2.3.13.jar:na]
[info] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) ~[scala-library-2.11.7.jar:na]
[info] Caused by: java.util.concurrent.TimeoutException: Futures timed out after [300000 milliseconds]
[info] at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53) ~[scala-library-2.11.7.jar:na]
[info] at scala.concurrent.Await$.result(package.scala:190) ~[scala-library-2.11.7.jar:na]
[info] at play.forkrun.ForkRun$$anonfun$askForReload$1.apply(ForkRun.scala:127) ~[na:na]
[info] at play.forkrun.ForkRun$$anonfun$askForReload$1.apply(ForkRun.scala:125) ~[na:na]
[info] at play.runsupport.Reloader$$anonfun$reload$1.apply(Reloader.scala:323) ~[na:na]
[info] at play.runsupport.Reloader$$anon$3.run(Reloader.scala:43) ~[na:na]
I'm not really sure what am i doing wrong.
I had the same problem and I've found the solution here https://support.pivotal.io/hc/en-us/articles/201821186-Namenode-fails-to-start-with-error-jurisdiction-policy-files-are-not-signed-by-a-trusted-signer-
I already had installed Java 8 JDK on my machine so I had to download the policy files from this website:
http://www.oracle.com/technetwork/java/javase/downloads/jce8-download-2133166.html
Then I just pasted the files to the location which is provided in the "readme.txt".
The loction depends on the OS that you are using.
And everything works fine :)