I am running the example provided by official akka: https://github.com/akka/akka-samples/tree/2.5/akka-sample-cluster-scala.
My OS is: Linux Mint 19 with the latest kernel.
And for the Worker Dial-in Example(Transformation Example), I cannot fully run this example as there is no enough space in /dev/shm. Although I have more than 2GB available space.
The problem is when I launch the first frontend node, it eats some KBs space. When I launch the second one, it eats some MBs space. When I launch the third one, it eats some hundred of MBs space. Further I just cannot even launch the fourth one, it just throws an error which causes the whole cluster down:
[info] Warning: space is running low in /dev/shm (tmpfs) threshold=167,772,160 usable=95,424,512
[info] Warning: space is running low in /dev/shm (tmpfs) threshold=167,772,160 usable=45,088,768
[info] [ERROR] [11/05/2018 21:03:56.156] [ClusterSystem-akka.actor.default-dispatcher-12] [akka://ClusterSystem#127.0.0.1:57246/] swallowing exception during message send
[info] io.aeron.exceptions.RegistrationException: IllegalStateException : Insufficient usable storage for new log of length=50335744 in /dev/shm (tmpfs)
[info] at io.aeron.ClientConductor.onError(ClientConductor.java:174)
[info] at io.aeron.DriverEventsAdapter.onMessage(DriverEventsAdapter.java:81)
[info] at org.agrona.concurrent.broadcast.CopyBroadcastReceiver.receive(CopyBroadcastReceiver.java:100)
[info] at io.aeron.DriverEventsAdapter.receive(DriverEventsAdapter.java:56)
[info] at io.aeron.ClientConductor.service(ClientConductor.java:660)
[info] at io.aeron.ClientConductor.awaitResponse(ClientConductor.java:696)
[info] at io.aeron.ClientConductor.addPublication(ClientConductor.java:371)
[info] at io.aeron.Aeron.addPublication(Aeron.java:259)
[info] at akka.remote.artery.aeron.AeronSink$$anon$1.<init>(AeronSink.scala:103)
[info] at akka.remote.artery.aeron.AeronSink.createLogicAndMaterializedValue(AeronSink.scala:100)
[info] at akka.stream.impl.GraphStageIsland.materializeAtomic(PhasedFusingActorMaterializer.scala:630)
[info] at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:450)
[info] at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:415)
[info] at akka.stream.impl.PhasedFusingActorMaterializer.materialize(PhasedFusingActorMaterializer.scala:406)
[info] at akka.stream.scaladsl.RunnableGraph.run(Flow.scala:588)
[info] at akka.remote.artery.Association.runOutboundOrdinaryMessagesStream(Association.scala:726)
[info] at akka.remote.artery.Association.runOutboundStreams(Association.scala:657)
[info] at akka.remote.artery.Association.associate(Association.scala:649)
[info] at akka.remote.artery.AssociationRegistry.association(Association.scala:989)
[info] at akka.remote.artery.ArteryTransport.association(ArteryTransport.scala:724)
[info] at akka.remote.artery.ArteryTransport.send(ArteryTransport.scala:710)
[info] at akka.remote.RemoteActorRef.$bang(RemoteActorRefProvider.scala:591)
[info] at akka.actor.ActorRef.tell(ActorRef.scala:124)
[info] at akka.actor.ActorSelection$.rec$1(ActorSelection.scala:265)
[info] at akka.actor.ActorSelection$.deliverSelection(ActorSelection.scala:269)
[info] at akka.actor.ActorSelection.tell(ActorSelection.scala:46)
[info] at akka.actor.ScalaActorSelection.$bang(ActorSelection.scala:280)
[info] at akka.actor.ScalaActorSelection.$bang$(ActorSelection.scala:280)
[info] at akka.actor.ActorSelection$$anon$1.$bang(ActorSelection.scala:198)
[info] at akka.cluster.ClusterCoreDaemon.gossipTo(ClusterDaemon.scala:1330)
[info] at akka.cluster.ClusterCoreDaemon.gossip(ClusterDaemon.scala:1047)
[info] at akka.cluster.ClusterCoreDaemon.gossipTick(ClusterDaemon.scala:1010)
[info] at akka.cluster.ClusterCoreDaemon$$anonfun$initialized$1.applyOrElse(ClusterDaemon.scala:496)
[info] at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
[info] at akka.actor.Actor.aroundReceive(Actor.scala:517)
[info] at akka.actor.Actor.aroundReceive$(Actor.scala:515)
[info] at akka.cluster.ClusterCoreDaemon.aroundReceive(ClusterDaemon.scala:295)
[info] at akka.actor.ActorCell.receiveMessage(ActorCell.scala:588)
[info] at akka.actor.ActorCell.invoke(ActorCell.scala:557)
[info] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
[info] at akka.dispatch.Mailbox.run(Mailbox.scala:225)
[info] at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
[info] at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
[info] at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
[info] at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
[info] at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
It seems it is sending huge message(48MB+?) to every nodes.
So what's up here? What is the root cause and how shall I fix this?
Related
If you use sbt docker:publishLocal to build a docker image from your scala project, you will see the below lines printed out:
[info] Packaging /home/user123/myUser/repos/my_job/target/scala-2.12/app_internal_2.12-0.1.jar ...
[info] Done packaging.
[info] Sending build context to Docker daemon 129.7MB
[info] Step 1/7 : FROM openjdk:11-jre
[info] ---> 8c8b7f0ab84c
[info] Step 2/7 : LABEL MAINTAINER="no_name#my.org"
[info] ---> Using cache
[info] ---> d5caf9a92999
[info] Step 3/7 : WORKDIR /opt/docker
[info] ---> Using cache
[info] ---> d887eeb10e8e
[info] Step 4/7 : ADD --chown=root:root opt /opt
[info] ---> 1b43a84a5e32
[info] Step 5/7 : USER root
[info] ---> Running in 282c7f7de8ad
[info] Removing intermediate container 282c7f7de8ad
[info] ---> 11fed4892683
[info] Step 6/7 : ENTRYPOINT ["/opt/docker/bin/my_job"]
[info] ---> Running in 1d297dd1e960
[info] Removing intermediate container 1d297dd1e960
[info] ---> 1923a8df3fcf
[info] Step 7/7 : CMD []
[info] ---> Running in 3d9f7a4a262b
[info] Removing intermediate container 3d9f7a4a262b
[info] ---> d67ed46fd3fe
[info] Successfully built d67ed46fd3fe
[info] Successfully tagged docker_app_internal:0.1
[info] Built image docker_app_internal with tags [0.1]
[success] Total time: 25 s, completed Mar 27, 2019 10:23:35 AM
You may get confused by the error. And why:
THIS WORKS:
docker run -it --entrypoint=/bin/bash docker_app_internal:0.1 -i
Does not work:
docker run docker_app_internal:0.1
Thanks to #Muki for creating this helpful project.
Refer: https://github.com/sbt/sbt-native-packager
If you have the project root folder different from MainClass name, then your entrypoint using sbt docker:publishLocal becomes /your/linuxpath/bin/rootFolder. But, the actual file that gets created in the docker image is /your/linuxpath/bin/main-class (if your main class name is MainClass)
To fix this, please explicitly mention the entrypoint in build.sbt as below:
dockerEntrypoint := Seq("/opt/docker/bin/main-class")
I have started kafka-manager on centos VM and below are its logs.
[info] o.a.z.ZooKeeper - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
[info] o.a.z.ZooKeeper - Client environment:java.io.tmpdir=/tmp
[info] o.a.z.ZooKeeper - Client environment:java.compiler=
[info] o.a.z.ZooKeeper - Client environment:os.name=Linux
[info] o.a.z.ZooKeeper - Client environment:os.arch=amd64
[info] o.a.z.ZooKeeper - Client environment:os.version=3.10.0-862.el7.x86_64
[info] o.a.z.ZooKeeper - Client environment:user.name=root
[info] o.a.z.ZooKeeper - Client environment:user.home=/root
[info] o.a.z.ZooKeeper - Client environment:user.dir=/root/Confluent_kafka/kafka-manager-1.3.3.21
[info] o.a.z.ZooKeeper - Initiating client connection, connectString=localhost:3181 sessionTimeout=60000 watcher=org.apache.curator.ConnectionState#73687e45
[info] o.a.z.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:3181. Will not attempt to authenticate using SASL (unknown error)
[info] o.a.z.ClientCnxn - Socket connection established to localhost/127.0.0.1:3181, initiating session
[info] k.m.a.KafkaManagerActor - zk=localhost:3181
[info] k.m.a.KafkaManagerActor - baseZkPath=/kafka-manager
[info] o.a.z.ClientCnxn - Session establishment complete on server localhost/127.0.0.1:3181, sessionid = 0x16565ff95660000, negotiated timeout = 60000
[info] k.m.a.KafkaManagerActor - Started actor akka://kafka-manager-system/user/kafka-manager
[info] k.m.a.KafkaManagerActor - Starting delete clusters path cache...
[info] k.m.a.DeleteClusterActor - Started actor akka://kafka-manager-system/user/kafka-manager/delete-cluster
[info] k.m.a.DeleteClusterActor - Starting delete clusters path cache...
[info] k.m.a.KafkaManagerActor - Starting kafka manager path cache...
[info] k.m.a.DeleteClusterActor - Adding kafka manager path cache listener...
[info] k.m.a.DeleteClusterActor - Scheduling updater for 10 seconds
[info] k.m.a.KafkaManagerActor - Adding kafka manager path cache listener...
[info] play.api.Play - Application started (Prod)
[info] p.c.s.NettyServer - Listening for HTTP on /0.0.0.0:9000
[info] k.m.a.KafkaManagerActor - Updating internal state...
[info] k.m.a.KafkaManagerActor - Updating internal state...
[info] k.m.a.KafkaManagerActor - Updating internal state...
[info] k.m.a.KafkaManagerActor - Shutting down kafka manager
Tha kafka manager starts perfectly but the WEB UI does not load at all.
The IPV6 is disabled and the netstat shows this
tcp 0 0 0.0.0.0:9000 0.0.0.0:* LISTEN 2670/java
Can someone help in this.
When I try to build hive from source by following these commands:
$git clone https://git-wip-us.apache.org/repos/asf/hive.git
$cd hive
$mvn clean package -Pdist
I receive this errors after I run "mvn clean package -Pdist
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.18.1:test (default-test) on project hive-exec: There are test failures.
[ERROR]
[ERROR] Please refer to /home/elbasir/hive/ql/target/surefire-reports for the individual test results.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR]
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR] mvn <goals> -rf :hive-exec
Can anybody help to solve this? I have been searching the internet for a week to solve it but no success until now.
Here is the failure I got:
Running org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 4.573 sec <<< FAILURE! - in
org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest
createDestinationPath(org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest) Time elapsed: 1.919 sec <<< FAILURE!
java.lang.AssertionError:
Expected: is "hdfs://somehost:9000/someBasePath/withADir/replicaDbName/somefunctionname/9223372036854775807/ab.jar"
but: was "hdfs://somehost:9000/someBasePath/withADir/replicadbname/somefunctionname/0/ab.jar" at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.junit.Assert.assertThat(Assert.java:865)
at org.junit.Assert.assertThat(Assert.java:832)
at
org.apache.hadoop.hive.ql.parse.repl.load.message.PrimaryToReplicaResourceFunctionTest.createDestinationPath(PrimaryToReplicaResourceFunctionTest.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.internal.runners.TestMethod.invoke(TestMethod.java:68)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:316)
at org.junit.internal.runners.MethodRoadie$2.run(MethodRoadie.java:88)
at org.junit.internal.runners.MethodRoadie.runBeforesThenTestThenAfters(MethodRoadie.java:96)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.executeTest(PowerMockJUnit44RunnerDelegateImpl.java:300)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTestInSuper(PowerMockJUnit47RunnerDelegateImpl.java:131)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.access$100(PowerMockJUnit47RunnerDelegateImpl.java:59)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner$TestExecutorStatement.evaluate(PowerMockJUnit47RunnerDelegateImpl.java:147)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.evaluateStatement(PowerMockJUnit47RunnerDelegateImpl.java:107)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit47RunnerDelegateImpl$PowerMockJUnit47MethodRunner.executeTest(PowerMockJUnit47RunnerDelegateImpl.java:82)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$PowerMockJUnit44MethodRunner.runBeforesThenTestThenAfters(PowerMockJUnit44RunnerDelegateImpl.java:288)
at org.junit.internal.runners.MethodRoadie.runTest(MethodRoadie.java:86)
at org.junit.internal.runners.MethodRoadie.run(MethodRoadie.java:49)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.invokeTestMethod(PowerMockJUnit44RunnerDelegateImpl.java:208)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.runMethods(PowerMockJUnit44RunnerDelegateImpl.java:147)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl$1.run(PowerMockJUnit44RunnerDelegateImpl.java:121)
at org.junit.internal.runners.ClassRoadie.runUnprotected(ClassRoadie.java:33)
at org.junit.internal.runners.ClassRoadie.runProtected(ClassRoadie.java:45)
at org.powermock.modules.junit4.internal.impl.PowerMockJUnit44RunnerDelegateImpl.run(PowerMockJUnit44RunnerDelegateImpl.java:123)
at org.powermock.modules.junit4.common.internal.impl.JUnit4TestSuiteChunkerImpl.run(JUnit4TestSuiteChunkerImpl.java:121)
at org.powermock.modules.junit4.common.internal.impl.AbstractCommonPowerMockRunner.run(AbstractCommonPowerMockRunner.java:53)
at org.powermock.modules.junit4.PowerMockRunner.run(PowerMockRunner.java:59)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:283)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:173)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:128)
at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Results :
Failed tests:
PrimaryToReplicaResourceFunctionTest.createDestinationPath:100 Expected: is "hdfs://somehost:9000/someBasePath/withADir/replicaDbName/somefunctionname/9223372036854775807/ab.jar"
but: was "hdfs://somehost:9000/someBasePath/withADir/replicadbname/somefunctionname/0/ab.jar"
Tests run: 3305, Failures: 1, Errors: 0, Skipped: 14
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Hive ............................................... SUCCESS [ 3.223 s]
[INFO] Hive Shims Common .................................. SUCCESS [ 10.496 s]
[INFO] Hive Shims 0.23 .................................... SUCCESS [ 7.779 s]
[INFO] Hive Shims Scheduler ............................... SUCCESS [ 3.569 s]
[INFO] Hive Shims ......................................... SUCCESS [ 1.963 s]
[INFO] Hive Storage API ................................... SUCCESS [01:32 min]
[INFO] Hive Common ........................................ SUCCESS [01:07 min]
[INFO] Hive Service RPC ................................... SUCCESS [ 5.318 s]
[INFO] Hive Serde ......................................... SUCCESS [03:32 min]
[INFO] Hive Standalone Metastore .......................... SUCCESS [ 37.830 s]
[INFO] Hive Metastore ..................................... SUCCESS [02:29 min]
[INFO] Hive Vector-Code-Gen Utilities ..................... SUCCESS [ 0.298 s]
[INFO] Hive Llap Common ................................... SUCCESS [ 7.068 s]
[INFO] Hive Llap Client ................................... SUCCESS [ 22.578 s]
[INFO] Hive Llap Tez ...................................... SUCCESS [ 16.250 s]
[INFO] Spark Remote Client ................................ SUCCESS [ 26.667 s]
[INFO] Hive Query Language ................................ FAILURE [ 01:09 h]
[INFO] Hive Llap Server ................................... SKIPPED
[INFO] Hive Service ....................................... SKIPPED
[INFO] Hive Accumulo Handler .............................. SKIPPED
[INFO] Hive JDBC .......................................... SKIPPED
[INFO] Hive Beeline ....................................... SKIPPED
[INFO] Hive CLI ........................................... SKIPPED
[INFO] Hive Contrib ....................................... SKIPPED
[INFO] Hive Druid Handler ................................. SKIPPED
[INFO] Hive HBase Handler ................................. SKIPPED
[INFO] Hive JDBC Handler .................................. SKIPPED
[INFO] Hive HCatalog ...................................... SKIPPED
[INFO] Hive HCatalog Core ................................. SKIPPED
[INFO] Hive HCatalog Pig Adapter .......................... SKIPPED
[INFO] Hive HCatalog Server Extensions .................... SKIPPED
[INFO] Hive HCatalog Webhcat Java Client .................. SKIPPED
[INFO] Hive HCatalog Webhcat .............................. SKIPPED
[INFO] Hive HCatalog Streaming ............................ SKIPPED
[INFO] Hive HPL/SQL ....................................... SKIPPED
[INFO] Hive Llap External Client .......................... SKIPPED
[INFO] Hive Shims Aggregator .............................. SKIPPED
[INFO] Hive TestUtils ..................................... SKIPPED
[INFO] Hive Packaging ..................................... SKIPPED
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
I want to Save and Read Spark DataFrame from AWS S3. I googled it a lot but found nothing much of use.
The code that I have written is like this:
val spark = SparkSession.builder().master("local").appName("test").getOrCreate()
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsAccessKeyId", "**********")
spark.sparkContext.hadoopConfiguration.set("fs.s3n.awsSecretAccessKey", "********************")
import spark.implicits._
spark.read.textFile("s3n://myBucket/testFile").show(false)
List(1,2,3,4).toDF.write.parquet("s3n://myBucket/test/abc.parquet")
But when run it I get following error:
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/myBucket/testFile' - ResponseCode=403, ResponseMessage=Forbidden
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:245)
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:119)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] at java.lang.reflect.Method.invoke(Method.java:498)
[info] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186)
[info] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
[info] at org.apache.hadoop.fs.s3native.$Proxy15.retrieveMetadata(Unknown Source)
[info] at org.apache.hadoop.fs.s3native.NativeS3FileSystem.getFileStatus(NativeS3FileSystem.java:414)
[info] ...
[info] Cause: org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/myBucket/testFile' - ResponseCode=403, ResponseMessage=Forbidden
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:477)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:718)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1599)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1535)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1987)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1332)
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:111)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] ...
[info] Cause: org.jets3t.service.impl.rest.HttpException:
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:475)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestHead(RestS3Service.java:718)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1599)
[info] at org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectDetailsImpl(RestS3Service.java:1535)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1987)
[info] at org.jets3t.service.S3Service.getObjectDetails(S3Service.java:1332)
[info] at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieveMetadata(Jets3tNativeFileSystemStore.java:111)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[info] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] ...
I am using
Spark: 2.1.0
Scala: 2.11.2
AWS Java SDK: 1.11.126
Any help is appreciated !
I have tried following things on spark version 2.1.1 and its worked fine for me.
Step 1: Download following jars:
-- hadoop-aws-2.7.3.jar
-- aws-java-sdk-1.7.4.jar
Note:
If you not able to find the following jars, then you can get the jars from hadoop-2.7.3
Step 2: Place the above jars into $SPARK_HOME/jars/
Step 3: code:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
val conf = new SparkConf().setMaster("local").setAppName("My App")
val sc = new SparkContext(conf)
sc.getOrCreate.hadoopConfiguration.set("fs.s3a.access.key"", "***********")
sc.getOrCreate.hadoopConfiguration.set("fs.s3a.secret.key", "******************")
val input = sc.textFile("s3a://mybucket/*.txt")
List(1,2,3,4).toDF.write.parquet("s3a://mybucket/abc.parquet")
Set the secrets in the spark conf itself with an option like "spark.hadoop.fs.s3n...", so that spark propagates it around with the work
After switching Spark framework from 1.6 to 2.1, running my tests results in many occurrences of the following error:
[info] java.lang.AssertionError: assertion failed: copyAndReset must return a zero value copy
[info] at scala.Predef$.assert(Predef.scala:170)
[info] at org.apache.spark.util.AccumulatorV2.writeReplace(AccumulatorV2.scala:163)
[info] at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
[info] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[info] at java.lang.reflect.Method.invoke(Method.java:497)
[info] at java.io.ObjectStreamClass.invokeWriteReplace(ObjectStreamClass.java:1075)
[info] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1136)
[info] at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
[info] at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
[info] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
[info] ...
Any ideas what's happening and how I can fix this?