Getting errors while I'm connecting to Hbase from Spark using Amazon EMR? - scala

I'm trying to connect Hbase tables from Spark using Amazon EMR. I'm using the below versions of the drivers.
Hbase : 1.1.2.2.3.4.0-3485
Phoenix driver : 4.2.0.2.2.0.0-2041
When i'm running my fat jar on EMR getting below errors. I tried to resolve but got struck.
java.util.concurrent.ExecutionException: java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1658)
at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1613)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:924)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1168)
at org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:349)
org.apache.phoenix.compile.FromCompiler$SingleTableColumnResolver.<init>(FromCompiler.java:215)
at org.apache.phoenix.compile.FromCompiler.getResolverForQuery(FromCompiler.java:159)
at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:304)
at org.apache.phoenix.jdbc.PhoenixStatement$ExecutableSelectStatement.compilePlan(PhoenixStatement.java:294)
at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:215)
at org.apache.phoenix.jdbc.PhoenixStatement$1.call(PhoenixStatement.java:211)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at org.apache.phoenix.jdbc.PhoenixStatement.executeQuery(PhoenixStatement.java:210)
at org.apache.phoenix.jdbc.PhoenixPreparedStatement.executeQuery(PhoenixPreparedStatement.java:183)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:127)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:117)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:345)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122)
at spid_part1$.main(spid_part1.scala:71)
at spid_part1.main(spid_part1.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.IllegalAccessError: class com.google.protobuf.HBaseZeroCopyByteString cannot access its superclass com.google.protobuf.LiteralByteString
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$5.call(ConnectionQueryServicesImpl.java:1176)
at org.apache.phoenix.query.ConnectionQueryServicesImpl$5.call(ConnectionQueryServicesImpl.java:1169)
at org.apache.hadoop.hbase.client.HTable$16.call(HTable.java:1646)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Any help on this, Thanks

Related

Scala Compilter Interface Error - Build does not generate - IntelliJ

I'm trying to build a calculator in Scala without using Control Flow elements and I am getting this error, maybe a possible solution?
java.lang.ClassNotFoundException: xsbt.CompilerInterface
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:418)
at java.lang.ClassLoader.loadClass(ClassLoader.java:351)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at sbt.internal.inc.AnalyzingCompiler.getBridgeClass(AnalyzingCompiler.scala:372)
at sbt.internal.inc.AnalyzingCompiler.bridgeInstance(AnalyzingCompiler.scala:321)
at sbt.internal.inc.AnalyzingCompiler.compile(AnalyzingCompiler.scala:95)
at org.jetbrains.jps.incremental.scala.local.IdeaIncrementalCompiler.compile(IdeaIncrementalCompiler.scala:57)
at org.jetbrains.jps.incremental.scala.local.LocalServer.doCompile(LocalServer.scala:52)
at org.jetbrains.jps.incremental.scala.local.LocalServer.compile(LocalServer.scala:30)
at org.jetbrains.jps.incremental.scala.remote.Main$.compileLogic(Main.scala:207)
at org.jetbrains.jps.incremental.scala.remote.Main$.$anonfun$handleCommand$1(Main.scala:190)
at org.jetbrains.jps.incremental.scala.remote.Main$.decorated$1(Main.scala:180)
at org.jetbrains.jps.incremental.scala.remote.Main$.handleCommand(Main.scala:187)
at org.jetbrains.jps.incremental.scala.remote.Main$.serverLogic(Main.scala:163)
at org.jetbrains.jps.incremental.scala.remote.Main$.nailMain(Main.scala:103)
at org.jetbrains.jps.incremental.scala.remote.Main.nailMain(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.martiansoftware.nailgun.NGSession.run(NGSession.java:319)

How to use s3a with Apache spark 2.2(hadoop 2.8) in the Spark Submit?

I am trying to access the S3 data from spark using the spark 2.2.0 built using hadoop 2.8 version, I am using the /jars/hadoop-aws-2.8.3.jar, /jars/aws-java-sdk-s3-1.10.6.jar and /jars/aws-java-sdk-core-1.10.6.jar in the classpath
I get the following exception
java.lang.NoClassDefFoundError: org/apache/hadoop/fs/StorageStatistics
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.hadoop.conf.Configuration.getClassByNameOrNull(Configuration.java:2134)
at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2099)
at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193)
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2654)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2667)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.sql.execution.datasources.DataSource.hasMetadata(DataSource.scala:301)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:344)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.StorageStatistics
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 27 more
Then I added the hadoop-common jar to the classpath from spark installation directory /sparkinstallation/jars/hadoop-common-2.8.3.jar, now I get the following error:
java.lang.IllegalAccessError: tried to access method org.apache.hadoop.metrics2.lib.MutableCounterLong.<init>(Lorg/apache/hadoop/metrics2/MetricsInfo;J)V from class org.apache.hadoop.fs.s3a.S3AInstrumentation
at org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:194)
at org.apache.hadoop.fs.s3a.S3AInstrumentation.streamCounter(S3AInstrumentation.java:216)
at org.apache.hadoop.fs.s3a.S3AInstrumentation.<init>(S3AInstrumentation.java:139)
at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:174)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:295)
at org.apache.spark.sql.execution.datasources.DataSource.hasMetadata(DataSource.scala:301)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:344)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
at org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:441)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:280)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:214)
at java.lang.Thread.run(Thread.java:745)
Can somebody help if I am missing something ?
I have referred to the link - How to use s3 with Apache spark 2.2 in the Spark shell, but didn't help
I would suggest to add the dependency to your spark-submit command as below, which will downloads all the dependencies required. If you just add a jar, you may still have some other dependencies missing:
spark-shell --packages "org.apache.hadoop:hadoop-aws:2.7.3"
spark-submit --packages "org.apache.hadoop:hadoop-aws:2.7.3"
Another way is to bundle the dependencies into your job jar file, then use normal spark-sbumit

java.lang.OutOfMemoryError embedding HDBSQL in Play

I'm using HSQLDB embedded in-memory in a Play for Scala application server.
I configure the driver like so:
driver = org.hsqldb.jdbc.JDBCDriver
url = "jdbc:hsqldb:mem:inmemory"
Also, when Play restarts I issue a SHUTDOWN statement in the HSQLDB connection.
This seems to work fine, however when I restart Play around 20 times in the development environment I get the following out of memory exception. Is HDBSQL shut down correctly? Maybe the database is closed but the engine memory itself is not released?
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError:
Metaspace
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192) Caused by: java.lang.OutOfMemoryError: Metaspace
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:467)
at java.net.URLClassLoader.access$100(URLClassLoader.java:73)
at java.net.URLClassLoader$1.run(URLClassLoader.java:368)
at java.net.URLClassLoader$1.run(URLClassLoader.java:362)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:361)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at scala.collection.Iterator$class.toStream(Iterator.scala:1181)
at scala.collection.AbstractIterator.toStream(Iterator.scala:1194)
at scala.collection.TraversableOnce$class.toSeq(TraversableOnce.scala:296)
at scala.collection.AbstractIterator.toSeq(Iterator.scala:1194)
at scala.tools.nsc.backend.jvm.GenASM$newNormal$.computeDetour$1(GenASM.scala:3094)
at scala.tools.nsc.backend.jvm.GenASM$newNormal$.collapseJumpOnlyBlocks(GenASM.scala:3126)
at scala.tools.nsc.backend.jvm.GenASM$newNormal$.normalize(GenASM.scala:3195)
at scala.tools.nsc.backend.jvm.GenASM$JPlainBuilder.genCode(GenASM.scala:1861)
at scala.tools.nsc.backend.jvm.GenASM$JPlainBuilder.genMethod(GenASM.scala:1467)
at scala.tools.nsc.backend.jvm.GenASM$JPlainBuilder.genClass(GenASM.scala:1337)
HSQLDB has a way to automatically shutdown the database when the last connection is closed. You need to add shutdown=true as a URL property:
url = "jdbc:hsqldb:mem:inmemory;shutdown=true"
See more details in the docs:
http://hsqldb.org/doc/2.0/guide/dbproperties-chapt.html#N15641

spring kafka exception on startup

I followed this kafka example :https://www.codenotfound.com/spring-kafka-consumer-producer-example.html
but got exception:
java.lang.IllegalStateException: Error processing condition on org.springframework.boot.autoconfigure.kafka.KafkaAutoConfiguration.kafkaProducerListener
at org.springframework.boot.autoconfigure.condition.SpringBootCondition.matches(SpringBootCondition.java:64) ~[spring-boot-autoconfigure-2.0.0.M3.jar:2.0.0.M3]
at org.springframework.context.annotation.ConditionEvaluator.shouldSkip(ConditionEvaluator.java:109) ~[spring-context-5.0.0.RC3.jar:5.0.0.RC3]
java.lang.IllegalStateException: Failed to introspect Class [org.springframework.boot.autoconfigure.kafka.KafkaAutoConfiguration] from ClassLoader [sun.misc.Launcher$AppClassLoader#18b4aac2]
at org.springframework.util.ReflectionUtils.getDeclaredMethods(ReflectionUtils.java:659) ~[spring-core-5.0.0.RC3.jar:5.0.0.RC3]
at org.springframework.util.ReflectionUtils.doWithMethods(ReflectionUtils.java:556) ~[spring-core-5.0.0.RC3.jar:5.0.0.RC3]
at org.springframework.util.ReflectionUtils.doWithMethods(ReflectionUtils.java:541) ~[spring-core-5.0.0.RC3.jar:5.0.0.RC3]
Caused by: java.lang.NoClassDefFoundError: org/springframework/kafka/security/jaas/KafkaJaasLoginModuleInitializer
at java.lang.Class.getDeclaredMethods0(Native Method) ~[na:1.8.0_121]
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) ~[na:1.8.0_121]
at java.lang.Class.getDeclaredMethods(Class.java:1975) ~[na:1.8.0_121]
at org.springframework.util.ReflectionUtils.getDeclaredMethods(ReflectionUtils.java:641) ~[spring-core-5.0.0.RC3.jar:5.0.0.RC3]
... 33 common frames omitted
Caused by: java.lang.ClassNotFoundException: org.springframework.kafka.security.jaas.KafkaJaasLoginModuleInitializer
at java.net.URLClassLoader.findClass(URLClassLoader.java:381) ~[na:1.8.0_121]
at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_121]
I've google but can not resolve this issue , is anyone know this?

MapReduce (Hadoop-2.6.0)+ HBase-1.0.1.1 class not found exception

I have written a Map-Reduce program to fetch data from an input file and output it to a HBase table. But I am not able to execute. I am getting the following error
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at beginners.VisitorSort.main(VisitorSort.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/mapreduce/TableReducer
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at visitor.HitTimeGmt.main(HitTimeGmt.java:142)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.mapreduce.TableReducer
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
I am not sharing the code, as I know this is an classpath issue. The same code ran on Hadoop-1.3.1 and Hbase-0.94.8 versions. Have updated the jars in the build classpath in eclipse, in bashrc file, in Hadoop-env.sh and also in hbase-env.sh.
But still I am facing this issue. I am out of options now. Any help is appreciated. Thanks in advance.
Finally, solved the issue. Needed to add the following line in hadoop-env.sh(all nodes)
HADOOP_CLASSPATH=$HBASE_HOME/lib/*