GCS Connector Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found - google-cloud-storage

We are trying to run Hive queries on HDP 2.1 using GCS Connector, it was working fine until yesterday but since today morning our jobs are randomly started failing. When we restart them manually they just work fine. I suspect it's something to do with number of parallel Hive jobs running at a given point of time.
Below is the error message:
vertexId=vertex_1407434664593_37527_2_00, diagnostics=[Vertex Input: audience_history initializer failed., java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found]
DAG failed due to vertex failure. failedVertices:1 killedVertices:0
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask
Any help will be highly appreciated.
Thanks!

Related

Zeppelin fails after loading DeltaTable with "Could not find active SparkSession"

I'm running into an issue when trying to do repeat queries with Delta Lake tables in Zeppelin. This code snippet runs without any problems the first time through:
import io.delta.tables._
val deltaTable = DeltaTable.forPath("s3://bucket/path")
deltaTable.toDF.show()
But when I try to run it a second time, it fails with this error:
java.lang.IllegalArgumentException: Could not find active SparkSession
at io.delta.tables.DeltaTable$$anonfun$1.apply(DeltaTable.scala:620)
at io.delta.tables.DeltaTable$$anonfun$1.apply(DeltaTable.scala:620)
at scala.Option.getOrElse(Option.scala:121)
at io.delta.tables.DeltaTable$.forPath(DeltaTable.scala:619)
... 51 elided
I can restart the Spark interpreter and run the query again, but this is a huge impediment to development. Does anyone know why this is happening and whether there is a workaround that doesn't involve restarting the interpreter every time I want to run a new query?

Remote debug an apache beam job on flink cluster

I am running a streaming beam job on a flink cluster where I am getting the following exception.
Caused by: org.apache.beam.sdk.util.UserCodeException: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
at org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:218)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:183)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:62)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.processElement(DoFnOperator.java:544)
at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:302)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:596)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.emit(DoFnOperator.java:941)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.output(DoFnOperator.java:895)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:252)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:74)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:576)
at org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:71)
at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:139)
Caused by: org.apache.beam.sdk.util.UserCodeException: java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds have the same scheme, but received alluxio, file.
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
at org.apache.beam.sdk.io.WriteFiles$FinalizeTempFileBundles$FinalizeFn$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:218)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:183)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:62)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.processElement(DoFnOperator.java:544)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.emit(DoFnOperator.java:941)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.output(DoFnOperator.java:895)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:252)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:74)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:576)
at org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:71)
at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:139)
at org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:218)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:183)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:62)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.processElement(DoFnOperator.java:544)
at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:302)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds have the same scheme, but received alluxio, file.
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
at org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:428)
at org.apache.beam.sdk.io.FileSystems.rename(FileSystems.java:308)
at org.apache.beam.sdk.io.FileBasedSink$WriteOperation.moveToOutputFiles(FileBasedSink.java:755)
at org.apache.beam.sdk.io.WriteFiles$FinalizeTempFileBundles$FinalizeFn.process(WriteFiles.java:850)
The streaming job is getting data from the apache pulsar source and writing output data onto an Alluxio data lake in parquet file format. I am using Spotify's scio for writing this job in Scala. A little code chunk to emphasize what I am trying to achieve:
pulsarSource
.open(sc)
.withFixedWindows(Duration.standardSeconds(windowDuration))
.toSinkTap(sink)
From the exception, I can see that source and output paths should have the same URI scheme but I don't know how it is happening because I am using an alluxio path as an output directory. There are some temp directories that are being created on alluxio output directory but after the WindowDuration, when the output file is being created, this exception happens.
I had a doubt that temp location might be configured by default to the local filesystem, so I did set that to output directory path (alluxio dir path) but it didn't change anything.
sc.options.setTempLocation(outputDir)
I want to do remote debugging in order to figure out the issue. I have tried this document to do remote debugging on the task executor node, but once my IntelliJ IDE connects with the node, I don't get hit on my breakpoint.
Can someone suggest, how can I debug or get more information about this issue.
Thanks
Remote debugging might be quite hard, but let's try this first: Make sure you connect to the task manager and not job manager (easy to verify with thread names). Then make sure to have a high number of retries, such that you don't miss the task execution, as attaching the debugging might take a while.
It's also helpful to double check that line numbers of the stack trace match your code version in the IDE. If Flink/Beam is preinstalled, they might run a slightly different version and your break point is void. Just paste the stack trace in your IDE and check if each line matches the expectation. Finally, add a few more breakpoint at central places like org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202) to make sure if the setup is working at all.
However, remote debugging is usually not the recommended option for big data systems. You'd first ensure locally that most of the things work on their own with some IT tests and local runners. Then, you might want to add e2e tests with docker containers and a local mini cluster. Additionally, you'd add a ton of logging statements, which you can turn on and off with your logging configuration. Similarly, if you set logging level to debug, the existing log statements of the frameworks might already be enough to gain some insights. One important thing that you should always look at is the generated topology that you can see in Web UI. Maybe it already tells you the paths in question.

Scala Spark : (org.apache.spark.repl.ExecutorClassLoader) Failed to check existence of class org on REPL class server at path

Running basic df.show() post spark notebook installation
I am getting the following error when running scala - spark code on spark-notebook. Any idea when this occurs and how to avoid?
[org.apache.spark.repl.ExecutorClassLoader] Failed to check existence of class org.apache.spark.sql.catalyst.expressions.Object on REPL class server at spark://192.168.10.194:50935/classes
[org.apache.spark.util.Utils] Aborting task
[org.apache.spark.repl.ExecutorClassLoader] Failed to check existence of class org on REPL class server at spark://192.168.10.194:50935/classes
[org.apache.spark.util.Utils] Aborting task
[org.apache.spark.repl.ExecutorClassLoader] Failed to check existence of class
I installed the spark on local, and when I was using following code it was giving me the same error.
spark.read.format("json").load("Downloads/test.json")
I think the issue was, it was trying to find some master node and taking some random or default IP. I specified the mode and then provided the IP as 127.0.0.1 and it resolved my issue.
Solution
Run the spark using local master
usr/local/bin/spark-shell --master "local[4]" --conf spark.driver.host=127.0.0.1'

The initialization of the DataSource's outputs caused an error: The UDF class is not a proper subclass

I have this issue
The initialization of the DataSource's outputs caused an error: The UDF class is not a proper subclass of org.apache.flink.api.common.functions.MapFunction
generated by this code:
val probes: DataSet[Probe] = env.createInput[InputProbe](new ProbesInputFormat).map { i =>
new Probe(
i.rssi,
0,
i.macHash,
i.deviceId,
0,
i.timeStamp)
}
I'm using scala 2.11 on flink 1.4.0 with IDEA.
On Dev machine i have no issue and the job runs properly, while on a Flink Standalone Cluster of 3 nodes i encountered the above error.
Can you help me please ;(
UPDATE:
I resolved implementing a class that extends from RichMapFunction, i don't know why but seems that lambda function => are not supported properly.
Now i have a new issue:
java.lang.ClassCastException: hk.huko.aps2.entities.Registry cannot be cast to scala.Product
Should i open a new POST?
I resolved the issue. It happened because flink load my job JAR many times (classloader) and in somehow it produced that error.
The solution is to not creare a JAR including all external JARs dependencies, but to copy into flink/lib folder those libraries plus your job JAR.

Streamsets class not found exception

I built a pipeline in streamsets to read data from my sql and do change data capture. When i started the execution of the pipeline i get the following error.
Pipeline Status: START_ERROR: java.lang.NoClassDefFoundError: org/apache/kafka/common/KafkaException
The most likely problem is that you need to follow the MapR-specific instructions at https://streamsets.com/documentation/datacollector/latest/help/#Installation/MapR-Prerequisites.html