Remote debug an apache beam job on flink cluster

Remote debug an apache beam job on flink cluster - scala

I am running a streaming beam job on a flink cluster where I am getting the following exception.
Caused by: org.apache.beam.sdk.util.UserCodeException: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
at org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:218)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:183)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:62)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.processElement(DoFnOperator.java:544)
at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:302)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.streaming.runtime.tasks.ExceptionInChainedOperatorException: Could not forward element to next operator
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:596)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.emit(DoFnOperator.java:941)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.output(DoFnOperator.java:895)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:252)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:74)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:576)
at org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:71)
at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:139)
Caused by: org.apache.beam.sdk.util.UserCodeException: java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds have the same scheme, but received alluxio, file.
at org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:34)
at org.apache.beam.sdk.io.WriteFiles$FinalizeTempFileBundles$FinalizeFn$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:218)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:183)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:62)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.processElement(DoFnOperator.java:544)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:579)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:718)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:696)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.emit(DoFnOperator.java:941)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator$BufferedOutputManager.output(DoFnOperator.java:895)
at org.apache.beam.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:252)
at org.apache.beam.runners.core.SimpleDoFnRunner.access$700(SimpleDoFnRunner.java:74)
at org.apache.beam.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:576)
at org.apache.beam.sdk.transforms.DoFnOutputReceivers$WindowedContextOutputReceiver.output(DoFnOutputReceivers.java:71)
at org.apache.beam.sdk.transforms.MapElements$1.processElement(MapElements.java:139)
at org.apache.beam.sdk.transforms.MapElements$1$DoFnInvoker.invokeProcessElement(Unknown Source)
at org.apache.beam.runners.core.SimpleDoFnRunner.invokeProcessElement(SimpleDoFnRunner.java:218)
at org.apache.beam.runners.core.SimpleDoFnRunner.processElement(SimpleDoFnRunner.java:183)
at org.apache.beam.runners.flink.metrics.DoFnRunnerWithMetricsUpdate.processElement(DoFnRunnerWithMetricsUpdate.java:62)
at org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator.processElement(DoFnOperator.java:544)
at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:302)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds have the same scheme, but received alluxio, file.
at org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
at org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:428)
at org.apache.beam.sdk.io.FileSystems.rename(FileSystems.java:308)
at org.apache.beam.sdk.io.FileBasedSink$WriteOperation.moveToOutputFiles(FileBasedSink.java:755)
at org.apache.beam.sdk.io.WriteFiles$FinalizeTempFileBundles$FinalizeFn.process(WriteFiles.java:850)
The streaming job is getting data from the apache pulsar source and writing output data onto an Alluxio data lake in parquet file format. I am using Spotify's scio for writing this job in Scala. A little code chunk to emphasize what I am trying to achieve:
pulsarSource
.open(sc)
.withFixedWindows(Duration.standardSeconds(windowDuration))
.toSinkTap(sink)
From the exception, I can see that source and output paths should have the same URI scheme but I don't know how it is happening because I am using an alluxio path as an output directory. There are some temp directories that are being created on alluxio output directory but after the WindowDuration, when the output file is being created, this exception happens.
I had a doubt that temp location might be configured by default to the local filesystem, so I did set that to output directory path (alluxio dir path) but it didn't change anything.
sc.options.setTempLocation(outputDir)
I want to do remote debugging in order to figure out the issue. I have tried this document to do remote debugging on the task executor node, but once my IntelliJ IDE connects with the node, I don't get hit on my breakpoint.
Can someone suggest, how can I debug or get more information about this issue.
Thanks

Remote debugging might be quite hard, but let's try this first: Make sure you connect to the task manager and not job manager (easy to verify with thread names). Then make sure to have a high number of retries, such that you don't miss the task execution, as attaching the debugging might take a while.
It's also helpful to double check that line numbers of the stack trace match your code version in the IDE. If Flink/Beam is preinstalled, they might run a slightly different version and your break point is void. Just paste the stack trace in your IDE and check if each line matches the expectation. Finally, add a few more breakpoint at central places like org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:202) to make sure if the setup is working at all.
However, remote debugging is usually not the recommended option for big data systems. You'd first ensure locally that most of the things work on their own with some IT tests and local runners. Then, you might want to add e2e tests with docker containers and a local mini cluster. Additionally, you'd add a ton of logging statements, which you can turn on and off with your logging configuration. Similarly, if you set logging level to debug, the existing log statements of the frameworks might already be enough to gain some insights. One important thing that you should always look at is the generated topology that you can see in Web UI. Maybe it already tells you the paths in question.

Related

Java program is not run when invoked via perl using crontab

I am trying to create a scheduled (by cron) single Perl script to call few other Perl files to complete a daily routine job as below. The files are located in multiple locations.
Script_A.pl (in directory A) is scheduled to run every day at 3 am.
-> calls script_1.pl (in directory B)
-> calls script_2.pl (in directory B)
-> calls a java program (in directory B)
My problem is even though the cron is acting as expected,i.e. Perl scripts are executed in order but the java program is not running properly. I managed to get the output to a log.
Exception in thread "main" java.lang.NoClassDefFoundError: com/sun/xml/tree/XmlDocument
at InputISBN.main(InputISBN.java:43)
Caused by: java.lang.ClassNotFoundException: com.sun.xml.tree.XmlDocument
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
... 1 more
I have checked few related tickets on this forum as well in others and tried using -classpath with no luck. Any help is highly appreciated. This works fine when I manually invoke Script_a.pl (./Script_A.pl)

Write karate gatling data to different databases listening to different ports using influx config [duplicate]

One question as just starting to use karate-gatling: is it possible to aggregate the reports generated? So after multiple runs to get one single report? It would be nice to be able to compare somehow the performance - to get automatically the information if there is a performance degradation or not. What I did try but did not work, was to copy the simulation logs and afterwards only generate the reports ("gatling.bat -ro simulations") but this did not work. The error that I got was:
gatling.bat -ro simulations/catskaratesimulation-1544015145031
GATLING_HOME is set to "D:\AutomationTeam\gatling-charts-highcharts-bundle-3.0.1.1"
JAVA = ""C:\Program Files\Java\jdk1.8.0_131\bin\java.exe""
Parsing log file(s)...
Exception in thread "main" java.lang.NumberFormatException: For input string: "catskaratesimulation"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:589)
at java.lang.Long.parseLong(Long.java:631)
at scala.collection.immutable.StringLike.toLong(StringLike.scala:305)
at scala.collection.immutable.StringLike.toLong$(StringLike.scala:305)
at scala.collection.immutable.StringOps.toLong(StringOps.scala:29)
at io.gatling.charts.stats.LogFileReader.$anonfun$firstPass$1(LogFileReader.scala:102)
at scala.collection.Iterator.foreach(Iterator.scala:937)
at scala.collection.Iterator.foreach$(Iterator.scala:937)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1425)
at io.gatling.charts.stats.LogFileReader.firstPass(LogFileReader.scala:86)
at io.gatling.charts.stats.LogFileReader.$anonfun$x$4$1(LogFileReader.scala:125)
at io.gatling.charts.stats.LogFileReader.parseInputFiles(LogFileReader.scala:63)
at io.gatling.charts.stats.LogFileReader.(LogFileReader.scala:125)
at io.gatling.app.RunResultProcessor.initLogFileReader(RunResultProcessor.scala:67)
at io.gatling.app.RunResultProcessor.processRunResult(RunResultProcessor.scala:49)
at io.gatling.app.Gatling$.start(Gatling.scala:81)
at io.gatling.app.Gatling$.fromArgs(Gatling.scala:46)
at io.gatling.app.Gatling$.main(Gatling.scala:38)
at io.gatling.app.Gatling.main(Gatling.scala)
Is there another way to do it? Should I somehow reconfigure gatling? Thanks!

It worked when using the same version (2.2.4) via gatling.bat -ro folder_with_simulations.

Issue "Exception in thread "main"# START NON-TRANSLATABLEjava.lang.NoClassDefFoundError: G€“Xmx3072m" (MyEclipse 12.0 Blue & Websphere7)

I am using MyEclipse 14.0 Blue & Websphere7. I am trying to deploy into the Websphere via Server tab, and when it starts to deploy I am getting below issue.
****Exception in thread "main"# START NON-TRANSLATABLEjava.lang.NoClassDefFoundError: G€“Xmx3072m
Caused by: java.lang.ClassNotFoundException: G€“Xmx3072m
at java.net.URLClassLoader.findClass(URLClassLoader.java:434)
at java.lang.ClassLoader.loadClass(ClassLoader.java:660)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:358)
at java.lang.ClassLoader.loadClass(ClassLoader.java:626)
Could not find the main class: GÇôXmx3072m. Program will exit.****
Any suggestion to resolve this issue?

I think, you have typo in your jvm configuration so it thinks max heap size param (-Xmx3072m) is a classpath variable, please check configuration.

I had a similar problem and fixed it by fixing the global PATH variable.
One variable had two quotes around it which messed up the path for the websphere...
A second thing would be to check for any characters that would be invalid for batch files or scripts in the operating system (because your problem seems to be encoding related)
Some more hints you may find here.

Problems with Chronicle Map on Windows

I am trying to use ChronicleMap for my index structure, this seems to work fine on Linux but when I am running my JUnit test on Windows (which is my development environment), I keep getting an error: java.io.IOException: Unable to wait until the file is ready, likely the process which created the file crashed or hung for more than 1 minute.
Here's the code snippet that is problematic:
File file = new File(idxFullPath);
ChronicleMap<Integer, int[]> idx =
ChronicleMapBuilder.of(Integer.class, int[].class)
.averageValue(getSampleIdxList())
.entries(IDX_MAX_SIZE)
.createPersistedTo(file);
The following exception is thrown:
[2016-06-17 14:32:47.779] ERROR main com.mcm.op.persistence.Persistence ERR java.io.IOException: Unable to wait until the file is ready, likely the process which created the file crashed or hung for more than 1 minute
at net.openhft.chronicle.map.ChronicleMapBuilder.waitUntilReady(ChronicleMapBuilder.java:1520)
at net.openhft.chronicle.map.ChronicleMapBuilder.openWithExistingFile(ChronicleMapBuilder.java:1583)
at net.openhft.chronicle.map.ChronicleMapBuilder.createWithFile(ChronicleMapBuilder.java:1444)
at net.openhft.chronicle.map.ChronicleMapBuilder.createPersistedTo(ChronicleMapBuilder.java:1405)
at com.mcm.op.persistence.Persistence.initIdx(Persistence.java:131)
at com.mcm.op.persistence.Persistence.init(Persistence.java:177)
at com.mcm.op.persistence.PersistenceTest.initPersist(PersistenceTest.java:47)
at com.mcm.op.persistence.PersistenceTest.setUp(PersistenceTest.java:29)

Indeed, it is likely that the process which created the file has crashed, or stopped terminated debugging, or something like that.
If it's ok to have a fresh index from unit test-to-test runs, I recommend to try either delete the file at idxFullPath before creating a Chronicle Map, or randomize the mapping file via something like File.createTempFile(). In either case File.deleteOnExit() could appear to be helpful.
If you want to keep the index between unit test runs and always use the same file at idxFullPath for persistence, you could try to use builder.createOrRecoverPersistedTo() instead of plain createPersistedTo() map creation method. However this might slow down the map creation.

Error running hadoop application in Eclipse on Windows

I'm trying to set up an Eclipse environment for developing and debugging hadoop. I'm following Tom White's Definitive Hadoop 3rd ed. What I would like to do is get the MaxTemperature app working locally on my Windows within Eclipse before moving it to my Hortonworks sandbox VM. The comment on page 158 about using the local job runner seems to be what I want. I don't want to set up a full hadoop implementation on Windows. I'm hoping with the right config params I can convince it to run as a java application inside Eclipse.
Windows: 7
Eclipse: Luna
Hadoop: 2.4.0
JDK: 7
When I set the Run configuration for MaxTemperatureDriver (Source code on page 157) to
inputfile outputdir foo (deliberate bogus 3rd parameter)
I get the usage message so I know I'm running my program with those params.
If I remove the bogus third param I get
Exception in thread "main" java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses.
at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82)
at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1255)
at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1251)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapreduce.Job.connect(Job.java:1250)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:1279)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303)
at mark.MaxTemperatureDriver.run(MaxTemperatureDriver.java:52)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at mark.MaxTemperatureDriver.main(MaxTemperatureDriver.java:56)
I've tried inserting -conf but it seems to be ignored. There is no error message if I specify a nonexistent path.
I've tried inserting -fs file:/// -jt local, but it makes no difference
I've tried inserting -D mapreduce.framework.name=local
I've tried specifying the input and output with the file: format
Note. I'm not asking about how to configure eclipse to connect to a remote Hadoop installation. I want the application to run within eclipse.
Is this possible? Any ideas?
Additional info:
I turned on debugging. I saw:
582 [main] DEBUG org.apache.hadoop.mapreduce.Cluster - Trying ClientProtocolProvider : org.apache.hadoop.mapred.YarnClientProtocolProvider
583 [main] DEBUG org.apache.hadoop.mapreduce.Cluster - Cannot pick org.apache.hadoop.mapred.YarnClientProtocolProvider as the ClientProtocolProvider - returned null protocol
I'm wondering not why YarnClientProtocolProvider failed, but why it didn't try LocalClientProtocolProvider.
New info:
It seems that this is an issue with Hadoop 2.4.0. I recreated my environment with Hadoop 1.2.1, followed the instructions in
http://gerrymcnicol.com/index.php/2014/01/02/hadoop-and-cassandra-part-4-writing-your-first-mapreduce-job/
added the Windows hack from
http://bigdatanerd.wordpress.com/2013/11/14/mapreduce-running-mapreduce-in-windows-file-system-debug-mapreduce-in-eclipse
and it all started working.

Following blog will be useful.
Running mapreduce in Windows filesystem

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse