How to run a spark example program in Intellij IDEA

How to run a spark example program in Intellij IDEA - scala

First on the command line from the root of the downloaded spark project I ran
mvn package
It was successful.
Then an intellij project was created by importing the spark pom.xml.
In the IDE the example class appears fine: all of the libraries are found. This can be viewed in the screenshot.
However , when attempting to run the main() a ClassNotFoundException on SparkContext occurs.
Why can Intellij not simply load and run this maven based scala program? And what can be done as a workaround?
As one can see below, the SparkContext is looking fine in the IDE: but then is not found when attempting to run:
The test was run by right clicking inside main():
.. and selecting Run GroupByTest
It gives
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkContext
at org.apache.spark.examples.GroupByTest$.main(GroupByTest.scala:36)
at org.apache.spark.examples.GroupByTest.main(GroupByTest.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:120)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.SparkContext
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
Here is the run configuration:

Spark lib isn't your class_path.
Execute sbt/sbt assembly,
and after include "/assembly/target/scala-$SCALA_VERSION/spark-assembly*hadoop*-deps.jar" to your project.

This may help IntelliJ-Runtime-error-tt11383. Change module dependencies from provide to compile. This works for me.

You need to add the spark dependency. If you are using maven just add these lines to your pom.xml:
<dependencies>
...
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>provided</scope>
</dependency>
...
</dependencies>
This way you'll have the dependency for compiling and testing purposes but not in the "jar-with-dependencies" artifact.
But if you want to execute the whole application in an standalone cluster running in your intellij you can add a maven profile to add the dependency with compile scope. Just like this:
<properties>
<scala.binary.version>2.11</scala.binary.version>
<spark.version>1.2.1</spark.version>
<spark.scope>provided</spark.scope>
</properties>
<profiles>
<profile>
<id>local</id>
<properties>
<spark.scope>compile</spark.scope>
</properties>
<dependencies>
<!--<dependency>-->
<!--<groupId>org.apache.hadoop</groupId>-->
<!--<artifactId>hadoop-common</artifactId>-->
<!--<version>2.6.0</version>-->
<!--</dependency>-->
<!--<dependency>-->
<!--<groupId>com.hadoop.gplcompression</groupId>-->
<!--<artifactId>hadoop-gpl-compression</artifactId>-->
<!--<version>0.1.0</version>-->
<!--</dependency>-->
<dependency>
<groupId>com.hadoop.gplcompression</groupId>
<artifactId>hadoop-lzo</artifactId>
<version>0.4.19</version>
</dependency>
</dependencies>
<activation>
<activeByDefault>false</activeByDefault>
<property>
<name>env</name>
<value>local</value>
</property>
</activation>
</profile>
</profiles>
<dependencies>
<!-- SPARK DEPENDENCIES -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
<scope>${spark.scope}</scope>
</dependency>
</dependencies>
I also added an option to my application to start a local cluster if --local is passed:
private def sparkContext(appName: String, isLocal:Boolean): SparkContext = {
val sparkConf = new SparkConf().setAppName(appName)
if (isLocal) {
sparkConf.setMaster("local")
}
new SparkContext(sparkConf)
}
Finally you have to enable "local" profile in Intellij in order to get proper dependencies. Just go to "Maven Projects" tab and enable the profile.

Related

Hadoop 3 gcs-connector doesn't work properly with latest version of spark 3 standalone mode

I wrote a simple Scala application which reads a parquet file from GCS bucket. The application uses :
JDK 17
Scala 2.12.17
Spark SQL 3.3.1
gcs-connector of hadoop3-2.2.7
The connector is taken from Maven, imported via sbt (Scala build tool). I'm not using the latest, 2.2.9, version because of this issue.
The application works perfectly in local mode, so I tried to switch to the standalone mode.
What I did is these steps:
Downloaded Spark 3.3.1 from here
Started the cluster manually like here
I tried to run the application again and faced this error:
[error] Caused by: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found
[error] at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2688)
[error] at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3431)
[error] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3466)
[error] at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
[error] at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
[error] at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
[error] at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
[error] at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
[error] at org.apache.parquet.hadoop.util.HadoopInputFile.fromStatus(HadoopInputFile.java:44)
[error] at org.apache.spark.sql.execution.datasources.parquet.ParquetFooterReader.readFooter(ParquetFooterReader.java:44)
[error] at org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat$.$anonfun$readParquetFootersInParallel$1(ParquetFileFormat.scala:484)
[error] ... 14 more
[error] Caused by: java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found
[error] at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2592)
[error] at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2686)
[error] ... 24 more
Somehow it cannot detect connector's file system: java.lang.ClassNotFoundException: Class com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem not found
My spark configuration is pretty basic:
spark.app.name = "Example app"
spark.master = "spark://YOUR_SPARK_MASTER_HOST:7077"
spark.hadoop.fs.defaultFS = "gs://YOUR_GCP_BUCKET"
spark.hadoop.fs.gs.impl = "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem"
spark.hadoop.fs.AbstractFileSystem.gs.impl = "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS"
spark.hadoop.google.cloud.auth.service.account.enable = true
spark.hadoop.google.cloud.auth.service.account.json.keyfile = "src/main/resources/gcp_key.json"

I ve found out that the maven version of GCS hadoop connector, is missing dependecies internally.
Ive fixed it by either:
downloading the connector from here https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage and providing to spark configuration on startup. (but it is not recommended to use in production, as the site is clearly stating)
providing missing dependencies for the connector.
to resolve the second option, I did unpack the gcs hadoop connector jar file, looked for the pom.xml, copy dependencies to a new stand alone xml file, and download them using mvn dependency:copy-dependencies -DoutputDirectory=/path/to/pyspark/jars/ command
here is example pom.xml that Ive created, please note I am using the 2.2.9 version of the connector
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>
<name>TMP_PACKAGE_NAME</name>
<description>
jar dependencies of gcs hadoop connector
</description>
<!--'com.google.oauth-client:google-oauth-client:jar:1.34.1'
-->
<groupId>TMP_PACKAGE_GROUP</groupId>
<artifactId>TMP_PACKAGE_NAME</artifactId>
<version>0.0.1</version>
<dependencies>
<dependency>
<groupId>com.google.cloud.bigdataoss</groupId>
<artifactId>gcs-connector</artifactId>
<version>hadoop3-2.2.9</version>
</dependency>
<dependency>
<groupId>com.google.api-client</groupId>
<artifactId>google-api-client-jackson2</artifactId>
<version>2.1.0</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>31.1-jre</version>
</dependency>
<dependency>
<groupId>com.google.oauth-client</groupId>
<artifactId>google-oauth-client</artifactId>
<version>1.34.1</version>
</dependency>
<dependency>
<groupId>com.google.cloud.bigdataoss</groupId>
<artifactId>util</artifactId>
<version>2.2.9</version>
</dependency>
<dependency>
<groupId>com.google.cloud.bigdataoss</groupId>
<artifactId>util-hadoop</artifactId>
<version>hadoop3-2.2.9</version>
</dependency>
<dependency>
<groupId>com.google.cloud.bigdataoss</groupId>
<artifactId>gcsio</artifactId>
<version>2.2.9</version>
</dependency>
<dependency>
<groupId>com.google.auto.value</groupId>
<artifactId>auto-value-annotations</artifactId>
<version>1.10.1</version>
<scope>runtime</scope>
</dependency>
<dependency>
<groupId>com.google.flogger</groupId>
<artifactId>flogger</artifactId>
<version>0.7.4</version>
</dependency>
<dependency>
<groupId>com.google.flogger</groupId>
<artifactId>google-extensions</artifactId>
<version>0.7.4</version>
</dependency>
<dependency>
<groupId>com.google.flogger</groupId>
<artifactId>flogger-system-backend</artifactId>
<version>0.7.4</version>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.10</version>
</dependency>
</dependencies>
</project>
hope this helps

This is caused by the fact that Spark uses an old Guava library version and you used a non-shaded GCS connector jar. To make it work, you just need to use shaded GCS connector jar from Maven, for example: https://repo1.maven.org/maven2/com/google/cloud/bigdataoss/gcs-connector/hadoop3-2.2.9/gcs-connector-hadoop3-2.2.9-shaded.jar

Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$

Hi I try to run spark on my local laptop.
I created a mvn project in intelijidea and in my main class I have one line like bellow and when I try to run a project I got the error like below
val spark = SparkSession.builder().master("local").getOrCreate()
21/11/02 18:02:35 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$ (in unnamed module #0x34e9fd99) cannot access class sun.nio.ch.DirectBuffer (in module java.base) because module java.base does not export sun.nio.ch to unnamed module #0x34e9fd99
at org.apache.spark.storage.StorageUtils$.(StorageUtils.scala:213)
at org.apache.spark.storage.BlockManagerMasterEndpoint.(BlockManagerMasterEndpoint.scala:110)
at org.apache.spark.SparkEnv$.$anonfun$create$9(SparkEnv.scala:348)
at org.apache.spark.SparkEnv$.registerOrLookupEndpoint$1(SparkEnv.scala:287)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:336)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:191)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:277)
at org.apache.spark.SparkContext.(SparkContext.scala:460)
at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2690)
at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$2(SparkSession.scala:949)
at scala.Option.getOrElse(Option.scala:201)
at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:943)
at Main$.main(Main.scala:8)
at Main.main(Main.scala)
My dependency in pom
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.13</artifactId>
<version>3.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.13</artifactId>
<version>3.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-hive -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.11</artifactId>
<version>2.1.3</version>
<scope>provided</scope>
</dependency>
Any idea how to resolve this problem ?

At the time of writing this answer, Spark does not support Java 17 - only Java 8/11 (source: https://spark.apache.org/docs/latest/).
In my case uninstalling Java 17 and installing Java 8 (e.g. OpenJDK 8) fixed the problem and I started using Spark on my laptop.
UPDATE:
Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+.

Apache Flink: How to sink stream to Google Cloud Storage File System

I was trying yo write Some Data Streams into a file in Google Cloud Storage File System as following (Using Flink 1.8 and Scala 2.11):
data.addSink(new BucketingSink[(String, Int)]("gs://url-try/try.txt"))
But I´m facing the following error:
Exception in thread "main" org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
at org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:638)
at org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:123)
at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654)
Caused by: java.lang.RuntimeException: Error while creating FileSystem when initializing the state of the BucketingSink.
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:379)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:278)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:738)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:289)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'gs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:403)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1227)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:432)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:376)
... 8 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Cannot support file system for 'gs' via Hadoop, because Hadoop is not in the classpath, or some classes are missing from the classpath.
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:179)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399)
... 11 more
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/HdfsConfiguration
at org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:85)
... 12 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.HdfsConfiguration
at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 13 more
I have seen several questions about that and I also did:
env variables:
- FLINK_CONF_DIR
file flink-conf.yaml:
- fs.hdfs.hadoopconf: src/main/resources/core-site.xml
core-site.xml:
> <property>
> <name>fs.gs.impl</name>
> <value>com.google.cloud.hadoop.fs.gcs.
> GoogleHadoopFileSystem</value>
> <description>The FileSystem for gs: (GCS) uris.</description>
> </property>
> <property>
> <name>fs.AbstractFileSystem.gs.impl</name>
> <value>com.google.cloud.hadoop.fs.gcs.GoogleHadoopFS</value>
> <description>The AbstractFileSystem for gs: (GCS)
> uris.</description>
These are my pom dependencies:
<dependencies>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-scala -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala_2.11</artifactId>
<version>1.8.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-streaming-scala -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala_2.11</artifactId>
<version>1.8.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-lang/commons-lang -->
<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.6</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.google.cloud.bigdataoss/gcs-connector -->
<dependency>
<groupId>com.google.cloud.bigdataoss</groupId>
<artifactId>gcs-connector</artifactId>
<version>hadoop3-1.9.16</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-hadoop-fs -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-hadoop-fs</artifactId>
<version>1.8.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-filesystem -->
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-filesystem_2.11</artifactId>
<version>1.8.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-hdfs -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>3.2.0</version>
</dependency>
<dependency>
<groupId>com.google.cloud</groupId>
<artifactId>google-cloud-storage</artifactId>
<version>1.35.0</version>
</dependency>
</dependencies>
Any help?

According to the stack trace posted by you, I have seen that you are having issues trying to write into the GCS container Using Flink and Scala.
So there is a similar post where the issue is resolved, please check it.
do not hesitate to comeback if you have further questions.

Maven dependency given but still one class not found on execution

I have added maven dependencies in pom.xml for cdk but still i get one error of class not found when executing the jar file.
dyna218-128:spark4vs laeeqahmed$ java -cp target/spark4vs-1.0-SNAPSHOT.jar se.uu.farmbio.spark4vs.RunPrediction
Exception in thread "main" java.lang.NoClassDefFoundError: org/openscience/cdk/interfaces/IAtomContainer
Caused by: java.lang.ClassNotFoundException: org.openscience.cdk.interfaces.IAtomContainer
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
POM.XML is as under
<dependency>
<groupId>log4j</groupId>
<artifactId>log4j</artifactId>
<version>1.2.16</version>
</dependency>
<dependency><!-- SVM depedency -->
<groupId>tw.edu.ntu.csie</groupId>
<artifactId>libsvm</artifactId>
<version>3.1</version>
</dependency>
<dependency>
<groupId>org.openscience.cdk</groupId>
<artifactId>cdk</artifactId>
<version>1.4.7</version>
</dependency>
<repositories>
<repository>
<id>3rdparty</id>
<url>https://maven.ch.cam.ac.uk/content/repositories/thirdparty/</url>
</repository>
</repositories>

Maven dependencies are for building project. Maven Jar Plugin is not using it when JAR id packaged. So you cant run without additional work.
Hovewer there are many solution for this. For example you can use Maven One Jar Plugin and package all dependencies into JAR - but this is not always usable
http://onejar-maven-plugin.googlecode.com/svn/mavensite/usage.html
You can create archive with jar-dependencies
http://maven.apache.org/plugins/maven-assembly-plugin/descriptor-refs.html#jar-with-dependencies
You can merge all jars into one with Maven Shade Plugin
http://maven.apache.org/plugins/maven-shade-plugin/

Arquillian/JUnit tests run from console but not inside Eclipse

I've setup our project with some JUnit tests that are run by Arquillian inside the full JBoss Server (inside a profile called jboss-remote-6). I pretty much did everything as in the manual at http://docs.jboss.org/arquillian/reference/latest/en-US/html/gettingstarted.html.
If I execute mvn test in the console, everything is properly executed and the assertions are checked.
But when I try to run the JUnit test case inside Eclipse, it fails with the following exception:
org.jboss.arquillian.impl.client.deployment.ValidationException: DeploymentScenario contains targets not maching any defined Container in the registry. _DEFAULT_
at org.jboss.arquillian.impl.client.deployment.DeploymentGenerator.validate(DeploymentGenerator.java:95)
at org.jboss.arquillian.impl.client.deployment.DeploymentGenerator.generateDeployment(DeploymentGenerator.java:77)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
(...)
I set up the Maven profile for this project correctly to "jbossas-remote-6" as stated in the pom.xml. What am I doing wrong? Google coulnd't help on this specific one.
Best regards,
Sebastian

There are various things I did to make this work. My role model was the jboss-javaee6 Maven archetype, which is also using Arquillian for unit testing the code in a remote JBoss 6 server. I did the following steps:
Add arquillian.xml
I added the Arquillian.xml in src/test/resources:
<?xml version="1.0" encoding="UTF-8"?>
<arquillian xmlns="http://jboss.com/arquillian" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="
http://jboss.org/schema/arquillian
http://jboss.org/schema/arquillian/arquillian-1.0.xsd">
<container qualifier="jbossas-remote" default="true">
<property name="httpPort">8080</property>
</container>
</arquillian>
Shrinkwrap a WebArchive instead of JavaArchive
Using return Shrinkwrap.create( WebArchive.class, "test.war") instead of the JavaArchive.class made the method addAsWebInfResource() method available, where I could add the empty generated beans.xml.
Adjust pom.xml to reduce CLASSPATH length
Eclipse was constantly breaking with javaw.exe giving a CreateProcess error=87 message. This was caused by the CLASSPATH being too long for the console command. Since the dependency jboss-as-client added Bazillions of dependencies, I changed it to jboss-as-profileservice-client which works fine and has a lot less dependencies.
Another important thing is to have a jndi.properties file in the src/test/resources directory, as stated in the Arquillian docs. But that was already the case here. I guess the arquillian.xml made the difference - this file was never mentioned in the docs, just saw it in the archetype.
This is my Maven profile for remote JBoss testing:
<profile>
<id>jbossas-remote-6</id>
<dependencies>
<dependency>
<groupId>org.jboss.arquillian.container</groupId>
<artifactId>arquillian-jbossas-remote-6</artifactId>
<version>1.0.0.Alpha5</version>
</dependency>
<dependency>
<groupId>org.jboss.spec</groupId>
<artifactId>jboss-javaee-6.0</artifactId>
<version>2.0.0.Final</version>
<type>pom</type>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.jboss.jbossas</groupId>
<artifactId>jboss-as-profileservice-client</artifactId>
<version>6.0.0.Final</version>
<type>pom</type>
</dependency>
</dependencies>
<build>
<testResources>
<testResource>
<directory>src/test/resources</directory>
</testResource>
</testResources>
</build>
I hope my answer will be useful to somebody. :)

Note there is also an open issue related to running tests in Eclipse:
https://issues.jboss.org/browse/ARQ-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to run a spark example program in Intellij IDEA - scala

Spark lib isn't your class_path. Execute sbt/sbt assembly, and after include "/assembly/target/scala-$SCALA_VERSION/spark-assemblyhadoop-deps.jar" to your project.

This may help IntelliJ-Runtime-error-tt11383. Change module dependencies from provide to compile. This works for me.

Related

Hadoop 3 gcs-connector doesn't work properly with latest version of spark 3 standalone mode

Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$

Apache Flink: How to sink stream to Google Cloud Storage File System

Maven dependency given but still one class not found on execution

Arquillian/JUnit tests run from console but not inside Eclipse

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to run a spark example program in Intellij IDEA - scala

Spark lib isn't your class_path. Execute sbt/sbt assembly, and after include "/assembly/target/scala-$SCALA_VERSION/spark-assembly*hadoop*-deps.jar" to your project.

This may help IntelliJ-Runtime-error-tt11383. Change module dependencies from provide to compile. This works for me.

Related

Hadoop 3 gcs-connector doesn't work properly with latest version of spark 3 standalone mode

Exception in thread "main" java.lang.IllegalAccessError: class org.apache.spark.storage.StorageUtils$

Apache Flink: How to sink stream to Google Cloud Storage File System

Maven dependency given but still one class not found on execution

Arquillian/JUnit tests run from console but not inside Eclipse

Categories

Resources

Spark lib isn't your class_path. Execute sbt/sbt assembly, and after include "/assembly/target/scala-$SCALA_VERSION/spark-assemblyhadoop-deps.jar" to your project.