Dependency error while setting up spark with java in eclipse - eclipse

I am new to spark and Java. I was trying to setup the spark environment in Eclipse using maven dependency. I am using Java 1.8 with Scala 2.11.7. I gave created a Scala project in Eclipse and created a maven dependency.
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.1.0</version>
</dependency>
Now I am getting an error as "Failure to transfer org.spark-project.spark:unused:jar:1.0.0 from https://repo.maven.apache.org/maven2 was cached in the local repository, resolution will not be reattempted until the update interval of central has elapsed or updates are forced."

I am new too. These r my steps working with eclipse, scala 2.11, maven and spark: 1.- Create a maven project 2.- Use this POM as basic orientation https://github.com/entradajuan/Spark0/blob/master/Spark0/pom.xml 3.- Convert the project to Scala: right buton->configure->convert to Scala 4.- Then create a package in src/main/java 5.- Create an scala object with its def main in the new package
So far I am always starting the same way. I dont use Java in my projects.
It works fine also when running maven build "clean install" for getting jar.

Related

Not able to compile library extend from mllib

I am working on a ml project using apache spark and maven. I create two library for the project - one called "rmml" which extends Spark mllib library by adding a new FactorizationMachine Algorithm to "org.apache.spark.mllib.regression" and the other library called "dataprocess" uses this new algorithm I added.
In Intellij on my laptop, I am able to call and run the FM algorithm fine in "dataprocess", and I am able to compile "rmml", however I hit an error when try to compile "dataprocess" library with "error: object FactorizationMachine is not a member of package "org.apache.spark.mllib.regression". I am not a java developer, so I am having a hard time figure this out. Any help would be great, thanks!
This pom of "dataprocess" library that imports "rmml"
<dependency>
<groupId>com.something</groupId>
<artifactId>rmml</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
This is pom of "rmml" project
<groupId>com.something</groupId>
<artifactId>rmml</artifactId>
<version>1.0-SNAPSHOT</version>
And here is class path and file path of "rmml" project
As Luis suggested "need to publish rmml to the local repository"

Eclipse maven not adding dependencies

I am new to maven and am experiencing difficulties while trying to mavenise a Java project.
Setting:
IDE: Eclipse Oxygen.2 Release (4.7.2)
Java: 8
m2e: 1.8.2
What I did:
- copy-pasted the entire original java project and renamed it
- right-click in eclipse: Configure > Convert to Maven project
- in java build path, deletion of libraries import from original local lib repo. The build path shows the Maven Dependencies folder, with the only junit library.
- maven install => downloaded things in the user/.m2/repository/, but not all.
What does not work:
When I try to add a dependency right from a file:
,
nothing pops up in the artifact selection windows, even though there is a commons-logging/ folder in m2/repository
When I try to add the dependency manually in the pom.xml:
<dependency>
<groupId>org.kie.modules</groupId>
<artifactId>org-apache-commons-configuration-main</artifactId>
<version>6.5.0.Final</version>
<type>pom</type>
</dependency>
but the package resolution error still appears in the java file, and I get this warning after Maven install
`[WARNING] The POM for org.kie.modules:org-apache-commons-configuration-main:pom:6.5.0.Final is invalid, transitive dependencies (if any) will not be available, enable debug logging for more details`
I did Maven Update project, eclipse project clean, nothing changes.
My goal for now is just that eclipse understands (at least for one library), that it has to take it from maven repository. I still have many other dependencies to solve (intra-project), but that will be the next step.
Thanks for your help.
The cause of the issue is stated in the warning message :
[WARNING] The POM for
org.kie.modules:org-apache-commons-configuration-main:pom:6.5.0.Final
is invalid, transitive dependencies (if any) will not be available,
enable debug logging for more details
It means that the pom.xml downloaded in your local maven repository exists but is not valid.
Delete the folder of the dependency downloaded in your local maven repository and try again.
If you still have the same problem, check that your central repository that provides the dependency provides also correctly the pom.xml for that.
You can do it by browsing the directory of the dependency from a web browser.
For example we can see that the maven central repository provides a valid pom :
http://central.maven.org/maven2/org/kie/modules/org-apache-commons-collections-main/6.5.0.Final/org-apache-commons-collections-main-6.5.0.Final.pom

How to build spark application using Scala IDE and Maven?

I'm new to Scala, Spark and Maven and would like to build spark application described here. It uses the Mahout library.
I have Scala IDE install and would like to use Maven to build the dependencies (which are the Mahout library as well as Spark lib). I couldn't find a good tutorial to start. Could someone help me figure it out?
First try compiling simple application with Maven in Scala IDE. The key of Maven project is directory structure and pom.xml. Although I don't use Scala IDE, this document seems helpful.
http://scala-ide.org/docs/tutorials/m2eclipse/
Next step is to add dependency on Spark in pom.xml you can follow this document.
http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/
For latest version of Spark and Mahout artifacts you can check them here:
http://mvnrepository.com/artifact/org.apache.spark
http://mvnrepository.com/artifact/org.apache.mahout
Hope this helps.
You need following tools to get started ( based on recent availability) -
Scala IDE for Eclipse – Download latest version of Scala IDE from
here.
Scala Version – 2.11 ( make sure scala compiler is set to
this version as well)
Spark Version 2.2 ( provided in maven
dependency)
winutils.exe
For running in Windows environment , you need hadoop binaries in
windows format. winutils provides that and we need to set
hadoop.home.dir system property to bin path inside which winutils.exe
is present. You can download winutils.exe here and place at path
like this – c:/hadoop/bin/winutils.exe
And, you can define Spark Core Dependency in your Maven POM.XML for your project, to get started with.
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.0</version>
<scope>provided</scope>
</dependency>
And in your Java/Scala class define this property, to run on your local environmet on Windows -
System.setProperty("hadoop.home.dir", "c://hadoop//");
More details and full setup details can be found here.

Eclipse cannot find class com.google.common.reflect.TypeToken?

My project which uses Dataflow compiles just fine using
mvn compile
However when I import my project into eclipse, eclipse is unable to build the project and gives the following error
The project was not built since its build path is incomplete.
Cannot find the class file for com.google.common.reflect.TypeToken.
Fix the build path then try building this project
Adding an explicit dependency on Guava to my pom file appears to have fixed the problem.
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>[18.0,)</version>
</dependency>
By running
mvn dependency:tree -Dverbose -Dincludes=com.google.guava
I learned that I had several dependencies that were pulling in Guava so by adding an explicit dependency I was able to force maven to pull in a newer version.
However, I don't know why running 'mvn compile' on the command line worked.

Maven not updating/downloading dependencies

I'm using Eclipse EE and Apache Wicket.
I have my dependencies written down at pom.xml.
For some reason, maven is not updating even when I try to clean the project and go under Maven > Update Project.
When I go to check the dependencies the jar is there, but it still give me error when I try to run (ClassNotFoundException).
Why is this happening?
Dependency in question:
<dependency>
<groupId>org.tuckey</groupId>
<artifactId>urlrewritefilter</artifactId>
<version>4.0.3</version>
</dependency>
Also already tried to reinstall maven.
Try running mvn eclipse:eclipse to update ur Eclipse.
Also, refresh after completion of the command run.
You shouldn't mix "mvn eclipse:eclipse" and m2eclipse maven eclipse plugin, it will cause the plugin to not operate correctly. If you have done so, you need to remove your .project/.classpath/.settings files and re-import with "import existing maven projects".