Spark with IntelliJ or Eclipse - eclipse

I am trying to setup IntelliJ for spark 2.11 but it is very daunting and after days I have not been able to compile a simple instruction such as with "spark.read.format" which is not found in main core and sql spark libraries.
I have seen a few posts on the subject but with none resolved. Does anyone have some experience with perhaps a working sample program I can start with?
Could it be that it would be easier with Eclipse?
Many thanks in advance for your answers,
EZ

build project in Intellij using with scala 2.11 and sbt 0.13: then ensure that your plugins.sbt contains as below:
logLevel := Level.Warn
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")
then your build.sbt must contain as below:
scalaVersion := "2.11.8"
val sparkVersion = "2.1.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion %"provided"
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion %"provided"
Then write your code, click Terminal in Intellij and type sbt assembly: you can ship that jar to remote cluster, otherwise run from Intelij locally, let me know how it goes

Related

Nats Spark connector: Error: Failed to load class

good afternoon!
I'm newbie in nats/spark thing and I've been stuck for a few days. Would be greatfull for any tip.
I'm using the https://github.com/Logimethods/nats-connector-spark-scala connector to read messages from a nats server.
I'm using Intellij with SBT to run it and it works. Instead when I'm trying to build de jar file fails:
I've checked if the jar file has the MANIFEST.MF:
I'm thinking that I'm missing maybe some dependecy or incompatibility issues is given, so I'll attach my buildd.sbt file:
name := "brokerNatsSparkSBT"
version := "0.1"
scalaVersion := "2.11.12"
resolvers += "Sonatype OSS Snapshots" at "https://oss.sonatype.org/content/repositories/snapshots"
resolvers += "Sonatype OSS Release" at "https://oss.sonatype.org/content/groups/public/"
libraryDependencies += "com.logimethods" % "nats-connector-spark-scala_2.11" % "1.0.0"
val sparkVersion = "2.3.1"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion
)
Using JDK 1.8, SBT according to build.properties 1.5.4.
Thanks in advance!
After a few days of struggling in the end I made it thanks to this article. Inluding the sbt-assembly plugin and building with it the jar file I achived to build correctly de jar.

Unresolved dependency generating jar with SBT

I'm developing a Spark process in Scala (Eclipse IDE) and runs fine in my local cluster, but when I try to compiled it with SBT that I installed on my pc I got a error (see picture).
My first doubt is why SBT try to compile with scala 2.12 if I explicitly set scalaVersion to 2.11.11 in my build.sbt. I tried installing other SBT versions with the same results, also in other PCs but not works. I need help to fix it.
scala_version(Spark) :2.11.11
sbt_version : 1.0.2
spark: 2.2
build.sbt
name := "Comple"
version := "1.0"
organization := "com.antonio.spark"
scalaVersion := "2.11.11"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.2.0" % "provided",
"org.apache.spark" %% "spark-sql" % "2.2.0" % "provided"
)
assembly.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "1.0.2")
Error:
ResolveException: unresolved dependency: sbt_assembly;1.0.2: not found
Try changing your assembly.sbt file to:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.5")
as stated in the documentation here: https://github.com/sbt/sbt-assembly
I recently used that with spark-core_2.11 version 2.2.0 and it worked.

How to add Java dependencies to Scala projects's sbt file

I have a spark streaming Scala project which uses Apache NiFi receiver. The projects runs fine under Eclipse/Scala IDE and now I want to package it for deployment now.
When I add it as
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
sbt assumes it's a Scala library and tries to resolve it.
How doe I add NiFi receiver and all it's dependencies to project's SBT file?
Also, is it possible to pint dependencies to local directories instead of sbt trying to resolve?
Thanks in advance.
Here is my sbt file contents:
name := "NiFi Spark Test"
version := "1.0"
scalaVersion := "2.10.5"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.2" % "provided"
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
Double % are used for adding scala version as suffix to the maven artefact. It is required because different scala compiler versions produces incompatible bytecode. If you are would like to use java library from maven, then you should use single % character
libraryDependencies += "org.apache.nifi" % "nifi-spark-receiver" % "0.3.0"
I also found that I can put libraries the project depends on into the lib folder and they will be picked up during assembly.

IntelliJ Idea 14: cannot resolve symbol spark

I made a dependency of Spark which worked in my first project. But when I try to make a new project with Spark, my SBT does not import the external jars of org.apache.spark. Therefore IntelliJ Idea gives the error that it "cannot resolve symbol".
I already tried to make a new project from scratch and use auto-import but none works. When I try to compile I get the messages that "object apache is not a member of package org". My build.sbt looks like this:
name := "hello"
version := "1.0"
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" % "spark-parent_2.10" % "1.4.1"
I have the impression that there might be something wrong with my SBT settings, although it already worked one time. And except for the external libraries everything is the same...
I also tried to import the pom.xml file of my spark dependency but that also doesn't work.
Thank you in advance!
This worked for me->
name := "ProjectName"
version := "0.1"
scalaVersion := "2.11.11"
libraryDependencies ++= Seq(
"org.apache.spark" % "spark-core_2.11" % "2.2.0",
"org.apache.spark" % "spark-sql_2.11" % "2.2.0",
"org.apache.spark" % "spark-mllib_2.10" % "1.1.0"
)
I use
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.4.1"
in my build.sbt and it works for me.
I had a similar problem. It seems the reason was that the build.sbt file was specifying the wrong version of scala.
If you run spark-shell it'll say at some point the scala version used by Spark, e.g.
Using Scala version 2.11.8
Then I edited the line in the build.sbt file to point to that version and it worked.
Currently spark-cassandra-connector compatible with Scala 2.10 and 2.11.
It worked for me when I updated the scala version of my project like below:
ThisBuild / scalaVersion := "2.11.12"
and I updated my dependency like:
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "2.4.0",
If you use "%%", sbt will add your project’s binary Scala version to the artifact name.
From sbt run:
sbt> reload
sbt> compile
Your library dependecy conflicts with with the scala version you're using, you need to use 2.11 for it to work. The correct dependency would be:
scalaVersion := "2.11.7"
libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "1.4.1"
note that you need to change spark_parent to spark_core
name := "SparkLearning"
version := "0.1"
scalaVersion := "2.12.3"
// additional libraries
libraryDependencies += "org.apache.spark" % "spark-streaming_2.10" % "1.4.1"

Can't install Scaladoc with SBT and Intellij

I am new to scala and am currently trying to setup IntelliJ IDEA 13.1 with the Scala plugin. It has support for SBT. I have simply followed the basic tutorial for creating a new project for SBT here: http://confluence.jetbrains.com/display/IntelliJIDEA/Getting+Started+with+SBT
Currently my build.sbt file is:
name := "scalasandpit"
version := "1.0"
scalaVersion := "2.10"
libraryDependencies += "org.scalatest" % "scalatest_2.10" % "2.1.0" % "test"
autoAPIMappings := true
This pulls down various jar binaries, but no sources and no javadoc. I wondered if there is a way to have both sources and javadoc work with IntelliJ and SBT. I think I'm missing something.
There seem to be two issues: getting sbt to pull down sources and docs, and then getting Idea to show them to you. To solve the former problem see the sbt documentation -- about half way down there's a section called "Download Sources" which tells you what to add to your build.sbt:
libraryDependencies +=
"org.scalatest" % "scalatest_2.10" % "2.1.0" % "test" withSources() withJavadoc()