Scala SBT elasticsearch-hadoop unresolved dependency - scala

When adding dependency libraryDependencies += "org.elasticsearch" % "elasticsearch-hadoop" % "5.1.1" and refreshing project, I get many unresolved dependencies(cascading, org.pentaho,...).
However if I add another dependency, like libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0" it works and I can use the library in my scala files.
So, is the problem coming from elasticsearch-hadoop ? I'm using SBT 0.13.13 but also tried with 0.13.8.
I took the dependency from https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch-hadoop/5.1.1 I know that for some dependencies you need to add the repository aswell (resolvers += ...), but here it doesn't seems to need a repo.

Add the following in your build.sbt file:
resolvers += "conjars.org" at "http://conjars.org/repo"

Can update your .sbt file
name:="HelloSparkApp"
version:="1.0"
scalaVersion:="2.10.4"
libraryDependencies+="org.apache.spark"%%"spark-core"%"1.5.2"
And execute the below commands from the project directory
sbt clean
sbt package
sbt eclipse

Related

SBT package to include only a few external jars but not all

Is there a way that I can use sbt package to include a few external jars but not all. I am aware about sbt assembly but that will include all the jars that is configured in build.sbt
In that case you have to give the dependencies without "provided" option and add "provided" to exclude the dependencies for packages in build.sbt file.
Example:
Here I have given "provided" for spark-streaming dependency, which will not be included in my fat jar:
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided"
But pureconfig dependency package which will be included in my fat jar as I have not mentioned the "provided" keyword:
"com.github.pureconfig" %% "pureconfig" % "0.12.3"
You can use the assemblyExcludedJar setting in the assembly plugin. There's an example in the sbt-assembly README:
assemblyExcludedJars in assembly := {
val cp = (fullClasspath in assembly).value
cp filter {_.data.getName == "compile-0.1.0.jar"}
}
unmanagedJars in Compile += file("lib/my.jar")

How to add Java dependencies to Scala projects's sbt file

I have a spark streaming Scala project which uses Apache NiFi receiver. The projects runs fine under Eclipse/Scala IDE and now I want to package it for deployment now.
When I add it as
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
sbt assumes it's a Scala library and tries to resolve it.
How doe I add NiFi receiver and all it's dependencies to project's SBT file?
Also, is it possible to pint dependencies to local directories instead of sbt trying to resolve?
Thanks in advance.
Here is my sbt file contents:
name := "NiFi Spark Test"
version := "1.0"
scalaVersion := "2.10.5"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.2" % "provided"
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
libraryDependencies += "org.apache.nifi" %% "nifi-spark-receiver" % "0.3.0"
Double % are used for adding scala version as suffix to the maven artefact. It is required because different scala compiler versions produces incompatible bytecode. If you are would like to use java library from maven, then you should use single % character
libraryDependencies += "org.apache.nifi" % "nifi-spark-receiver" % "0.3.0"
I also found that I can put libraries the project depends on into the lib folder and they will be picked up during assembly.

How to add dependency files to Scala?

I'm new to Scala and Spark and and started writing a simple Apache Spark program in Scala IDE (in Eclipse). I added the dependency jar files to my project as I usually do in my java project but it can't recognize them and give me the following error message object apache is not a member of package org. How should I add the dependency jar files?
The jar files I'm adding are the ones exist under 'lib' directory where Spark in installed.
For scala you use SBT as a dependency manager and code compiler.
More information on how to set it up here:
http://www.scala-sbt.org/release/tutorial/Setup.html
However your build file will look something like this:
name := "Test"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "1.3.0"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.3.0"

Where to place downloaded ScalaTest jar so sbt uses it across projects?

I've downloaded the ScalaTest jar and have used it as in the example, but now I would like to start using it with sbt. Where do I place the downloaded jar so I can use it with sbt across multiple projects?
You don't download dependencies like ScalaTest manually. The point of using sbt is to declare your project's dependencies and let sbt download them for you automatically.
Add this line in your build.sbt file:
libraryDependencies += "org.scalatest" %% "scalatest" % "2.2.0" % "test"
For more details see official doc on setting this up.

How does one get sbt-idea to work in scala-2.10 project?

I had a lot of trouble getting sbt-idea to work in my Scala 2.10 project.
I tried compiling sbt-idea from its git repo, making sure that to have set
scalaVersion := "2.10.0-RC5"
in build/Build.scala, and using publish-local command to compile it in git. But I nevertheless keep getting
[error] sbt.IncompatiblePluginsException: Binary incompatibility in plugins detected.
when I then use that in my published version, say by simply adding
addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.3.0-SNAPSHOT")
to the project/plugins.sbt file.
Don't think you need to build SBT for Scala 2.10. I keep my gen-idea and eclipse project generators in the global build.sbt file and it works for all my projects (or so it seems ;-)
I'm using Ubuntu, so where the SBT config files are saved on your computer may be different.
Create a folder called plugins under the hidden sbt directory. On Linux this is located at ~/.sbt (where tilde is an alias for your home directory). So now you should have ~/.sbt/plugins
Then create a file called build.sbt under this directory and add the following to it:
resolvers += "Sonatype snapshots" at "http://oss.sonatype.org/content/repositories/snapshots/"
resolvers += "Sonatype releases" at "https://oss.sonatype.org/content/repositories/releases/"
addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "2.1.0")
addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.2.0-SNAPSHOT")
To test, I just generated a scala 2.10 project with it, and it seems fine.
Oh, the file above also adds support for the eclipse command in SBT if you want to generate Scala-IDE projects.
I was able to use an older version of gen-idea by adding the following to project/plugins.sbt in the project itself:
import sbt._
import Defaults._
libraryDependencies += sbtPluginExtra(
m = "com.github.mpeltonen" % "sbt-idea" % "1.2.0", // Plugin module name and version
sbtV = "0.12", // SBT version
scalaV = "2.9.2" // Scala version compiled the plugin
)