Setting up Scala project - scala

Is there a standard in place for setting up a Scala project where the build.sbt is contained in a subdirectory?
I've cloned https://github.com/lightbend/cloudflow and opened it in IntelliJ, here is the structure:
Can see core contains build.sbt.
If I open the project core in a new project window then IntelliJ will recognise the Scala project.
How to compile the Scala project core while keeping the other folders available within the IntelliJ window?

EDIT:
If you do want to play around with the project, it should suffice to either import an SBT project and select core as the root. Intellij should also detect the build.sbt if you open core as the root.
Here is the SBT Reference Manual
Traditionally, build.sbt will be at the root of the project.
If you are looking to use their libraries, you should import them in your sbt file, you shouldn't clone the repo unless you intend to modify or fork their repo.
For importing libraries into your project take a look at the Maven Repository for Cloudflow, select the project(s), click on the version you want, and select the SBT tab. Just copy and paste those dependencies into your build.sbt. Once you build the project with SBT, you should have all those packages available to you.
So in [ProjectRoot]/build.sbt something along the lines of
val scalaVersion = "2.13.4"
lazy val root = (project in file("."))
.settings(
name := "Your_Project_Name",
scalaVersion := scalaVersion,
libraryDependencies += "com.lightbend.cloudflow" %% "cloudflow-streamlets" % "2.0.26-RC12",
// Either sequentially
libraryDependencies += "com.lightbend.cloudflow" %% "cloudflow-akka" % "2.0.26-RC12",
// Or as a sequence (note that you can have a trailing comma for these)
libraryDependencies ++= Seq(
"com.lightbend.cloudflow" %% "cloudflow-blueprint" % "2.0.26-RC12",
"com.lightbend.cloudflow" %% "cloudflow-flink" % "2.0.26-RC12", // No next elemenet
)
// Or combo using both like above, just don't forget the commas in between
)

Related

sbt assembly, including my jar

I want to build a 'fat' jar of my code. I understand how to do this mostly but all the examples I have use the idea that the jar is not local and I am not sure how to include into my assembled jar another JAR that I built that the scala code uses. Like what folder does this JAR I have to include reside in?
Normally when I run my current code as a test using spark-shell it looks like this:
spark-shell --jars magellan_2.11-1.0.6-SNAPSHOT.jar -i st_magellan_abby2.scala
(the jar file is right in the same path as the .scala file)
So now I want to build a build.sbt file that does the same and includes that SNAPSHOT.jar file?
name := "PSGApp"
version := "1.0"
scalaVersion := "2.11.8"
resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"
//provided means don't included it is there. already on cluster?
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.2.0" % "provided",
"org.apache.spark" %% "spark-sql" % "2.2.0" % "provided",
"org.apache.spark" %% "spark-streaming" % "2.2.0" % "provided",
//add magellan here somehow?
)
So where would I put the jar in the SBT project folder structure so it gets picked up when I run sbt assembly? Is that in the main/resources folder? Which the reference manual says is where 'files to include in the main jar' go?
What would I put in the libraryDependencies here so it knows to add that specific jar and not go out into the internet to get it?
One last thing, I was also doing some imports in my test code that doesn't seem to fly now that I put this code in an object with a def main attached to it.
I had things like:
import sqlContext.implicits._ which was right in the code above where it was about to be used like so:
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
import org.apache.spark.sql.functions.udf
val distance =udf {(a: Point, b: Point) =>
a.withinCircle(b, .001f); //current radius set to .0001
}
I am not sure can I just keep these imports inside the def main? or do I have to move them elsewhere somehow? (Still learning scala and wrangling the scoping I guess).
One way is to build your fat jar using the assembly plugin (https://github.com/sbt/sbt-assembly) locally and publishLocal to store the resulting jar into your local ivy2 cache
This will make it available for inclusion in your other project based on build.sbt settings in this project, eg:
name := "My Project"
organization := "org.me"
version := "0.1-SNAPSHOT"
Will be locally available as "org.me" %% "my-project" % "0.1-SNAPSHOT"
SBT will search local cache before trying to download from external repo.
However, this is considered bad practise, because only final project should ever be a fat-jar. You should never include one as dependency (many headaches).
There is no reason to make project magellan a fat-jar if library is included in PGapp. Just publishLocal without assembly
Another way is to make projects dependant on each other as code, not library.
lazy val projMagellan = RootProject("../magellan")
lazy val projPSGApp = project.in(file(".")).dependsOn(projMagellan)
This makes compilation in projPSGApp tigger compilation in projMagellan.
It depends on your use case though.
Just don't get in a situation where you have to manage your .jar manually
The other question:
import sqlContext.implicits._ should always be included in the scope where dataframe actions are required, so you shouldn't put that import near the other ones in the header
Update
Based on discussion in comments, my advise would be:
Get the magellan repo
git clone git#github.com:harsha2010/magellan.git
Create a branch to work on, eg.
git checkout -b new-stuff
Change the code you want
Then update the versioning number, eg.
version := "1.0.7-SNAPSHOT"
Publish locally
sbt publishLocal
You'll see something like (after a while):
[info] published ivy to /Users/tomlous/.ivy2/local/harsha2010/magellan_2.11/1.0.7-SNAPSHOT/ivys/ivy.xml
Go to your other project
Change build.sbt to include
"harsha2010" %% "magellan" % "1.0.7-SNAPSHOT" in your libraryDependencies
Now you have a good (temp) reference to your library.
Your PSGApp should be build as an fat jar assembly to pass to Spark
sbt clean assembly
This will pull in the custom build jar
If the change in the magellan project is usefull for the rest of the world, you should push your changes and create a pull request, so that in the future you can just include the latest build of this library

sbt: set the base-directory of a remote RootProject

Disclaimer: I am new to sbt and Scala so I might be missing obvious things.
My objective here is to use the Scala compiler as a library from my main project. I was initially doing that by manually placing the scala jars in a libs directory in my project and then including that dir in my classpath. Note that at the time I wasn't using sbt. Now, I want to use sbt and also download the scala sources from github, build the scala jars and then build my project. I start by creating 2 directories: myProject and myProject/project. I then create the following 4 files:
The sbt version file:
// File 1: project/build.properties
sbt.version=0.13.17
The plugins file (not relevant to this question):
// File 2: project/plugins.sbt
addSbtPlugin("com.eed3si9n" % "sbt-buildinfo" % "0.7.0")
The build.sbt file:
// File 3: build.sbt
lazy val root = (project in file(".")).
settings(
inThisBuild(List(
organization := "me",
scalaVersion := "2.11.12",
version := "0.1.0-SNAPSHOT"
)),
name := "a name"
).dependsOn(ScalaDep)
lazy val ScalaDep = RootProject(uri("https://github.com/scala/scala.git"))
My source file:
// File 4: Test.scala
import scala.tools.nsc.MainClass
object Test extends App {
println("Hello World !")
}
If I run sbt inside myProject then sbt will download the scala sources from github and then try to compile them. The problem is that the base-directory is still myProject. This means that if the scala sbt source files refer to something that is in the scala base-directory they won't find it. For example, the scala/project/VersionUtil.scala file tries to open the scala/versions.properties file that lies in the scala base-directory.
Question: How can I set sbt to download a github repo and then build it using that project's base-directory instead of mine's (by that I mean the base-directory of myProject in the above example) ??
Hope that makes sense.
I would really appreciate any feedback on this.
Thanks in advance !
In the Scala ecosystem you usually depend on binary artifacts (libraries) that are published in Maven or Ivy repositories. Virtually all Scala projects publish binaries, including the compiler. So all you have to do is add the line below to your project settings:
libraryDependencies += "org.scala-lang" % "scala-compiler" % scalaVersion.value
dependsOn is used for dependencies between sub-projects in the same build.
For browsing sources you could use an IDE. IntelliJ IDEA can readily import Sbt projects and download/attach sources for library dependencies. Eclipse has an Sbt plugin that does the same. Ensime also, etc. Or just git clone the repository.

How do I call a dependent library function from an sbt task?

I have a CLI tool written in Java which can modify some source with the added params. For example, it can rename an enum value across a whole project.
I want to write an sbt task that can run this tool from my project dir with the given params, like sbt 'enums -rename A B'. My tool can be injected to the project through the sbt dependencies.
I skimmed through the book sbt in Action looking for an answer, but those examples are not this specific.
My build.sbt (far from working):
name := """toolTestWithActivator"""
version := "1.0-SNAPSHOT"
resolvers += "Local Repository" at "file://C:/Users/torcsi/.ivy2/local"
lazy val root = (project in file(".")).enablePlugins(PlayJava)
scalaVersion := "2.11.6"
libraryDependencies ++= Seq(
"tool" % "tool_2.11" % "1.0",
javaJdbc,
javaEbean,
cache,
javaWs
)
val mytool = taskKey[String]("mytool")
mytool := {
com.my.tool.Main
}
Can sbt handle this type of task/dependency structure, or do I need to do this another way?
SBT is recursive: it compiles .sbt files and .scala files under the project folder and use those to execute your build (in fact you can see sbt as a library that helps you producing builds).
So, as you need your library to define a task, that one is a dependency of your build.sbt file (and not a dependency of your project).
To declare that the build.sbt file depends on your library, just create a ".sbt" file in the project folder; example:
project/dependencies.sbt
libraryDependencies += "tool" %% "tool" % "1.0"
and in build.sbt add:
val mytool = taskKey[Unit]("mytool")
mytool := {
com.my.tool.main(Array())
}
Some comments:
be careful with the scala version used: as sbt 0.13 is compiled with scala 2.10; your library should also be compiled for scala 2.10 (the package should be tools_2.10 ). And the new sbt 1.0 is compiled with scala 2.12.
I used the %% notation, so that sbt adds by itself the expected scala version.
I supposed your cli tool defines a classic java main method (or the scala equivalent). So, the argument should be an Array of String (here an empty one) and it returns Unit (void in java).
Some reference to understand the solution:
http://www.scala-sbt.org/0.13/docs/Organizing-Build.html

How to install library with SBT libraryDependencies in an Intellij project

I am very new to SBT, Breeze and IntelliJ, though I have a decent grasp of Scala and I am trying to install the Breeze library, which I think is managed.
What I've done:
I followed the instructions on this page and added this script to the build.sbt file in my project:
libraryDependencies ++= Seq(
// other dependencies here
"org.scalanlp" %% "breeze" % "0.10",
// native libraries are not included by default. add this if you want them (as of 0.7)
// native libraries greatly improve performance, but increase jar sizes.
"org.scalanlp" %% "breeze-natives" % "0.10"
)
resolvers ++= Seq(
// other resolvers here
"Sonatype Releases" at "https://oss.sonatype.org/content/repositories/releases/"
)
// Scala 2.9.2 is still supported for 0.2.1, but is dropped afterwards.
scalaVersion := "2.11.1" // or 2.10.3 or later
I then ran sbt update in the project directory (via the terminal), and saw that all the pieces of Breeze downloaded.
I then tried re-running sbt update, but this did not trigger another download.
Issue:
The problem is that I cannot access the library via IntelliJ. import breeze._ gives the standard Cannot resolve symbol breeze and I couldn't find any mention of Breeze in "Project Structure." It isn't in the lib directory of the project either.
Am I missing a step?
Sounds like a bug in the IntelliJ project, try removing the .idea directory from the project directory and then re-import the project into IntelliJ using the wizard.

Play Scala SBT not showing dependency in Reference Library in Eclipse

I created a new project using Play Scala and Eclipse. Added Squeryl dependency and see that it's been pulled during compile time. Confirmed it's present in .ivy2/cache/org.squeryl directory but eclipse project is not able to pull it up and causing compilation for import.
build.sbt
name := """registration"""
version := "1.0-SNAPSHOT"
lazy val root = (project in file(".")).enablePlugins(PlayScala)
scalaVersion := "2.11.1"
libraryDependencies ++= Seq(
jdbc,
anorm,
cache,
ws,
"org.squeryl" % "squeryl_2.10" % "0.9.6-RC2"
)
It looks like squeryl doesn't have a binary readily available for Scala 2.11 yet according to http://www.squeryl.org/getting-started.html
So if you want to use a pre-compiled version of this library you must change your scala version to 2.10.4.
All versions of squeryl available can be found at: http://mvnrepository.com/artifact/org.squeryl
I had a similar case using eclipse.
Select Project --> Clean to clean your workspace and build it again if you have not checked "Build Automatically".
If its still not visible please refresh the package explorer (or just the 'Referenced Library') with F5.