Building apache Kafka - scala

I am trying to build Kafka with Scala 2.10.1. I tried following steps given on Git-hub. At the end it generates a Jar in Target directory, however that Jar is empty and the size is 5kb. Am I missing something here ? I am totally new to SBT.
1) ./sbt update
2) ./sbt package
3) ./sbt assembly-package-dependency
To build for a particular version of Scala (either 2.8.0, 2.8.2, 2.9.1, 2.9.2 or 2.10.1), change step 2 above to: 2. ./sbt "++2.8.0 package"

Actually Kafka jar is located in core/target/scala-2.10/ directory, and dependencies are in the Ivy cache.
Execute ./sbt release-zip to get an archive in target/RELEASE/ with all dependencies and shell scripts packaged.
To build release for a particular Scala version, add version param to the build command:
./sbt "++2.10.1 release-zip"

Related

Install sbt from source

Because in my operating system distribution has not default package sbt, I try compile from source and install sbt package locally (https://github.com/sbt/sbt). Unfortunately I can not do it and I can not find any guide to do this without installed sbt. To compile sbt from source is needed sbt? What can I to do to install sbt from source?
The answer is yes: to build sbt from source, you need to install sbt first (obviously, not from source). There are instructions for Building sbt from source and they start with
Install the current stable binary release of sbt (see Setup), which will be used to build sbt from source.
If you have a concrete problem installing sbt from binaries, you should solve it first. You can ask for help with it in a new question.

Installing sbt on Windows 10 for Scala course

Instructions for the course say to use verion 0.13.x.
I installed the latest msi from the sbt site, but when I type "sbt about", I get:
Microsoft Windows [Version 10.0.15063]
(c) 2017 Microsoft Corporation. All rights reserved.
C:\Users\reall>sbt about
Error: Unable to access jarfile
Copying runtime jar.
The filename, directory name, or volume label syntax is incorrect.
Error: Unable to access jarfile
"C:\Users\reall\.sbt\preloaded\org.scala-sbt\sbt\"1.0.2"\jars\sbt.jar"
Java HotSpot(TM) 64-Bit Server VM warning: Ignoring option MaxPermSize; support was removed in 8.0
[info] Loading project definition from C:\Users\reall\project
[info] Set current project to reall (in build file:/C:/Users/reall/)
[info] This is sbt 1.0.2
[info] The current project is {file:/C:/Users/reall/}reall 0.1-SNAPSHOT
[info] The current project is built against Scala 2.12.3
[info] Available Plugins: sbt.plugins.IvyPlugin, sbt.plugins.JvmPlugin, sbt.plugins.CorePlugin, sbt.plugins.JUnitXmlReportPlugin, sbt.plugins.Giter8TemplatePlugin
[info] sbt, sbt plugins, and build definitions are using Scala 2.12.3
i.e., a jar file error and the sbt version 1.0.2.
Any idea what I'm doing wrong?
You are not doing anything wrong, it's just that the required version 0.13.x is not the latest anymore. So you can either follow #dmytro-mitin's answer and reinstall sbt, or you can still use the one you already have: what you installed now is the sbt launcher, it can be used to run different versions of sbt depending on a project. So it's not important which launcher version you are using (unless you're working on something very sbt-specific).
Normally, every sbt project has a project/build.properties file with the sbt version that is needed to work with it:
sbt.version=0.13.16
So you can change (or create) this file and when you run sbt in the project root folder, it will launch sbt version 0.13.16.
Another way to launch specific version of sbt is to run it with the -sbt-version option :
sbt -sbt-version 0.13.16
or using -D flag:
sbt -Dsbt.version=0.13.16
which has exactly the same effect as editing project/build.properties.
Install not the latest version. The latest one is 1.0.2.
Install 0.13.16.
You can download it here: http://www.scala-sbt.org/download.html
There are msi and zip files.
Installing sbt on Windows

Building Customize Spark

We are creating a customize version of Spark since we are changing some lines of code from ALS.scala. We build the customize spark version using
mvn command:
./make-distribution.sh --name custom-spark --tgz -Psparkr -Phadoop-2.6 -Phive -Phive-thriftserver -Pyarn
However, upon using the customized version of Spark, we run into this error:
Do you guys have some idea on what causes the error and how we might solve the issue?
I am actually using a jar file in the local machine by building them using sbt: sbt compile then sbt clean package and putting the jar file here: /Users/user/local/kernel/kernel-0.1.5-SNAPSHOT/lib.
However in the hadoop environment, the installation is different. Thus, I use maven to build spark and that's where the error comes in. I am thinking that this error might be dependent on using maven to build spark as there are some reports like this:
https://issues.apache.org/jira/browse/SPARK-2075
or probably on building spark assembly files

How to build a bundle sbt from source for offline use?

My goal is to have a sbt jar file with all dependencies in order to create a debian package, so it could be install on machine without check/install package at first run.
Is it the right choice use sbt-assembly to build a sbt jar with all dependencies?
The sbt binary version doesn't come with dependecies and sbt download them at first run.
I don't fully understand your use case, but would sbt-native-packager .deb format be a good fit?

Running Spark sbt project without sbt?

I have a Spark project which I can run from sbt console. However, when I try to run it from the command line, I get Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkContext. This is expected, because the Spark libs are listed as provided in the build.sbt.
How do I configure things so that I can run the JAR from the command line, without having to use sbt console?
To run Spark stand-alone you need to build a Spark assembly.
Run sbt/sbt assembly on the spark root dir. This will create: assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar
Then you build your job jar with dependencies (either with sbt assembly or maven-shade-plugin)
You can use the resulting binaries to run your spark job from the command line:
ADD_JARS=job-jar-with-dependencies.jar SPARK_LOCAL_IP=<IP> java -cp spark-assembly-1.0.0-SNAPSHOT-hadoop1.0.4.jar:job-jar-with-dependencies.jar com.example.jobs.SparkJob
Note: If you need other HDFS version, you need to follow additional steps before building the assembly. See About Hadoop Versions
Using sbt assembly plugin we can create a single jar. After doing that you can simply run it using java -jar command
For more details refer