I'm writing a script to try to get Cassandra and Spark working together but I can't even get the program to compile. I am using SBT as the build tool and I have all the dependencies required for the program declared. The first time I ran sbt run it downloaded the dependencies but I would get an error when it started compiling the scala code shown below:
[info] Compiling 1 Scala source to /home/vagrant/ScalaTest/target/scala-2.10/classes...
[error] /home/vagrant/ScalaTest/src/main/scala/ScalaTest.scala:6: not found: type SparkConf
[error] val conf = new SparkConf(true)
[error] ^
[error] /home/vagrant/ScalaTest/src/main/scala/ScalaTest.scala:9: not found: type SparkContext
[error] val sc = new SparkContext("spark://", "test", conf)
[error] ^
[error] two errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 3 s, completed Jun 5, 2015 2:40:09 PM
This is the SBT build file
lazy val root = (project in file(".")).
name := "ScalaTest",
version := "1.0"
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "1.3.0-M1"
and this is the actual Scala program
import com.datastax.spark.connector._
object ScalaTest {
def main(args: Array[String]) {
val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "")
val sc = new SparkContext("spark://", "test", conf)
Here is my directory structure
- ScalaTest
- build.sbt
- project
- src
- main
- scala
- ScalaTest.scala
- target
I don't know if this is the problem, but you're not importing the SparkConf and SparkContext classes definition. Thus try adding to your scala file:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
The build.sbt is as follows:
name := "ScalaVertxTest"
version := "0.1"
scalaVersion := "2.12.8"
libraryDependencies += "io.vertx" %% "vertx-lang-scala" % "3.6.3"
In Scala file, just trying to create vertx instance as follows:
package com.example
import io.vertx.scala.core._
object Main {
def main (args: Array[String]): Unit = {
println("Hello Vertx Scala")
var vertx = Vertx.vertx()
sbt compile command generates following error message:
com/example/Main.scala:3:11: object vertx is not a member of package io
[error] import io.vertx.scala.core._
[error] ^
[error] com/example/Main.scala:12:17: not found: value Vertx
[error] var vertx = Vertx.vertx()
[error] ^
[error] two errors found
How to create Vertx instance in Scala?
I was having problem building in Intellij IDE.
Compiled using sbt in console, program builds correctly.
I want to deploy and submit a spark program using sbt but its throwing error.
package in.goai.spark
import org.apache.spark.{SparkContext, SparkConf}
object SparkMeApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("First Spark")
val sc = new SparkContext(conf)
val fileName = args(0)
val lines = sc.textFile(fileName).cache
val c = lines.count
println(s"There are $c lines in $fileName")
name := "First Spark"
version := "1.0"
organization := "in.goai"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1"
resolvers += Resolver.mavenLocal
Under first/project directory
When I am trying to run sbt package its throwing error given below.
[root#hadoop first]# sbt package
[info] Loading project definition from /home/training/workspace_spark/first/project
[info] Set current project to First Spark (in build file:/home/training/workspace_spark/first/)
[info] Compiling 1 Scala source to /home/training/workspace_spark/first/target/scala-2.11/classes...
[error] /home/training/workspace_spark/first/src/main/scala/LineCount.scala:3: object apache is not a member of package org
[error] import org.apache.spark.{SparkContext, SparkConf}
[error] ^
[error] /home/training/workspace_spark/first/src/main/scala/LineCount.scala:9: not found: type SparkConf
[error] val conf = new SparkConf().setAppName("First Spark")
[error] ^
[error] /home/training/workspace_spark/first/src/main/scala/LineCount.scala:11: not found: type SparkContext
[error] val sc = new SparkContext(conf)
[error] ^
[error] three errors found
[error] (compile:compile) Compilation failed
[error] Total time: 4 s, completed May 10, 2018 4:05:10 PM
I have tried with extends to App too but no change.
Please remove resolvers += Resolver.mavenLocal from build.sbt. Since spark-core is available on Maven, we don't need to use local resolvers.
After that, you can try sbt clean package.
Any idea why we get these errors?
ubuntu#group-3-vm1:~/software/sbt/bin$ ./sbt package
[info] Set current project to hello (in build file:/home/ubuntu/software/sbt/bin/)
[info] Compiling 1 Scala source to /home/ubuntu/software/sbt/bin/target/scala-2.11/classes...
[error] /home/ubuntu/software/sbt/bin/hi.scala:1: object apache is not a member of package org
[error] import org.apache.spark.SparkContext
[error] ^
[error] /home/ubuntu/software/sbt/bin/hi.scala:2: object apache is not a member of package org
[error] import org.apache.spark.SparkContext._
[error] ^
the code is:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.api.java._
import org.apache.spark.api.java.function.Function_
import org.apache.spark.graphx._
import org.apache.spark.graphx.lib._
import org.apache.spark.graphx.PartitionStrategy._
//class PartBQ1{
object PartBQ1{
val conf = new SparkConf().setMaster("spark://")
.set("spark.driver.memory", "1g")
.set("spark.eventLog.enabled", "true")
.set("spark.eventLog.dir", "/home/ubuntu/storage/logs")
.set("spark.executor.memory", "21g")
.set("spark.executor.cores", "4")
.set("spark.cores.max", "4")
.set("spark.task.cpus", "1")
val sc = new SparkContext(conf=conf)
sql_ctx = new SQLContext(sc)
graph = GraphLoader.edgeListFile(sc, "data2.txt")
Seems to be missing a sbt file. Like:
name := "Simple Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.1"
I have a Scala code like below :-
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark._
object RecipeIO {
val sc = new SparkContext(new SparkConf().setAppName("Recipe_Extraction"))
def read(INPUT_PATH: String): org.apache.spark.rdd.RDD[(String)]= {
val data = sc.wholeTextFiles("INPUT_PATH")
val files = data.map { case (filename, content) => filename}
When I compile this code using sbt it gives me the error :
value wholeTextFiles is not a member of org.apache.spark.SparkContext.
I am importing all of which is required but it's still giving me this errror.
But when I compile this code by replacing wholeTextFiles with textFile, the code gets compiled.
What might be the problem here and how do I resolve that?
Thanks in advance!
Scala compiler version 2.10.2
[info] Set current project to RecipeIO (in build file:/home/akshat/RecipeIO/)
[info] Compiling 1 Scala source to /home/akshat/RecipeIO/target/scala-2.10.4/classes...
[error] /home/akshat/RecipeIO/src/main/scala/RecipeIO.scala:14: value wholeTexFiles is not a member of org.apache.spark.SparkContext
[error] val data = sc.wholeTexFiles(INPUT_PATH)
[error] ^
[error] one error found
[error] {file:/home/akshat/RecipeIO/}default-55aff3/compile:compile: Compilation failed
[error] Total time: 16 s, completed Jun 15, 2015 11:07:04 PM
My build.sbt file looks like this :
name := "RecipeIO"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "0.9.0-incubating"
libraryDependencies += "org.eclipse.jetty" % "jetty-server" % "8.1.2.v20120308"
ivyXML :=
<dependency org="org.eclipse.jetty.orbit" name="javax.servlet" rev="3.0.0.v201112011016">
<artifact name="javax.servlet" type="orbit" ext="jar"/>
You have a typo: it should be wholeTextFiles instead of wholeTexFiles.
As a side note, I think you want sc.wholeTextFiles(INPUT_PATH) and not sc.wholeTextFiles("INPUT_PATH") if you really want to use the INPUT_PATH variable.
I installed spark on ubuntu 14.04 following this tutorial http://blog.prabeeshk.com/blog/2014/10/31/install-apache-spark-on-ubuntu-14-dot-04/
I am able to run the examples provided inside spark and it seems to work.
The problem is that I am not able to make a scala file and to execute it with spark. This is what I have done following the guidelines https://spark.apache.org/docs/latest/quick-start.html
My standalone app is:
/* SimpleApp.scala */
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.commons.math3.random.RandomDataGenerator
object SimpleApp {
def main(args: Array[String]) {
val logFile = "/home/donbeo/Applications/spark/spark-1.1.0/README.md" // Should be some file on your system
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(logFile, 2).cache()
val numAs = logData.filter(line => line.contains("a")).count()
val numBs = logData.filter(line => line.contains("b")).count()
println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))
println("A random number")
val randomData = new RandomDataGenerator()
println(randomData.nextLong(0, 100))
and my sbt file is :
name := "Simple Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.1.0"
libraryDependencies += "org.apache.commons" % "commons-math3" % "3.3"
my project structure is:
donbeo#donbeo-HP-EliteBook-Folio-9470m:~/Documents/scala_code/simpleApp$ find .
and then I run
donbeo#donbeo-HP-EliteBook-Folio-9470m:~/Documents/scala_code/simpleApp$ sbt package
[info] Set current project to Simple Project (in build file:/home/donbeo/Documents/scala_code/simpleApp/)
[info] Updating {file:/home/donbeo/Documents/scala_code/simpleApp/}simpleapp...
[info] Resolving org.eclipse.jetty.orbit#javax.transaction;1.1.1.v201105210645 .[info] Resolving org.eclipse.jetty.orbit#javax.mail.glassfish;1.4.1.v20100508202[info] Resolving org.eclipse.jetty.orbit#javax.activation;1.1.0.v201105071233 ..[info] Resolving org.spark-project.akka#akka-remote_2.10;2.2.3-shaded-protobuf .[info] Resolving org.spark-project.akka#akka-actor_2.10;2.2.3-shaded-protobuf ..[info] Resolving org.spark-project.akka#akka-slf4j_2.10;2.2.3-shaded-protobuf ..[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Compiling 1 Scala source to /home/donbeo/Documents/scala_code/simpleApp/target/scala-2.10/classes...
[info] Packaging /home/donbeo/Documents/scala_code/simpleApp/target/scala-2.10/simple-project_2.10-1.0.jar ...
[info] Done packaging.
[success] Total time: 8 s, completed 04-Feb-2015 15:20:09
and at the final step I get an error
donbeo#donbeo-HP-EliteBook-Folio-9470m:~/Applications/spark/spark-1.1.0$ ./bin/spark-submit \ --class "SimpleApp" \ --master local[4] \ /home/donbeo/Documents/scala_code/simpleApp/target/scala-2.10/simple-project_2.10-1.0.jar
Exception in thread "main" java.net.URISyntaxException: Illegal character in path at index 0: --class
at java.net.URI$Parser.fail(URI.java:2829)
at java.net.URI$Parser.checkChars(URI.java:3002)
at java.net.URI$Parser.parseHierarchical(URI.java:3086)
at java.net.URI$Parser.parse(URI.java:3044)
at java.net.URI.<init>(URI.java:595)
at org.apache.spark.util.Utils$.resolveURI(Utils.scala:1343)
at org.apache.spark.deploy.SparkSubmitArguments.parse$1(SparkSubmitArguments.scala:338)
at org.apache.spark.deploy.SparkSubmitArguments.parseOpts(SparkSubmitArguments.scala:225)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:60)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:70)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Am I doing something wrong? How can I solve?
You need to remove all the \ from their command line examples, they have been added because of the line breaks:
./bin/spark-submit --class "SimpleApp" --master local[4] /home/donbeo/Documents/scala_code/simpleApp/target/scala-2.10/simple-project_2.10-1.0.jar