I want to deploy and submit a spark program using sbt but its throwing error.
Code:
package in.goai.spark
import org.apache.spark.{SparkContext, SparkConf}
object SparkMeApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("First Spark")
val sc = new SparkContext(conf)
val fileName = args(0)
val lines = sc.textFile(fileName).cache
val c = lines.count
println(s"There are $c lines in $fileName")
}
}
build.sbt
name := "First Spark"
version := "1.0"
organization := "in.goai"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.1"
resolvers += Resolver.mavenLocal
Under first/project directory
build.properties
bt.version=0.13.9
When I am trying to run sbt package its throwing error given below.
[root#hadoop first]# sbt package
[info] Loading project definition from /home/training/workspace_spark/first/project
[info] Set current project to First Spark (in build file:/home/training/workspace_spark/first/)
[info] Compiling 1 Scala source to /home/training/workspace_spark/first/target/scala-2.11/classes...
[error] /home/training/workspace_spark/first/src/main/scala/LineCount.scala:3: object apache is not a member of package org
[error] import org.apache.spark.{SparkContext, SparkConf}
[error] ^
[error] /home/training/workspace_spark/first/src/main/scala/LineCount.scala:9: not found: type SparkConf
[error] val conf = new SparkConf().setAppName("First Spark")
[error] ^
[error] /home/training/workspace_spark/first/src/main/scala/LineCount.scala:11: not found: type SparkContext
[error] val sc = new SparkContext(conf)
[error] ^
[error] three errors found
[error] (compile:compile) Compilation failed
[error] Total time: 4 s, completed May 10, 2018 4:05:10 PM
I have tried with extends to App too but no change.
Please remove resolvers += Resolver.mavenLocal from build.sbt. Since spark-core is available on Maven, we don't need to use local resolvers.
After that, you can try sbt clean package.
Related
Hi i am trying to set up a small spark application in SBT,
My build.sbt is
import Dependencies._
name := "hello"
version := "1.0"
scalaVersion := "2.11.8"
val sparkVersion = "1.6.1"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-twitter" % sparkVersion
)
libraryDependencies += scalaTest % Test
Everything works fine i get all dependencies resolved by SBT, but when i try importing spark in my hello.scala project file i get this error
not found: value spark
my hello.scala file is
package example
import org.apache.spark._
import org.apache.spark.SparkContext._
object Hello extends fileImport with App {
println(greeting)
anime.select("*").orderBy($"rating".desc).limit(10).show()
}
trait fileImport {
lazy val greeting: String = "hello"
var anime = spark.read.option("header", true).csv("C:/anime.csv")
var ratings = spark.read.option("header", true).csv("C:/rating.csv")
}
here is error file i get
[info] Compiling 1 Scala source to C:\Users\haftab\Downloads\sbt-0.13.16\sbt\alfutaim\target\scala-2.11\classes...
[error] C:\Users\haftab\Downloads\sbt-0.13.16\sbt\alfutaim\src\main\scala\example\Hello.scala:12: not found: value spark
[error] var anime = spark.read.option("header", true).csv("C:/anime.csv")
[error] ^
[error] C:\Users\haftab\Downloads\sbt-0.13.16\sbt\alfutaim\src\main\scala\example\Hello.scala:13: not found: value spark
[error] var ratings = spark.read.option("header", true).csv("C:/rating.csv")
[error] ^
[error] two errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 3 s, completed Sep 10, 2017 1:44:47 PM
spark is initialized in spark-shell only
but for the code you need to initialize the spark variable by yourself
import org.apache.spark.sql.SparkSession
val spark = SparkSession.builder().appName("testings").master("local").getOrCreate
you can change the testings name to your desired name .master option is optional if you want to run the code using spark-submit
I am using the spark-cassandra-connector with Scala and I want to read data from cassandra and display it via the method toArray.
However, I get an error message that it is not member of a class, but it is indicated in the API. Could somebody help me in finding my error?
Here are my files:
build.sbt:
name := "Simple_Project"
version := "1.0"
scalaVersion := "2.11.8"
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs # _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0-preview"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0-preview"
resolvers += "Spark Packages Repo" at "https://dl.bintray.com/spark-packages/maven"
libraryDependencies += "datastax" % "spark-cassandra-connector" % "2.0.0-M2-s_2.11"
SimpleScala.scala:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql._
import org.apache.spark.sql.functions._
import com.datastax.spark.connector._
import com.datastax.spark.connector.rdd._
import org.apache.spark.sql.cassandra._
import org.apache.spark.sql.SQLContext
import com.datastax.spark.connector.cql.CassandraConnector._
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application")
conf.set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new SparkContext(conf)
val rdd_2 = sc.cassandraTable("test_2", "words")
rdd_2.toArray.foreach(println)
}
}
Functions for cqlsh:
CREATE KEYSPACE test_2 WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE test_2.words (word text PRIMARY KEY, count int);
INSERT INTO test_2.words (word, count) VALUES ('foo', 20);
INSERT INTO test_2.words (word, count) VALUES ('bar', 20);
Error message:
[info] Loading global plugins from /home/andi/.sbt/0.13/plugins
[info] Resolving org.scala-sbt.ivy#ivy;2.3.0-sbt-2cc8d2761242b072cedb0a04cb39435[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Loading project definition from /home/andi/test_spark/project
[info] Updating {file:/home/andi/test_spark/project/}test_spark-build...
[info] Resolving org.scala-sbt.ivy#ivy;2.3.0-sbt-2cc8d2761242b072cedb0a04cb39435[info] Resolving org.fusesource.jansi#jansi;1.4 ...
[info] Done updating.
[info] Set current project to Simple_Project (in build file:/home/andi/test_spark/)
[info] Compiling 1 Scala source to /home/andi/test_spark/target/scala-2.11/classes...
[error] /home/andi/test_spark/src/main/scala/SimpleApp.scala:50: value toArray is not a member of com.datastax.spark.connector.rdd.CassandraTableScanRDD[com.datastax.spark.connector.CassandraRow]
[error] rdd_2.toArray.foreach(println)
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed
Many thanks in advance,
Andi
CassandraTableScanRDD.toArray method has been deprecated and removed since 2.0.0 release of Spark Cassandra Connector. This method was there until 1.6.0 release. You can use collect method instead.
Unfortunately, the document Spark Cassandra Connector still uses toArray. Anyway, here is that will work
rdd_2.collect.foreach(println)
Any idea why we get these errors?
ubuntu#group-3-vm1:~/software/sbt/bin$ ./sbt package
[info] Set current project to hello (in build file:/home/ubuntu/software/sbt/bin/)
[info] Compiling 1 Scala source to /home/ubuntu/software/sbt/bin/target/scala-2.11/classes...
[error] /home/ubuntu/software/sbt/bin/hi.scala:1: object apache is not a member of package org
[error] import org.apache.spark.SparkContext
[error] ^
[error] /home/ubuntu/software/sbt/bin/hi.scala:2: object apache is not a member of package org
[error] import org.apache.spark.SparkContext._
[error] ^
the code is:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.api.java._
import org.apache.spark.api.java.function.Function_
import org.apache.spark.graphx._
import org.apache.spark.graphx.lib._
import org.apache.spark.graphx.PartitionStrategy._
//class PartBQ1{
object PartBQ1{
val conf = new SparkConf().setMaster("spark://10.0.1.31:7077")
.setAppName("CS-838-Assignment2-Question2")
.set("spark.driver.memory", "1g")
.set("spark.eventLog.enabled", "true")
.set("spark.eventLog.dir", "/home/ubuntu/storage/logs")
.set("spark.executor.memory", "21g")
.set("spark.executor.cores", "4")
.set("spark.cores.max", "4")
.set("spark.task.cpus", "1")
val sc = new SparkContext(conf=conf)
sql_ctx = new SQLContext(sc)
graph = GraphLoader.edgeListFile(sc, "data2.txt")
}
Seems to be missing a sbt file. Like:
simple.sbt
name := "Simple Project"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" %% "spark-core" % "1.5.1"
I'm writing a script to try to get Cassandra and Spark working together but I can't even get the program to compile. I am using SBT as the build tool and I have all the dependencies required for the program declared. The first time I ran sbt run it downloaded the dependencies but I would get an error when it started compiling the scala code shown below:
[info] Compiling 1 Scala source to /home/vagrant/ScalaTest/target/scala-2.10/classes...
[error] /home/vagrant/ScalaTest/src/main/scala/ScalaTest.scala:6: not found: type SparkConf
[error] val conf = new SparkConf(true)
[error] ^
[error] /home/vagrant/ScalaTest/src/main/scala/ScalaTest.scala:9: not found: type SparkContext
[error] val sc = new SparkContext("spark://192.168.10.11:7077", "test", conf)
[error] ^
[error] two errors found
[error] (compile:compileIncremental) Compilation failed
[error] Total time: 3 s, completed Jun 5, 2015 2:40:09 PM
This is the SBT build file
lazy val root = (project in file(".")).
settings(
name := "ScalaTest",
version := "1.0"
)
libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "1.3.0-M1"
and this is the actual Scala program
import com.datastax.spark.connector._
object ScalaTest {
def main(args: Array[String]) {
val conf = new SparkConf(true)
.set("spark.cassandra.connection.host", "127.0.0.1")
val sc = new SparkContext("spark://192.168.10.11:7077", "test", conf)
}
}
Here is my directory structure
- ScalaTest
- build.sbt
- project
- src
- main
- scala
- ScalaTest.scala
- target
I don't know if this is the problem, but you're not importing the SparkConf and SparkContext classes definition. Thus try adding to your scala file:
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
I have a Scala code like below :-
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark._
object RecipeIO {
val sc = new SparkContext(new SparkConf().setAppName("Recipe_Extraction"))
def read(INPUT_PATH: String): org.apache.spark.rdd.RDD[(String)]= {
val data = sc.wholeTextFiles("INPUT_PATH")
val files = data.map { case (filename, content) => filename}
(files)
}
}
When I compile this code using sbt it gives me the error :
value wholeTextFiles is not a member of org.apache.spark.SparkContext.
I am importing all of which is required but it's still giving me this errror.
But when I compile this code by replacing wholeTextFiles with textFile, the code gets compiled.
What might be the problem here and how do I resolve that?
Thanks in advance!
Environment:
Scala compiler version 2.10.2
spark-1.2.0
Error:
[info] Set current project to RecipeIO (in build file:/home/akshat/RecipeIO/)
[info] Compiling 1 Scala source to /home/akshat/RecipeIO/target/scala-2.10.4/classes...
[error] /home/akshat/RecipeIO/src/main/scala/RecipeIO.scala:14: value wholeTexFiles is not a member of org.apache.spark.SparkContext
[error] val data = sc.wholeTexFiles(INPUT_PATH)
[error] ^
[error] one error found
[error] {file:/home/akshat/RecipeIO/}default-55aff3/compile:compile: Compilation failed
[error] Total time: 16 s, completed Jun 15, 2015 11:07:04 PM
My build.sbt file looks like this :
name := "RecipeIO"
version := "1.0"
scalaVersion := "2.10.4"
libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "0.9.0-incubating"
libraryDependencies += "org.eclipse.jetty" % "jetty-server" % "8.1.2.v20120308"
ivyXML :=
<dependency org="org.eclipse.jetty.orbit" name="javax.servlet" rev="3.0.0.v201112011016">
<artifact name="javax.servlet" type="orbit" ext="jar"/>
</dependency>
You have a typo: it should be wholeTextFiles instead of wholeTexFiles.
As a side note, I think you want sc.wholeTextFiles(INPUT_PATH) and not sc.wholeTextFiles("INPUT_PATH") if you really want to use the INPUT_PATH variable.