44: error: value read is not a member of object org.apache.spark.sql.SQLContext - scala

I am using Spark 1.6.1, and Scala 2.10.5. I am trying to read the csv file through com.databricks.
While launching the spark-shell, I use below lines as well
spark-shell --packages com.databricks:spark-csv_2.10:1.5.0 --driver-class-path path to/sqljdbc4.jar, and below is the whole code
import java.util.Properties
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext
val conf = new SparkConf().setAppName("test").setMaster("local").set("spark.driver.allowMultipleContexts", "true");
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
val df = SQLContext.read().format("com.databricks.spark.csv").option("inferScheme","true").option("header","true").load("path_to/data.csv");
I am getting below error:-
error: value read is not a member of object org.apache.spark.sql.SQLContext,
and the "^" is pointing toward "SQLContext.read().format" in the error message.
I did try the suggestions available in stackoverflow, as well as other sites as well. but nothing seems to be working.

SQLContext means object access - static methods in class.
You should use sqlContext variable, as methods are not static, but are in class
So code should be:
val df = sqlContext.read.format("com.databricks.spark.csv").option("inferScheme","true").option("header","true").load("path_to/data.csv");

Related

import sqlContext cannot be resolved although defined with SQLContext instance

I followed the solutions in here, however, I am still getting the "cannot resolve symbol SQLContext" error. ".implicits._" cannot be resolved either. What would be the reason for it?
Spark/Scala versions I use:
Scala 2.12.13
Spark 3.0.1 (without bundled Hadoop)
Here is my related code part:
import org.apache.log4j.LogManager
import org.apache.spark.{SparkConf, SparkContext}
object Count {
def main(args: Array[String]) {
...
...
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
}}
You didn't import SQLContext at all:
import org.apache.spark.sql.SQLContext
You should probably not use SQLContext anymore in the first place though:
As of Spark 2.0, this is replaced by SparkSession. However, we are keeping the class here for backward compatibility.
https://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/SQLContext.html
See how to use a SparkSession from SparkContext at How to create SparkSession from existing SparkContext and then import sparkSession.implicits._.

How to fix 22: error: not found: value SparkSession in Scala?

I am new to Spark and I would like to read a CSV-file to a Dataframe.
Spark 1.3.0 / Scala 2.3.0
This is what I have so far:
# Start Scala with CSV Package Module
spark-shell --packages com.databricks:spark-csv_2.10:1.3.0
# Import Spark Classes
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.sql.SQLContext
import sqlCtx ._
# Create SparkConf
val conf = new SparkConf().setAppName("local").setMaster("master")
val sc = new SparkContext(conf)
# Create SQLContext
val sqlCtx = new SQLContext(sc)
# Create SparkSession and use it for all purposes:
val session = SparkSession.builder().appName("local").master("master").getOrCreate()
# Read CSV-File and turn it into Dataframe.
val df_fc = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").load("/home/Desktop/test.csv")
However at SparkSession.builder() it gives the following error:
^
How can I fix this error?
SparkSession is available in spark 2. No need to create sparkcontext in spark version 2. sparksession itself provides the gateway to all .
Try below as you are using version 1.x:
val df_fc = sqlCtx.read.format("com.databricks.spark.csv").option("header", "true").load("/home/Desktop/test.csv")

SQLContext.gerorCreate is not a value

I am getting error SQLContext.gerorCreate is not a value of object org.apache.spark.SQLContext. This is my code
import org.apache.spark.SparkConf
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.kafka.KafkaUtils
import org.apache.spark.sql.functions
import org.apache.spark.sql.SQLContext
import org.apache.spark.sql.types
import org.apache.spark.SparkContext
import java.io.Serializable
case class Sensor(id:String,date:String,temp:String,press:String)
object consum {
def main(args: Array[String]) {
val sparkConf = new SparkConf().setAppName("KafkaWordCount").setMaster("local[2]")
val ssc = new StreamingContext(sparkConf, Seconds(2))
val sc=new SparkContext(sparkConf)
val lines = KafkaUtils.createStream(ssc, "localhost:2181", "spark-streaming-consumer-group", Map("hello" -> 5))
def parseSensor(str:String): Sensor={
val p=str.split(",")
Sensor(p(0),p(1),p(2),p(3))
}
val data=lines.map(_._2).map(parseSensor)
val sqlcontext=new SQLContext(sc)
import sqlcontext.implicits._
data.foreachRDD { rdd=>
val sensedata=sqlcontext.getOrCreate(rdd.sparkContext)
}
I have tried with SQLContext.getOrCreate as well but same error.
There is no such getOrCreate function defined for neither SparkContext nor SQLContext.
getOrCreate function is defined for SparkSession instances from which SparkSession instances are created. And we get sparkContext instance or sqlContext instance from the SparkSession instance created using getOrCreate method call.
I hope the explanation is clear.
Updated
The explanation I did above is suitable for higher versions of spark. In the blog as the OP is referencing, the author is using spark 1.6 and the api doc of 1.6.3 clearly states
Get the singleton SQLContext if it exists or create a new one using the given SparkContext

SparkConf not found running spark neo4j connector example

I execute the spark neo4j example code like so:
spark-shell --conf spark.neo4j.bolt.password=TestNeo4j --packages neo4j-contrib:neo4j-spark-connector:1.0.0-RC1,graphframes:graphframes:0.1.0-spark1.6 -i neo4jspark.scala
My Scalafile:
import org.neo4j.spark._
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
import org.apache.spark.SparkConf
val conf = new SparkConf.setMaster("local").setAppName("neo4jspark")
val sc = new SparkContext(conf)
val neo = Neo4j(sc)
val rdd = neo.cypher("MATCH (p:POINT) RETURN p").loadRowRdd
rdd.count
The error:
Loading neo4jspark.scala...
import org.neo4j.spark._
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.rdd.RDD
import org.apache.spark.streaming._
import org.apache.spark.streaming.StreamingContext._
import org.apache.spark.SparkConf
<console>:38: error: not found: value SparkConf
val conf = new SparkConf.setMaster("local").setAppName("neo4jspark")
^
<console>:38: error: not found: value conf
val sc = new SparkContext(conf)
^
<console>:39: error: not found: value Neo4j
val neo = Neo4j(sc)
^
<console>:38: error: not found: value neo
val rdd = neo.cypher("MATCH (p:POINT) RETURN p").loadRowRdd
^
<console>:39: error: object count is not a member of package org.apache.spark.streaming.rdd
rdd.count
^
I am importing the SparkConf, no idea whay it says there is no value for that, am I missing something (simple I hope)??
EDIT: This seems to be a version error:
I ran it with this start up:
spark-shell --conf spark.neo4j.bolt.password=TestNeo4j --packages neo4j-contrib:neo4j-spark-connector:2.0.0-M2,graphframes:graphframes:0.2.0-spark2.0-s_2.11 -i neo4jspark.scala
Still get the conf error but it does run. I just need to now figure out why anything i do with the returned RDD breaks :D This is on mac, just tested same versions of everything on my windows and it breaks like it showed her mostly because it couldn't get the import.org.neo4j.spark._ here is the error:
<console>:23: error: object neo4j is not a member of package org
import org.neo4j.spark._
^
No idea what is different about windows than the mac I am deving on :/

Error found when importing spark.implicits

I am using spark 1.4.0
When I tried to import spark.implicits using this command:
import spark.implicits._, this error appear:
<console>:19: error: not found: value spark
import spark.implicits._
^
Can anyone help me to resolve this problem ?
It's because SparkSession is avialable from Spark 2.0 and spark value is an object of type SparkSession in Spark REPL.
In Spark 1.4 use
import sqlContext.implicits._
Value sqlContext is automatically created in Spark REPL for Spark 1.x
To make it complete, first you have to create a sqlContext
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.SQLContext
val conf = new SparkConf().setMaster("local").setAppName("my app")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._