error: not found: value sc - scala

I am new to Scala and am trying to code read a file using the following code
scala> val textFile = sc.textFile("README.md")
scala> textFile.count()
But I keep getting the following error
error: not found: value sc
I have tried everything, but nothing seems to work. I am using Scala version 2.10.4 and Spark 1.1.0 (I have even tried Spark 1.2.0 but it doesn't work either). I have sbt installed and compiled yet not able to run sbt/sbt assembly. Is the error because of this?

You should run this code using ./spark-shell. It's scala repl with provided sparkContext. You can find it in your apache spark distribution in folder spark-1.4.1/bin.

Related

Play Scala Slick: error withTableQuery: macro implementation not found: apply

I have an old Play 2.6 Scala project that works in production. I'm trying to start it back locally but now I have the following error with Slick: macro implementation not found: apply for every TableQuery
For instance with this line:
val usersCars: TableQuery[UsersCars] = TableQuery[UsersCars]
throws the error.
I tried to change the file where it is defined but without success.
The Scala version is 2.12.5

Run spark-shell from sbt

The default way of getting spark shell seems to be to download the distribution from the website. Yet, this spark issue mentions that it can be installed via sbt. I could not find documentation on this. In a sbt project that uses spark-sql and spark-core, no spark-shell binary was found.
How do you run spark-shell from sbt?
From the following URL:
https://bzhangusc.wordpress.com/2015/11/20/use-sbt-console-as-spark-shell/
If you already using Sbt for your project, it’s very simple to setup Sbt Console to replace Spark-shell command.
Let’s start from the basic case. When you setup the project with sbt, you can simply run the console as sbt console
Within the console, you just need to initiate SparkContext and SQLContext to make it behave like Spark Shell
scala> val sc = new org.apache.spark.SparkContext("localhell")
scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)

Neo4j Spark connector error: import.org.neo4j.spark._ object neo4j is not found in package org

I have my scala code running in spark connecting to Neo4j on my mac. I wanted to test it on my windows machine but cannot seem to get it to run, I keep getting the error:
Spark context Web UI available at http://192.168.43.4:4040
Spark context available as 'sc' (master = local[*], app id = local-1508360735468).
Spark session available as 'spark'.
Loading neo4jspark.scala...
<console>:23: error: object neo4j is not a member of package org
import org.neo4j.spark._
^
Which gives subsequent errors of:
changeScoreList: java.util.List[Double] = []
<console>:87: error: not found: value neo
val initialDf2 = neo.cypher(noBbox).partitions(5).batch(10000).loadDataFrame
^
<console>:120: error: not found: value neo
Not sure what I am doing wrong, I am executing it like this:
spark-shell --conf spark.neo4j.bolt.password=TestNeo4j --packages neo4j-contrib:neo4j-spark-connector:2.0.0-M2,graphframes:graphframes:0.2.0-spark2.0-s_2.11 -i neo4jspark.scala
Says it finds all the dependencies yet the code throws the error when using neo. Not sure what else to try? Not sure why this doesn't work on my windows box and does my mac. Spark version 2.2 the same, neo4j up and running same versions, scala too, even java (save for a few minor revisions version differences)
This is a known issue (with a related one here), the fix for which is part of the Spark 2.2.1 release.

Spark Scala API: No typeTag available in spark.createDataFrame in official example

I just started working with the MLib for Spark and tried to run the provided examples, more specifically https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/ml/DCTExample.scala
However, compilation using the IntelliJ IDE fails with the message
Error:(41, 35) No TypeTag available for (org.apache.spark.ml.linalg.Vector,)
val df = spark.createDataFrame(data.map(Tuple1.apply)).toDF("features")
The project setup uses jdk1.8.0_121, spark2.11-2.1.0 and scala 2.10.6.
Any ideas on why the example fails to run? I followed the following tutorial during installation: https://www.supergloo.com/fieldnotes/intellij-scala-spark/
You can't have spark for Scala 2.11 (that's what _2.11 in the name means) with Scala 2.10, though this specific error looks quite strange. Switch to Scala 2.11.8.

GraphFrames SLF4J not available

I am running Scala 2.10.4 with Spark 1.5.0-cdh5.5.2 and am getting the following error when running a GraphFrames job:
scala
> val g = GraphFrame(v, e)
error: bad symbolic reference. A signature in Logging.class refers to type LazyLogging
in package com.typesafe.scalalogging.slf4j which is not available.
It may be completely missing from the current classpath, or the version on
the classpath might be incompatible with the version used when compiling Logging.class.
I am starting my spark-shell with the following command:
spark-shell --jars /data/spark-jars/scalalogging-slf4j_2.10-1.1.0.jar,/data/spark-jars/graphframes-0.2.0-spark1.5-s_2.10.jar
I have tried different versions of scalalogging, but nothing seems to work.
Thanks for the help.