What do WARN messages mean when starting spark-shell? - scala

When starting my spark-shell, I had a bunch of WARN messages. But I cannot understand them. Is there any important problems that I should take care of? Or is there any configuration that I missed? Or these WARN messages are normal.
cliu#cliu-ubuntu:Apache-Spark$ spark-shell
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's repl log4j profile: org/apache/spark/log4j-defaults-repl.properties
To adjust logging level use sc.setLogLevel("INFO")
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 1.5.2
/_/
Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_66)
Type in expressions to have them evaluated.
Type :help for more information.
15/11/30 11:43:54 WARN Utils: Your hostname, cliu-ubuntu resolves to a loopback address: 127.0.1.1; using xxx.xxx.xxx.xx (`here I hide my IP`) instead (on interface wlan0)
15/11/30 11:43:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/11/30 11:43:55 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
Spark context available as sc.
15/11/30 11:43:58 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:43:58 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:44:11 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/11/30 11:44:11 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
15/11/30 11:44:14 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/11/30 11:44:14 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:44:14 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
15/11/30 11:44:27 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
15/11/30 11:44:27 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
SQL context available as sqlContext.
scala>

This one:
15/11/30 11:43:54 WARN Utils: Your hostname, cliu-ubuntu resolves to a loopback address: 127.0.1.1; using xxx.xxx.xxx.xx (`here I hide my IP`) instead (on interface wlan0)
15/11/30 11:43:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
means that the hostname the driver managed to figure out for itself is not routable and hence no remote connections are allowed. In your local environment, it is not an issue, but if you go for multi-machine configuration, Spark won't work properly. Hence the WARN message as it may or may not be an issue. Just a heads-up.

The logging info are absolutely normal. Here the BoneCP tries to bind to a JDBC connection and this is why you receive these warnings. In any case if you would like to manage the log records you could specify the logging level by copying <spark-path>/conf/log4j.properties.template
file to <spark-path>/conf/log4j.properties and make your configurations.
Lastly, a similar answer for logging level can be found here:
How to stop messages displaying on spark console?

Adding to #Jacek Laskowski answer, with respect to the SPARK_LOCAL_IP warning:
15/11/30 11:43:54 WARN Utils: Your hostname, cliu-ubuntu resolves to a loopback address: 127.0.1.1; using xxx.xxx.xxx.xx (`here I hide my IP`) instead (on interface wlan0)
15/11/30 11:43:54 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
I encountered the same running spark-shell over a standalone Spark cluster running on Ubuntu 20.04 server. As expected, setting the SPARK_LOCAL_IP environment variables to $(hostname) made the warning go away, but while the application was running without issues, the worker GUI was not reachable using port 4040.
For fixing this, we had to set SPARK_LOCAL_HOSTNAME instead of SPARK_LOCAL_IP. Doing this, the warning was gone, and the worker GUI became accessible though port 4040.
I couldn't find information about this variable in Spark documentation, but according to Spark's source code it is used for setting a custom local machine URI: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L1058

Related

System memory 259522560 must be at least 4.718592E8. Please use a larger heap size

I have this error when I run my Spark scripts with version 1.6 of Spark.
My scripts are working with version 1.5.
Java version: 1.8
scala version: 2.11.7
I tried to change the system env variable JAVA_OPTS=-Xms128m -Xmx512m many times, with different values of Xms and Xmx but it didn't change anything ...
I also tried to modify the memory settings of Intellij
help/change memory settings...
file/settings/scal compiler...
Nothing worked.
I have different users in the computer, and Java is setup at the root of the computer while intellij is setup in the folder of one of the users. Can it have an impact?
Here are the logs of the error:
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
20/04/30 17:06:54 INFO SparkContext: Running Spark version 1.6.0
20/04/30 17:06:55 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/04/30 17:06:55 INFO SecurityManager: Changing view acls to:
20/04/30 17:06:55 INFO SecurityManager: Changing modify acls to:
20/04/30 17:06:55 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(); users with modify permissions: Set()
20/04/30 17:06:56 INFO Utils: Successfully started service 'sparkDriver' on port 57698.
20/04/30 17:06:57 INFO Slf4jLogger: Slf4jLogger started
20/04/30 17:06:57 INFO Remoting: Starting remoting
20/04/30 17:06:57 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem#10.1.5.175:57711]
20/04/30 17:06:57 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 57711.
20/04/30 17:06:57 INFO SparkEnv: Registering MapOutputTracker
20/04/30 17:06:57 ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 259522560 must be at least 4.718592E8. Please use a larger heap size.
at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)
at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:457)
at batch.BatchJob$.main(BatchJob.scala:23)
at batch.BatchJob.main(BatchJob.scala)
20/04/30 17:06:57 INFO SparkContext: Successfully stopped SparkContext
Exception in thread "main" java.lang.IllegalArgumentException: System memory 259522560 must be at least 4.718592E8. Please use a larger heap size.
at org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)
at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:457)
at batch.BatchJob$.main(BatchJob.scala:23)
at batch.BatchJob.main(BatchJob.scala)
And the beginning of the code:
package batch
import java.lang.management.ManagementFactory
import org.apache.spark.{SparkContext, SparkConf}
import org.apache.spark.sql.{SaveMode, SQLContext}
object BatchJob {
def main (args: Array[String]): Unit = {
// get spark configuration
val conf = new SparkConf()
.setAppName("Lambda with Spark")
// Check if running from IDE
if (ManagementFactory.getRuntimeMXBean.getInputArguments.toString.contains("IntelliJ IDEA")) {
System.setProperty("hadoop.home.dir", "C:\\Libraries\\WinUtils") // required for winutils
conf.setMaster("local[*]")
}
// setup spark context
val sc = new SparkContext(conf)
implicit val sqlContext = new SQLContext(sc)
...
Finally could find a solution:
add -Xms2g -Xmx4g in VM options directly in Intellij Scala Console.
That's the only thing that worked for me

Kafka to Snowflake connecting issue

(Submitting on behalf of a Snowflake client...)
.........................
I am trying to connect Kafka to snowflake using Snowflake Connector for Kafka.
Referring to this document: https://docs.snowflake.net/manuals/user-guide/kafka-connector.html
When I am running Kafka, it is initializing the Snowflake plugins .
eg:
[2019-08-31 21:52:09,448] INFO Added aliases 'SnowflakeSinkConnector' and 'SnowflakeSink' to plugin 'com.snowflake.kafka.connector.SnowflakeSinkConnector' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:396)
[2019-08-31 21:52:09,456] INFO Added aliases 'SnowflakeJsonConverter' and 'SnowflakeJson' to plugin 'com.snowflake.kafka.connector.records.SnowflakeJsonConverter' (org.apache.kafka.connect.runtime.isolation.DelegatingClassLoader:396)
But after that it is unable to read other worker config attributes.
[2019-08-31 21:52:10,373] WARN The configuration 'connector.class' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,373] WARN The configuration 'snowflake.topic2table.map' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,375] WARN The configuration 'tasks.max' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,378] WARN The configuration 'topics' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,381] WARN The configuration 'snowflake.private.key.passphrase' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,385] WARN The configuration 'plugin.path' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,386] WARN The configuration 'buffer.flush.time' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,386] WARN The configuration 'snowflake.url.name' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,387] WARN The configuration 'value.converter.basic.auth.credentials.source' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,387] WARN The configuration 'snowflake.database.name' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,387] WARN The configuration 'snowflake.schema.name' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,387] WARN The configuration 'value.converter.schema.registry.url' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,389] WARN The configuration 'offset.storage.file.filename' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,392] WARN The configuration 'value.converter.basic.auth.user.info' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,392] WARN The configuration 'buffer.count.records' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,393] WARN The configuration 'snowflake.private.key' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,393] WARN The configuration 'snowflake.user.name' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,393] WARN The configuration 'name' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,394] WARN The configuration 'value.converter' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,394] WARN The configuration 'key.converter' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
[2019-08-31 21:52:10,394] WARN The configuration 'buffer.size.bytes' was supplied but isn't a known config. (org.apache.kafka.clients.admin.AdminClientConfig:287)
I realize these are warnings but after that, we're getting failures. So I assume it is failing as it is unable to initialize the above config values.
WARNING: A provider org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource will be ignored.
Sep 04, 2019 11:55:52 AM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource will be ignored.
Sep 04, 2019 11:55:52 AM org.glassfish.jersey.internal.inject.Providers checkProviderRuntime
WARNING: A provider org.apache.kafka.connect.runtime.rest.resources.RootResource registered in SERVER runtime does not implement any provider interfaces applicable in the SERVER runtime. Due to constraint configuration problems the provider org.apache.kafka.connect.runtime.rest.resources.RootResource will be ignored.
Sep 04, 2019 11:55:52 AM org.glassfish.jersey.internal.Errors logErrors
WARNING: The following warnings have been detected: WARNING: The (sub)resource method listConnectors in org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains empty path annotation.
WARNING: The (sub)resource method createConnector in org.apache.kafka.connect.runtime.rest.resources.ConnectorsResource contains empty path annotation.
WARNING: The (sub)resource method listConnectorPlugins in org.apache.kafka.connect.runtime.rest.resources.ConnectorPluginsResource contains empty path annotation.
WARNING: The (sub)resource method serverInfo in org.apache.kafka.connect.runtime.rest.resources.RootResource contains empty path annotation.
[2019-09-04 11:55:52,788] INFO Started o.e.j.s.ServletContextHandler#2be818da{/,null,AVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler:850)
[2019-09-04 11:55:52,800] INFO Started http_8083#798deee8{HTTP/1.1,[http/1.1]}{0.0.0.0:8083} (org.eclipse.jetty.server.AbstractConnector:292)
[2019-09-04 11:55:52,801] INFO Started #9514ms (org.eclipse.jetty.server.Server:408)
[2019-09-04 11:55:52,802] INFO Advertised URI: http://10.10.25.86:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:267)
[2019-09-04 11:55:52,802] INFO REST server listening at http://10.10.25.86:8083/, advertising URL http://10.10.25.86:8083/ (org.apache.kafka.connect.runtime.rest.RestServer:217)
[2019-09-04 11:55:52,802] INFO Kafka Connect started (org.apache.kafka.connect.runtime.Connect:55)
[2019-09-04 11:55:52,807] ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:113)
org.apache.kafka.common.config.ConfigException: Must configure one of topics or topics.regex
at org.apache.kafka.connect.runtime.SinkConnectorConfig.validate(SinkConnectorConfig.java:96)
at org.apache.kafka.connect.runtime.AbstractHerder.validateConnectorConfig(AbstractHerder.java:269)
at org.apache.kafka.connect.runtime.standalone.StandaloneHerder.putConnectorConfig(StandaloneHerder.java:189)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:107)
[2019-09-04 11:55:52,808] INFO Kafka Connect stopping (org.apache.kafka.connect.runtime.Connect:65)
[2019-09-04 11:55:52,808] INFO Stopping REST server (org.apache.kafka.connect.runtime.rest.RestServer:223)
[2019-09-04 11:55:52,820] INFO Stopped http_8083#798deee8{HTTP/1.1,[http/1.1]}{0.0.0.0:8083} (org.eclipse.jetty.server.AbstractConnector:341)
[2019-09-04 11:55:52,821] INFO node0 Stopped scavenging (org.eclipse.jetty.server.session:167)
[2019-09-04 11:55:52,827] INFO Stopped o.e.j.s.ServletContextHandler#2be818da{/,null,UNAVAILABLE} (org.eclipse.jetty.server.handler.ContextHandler:1040)
[2019-09-04 11:55:52,829] INFO REST server stopped (org.apache.kafka.connect.runtime.rest.RestServer:241)
[2019-09-04 11:55:52,829] INFO Herder stopping (org.apache.kafka.connect.runtime.standalone.StandaloneHerder:95)
[2019-09-04 11:55:52,829] INFO Worker stopping (org.apache.kafka.connect.runtime.Worker:184)
[2019-09-04 11:55:52,829] INFO Stopped FileOffsetBackingStore (org.apache.kafka.connect.storage.FileOffsetBackingStore:66)
[2019-09-04 11:55:52,830] INFO Worker stopped (org.apache.kafka.connect.runtime.Worker:205)
[2019-09-04 11:55:52,830] INFO Herder stopped (org.apache.kafka.connect.runtime.standalone.StandaloneHerder:112)
[2019-09-04 11:55:52,830] INFO Kafka Connect stopped (org.apache.kafka.connect.runtime.Connect:70)
I understand that this may be a configuration issue in that in the newer version of Kafka, the configuration for "topic" was updated to "topics", but are there any other/additional explanations, corrective actions or recommended work-arounds?
Thank you!
You can ignore all those config warnings, they are just that—warnings (albeit noisy & confusing ones!).
The reason it's failed is as you've identified:
Must configure one of topics or topics.regex
You have to specify one of these in your configuration.

Can't run spark shell ! java.lang.NoSuchMethodError: org.apache.spark.repl.SparkILoop.mumly

hadoop#youngv-VirtualBox:/usr/local/spark$ ./bin/spark-shell
18/11/30 23:32:38 WARN Utils: Your hostname, youngv-VirtualBox resolves to a loopback address: 127.0.0.1; using 10.0.2.15 instead (on interface enp0s3)
18/11/30 23:32:38 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
18/11/30 23:32:40 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Exception in thread "main" java.lang.NoSuchMethodError:
org.apache.spark.repl.SparkILoop.mumly(Lscala/Function0;)Ljava/lang/Object;
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.org$apache$spark$repl$SparkILoop$$anonfun$$loopPostInit$1(SparkILoop.scala:199)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$startup$1$1.apply(SparkILoop.scala:267)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1$$anonfun$startup$1$1.apply(SparkILoop.scala:247)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.withSuppressedSettings$1(SparkILoop.scala:235)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.startup$1(SparkILoop.scala:247)
at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:282)
at org.apache.spark.repl.SparkILoop.runClosure(SparkILoop.scala:159)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:182)
at org.apache.spark.repl.Main$.doMain(Main.scala:78)
at org.apache.spark.repl.Main$.main(Main.scala:58)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
while I want to run the spark-shell, but appear error
with: spark-2.4.0 scala-2.11.12 jdk-1.8
Anyone could tell me how to solve this problem? I will be very grateful.
there might be different jar version in assembly classpath, remove it and try
building it again.

Unable to connect to Spark master

I start my DataStax cassandra instance with Spark:
dse cassandra -k
I then run this program (from within Eclipse):
import org.apache.spark.sql.SQLContext
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
object Start {
def main(args: Array[String]): Unit = {
println("***** 1 *****")
val sparkConf = new SparkConf().setAppName("Start").setMaster("spark://127.0.0.1:7077")
println("***** 2 *****")
val sparkContext = new SparkContext(sparkConf)
println("***** 3 *****")
}
}
And I get the following output
***** 1 *****
***** 2 *****
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
15/12/29 15:27:50 INFO SparkContext: Running Spark version 1.5.2
15/12/29 15:27:51 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
15/12/29 15:27:51 INFO SecurityManager: Changing view acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: Changing modify acls to: nayan
15/12/29 15:27:51 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(nayan); users with modify permissions: Set(nayan)
15/12/29 15:27:52 INFO Slf4jLogger: Slf4jLogger started
15/12/29 15:27:52 INFO Remoting: Starting remoting
15/12/29 15:27:53 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver#10.0.1.88:55126]
15/12/29 15:27:53 INFO Utils: Successfully started service 'sparkDriver' on port 55126.
15/12/29 15:27:53 INFO SparkEnv: Registering MapOutputTracker
15/12/29 15:27:53 INFO SparkEnv: Registering BlockManagerMaster
15/12/29 15:27:53 INFO DiskBlockManager: Created local directory at /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/blockmgr-21a96671-c33e-498c-83a4-bb5c57edbbfb
15/12/29 15:27:53 INFO MemoryStore: MemoryStore started with capacity 983.1 MB
15/12/29 15:27:53 INFO HttpFileServer: HTTP File server directory is /private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/spark-fce0a058-9264-4f2c-8220-c32d90f11bd8/httpd-2a0efcac-2426-49c5-982a-941cfbb48c88
15/12/29 15:27:53 INFO HttpServer: Starting HTTP Server
15/12/29 15:27:53 INFO Utils: Successfully started service 'HTTP file server' on port 55127.
15/12/29 15:27:53 INFO SparkEnv: Registering OutputCommitCoordinator
15/12/29 15:27:53 INFO Utils: Successfully started service 'SparkUI' on port 4040.
15/12/29 15:27:53 INFO SparkUI: Started SparkUI at http://10.0.1.88:4040
15/12/29 15:27:54 WARN MetricsSystem: Using default name DAGScheduler for source because spark.app.id is not set.
15/12/29 15:27:54 INFO AppClient$ClientEndpoint: Connecting to master spark://127.0.0.1:7077...
15/12/29 15:27:54 WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkMaster#127.0.0.1:7077] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
15/12/29 15:28:14 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.FutureTask#1f22aef0 rejected from java.util.concurrent.ThreadPoolExecutor#176cb4af[Running, pool size = 1, active threads = 1, queued tasks = 0, completed tasks = 0]
So something is happening during the creation of the spark context.
When i look in $DSE_HOME/logs/spark, it is empty. Not sure where else to look.
It turns out that the problem was the spark library version AND the Scala version. DataStax was running Spark 1.4.1 and Scala 2.10.5, while my eclipse project was using 1.5.2 & 2.11.7 respectively.
Note that BOTH the Spark library and Scala appear to have to match. I tried other combinations, but it only worked when both matched.
I am getting pretty familiar with this part of your posted error:
WARN ReliableDeliverySupervisor: Association with remote system [akka.tcp://...
It can have numerous causes, pretty much all related to misconfigured IPs. First I would do whatever zero323 says, then here's my two cents: I have solved my own problems recently by using IP addresses, not hostnames, and the only config I use in a simple standalone cluster is SPARK_MASTER_IP.
SPARK_MASTER_IP in the $SPARK_HOME/conf/spark-env.sh on your master then should lead the master webui to show the IP address you set:
spark://your.ip.address.numbers:7077
And your SparkConf setup can refer to that.
Having said that, I am not familiar with your specific implementation but I notice in the error two occurrences containing:
/private/var/folders/pd/6rxlm2js10gg6xys5wm90qpm0000gn/T/
Have you looked there to see if there's a logs directory? Is that where $DSE_HOME points? Alternatively connect to the driver where it creates it's webui:
INFO SparkUI: Started SparkUI at http://10.0.1.88:4040
and you should see a link to an error log there somewhere.
More on the IP vs. hostname thing, this very old bug is marked as Resolved but I have not figured out what they mean by Resolved, so I just tend toward IP addresses.

Using Scala IDE and Apache Spark on Windows

I want to start working on a project that uses Spark with Scala on Windows 7.
I downloaded the Apache Spark pre-build for hadoop 2.4 (download page) and I can run it from command prompt (cmd). I can run all of the codes on the quick start of spark page before self-contains application section.
Then I downloaded Scala IDE 4.0.0 from its download page (Sorry it's not possible to post more than 2 links).
Now I created a new scala project and also import the spark assembly jar file into the project. When I want to run the example in the self-contains application section in quick start of spark page but I got the following errors:
15/03/26 11:59:55 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 11:59:58 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 11:59:58 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:15 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 12:00:17 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 12:00:17 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:35 INFO AppClient$ClientActor: Connecting to master akka.tcp://sparkMaster#myhost:7077/user/Master...
15/03/26 12:00:37 WARN AppClient$ClientActor: Could not connect to akka.tcp://sparkMaster#myhost:7077: akka.remote.InvalidAssociation: Invalid address: akka.tcp://sparkMaster#myhost:7077
15/03/26 12:00:37 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster#myhost:7077]. Address is now gated for 5000 ms, all messages to this address will be delivered to dead letters. Reason: myhost
15/03/26 12:00:55 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: All masters are unresponsive! Giving up.
15/03/26 12:00:55 ERROR TaskSchedulerImpl: Exiting due to error from cluster scheduler: All masters are unresponsive! Giving up.
15/03/26 12:00:55 WARN SparkDeploySchedulerBackend: Application ID is not initialized yet.
The only line of code that I add to the example, is .setMaster("spark://myhost:7077") for SparkConf definition. I think I need to configure the Scala IDE to use the pre-build spark on my computer but actually I don't know how and I couldn't find anything by googling.
Could you help me to get Scala IDE works with the spark on windows 7?
Thanks in advance
I found the answer:
I should correct the master definition in my code as follow:
replace:
.setMaster("spark://myhost:7077")
with:
.setMaster("local[*]")
Hope that it helps you as well.