spark-shell on multinode spark cluster fails to spon executor on remote worker node - scala

Installed spark cluster on standalone mode with 2 nodes on first node there is spark master running and on another node spark worker. When i try to run spark shell on worker node with word count code it runs fine but when i try to run spark shell on the master node it gives following output :
WARN scheduler.TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Executor is not triggered to run the job. Even though there is worker available to spark master its giving such a problem . Any help is appriciated , thanks

You use client deploy mode so the best bet is that executor nodes cannot connect to the driver port on the local machine. It could be firewall issue or problem with advertised IP / hostname. Please make sure that:
spark.driver.bindAddress
spark.driver.host
spark.driver.port
use expected values. Please refer to the networking section of Spark documentation.
Less likely it is a lack of resources. Please check if you don't request more resources than provided by workers.

Related

Initial job has not accepted any resources; Error with spark in VMs

I have three Ubuntu VMs (clones) in my local machine which i wanted to use to make a simple cluster. One VM to be used as a master and the other two as slaves. I can ssh every VM from every other one succesfully and i have the ip's of the two slaves in the conf/slaves file of the master and the master's ip in the spark-env.sh of every VM.When I run
start-slave.sh spark://master-ip:7077
from the slaves,they appear in the spark UI. But when i try to run things in parallel i always get the message about the resources. For testing code i use the scala shell
spark-shell --master://master-ip:7077 and sc.parallelize(1 until 10000).count.
Do You mean that warn: WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster ui to ensure that workers are registered and have sufficient memory
This message will pop up any time an application is requesting more resources from the cluster than the cluster can currently provide.
Spark is only looking for two things: Cores and Ram. Cores represents the number of open executor slots that your cluster provides for execution. Ram refers to the amount of free Ram required on any worker running your application.
Note for both of these resources the maximum value is not your System's max, it is the max as set by the your Spark configuration.
If you need to run multiple Spark apps simultaneously then you’ll need to adjust the amount of cores being used by each app.
If you are working with applications on the same node you need to assign cores to each application to make them work in parallel: ResourceScheduling
If you use VMs (as in your situation): assign only one core to each VM
when you first create it or whatever relevant to your system
resource capacity as by now spark request 4 cores for each * 2 VMs = 8 core which you don't have.
This is a tutorial i find that could help you: Install Spark on Ubuntu: Standalone Cluster Mode
Further Reading: common-spark-troubleshooting

Spark Standalone Cluster deployMode = "cluster": Where is my Driver?

I have researched this for a significant amount of time and find answers that seem to be for a slightly different question than mine.
UPDATE: Spark docs say the Driver runs on a cluster Worker in deployMode: cluster. This does not seem to be true when you don't use spark-submit
My Spark 2.3.3 cluster is running fine. I see the GUI on “http://master-address:8080", there are 2 idle workers, as configured.
I have a Scala application that creates a context and starts a Job. I do not use spark-submit, I start the Job programmatically and this is where many answers diverge from my question.
In "my-app" I create a new SparkConf, with the following code (slightly abbreviated):
conf.setAppName(“my-job")
conf.setMaster(“spark://master-address:7077”)
conf.set(“deployMode”, “cluster”)
// other settings like driver and executor memory requests
// the driver and executor memory requests are for all mem on the slaves, more than
// mem available on the launching machine with “my-app"
val jars = listJars(“/path/to/lib")
conf.setJars(jars)
…
When I launch the job I see 2 executors running on the 2 nodes/workers/slaves. The logs show their IP address and calls them executor 0 and 1.
With a Yarn cluster I would expect the “Driver" to run on/in the Yarn Master but I am using the Spark Standalone Master, where is the Driver part of the Job running? If it runs on a random worker or elsewhere, is there a way to find it from logs
Where is my Spark Driver executing? Does deployMode = cluster work when not using spark-submit? Evidence shows a cluster with one master (on the same machine as executor 0) and 2 Workers. It also show identical memory usage on both Workers during the job. From logs I know both Workers are running Executors. Where is the Driver?
The “Driver” creates and broadcasts some large data structures so the need for an answer is more critical than with more typical tiny Drivers.
Where is the driver running? How do I find it given logs and monitoring? I can't reconcile what I see with the docs, they contradict each other.
This is answered by the official documentation:
In cluster mode, however, the driver is launched from one of the Worker processes inside the cluster, and the client process exits as soon as it fulfills its responsibility of submitting the application without waiting for the application to finish.
In other words driver uses arbitrary worker node, hence it it is likely to co-locate with one on the executors, on such small cluster. And to anticipate the follow-up question - this behavior is not configurable - you just have to make sure that the cluster has capacity to start both required executors, and the driver with it's requested memory and cores.

Spark fails with too many open files on HDInsight YARN cluster

I am running into the same issue as in this thread with my Scala Spark Streaming application: Why does Spark job fail with "too many open files"?
But given that I am using Azure HDInsights to deploy my YARN cluster, and I don't think I can log into that machine and update the ulimit in all machines.
Is there any other way to solve this problem? I cannot reduce the number of reducers by too much either, or my job will become much slower.
You can ssh into all nodes from the head node (ambari ui show fqdn of all nodes).
ssh sshuser#nameofthecluster.azurehdinsight.net
You can the write a custom action that alters the settings on the necessary nodes if you want to automate this action.

Spark driver node and worker node for a Spark application in Standalone cluster

I want to understand when a Spark application is submitted which node will act as a driver node and which node will be as a worker node ?
For example if I have Standalone cluster of 3 nodes.
When spark first application(app1) is submitted, spark framework will randomly choose one of the node as driver node and other nodes as worker nodes. This is only for app1. During it's execution, if another spark application(app2) is submitted, spark can choose randomly one node as driver node and other nodes as worker nodes. This is only for app2. So while both spark applications are executing there can be a situation that two different nodes can be master nodes. Please correct me If misunderstand.
You're on the right track. Spark has a notion of a Worker node which is used for computation. Each such worker can have N amount of Executor processes running on it. If Spark assigns a driver to be ran on an arbitrary Worker that doesn't mean that Worker can't run additional Executor processes which run the computation.
As for your example, Spark doesn't select a Master node. A master node is fixed in the environment. What it does choose is where to run the driver, which is where the SparkContext will live for the lifetime of the app. Basically if you interchange Master and Driver, your answer is correct.

Apache spark in cluster mode where to run the jobs. In Master or in worker node?

I have installed the spark in cluster mode. 1 master and 2 workers.And When I start spark shell in master node it is countinously running without getting the scala shell.
But when I run spark-shell on a worker node I am getting scala shell.And I am able to do the jobs.
val file=sc.textFile(“hdfs://192.168.1.20:9000/user/1gbdata”)
file.count()
And for this I got the output.
So My doubt is actually where to run the spark jobs.
Is it in worker nodes?
Based on the documentation, you need to connect your spark-shell to the master node with the following command : spark-shell --master spark://IP:PORT. This url can be retrieved from the master's UI or log file.
You should be able to launch the spark-shell on the master node (machine), make sure to check out the UI to see if the spark-shell is effectively running and that the prompt is shown (you might need to press enter on your keyboard after issuing spark-shell).
Please note that when you are using spark-submit in cluster mode, the driver will be submitted directly from one of the worker nodes, contrary to client mode where it will run as a client process. Refer to the documentation for more details.