When Spark is deployed in YARN cluster mode, how should I issue the Spark monitoring REST API calls http://spark.apache.org/docs/latest/monitoring.html ?
Does YARN have an API that takes the REST call for example (I already know the app-id)
http://localhost:4040/api/v1/applications/[app-id]/jobs
, proxies it to the correct driver port, and returns the JSON back to me? By "me" I mean my client.
Assume (or already by design) I cannot directly talk to the driver machine due to security reasons.
pls have a look at spark docs
- REST API
Yes with the latest api its available.
By this article
It turns out there is a third surprisingly easy option which is not documented. Spark has a hidden REST API which handles application submission, status checking and cancellation.
In addition to viewing the metrics in the UI, they are also available as JSON. This gives developers an easy way to create new visualizations and monitoring tools for Spark. The JSON is available for both running applications, and in the history server. The endpoints are mounted at /api/v1. Eg., for the history server, they would typically be accessible at http://:18080/api/v1, and for a running application, at http://localhost:4040/api/v1.
These are the other options available..
Livy jobserver
Submit Spark jobs remotely to an Apache Spark cluster Linux using Livy
Other options include
Triggering spark jobs with REST
This is what worked for me,
In yarn resource manager UI, click on link of the "application manager" for the running application and note the URL that it directs to
For me the link was something like
http://RM:20888/proxy/application_1547506848892_0002/
Append "api/v1/applications/application_1547506848892_0002" to the URL for the api.
For above case the api url is
curl "http://RM:20888/proxy/application_1547506848892_0002/api/v1/applications/application_1547506848892_0002"
Related
I have created a sample play framework api which has one endpoint.
http://play-demo-broker.cfapps.io/say?number=20
Which just return me number that have passed.
I am able successfully deploy the service. Next want this service to Act like service broker
For same want to register this as by using below command
cf create-service-broker play-demo-broker admin admin http://play-demo-broker.cfapps.io --space-scoped
This command it giving me below error -
The service broker rejected the request. Status Code: 404 Not Found
Not sure what is causing this issue as there not much information available for Play Framework Service broker Setup.
The play framework is implemented above the akka packages. Akka rejects paths that are not implemented.
If I an not mistaken, cf create-service-broker command access the / endpoint. If you implemented only say?number=20 endpoint, then be default all other paths, such as the empty path, are rejected by Akka.
In order to open that endpoint you need to add it into the routes.
For example you can add:
GET / controllers.ControllerName.GetEmptyPath
And implement the GetEmptyPath method in ControllerName
I run Spark in both client and cluster mode. Is there any rest url that can be used to kill running spark apps and drivers?
At the moment Spark has a hidden REST API. It's likely that in the future it will be public (see issue SPARK-12528). However, at the moment it's still "private", so you should use it at your own risk - meaning that if something changes in the API of the next Spark version, you need to update your code.
Otherwise, you can use Spark-server, but this will bring along more packages/dependencies, which you might not need.
curl -X PUT 'http://localhost:8088/ws/v1/cluster/apps/application_1524528223375_0082/state' -d '{"state": "KILLED"}'
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Application_State_API
If running on yarn, you can use "yarn application -kill application_XXXX_ID" to kill a application.
This command can also be issued using YARN REST APIs, with an decent description of calls listed here or in the official docs
The blog post apache-spark-hidden-rest-api uses actually the YARN REST API.
Thus said, the above is possible only on YARN.
Please try this if you have submissionId:-
curl -X POST http://spark-cluster-ip:6066/v1/submissions/kill/driver-20151008145126-0000
I'm trying to retrieve projects metrics using the REST Api. Therefore I first query the projects using "/api/projects/index". Afterwards I retrieve the metrics using "/api/metrics/search". Both works fine. And I result with:
[id:35476, k:com.test:TestProject, nm:TestProject, qu:TRK, sc:PRJ]
[custom:false, description:Cyclomatic complexity, direction:-1, domain:Complexity, hidden:false, id:10019, key:complexity, name:Complexity, qualitative:false, type:INT]
Now I wanted to retrieve a projects metrics. Therefore I use the following URL:
https://MYHOST/sonarqube/api/timemachine/index?resource=35476&metric=10019&fromDateTime=2010-12-25T23:59:59+0100&toDateTime=2018-12-25T23:59:59+0100
There the server retruns only: [{"cols":[],"cells":[]}]
This surprices me, because when I enter the WebInterface of sonar for the project, I can see numbers. I tried some other metrics, however all ended with the same result. What am I doing wrong?
You didn't mention server version, so I'll assume the latest: 5.2.
I got the same result for a bare query (http://nemo.sonarqube.org/api/timemachine/index), and for a query which specified resource but not metrics (http://nemo.sonarqube.org/api/timemachine/index?resource=org.sonarsource.sonarqube%3Asonarqube).
So I'm guessing there's a problem with either your resource or metric id. Try using the keys (com.test&%3ATestProject, and complexity) instead.
And yes, the ids you got back from the other web services should work here, but what's meant by "id" can be a little... ah... variable from service to service to service.
I have a use case to integrate and Import Ambari alerts that's getting generated in Ambari Web interface , to the centralized monitoring environment we are using for managing clusters.I am using HDP . Do we have any detailed documentation/Steps/ about how to do this. Here are some example that I want to accomplish
How to make a REST API call to see if HDFS file system if filled and uses is more than 90 % or how to check if if one of service is down like HDFS/HBASE is not working and have raised alarm in Ambari GUI .
Checkout this page for links to the Ambari REST API for alerts:
https://cwiki.apache.org/confluence/display/AMBARI/Alerts
Also checkout slides 4-20 in this SlideShare, particularly slide 13 highlights the Alerts REST API:
http://www.slideshare.net/hortonworks/apache-ambari-whats-new-in-200
Hadoop 2.5.1 added a new Rest api to submit an application:
http://hadoop.apache.org/docs/r2.5.1/hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html#Cluster_Applications_APISubmit_Application
"Cluster Applications API(Submit Application)
The Submit Applications API can be used to submit applications. In case of submitting applications, you must first obtain an application-id using the Cluster New Application API. "
Until Hadoop 2.4 to run a mapreduce example from command line we must execute the hadoop command line shell:
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output 'dfs[a-z.]+'
Now in Hadoop 2.5.1 can run the same mapreduce sample using the above Rest api but I was not able (I didn't understand) how the http Request Body should be written.
I read the doc above and the example is about a YARN application but I was not able to create a body for a mapreduce application.
Specifically is not clear to me how to fill the Elements of the am-container-spec object (specifically local-resources and commands) to let the application to run the hadoop-mapreduce-examples-2.5.1.jar grep example.
Can someone send me the JSON or XML about the Request Body to run the above mapreduce example?