Spark Job SUBMITTED but not RUNNING after submit via REST API - rest

Following the instructions in this website, I'm trying to submit a job to Spark via REST API /v1/submissions.
I tried to submit SparkPi in the example:
$ ./create.sh
{
"action" : "CreateSubmissionResponse",
"message" : "Driver successfully submitted as driver-20211212044718-0003",
"serverSparkVersion" : "3.1.2",
"submissionId" : "driver-20211212044718-0003",
"success" : true
}
$ ./status.sh driver-20211212044718-0003
{
"action" : "SubmissionStatusResponse",
"driverState" : "SUBMITTED",
"serverSparkVersion" : "3.1.2",
"submissionId" : "driver-20211212044718-0003",
"success" : true
}
create.sh:
curl -X POST http://172.17.197.143:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
"appResource": "/home/ruc/spark-3.1.2/examples/jars/spark-examples_2.12-3.1.2.jar",
"sparkProperties": {
"spark.master": "spark://172.17.197.143:7077",
"spark.driver.memory": "1g",
"spark.driver.cores": "1",
"spark.app.name": "REST API - PI",
"spark.jars": "/home/ruc/spark-3.1.2/examples/jars/spark-examples_2.12-3.1.2.jar",
"spark.driver.supervise": "true"
},
"clientSparkVersion": "3.1.2",
"mainClass": "org.apache.spark.examples.SparkPi",
"action": "CreateSubmissionRequest",
"environmentVariables": {
"SPARK_ENV_LOADED": "1"
},
"appArgs": [
"400"
]
}'
status.sh:
export DRIVER_ID=$1
curl http://172.17.197.143:6066/v1/submissions/status/$DRIVER_ID
But when I try to get the status of the job (even after a few minutes), I got a "SUBMITTED" rather than "RUNNING" or "FINISHED".
Then I looked up the log and found that
21/12/12 04:47:18 INFO master.Master: Driver submitted org.apache.spark.deploy.worker.DriverWrapper
21/12/12 04:47:18 WARN master.Master: Driver driver-20211212044718-0003 requires more resource than any of Workers could have.
# ...
21/12/12 04:49:02 WARN master.Master: Driver driver-20211212044718-0003 requires more resource than any of Workers could have.
However, in my spark-env.sh, I have
export SPARK_WORKER_MEMORY=10g
export SPARK_WORKER_CORES=2
I have no idea what happened. How can I make it run normally?

Since you've checked resources and You have enough. It might be network issue. executor maybe cannot connect back to driver program. Allow traffic on both master and workers.

Related

OpsManager mongodb deployment issue adding PLAIN auth

I'm trying to enable PLAIN authentication security over a mongodb replica shard managed with OpsManager following their documentation https://docs.opsmanager.mongodb.com/v4.0/tutorial/enable-ldap-authentication-for-group/ .
The issue I'm facing is at the automation-agent trying to get mongoS status while restarting after enabling security. Please see the error output below:
<mongos_5> [09:18:19.711] Failed to compute states :
<mongos_5> [09:18:19.711] Error calling ComputeState : <mongos_5> [09:18:19.632] Error getting current config from running mongo using conn params = mongos01:27017 (local=false) :
<mongos_5> [09:18:19.632] Error getting pid for mongos01:27017 (local=false) :
<mongos_5> [09:18:19.632] Error running command for runCommandWithTimeout(dbName=admin, cmd=[{serverStatus 1} {locks false} {recordStats false}]) :
result={"$clusterTime":{"clusterTime":6808443558471663617,"signature": {"hash":"e44BxV30B7dTpampo4VZsVuio7E=","keyId":6808441655801151517}},"code":13,"codeName":"Unauthorized",
"errmsg":"command serverStatus requires authentication","ok":0,"operationTime":6808443558471663617} connection=&{mongos01:27017 (local=false) 2 true 0xc4207b21a0 2020-03-26 09:18:19.627337419 +0000 UTC 0xc4207bdef0 <nil> }
identityUsed= : command serverStatus requires authentication
I noticed that even if opsmanager is not able to get the status the security was enabled successfully and PLAIN authentication mechanism works but the status hangs at
Start the process ... Start MongoDB process
I tried this over the API following mongodb-labs repo https://github.com/mongodb-labs/mms-api-examples/blob/master/automation/api_usage_example/configs/security_ldap_cluster.json but also manually following mongodb docs but everytime I'm facing the same error.
After all I enabled LDAP(PLAIN) only for mongo in mongoconfig file (see below the ops manager API snippet call example), and avoid enable in opsmanager for the agents also.
{
"args2_6": {
"net": {
"port": 28001
},
"replication": {
"replSetName": "rs0"
},
"storage": {
"dbPath": "/data/mongo"
},
"systemLog": {
"destination": "file",
"path": "/data/mongo/mongodb.log"
},
"security": {
"authorization": "enabled"
},
"setParameter": {
"saslauthdPath": "",
"authenticationMechanisms": "PLAIN,MONGO-CR,SCRAM-SHA-256",
}
}, ...

Jfrog REST API Jenkins Groovy status code: 404, reason phrase: Not Found

I have a Groovy script which is run in the Jenkins script console. The script uses the JFrog Rest API to run some queries. One of which returns: status code: 404, reason phrase: Not Found
CURL:
$ curl -X GET -H "X-JFrog-Art-Api:APIKey" https://OU.jfrog.io/OU/api/storage/test-repository/docker-log-gen/1.12/manifest.json?properties
{
"properties" : { ... },
"uri" : "https://OU.jfrog.io/artifactory/api/storage/test-repository/docker-log-gen/1.12/manifest.json"
}
WGET
$ wget --header="X-JFrog-Art-Api:APIKey" https://OU.jfrog.io/OU/api/storage/test-repository/docker-log-gen/1.12/manifest.json?properties
--2020-01-14 13:12:16-- https://OU.jfrog.io/OU/api/storage/test-repository/docker-log-gen/1.12/manifest.json?properties
HTTP request sent, awaiting response... 200 OK
Jenkins Groovy
def restClient = new RESTClient('https://OU.jfrog.io')
restClient.headers['X-JFrog-Art-Api'] = 'APIKey'
println(restClient.get(path: '/OU/api/storage/test-repository/docker-log-gen/1.12/manifest.json?properties', requestContentType: 'text/plain') )
groovyx.net.http.HttpResponseException: status code: 404, reason phrase: Not Found
Other rest calls (api/docker) are made prior to this one in the script and return successfully. I am unable to identify a cause for this response, as shown the command-line calls return the expected JSON.
Please help.
The part after the first question mark is not the URI path component.
println(restClient.get(path: '/OU/api/storage/test-repository/docker-log-gen/1.12/manifest.json', query: ['properties': ''] , requestContentType: 'text/plain').data.text )
{
"properties" : { ... },
"uri" : "https://OU.jfrog.io/artifactory/api/storage/test-repository/docker-log-gen/1.12/manifest.json"
}

How to convert Livy curl call to Livy Rest API call

I am getting started with Livy, in my setup Livy server is running on Unix machine and I am able to do curl to it and execute the job. I have created a fat jar and uploaded it on hdfs and I am simply calling its main method from Livy. My Json payload for Livy looks like below:
{
"file" : "hdfs:///user/data/restcheck/spark_job_2.11-3.0.0-RC1-
SNAPSHOT.jar",
"proxyUser" : "test_user",
"className" : "com.local.test.spark.pipeline.path.LivyTest",
"files" : ["hdfs:///user/data/restcheck/hivesite.xml","hdfs:///user/data/restcheck/log4j.properties"],
"driverMemory" : "5G",
"executorMemory" : "10G",
"executorCores" : 5,
"numExecutors" : 10,
"queue" : "user.queue",
"name" : "LivySampleTest2",
"conf" : {"spark.master" : "yarn","spark.executor.extraClassPath" :
"/etc/hbase/conf/","spark.executor.extraJavaOptions" : "-Dlog4j.configuration=file:log4j.properties","spark.driver.extraJavaOptions" : "-Dlog4j.configuration=file:log4j.properties","spark.ui.port" : 4100,"spark.port.maxRetries" : 100,"JAVA_HOME" : "/usr/java/jdk1.8.0_60","HADOOP_CONF_DIR" :
"/etc/hadoop/conf:/etc/hive/conf:/etc/hbase/conf","HIVE_CONF_DIR" :
"/etc/hive/conf"}
}
and below is my curl call to it:
curl -X POST --negotiate -u:"test_user" --data #/user/data/Livy/SampleFile.json  -H "Content-Type: application/json" https://livyhost:8998/batches
I am trying to convert this a REST API call and following the WordCount example provided by Cloudera but not able to covert my curl call to the REST API. I have all the jars already added in HDFS so I dont think I need to do the upload jar call.
It should work with curl also
Please try the below JSON.
curl -H "Content-Type: application/json" https://livyhost:8998/batches
-X POST --data '{
"name" : "LivyREST",
"className" : "com.local.test.spark.pipeline.path.LivyTest",
"file" : "/user/data/restcheck/spark_job_2.11-3.0.0-RC1-
SNAPSHOT.jar"
}'
Also, I am adding some more references
http://gethue.com/how-to-use-the-livy-spark-rest-job-server-api-for-submitting-batch-jar-python-and-streaming-spark-jobs/

How to enable Javascript in Druid

I have been using Druid for the past week and wanted to enable javascript for some postAggregations.
I think I followed the outlined steps and updated the common.runtime.properties file in ../con f/druid/_common/ to include druid.javascript.enabled=true. I then stopped the current processes and re-ran the Quickstart procedures, but it still says that JavaScript is disabled:
{
"error" : "Unknown exception",
"errorMessage" : "Instantiation of [simple type, class io.druid.query.aggregation.post.JavaScriptPostAggregator] value failed: JavaScript is disabled. (through reference chain: java.util.ArrayList[0])",
"errorClass" : "com.fasterxml.jackson.databind.JsonMappingException",
"host" : null
}
I am currently running it in the 'Quickstart' configuration - single local machine. Any pointers? Thanks!
JavaScript Query For druid Aggregation. Save the file as .body and hit the curl request.
This is a sample query for Average value.
curl -X POST "http://localhost:8082/druid/v2/?pretty" \ -H
'content-type: application/json' -d #query.body
{
"queryType":"groupBy",
"dataSource":"whirldata",
"granularity":"all",
"dimensions":[],
"aggregations":[{"name":"rows","type":"count","fieldName":"rows"},
{"name":"TargetDOS","type":"doubleSum","fieldName":"Target DOS"}],"postAggregations":[
{
"type": "javascript",
"name": "Target DOS Average",
"fieldNames": ["TargetDOS", "rows"],
"function": "function(TargetDOS, rows) { return Math.abs(TargetDOS) / rows; }"
}], "intervals":[ "2006-01-01T00:00:00.000Z/2020-01-01T00:00:00.000Z" ]}
The part you are missing is likely that the quickstart reads configs from conf-quickstart rather than conf. So try editing conf-quickstart/druid/_common/common.runtime.properties.

wso2am API manager 2.1 publisher change-lifecycle issue

I deployed API Manager 2.1.0 and setup the api-import-export-2.1.0 war file described here. After importing my API endpoint by uploading a zip file the status=CREATED.
To actually publish the API I am calling the Publisher's change-lifecycle API but I am getting this exception:
TID: [-1234] [] [2017-07-06 11:11:57,289] ERROR
{org.wso2.carbon.apimgt.rest.api.util.exception.GlobalThrowableMapper}
- An Unknown exception has been captured by global exception mapper.
{org.wso2.carbon.apimgt.rest.api.util.exception.GlobalThrowableMapper}
java.lang.NoSuchMethodError:
org.wso2.carbon.apimgt.api.APIProvider.changeLifeCycleStatus(Lorg/wso2/carbon/apimgt/api/model/APIIdentifier;Ljava/lang/String;)Z
Any ideas on why?
I can get an access token (scope apim:api_view) and call this
:9443/api/am/publisher/v0.10/apis
to list the api's just fine.
I get a different acces_token (for scope: apim:api_publish) and then call
:9443/api/am/publisher/v0.10/apis/change-lifecycle
but get the above Exception. Here's the example:
[root#localhost] ./publish.sh
View APIs (token dc0c1497-6c27-3a10-87d7-b2abc7190da5 scope: apim:api_view)
curl -k -s -H "Authorization: Bearer dc0c1497-6c27-3a10-87d7-b2abc7190da5" https://gw-node:9443/api/am/publisher/v0.10/apis
{
"count": 1,
"next": "",
"previous": "",
"list": [
{
"id": "d214f784-ee16-4067-9588-0898a948bb17",
"name": "Health",
"description": "health check",
"context": "/api",
"version": "v1",
"provider": "admin",
"status": "CREATED"
}
] }
Publish API (token b9a31369-8ea3-3bf2-ba3c-7f2a4883de7d scope: apim:api_publish)
curl -k -H "Authorization: Bearer b9a31369-8ea3-3bf2-ba3c-7f2a4883de7d" -X POST https://gw-node:9443/api/am/publisher/v0.10/apis/change-lifecycle?apiId=d214f784-ee16-4067-9588-0898a948bb17&action=Publish
{
"code":500,
"message":"Internal server error",
"description":"The server encountered an internal error. Please contact administrator.",
"moreInfo":"",
"error":[]
}
Issue resolved. In apim 2.1 the publisher & store API versions changed.
In apim 2.0 I was using:
:9443/api/am/publisher/v0.10/apis
:9443/api/am/store/v0.10/apis
but in apim 2.1 they are:
:9443/api/am/publisher/v0.11/apis
:9443/api/am/store/v0.11/apis