passing python modules on HDFS through livy - pyspark

On the /user/usr1/ path in HDFS, I placed two scripts pySparkScript.py and relatedModule.py. relatedModule.py is a python module which will be imported into pySparkScript.py.
I can run the scripts with spark-submit pySparkScript.py
However, I need to run these scripts through Livy. Normally, I run single scripts successfully as the following:
curl -H "Content-Type:application/json" -X POST -d '{"file": "/user/usr1/pySparkScript.py"}' livyNodeAddress/batches
However, when I run the above code, as soon as it gets to import relatedModule.py it fails. I realize I should give the path to the relatedModule also in the parameters of Livy. I tried the following option:
curl -H "Content-Type:application/json" -X POST -d '{"file": "/user/usr1/pySparkScript.py", "files": ["/user/usr1/relatedModule.py"]}' livyNodeAddress/batches
How should I pass both files to Livy?

Try to use pyFiles property.
Please refer Livy REST API docs.

Related

How to associate a project with a quality profile in Sonarqube

I am using sonar webapi to associate a project with a quality profile but not able to do it. On every run of sonnar-scanner it is associating default quality profile. Below is the code snippet.
Updated the code snippet
curl -k -X POST --insecure -H “X-Auth-Token:XXX” -d "language=py" -d "qualityProfile=test_profile" -d “project=test_1.0” https://sonartest.xxx.com/api/qualityprofiles/add_project
I am not sure what I am doing wrong. I have administrative access and followed the webapi of Version 6.7.3 (build 38370)
Finally got some help from Soanrqube community. I need to remove X-Auth-Token from code. It should be something like this
curl -u ur_token: -X POST -d language=py -d qualityProfile=test_profile -d projectKey=${params.ProjectName} https://sonar-url.com/api/qualityprofiles/add_project

How to execute a jar-packaged scala program via Apache Livy on Spark that responds with a result directly to a client request?

What I intend to achieve is having a Scala Spark program (in a jar) receive a POST message from a client e.g. curl, take some argument values, do some Spark processing and then return a result value to the calling client.
From the Apache Livy documentation available I cannot find a way how I can invoke a compiled and packaged Spark program from a client (e.g. curl) via Livy in an interactive i.e. session mode. Such a request/reply scenario via Livy can be done with Scala code passed in plain text to the Spark shell. But how can I do it with a Scala class in a packaged jar?
curl -k --user "admin:mypassword" -v \
-H "Content-Type: application/json" -X POST \
-d #Curl-KindSpark_ScalaCode01.json \
"https://myHDI-Spark-Clustername.azurehdinsight.net/livy/sessions/0/statements" \
-H "X-Requested-By: admin"
Instead of Scala source code as data (-d #Curl-KindSpark_ScalaCode01.json) I would rather pass the path and filename of the jar-file and a ClassName and Argument values. But how?
Make a uber jar of your Spark app with sbt-assemby plugin.
Upload jar file from the previous step to your HDFS cluster:
hdfs dfs -put /home/hduser/PiNumber.jar /user/hduser
Execute your job via livy:
curl -X POST -d '{"conf": {"kind": "spark" , "jars": "hdfs://localhost:8020/user/hduser/PiNumber.jar"}}' -H "Content-Type: application/json" -H "X-Requested-By: user" localhost:8998/sessions
check it:
curl localhost/sessions/0/statements/3:
{"id":3,"state":"available","output":{"status":"ok","execution_count":3,"data":{"text/plain":"Pi
is roughly 3.14256"}}}
p.s.
Spark Livy API for Scala/Java requires using an uber jar file. sbt-assembly doesn't make fat jar instantly, it annoys me.
Usually, I use Python API of Livy for smoke tests and tweaking.
Sanity checks with Python:
curl localhost:sessions/0/statements -X POST -H 'Content-Type: application/json' -d '{"code":"print(\"Sanity check for Livy\")"}'
You can put more complicated logic to field code.
BTW, it's a way in which popular notebooks for Spark works - sending the source code to cluster via Livy.
Thx, I will try this out. In the meanwhile I found another solution:
$ curl -k --user "admin:" -v -H "Content-Type: application/json" -X POST -d #Curl-KindSpark_BrandSampleModel_SessionSetup.json "https://mycluster.azurehdinsight.net/livy/sessions
with a JSON file containing
{
"kind": "spark",
"jars": ["adl://skylytics1.azuredatalakestore.net/skylytics11/azuresparklivy_2.11-0.1.jar"]
}
and with the uploaded jar in the Azure Data Lake Gen1 account containing the Scala object and then post the statement
$ curl -k --user "admin:myPassword" -v -H "Content-Type: application/json" -X POST -d #Curl-KindSpark_BrandSampleModel_CodeSubmit.json "https://mycluster.azurehdinsight.net/livy/sessions/4/statements" -H "X-Requested-By: admin"
with the content
{
"code": "import AzureSparkLivy_GL01._; val brandModelSamplesFirstModel = AzureSparkLivyFunction.SampleModelOfBrand(sc, \"Honda\"); brandModelSamplesFirstModel"
}.
So I told Livy to start an interactive Spark session and load the specified jar and passed some code to invoke a member of the object in the jar. It works. Will check your advice too.

POST request with Powershell 2.0 using cURL

Scenario
Among other things, Powershell 2.0 doesn't have the useful cmdlet Invoke-RestMethod.
I can't upgrade to version 3 and most examples I've found use version 3.
I have found this article, which seems, however, too complicated for my simple scenario.
I need to write a Powershell script that POSTs data in Json format, e.g.
{"Id":5,"Email":"test#com","DataFields":null,"Status":0}
What I've tried
I am able to GET data. This is one of the scripts I have tried.
curl -v --user username:password https://api.dotmailer.com/v2/account-info
But, when I try to POST, I can't figure out where to put the body of the message in the script. This is what I've got so far:
curl -v -X POST -H "Accept: application/json" -H "Content-Type: application/json" -u username:password -d '{"Id":5,"Email":"test#com","OptInType":0,"EmailType":0, "DataFields":null,"Status":0}' https://api.dotmailer.com/v2/contacts
which returns the following error:
{"message":"Could not parse the body of the request based on the content type \"application/json\" ERROR_BODY_DOES_NOT_MATCH_CONTENT_TYPE"}*
Question
Can anyone advise on how to POST Json data from Powershell using cURL?
Any pointers to why I get the error I mentioned in the Waht I've tried section would be much appreciated.
Thanks.
Note that the question is about the curl.exe external program, not about PowerShell's Invoke-WebRequest cmdlet (which, unfortunately, is aliased to curl in later PowerShell versions, preempting calls to the external program unless the .exe extension is explicitly specified (curl.exe ...).
Unfortunately and unexpectedly, you have to \-escape embedded " instances in a string you pass as an argument to an external program.
Therefore, even though:
'{"Id":5,"Email":"test#com","DataFields":null,"Status":0}'
should work, it doesn't, due to a long-standing bug; instead, you must use:
'{\"Id\":5,\"Email\":\"test#com\",\"DataFields\":null,\"Status\":0}'
See this answer for more information.
From curl's man page it appears you need to use -d switch:
curl -v --user username:password -H "Content-Type: application/json" -d '{"Id":5,"Email":"test#com","DataFields":null,"Status":0}' https://api.dotmailer.com/v2/contacts

How do I deploy a file to Artifactory using the command line?

I've spent far more time on this than I care to admit. I am trying to just deploy one file into my Artifactory server from the command line. I'm doing this using gradle because that is how we manage our java builds. However, this artifact is an NDK/JNI build artifact, and does not use gradle.
So I just need the simplest gradle script to do the deploy. Something equivalent to:
scp <file> <remote>
I am currently trying to use the artifactory plugin, and am having little luck in locating a reference for the plugin.
curl POST did not work for me . PUT worked correctly . The usage is
curl -X PUT $SERVER/$PATH/$FILE --data-binary #localfile
example :
$ curl -v --user username:password --data-binary #local-file -X PUT "http://<artifactory server >/artifactory/abc-snapshot-local/remotepath/remotefile"
Instead of using the curl command, I recommend using the jfrog CLI.
Download from here - https://www.jfrog.com/getcli/ and use the following command (make sure the file is executable) -
./jfrog rt u <file-name> <upload-path>
Here is a simple example:
./jfrog rt u sample-service-1.0.0.jar libs-release-local/com/sample-service/1.0.0/
You will be prompted for credentials and the repo URL the first time.
You can do lots of other stuff with this CLI tool. Check out the detailed instructions here - https://www.jfrog.com/confluence/display/RTF/JFrog+CLI.
The documentation for the artifactory plugin can be found, as expected, in Artifactory User Guide.
Please note that it is adviced to use the newer plugin - artifactory-publish, which supports the new Gradle publishing model.
Regarding uploading from the command line, you really don't need gradle for that. You can execute a simple PUT query using CURL or any other tool.
And of course if you just want to get your file into Artifactory, you can always deploy it via the UI.
Take a look the Artifactory REST API, mostly you can't use scp command, instead use the curl command towards REST API.
$ curl -X POST $SERVER/$PATH/$FILE --data #localfile
Mostly it looks like
$ curl -X POST http://localhost:8081/artifactory/abc-snapshot-local/remotepath/remotefile --data #localfile
The scp command is only used if you really want to access the internal folder which is managed by artifactory
$ curl -v -X PUT \
--user username:password \
--upload-file <path to your file> \
http://localhost:8080/artifactory/libs-release-local/my/jar/1.0/jar-1.0.jar
Ironically, I'm answering my own question. After a couple more hours working on the problem, I found a sample project on github: https://github.com/JFrogDev/project-examples
The project even includes a straightforward bash script for doing the exact deploy/copy from the command line that I was looking for, as well as a couple of less straightforward gradle scripts.
As per official docs, You can upload any file using the following command:
curl -u username:password -T <PATH_TO_FILE> "https://<ARTIFACTORY_SERVER>/<REPOSITORY_PATH>/<TARGET_FILE>"
Note: The user should have write access to this path.

Create jobs and execute them in jenkins using REST

I am trying to create a WCF REST client that will communicate to Jenkins and create a job from an XML file and then build the job. My understanding is that you can do that with Jenkins.
Can some one please provide some commands that you can type on a browser's address bar to create and build jobs? ie: http:localhost/jenkins/createItem?name=TESTJOB something along those lines.
Usually, when parsing through the documentation, it can take one or two days. It is helpful to be able to access code or curl commands to get you up and running in one hour. That is my objective with a lot of third party software.
See the post at http://scottizu.wordpress.com/2014/04/30/getting-started-with-the-jenkins-api/ which lists several of the curl commands. You will have to replace my.jenkins.com (ie JENKINS_HOST) with the your own url.
To create a job, for instance, try:
curl -X POST -H "Content-Type:application/xml" -d "<project><builders/><publishers/><buildWrappers/></project>" "http://JENKINS_HOST/createItem?name=AA_TEST_JOB2"
This uses a generic config. You can also download a config from a manually created job and then use that as a template.
curl "http://JENKINS_HOST/job/MY_JOB_NAME/config.xml" > config.xml
curl -X POST -H "Content-Type:application/xml" -d #config.xml "http://JENKINS_HOST/createItem?name=AA_TEST_JOB3"
To execute the job (and set string parameters), use:
curl "http://JENKINS_HOST/job/MY_JOB_NAME/build"
curl "http://JENKINS_HOST/job/MY_JOB_NAME/buildWithParameters?PARAMETER0=VALUE0&PARAMETER1=VALUE1"
See the Jenkins API Wiki page (including the comments at the end). You can fill in the gaps using the documentation provided by Jenkins itself; for example, http://JENKINS_HOST/api will give you the URL for creating a job and http://JENKINS_HOST/job/JOBNAME/api will give you the URL to trigger a build.
I highly recommend avoiding the custom creation of job configuration XML files and looking at something like the Job DSL plugin instead. This gives you a nice Groovy-based DSL to create jobs programmatically - much more concise and less error-prone.
Thanks to a GIST - https://gist.github.com/stuart-warren/7786892
Check if job exists
curl -XGET 'http://jenkins/checkJobName?value=yourJobFolderName' --user user.name:YourAPIToken
With folder plugin
curl -s -XPOST 'http://jenkins/job/FolderName/createItem?name=yourJobName' --data-binary #config.xml -H "Content-Type:text/xml" --user user.name:YourAPIToken
Without folder plugin
curl -s -XPOST 'http://jenkins/createItem?name=yourJobName' --data-binary #config.xml -H "Content-Type:text/xml" --user user.name:YourAPIToken
Create folder
curl -XPOST 'http://jenkins/createItem?name=FolderName&mode=com.cloudbees.hudson.plugins.folder.Folder&from=&json=%7B%22name%22%3A%22FolderName%22%2C%22mode%22%3A%22com.cloudbees.hudson.plugins.folder.Folder%22%2C%22from%22%3A%22%22%2C%22Submit%22%3A%22OK%22%7D&Submit=OK' --user user.name:YourAPIToken -H "Content-Type:application/x-www-form-urlencoded"
If you want to create a job into a view given the view exists.
curl -X POST -H "Content-Type:application/xml" -d #build.xml "http://jenkins_host/view/viewName/createItem?name=itemName"
the build.xml filetemplate could be found in the root directory of a job's workspace
if you want to create a view:
curl -X POST -H "Content-Type:application/xml" -d #view.xml "http://jenkins_host/createView?name=viewName"
the content of the file view.xml could be:
<?xml version="1.0" encoding="UTF-8"?>
<hudson.model.ListView>
<name>viewName</name>
<filterExecutors>false</filterExecutors>
<filterQueue>false</filterQueue>
<properties class="hudson.model.View$PropertyList"/>
<jobNames>
<comparator class="hudson.util.CaseInsensitiveComparator"/>
</jobNames>
<jobFilters/>
<columns>
<hudson.views.StatusColumn/>
<hudson.views.WeatherColumn/>
<hudson.views.JobColumn/>
<hudson.views.LastSuccessColumn/>
<hudson.views.LastFailureColumn/>
<hudson.views.LastDurationColumn/>
<hudson.views.BuildButtonColumn/>
</columns>
</hudson.model.ListView>
and to check if a view exists:
curl -X POST -H "Content-Type:application/xml" "http://jenkins_host/checkViewName?value=viewName"
to check if a job exists:
curl -X POST -H "Content-Type:application/xml" "http://jenkins_host/checkJobName?value=jobName"
To create a job:
curl -X POST -H "Content-Type:application/xml" -d "<project><builders/><publishers/><buildWrappers/></project>" -u username: API_Token http://JENKINS_HOST/createItem?name=AA_TEST_JOB2
To build a job:
curl -X POST -u username:API_TOKEN http://JENKINS_HOST/job/MY_JOB_NAME/build
In case you need to make the same HTTP calls using the Python requests library, instead of CURL...
Download a job config:
import requests
auth = ("username", "api_token")
url = "http://" + JENKINS_HOST + "/job/" + JOB_NAME + "/config.xml"
response = requests.get(url, auth=auth)
open('config.xml', 'wt').write(response.text)
Create a new job using same config:
url = "http://" + JENKINS_HOST + "/createItem?name=" + NEW_JOB_NAME
headers = {'content-type': 'text/xml'}
data = response.text
response = requests.post(url, auth=auth, headers=headers, data=data)
Omit auth parameter when not needed.