I'm using Jenkins Rest API to build and schedule job.
The problem that i schedule one job for the Week-end but it execute it several times (Same job executed every minute).
For the rest of the week the job is executed only once, so if there any GUI options to empty the week-end job list ?
you can use the following groovy script to clean all ( or part of your queue ....)
this example delete all jobs that start with a specific branch name
import jenkins.model.*
def branchName = build.environment.get("GIT_BRANCH_NAME")
println "=========before clean the queue ... =="
def q = Jenkins.instance.queue
q.items.each {
println("${it.task.name}:")
}
q.items.findAll { it.task.name.startsWith(branchName) }.each { q.cancel(it.task) }
println "=========after clean the queue ... =="
q = Jenkins.instance.queue
q.items.each {
println("${it.task.name}:")
}
Related
We have this inside celery beat scheduler:
"task_name": {
"task": "utils.tasks.task_name",
"schedule": crontab(minute="*/05"),
}
so this will run every five minutes but I want task_name to run only once. I can not say:
"schedule": crontab(minute="1", hour="1")
because the task_name run time can be any time.
I have added a condition to the task so it will not continue after 10 minutes. Do you think it is possible for it to run only once instead of checking the task constantly?
I would say you need to add some kind of locking to your code, here's an example:
LOCKS = {}
#task(bind=True)
def task_name(self):
lock_id = self.name
if LOCKS.get(lock_id) is None:
LOCKS[lock_id] = True
return your_function()
logger.debug("You can run this only once")
For example i have the following class. How i can prevent execution of get_entity task if create_entity task was not executed?
class MyTaskSequence(TaskSequence):
#seq_task(1)
def create_entity(self):
self.round += 1
with self.client.post('/entities', json={}, catch_response=True) as resp:
if resp.status_code != HTTPStatus.CREATED:
resp.failure()
# how to stop other tasks for that run?
self.entity_id = resp.json()['data']['entity_id']
#seq_task(2)
def get_entity(self):
# It is being always executed,
# but it should not be run if create_entity task failed
resp = self.client.get(f'/entities/{self.entity_id}')
...
I found TaskSet.interrupt method in documentation, but does not allow to cancel root TaskSet. I tried to make parent TaskSet for my task sequence, so TaskSet.interrupt works.
class MyTaskSet(TaskSet):
tasks = {MyTaskSequence: 10}
But now i see that all results in ui are cleared after i call interrupt!
I just need to skip dependent tasks in this sequence. I need the results.
The easiest way to solve this is just to use a single #task with multiple requests inside it. Then, if a request fails just do a return after resp.failure()
Might self.interrupt() be what you are looking for?
See https://docs.locust.io/en/latest/writing-a-locustfile.html#interrupting-a-taskset for reference.
Why not using on_start(self): which runs once whenever a locust created, it can set a global which can be checked whether the locust executes the tasks
class MyTaskSequence(TaskSequence):
entity_created = false
def on_start(self):
self.round += 1
with self.client.post('/entities', json={}, catch_response=True) as resp:
if resp.status_code != HTTPStatus.CREATED:
self.entity_created = true
resp.failure()
self.entity_id = resp.json()['data']['entity_id']
#seq_task(2)
def get_entity(self):
if self.entity_created:
resp = self.client.get(f'/entities/{self.entity_id}')
...
We're in the process of writing a .NET Cadence client and ran into an issue while unit testing workflows. When we start a worker, execute a workflow, stop the worker, start it again, and then try and execute another workflow, the first workflow completes, but any workflow after the first hangs during the client.ExecuteWorkflow() call, eventually failing with a START_TO_CLOSE timeout. I replicated this behavior by munging the greetings cadence-samples workflow. See the loop in func main():
package main
import (
"context"
"time"
"go.uber.org/cadence/client"
"go.uber.org/cadence/worker"
"go.uber.org/zap"
"github.com/pborman/uuid"
"github.com/samarabbas/cadence-samples/cmd/samples/common"
)
// This needs to be done as part of a bootstrap step when the process starts.
// The workers are supposed to be long running.
func startWorkers(h *common.SampleHelper) worker.Worker {
// Configure worker options.
workerOptions := worker.Options{
MetricsScope: h.Scope,
Logger: h.Logger,
}
return h.StartWorkers(h.Config.DomainName, ApplicationName, workerOptions)
}
func startWorkflow(h *common.SampleHelper) client.WorkflowRun {
workflowOptions := client.StartWorkflowOptions{
ID: "greetings_" + uuid.New(),
TaskList: ApplicationName,
ExecutionStartToCloseTimeout: time.Minute,
DecisionTaskStartToCloseTimeout: time.Minute,
}
return h.StartWorkflow(workflowOptions, SampleGreetingsWorkflow)
}
func main() {
// setup the SampleHelper
var h common.SampleHelper
h.SetupServiceConfig()
// Loop:
// - start a worker
// - start a workflow
// - block and wait for workflow result
// - stop the worker
for i := 0; i < 3; i++ {
// start the worker
// execute the workflow
workflowWorker := startWorkers(&h)
workflowRun := startWorkflow(&h)
// create context
// get workflow result
var result string
ctx, cancel := context.WithCancel(context.Background())
err := workflowRun.Get(ctx, &result)
if err != nil {
panic(err)
}
// log the result
h.Logger.Info("Workflow Completed", zap.String("Result", result))
// stop the worker
// cancel the context
workflowWorker.Stop()
cancel()
}
}
This is not a blocking issue and will probably not come up in production.
Background:
We (Jeff Lill and I) noticed this issue during unit testing workflows in our .NET Cadence client. When we run our workflow tests individually they all pass, but when we run multiple at a time (sequentially, not in parallel), we see the behavior described above. This is because of the cleanup done in the .NET Cadence client dispose() method called after a test completes (pass or fail). One of the dispose behaviors is to stop workers created during a test. When the next test runs, new workers are created using the same workflow service client, and this is where the issue arises.
I am submitting multiple spark jobs in the following manner -
someCollection.foreach(m => {
..some code
sparkSubmitClass.run(m.name)
.. some code
})
where sparkSubmitClass.run() method basically uses a shell script which has $SPARK_HOME/bin/spark-submit and other related parameters.
The problem is that this code submits all the spark jobs in one go. What I want to achieve is - submit a job, then submit another job only when the earlier one finishes. This is because someCollection is ordered and the next job depends on data created by the previous job(s).
sparkSubmitClass.run() is on the following lines -
def run(appName: String)(implicit executionContext: ExecutionContext) = {
val command = s"sparkJob.sh $appName"
val processBuilder = Process(command)
val pio = new ProcessIO(_ => (),
stdout => {
scala.io.Source.fromInputStream(stdout)
.getLines.foreach(str => log.info(s"spark-submit: Application
Name=$appName stdout='${str.replace("'", "\\'")}'"))
},
stderr => {
val lines = scala.io.Source.fromInputStream(stderr).getLines().toBuffer
lines.foreach(str => log.info(s"spark-submit: Application Name=$appName
stderr='${str.replace("'", "\\'")}'"))
lines.flatMap(parseLineForApplicationUrl).headOption.foreach(appId =>
appId)
})
val process = processBuilder.run(pio)
val exitVal = process.exitValue() //returns 0 as soon as application is
submitted
}
And sparkJob.sh is basically -
MAIN_CLASS="com.SomeClassHavingRDDAndHiveOperations"
APPNAME=$1
JAVA_OPTS="-XX:MaxDirectMemorySize=$WORKER_DIRECT_MEM_SIZE -
XX:+HeapDumpOnOutOfMemoryError -Djava.net.preferIPv4Stack=true"
SPARK_HOME="/usr/lib/spark"
cmd='$SPARK_HOME/bin/spark-submit --class $MAIN_CLASS
--name ${APPNAME}
--conf "spark.yarn.submit.waitAppCompletion=false"
--conf "spark.io.compression.codec=snappy"
--conf "spark.kryo.unsafe=true"
--conf "spark.kryoserializer.buffer.max=1024m"
--conf "spark.serializer=org.apache.spark.serializer.KryoSerializer"
--driver-java-options "-XX:MaxMetaspaceSize=$WORKER_PERM_SIZE $JAVA_OPTS"
$appdir/SomeJar.jar $APPNAME'
eval $cmd
Any thoughts on how to build this kind of ordering?
Instead of writing bash scripts and calling each job and waste io/read-write phase why don't you loop the jobs ordering as you need inside your code. Here's some hints for you to follow :
First you have to make sure that you have an interface and you implement that interface to every class you want to process in order so that you can have a common method to start for each jobs. (in this example the method is process and the interface is JobInterface)
Then you need to write all the class-names-with-package in one file with the order you want. lets say that file is orderedJobs (you don't need to mention extension)
package1.Class1
package1.Class2
package2.Class3
....
Read and parse that file. I am assuming it to be in resouces folder, you can filter the lines you don't want
val classCall = Source.fromInputStream(getClass.getResourceAsStream(<locationOforderedJobs>)).getLines().filter(!_.startsWith("#"))
Loop with foreach for each class and call the common method defined (process)
classCall.foreach(job => {
processJob(job).process(<you can pass arguments>)
}
processJob is a funtion where you instantiate each class
def processJob(name: String): JobInterface = {
val action = Class.forName("<package path from source root>"+className).newInstance()
action.asInstanceOf[JobInterface]
}
This way you can reduce the io/read-write time wastage, increase spark processing efficiency by storing useful data for other jobs in-memory, reduce the processing time and many more...
I hope it helps
I am running a Spark application (version 1.6.0) on a Hadoop cluster with Yarn (version 2.6.0) in client mode. I have a piece of code that runs a long computation, and I want to kill it if it takes too long (and then run some other function instead).
Here is an example:
val conf = new SparkConf().setAppName("TIMEOUT_TEST")
val sc = new SparkContext(conf)
val lst = List(1,2,3)
// setting up an infite action
val future = sc.parallelize(lst).map(while (true) _).collectAsync()
try {
Await.result(future, Duration(30, TimeUnit.SECONDS))
println("success!")
} catch {
case _:Throwable =>
future.cancel()
println("timeout")
}
// sleep for 1 hour to allow inspecting the application in yarn
Thread.sleep(60*60*1000)
sc.stop()
The timeout is set for 30 seconds, but of course the computation is infinite, and so Awaiting on the result of the future will throw an Exception, which will be caught and then the future will be canceled and the backup function will execute.
This all works perfectly well, except that the canceled job doesn't terminate completely: when looking at the web UI for the application, the job is marked as failed, but I can see there are still running tasks inside.
The same thing happens when I use SparkContext.cancelAllJobs or SparkContext.cancelJobGroup. The problem is that even though I manage to get on with my program, the running tasks of the canceled job are still hogging valuable resources (which will eventually slow me down to a near stop).
To sum things up: How do I kill a Spark job in a way that will also terminate all running tasks of that job? (as opposed to what happens now, which is stopping the job from running new tasks, but letting the currently running tasks finish)
UPDATE:
After a long time ignoring this problem, we found a messy but efficient little workaround. Instead of trying to kill the appropriate Spark Job/Stage from within the Spark application, we simply logged the stage ID of all active stages when the timeout occurred, and issued an HTTP GET request to the URL presented by the Spark Web UI used for killing said stages.
I don't know it this answers your question.
My need was to kill jobs hanging for too much time (my jobs extract data from Oracle tables, but for some unknonw reason, seldom the connection hangs forever).
After some study, I came to this solution:
val MAX_JOB_SECONDS = 100
val statusTracker = sc.statusTracker;
val sparkListener = new SparkListener()
{
override def onJobStart(jobStart : SparkListenerJobStart)
{
val jobId = jobStart.jobId
val f = Future
{
var c = MAX_JOB_SECONDS;
var mustCancel = false;
var running = true;
while(!mustCancel && running)
{
Thread.sleep(1000);
c = c - 1;
mustCancel = c <= 0;
val jobInfo = statusTracker.getJobInfo(jobId);
if(jobInfo!=null)
{
val v = jobInfo.get.status()
running = v == JobExecutionStatus.RUNNING
}
else
running = false;
}
if(mustCancel)
{
sc.cancelJob(jobId)
}
}
}
}
sc.addSparkListener(sparkListener)
try
{
val df = spark.sql("SELECT * FROM VERY_BIG_TABLE") //just an example of long-running-job
println(df.count)
}
catch
{
case exc: org.apache.spark.SparkException =>
{
if(exc.getMessage.contains("cancelled"))
throw new Exception("Job forcibly cancelled")
else
throw exc
}
case ex : Throwable =>
{
println(s"Another exception: $ex")
}
}
finally
{
sc.removeSparkListener(sparkListener)
}
For the sake of future visitors, Spark introduced the Spark task reaper since 2.0.3, which does address this scenario (more or less) and is a built-in solution.
Note that is can kill an Executor eventually, if the task is not responsive.
Moreover, some built-in Spark sources of data have been refactored to be more responsive to spark:
For the 1.6.0 version, Zohar's solution is a "messy but efficient" one.
According to setJobGroup:
"If interruptOnCancel is set to true for the job group, then job cancellation will result in Thread.interrupt() being called on the job's executor threads."
So the anno function in your map must be interruptible like this:
val future = sc.parallelize(lst).map(while (!Thread.interrupted) _).collectAsync()