HTCondor job submission tags - distributed-computing

I want to run different batches of jobs on our HTCondor pool. Let's say 10 jobs of Type1, 20 jobs of Type2 and so on. Each of these job types should get new jobs when the current jobs are finished.
With just one type I use a simply query if all jobs are finished or if the time limit for the whole job batch passed. If one of these requirements is fulfilled the next iteration of x jobs is submitted to the cluster.
This is done by a small function (written in Lua, which is not really important for the question):
function WaitForSims(CheckupDelay)
while io.popen([[condor_q -format "%d\n" clusterid]]):read('*all'):len()~=0 do
os.execute("echo Checkup timestamp: "..os.date("%x %X"))
os.execute(string.format("timeout %d 1>nul",CheckupDelay))
end
end
Is there a possibility to separate the jobs of Type1, Type2 and Type3 and check them independently? Currently it checks for all jobs as my current user.
Adding a tag or something to the jobs would be ideal, as I could simply change the checkup call. In the documentation I couldn't find anything which is easy to add, I could remember the JobID-s, but then I'll have to store those adding more complexity.

Linked Answer
Solution could be found in another answer, I didn't find where it is described in the documentation though.
In the job.sub file add:
+YourCustomVarName = 1
+YourCustomStringName = "String"
For checking against it use:
condor_q -constraint 'YourCustomVarName == 1' -f "%s" JobStatus
or
condor_q -constraint "YourCustomStringName == \"String\"" -f "%s" JobStatus
(handling of quotations could vary)

Related

Tekton: How delete successful pipelineruns?

My aspired tekton usecase is simple:
successful pipelineruns should be removed after x days
failed pipelineruns shouldn't be removed automatically.
I plan to do the cleanup in an initial cleanup-task. That seems better to me than annotation- or cronjob-approaches. As long as nothing new is built, nothing has to be deleted.
Direct approaches:
Failed: tkn delete doesn't seem very helpful because it doesn't discriminate between successful or not.
Failed: oc delete --field-selector ... doesn't contain the well hidden but highly expressive field status.conditions[0].type==Succeeded
Indirect approaches (first filtering a list of podnames and then delete them - not elegant at all):
Failed: Filtering output with -o=jsonpath... seems costly and the condition-array seems to break the statement, so that (why ever?!) everything is returned... not viable
My last attempt is tkn pipelineruns list --show-managed-fields and parse this with sed/awk... which is gross... but at least it does what I want it to do... and quite efficiently at that. But it might result as brittle when the design of the output is going to change in future releases...
Do you have any better more elegant approaches?
Thanks a lot!
Until a better solution is there, I'll post my current solution (and its drawbacks):
Our cleanup-task is now built around the following solution, evaluating the table returned by tkn pipelineruns list:
tkn pipelineruns list --show-managed-fields -n e-dodo-tmgr --label tekton.dev/pipeline=deploy-pipeline | awk '$6~/Succeeded/ && $3~/day|week|month/ {print $1}'
Advantages:
It does what it should without extensive calls or additional calculation.
Disadvantages:
Time is limited to "older than an hour / a day / a week ..." But that's acceptable, since only successful builds are concerned.
I guess the design is quite brittle, because with changes in the tkn-Client the format of the table might change which implies that awk will pick the wrong columns, or similar pattern-probs.
All in all I hope the solution will hold until there are some more helpful client-features that make the desired info directly filterable. Actually I'd hope for something like tkn pipelineruns delete --state successful --period P1D.
The notation for the time period is from ISO8601.

Rundeck[passing parameters between the jobs]

I have a job which calls another rundeck job, I want to pass parameters between the jobs. I can only pass options from the 1st job to 2nd job, i want results(output) of the 1st job to be passed to the second job as an argument.
Example: 1st job content.
set -x;
sql_file=/apps/$env/test${buildnumber}.sql;
echo $sql_file
I want to call 2nd job, and pass sql_file location as the variable
I can refer the 2nd job and give options of the 1st job as an argument, i cannot find a way to give the output of the 1st job as an arguemnt to the 2nd job.
Rundeck 2.9 will support this ability. It is not released yet, but you can try the beta version here http://rundeck.org/news/2017/06/20/rundeck-2.9.0-beta1.html

When and how does Optaplanner evaluate rules?

We are just getting our heads around using Optaplanner for a project. We have a very simple solution setup as per following:
Job -> PlanningEntity, PlanningVariable=Resource from resourcesList
Resource -> POJO
Solution
- List<Job> PlanningEntityCollectionProperty
- List<Resource> ProblemFactCollectionProperty, resourcesList
We have setup some rules for testing. The first rule is simply to say, don't assign more than three Jobs to a Resource:
rule "noMoreThan3JobsPerResource"
when
$resource : Resource()
$totalJobsOnResource : Number(intValue > 3) from accumulate (
Job(
resource == $resource,
$count : 1),
sum($count)
)
then
scoreHolder.addHardConstraintMatch(kcontext, 3 - $totalJobsOnResource.intValue());
end
What we want to understand is HOW and WHEN the drools rules are evaluated. For example, if we add these two rules:
rule "logWhenResource"
when
$resource: Resource()
then
System.out.println("RESOURCE encountered");
end
rule "logWhenJob"
when
$job : Job()
then
System.out.println("JOB encountered");
end
We get "JOB encountered" in the log, but never "RESOURCE encountered". And yet, our first rule has $resource : Resource() in the when? Does optaplanner fire a rule when a job is placed (in our example)? We are just a bit unclear on why logWhenResource doesn't fire, but noMoreThan3JobsPerResource does (when they both try and 'match' a Resource object? Is Resource the resource that a job has been moved to?
Thanks in advance
After some discussions on IRC, (and a lot of patient help from Geoffrey!), hopefully the following will serve as a helper for other people.
1. Turn Logging on
First off, make sure you turn on trace logging for the Optaplanner package (and maybe turn it off for drools). This really helps as it shows exactly when optaplanner is triggering score calculations. It also shows the candidate score calculation:
Move index (0), score (-3init/-2hard/0medium/0soft), move (Job 7 {null -> Resource 1}).
in addition to the final step selection:
CH step (6), time spent (110), score (-3init/-2hard/0medium/0soft), selected move count (2), picked move (Job 7 {null -> Resource 1}).
You can also log in your "then" part of Rules, by doing something like:
LoggerFactory.getLogger("org.optaplanner").debug("...);
This makes sure it gets logged in the right order as Logging vs println can be asynchronous and things may not be in time ascending order.
2. Understand when Optaplanner calculates scores, and when it doesn't
This is a pretty useful summary of the 'event loop' of optaplanner:
doMove()
fireAllRules()
undoMove()
doMove()
fireAllRules()
undoMove()
doStep()
doMove()
fireAllRules()
undoMove() ...
etc. One thing that is interesting, as per our chat on IRC is the following:
"Notice that it doesn't do fireAllRules() after an undoMove or after doStep() because it can predict the score". Neat.
3. FULL_ASSERT
To check whether you are corrupting the score, turn on FULL_ASSERT.
<environmentMode>FULL_ASSERT</environmentMode>
This is useful to determine if your score calculation isn't right (ours wasn't).
Turn on TRACE logging. It fires all rules (that have changed since last time because it's incremental calculation) every time there's a move line in that log.

MVS OS-390 - How do I Capture Job Information from CA-JOBTRAC programmatically

I am using REXX to invoke JOBTRAC programmatically which works however I am unable to pass JOBNAME arguments using this approach. Can this be done using REXX?
The idea is to find the history of the job run using the program jobtrac. We use jobtrac's schedule to find the history of when job runs happened. We invoke jobtrac using
‘TSO JOBTRAC’ AND SUPPLY history command ‘H XXXXXX’ in the command line (XXXXX – jobname)
I was thinking to route the jobtrac info to a flat file and parse it so that I can do some reporting real time during the job run. The above problem is also linked to this following scenario:
If I give dslist 'DSLIST A.B.C.*'’ in the ISPF panel
It gives the series of datasets ...
A.B.C.A,
A.B.C.D
A.B.C.E
When I give
"SAVE ORANGE"
it stores this list under
MYUSERID.ORANGE.DATASETS.
I know this can be automated pro grammatically and I have seen that . But I don’t have the code base to do that right now. This is much similar to the jobtrack issue I have.
Here is some REXX CODE to help with understanding. I know this code is wrong…we cannot use outtrap for this as it is used to get console output.
say 'No. of month end jobs considered for history :'jobnames.0
if jobnames.0 > 0 then do
do i = 1 to jobnames.0
say jobnames.i
jobname = Word(jobnames.i,1);
say 'jobname under consideration is ' jobname;
tsocmd="JOBTRAC;ADDLOC=000;H "|| strip(jobname);
say 'tso command is ' tsocmd;
y = outtrap(jobdetails.)
Address TSO "'tsocmd'" ------------------> wrong…I believe I have to use ispexec
say 'job details are ' jobdetails.6;
end;

Can a list of strings be supplied to Capistrano tasks?

I have a task whose command in 'run' is the same except for a single value. This value would out of a list of potential values. What I would like to do is create a task which would use this list of values to define the task and then use that same value in the command defined in 'run'. The point is that it would be great to define the task in such a way where I don't have to repeat nearly identical task definitions for each value.
For example: I want a task that will get the status of a single program from a list of programs that I have defined in an array. I would like to define task to be something like this:
set programs = %w["postfix", "nginx", "pgpool"]
programs.each do |program|
desc "#{program} status"
task :#{program} do
run "/etc/init.d/#{program} status"
end
end
This obviously doesn't work, but hopefully it shows what I am attempting here.
Thoughts?
Well, I answered my own question... with a little trial and error. I also did the same thing with namespace so the control of services is nice and elegant. It works quite nicely!
set :programs, %w[postfix nginx pgpool]
set :init_commands, %w[status start stop]
# init.d service control
init_commands.each do |init_command|
namespace :"#{init_command}" do
programs.each do |program|
desc "#{program} #{init_command}"
task :"#{program}" do
run "/etc/init.d/#{program} #{init_command}"
end
end
end
end