Limit the number of single processes in Nextflow workflows - workflow

I have the following simple workflow:
workflow {
Channel.fromPath(params.file_list)
.splitText(){it.trim()}
.set { file_list }
data = GetFromHPSS(file_list)
data_pairs = CoupleDETXToFile(data, file(params.detx_path))
SingleDUTimeResFit(data_pairs)
}
In which file_list is a list of paths on a tape-drive system. The GetFromHPSS is the process which retrieves files from the tape system and I need to limit the parallel processes to a fairly low number.
Currently, I am using
executor {
queueSize = 100
}
in the configuration file but there are two problems:
it limits the overall maximum number of parallel jobs, while I could run thousands of SingleDUTimeResFit processes in parallel
it always first waits until it processed everything from GetFromHPSS instead of continuing with the subsequent processes
Here is an example:
N E X T F L O W ~ version 21.04.3
Launching `workflows/singledu_timeresfit.nf` [wise_galileo] - revision: 8084ac1482
executor > sge (502)
[13/ca3e8a] process > GetFromHPSS (426) [ 18%] 402 of 22840
[- ] process > CoupleDETXToFile [ 0%] 0 of 402
[- ] process > SingleDUTimeResFit -
Is there a way to limit GetFromHPSS to a specific number of parallel executions and let the remaining processes run with another queue-limit set?
EDIT: This is one of my best tries I guess, but it does not accept the configuration:
process {
executor {
queueSize = 100
submitRateLimit = "10sec"
}
withName: GetFromHPSS {
executor.queueSize = 10
}
}
With this process top-level configuration, I get:
N E X T F L O W ~ version 21.04.3
Launching `workflows/singledu_timeresfit.nf` [confident_pasteur] - revision: 8084ac1482
Unknown config attribute `process.withName:GetFromHPSS` -- check config file: /sps/km3net/users/tgal/dev/PhD/workflows/nextflow.config

I think what you're looking for here is the maxForks directive, which can be applied to just the 'GetFromHPSS' process without the need to change the executor's queueSize:
process 'GetFromHPSS' {
maxForks 1
"""
<your script here>
"""
}
You could even parameterize it, if you think it makes sense:
params.hpss_forks = 5
process 'GetFromHPSS' {
maxForks params.hpss_forks
"""
<your script here>
"""
}

Related

Add a constraint of renewable ressource

I am new to these optimisation problems, I just found the or-tools library and saw that cp_model can fix problems that are close to mine.
I have printers and some tasks, that I want to schedule in order to finish the production the earliest. The tasks uses time on machine and raw material, that I must refill at the end of coil. For the moment, I don't consider changing a plastic coil before using all the material.
Here is some information about my situation:
1- The printers are all same, they can do every task, with the same efficiency.
2- A printer can only print one task at a time.
3- A printer cannot start without human around, so tasks can start only at certain hours (in the code below, from 0AM to 10 AM).
4- A task can finish at any time.
5- If a printer has no more material, it needs to be change, this can happen only on opening hours.
6- If a printer has no more material, the task is paused until new material is put.
7- I consider having unlimited quantity of material coil.
Thanks to examples and some search in the documentation, I have been able to fix all the issues that are not related to material. I have been able to set a maximum quantity per machine, but it is not my issue.
I don't understand how I can pause/resume my intervals (for the moment I set the duration to a fixed one).
from ortools.sat.python import cp_model
from ortools.util.python import sorted_interval_list
import random
Domain = sorted_interval_list.Domain
def main():
random.seed(0)
nb_jobs = 10
nb_machine = 2
horizon = 30000000
job_list = [] #Job : (id,time,quantity)
listOfEnds = []
quantityPerMachine = []
maxQuantity = 5
#create the jobs
for i in range(nb_jobs):
time = random.randrange(1,24)
quantity = random.randrange(1,4)
job_list.append([i,time,quantity])
print([job[1] for job in job_list])
print("total time to print = ",horizon)
print("quantity")
print([job[2] for job in job_list])
print("total quantity = ",sum([job[2] for job in job_list]))
model = cp_model.CpModel()
makespan = model.NewIntVar(0, horizon, 'makespan')
machineForJob = {}
#boolean variable for each combination of machine and job. if True then machine works on this job
for machine in range(nb_machine):
for job in job_list:
j = job[0]
machineForJob[(machine,j)]=(model.NewBoolVar(f'x[{machine},{j}]'))
#for each job, apply sum()==1 to ensure one machine works on each job only
for j in range(nb_jobs):
model.Add(sum([machineForJob[(machine,j)] for machine in range(nb_machine)])==1)
#set the affectation of the jobs
time_intervals={}
starts = {}
ends = {}
#Time domain represents working hours when someone can start the taks
timeDomain = []#[[0],[1],[2],[3],[4],[5],[6],[7],[8],[9],[10],[24],[25],[26]]
for i in range(20):
for j in range(10):
t= [j +i*24]
timeDomain.append(t)
for machine in range(nb_machine):
time_intervals[machine] = []
for job in job_list:
j = job[0]
duration = job[1]
starts[(machine,j)] = model.NewIntVarFromDomain(Domain.FromIntervals(timeDomain),f'start {machine},{j}')
ends[(machine,j)] = (model.NewIntVar(0, horizon, f'end {machine},{j}'))
time_intervals[machine].append(model.NewOptionalIntervalVar(starts[(machine,j)], duration, ends[(machine,j)],
machineForJob[(machine,j)],f'interval {machine},{j} '))
#time should not overlap, quantity of raw material is limited,
for machine in range(nb_machine):
#model.Add(quantityPerMachine[machine] <= maxQuantity) Not working as expected as the raw material cannot be refilled
model.AddNoOverlap(time_intervals[machine])
#calculate time per machine
time_per_machine = []
for machine in range(nb_machine):
q = 0
s = 0
for job in job_list:
s+= job[1]*machineForJob[(machine,job[0])]
listOfEnds.append(ends[(machine,job[0])])
q+= job[2]*machineForJob[(machine,job[0])]
time_per_machine.append(s)
quantityPerMachine.append(q)
#Goal is to finish all taks earliest
model.AddMaxEquality(makespan,listOfEnds)
model.Minimize(makespan)
solver = cp_model.CpSolver()
# Sets a time limit of 10 seconds.
solver.parameters.max_time_in_seconds = 600.0
#Solve and prints
status = solver.Solve(model)
if status == cp_model.OPTIMAL or status == cp_model.FEASIBLE:
print("optimal =",status == cp_model.OPTIMAL )
print(f'Total cost = {solver.ObjectiveValue()}')
for i in range(nb_machine):
for j in range(nb_jobs):
if solver.BooleanValue(machineForJob[(i,j)]):
print(
f'Machine {i} assigned to Job {j} Time = {job_list[j][1]},Quantity = {job_list[j][2]}')
print(f"[{solver.Value(starts[(i,j)])} ,{solver.Value(ends[(i,j)])}]")
else:
print('No solution found.')
# Statistics.
print('\nStatistics')
print(' - conflicts: %i' % solver.NumConflicts())
print(' - branches : %i' % solver.NumBranches())
print(' - wall time: %f s' % solver.WallTime())

How to pass the number of total users to simulate and spawn rate in the locust script itself

How do I pass the number of total users to simulate and spawn rate in the Web UI when I run the locust file, instead, I would like to pass them as variables in the script itself?
class QuickstartUser(HttpUser):
wait_time = between(1, 2.5)
users = 10
spawn_rate = 1
#task
def on_start(self):
filenumber="ABC"
# Get file info
response = self.client.get(f"/files/" + filenumber)
json_var = response.json()
print("response Json: ", json_var)
time.sleep(1)
You could probably do it by accessing the Runner in code, but it would be much easier if you used a Load Shape.
class MyCustomShape(LoadTestShape):
time_limit = 600
spawn_rate = 20
def tick(self):
run_time = self.get_run_time()
if run_time < self.time_limit:
# User count rounded to nearest hundred.
user_count = round(run_time, -2)
return (user_count, spawn_rate)
return None
tick is called automatically, you just have to return a tuple of the user count and spawn rate you want. You can do whatever work you want to calculate what the users and rate should be. There are more examples in the GitHub repo.

Simpy: How can I represent failures in a train subway simulation?

New python user here and first post on this great website. I haven't been able to find an answer to my question so hopefully it is unique.
Using simpy I am trying to create a train subway/metro simulation with failures and repairs periodically built into the system. These failures happen to the train but also to signals on sections of track and on plaforms. I have read and applied the official Machine Shop example (which you can see resemblance of in the attached code) and have thus managed to model random failures and repairs to the train by interrupting its 'journey time'.
However I have not figured out how to model failures of signals on the routes which the trains follow. I am currently just specifying a time for a trip from A to B, which does get interrupted but only due to train failure.
Is it possible to define each trip as its own process i.e. a separate process for sections A_to_B and B_to_C, and separate platforms as pA, pB and pC. Each one with a single resource (to allow only one train on it at a time) and to incorporate random failures and repairs for these section and platform processes? I would also need to perhaps have several sections between two platforms, any of which could experience a failure.
Any help would be greatly appreciated.
Here's my code so far:
import random
import simpy
import numpy
RANDOM_SEED = 1234
T_MEAN_A = 240.0 # mean journey time
T_MEAN_EXPO_A = 1/T_MEAN_A # for exponential distribution
T_MEAN_B = 240.0 # mean journey time
T_MEAN_EXPO_B = 1/T_MEAN_B # for exponential distribution
DWELL_TIME = 30.0 # amount of time train sits at platform for passengers
DWELL_TIME_EXPO = 1/DWELL_TIME
MTTF = 3600.0 # mean time to failure (seconds)
TTF_MEAN = 1/MTTF # for exponential distribution
REPAIR_TIME = 240.0
REPAIR_TIME_EXPO = 1/REPAIR_TIME
NUM_TRAINS = 1
SIM_TIME_DAYS = 100
SIM_TIME = 3600 * 18 * SIM_TIME_DAYS
SIM_TIME_HOURS = SIM_TIME/3600
# Defining the times for processes
def A_B(): # returns processing time for journey A to B
return random.expovariate(T_MEAN_EXPO_A) + random.expovariate(DWELL_TIME_EXPO)
def B_C(): # returns processing time for journey B to C
return random.expovariate(T_MEAN_EXPO_B) + random.expovariate(DWELL_TIME_EXPO)
def time_to_failure(): # returns time until next failure
return random.expovariate(TTF_MEAN)
# Defining the train
class Train(object):
def __init__(self, env, name, repair):
self.env = env
self.name = name
self.trips_complete = 0
self.broken = False
# Start "travelling" and "break_train" processes for the train
self.process = env.process(self.running(repair))
env.process(self.break_train())
def running(self, repair):
while True:
# start trip A_B
done_in = A_B()
while done_in:
try:
# going on the trip
start = self.env.now
yield self.env.timeout(done_in)
done_in = 0 # Set to 0 to exit while loop
except simpy.Interrupt:
self.broken = True
done_in -= self.env.now - start # How much time left?
with repair.request(priority = 1) as req:
yield req
yield self.env.timeout(random.expovariate(REPAIR_TIME_EXPO))
self.broken = False
# Trip is finished
self.trips_complete += 1
# start trip B_C
done_in = B_C()
while done_in:
try:
# going on the trip
start = self.env.now
yield self.env.timeout(done_in)
done_in = 0 # Set to 0 to exit while loop
except simpy.Interrupt:
self.broken = True
done_in -= self.env.now - start # How much time left?
with repair.request(priority = 1) as req:
yield req
yield self.env.timeout(random.expovariate(REPAIR_TIME_EXPO))
self.broken = False
# Trip is finished
self.trips_complete += 1
# Defining the failure
def break_train(self):
while True:
yield self.env.timeout(time_to_failure())
if not self.broken:
# Only break the train if it is currently working
self.process.interrupt()
# Setup and start the simulation
print('Train trip simulator')
random.seed(RANDOM_SEED) # Helps with reproduction
# Create an environment and start setup process
env = simpy.Environment()
repair = simpy.PreemptiveResource(env, capacity = 1)
trains = [Train(env, 'Train %d' % i, repair)
for i in range(NUM_TRAINS)]
# Execute
env.run(until = SIM_TIME)
# Analysis
trips = []
print('Train trips after %s hours of simulation' % SIM_TIME_HOURS)
for train in trains:
print('%s completed %d trips.' % (train.name, train.trips_complete))
trips.append(train.trips_complete)
mean_trips = numpy.mean(trips)
std_trips = numpy.std(trips)
print "mean trips: %d" % mean_trips
print "standard deviation trips: %d" % std_trips
it looks like you are using Python 2, which is a bit unfortunate, because
Python 3.3 and above give you some more flexibility with Python generators. But
your problem should be solveable in Python 2 nonetheless.
you can use sub processes within in a process:
def sub(env):
print('I am a sub process')
yield env.timeout(1)
# return 23 # Only works in py3.3 and above
env.exit(23) # Workaround for older python versions
def main(env):
print('I am the main process')
retval = yield env.process(sub(env))
print('Sub returned', retval)
As you can see, you can use Process instances returned by Environment.process()
like normal events. You can even use return values in your sub proceses.
If you use Python 3.3 or newer, you don’t have to explicitly start a new
sub-process but can use sub() as a sub routine instead and just forward the
events it yields:
def sub(env):
print('I am a sub routine')
yield env.timeout(1)
return 23
def main(env):
print('I am the main process')
retval = yield from sub(env)
print('Sub returned', retval)
You may also be able to model signals as resources that may either be used
by failure process or by a train. If the failure process requests the signal
at first, the train has to wait in front of the signal until the failure
process releases the signal resource. If the train is aleady passing the
signal (and thus has the resource), the signal cannot break. I don’t think
that’s a problem be cause the train can’t stop anyway. If it should be
a problem, just use a PreemptiveResource.
I hope this helps. Please feel welcome to join our mailing list for more
discussions.

Simpy subway simulation: how to fix interrupt failure of class train while queueing for a resource?

I am working on a train simulation in simpy and have had success so far with a single train entity following the code below.
The trains processes are sections followed by platforms. Each section and platform has a resource of 1 to ensure that only one train utilises at a time.
However I can't find a way to get around the error below:
When I add in a second train to the simulation there is occasionally the situation where one train waits for an unavailable resource and then a failure occurs on that train while it is waiting.
I end up with an Interrupt: Interrupt() error.
Is there a way around these failing queues for resources?
Any help is much appreciated.
import random
import simpy
import numpy
# Configure parameters for the model
RANDOM_SEED = random.seed() # makes results repeatable
T_MEAN_SECTION = 200.0 # journey time (seconds)
DWELL_TIME = 30.0 # dwell time mean (seconds)
DWELL_TIME_EXPO = 1/DWELL_TIME # for exponential distribution
MTTF = 600.0 # mean time to failure (seconds)
TTF_MEAN = 1/MTTF # for exponential distribution
REPAIR_TIME = 120.0 # mean repair time for when failure occurs (seconds)
REPAIR_TIME_EXPO = 1/REPAIR_TIME # for exponential distribution
NUM_TRAINS = 2 # number of trains to simulate
SIM_TIME_HOURS = 1 # sim time in hours
SIM_TIME_DAYS = SIM_TIME_HOURS/18.0 # number of days to simulate
SIM_TIME = 3600 * 18 * SIM_TIME_DAYS # sim time in seconds (this is used in the code below)
# Defining the times for processes
def Section(): # returns processing time for platform 7 Waterloo to 26 Bank
return T_MEAN_SECTION
def Dwell(): # returns processing time for platform 25 Bank to platform 7 Waterloo
return random.expovariate(DWELL_TIME_EXPO)
def time_to_failure(): # returns time until next failure
return random.expovariate(TTF_MEAN)
# Defining the train
class Train(object):
def __init__(self, env, name, repair):
self.env = env
self.name = name
self.trips_complete = 0
self.num_saf = 0
self.sum_saf = 0
self.broken = False
# Start "running" and "downtime_train" processes for the train
self.process = env.process(self.running(repair))
env.process(self.downtime_train())
def running(self, repair):
while True:
# request section A
request_SA = sectionA.request()
########## SIM ERROR IF FAILURE OCCURS HERE ###########
yield request_SA
done_in_SA = Section()
while done_in_SA:
try:
# going on the trip
start = self.env.now
print('%s leaving platform at time %d') % (self.name, env.now)
# processing time
yield self.env.timeout(done_in_SA)
# releasing the section resource
sectionA.release(request_SA)
done_in_SA = 0 # Set to 0 to exit while loop
except simpy.Interrupt:
self.broken = True
delay = random.expovariate(REPAIR_TIME_EXPO)
print('Oh no! Something has caused a delay of %d seconds to %s at time %d') % (delay, self.name, env.now)
done_in_SA -= self.env.now - start # How much time left?
with repair.request(priority = 1) as request_D_SA:
yield request_D_SA
yield self.env.timeout(delay)
self.broken = False
print('Okay all good now, failure fixed on %s at time %d') % (self.name, env.now)
self.num_saf += 1
self.sum_saf += delay
# request platform A
request_PA = platformA.request()
########## SIM ERROR IF FAILURE OCCURS HERE ###########
yield request_PA
done_in_PA = Dwell()
while done_in_PA:
try:
# platform process
start = self.env.now
print('%s arriving to platform A and opening doors at time %d') % (self.name, env.now)
yield self.env.timeout(done_in_PA)
print('%s closing doors, ready to depart platform A at %d\n') % (self.name, env.now)
# releasing the platform resource
platformA.release(request_PA)
done_in_PA = 0 # Set to 0 to exit while loop
except simpy.Interrupt:
self.broken = True
delay = random.expovariate(REPAIR_TIME_EXPO)
print('Oh no! Something has caused a delay of %d seconds to %s at time %d') % (delay, self.name, env.now)
done_in_PA -= self.env.now - start # How much time left?
with repair.request(priority = 1) as request_D_PA:
yield request_D_PA
yield self.env.timeout(delay)
self.broken = False
print('Okay all good now, failure fixed on %s at time %d') % (self.name, env.now)
self.num_saf += 1
self.sum_saf += delay
# Round trip is finished
self.trips_complete += 1
# Defining the failure event
def downtime_train(self):
while True:
yield self.env.timeout(time_to_failure())
if not self.broken:
# Only break the train if it is currently working
self.process.interrupt()
# Setup and start the simulation
print('Train trip simulator')
random.seed(RANDOM_SEED) # Helps with reproduction
# Create an environment and start setup process
env = simpy.Environment()
# Defining resources
platformA = simpy.Resource(env, capacity = 1)
sectionA = simpy.Resource(env, capacity = 1)
repair = simpy.PreemptiveResource(env, capacity = 10)
trains = [Train(env, 'Train %d' % i, repair)
for i in range(NUM_TRAINS)]
# Execute
env.run(until = SIM_TIME)
Your processes request a resource and never release it. That’s why the second trains waits forever for its request to succeed. While it is waiting, the failure process seems to interrupt the process. That’s why you get an error. Please read the guide to resources to understand how SimPy’s resources work and, especially, how to release a resource when you are done.

Retrying celery failed tasks that are part of a chain

I have a celery chain that runs some tasks. Each of the tasks can fail and be retried. Please see below for a quick example:
from celery import task
#task(ignore_result=True)
def add(x, y, fail=True):
try:
if fail:
raise Exception('Ugly exception.')
print '%d + %d = %d' % (x, y, x+y)
except Exception as e:
raise add.retry(args=(x, y, False), exc=e, countdown=10)
#task(ignore_result=True)
def mul(x, y):
print '%d * %d = %d' % (x, y, x*y)
and the chain:
from celery.canvas import chain
chain(add.si(1, 2), mul.si(3, 4)).apply_async()
Running the two tasks (and assuming that nothing fails), your would get/see printed:
1 + 2 = 3
3 * 4 = 12
However, when the add task fails the first time and succeeds in subsequent retry calls, the rest of the tasks in the chain do not run, i.e. the add task fails, all other tasks in the chain are not run and after a few seconds, the add task runs again and succeeds and the rest of the tasks in the chain (in this case mul.si(3, 4)) does not run.
Does celery provide a way to continue failed chains from the task that failed, onwards? If not, what would be the best approach to accomplishing this and making sure that a chain's tasks run in the order specified and only after the previous task has executed successfully even if the task is retried a few times?
Note 1: The issue can be solved by doing
add.delay(1, 2).get()
mul.delay(3, 4).get()
but I am interested in understanding why chains do not work with failed tasks.
You've found a bug :)
Fixed in https://github.com/celery/celery/commit/b2b9d922fdaed5571cf685249bdc46f28acacde3
will be part of 3.0.4.
I'm also interested in understanding why chains do not work with failed tasks.
I dig some celery code and what I've found so far is:
The implementation happends at app.builtins.py
#shared_task
def add_chain_task(app):
from celery.canvas import chord, group, maybe_subtask
_app = app
class Chain(app.Task):
app = _app
name = 'celery.chain'
accept_magic_kwargs = False
def prepare_steps(self, args, tasks):
steps = deque(tasks)
next_step = prev_task = prev_res = None
tasks, results = [], []
i = 0
while steps:
# First task get partial args from chain.
task = maybe_subtask(steps.popleft())
task = task.clone() if i else task.clone(args)
i += 1
tid = task.options.get('task_id')
if tid is None:
tid = task.options['task_id'] = uuid()
res = task.type.AsyncResult(tid)
# automatically upgrade group(..) | s to chord(group, s)
if isinstance(task, group):
try:
next_step = steps.popleft()
except IndexError:
next_step = None
if next_step is not None:
task = chord(task, body=next_step, task_id=tid)
if prev_task:
# link previous task to this task.
prev_task.link(task)
# set the results parent attribute.
res.parent = prev_res
results.append(res)
tasks.append(task)
prev_task, prev_res = task, res
return tasks, results
def apply_async(self, args=(), kwargs={}, group_id=None, chord=None,
task_id=None, **options):
if self.app.conf.CELERY_ALWAYS_EAGER:
return self.apply(args, kwargs, **options)
options.pop('publisher', None)
tasks, results = self.prepare_steps(args, kwargs['tasks'])
result = results[-1]
if group_id:
tasks[-1].set(group_id=group_id)
if chord:
tasks[-1].set(chord=chord)
if task_id:
tasks[-1].set(task_id=task_id)
result = tasks[-1].type.AsyncResult(task_id)
tasks[0].apply_async()
return result
def apply(self, args=(), kwargs={}, **options):
tasks = [maybe_subtask(task).clone() for task in kwargs['tasks']]
res = prev = None
for task in tasks:
res = task.apply((prev.get(), ) if prev else ())
res.parent, prev = prev, res
return res
return Chain
You can see that at the end prepare_steps prev_task is linked to the next task.
When the prev_task failed the next task is not called.
I'm testing with adding the link_error from prev task to the next:
if prev_task:
# link and link_error previous task to this task.
prev_task.link(task)
prev_task.link_error(task)
# set the results parent attribute.
res.parent = prev_res
But then, the next task must take care of both cases (maybe, except when it's configured to be immutable, e.g. not accept more arguments).
I think chain can support that by allowing some syntax likes this:
c = chain(t1, (t2, t1e), (t3, t2e))
which means:
t1 link to t2 and link_error to t1e
t2 link to t3 and link_error to t2e