How to get Locust to evenly spread distributions among tasks? - locust

I have created a locustfile with 28 tasks.
When I run locust it only runs a subset of the tasks.
Here is the command I am using:
locust -f $locustfile.py --headless -u 5 -r 1 --run-time 15m --logfile locust.log
I am running it for 15 minutes and each task takes just a few seconds to a minute to run.
Upon completion, it says that it ran a task 187 times yet only 8 of the 28 tasks were run.
Of the 8 it did run it ran them anywhere from 17 to 31 times.
I'm decorating all of the tasks with "#task" so they should all be weighted the same.
Can anyone tell me why the task selection is so limited and how to make it spread out more?

It turns out that I had a bug in my error event handling that was causing the failures not to be reported. I fixed this and am now getting all of the test results. It was Solowalker's comment that led me to this discovery.

Related

How to find process by dmesg message in Solaris?

There is a "Oracle Solaris 11.4" system, where we have a flood messages in dmesg.
genunix: [ID 200113 kern.warning] WARNING: symlink creation failed,
error 2
This messages appears every 15 minutes, but I didn't find any crontab job with this interval starting.
Is there a way to know what is the process runs every 15 minutes?
May I use dtrace or something else?
Thanks

GitHub workflow job timeout-minutes is ignored. Why?

According to https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions#jobsjob_idtimeout-minutes
The timeout-minutes parameter defaults to 360 minutes (6 hours).
I parallelized my mutation testing so that my workflow takes around 6.5 hours to run (mutation testing with Stryker of ~1600 mutants on just 2 cores - 9 jobs in parallel). Thus, I’ve set the timeout-minutes to 420 minutes (7 hours) for the mutation job just in case: https://github.com/lbragile/TabMerger/blob/b53a668678b7dcde0dd8f8b06ae23ee668ff8f9e/.github/workflows/testing.yml#L53
This seems to be ignored as the workflow still ends in 6 hours 23min (without warnings/errors): https://github.com/lbragile/TabMerger/runs/2035483167?check_suite_focus=true
Why is my value being ignored?
Also, is there anything I can do to use more CPUs on the workflows virtual machine?
GitHub-hosted runners are limited to maximum 6 hours per job.
Usage limits
There are some limits on GitHub Actions usage when using GitHub-hosted runners. These limits are subject to change.
[...]
Job execution time - Each job in a workflow can run for up to 6 hours of execution time. If a job reaches this limit, the job is terminated and fails to complete.
https://docs.github.com/en/actions/reference/usage-limits-billing-and-administration#usage-limits

Airflow tasks stuck in queued state

We're running Airflow 1.10.12, with KubernetesExecutor and KubernetesPodOperator.
In the past few days, we’re seeing tasks getting stuck in queued state for a long time (to be honest, unless we restart the scheduler, it will remain stuck in that state), new tasks of the same DAG are getting scheduled properly.
The only thing that helps is either clearing it manually, or restarting the scheduler service
We usually see it happen when we run our E2E tests, which spawns ~20 DAG runs for everyone of our 3 DAGs, due to limited parallelism, some will be queued (which is fine by us)
These are our parallelism params in airflow.cfg
parallelism = 32
dag_concurrency = 16
max_active_runs_per_dag = 16
2 of our DAGs, overwrite the max_active_runs and set it to 10
Any idea what could be causing it?

Set iteration count

I'm building a locust script to be integrated into our CI/CD pipeline as a synthetic monitoring solution. It'll run once, one iteration every 15 minutes. If the application fails alerts will be enabled and sent to the appropriate personnel.
Currently, I don't see any locust help with an iteration count command line option. I do see a --run-time option but that doesn't specify how many times it runs vs the amount of time to run.
If you add locust-plugins there is now a way to do this, using command line parameter -i. See https://github.com/SvenskaSpel/locust-plugins#command-line-options

Incorrect failure notification from Rundeck during fall time change

Last night was "fall back" time change for most locations in the US. I woke up this morning to find dozens of job failure notifications. Almost all of them though were incorrect: the jobs showed as having completed normally, yet Rundeck sent a failure notification for it.
Interestingly, this happened in two completely separate Rundeck installations (v2.10.8-1 and v3.1.2-20190927). The commonality is that they're both on CentOS 7 (separate servers). They're both using MariaDB, although different versions of MariaDB.
The failure emails for the jobs that finished successfully showed a negative time in the "Scheduled after" line:
#1,811,391
by admin Scheduled after 59m at 1:19 AM
• Scheduled after -33s - View Output »
• Download Output
Execution
User: admin
Time: 59m
Started: in 59m 2019-11-03 01:19:01.0
Finished: 1s ago Sun Nov 03 01:19:28 EDT 2019
Executions Success rate Average duration
100% -45s
That job actually ran in 27s at 01:19 EDT (the first 1am hour, it is now EST). Looking at the email headers, I believe I got the message at 1:19 EST, an hour after the job ran.
So that would seem to imply to me that it's just a notification problem (somehow).
But there were a couple of jobs that were following other job executions that failed as well, apparently because the successfully finished job returned a RC 2. I'm not sure what to make of this.
We've been running Rundeck for a few years now, this is the first I remember seeing this problem. Of course my memory may be faulty--maybe we did see it previously, only there were fewer jobs affected or some such.
The fact that it impacted two different versions of Rundeck on two different servers implies either it's a fundamental issue with Rundeck that's been around for a while or it is something else in the operating system that's somehow causing problems for Rundeck. (Although time change isn't new, so that would seem to be somewhat surprising too.)
Any thoughts about what might have gone on (and how to prevent it next year, short of the obvious run on UTC) would be appreciated.
You can define specific Timezone in Rundeck, check this and this.