I'm using spm_jobman() run a large number of SPM jobs. I have a machine with a large number of processors, so I would like to be able to run them in parallel. Can spm_jobman() run jobs in parallel? If so, how can this be done?
Related
We are running into the following issue:
We have a job in our pipeline that runs tests. The number of tests need to be distributed over 4 agents to run optimal. It can happen that only one agent is available and the job will start to run all the load on that specific agent, which can then time-out because it takes too long for other agents to become available in time to share in the load.
In essence, if we run with 4 agents, the job will run with optimal efficiency.
My question: is it possible to let a job wait for a specific number of agents to become available before starting the tasks in the job?
That`s not possible through out-of-box features.... But you may create a simple PowerShell script that will query your agents statuses: https://learn.microsoft.com/en-us/rest/api/azure/devops/distributedtask/agents/list?view=azure-devops-rest-7.1
and use includeAssignedRequest
GET https://dev.azure.com/{organization}/_apis/distributedtask/pools/{poolId}/agents?includeAssignedRequest={includeAssignedRequest}&api-version=7.1-preview.1
if you see assignedRequest, your build agent is busy...
I am using spring batch to create a workflow of batch job. The single batch job takes 2 hrs to complete(data to be processed ~ 1 million) so decided to run in distributed way where one task will be distributed across multiple worker nodes, that way it can execute in short time. The other jobs (all are working in distributed manner) in workflow need to run in sequential manner one after other. The jobs are multi node distributed jobs(master/slave architecture) that need to run one after another.
Now, I was considering to deploy the workflow on airflow. So, while exploring that I could not find any way to run a single task that distributes across multiple machine. Is it possible in airflow?
Yes, you can create a task using Spark framework. Spark allows you to process the data on multiple nodes in a distributed fashion.
You can then use SparkSubmitOperator to align the task in your DAG.
I know that UI tests can be run in parallel on multiple machines using selenium grid. How about API tests?
I looked at pytest-xdist plugin and it can run tests in parallel on the local machine using py.test -n NUM, which will send tests to multiple CPUs and run them in parallel. This may not be as effective and fast, if the number of tests that we would like to run in parallel is much more than the no of CPUs on the machine. For example: If the machine has 4 CPUs and we would like to run 50 tests in parallel.
And it seems to run the tests on remote machine we need to do something like
py.test -d --tx socket=192.168.1.102:8888 --rsyncdir mypkg mypkg
I am wondering if there is a way to distribute the tests to multiple remote machines and run them in parallel. For example: If i have 1000 tests and 50 remote machines, then i would like each remote machine to run 1 or more tests at the same time so that tests complete faster. Which means, all the 1000 tests will complete in the time it takes for 20 tests or less.
Thanks.
It looks like you want the load distribution mode, followed by multiple invocations of the --tx argument:
py.test --dist=load --tx socket=192.168.1.110:8888 --tx socket=192.168.1.111:8888 --tx socket=192.168.1.112:8888 --rsyncdir mypkg mypkg
I'm sure you've looked at CPU usage of the python processes when running the tests. If you are doing what what I expect you are doing (running an integration test suite against a single instance of a network service with high response times), your test suite isn't CPU bound but is actually I/O bound. For this type of workload, CPU usage may appear high, but actually includes the amount of time the test runner spent waiting for a response from the system under test.
The biggest problem I've encountered when parallelizing that type of test suite is that the order tests complete sometimes matters, and when run in parallel tests finish in a different order when they run in series just due to variation in response times, causing intermittent and difficult to troubleshoot test failures.
If that doesn't happen with multiple cores on a single machine, that's a good sign your plan will work. That having been said, because there is operational overhead involved with keeping any pool of hosts around - patching with updates, dealing with configuration, provisioning, and networking, not to mention other unexpected issues, I suggest you try something different.
I think you should consider refactoring your test code to use asynchronous IO instead of setting up the test grid. When you do this correctly, multiple tests will be able to run on one core at the same time. Your sysadmin (which may be you!) will thank you.
I have few questions about running tasks in parallel in Azure Batch. Per the official documentation, "Azure Batch allows you to set maximum tasks per node up to four times (4x) the number of node cores."
Is there a setup other than specifying the max tasks per node when creating a pool, that needs to be done (to the code) to be able to run parallel tasks with batch?
So if I am understanding this correctly, if I have a Standard_D1_v2 machine with 1 core, I can run up to 4 concurrent tasks running in parallel in it. Is that right? If yes, I ran some tests and I am quite not sure about the behavior that I got. In a pool of D1_v2 machines set up to run 1 task per node, I get about 16 min for my job execution time. Then, using the same applications and same parameters with the only change being a new pool with same setup, also D1_v2, except running 4 tasks per node, I still get a job execution time of about 15 min. There wasn't any improvement in the job execution time for running tasks in parallel. What could be happening? What am I missing here?
I ran a test with a pool of D3_v2 machines with 4 cores, set up to run 2 tasks per core for a total of 8 tasks per node, and another test with a pool (same number of machines as previous one) of D2_v2 machines with 2 cores, set up to run 2 tasks per core for a total of 4 parallel tasks per node. The run time/ job execution time for both these tests were the same. Isn't there supposed to be an improvement considering that 8 tasks are running per node in the first test versus 4 tasks per node in the second test? If yes, what could be a reason why I'm not getting this improvement?
No. Although you may want to look into the task scheduling policy, compute node fill type to control how your tasks are distributed amongst nodes in your pool.
How many tasks are in your job? Are your tasks compute-bound? If so, you won't see any improvement (perhaps even end-to-end performance degradation).
Batch merely schedules the tasks concurrently on the node. If the command/process that you're running utilizes all of the cores on the machine and is compute-bound, you won't see an improvement. You should double check your tasks start and end times within the job and the node execution info to see if they are actually being scheduled concurrently on the same node.
I'm using Jenkins for my builds, and I wrote some test scripts that I need to run after the compilation of the build.
I want to save some time, so I have to run the test scripts parallel. How can I do that?
EDIT: ok, I understand know that I need a separate Job for each test (for 4 tests I need 4 jobs, right?)
So, I did that, and via the parent job I ran this jobs. (using "build other projects" plugin).
But I didn't managed to aggregate the results (using aggregate downstream test results). The parent job exits before the downstream jobs were finished.
What shall I do?
Thanks.
You can use multi-job plugin. This would allow you to run multiple jobs in parallel and the parent job would wait for the sub jobs to be completed. The parent jobs status can be determined by the sub jobs status.
https://wiki.jenkins-ci.org/display/JENKINS/Multijob+Plugin
Jenkins doesn't really allow you to run things in parallel. You can however split your build into different jobs to achieve this. It would look like this.
Job to compile the source runs.
Jobs that run the tests are triggered by the completion of the compilation and start running. They copy compilation results from the previous job into their workspaces.
This is a big kludgy though. The better alternative would be to parallelise within the scripts that run the tests, i.e. you run a single script and this then runs the tests in parallel. If this is not possible, you'll have to split into different jobs though.
Have you looked at the Jenkins JOIN Plugin? I have not used it but I believe it is what you are attempting to accomplish.
- Mike
Actually you can but you will need some coding behind.
In my case, I have parallel test execution on jenkins.
1) Create a small job with parameters that is supposed to do a test run with a small suite
2) Edit this job to run on a list of slaves (where you have the proper environment)
3) Edit this build to allow concurrent builds
And now the hard part.
4) Create a small java program for computing the list of parameters for each job to run.
5) Iterate trough the list and launch a new Jenkins job on a new thread.
Put a Thread.sleep(5000) between runs in order to avoid communication errors
6) Join the threads
At the end of each job, I send the results to a shared location in order to perform some reporting at the end of all tests.
For starting a jenkins job with parameters use CLI
I intend to make my code as generic as possible and publish it if anyone else will need it.
You can use https://wiki.jenkins-ci.org/display/JENKINS/Build+Flow+Plugin with code like this
parallel (
// job 1, 2 and 3 will be scheduled in parallel.
{ build("job1") },
{ build("job2") },
{ build("job3") }
)
You can use any one of the followings:
Multijob plugin
Build Flow plugin