The problem
Say I have 20 processors available. I want to pass arguments to an external
program from IPython that runs best with 4 threads at a time, and use map_async to keep adding jobs until all jobs are finished. Below is example code where I believe just one process would be assigned to each job at a time. Is this an example where you would use the 'chunksize' flag? It seems that would do the opposite, i.e., send multiple jobs to one processor.
Start engines outside of IPython
ipcluster start -n 20 --daemon
IPython code
import ipyparallel as ipp
import subprocess
def func(args):
""" function that calls external prog w/ 4 threads """
subprocess.call([some_external_program, args, nthreads=4])
args = [...]
ipyclient = ipp.Client().load_balanced_view()
results = ipyclient.map_async(func, args)
results.get()
If a task is multithreaded, you don't want to be running it on too many engines. If this is the bulk of your work, it is probably best to start n_cpus/n_threads engines instead of n_cpus (5 engines in your case of 20 CPUs, 4 threads). If it's a subset of your work that's multithreaded like this, then you may want to just restrict their assignment to n_cpus/n_threads. You can do this with the targets argument when creating a view, which will restrict task assignment to a subset of engines:
n_threads = 4
client = ipp.Client()
all_view = client.load_balanced_view() # uses all engines
threaded_view = client.load_balanced_view(targets=client.ids[::n_threads])
This assumes that you have one engine per CPU on a single machine. If you are using multiple machines or the engine count has a different relationship to the number of CPUs, you will have to work out the correct subset of engines to use. Targets can be any manually specified list of engine IDs.
Related
I have a Playwright test which I'm running via the following command -
dotnet test -- NUnit.NumberOfTestWorkers=2
From what I can gather, this will execute the same test in parallel with 2 workers. I'm curious if there's any way to have each worker go down a separate logical path, perhaps depending upon a worker id or something similar? Eg:
if (workerId == 1)
//do something
else if (workerId == 2) //do something else
What would be the best way to do this?
As to why I want this, I have a Blazor Server app which is a chat room, and I want to test the text updating from separate users (which would be represented by different worker ids, for example). I'd also like the solution to be scalable, eg: I can enter 5000 or so workers to test large scalability in the chat room.
You appear to have misunderstood what the NumberOfTestWorkers setting does. It simply tells NUnit how many separate test workers to set up. It does not have any impact on how NUnit allocates tests among it's workers, when running in parallel. And it does not cause an individual test to run more than once.
In general, the kind of load testing you are trying to do isn't directly supported by NUnit. You would have to build something of your own, possibly using NUnit or try a framework intended for that kind of testing.
A long time ago, there was something called pnunit, but I don't believe it is kept up to date any longer. See https://docs.plasticscm.com/technical-articles/pnunit-parallel-nunit
When I create a ParametersVariation simulation, the main model does not run. All I see is the default UI with iterations completed and replication. My end goal (as with most people) is to have a model go through a certain number of replications, but nothing is even running. There is limited documentation available on this. Please advise.
This is how Parameters Variation is intended to work. If you're running 1000 runs and multiple replications with parallel runs, how can you see what's happening in Main in each?
Typically, the best way to benefit from such an experiment is to track the results of each run using elements from the Analysis palette or even better to export results to Excel or similar.
To be able to collect data, you need to write your code in Java actions fields with root. to access elements in main (or top-level agent).
Check the example below, where after each run a variable from main is added to a dataset in the Parameters Variation experiment. At the end of 100 runs for example, the dataset will have 100 values of the main variable, with 1 value for each run.
I had a quick question about the workflow plugin. I'm trying to see if the plugin will be able to satisfy my use case:
We have a jenkins job that will build our app
We want to spin off a suite of test jobs that will perform various tests on the newly built app (unit, integration, etc). These will need to be run in parallel and we want to run them on more than one jenkins node for performance reasons
We'll take the aggregated output from all our test processes from step 2 and be able to decide whether or not we should deploy (everything's passed) or not
I was curious as to whether or not I'd be able to accomplish this within the plugin and if so if you had any tips/pointers to a start.
Thanks!
You can certainly run nodes inside parallel branches. If one branch fails, the parallel step as a whole fails. If you want the build to succeed, but behave differently depending on test results, you can capture them directly as Groovy variables in various ways.
If you are using JUnitArchiver, currently it does not provide a simple means of exposing the test results directly to the Pipeline script (JENKINS-26276), though if you just want to tell if there are some failures or none, you can inspect currentBuild.status.
If you have JUnit-format test results and wish to automatically split them amongst various nodes (especially helpful in case you have a large pool of machines and it would be unmaintainable to manually divide your tests), see this demo of the Parallel Test Executor plugin’s splitTests step.
My team is new to developing these things and I came into a project that is defining an over-arching workflow using separate processes that are all defined under the same project. So it appears that right now the processes defined are all discrete units, and the plan was to connect these units together using inputs and outputs.
Based on the documentation it looks like the best-practicey way of doing this would be to define the entire, over-arching workflow using sub-process tasks.
So I wonder:
Is the implementation we've started workable?
or
Should I only have one process unit per one workflow, which defines sub-processes if the workflow is too complicated and has discrete parts?
It's fine to separate out certain parts of the process into its own process, and then call those from some sort of parent process. The task you should use in the parent process is called reusable sub-process, or call activity. It's absolutely fine to have multiple processes in the same project.
I need to know that how can we run a single job in parallel with different parameters in talend.
The answer is straightforward, but rather depends on what you want, and whether you are using free Talend or commercial.
As far as parameters go, make sure that your jobs are using context variables - this is the preferred way of passing parameters in.
As for running in parallel, there are a few options.
Talend's studio is a java code generator, so you can export your job (it's just java code) and run it wherever you want. How you invoke it is up to you - schedule it, invoke it N times manually, your call. Obviously, if your job touches shared resources then making it safe to run in parallel is up to you - the usual concurrency issues apply.
If you have the commercial product, then you can use the Talend admin centre (TAC). The TAC allows you to schedule a job more than once with different contexts. Or, if you want to keep the parallelization logic inside your job, then consider using the tParallelize component in one job to run another job N times.