How can I run tests in the same script in parallel? [duplicate] - perl

In all the tutorials I've read for Test::Class, there seems to be one runner script that loads all of the classes. And I think from the perspective of Test::Harness this is just one giant test. I don't think it can parallelize the tests inside the runner.
My X problem is that I am trying to factor out superclass behaviors when testing subclasses. Each subclass should have its own subclass test (that can be parallelized), but also exercise behaviors inherited from the superclass. How does one that?
Edit: I found these two posts from 2007 that seem to imply that what I'm asking for is incompatible/not possible. Any update since then?
http://www.hexten.net/pipermail/tapx-dev/2007-October/001756.html (speculation for Test::Class to support parallelism
http://perlbuzz.com/2007/08/organizing-tests-with-testclass.html (implying that Test::Class and Test::Harness are ideologically exclusive)

Test::Class doesn't support parallelisation on its own. Probably the easiest solution would be to have separate .t runners for each of your test classes (or for logical groups of test classes), and run using e.g. prove -j9.
If you really want to run all of the tests in parallel you could write a simple script to auto-generate a .t file for each test class. You'd lose the performance benefit of running multiple test classes within a single perl interpreter, but the parallelisation might compensate for the additional startup overhead. And I'd argue that no matter how much Test::Class tries to provide test isolation, it's impossible to guarantee that in Perl. Once you start taking advantage of modifying the symbol table for mocking purposes etc, your tests will start to interfere with each other unless you always get the cleanup right. Running each test in a separate perl interpreter is a better way to provide guaranteed isolation.

To make Test::Class parallel, I had Used the following mechanism. Hope it could help you.
I had made use of the Parallel::ForkManager module to invoke the tests. But had
parameterized the TEST_METHOD environment variable, so that the required tests are run
in each thread parallely
This provides a isolation among other tests because, each test is invoked independently, and
the thread process is managed to wait until all the child process are completed

Related

How to get nunit filters at runtime?

Does anybody know how to get list of categories (provided with 'where' filter to nunit-console) at runtime?
Depending on this, I need to differently initialize the test assembly.
Is there something static like TestExecutionContext that may contain such information?
The engine doesn't pass information on to the framework about "why" it's running a particular test... i.e. if it's running all tests or if it was selected by name or category. That's deliberately kept as something the test doesn't know about with the underlying philosophy being that tests should just run based on the data provided to them.
On some platforms, it's possible to get the command-line, which ran the test. With that info you could decode the various options and make some conclusions but it seems as if it would be easier to restructure the tests so they didn't need this information.
As a secondary reason, it would also be somewhat complicated to supply the info you want and to use it. A test may have multiple categories. Imagine a test selected because two categories matched, for example!
Is it possible that what you really want to do is to pass some parameters to your tests? There is a facility for doing that of course.
I think this is a bit of an XY problem. Depending on what you are actually trying to accomplish, the best approach is likely to be different. Can you edit to tell us what you are trying to do?
UPDATE:
Based on your comment, I gather that some of your initialization is both time-consuming and not needed unless certain tests are run.
Two approaches to this (or combine them):
Do less work in all your initialization (i.e. TestCase, TestCaseSource, SetUpFixture. It's generally best not to create your classes under test or initialize databases. Instead, simply leave strings, ints, etc., which allow the actual test to do the work IFF it is run.
2.Use a SetUpFixture in some namespace containing all the tests, which require that particular initialization. If you dont' run any tests from that namespace, then the initialization won't be done.
Of course both of the above may entail a large refactoring of your tests, but the legacy app won't have to be changed.

Dynamically control order of tests with pytest

I would like to control the order of my tests using logic which will reorder them on the fly, while they are already running.
My use case is this: I am parallelizing my tests with xdist, and each test uses external resources from a common and limited pool. Some tests use more resources than others, so at any given time when only a fraction of the resources are available, some of the tests have the resources they need to run and others don't.
I want to optimize the usage of the resources, so I would like to dynamically choose which test will run next, based on the resources currently available. I would calculate an optimal ordering during the collection stage, but I don't know in advance how long each test will take, so I can't predict which resources will be available when.
I haven't found a way to implement this behavior with a plugin because the collection stage seems to be distinct from the running stage, and I don't know another way to modify the list of tests to run other than the collection hooks.
Would very much appreciate any suggestions, whether a way to implement this or an alternative idea which would optimize the resources utilization. My alternative is to write my own simplistic test runner, but I don't want to give up on the rest of what pytest offers.
Thanks!
Not really an full answer to your problem, but to your question on pytest, and few hints.
To change the order of the tests in pytest (just pytest, not pytest-xdist), you can just alter the order of the test items on the go by installing this hook wrapper:
conftest.py:
import pytest
import random
random.seed()
#pytest.hookimpl(hookwrapper=True)
def pytest_runtest_protocol(item, nextitem):
yield
idx = item.session.items.index(item)
remaining = item.session.items[idx+1:]
random.shuffle(remaining)
item.session.items[idx+1:] = remaining
It makes no sense to change what was already executed, but only what remains — hence, [idx+1:]. In this example, I just shuffle the items, but you can do whatever you want with the list of the remaining features.
Keep in mind: This can affect how the fixtures are setup'ed & teardown'ed (those of scope above 'function'). Originally, pytest orders the tests so they can utilise the fixtures in a most efficient way. And specifically, the nextitem argument is used internally to track if the fixture should be finished. Since you efficiently change it every time, the effects can be unpredictable.
With pytest-xdist, it all goes totally different. You have to read on how pytest-dist distributes the tests across the slaves.
First, every slave collects all the same tests and exactly in the same order. If the order is different, pytest-xdist will fail. So you cannot reorder them on collection.
Second, the master process sends the test indexes in that collected list as the next tests to execute. So, the collection must be unchanged all the time.
Third, you can redefine the pytest_xdist_make_scheduler hook. There are few sample implementations in the pytest-xdist itself. You can define your own logic of scheduling the tests in schedule() method, using the nodes added/removed and tests collected via the corresponding methods.
Fourth, that would be too easy. The .schedule() method is called only in slave_collectionfinish event sent from the slave. I'm sad to say this, but you will have to kill and restart the slave processes all the time to trigger this event, and to re-schedule the remaining tests.
As you can see, the pytest-xdist's implementation will be huge & complex. But I hope this will give you some hints where to look at.

Celery - running a set of tasks with complex dependencies

In the application I'm working on, a user can perform a "transition" which consists of "steps". A step can have an arbitrary number of dependencies on other steps. I'd like to be able to call a transition and have the steps execute in parallel as separate Celery tasks.
Ideally, I'd like something along the lines of celery-tasktree, except for directed acyclic graphs in general, rather than only trees, but it doesn't appear that such a library exists as yet.
The first solution that comes to mind is a parallel adaptation of a standard topological sort - rather than determining a linear ordering of steps which satisfy the dependency relation, we determine the entire set of steps that can be executed in parallel at the beginning, followed by the entire set of steps that can be executed in round 2, and so on.
However, this is not optimal when tasks take a variable amount of time and workers have to idle waiting for a longer running task while there are tasks that are now ready to run. (For my specific application, this solution is probably fine for now, but I'd still like to figure out how to optimise this.)
As noted in https://cs.stackexchange.com/questions/2524/getting-parallel-items-in-dependency-resolution, a better way is operate directly off the DAG - after each task finishes, check whether any of its dependent tasks are now able to run, and if so, run them.
What would be the best way to go about implementing something like this? It's not clear to me that there's an easy way to do this.
From what I can tell, Celery's group/chain/chord primitives aren't flexible enough to allow me to express a full DAG - though I might be wrong here?
I'm thinking I could create a wrapper for tasks which notifies dependent tasks once the current task finishes - I'm not sure what the best way to handle such a notification would be though. Accessing the application's Django database isn't particularly neat, and would make it hard to spin this out into a generic library, but Celery itself doesn't provide obvious mechanisms for this.
I also faced this problem but i couldn't really find a better solution or library except for one library, For anyone still interested, you can checkout
https://github.com/selinon/selinon. Although its only for python 3, It seems to be the only thing that does exactly what you want.
Airflow is another option but airflow is used in a more static environment just like other dag libraries.

Process-level parallelism in Scala

I'd like to use reflection in combination with parallel processing in Scala, but I'm getting bitten by reflection's lack of thread safety.
So, I'm considering just running each task in its own process (not thread).
Is there any easy way to do this?
For example, is there a way to configure .par so it spawns processes, not threads? Or is there some function fork that takes a closure and runs it in a new process?
EDIT: Futures are apparently a good way to go.
However, I still need to figure out how to run them in separate processes.
EDIT 2: I'm still having concurrency issues, even when using Akka's "fork-join-executor" dispatcher, which sure sounds like it should be forking processes. However, when I run ManagementFactory.getRuntimeMXBean().getName() inside the Futures, it seems everything still lives in the same process.
Is this the right way to check for actual process-level parallelism?
Am I using the correct Akka dispatcher?
EDIT 3: I realize reflection sucks. Unfortunately it is used in a library I need.
Have you looked into Scala Actors or Akka? There may be no more compelling reason to use Scala than for parallel and asynchronous programming. It's baked into the language. Check out these facilities. I'm pretty sure you'll find what you need.
There's little information as regards the problem you're trying to solve here...previous answers are pretty much on the ball - look at Actors etc...Akka and you may find that you don't need to necessarily do anything too complicated. Introspection/reflection in a multi-threaded environment usually means a messy and not well thought-out strategy in terms of decomposing the problem in hand.

Can Test::Class tests be run in parallel? (or how to factor out superclass tests)

In all the tutorials I've read for Test::Class, there seems to be one runner script that loads all of the classes. And I think from the perspective of Test::Harness this is just one giant test. I don't think it can parallelize the tests inside the runner.
My X problem is that I am trying to factor out superclass behaviors when testing subclasses. Each subclass should have its own subclass test (that can be parallelized), but also exercise behaviors inherited from the superclass. How does one that?
Edit: I found these two posts from 2007 that seem to imply that what I'm asking for is incompatible/not possible. Any update since then?
http://www.hexten.net/pipermail/tapx-dev/2007-October/001756.html (speculation for Test::Class to support parallelism
http://perlbuzz.com/2007/08/organizing-tests-with-testclass.html (implying that Test::Class and Test::Harness are ideologically exclusive)
Test::Class doesn't support parallelisation on its own. Probably the easiest solution would be to have separate .t runners for each of your test classes (or for logical groups of test classes), and run using e.g. prove -j9.
If you really want to run all of the tests in parallel you could write a simple script to auto-generate a .t file for each test class. You'd lose the performance benefit of running multiple test classes within a single perl interpreter, but the parallelisation might compensate for the additional startup overhead. And I'd argue that no matter how much Test::Class tries to provide test isolation, it's impossible to guarantee that in Perl. Once you start taking advantage of modifying the symbol table for mocking purposes etc, your tests will start to interfere with each other unless you always get the cleanup right. Running each test in a separate perl interpreter is a better way to provide guaranteed isolation.
To make Test::Class parallel, I had Used the following mechanism. Hope it could help you.
I had made use of the Parallel::ForkManager module to invoke the tests. But had
parameterized the TEST_METHOD environment variable, so that the required tests are run
in each thread parallely
This provides a isolation among other tests because, each test is invoked independently, and
the thread process is managed to wait until all the child process are completed