I would like to control the order of my tests using logic which will reorder them on the fly, while they are already running.
My use case is this: I am parallelizing my tests with xdist, and each test uses external resources from a common and limited pool. Some tests use more resources than others, so at any given time when only a fraction of the resources are available, some of the tests have the resources they need to run and others don't.
I want to optimize the usage of the resources, so I would like to dynamically choose which test will run next, based on the resources currently available. I would calculate an optimal ordering during the collection stage, but I don't know in advance how long each test will take, so I can't predict which resources will be available when.
I haven't found a way to implement this behavior with a plugin because the collection stage seems to be distinct from the running stage, and I don't know another way to modify the list of tests to run other than the collection hooks.
Would very much appreciate any suggestions, whether a way to implement this or an alternative idea which would optimize the resources utilization. My alternative is to write my own simplistic test runner, but I don't want to give up on the rest of what pytest offers.
Thanks!
Not really an full answer to your problem, but to your question on pytest, and few hints.
To change the order of the tests in pytest (just pytest, not pytest-xdist), you can just alter the order of the test items on the go by installing this hook wrapper:
conftest.py:
import pytest
import random
random.seed()
#pytest.hookimpl(hookwrapper=True)
def pytest_runtest_protocol(item, nextitem):
yield
idx = item.session.items.index(item)
remaining = item.session.items[idx+1:]
random.shuffle(remaining)
item.session.items[idx+1:] = remaining
It makes no sense to change what was already executed, but only what remains — hence, [idx+1:]. In this example, I just shuffle the items, but you can do whatever you want with the list of the remaining features.
Keep in mind: This can affect how the fixtures are setup'ed & teardown'ed (those of scope above 'function'). Originally, pytest orders the tests so they can utilise the fixtures in a most efficient way. And specifically, the nextitem argument is used internally to track if the fixture should be finished. Since you efficiently change it every time, the effects can be unpredictable.
With pytest-xdist, it all goes totally different. You have to read on how pytest-dist distributes the tests across the slaves.
First, every slave collects all the same tests and exactly in the same order. If the order is different, pytest-xdist will fail. So you cannot reorder them on collection.
Second, the master process sends the test indexes in that collected list as the next tests to execute. So, the collection must be unchanged all the time.
Third, you can redefine the pytest_xdist_make_scheduler hook. There are few sample implementations in the pytest-xdist itself. You can define your own logic of scheduling the tests in schedule() method, using the nodes added/removed and tests collected via the corresponding methods.
Fourth, that would be too easy. The .schedule() method is called only in slave_collectionfinish event sent from the slave. I'm sad to say this, but you will have to kill and restart the slave processes all the time to trigger this event, and to re-schedule the remaining tests.
As you can see, the pytest-xdist's implementation will be huge & complex. But I hope this will give you some hints where to look at.
Related
Does anybody know how to get list of categories (provided with 'where' filter to nunit-console) at runtime?
Depending on this, I need to differently initialize the test assembly.
Is there something static like TestExecutionContext that may contain such information?
The engine doesn't pass information on to the framework about "why" it's running a particular test... i.e. if it's running all tests or if it was selected by name or category. That's deliberately kept as something the test doesn't know about with the underlying philosophy being that tests should just run based on the data provided to them.
On some platforms, it's possible to get the command-line, which ran the test. With that info you could decode the various options and make some conclusions but it seems as if it would be easier to restructure the tests so they didn't need this information.
As a secondary reason, it would also be somewhat complicated to supply the info you want and to use it. A test may have multiple categories. Imagine a test selected because two categories matched, for example!
Is it possible that what you really want to do is to pass some parameters to your tests? There is a facility for doing that of course.
I think this is a bit of an XY problem. Depending on what you are actually trying to accomplish, the best approach is likely to be different. Can you edit to tell us what you are trying to do?
UPDATE:
Based on your comment, I gather that some of your initialization is both time-consuming and not needed unless certain tests are run.
Two approaches to this (or combine them):
Do less work in all your initialization (i.e. TestCase, TestCaseSource, SetUpFixture. It's generally best not to create your classes under test or initialize databases. Instead, simply leave strings, ints, etc., which allow the actual test to do the work IFF it is run.
2.Use a SetUpFixture in some namespace containing all the tests, which require that particular initialization. If you dont' run any tests from that namespace, then the initialization won't be done.
Of course both of the above may entail a large refactoring of your tests, but the legacy app won't have to be changed.
I'm using the Asana REST API to iterate over workspaces, projects, and tasks. After I achieved the initial crawl over the data, I was surprised to see that I only retrieved the top-level tasks. Since I am required to provide the workspace and project information, I was hoping not to have to recurse any deeper. It appears that I can recurse on a single task with the \subtasks endpoint and re-query... wash/rinse/repeat... but that amounts to a potentially massive number of REST calls (one for each subtask to see if they, in turn, have subtasks to query - and so on).
I can partially mitigate this by adding to the opt_fields query parameter something like:
&opt_fields=subtasks,subtasks.subtasks
However, this doesn't scale well. It means I have to elongate the query for each layer of depth. I suppose I could say "don't put tasks deeper than x layers deep" - but that seems to fly in the face of Asana's functionality and design. Also, since I need lots of other properties, it requires me to make a secondary query for each node in the hierarchy to gather those. Ugh.
I can use the path method to try to mitigate this a bit:
&opt_fields=(this|subtasks).(id|name|etc...)
but again, I have to do this for every layer of depth. That's impractical.
There's documentation about this great REPEATER + operator. Supposedly it would work like this:
&opt_fields=this.subtasks+.name
That is supposed to apply to ALL subtasks anywhere in the hierarchy. In practice, this is completely broken, and the REST API chokes and returns only the ids of the top-level tasks. :( Apparently their documentation is just wrong here.
The only method that seems remotely functional (if not practical) is to iterate first on the top-level tasks, being sure to include opt_fields=subtasks. Whenever this is a non-empty array, I would need to recurse on that task, query for its subtasks, and continue in that manner, until I reach a null subtasks array. This could be of arbitrary depth. In practice, the first REST call yields me (hopefully) the largest number of tasks, so the individual recursion may be mitigated by real data... but it's a heck of an assumption.
I also noticed that the limit parameter applied ONLY to the top-level tasks. If I choose to expand the subtasks, say. I could get a thousand tasks back instead of 100. The call could timeout if the data is too large. The safest thing to do would be to only request the ids of subtasks until recursion, and as always, ask for all the desired top-level properties at that time.
All of this seems incredibly wasteful - what I really want is a flat list of tasks which include the parent.id and possibly a list of subtasks.id - but I don't want to query for them hierarchically. I also want to page my queries with rational data sizes in mind. I'd like to get 100 tasks at a time until Asana runs out - but that doesn't seem possible, since the limit only applies to top-level items.
Unfortunately the repeater didn't solve my problem, since it just doesn't work. What are other people doing to solve this problem? And, secondarily, can anyone with intimate Asana insight provide any hope of getting a better way to query?
While I'm at it, a suggested way to design this: the task endpoint should not require workspace or project predicate. I should be able to filter by them, but not be required to. I am limited to 100 objects already, why force me to filter unnecessarily? In the same vein - navigating the hierarchy of Asana seems an unnecessary tax for clients who are not Asana (and possibly even the Asana UI itself).
Any ideas or insights out there?
Have you ensured that the + you send is URL-encoded? Whatever library you are using should usually handle this (which language are you using, btw? We have some first-party client libraries available)
Try &opt_fields=this.subtasks%2B.name if you're creating the URL manually, or (better yet) use a library that correctly encodes URL query parameters.
In the application I'm working on, a user can perform a "transition" which consists of "steps". A step can have an arbitrary number of dependencies on other steps. I'd like to be able to call a transition and have the steps execute in parallel as separate Celery tasks.
Ideally, I'd like something along the lines of celery-tasktree, except for directed acyclic graphs in general, rather than only trees, but it doesn't appear that such a library exists as yet.
The first solution that comes to mind is a parallel adaptation of a standard topological sort - rather than determining a linear ordering of steps which satisfy the dependency relation, we determine the entire set of steps that can be executed in parallel at the beginning, followed by the entire set of steps that can be executed in round 2, and so on.
However, this is not optimal when tasks take a variable amount of time and workers have to idle waiting for a longer running task while there are tasks that are now ready to run. (For my specific application, this solution is probably fine for now, but I'd still like to figure out how to optimise this.)
As noted in https://cs.stackexchange.com/questions/2524/getting-parallel-items-in-dependency-resolution, a better way is operate directly off the DAG - after each task finishes, check whether any of its dependent tasks are now able to run, and if so, run them.
What would be the best way to go about implementing something like this? It's not clear to me that there's an easy way to do this.
From what I can tell, Celery's group/chain/chord primitives aren't flexible enough to allow me to express a full DAG - though I might be wrong here?
I'm thinking I could create a wrapper for tasks which notifies dependent tasks once the current task finishes - I'm not sure what the best way to handle such a notification would be though. Accessing the application's Django database isn't particularly neat, and would make it hard to spin this out into a generic library, but Celery itself doesn't provide obvious mechanisms for this.
I also faced this problem but i couldn't really find a better solution or library except for one library, For anyone still interested, you can checkout
https://github.com/selinon/selinon. Although its only for python 3, It seems to be the only thing that does exactly what you want.
Airflow is another option but airflow is used in a more static environment just like other dag libraries.
In all the tutorials I've read for Test::Class, there seems to be one runner script that loads all of the classes. And I think from the perspective of Test::Harness this is just one giant test. I don't think it can parallelize the tests inside the runner.
My X problem is that I am trying to factor out superclass behaviors when testing subclasses. Each subclass should have its own subclass test (that can be parallelized), but also exercise behaviors inherited from the superclass. How does one that?
Edit: I found these two posts from 2007 that seem to imply that what I'm asking for is incompatible/not possible. Any update since then?
http://www.hexten.net/pipermail/tapx-dev/2007-October/001756.html (speculation for Test::Class to support parallelism
http://perlbuzz.com/2007/08/organizing-tests-with-testclass.html (implying that Test::Class and Test::Harness are ideologically exclusive)
Test::Class doesn't support parallelisation on its own. Probably the easiest solution would be to have separate .t runners for each of your test classes (or for logical groups of test classes), and run using e.g. prove -j9.
If you really want to run all of the tests in parallel you could write a simple script to auto-generate a .t file for each test class. You'd lose the performance benefit of running multiple test classes within a single perl interpreter, but the parallelisation might compensate for the additional startup overhead. And I'd argue that no matter how much Test::Class tries to provide test isolation, it's impossible to guarantee that in Perl. Once you start taking advantage of modifying the symbol table for mocking purposes etc, your tests will start to interfere with each other unless you always get the cleanup right. Running each test in a separate perl interpreter is a better way to provide guaranteed isolation.
To make Test::Class parallel, I had Used the following mechanism. Hope it could help you.
I had made use of the Parallel::ForkManager module to invoke the tests. But had
parameterized the TEST_METHOD environment variable, so that the required tests are run
in each thread parallely
This provides a isolation among other tests because, each test is invoked independently, and
the thread process is managed to wait until all the child process are completed
In all the tutorials I've read for Test::Class, there seems to be one runner script that loads all of the classes. And I think from the perspective of Test::Harness this is just one giant test. I don't think it can parallelize the tests inside the runner.
My X problem is that I am trying to factor out superclass behaviors when testing subclasses. Each subclass should have its own subclass test (that can be parallelized), but also exercise behaviors inherited from the superclass. How does one that?
Edit: I found these two posts from 2007 that seem to imply that what I'm asking for is incompatible/not possible. Any update since then?
http://www.hexten.net/pipermail/tapx-dev/2007-October/001756.html (speculation for Test::Class to support parallelism
http://perlbuzz.com/2007/08/organizing-tests-with-testclass.html (implying that Test::Class and Test::Harness are ideologically exclusive)
Test::Class doesn't support parallelisation on its own. Probably the easiest solution would be to have separate .t runners for each of your test classes (or for logical groups of test classes), and run using e.g. prove -j9.
If you really want to run all of the tests in parallel you could write a simple script to auto-generate a .t file for each test class. You'd lose the performance benefit of running multiple test classes within a single perl interpreter, but the parallelisation might compensate for the additional startup overhead. And I'd argue that no matter how much Test::Class tries to provide test isolation, it's impossible to guarantee that in Perl. Once you start taking advantage of modifying the symbol table for mocking purposes etc, your tests will start to interfere with each other unless you always get the cleanup right. Running each test in a separate perl interpreter is a better way to provide guaranteed isolation.
To make Test::Class parallel, I had Used the following mechanism. Hope it could help you.
I had made use of the Parallel::ForkManager module to invoke the tests. But had
parameterized the TEST_METHOD environment variable, so that the required tests are run
in each thread parallely
This provides a isolation among other tests because, each test is invoked independently, and
the thread process is managed to wait until all the child process are completed