Pytest live logging with parallel execution - possible? - pytest

I have a test suite that I run with
python3 -mpytest --log-cli-level=DEBUG ...
on the build server. The live logs are useful to troubleshoot if the tests get stuck or are slow for some reason (the tests use external resources).
To speed things up, it is possible to run them with e.g.
python3 -mpytest -n 4 --log-cli-level=DEBUG ...
to have four parallel test runners. Speedup is almost linear with number of processes, which is great, but unfortunately the parent process swallows all live logs. I get the captured logs in case of a test failure, but I need the live logs as well to understand what is going on in real time. I understand that the output from all four parallel runs will be intermixed and that is fine. The purpose is for the committer to just check the build server output and know roughly what is going on.
I am currently using pytest-xdist, but use none of the more advanced features from it (just the multiprocessing).

Related

Does ReactTestingLibrary take up a lot of CPU/RAM, causing tests to timeout & fail?

I've had some issues with tests timing out randomly. Usually on CircleCI, but sometimes locally. Based on Kent Dodds suggestion to write fewer longer tests I now have more tests with multiple clicks & multiple network requests (mocking fetch too). Theses tests seem to timeout. Just recently CircleCI added a Resources tab to the pipeline for some interesting metrics. When the tests timeout, the 4GB ram clearly gets to 100% for extended time, and the test fails. On a passed test, the ram stays mostly below 100%.
Failed test (4GB):
Passed test (4GB):
Updated Resource_class to 8GB
I tried a single experiment to update my circleci config so that the resource_class gets updated to large/8GB. Test passed and even better CPU usage %.
So, does React Testing Library take up a lot of horsepower?
Is our default 4GB RAM docker image ok?

Protractor - Run a instances to each spec file

so, I'm new in the tests world. srsr.
I've somy spec files and I running it 4 instances for about 10 spec files.
I would like to know if is a good idea to create a instance to run each file?
I know that if I have 10 files, doing it is ok.
But if I have 30 files?
setup 30 instancies, one for each.
is it good idea ?
thanks guys!
By instance I assume you are talking about running your tests in parallel. Running tests in parallel is meant to speed up your the time it takes to execute your test suite. How many tests you can reliably run in parallel depends on your setup. At some point, if you have to many tests running in parallel tests will begin to timeout. If 30 instances for 30 tests will reliably run on your machine, then that is what you should do. But it defeats the purpose if tests are timing out from to much stuff going on.

Running tests in parallel on multiple machines using py.test

I know that UI tests can be run in parallel on multiple machines using selenium grid. How about API tests?
I looked at pytest-xdist plugin and it can run tests in parallel on the local machine using py.test -n NUM, which will send tests to multiple CPUs and run them in parallel. This may not be as effective and fast, if the number of tests that we would like to run in parallel is much more than the no of CPUs on the machine. For example: If the machine has 4 CPUs and we would like to run 50 tests in parallel.
And it seems to run the tests on remote machine we need to do something like
py.test -d --tx socket=192.168.1.102:8888 --rsyncdir mypkg mypkg
I am wondering if there is a way to distribute the tests to multiple remote machines and run them in parallel. For example: If i have 1000 tests and 50 remote machines, then i would like each remote machine to run 1 or more tests at the same time so that tests complete faster. Which means, all the 1000 tests will complete in the time it takes for 20 tests or less.
Thanks.
It looks like you want the load distribution mode, followed by multiple invocations of the --tx argument:
py.test --dist=load --tx socket=192.168.1.110:8888 --tx socket=192.168.1.111:8888 --tx socket=192.168.1.112:8888 --rsyncdir mypkg mypkg
I'm sure you've looked at CPU usage of the python processes when running the tests. If you are doing what what I expect you are doing (running an integration test suite against a single instance of a network service with high response times), your test suite isn't CPU bound but is actually I/O bound. For this type of workload, CPU usage may appear high, but actually includes the amount of time the test runner spent waiting for a response from the system under test.
The biggest problem I've encountered when parallelizing that type of test suite is that the order tests complete sometimes matters, and when run in parallel tests finish in a different order when they run in series just due to variation in response times, causing intermittent and difficult to troubleshoot test failures.
If that doesn't happen with multiple cores on a single machine, that's a good sign your plan will work. That having been said, because there is operational overhead involved with keeping any pool of hosts around - patching with updates, dealing with configuration, provisioning, and networking, not to mention other unexpected issues, I suggest you try something different.
I think you should consider refactoring your test code to use asynchronous IO instead of setting up the test grid. When you do this correctly, multiple tests will be able to run on one core at the same time. Your sysadmin (which may be you!) will thank you.

why salt-cloud is so slow comparing to terraform?

I'm comparing salt-cloud and terraform as tools to manage our infrastructure at GCE. We use salt stack to manage VM configurations, so I would naturally prefer to use salt-cloud as an integral part of the stack and phase out terraform as a legacy thing.
However my use case is critical on VM deployment time because we offer PaaS solution with VMs deployed on customer request, so need to deliver ready VMs on a click of a button within seconds.
And what puzzles me is why salt-cloud takes so long to deploy basic machines.
I have created neck-to-neck simple test with deploying three VMs based on default CentOS7 image using both terraform and salt-cloud (both in parallel mode). And the time difference is stunning - where terraform needs around 30 seconds to deploy requested machines (which is similar to time needed to deploy through GCE GUI), salt-cloud takes around 220 seconds to deploy exactly same machines under same account in the same zone. Especially strange is that first 130 seconds salt-cloud does not start deploying and does seemingly nothing at all, and only after around 130 seconds pass it shows message deploying VMs and those VMs appear in GUI as in deployment.
Is there something obvious that I'm missing about salt-cloud that makes it so slow? Can it be sped up somehow?
I would prefer to user full salt stack, but with current speed issues it has I cannot really afford that.
Note that this answer is a speculation based on what I understood about terraform and salt-cloud, I haven't verified with an experiment!
I think the reason is that Terraform keeps state of the previous run (either locally or remotely), while salt-cloud doesn't keep state and so queries the cloud before actually provisioning anything.
These two approaches (keeping state or querying before doing something) are needed, since both tools are idempotent (you can run them multiple times safely).
For example, I think that if you remove the state file of Terraform and re-run it, it will assume there is nothing in the cloud and will actually instantiate a duplicate. This is not to imply that terraform does it wrong, it is to show that state is important and Terraform docs say clearly that when operating in a team the state should be saved remotely, exactly to avoid this kind of problem.
Following my line of though, this should also mean that if you either run salt-cloud in verbose debug mode or look at the network traffic generated by it, in the first 130 secs you mention (before it says "deploying VMs"), you should see queries from salt-cloud to the cloud provider to dynamically construct the state.
Last point, the fact that salt-cloud doesn't store the state of a previous run doesn't mean that it is automatically safe to use in a team environment. It is safe to use as long as no two team members run it at the same time. On the other hand, terraform with remote state on Consul allows for example to lock, so that team concurrent usage will always be safe.

Matlab/Simulink: run batch of simulations in parallel?

I have to run a series of simulations and save the results. Since by default Matlab only uses one core, I wonder if it is possible to open multiple worker tasks and assign different simulation runs to them?
You could run each simulation in a separate MATLAB instance and let the OS handle the process to core assignment.
One master MATLAB could synchronize each child instances checking for example if simulation results file are existing.
I aso have the same problem but I did not manage to really understand how to make it in MatLab. The documentation in matlab is too advanced to get to know how to make it.
Since I am working with Ubuntu I find a way to do the work calling the unix command from MatLab and using the parallel GNU command
So I mange to run my simulation in parallel with 4 cores.
unix('parallel --progress -j4 flow > /dev/null :::: Pool.txt','-echo')
you can find more info in the link
Shell, run four processes parallel
Details of the syntaxis can be found at https://www.gnu.org/software/parallel/
but breifly I can tell you
--progress shows a status of the progress
-j4 tells the amount or jobs in parallel you want to have
flow is the name of my simulator
/dev/null was just to avoid the screen run output of the simulator to show up
Pool.txt is a file I made with the required simulator input that is basically the path and the main simulator file.
echo I do not remember now what was it for :D