Why do I need to know how many tests I will be running with Test::More? - perl

Am I a bad person if I use use Test::More qw(no_plan)?
The Test::More POD says
Before anything else, you need a testing plan. This basically declares how many tests your script is going to run to protect against premature failure...
use Test::More tests => 23;
There are rare cases when you will not know beforehand how many tests your script is going to run. In this case, you can declare that you have no plan. (Try to avoid using this as it weakens your test.)
use Test::More qw(no_plan);
But premature failure can be easily seen when there are no results printed at the end of a test run. It just doesn't seem that helpful.
So I have 3 questions:
What is the reasoning behind requiring a test plan by default?
Has anyone found this a useful and time saving feature in the long run?
Do other test suites for other languages support this kind of thing?

What is the reason for requiring a test plan by default?
ysth's answer links to a great discussion of this issue which includes comments by Michael Schwern and Ovid who are the Test::More and Test::Most maintainers respectively. Apparently this comes up every once in a while on the perl-qa list and is a bit of a contentious issue. Here are the highlights:
Reasons to not use a test plan
Its annoying and takes time.
Its not worth the time because test scripts won't die without the test harness noticing except in some rare cases.
Test::More can count tests as they happen
If you use a test plan and need to skip tests, then you have the additional pain of needing a SKIP{} block.
Reasons to use a test plan
It only takes a few seconds to do. If it takes longer, your test logic is too complex.
If there is an exit(0) in the code somewhere, your test will complete successfully without running the remaining test cases. An observant human may notice the screen output doesn't look right, but in an automated test suite it could go unnoticed.
A developer might accidentally write test logic so that some tests never run.
You can't really have a progress bar without knowing ahead of time how many tests will be run. This is difficult to do through introspection alone.
The alternative
Test::Simple, Test::More, and Test::Most have a done_testing() method which should be called at the end of the test script. This is the approach I take currently.
This fixes the problem where code has an exit(0) in it. It doesn't fix the problem of logic which unintentionally skips tests though.
In short, its safer to use a plan, but the chances of this actually saving the day are low unless your test suites are complicated (and they should not be complicated).
So using done_testing() is a middle ground. Its probably not a huge deal whatever your preference.
Has this feature been useful to anyone in the real world?
A few people mention that this feature has been useful to them in the real word. This includes Larry Wall. Michael Schwern says the feature originates with Larry, more than 20 years ago.
Do other languages have this feature?
None of the xUnit type testing suites has the test plan feature. I haven't come across any examples of this feature being used in any other programming language.

I'm not sure what you are really asking because the documentation extract seems to answer it. I want to know if all my tests ran. However, I don't find that useful until the test suite stabilizes.
While developing, I use no_plan because I'm constantly adding to the test suite. As things stabilize, I verify the number of tests that should run and update the plan. Some people mention the "test harness" catching that already, but there is no such thing as "the test harness". There's the one that most modules use by default because that's what MakeMaker or Module::Build specify, but the TAP output is independent of any particular TAP consumer.
A couple of people have mentioned situations where the number of tests might vary. I figure out the tests however I need to compute the number then use that in the plan. It also helps to have small test files that target very specific functionality so the number of tests is low.
use vars qw( $tests );
BEGIN {
$tests = ...; # figure it out
use Test::More tests => $tests;
}
You can also separate the count from the loading:
use Test::More;
plan tests => $tests;
The latest TAP lets you put the plan at the end too.

In one comment, you seem to think prematurely exiting will count as a failure, since the plan won't be output at the end, but this isn't the case - the plan will be output unless
you terminate with POSIX::_exit or a fatal signal or the like. In particular, die() and exit() will result
in the plan being output (though the test harness should detect anything other than an exit(0) as a prematurely terminated test).
You may want to look at Test::Most's deferred plan option, soon to be in Test::More (if it's not already).
There's also been discussion of this on the perl-qa list recently. One thread: http://www.nntp.perl.org/group/perl.qa/2009/03/msg12121.html

Doing any testing is better than doing no testing, but testing is about being deliberate. Stating the number tests expected gives you the ability to see if there is a bug in the test script that is preventing a test from executing (or executing too many times). If you don't run tests under specific conditions you can use the skip function to declare this:
SKIP: {
skip $why, $how_many if $condition;
...normal testing code goes here...
}

I think it's ok to bend the rules and use no_plan when the human cost of figuring out the plan is too high, but this cost is a good indication that the test suite has not been well designed.
Another case where it's useful to have the test_plan explicitely defined is when you are doing this kind of tests:
$coderef = sub { my $arg = shift; isa_ok $arg, 'MyClass' };
do(#args, $coderef);
and
## hijack our interface to test it's called.
local *MyClass::do = $coderef;
If you don't specify a plan, it's easy to miss out that your test failed and that some assertions weren't run as you expected.

Having explicitly the number of test in the plan is a good idea, unless it is too expensive to retrieve this number. The question has been properly answered already but I wanted to stress two points:
Better than no_plan is to use done_testing()
use Test::More;
... run your tests ...;
done_testing( $number_of_tests_run );
# or done_testing() if not number of test is known
this Matt Trout blog entry is interesting, and rants about adding a plan vs cvs conflicts and other issues that make the plan problematic: Why numeric test plans are bad, wrong, and don't actually help anyway

I find it annoying, too, and I usually ignore the number at the very beginning until the test suite stabilizes. Then I just keep it up to date manually. I do like the idea of knowing how many total tests there are as the seconds tick by, as a kind of a progress indicator.
To make counting easier I put the following before each test:
#----- load non-existant record -----
....
#----- add a new record -----
....
#----- load the new record (by name) -----
....
#----- verify the name -----
etc.
Then I can quickly scan the file and easily count the tests, just looking for the #----- lines. I suppose I could even write something up in Emacs to do it for me, but it's honestly not that much of a chore.

It is a pain when doing TDD, because you are writing new tests opportunistically. When I was teaching TDD and the shop used Perl, we decided to use our test suite the no plan way. I guess we could have changed from no_plan to lock down the number of tests. At the time I saw it as more hindrance than help.

Eric Johnson's answer is exactly correct. I just wanted to add that done_testing, a much better replacement to no_plan, was released in Test-Simple 0.87_1 recently. It's an experimental release, but you can download it directly from the previous link.
done_testing allows you to declare the number of tests you think you've run at the end of your testing script, rather than trying to guess it before your script starts. You can read the documentation here.

Related

Which is better in PHP: suppress warnings with '#' or run extra checks with isset()?

For example, if I implement some simple object caching, which method is faster?
1. return isset($cache[$cls]) ? $cache[$cls] : $cache[$cls] = new $cls;
2. return #$cache[$cls] ?: $cache[$cls] = new $cls;
I read somewhere # takes significant time to execute (and I wonder why), especially when warnings/notices are actually being issued and suppressed. isset() on the other hand means an extra hash lookup. So which is better and why?
I do want to keep E_NOTICE on globally, both on dev and production servers.
I wouldn't worry about which method is FASTER. That is a micro-optimization. I would worry more about which is more readable code and better coding practice.
I would certainly prefer your first option over the second, as your intent is much clearer. Also, best to keep away edge condition problems by always explicitly testing variables to make sure you are getting what you are expecting to get. For example, what if the class stored in $cache[$cls] is not of type $cls?
Personally, if I typically would not expect the index on $cache to be unset, then I would also put error handling in there rather than using ternary operations. If I could reasonably expect that that index would be unset on a regular basis, then I would make class $cls behave as a singleton and have your code be something like
return $cls::get_instance();
The isset() approach is better. It is code that explicitly states the index may be undefined. Suppressing the error is sloppy coding.
According to this article 10 Performance Tips to Speed Up PHP, warnings take additional execution time and also claims the # operator is "expensive."
Cleaning up warnings and errors beforehand can also keep you from
using # error suppression, which is expensive.
Additionally, the # will not suppress the errors with respect to custom error handlers:
http://www.php.net/manual/en/language.operators.errorcontrol.php
If you have set a custom error handler function with
set_error_handler() then it will still get called, but this custom
error handler can (and should) call error_reporting() which will
return 0 when the call that triggered the error was preceded by an #.
If the track_errors feature is enabled, any error message generated by
the expression will be saved in the variable $php_errormsg. This
variable will be overwritten on each error, so check early if you want
to use it.
# temporarily changes the error_reporting state, that's why it is said to take time.
If you expect a certain value, the first thing to do to validate it, is to check that it is defined. If you have notices, it's probably because you're missing something. Using isset() is, in my opinion, a good practice.
I ran timing tests for both cases, using hash keys of various lengths, also using various hit/miss ratios for the hash table, plus with and without E_NOTICE.
The results were: with error_reporting(E_ALL) the isset() variant was faster than the # by some 20-30%. Platform used: command line PHP 5.4.7 on OS X 10.8.
However, with error_reporting(E_ALL & ~E_NOTICE) the difference was within 1-2% for short hash keys, and up 10% for longer ones (16 chars).
Note that the first variant executes 2 hash table lookups, whereas the variant with # does only one lookup.
Thus, # is inferior in all scenarios and I wonder if there are any plans to optimize it.
I think you have your priorities a little mixed up here.
First of all, if you want to get a real world test of which is faster - load test them. As stated though suppressing will probably be slower.
The problem here is if you have performance issues with regular code, you should be upgrading your hardware, or optimize the grand logic of your code rather than preventing proper execution and error checking.
Suppressing errors to steal the tiniest fraction of a speed gain won't do you any favours in the long run. Especially if you think that this error may keep happening time and time again, and cause your app to run more slowly than if the error was caught and fixed.

pytest: are pytest_sessionstart() and pytest_sessionfinish() valid hooks?

are pytest_sessionstart(session) and pytest_sessionfinish(session) valid hooks? They are not described in dev hook docs or latest hook docs
What is the difference between them and pytest_configure(config)/pytest_unconfigure(config)?
In docs it is said:
pytest_configure(config)called after command line options have been parsed. and all plugins
and initial conftest files been loaded.
and
pytest_unconfigure(config) called before test process is exited.
Session is the same, right?
Thanks!
The bad news is that the situation with sessionstart/configure is not very well specified. Sessionstart in particular is not much documented because the semantics differ if one is in the xdist/distribution case or not. One can distinguish these situations but it's all a bit too complicated.
The good news is that pytest-2.3 should make things easier. If you define a #fixture with scope="session" you can implement a fixture that is called once per process within which test execute.
For distributed testing, this means once per test slave. For single-process testing, it means once for the whole test run. In either case, if you do a "--collectonly" run, or "-h" or other options that do not involve the running of tests, then fixture functions will not execute at all.
Hope this clarifies.

How should I deal with failing tests for bugs that will not be fixed

I have a complex set of integration tests that uses Perl's WWW::Mechanize to drive a web app and check the results based on specific combinations of data. There are over 20 subroutines that make up the logic of the tests, loop through data, etc. Each test runs several of the test subroutines on a different dataset.
The web app is not perfect, so sometimes bugs cause the tests to fail with very specific combinations of data. But these combinations are rare enough that our team will not bother to fix the bugs for a long time; building many other new features takes priority.
So what should I do with the failing tests? It's just a few tests out of several dozen per combination of data.
1) I can't let it fail because then the whole test suite would fail.
2) If we comment them out, that means we miss out on making that test for all the other datasets.
3) I could add a flag in the specific dataset that fails, and have the test not run if that flag is set, but then I'm passing extra flags all over the place in my test subroutines.
What's the cleanest and easiest way to do this?
Or are clean and easy mutually exclusive?
That's what TODO is for.
With a todo block, the tests inside are expected to fail. Test::More will run the tests normally, but print out special flags indicating they are "todo". Test::Harness will interpret failures as being ok. Should anything succeed, it will report it as an unexpected success. You then know the thing you had todo is done and can remove the TODO flag.
The nice part about todo tests, as opposed to simply commenting out a block of tests, is it's like having a programmatic todo list. You know how much work is left to be done, you're aware of what bugs there are, and you'll know immediately when they're fixed.
Once a todo test starts succeeding, simply move it outside the block. When the block is empty, delete it.
I see two major options
disable the test (commenting it out), with a reference to your bugtracking system (i.e. a bug ig), possibly keeping a note in the bug as well that there is a test ready for this bug
move the failing tests in a seperate test suite. You could even reverse the failing assertion so you can run the suite and while it is green the bug is still there and if it becomes red either the bug is gone or something else is fishy. Of course a link to the bugtracking system and bag is still a good thing to have.
If you actually use Test::More in conjunction with WWW::Mechanize, case closed (see comment from #daxim). If not, think of a similar approach:
# In your testing module
our $TODO;
# ...
if (defined $TODO) {
# only print warnings
};
# in a test script
local $My::Test::TODO = "This bug is delayed until iteration 42";

Speed improvements for Perl's chameneos-redux in the Computer Language Benchmarks Game

Ever looked at the Computer Language Benchmarks Game (formerly known as the Great Language Shootout)?
Perl has some pretty healthy competition there at the moment. It also occurs to me that there's probably some places that Perl's scores could be improved. The biggest one is in the chameneos-redux script right now—the Perl version runs the worst out of any language: 1,626 times slower than the C baseline solution!
There are some restrictions on how the programs can be made and optimized, and there is Perl's interpreted runtime penalty, but 1,626 times? There's got to be something that can get the runtime of this program way down.
Taking a look at the source code and the challenge, how can the speed be improved?
I ran the source code through the Devel::SmallProf profiler. The profile output is a little too verbose to post here, but you can see the results yourself using $ perl -d:SmallProf chameneos.pl 10000 (no need to run it for 6000000 meetings unless you really want to!) See perlperf for more details on some profiling tools in Perl.
It turns out that using semaphores is the major bottleneck. The lion's share of total CPU time is spent on checking whether a semaphore is locked or not. Although I haven't had enough time to look at why the source code uses semaphores, it may be that you can work around having to use semaphores altogether. That's probably your best shot at improving the code's performance.
As Zaid posted, Thread::Semaphore is rather slow. One optimization could be to use the implicit locks on shared variables instead of them. It should be faster, though I suspect it won't be faster by much.
In general, Perl's threading implementation sucks for any kind of usage that requires a lot of interthread communication. It's very suitable for tasks with little communication (as unlike CPython's threads and CRuby's threads they are actually preemptive).
It may be possible to improve that situation, we need better primitives.
I have a version based on another version from Jesse Millikian, which I think was never published.
I think it may run ~ 7x faster than the current entry, and uses standard modules all around. I'm not sure if it actually complies with all the rules though.
I've tried the forks module on it, but I think it slows it down a bit.
Anyone tried s/threads/forks/ on the Perl entry? Or Coro / Coro::MP, though the latter would probably trigger the 'interesting alternative implementations' clause.

Is my Rose::DB::Object compile-time too slow?

I'm planning to move from Class::DBI to Rose::DB::Object due to its nice structure and the jargon that RDBO is faster compares to CDBI and DBIC.
However on my machine (linux 2.6.9-89, perl 5.8.9) RDBO compiled time is much slower than CDBI:
$ time perl -MClass::DBI -e0
real 0m0.233s
user 0m0.208s
sys 0m0.024s
$ time perl -MRose::DB::Object -e0
real 0m1.178s
user 0m1.097s
sys 0m0.078s
That's a lot different...
Anyone experiences similar behaviour here?
Cheers.
#manni and #john: thanks for the explanation about the modules referenced by RDBO, it surely answers why the compile-time is slower than CDBI.
The application is not running on a persistent environment. In fact it's invoked by several simultaneous cron jobs that run at 2 mins, 5 mins, and x mins interval - so yes, compile-time is crucial here...
Jonathan Rockway's App::Persistent seems interesting, however its (current) limitation to allow only one application running at a time is not suitable for my purpose. Also it has issue when we kill the client, the server process is still running...
Rose::DB::Object simply contains (or references from other modules) much more code than Class::DBI. On the bright side, it also has many more features and is much faster at runtime than Class::DBI. If compile time is concern for you, then your best bet is to load as little code as possible (or get faster disks).
Another option is to set auto_load_related_classes to false in your Metadata objects. To do this early enough and globally will probably require you to make a Metadata subclass and then set that as the meta_class in your common Rose::DB::Object base class.
Turning auto_load_related_classes off means that you'd have to manually load related classes that you actually want to use in your script. That's a bit of a pain, but it lets you control how many classes get loaded. (If you have heavily interrelated classes, loading a single one can end up pulling all the other ones in.)
You could, perhaps, have an environment variable to control the behavior. Example metadata class:
package My::DB::Object::Metadata;
use base 'Rose::DB::Object::Metadata';
# New class method to handle default
sub default_auto_load_related_classes
{
return $ENV{'RDBO_AUTO_LOAD_RELATED_CLASSES'} ? 1 : 0
}
# Override existing object method, honoring new class-defined default
sub auto_load_related_classes
{
my($self) = shift;
return $self->SUPER::auto_load_related_classes(#_) if(#_);
if(defined(my $value = $self->SUPER::auto_load_related_classes))
{
return $value;
}
# Initialize to default
return $self->SUPER::auto_load_related_classes(ref($self)->default_auto_load_related_classes);
}
And here's how it's tied to your common object base class:
package My::DB::Object;
use base 'Rose::DB::Object';
use My::DB::Object::Metadata;
sub meta_class { 'My::DB::Object::Metadata' }
Then set RDBO_AUTO_LOAD_RELATED_CLASSES to true when you're running in a persistent environment, and leave it false (and don't forget to explicitly load related classes) for command-line scripts.
Again, this will only help if you're currently loading more classes than you strictly need in a particular script due to the default true value of the auto_load_related_classes Metadata attribute.
If compile time is an issue, there are methods to lessen the impact. One is PPerl which makes a normal Perl script into a daemon that is compiled once. The only change you need to make (after installing it, of course) is to the shebang line:
#!/usr/bin/pperl
Another option is to code write a client/server model program where the bulk of the work is done by a server that loads the expensive modules and a thin script that just interacts with the server over sockets or pipes.
You should also look at App::Persistent and this article, both of which were written by Jonathan Rockway (aka jrockway).
This looks almost as dramatic over here:
time perl -MClass::DBI -e0
real 0m0.084s
user 0m0.080s
sys 0m0.004s
time perl -MRose::DB::Object -e0
real 0m0.391s
user 0m0.356s
sys 0m0.036s
I'm afraid part of the difference can simply be explained by the number of dependencies in each module:
perl -MClass::DBI -le 'print scalar keys %INC'
46
perl -MRose::DB::Object -le 'print scalar keys %INC'
95
Of course, you should ask yourself how much compilation time really matters for your particular problem. And what source code would be easier to maintain for you.