ScalaCheck collection generator - scalacheck

I'm using Scalacheck and want to generate collection of a given size. There is a special function for that in scalaCheck, Gen.listOfN(size, Gen[T]).
When in forAll method I print the size of the generated collection it does not always have the defined size. Actually it only has the given size on the first attempt. For example, size 6 --> first attempt the size of collection is 6, second attempt size is only 3. What am I doing wrong?

It sounds like you might be using an old (pre-1.11.0) version of ScalaCheck. In these versions, the generator boundaries weren't always respected.
When ScalaCheck finds a failing test case for your property, it tries to simplify that test case (make it "smaller"). Nowadays (version >= 1.11.0), ScalaCheck tries to respect for example listOfN when doing this simplification, and not test lists with fewer than n items. However, in some cases it is still not possible for ScalaCheck to know what boundaries a generator had from the start, for example when you use the Gen.map method.
For more information on the cases when ScalaCheck still might simplify test cases in unexpected ways (and what you can do to mitigate it), see: Scalacheck won't properly report the failing case

Related

ScalaTest: where Checkers are used and assertions are used

I am going through the coursera functional programming and have an assignment where the scalatest is written using FunSuite and Checkers.
This test framework is new to me but I have some basic idea of using assertion, as I have developed pigunit for an user defined function using assert.
As google didn't give me clear usage of Checkers and how it is different from assert, could anyone clarify where Checkers can be used and why not assert be used.
Thanks
As you know, an assertion is a way of testing that a certain condition holds. These are pretty simple in ScalaTest, as you only need to use assert. For example:
assert(List(1, 2, 3).length == 3)
"Checkers," or, as they are more often called, properties, are a bit different. They are a way to assert that a condition holds for all possible inputs instead of for a single case. For example, here is a property that tests that a list always has a nonnegative length:
check((ls: List[Int]) => ls.length >= 0)
At this point, ScalaTest defers to ScalaCheck to do the heavy lifting. ScalaCheck generates random values for ls in an effort to find one that fails the test. This concept is called property-based testing. You can read more about how to use it in ScalaTest here.

How should I test "isEqual" method of value object types in BDD?

I'm new to BDD, even to whole testing world.
I'm trying to take BDD practices when writing a simple linear algebra library in swift. So there would be many value object types like Matrix, Vector etc. When writing code, I suppose I still need to stick to the TDD principle (am I right?):
Not writing any single line of code without a failing test
To implement a value object type, I need to make it conform to Equatable protocol and implement its == operator. This is adding code, so I need a failing test. How to write spec for this kinda scenarios ?
One may suggest some approach like:
describe("Matrix") {
it("should be value object") {
let aMatrix = Matrix<Double>(rows: 3, cols:2)
let sameMatrix = Matrix<Double>(rows: 3, cols:2)
expect(sameMatrix) == aMatrix
let differentMatrix = Matrix<Double>(rows: 4, cols: 2)
expect(differentMatrix) != aMatrix
}
}
This would be an ugly boilerplate for two reasons:
There may be plenty of value object types and I need to repeat it for all of them
There may be plenty of cases that would cause two objects being not equal. Taking the spec above for example, an implementation of == like return lhs.rows == rhs.rows would pass the test. In order to reveal this "bug", I need to add another expectation like expect(matrixWithDifferentColmunCount) != aMatrix. And again, this kinda repetition happens for all value object types.
So, how should I test this "isEqual" ( or operator== ) method elegantly ? or shouldn't I test it at all ?
I'm using swift and Quick for testing framework. Quick provides a mechanism called SharedExample to reduce boilerplates. But since swift is a static typing language and Quick's shared example doesn't support generics, I can't directly use a shared example to test value objects.
I came up with a workaround but don't consider it as an elegant one.
Elegance is the enemy of test suites. Test suites are boring. Test suites are repetitive. Test suites are not "DRY." The more clever you make your test suites, the more you try to avoid boilerplate, the more you are testing your test infrastructure instead of your code.
The rule of BDD is to write the test before you write the code. It's a small amount of test code because it's testing a small amount of live code; you just write it.
Yes, that can be taken too far, and I'm not saying you never use a helper function. But when you find yourself refactoring your test suite, you need to ask yourself what the test suite was for.
As a side note, your test doesn't test what it says it does. You're testing that identical constructors create Equal objects and non-identical constructors create non-Equal objects (which in principle is two tests). This doesn't test at all that it's a value object (though it's a perfectly fine thing to test). This isn't a huge deal. Don't let testing philosophy get in the way of useful testing; but it's good to have your titles match your test intent.

how to use forAll in scalatest to generate only one object of a generator?

Im working with scalatest and scalacheck, alsso working with FeatureSpec.
I have a generator class that generate object for me that looks something like this:
object InvoiceGen {
def myObj = for {
country <- Gen.oneOf(Seq("France", "Germany", "United Kingdom", "Austria"))
type <- Gen.oneOf(Seq("Communication", "Restaurants", "Parking"))
amount <- Gen.choose(100, 4999)
number <- Gen.choose(1, 10000)
valid <- Arbitrary.arbitrary[Boolean]
} yield SomeObject(country, type, "1/1/2014", amount,number.toString, 35, "something", documentTypeValid, valid, "")
Now, I have the testing class which works with FeatureSpec and everything that I need to run the tests.
In this class I have scenarios, and in each scenario I want to generate a different object.
The thing is from what I understand is that to generate object is better to use forAll func, but for all will not sure to bring you an object so you can add minSuccessful(1) to make sure you get at list 1 obj....
I did it like this and it works:
scenario("some scenario") {
forAll(MyGen.myObj, minSuccessful(1)) { someObject =>
Given("A connection to the system")
loginActions shouldBe 'Connected
When("something")
//blabla
Then("something should happened")
//blabla
}
}
but im not sure exactly what it means.
What I want is to generate an invoice each scenario and do some actions on it...
im not sure why i care if the generation work or didnt work...i just want a generated object to work with.
TL;DR: To get one object, and only one, use myObj.sample.get. Unless your generator is doing something fancy that's perfectly safe and won't blow up.
I presume that your intention is to run some kind of integration/acceptance test with some randomly generated domain object—in other words (ab-)use scalacheck as a simple data generator—and you hope that minSuccessful(1) would ensure that the test only runs once.
Be aware that this is not the case!. scalacheck will run your test multiple times if it fails, to try and shrink the input data to a minimal counterexample.
If you'd like to ensure that your test runs only once you must use sample.
However, if running the test multiple times is fine, prefer minSuccessful(1) to "succeed fast" but still profit from minimized counterexamples in case the test fails.
Gen.sample returns an option because generators can fail:
ScalaCheck generators can fail, for instance if you're adding a filter (listingGen.suchThat(...)), and that failure is modeled with the Option type.
But:
[…] if you're sure that your generator never will fail, you can simply call Option.get like you do in your example above. Or you can use Option.getOrElse to replace None with a default value.
Generally if your generator is simple, i.e. does not use generators that could fail and does not use any filters on its own, it's perfectly safe to just call .get on the option returned by .sample. I've been doing that in the past and never had problems with it. If your generators frequently return None from .sample they'd likely make scalacheck fail to successfully generate values as well.
If all that you want is a single object use Gen.sample.get.
minSuccessful has a very different meaning: It's the minimal number of successful tests that scalacheck runs—which by no means implies
that scalacheck takes only a single value out of the generator, or
that the test runs only once.
With minSuccessful(1) scalacheck wants one successful test. It'll take samples out of the generator until the test runs at least once—i.e. if you filter the generated values with whenever in your test body scalacheck will take samples as long as whenever discards them.
If the test passes scalacheck is happy and won't run the test a second time.
However if the test fails scalacheck will try and produce a minimal example to fail the test. It'll shrink the input data and run the test as long as it fails and then provides you with the minimized counter example rather than the actual input that triggered the initial failure.
That's an important property of property testing as it helps you to discover bugs: The original data is frequently too large to lend itself for debugging. Minimizing it helps you discover the piece of input data that actually triggers the failure, i.e. corner cases like empty strings that you didn't think of.
I think the way you want to use Scalacheck (generate only one object and execute the test for it) defeats the purpose of property-based testing. Let me explain a bit in detail:
In classical unit-testing, you would generate your system under test, be it an object or a system of dependent objects, with some fixed data. This could e.g. be strings like "foo" and "bar" or, if you needed a name, you would use something like "John Doe". For integers and other data, you can also randomly choose some values.
The main advantage is that these are "plain" values—you can directly see them in the code and correlate them with the output of a failed test. The big disadvantage is that the tests will only ever run with the values you specified, which in turn means that your code is also only tested with these values.
In contrast, property-based testing allows you to just describe how the data should look like (e.g. "a positive integer", "a string of maximum 20 characters"). The testing framework will then—with the help of generators—generate a number of matching objects and execute the test for all of them. This way, you can be more sure that your code will actually be correct for different inputs, which after all is the purpose of testing: to check if your code does what it should for the possible inputs.
I never really worked with Scalacheck, but a colleague explained it to me that it also tries to cover edge-cases, e.g. putting in a 0 and MAX_INT for a positive integer, or an empty string for the aforementioned string with max. 20 characters.
So, to sum it up: Running a property-based test only once for one generic object is the wrong thing to do. Instead, once you have the generator infrastructure in place, embrace the advantage you then have and let your code be checked a lot more times!

Avoid testing duplicate values with ScalaTest forAll

I'm playing with property-based testing on ScalaTest and I had the following code:
val myStrings = Gen.oneOf("hi", "hello")
forAll(myStrings) { s: String =>
println(s"String tested: $s")
}
When I run the forAll code, I've noticed that the same value is tried more than once, e.g.
String tested: hi
String tested: hello
String tested: hi
String tested: hello
String tested: hi
String tested: hello
...
I was wondering if there is a way for, given the code above, for each value in oneOf to be tried only once. In other words, to get ScalaTest not to use the same value twice.
Even if I used other generators, such as Gen.alphaStr, I'd like to find a way to avoid testing the same String twice. The reason I'm interested in doing this is because each test runs against a server running in a different process, and hence there's a bit of cost involved, so I'd like to avoid testing the same thing twice.
What you're trying to do is seems to be against scalacheck ideology(see Note1); however it's kind of possible (with high probability) by reducing the number of samples:
scala> forAll(oneOf("a", "b")){i => println(i); true}.check(Test.Parameters.default.withMinSuccessfulTests(2))
a
b
+ OK, passed 2 tests.
Note that you can still get aa/bb sometimes, as scala-check is built on randomness and statistical approach. If you need to always check all combinations - you probably don't need scala-check:
scala> assert(Set("a", "b").forall(_ => true))
Basically Gen allows you to create an infinite collection that represents a distribution of input values. The more values you generate - the better sampling you get. So if you have N possible states, you can't guarantee that they won't repeat in an infinite collection.
The only way to do exactly what you want is to explicitly check for duplicates before calling the service. You can use something like Option(ConcurrentHashMap.putIfAbscent(value, value)).isEmpty for that. Keep in mind it is a risk of OOM so be careful to take care of the amount of generated values and maybe even add an explicit check.
Note1) What scalacheck is needed for reducing number of combinations from maximum (which is more than 100) to some value that still gives you a good check. So scalacheck is useful when a set of possible inputs is really huge. And in that case the probability of repetitions is really small
P.S.
Talking about oneOf (from scaladoc):
def oneOf[T](t0: T, t1: T, tn: T*): Gen[T]
Picks a random value from a list
See also (examples are a bit outdated): How can I reduce the number of test cases ScalaCheck generates?
I would aim to increase the entropy of values. Using random sentences will increase it a lot, although not (theoretically) fixing the issue.
val genWord = Gen.onOf("hi", "hello")
def sentanceOf(words: Int): Gen[String] = {
Gen.listOfN(words, genWord).map(_.mkString(" ")
}

How do I set the default number of threads for Scala 2.10 parallel collections?

In Scala before 2.10, I can set the parallelism in the defaultForkJoinPool (as in this answer scala parallel collections degree of parallelism). In Scala 2.10, that API no longer exists. It is well documented that we can set the parallelism on a single collection (http://docs.scala-lang.org/overviews/parallel-collections/configuration.html) by assigning to its taskSupport property.
However, I use parallel collections all over my codebase and would not like to add an extra two lines to every single collection instantiation. Is there some way to configure the global default thread pool size so that someCollection.par.map(f(_)) automatically uses the default number of threads?
I know that the question is over a month old, but I've just had exactly the same question. Googling wasn't helpful and I couldn't find anything that looked halfway sane in the new API.
Setting -Dscala.concurrent.context.maxThreads=n as suggested here: Set the parallelism level for all collections in Scala 2.10? seemingly had no effect at all, but I'm not sure if I used it correctly (I run my application with 'java' in an environment without 'scala' installed explicitly, it might be the cause).
I don't know why scala-people removed this essential setter from the appropriate package object.
However, it's often possible to use reflection to work around an incomplete/weird interface:
def setParallelismGlobally(numThreads: Int): Unit = {
val parPkgObj = scala.collection.parallel.`package`
val defaultTaskSupportField = parPkgObj.getClass.getDeclaredFields.find{
_.getName == "defaultTaskSupport"
}.get
defaultTaskSupportField.setAccessible(true)
defaultTaskSupportField.set(
parPkgObj,
new scala.collection.parallel.ForkJoinTaskSupport(
new scala.concurrent.forkjoin.ForkJoinPool(numThreads)
)
)
}
For those not familiar with the more obscure features of Scala, here is a short explanation:
scala.collection.parallel.`package`
accesses the package object with the defaultTaskSupport variable (it looks somewhat like Java's static variable, but it's actually a member variable of the package object). The backticks are required for the identifier, because package is a reserved keyword. Then we get the private final field that we want (getField("defaultTaskSupport") didn't work for some reason?...), tell it to be accessible in order to be able to modify it, and then replace it's value by our own ForkJoinTaskSupport.
I don't yet understand the exact mechanism of the creation of parallel collections, but the source code of the Combiner trait suggests that the value of defaultTaskSupport should percolate to the parallel collections somehow.
Notice that the question is qualitatively of the same sort as a much older question: "I have Math.random() all over my codebase, how can I set the seed to a fixed number for debugging purposes?" (See e.g. : Set seed on Math.random() ). In both cases, we have some sort of global "static" variable that we implicitly use in a million different places, we want to change it, but there are no setters for this variable => we use reflection.
Ugly as hell, but seems to work just fine. If you need to limit the total number of threads, don't forget that the garbage collector runs on separate thread.