How to get scalacheck to work on class with a Seq? - scala

I have a case class that I am trying to test via ScalaCheck. The case class contains other classes.
Here are the classes:
case class Shop(name: String = "", colors: Seq[Color] = Nil)
case class Color(colorName: String = "", shades: Seq[Shade] = Nil)
case class Shade(shadeName: String, value: Int)
I have generators for each one
implicit def shopGen: Gen[Shop] =
for {
name <- Gen.alphaStr.suchThat(_.length > 0)
colors <- Gen.listOf(colorsGen)
} yield Shop(name, colors)
implicit def colorsGen: Gen[Color] =
for {
colorName <- Gen.alphaStr.suchThat(_.length > 0)
shades <- Gen.listOf(shadesGen)
} yield Color(colorName, shades)
implicit def shadesGen: Gen[Shade] =
for {
shadeName <- Gen.alphaStr.suchThat(_.length > 0) //**Note this**
value <- Gen.choose(1, Int.MaxValue)
} yield Shade(shadeName, value)
When I write my test and simply do the below:
property("Shops must encode/decode to/from JSON") {
"test" mustBe "test
}
I get an error and the test hangs and stops after 51 tries. The error I get is Gave up after 1 successful property evaluation. 51 evaluations were discarded.
If I remove Gen.alphaStr.suchThat(_.length > 0) from shadesGen and just replace it with Gen.alphaStr then it works.
Question
Why does having Gen.alphaStr work for shadesGen, however, Gen.alphaStr.suchThat(_.length > 0) does not?
Also when I run test multiple times (with Gen.alphaStr) some pass while some don't. Why is this?

You probably see this behavior because of the way listOf is implemented. Inside it is based on buildableOf which is in turn based on buildableOfN which has following comment:
... If the given generator fails generating a value, the
complete container generator will also fail.
Your data structure is essentially a list of lists so even one bad generation will curse the whole data-structure to be discarded. And obviously most of the failures happens at the bottom level. That's why removing the filter for shadeName helps. So to make it work you should generate more valid strings. You may change Gen.alphaStr to some custom-made generator based on nonEmptyListOf such as:
def nonemptyAlphaStr:Gen[String] = Gen.nonEmptyListOf(alphaChar).map(_.mkString)
Another simple way to work this around is to use retryUntil instead of suchThat such as in:
implicit def shadesGen: Gen[Shade] =
for {
//shadeName <- Gen.alphaStr.suchThat(_.length > 0) //**Note this**
shadeName <- Gen.alphaStr.retryUntil(_.length > 0)
value <- Gen.choose(1, Int.MaxValue)
} yield Shade(shadeName, value)

Related

Generator Ordering Causing Infinite Recursion in For Comprehension in Scala [duplicate]

I'm seeing what seems to be a very obvious bug with scalacheck, such that if it's really there I can't see how people use it for recursive data structures.
This program fails with a StackOverflowError before scalacheck takes over, while constructing the Arbitrary value. Note that the Tree type and the generator for Trees is taken verbatim from this scalacheck tutorial.
package treegen
import org.scalacheck._
import Prop._
class TreeProperties extends Properties("Tree") {
trait Tree
case class Node(left: Tree, right: Tree) extends Tree
case class Leaf(x: Int) extends Tree
val ints = Gen.choose(-100, 100)
def leafs: Gen[Leaf] = for {
x <- ints
} yield Leaf(x)
def nodes: Gen[Node] = for {
left <- trees
right <- trees
} yield Node(left, right)
def trees: Gen[Tree] = Gen.oneOf(leafs, nodes)
implicit lazy val arbTree: Arbitrary[Tree] = Arbitrary(trees)
property("vacuous") = forAll { t: Tree => true }
}
object Main extends App {
(new TreeProperties).check
}
What's stranger is that changes that shouldn't affect anything seem to alter the program so that it works. For example, if you change the definition of trees to this, it passes without any problem:
def trees: Gen[Tree] = for {
x <- Gen.oneOf(0, 1)
t <- if (x == 0) {leafs} else {nodes}
} yield t
Even stranger, if you alter the binary tree structure so that the value is stored on Nodes and not on Leafs, and alter the leafs and nodes definition to be:
def leafs: Gen[Leaf] = Gen.value(Leaf())
def nodes: Gen[Node] = for {
x <- ints // Note: be sure to ask for x first, or it'll StackOverflow later, inside scalacheck code!
left <- trees
right <- trees
} yield Node(left, right, x)
It also then works fine.
What's going on here? Why is constructing the Arbitrary value initially causing a stack overflow? Why does it seem that scalacheck generators are so sensitive to minor changes that shouldn't affect the control flow of the generators?
Why isn't my expression above with the oneOf(0, 1) exactly equivalent to the original oneOf(leafs, nodes) ?
The problem is that when Scala evaluates trees, it ends up in an endless recursion since trees is defined in terms of itself (via nodes). However, when you put some other expression than trees as the first part of your for-expression in nodes, Scala will delay the evaluation of the rest of the for-expression (wrapped up in chains of map and flatMap calls), and the infinite recursion will not happen.
Just as pedrofurla says, if oneOf was non-strict this would probably not happen (since Scala wouldn't evaluate the arguments immediately). However you can use Gen.lzy to be explicit about the lazyness. lzy takes any generator and delays the evaluation of that generator until it is really used. So the following change solves your problem:
def trees: Gen[Tree] = Gen.lzy(Gen.oneOf(leafs, nodes))
Even though following Rickard Nilsson's answer above got rid of the constant StackOverflowError on program startup, I'd still hit a StackOverflowError about one time out of three once I actually asked scalacheck to check the properties. (I changed Main above to run .check 40 times, and would see it succeed twice, then fail with a stack overflow, then succeed twice, etc.)
Eventually I had to put in a hard block to the depth of the recursion and this is what I guess I'll be doing when using scalacheck on recursive data structures in the future:
def leafs: Gen[Leaf] = for {
x <- ints
} yield Leaf(x)
def genNode(level: Int): Gen[Node] = for {
left <- genTree(level)
right <- genTree(level)
} yield Node(left, right)
def genTree(level: Int): Gen[Tree] = if (level >= 100) {leafs}
else {leafs | genNode(level + 1)}
lazy val trees: Gen[Tree] = genTree(0)
With this change, scalacheck never runs into a StackOverflowError.
A slight generalization of approach in Daniel Martin's own answer is using sized. Something like (untested):
def genTree() = Gen.sized { size => genTree0(size) }
def genTree0(maxDepth: Int) =
if (maxDepth == 0) leafs else Gen.oneOf(leafs, genNode(maxDepth))
def genNode(maxDepth: Int) = for {
depthL <- Gen.choose(0, maxDepth - 1)
depthR <- Gen.choose(0, maxDepth - 1)
left <- genTree0(depthL)
right <- genTree0(depthR)
} yield Node(left, right)
def leafs = for {
x <- ints
} yield Leaf(x)

How do I shrink a list but guarantee it isn't empty?

In ScalaCheck, I have written a generator of non-empty lists of strings,
val nonEmptyListsOfString: Gen[List[String]] =
Gen.nonEmptyListOf(Arbitrary.arbitrary[String])
And then, assume I wrote a property using Prop.forAll,
Prop.forAll(nonEmptyListsOfString) { strs: List[String] =>
strs == Nil
}
This is just a simple example that is meant to fail, so that it can show how the shrinking is done by Scalacheck to find the smallest example.
However, the default shrinker in Scalacheck doesn't respect the generator, and will still shrink to an empty string, ignoring the generator properties.
sbt> test
[info] ! Prop.isEmpty: Falsified after 1 passed tests.
[info] > ARG_0: List()
[info] > ARG_0_ORIGINAL: List("")
[info] Failed: Total 1, Failed 1, Errors 0, Passed 0
[error] Failed tests:
[error] example.Prop
As mentioned in the comment, and re-using the example from the github issue you posted:
import cats.data.NonEmptyList
import org.scalacheck.{Arbitrary, Gen}
import org.scalatest.{FreeSpec, Matchers}
import org.scalatest.prop.PropertyChecks
class ScalaCheckTest extends FreeSpec with PropertyChecks with Matchers{
"Test scalacheck (failing)" in {
val gen: Gen[List[Int]] = for {
n <- Gen.choose(1, 3)
list <- Gen.listOfN(n, Gen.choose(0, 9))
} yield list
forAll(gen) { list =>
list.nonEmpty shouldBe true
if (list.sum < 18) throw new IllegalArgumentException("ups")
}
}
"Test scalacheck" in {
val gen1 = for{
first <- Arbitrary.arbInt.arbitrary
rest <- Gen.nonEmptyListOf(Arbitrary.arbInt.arbitrary)
} yield {
NonEmptyList(first, rest)
}
forAll(gen1) { list =>
val normalList = list.toList
normalList.nonEmpty shouldBe true
if (normalList.sum < 18) throw new IllegalArgumentException("ups")
}
}
}
The first test does fail showing an empty list being used, but the second one does indeed throw the exception.
UPDATE: Cats is obviously not really needed, here I use a simple (and local) version of a non-empty list for the sake of this test.
"Test scalacheck 2" in {
case class FakeNonEmptyList[A](first : A, tail : List[A]){
def toList : List[A] = first :: tail
}
val gen1 = for{
first <- Arbitrary.arbInt.arbitrary
rest <- Gen.nonEmptyListOf(Arbitrary.arbInt.arbitrary)
} yield {
FakeNonEmptyList(first, rest)
}
forAll(gen1) { list =>
val normalList = list.toList
normalList.nonEmpty shouldBe true
if (normalList.sum < 18) throw new IllegalArgumentException("ups")
}
}
There is a way to define your own Shrink class in ScalaCheck. However, it is not common nor very easy to do.
Overview
A Shrink requires defining an implicit definition in scope of your property test. Then Prop.forAll will find your Shrink class if it is in scope and has the appropriate type signature for the value that failed a test.
Fundamentally, a Shrink instance is a function that converts the failing value, x, to a stream of "shrunken" values. It's type signature is roughly:
trait Shrink[T] {
def shrink(x: T): Stream[T]
}
You can define a Shrink with the companion object's apply method, which is roughly this:
object Shrink {
def apply[T](s: T => Stream[T]): Shrink[T] = {
new Shrink[T] {
def shrink(x: T): Stream[T] = s(x)
}
}
}
Example: Shrinking integers
If you know how to work with a Stream collection in Scala, then it's easy to define a shrinker for Int that shrinks by halving the value:
implicit val intShrinker: Shrink[Int] = Shrink {
case 0 => Stream.empty
case x => Stream.iterate(x / 2)(_ / 2).takeWhile(_ != 0) :+ 0
}
We want to avoid returning the original value to ScalaCheck, so that's why zero is a special case.
Answer: Non-empty lists
In the case of a non-empty list of strings, you want to re-use the container shrinking of ScalaCheck, but avoid empty containers. Unfortunately, that's not easy to do, but it is possible:
implicit def shrinkListString(implicit s: Shrink[String]): Shrink[List[String]] = Shrink {
case Nil => Stream.empty[List[String]]
case strs => Shrink.shrink(strs)(Shrink.shrinkContainer).filter(!_.isEmpty)
}
Rather than writing a generic container shrinker that avoids empty containers, the one above is specific to List[String]. It could probably be rewritten to List[T].
The first pattern match against Nil is probably unnecessary.

Generating ScalaCheck tests with Cucumber JVM - Generic Functions

To avoid X & Y problems, a little background:
I'm trying to set up a web project where I'm going to be duplicating business logic server and client side, client obviously in Javascript and the server in Scala. I plan to write business logic in Cucumber so I can make sure the tests and functionality line up on both sides. Finally, I'd like to have a crack at bringing ScalaCheck and JSCheck into this, generated input data rather than specified.
Basically, the statements would work like this:
Given statements select add generators.
When statements specify functions to act upon those values in sequence.
Then statements take the input data and the final result data and run a property.
The objective is to make this sort of thing composable so you could specify several generators, a set of actions to run on each of them, and then a set of properties that would each get run on the inputs and result.
Done this already in Javascript (technically Coffeescript), and of course with a dynamic language is straightforward to do. Basically what I want to be able to do in my scala step definitions is this, excuse the arbitrary test data:
class CucumberSteps extends ScalaDsl with EN
with ShouldMatchers with QuickCheckCucumberSteps {
Given("""^an list of integer between 0 and 100$""") {
addGenerator(Gen.containerOf[List, Int](Gen.choose(0,100)))
}
Given("""^an list of random string int 500 and 700$""") {
addGenerator(Gen.containerOf[List, Int](Gen.choose(500,700)))
}
When("""^we concatenate the two lists$""") {
addAction {(l1: List[Int], l2: List[Int]) => l1 ::: l2 }
}
Then("""^then the size of the result should equal the sum of the input sizes$""") {
runProperty { (inputs: (List[Int], List[Int]), result: (List[Int])) =>
inputs._1.size + inputs._2.size == result._1.size
}
}
}
So the key thing I want to do is create a trait QuickCheckCucumberSteps that will be the API, implementing addGenerator, addAction and runProperty.
Here's what I've roughed out so far, and where I get stuck:
trait QuickCheckCucumberSteps extends ShouldMatchers {
private var generators = ArrayBuffer[Gen[Any]]()
private var actions = ArrayBuffer[""AnyFunction""]()
def addGenerator(newGen: Gen[Any]): Unit =
generators += newGen
def addAction(newFun: => ""AnyFunction""): Unit =
actions += newFun
def buildPartialProp = {
val li = generators
generators.length match {
case 1 => forAll(li(0))_
case 2 => forAll(li(0), li(1))_
case 3 => forAll(li(0), li(1), li(2))_
case 4 => forAll(li(0), li(1), li(2), li(3))_
case _ => forAll(li(0), li(1), li(2), li(3), li(4))_
}
}
def runProperty(propertyFunc: => Any): Prop = {
val partial = buildPartialProp
val property = partial {
??? // Need a function that takes x number of generator inputs,
// applies each action in sequence
// and then applies the `propertyFunc` to the
// inputs and results.
}
val result = Test.check(new Test.Parameters.Default {},
property)
result.status match {
case Passed => println("passed all tests")
case Failed(a, l) => fail(format(pretty(result), "", "", 75))
case _ => println("other cases")
}
}
}
My key issue is this, I want to have the commented block become a function that takes all the added actions, apply them in order and then run and return the result of the property function. Is this possible to express with Scala's type system, and if so, how do I get started? Happy to do reading and earn this one, but I need at least a way forward as I don't know how to express it at this point. Happy to drop in my Javascript code if what I'm trying to make here isn't clear.
If I were you, I wouldn't put ScalaCheck generator code within your Cucumber Given/When/Then statements :). The ScalaCheck api calls are part of the "test rig" - so not under test. Try this (not compiled/tested):
class CucumberSteps extends ScalaDsl with EN with ShouldMatchers {
forAll(Gen.containerOf[List, Int](Gen.choose(0,100)),
Gen.containerOf[List, Int](Gen.choose(500,700)))
((l1: List[Int], l2: List[Int]) => {
var result: Int = 0
Given(s"""^a list of integer between 0 and 100: $l1 $""") { }
Given(s"""^a list of integer between 0 and 100: $l2 $""") { }
When("""^we concatenate the two lists$""") { result = l1 ::: l2 }
Then("""^the size of the result should equal the sum of the input sizes$""") {
l1.size + l2.size == result.size }
})
}

scalacheck Arbitrary implicits and recursive generators

I'm seeing what seems to be a very obvious bug with scalacheck, such that if it's really there I can't see how people use it for recursive data structures.
This program fails with a StackOverflowError before scalacheck takes over, while constructing the Arbitrary value. Note that the Tree type and the generator for Trees is taken verbatim from this scalacheck tutorial.
package treegen
import org.scalacheck._
import Prop._
class TreeProperties extends Properties("Tree") {
trait Tree
case class Node(left: Tree, right: Tree) extends Tree
case class Leaf(x: Int) extends Tree
val ints = Gen.choose(-100, 100)
def leafs: Gen[Leaf] = for {
x <- ints
} yield Leaf(x)
def nodes: Gen[Node] = for {
left <- trees
right <- trees
} yield Node(left, right)
def trees: Gen[Tree] = Gen.oneOf(leafs, nodes)
implicit lazy val arbTree: Arbitrary[Tree] = Arbitrary(trees)
property("vacuous") = forAll { t: Tree => true }
}
object Main extends App {
(new TreeProperties).check
}
What's stranger is that changes that shouldn't affect anything seem to alter the program so that it works. For example, if you change the definition of trees to this, it passes without any problem:
def trees: Gen[Tree] = for {
x <- Gen.oneOf(0, 1)
t <- if (x == 0) {leafs} else {nodes}
} yield t
Even stranger, if you alter the binary tree structure so that the value is stored on Nodes and not on Leafs, and alter the leafs and nodes definition to be:
def leafs: Gen[Leaf] = Gen.value(Leaf())
def nodes: Gen[Node] = for {
x <- ints // Note: be sure to ask for x first, or it'll StackOverflow later, inside scalacheck code!
left <- trees
right <- trees
} yield Node(left, right, x)
It also then works fine.
What's going on here? Why is constructing the Arbitrary value initially causing a stack overflow? Why does it seem that scalacheck generators are so sensitive to minor changes that shouldn't affect the control flow of the generators?
Why isn't my expression above with the oneOf(0, 1) exactly equivalent to the original oneOf(leafs, nodes) ?
The problem is that when Scala evaluates trees, it ends up in an endless recursion since trees is defined in terms of itself (via nodes). However, when you put some other expression than trees as the first part of your for-expression in nodes, Scala will delay the evaluation of the rest of the for-expression (wrapped up in chains of map and flatMap calls), and the infinite recursion will not happen.
Just as pedrofurla says, if oneOf was non-strict this would probably not happen (since Scala wouldn't evaluate the arguments immediately). However you can use Gen.lzy to be explicit about the lazyness. lzy takes any generator and delays the evaluation of that generator until it is really used. So the following change solves your problem:
def trees: Gen[Tree] = Gen.lzy(Gen.oneOf(leafs, nodes))
Even though following Rickard Nilsson's answer above got rid of the constant StackOverflowError on program startup, I'd still hit a StackOverflowError about one time out of three once I actually asked scalacheck to check the properties. (I changed Main above to run .check 40 times, and would see it succeed twice, then fail with a stack overflow, then succeed twice, etc.)
Eventually I had to put in a hard block to the depth of the recursion and this is what I guess I'll be doing when using scalacheck on recursive data structures in the future:
def leafs: Gen[Leaf] = for {
x <- ints
} yield Leaf(x)
def genNode(level: Int): Gen[Node] = for {
left <- genTree(level)
right <- genTree(level)
} yield Node(left, right)
def genTree(level: Int): Gen[Tree] = if (level >= 100) {leafs}
else {leafs | genNode(level + 1)}
lazy val trees: Gen[Tree] = genTree(0)
With this change, scalacheck never runs into a StackOverflowError.
A slight generalization of approach in Daniel Martin's own answer is using sized. Something like (untested):
def genTree() = Gen.sized { size => genTree0(size) }
def genTree0(maxDepth: Int) =
if (maxDepth == 0) leafs else Gen.oneOf(leafs, genNode(maxDepth))
def genNode(maxDepth: Int) = for {
depthL <- Gen.choose(0, maxDepth - 1)
depthR <- Gen.choose(0, maxDepth - 1)
left <- genTree0(depthL)
right <- genTree0(depthR)
} yield Node(left, right)
def leafs = for {
x <- ints
} yield Leaf(x)

How to yield a single element from for loop in scala?

Much like this question:
Functional code for looping with early exit
Say the code is
def findFirst[T](objects: List[T]):T = {
for (obj <- objects) {
if (expensiveFunc(obj) != null) return /*???*/ Some(obj)
}
None
}
How to yield a single element from a for loop like this in scala?
I do not want to use find, as proposed in the original question, i am curious about if and how it could be implemented using the for loop.
* UPDATE *
First, thanks for all the comments, but i guess i was not clear in the question. I am shooting for something like this:
val seven = for {
x <- 1 to 10
if x == 7
} return x
And that does not compile. The two errors are:
- return outside method definition
- method main has return statement; needs result type
I know find() would be better in this case, i am just learning and exploring the language. And in a more complex case with several iterators, i think finding with for can actually be usefull.
Thanks commenters, i'll start a bounty to make up for the bad posing of the question :)
If you want to use a for loop, which uses a nicer syntax than chained invocations of .find, .filter, etc., there is a neat trick. Instead of iterating over strict collections like list, iterate over lazy ones like iterators or streams. If you're starting with a strict collection, make it lazy with, e.g. .toIterator.
Let's see an example.
First let's define a "noisy" int, that will show us when it is invoked
def noisyInt(i : Int) = () => { println("Getting %d!".format(i)); i }
Now let's fill a list with some of these:
val l = List(1, 2, 3, 4).map(noisyInt)
We want to look for the first element which is even.
val r1 = for(e <- l; val v = e() ; if v % 2 == 0) yield v
The above line results in:
Getting 1!
Getting 2!
Getting 3!
Getting 4!
r1: List[Int] = List(2, 4)
...meaning that all elements were accessed. That makes sense, given that the resulting list contains all even numbers. Let's iterate over an iterator this time:
val r2 = (for(e <- l.toIterator; val v = e() ; if v % 2 == 0) yield v)
This results in:
Getting 1!
Getting 2!
r2: Iterator[Int] = non-empty iterator
Notice that the loop was executed only up to the point were it could figure out whether the result was an empty or non-empty iterator.
To get the first result, you can now simply call r2.next.
If you want a result of an Option type, use:
if(r2.hasNext) Some(r2.next) else None
Edit Your second example in this encoding is just:
val seven = (for {
x <- (1 to 10).toIterator
if x == 7
} yield x).next
...of course, you should be sure that there is always at least a solution if you're going to use .next. Alternatively, use headOption, defined for all Traversables, to get an Option[Int].
You can turn your list into a stream, so that any filters that the for-loop contains are only evaluated on-demand. However, yielding from the stream will always return a stream, and what you want is I suppose an option, so, as a final step you can check whether the resulting stream has at least one element, and return its head as a option. The headOption function does exactly that.
def findFirst[T](objects: List[T], expensiveFunc: T => Boolean): Option[T] =
(for (obj <- objects.toStream if expensiveFunc(obj)) yield obj).headOption
Why not do exactly what you sketched above, that is, return from the loop early? If you are interested in what Scala actually does under the hood, run your code with -print. Scala desugares the loop into a foreach and then uses an exception to leave the foreach prematurely.
So what you are trying to do is to break out a loop after your condition is satisfied. Answer here might be what you are looking for. How do I break out of a loop in Scala?.
Overall, for comprehension in Scala is translated into map, flatmap and filter operations. So it will not be possible to break out of these functions unless you throw an exception.
If you are wondering, this is how find is implemented in LineerSeqOptimized.scala; which List inherits
override /*IterableLike*/
def find(p: A => Boolean): Option[A] = {
var these = this
while (!these.isEmpty) {
if (p(these.head)) return Some(these.head)
these = these.tail
}
None
}
This is a horrible hack. But it would get you the result you wished for.
Idiomatically you'd use a Stream or View and just compute the parts you need.
def findFirst[T](objects: List[T]): T = {
def expensiveFunc(o : T) = // unclear what should be returned here
case class MissusedException(val data: T) extends Exception
try {
(for (obj <- objects) {
if (expensiveFunc(obj) != null) throw new MissusedException(obj)
})
objects.head // T must be returned from loop, dummy
} catch {
case MissusedException(obj) => obj
}
}
Why not something like
object Main {
def main(args: Array[String]): Unit = {
val seven = (for (
x <- 1 to 10
if x == 7
) yield x).headOption
}
}
Variable seven will be an Option holding Some(value) if value satisfies condition
I hope to help you.
I think ... no 'return' impl.
object TakeWhileLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T]): T = if (seq.isEmpty) null.asInstanceOf[T] else
seq(seq.takeWhile(_ == null).size)
}
object OptionLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T], index: Int = 0): T = if (seq.isEmpty) null.asInstanceOf[T] else
Option(seq(index)) getOrElse func(seq, index + 1)
}
object WhileLoop extends App {
println("first non-null: " + func(Seq(null, null, "x", "y", "z")))
def func[T](seq: Seq[T]): T = if (seq.isEmpty) null.asInstanceOf[T] else {
var i = 0
def obj = seq(i)
while (obj == null)
i += 1
obj
}
}
objects iterator filter { obj => (expensiveFunc(obj) != null } next
The trick is to get some lazy evaluated view on the colelction, either an iterator or a Stream, or objects.view. The filter will only execute as far as needed.