How to make Scalacheck arbString generate only readable Strings? - scala

Postgres doesn't accept all kind of symbols that Scalacheck arbString generates. Is there a way to generate human readable strings with Scalacheck?

If you take a look at the Gen object you can see a few generators, including alphaChar and identifier.
scala> import org.scalacheck.Gen._
import org.scalacheck.Gen._
scala> identifier.sample
res0: Option[String] = Some(vxlgvihQeknhe4PolpsJas1s0gx3dmci7z9i2pkYlxhO2vdrkqpspcaUmzrxnnb)
scala> alphaChar.sample
res1: Option[Char] = Some(f)
scala> listOf(alphaChar).sample
res2: Option[List[Char]] = Some(List(g, n, x, Y, h, a, c, e, a, j, B, d, m, a, r, r, Z, a, z, G, e, i, i, v, n, Z, x, z, t))
scala> listOf(alphaChar).map(_.mkString).sample
res3: Option[String] = Some(oupwJfqmmqebcsqbtRxzmgnJvdjzskywZiwsqnkzXttLqydbaahsfrjqdyyHhdaNpinvnxinhxhjyzvehKmbuejaeozytjyoyvb)

You can do so by adding a case class ReadableChar(c: Char), and defining an instance of arbitrary for it. Maybe something like
case class ReadableChar(c: Char)
implicit val arbReadable: Arbitrary[ReadableChar] = Arbitrary {
val legalChars = Range('a', 'z').map(_.toChar)
for {
c <- Gen.oneOf(legalChars)
} yield ReadableChar(c)
}
Then you can use the instance for Arbitrary[Array[ReadableChar]] to generate an array of readable chars, turn it into a string via .map(_.c).toString.
This works if you want to define "human readable strings" by the chars they are allowed to contain. If you need additional restrictions you can write a second case class ReadableString(s: String) and define an instance of Arbitrary for it, too.

Related

Create an Arbitrary instance for a case class that holds a `Numeric` in ScalaCheck?

I'm specifically trying to define Semigroup and a Sum type which 'is a' Semigroup and check the Associative property of Semigroup generically using ScalaCheck.
I first wrote this out in Haskell because I find it easier to think of these things first in Haskell syntax and then translate them to Scala.
So in Haskell, I wrote the following which works in GHCi:
newtype Sum a = Sum a deriving (Show, Eq)
instance Num a => Num (Sum a) where
(+) (Sum x) (Sum y) = Sum (x + y)
class Semigroup a where
(<>) :: a -> a -> a
instance Num a => Semigroup (Sum a) where
(<>) = (+)
instance Arbitrary a => Arbitrary (Sum a) where
arbitrary = fmap Sum arbitrary
semigroupAssocProp x y z = (x <> (y <> z)) == ((x <> y) <> z)
quickCheck (semigroupAssocProp :: Num a => Sum a -> Sum a -> Sum a -> Bool)
I'm trying to create something roughly equivalent in Scala. So far, I have what you see below:
trait Semigroup[A] {
def |+|(b: A): A
}
case class Sum[A: Numeric](n: A) extends Semigroup[Sum[A]] {
def |+|(x: Sum[A]): Sum[A] = Sum[A](implicitly[Numeric[A]].plus(n, x.n)
}
val semigroupAssocProp = Prop.forAll { (x: Sum[Int], y: Sum[Int], z: Sum[Int]) =>
(x |+| (y |+| z)) == ((x |+| y) |+| z)
}
val chooseSum = for { n <- Gen.chooseNum(-10000, 10000) } yield Sum(n)
// => val chooseSum Gen[Sum[Int]] = org.scalacheck.Gen$$anon$<some hash>
I'm lost on how to create an Arbitrary instance for a more generic Sum[Numeric], or at least a Gen[Sum[Numeric]] and how to create a more generic semigroupAssocProp that could take an x, y, and z of type S where S extends Semigroup[T], with T being any concrete type.
I'm really trying to get as close in functionality to the Haskell version I wrote as possible in Scala.
Part of the issue is that this is a more direct translation of your Haskell code:
trait Semigroup[A] {
def add(a: A, b: A): A
}
case class Sum[A](n: A)
object Sum {
implicit def sumSemigroup[A: Numeric]: Semigroup[Sum[A]] =
new Semigroup[Sum[A]] {
def add(a: Sum[A], b: Sum[A]): Sum[A] =
Sum(implicitly[Numeric[A]].plus(a.n, b.n))
}
}
It's not a literal translation, since we don't supply a Numeric instance for Sum[A] (which would be more of a pain, given Numeric's interface), but it does represent the standard encoding of type classes in Scala.
Now you provide an Arbitrary instance for Sum[A] in exactly the same way as in Haskell:
import org.scalacheck.Arbitrary
implicit def arbitrarySum[A](implicit A: Arbitrary[A]): Arbitrary[Sum[A]] =
Arbitrary(A.arbitrary.map(Sum(_)))
And then you can define your property:
import org.scalacheck.Prop
def semigroupAssocProp[A: Arbitrary: Semigroup]: Prop =
Prop.forAll { (x: A, y: A, z: A) =>
val semigroup = implicitly[Semigroup[A]]
semigroup.add(x, semigroup.add(y, z)) == semigroup.add(semigroup.add(x, y), z)
}
And then check it:
scala> semigroupAssocProp[Sum[Int]].check
+ OK, passed 100 tests.
The key point is that Scala doesn't encode type classes using subtyping in the way that your implementation tries to do—instead you define your type classes as traits (or classes) that look very similar to the way you use class in Haskell. My Semigroup's |+|, for example, takes two arguments, just like the <> in the Haskell Semigroup. Instead of a separate instance-like language-level mechanism, though, you define your type class instances by instantiating these traits (or classes) and putting the instances into implicit scope.

Map versus FlatMap on String

Listening to the Collections lecture from Functional Programming Principles in Scala, I saw this example:
scala> val s = "Hello World"
scala> s.flatMap(c => ("." + c)) // prepend each element with a period
res5: String = .H.e.l.l.o. .W.o.r.l.d
Then, I was curious why Mr. Odersky didn't use a map here. But, when I tried map, I got a different result than I expected.
scala> s.map(c => ("." + c))
res8: scala.collection.immutable.IndexedSeq[String] = Vector(.H, .e, .l, .l, .o,
". ", .W, .o, .r, .l,
I expected that above call to return a String, since I'm map-ing, i.e. applying a function to each item in the "sequence," and then returning a new "sequence."
However, I could perform a map rather than flatmap for a List[String]:
scala> val sList = s.toList
sList: List[Char] = List(H, e, l, l, o, , W, o, r, l, d)
scala> sList.map(c => "." + c)
res9: List[String] = List(.H, .e, .l, .l, .o, ". ", .W, .o, .r, .l, .d)
Why was a IndexedSeq[String] the return type of calling map on the String?
The reason for this behavior is that, in order to apply "map" to a String, Scala treats the string as a sequence of chars (IndexedSeq[String]). This is what you get as a result of the map invocation, where for each element of said sequence, the operation is applied. Since Scala treated the string as a sequence to apply map, that is what mapreturns.
flatMap then simply invokes flatten on that sequence afterwards, which then "converts" it back to a String
You also have an interesting "collection of Scala flatMap examples", the first of which illustrates that difference between flatMap and map:
scala> val fruits = Seq("apple", "banana", "orange")
fruits: Seq[java.lang.String] = List(apple, banana, orange)
scala> fruits.map(_.toUpperCase)
res0: Seq[java.lang.String] = List(APPLE, BANANA, ORANGE)
scala> fruits.flatMap(_.toUpperCase)
res1: Seq[Char] = List(A, P, P, L, E, B, A, N, A, N, A, O, R, A, N, G, E)
Quite a difference, right?
Because flatMap treats a String as a sequence of Char, it flattens the resulting list of strings into a sequence of characters (Seq[Char]).
flatMap is a combination of map and flatten, so it first runs map on the sequence, then runs flatten, giving the result shown.
You can see this by running map and then flatten yourself:
scala> val mapResult = fruits.map(_.toUpperCase)
mapResult: Seq[String] = List(APPLE, BANANA, ORANGE)
scala> val flattenResult = mapResult.flatten
flattenResult: Seq[Char] = List(A, P, P, L, E, B, A, N, A, N, A, O, R, A, N, G, E)
Your map function c => ("." + c) takes a char and returns a String. It's like taking a List and returning a List of Lists. flatMap flattens that back.
If you would return a char instead of a String you wouldn't need the result flattened, e.g. "abc".map(c => (c + 1).toChar) returns "bcd".
With map you are taking a list of characters and turning it into a list of strings. That's the result you see. A map never changes the length of a list – the list of strings has as many elements as the original string has characters.
With flatMap you are taking a list of characters and turning it into a list of strings and then you mush those strings together into a single string again. flatMap is useful when you want to turn one element in a list into multiple elements, without creating a list of lists. (This of course also means that the resulting list can have any length, including 0 – this is not possible with map unless you start out with the empty list.)
Use flatMap in situations where you run map followed by flattern. The specific situation is this:
• You’re using map (or a for/yield expression) to create a new collection from an existing collection.
• The resulting collection is a List of Lists.
• You call flatten immediately after map (or a for/yield expression).
When you’re in this situation, you can use flatMap instead.
Example: Add all the Integers from the bag
val bag = List("1", "2", "three", "4", "one hundred seventy five")
def toInt(in: String): Option[Int] = {
try {
Some(Integer.parseInt(in.trim))
} catch {
case e: Exception => None
}
}
Using a flatMap method
> bag.flatMap(toInt).sum
Using map method (3 steps needed)
bag.map(toInt) // List[Option[Int]] = List(Some(1), Some(2), None, Some(4), None)
bag.map(toInt).flatten //List[Int] = List(1, 2, 4)
bag.map(toInt).flatten.sum //Int = 7

Generic column access for scala matrix

I have code that stores matrices of different types, e.g. m1: Array[Array[Double]], m2: List[List[Int]]. As seen, these matrices are all stored as as a sequence of rows. Any row is easy to retrieve but columns seem to me to require traversal of the matrix. I'd like to write a very generic function that returns a column from a matrix of either of these types. I've written this in many ways, the latest of which is:
/* Get a column of any matrix stored in rows */
private def column(M: Seq[Seq[Any]], n: Int, c: Seq[Any] = List(),
i: Int = 0): List[Any] = {
if (i != M.size) column(M, n, c :+ M(i)(n), i+1) else c.toList
This compiles however it doesn't work: I get a type mismatch when I try to pass in an Array[Array[Double]]. I've tried to write this with some view bounds as well i.e.
private def column[T1 <% Seq[Any], T2 <% Seq[T1]] ...
But this wasn't fruitful either. How come the first code segment I wrote doesn't work? What is the best way to do this?
import collection.generic.CanBuildFrom
def column[T, M[_]](xss: M[M[T]], c: Int)(
implicit cbf: CanBuildFrom[Nothing, T, M[T]],
mm2s: M[M[T]] => Seq[M[T]],
m2s: M[T] => Seq[T]
): M[T] = {
val bf = cbf()
for (xs <- mm2s(xss)) { bf += m2s(xs).apply(c) }
bf.result
}
If you don't care about the return type, this is a really simple way to do it:
def column[A, M[_]](matrix: M[M[A]], colIdx: Int)
(implicit v1: M[M[A]] => Seq[M[A]], v2: M[A] => Seq[A]): Seq[A] =
matrix.map(_(colIdx))
I suggest you represent a Matrix as an underlying single-dimensional Array (the only kind of Array there is!) and separately represent its structure in terms of rows and columns.
This gives you more flexibility both in representation and access. E.g., you can provide both row-major and column-major organizations. Producing row iterators is just as easy as producing column iterators, regardless of whether it's a row-major or column-major organization.
Try this one:
private def column[T](
M: Seq[Seq[T]], n: Int, c: Seq[T] = List(), i: Int = 0): List[T] =
if (i != M.size) column(M, n, c :+ M(i)(n), i+1) else c.toList

Scala; Can you define many parameters as one

Can you define a set of variables for later use?
Here are some pseudo code highlighting my intent:
def coordinates = x1, y1, x2, y2
log("Drawing from (%4.1f, %4.1f) to (%4.1f, %4.1f)".format(coordinates))
canvas.drawLine(coordinates, linePaint)
Here is a working example that contains duplicated code.
log("Drawing from (%4.1f, %4.1f) to (%4.1f, %4.1f)".format(x1, y1, x2, y2))
canvas.drawLine(x1, y1, x2, y2, linePaint)
Yes, you can, although the syntax is arguably horribly clunky, and there are some limitations that may seem a little arbitrary at first. The trick is to convert the method to a function (called "eta expansion"), and then to use that function's tupled method to get something you can apply to a tuple.
Suppose you have a class like this:
class Foo {
def f(a: String, b: String) = "%s, %s".format(b, a)
def g(x: Int, y: Int, z: Int) = x + y * z
}
And an instance:
val foo = new Foo
And some data you'd like to use Foo's methods on:
val names = ("John", "Doe")
val nums = (42, 3, 37)
You can't just write foo.f(names) or foo.g(nums), because the types don't line up—argument lists and tuples are different things in Scala. But you can write the following:
scala> (foo.f _).tupled(names)
res0: String = Doe, John
scala> (foo.g _).tupled(nums)
res1: Int = 153
Sticking the underscore after the method turns it into a function (this is in my opinion the most confusing little quirk of Scala's syntax), and tupled converts it from a function with two (or three) arguments to a function with a single tuple argument.
You could clean the code up a little by defining the following helper functions, for example:
scala> val myF = (foo.f _).tupled
myF: ((String, String)) => String = <function1>
scala> val myG = (foo.g _).tupled
myG: ((Int, Int, Int)) => Int = <function1>
scala> myF(names)
res2: String = Doe, John
scala> myG(nums)
res3: Int = 153
I'm not sure that's much better, though.
Lastly, you can't (conveniently) use this approach on a varargs method—you can't for example write the following:
val coordsTupleToString = ("(%4.1f, %4.1f) to (%4.1f, %4.1f)".format _).tupled
Or even just:
val coordsToString = "(%4.1f, %4.1f) to (%4.1f, %4.1f)".format _
Which is yet another reason to avoid varargs in Scala.
Looks like you need a tuple:
val coordinates = (x1, y1, x2, y2)
or maybe a full-blown object?
Now, this may be obvious, but if it annoys you in only a few cases, you can always enhance:
implicit def enhancedCanvas(canvas: Canvas) = new {
// using bad and slow syntax. please change this in Scala 2.10.
def drawLineC(coordinates: (Float, Float, Float, Float), paint: Paint) = {
val (x1, y1, x2, y2) = coordinates
canvas.drawLine(x1, y1, x2, y2, paint)
}
}
Another possibility, if you’re crazy enough. (Might be that an enhancement like this is already in Scalaz or Shapeless.)
implicit def enhTuple4[A,B,C,D](t: Tuple4[A,B,C,D]) = new {
def |<[E] (f: (A, B, C, D) => E) = f(t._1, t._2, t._3, t._4)
}
// to be used as
val coordinates = (x1, y1, x2, y2)
coordinates |< (canvas.drawLine(_, _, _, _, linePaint))

generating permutations with scalacheck

I have some generators like this:
val fooRepr = oneOf(a, b, c, d, e)
val foo = for (s <- choose(1, 5); c <- listOfN(s, fooRepr)) yield c.mkString("$")
This leads to duplicates ... I might get two a's, etc. What I really want is to generate random permutation with exactly 0 or 1 or each of a, b, c, d, or e (with at least one of something), in any order.
I was thinking there must be an easy way, but I'm struggling to even find a hard way. :)
Edited: Ok, this seems to work:
val foo = for (s <- choose(1, 5);
c <- permute(s, a, b, c, d, e)) yield c.mkString("$")
def permute[T](n: Int, gs: Gen[T]*): Gen[Seq[T]] = {
val perm = Random.shuffle(gs.toList)
for {
is <- pick(n, 1 until gs.size)
xs <- sequence[List,T](is.toList.map(perm(_)))
} yield xs
}
...borrowing heavily from Gen.pick.
Thanks for your help, -Eric
Rex, thanks for clarifying exactly what I'm trying to do, and that's useful code, but perhaps not so nice with scalacheck, particularly if the generators in question are quite complex. In my particular case the generators a, b, c, etc. are generating huge strings.
Anyhow, there was a bug in my solution above; what worked for me is below. I put a tiny project demonstrating how to do this at github
The guts of it is below. If there's a better way, I'd love to know it...
package powerset
import org.scalacheck._
import org.scalacheck.Gen._
import org.scalacheck.Gen
import scala.util.Random
object PowersetPermutations extends Properties("PowersetPermutations") {
def a: Gen[String] = value("a")
def b: Gen[String] = value("b")
def c: Gen[String] = value("c")
def d: Gen[String] = value("d")
def e: Gen[String] = value("e")
val foo = for (s <- choose(1, 5);
c <- permute(s, a, b, c, d, e)) yield c.mkString
def permute[T](n: Int, gs: Gen[T]*): Gen[Seq[T]] = {
val perm = Random.shuffle(gs.toList)
for {
is <- pick(n, 0 until gs.size)
xs <- sequence[List, T](is.toList.map(perm(_)))
} yield xs
}
implicit def arbString: Arbitrary[String] = Arbitrary(foo)
property("powerset") = Prop.forAll {
a: String => println(a); true
}
}
Thanks,
Eric
You're not describing a permutation, but the power set (minus the empty set)Edit: you're describing a combination of a power set and a permutation. The power set of an indexed set N is isomorphic to 2^N, so we simply (in Scala alone; maybe you want to alter this for use with ScalaCheck):
def powerSet[X](xs: List[X]) = {
val xis = xs.zipWithIndex
(for (j <- 1 until (1<<xs.length)) yield {
for ((x,i) <- xis if ((j & (1<<i)) != 0)) yield x
}).toList
}
to generate all possible subsets given a set. Of course, explicit generation of power sets is unwise if they original set contains more than a handful of elements. If you don't want to generate all of them, just pass in a random number from 1 until (1<<(xs.length-1)) and run the inner loop. (Switch to Long if there are 33-64 elements, and to BitSet if there are more yet.) You can then permute the result to switch the order around if you wish.
Edit: there's another way to do this if you can generate permutations easily and you can add a dummy argument: make your list one longer, with a Stop token. Then permute and .takeWhile(_ != Stop). Ta-da! Permutations of arbitrary length. (Filter out the zero-length answer if need be.)