How can I generate a list of n unique elements picked from a set? - scala

How to generate a list of n unique values (Gen[List[T]]) from a set of values (not generators) using ScalaCheck? This post uses Gen[T]* instead of a set of values, and I can't seem to rewrite it to make it work.
EDIT
At #Jubobs' request I now shamefully display what I have tried so far, revealing my utter novice status at using ScalaCheck :-)
I simply tried to replace gs: Gen[T] repeated parameter to a Set in what #Eric wrote as a solution here:
def permute[T](n: Int, gs: Set[T]): Gen[Seq[T]] = {
val perm = Random.shuffle(gs.toList)
for {
is <- Gen.pick(n, 1 until gs.size)
xs <- Gen.sequence[List[T], T](is.toList.map(perm(_)))
} yield xs
}
but is.toList.map(perm(_)) got underlined with red, with IntelliJ IDEA telling me that "You should read ScalaCheck API first before blind (although intuitive) trial and error", or maybe "Type mismatch, expected: Traversable[Gen[T]], actual List[T]", I can't remember.
I also tried several other ways, most of which I find ridiculous (and thus not worthy of posting) in hindsight, with the most naive being the using of #Eric's (otherwise useful and neat) solution as-is:
val g = for (i1 <- Gen.choose(0, myList1.length - 1);
i2 <- Gen.choose(0, myList2.length - 1))
yield new MyObject(myList1(i1), myList2(i2))
val pleaseNoDuplicatesPlease = permute(4, g, g, g, g, g)
After some testing I saw that pleaseNoDuplicatesPlease in fact contained duplicates, at which point I weighed my options of having to read through ScalaCheck API and understand a whole lot more than I do now (which will inevitably, and gradually come), or posting my question at StackOverflow (after carefully searching whether similar questions exist).

Gen.pick is right up your alley:
scala> import org.scalacheck.Gen
import org.scalacheck.Gen
scala> val set = Set(1,2,3,4,5,6)
set: scala.collection.immutable.Set[Int] = Set(5, 1, 6, 2, 3, 4)
scala> val myGen = Gen.pick(5, set).map { _.toList }
myGen: org.scalacheck.Gen[scala.collection.immutable.List[Int]] = org.scalacheck.Gen$$anon$3#78693eee
scala> myGen.sample
res0: Option[scala.collection.immutable.List[Int]] = Some(List(5, 6, 2, 3, 4))
scala> myGen.sample
res1: Option[scala.collection.immutable.List[Int] = Some(List(1, 6, 2, 3, 4))

Related

Functional Programming In Scala - reversing lists exercise

While working trough the exercises and concepts of the book: Functional Programming in Scala by Paul Chiusano and RĂșnar Bjarnason, I stumbled upon the exercise of writing my own function to reverse a list.
The "suggestion" the authors gave to motivate readers to learn more, was to see if we can write such a function using a fold.
My "non-fold" version is the following:
def myrev(arr:List[Int]):List[Int] = if (arr.length > 0) { myrev(arr.tail) :+ arr.head } else arr
However, can someone give me some pointers on how to at least start the logic to reverse a list via a fold?
I know that:
List(1,2,3,4).foldLeft(0)({case(x,y)=>x})
gives me the initial "seed" value, which is 0 above, and that:
List(1,2,3,4).foldLeft(0)({case(x,y)=>y})
gives me 4 which is the last element of the list.
So I would need to supply as function to foldLeft a sort of "identity" function that can give me an element in a certain position and maybe use it to reverse the list somehow, but I feel at a loss.
I didn't fully understood some Haskell code I found online and I want to "struggle" and try to get there on my own only with some pointers, instead of simply blindly copy some solution, so any help would be greatly appreciated.
scala> List(1, 2, 3, 4).foldLeft(List.empty[Int])((result, currentElem) => currentElem :: result)
res2: List[Int] = List(4, 3, 2, 1)
An implementation of a function that takes a list and inverse it using foldLeft
def reverse (l: List[String]) =
l.foldLeft(List[String]())( (x: List[String], y: String) => y :: x )
Create a list and call the reverse function
val l = List("1", "2", "3")
reverse(l)
Result
res2: List(3, 2, 1): List[String]

Are chained maps optimized by compiler?

Scala has an amazing way of converting a collection into another collection using map construct.
val l = List(1, 2, 3, 4)
l.map(_*_)
will return the squares of the elements in list l
I come across various instances where multiple maps are chained together say,
val l = List(1, 2, 3, 4)
val res = l.map(_ * _).map(_ + 1).filter(_ < 3)
What i believe happens underneath is equivalent to something below.
val l = List(1, 2, 3, 4)
val l1 = l.map(_*_)
val l2 = l1.map(_ + 1)
val res = l2.filter(_ < 3)
creating l1 and l2 might cause memory issues if the collection is too big.
To tackle this problem, does Scala compiler have any optimizations?
val l = List(1, 2, 3, 4)
val res = l1.map( _*_ + 1).filter(_ < 3)
in general if f, g, h are functions
val l = List(/*something*/)
val res = l.map(f(_)).map(g(_)).map(h(_))
can be converted into
val res = l.map(f _ andThen g _ andThen h _)
Scala offers Stream, which is a lazy ordered collection.
val s = Stream(1, 2, 3, 4)
// note i've changed your sequence of transformations
// a bit, so that it compiles and yields more than one result
val res = s.map(i => i * i).map(_ + 1).filter(_ < 11)
res is now a Stream. No actual evaluation has been performed yet, no blocks of memory related to the size of s have been used.
If you intend to use the elements of res one at a time, no more work is required. You can use res in a for statement or comprehension directly, for example.
for ( elem <- res ) println( s"A value is ${elem}" )
If you want res as a List, you can just call .toList at the end of the sequence of transformations. Instead of the above, use
val res = s.map(i => i * i).map(_ + 1).filter(_ < 11).toList
s will only be traversed once in creating the new List.
No, because this would require the compiler to know about the semantics of map and treat the standard library classes which implement it specially (since nobody stops you from writing a class where this doesn't hold). There is a research proposal which might end up implementing this... eventually.
There is also Scala-Blitz which optimizes some collection operations, but fusion and deforestation are listed as future work in this presentation and I don't think they are implemented yet.
As Steve Waldman's answer says, using Stream (or, better yet, Iterator) can help, but it won't eliminate the intermediate collections completely.

Why the variation in operators?

Long time lurker, first time poster.
In Scala, I'm looking for advantages as to why it was preferred to vary operators depending on type. For example, why was this:
Vector(1, 2, 3) :+ 4
determined to be an advantage over:
Vector(1, 2, 3) + 4
Or:
4 +: Vector(1,2,3)
over:
Vector(4) + Vector(1,2,3)
Or:
Vector(1,2,3) ++ Vector(4,5,6)
over:
Vector(1,2,3) + Vector(4,5,6)
So, here we have :+, +:, and ++ when + alone could have sufficed. I'm new at Scala, and I'll succumb. But, this seems unnecessary and obfuscated for a language that tries to be clean with its syntax.
I've done quite a few google and stack overflow searches and have only found questions about specific operators, and operator overloading in general. But, no background on why it was necessary to split +, for example, into multiple variations.
FWIW, I could overload the operators using implicit classes, such as below, but I imagine that would only cause confusion (and tisk tisks) from experienced Scala programmers using/reading my code.
object AddVectorDemo {
implicit class AddVector(vector : Vector[Any]) {
def +(that : Vector[Any]) = vector ++ that
def +(that : Any) = vector :+ that
}
def main(args : Array[String]) : Unit = {
val u = Vector(1,2,3)
val v = Vector(4,5,6)
println(u + v)
println(u + v + 7)
}
}
Outputs:
Vector(1, 2, 3, 4, 5, 6)
Vector(1, 2, 3, 4, 5, 6, 7)
The answer requires a surprisingly long detour through variance. I'll try to make it as short as possible.
First, note that you can add anything to an existing Vector:
scala> Vector(1)
res0: scala.collection.immutable.Vector[Int] = Vector(1)
scala> res0 :+ "fish"
res1: scala.collection.immutable.Vector[Any] = Vector(1, fish)
Why can you do this? Well, if B extends A and we want to be able to use Vector[B] where Vector[A] is called for, we need to allow Vector[B] to add the same sorts of things that Vector[A] can add. But everything extends Any, so we need to allow addition of anything that Vector[Any] can add, which is everything.
Making Vector and most other non-Set collections covariant is a design decision, but it's what most people expect.
Now, let's try adding a vector to a vector.
scala> res0 :+ Vector("fish")
res2: scala.collection.immutable.Vector[Any] = Vector(1, Vector(fish))
scala> res0 ++ Vector("fish")
res3: scala.collection.immutable.Vector[Any] = Vector(1, fish)
If we only had one operation, +, we wouldn't be able to specify which one of these things we meant. And we really might mean to do either. They're both perfectly sensible things to try. We could try to guess based on types, but in practice it's better to just ask the programmer to explicitly say what they mean. And since there are two different things to mean, there need to be two ways to ask.
Does this come up in practice? With collections of collections, yes, all the time. For example, using your +:
scala> Vector(Vector(1), Vector(2))
res4: Vector[Vector[Int]] = Vector(Vector(1), Vector(2))
scala> res4 + Vector(3)
res5: Vector[Any] = Vector(Vector(1), Vector(2), 3)
That's probably not what I wanted.
It's a fair question, and I think it has a lot to do with legacy code and Java compatibility. Scala copied Java's + for String concatenation, which has complicated things.
This + allows us to do:
(new Object) + "foobar" //"java.lang.Object#5bb90b89foobar"
So what should we expect if we had + for List and we did List(1) + "foobar"? One might expect List(1, "foobar") (of type List[Any]), just like we get if we use :+, but the Java-inspired String-concatenation overload would complicate this, since the compiler would fail to resolve the overload.
Odersky even once commented:
One should never have a + method on collections that are covariant in their element type. Sets and maps are non-variant, that's why they can have a + method. It's all rather delicate and messy. We'd be better off if we did not try to duplicate Java's + for String concatenation. But when Scala got designed the idea was to keep essentially all of Java's expression syntax, including String +. And it's too late to change that now.
There is some discussion (although in a different context) on the answers to this similar question.

How to explain that "Set(someList : _*)" results the same as "Set(someList).flatten"

I found a piece of code I wrote some time ago using _* to create a flattened set from a list of objects.
The real line of code is a bit more complex and as I didn't remember exactly why was that there, took a bit of experimentation to understand the effect, which is actually very simple as seen in the following REPL session:
scala> val someList = List("a","a","b")
someList: List[java.lang.String] = List(a, a, b)
scala> val x = Set(someList: _*)
x: scala.collection.immutable.Set[java.lang.String] = Set(a, b)
scala> val y = Set(someList).flatten
y: scala.collection.immutable.Set[java.lang.String] = Set(a, b)
scala> x == y
res0: Boolean = true
Just as a reference of what happens without flatten:
scala> val z = Set(someList)
z: scala.collection.immutable.Set[List[java.lang.String]] = Set(List(a, a, b))
As I can't remember where did I get that idiom from I'd like to hear about what is actually happening there and if there is any consequence in going for one way or the other (besides the readability impact)
P.S.: Maybe as an effect of the overuse of underscore in Scala language (IMHO), it is kind of difficult to find documentation about some of its use cases, specially if it comes together with a symbol commonly used as a wildcard in most search engines.
_* is for expand this collection as if it was written here literally, so
val x = Set(Seq(1,2,3,4): _*)
is the same as
val x = Set(1,2,3,4)
Whereas, Set(someList) treats someList as a single argument.
To lookup funky symbols, you could use symbolhound

Scalacheck: Generate list corresponding to list of generators

I want to generate a list of integers corresponding to a list of generators in ScalaCheck.
import org.scalacheck._
import Arbitrary.arbitrary
val smallInt = Gen.choose(0,10)
val bigInt = Gen.choose(1000, 1000000)
val zeroOrOneInt = Gen.choose(0, 1)
val smallEvenInt = smallInt suchThat (_ % 2 == 0)
val gens = List(smallInt, bigInt, zeroOrOneInt, smallEvenInt)
//val listGen: Gen[Int] = ??
//println(listGen.sample) //should print something like List(2, 2000, 0, 6)
For the given gens, I would like to create a generator listGen whose valid sample can be List(2, 2000, 0, 6).
Here is my first attempt using tuples.
val gensTuple = (smallInt, bigInt, zeroOrOneInt, smallEvenInt)
val tupleGen = for {
a <- gensTuple._1
b <- gensTuple._2
c <- gensTuple._3
d <- gensTuple._4
} yield (a, b, c, d)
println(tupleGen.sample) // prints Some((1,318091,0,6))
This works, but I don't want to use tuples since the list of generators(gens) is created dynamically
and the size of the list is not fixed. Is there a way to do it with Lists?
I want the use the generator of the list(listGen) in scalacheck forAll property checking.
This looks like a toy problem but this is
the best I could do to create a standalone snippet reproducing the actual issue I am
facing.
How about using the Gen.sequence method? It transforms an Iterable[Gen[T]] into a Gen[C[T]], where C can be List:
def sequence[C[_],T](gs: Iterable[Gen[T]])(implicit b: Buildable[T,C]): Gen[C[T]] =
...
Just use Gen.sequence, but be careful as it will try to return a java.util.ArrayList[T] if you don't fully parameterize it (bug).
Full working example:
def genIntList(): Gen[List[Int]] = {
val gens = List(Gen.chooseNum(1, 2), Gen.chooseNum(3, 4))
Gen.sequence[List[Int], Int](gens)
}
println(genIntList.sample.get) // prints: List(1,4)
EDIT: Please disregard, this doesn't answer the asker's question
I can't comment on posts yet, so I'll have to venture a guess here. I presume the function 'sample' applies to the generators
Any reason why you can't do:
gens map (t=>t.sample)
For a more theoretical answer: the method you want is traverse, which is equivalent to sequence compose map although it might be more efficient. It is of the general form:
def traverse[C[_]: Traverse, F[_]: Applicative, A, B](f: A => F[B], t: C[A]): F[C[B]]
It behaves like map but allows you to carry around some extra Applicative structure during the traversal, sequencing it along the way.