Scala : simulate Lotto number generator in Range 1 to 45 - scala

The following is an imperative solution to a simulation of a Number Lottery from the Range 1 to 45, each time we generate a number n1 the number is removed from the set of possible numbers.
Is it possible to achieve the same in a more functional way ? i.e using map,filter etc
def getNumbers :Array[Int] = {
val range = 1 to 45
var set = range.toSet
var resultSet:Set[Int] = Set()
var current: Int = 0
while(resultSet.size < 5 ){
current = Random.shuffle(set).head // pick the head of the shuffled set
set -= current
resultSet += current
}
resultSet.toArray
}
"edit"
An example pick 3 numbers from the Range 1 to 5
Original Set is {1,2,3,4,5}
{1,2,3,4,5} shuffle(1) picked at random 3
{1,2,4,5} shuffle(2) picked at random 2
{1,4,5} shuffle(3) picked at random 4
original Set becomes {1,5}
numbers picked {3,2,4}
each shuffle randomizes a different SET ! => different probabilities
I would like to see a functional "method" with 5 shuffles not 1 shuffle !

Sure, it's possible. The collections API has everything you need. What you're looking for is take, which will take the first n elements of the collection, or as many elements as the collection has if there are less than n.
Random.shuffle(1 to 45).take(5).toArray

I am with m-z on this, however if you REALLY want the functional form of a constant reshuffle, then you want something like this:
import scala.util.Random
import scala.annotation.tailrec
val initialSet = 1 to 45
def lottery(initialSet: Seq[Int], numbersPicked: Int): Set[Int] = {
#tailrec
def internalTailRec(setToUse: Seq[Int], picksLeft: Int, selection: Set[Int]):Set[Int]= {
if(picksLeft == 0) selection
else {
val selected = Random.shuffle(setToUse).head
internalTailRec(setToUse.filter(_ != selected), picksLeft - 1, selection ++ Set(selected))
}
}
internalTailRec(initialSet, numbersPicked, Set())
}
lottery(initialSet, 5)

This would simulate the behaviour you wanted:
Draw one, reshuffle data left, draw one and so on.
import scala.util.Random
import scala.annotation.tailrec
def draw(count: Int, data: Set[Int]): Set[Int] = {
#tailrec
def drawRec( accum: Set[Int] ) : Set[Int] =
if (accum.size == count )
accum
else
drawRec( accum + Random.shuffle( (data -- accum).toList ).head )
drawRec( Set() )
}
Example
scala> draw(5, (1 to 45).toSet)
res15: Set[Int] = Set(1, 41, 45, 17, 22)
scala> draw(5, (1 to 45).toSet)
res16: Set[Int] = Set(5, 24, 1, 6, 28)

I prefer #m-z's solution as well and agree with his reasoning about the probability. Reading the source for Random.shuffle is probably a worth while exercise.
Each iteration removes an element from range and adds it to the accumulator acc which is similar to your imperative approach except that we are not mutating collections and counters.
import scala.util.Random._
import scala.annotation.tailrec
def getNumbers(): Array[Int] = {
val range = (1 to 45).toSeq
#tailrec
def getNumbersR(range: Seq[Int], acc: Array[Int], i: Int): Array[Int] = (i, range(nextInt(range.size))) match{
case (i, x) if i < 5 => getNumbersR(range.filter(_ != x), x +: acc, i + 1)
case (i, x) => acc
}
getNumbersR(range, Array[Int](), 0)
}
scala> getNumbers
res78: Array[Int] = Array(4, 36, 41, 20, 14)

Related

Scala: partitioning number into n almost equal-length ranges

I am trying to implement a function:
def NumberPartition(InputNum:Int,outputListSize:Int):List[Range]
such that:
NumberPartition(8,3)=List(Range(0,3),Range(3,6),Range(6,8))
ie. it creates n-1 equal-length ranges(length=ceil(InputNum/outputListSize)) plus the last/first one being slightly smaller.
I want to use this function for agglomeration of an embarrassingly-parallel program consisting of n subroutines that are going to be batch-handled by n tasks/threads.
What is the most idiomatic way of doing this in Scala?
I think using Range steps could be helpful:
def rangeHeads(n:Int,len:Int):Range=Range(0,n,ceil(n/len))//type conversion for ceil() omitted here.
rangeHeads(8,3)//Range(0, 3, 6)
I just need a function that does (1,2,3,4)->((1,2),(2,3),(3,4))
While this isn't the exact output you are seeking, perhaps this will be good guidance:
scala> def numberPartition(inputNum: Int, outputListSize: Int): List[List[Int]] = {
(0 to inputNum).toList.grouped(outputListSize).toList
}
numberPartition: numberPartition[](val inputNum: Int,val outputListSize: Int) => List[List[Int]]
scala> numberPartition(8, 3)
res0: List[List[Int]] = List(List(0, 1, 2), List(3, 4, 5), List(6, 7, 8))
def roundedUpIntDivide(a:Int,b:Int):Int=a/b + (if(a%b==0) 0 else 1)
def partitionToRanges(n:Int,len:Int): List[(Int, Int)] ={
(Range(0,n,roundedUpIntDivide(n,len)):+n)
.sliding(2)
.map(x => (x(0),x(1)))
.toList
}
thanks to #jwvh for suggesting sliding()
def numberPartition(inputNum: Int, outputListSize: Int): List[(Int, Int)] = {
val range = 0.until(inputNum).by(inputNum/outputListSize + 1).:+(inputNum)
range.zip(range.tail).toList
}
Note:
0.until(inputNum) is an exclusive range [0, inputNum);
.by(inputNum/outputListSize + 1) is the step;
.:+(inputNum) it adds back the range upper bound;
range.zip(range.tail) build couples from list, equals to .sliding(2).map { case Seq(x,y) => (x,y) }

Scala List Operation

Given a List of Int and variable X of Int type . What is the best in Scala functional way to retain only those values in the List (starting from beginning of list) such that sum of list values is less than equal to variable.
This is pretty close to a one-liner:
def takeWhileLessThan(x: Int)(l: List[Int]): List[Int] =
l.scan(0)(_ + _).tail.zip(l).takeWhile(_._1 <= x).map(_._2)
Let's break that into smaller pieces.
First you use scan to create a list of cumulative sums. Here's how it works on a small example:
scala> List(1, 2, 3, 4).scan(0)(_ + _)
res0: List[Int] = List(0, 1, 3, 6, 10)
Note that the result includes the initial value, which is why we take the tail in our implementation.
scala> List(1, 2, 3, 4).scan(0)(_ + _).tail
res1: List[Int] = List(1, 3, 6, 10)
Now we zip the entire thing against the original list. Taking our example again, this looks like the following:
scala> List(1, 2, 3, 4).scan(0)(_ + _).tail.zip(List(1, 2, 3, 4))
res2: List[(Int, Int)] = List((1,1), (3,2), (6,3), (10,4))
Now we can use takeWhile to take as many values as we can from this list before the cumulative sum is greater than our target. Let's say our target is 5 in our example:
scala> res2.takeWhile(_._1 <= 5)
res3: List[(Int, Int)] = List((1,1), (3,2))
This is almost what we want—we just need to get rid of the cumulative sums:
scala> res2.takeWhile(_._1 <= 5).map(_._2)
res4: List[Int] = List(1, 2)
And we're done. It's worth noting that this isn't very efficient, since it computes the cumulative sums for the entire list, etc. The implementation could be optimized in various ways, but as it stands it's probably the simplest purely functional way to do this in Scala (and in most cases the performance won't be a problem, anyway).
In addition to Travis' answer (and for the sake of completeness), you can always implement these type of operations as a foldLeft:
def takeWhileLessThanOrEqualTo(maxSum: Int)(list: Seq[Int]): Seq[Int] = {
// Tuple3: the sum of elements so far; the accumulated list; have we went over x, or in other words are we finished yet
val startingState = (0, Seq.empty[Int], false)
val (_, accumulatedNumbers, _) = list.foldLeft(startingState) {
case ((sum, accumulator, finished), nextNumber) =>
if(!finished) {
if (sum + nextNumber > maxSum) (sum, accumulator, true) // We are over the sum limit, finish
else (sum + nextNumber, accumulator :+ nextNumber, false) // We are still under the limit, add it to the list and sum
} else (sum, accumulator, finished) // We are in a finished state, just keep iterating over the list
}
accumulatedNumbers
}
This only iterates over the list once, so it should be more efficient, but is more complicated and requires a bit of reading code to understand.
I will go with something like this, which is more functional and should be efficient.
def takeSumLessThan(x:Int,l:List[Int]): List[Int] = (x,l) match {
case (_ , List()) => List()
case (x, _) if x<= 0 => List()
case (x, lh :: lt) => lh :: takeSumLessThan(x-lh,lt)
}
Edit 1 : Adding tail recursion and implicit for shorter call notation
import scala.annotation.tailrec
implicit class MyList(l:List[Int]) {
def takeSumLessThan(x:Int) = {
#tailrec
def f(x:Int,l:List[Int],acc:List[Int]) : List[Int] = (x,l) match {
case (_,List()) => acc
case (x, _ ) if x <= 0 => acc
case (x, lh :: lt ) => f(x-lh,lt,acc ++ List(lh))
}
f(x,l,Nil)
}
}
Now you can use this like
List(1,2,3,4,5,6,7,8).takeSumLessThan(10)

How can I find repeated items in a Scala List?

I have a Scala List that contains some repeated numbers. I want to count the number of times a specific number will repeat itself. For example:
val list = List(1,2,3,3,4,2,8,4,3,3,5)
val repeats = list.takeWhile(_ == List(3,3)).size
And the val repeats would equal 2.
Obviously the above is pseudo-code and takeWhile will not find two repeated 3s since _ represents an integer. I tried mixing both takeWhile and take(2) but with little success. I also referred code from How to find count of repeatable elements in scala list but it appears the author is looking to achieve something different.
Thanks for your help.
This will work in this case:
val repeats = list.sliding(2).count(_.forall(_ == 3))
The sliding(2) method gives you an iterator of lists of elements and successors and then we just count where these two are equal to 3.
Question is if it creates the correct result to List(3, 3, 3)? Do you want that to be 2 or just 1 repeat.
val repeats = list.sliding(2).toList.count(_==List(3,3))
and more generally the following code returns tuples of element and repeats value for all elements:
scala> list.distinct.map(x=>(x,list.sliding(2).toList.count(_.forall(_==x))))
res27: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
which means that the element '3' repeats 2 times consecutively at 2 places and all others 0 times.
and also if we want element repeats 3 times consecutively we just need to modify the code as follows:
list.distinct.map(x=>(x,list.sliding(3).toList.count(_.forall(_==x))))
in SCALA REPL:
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 5)
scala> list.distinct.map(x=>(x,list.sliding(3).toList.count(_==List(x,x,x))))
res29: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
Even sliding value can be varied by defining a function as:
def repeatsByTimes(list:List[Int],n:Int) =
list.distinct.map(x=>(x,list.sliding(n).toList.count(_.forall(_==x))))
Now in REPL:
scala> val list = List(1,2,3,3,4,2,8,4,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 4, 2, 8, 4, 3, 3, 5)
scala> repeatsByTimes(list,2)
res33: List[(Int, Int)] = List((1,0), (2,0), (3,2), (4,0), (8,0), (5,0))
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,2,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 2, 4, 3, 3, 3, 5)
scala> repeatsByTimes(list,3)
res34: List[(Int, Int)] = List((1,0), (2,0), (3,3), (4,0), (8,0), (5,0))
scala>
We can go still further like given a list of integers and given a maximum number
of consecutive repetitions that any of the element can occur in the list, we may need a list of 3-tuples representing (the element, number of repetitions of this element, at how many places this repetition occurred). this is more exhaustive information than the above. Can be achieved by writing a function like this:
def repeats(list:List[Int],maxRep:Int) =
{ var v:List[(Int,Int,Int)] = List();
for(i<- 1 to maxRep)
v = v ++ list.distinct.map(x=>
(x,i,list.sliding(i).toList.count(_.forall(_==x))))
v.sortBy(_._1) }
in SCALA REPL:
scala> val list = List(1,2,3,3,3,4,2,8,4,3,3,3,2,4,3,3,3,5)
list: List[Int] = List(1, 2, 3, 3, 3, 4, 2, 8, 4, 3, 3, 3, 2, 4, 3, 3, 3, 5)
scala> repeats(list,3)
res38: List[(Int, Int, Int)] = List((1,1,1), (1,2,0), (1,3,0), (2,1,3),
(2,2,0), (2,3,0), (3,1,9), (3,2,6), (3,3,3), (4,1,3), (4,2,0), (4,3,0),
(5,1,1), (5,2,0), (5,3,0), (8,1,1), (8,2,0), (8,3,0))
scala>
These results can be understood as follows:
1 times the element '1' occurred at 1 places.
2 times the element '1' occurred at 0 places.
............................................
............................................
.............................................
2 times the element '3' occurred at 6 places..
.............................................
3 times the element '3' occurred at 3 places...
............................................and so on.
Thanks to Luigi Plinge I was able to use methods in run-length encoding to group together items in a list that repeat. I used some snippets from this page here: http://aperiodic.net/phil/scala/s-99/
var n = 0
runLengthEncode(totalFrequencies).foreach{ o =>
if(o._1 > 1 && o._2==subjectNumber) n+=1
}
n
The method runLengthEncode is as follows:
private def pack[A](ls: List[A]): List[List[A]] = {
if (ls.isEmpty) List(List())
else {
val (packed, next) = ls span { _ == ls.head }
if (next == Nil) List(packed)
else packed :: pack(next)
}
}
private def runLengthEncode[A](ls: List[A]): List[(Int, A)] =
pack(ls) map { e => (e.length, e.head) }
I'm not entirely satisfied that I needed to use the mutable var n to count the number of occurrences but it did the trick. This will count the number of times a number repeats itself no matter how many times it is repeated.
If you knew your list was not very long you could do it with Strings.
val list = List(1,2,3,3,4,2,8,4,3,3,5)
val matchList = List(3,3)
(matchList.mkString(",")).r.findAllMatchIn(list.mkString(",")).length
From you pseudocode I got this working:
val pairs = list.sliding(2).toList //create pairs of consecutive elements
val result = pairs.groupBy(x => x).map{ case(x,y) => (x,y.size); //group pairs and retain the size, which is the number of occurrences.
result will be a Map[List[Int], Int] so you can the count number like:
result(List(3,3)) // will return 2
I couldn't understand if you also want to check lists of several sizes, then you would need to change the parameter to sliding to the desired size.
def pack[A](ls: List[A]): List[List[A]] = {
if (ls.isEmpty) List(List())
else {
val (packed, next) = ls span { _ == ls.head }
if (next == Nil) List(packed)
else packed :: pack(next)
}
}
def encode[A](ls: List[A]): List[(Int, A)] = pack(ls) map { e => (e.length, e.head) }
val numberOfNs = list.distinct.map{ n =>
(n -> list.count(_ == n))
}.toMap
val runLengthPerN = runLengthEncode(list).map{ t => t._2 -> t._1}.toMap
val nRepeatedMostInSuccession = runLengthPerN.toList.sortWith(_._2 <= _._2).head._1
Where runLength is defined as below from scala's 99 problems problem 9 and scala's 99 problems problem 10.
Since numberOfNs and runLengthPerN are Maps, you can get the population count of any number in the list with numberOfNs(number) and the length of the longest repitition in succession with runLengthPerN(number). To get the runLength, just compute as above with runLength(list).map{ t => t._2 -> t._1 }.

Count occurrences of each item in a Scala parallel collection

My question is very similar to Count occurrences of each element in a List[List[T]] in Scala, except that I would like to have an efficient solution involving parallel collections.
Specifically, I have a large (~10^7) vector vec of short (~10) lists of Ints, and I would like to get for each Int x the number of times x occurs, for example as a Map[Int,Int]. The number of distinct integers is of the order 10^6.
Since the machine this needs to be done on has a fair amount of memory (150GB) and number of cores (>100) it seems like parallel collections would be a good choice for this. Is the code below a good approach?
val flatpvec = vec.par.flatten
val flatvec = flatpvec.seq
val unique = flatpvec.distinct
val counts = unique map (x => (x -> flatvec.count(_ == x)))
counts.toMap
Or are there better solutions? In case you are wondering about the .seq conversion: for some reason the following code doesn't seem to terminate, even for small examples:
val flatpvec = vec.par.flatten
val unique = flatpvec.distinct
val counts = unique map (x => (x -> flatpvec.count(_ == x)))
counts.toMap
This does something. aggregate is like fold except you also combine the results of the sequential folds.
Update: It's not surprising that there is overhead in .par.groupBy, but I was surprised by the constant factor. By these numbers, you would never count that way. Also, I had to bump the memory way up.
The interesting technique used to build the result map is described in this paper linked from the overview. (It cleverly saves the intermediate results and then coalesces them in parallel at the end.)
But copying around the intermediate results of the groupBy turns out to be expensive, if all you really want is a count.
The numbers are comparing sequential groupBy, parallel, and finally aggregate.
apm#mara:~/tmp$ scalacm countints.scala ; scalam -J-Xms8g -J-Xmx8g -J-Xss1m countints.Test
GroupBy: Starting...
Finished in 12695
GroupBy: List((233,10078), (237,20041), (268,9939), (279,9958), (315,10141), (387,9917), (462,9937), (680,9932), (848,10139), (858,10000))
Par GroupBy: Starting...
Finished in 51481
Par GroupBy: List((233,10078), (237,20041), (268,9939), (279,9958), (315,10141), (387,9917), (462,9937), (680,9932), (848,10139), (858,10000))
Aggregate: Starting...
Finished in 2672
Aggregate: List((233,10078), (237,20041), (268,9939), (279,9958), (315,10141), (387,9917), (462,9937), (680,9932), (848,10139), (858,10000))
Nothing magical in the test code.
import collection.GenTraversableOnce
import collection.concurrent.TrieMap
import collection.mutable
import concurrent.duration._
trait Timed {
def now = System.nanoTime
def timed[A](op: =>A): A = {
val start = now
val res = op
val end = now
val lapsed = (end - start).nanos.toMillis
Console println s"Finished in $lapsed"
res
}
def showtime(title: String, op: =>GenTraversableOnce[(Int,Int)]): Unit = {
Console println s"$title: Starting..."
val res = timed(op)
//val showable = res.toIterator.min //(res.toIterator take 10).toList
val showable = res.toList.sorted take 10
Console println s"$title: $showable"
}
}
It generates some random data for interest.
object Test extends App with Timed {
val upto = math.pow(10,6).toInt
val ran = new java.util.Random
val ten = (1 to 10).toList
val maxSamples = 1000
// samples of ten random numbers in the desired range
val samples = (1 to maxSamples).toList map (_ => ten map (_ => ran nextInt upto))
// pick a sample at random
def anyten = samples(ran nextInt maxSamples)
def mag = 7
val data: Vector[List[Int]] = Vector.fill(math.pow(10,mag).toInt)(anyten)
The sequential operation and the combining operation of aggregate are invoked from a task, and the result is assigned to a volatile var.
def z: mutable.Map[Int,Int] = mutable.Map.empty[Int,Int]
def so(m: mutable.Map[Int,Int], is: List[Int]) = {
for (i <- is) {
val v = m.getOrElse(i, 0)
m(i) = v + 1
}
m
}
def co(m: mutable.Map[Int,Int], n: mutable.Map[Int,Int]) = {
for ((i, count) <- n) {
val v = m.getOrElse(i, 0)
m(i) = v + count
}
m
}
showtime("GroupBy", data.flatten groupBy identity map { case (k, vs) => (k, vs.size) })
showtime("Par GroupBy", data.flatten.par groupBy identity map { case (k, vs) => (k, vs.size) })
showtime("Aggregate", data.par.aggregate(z)(so, co))
}
If you want to make use of parallel collections and Scala standard tools, you could do it like that. Group your collection by the identity and then map it to (Value, Count):
scala> val longList = List(1, 5, 2, 3, 7, 4, 2, 3, 7, 3, 2, 1, 7)
longList: List[Int] = List(1, 5, 2, 3, 7, 4, 2, 3, 7, 3, 2, 1, 7)
scala> longList.par.groupBy(x => x)
res0: scala.collection.parallel.immutable.ParMap[Int,scala.collection.parallel.immutable.ParSeq[Int]] = ParMap(5 -> ParVector(5), 1 -> ParVector(1, 1), 2 -> ParVector(2, 2, 2), 7 -> ParVector(7, 7, 7), 3 -> ParVector(3, 3, 3), 4 -> ParVector(4))
scala> longList.par.groupBy(x => x).map(x => (x._1, x._2.size))
res1: scala.collection.parallel.immutable.ParMap[Int,Int] = ParMap(5 -> 1, 1 -> 2, 2 -> 3, 7 -> 3, 3 -> 3, 4 -> 1)
Or even nicer like pagoda_5b suggested in the comments:
scala> longList.par.groupBy(identity).mapValues(_.size)
res1: scala.collection.parallel.ParMap[Int,Int] = ParMap(5 -> 1, 1 -> 2, 2 -> 3, 7 -> 3, 3 -> 3, 4 -> 1)

How can I find the index of the maximum value in a List in Scala?

For a Scala List[Int] I can call the method max to find the maximum element value.
How can I find the index of the maximum element?
This is what I am doing now:
val max = list.max
val index = list.indexOf(max)
One way to do this is to zip the list with its indices, find the resulting pair with the largest first element, and return the second element of that pair:
scala> List(0, 43, 1, 34, 10).zipWithIndex.maxBy(_._1)._2
res0: Int = 1
This isn't the most efficient way to solve the problem, but it's idiomatic and clear.
Since Seq is a function in Scala, the following code works:
list.indices.maxBy(list)
even easier to read would be:
val g = List(0, 43, 1, 34, 10)
val g_index=g.indexOf(g.max)
def maxIndex[ T <% Ordered[T] ] (list : List[T]) : Option[Int] = list match {
case Nil => None
case head::tail => Some(
tail.foldLeft((0, head, 1)){
case ((indexOfMaximum, maximum, index), elem) =>
if(elem > maximum) (index, elem, index + 1)
else (indexOfMaximum, maximum, index + 1)
}._1
)
} //> maxIndex: [T](list: List[T])(implicit evidence$2: T => Ordered[T])Option[Int]
maxIndex(Nil) //> res0: Option[Int] = None
maxIndex(List(1,2,3,4,3)) //> res1: Option[Int] = Some(3)
maxIndex(List("a","x","c","d","e")) //> res2: Option[Int] = Some(1)
maxIndex(Nil).getOrElse(-1) //> res3: Int = -1
maxIndex(List(1,2,3,4,3)).getOrElse(-1) //> res4: Int = 3
maxIndex(List(1,2,2,1)).getOrElse(-1) //> res5: Int = 1
In case there are multiple maximums, it returns the first one's index.
Pros:You can use this with multiple types, it goes through the list only once, you can supply a default index instead of getting exception for empty lists.
Cons:Maybe you prefer exceptions :) Not a one-liner.
I think most of the solutions presented here go thru the list twice (or average 1.5 times) -- Once for max and the other for the max position. Perhaps a lot of focus is on what looks pretty?
In order to go thru a non empty list just once, the following can be tried:
list.foldLeft((0, Int.MinValue, -1)) {
case ((i, max, maxloc), v) =>
if (v > max) (i + 1, v, i)
else (i + 1, max, maxloc)}._3
Pimp my library! :)
class AwesomeList(list: List[Int]) {
def getMaxIndex: Int = {
val max = list.max
list.indexOf(max)
}
}
implicit def makeAwesomeList(xs: List[Int]) = new AwesomeList(xs)
//> makeAwesomeList: (xs: List[Int])scalaconsole.scratchie1.AwesomeList
//Now we can do this:
List(4,2,7,1,5,6) getMaxIndex //> res0: Int = 2
//And also this:
val myList = List(4,2,7,1,5,6) //> myList : List[Int] = List(4, 2, 7, 1, 5, 6)
myList getMaxIndex //> res1: Int = 2
//Regular list methods also work
myList filter (_%2==0) //> res2: List[Int] = List(4, 2, 6)
More details about this pattern here: http://www.artima.com/weblogs/viewpost.jsp?thread=179766