Scalacheck number generator between 0 <= x < 2^64 - scala

I'm trying to right a good number generator that covers uint64_t in C. Here is what I have so far.
def uInt64s : Gen[BigInt] = Gen.choose(0,64).map(pow2(_) - 1)
It is a good start, but it only generates numbers 2^n - 1. Is there a more effective way to generate random BigInts while preserving the number range 0 <= n < 2^64?

Okay, maybe I am missing something here, but isn't it as simple as this?
def uInt64s : Gen[BigInt] = Gen.chooseNum(Long.MinValue,Long.MaxValue)
.map(x => BigInt(x) + BigInt(2).pow(63))
Longs already have the correct number of bits - just adding 2^63 so Long.MinValue becomes 0 and Long.MaxValue becomes 2^64 - 1. And doing the addition with BigInts of course.
I was curious about the distribution of generated values. Apparently the distribution of chooseNum is not uniform, since it prefers special values, but the edge cases for Longs are probably also interesting for UInt64s:
/** Generates numbers within the given inclusive range, with
* extra weight on zero, +/- unity, both extremities, and any special
* numbers provided. The special numbers must lie within the given range,
* otherwise they won't be included. */
def chooseNum[T](minT: T, maxT: T, specials: T*)(

With ScalaCheck...
Generating a number from 0..Long.MaxValue is easy.
Generating an unsigned long from 0..Long.MaxValue..2^64-1 is not so easy.
Tried:
❌ Gen.chooseNum(BigInt(0),BigInt(2).pow(64)-1) Does not work: At this time there is not an implicit defined for BigInt.
❌ Arbitrary.arbBigInt.arbitrary Does not work: It's type BigInt but still limited to the range of signed Long.
✔ Generate a Long as BigInt and shift left arbitrarily to make an UINT64 Works: Taking Rickard Nilsson's, ScalaCheck code as a guide this passed the test.
This is what I came up with:
// Generate a long and map to type BigInt
def genBigInt : Gen[BigInt] = Gen.chooseNum(0,Long.MaxValue) map (x => BigInt(x))
// Take genBigInt and shift-left a chooseNum(0,64) of positions
def genUInt64 : Gen[BigInt] = for { bi <- genBigInt; n <- Gen.chooseNum(0,64); x = (bi << n) if x >= 0 && x < BigInt(2).pow(64) } yield x
...
// Use the generator, genUInt64()
As noted, Scalacheck number generator between 0 <= x < 2^64, the distribution of the BigInts generated is not even. The preferred generator is #stholzm solution:
def genUInt64b : Gen[BigInt] =
Gen.chooseNum(Long.MinValue,Long.MaxValue) map (x =>
BigInt(x) + BigInt(2).pow(63))
it is simpler, the numbers fed to ScalaCheck will be more evenly distributed, it is faster, and it passes the tests.

A simpler and more efficient alternative to stholmz's answer is as follows:
val myGen = {
val offset = -BigInt(Long.MinValue)
Arbitrary.arbitrary[Long].map { BigInt(_) + offset }
}
Generate an arbitrary Long;
Convert it to a BigInt;
Add the appropriate offset, i.e. -BigInt(Long.MinValue)).
Tests in the REPL:
scala> myGen.sample
res0: Option[scala.math.BigInt] = Some(9223372036854775807)
scala> myGen.sample
res1: Option[scala.math.BigInt] = Some(12628207908230674671)
scala> myGen.sample
res2: Option[scala.math.BigInt] = Some(845964316914833060)
scala> myGen.sample
res3: Option[scala.math.BigInt] = Some(15120039215775627454)
scala> myGen.sample
res4: Option[scala.math.BigInt] = Some(0)
scala> myGen.sample
res5: Option[scala.math.BigInt] = Some(13652951502631572419)

Here is what I have so far, I'm not entirely happy with it
/**
* Chooses a BigInt in the ranges of 0 <= bigInt < 2^^64
* #return
*/
def bigInts : Gen[BigInt] = for {
bigInt <- Arbitrary.arbBigInt.arbitrary
exponent <- Gen.choose(1,2)
} yield bigInt.pow(exponent)
def positiveBigInts : Gen[BigInt] = bigInts.filter(_ >= 0)
def bigIntsUInt64Range : Gen[BigInt] = positiveBigInts.filter(_ < (BigInt(1) << 64))
/**
* Generates a number in the range 0 <= x < 2^^64
* then wraps it in a UInt64
* #return
*/
def uInt64s : Gen[UInt64] = for {
bigInt <- bigIntsUInt64Range
} yield UInt64(bigInt)
Since it appears that Arbitrary.argBigInt.arbitrary is only ranges -2^63 <= x <= 2^63 I take the x^2 some of the time to get a number larger than 2^63
Free free to comment if you see a place improvements can be made or a bug fixed

Related

How to find the largest multiple of n that fits in a 32 bit integer

I am reading Functional Programming in Scala and am having trouble understanding a piece of code. I have checked the errata for the book and the passage in question does not have a misprint. (Actually, it does have a misprint, but the misprint does not affect the code that I have a question about.)
The code in question calculates a pseudo-random, non-negative integer that is less than some upper bound. The function that does this is called nonNegativeLessThan.
trait RNG {
def nextInt: (Int, RNG) // Should generate a random `Int`.
}
case class Simple(seed: Long) extends RNG {
def nextInt: (Int, RNG) = {
val newSeed = (seed * 0x5DEECE66DL + 0xBL) & 0xFFFFFFFFFFFFL // `&` is bitwise AND. We use the current seed to generate a new seed.
val nextRNG = Simple(newSeed) // The next state, which is an `RNG` instance created from the new seed.
val n = (newSeed >>> 16).toInt // `>>>` is right binary shift with zero fill. The value `n` is our new pseudo-random integer.
(n, nextRNG) // The return value is a tuple containing both a pseudo-random integer and the next `RNG` state.
}
}
type Rand[+A] = RNG => (A, RNG)
def nonNegativeInt(rng: RNG): (Int, RNG) = {
val (i, r) = rng.nextInt
(if (i < 0) -(i + 1) else i, r)
}
def nonNegativeLessThan(n: Int): Rand[Int] = { rng =>
val (i, rng2) = nonNegativeInt(rng)
val mod = i % n
if (i + (n-1) - mod >= 0) (mod, rng2)
else nonNegativeLessThan(n)(rng2)
}
I have trouble understanding the following code in nonNegativeLessThan that looks like this: if (i + (n-1) - mod >= 0) (mod, rng2), etc.
The book explains that this entire if-else expression is necessary because a naive implementation that simply takes the mod of the result of nonNegativeInt would be slightly skewed toward lower values since Int.MaxValue is not guaranteed to be a multiple of n. Therefore, this code is meant to check if the generated output of nonNegativeInt would be larger than the largest multiple of n that fits inside a 32 bit value. If the generated number is larger than the largest multiple of n that fits inside a 32 bit value, the function recalculates the pseudo-random number.
To elaborate, the naive implementation would look like this:
def naiveNonNegativeLessThan(n: Int): Rand[Int] = map(nonNegativeInt){_ % n}
where map is defined as follows
def map[A,B](s: Rand[A])(f: A => B): Rand[B] = {
rng =>
val (a, rng2) = s(rng)
(f(a), rng2)
}
To repeat, this naive implementation is not desirable because of a slight skew towards lower values when Int.MaxValue is not a perfect multiple of n.
So, to reiterate the question: what does the following code do, and how does it help us determine whether a number is smaller that the largest multiple of n that fits inside a 32 bit integer? I am talking about this code inside nonNegativeLessThan:
if (i + (n-1) - mod >= 0) (mod, rng2)
else nonNegativeLessThan(n)(rng2)
I have exactly the same confusion about this passage from the Functional Programming in Scala. And I absolutely agree with jwvh's analysis - the statement if (i + (n-1) - mod >= 0) will be always true.
In fact, if one tries the same example in Rust, the compiler warns about this (just an interesting comparison of how much static checking is being done). Of course the pencil and paper approach of jwvh is absolutely the right approach.
We first define some type aliases to make the code match closer to the Scala code (forgive my Rust if its not quite idiomatic).
pub type RNGType = Box<dyn RNG>;
pub type Rand<A> = Box<dyn Fn(RNGType) -> (A, RNGType)>;
pub fn non_negative_less_than_(n: u32) -> Rand<u32> {
let t = move |rng: RNGType| {
let (i, rng2) = non_negative_int(rng);
let rem = i % n;
if i + (n - 1) - rem >= 0 {
(rem, rng2)
} else {
non_negative_less_than(n)(rng2)
}
};
Box::new(t)
}
The compiler warning regarding if nn + (n - 1) - rem >= 0 is:
warning: comparison is useless due to type limits

find out if a number is a good number in scala

Hi I am new to scala functional programming methodology. I want to input a number to my function and check if it is a good number or not.
A number is a good number if its every digit is larger than the sum of digits which are on the right side of that digit. 
For example:
9620  is good as (2 > 0, 6 > 2+0, 9 > 6+2+0)
steps I am using to solve this is
1. converting a number to string and reversing it
2. storing all digits of the reversed number as elements of a list
3. applying for loop from i equals 1 to length of number - 1
4. calculating sum of first i digits as num2
5. extracting ith digit from the list as digit1 which is one digit ahead of the first i numbers for which we calculated sum because list starts from zero.
6. comparing output of 4th and 5th step. if num1 is greater than num2 then we will break the for loop and come out of the loop to print it is not a good number.
please find my code below
val num1 = 9521.toString.reverse
val list1 = num1.map(_.todigit).toList
for (i <- 1 to num1.length - 1) {
val num2 = num1.take(i).map(_.toDigits) sum
val digit1 = list1(i)
if (num2 > digit1) {
print("number is not a good number")
break
}
}
I know this is not the most optimized way to solve this problem. Also I am looking for a way to code this using tail recursion where I pass two numbers and get all the good numbers falling in between those two numbers.
Can this be done in more optimized way?
Thanks in advance!
No String conversions required.
val n = 9620
val isGood = Stream.iterate(n)(_/10)
.takeWhile(_>0)
.map(_%10)
.foldLeft((true,-1)){ case ((bool,sum),digit) =>
(bool && digit > sum, sum+digit)
}._1
Here is a purely numeric version using a recursive function.
def isGood(n: Int): Boolean = {
#tailrec
def loop(n: Int, sum: Int): Boolean =
(n == 0) || (n%10 > sum && loop(n/10, sum + n%10))
loop(n/10, n%10)
}
This should compile into an efficient loop.
Using this function:(This will be the efficient way as the function forall will not traverse the entire list of digits. it stops when it finds the false condition immediately ( ie., when v(i)>v.drop(i+1).sum becomes false) while traversing from left to right of the vector v. )
def isGood(n: Int)= {
val v1 = n.toString.map(_.asDigit)
val v = if(v1.last!=0) v1 else v1.dropRight(1)
(0 to v.size-1).forall(i=>v(i)>v.drop(i+1).sum)
}
If we want to find good numbers in an interval of integers ranging from n1 to n2 we can use this function:
def goodNums(n1:Int,n2:Int) = (n1 to n2).filter(isGood(_))
In Scala REPL:
scala> isGood(9620)
res51: Boolean = true
scala> isGood(9600)
res52: Boolean = false
scala> isGood(9641)
res53: Boolean = false
scala> isGood(9521)
res54: Boolean = true
scala> goodNums(412,534)
res66: scala.collection.immutable.IndexedSeq[Int] = Vector(420, 421, 430, 510, 520, 521, 530, 531)
scala> goodNums(3412,5334)
res67: scala.collection.immutable.IndexedSeq[Int] = Vector(4210, 5210, 5310)
This is a more functional way. pairs is a list of tuples between a digit and the sum of the following digits. It is easy to create these tuples with drop, take and slice (a combination of drop and take) methods.
Finally I can represent my condition in an expressive way with forall method.
val n = 9620
val str = n.toString
val pairs = for { x <- 1 until str.length } yield (str.slice(x - 1, x).toInt, str.drop(x).map(_.asDigit).sum)
pairs.forall { case (a, b) => a > b }
If you want to be functional and expressive avoid to use break. If you need to check a condition for each element is a good idea to move your problem to collections, so you can use forAll.
This is not the case, but if you want performance (if you don't want to create an entire pairs collection because the condition for the first element is false) you can change your for collection from a Range to Stream.
(1 until str.length).toStream
Functional style tends to prefer monadic type things, such as maps and reduces. To make this look functional and clear, I'd do something like:
def isGood(value: Int) =
value.toString.reverse.map(digit=>Some(digit.asDigit)).
reduceLeft[Option[Int]]
{
case(sum, Some(digit)) => sum.collectFirst{case sum if sum < digit => sum+digit}
}.isDefined
Instead of using tail recursion to calculate this for ranges, just generate the range and then filter over it:
def goodInRange(low: Int, high: Int) = (low to high).filter(isGood(_))

Why does scala return an out of range value in this modulo operation?

This is a piece of code to generate random Long values within a given range, simplified for clarity:
def getLong(min: Long, max: Long): Long = {
if(min > max) {
throw new IncorrectBoundsException
}
val rangeSize = (max - min + 1L)
val randValue = math.abs(Random.nextLong())
val result = (randValue % (rangeSize)) + min
result
}
I know the results of this aren't uniform and this wouldn't work correctly for some values of min and max, but that's beside the point.
In the tests it turned out, that the following assertion isn't always true:
getLong(-1L, 1L) >= -1L
More specifically the returned value is -3. How is that even possible?
As it turns out, math.abs(x: Long): Long isn't guaranteed to always return non-negative values! There is no Long value that could represent math.abs(Long.MinValue), so instead of throwing an exception, math.abs returns Long.MinValue:
scala> Long.MinValue
res27: Long = -9223372036854775808
scala> math.abs(Long.MinValue)
res28: Long = -9223372036854775808
scala> math.abs(Long.MinValue) % 3
res29: Long = -2
scala> math.abs(Long.MinValue) % 3 + (-1)
res30: Long = -3
Which is, in my opinion, a very good example of why one should be using ScalaCheck to test at least parts of their codebase.

Scala - What type are the numbers in the List using x.toString.toList?

I have written a function in Scala that should calculate the sum of the squares of the digits of a number. Eg: 44 -> 32 (4^2 + 4^2 = 16 + 16 = 32)
Here it is:
def digitSum(x:BigInt) : BigInt = {
var sum = 0
val leng = x.toString.toList.length
var y = x.toString.toList
for (i<-0 until leng ) {
sum += y(i).toInt * y(i).toInt
}
return sum
}
However when I call the function let's say with digitSum(44) instead of 32 I get 5408.
Why is this happening? Does it have to do with the fact that in the list there are Strings? If so why does the .toInt method do not work?
Thanks!
The answer to your questions has been already covered here Scala int value of String characters, have a good read through and you will have more information than required ;)
Also looking at your code, it can benefit more from Scala expressiveness and functional features. The same function can be written in the following manner:
def digitSum(x: BigInt) = x.toString
.map(_.asDigit)
.map(a => a * a)
.sum
In the future try to avoid using mutable variables and standard looping techniques if you could.
When you do toString you're mapping the String to Chars not Ints and then to Ints later. This is what it looks like in the repl:
scala> "1".toList.map(_.toInt)
res0: List[Int] = List(49)
What you want is probably something like this:
def digitSum(x:BigInt) : BigInt = {
var sum = 0
val leng = x.toString.toList.length
var y = x.toString.toList
for (i<-0 until leng ) {
sum += (y(i).toInt - 48) * (y(i).toInt - 48) //Subtract out char base
}
sum
}

Is there a Scala-way to get the length of a number?

I would like to know, if there is a Scala built-in method to get the length of the decimal representation of an integer ?
Example: 45 has length 2; 10321 has length 5.
One could get the length with 10321.toString.length, but this smells a bit because of the overhead when creating a String object. Is there a nicer way or a built-in method ?
UPDATE:
With 'nicer' I mean a faster solution
I am only interested in positive integers
This is definitely personal preference, but I think the logarithm method looks nicer without a branch. For positive values only, the abs can be omitted of course.
def digits(x: Int) = {
import math._
ceil(log(abs(x)+1)/log(10)).toInt
}
toString then get length of int will not work for negative integers. This code will work not only for positive numbers but also negatives.
def digits(n:Int) = if (n==0) 1 else math.log10(math.abs(n)).toInt + 1;
If you want speed then something like the following is pretty good, assuming random distribution:
def lengthBase10(x: Int) =
if (x >= 1000000000) 10
else if (x >= 100000000) 9
else if (x >= 10000000) 8
else if (x >= 1000000) 7
else if (x >= 100000) 6
else if (x >= 10000) 5
else if (x >= 1000) 4
else if (x >= 100) 3
else if (x >= 10) 2
else 1
Calculating logarithms to double precision isn't efficient if all you want is the floor.
The conventional recursive way would be:
def len(x: Int, i: Int = 1): Int =
if (x < 10) i
else len(x / 10, i + 1)
which is faster than taking logs for integers in the range 0 to 10e8.
lengthBase10 above is about 4x faster than everything else though.
Something like this should do the job:
def numericLength(n: Int): Int = BigDecimal(n).precision
Take log to the base 10, take the floor and add 1.
The easiest way is:
def numberLength(i : Int): Int = i.toString.length
You might add a guarding-condition because negative Int will have the length of their abs + 1.
Another possibility can be:
private lazy val lengthList = (1 until 19).map(i => i -> math.pow(10, i).toLong)
def numberSize(x: Long): Int =
if (x >= 0) positiveNumberSize(x)
else positiveNumberSize(-x) + 1
private def positiveNumberSize(x: Long): Int =
lengthList
.collectFirst {
case (l, p) if x < p => l
}
.getOrElse(19)
Most people gave the most efficient answer of (int) log(number)+1
But I want to get a bit deeper into understanding why this works.
Let N be a 3 digit number. This means N can be any number between 100 and 1000, or :
100 < N < 1000
=> 10^2 < N < 10^3
The Logarithmic function is continuous , therefore :
log(10^2) < log(N) < log(10^3)
=> 2 < log(N) < 3
We can conclude that N's logarithm is a number between 2 and 3 , or in other words , any 3 digit numbers' logarithm is between 2 and 3.
So if we take only the integer part of a numbers logarithm(eg. the integer part of 2.567 is 2) and add 1 we get the digit length of the number.
Here is the solution:
number.toString.toCharArray.size
input - output
45 - 2
100 - 3