Generating a random List() of List(List(Doubles))

Generating a random List() of List(List(Doubles)) - scala

I'm trying to figure out how to generate a list of random doubles through the range -50 to 50, with the length of list being 20. (so 20 elements of random doubles ranging from -50 to 50).
I then want to create a fixed number (could be any number, we'll say 3 for now) of List[List[Double]] with that randomized double list. I read up on the Random doc but it is still very confusing to me. This is what I currently have:
val length: Int = 20
val doubles: List[Double] = List()
val listOf: List[List[Double]] = List(List())
val rand = new Random()
Essentially, let's say I do generate a list of 20 elements with random doubles ranging from -50 to 50. I then want to generate a random number of lists that include the randomized
list of doubles.
Ex:
val doubles: List[Double] = List(-29.3,46.8,-17.0,9.2,1.4) // in this case, doubles has a length of 5)
val listOf: List[List[Double]] = List(List(-29.3,46.8,-17.0,9.2,1.4),List(-5.0,3.4,31.5,29.0,-41.3)) // in this case, the inner lists have a length of 5, and the fixed number is
//2 because listOf has a length of 2
I am also looking to approach this problem with no mutability. How can I generate a random list of doubles with the above specs, and then generate a list of random lists?

The straight forward answer is simply:
import scala.util.Random
List.fill(3)(List.fill(20)(Random.between(-50.0, 50.0)))
The likelihood of repeating any of the random Doubles is extremely small, but if you absolutely must guarantee uniqueness, without mutation, then here's one rather inefficient solution.
import scala.util.Random
def isDistinct(lld: List[List[Double]]):Boolean =
lld.flatten.foldLeft((true, Set.empty[Double])){
case ((res, seen), dbl) => (res && !seen(dbl), seen+dbl)
}._1
LazyList.continually {
val llr = List.fill(3)(List.fill(20)(Random.between(-50.0, 50.0)))
Option.when(isDistinct(llr))(llr)
}.flatten.head
Also worth noting: between() is inclusive at the bottom (so -50.0 is unlikely but possible) and exclusive at the top (so exactly 50.0 shouldn't be possible).
Scala 2.12.x translation
def isDistinct(. . . //same
val rng = new scala.util.Random
Stream.continually {
val llr = List.fill(3)(List.fill(20)(rng.nextDouble * 100 - 50))
if (isDistinct(llr)) Some(llr) else None
}.flatten.head

Related

How to sum number of Ints and Number of Floats within a List - Scala

I need to calculate the number of integers and floats i have in a Map which is like Map[String, List[(Int, String, Float)]]
The data comes from reading a file - the data inside for example looks kinda like (however there is a few more Routes):
Cycle Route (City),1:City Centre :0.75f,2:Main Park :3.8f,3:Central Station:2.7f,4:Modern Art Museum,5:Garden Centre:2.4f,6:Music Centre:3.4f
The map is split so that the String is the name of the route and the List is the rest of the data.
I want it to calculate the number of 'checkpoints' per route and total distance of each route (which is the float) then print out e.g. Oor Wullie Route has 6 checkpoints and total distance of 18.45km
I am guessing I need to use a foldLeft however i am unsure how to do so?
Example of a simple fold i have done before but not sure how to apply one to above scenario?
val list1 = List.range(1,20)
def sum(ls:List[Int]):Int = {
ls.foldLeft(0) { _ + _}
}

You could do this with a fold, but IMO it is unnecessary.
You know the number of checkpoints by simply taking the size of the list (assuming each entry in the list is a checkpoint).
To compute the total distance, you could do:
def getDistance(list: List[(Int, String, Float)]): Float =
list
.iterator // mapping the iterator to avoid building an intermediate List instance
.map(_._3) // get the distance Float from the tuple
.sum // built-in method for collections of Numeric elements (e.g. Float)
And then get your printout like:
def summarize(routes: Map[String, List[(Int, String, Float)]]): Unit =
for { (name, stops) <- routes } {
val numStops = stops.size
val distance = getDistance(stops)
println(s"$name has $numStops stops and total distance of $distance km")
}
If you really wanted to compute both numStops and distance via foldLeft, Luis's comment on your question is the way to do it.
edit - per Luis's request, putting his comment in here and cleaning it up a bit:
stops.foldLeft(0 -> 0.0f) {
// note: "acc" is short for "accumulated"
case ((accCount, accDistance), (_, _, distance)) =>
(accCount + 1) -> (accDistance + distance)
}

find out if a number is a good number in scala

Hi I am new to scala functional programming methodology. I want to input a number to my function and check if it is a good number or not.
A number is a good number if its every digit is larger than the sum of digits which are on the right side of that digit. 
For example:
9620  is good as (2 > 0, 6 > 2+0, 9 > 6+2+0)
steps I am using to solve this is
1. converting a number to string and reversing it
2. storing all digits of the reversed number as elements of a list
3. applying for loop from i equals 1 to length of number - 1
4. calculating sum of first i digits as num2
5. extracting ith digit from the list as digit1 which is one digit ahead of the first i numbers for which we calculated sum because list starts from zero.
6. comparing output of 4th and 5th step. if num1 is greater than num2 then we will break the for loop and come out of the loop to print it is not a good number.
please find my code below
val num1 = 9521.toString.reverse
val list1 = num1.map(_.todigit).toList
for (i <- 1 to num1.length - 1) {
val num2 = num1.take(i).map(_.toDigits) sum
val digit1 = list1(i)
if (num2 > digit1) {
print("number is not a good number")
break
}
}
I know this is not the most optimized way to solve this problem. Also I am looking for a way to code this using tail recursion where I pass two numbers and get all the good numbers falling in between those two numbers.
Can this be done in more optimized way?
Thanks in advance!

No String conversions required.
val n = 9620
val isGood = Stream.iterate(n)(_/10)
.takeWhile(_>0)
.map(_%10)
.foldLeft((true,-1)){ case ((bool,sum),digit) =>
(bool && digit > sum, sum+digit)
}._1

Here is a purely numeric version using a recursive function.
def isGood(n: Int): Boolean = {
#tailrec
def loop(n: Int, sum: Int): Boolean =
(n == 0) || (n%10 > sum && loop(n/10, sum + n%10))
loop(n/10, n%10)
}
This should compile into an efficient loop.

Using this function:(This will be the efficient way as the function forall will not traverse the entire list of digits. it stops when it finds the false condition immediately ( ie., when v(i)>v.drop(i+1).sum becomes false) while traversing from left to right of the vector v. )
def isGood(n: Int)= {
val v1 = n.toString.map(_.asDigit)
val v = if(v1.last!=0) v1 else v1.dropRight(1)
(0 to v.size-1).forall(i=>v(i)>v.drop(i+1).sum)
}
If we want to find good numbers in an interval of integers ranging from n1 to n2 we can use this function:
def goodNums(n1:Int,n2:Int) = (n1 to n2).filter(isGood(_))
In Scala REPL:
scala> isGood(9620)
res51: Boolean = true
scala> isGood(9600)
res52: Boolean = false
scala> isGood(9641)
res53: Boolean = false
scala> isGood(9521)
res54: Boolean = true
scala> goodNums(412,534)
res66: scala.collection.immutable.IndexedSeq[Int] = Vector(420, 421, 430, 510, 520, 521, 530, 531)
scala> goodNums(3412,5334)
res67: scala.collection.immutable.IndexedSeq[Int] = Vector(4210, 5210, 5310)

This is a more functional way. pairs is a list of tuples between a digit and the sum of the following digits. It is easy to create these tuples with drop, take and slice (a combination of drop and take) methods.
Finally I can represent my condition in an expressive way with forall method.
val n = 9620
val str = n.toString
val pairs = for { x <- 1 until str.length } yield (str.slice(x - 1, x).toInt, str.drop(x).map(_.asDigit).sum)
pairs.forall { case (a, b) => a > b }
If you want to be functional and expressive avoid to use break. If you need to check a condition for each element is a good idea to move your problem to collections, so you can use forAll.
This is not the case, but if you want performance (if you don't want to create an entire pairs collection because the condition for the first element is false) you can change your for collection from a Range to Stream.
(1 until str.length).toStream

Functional style tends to prefer monadic type things, such as maps and reduces. To make this look functional and clear, I'd do something like:
def isGood(value: Int) =
value.toString.reverse.map(digit=>Some(digit.asDigit)).
reduceLeft[Option[Int]]
{
case(sum, Some(digit)) => sum.collectFirst{case sum if sum < digit => sum+digit}
}.isDefined
Instead of using tail recursion to calculate this for ranges, just generate the range and then filter over it:
def goodInRange(low: Int, high: Int) = (low to high).filter(isGood(_))

Averaging a very long List[Double] Without getting infinity in Scala

I have a very long list of doubles that I need to average but I can't sum them within the double data type so when I go to divide I still get Infinity.
def applyToMap(list: Map[String, List[Map[String, String]]], f: Map[String, String]=>Double): Map[String,Double]={
val mSLD = list.mapValues(lm=>lm.map(f))
mSLD.mapValues(ld=> ld.sum/ld.size)
}
This leaves me with a Map[String, Double] that are all Key -> Infinity

You could use fold to compute an average as you go. Rather than doing sum / size you should count your way through the items with n, and for each one adjust your accumulator with acc = (acc * n/(n+1)) + (item * 1/(n+1))
Here’s the general scala code:
val average = seq.foldLeft((0.0, 1)) ((acc, i) => ((acc._1 + (i - acc._1) / acc._2), acc._2 + 1))._1
Taken from here.
You’d probably still have precision difficulty if the list is really long, as you’d be dividing by a gradually very large number. To be really safe you should break the list into sublists, and compute the average of averages of the sublists. Make sure the sublists are all the same length though, or do a weighted average based on their size.

Interested in implementing gandaliters solution, I came up with the following (Since I'm not the well known friend of Doubles, I tried to find an easy to follow numeric sequence with Bytes). First, I generate 10 Bytes in the range of 75..125, to be close to MaxByte, but below for every value, and in average 100, for simple control:
val rnd = util.Random
val is=(1 to 10).map (i => (rnd.nextInt (50)+75).toByte)
// = Vector(99, 122, 99, 105, 102, 104, 122, 99, 87, 114)
The 1st algo multiplies before division (which increases the danger to exceed MaxByte), the 2nd divides before multiplication, which leads to rounding errors.
def slidingAvg0 (sofar: Byte, x: Byte, cnt: Byte): (Byte, Byte) = {
val acc : Byte = ((sofar * cnt).toByte / (cnt + 1).toByte + (x/(cnt + 1).toByte).toByte).toByte
println (acc)
(acc.toByte, (cnt + 1).toByte)
}
def slidingAvg1 (sofar: Byte, x: Byte, cnt: Byte): (Byte, Byte) = {
val acc : Byte = (((sofar / (cnt + 1).toByte).toByte * cnt).toByte + (x/(cnt + 1).toByte).toByte).toByte
println (acc)
(acc.toByte, (cnt + 1).toByte)
}
This is foldLeft in scala:
((is.head, 1.toByte) /: is.tail) { case ((sofar, cnt), x) => slidingAvg0 (sofar, x, cnt)}
110
21
41
2
18
32
8
16
0
scala> ((is.head, 1.toByte) /: is.tail) { case ((sofar, cnt), x) => slidingAvg1 (sofar, x, cnt)}
110
105
104
100
97
95
89
81
83
Since 10 values is far too less to rely on the average being close to 100, let's see the sum as Int:
is.map (_.toInt).sum
res65: Int = 1053
The drift is pretty significant (should be 105, is 0/83)
Whether the findings are transferable from Bytes/Int to Doubles is the other question. And I'm not 100% confident, that my braces mirror the evaluation order, but imho, for multiplication/division of same precedence it is left to right.
So the original formulas were:
acc = (acc * n/(n+1)) + (item * 1/(n+1))
acc = (acc /(n+1) *n) + (item/(n+1))

If i understand the OP correctly then the amount of data doesn't seem to be a problem otherwise it wouldn't fit into memory.
So i concentrate on the data types only.
Summary
My suggestion is to go with BigDecimal instead of Double.
Especially if you are adding reasonbly high values.
The only significant drawback is the performance and a small amount of cluttered syntax.
Alternatively you must rescale your input upfront but this will degrade precision and requires special care with post processing.
Double breaks at some scale
scala> :paste
// Entering paste mode (ctrl-D to finish)
val res0 = (Double.MaxValue + 1) == Double.MaxValue
val res1 = Double.MaxValue/10 == Double.MaxValue
val res2 = List.fill(11)(Double.MaxValue/10).sum
val res3 = List.fill(10)(Double.MaxValue/10).sum == Double.MaxValue
val res4 = (List.fill(10)(Double.MaxValue/10).sum + 1) == Double.MaxValue
// Exiting paste mode, now interpreting.
res0: Boolean = true
res1: Boolean = false
res2: Double = Infinity
res3: Boolean = true
res4: Boolean = true
Take a look these simple Double arithmetics examples in your scala REPL:
Double.MaxValue + 1 will numerically cancel out and nothing is going to be added, thus it is still the same as Double.MaxValue
Double.MaxValue/10 behaves as expected and doesn't equal to Double.MaxValue
Adding Double.MaxValue/10 for 11 times will produce an overflow to Infintiy
Adding Double.MaxValue/10 for 10 times won't break arithmetics and evaluate to Double.MaxValue again
The summed Double.MaxValue/10 behaves exactly as the Double.MaxValue
BigDecimal works on all scales but is slower
scala> :paste
// Entering paste mode (ctrl-D to finish)
val res0 = (BigDecimal(Double.MaxValue) + 1) == BigDecimal(Double.MaxValue)
val res1 = BigDecimal(Double.MaxValue)/10 == BigDecimal(Double.MaxValue)
val res2 = List.fill(11)(BigDecimal(Double.MaxValue)/10).sum
val res3 = List.fill(10)(BigDecimal(Double.MaxValue)/10).sum == BigDecimal(Double.MaxValue)
val res4 = (List.fill(10)(BigDecimal(Double.MaxValue)/10).sum + 1) == BigDecimal(Double.MaxValue)
// Exiting paste mode, now interpreting.
res0: Boolean = false
res1: Boolean = false
res2: scala.math.BigDecimal = 197746244834854727000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
res3: Boolean = true
res4: Boolean = false
Now compare these results with the ones above from Double.
As you can see everything works as expected.
Rescaling reduces precision and can be tedious
When working with astronomic or microscopic scales it is likely to happen that numbers will overflow or underflow quickly.
Then it is appropriate to work with other units than the base units to compensate this.
E.g. with km instead of m.
However, then you will have to take special care when multiplying those numbers in formulas.
10km * 10km ≠ 100 km^2
but rather
10,000 m * 10,000 m = 100,000,000 m^2 = 100 Mm^2
So keep this in mind.
Another trap is when dealing with very diverse datasets where numbers exist in all kinds of scales and quantities.
When scaling down your input domain you will loose precision and small numbers may be cancelled out.
In some scenarios these numbers don't need to be considered because of their small impact.
However, when these small numbers exist in a high frequency and are ignored all the time you will introduce a large error in the end.
So keep this in mind as well ;)
Hope this helps

Make a set of random Integers Scala

Is there a way in Scala to have a set of Ints be random without duplicates?
For example, I have set of Ints currently set to zero by default; a,b,c,d,e. And I want to assign a random int to each one from 1-100 while never assigning the same number to any of the variables. Thanks.

I can see two ways how this can be done.
First is the simplest one. If the range (1-100) is small enough, you can just generate every value in this range, shuffle them and take first m:
import scala.util.Random
Random.shuffle(0 until 100 toList).take(4)
result:
res0: List[Int] = List(54, 11, 35, 15)
But if range is large, this won't be very efficient (as range must be materialized in the memory). So in the case when the number of picked values (m) is much smaller than the range (n), it's more efficient to generate random values until you pick one that wasn't used before.
Here is how:
import scala.util.Random
def distinctRandomMOutOfN(m:Int, n:Int):Set[Int] = {
require(m <= n)
Stream.continually(Random.nextInt(n)).scanLeft(Set[Int]()) {
(accum, el) => accum + el
}.dropWhile(_.size < m).head
}
distinctRandomMOutOfN(4, 100)
result:
res1: Set[Int] = Set(99, 28, 82, 87)
The downside of the second approach is that if m is close to n it takes average time close to O(m²) to compute.
UPD.
So if you want a general solution that will work efficiently in any case you may use hybrid variant. Use the first approach if m is at the same order of magnitude as n (i.e. m * 2 >= n) and second variant otherwise.
This implementation will use O(m) memory and will have an average running time of O(m).

You can be confident that there are no duplicates if you simply shuffle all the possible elements and then take what you need.
import scala.util.Random
Random.shuffle(1 to 100).take(5) // res0: Vector(74, 82, 68, 24, 15)

reduce list of integers/range of integers in scala

Total newbie question here...Today while trying to calculate sum of a list of integers(actually BitSet), I ran into overflow scenarios and noticed that the return type of(sum/product) is Int. Are there any methods in Range/List to sum up or say multiply all values to Long?
val x = 1 to Integer.MaxValue
println(x.sum) //prints -1453759936
thanks

Convert the elements to Long (or BigInt should that go that far) while summing:
x.view.map(_.toLong).sum
You can also go back to fold
x.foldLeft(0L)(_ + _)
(Note: should you sum over a range, maybe it would be better do a little math, but I understand that is not what you did in fact)

Compare:
>> val x = 1 to Int.MaxValue
x: scala.collection.immutable.Range.Inclusive with scala.collection.immutable.Range.ByOne = Range(...)
With:
>> val x = 1L to Int.MaxValue
x: scala.collection.immutable.NumericRange.Inclusive[Long] = NumericRange(...)
Note that the first uses Int.to, and the latter used Long.to (where Int.MaxValue is up-converted automatically). Of course, the sum of a consecutive integer sequence has a very nice discrete formula :)
Happy coding.

This isn't very efficient, but the easiest way:
val x = 1L to Int.MaxValue
println(x.sum) //prints 2305843008139952128
If you need x to contain Ints rather than Longs, you can do
val x = 1 to Int.MaxValue
println(x.foldLeft(0L)(_+_))