Cumulative sum without var scala [duplicate] - scala

I've got a List of days in the month:
val days = List(31, 28, 31, ...)
I need to return a List with the cumulative sum of days:
val cumDays = List(31, 59, 90)
I've thought of using the fold operator:
(0 /: days)(_ + _)
but this will only return the final result (365), whereas I need the list of intermediate results.
Anyway I can do that elegantly?

Scala 2.8 has the methods scanLeft and scanRight which do exactly that.
For 2.7 you can define your own scanLeft like this:
def scanLeft[a,b](xs:Iterable[a])(s:b)(f : (b,a) => b) =
xs.foldLeft(List(s))( (acc,x) => f(acc(0), x) :: acc).reverse
And then use it like this:
scala> scanLeft(List(1,2,3))(0)(_+_)
res1: List[Int] = List(0, 1, 3, 6)

I'm not sure why everybody seems to insist on using some kind of folding, while you basically want to map the values to the cumulated values...
val daysInMonths = List(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
val cumulated = daysInMonths.map{var s = 0; d => {s += d; s}}
//--> List[Int] = List(31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365)

You can simply perform it:
daysInMonths.foldLeft((0, List[Int]()))
{(acu,i)=>(i+acu._1, i+acu._1 :: acu._2)}._2.reverse

Fold into a list instead of an integer. Use pair (partial list with the accumulated values, accumulator with the last sum) as state in the fold.

Fold your list into a new list. On each iteration, append a value which is the sum of the head + the next input. Then reverse the entire thing.
scala> val daysInMonths = List(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
daysInMonths: List[Int] = List(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
scala> daysInMonths.foldLeft(Nil: List[Int]) { (acc,next) =>
| acc.firstOption.map(_+next).getOrElse(next) :: acc
| }.reverse
res1: List[Int] = List(31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365)

You can also create a monoid class that concatenates two lists while adding to the second one the last value from the first. No mutables and no folds involved:
case class CumSum(v: List[Int]) { def +(o: CumSum) = CumSum(v ::: (o.v map (_ + v.last))) }
defined class CumSum
scala> List(1,2,3,4,5,6) map {v => CumSum(List(v))} reduce (_ + _)
res27: CumSum = CumSum(List(1, 3, 6, 10, 15, 21))

For any:
val s:Seq[Int] = ...
You can use one of those:
s.tail.scanLeft(s.head)(_ + _)
s.scanLeft(0)(_ + _).tail
or folds proposed in other answers but...
be aware that Landei's solution is tricky and you should avoid it.
BE AWARE
s.map { var s = 0; d => {s += d; s}}
//works as long `s` is strict collection
val s2:Seq[Int] = s.view //still seen as Seq[Int]
s2.map { var s = 0; d => {s += d; s}}
//makes really weird things!
//Each value'll be different whenever you'll access it!
I should warn about this as a comment below Landei's answer but I couldn't :(.

Works on 2.7.7:
def stepSum (sums: List [Int], steps: List [Int]) : List [Int] = steps match {
case Nil => sums.reverse.tail
case x :: xs => stepSum (sums.head + x :: sums, steps.tail) }
days
res10: List[Int] = List(31, 28, 31, 30, 31)
stepSum (List (0), days)
res11: List[Int] = List(31, 59, 90, 120, 151)

Related

Create groups of sequences whose element-wise difference is 1 in Scala?

We have the following sequence of numbers: [19, 23, 24, 31, 126, 127, 155, 159, 160, 161]. We need to group this sequence according to difference between the neighboring values such that difference of values in each group would be equal to 1 if group size is > 1.
In Python, I would write something like:
outliers = [19, 23, 24, 31, 126, 127, 155, 159, 160, 161]
chains = [[i for i in list(map(itemgetter(1), g))]
for _, g in itertools.groupby(enumerate(outliers),
lambda x: x[0]-x[1])]
# [[19], [23, 24], [31], [126, 127], [155], [159, 160, 161]]
Pretty neat. But how can this be done in Scala without falling back to loops with conditions? I have been trying to do something with zipWithIndex and groupBy methods by so far to no avail :(
You can fold over the sequence, building the result as you go.
outliers.foldRight(List.empty[List[Int]]) {case (n, acc) =>
if (acc.isEmpty) List(List(n))
else if (acc(0)(0) == n+1) (n :: acc.head) :: acc.tail
else List(n) :: acc
}
//res0: List[List[Int]] = List(List(19), List(23, 24), List(31), List(126, 127), List(155), List(159, 160, 161))
Not sure if recursion can match with your conditions but it at last O(n).
def rec(source: List[Int], temp: List[Int], acc: List[List[Int]]): List[List[Int]] = source match {
case Nil => acc
case x :: xs => {
if (xs.nonEmpty && xs.head - x == 1) rec(xs, temp :+ x, acc)
else rec(xs, List(), acc :+ (temp :+ x))
}
}
val outliers = List(19, 23, 24, 31, 126, 127, 155, 159, 160, 161)
rec(outliers, List[Int](), List[List[Int]]())

printing elements in list using stream

Why does the following code prints only 1 and not the rest of the list elements?
scala> val l: List [Int] = List(1,2,3)
l: List[Int] = List(1, 2, 3)
scala> l.toStream.map(x => print(x))
1res3: scala.collection.immutable.Stream[Unit] = Stream((), ?)
What is the correct way to write this code?
I'll divide my answer to two:
1. The map method in Scala:
you're using map, which expects a function with no side-effects (printing is a side effect). What you're looking for is:
l.toStream.foreach(x => print(x))
Basically, the general idea is that map takes something and converts it to something else (for example, increasing its value). while foreach is performing some action on that value that isn't supposed to have a return value.
scala> l.toStream.foreach(x => print(x))
123
2. Stream in Scala:
Streams are lazy, so Scala only computes the values it needs. Try this:
scala> l.toStream.map(x => x+1)
res2: scala.collection.immutable.Stream[Int] = Stream(2, ?)
You can see it computed the first value, and the question marks states that it has no idea what comes after it, because it didn't compute it yet. In you're example the first value is nothing, as the print returns no value.
Stream is on demand data structure which means not all the values will be evaluated until you need them.
example,
scala> val stream = (1 to 10000).toStream
stream: scala.collection.immutable.Stream[Int] = Stream(1, ?)
Now if you access head and tail, stream will be evaluated upto 2nd index.
scala> stream.head
res13: Int = 1
scala> stream.tail
res14: scala.collection.immutable.Stream[Int] = Stream(2, ?)
scala> stream
res15: scala.collection.immutable.Stream[Int] = Stream(1, 2, ?)
If you access index 99,
scala> stream(99)
res16: Int = 100
Now if you print stream, stream will be evaluated upto 99th index,
scala> stream
res17: scala.collection.immutable.Stream[Int] = Stream(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, ?)
It's always good to process only those in stream, which you need. you can use take() for that.
scala> stream.take(50).foreach(x => print(x + " "))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
So, answer to your question can be,
scala> List(1,2,3).toStream.take(3).foreach(x => print(x + " "))
1 2 3
Reference
https://www.coursera.org/learn/progfun2/home/week/2
to print complete stream use
l.toStream.print
Output: 1, 2, 3, empty
to print first n values, you may use take(n)
l.toStream.take(2).print
prints output: 1, 2, empty
You can print it with
l.toStream.foreach(println)
But generally speaking is not a good idea trying to print or even processing without being careful a whole Stream since it may be infinite and cause an error while doing so.
More info about Streams here
Streams in Scala are lazy data structures which means that they tend to perform only the as needed work.
scala> val stream1 = Stream.range(1, 10)
// stream1: scala.collection.immutable.Stream[Int] = Stream(1, ?)
In this case, only the first element is computed. The stream knows how to compute rest of the elements and will compute them only when it actually needs them. for example ("consumers"),
scala> val list1 = stream1.toList
// list1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> stream1.foreach(print)
// 123456789
But when faced with "transformers", Streams will just keep the new transformation with them but will not apply to whole Stream. The map method is supposed to provide transforms
scala> val stream2 = stream1.map(i => i + 5)
// stream2: scala.collection.immutable.Stream[Int] = Stream(6, ?)
So, it just know that it has to apply this i => i + 5 function to respective elements of stream1 to get element of stream2. And will do that when required (facing any consumer).
Lets consider something similar to your example,
scala> val stream3 = stream1.map(i => println("element :: " + i))
// element :: 1
// stream3: scala.collection.immutable.Stream[Unit] = Stream((), ?)
Here your "transform" function takes an element Int, prints it in line and returns nothing which is called Unit or () in Scala. Out lazy stream here, will compute this transform for first element and will not do for rest. And this computation here will result in that element :: 1 being printed.
Now, lets see what happens when we apply some consumer to it,
scala> val list3 = stream3.toList
// element :: 2
// element :: 3
// element :: 4
// element :: 5
// element :: 6
// element :: 7
// element :: 8
// element :: 9
// list3: List[Unit] = List((), (), (), (), (), (), (), (), ())
Which will look wrong to most people. All I wanted to convert my stream to list but why are all these lines getting printed.
Which is why, when you are using map, you should provide a pure function.
What is a pure function? The simple answer is that a pure function only does the things it is supposed to do and nothing else. It does not cause any change out of its scope. It just takes something and give something else back.
All of the following are pure functions,
scala> val pf1 = (i: Int) => i + 1
// pf1: Int => Int = $$Lambda$1485/1538411140#6fdc53db
scala> val pf2 = (i: Int) => {
| val x = 100
| val xy = 200
| xy + i
| }
// pf2: Int => Int = $$Lambda$1487/14070792#7bf770ba
scala> val pf3 = (i: Int) => ()
// pf3: Int => Unit = $$Lambda$1486/1145379385#336cd7d5
Where as following is not a pure function,
val npf1 = (i: Int) => println(i)
// npf1: Int => Unit = $$Lambda$1488/1736134005#7ac97ba6
Because it causes a "magical" change in the environment.
Why "magical"? Because, it claims to be a function of type Int => Unit which means it should just be transforming an Int to an Unit. But it also printed something on our console, which was outside of its environment.
A real world example of this magic will be that - whenever you put a bread in your toaster it causes a rain storm on the Hulk's current location. And nobody wants the Hulk to come looking for their toaster.
In general, the bottom line is, that you should not use side effects in .map. When you do foo.map(bar) that just returns another collection, that contains element, generated by applying bar to the original collection. It may or may not be lazy. The point is, you should treat the elements of any collection as undefined until something looks at them.
If you want side effects, use foreach: Seq(1,2,3).toStream.foreach(println)

Drop every Nth element from a Scala Array

My requirement is to drop every Nth element from a Scala Array (pls note every Nth element). I wrote the below method which does the job. Since, I am new to Scala, I couldn't avoid the Java hangover. Is there a simpler or more efficient alternative?
def DropNthItem(a: Array[String], n: Int): Array[String] = {
val in = a.indices.filter(_ % n != 0)
val ab: ArrayBuffer[String] = ArrayBuffer()
for ( i <- in)
ab += a(i-1)
return ab.toArray
}
You made a good start. Consider this simplification.
def DropNthItem(a: Array[String], n: Int): Array[String] =
a.indices.filter(x => (x+1) % n != 0).map(a).toArray
How about something like this?
arr.grouped(n).flatMap(_.take(n-1)).toArray
You can do this in two steps functionally using zipWithIndex to get an array of elements tupled with their indices, and then collect to build a new array consisting of only elements that have indices that aren't 0 = i % n.
def dropNth[A: reflect.ClassTag](arr: Array[A], n: Int): Array[A] =
arr.zipWithIndex.collect { case (a, i) if (i + 1) % n != 0 => a }
This will make it
def DropNthItem(a: Array[String], n: Int): Array[String] =
a.zipWithIndex.filter(_._2 % n != 0).map(_._1)
If you're looking for performance (since you're using an ArrayBuffer), you might as well track the index with a var, manually increment it, and check it with an if to filter out n-multiple-indexed values.
def dropNth[A: reflect.ClassTag](arr: Array[A], n: Int): Array[A] = {
val buf = new scala.collection.mutable.ArrayBuffer[A]
var i = 0
for(a <- arr) {
if((i + 1) % n != 0) buf += a
i += 1
}
buf.toArray
}
It's faster still if we traverse the original array as an iterator using a while loop.
def dropNth[A: reflect.ClassTag](arr: Array[A], n: Int): Array[A] = {
val buf = new scala.collection.mutable.ArrayBuffer[A]
val it = arr.iterator
var i = 0
while(it.hasNext) {
val a = it.next
if((i + 1) % n != 0) buf += a
i += 1
}
buf.toArray
}
I'd go with something like this;
def dropEvery[A](arr: Seq[A], n: Int) = arr.foldLeft((Seq.empty[A], 0)) {
case ((acc, idx), _) if idx == n - 1 => (acc, 0)
case ((acc, idx), el) => (acc :+ el, idx + 1)
}._1
// example: dropEvery(1 to 100, 3)
// res0: Seq[Int] = List(1, 2, 4, 5, 7, 8, 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 34, 35, 37, 38, 40, 41, 43, 44, 46, 47, 49, 50, 52, 53, 55, 56, 58, 59, 61, 62, 64, 65, 67, 68, 70, 71, 73, 74, 76, 77, 79, 80, 82, 83, 85, 86, 88, 89, 91, 92, 94, 95, 97, 98, 100)
This is efficient since it requires a single pass over the array and removes every nth element from it – I believe that is easy to see.
The first case matches when idx == n - 1 and ignores the element at that index, and passes over the acc and resets the count to 0 for the next element.
If the first case doesn't match, it adds the element to the end of the acc and increments the count by 1.
Since you're willing to get rid of the Java hangover, you might want to use implicit classes to use this in a very nice way:
implicit class MoreFuncs[A](arr: Seq[A]) {
def dropEvery(n: Int) = arr.foldLeft((Seq.empty[A], 0)) {
case ((acc, idx), _) if idx == n - 1 => (acc, 0)
case ((acc, idx), el) => (acc :+ el, idx + 1)
}._1
}
// example: (1 to 100 dropEvery 1) == Nil (: true)

Concatenating multiple lists in Scala

I have a function called generateList and concat function as follows. It is essentially concatenating lists returned by the generateList with i starting at 24 and ending at 1
def concat(i: Int, l: List[(String, Int)]) : List[(String, Int)] = {
if (i==1) l else l ::: concat(i-1, generateList(signs, i))
}
val all = concat(23, generateList(signs, 24))
I can convert this to tail-recursion. But I am curious if there a scala way of doing this?
There are many ways to do this with Scala's built in methods available to Lists.
Here is one approach that uses foldRight
(1 to 24).foldRight(List[Int]())( (i, l) => l ::: generateList(i))
Starting with the range of ints you use to build separate lists, it concats the result of generateList(i) to the initial empty list.
Here is one way to do this:
val signs = ""
def generateList(s: String, n: Int) = n :: n * 2 :: Nil
scala> (24 to 1 by -1) flatMap (generateList(signs, _))
res2: scala.collection.immutable.IndexedSeq[Int] = Vector(24, 48, 23, 46, 22, 44, 21, 42, 20, 40, 19, 38, 18, 36, 17, 34, 16, 32, 15, 30, 14, 28, 13, 26, 12, 24, 11, 22, 10, 20, 9, 18, 8, 16, 7, 14, 6, 12, 5, 10, 4, 8, 3, 6, 2, 4, 1, 2)
What you want to do is to map the list with x => generateList(signs, x) function and then concatenate the results, i.e. flatten the list. This is just what flatMap does.

ScalaCheck: choose an integer with custom probability distribution

I want to create a generator in ScalaCheck that generates numbers between say 1 and 100, but with a bell-like bias towards numbers closer to 1.
Gen.choose() distributes numbers randomly between the min and max value:
scala> (1 to 10).flatMap(_ => Gen.choose(1,100).sample).toList.sorted
res14: List[Int] = List(7, 21, 30, 46, 52, 64, 66, 68, 86, 86)
And Gen.chooseNum() has an added bias for the upper and lower bounds:
scala> (1 to 10).flatMap(_ => Gen.chooseNum(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 61, 85, 86, 91, 92, 100, 100)
I'd like a choose() function that would give me a result that looks something like this:
scala> (1 to 10).flatMap(_ => choose(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 2, 5, 11, 18, 35, 49, 100)
I see that choose() and chooseNum() take an implicit Choose trait as an argument. Should I use that?
You could use Gen.frequency() (1):
val frequencies = List(
(50000, Gen.choose(0, 9)),
(38209, Gen.choose(10, 19)),
(27425, Gen.choose(20, 29)),
(18406, Gen.choose(30, 39)),
(11507, Gen.choose(40, 49)),
( 6681, Gen.choose(50, 59)),
( 3593, Gen.choose(60, 69)),
( 1786, Gen.choose(70, 79)),
( 820, Gen.choose(80, 89)),
( 347, Gen.choose(90, 100))
)
(1 to 10).flatMap(_ => Gen.frequency(frequencies:_*).sample).toList
res209: List[Int] = List(27, 21, 31, 1, 21, 18, 9, 29, 69, 29)
I got the frequencies from https://en.wikipedia.org/wiki/Standard_normal_table#Complementary_cumulative. The code is just a sample of the table (% 3 or mod 3), but I think you can get the idea.
I can't take much credit for this, and will point you to this excellent page:
http://www.javamex.com/tutorials/random_numbers/gaussian_distribution_2.shtml
A lot of this depends what you mean by "bell-like". Your example doesn't show any negative numbers but the number "1" can't be in the middle of the bell and not produce any negative numbers unless it was a very, very tiny bell!
Forgive the mutable loop but I use them sometimes when I have to reject values in a collection build:
object Test_Stack extends App {
val r = new java.util.Random()
val maxBellAttempt = 102
val stdv = maxBellAttempt / 3 //this number * 3 will happen about 99% of the time
val collectSize = 100000
var filled = false
val l = scala.collection.mutable.Buffer[Int]()
//ref article above "What are the minimum and maximum values with nextGaussian()?"
while(l.size < collectSize){
val temp = (r.nextGaussian() * stdv + 1).abs.round.toInt //the +1 is the mean(avg) offset. can be whatever
//the abs is clipping the curve in half you could remove it but you'd need to move the +1 over more
if (temp <= maxBellAttempt) l+= temp
}
val res = l.to[scala.collection.immutable.Seq]
//println(res.mkString("\n"))
}
Here's the distribution I just pasted the output into excel and did a "countif" to show the freq of each: