Transform a Scala Stream to a new Stream which is the sum of the current element and the previous element - scala

How to transform a Scala Stream of integers so that we have a new Stream where the elements are the sum of this element and the previous element.
By example if the input stream is 1, 2, 3, 4 ... then the output stream is 1, 3, 5, 7.
Also a second question, how would you make the sum use the previous one in the output stream so that the output would be 1, (2+(1)), (3+(2+1)), (4+(3+(2+1))).

Just zip your stream with a shifted version of itself and sum the two elements.
val s1 = Stream.from(0) // 0, 1, 2, 3, ...
val s2 = Stream.from(1) // 1, 2, 3, 4, ...
val sumOfTwo = s1.zip(s2).map{ case (a,b) => a+b } // 1, 3, 5, 7, ...
To compute the total sum, just use the scan function that acts like a fold but returning elements at each step.
val totalSum = s1.scan(0)((ctr, el) => ctr + el) // 0, 1, 3, 6, 10, ...

This answer computes the cumulative sum by using a variable for the accumulated result instead of scan(). Example program:
import scala.collection.immutable.Stream
object Main extends App {
// 1, 2, 3, ...
val naturals = Stream.from(1)
// cumulative sum (see https://stackoverflow.com/a/8567134/1071311)
def sumUp(s : Stream[Int], acc : Int = 0) : Stream[Int] =
Stream.cons(s.head + acc, sumUp(s.tail, s.head + acc))
val firstFive = sumUp(naturals, 0).take(5)
firstFive.foreach(println _)
}
Output:
1
3
6
10
15

Related

Getting maximum value in remaining sub-list for each element in a list

val input = List(16, 17, 4, 5, 3, 0)
We have to sort in a way where starting from the first element we need to return the maximum element in the remaining list.
I want output like 17,5,5,3,0.
I've Tried Below code
val v1 = input.scanRight(Int.MinValue)(math.max).dropRight(1)
println("variation 1, with scan")
println(v1)
If I understand your requirement correctly, you could use foldRight to traverse the list from right to left, and at each iteration store the maximum value of the traversed elements in the accumulator, which is a Tuple of (List[Int], Int):
val input = List(16, 17, 4, 5, 3, 0)
input.foldRight((List[Int](), Int.MinValue)){ case (i, (ls, j)) =>
val m = i max j
(m :: ls, m)
}._1
// res1: List[Int] = List(17, 17, 5, 5, 3, 0)
Reversing will work in above case
var max = Int.MinValue
val buffer = new scala.collection.mutable.ArrayBuffer[Int](input.length)
for (i <- input.reverse if i >= max) {
max = i
i +=: buffer
}
println(buffer.toList)

Operate on neighbor elements in RDD in Spark

As I have a collection:
List(1, 3,-1, 0, 2, -4, 6)
It's easy to make it sorted as:
List(-4, -1, 0, 1, 2, 3, 6)
Then I can construct a new collection by compute 6 - 3, 3 - 2, 2 - 1, 1 - 0, and so on like this:
for(i <- 0 to list.length -2) yield {
list(i + 1) - list(i)
}
and get a vector:
Vector(3, 1, 1, 1, 1, 3)
That is, I want to make the next element minus the current element.
But how to implement this in RDD on Spark?
I know for the collection:
List(-4, -1, 0, 1, 2, 3, 6)
There will be some partitions of the collection, each partition is ordered, can I do the similar operation on each partition and collect results on each partition together?
The most efficient solution is to use sliding method:
import org.apache.spark.mllib.rdd.RDDFunctions._
val rdd = sc.parallelize(Seq(1, 3,-1, 0, 2, -4, 6))
.sortBy(identity)
.sliding(2)
.map{case Array(x, y) => y - x}
Suppose you have something like
val seq = sc.parallelize(List(1, 3, -1, 0, 2, -4, 6)).sortBy(identity)
Let's create first collection with index as key like Ton Torres suggested
val original = seq.zipWithIndex.map(_.swap)
Now we can build collection shifted by one element.
val shifted = original.map { case (idx, v) => (idx - 1, v) }.filter(_._1 >= 0)
Next we can calculate needed differences ordered by index descending
val diffs = original.join(shifted)
.sortBy(_._1, ascending = false)
.map { case (idx, (v1, v2)) => v2 - v1 }
So
println(diffs.collect.toSeq)
shows
WrappedArray(3, 1, 1, 1, 1, 3)
Note that you can skip the sortBy step if reversing is not critical.
Also note that for local collection this could be computed much more simple like:
val elems = List(1, 3, -1, 0, 2, -4, 6).sorted
(elems.tail, elems).zipped.map(_ - _).reverse
But in case of RDD the zip method requires each collection should contain equal element count for each partition. So if you would implement tail like
val tail = seq.zipWithIndex().filter(_._2 > 0).map(_._1)
tail.zip(seq) would not work since both collection needs equal count of elements for each partition and we have one element for each partition that should travel to previous partition.

Scala trying to count instances of a digit in a number

This is my first day using scala. I am trying to make a string of the number of times each digit is represented in a string. For instance, the number 4310227 would return "1121100100" because 0 appears once, 1 appears once, 2 appears twice and so on...
def pow(n:Int) : String = {
val cubed = (n * n * n).toString
val digits = 0 to 9
val str = ""
for (a <- digits) {
println(a)
val b = cubed.count(_==a.toString)
println(b)
}
return cubed
}
and it doesn't seem to work. would like some scalay reasons why and to know whether I should even be going about it in this manner. Thanks!
When you iterate over strings, which is what you are doing when you call String#count(), you are working with Chars, not Strings. You don't want to compare these two with ==, since they aren't the same type of object.
One way to solve this problem is to call Char#toString() before performing the comparison, e.g., amend your code to read cubed.count(_.toString==a.toString).
As Rado and cheeken said, you're comparing a Char with a String, which will never be be equal. An alternative to cheekin's answer of converting each character to a string is to create a range from chars, ie '0' to '9':
val digits = '0' to '9'
...
val b = cubed.count(_ == a)
Note that if you want the Int that a Char represents, you can call char.asDigit.
Aleksey's, Ren's and Randall's answers are something you will want to strive towards as they separate out the pure solution to the problem. However, given that it's your first day with Scala, depending on what background you have, you might need a bit more context before understanding them.
Fairly simple:
scala> ("122333abc456xyz" filter (_.isDigit)).foldLeft(Map.empty[Char, Int]) ((histo, c) => histo + (c -> (histo.getOrElse(c, 0) + 1)))
res1: scala.collection.immutable.Map[Char,Int] = Map(4 -> 1, 5 -> 1, 6 -> 1, 1 -> 1, 2 -> 2, 3 -> 3)
This is perhaps not the fastest approach because intermediate datatype like String and Char are used but one of the most simplest:
def countDigits(n: Int): Map[Int, Int] =
n.toString.groupBy(x => x) map { case (n, c) => (n.asDigit, c.size) }
Example:
scala> def countDigits(n: Int): Map[Int, Int] = n.toString.groupBy(x => x) map { case (n, c) => (n.asDigit, c.size) }
countDigits: (n: Int)Map[Int,Int]
scala> countDigits(12345135)
res0: Map[Int,Int] = Map(5 -> 2, 1 -> 2, 2 -> 1, 3 -> 2, 4 -> 1)
Where myNumAsString is a String, eg "15625"
myNumAsString.groupBy(x => x).map(x => (x._1, x._2.length))
Result = Map(2 -> 1, 5 -> 2, 1 -> 1, 6 -> 1)
ie. A map containing the digit with its corresponding count.
What this is doing is taking your list, grouping the values by value (So for the initial string of "15625", it produces a map of 1 -> 1, 2 -> 2, 6 -> 6, and 5 -> 55.). The second bit just creates a map of the value to the count of how many times it occurs.
The counts for these hundred digits happen to fit into a hex digit.
scala> val is = for (_ <- (1 to 100).toList) yield r.nextInt(10)
is: List[Int] = List(8, 3, 9, 8, 0, 2, 0, 7, 8, 1, 6, 9, 9, 0, 3, 6, 8, 6, 3, 1, 8, 7, 0, 4, 4, 8, 4, 6, 9, 7, 4, 6, 6, 0, 3, 0, 4, 1, 5, 8, 9, 1, 2, 0, 8, 8, 2, 3, 8, 6, 4, 7, 1, 0, 2, 2, 6, 9, 3, 8, 6, 7, 9, 5, 0, 7, 6, 8, 7, 5, 8, 2, 2, 2, 4, 1, 2, 2, 6, 8, 1, 7, 0, 7, 6, 9, 5, 5, 5, 3, 5, 8, 2, 5, 1, 9, 5, 7, 2, 3)
scala> (new Array[Int](10) /: is) { case (a, i) => a(i) += 1 ; a } map ("%x" format _) mkString
warning: there were 1 feature warning(s); re-run with -feature for details
res7: String = a8c879caf9
scala> (new Array[Int](10) /: is) { case (a, i) => a(i) += 1 ; a } sum
warning: there were 1 feature warning(s); re-run with -feature for details
res8: Int = 100
I was going to point out that no one used a char range, but now I see Kristian did.
def pow(n:Int) : String = {
val cubed = (n * n * n).toString
val cnts = for (a <- '0' to '9') yield cubed.count(_ == a)
(cnts map (c => ('0' + c).toChar)).mkString
}

How to take the first distinct (until the moment) elements of a list?

I am sure there is an elegant/funny way of doing it,
but I can only think of a more or less complicated recursive solution.
Rephrasing:
Is there any standard lib (collections) method nor simple combination of them to take the first distinct elements of a list?
scala> val s = Seq(3, 5, 4, 1, 5, 7, 1, 2)
s: Seq[Int] = List(3, 5, 4, 1, 5, 7, 1, 2)
scala> s.takeWhileDistinct //Would return Seq(3,5,4,1), it should preserve the original order and ignore posterior occurrences of distinct values like 7 and 2.
If you want it to be fast-ish, then
{ val hs = scala.collection.mutable.HashSet[Int]()
s.takeWhile{ hs.add } }
will do the trick. (Extra braces prevent leaking the temp value hs.)
This is a short approach in a maximum of O(2logN).
implicit class ListOps[T](val s: Seq[T]) {
def takeWhileDistinct: Seq[T] = {
s.indexWhere(x => { s.count(x==) > 1 }) match {
case ind if (ind > 0) => s.take(
s.indexWhere(x => { s.count(x==) > 1 }, ind + 1) + ind).distinct
case _ => s
}
}
}
val ex = Seq(3, 5, 4, 5, 7, 1)
val ex2 = Seq(3, 5, 4, 1, 5, 7, 1, 5)
println(ex.takeWhileDistinct.mkString(", ")) // 3, 4, 5
println(ex2.takeWhileDistinct.mkString(", ")) // 3, 4, 5, 1
Look here for live results.
Interesting problem. Here's an alternative. First, let's get the stream of s so we can avoid unnecessary work (though the overhead is likely to be greater than the saved work, sadly).
val s = Seq(3, 5, 4, 5, 7, 1)
val ss = s.toStream
Now we can build s again, but keeping track of whether there are repetitions or not, and stopping at the first one:
val newS = ss.scanLeft(Seq[Int]() -> false) {
case ((seen, stop), current) =>
if (stop || (seen contains current)) (seen, true)
else ((seen :+ current, false))
}
Now all that's left is take the last element without repetition, and drop the flag:
val noRepetitionsS = newS.takeWhile(!_._2).last._1
A variation on Rex's (though I prefer his...)
This one is functional throughout, using the little-seen scanLeft method.
val prevs = xs.scanLeft(Set.empty[Int])(_ + _)
(xs zip prevs) takeWhile { case (x,prev) => !prev(x) } map {_._1}
UPDATE
And a lazy version (using iterators, for moar efficiency):
val prevs = xs.iterator.scanLeft(Set.empty[Int])(_ + _)
(prevs zip xs.iterator) takeWhile { case (prev,x) => !prev(x) } map {_._2}
Turn the resulting iterator back to a sequence if you want, but this'll also work nicely with iterators on both the input AND the output.
The problem is simpler than the std lib function I was looking for (takeWhileConditionOverListOfAllAlreadyTraversedItems):
scala> val s = Seq(3, 5, 4, 1, 5, 7, 1, 2)
scala> s.zip(s.distinct).takeWhile{case(a,b)=>a==b}.map(_._1)
res20: Seq[Int] = List(3, 5, 4, 1)

Shuffling part of an ArrayBuffer in-place

I need to shuffle part of an ArrayBuffer, preferably in-place so no copies are required. For example, if an ArrayBuffer has 10 elements, and I want to shuffle elements 3-7:
// Unshuffled ArrayBuffer of ints numbered 0-9
0, 1, 2, 3, 4, 5, 6, 7, 8, 9
// Region I want to shuffle is between the pipe symbols (3-7)
0, 1, 2 | 3, 4, 5, 6, 7 | 8, 9
// Example of how it might look after shuffling
0, 1, 2 | 6, 3, 5, 7, 4 | 8, 9
// Leaving us with a partially shuffled ArrayBuffer
0, 1, 2, 6, 3, 5, 7, 4, 8, 9
I've written something like shown below, but it requires copies and iterating over loops a couple of times. It seems like there should be a more efficient way of doing this.
def shufflePart(startIndex: Int, endIndex: Int) {
val part: ArrayBuffer[Int] = ArrayBuffer[Int]()
for (i <- startIndex to endIndex ) {
part += this.children(i)
}
// Shuffle the part of the array we copied
val shuffled = this.random.shuffle(part)
var count: Int = 0
// Overwrite the part of the array we chose with the shuffled version of it
for (i <- startIndex to endIndex ) {
this.children(i) = shuffled(count)
count += 1
}
}
I could find nothing about partially shuffling an ArrayBuffer with Google. I assume I will have to write my own method, but in doing so I would like to prevent copies.
If you can use a subtype of ArrayBuffer you can access the underlying array directly, since ResizableArray has a protected member array:
import java.util.Collections
import java.util.Arrays
import collection.mutable.ArrayBuffer
val xs = new ArrayBuffer[Int]() {
def shuffle(a: Int, b: Int) {
Collections.shuffle(Arrays.asList(array: _*).subList(a, b))
}
}
xs ++= (0 to 9) // xs = ArrayBuffer(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)
xs.shuffle(3, 8) // xs = ArrayBuffer(0, 1, 2, 4, 6, 5, 7, 3, 8, 9)
Notes:
The upper bound for java.util.List#subList is exclusive
It's reasonably efficient because Arrays#asList doesn't create a new set of elements: it's backed by the array itself (hence no add or remove methods)
If using for real, you probably want to add bounds-checks on a and b
I'm not entirely sure why it must be in place - you might want to think that over. But, assuming it's the right thing to do, this could do it:
import scala.collection.mutable.ArrayBuffer
implicit def array2Shufflable[T](a: ArrayBuffer[T]) = new {
def shufflePart(start: Int, end: Int) = {
val seq = (start.max(0) until end.min(a.size - 1)).toSeq
seq.zip(scala.util.Random.shuffle(seq)) foreach { t =>
val x = a(t._1)
a.update(t._1, a(t._2))
a(t._2) = x
}
a
}
}
You can use it like:
val a = ArrayBuffer(1,2,3,4,5,6,7,8,9)
println(a)
println(a.shufflePart(2, 7))
edit: If you're willing to pay the extra cost of an intermediate sequence, this is more reasonable, algorithmically speaking:
def shufflePart(start: Int, end: Int) = {
val seq = (start.max(0) until end.min(a.size - 1)).toSeq
seq.zip(scala.util.Random.shuffle(seq) map { i =>
a(i)
}) foreach { t =>
a.update(t._1, t._2)
}
a
}
}