Weird Scala behavior when defining simple Stream - scala

Why 6? I'd like to define sequence (5, 6, ...). How to do it correctly using "val" not "def"?
scala> val f: Stream[Int] = 5 #:: f map { _ + 1 }
f: Stream[Int] = Stream(6, ?)
I'm using scala 2.9.2

You need some parentheses to tell it to apply the map to f, but not to the 5:
scala> val f: Stream[Int] = 5 #:: (f map { _ + 1 })
f: Stream[Int] = Stream(5, ?)
scala> f.take(5).toList
res2: List[Int] = List(5, 6, 7, 8, 9)

Related

Error: type mismatch flatMap

I am new to spark programming and scala and i am not able to understand the difference between map and flatMap.
I tried below code as i was expecting both to work but got error.
scala> val b = List("1","2", "4", "5")
b: List[String] = List(1, 2, 4, 5)
scala> b.map(x => (x,1))
res2: List[(String, Int)] = List((1,1), (2,1), (4,1), (5,1))
scala> b.flatMap(x => (x,1))
<console>:28: error: type mismatch;
found : (String, Int)
required: scala.collection.GenTraversableOnce[?]
b.flatMap(x => (x,1))
As per my understanding flatmap make Rdd in to collection for String/Int Rdd.
I was thinking that in this case both should work without any error.Please let me know where i am making the mistake.
Thanks
You need to look at how the signatures defined these methods:
def map[U: ClassTag](f: T => U): RDD[U]
map takes a function from type T to type U and returns an RDD[U].
On the other hand, flatMap:
def flatMap[U: ClassTag](f: T => TraversableOnce[U]): RDD[U]
Expects a function taking type T to a TraversableOnce[U], which is a trait Tuple2 doesn't implement, and returns an RDD[U]. Generally, you use flatMap when you want to flatten a collection of collections, i.e. if you had an RDD[List[List[Int]] and you want to produce a RDD[List[Int]] you can flatMap it using identity.
map(func) Return a new distributed dataset formed by passing each element of the source through a function func.
flatMap(func) Similar to map, but each input item can be mapped to 0 or more output items (so func should return a Seq rather than a single item).
The following example might be helpful.
scala> val b = List("1", "2", "4", "5")
b: List[String] = List(1, 2, 4, 5)
scala> b.map(x=>Set(x,1))
res69: List[scala.collection.immutable.Set[Any]] =
List(Set(1, 1), Set(2, 1), Set(4, 1), Set(5, 1))
scala> b.flatMap(x=>Set(x,1))
res70: List[Any] = List(1, 1, 2, 1, 4, 1, 5, 1)
scala> b.flatMap(x=>List(x,1))
res71: List[Any] = List(1, 1, 2, 1, 4, 1, 5, 1)
scala> b.flatMap(x=>List(x+1))
res75: scala.collection.immutable.Set[String] = List(11, 21, 41, 51) // concat
scala> val x = sc.parallelize(List("aa bb cc dd", "ee ff gg hh"), 2)
scala> val y = x.map(x => x.split(" ")) // split(" ") returns an array of words
scala> y.collect
res0: Array[Array[String]] = Array(Array(aa, bb, cc, dd), Array(ee, ff, gg, hh))
scala> val y = x.flatMap(x => x.split(" "))
scala> y.collect
res1: Array[String] = Array(aa, bb, cc, dd, ee, ff, gg, hh)
Map operation return type is U where as flatMap return type is TraversableOnce[U](means collections)
val b = List("1", "2", "4", "5")
val mapRDD = b.map { input => (input, 1) }
mapRDD.foreach(f => println(f._1 + " " + f._2))
val flatmapRDD = b.flatMap { input => List((input, 1)) }
flatmapRDD.foreach(f => println(f._1 + " " + f._2))
map does a 1-to-1 transformation, while flatMap converts a list of lists to a single list:
scala> val b = List(List(1,2,3), List(4,5,6), List(7,8,90))
b: List[List[Int]] = List(List(1, 2, 3), List(4, 5, 6), List(7, 8, 90))
scala> b.map(x => (x,1))
res1: List[(List[Int], Int)] = List((List(1, 2, 3),1), (List(4, 5, 6),1), (List(7, 8, 90),1))
scala> b.flatMap(x => x)
res2: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 90)
Also, flatMap is useful for filtering out None values if you have a list of Options:
scala> val c = List(Some(1), Some(2), None, Some(3), Some(4), None)
c: List[Option[Int]] = List(Some(1), Some(2), None, Some(3), Some(4), None)
scala> c.flatMap(x => x)
res3: List[Int] = List(1, 2, 3, 4)

scala merge option sequences

Want to merge val A = Option(Seq(1,2)) and val B = Option(Seq(3,4)) to yield a new option sequence
val C = Option(Seq(1,2,3,4))
This
val C = Option(A.getOrElse(Nil) ++ B.getOrElse(Nil)),
seems faster and more idiomatic than
val C = Option(A.toList.flatten ++ B.toList.flatten)
But is there a better way? And am I right that getOrElse is faster and lighter than toList.flatten?
What about a neat for comprehension:
val Empty = Some(Nil)
val C = for {
a <- A orElse Empty
b <- B orElse Empty
} yield a ++ b
Creates less intermediate options.
Or, you could just do a somewhat cumbersome pattern matching:
(A, B) match {
case (None, None) => Nil
case (None, sb#Some(b)) => sb
case (sa#Some(a), None) => sa
case (Some(a), Some(b)) => Some(a ++ b)
}
I think this at least creates less intermediate collections than the double flatten.
Your first case:
// In this case getOrElse is not needed as the option is clearly not `None`.
// So, you can replace the following:
val C = Option(A.getOrElse(Nil) ++ B.getOrElse(Nil))
// By this:
val C = Option(A.get ++ B.get) // A simple concatenation of two sequences.
C: Option[Seq[Int]] = Some(List(1, 2, 3, 4))
Your second case/option is wrong for multiple reasons.
val C = Option(A.toList.flatten ++ B.toList.flatten)
Option[List[Int]] = Some(List(1, 2, 3, 4))
It returns the incorrect type Option[List[Int]] instead of Option[Seq[Int]]
It needlessly invokes toList on A & B. You could simply add the options and invoke flatten on them.
It is not DRY and redundantly calls flatten on both A.toList & B.toList whereas it could call flatten on (A ++ B)
Instead of this, you could do this more efficiently:
val E = Option((A ++ B).flatten.toSeq)
E: Option[Seq[Int]] = Some(List(1, 2, 3, 4))
Using foldLeft
Seq(Some(List(1, 2)), None).foldLeft(List.empty[Int])(_ ++ _.getOrElse(List.empty[Int]))
result: List[Int] = List(1, 2)
Using flatten twice
Seq(Some(Seq(1, 2, 3)), Some(4, 5, 6), None).flatten.flatten
result: Seq(1, 2, 3, 4, 5, 6)
Scala REPL
scala> val a = Some(Seq(1, 2, 3))
a: Some[Seq[Int]] = Some(List(1, 2, 3))
scala> val b = Some(Seq(4, 5, 6))
b: Some[Seq[Int]] = Some(List(4, 5, 6))
scala> val c = None
c: None.type = None
scala> val d = Seq(a, b, c).flatten.flatten
d: Seq[Int] = List(1, 2, 3, 4, 5, 6)

How do you rotate (circular shift) of a Scala collection

I can do this quite easily, and cleanly, using a for loop. For instance, if I wanted to traverse a Seq from every element back to itself I would do the following:
val seq = Seq(1,2,3,4,5)
for (i <- seq.indices) {
for (j <- seq.indices) {
print(seq(i + j % seq.length))
}
}
But as I'm looking to fold over the collection, I'm wondering if there is a more idiomatic approach. A recursive approach would allow me to avoid any vars. But basically, I'm wondering if something like the following is possible:
seq.rotatedView(i)
Which would create a rotated view, like rotating bits (or circular shift).
Is it like below:
scala> def rotatedView(i:Int)=Seq(1,2,3,4,5).drop(i)++Seq(1,2,3,4,5).take(i)
rotatedView: (i: Int)Seq[Int]
scala> rotatedView(1)
res48: Seq[Int] = List(2, 3, 4, 5, 1)
scala> rotatedView(2)
res49: Seq[Int] = List(3, 4, 5, 1, 2)
This ought to do it in a fairly generic way, and allow for arbitrary rotations:
def rotateLeft[A](seq: Seq[A], i: Int): Seq[A] = {
val size = seq.size
seq.drop(i % size) ++ seq.take(i % size)
}
def rotateRight[A](seq: Seq[A], i: Int): Seq[A] = {
val size = seq.size
seq.drop(size - (i % size)) ++ seq.take(size - (i % size))
}
The idea is simple enough, to rotate left, drop the first i elements from the left, and take them again from the left to concatenate them in the opposite order. If you don't mind calculating the size of the collection, you can do your operations modulo the size, to allow i to be arbitrary.
scala> rotateRight(seq, 1)
res34: Seq[Int] = List(5, 1, 2, 3, 4)
scala> rotateRight(seq, 7)
res35: Seq[Int] = List(4, 5, 1, 2, 3)
scala> rotateRight(seq, 70)
res36: Seq[Int] = List(1, 2, 3, 4, 5)
Similarly, you can use splitAt:
def rotateLeft[A](seq: Seq[A], i: Int): Seq[A] = {
val size = seq.size
val (first, last) = seq.splitAt(i % size)
last ++ first
}
def rotateRight[A](seq: Seq[A], i: Int): Seq[A] = {
val size = seq.size
val (first, last) = seq.splitAt(size - (i % size))
last ++ first
}
To make it even more generic, using the enrich my library pattern:
import scala.collection.TraversableLike
import scala.collection.generic.CanBuildFrom
implicit class TraversableExt[A, Repr <: TraversableLike[A, Repr]](xs: TraversableLike[A, Repr]) {
def rotateLeft(i: Int)(implicit cbf: CanBuildFrom[Repr, A, Repr]): Repr = {
val size = xs.size
val (first, last) = xs.splitAt(i % size)
last ++ first
}
def rotateRight(i: Int)(implicit cbf: CanBuildFrom[Repr, A, Repr]): Repr = {
val size = xs.size
val (first, last) = xs.splitAt(size - (i % size))
last ++ first
}
}
scala> Seq(1, 2, 3, 4, 5).rotateRight(2)
res0: Seq[Int] = List(4, 5, 1, 2, 3)
scala> List(1, 2, 3, 4, 5).rotateLeft(2)
res1: List[Int] = List(3, 4, 5, 1, 2)
scala> Stream(1, 2, 3, 4, 5).rotateRight(1)
res2: scala.collection.immutable.Stream[Int] = Stream(5, ?)
Keep in mind these are not all necessarily the most tuned for performance, and they also can't work with infinite collections (none can).
Following the OP's comment that they want to fold over it, here's a slightly different take on it that avoids calculating the length of the sequence first.
Define an iterator that will iterate over the rotated sequence
class RotatedIterator[A](seq: Seq[A], start: Int) extends Iterator[A] {
var (before, after) = seq.splitAt(start)
def next = after match {
case Seq() =>
val (h :: t) = before; before = t; h
case h :: t => after = t; h
}
def hasNext = after.nonEmpty || before.nonEmpty
}
And use it like this:
val seq = List(1, 2, 3, 4, 5)
val xs = new RotatedIterator(seq, 2)
println(xs.toList) //> List(3, 4, 5, 1, 2)
A simple method is to concatenate the sequence with itself and then take the slice that is required:
(seq ++ seq).slice(start, start + seq.length)
This is just a variant of the drop/take version but perhaps a little clearer.
Given:
val seq = Seq(1,2,3,4,5)
Solution:
seq.zipWithIndex.groupBy(_._2<3).values.flatMap(_.map(_._1))
or
seq.zipWithIndex.groupBy(_._2<3).values.flatten.map(_._1)
Result:
List(4, 5, 1, 2, 3)
If rotation is more than length of collection - we need to use rotation%length, if negative than formula (rotation+1)%length and take absolute value.
It's not efficient
Another tail-recursive approach. When I benchmarked it with JMH it was about 2 times faster than solution based on drop/take:
def rotate[A](list: List[A], by: Int): List[A] = {
#tailrec
def go(list: List[A], n: Int, acc: List[A]): List[A] = {
if(n > 0) {
list match {
case x :: xs => go(xs, n-1, x :: acc)
}
} else {
list ++ acc.reverse
}
}
if (by < 0) {
go(list, -by % list.length, Nil)
} else {
go(list, list.length - by % list.length, Nil)
}
}
//rotate right
rotate(List(1,2,3,4,5,6,7,8,9,10), 3) // List(8, 9, 10, 1, 2, 3, 4, 5, 6, 7)
//use negative number to rotate left
rotate(List(1,2,3,4,5,6,7,8,9,10), -3) // List(4, 5, 6, 7, 8, 9, 10, 1, 2, 3)
Here is one liner solution
def rotateRight(A: Array[Int], K: Int): Array[Int] = {
if (null == A || A.size == 0) A else (A drop A.size - (K % A.size)) ++ (A take A.size - (K % A.size))
}
rotateRight(Array(1,2,3,4,5), 3)
Here's a fairly simple and idiomatic Scala collections way to write it:
def rotateSeq[A](seq: Seq[A], isLeft: Boolean = false, count: Int = 1): Seq[A] =
if (isLeft)
seq.drop(count) ++ seq.take(count)
else
seq.takeRight(count) ++ seq.dropRight(count)
We can simply use foldLeft to reverse a list as below.
val input = List(1,2,3,4,5)
val res = input.foldLeft(List[Int]())((s, a) => { List(a) ++: s})
println(res) // List(5, 4, 3, 2, 1)
Another one line solution if you don't need to validate the "offset":
def rotate[T](seq: Seq[T], offset: Int): Seq[T] = Seq(seq, seq).flatten.slice(offset, offset + seq.size)
This is a simple piece of code
object tesing_it extends App
{
val one = ArrayBuffer(1,2,3,4,5,6)
val i = 2 //the number of index you want to move
for(z<-0 to i){
val y = 0
var x = one += one(y)
x = x -= x(y)
println("for seq after process " +z +" " + x)
}
println(one)
}
Result:
for seq after process 0 ArrayBuffer(2, 3, 4, 5, 6, 1)
for seq after process 1 ArrayBuffer(3, 4, 5, 6, 1, 2)
for seq after process 2 ArrayBuffer(4, 5, 6, 1, 2, 3)
ArrayBuffer(4, 5, 6, 1, 2, 3)

Merging streams in scala

I need help with merging two streams into one. The output has to be as follows:
(elem1list1#elem1list2, elem2list1#elem2list2...)
and the function breaks if any of streams is empty
def mergeStream(a: Stream[A], b: Stream[A]):Stream[A] =
if (a.isEmpty || b.isEmpty) Nil
else (a,b) match {
case(x#::xs, y#::ys) => x#::y
}
Any clue how to fix it?
You can also zip two Streams together, which will truncate the longer Stream, and flatMap them out of the tuples:
a.zip(b).flatMap { case (a, b) => Stream(a, b) }
Though I cannot speak to it's efficiency.
scala> val a = Stream(1,2,3,4)
a: scala.collection.immutable.Stream[Int] = Stream(1, ?)
scala> val b = Stream.from(3)
b: scala.collection.immutable.Stream[Int] = Stream(3, ?)
scala> val c = a.zip(b).flatMap { case (a, b) => Stream(a, b) }.take(10).toList
c: List[Int] = List(1, 3, 2, 4, 3, 5, 4, 6)
def mergeStream(s1: Stream[Int], s2: Stream[Int]): Stream[Int] = (s1, s2) match {
case (x#::xs, y#::ys) => x #:: y #:: mergeStream(xs, ys)
case _ => Stream.empty
}
scala> mergeStream(Stream.from(1), Stream.from(100)).take(10).toList
res0: List[Int] = List(1, 100, 2, 101, 3, 102, 4, 103, 5, 104)
You can use interleave from scalaz:
scala> (Stream(1,2) interleave Stream.from(10)).take(10).force
res1: scala.collection.immutable.Stream[Int] = Stream(1, 10, 2, 11, 12, 13, 14, 15, 16, 17)

What is the #:: operator in a scala Stream? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Searching the Scala documentation for #::
I'm looking through the docs of Stream
The filter method has this code:
def naturalsFrom(i: Int): Stream[Int] = i #:: naturalsFrom(i + 1)
naturalsFrom(1) 10 } filter { _ % 5 == 0 } take 10 mkString(", ")
What is the #:: operator? Does this map to a function call somewhere?
As SHildebrandt says, #:: is the cons operator for Streams.
In other words, #:: is to streams what :: is to Lists
val x = Stream(1,2,3,4) //> x : scala.collection.immutable.Stream[Int] = Stream(1, ?)
10#::x //> res0: scala.collection.immutable.Stream[Int] = Stream(10, ?)
val y = List(1,2,3,4) //> y : List[Int] = List(1, 2, 3, 4)
10::y //> res1: List[Int] = List(10, 1, 2, 3, 4)
x #:: xs
returns
Stream.cons(x, xs)
which returns a Stream of an element x followed by a Stream xs.