I have the need to combine values from several (possibly infinite) streams, the number of streams may vary; sometimes to "draw one from each" and handle them as a tuple, sometimes to interleave the values.
Sample input could be like this:
val as= Stream.from(0)
val bs= Stream.from(10)
val cs= Stream.from(100)
val ds= Stream.from(1000)
val list= List(as, bs, cs, ds)
For the first use case, I would like to end up with something like
Seq(0, 10, 100, 1000), Seq(1, 11, 101, 1001), ...
and for the second
Seq(0, 10, 100, 1000, 1, 11, 101, 1001, ...
Is there a standard, or even built-in, solution for combining Streams?
My solution is identical to the solution from Eastsun but easier to understand:
def combine[A](s:Seq[Stream[A]]):Stream[Seq[A]]=s.map(_.head) #:: combine(s.map(_.tail))
Here it is:
scala> val coms = Stream.iterate(list)(_ map (_.tail)) map (_ map (_.head))
coms: scala.collection.immutable.Stream[List[Int]] = Stream(List(0, 10, 100, 1000), ?)
scala> coms take 5 foreach println
List(0, 10, 100, 1000)
List(1, 11, 101, 1001)
List(2, 12, 102, 1002)
List(3, 13, 103, 1003)
List(4, 14, 104, 1004)
scala> val flat = coms.flatten
flat: scala.collection.immutable.Stream[Int] = Stream(0, ?)
scala> flat take 12 toList
res1: List[Int] = List(0, 10, 100, 1000, 1, 11, 101, 1001, 2, 12, 102, 1002)
The best I have come up with yet looks a bit "crowded", as if I'm trying to write a textbook example of stream operations...
def combine[A](list: List[Stream[A]]): Stream[Seq[A]] = {
val listOfSeqs= list.map(_.map(Seq(_))) // easier to reduce when everything are Seqs...
listOfSeqs.reduceLeft((stream1, stream2)=> stream1 zip stream2 map {
case (seq1, seq2) => seq1 ++ seq2
})
}
def interleave[A](list: List[Stream[A]]): Stream[A] = combine(list).flatten
Related
I have to find out the maximum/minimum from this list:
val data= List(List(1,2), List(3,4,91,9,10),11,211,456,345)
From some stack over flow example ,i can see below solution:
val flatdata=data.collect{case i:Int => List(i); case l # a :: b => l}.flatten
[
But this is giving error]
Can some one please help.
want the solution using pure scala and not spark
Let's go through your code in more detail:
sacla> val data= List(List(1,2), List(3,4,91,9,10),11,211,456,345)
data: List[Any] = List(List(1, 2), List(3, 4, 91, 9, 10), 11, 211, 456, 345)
The type of data is a List[Any] because the list is not one specific type. The compiler tries to infer the type, but since Int and List[Int] aren't compatible, it resolves to Any.
scala> data.collect{case i:Int => List(i); case l # a :: b => l}
res0 List[List[Any]] = List(List(1, 2), List(3, 4, 91, 9, 10), List(11), List(211), List(456), List(345))
This second part, tries to consolidate the entries to be a List of Lists. It matches an Int and wraps it in a List.
However, you see the type here is still List[List[Any]].
Now the last part, the .flatten
scala> res0.flatten
res1: List[Any] = List(1, 2, 3, 4, 91, 9, 10, 11, 211, 456, 345)
This takes you from a List[List[Any] to List[Any].
Now the key part here is that if you try to call .max or .min on this list, it won't work. Since there is no such Ordering for Any.
<console>:13: error: No implicit Ordering defined for Any.
The fix will be to force the this type in the original collect call.
scala> data.collect{case i:Int => List(i); case l : List[Int] => l}
res6: List[List[Int]] = List(List(1, 2), List(3, 4, 91, 9, 10), List(11), List(211), List(456), List(345))
scala> .flatten
res7: List[Int] = List(1, 2, 3, 4, 91, 9, 10, 11, 211, 456, 345)
scala> .max
res8: Int = 456
scala> res7.min
res9: Int = 1
One thing wrong with your code is that the result is List[Any], which isn't going to be terribly useful.
This gives a compiler warning but produces a List[Int] result.
data.flatMap{case li:List[Int] => li; case i:Int => List(i)}
There's nothing wrong with your code, it evaluates fine. Not sure what error you're getting.
scala> val data= List(List(1,2), List(3,4,91,9,10),11,211,456,345)
data: List[Any] = List(List(1, 2), List(3, 4, 91, 9, 10), 11, 211, 456, 345)
scala> data.collect{case i:Int => List(i); case l # a :: b => l}
res0: List[List[Any]] = List(List(1, 2), List(3, 4, 91, 9, 10), List(11), List(211), List(456), List(345))
scala> data.collect{case i:Int => List(i); case l # a :: b => l}.flatten
res1: List[Any] = List(1, 2, 3, 4, 91, 9, 10, 11, 211, 456, 345)
Suppose I have two Streams which could be finite or infinite:
val a = Stream(1, 2, 3)
val b = Stream(95, 96, 97, 98, 99)
I can zip them together like so:
left.zip(right).flatMap { case (a, b) => Stream(a, b) }
However, the end result would merge three elements from a and three from b (1, 95, 2, 96, 3, 97). What I'd like to achieve is to zip those two Streams and if one's bigger in size, append the remainder. So the output would be 1, 95, 2, 96, 3, 97, 98, 99.
Is there a nice functional way to achieve this?
You can use zipAll + Option to do that.
def join[A](s1: Stream[A], s2: Stream[A]): Stream[A] =
s1.map(a => Some(a)).zipAll(s2.map(a => Some(a)), None, None).flatMap {
case (Some(a1), Some(a2)) => Stream(a1, a2)
case (Some(a1), None) => Stream(a1)
case (None, Some(a2)) => Stream(a2)
case (None, None) => Stream.empty
}
join(Stream(1, 2, 3), Stream(95, 96, 97, 98, 99))
// res: Stream[Int] = Stream(1, 95, 2, 96, 3, 97, 98, 99)
(PS: If you are in Scala 2.13 use LazyList instead of Stream)
I want to filter the list of list based on few elements in it.
pritnln("databind = "datamap)
dataBind = List((List(3,60,90,T3,T6),List(3,90,89,T32,T5),List(3,60,90,T5,T6), List(3,120,89,T32,T5))
I want to filter this list[list[string]] based on unique first elements present in each list. If the first three elements are repeating, I don't want to get it.
My expected output
List((List(3,60,90,T3,T6),List(3,90,89,T32,T5),List(3,120,89,T32,T5))
When I checked in some question they have used for list of tuple
databind.groupBy(v => (v._1, v._2, v._3)).keys.ToList
How can I do this for the above mentioned list ?
Assuming your lists always contain more than 3 elements:
scala> val l = List(List(1,2,3,4), List(2,3,4,5), List(1,2,3,5))
l: List[List[Int]] = List(List(1, 2, 3, 4), List(2, 3, 4, 5), List(1, 2, 3, 5))
scala> l.groupBy(_.take(3))
res1: scala.collection.immutable.Map[List[Int],List[List[Int]]] = Map(List(1, 2, 3) -> List(List(1, 2, 3, 4), List(1, 2, 3, 5)), List(2, 3, 4) -> List(List(2, 3, 4, 5)))
It is then up to you what you do with the groups. For example if you only want the Lists with a unique first 3 elements then:
scala> res1.collect{ case (_, List(l)) => l}
res2: scala.collection.immutable.Iterable[List[Int]] = List(List(2, 3, 4, 5))
Since you want to group lists based on first few elements of every item, you can go about doing the following:
scala> val dataBind = List(List(3,60,90,"T3","T6"),List(3,90,89,"T32","T5"),List(3,60,90,"T5","T6"), List(3,120,89,"T32","T5"))
dataBind: List[List[Any]] = List(List(3, 60, 90, T3, T6), List(3, 90, 89, T32, T5), List(3, 60, 90, T5, T6), List(3, 120, 89, T32, T5))
scala> dataBind.groupBy(_.take(3)).mapValues(_.head).values.toList
res8: List[List[Any]] = List(List(3, 120, 89, T32, T5), List(3, 60, 90, T3, T6), List(3, 90, 89, T32, T5))
You can specify the transformation of your choice inside mapValues method to derive the desired result.
I would like to randomly select a certain number of elements from a list and make another list out of it. For example out of a list containing 100 elements I would like to randomly select 20 of the elements and store it in another list.
The easiest way to do this is a one-liner:
scala> util.Random.shuffle((1 to 100).toList).take(10)
res0: List[Int] = List(63, 21, 49, 70, 73, 14, 23, 88, 28, 97)
You could try to get clever and avoid shuffling the entire list, but it's almost definitely not necessary, and it'll be very easy to get it wrong.
Use util.Random to shuffle the list and then take the first 20 elements :
scala> import scala.util.Random
import scala.util.Random
scala> val l = List.range(1,100)
l: List[Int] = List(1, 2, 3, ...., 98, 99)
scala> Random.shuffle(l).take(20)
res2: List[Int] = List(11, 32, 95, 56, 90, ..., 45, 20)
And I have a comparison function "compr" already in the code to compare two values.
I want something like this:
Sorting.stableSort(arr[i,j] , compr)
where arr[i,j] is a range of element in array.
Take the slice as a view, sort and copy it back (or take a slice as a working buffer).
scala> val vs = Array(3,2,8,5,4,9,1,10,6,7)
vs: Array[Int] = Array(3, 2, 8, 5, 4, 9, 1, 10, 6, 7)
scala> vs.view(2,5).toSeq.sorted.copyToArray(vs,2)
scala> vs
res31: Array[Int] = Array(3, 2, 4, 5, 8, 9, 1, 10, 6, 7)
Outside the REPL, the extra .toSeq isn't needed:
vs.view(2,5).sorted.copyToArray(vs,2)
Updated:
scala 2.13.8> val vs = Array(3, 2, 8, 5, 4, 9, 1, 10, 6, 7)
val vs: Array[Int] = Array(3, 2, 8, 5, 4, 9, 1, 10, 6, 7)
scala 2.13.8> vs.view.slice(2,5).sorted.copyToArray(vs,2)
val res0: Int = 3
scala 2.13.8> vs
val res1: Array[Int] = Array(3, 2, 4, 5, 8, 9, 1, 10, 6, 7)
Split array into three parts, sort middle part and then concat them, not the most efficient way, but this is FP who cares about performance =)
val sorted =
for {
first <- l.take(FROM)
sortingPart <- l.slice(FROM, UNTIL)
lastPart <- l.takeRight(UNTIL)
} yield (first ++ Sorter.sort(sortingPart) ++ lastPart)
Something like that:
def stableSort[T](x: Seq[T], i: Int, j: Int, comp: (T,T) => Boolean ):Seq[T] = {
x.take(i) ++ x.slice(i,j).sortWith(comp) ++ x.drop(i+j-1)
}
def comp: (Int,Int) => Boolean = { case (x1,x2) => x1 < x2 }
val x = Array(1,9,5,6,3)
stableSort(x,1,4, comp)
// > res0: Seq[Int] = ArrayBuffer(1, 5, 6, 9, 3)
If your class implements Ordering it would be less cumbersome.
This should be as good as you can get without reimplementing the sort. Creates just one extra array with the size of the slice to be sorted.
def stableSort[K:reflect.ClassTag](xs:Array[K], from:Int, to:Int, comp:(K,K) => Boolean) : Unit = {
val tmp = xs.slice(from,to)
scala.util.Sorting.stableSort(tmp, comp)
tmp.copyToArray(xs, from)
}