printing elements in list using stream - scala

Why does the following code prints only 1 and not the rest of the list elements?
scala> val l: List [Int] = List(1,2,3)
l: List[Int] = List(1, 2, 3)
scala> l.toStream.map(x => print(x))
1res3: scala.collection.immutable.Stream[Unit] = Stream((), ?)
What is the correct way to write this code?

I'll divide my answer to two:
1. The map method in Scala:
you're using map, which expects a function with no side-effects (printing is a side effect). What you're looking for is:
l.toStream.foreach(x => print(x))
Basically, the general idea is that map takes something and converts it to something else (for example, increasing its value). while foreach is performing some action on that value that isn't supposed to have a return value.
scala> l.toStream.foreach(x => print(x))
123
2. Stream in Scala:
Streams are lazy, so Scala only computes the values it needs. Try this:
scala> l.toStream.map(x => x+1)
res2: scala.collection.immutable.Stream[Int] = Stream(2, ?)
You can see it computed the first value, and the question marks states that it has no idea what comes after it, because it didn't compute it yet. In you're example the first value is nothing, as the print returns no value.

Stream is on demand data structure which means not all the values will be evaluated until you need them.
example,
scala> val stream = (1 to 10000).toStream
stream: scala.collection.immutable.Stream[Int] = Stream(1, ?)
Now if you access head and tail, stream will be evaluated upto 2nd index.
scala> stream.head
res13: Int = 1
scala> stream.tail
res14: scala.collection.immutable.Stream[Int] = Stream(2, ?)
scala> stream
res15: scala.collection.immutable.Stream[Int] = Stream(1, 2, ?)
If you access index 99,
scala> stream(99)
res16: Int = 100
Now if you print stream, stream will be evaluated upto 99th index,
scala> stream
res17: scala.collection.immutable.Stream[Int] = Stream(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, ?)
It's always good to process only those in stream, which you need. you can use take() for that.
scala> stream.take(50).foreach(x => print(x + " "))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50
So, answer to your question can be,
scala> List(1,2,3).toStream.take(3).foreach(x => print(x + " "))
1 2 3
Reference
https://www.coursera.org/learn/progfun2/home/week/2

to print complete stream use
l.toStream.print
Output: 1, 2, 3, empty
to print first n values, you may use take(n)
l.toStream.take(2).print
prints output: 1, 2, empty

You can print it with
l.toStream.foreach(println)
But generally speaking is not a good idea trying to print or even processing without being careful a whole Stream since it may be infinite and cause an error while doing so.
More info about Streams here

Streams in Scala are lazy data structures which means that they tend to perform only the as needed work.
scala> val stream1 = Stream.range(1, 10)
// stream1: scala.collection.immutable.Stream[Int] = Stream(1, ?)
In this case, only the first element is computed. The stream knows how to compute rest of the elements and will compute them only when it actually needs them. for example ("consumers"),
scala> val list1 = stream1.toList
// list1: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> stream1.foreach(print)
// 123456789
But when faced with "transformers", Streams will just keep the new transformation with them but will not apply to whole Stream. The map method is supposed to provide transforms
scala> val stream2 = stream1.map(i => i + 5)
// stream2: scala.collection.immutable.Stream[Int] = Stream(6, ?)
So, it just know that it has to apply this i => i + 5 function to respective elements of stream1 to get element of stream2. And will do that when required (facing any consumer).
Lets consider something similar to your example,
scala> val stream3 = stream1.map(i => println("element :: " + i))
// element :: 1
// stream3: scala.collection.immutable.Stream[Unit] = Stream((), ?)
Here your "transform" function takes an element Int, prints it in line and returns nothing which is called Unit or () in Scala. Out lazy stream here, will compute this transform for first element and will not do for rest. And this computation here will result in that element :: 1 being printed.
Now, lets see what happens when we apply some consumer to it,
scala> val list3 = stream3.toList
// element :: 2
// element :: 3
// element :: 4
// element :: 5
// element :: 6
// element :: 7
// element :: 8
// element :: 9
// list3: List[Unit] = List((), (), (), (), (), (), (), (), ())
Which will look wrong to most people. All I wanted to convert my stream to list but why are all these lines getting printed.
Which is why, when you are using map, you should provide a pure function.
What is a pure function? The simple answer is that a pure function only does the things it is supposed to do and nothing else. It does not cause any change out of its scope. It just takes something and give something else back.
All of the following are pure functions,
scala> val pf1 = (i: Int) => i + 1
// pf1: Int => Int = $$Lambda$1485/1538411140#6fdc53db
scala> val pf2 = (i: Int) => {
| val x = 100
| val xy = 200
| xy + i
| }
// pf2: Int => Int = $$Lambda$1487/14070792#7bf770ba
scala> val pf3 = (i: Int) => ()
// pf3: Int => Unit = $$Lambda$1486/1145379385#336cd7d5
Where as following is not a pure function,
val npf1 = (i: Int) => println(i)
// npf1: Int => Unit = $$Lambda$1488/1736134005#7ac97ba6
Because it causes a "magical" change in the environment.
Why "magical"? Because, it claims to be a function of type Int => Unit which means it should just be transforming an Int to an Unit. But it also printed something on our console, which was outside of its environment.
A real world example of this magic will be that - whenever you put a bread in your toaster it causes a rain storm on the Hulk's current location. And nobody wants the Hulk to come looking for their toaster.

In general, the bottom line is, that you should not use side effects in .map. When you do foo.map(bar) that just returns another collection, that contains element, generated by applying bar to the original collection. It may or may not be lazy. The point is, you should treat the elements of any collection as undefined until something looks at them.
If you want side effects, use foreach: Seq(1,2,3).toStream.foreach(println)

Related

How can I split a list into multiple other lists?

I only recently started working with Scala and I came face to face with a problem I can't seem to find a solution to. So basically, I'm given an input text file by the name of "in.txt", which includes lines of coordinates that I have to work with like I've shown bellow.
2 1
6 6
4 2
2 5
2 6
2 7
3 4
6 1
6 2
2 3
6 3
6 4
6 5
6 7
I decided to use a List to store all the values so I could use built in functions to do calculations with the values afterwards.
val lines = io.Source.fromFile("in.txt").getLines
val coordinates =
lines
.drop(0)
.toList
.sortWith(_<_)
.mkString
.replaceAll("\\s", "")
.grouped(2)
.toList
Everything works as it should, as the output of println(coordinates) is
List(21, 23, 25, 26, 27, 34, 42, 61, 62, 63, 64, 65, 66, 67)
But what I want to do next is to create multiple lists out of this one. For example, a new list should be created if, for example, a value starts with "2", and all the values that start with "2" would be placed in the new list like this:
List(21, 23, 25, 26, 27)
Then the same would be done with "3", then "4" and so on.
Using functions such as .partition and .groupBy works, but taking into account the fact that the values in the coordinates can also reach 4 digit numbers, and that they can change if the input file is edited, it would be a pain to write all those conditions manually. So basically my question is this: Is it possible to achieve this by making use of Scala's functionality, some sort of form of iterations?
Thanks in advance!
I am assuming your file can take a mixture of 2, 3, 4, ... digit strings.
scala> val l = List("12", "13", "123", "1234")
l: List[String] = List(12, 13, 123, 1234)
scala> val grouped = l.groupBy(s => s.take(s.length - 1)).values
grouped: Iterable[List[String]] = MapLike(List(123), List(12, 13), List(1234))
If you want this sorted:
val grouped = l.groupBy(s => s.take(s.length - 1)).toSeq.sortBy(_._1).map{ case (_, l) => l.sorted}
grouped: Seq[List[String]] = ArrayBuffer(List(12, 13), List(123), List(1234))
You can generate all your input conditions with a range:
val conditions = 1 to 9999
And then foldLeft them filtering your original list by each of its elements:
conditions.foldLeft(List():List[List[Int]])((acc, elem) => l.filter(_.toString.startsWith(elem.toString))::acc).filterNot(_.isEmpty)
Output
res28: List[List[Int]] = List(List(67), List(66), List(65), List(64), List(63), List(62), List(61), List(42), List(34), List(27), List(26), List(25), List(23), List(21), List(61, 62, 63, 64, 65, 66, 67), List(42), List(34), List(21, 23, 25, 26, 27))

Cumulative sum without var scala [duplicate]

I've got a List of days in the month:
val days = List(31, 28, 31, ...)
I need to return a List with the cumulative sum of days:
val cumDays = List(31, 59, 90)
I've thought of using the fold operator:
(0 /: days)(_ + _)
but this will only return the final result (365), whereas I need the list of intermediate results.
Anyway I can do that elegantly?
Scala 2.8 has the methods scanLeft and scanRight which do exactly that.
For 2.7 you can define your own scanLeft like this:
def scanLeft[a,b](xs:Iterable[a])(s:b)(f : (b,a) => b) =
xs.foldLeft(List(s))( (acc,x) => f(acc(0), x) :: acc).reverse
And then use it like this:
scala> scanLeft(List(1,2,3))(0)(_+_)
res1: List[Int] = List(0, 1, 3, 6)
I'm not sure why everybody seems to insist on using some kind of folding, while you basically want to map the values to the cumulated values...
val daysInMonths = List(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
val cumulated = daysInMonths.map{var s = 0; d => {s += d; s}}
//--> List[Int] = List(31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365)
You can simply perform it:
daysInMonths.foldLeft((0, List[Int]()))
{(acu,i)=>(i+acu._1, i+acu._1 :: acu._2)}._2.reverse
Fold into a list instead of an integer. Use pair (partial list with the accumulated values, accumulator with the last sum) as state in the fold.
Fold your list into a new list. On each iteration, append a value which is the sum of the head + the next input. Then reverse the entire thing.
scala> val daysInMonths = List(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
daysInMonths: List[Int] = List(31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31)
scala> daysInMonths.foldLeft(Nil: List[Int]) { (acc,next) =>
| acc.firstOption.map(_+next).getOrElse(next) :: acc
| }.reverse
res1: List[Int] = List(31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365)
You can also create a monoid class that concatenates two lists while adding to the second one the last value from the first. No mutables and no folds involved:
case class CumSum(v: List[Int]) { def +(o: CumSum) = CumSum(v ::: (o.v map (_ + v.last))) }
defined class CumSum
scala> List(1,2,3,4,5,6) map {v => CumSum(List(v))} reduce (_ + _)
res27: CumSum = CumSum(List(1, 3, 6, 10, 15, 21))
For any:
val s:Seq[Int] = ...
You can use one of those:
s.tail.scanLeft(s.head)(_ + _)
s.scanLeft(0)(_ + _).tail
or folds proposed in other answers but...
be aware that Landei's solution is tricky and you should avoid it.
BE AWARE
s.map { var s = 0; d => {s += d; s}}
//works as long `s` is strict collection
val s2:Seq[Int] = s.view //still seen as Seq[Int]
s2.map { var s = 0; d => {s += d; s}}
//makes really weird things!
//Each value'll be different whenever you'll access it!
I should warn about this as a comment below Landei's answer but I couldn't :(.
Works on 2.7.7:
def stepSum (sums: List [Int], steps: List [Int]) : List [Int] = steps match {
case Nil => sums.reverse.tail
case x :: xs => stepSum (sums.head + x :: sums, steps.tail) }
days
res10: List[Int] = List(31, 28, 31, 30, 31)
stepSum (List (0), days)
res11: List[Int] = List(31, 59, 90, 120, 151)

ScalaCheck: choose an integer with custom probability distribution

I want to create a generator in ScalaCheck that generates numbers between say 1 and 100, but with a bell-like bias towards numbers closer to 1.
Gen.choose() distributes numbers randomly between the min and max value:
scala> (1 to 10).flatMap(_ => Gen.choose(1,100).sample).toList.sorted
res14: List[Int] = List(7, 21, 30, 46, 52, 64, 66, 68, 86, 86)
And Gen.chooseNum() has an added bias for the upper and lower bounds:
scala> (1 to 10).flatMap(_ => Gen.chooseNum(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 61, 85, 86, 91, 92, 100, 100)
I'd like a choose() function that would give me a result that looks something like this:
scala> (1 to 10).flatMap(_ => choose(1,100).sample).toList.sorted
res15: List[Int] = List(1, 1, 1, 2, 5, 11, 18, 35, 49, 100)
I see that choose() and chooseNum() take an implicit Choose trait as an argument. Should I use that?
You could use Gen.frequency() (1):
val frequencies = List(
(50000, Gen.choose(0, 9)),
(38209, Gen.choose(10, 19)),
(27425, Gen.choose(20, 29)),
(18406, Gen.choose(30, 39)),
(11507, Gen.choose(40, 49)),
( 6681, Gen.choose(50, 59)),
( 3593, Gen.choose(60, 69)),
( 1786, Gen.choose(70, 79)),
( 820, Gen.choose(80, 89)),
( 347, Gen.choose(90, 100))
)
(1 to 10).flatMap(_ => Gen.frequency(frequencies:_*).sample).toList
res209: List[Int] = List(27, 21, 31, 1, 21, 18, 9, 29, 69, 29)
I got the frequencies from https://en.wikipedia.org/wiki/Standard_normal_table#Complementary_cumulative. The code is just a sample of the table (% 3 or mod 3), but I think you can get the idea.
I can't take much credit for this, and will point you to this excellent page:
http://www.javamex.com/tutorials/random_numbers/gaussian_distribution_2.shtml
A lot of this depends what you mean by "bell-like". Your example doesn't show any negative numbers but the number "1" can't be in the middle of the bell and not produce any negative numbers unless it was a very, very tiny bell!
Forgive the mutable loop but I use them sometimes when I have to reject values in a collection build:
object Test_Stack extends App {
val r = new java.util.Random()
val maxBellAttempt = 102
val stdv = maxBellAttempt / 3 //this number * 3 will happen about 99% of the time
val collectSize = 100000
var filled = false
val l = scala.collection.mutable.Buffer[Int]()
//ref article above "What are the minimum and maximum values with nextGaussian()?"
while(l.size < collectSize){
val temp = (r.nextGaussian() * stdv + 1).abs.round.toInt //the +1 is the mean(avg) offset. can be whatever
//the abs is clipping the curve in half you could remove it but you'd need to move the +1 over more
if (temp <= maxBellAttempt) l+= temp
}
val res = l.to[scala.collection.immutable.Seq]
//println(res.mkString("\n"))
}
Here's the distribution I just pasted the output into excel and did a "countif" to show the freq of each:

how to randomly select a certain number of elements from a list

I would like to randomly select a certain number of elements from a list and make another list out of it. For example out of a list containing 100 elements I would like to randomly select 20 of the elements and store it in another list.
The easiest way to do this is a one-liner:
scala> util.Random.shuffle((1 to 100).toList).take(10)
res0: List[Int] = List(63, 21, 49, 70, 73, 14, 23, 88, 28, 97)
You could try to get clever and avoid shuffling the entire list, but it's almost definitely not necessary, and it'll be very easy to get it wrong.
Use util.Random to shuffle the list and then take the first 20 elements :
scala> import scala.util.Random
import scala.util.Random
scala> val l = List.range(1,100)
l: List[Int] = List(1, 2, 3, ...., 98, 99)
scala> Random.shuffle(l).take(20)
res2: List[Int] = List(11, 32, 95, 56, 90, ..., 45, 20)

Combining Scala streams

I have the need to combine values from several (possibly infinite) streams, the number of streams may vary; sometimes to "draw one from each" and handle them as a tuple, sometimes to interleave the values.
Sample input could be like this:
val as= Stream.from(0)
val bs= Stream.from(10)
val cs= Stream.from(100)
val ds= Stream.from(1000)
val list= List(as, bs, cs, ds)
For the first use case, I would like to end up with something like
Seq(0, 10, 100, 1000), Seq(1, 11, 101, 1001), ...
and for the second
Seq(0, 10, 100, 1000, 1, 11, 101, 1001, ...
Is there a standard, or even built-in, solution for combining Streams?
My solution is identical to the solution from Eastsun but easier to understand:
def combine[A](s:Seq[Stream[A]]):Stream[Seq[A]]=s.map(_.head) #:: combine(s.map(_.tail))
Here it is:
scala> val coms = Stream.iterate(list)(_ map (_.tail)) map (_ map (_.head))
coms: scala.collection.immutable.Stream[List[Int]] = Stream(List(0, 10, 100, 1000), ?)
scala> coms take 5 foreach println
List(0, 10, 100, 1000)
List(1, 11, 101, 1001)
List(2, 12, 102, 1002)
List(3, 13, 103, 1003)
List(4, 14, 104, 1004)
scala> val flat = coms.flatten
flat: scala.collection.immutable.Stream[Int] = Stream(0, ?)
scala> flat take 12 toList
res1: List[Int] = List(0, 10, 100, 1000, 1, 11, 101, 1001, 2, 12, 102, 1002)
The best I have come up with yet looks a bit "crowded", as if I'm trying to write a textbook example of stream operations...
def combine[A](list: List[Stream[A]]): Stream[Seq[A]] = {
val listOfSeqs= list.map(_.map(Seq(_))) // easier to reduce when everything are Seqs...
listOfSeqs.reduceLeft((stream1, stream2)=> stream1 zip stream2 map {
case (seq1, seq2) => seq1 ++ seq2
})
}
def interleave[A](list: List[Stream[A]]): Stream[A] = combine(list).flatten