Scala bufferedIterator incrementing head in inner function - scala

Hi I'm seeing what I believe is odd behaviour in scala. Calling head on a bufferedIterator seems to be incrementing the head in a inner function. Either my expetations are wrong in which case why is the output correct. Or is the output wrong?
given:
import scala.io.Source
val source = Source.fromString("abcdef")
val buff1 = source.buffered;
println("outer head 1: " +buff1.head)
println("outer head 2: " +buff1.head)
def readLine():List[String] = {
def buffered = source.buffered
def readLine(tokens:List[String] , partialToken:String):List[String] = {
println("head1 " + buffered.head)
println("head2 " + buffered.head)
return Nil;
}
return (readLine(Nil, ""));
}
readLine();
The expected output of this to me is
outer head 1: a
outer head 2: a
head1: a
head2: a
actual output is as follows.
outer head 1: a
outer head 2: a
head1 b
head2 c

scala.io.Source is and behaves like an Iterator[Char]. So you must make sure not to use it in several places at once: Iterator.next is called 3 times from 3 different BufferedSource in your example, hence the different values you get out of it:
buff1.head: the buffered source has not buffered anything yet, so asking for head here calls next on the inner source, hence the first a.
buff1.head again: here the head has already been buffered, so you get a and the inner source isn't changed.
buffered.head: since buffered is a def, this is equivalent to source.buffered.head. This new buffered source has not buffered anything yet, so asking for head retrieves an element from the inner source, hence the b.
buffered.head: this creates yet another buffered source, same as above, and you get c.
The bottom line is: if you call source.buffered, never use source again directly, and do not call it several times either.
Your example can be fixed by calling buffered immediately:
val source = Source.fromString("abcdef").buffered
You could also turn def buffered = into val buffered = to make sure source.buffered is not called several times.

Calling head on a bufferedIterator seems to be incrementing the head in a inner function.
Note: (July 2016 3 years later)
Commit 11688eb shows:
SI-9691 BufferedIterator should expose a headOption
This exposes a new API to the BufferedIterator trait.
It will return the next element of an iterator as an Option.
The return will be Some(value) if there is a next value, and None if there is not a next element.
That should help avoid any kind of increment.

You are right, except it increments not a function, but a simple field : IndexedSeqLike on line 66, you can check it out by yourself using some IDE debbuger and following execution step by step

Related

Filling in desired lines in Scala

I currently have a value of result that is a string which represents cycles in a graph
> scala result
String =
0:0->52->22;
5:5->70->77;
8:8->66->24;8->42->32;
. //
. // trimmed to get by point across
. //
71:71->40->45;
77:77->34->28;77->5->70;
84:84->22->29
However, I want to have the output have the numbers in between be included and up to a certain value included. The example code would have value = 90
0:0->52->22;
1:
2:
3:
4:
5:5->70->77;
6:
7:
8:8->66->24;8->42->32;
. //
. // trimmed
. //
83:
84:84->22->29;
85:
86:
87:
88:
89:
90:
If it helps or makes any difference, this value is changed to a list for later purposes, such like
list_result = result.split("\n").toList
List[String] = List(0:0->52->22;, 5:5->70->77;, 8:8->66->24;8->42->32;, 11:11->26->66;11->17->66;
My initial thought was to insert the missing numbers into the list and then sort it, but I had trouble with the sorting so I instead look here for a better method.
Turn your list_result into a Map with default values. Then walk through the desired number range, exchanging each for its Map value.
val map_result: Map[String,List[String]] =
list_result.groupBy("\\d+:".r.findFirstIn(_).getOrElse("bad"))
.withDefault(List(_))
val full_result: String =
(0 to 90).flatMap(n => map_result(s"$n:")).mkString("\n")
Here's a Scastie session to see it in action.
One option would be to use a Map as an intermediate data structure:
val l: List[String] = List("0:0->52->22;", "5:5->70->77;", "8:8->66->24;8->42->32;", "11:11->26->66;11->17->66;")
val byKey: List[Array[String]] = l.map(_.split(":"))
val stop = 90
val mapOfValues = (1 to stop).map(_->"").toMap
val output = byKey.foldLeft(mapOfValues)((acc, nxt) => acc + (nxt.head.toInt -> nxt.tail.head))
output.toList.sorted.map {case (key, value) => println(s"$key, $value")}
This will give you the output you are after. It breaks your input strings into pseudo key-value pairs, creates a map to hold the results, inserts the elements of byKey into the map, then returns a sorted list of the results.
Note: If you are using this in anything like production code you'd need to properly check that each Array in byKey does have two elements to prevent any nullPointerExceptions with the later calls to head and tail.head.
The provided solutions are fine, but I would like to suggest one that can process the data lazily and doesn't need to keep all data in memory at once.
It uses a nice function called unfold, which allows to "unfold" a collection from a starting state, up to a point where you deem the collection to be over (docs).
It's not perfectly polished but I hope it may help:
def readLines(s: String): Iterator[String] =
util.Using.resource(io.Source.fromString(s))(_.getLines)
def emptyLines(from: Int, until: Int): Iterator[(String)] =
Iterator.range(from, until).map(n => s"$n:")
def indexOf(line: String): Int =
Integer.parseInt(line.substring(0, line.indexOf(':')))
def withDefaults(from: Int, to: Int, it: Iterator[String]): Iterator[String] = {
Iterator.unfold((from, it)) { case (n, lines) =>
if (lines.hasNext) {
val next = lines.next()
val i = indexOf(next)
Some((emptyLines(n, i) ++ Iterator.single(next), (i + 1, lines)))
} else if (n < to) {
Some((emptyLines(n, to + 1), (to, lines)))
} else {
None
}
}.flatten
}
You can see this in action here on Scastie.
What unfold does is start from a state (in this case, the line number from and the iterator with the lines) and at every iteration:
if there are still elements in the iterator it gets the next item, identifies its index and returns:
as the next item an Iterator with empty lines up to the latest line number followed by the actual line
e.g. when 5 is reached the empty lines between 1 and 4 are emitted, terminated by the line starting with 5
as the next state, the index of the line after the last in the emitted item and the iterator itself (which, being stateful, is consumed by the repeated calls to unfold at each iteration)
e.g. after processing 5, the next state is 6 and the iterator
if there are no elements in the iterator anymore but the to index has not been reached, it emits another Iterator with the remaining items to be printed (in your example, those after 84)
if both conditions are false we don't need to emit anything anymore and we can close the "unfolding" collection, signalling this by returning a None instead of Some[(Item, State)]
This returns an Iterator[Iterator[String]] where every nested iterator is a range of values from one line to the next, with the default empty lines "sandwiched" in between. The call to flatten turns it into the desired result.
I used an Iterator to make sure that only the essential state is kept in memory at any time and only when it's actually used.

Scala, user input till only newline is given

I have tried to get multiple user inputs to print them in Scala IDE.
I have tried the this piece of code
println(scala.io.StdIn.readLine())
which works, as the IDE takes my input and then print it in the line but this works only for a single input.
I want the code to take multiple inputs till only newline is entered. example,
1
2
3
so i decided we needed an iterator for the input, which led me to try the following 2 lines of code seperately
var in = Iterator.continually{ scala.io.StdIn.readLine() }.takeWhile { x => x != null}
and
var in = io.Source.stdin.getLines().takeWhile { x => x != null}
Unfortunately none of them worked as the IDE is not taking my input at all.
You're really close.
val in = Iterator.continually(io.StdIn.readLine).takeWhile(_.nonEmpty).toList
This will read input until an empty string is entered and saves the input in a List[String]. The reason for toList is because an Iterator element doesn't become real until next is called on it, so readLine won't be called until the next element is required. The transition to List creates all the elements of the Iterator.
update
As #vossad01 has pointed out, this can be made safer for unexpected input.
val in = Iterator.continually(io.StdIn.readLine)
.takeWhile(Option(_).fold(false)(_.nonEmpty))
.toList

Reason for Stream class can be used for 'first class loop'

Stream(1,2,3,4).map(_+10).filter(_%2==0).toList
I'm curious about the reason why above expression should be executed one element by one element without temporary output(first class loop). For example,
cons(11, Stream(2,3,4).map(_+10)).filter(_%2==0).toList
cons(12, Stream(3,4).map(_+10)).filter(_%2==0).toList
12 :: cons(13, Stream(4).map(_+10)).filter(_%2==0).toList
12 :: 14 :: List()
Since there is no extra command for changing order of executing.
I thought executing order would like this,
cons(11, Stream(2,3,4).map(_+10)).filter(_%2==0).toList
cons(11, cons(12, Stream(3,4).map(_+10))).filter(_%2==0).toList
cons(11, cons(12, cons(13, Stream(4).map(_+10)))).filter(_%2==0).toList
cons(11, cons(12, cons(13, cons(14, Empty))))).filter(_%2==0).toList
.
.
12 :: 14 :: List()
Because map command is lefter than filter command.
... while I'm writing this, I realize that there may be another rule:
'outer command first, inner command later'
and this 'outer -> inner' rule comes faster than 'left -> right' rule.
so, inner map command of below is slower than outer filter command of below.
cons(11, Stream(2,3,4).map(_+10)).filter(_%2==0).toList
Is my thinking right?
Because streams are lazy, each element is evaluated on an "as needed" basis. Consider the following example stream:
val es = Stream(2,3,4).map(x=>{println("add");x+10})
.filter(x=>{println("filt");x%2==0})
The first element is evaluated with the definition of the stream, but nothing else until you ask for it.
scala> es(0)
res314: Int = 12
scala> es(1)
add
filt
add
filt
res315: Int = 14
Think of it this way, when I asked for es(1) it "pulled" 3 through the map (adding 10) but it failed to get through the filter. Since we still didn't have the next es() element yet, we had to pull 4 through the map and this time it passed the filter step.

.pop() equivalent in scala

I have worked on python
In python there is a function .pop() which delete the last value in a list and return that
deleted value
ex. x=[1,2,3,4]
x.pop() will return 4
I was wondering is there is a scala equivalent for this function?
If you just wish to retrieve the last value, you can call x.last. This won't remove the last element from the list, however, which is immutable. Instead, you can call x.init to obtain a list consisting of all elements in x except the last one - again, without actually changing x. So:
val lastEl = x.last
val rest = x.init
will give you the last element (lastEl), the list of all bar the last element (rest), and you still also have the original list (x).
There are a lot of different collection types in Scala, each with its own set of supported and/or well performing operations.
In Scala, a List is an immutable cons-cell sequence like in Lisp. Getting the last element is not a well optimised solution (the head element is fast). Similarly Queue and Stack are optimised for retrieving an element and the rest of the structure from one end particularly. You could use either of them if your order is reversed.
Otherwise, Vector is a good performing general structure which is fast both for head and last calls:
val v = Vector(1, 2, 3, 4)
val init :+ last = v // uses pattern matching extractor `:+` to get both init and last
Where last would be the equivalent of your pop operation, and init is the sequence with the last element removed (you can also use dropRight(1) as suggested in the other answers). To just retrieve the last element, use v.last.
I tend to use
val popped :: newList = list
which assigns the first element of the list to popped and the remaining list to newList
The first answer is correct but you can achieve the same doing:
val last = x.last
val rest = x.dropRight(1)
If you're willing to relax your need for immutable structures, there's always Stack and Queue:
val poppable = scala.collection.mutable.Stack[String]("hi", "ho")
val popped = poppable.pop
Similar to Python's ability to pop multiple elements, Queue handles that:
val multiPoppable = scala.collection.mutable.Queue[String]("hi", "ho")
val allPopped = poppable.dequeueAll(_ => true)
If it is mutable.Queue, use dequeue function
/** Returns the first element in the queue, and removes this element
* from the queue.
*
* #throws java.util.NoSuchElementException
* #return the first element of the queue.
*/
def dequeue(): A =
if (isEmpty)
throw new NoSuchElementException("queue empty")
else {
val res = first0.elem
first0 = first0.next
decrementLength()
res
}

Why does the length function seems to delete the list?

I have this code snippet:
for (f <- file_list){
val file_name = path + "\\" + f + ".txt"
val line_list = Source.fromFile(file_name).getLines()
println (file_name + ": " + line_list.length)
println (file_name + ": " + line_list.length)
total_number_lines += line_list.size
}
I have a list of files, for each of them I open it, load it as a list of its lines and then I count the number of lines in the list.
The former call to line_list.length gives the right values of line number, but the latter one always returns zero. Actually, after the length function is executed, the line_list list seems to be empty.
I really cannot understand why is that.
What I am missing?
Source.getLines() returns an Iterator[String], not a collection, so calling .length on it will completely consume it.
You can use Source.fromFile(file_name).getLines().toList if you want to go through it several times.
getLines() returns an Iterator[String] and you can only traverse an iterator once. Calling length exhausts the iterator, so subsequent calls to length and size are being called when the end has being reached, hence it appearing empty:
It is of particular importance to note that, unless stated otherwise,
one should never use an iterator after calling a method on it. The two
most important exceptions are also the sole abstract methods: next and
hasNext.