I have this code snippet:
for (f <- file_list){
val file_name = path + "\\" + f + ".txt"
val line_list = Source.fromFile(file_name).getLines()
println (file_name + ": " + line_list.length)
println (file_name + ": " + line_list.length)
total_number_lines += line_list.size
}
I have a list of files, for each of them I open it, load it as a list of its lines and then I count the number of lines in the list.
The former call to line_list.length gives the right values of line number, but the latter one always returns zero. Actually, after the length function is executed, the line_list list seems to be empty.
I really cannot understand why is that.
What I am missing?
Source.getLines() returns an Iterator[String], not a collection, so calling .length on it will completely consume it.
You can use Source.fromFile(file_name).getLines().toList if you want to go through it several times.
getLines() returns an Iterator[String] and you can only traverse an iterator once. Calling length exhausts the iterator, so subsequent calls to length and size are being called when the end has being reached, hence it appearing empty:
It is of particular importance to note that, unless stated otherwise,
one should never use an iterator after calling a method on it. The two
most important exceptions are also the sole abstract methods: next and
hasNext.
Related
val args = "To now was far back saw the *$# giant planet itself, het a won"
Find and sort distinct anagram pairs from "args":
now won
was saw
the het
First I clean up the args and put them in an array.
val argsArray = args.replaceAll("[^a-zA-Z0-9\\s]", "").toLowerCase.split(" ").distinct.sorted
argsArray: Array[String] = Array("", a, back, far, giant, het, itself, now, planet, saw, the, to, was, won)
My idea is to reduce each word to an array of char, then sort, then compare. But I get stuck because the following returns the wrong data type ---- String = [C#2736f24a
for (i <- 0 until argsArray.length - 1){
val j = i + 1
if(argsArray(i).toCharArray.sorted == argsArray(j).toCharArray.sorted) {
println(argsArray(i).toCharArray + " " + argsArray(j).toCharArray)
}
}
I assume there are better ways to solve this, but what I really want to learn is how to deal with this data type problem, so please help me solve that and then I will refactor later. Thank you.
[C#<whatever> is just how Array[Char] is converted to String on JVM. Remove calls to toCharArray from println and it'll print the strings you want. The second error, with the current code in the question, is the equality check: == on arrays checks that they are the same object, and since sorted will always create a new array, the left and right sides are always different objects even if they have the same elements.
I have the following code snippets: The code reads the system (Linux) dictionary(en) file and keeps it in memory List.
Code 1 : (With mutable List)
val word = scala.collection.mutable.LinkedList[String]("init");
for(line <- Source.fromFile("/usr/share/dict/words").getLines()){
val s : String = line.trim()
if( // some checks
){
word append scala.collection.mutable.LinkedList[String](s)
}
}
Code 2 : (With Immutable List)
var word = List[String]()
for(line <- Source.fromFile("/usr/share/dict/words").getLines()){
val s : String = line.trim()
if( // some checks
){
word ::= s
}
}
Code 2 : returns almost immediately , But
Code 1 : Takes for ever .
Can any one help me out , why is it taking so much time for mutable List? . Should we use Mutable at all or Am I doing something wrong?
Scala version used : 2.10.3
Thanks in Advance for your help.
word append scala.collection.mutable.LinkedList[String](s)
Traverse the word list and then at the end append the items from the other list.
word ::= s
Append s at the front of the word list and assign the new list to word variable.
Appending to the end of list is always expensive as compared to add a item to the front.
In the first example, you are adding to the end of a list repeatedly (append). This takes time on the order of the length of the list. In the second example, you are adding to the beginning of a list (::). This takes constant time. So the first example has an execution time that increases with the square of the number of lines in the file, and the second has an execution time that increases linearly with the length of the file.
This is due to the nature of linked lists, which are the data structure underlying both immutable List and mutable LinkedList. linked lists are fast to access at the front and slow to access at the back.
I have worked on python
In python there is a function .pop() which delete the last value in a list and return that
deleted value
ex. x=[1,2,3,4]
x.pop() will return 4
I was wondering is there is a scala equivalent for this function?
If you just wish to retrieve the last value, you can call x.last. This won't remove the last element from the list, however, which is immutable. Instead, you can call x.init to obtain a list consisting of all elements in x except the last one - again, without actually changing x. So:
val lastEl = x.last
val rest = x.init
will give you the last element (lastEl), the list of all bar the last element (rest), and you still also have the original list (x).
There are a lot of different collection types in Scala, each with its own set of supported and/or well performing operations.
In Scala, a List is an immutable cons-cell sequence like in Lisp. Getting the last element is not a well optimised solution (the head element is fast). Similarly Queue and Stack are optimised for retrieving an element and the rest of the structure from one end particularly. You could use either of them if your order is reversed.
Otherwise, Vector is a good performing general structure which is fast both for head and last calls:
val v = Vector(1, 2, 3, 4)
val init :+ last = v // uses pattern matching extractor `:+` to get both init and last
Where last would be the equivalent of your pop operation, and init is the sequence with the last element removed (you can also use dropRight(1) as suggested in the other answers). To just retrieve the last element, use v.last.
I tend to use
val popped :: newList = list
which assigns the first element of the list to popped and the remaining list to newList
The first answer is correct but you can achieve the same doing:
val last = x.last
val rest = x.dropRight(1)
If you're willing to relax your need for immutable structures, there's always Stack and Queue:
val poppable = scala.collection.mutable.Stack[String]("hi", "ho")
val popped = poppable.pop
Similar to Python's ability to pop multiple elements, Queue handles that:
val multiPoppable = scala.collection.mutable.Queue[String]("hi", "ho")
val allPopped = poppable.dequeueAll(_ => true)
If it is mutable.Queue, use dequeue function
/** Returns the first element in the queue, and removes this element
* from the queue.
*
* #throws java.util.NoSuchElementException
* #return the first element of the queue.
*/
def dequeue(): A =
if (isEmpty)
throw new NoSuchElementException("queue empty")
else {
val res = first0.elem
first0 = first0.next
decrementLength()
res
}
There is some misunderstanding between me and Scala
0 or 1?
object Fun extends App {
def foo(list:List[Int], count:Int = 0): Int = {
if (list.isEmpty) { // when this is true
return 1 // and we are about to return 1, the code goes to the next line
}
foo(list.tail, count + 1) // I know I do not use "return here" ...
count
}
val result = foo( List(1,2,3) )
println ( result ) // 0
}
Why does it print 0?
Why does recursion work even without "return"
(when it is in the middle of function, but not in the end)?
Why doesn't it return 1? when I use "return" explicitly?
--- EDIT:
It will work if I use return here "return foo(list.tail, count + 1)'.
Bu it does NOT explain (for me) why "return 1" does not work above.
If you read my full explanation below then the answers to your three questions should all be clear, but here's a short, explicit summary for everyone's convenience:
Why does it print 0? This is because the method call was returning count, which had a default value of 0—so it returns 0 and you print 0. If you called it with count=5 then it would print 5 instead. (See the example using println below.)
Why does recursion work even without "return" (when it is in the middle of function, but not in the end)? You're making a recursive call, so the recursion happens, but you weren't returning the result of the recursive call.
Why doesn't it return 1? when I use "return" explicitly? It does, but only in the case when list is empty. If list is non-empty then it returns count instead. (Again, see the example using println below.)
Here's a quote from Programming in Scala by Odersky (the first edition is available online):
The recommended style for methods is in fact to avoid having explicit, and especially multiple, return statements. Instead, think of each method as an expression that yields one value, which is returned. This philosophy will encourage you to make methods quite small, to factor larger methods into multiple smaller ones. On the other hand, design choices depend on the design context, and Scala makes it easy to write methods that have multiple, explicit returns if that's what you desire. [link]
In Scala you very rarely use the return keyword, but instead take advantage that everything in an expression to propagate the return value back up to the top-level expression of the method, and that result is then used as the return value. You can think of return as something more like break or goto, which disrupts the normal control flow and might make your code harder to reason about.
Scala doesn't have statements like Java, but instead everything is an expression, meaning that everything returns a value. That's one of the reasons why Scala has Unit instead of void—because even things that would have been void in Java need to return a value in Scala. Here are a few examples about how expressions work that are relevant to your code:
Things that are expressions in Java act the same in Scala. That means the result of 1+1 is 2, and the result of x.y() is the return value of the method call.
Java has if statements, but Scala has if expressions. This means that the Scala if/else construct acts more like the Java ternary operator. Therefore, if (x) y else z is equivalent to x ? y : z in Java. A lone if like you used is the same as if (x) y else Unit.
A code block in Java is a statement made up of a group of statements, but in Scala it's an expression made up of a group of expressions. A code block's result is the result of the last expression in the block. Therefore, the result of { o.a(); o.b(); o.c() } is whatever o.c() returned. You can make similar constructs with the comma operator in C/C++: (o.a(), o.b(), o.c()). Java doesn't really have anything like this.
The return keyword breaks the normal control flow in an expression, causing the current method to immediately return the given value. You can think of it kind of like throwing an exception, both because it's an exception to the normal control flow, and because (like the throw keyword) the resulting expression has type Nothing. The Nothing type is used to indicate an expression that never returns a value, and thus can essentially be ignored during type inference. Here's simple example showing that return has the result type of Nothing:
def f(x: Int): Int = {
val nothing: Nothing = { return x }
throw new RuntimeException("Can't reach here.")
}
Based on all that, we can look at your method and see what's going on:
def foo(list:List[Int], count:Int = 0): Int = {
// This block (started by the curly brace on the previous line
// is the top-level expression of this method, therefore its result
// will be used as the result/return value of this method.
if (list.isEmpty) {
return 1 // explicit return (yuck)
}
foo(list.tail, count + 1) // recursive call
count // last statement in block is the result
}
Now you should be able to see that count is being used as the result of your method, except in the case when you break the normal control flow by using return. You can see that the return is working because foo(List(), 5) returns 1. In contrast, foo(List(0), 5) returns 5 because it's using the result of the block, count, as the return value. You can see this clearly if you try it:
println(foo(List())) // prints 1 because list is empty
println(foo(List(), 5)) // prints 1 because list is empty
println(foo(List(0))) // prints 0 because count is 0 (default)
println(foo(List(0), 5)) // prints 5 because count is 5
You should restructure your method so that the value that the body is an expression, and the return value is just the result of that expression. It looks like you're trying to write a method that returns the number of items in the list. If that's the case, this is how I'd change it:
def foo(list:List[Int], count:Int = 0): Int = {
if (list.isEmpty) count
else foo(list.tail, count + 1)
}
When written this way, in the base case (list is empty) it returns the current item count, otherwise it returns the result of the recursive call on the list's tail with count+1.
If you really want it to always return 1 you can change it to if (list.isEmpty) 1 instead, and it will always return 1 because the base case will always return 1.
You're returning the value of count from the first call (that is, 0), not the value from the recursive call of foo.
To be more precise, in you code, you don't use the returned value of the recursive call to foo.
Here is how you can fix it:
def foo(list:List[Int], count:Int = 0): Int = {
if (list.isEmpty) {
1
} else {
foo(list.tail, count + 1)
}
}
This way, you get 1.
By the way, don't use return. It doesn't do always what you would expect.
In Scala, a function return implicitly the last value. You don't need to explicitly write return.
Your return works, just not the way you expect because you're ignoring its value. If you were to pass an empty list, you'd get 1 as you expect.
Because you're not passing an empty list, your original code works like this:
foo called with List of 3 elements and count 0 (call this recursion 1)
list is not empty, so we don't get into the block with return
we recursively enter foo, now with 2 elements and count 1 (recursion level 2)
list is not empty, so we don't get into the block with return
we recursively enter foo, now with 1 element and count 2 (recursion level 3)
list is not empty, so we don't get into the block with return
we now enter foo with no elements and count 3 (recursion level 4)
we enter the block with return and return 1
we're back to recursion level 3. The result of the call to foo from which we just came back in neither assigned nor returned, so it's ignored. We proceed to the next line and return count, which is the same value that was passed in, 2
the same thing happens on recursion levels 2 and 1 - we ignore the return value of foo and instead return the original count
the value of count on the recursion level 1 was 0, which is the end result
The fact that you do not have a return in front of foo(list.tail, count + 1) means that, after you return from the recursion, execution is falling through and returning count. Since 0 is passed as a default value for count, once you return from all of the recursed calls, your function is returning the original value of count.
You can see this happening if you add the following println to your code:
def foo(list:List[Int], count:Int = 0): Int = {
if (list.isEmpty) { // when this is true
return 1 // and we are about to return 1, the code goes to the next line
}
foo(list.tail, count + 1) // I know I do not use "return here" ...
println ("returned from foo " + count)
count
}
To fix this you should add a return in front of foo(list.tail.....).
you return count in your program which is a constant and is initialized with 0, so that is what you are returning at the top level of your recursion.
Hi I'm seeing what I believe is odd behaviour in scala. Calling head on a bufferedIterator seems to be incrementing the head in a inner function. Either my expetations are wrong in which case why is the output correct. Or is the output wrong?
given:
import scala.io.Source
val source = Source.fromString("abcdef")
val buff1 = source.buffered;
println("outer head 1: " +buff1.head)
println("outer head 2: " +buff1.head)
def readLine():List[String] = {
def buffered = source.buffered
def readLine(tokens:List[String] , partialToken:String):List[String] = {
println("head1 " + buffered.head)
println("head2 " + buffered.head)
return Nil;
}
return (readLine(Nil, ""));
}
readLine();
The expected output of this to me is
outer head 1: a
outer head 2: a
head1: a
head2: a
actual output is as follows.
outer head 1: a
outer head 2: a
head1 b
head2 c
scala.io.Source is and behaves like an Iterator[Char]. So you must make sure not to use it in several places at once: Iterator.next is called 3 times from 3 different BufferedSource in your example, hence the different values you get out of it:
buff1.head: the buffered source has not buffered anything yet, so asking for head here calls next on the inner source, hence the first a.
buff1.head again: here the head has already been buffered, so you get a and the inner source isn't changed.
buffered.head: since buffered is a def, this is equivalent to source.buffered.head. This new buffered source has not buffered anything yet, so asking for head retrieves an element from the inner source, hence the b.
buffered.head: this creates yet another buffered source, same as above, and you get c.
The bottom line is: if you call source.buffered, never use source again directly, and do not call it several times either.
Your example can be fixed by calling buffered immediately:
val source = Source.fromString("abcdef").buffered
You could also turn def buffered = into val buffered = to make sure source.buffered is not called several times.
Calling head on a bufferedIterator seems to be incrementing the head in a inner function.
Note: (July 2016 3 years later)
Commit 11688eb shows:
SI-9691 BufferedIterator should expose a headOption
This exposes a new API to the BufferedIterator trait.
It will return the next element of an iterator as an Option.
The return will be Some(value) if there is a next value, and None if there is not a next element.
That should help avoid any kind of increment.
You are right, except it increments not a function, but a simple field : IndexedSeqLike on line 66, you can check it out by yourself using some IDE debbuger and following execution step by step