File I/O loop breaking prematurely? - scala

I'm reading a file line by line using this loop:
for(line <- s.getLines()){
mylist += otherFunction(line);
}
where the variable mylist is a ArrayBuffer which stores a collection of custom datatypes. The otherFunction(line); does something like this...
def otherFunction(list:String)={
val line = s.getLine(index);
val t = new CustomType(0,1,line(0));
t
}
and CustomType is defined as...
class CustomType(name:String,id:Int,num:Int){}
I've ommitted much of the code as you can see because it's not relevant. I can run the rest of my functions and it'll read the file line by line till EOF as long as I comment out the last line of otherFunction(). Why is returning a value in this function to my list causing my for loop to stop?

It's not clear exactly what you're trying to do here. I assume s is a scala.io.Source object. Why does otherFunction take a string argument that it doesn't use? getLine is deprecated, and you don't say where index comes from. Do you really want to refer to the first character in the line String with index 0, and is it really supposed to be an Int? Assuming that this is actually what you want to do, why not just use a map on the iterator?
val list = s.getLines.map(i => new CustomType("0", 1, i(0).asDigit)).toIndexedSeq

Related

Scala foldLeft with a List

I have the following code snippet:
import scala.io.Source
object test extends App {
val lineIterator = Source.fromFile("test1.txt").getLines()
val fileContent = lineIterator.foldLeft(List[String]())((list, currentLine) => {
currentLine :: list
list
})
fileContent foreach println
}
Let's assume the test1.txt file is not empty and has some values in it.
So my question about the foldLeft function is, why does this example here return an empty list, and when I remove the list at the end of the foldLeft function it works?
Why is it returning an empty list under the value fileContent?
The line currentLine :: list does not mutate the original list. It creates a new list with currentLine prepended, and returns that new list. When this expression is not the last one in your block, it will just be discarded, and the (still) empty list will be returned.
When you remove the list at the end, you will actually return currentLine :: list.
You call foldLeft with some start value (in your case an empty list) and a function, which takes an accumulator and the current value. This function returns the new list then. In your implementation the empty list from the first call will propagate to the end of function execution. This is why you get an empty list as a result.
Please look at this example: https://twitter.github.io/scala_school/collections.html#fold
list is immutable, so it is still empty after currentLine :: list. Thus the code within brackets returns an empty List, which is then folded with the next item, still returning an empty List.

Scala code: reverse operation

I wanted to ensure I understand some scala code correctly. I have a method in a class as:
def getNodes(): IndexedSeq[Node] = allNodes
Then somewhere this method gets called as:
val nodes = graph.getNodes()
and then there is a line
val orderedNodes = nodes ++ nodes.reverse
Does this make another sequence where the original sequence and the reversed get concatenated or is there some other subtlety to it as well?
Yes, the result is a new IndexedSeq containing items just like you wrote. You're calling methods ++ and reverse that are well documented here:
http://www.scala-lang.org/api/2.10.3/index.html#scala.collection.IndexedSeq
Your code can be written like this:
val orderedNodes = nodes.++(nodes.reverse)

Scala: value split is not a member of char

I am trying to write word count program in Scala. I'm using a string "file" :
file.map( _.split(" ")).flatMap(word => (word, 1)).reduceByKey( _ + _ )
It is keep saying that:
value split is not a member of Char
Can't figure out how to solve it!
When you call map on a String it is wrapped with WrappedString which extends AbstractSeq[Char]. Therefore, when you call map it is as if you are doing so on a Seq of Char not a Seq of String.
See the link below for the code https://github.com/scala/scala/blob/v2.10.2/src/library/scala/collection/immutable/WrappedString.scala
The code below splits by whitespace and returns the size, a word counter.
val file = "Some test data"
file.split("\\s+").size
To get a count of the number of times each word in the string appears.
val file = "Some test data test"
println(file.split("\\s+").toList.groupBy(w => w).mapValues(_.length))
I found out that the code is perfect! Just because I was running it on Spark, the answer was kept in lazy RDD file that I needed to collect it somehow. Therefore, I saved it to a text file and problem solved! Here is the code:
file.flatMap(line=>line.split(" ")).map(w=>(w,1)).reduceByKey(+).saveAsTextFile("OUT.txt")
Thanks.

type mismatch in string concatenation

I'm really new to Scala and I'm not even able to concatenate Strings. Here is my code:
object RandomData {
private[this] val bag = new scala.util.Random
def apply(sensorId: String, stamp: Long, size: Int): String = {
var cpt: Int = 0
var data: String = "test"
repeat(10) {
data += "_test"
}
return data
}
}
I got the error:
type mismatch;
found : Unit
required: com.excilys.ebi.gatling.core.structure.ChainBuilder
What am I doing wrong ??
repeat is offered by Gatling in order to repeat Gatling tasks, e.g., query a website. If you have a look at the documentation (I wasn't able to find a link to the API doc of repeat), you'll see that repeat expects a chain, which is why your error message says "required: com.excilys.ebi.gatling.core.structure.ChainBuilder". However, all you do is to append to a string - which will not return a value of type ChainBuilder.
Moreover, appending to a string is nothing that should be done via Gatling. It looks to me as if you are confusing Gatling's repeat with a Scala for loop. If you only want to append "_test" to data 10 times, use one of Scala's loops (for, while) or a functional approach with e.g. foldLeft. Here are two examples:
/* Imperative style loop */
for(i <- 1 to 10) {
data += "_test"
}
/* Functional style with lazy streams */
data += Stream.continually("_test").take(10).mkString("")
Your problem is that the block
{
data += "_test"
}
evaluates to Unit, whereas the repeat method seems to want it to evaluate to a ChainBuilder.
Check out the documentation for the repeat method. I was unable to find it, but it's probably reasonable to assume that it looks something like
def repeat(numTimes: Int)(thunk: => ChainBuilder): Unit
I'm not sure if the repeat method does anything special, but with your usage, you could just use this block instead of the repeat(10){...}
for(i <- 1 to 10) data += "_test"
Also, as a side note, you don't need the return keyword with scala. You can just say data instead of return data.

Scala DSL: method chaining with parameterless methods

i am creating a small scala DSL and running into the following problem to which i dont really have a solution. A small conceptual example of what i want to achieve:
(Compute
write "hello"
read 'name
calc()
calc()
write "hello" + 'name
)
the code defining this dsl is roughly this:
Object Compute extends Compute{
...
implicit def str2Message:Message = ...
}
class Compute{
def write(msg:Message):Compute = ...
def read(s:Symbol):Compute = ...
def calc():Compute = { ... }
}
Now the question: how can i get rid of these parenthesis after calc? is it possible? if so, how? just omitting them in the definition does not help because of compilation errors.
ok, i think, i found an acceptable solution... i now achieved this possible syntax
| write "hello"
| read 'name
| calc
| calc
| write "hello " + 'name
using an object named "|", i am able to write nearly the dsl i wanted. normaly, a ";" is needed after calc if its parameterless. The trick here is to accept the DSL-object itself (here, its the "|" on the next line). making this parameter implicit also allows calc as a last statement in this code.
well, looks like it is definitly not possible to have it the way i want, but this is ok too
It's not possible to get rid of the parenthesis, but you can replace it. For example:
object it
class Compute {
def calc(x: it.type):Compute = { ... }
(Compute
write "hello"
read 'name
calc it
calc it
write "hello" + 'name
)
To expand a bit, whenever Scala sees something like this:
object method
non-reserved-word
It assumes it means object.method(non-reserved-word). Conversely, whenever it sees something like this:
object method object
method2 object2
It assumes these are two independent statements, as in object.method(object); method2.object, expecting method2 to be a new object, and object2 a method.
These assumptions are part of Scala grammar: it is meant to be this way on purpose.
First try to remove the parentheses from the definition of calc. Second try to use curly braces around the whole instead of parentheses. Curly braces and parentheses doesn't mean the same and I find that parenthesis works best in single line code (unless using semi-colons). See also What is the formal difference in Scala between braces and parentheses, and when should they be used?