Is there an idiomatic Scala solution to obtain only the file names from File.listFiles?
Perhaps something like:
val names = new File(dir).listFiles.somemagicTrait(_getName)
and have names become a List[String]?
I know I can just loop and add them to a mutable list.
how about?
new File(dir).listFiles.map(_.getName).toList
I'm always wary of answering the wrong part of the question, but as Jean-Phillipe commented, you can get an array of the names from
new File(dir).list
and if you really need a list call toList on that.
Related
I have a code where I wanted to update an RDD as below:
val xRDD = xRDD.zip(tempRDD)
This gave me the error : recursive value x needs type
I want to maintain the xRDD over iterations and modifying it with tempRDD in each iteration. How can I achieve it?
Thanks in advance.
The compiler is telling you that you're attempting to define a variable with itself and use it in it's own definition within an action. To say this another way, you're attempting to use something that doesn't exist in an action to define it.
Edit:
If you have a list of actions that produce new RDD that you'd like to zip together, perhaps you should look at a Fold:
listMyActions.foldLeft(origRDD){ (rdd, f) =>
val tempRDD = f(rdd)
rdd.zip(tempRDD)
}
Don't forget that vals are immutable, this means that you can't reassign something to a previously defined variable. However if you want to do this, you can replace it for a var, which is not recommended, this question is more related to Scala's feature than to Apache-Spark's one. Besides, if you want more information you can consult this post Use of def val and vars in scala.
it is easy for Hadoop to use .replace() for example
String[] valArray = value.toString().replace("\N", "")
But it dosen't work in Spark,I write Scala in Spark-shell like below
val outFile=inFile.map(x=>x.replace("\N",""))
So,how to deal with it?
For some reason your x is an Array[String]. How did you get it like that? You can .toString.replace it if you like, but that will probably not get you what you want (and would give the wrong output in java anyway); you probably want to do another layer of map, inFile.map(x => x.map(_.replace("\N","")))
I have an val it:Iterator[(A,B)] and I want to create a SortedMap[A,B] with the elements I get out of the Iterator. The way I do it now is:
val map = SortedMap[A,B]() ++ it
It works fine but feels a little bit awkward to use. I checked the SortedMap doc but couldn't find anything more elegant. Is there something like:
it.toSortedMap
or
SortedMap.from(it)
in the standard Scala library that maybe I've missed?
Edit: mixing both ideas from #Rex's answer I came up with this:
SortedMap(it.to:_*)
Which works just fine and avoids having to specify the type signature of SortedMap. Still looks funny though, so further answers are welcome.
The feature you are looking for does exist for other combinations, but not the one you want. If your collection requires just a single parameter, you can use .to[NewColl]. So, for example,
import collection.immutable._
Iterator(1,2,3).to[SortedSet]
Also, the SortedMap companion object has a varargs apply that can be used to create sorted maps like so:
SortedMap( List((1,"salmon"), (2,"herring")): _* )
(note the : _* which means use the contents as the arguments). Unfortunately this requires a Seq, not an Iterator.
So your best bet is the way you're doing it already.
Suppose I have a txt file named "input.txt" and I want to use scala to read it in. The dimension of the file is not available in the beginning.
So, how to construct such an Array[Array[Float]]? What I want is a simple and neat way rather than write some code like in Java to iterates over lines and parse each number. I think functional programming should be quite good at it.. but cannot think of one up to now.
Best Regards
If your input is correct, you can do it in such way:
val source = io.Source.fromFile("input.txt")
val data = source.getLines().map(line => line.split(" ").map(_.toFloat)).toArray
source.close()
Update: for additional information about using Source check this thread
I'm pretty new to scala and I am not able to solve this (pretty) trivial problem.
I know I can instantiate a List with predefined values like this:
val myList = List(1,2)
I want to fill a List with all Integers from 1 to 100000 . My Goal is not to use a var for the List and use a loop to fill the list.
Is there any "functional" way of doing this?
Either of these will do the trick. (If you try them in the REPL, though, be advised that it's going to try to print all million hundred thousand entries, which is generally not going to work.)
List.range(1,100001)
(1 to 100000).toList
I am also very new to Scala, it's pretty awesome isn't it.
Rex has the absolutely correct answer, but as food for thought: if you want a list that is not evaluated up front (perhaps the computations involved in evaluating the items in the list is expensive, or you just want to make things lazy), you can use a Stream.
Stream.from(0,1).takeWhile(_<=100000)
This can be used in most situations where you'd use a List.