Reading files from a directory in Scala - scala

How do I get the list of files (or all *.txt files for example) in a directory in Scala.
The Source class does not seem to help.

new java.io.File(dirName).listFiles.filter(_.getName.endsWith(".txt"))

The JDK7 version, using the new DirectoryStream class is:
import java.nio.file.{Files, Path}
Files.newDirectoryStream(path)
.filter(_.getFileName.toString.endsWith(".txt"))
.map(_.toAbsolutePath)
Instead of a string, this returns a Path, which has loads of handy methods on it, like 'relativize' and 'subpath'.
Note that you will also need to import import scala.collection.JavaConversions._ to enable interop with Java collections.

The Java File class is really all you need, although it's easy enough to add some Scala goodness to iteration over directories easier.
import scala.collection.JavaConversions._
for(file <- myDirectory.listFiles if file.getName endsWith ".txt"){
// process the file
}

For now, you should use Java libraries to do so.

Related

Pass json file into JAR or read from spark session

I have a Spark UDF written on Scala. I'd like to use my function with some additional files.
import scala.io.Source
import org.json4s.jackson.JsonMethods.parse
import org.json4s.DefaultFormats
object consts {
implicit val formats = DefaultFormats
val my_map = parse(Source.fromFile("src/main/resources/map.json").mkString).extract[Map[String, Map[String, List[String]]]]
}
Now I want to use my_map object inside UDF. So I basically do this:
import package.consts.my_map
object myUDFs{
*bla-bla and use my_map*
}
I've already tested my function in a local, so it works well.
Now I want to understand how to pack a jar file so that .json file stays there?
Thank you.
If you manage your project with Maven, you can place your .json file(s) under src/main/resources as it's the default place where Maven looks for your project's resources.
You also can define a custom path for your resources as described here: https://maven.apache.org/plugins/maven-resources-plugin/examples/resource-directory.html
UPD: I managed to do so by creating fatJar and reading my resource file this way:
Source
.fromInputStream(
getClass.getClassLoader.getResourceAsStream("map.json")
)
.mkString
).extract[Map[String, Map[String, List[String]]]]

importing a library that contains a same "namespaces" as Scala Option

I'm going to use Args4j java library to parse command line arguments, but while importing the library: import org.kohsuke.args4j.{CmdLineException, CmdLineParser, Option} it steps over the scala Option.
As I use Option in the same file, I end up having issues as the scala Option is being recognized as an Args4j Option.
I solved it by importing the Args4j lib inside the Object I actually going to use it, but I was wondering if there was a better way to solve this allowing me to have all imports grouped on top.
Thanks!
You can "rename" type during import:
import org.kohsuke.args4j.{CmdLineException, CmdLineParser, Option => ArgOption}
to avoid clashes.
Object[A]
ArgObject
Other thing would be importing package instead of its elements:
import org.kohsuke.args4j
Option[A]
arg4j.Option
or use full name for Scala's option
scala.Option[A]
Option

Scala - Check if a class is imported or not

In Scala console the command:
import testPackage._
will give the below output:
scala> import testPackage._
import testPackage._
But after importing how to check what are the classes imported in the console or how to list the classes (of testPackage) in the console (just for verification). Please help.
Within the REPL I'm not sure if there's a command for listing all imported classes, however what you can do is use the tabbed completion, just type in:
scala> val tmp : testPackage.
and then hit TAB. You should get a list of the types available within that scope.
HTH

What does "=>" mean in import in scala?

I am new to scala. I am looking through some code and came up with a code that imports com.infinite.usermanagement.controllers.{ SecurityService => BaseSecurityService } package. I was wondering what does => sign means in an import.
This line means you import the class SecurityService and rename it to BaseSecurityService. You can use this to prevent name conflicts, etc. You can use this class by using BaseSecurityService instead of the original class name.
A very common example is the following (to prevent mixing up Scala and Java classes):
import java.util.{Map => JMap, List => JList}
As others have mentioned, it's an import rename. There is however one further feature that proves astoundingly-useful on occasion that I would like to highlight: If you "rename" to _, the symbol is no longer imported.
This is useful in a few cases. The simplest is that you'd like to do a wildcard import from two packages, but there's a name that's defined in both and you're only interested in one of them:
import java.io.{ File=>_, _ }
import somelibrary._
Now when you reference File, it will unambiguously use the somelibrary.File without having to fully-qualify it.
In that case, you could have also renamed java.io.File to another name to get it out of the way, but sometimes you really do not want a name visible at all. This is the case for packages that contain implicits. If you do not want a particular implicit conversion (e.g. if you'd rather have a compile error) then you have to delete its name completely:
import somelibrary.{RichFile => _, _}
// Files now won't become surprise RichFiles

How to import several implicit at once?

I have several implicit context for my application.
like
import scala.collection.JavaConversions._
import HadoopConversion._
etc
Right now I have to copy paste all those imports at each file. Is it possible to combine them in one file and make only one import?
A good technique that some libraries provide by default is bundling up implicits into a trait. This way you can compose sets of implicits by defining a trait that extends other implicit bundling traits. And then you can use it at the top of your scala file with the following.
import MyBundleOfImplicits._
Or be more selective by mixing it in only where you need it.
object Main extends App with MyBundleOfImplicits {
// ...
}
Unfortunately with something like JavaConversions, to use this method you will need to redefine all the implicits you want to use inside a trait.
trait JavaConversionsImplicits {
import java.{lang => jl}
import java.{util => ju}
import scala.collection.JavaConversions
implicit def asJavaIterable[A](i : Iterable[A]): jl.Iterable[A] = JavaConversions.asJavaIterable(i)
implicit def asJavaIterator[A](i : Iterator[A]): ju.Iterator[A] = JavaConversions.asJavaIterator(i)
}
trait MyBundleOfImplicits extends JavaConversionsImplicits with OtherImplicits
Scala does not have first-class imports. So the answer to your question is no. But there is an exception for the scala REPL. You can put all your imports in a file and then just tell the REPL where it is located. See this question.
The other answers/comments are already comprehensive. But if you just want to reduce COPY/PASTEs, all mainstream IDE/text-editors support text templating ('live template' in IntelliJ IDEA, 'template' in Eclipse, 'snippets' in TextMate ...) that will definitely make your life easier.