How to read files from test resources with scalatest? [duplicate] - scala

I have a folder structure like below:
- main
-- java
-- resources
-- scalaresources
--- commandFiles
and in that folders I have my files that I have to read.
Here is the code:
def readData(runtype: String, snmphost: String, comstring: String, specificType: String): Unit = {
val realOrInvFile = "/commandFiles/snmpcmds." +runtype.trim // these files are under commandFiles folder, which I have to read.
try {
if (specificType.equalsIgnoreCase("Cisco")) {
val specificDeviceFile: String = "/commandFiles/snmpcmds."+runtype.trim+ ".cisco"
val realOrInvCmdsList = scala.io.Source.fromFile(realOrInvFile).getLines().toList.filterNot(line => line.startsWith("#")).map{
//some code
}
val specificCmdsList = scala.io.Source.fromFile(specificDeviceFile).getLines().toList.filterNot(line => line.startsWith("#")).map{
//some code
}
}
} catch {
case e: Exception => e.printStackTrace
}
}
}

Resources in Scala work exactly as they do in Java.
It is best to follow the Java best practices and put all resources in src/main/resources and src/test/resources.
Example folder structure:
testing_styles/
├── build.sbt
├── src
│   └── main
│   ├── resources
│   │   └── readme.txt
Scala 2.12.x && 2.13.x reading a resource
To read resources the object Source provides the method fromResource.
import scala.io.Source
val readmeText : Iterator[String] = Source.fromResource("readme.txt").getLines
reading resources prior 2.12 (still my favourite due to jar compatibility)
To read resources you can use getClass.getResource and getClass.getResourceAsStream .
val stream: InputStream = getClass.getResourceAsStream("/readme.txt")
val lines: Iterator[String] = scala.io.Source.fromInputStream( stream ).getLines
nicer error feedback (2.12.x && 2.13.x)
To avoid undebuggable Java NPEs, consider:
import scala.util.Try
import scala.io.Source
import java.io.FileNotFoundException
object Example {
def readResourceWithNiceError(resourcePath: String): Try[Iterator[String]] =
Try(Source.fromResource(resourcePath).getLines)
.recover(throw new FileNotFoundException(resourcePath))
}
good to know
Keep in mind that getResourceAsStream also works fine when the resources are part of a jar, getResource, which returns a URL which is often used to create a file can lead to problems there.
in Production
In production code I suggest to make sure that the source is closed again.

For Scala >= 2.12, use Source.fromResource:
scala.io.Source.fromResource("located_in_resouces.any")

One-liner solution for Scala >= 2.12
val source_html = Source.fromResource("file.html").mkString
Important taken from comments (thanks to #anentropic): with Source.fromResource you do not put the initial forward slash.

import scala.io.Source
object Demo {
def main(args: Array[String]): Unit = {
val ipfileStream = getClass.getResourceAsStream("/folder/a-words.txt")
val readlines = Source.fromInputStream(ipfileStream).getLines
readlines.foreach(readlines => println(readlines))
}
}

The required file can be accessed as below from resource folder in scala
val file = scala.io.Source.fromFile(s"src/main/resources/app.config").getLines().mkString

For Scala 2.11, if getLines doesn't do exactly what you want you can also copy the a file out of the jar to the local file system.
Here's a snippit that reads a binary google .p12 format API key from /resources, writes it to /tmp, and then uses the file path string as an input to a spark-google-spreadsheets write.
In the world of sbt-native-packager and sbt-assembly, copying to local is also useful with scalatest binary file tests. Just pop them out of resources to local, run the tests, and then delete.
import java.io.{File, FileOutputStream}
import java.nio.file.{Files, Paths}
def resourceToLocal(resourcePath: String) = {
val outPath = "/tmp/" + resourcePath
if (!Files.exists(Paths.get(outPath))) {
val resourceFileStream = getClass.getResourceAsStream(s"/${resourcePath}")
val fos = new FileOutputStream(outPath)
fos.write(
Stream.continually(resourceFileStream.read).takeWhile(-1 !=).map(_.toByte).toArray
)
fos.close()
}
outPath
}
val filePathFromResourcesDirectory = "google-docs-key.p12"
val serviceAccountId = "[something]#drive-integration-[something].iam.gserviceaccount.com"
val googleSheetId = "1nC8Y3a8cvtXhhrpZCNAsP4MBHRm5Uee4xX-rCW3CW_4"
val tabName = "Favorite Cities"
import spark.implicits
val df = Seq(("Brooklyn", "New York"),
("New York City", "New York"),
("San Francisco", "California")).
toDF("City", "State")
df.write.
format("com.github.potix2.spark.google.spreadsheets").
option("serviceAccountId", serviceAccountId).
option("credentialPath", resourceToLocal(filePathFromResourcesDirectory)).
save(s"${googleSheetId}/${tabName}")

The "resources" folder must be under the source root. if using intellj check for the blue folder in the project folders on the left side. eg AppName/src/main/scala or Project/scala/../main/ etc.
If using val stream: InputStream = getClass.getResourceAsStream("/readme.txt") don't forget the "/" (forward slash), given readme.txt is the file inside resources

Related

Autofix directory structure based on package in scala

I have a file src/main/scala/foo.scala which needs to be inside package bar. Ideally the file should be inside src/main/scala/bar/foo.scala.
// src/main/scala/foo.scala
package bar
// ...
How can I auto-fix this issue throughout my project such that the folder structure matches the package structure?
Is there any SBT plugin etc that can help me fix this issue?
As far as I am aware there are not such tools, though AFAIR IntelliJ can warn about package-directory mismatch.
Best I can think if is custom scalafix (https://scalacenter.github.io/scalafix/) rule - scalafix/scalameta would be used to check file's actual package, translate it to an expected directory and if they differ, move file.
I suggest scalafix/scalameta because there are corner cases like:
you are allowed to write your packages like:
package a
package b
package c
and it almost like package a.b.c except that it automatically imports everything from a and b
you can have package object in your file and then if you have
package a.b
package object c
this file should be in a/b/c directory
so I would prefer to check if file didn't fall under any of those using some existing tooling.
If you are certain that you don't have such cases (I wouldn't without checking) you could:
match the first line with regexp (^package (.*))
translate a.b.c into a/b/c (matched.split('.').map(_.trim).mkString(File.separator))
compare generated location to an actual location ( I suggest resolving absolute file locations)
move file if necessary
If there is a possibility of having more complex case than that, I could replace first step by querying scalafix/scalameta utilities.
Here is an sbt plugin providing packageStructureToDirectoryStructure task that reads package statements from source files, creates corresponding directories, and then moves files to them
import sbt._
import sbt.Keys._
import better.files._
object PackagesToDirectories extends AutoPlugin {
object autoImport {
val packageStructureToDirectoryStructure = taskKey[Unit]("Make directory structure match package structure")
}
import autoImport._
override def trigger = allRequirements
override lazy val projectSettings = Seq(
packageStructureToDirectoryStructure := {
val log = streams.value.log
log.info(s"Refactoring directory structure to match package structure...")
val sourceFiles = (Compile / sources).value
val sourceBase = (Compile / scalaSource).value
def packageStructure(lines: Traversable[String]): String = {
val packageObjectRegex = """package object\s(.+)\s\{""".r
val packageNestingRegex = """package\s(.+)\s\{""".r
val packageRegex = """package\s(.+)""".r
lines
.collect {
case packageObjectRegex(name) => name
case packageNestingRegex(name) => name
case packageRegex(name) => name
}
.flatMap(_.split('.'))
.mkString("/")
}
sourceFiles.foreach { sourceFile =>
val packagePath = packageStructure(sourceFile.toScala.lines)
val destination = file"$sourceBase/$packagePath"
destination.createDirectoryIfNotExists(createParents = true)
val result = sourceFile.toScala.moveToDirectory(destination)
log.info(s"$sourceFile moved to $result")
}
}
)
}
WARNING: Make sure to backup the project before running it.

list files recursively and filter by extension

I am trying to recursively go through a given directory and all it's subdirectories, and list all the files that end with .json. Thanks to this answer, I've got the recursive bit working. Now I'm trying to figure out the filtering part.
Here's my current attempt:
import org.apache.hadoop.fs.{FileSystem,Path, PathFilter}
import org.apache.hadoop.fs.Path;
val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)
def listJsonFiles(hdfsPath: String): List[String] = {
fs
.listStatus(new Path(hdfsPath)).filter(_.getPath.getName.endsWith(".json"))
//.listStatus(new Path(hdfsPath))
.flatMap { obj =>
if (obj.isFile)
List(hdfsPath + "/" + obj.getPath.getName)
// If it's a dir and we're in a recursive option:
else
listJsonFiles(hdfsPath + "/" + obj.getPath.getName)
}
.toList
}
val files = listJsonFiles("/path/to/some/stuff")
That returns an empty list. If I use the listStatus line without the filter, it returns a list of all the files in the path I pass in, including all the *.json files. So I know they exist, I can just can't get the filter working.
The problem is the filter is applied first that filters out subdirectory. Move the filter down to flatMap.
import org.apache.hadoop.fs.{FileSystem, Path}
val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)
def listJsonFiles(hdfsPath: String): List[String] = {
fs
.listStatus(new Path(hdfsPath))
//.listStatus(new Path(hdfsPath))
.flatMap { obj =>
if (obj.isDirectory) listJsonFiles(obj.getPath.toString)
else if(obj.getPath.getName.endsWith(".json")) List(obj.getPath.toString)
else Nil
}
.toList
}
val files = listJsonFiles("/var/tmp/stuff")
files.foreach(println)
There is native method on FileSystem to recursively scan HDFS directory.
FileSystem.listFiles takes an argument to do recursive search and return RemoteIteator which we can use to filter the desired files. This also guards JVM running from OOO.
import org.apache.hadoop.fs.{FileSystem, Path, RemoteIterator}
val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)
// implicit for converting remote Iterator to scala iterator
implicit def remoteIteratorToIterator[A](ri: RemoteIterator[A]): Iterator[A] = new Iterator[A] {
override def hasNext: Boolean = ri.hasNext
override def next(): A = ri.next()
}
def listJsonFiles(hdfsPath: String): List[String] = {
fs
.listFiles(new Path(hdfsPath), true)
.map(_.getPath.toString)
.filter(_.endsWith(".json"))
.toList
}
val files = listJsonFiles("/var/tmp/stuff")
files.foreach(println)
Input directory
$ tree stuff/
stuff/
├── dir1
│   ├── dir1.json
│   └── dir2
│   └── dir2.json
└── stuff.json
2 directories, 3 files
Output
file:/var/tmp/stuff/stuff.json
file:/var/tmp/stuff/dir1/dir2/dir2.json
file:/var/tmp/stuff/dir1/dir1.json

Scala Eclipse project - unable to read text file from resources directory

I'm trying to read text file located in resources directory using Scala version 2.12.3.
However I'm getting file not found error.
my project in eclipse
my scala code:
package main.scala
import scala.io.Source
import scala.io.Codec
object Application {
def main(args: Array[String]) {
try {
val source = Source.fromFile("sample.txt")(Codec.UTF8)
for (line <- source.getLines) {
println(line.toUpperCase)
}
source.close
} catch {
case e: Throwable => e.printStackTrace()
}
}
}
I also tried using
val source = Source.fromFile("sample.txt")(Codec.UTF8)
but got the same error.
If you want to read file from src/main/resources directory you should use Source.fromResource method, so try this:
Source.fromResource("sample.txt")(Codec.UTF8)
Update
In your case you have to use either Source.fromFile("src/main/resources/sample.txt") or
Source.fromFile("sample.txt") if you put your file in root project directory

Not able to read Configuration file Using scala typesafe API

I have a spark/scala project named as Omega
I have a conf file inside Omega/conf/omega.config
I use API's from typesafe to load the config file from conf/omega.config.
It was working fine and I was able to read the respective value for each key
Now today, For the first time I added some more key-value pairs in my omega.config file and tried to retrieve them from my scala code. It throws
Exception in thread "main" com.typesafe.config.ConfigException$Missing: No configuration setting found for key 'job_name'
This issue started happening after adding new value for the key job_name in my omega.config file.
Also I am not able to read the newly added -key-values, I am still able to read all old values using config. getString method
I am building my spark/scala application using maven.
Omega.config
input_path="/user/cloudera/data
user_name="surender"
job_name="SAMPLE"
I am Not able to access the recently added key "job_name" alone
package com.pack1
import com.pack2.ApplicationUtil
object OmegaMain {
val config_loc = "conf/omega.config"
def main(args: Array[String]): Unit = {
val config = ApplicationUtil.loadConfig(config_loc)
val jobName = ApplicationUtil.getFromConfig(config,"job_name")
}
}
package com.pack2
import com.typesafe.config.{Config, ConfigFactory}
object ApplicationUtil {
def loadConfig(filePath:String):Config={
val config = ConfigFactory.parseFile(new File(filePath))
config
}
def getFromConfig(config:Config,jobName:String):String={
config.getString(jobName)
}
}
Could some one help me what went wrong?
You can try something like:
def loadConfig(filename: String, syntax: ConfigSyntax): Config = {
val in: InputStream = getClass.getResourceAsStream(filename)
if (in == null) return null
val file: File = File.createTempFile(String.valueOf(in.hashCode()), ".conf")
file.deleteOnExit()
val out: FileOutputStream = new FileOutputStream(file)
val buffer: Array[Byte] = new Array(1024)
var bytesRead: Int = in.read(buffer)
while (bytesRead != -1) { out.write(buffer, 0, bytesRead); bytesRead = in.read(buffer) }
out.close()
val conf: Config = ConfigFactory.parseFile(file, ConfigParseOptions.defaults().setSyntax(syntax).setAllowMissing(false).setOriginDescription("Merged with " + filename))
conf
}
filename is some file path in the classpath. If you want to update this method to taking some external file into account, change update the 4th with val file: File = new File("absolute Path of he file")
I am guessing the file isn't on the classpath after you build with Maven.
Since you are using Maven to build a jar, you need your omega.config to be in the classpath. This means that you either have to put it into src/main/resources by default or explicitly tell Maven to add conf to the default resources classpath.

Writing /public directory in PlayFramework after deployment to production

In PlayFramework 2.2.x after using play dist I am having issues reading and writing to /public directory. Is that a known problem? Is the only solution to read/write to another directory with a global path?
This is my sample code:
val imageDirectory = "images/twitpics/"
val localPrefix = "/public/"
val publicPrefix = "/assets/"
val files = Play.getFile(localPrefix + imageDirectory)
.listFiles.filter(_.getName.takeRight(3) == "jpg")
val randomIndex = _rand.nextInt(files.length)
val imageFile = files(randomIndex)
Also
private val _jsonConfigFile = "/public/data/data.json"
def writeJsonToFile(content: String) = {
import java.io._
val pw = new PrintWriter(Play.getFile(_jsonConfigFile))
pw.write(content)
pw.close
}
After dist the public/ directory is packaged into the application jar, which is put on the classpath, so you cannot access it through the filesystem or write to it.