Issue with try-finally in Scala - scala

I have following scala code:
val file = new FileReader("myfile.txt")
try {
// do operations on file
} finally {
file.close() // close the file
}
How do I handle FileNotFoundException thrown when I read the file? If I put that line inside try block, I am not able to access the file variable inside finally.

For scala 2.13:
you can just use Using to acquire some resource and release it automatically without error handling if it's an AutoClosable:
import java.io.FileReader
import scala.util.Using
val newStyle: Try[String] = Using(new FileReader("myfile.txt")) {
reader: FileReader =>
// do something with reader
"something"
}
newStyle
// will be
// Failure(java.io.FileNotFoundException: myfile.txt (No such file or directory))
// if file is not found or Success with some value it will not fall
scala 2.12:
You can wrap your reader creation by scala.util.Try and if it will fall on creation you will get Failure with FileNotFoundException inside.
import java.io.FileReader
import scala.util.Try
val oldStyle: Try[String] = Try{
val file = new FileReader("myfile.txt")
try {
// do operations on file
"something"
} finally {
file.close() // close the file
}
}
oldStyle
// will be
// Failure(java.io.FileNotFoundException: myfile.txt (No such file or directory))
// or Success with your result of file reading inside
I recommend not to use try ... catch blocks in scala code. It's not type safety for some cases and can lead to non-obvious results but for release some resource in old scala versions there is the only way to do it - using try-finally.

Related

Write/Read/Delete binary data in Spark Databricks (scala)

I'm quite new to Spark on Databricks (Scala) and I would like to know how I can write a variable's content which is of type Array[Byte] to a temporary file data.bin in a mount storage mtn/somewhere/tmp/ (Azure Data Lake) or to file:/tmp/. Then I would like to know how to read it as an InputStream and later delete it when I'm done with it.
All methods I've read so far does not work or does not apply to binary data.
Thank you.
Turns out this code works fine :
import java.io._
import org.apache.commons.io.FileUtils
// Create or collect the data
val bytes: Array[Byte] = <some_data>
try {
// Write data to temp file
// Note : Here I use GRIB2 file as I manipulate forecast data,
// but you can use .bin or .png/.jpg (if it's an image data)
// extensions or no extension at all. It doesn't matter.
val path: String = "mf-data.grib"
val file: File = new File(path)
FileUtils.writeByteArrayToFile(file, bytes)
// Read the temp file
val input = new FileInputStream(path)
////////// Do something with it //////////
// Remove the temp file
if (!file.delete()) {
println("Cannot delete temporary file !")
}
} catch {
case _: Throwable => println("An I/O error occured")

Scala Eclipse project - unable to read text file from resources directory

I'm trying to read text file located in resources directory using Scala version 2.12.3.
However I'm getting file not found error.
my project in eclipse
my scala code:
package main.scala
import scala.io.Source
import scala.io.Codec
object Application {
def main(args: Array[String]) {
try {
val source = Source.fromFile("sample.txt")(Codec.UTF8)
for (line <- source.getLines) {
println(line.toUpperCase)
}
source.close
} catch {
case e: Throwable => e.printStackTrace()
}
}
}
I also tried using
val source = Source.fromFile("sample.txt")(Codec.UTF8)
but got the same error.
If you want to read file from src/main/resources directory you should use Source.fromResource method, so try this:
Source.fromResource("sample.txt")(Codec.UTF8)
Update
In your case you have to use either Source.fromFile("src/main/resources/sample.txt") or
Source.fromFile("sample.txt") if you put your file in root project directory

Writing to a local file on hdfs in Spark from a non-Spark datastructure

I have the following code:
def writeCSV(indexing: ListBuffer[Array[Int]], outputpath: String): Unit = {
new PrintWriter(outputpath + "out.csv") {
write("col1,col2,col3\n")
for (entry <- indexing) {
for (num <- entry) {
write(num + "");
if (num != entry(2)) write(",");
}
write("\n")
}
close
}
Which does not work, because my Spark complains that the output path cannot be found. How would I be able to print this out from a regular datastructure (ListBuffer[Array[Int]]) to just a regular file in my Spark program? Do I need to map the ListBuffer to some Spark datastructure?
I understand this is not what you want to do normally, but this is more for use for debugging and will not be used in production code.
I am new to Spark and I am using Spark 1.6.0.
If you want to write a file on HDFS, you would pass InputStream via FileSystem in org.apache.hadoop.fs package to PrintWriter constructor.
Example code
import org.apache.hadoop.fs.{FileSystem,Path}
val fs = FileSystem.get(spark.sparkContext.hadoopConfiguration)
new PrintWriter(fs.create(new Path(""))){
write(...)
}

How to download and save a file from the internet using Scala?

Basically I have a url/link to a text file online and I am trying to download it locally. For some reason, the text file that gets created/downloaded is blank. Open to any suggestions. Thanks!
def downloadFile(token: String, fileToDownload: String) {
val url = new URL("http://randomwebsite.com/docs?t=" + token + "&p=tsr%2F" + fileToDownload)
val connection = url.openConnection().asInstanceOf[HttpURLConnection]
connection.setRequestMethod("GET")
val in: InputStream = connection.getInputStream
val fileToDownloadAs = new java.io.File("src/test/resources/testingUpload1.txt")
val out: OutputStream = new BufferedOutputStream(new FileOutputStream(fileToDownloadAs))
val byteArray = Stream.continually(in.read).takeWhile(-1 !=).map(_.toByte).toArray
out.write(byteArray)
}
I know this is an old question, but I just came across a really nice way of doing this :
import sys.process._
import java.net.URL
import java.io.File
def fileDownloader(url: String, filename: String) = {
new URL(url) #> new File(filename) !!
}
Hope this helps. Source.
You can now simply use fileDownloader function to download the files.
fileDownloader("http://ir.dcs.gla.ac.uk/resources/linguistic_utils/stop_words", "stop-words-en.txt")
Here is a naive implementation by scala.io.Source.fromURL and java.io.FileWriter
def downloadFile(token: String, fileToDownload: String) {
try {
val src = scala.io.Source.fromURL("http://randomwebsite.com/docs?t=" + token + "&p=tsr%2F" + fileToDownload)
val out = new java.io.FileWriter("src/test/resources/testingUpload1.txt")
out.write(src.mkString)
out.close
} catch {
case e: java.io.IOException => "error occured"
}
}
Your code works for me... There are other possibilities that make empty file.
Here is a safer alternative to new URL(url) #> new File(filename) !!:
val url = new URL(urlOfFileToDownload)
val connection = url.openConnection().asInstanceOf[HttpURLConnection]
connection.setConnectTimeout(5000)
connection.setReadTimeout(5000)
connection.connect()
if (connection.getResponseCode >= 400)
println("error")
else
url #> new File(fileName) !!
Two things:
When downloading from an URL object, if an error (404 for instance) is returned, then the URL object will throw a FileNotFoundException. And since this exception is generated from another thread (as URL happens to run on a separate thread), a simple Try or try/catch won't be able to catch the exception. Thus the preliminary check for the response code: if (connection.getResponseCode >= 400).
As a consequence of checking the response code, the connection might sometimes get stuck opened indefinitely for improper pages (as explained here). This can be avoided by setting a timeout on the connection: connection.setReadTimeout(5000).
Flush the buffer and then close your output stream.

Downloading Image file using scala

I am trying to downloading image file for Latex formula. Following is the code I am using
var out: OutputStream = null;
var in: InputStream = null;
try {
val url = new URL("http://latex.codecogs.com/png.download?$$I=\frac{dQ}{dt}$$")
val connection = url.openConnection().asInstanceOf[HttpURLConnection]
connection.setRequestMethod("GET")
in = connection.getInputStream
val localfile = "sample2.png"
out = new BufferedOutputStream(new FileOutputStream(localfile))
val byteArray = Stream.continually(in.read).takeWhile(-1 !=).map(_.toByte).toArray
out.write(byteArray)
} catch {
case e: Exception => println(e.printStackTrace())
} finally {
out.close
in.close
}
I am able to download but it is not downloading complete image, expected image size is around 517 bytes but it is downloading only 275 bytes. What might be going wrong in it. Attached the incomplete and complete images. Please help me. I have used same code to download files more than 1MB size it worked properly.
You're passing a bad string, the "\f" is interpreted as an escape sequence and gives you a single "form feed" character.
Better:
val url = new URL("http://latex.codecogs.com/png.download?$$I=\\frac{dQ}{dt}$$")
or
val url = new URL("""http://latex.codecogs.com/png.download?$$I=\frac{dQ}{dt}$$""")
An alternative option is to use the system commands which is much cleaner
import sys.process._
import java.net.URL
import java.io.File
new URL("""http://latex.codecogs.com/png.download?$$I=\frac{dQ}{dt}$$""") #> new File("sample2.png") !!
An example using standard Java API and resource releasing with Using.
import java.nio.file.Files
import java.nio.file.Paths
import java.net.URL
import scala.util.Using
#main def main() =
val url = URL("http://webcode.me/favicon.ico")
Using(url.openStream) { in =>
Files.copy(in, Paths.get("favicon.ico"))
}