Catching MalformedInputException during scala.io.Source.fromFile - scala

I'm using the scala.io.Source.fromFile method to read a csv file. Sometimes the file will be encoded in a different encoding format. I'll allow the user to specify the file enconding but...if the user doesn't specify the proper encoding I'd like to catch the MalformedInputException and then my method will return a None (instead of Some[Iterator[String]]).
I'm using the onCodingException method of the Codec but it seems that is not get applied. See below my code:
def readFileAsIterator(fileName: String,
encoding: Option[String] = Some(defaultEncoding)): Option[Iterator[String]] = {
try {
val codecType = encoding.getOrElse(defaultEncoding)
implicit val codec = Codec(codecType)
codec.onCodingException {
case e: CharacterCodingException => {
throw (new MalformedInputException(2))
}
}
val fileLines = io.Source.fromFile(fileName)(codec).getLines()
Some(fileLines)
} catch {
case e: Exception => {
None
}
}
}
Someone has played around with this method and managed to make it work?

This
io.Source.fromFile(fileName)(codec).getLines()
returns Iterator[String] which is lazy. So exception happens on iterating, not immediately on iterator creation.
Think, in general case it is not possible to detect wrong encoding without parsing before, so you need either parse file first to understand if encoding is right and than return new created iterator (not one used for parsing!), or leave exception handling to caller code, which parses data.
Or kind of trade-off, e.g. read several first lines, if ok (no coding exceptions) create new iterator for caller, but understand that in some cases caller will get exception on later wrong encoding line.
Update
Response to your comment to me under another answer.
Check this:
def readFileAsIterator(fileName: String,
encoding: Option[String] = Some("IBM1098"),
touchIterator: Boolean = false): Option[Iterator[String]] = {
try {
val codecType = encoding.getOrElse("IBM1098")
implicit val codec = Codec(codecType)
codec.onCodingException {
case e: CharacterCodingException => {
throw new MalformedInputException(2)
}
case e: java.nio.charset.UnmappableCharacterException => {
throw new MalformedInputException(3)
}
}
if (!touchIterator) {
Some(scala.io.Source.fromFile(fileName)(codec).getLines())
} else {
val i = scala.io.Source.fromFile(fileName)(codec).getLines()
if (i.hasNext) {
Some(i)
} else {
None
}
}
} catch {
case e: Exception => {
log.info(s"Handled exception in func", e)
None
}
}
}
Two calls on file which cause exception (in my case it was UnmappableCharacterException) with touching iterator and without depending on additional argument.
Under the hood you have iterator as I said. It is lazy buffered iterator. So it is initialized on first call (in modified method I force to initialize it with hasNext).
I do not think that it reads whole file, just buffer part of it (so it is automated implementation of my "trade-off case").

There are two things which you should think about modifying here,
1 - Return a Try[Iterator[String]] instead of Option[Iterator[String]]
2 - encoding can be a String with a default value.
def readFileAsIterator(fileName: String, encoding: String = "UTF-8"): Try[Iterator[String]] = Try({
implicit val codec = Codec(encoding)
codec.onCodingException({
case e: CharacterCodingException =>
throw (new MalformedInputException(2))
})
io.Source.fromFile(fileName)(codec).getLines()
})

Had same error. I handled it using onMalformedInput() as shown below:
implicit val codec = Codec("UTF-8")
codec.onMalformedInput(CodingErrorAction.REPLACE)
codec.onUnmappableCharacter(CodingErrorAction.REPLACE)
for(line <- Source.fromFile("..").getLines()) {
...
}

Related

How to "override" an exception in scala?

so I have a method which already has a try block that throws ExceptionA. Now I need to put another try block where this method is being called and needs to throw an exception with some added details. Something like this:
method inner():
try{
//some logic
} catch {
throw new ExceptionA("exceptionA occurred")
}
method outer():
identifier = fromSomeDBCallPrivateToOuter()
try{
inner()
} catch {
// now either
// throw new Exception("Error with identifier" + identifier)
// or
// append identifier to thrown error from inner()
}
Can someone please provide any insight or suggestion on how to do this in Scala? Thanks in advance!
What you have in your snippet would work as written (if you correct the syntax), with a caveat, that exceptions are immutable (and even they weren't, it's still not a good idea to mutate them), so, instead of "appending" to exception, you'd need to create a new one, and set the original as cause.
It is more idiomatic in scala though to use Try monad instead of the "procedural" try/catch blocks. Something like this:
case class ExceptionB(id: String, original: ExceptionA)
extends Exception(s"Badness happened with id $id", original)
def outer(): Try[ReturnType] =
val id = getId()
Try {
inner
} recover {
case e: ExceptionA if iWannaNewException => throw new Exception(s"Id: id")
case e: ExceptionA => throw ExceptionB(id, e)
}
You can also use Either structure. This structure can return Right(value) if function completes without error or Left(message) containing information about error. You can adapt your code like below one:
def inner(): Either[String, Int] = {
if (checkSomeStuff()) Left("Cannot assigne identifier")
else Right(doSomeStuff())
}
def outer(): Either[String, Int] = {
inner() match {
case Left(error) => {
println("There is an error: " + error)
// you can also throw new Exception(s"Some info about $error") here
}
case Right(identifier) => {
println("Identifier : " + identifier)
doSomeStuffWithId() // do some staff for id
}
}
}
If you want to use Exceptions you need to choose who will handle the error case (in the inner or in the outer function).

Scala : Expression of Type None. Type doesn't confirm to expect type Document [duplicate]

This question already has answers here:
Getting Value of Either
(4 answers)
Closed 3 years ago.
I am new to Scala coding. I have below code snippet which builds document using documentBuilder. My input is XML. Whenever I input an malformed XML below code is failing to parse and raising SAXException.
def parse_xml(xmlString: String)(implicit invocationTs: Date) : Either [None, Document] = {
try {
println(s"Parse xmlString invoked")
val document = documentBuilder(false).parse(new InputSource(new StringReader(xmlString)))
document.getDocumentElement.normalize()
//Right(document)
document
} catch {
case e: Exception => None
SAXException is raised because of the inbuilt implementation of parse function. Please see below code where SAXException is being handled:
public abstract Document parse(InputSource is)
throws SAXException, IOException;
Now I am trying to bypass this SAXException as I don't want my job to failed just because of one malformed XML. So I have put try catch block handling below exception :
case e: Exception => None
But it's showing error here as "Expression of Type None. Type doesn't confirm to expect type Document" as my return type is document.
How can I get rid of this issue?
If you want to use wrappers, like Either or Option you always have to wrap returned value.
In case you want to pass exception further, better choice than Either might be Try:
def parse_xml(xmlString: String)(implicit invocationTs: Date) : Try[Document] = {
try {
println(s"Parse xmlString invoked")
val document = documentBuilder(false).parse(new InputSource(new StringReader(xmlString)))
document.getDocumentElement.normalize()
Success(document)
} catch {
case e: Exception => Failure(e)
}
}
You could even simplify it by wrapping block inside Try.apply:
Try{
println(s"Parse xmlString invoked")
val document = documentBuilder(false).parse(new InputSource(new StringReader(xmlString)))
document.getDocumentElement.normalize()
document
}
If you don't care about exception, just about result, use Option:
def parse_xml(xmlString: String)(implicit invocationTs: Date) : Option[Document] = {
try {
println(s"Parse xmlString invoked")
val document = documentBuilder(false).parse(new InputSource(new StringReader(xmlString)))
document.getDocumentElement.normalize()
Some(document)
} catch {
case e: Exception => None
}
}

Scala Error - Expression of type Unit doesn't conform to expected type File

I have the following code:
var tempLastFileCreated: File = try {
files(0)
} catch {
case e: ArrayIndexOutOfBoundsException => ???
}
where files is val files: Array[File] = dir.listFiles()
Now whatever I give in case e I get the message Expression of type Unit doesn't conform to expected type File
I understand that the right hand part of the => has to be something which is of type File.
Can anyone tell me what to put there?
You are promising that tempLastFileCreated is a File, therefore it cannot also be a Unit or a String, etc. You have a couple options. You could use a Option[File] instead:
val tempLastFileCreated: Option[File] = try {
Some(files(0))
}
catch {
case e: ArrayIndexOutOfBoundsException => None
}
Or if you wanted to store an error message, for example, another option is to use Either:
val tempLastFileCreated: Either[String, File] = try {
Right(files(0))
}
catch {
case e: ArrayIndexOutOfBoundsException => Left("index out of bounds!")
}
Whatever bests suits your needs. You might want to take a look at Scala's scala.util.Try data type, which is safer. For example,
val tempLastFileCreated: Option[File] = Try(files(0)) match {
case Success(file) => Some(file)
case Failure(throwable) => None //or whatever
}

How to manage DB connection in Scala using functional programming style?

I have a piece of Scala code using DB connection:
def getAllProviderCodes()(implicit conf : Configuration) : List[String] = {
var conn: java.sql.Connection = null
try {
conn = DriverManager.getConnection(DBInfo.dbUrl(conf), DBInfo.dbUserName(conf), DBInfo.dbPassword(conf))
return ResultSetIterator.create(
conn.prepareStatement("SELECT pcode FROM providers").executeQuery()
){_.getString("pcode")}.toList
} catch {
case e: Exception =>
logger.warn("Something went wrong with creating the connection: " + e.getStackTrace)
} finally {
if (conn != null) {
conn.close()
}
}
List()
}
It's very OOP-Java-like style, so I'd like to know is there a way to write it in more functional way? I tried to succeed in applying Try monad, but failed: my biggest concern is that we have state here, as well as finally block. Maybe there's some kind of pattern for such cases?
Thank you in advance.
UPD: Here's the example from here of what IMHO the solution will look like:
val connection = database.getConnection()
val data: Seq[Data] = Try{
val results = connection.query("select whatever")
results.map(convertToWhatIneed)
} recover {
case t: Throwable =>
Seq.empty[Data]
} get
connection.close()
But as I've mentioned in the comment, I have to close the connection, then I have to place all the things regarding to connection inside Try to keep it pure... and then I to the variant with "try-catch-finally" inside Try block.
I've never played around with the Java SQL Connection library so the syntax of my answer has been written as pseudocode, but if I understand your question correctly here is how I would implement what you have done:
def getAllProviderCodes()(implicit conf : Configuration): List[String] = {
val conn: Connection = DriverManager.getConnection(???) // replace ??? with parameters
val result: List[String] = Try {
??? // ResultSetIterator stuff
} match {
case Success(output) => output // or whatever .toList thing
case Failure(_) => List.empty // add logging here
}
if(conn != null) conn.close()
result // will be whatever List you make (or an empty List if Try fails)
}
Instead of a Java-like try-catch-finally block, one Scala-like way of doing things would be to put the stuff which could explode in a Try block and assigning the response to a value using case Success(out) and case Failure(ex).
Just pull the connection outside of the try:
val conn = getConnection()
try {
doStuff(conn)
} finally {
conn.close
}
If you want the result of whole thing to be a Try, just wrap it into a Try:
def doDBStuff = Try {
val conn = getConnection()
try {
doStuff(conn)
} finally {
conn.close
}
}
Or with a bit less nesting (but this will throw connection exceptions):
def doDBStuff = {
val conn = getConnection()
val result = Try { doStuff(conn) }
conn.close
result
}

Scala finally block closing/flushing resource

Is there a better way to ensure resources are properly released - a better way to write the following code ?
val out: Option[FileOutputStream] = try {
Option(new FileOutputStream(path))
} catch {
case _ => None
}
if (out.isDefined) {
try {
Iterator.continually(in.read).takeWhile(-1 != _).foreach(out.get.write)
} catch {
case e => println(e.getMessage)
} finally {
in.close
out.get.flush()
out.get.close()
}
}
Something like that is a good idea, but I'd make it a method:
def cleanly[A,B](resource: => A)(cleanup: A => Unit)(code: A => B): Option[B] = {
try {
val r = resource
try { Some(code(r)) }
finally { cleanup(r) }
} catch {
case e: Exception => None
}
}
(note that we only catch once; if you really want a message printed in one case and not the other, then you do have to catch both like you did). (Also note that I only catch exceptions; catching Error also is usually unwise, since it's almost impossible to recover from.) The method is used like so:
cleanly(new FileOutputStream(path))(_.close){ fos =>
Iterator.continually(in.read).takeWhile(_ != -1).foreach(fos.write)
}
Since it returns a value, you'll get a Some(()) if it succeeded here (which you can ignore).
Edit: to make it more general, I'd really have it return an Either instead, so you get the exception. Like so:
def cleanly[A,B](resource: => A)(cleanup: A => Unit)(code: A => B): Either[Exception,B] = {
try {
val r = resource
try { Right(code(r)) } finally { cleanup(r) }
}
catch { case e: Exception => Left(e) }
}
Now if you get a Right, all went okay. If you get a Left, you can pick out your exception. If you don't care about the exception, you can use .right.toOption to map it into an option, or just use .right.map or whatever to operate on the correct result only if it is there (just like with Option). (Pattern matching is a useful way to deal with Eithers.)
Have a look at Scala-ARM
This project aims to be the Scala Incubator project for Automatic-Resource-Management in the scala library ...
... The Scala ARM library allows users to ensure opening closing of resources within blocks of code using the "managed" method. The "managed" method essentially takes an argument of "anything that has a close or dispose method" and constructs a new ManagedResource object.
Alternatively you can do this with Choppy's Lazy TryClose monad.
val output = for {
fin <- TryClose(in)
fout <- TryClose.wrapWithCloser(new FileOutputStream(path))(out => {out.flush(); out.close();})
} yield wrap(Iterator.continually(fin.read).takeWhile(-1 != _).foreach(fout.get.write))
// Then execute it like this:
output.resolve
More info here: https://github.com/choppythelumberjack/tryclose
(just be sure to import tryclose._ and tryclose.JavaImplicits._)