Deal with Java NIO Iterator in Scala with Try - scala

I recently learned how to use Scala's native Try type to handle errors. One good thing with Try is that I'm able to use for-comprehension and silently ignore the error.
However, this becomes a slight problem with Java's NIO package (which I really want to use).
val p = Paths.get("Some File Path")
for {
stream <- Try(Files.newDirectoryStream(p))
file:Path <- stream.iterator()
} yield file.getFileName
This would have been perfect. I intend to get all file names from a directory, and using a DirectoryStream[Path] is the best way because it scales really well. The NIO page says DirectoryStream has an iterator() method that returns an iterator. For Java's for loop, it's enough and can be used like this:
try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
for (Path file: stream) {
System.out.println(file.getFileName());
}
}
However, Scala does not accept this. I was greeted with the error:
[error] /.../DAL.scala:42: value filter is not a member of java.util.Iterator[java.nio.file.Path]
[error] file:Path <- stream.iterator
I try to use JavaConverters, and it shows it handles Java Iterator type: scala.collection.Iterator <=> java.util.Iterator, but when I try to call it in this way: stream.iterator().asScala, the method is not reachable.
What should I do? How do I write nice Scala code while still using NIO package?

I don't actually quite get while in this for comprehension filter is being invoked, but note that stream.iterator() returns a Iterator[Path], not a Path, even though my IDE thinks it does, probably because he thinks he can apply map to it, but in truth this are methods which are not defined on java.util.Iterator[java.nio.file.Path] as the compiler confirms:
scala> for {
| stream <- Try(Files.newDirectoryStream(p))
| file <- stream.iterator()
| } yield file
<console>:13: error: value map is not a member of java.util.Iterator[java.nio.file.Path]
file <- stream.iterator()
This for comprehension translates to:
Try(Files.newDirectoryStream(p)).flatMap(stream => stream.iterator().map(...))
Where the second map is not defined. One solution could be found in this SO question, but I can't tell you how to use iterator in for comprehension here since in java iterator cannot be mapped on and I'm not sure you can convert it into the comprehension.
Edit:
I managed to find out more about the problem, I tried this for comprehension:
for {
stream <- Try(Files.newDirectoryStream(p))
file <- stream.iterator().toIterator
} yield file
This doesn't compile because:
found : Iterator[java.nio.file.Path]
required: scala.util.Try[?]
file <- stream.iterator().toIterator
It translates to:
Try(Files.newDirectoryStream(p)).flatMap(stream => stream.iterator().map(...))
But flatMap actually expects a Try back, in fact this works:
Try(Files.newDirectoryStream(p)).flatMap(stream => Try(stream.iterator().map(...)))
^
What I came up with:
import java.nio.file.{Paths, Files}
import util.Try
import scala.collection.JavaConversions._
Try(Files.newDirectoryStream(p))
.map(stream =>
stream
.iterator()
.toIterator
.toList
.map(path => path.getFileName.toString)
).getOrElse(List())
Which returns a List[String], unfortunately this is far from being as pretty as your for comprehension, maybe somebody else has a better idea.

I really like what Ende Neu wrote and it's hard to work with NIO in Scala. I want to preserve the efficiency brought from Java's Stream, so I decide to write this function instead. It still uses Try and I only need to deal with Success and Failure cases :)
It's not as smooth as I'd hope, and without Java 7's great try-with-resource feature, I have to close the stream by myself (which is terrible...), but this works out.
def readFileNames(filePath: String):Option[List[Path]] = {
val p = Paths.get(filePath)
val stream: Try[DirectoryStream[Path]] = Try(Files.newDirectoryStream(p))
val listOfFiles = List[Path]()
stream match {
case Success(st) =>
val iterator = st.iterator()
while (iterator.hasNext) {
listOfFiles :+ iterator.next()
}
case Failure(ex) => println(s"The file path is incorrect: ${ex.getMessage}")
}
stream.map(ds => ds.close())
if(listOfFiles.isEmpty) None else Some(listOfFiles)
}

Related

How to return successfully parsed rows that converted into my case class

I have a file, each row is a json array.
I reading each line of the file, and trying to convert the rows into a json array, and then for each element I am converting to a case class using json spray.
I have this so far:
for (line <- source.getLines().take(10)) {
val jsonArr = line.parseJson.convertTo[JsArray]
for (ele <- jsonArr.elements) {
val tryUser = Try(ele.convertTo[User])
}
}
How could I convert this entire process into a single line statement?
val users: Seq[User] = source.getLines.take(10).map(line => line.parseJson.convertTo[JsonArray].elements.map(ele => Try(ele.convertTo[User])
The error is:
found : Iterator[Nothing]
Note: I used Scala 2.13.6 for all my examples.
There is a lot to unpack in these few lines of code. First of all, I'll share some code that we can use to generate some meaningful input to play around with.
object User {
import scala.util.Random
private def randomId: Int = Random.nextInt(900000) + 100000
private def randomName: String = Iterator
.continually(Random.nextInt(26) + 'a')
.map(_.toChar)
.take(6)
.mkString
def randomJson(): String = s"""{"id":$randomId,"name":"$randomName"}"""
def randomJsonArray(size: Int): String =
Iterator.continually(randomJson()).take(size).mkString("[", ",", "]")
}
final case class User(id: Int, name: String)
import scala.util.{Try, Success, Failure}
import spray.json._
import DefaultJsonProtocol._
implicit val UserFormat = jsonFormat2(User.apply)
This is just some scaffolding to define some User domain object and come up with a way to generate a JSON representation of an array of such objects so that we can then use a JSON library (spray-json in this case) to parse it back into what we want.
Now, going back to your question. This is a possible way to massage your data into its parsed representation. It may not fit 100% what your are trying to do, but there's some nuance in the data types involved and how they work:
val parsedUsers: Iterator[Try[User]] =
for {
line <- Iterator.continually(User.randomJsonArray(4)).take(10)
element <- line.parseJson.convertTo[JsArray].elements
} yield Try(element.convertTo[User])
First difference: notice that I use the for comprehension in a form in which the "outcome" of an iteration is not a side effect (for (something) { do something }) but an actual value for (something) yield { return a value }).
Second difference: I explicitly asked for an Iterator[Try[User]] rather than a Seq[User]. We can go very down into a rabbit hole on the topic of why the types are what they are here, but the simple explanation is that a for ... yield expression:
returns the same type as the one in the first line of the generation -- if you start with a val ns: Iterator[Int]; for (n<- ns) ... you'll get an iterator at the end
if you nest generators, they need to be of the same type as the "outermost" one
You can read more on for comprehensions on the Tour of Scala and the Scala Book.
One possible way of consuming this is the following:
for (user <- parsedUsers) {
user match {
case Success(user) => println(s"parsed object $user")
case Failure(error) => println(s"ERROR: '${error.getMessage}'")
}
As for how to turn this into a "one liner", for comprehensions are syntactic sugar applied by the compiler which turns every nested call into a flatMap and the final one into map, as in the following example (which yields an equivalent result as the for comprehension above and very close to what the compiler does automatically):
val parsedUsers: Iterator[Try[User]] = Iterator
.continually(User.randomJsonArray(4))
.take(10)
.flatMap(line =>
line.parseJson
.convertTo[JsArray]
.elements
.map(element => Try(element.convertTo[User]))
)
One note that I would like to add is that you should be mindful of readability. Some teams prefer for comprehensions, others manually rolling out their own flatMap/map chains. Coders discretion is advised.
You can play around with this code here on Scastie (and here is the version with the flatMap/map calls).

Functional scala log accumulator

I'm working on a Scala project using cats library, mainly. In there, we have calls like
for {
_ <- initSomeServiceAndLog("something from a far away service")
_ <- initSomeOtherServiceAndLog("something from another far away service")
a <- b()
c <- d(a)
} yield c
Imagine that b also logs something or might throw a business error (I know, we avoid to throw in Scala, but it's not the case right now). I'm looking for a solution to accumulate logs and print them all in the end, in a single message.
For a happy path, I saw that Writer Monad from Cats might be an acceptable solution.
But what if b method throws? The requirements are to logs everything - all previous logs and the error message, in a single message, with some kind of unique trace ID.
Any thoughts? Thanks in advance
Implementing functional logging (in a way that preserves logs even if error happened) using monad transformers like Writer (WriterT) or State (StateT) is hard. However, if we don't be anal about FP approach we could do the following:
use some IO monad
with it create something like in-memory storage for logs
however implement in in a functional way
Personally I would pick either cats.effect.concurrent.Ref or monix.eval.TaskLocal.
Example using Ref (and Task):
type Log = Ref[Task, Chain[String]]
type FunctionalLogger = String => Task[Unit]
val createLog: Task[Log] = Ref.of[Task, Chain[String]](Chain.empty)
def createAppender(log: Log): FunctionalLogger =
entry => log.update(chain => chain.append(entry))
def outputLog(log: Log): Task[Chain[String]] = log.get
with helpers like that I could:
def doOperations(logger: FunctionalLogger) = for {
_ <- operation1(logger) // logging is a side effect managed by IO monad
_ <- operation2(logger) // so it is referentially transparent
} yield result
createLog.flatMap { log =>
doOperations(createAppender(log))
.recoverWith(...)
.flatMap { result =>
outputLog(log)
...
}
}
However, making sure that output is called is a bit of a pain so we could use some form of Bracket or Resource to handle it:
val loggerResource: Resource[Task, FunctionalLogger] = Resource.make {
createLog // acquiring resource - IO operation that accesses something
} { log =>
outputLog(log) // releasing resource - works like finally in try-catchso it should
.flatMap(... /* log entries or sth */) // be called no matter if error occured
}.map(createAppender)
loggerResource.use { logger =>
doSomething(logger)
}
If you don't like passing this appender around explicitly you could use Kleisli to inject it:
type WithLogger[A] = Kleisli[Task, FunctionalLogger, A]
// def operation1: WithLogger[A]
// def operation2: WithLogger[B]
def doSomething: WithLogger[C] = for {
a <- operation1
b <- operation2
} yield c
loggerResource.use { logger =>
doSomething(logger)
}
TaskLocal would be used in a very similar way.
At the end of the day you would end up with:
type that says that it is logging
mutability managed through IO, so referential transparency would not be lost
certainty that even if IO fails, log will be preserved and the results sent
I believe some purist would not like this solution, but it has all the benefits of FP, so I would personally use it.

How do I create a seq of string from a file that is opened with managed?

Tried this to create a seq from file:
def getFileAsList(bufferedReader: BufferedReader): Seq[String] ={
import resource._
for(source <- managed(bufferedReader)){
for(line<-source.lines())
yield line
}
}
I don't think you use Scala-ARM in a way it was designed to be used. The thing is that unless you use Imperative style i.e. consume your managed resource in place, you use Monadic style so what you get is result wrapped into a ExtractableManagedResource which is a delayed (lazy) computation rather than an immediate result. So this is not a direct substitute for Java try-with-resource construct. Monadic style is more useful if you have a method that wants to return some lazy resource that is also happens to be managed i.e. requires some kind of explicit close after usage. But this means that the managed resource is created inside the method rather than passed from the outside as in your case.
Still you probably can achieve something similar to what you want with a construction like
def getFileAsList(bufferedReader: BufferedReader): java.util.stream.Stream[String] = {
import resource._
val managedWrapper = for (source <- managed(bufferedReader))
yield for (line <- source.lines())
yield line
managedWrapper.tried.get
}
The tried method converts ExtractableManagedResource into a Try and get on that will either get you the result or (re-)throw the exception that happened during result calculation.
Please also note, that java.util.Stream is a beast quite different from scala.collection.Seq or scala.collection.Stream. If you want get Scala-specific Stream you should use some Scala-specific code such as
def getFileAsList(bufferedReader: BufferedReader): scala.collection.immutable.Stream[String] = {
import resource._
val managedWrapper = for (source <- managed(bufferedReader))
yield Stream.continually(source.readLine()).takeWhile(_ != null)
managedWrapper.tried.get
}

Await.result on HttpService

I've a scala project with http4s 0.15.16a and slick 3.2.1 with these steps:
Receive a ID by rest call
passing ID to MySlickDAO that responds with a Future
Call Await.result(res, Duration.Inf) on Future returned by MySlickDAO
Create the json
The problem is that I use a Await.result and this is bad practices
is there a better solution ?
Here the code:
val service = HttpService {
//http://localhost:8080/rest/id/9008E75A-F112-396B-E050-A8C08D26075F
case GET -> Root / "rest" / "id" / id =>
val res = MySlickDAO.load(id)
Await.result(res, Duration.Inf)
val ll = res.value.get.get
ll match {
case Failure(x) =>
InternalServerError(x)
case Success(record) =>
val r = record.map(x => MyEntity(x._1, x._2, x._3))
jsonOK(r.asJson)
}
case ....
}
Instead of awaiting, you can chain the result of one Future into another:
val resFut = MySlickDAO.load(id)
resFut.map { record =>
val r = record.map(x => MyEntity(x._1, x._2, x._3))
jsonOK(r.asJson)
} recover { x =>
InternalServerError(x)
}
The result of this will be a Future of a common supertype of jsonOK and InternalServerError (not familiar with the libraries you're using; so I may have the type of load wrong: it's not a Future[Try[_]] is it?).
BTW: your original code has a very problematic line:
val ll = res.value.get.get
res.value is an Option[Try[T]]. Calling get on an Option or a Try is generally a bad idea (even though in this case because of the Await, the Option should never be None, so the get is technically safe) because it can throw an exception. You're much better off using map, flatMap, and friends.
The issue is that http4s 0.15 uses the Scalaz concurrency constructs, while Slick uses the native Scala ones, and the two aren't designed to work with each other. My understanding is that http4s 0.17+ has switched from Scalaz to Cats, which might entail using native Scala Futures, so if you can upgrade that might be worth a shot. If not, you can handle the conversion by manually creating a Task that wraps your future:
def scalaFutureRes = MySlickDAO.load(id)
val scalazTaskRes = Task.async { register =>
scalaFutureRes.onComplete {
case Success(success) => register(success.right)
case Failure(ex) => register(ex.left)
}
}
At this point you've got a Task[ResultType] from the Future[ResultType] which you can map/flatMap with the rest of your logic like in Levi's answer.
You can also use the delorean library for this which has this logic and the opposite direction defined on the classes in question via implicit conversions, so that you can just call .toTask on a Future to get it in a compatible form. Their readme also has a lot of useful information on the conversion and what pitfalls there are.

Scala: How to return a Some or Option

I have the following piece of code that I am trying to enhance:
I am using the java.nio.file package to represent a directory or a file as a Path.
So here goes:
import java.nio.file.{Paths,DirectoryStream,Files,
Path,DirectoryIteratorException}
val path: Path = Paths.get(directoryPath)
var directoryStream: Option[DirectoryStream[Path]] = None
// so far so good
try {
directoryStream = Some(Files.newDirectoryStream(pathO))
// this is where i get into trouble
def getMeDirStream: DirectoryStream[Path] =
if (!directoryStream.isEmpty && directoryStream.isDefined)
getMeDirStream.get
else
None
// invoke the iterator() method of dstream here
}
The above piece of code will not compile because I do not know what to return in the else, and right now, for the life of me, I can only come up with None, which the compiler simply does not like and I would like to learn what should be its replacement.
I want this example to be a learning lesson of Option and Some for me.
Okay, this is where I choke. I would like to check if the directoryStream is not empty and is defined, and then if this is the case, I would like to invoke getMeDirStream.get to invoke the iterator() method on it.
The API for Option tells me that invoking the get() method could result in a java.util.NoSuchElementException if the option is empty.
If the directoryStream is empty I want to return something and not None, because IntelliJ is telling me that "Expression of type None.type doesn't conform to expected type DirectoryStream[Path]".
Now, I am being all naive about this.
I would like to know the following:
What should I return in the else other than None?
Should I wrap the getMeDirStream.get in a try-catch with a java.util.NoSuchElementException, even though I am checking if the directoryStream is empty or not.?
What is the purpose of a try-catch in the getMeDirStream.get, if there is indeed such a need?
How can I clean up the above piece of code to incorporate correct checks for being isDefined and for catching appropriate exceptions?
Once I know what to return in the else (and after putting in the appropriate try-catch block if necessary), I would like to invoke the iterator() method on getMeDirStream to do some downstream operations.
Some and None are subtypes of Option, but to be more correct, they are actually two different cases of Option or data constructors. In other words, even though Scala allows you to directly invoke a Some or a None you should still regard their type to be Option. The more important thing to take from this is that you should never under any circumstance invoke Option#get as it is unsafe.
The intention of Option is to indicate the possibility that a value does not exist. If you care about the errors, then you should probably look at using Either instead (or Scalaz's Either called \/).
You can keep the computation within the Option context and then only extract the value later, or provide a default.
def fromTryCatch[A](a: => A): Either[Throwable, A] = try { Right(a) } catch { case e: Throwable => Left(e) }
val getMeDirStream: Option[java.util.Iterator[Path]] =
for {
path <- fromTryCatch(Paths.get(directoryPath)).toOption
directoryStream <- fromTryCatch(Files.newDirectoryStream(pathO)).toOption
} yield directoryStream.iterator()
Later, or right after, you can get the iterator, or provide a default value:
val iterator = getMeDirStream.getOrElse(java.util.Collections.emptyIterator[Path])
Your specific questions are difficult to address because it's unclear exactly what you're trying to achieve. In particular, when you ask what the purpose of the try block is... Well, you wrote it, so only you can answer that.
In general, you never call get on an Option. You either use pattern matching:
option match {
case Some(value) => /* ... */
case None => /* ... */
}
or you use methods like map, flatMap, and foreach (or the equivalent comprehension syntax that gpampara's code uses).
My revision of gpampara's answer:
import scala.collection.convert.wrapAll._
import scala.util.Try
import java.nio.file.{Paths, Files, Path}
val getMeDirStream: Option[Iterator[Path]] =
for {
path <- Try(Paths.get("")).toOption
directoryStream <- Try(Files.newDirectoryStream(path)).toOption
} yield directoryStream.iterator
Changes:
Using Try(...).toOption instead of Either
Using implicits in scala.collection.convert to return the result as a Scala Iterator.
Try is similar to Option. Instead of Some and None, it has Success and Failure subtypes, and the failure case includes a Throwable, whereas None is just a singleton with no additional information.