What Automatic Resource Management alternatives exist for Scala? - scala

I have seen many examples of ARM (automatic resource management) on the web for Scala. It seems to be a rite-of-passage to write one, though most look pretty much like one another. I did see a pretty cool example using continuations, though.
At any rate, a lot of that code has flaws of one type or another, so I figured it would be a good idea to have a reference here on Stack Overflow, where we can vote up the most correct and appropriate versions.

Chris Hansen's blog entry 'ARM Blocks in Scala: Revisited' from 3/26/09 talks about about slide 21 of Martin Odersky's FOSDEM presentation. This next block is taken straight from slide 21 (with permission):
def using[T <: { def close() }]
(resource: T)
(block: T => Unit)
{
try {
block(resource)
} finally {
if (resource != null) resource.close()
}
}
--end quote--
Then we can call like this:
using(new BufferedReader(new FileReader("file"))) { r =>
var count = 0
while (r.readLine != null) count += 1
println(count)
}
What are the drawbacks of this approach? That pattern would seem to address 95% of where I would need automatic resource management...
Edit: added code snippet
Edit2: extending the design pattern - taking inspiration from python with statement and addressing:
statements to run before the block
re-throwing exception depending on the managed resource
handling two resources with one single using statement
resource-specific handling by providing an implicit conversion and a Managed class
This is with Scala 2.8.
trait Managed[T] {
def onEnter(): T
def onExit(t:Throwable = null): Unit
def attempt(block: => Unit): Unit = {
try { block } finally {}
}
}
def using[T <: Any](managed: Managed[T])(block: T => Unit) {
val resource = managed.onEnter()
var exception = false
try { block(resource) } catch {
case t:Throwable => exception = true; managed.onExit(t)
} finally {
if (!exception) managed.onExit()
}
}
def using[T <: Any, U <: Any]
(managed1: Managed[T], managed2: Managed[U])
(block: T => U => Unit) {
using[T](managed1) { r =>
using[U](managed2) { s => block(r)(s) }
}
}
class ManagedOS(out:OutputStream) extends Managed[OutputStream] {
def onEnter(): OutputStream = out
def onExit(t:Throwable = null): Unit = {
attempt(out.close())
if (t != null) throw t
}
}
class ManagedIS(in:InputStream) extends Managed[InputStream] {
def onEnter(): InputStream = in
def onExit(t:Throwable = null): Unit = {
attempt(in.close())
if (t != null) throw t
}
}
implicit def os2managed(out:OutputStream): Managed[OutputStream] = {
return new ManagedOS(out)
}
implicit def is2managed(in:InputStream): Managed[InputStream] = {
return new ManagedIS(in)
}
def main(args:Array[String]): Unit = {
using(new FileInputStream("foo.txt"), new FileOutputStream("bar.txt")) {
in => out =>
Iterator continually { in.read() } takeWhile( _ != -1) foreach {
out.write(_)
}
}
}

Daniel,
I've just recently deployed the scala-arm library for automatic resource management. You can find the documentation here: https://github.com/jsuereth/scala-arm/wiki
This library supports three styles of usage (currently):
1) Imperative/for-expression:
import resource._
for(input <- managed(new FileInputStream("test.txt")) {
// Code that uses the input as a FileInputStream
}
2) Monadic-style
import resource._
import java.io._
val lines = for { input <- managed(new FileInputStream("test.txt"))
val bufferedReader = new BufferedReader(new InputStreamReader(input))
line <- makeBufferedReaderLineIterator(bufferedReader)
} yield line.trim()
lines foreach println
3) Delimited Continuations-style
Here's an "echo" tcp server:
import java.io._
import util.continuations._
import resource._
def each_line_from(r : BufferedReader) : String #suspendable =
shift { k =>
var line = r.readLine
while(line != null) {
k(line)
line = r.readLine
}
}
reset {
val server = managed(new ServerSocket(8007)) !
while(true) {
// This reset is not needed, however the below denotes a "flow" of execution that can be deferred.
// One can envision an asynchronous execuction model that would support the exact same semantics as below.
reset {
val connection = managed(server.accept) !
val output = managed(connection.getOutputStream) !
val input = managed(connection.getInputStream) !
val writer = new PrintWriter(new BufferedWriter(new OutputStreamWriter(output)))
val reader = new BufferedReader(new InputStreamReader(input))
writer.println(each_line_from(reader))
writer.flush()
}
}
}
The code makes uses of a Resource type-trait, so it's able to adapt to most resource types. It has a fallback to use structural typing against classes with either a close or dispose method. Please check out the documentation and let me know if you think of any handy features to add.

Here's James Iry solution using continuations:
// standard using block definition
def using[X <: {def close()}, A](resource : X)(f : X => A) = {
try {
f(resource)
} finally {
resource.close()
}
}
// A DC version of 'using'
def resource[X <: {def close()}, B](res : X) = shift(using[X, B](res))
// some sugar for reset
def withResources[A, C](x : => A #cps[A, C]) = reset{x}
Here are the solutions with and without continuations for comparison:
def copyFileCPS = using(new BufferedReader(new FileReader("test.txt"))) {
reader => {
using(new BufferedWriter(new FileWriter("test_copy.txt"))) {
writer => {
var line = reader.readLine
var count = 0
while (line != null) {
count += 1
writer.write(line)
writer.newLine
line = reader.readLine
}
count
}
}
}
}
def copyFileDC = withResources {
val reader = resource[BufferedReader,Int](new BufferedReader(new FileReader("test.txt")))
val writer = resource[BufferedWriter,Int](new BufferedWriter(new FileWriter("test_copy.txt")))
var line = reader.readLine
var count = 0
while(line != null) {
count += 1
writer write line
writer.newLine
line = reader.readLine
}
count
}
And here's Tiark Rompf's suggestion of improvement:
trait ContextType[B]
def forceContextType[B]: ContextType[B] = null
// A DC version of 'using'
def resource[X <: {def close()}, B: ContextType](res : X): X #cps[B,B] = shift(using[X, B](res))
// some sugar for reset
def withResources[A](x : => A #cps[A, A]) = reset{x}
// and now use our new lib
def copyFileDC = withResources {
implicit val _ = forceContextType[Int]
val reader = resource(new BufferedReader(new FileReader("test.txt")))
val writer = resource(new BufferedWriter(new FileWriter("test_copy.txt")))
var line = reader.readLine
var count = 0
while(line != null) {
count += 1
writer write line
writer.newLine
line = reader.readLine
}
count
}

For now Scala 2.13 has finally supported: try with resources by using Using :), Example:
val lines: Try[Seq[String]] =
Using(new BufferedReader(new FileReader("file.txt"))) { reader =>
Iterator.unfold(())(_ => Option(reader.readLine()).map(_ -> ())).toList
}
or using Using.resource avoid Try
val lines: Seq[String] =
Using.resource(new BufferedReader(new FileReader("file.txt"))) { reader =>
Iterator.unfold(())(_ => Option(reader.readLine()).map(_ -> ())).toList
}
You can find more examples from Using doc.
A utility for performing automatic resource management. It can be used to perform an operation using resources, after which it releases the resources in reverse order of their creation.

I see a gradual 4 step evolution for doing ARM in Scala:
No ARM: Dirt
Only closures: Better, but multiple nested blocks
Continuation Monad: Use For to flatten the nesting, but unnatural separation in 2 blocks
Direct style continuations: Nirava, aha! This is also the most type-safe alternative: a resource outside withResource block will be type error.

There is light-weight (10 lines of code) ARM included with better-files. See: https://github.com/pathikrit/better-files#lightweight-arm
import better.files._
for {
in <- inputStream.autoClosed
out <- outputStream.autoClosed
} in.pipeTo(out)
// The input and output streams are auto-closed once out of scope
Here is how it is implemented if you don't want the whole library:
type Closeable = {
def close(): Unit
}
type ManagedResource[A <: Closeable] = Traversable[A]
implicit class CloseableOps[A <: Closeable](resource: A) {
def autoClosed: ManagedResource[A] = new Traversable[A] {
override def foreach[U](f: A => U) = try {
f(resource)
} finally {
resource.close()
}
}
}

How about using Type classes
trait GenericDisposable[-T] {
def dispose(v:T):Unit
}
...
def using[T,U](r:T)(block:T => U)(implicit disp:GenericDisposable[T]):U = try {
block(r)
} finally {
Option(r).foreach { r => disp.dispose(r) }
}

Another alternative is Choppy's Lazy TryClose monad. It's pretty good with database connections:
val ds = new JdbcDataSource()
val output = for {
conn <- TryClose(ds.getConnection())
ps <- TryClose(conn.prepareStatement("select * from MyTable"))
rs <- TryClose.wrap(ps.executeQuery())
} yield wrap(extractResult(rs))
// Note that Nothing will actually be done until 'resolve' is called
output.resolve match {
case Success(result) => // Do something
case Failure(e) => // Handle Stuff
}
And with streams:
val output = for {
outputStream <- TryClose(new ByteArrayOutputStream())
gzipOutputStream <- TryClose(new GZIPOutputStream(outputStream))
_ <- TryClose.wrap(gzipOutputStream.write(content))
} yield wrap({gzipOutputStream.flush(); outputStream.toByteArray})
output.resolve.unwrap match {
case Success(bytes) => // process result
case Failure(e) => // handle exception
}
More info here: https://github.com/choppythelumberjack/tryclose

Here is #chengpohi's answer, modified so it works with Scala 2.8+, instead of just Scala 2.13 (yes, it works with Scala 2.13 also):
def unfold[A, S](start: S)(op: S => Option[(A, S)]): List[A] =
Iterator
.iterate(op(start))(_.flatMap{ case (_, s) => op(s) })
.map(_.map(_._1))
.takeWhile(_.isDefined)
.flatten
.toList
def using[A <: AutoCloseable, B](resource: A)
(block: A => B): B =
try block(resource) finally resource.close()
val lines: Seq[String] =
using(new BufferedReader(new FileReader("file.txt"))) { reader =>
unfold(())(_ => Option(reader.readLine()).map(_ -> ())).toList
}

While Using is OK, I prefer the monadic style of resource composition. Twitter Util's Managed is pretty nice, except for its dependencies and its not-very-polished API.
To that end, I've published https://github.com/dvgica/managerial for Scala 2.12, 2.13, and 3.0.0. Largely based on the Twitter Util Managed code, no dependencies, with some API improvements inspired by cats-effect Resource.
The simple example:
import ca.dvgi.managerial._
val fileContents = Managed.from(scala.io.Source.fromFile("file.txt")).use(_.mkString)
But the real strength of the library is composing resources via for comprehensions.
Let me know what you think!

Related

Scala stream and ExecutionContext issue

I'm new in Scala and i'm facing a few problems in my assignment :
I want to build a stream class that can do 3 main tasks : filter,map,and forEach.
My streams data is an array of elements. Each of the 3 main tasks should run in 2 different threads on my streams array.
In addition, I need to divde the logic of the action and its actual run to two different parts. First declare all tasks in stream and only when I run stream.run() I want the actual actions to happen.
My code :
class LearningStream[A]() {
val es: ExecutorService = Executors.newFixedThreadPool(2)
val ec = ExecutionContext.fromExecutorService(es)
var streamValues: ArrayBuffer[A] = ArrayBuffer[A]()
var r: Runnable = () => "";
def setValues(streamv: ArrayBuffer[A]) = {
streamValues = streamv;
}
def filter(p: A => Boolean): LearningStream[A] = {
var ls_filtered: LearningStream[A] = new LearningStream[A]()
r = () => {
println("running real filter..")
val (l,r) = streamValues.splitAt(streamValues.length/2)
val a:ArrayBuffer[A]=es.submit(()=>l.filter(p)).get()
val b:ArrayBuffer[A]=es.submit(()=>r.filter(p)).get()
ms_filtered.setValues(a++b)
}
return ls_filtered
}
def map[B](f: A => B): LearningStream[B] = {
var ls_map: LearningStream[B] = new LearningStream[B]()
r = () => {
println("running real map..")
val (l,r) = streamValues.splitAt(streamValues.length/2)
val a:ArrayBuffer[B]=es.submit(()=>l.map(f)).get()
val b:ArrayBuffer[B]=es.submit(()=>r.map(f)).get()
ls_map.setValues(a++b)
}
return ls_map
}
def forEach(c: A => Unit): Unit = {
r=()=>{
println("running real forEach")
streamValues.foreach(c)}
}
def insert(a: A): Unit = {
streamValues += a
}
def start(): Unit = {
ec.submit(r)
}
def shutdown(): Unit = {
ec.shutdown()
}
}
my main :
def main(args: Array[String]): Unit = {
var factorial=0
val s = new LearningStream[String]
s.filter(str=>str.startsWith("-")).map(s=>s.toInt*(-1)).forEach(i=>factorial=factorial*i)
for(i <- -5 to 5){
s.insert(i.toString)
}
println(s.streamValues)
s.start()
println(factorial)
}
The main prints only the filter`s output and the factorial isnt changed (still 1).
What am I missing here ?
My solution: #Levi Ramsey left a few good hints in the comments if you want to get hints and not the real solution.
First problem: Only one command (filter) run and the other didn't. solution: insert to the runnable of each command a call for the next stream via:
ec.submit(ms_map.r)
In order to be able to close all sessions, we need to add another LearningStream data member to the class. However we can't add just a regular LearningStream object because it depends on parameter [A]. Therefore, I implemented a trait that has the close function and my data member was of that trait type.

Iterate data source asynchronously in batch and stop while remote return no data in Scala

Let's say we have a fake data source which will return data it holds in batch
class DataSource(size: Int) {
private var s = 0
implicit val g = scala.concurrent.ExecutionContext.global
def getData(): Future[List[Int]] = {
s = s + 1
Future {
Thread.sleep(Random.nextInt(s * 100))
if (s <= size) {
List.fill(100)(s)
} else {
List()
}
}
}
object Test extends App {
val source = new DataSource(100)
implicit val g = scala.concurrent.ExecutionContext.global
def process(v: List[Int]): Unit = {
println(v)
}
def next(f: (List[Int]) => Unit): Unit = {
val fut = source.getData()
fut.onComplete {
case Success(v) => {
f(v)
v match {
case h :: t => next(f)
}
}
}
}
next(process)
Thread.sleep(1000000000)
}
I have mine, the problem here is some portion is more not pure. Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list? My situation is a little from this post, the next() there is synchronous call while my is also async.
Or is it ever possible to do what I want? Next batch will only be fetched when the previous one is resolved in the end whether to fetch the next batch depends on the size returned?
What's the best way to walk through this type of data sources? Are there any existing Scala frameworks that provide the feature I am looking for? Is play's Iteratee, Enumerator, Enumeratee the right tool? If so, can anyone provide an example on how to use those facilities to implement what I am looking for?
Edit----
With help from chunjef, I had just tried out. And it actually did work out for me. However, there was some small change I made based on his answer.
Source.fromIterator(()=>Iterator.continually(source.getData())).mapAsync(1) (f=>f.filter(_.size > 0))
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runForeach(println)
However, can someone give comparison between Akka Stream and Play Iteratee? Does it worth me also try out Iteratee?
Code snip 1:
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
Code snip 2: Assuming the getData depends on some other output of another flow, and I would like to concat it with the below flow. However, it yield too many files open error. Not sure what would cause this error, the mapAsync has been limited to 1 as its throughput if I understood correctly.
Flow[Int].mapConcat[Future[List[Int]]](c => {
Iterator.continually(ds.getData(c)).to[collection.immutable.Iterable]
}).mapAsync(1)(identity).takeWhile(_.nonEmpty).runForeach(println)
The following is one way to achieve the same behavior with Akka Streams, using your DataSource class:
import scala.concurrent.Future
import scala.util.Random
import akka.actor.ActorSystem
import akka.stream._
import akka.stream.scaladsl._
object StreamsExample extends App {
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
val ds = new DataSource(100)
Source.fromIterator(() => Iterator.continually(ds.getData)) // line 1
.mapAsync(1)(identity) // line 2
.takeWhile(_.nonEmpty) // line 3
.runForeach(println) // line 4
}
class DataSource(size: Int) {
...
}
A simplified line-by-line overview:
line 1: Creates a stream source that continually calls ds.getData if there is downstream demand.
line 2: mapAsync is a way to deal with stream elements that are Futures. In this case, the stream elements are of type Future[List[Int]]. The argument 1 is the level of parallelism: we specify 1 here because DataSource internally uses a mutable variable, and a parallelism level greater than one could produce unexpected results. identity is shorthand for x => x, which basically means that for each Future, we pass its result downstream without transforming it.
line 3: Essentially, ds.getData is called as long as the result of the Future is a non-empty List[Int]. If an empty List is encountered, processing is terminated.
line 4: runForeach here takes a function List[Int] => Unit and invokes that function for each stream element.
Ideally, I would like to wrap the Future for each batch into a big future, and the wrapper future success when last batch returned 0 size list?
I think you are looking for a Promise.
You would set up a Promise before you start the first iteration.
This gives you promise.future, a Future that you can then use to follow the completion of everything.
In your onComplete, you add a case _ => promise.success().
Something like
def loopUntilDone(f: (List[Int]) => Unit): Future[Unit] = {
val promise = Promise[Unit]
def next(): Unit = source.getData().onComplete {
case Success(v) =>
f(v)
v match {
case h :: t => next()
case _ => promise.success()
}
case Failure(e) => promise.failure(e)
}
// get going
next(f)
// return the Future for everything
promise.future
}
// future for everything, this is a `Future[Unit]`
// its `onComplete` will be triggered when there is no more data
val everything = loopUntilDone(process)
You are probably looking for a reactive streams library. My personal favorite (and one I'm most familiar with) is Monix. This is how it will work with DataSource unchanged
import scala.concurrent.duration.Duration
import scala.concurrent.Await
import monix.reactive.Observable
import monix.execution.Scheduler.Implicits.global
object Test extends App {
val source = new DataSource(100)
val completed = // <- this is Future[Unit], completes when foreach is done
Observable.repeat(Observable.fromFuture(source.getData()))
.flatten // <- Here it's Observable[List[Int]], it has collection-like methods
.takeWhile(_.nonEmpty)
.foreach(println)
Await.result(completed, Duration.Inf)
}
I just figured out that by using flatMapConcat can achieve what I wanted to achieve. There is no point to start another question as I have had the answer already. Put my sample code here just in case someone is looking for similar answer.
This type of API is very common for some integration between traditional Enterprise applications. The DataSource is to mock the API while the object App is to demonstrate how the client code can utilize Akka Stream to consume the APIs.
In my small project the API was provided in SOAP, and I used scalaxb to transform the SOAP to Scala async style. And with the client calls demonstrated in the object App, we can consume the API with AKKA Stream. Thanks for all for the help.
class DataSource(size: Int) {
private var transactionId: Long = 0
private val transactionCursorMap: mutable.HashMap[TransactionId, Set[ReadCursorId]] = mutable.HashMap.empty
private val cursorIteratorMap: mutable.HashMap[ReadCursorId, Iterator[List[Int]]] = mutable.HashMap.empty
implicit val g = scala.concurrent.ExecutionContext.global
case class TransactionId(id: Long)
case class ReadCursorId(id: Long)
def startTransaction(): Future[TransactionId] = {
Future {
synchronized {
transactionId += transactionId
}
val t = TransactionId(transactionId)
transactionCursorMap.update(t, Set(ReadCursorId(0)))
t
}
}
def createCursorId(t: TransactionId): ReadCursorId = {
synchronized {
val c = transactionCursorMap.getOrElseUpdate(t, Set(ReadCursorId(0)))
val currentId = c.foldLeft(0l) { (acc, a) => acc.max(a.id) }
val cId = ReadCursorId(currentId + 1)
transactionCursorMap.update(t, c + cId)
cursorIteratorMap.put(cId, createIterator)
cId
}
}
def createIterator(): Iterator[List[Int]] = {
(for {i <- 1 to 100} yield List.fill(100)(i)).toIterator
}
def startRead(t: TransactionId): Future[ReadCursorId] = {
Future {
createCursorId(t)
}
}
def getData(cursorId: ReadCursorId): Future[List[Int]] = {
synchronized {
Future {
Thread.sleep(Random.nextInt(100))
cursorIteratorMap.get(cursorId) match {
case Some(i) => i.next()
case _ => List()
}
}
}
}
}
object Test extends App {
val source = new DataSource(10)
implicit val system = ActorSystem("Sandbox")
implicit val materializer = ActorMaterializer()
implicit val g = scala.concurrent.ExecutionContext.global
//
// def process(v: List[Int]): Unit = {
// println(v)
// }
//
// def next(f: (List[Int]) => Unit): Unit = {
// val fut = source.getData()
// fut.onComplete {
// case Success(v) => {
// f(v)
// v match {
//
// case h :: t => next(f)
//
// }
// }
//
// }
//
// }
//
// next(process)
//
// Thread.sleep(1000000000)
val s = Source.fromFuture(source.startTransaction())
.map { e =>
source.startRead(e)
}
.mapAsync(1)(identity)
.flatMapConcat(
e => {
Source.fromIterator(() => Iterator.continually(source.getData(e)))
})
.mapAsync(5)(identity)
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runForeach(println)
/*
val done = Source.fromIterator(() => Iterator.continually(source.getData())).mapAsync(1)(identity)
.via(Flow[List[Int]].takeWhile(_.nonEmpty))
.runFold(List[List[Int]]()) { (acc, r) =>
// println("=======" + acc + r)
r :: acc
}
done.onSuccess {
case e => {
e.foreach(println)
}
}
done.onComplete(_ => system.terminate())
*/
}

Choosing between two Action.async blocks, why can't I specify Action.async in the caller?

On the first line below, I want to know if there's anything I can or should put to the left of the = that will future-proof against bad editing. I had one Action.async, but now I want to have two choices (run a single MongoDB query or run multiple MongoDB queries) to arrive at the answer to the web query q. My code compiles as-is, but I think I should have to put something before the = to make the code more type-safe.
def q(arg: String) =
if (wantMultipleQueriesByDataSource)
runMultipleQueriesByDataSource(arg)
else
runSingleQuery(arg)
def runSingleQuery(arg: String) = Action.async {
if (oDb.isDefined) {
val collection: MongoCollection[Document] = oDb.get.getCollection(collectionName)
val fut = getMapData(collection, arg)
fut.map { docs: Seq[Document] =>
val docsSources = List(DocsSource(docs, "*"))
val pickedDocs = pickDocs(args, docsSources)
Ok(buildQueryAnswer(pickedDocs))
} recover {
case e => BadRequest("FAIL(rect): " + e.getMessage)
}
}
else Future.successful(Ok(buildQueryAnswer(Nil)))
}
def runMultipleQueriesByDataSource(arg: String) = Action.async {
if (oDb.isDefined) {
val collection: MongoCollection[Document] = oDb.get.getCollection(collectionName)
val dataSources = List("apples", "bananas", "cherries")
val futureCollections = dataSources map { getMapDataByDataSource(collection, _, arg) }
Future.sequence(futureCollections) map {
case docsGroupedByDataSource: Seq[Seq[Document]] =>
val docsSources = (docsGroupedByDataSource zip dataSources) map {x => DocsSource(x._1, x._2)}
val pickedDocs = pickDocs(args, docsSources)
Ok(buildQueryAnswer(pickedDocs))
case _ => Ok(buildQueryAnswer(Nil))
} recover {
case e => BadRequest("FAIL(rect): " + e.getMessage)
}
}
else Future.successful(Ok(buildQueryAnswer(Nil)))
}
You don't need to because Scala has type inference in this case. But if you want, you can change the method signature to the following:
def q(arg: String): Action[AnyContent]
I'm assuming here that you are using Playframework.
Edit: In fact, in this case, Action.async return an Action[AnyContent] and it receives a block that returns a Future[Result].

Finally closing stream using Scala exception catching

Anybody know a solution to this problem ? I rewrote try catch finally construct to a functional way of doing things, but I can't close the stream now :-)
import scala.util.control.Exception._
def gunzip() = {
logger.info(s"Gunziping file ${f.getAbsolutePath}")
catching(classOf[IOException], classOf[FileNotFoundException]).
andFinally(println("how can I close the stream ?")).
either ({
val is = new GZIPInputStream(new FileInputStream(f))
Stream.continually(is.read()).takeWhile(-1 !=).map(_.toByte).toArray
}) match {
case Left(e) =>
val msg = s"IO error reading file ${f.getAbsolutePath} ! on host ${Setup.smtpHost}"
logger.error(msg, e)
MailClient.send(msg, msg)
new Array[Byte](0)
case Right(v) => v
}
}
I rewrote it based on Senia's solution like this :
def gunzip() = {
logger.info(s"Gunziping file ${file.getAbsolutePath}")
def closeAfterReading(c: InputStream)(f: InputStream => Array[Byte]) = {
catching(classOf[IOException], classOf[FileNotFoundException])
.andFinally(c.close())
.either(f(c)) match {
case Left(e) => {
val msg = s"IO error reading file ${file.getAbsolutePath} ! on host ${Setup.smtpHost}"
logger.error(msg, e)
new Array[Byte](0)
}
case Right(v) => v
}
}
closeAfterReading(new GZIPInputStream(new FileInputStream(file))) { is =>
Stream.continually(is.read()).takeWhile(-1 !=).map(_.toByte).toArray
}
}
I prefer this construction for such cases:
def withCloseable[T <: Closeable, R](t: T)(f: T => R): R = {
allCatch.andFinally{t.close} apply { f(t) }
}
def read(f: File) =
withCloseable(new GZIPInputStream(new FileInputStream(f))) { is =>
Stream.continually(is.read()).takeWhile(-1 !=).map(_.toByte).toArray
}
Now you could wrap it with Try and recover on some exceptions:
val result =
Try { read(f) }.recover{
case e: IOException => recover(e) // logging, default value
case e: FileNotFoundException => recover(e)
}
val array = result.get // Exception here!
take "scala-arm"
take Apache "commons-io"
then do the following
val result =
for {fis <- resource.managed(new FileInputStream(f))
gis <- resource.managed(new GZIPInputStream(fis))}
yield IOUtils.toString(gis, "UTF-8")
result.acquireFor(identity) fold (reportExceptions _, v => v)
One way how to handle it would be to use a mutable list of things that are opened and need to be closed later:
val cs: Buffer[Closeable] = new ArrayBuffer();
def addClose[C <: Closeable](c: C) = { cs += c; c; }
catching(classOf[IOException], classOf[FileNotFoundException]).
andFinally({ cs.foreach(_.close()) }).
either ({
val is = addClose(new GZIPInputStream(new FileInputStream(f)))
Stream.continually(is.read()).takeWhile(-1 !=).map(_.toByte).toArray
}) // ...
Update: You could use scala-conduit library (I'm the author) for this purpose. (The library is currently not considered production ready.) The main aim of pipes (AKA conduids) is to construct composable components with well defined resource handling. Each pipe repeatedly receives input and produces input. Optionally, it also produces a final result when it finishes. Pips has finalizers that are run after a pipe finishes - either on its own or when its downstream pipe finishes. Your example could be reworked (using Java NIO) as follows:
/**
* Filters buffers until a given character is found. The last buffer
* (truncated up to the character) is also included.
*/
def untilPipe(c: Byte): Pipe[ByteBuffer,ByteBuffer,Unit] = ...
// Create a new source that chunks a file as ByteBuffer's.
// (Note that the buffer changes on every step.)
val source: Source[ByteBuffer,Unit] = ...
// Sink that prints bytes to the standard output.
// You would create your own sink doing whatever you want.
val sink: Sink[ByteBuffer,Unit]
= NIO.writeChannel(Channels.newChannel(System.out));
runPipe(source >-> untilPipe(-1) >-> sink);
As soon as untilPipe(-1) finds -1 and finishes, its upstream source pipe's finalizer is run and the input is closed. If an exception occurs anywhere in the pipeline, the input is closed as well.
The full example can be found here.
I have another one proposition for cases when a closeble object like java.io.Socket may run for a long time, so one have to wrap it in Future. This is handful when you also control a timeout, when Socket is not responding.
object CloseableFuture {
type Closeable = {
def close(): Unit
}
private def withClose[T, F1 <: Closeable](f: => F1, andThen: F1 => Future[T]): Future[T] = future(f).flatMap(closeable => {
val internal = andThen(closeable)
internal.onComplete(_ => closeable.close())
internal
})
def apply[T, F1 <: Closeable](f: => F1, andThen: F1 => T): Future[T] =
withClose(f, {c: F1 => future(andThen(c))})
def apply[T, F1 <: Closeable, F2 <: Closeable](f1: => F1, thenF2: F1 => F2, andThen: (F1,F2) => T): Future [T] =
withClose(f1, {c1:F1 => CloseableFuture(thenF2(c1), {c2:F2 => andThen(c1,c2)})})
}
After I open a java.io.Socket and java.io.InputStream for it, and then execute code which reads from WhoisServer, I got both of them closed finally. Full code:
CloseableFuture(
{new Socket(server.address, WhoisPort)},
(s: Socket) => s.getInputStream,
(socket: Socket, inputStream: InputStream) => {
val streamReader = new InputStreamReader(inputStream)
val bufferReader = new BufferedReader(streamReader)
val outputStream = socket.getOutputStream
val writer = new OutputStreamWriter(outputStream)
val bufferWriter = new BufferedWriter(writer)
bufferWriter.write(urlToAsk+System.getProperty("line.separator"))
bufferWriter.flush()
def readBuffer(acc: List[String]): List[String] = bufferReader.readLine() match {
case null => acc
case str => {
readBuffer(str :: acc)
}
}
val result = readBuffer(Nil).reverse.mkString("\r\n")
WhoisResult(urlToAsk, result)
}
)

"using" function

I've defined 'using' function as following:
def using[A, B <: {def close(): Unit}] (closeable: B) (f: B => A): A =
try { f(closeable) } finally { closeable.close() }
I can use it like that:
using(new PrintWriter("sample.txt")){ out =>
out.println("hellow world!")
}
now I'm curious how to define 'using' function to take any number of parameters, and be able to access them separately:
using(new BufferedReader(new FileReader("in.txt")), new PrintWriter("out.txt")){ (in, out) =>
out.println(in.readLIne)
}
Starting Scala 2.13, the standard library provides a dedicated resource management utility: Using.
More specifically, the Using#Manager can be used when dealing with several resources.
In our case, we can manage different resources such as your PrintWriter or BufferedReader as they both implement AutoCloseable, in order to read and write from a file to another and, no matter what, close both the input and the output resource afterwards:
import scala.util.Using
import java.io.{PrintWriter, BufferedReader, FileReader}
Using.Manager { use =>
val in = use(new BufferedReader(new FileReader("input.txt")))
val out = use(new PrintWriter("output.txt"))
out.println(in.readLine)
}
// scala.util.Try[Unit] = Success(())
Someone has already done this—it's called Scala ARM.
From the readme:
import resource._
for(input <- managed(new FileInputStream("test.txt")) {
// Code that uses the input as a FileInputStream
}
I've been thinking about this and I thought maybe there was an other way to address this. Here is my take on supporting "any number" of parameters (limited by what tuples provide):
object UsingTest {
type Closeable = {def close():Unit }
final class CloseAfter[A<:Product](val x: A) {
def closeAfter[B](block: A=>B): B = {
try {
block(x);
} finally {
for (i <- 0 until x.productArity) {
x.productElement(i) match {
case c:Closeable => println("closing " + c); c.close()
case _ =>
}
}
}
}
}
implicit def any2CloseAfter[A<:Product](x: A): CloseAfter[A] =
new CloseAfter(x)
def main(args:Array[String]): Unit = {
import java.io._
(new BufferedReader(new FileReader("in.txt")),
new PrintWriter("out.txt"),
new PrintWriter("sample.txt")) closeAfter {case (in, out, other) =>
out.println(in.readLine)
other.println("hello world!")
}
}
}
I think I'm reusing the fact that 22 tuple/product classes have been written in the library... I don't think this syntax is clearer than using nested using (no pun intended), but it was an interesting puzzle.
using structural typing seems like a little overkill since java.lang.AutoCloseable is predestined for usage:
def using[A <: AutoCloseable, B](resource: A)(block: A => B): B =
try block(resource) finally resource.close()
or, if you prefer extension methods:
implicit class UsingExtension[A <: AutoCloseable](val resource: A) extends AnyVal {
def using[B](block: A => B): B = try block(resource) finally resource.close()
}
using2 is possible:
def using2[R1 <: AutoCloseable, R2 <: AutoCloseable, B](resource1: R1, resource2: R2)(block: (R1, R2) => B): B =
using(resource1) { _ =>
using(resource2) { _ =>
block(resource1, resource2)
}
}
but imho quite ugly - I would prefer to simply nest these using statements in the client code.
Unfortunately, there isn't support for arbitrary-length parameter lists with arbitrary types in standard Scala.
You might be able to do something like this with a couple of language changes (to allow variable parameter lists to be passed as HLists; see here for about 1/3 of what's needed).
Right now, the best thing to do is just do what Tuple and Function do: implement usingN for as many N as you need.
Two is easy enough, of course:
def using2[A, B <: {def close(): Unit}, C <: { def close(): Unit}](closeB: B, closeC: C)(f: (B,C) => A): A = {
try { f(closeB,closeC) } finally { closeB.close(); closeC.close() }
}
If you need more, it's probably worth writing something that'll generate the source code.
Here is an example that allows you to use the scala for comprehension as an automatic resource management block for any item that is a java.io.Closeable, but it could easily be expanded to work for any object with a close method.
This usage seems pretty close to the using statement and allows you to easily have as many resources defined in one block as you want.
object ResourceTest{
import CloseableResource._
import java.io._
def test(){
for( input <- new BufferedReader(new FileReader("/tmp/input.txt")); output <- new FileWriter("/tmp/output.txt") ){
output.write(input.readLine)
}
}
}
class CloseableResource[T](resource: =>T,onClose: T=>Unit){
def foreach(f: T=>Unit){
val r = resource
try{
f(r)
}
finally{
try{
onClose(r)
}
catch{
case e =>
println("error closing resource")
e.printStackTrace
}
}
}
}
object CloseableResource{
implicit def javaCloseableToCloseableResource[T <: java.io.Closeable](resource:T):CloseableResource[T] = new CloseableResource[T](resource,{_.close})
}
It is a good idea to detatch the cleanup algorithm from the program path.
This solution lets you accumulate closeables in a scope.
The scope cleanup will happen on after the block is executed, or the scope can be detached. The cleaning of the scope can then be done later.
This way we get the same convenience whitout being limited to single thread programming.
The utility class:
import java.io.Closeable
object ManagedScope {
val scope=new ThreadLocal[Scope]();
def managedScope[T](inner: =>T):T={
val previous=scope.get();
val thisScope=new Scope();
scope.set(thisScope);
try{
inner
} finally {
scope.set(previous);
if(!thisScope.detatched) thisScope.close();
}
}
def closeLater[T <: Closeable](what:T): T = {
val theScope=scope.get();
if(!(theScope eq null)){
theScope.closeables=theScope.closeables.:+(what);
}
what;
}
def detatchScope(): Scope={
val theScope=scope.get();
if(theScope eq null) null;
else {
theScope.detatched=true;
theScope;
}
}
}
class Scope{
var detatched=false;
var closeables:List[Closeable]=List();
def close():Unit={
for(c<-closeables){
try{
if(!(c eq null))c.close();
} catch{
case e:Throwable=>{};
}
}
}
}
The usage:
def checkSocketConnect(host:String, portNumber:Int):Unit = managedScope {
// The close later function tags the closeable to be closed later
val socket = closeLater( new Socket(host, portNumber) );
doWork(socket);
}
def checkFutureConnect(host:String, portNumber:Int):Unit = managedScope {
// The close later function tags the closeable to be closed later
val socket = closeLater( new Socket(host, portNumber) );
val future:Future[Boolean]=doAsyncWork(socket);
// Detatch the scope and use it in the future.
val scope=detatchScope();
future.onComplete(v=>scope.close());
}
This solution doesn't quite have the syntax you desire, but I think it's close enough :)
def using[A <: {def close(): Unit}, B](resources: List[A])(f: List[A] => B): B =
try f(resources) finally resources.foreach(_.close())
using(List(new BufferedReader(new FileReader("in.txt")), new PrintWriter("out.txt"))) {
case List(in: BufferedReader, out: PrintWriter) => out.println(in.readLine())
}
Of course the down side is you have to type out the types BufferedReader and PrintWrter in the using block. You might be able to add some magic so that you just need List(in, out) by using multiple ORed type bounds for type A in using.
By defining some pretty hacky and dangerous implicit conversions you can get around having to type List (and another way to get around specifying types for the resources), but I haven't documented the detail as it's too dangerous IMO.
here is my solution to the resource management in Scala:
def withResources[T <: AutoCloseable, V](r: => T)(f: T => V): V = {
val resource: T = r
require(resource != null, "resource is null")
var exception: Throwable = null
try {
f(resource)
} catch {
case NonFatal(e) =>
exception = e
throw e
} finally {
closeAndAddSuppressed(exception, resource)
}
}
private def closeAndAddSuppressed(e: Throwable,
resource: AutoCloseable): Unit = {
if (e != null) {
try {
resource.close()
} catch {
case NonFatal(suppressed) =>
e.addSuppressed(suppressed)
}
} else {
resource.close()
}
}
I used this in multiple Scala apps including managing resources in Spark executors. and one should be aware that we are other even better ways to manage resource like in CatsIO: https://typelevel.org/cats-effect/datatypes/resource.html. if you are ok with pure FP in Scala.
to answer your last question, you can definitely nest the resource like this:
withResource(r: File)(
r => {
withResource(a: File)(
anotherR => {
withResource(...)(...)
}
)
}
)
this way, not just that those resources are protected from leaking, they will also be released in the correct order(like stack). same behaviour like the Resource Monad from CatsIO.