Matching cardinality of varargs when passing parameters - scala

A simple function that accepts a File and a function that will be passed a PrintWriter for that file:
def printToFile(f: java.io.File)(op: java.io.PrintWriter => Unit) {
val p = new java.io.PrintWriter(f)
try { op(p) } finally { p.close() }
}
How to generalise this to any number of Files, while just passing the resulting PrintWriters to one function? I want to make the decision as to which PrintWriter to use in the client function.
I want a signature similar to (psuedocode):
def printToFile(f: java.io.File*)(op: (java.io.PrintWriter*) => Unit)
Here's how I'd like to write my client function:
printToFile(new File("file1.txt"), new File("file2.txt"), new File("file3.txt")) {
(file1PrintWriter, file2PrintWriter, file3PrintWriter) =>
// do stuff, decide which PrintWriter to write to
}
Where the cardinality of both *ed types are the same.
Importantly, I want the client function to be able to declare the PrintWriter variables it receives and not just have a Seq[PrintWriter] or similar to deal with.

I looks like you need to create as many PrintWriter instances as the number of arguments. The following should work.
import java.io._
object Foo {
def printToFile(files: File*)(op: (PrintWriter*) => Unit) {
val printers: Seq[PrintWriter] = files.map(file => new PrintWriter(file))
try {
op(printers :_*)
} finally {
printers.foreach(_.close)
}
}
}

Related

Read file in Scala : Stream closed

I try to read a file in scala like this:
def parseFile(filename: String) = {
val source = scala.io.Source.fromFile(filename)
try {
val lines = source.getLines().map(line => line.trim.toDouble)
return lines
} catch {
// re-throw exception, but make source source is closed
case
t: Throwable => {
println("error during parsing of file")
throw t
}
} finally {
source.close()
}
}
When I access the result later, I get an
java.io.IOException: Stream Closed
I understand that this arises because source.getLines() only returns an (lazy) Iterator[String], and I already close the BufferedSource in the finally clause.
How can I avoid this error, i.e. how can a "evaluate" the Stream before closing the source?
EDIT: I tried to call source.getLines().toSeq which did not help.
Maybe, you can try the following solution, which makes the codes more functional and takes the advantage of lazy evaluation.
First, define a helper function using, which takes care of open/close the file.
def using[A <: {def close() : Unit}, B](param: A)(f: A => B): B =
try f(param) finally param.close()
Then, you can refactor your code in functional programming style:
using(Source.fromFile(filename)) {
source =>
val lines = Try(source.getLines().map(line => line.trim.toDouble))
val result = lines.flatMap(l => Try(processOrDoWhatYouWantForLines(l)))
result.get
}
Actually, the using function can be used for handling all resources which need to be closed at the end of the operation.
List is not lazy so change:
val lines = source.getLines().map(line => line.trim.toDouble)
to
val lines = source.getLines().toList.map(line => line.trim.toDouble)
in order to force computing.

Is it possible to implement (without macros) an implicit zip/unzip in Scala to achieve a sneaky fluent or lazy pattern?

Edit2
Okay, so maybe I should parse out two desires here.
I had in mind that when it came time to get to the setSendTimeout(0) part, I would be using something like implicitly[Socket].
new ZContext(1) {
createSocket(ZMQ.PUB).setSendTimeout(0).//RATS!
}
I also had in mind a more generic approach to it, that would be (in pseudo code terms):
This is how you wrap a reference of T at a point in time without copying it, so that moving forward, you can tease out state of the reference after potential state changes from the value of whatever expression used it.
If it could be thought of as a chain of map map map s from T to where ever it ended up, then it is easy to append / apply a value onto it - just map again...
This is the motivating example.
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic = new GraphStageLogic(shape) {
logger.info("Initializing ZMQ context.")
val context = new ZContext(1)
logger.info(s"Binding PUB socket to ${endpoint}")
val socket = {
val s = context.createSocket(ZMQ.PUB)
s.setSendTimeOut(0)
s.bind(endpoint)
s
}
Look at socket down there. For some reason that feels uglier than it needs to be to me, but is a consequence of the fact that setters don't return stuff like setSendTimeOut.
I would normally try to improve it as follows:
new ZContext(1) {
createSocket(ZMQ.PUB).setSendTimeout(0).//RATS!
}
Here a version of #Dima's answer. Again the setup:
trait Instance {
def createPort(): Port
}
trait Port {
def makeSpecial(): Unit
def bindTo(address: Any): Unit
}
trait Provider {
def getTheInstance(i: Int): Instance
}
Now the trick:
implicit class InstanceOps(i: Instance) {
def withCreatePort(fun: (Unit => Port) => Any): Port = {
val res = i.createPort()
fun(_ => res)
res
}
}
And if you add an implicit modifier to the argument of the function passed into withCreatePort, you "import" the implicit conversion:
trait ConnectTest extends Provider {
getTheInstance(2).withCreatePort { implicit p =>
().makeSpecial().bindTo("foo")
}
}
This is potentially more dangerous, because you have an implicit conversion from Unit to Port, although it is locally encapsulated. This is generic because Connect is generic.
This trick is perhaps too clever and difficult to understand by some outside standing person reading your code.
Yes, you can create two wrappers, one giving you withCreatePort, the other giving you variants of the port method that return this:
trait Instance {
def createPort(): Port
}
trait Port {
def makeSpecial(): Unit
def bindTo(address: Any): Unit
}
class PortOps(p: Port) {
def makeSpecial() : this.type = { p.makeSpecial() ; this }
def bindTo(address: Any): this.type = { p.bindTo(address); this }
}
implicit class InstanceOps(i: Instance) {
def withCreatePort[A](fun: PortOps => A): A = fun(new PortOps(i.createPort()))
}
Example:
trait Provider {
def getTheInstance(i: Int): Instance
}
trait Plain extends Provider {
val instance = getTheInstance(2)
val port = instance.createPort()
port.makeSpecial()
port.bindTo("foo")
}
trait Rich extends Provider {
getTheInstance(2).withCreatePort { p =>
p.makeSpecial().bindTo("foo")
}
}
The question is if the effort is worth it. You can also experiment with import:
trait Import extends Provider {
val instance = getTheInstance(2)
val port = instance.createPort()
locally {
import port._
makeSpecial(); bindTo("foo")
}
}
I am not sure where you are going with this Zipped thingy ... But what you described in the beginning of your question (assuming that port in the end of that snippet is a typo, and you really meant to return instance) can be done with something like this:
object Taps {
implicit class Tap[T](t: T) extends Anyval {
def tap(f: T => Unit) = { f(t); t }
}
}
Then you can write:
import Taps._
val instance = getTheInstance(2).tap {
_.createPort
.makeSpecial
.bindTo(...)
}
Is this what you are looking for?

sys.process to wrap a process as a function

I have an external process that I would like to treat as a
function from String=>String. Given a line of input, it will respond with a single line of output. It seems that I should use
scala.sys.process, which is clearly an elegant library that makes many
shell operations easily accessible from within scala. However, I
can't figure out how to perform this simple use case.
If I write a single line to the process' stdin, it prints the result
in a single line. How can I use sys.process to create a wrapper so I
can use the process interactively? For example, if I had an
implementation for ProcessWrapper, here is a program and it's output:
// abstract definition
class ProcessWrapper(executable: String) {
def apply(line: String): String
}
// program using an implementation
val process = new ProcessWrapper("cat -b")
println(process("foo"))
println(process("bar"))
println(process("baz"))
Output:
1 foo
2 bar
3 baz
It is important that the process is not reloaded for each call to process because there is a significant initialization step.
So - after my comment - this would be my solution
import java.io.BufferedReader
import java.io.File
import java.io.InputStream
import java.io.InputStreamReader
import scala.annotation.tailrec
class ProcessWrapper(cmdLine: String, lineListenerOut: String => Unit, lineListenerErr: String => Unit,
finishHandler: => Unit,
lineMode: Boolean = true, envp: Array[String] = null, dir: File = null) {
class StreamRunnable(val stream: InputStream, listener: String => Unit) extends Runnable {
def run() {
try {
val in = new BufferedReader(new InputStreamReader(this.stream));
#tailrec
def readLines {
val line = in.readLine
if (line != null) {
listener(line)
readLines
}
}
readLines
}
finally {
this.stream.close
finishHandler
}
}
}
val process = Runtime.getRuntime().exec(cmdLine, envp, dir);
val outThread = new Thread(new StreamRunnable(process.getInputStream, lineListenerOut), "StreamHandlerOut")
val errThread = new Thread(new StreamRunnable(process.getErrorStream, lineListenerErr), "StreamHandlerErr")
val sendToProcess = process.getOutputStream
outThread.start
errThread.start
def apply(txt: String) {
sendToProcess.write(txt.getBytes)
if (lineMode)
sendToProcess.write('\n')
sendToProcess.flush
}
}
object ProcessWrapper {
def main(args: Array[String]) {
val process = new ProcessWrapper("python -i", txt => println("py> " + txt),
err => System.err.println("py err> " + err), System.exit(0))
while (true) {
process(readLine)
}
}
}
The main part is the StreamRunnable, where the process is read in a thread and the received line is passed on to a "LineListener" (a simple String => Unit - function).
The main is just a sample implementation - calling python ;)
I'm not sure, but you want somethings like that ?
case class ProcessWrapper(executable: String) {
import java.io.ByteArrayOutputStream
import scala.concurrent.duration.Duration
import java.util.concurrent.TimeUnit
lazy val process = sys.runtime.exec(executable)
def apply(line: String, blockedRead: Boolean = true): String = {
process.getOutputStream().write(line.getBytes())
process.getOutputStream().flush()
val r = new ByteArrayOutputStream
if (blockedRead) {
r.write(process.getInputStream().read())
}
while (process.getInputStream().available() > 0) {
r.write(process.getInputStream().read())
}
r.toString()
}
def close() = process.destroy()
}
val process = ProcessWrapper("cat -b")
println(process("foo\n"))
println(process("bar\n"))
println(process("baz\n"))
println(process("buz\n"))
println(process("puz\n"))
process.close
Result :
1 foo
2 bar
3 baz
4 buz
5 puz
I think that PlayCLI is a better way.
http://blog.greweb.fr/2013/01/playcli-play-iteratees-unix-pipe/ came across this today and looks exactly like what you want
How about using an Akka actor. The actor can have state and thus a reference to an open program (in a thread). You can send messages to that actor.
ProcessWrapper might be a typed actor itself or just something that converts the calls of a function to a call of an actor. If you only have 'process' as method name, then wrapper ! "message" would be enough.
Having a program open and ready to receive commands sounds like an actor that receives messages.
Edit: Probably I got the requirements wrong. You want to send multiple lines to the same process. That's not possible with the below solution.
One possibility would be to add an extension method to the ProcessBuilder that allows for taking the input from a string:
implicit class ProcessBuilderWithStringInput(val builder: ProcessBuilder) extends AnyVal {
// TODO: could use an implicit for the character set
def #<<(s: String) = builder.#<(new ByteArrayInputStream(s.getBytes))
}
You can now use the method like this:
scala> ("bc":ProcessBuilder).#<<("3 + 4\n").!!
res9: String =
"7
"
Note that the type annotation is necessary, because we need two conversions (String -> ProcessBuilder -> ProcessBuilderWithStringInput, and Scala will only apply one conversion automatically.

Why is a return statement required to allow this while statement to be evaluated properly?

Why is a return statement required to allow this while statement to be
evaluated properly? The following statement allows
import java.io.File
import java.io.FileInputStream
import java.io.InputStream
import java.io.BufferedReader
import java.io.InputStreamReader
trait Closeable {
def close ()
}
trait ManagedCloseable extends Closeable {
def use (code: () => Unit) {
try {
code()
}
finally {
this.close()
}
}
}
class CloseableInputStream (stream: InputStream)
extends InputStream with ManagedCloseable {
def read = stream.read
}
object autoclose extends App {
implicit def inputStreamToClosable (stream: InputStream):
CloseableInputStream = new CloseableInputStream(stream)
override
def main (args: Array[String]) {
val test = new FileInputStream(new File("test.txt"))
test use {
val reader = new BufferedReader(new InputStreamReader(test))
var input: String = reader.readLine
while (input != null) {
println(input)
input = reader.readLine
}
}
}
}
This produces the following error from scalac:
autoclose.scala:40: error: type mismatch;
found : Unit
required: () => Unit
while (input != null) {
^
one error found
It appears that it's attempting to treat the block following the use as an
inline statement rather than a lambda, but I'm not exactly sure why. Adding
return after the while alleviates the error:
test use {
val reader = new BufferedReader(new InputStreamReader(test))
var input: String = reader.readLine
while (input != null) {
println(input)
input = reader.readLine
}
return
}
And the application runs as expected. Can anyone describe to me what is going
on there exactly? This seems as though it should be a bug. It's been
persistent across many versions of Scala though (tested 2.8.0, 2.9.0, 2.9.1)
That's because it's use is declared as () => Unit, so the compiler expects the block you are giving use to return something that satisfies this signature.
It seems that what you want is to turn the entire block into a by-name parameter, to do so change def use (code : () => Unit) to def use (code : => Unit).
() => Unit is the type of a Function0 object, and you've required the use expression to be of that type, which it obviously isn't. => Unit is a by name parameter, which you should use instead.
You might find my answer to this question useful.
To go the heart of the matter, blocks are not lambdas. A block in Scala is a scope delimiter, nothing more.
If you had written
test use { () =>
val reader = new BufferedReader(new InputStreamReader(test))
var input: String = reader.readLine
while (input != null) {
println(input)
input = reader.readLine
}
}
Then you'd have a function (indicated by () =>) which is delimited by the block.
If use had been declared as
def use (code: => Unit) {
Then the syntax you used would work, but not because of any lambda thingy. That syntax indicates the parameter is passed by name, which, roughly speaking, means you'd take the whole expression passed as parameter (ie, the whole block), and substitute it for code inside the body of use. The type of code would be Unit, not a function, but the parameter would not be passed by value.
return or return expr has the type Nothing. You can substitute this for any type, as it never yields a value to the surrounding expression, instead it returns control to the caller.
In your program, it masquerades as the required type () => Unit.
Here's an occasionally convenient use for that (although you might be tarnished as unidiomatic if you use it too often, don't tell anyone you heard this from me!)
def foo(a: Option[Int]): Int = {
val aa: Int = a.getOrElse(return 0)
aa * 2
}
For the record, you should probably write:
def foo(a: Option[Int]): Int =
a.map(_ * 2).getOrElse(0)
You can get an insight into the mind of the compiler by checking the output of scala -Xprint:typer -e <one-liner>. Add -Ytyper-debug if you like sifting through the reams of output!
scala210 -Ytyper-debug -Xprint:typer -e 'def foo: Any = {val x: () => Any = { return }}'
... elided ...
typed return (): Nothing
adapted return (): Nothing to () => Any,

"using" function

I've defined 'using' function as following:
def using[A, B <: {def close(): Unit}] (closeable: B) (f: B => A): A =
try { f(closeable) } finally { closeable.close() }
I can use it like that:
using(new PrintWriter("sample.txt")){ out =>
out.println("hellow world!")
}
now I'm curious how to define 'using' function to take any number of parameters, and be able to access them separately:
using(new BufferedReader(new FileReader("in.txt")), new PrintWriter("out.txt")){ (in, out) =>
out.println(in.readLIne)
}
Starting Scala 2.13, the standard library provides a dedicated resource management utility: Using.
More specifically, the Using#Manager can be used when dealing with several resources.
In our case, we can manage different resources such as your PrintWriter or BufferedReader as they both implement AutoCloseable, in order to read and write from a file to another and, no matter what, close both the input and the output resource afterwards:
import scala.util.Using
import java.io.{PrintWriter, BufferedReader, FileReader}
Using.Manager { use =>
val in = use(new BufferedReader(new FileReader("input.txt")))
val out = use(new PrintWriter("output.txt"))
out.println(in.readLine)
}
// scala.util.Try[Unit] = Success(())
Someone has already done this—it's called Scala ARM.
From the readme:
import resource._
for(input <- managed(new FileInputStream("test.txt")) {
// Code that uses the input as a FileInputStream
}
I've been thinking about this and I thought maybe there was an other way to address this. Here is my take on supporting "any number" of parameters (limited by what tuples provide):
object UsingTest {
type Closeable = {def close():Unit }
final class CloseAfter[A<:Product](val x: A) {
def closeAfter[B](block: A=>B): B = {
try {
block(x);
} finally {
for (i <- 0 until x.productArity) {
x.productElement(i) match {
case c:Closeable => println("closing " + c); c.close()
case _ =>
}
}
}
}
}
implicit def any2CloseAfter[A<:Product](x: A): CloseAfter[A] =
new CloseAfter(x)
def main(args:Array[String]): Unit = {
import java.io._
(new BufferedReader(new FileReader("in.txt")),
new PrintWriter("out.txt"),
new PrintWriter("sample.txt")) closeAfter {case (in, out, other) =>
out.println(in.readLine)
other.println("hello world!")
}
}
}
I think I'm reusing the fact that 22 tuple/product classes have been written in the library... I don't think this syntax is clearer than using nested using (no pun intended), but it was an interesting puzzle.
using structural typing seems like a little overkill since java.lang.AutoCloseable is predestined for usage:
def using[A <: AutoCloseable, B](resource: A)(block: A => B): B =
try block(resource) finally resource.close()
or, if you prefer extension methods:
implicit class UsingExtension[A <: AutoCloseable](val resource: A) extends AnyVal {
def using[B](block: A => B): B = try block(resource) finally resource.close()
}
using2 is possible:
def using2[R1 <: AutoCloseable, R2 <: AutoCloseable, B](resource1: R1, resource2: R2)(block: (R1, R2) => B): B =
using(resource1) { _ =>
using(resource2) { _ =>
block(resource1, resource2)
}
}
but imho quite ugly - I would prefer to simply nest these using statements in the client code.
Unfortunately, there isn't support for arbitrary-length parameter lists with arbitrary types in standard Scala.
You might be able to do something like this with a couple of language changes (to allow variable parameter lists to be passed as HLists; see here for about 1/3 of what's needed).
Right now, the best thing to do is just do what Tuple and Function do: implement usingN for as many N as you need.
Two is easy enough, of course:
def using2[A, B <: {def close(): Unit}, C <: { def close(): Unit}](closeB: B, closeC: C)(f: (B,C) => A): A = {
try { f(closeB,closeC) } finally { closeB.close(); closeC.close() }
}
If you need more, it's probably worth writing something that'll generate the source code.
Here is an example that allows you to use the scala for comprehension as an automatic resource management block for any item that is a java.io.Closeable, but it could easily be expanded to work for any object with a close method.
This usage seems pretty close to the using statement and allows you to easily have as many resources defined in one block as you want.
object ResourceTest{
import CloseableResource._
import java.io._
def test(){
for( input <- new BufferedReader(new FileReader("/tmp/input.txt")); output <- new FileWriter("/tmp/output.txt") ){
output.write(input.readLine)
}
}
}
class CloseableResource[T](resource: =>T,onClose: T=>Unit){
def foreach(f: T=>Unit){
val r = resource
try{
f(r)
}
finally{
try{
onClose(r)
}
catch{
case e =>
println("error closing resource")
e.printStackTrace
}
}
}
}
object CloseableResource{
implicit def javaCloseableToCloseableResource[T <: java.io.Closeable](resource:T):CloseableResource[T] = new CloseableResource[T](resource,{_.close})
}
It is a good idea to detatch the cleanup algorithm from the program path.
This solution lets you accumulate closeables in a scope.
The scope cleanup will happen on after the block is executed, or the scope can be detached. The cleaning of the scope can then be done later.
This way we get the same convenience whitout being limited to single thread programming.
The utility class:
import java.io.Closeable
object ManagedScope {
val scope=new ThreadLocal[Scope]();
def managedScope[T](inner: =>T):T={
val previous=scope.get();
val thisScope=new Scope();
scope.set(thisScope);
try{
inner
} finally {
scope.set(previous);
if(!thisScope.detatched) thisScope.close();
}
}
def closeLater[T <: Closeable](what:T): T = {
val theScope=scope.get();
if(!(theScope eq null)){
theScope.closeables=theScope.closeables.:+(what);
}
what;
}
def detatchScope(): Scope={
val theScope=scope.get();
if(theScope eq null) null;
else {
theScope.detatched=true;
theScope;
}
}
}
class Scope{
var detatched=false;
var closeables:List[Closeable]=List();
def close():Unit={
for(c<-closeables){
try{
if(!(c eq null))c.close();
} catch{
case e:Throwable=>{};
}
}
}
}
The usage:
def checkSocketConnect(host:String, portNumber:Int):Unit = managedScope {
// The close later function tags the closeable to be closed later
val socket = closeLater( new Socket(host, portNumber) );
doWork(socket);
}
def checkFutureConnect(host:String, portNumber:Int):Unit = managedScope {
// The close later function tags the closeable to be closed later
val socket = closeLater( new Socket(host, portNumber) );
val future:Future[Boolean]=doAsyncWork(socket);
// Detatch the scope and use it in the future.
val scope=detatchScope();
future.onComplete(v=>scope.close());
}
This solution doesn't quite have the syntax you desire, but I think it's close enough :)
def using[A <: {def close(): Unit}, B](resources: List[A])(f: List[A] => B): B =
try f(resources) finally resources.foreach(_.close())
using(List(new BufferedReader(new FileReader("in.txt")), new PrintWriter("out.txt"))) {
case List(in: BufferedReader, out: PrintWriter) => out.println(in.readLine())
}
Of course the down side is you have to type out the types BufferedReader and PrintWrter in the using block. You might be able to add some magic so that you just need List(in, out) by using multiple ORed type bounds for type A in using.
By defining some pretty hacky and dangerous implicit conversions you can get around having to type List (and another way to get around specifying types for the resources), but I haven't documented the detail as it's too dangerous IMO.
here is my solution to the resource management in Scala:
def withResources[T <: AutoCloseable, V](r: => T)(f: T => V): V = {
val resource: T = r
require(resource != null, "resource is null")
var exception: Throwable = null
try {
f(resource)
} catch {
case NonFatal(e) =>
exception = e
throw e
} finally {
closeAndAddSuppressed(exception, resource)
}
}
private def closeAndAddSuppressed(e: Throwable,
resource: AutoCloseable): Unit = {
if (e != null) {
try {
resource.close()
} catch {
case NonFatal(suppressed) =>
e.addSuppressed(suppressed)
}
} else {
resource.close()
}
}
I used this in multiple Scala apps including managing resources in Spark executors. and one should be aware that we are other even better ways to manage resource like in CatsIO: https://typelevel.org/cats-effect/datatypes/resource.html. if you are ok with pure FP in Scala.
to answer your last question, you can definitely nest the resource like this:
withResource(r: File)(
r => {
withResource(a: File)(
anotherR => {
withResource(...)(...)
}
)
}
)
this way, not just that those resources are protected from leaking, they will also be released in the correct order(like stack). same behaviour like the Resource Monad from CatsIO.