How to close enumerated file? - scala

Say, in an action I have:
val linesEnu = {
val is = new java.io.FileInputStream(path)
val isr = new java.io.InputStreamReader(is, "UTF-8")
val br = new java.io.BufferedReader(isr)
import scala.collection.JavaConversions._
val rows: scala.collection.Iterator[String] = br.lines.iterator
Enumerator.enumerate(rows)
}
Ok.feed(linesEnu).as(HTML)
How to close readers/streams?

There is a onDoneEnumerating callback that functions like finally (will always be called whether or not the Enumerator fails). You can close the streams there.
val linesEnu = {
val is = new java.io.FileInputStream(path)
val isr = new java.io.InputStreamReader(is, "UTF-8")
val br = new java.io.BufferedReader(isr)
import scala.collection.JavaConversions._
val rows: scala.collection.Iterator[String] = br.lines.iterator
Enumerator.enumerate(rows).onDoneEnumerating {
is.close()
// ... Anything else you want to execute when the Enumerator finishes.
}
}

The IO tools provided by Enumerator give you this kind of resource management out of the box—e.g. if you create an enumerator with fromStream, the stream is guaranteed to get closed after running (even if you only read a single line, etc.).
So for example you could write the following:
import play.api.libs.iteratee._
val splitByNl = Enumeratee.grouped(
Traversable.splitOnceAt[Array[Byte], Byte](_ != '\n'.toByte) &>>
Iteratee.consume()
) compose Enumeratee.map(new String(_, "UTF-8"))
def fileLines(path: String): Enumerator[String] =
Enumerator.fromStream(new java.io.FileInputStream(path)).through(splitByNl)
It's a shame that the library doesn't provide a linesFromStream out of the box, but I personally would still prefer to use fromStream with hand-rolled splitting, etc. over using an iterator and providing my own resource management.

Related

Process Interaction through stdin/stdout

I am trying to build a class that starts a system process which waits for stdin. The class should have another method which takes a string, inputs that into the system process, and return the process' output.
The reason is that starting the process involves loading a lot of data and hence takes a while.
I am trying to dummy-test this with bc, so that bc is started and waits for input. I would envision an interface like this:
case class BcWrapper(executable: File) {
var bc: Option[???] = None
def startBc(): Unit = bc = Some(???)
def calc(input: String): String = bc.get.???
def stopBc(): Unit = bc.get.???
}
I would like to be able to use it like this:
val wrapper = BcWrapper(new File("/usr/bin/bc"))
wrapper.startBc()
val result1 = wrapper.calc("1 + 1") // should be "2"
val result2 = wrapper.calc(???)
[...]
wrapper.stopBc()
This topic has been touched in multiple questions, but never fully answered for a use case like this one. This question or this one seems to come close. However, I am not sure how to implement the ProcessLogger, nor whether to use one in the first place.
Unfortunately, the Scala documentation is not very elaborate either.
Note that I do not want to read from stdin, but want to call a function.
The background is that I want to read a large file, read it line by line, preprocess the lines, pass them to the external process, and post-process the output.
You can get something similar, but simpler, like so.
import sys.process._
import util.Try
class StdInReader(val reader :String) {
def send(input :String) :Try[String] =
Try(s"/bin/echo $input".#|(reader).!!.trim)
}
usage:
val bc = new StdInReader("/usr/bin/bc")
bc.send("2 * 8") //res0: scala.util.Try[String] = Success(16)
bc.send("12 + 8") //res1: scala.util.Try[String] = Success(20)
bc.send("22 - 8") //res2: scala.util.Try[String] = Success(14)
Programs that send a non-zero exit-code (bc doesn't) will result with a Failure().
If you need more fine-grained control you might start with something like this and expand on it.
import sys.process._
class ProcHandler(val cmnd :String) {
private val resbuf = collection.mutable.Buffer.empty[String]
def run(data :Seq[String]) :Unit = {
cmnd.run(new ProcessIO(
in => {
val writer = new java.io.PrintWriter(in)
data.foreach(writer.println)
writer.close()
},
out => {
val src = io.Source.fromInputStream(out)
src.getLines().foreach(resbuf += _)
src.close()
},
_.close() //maybe create separate buffer for stderr?
)).exitValue()
}
def results() :Seq[String] = {
val rs = collection.mutable.Buffer.empty[String]
resbuf.copyToBuffer(rs)
resbuf.clear()
rs
}
}
usage:
val bc = new ProcHandler("/usr/bin/bc")
bc.run(List("4+5","6-2","2*5"))
bc.run(List("99/3","11*77"))
bc.results() //res0: Seq[String] = ArrayBuffer(9, 4, 10, 33, 847)
OK, I did some more research and found this. It appears to get at what you want but there are limitations. In particular, the process stays open for input until you want to get output. At that point IO streams are closed to insure all buffers are flushed.
import sys.process._
import util.Try
class ProcHandler(val cmnd :String) {
private val procInput = new java.io.PipedOutputStream()
private val procOutput = new java.io.PipedInputStream()
private val proc = cmnd.run( new ProcessIO(
{ in => // attach to the process's internal input stream
val istream = new java.io.PipedInputStream(procInput)
val buf = Array.fill(100)(0.toByte)
Iterator.iterate(istream.read(buf)){ br =>
in.write(buf, 0, br)
istream.read(buf)
}.takeWhile(_>=0).toList
in.close()
},
{ out => // attach to the process's internal output stream
val ostream = new java.io.PipedOutputStream(procOutput)
val buf = Array.fill(100)(0.toByte)
Iterator.iterate(out.read(buf)){ br =>
ostream.write(buf, 0, br)
out.read(buf)
}.takeWhile(_>=0).toList
out.close()
},
_ => () // ignore stderr
))
private val procO = new java.io.BufferedReader(new java.io.InputStreamReader(procOutput))
private val procI = new java.io.PrintWriter(procInput, true)
def feed(str :String) :Unit = procI.println(str)
def feed(ss :Seq[String]) :Unit = ss.foreach(procI.println)
def read() :List[String] = {
procI.close() //close input before reading output
val lines = Stream.iterate(Try(procO.readLine)){_ =>
Try(procO.readLine)
}.takeWhile(_.isSuccess).map(_.get).toList
procO.close()
lines
}
}
usage:
val bc = new ProcHandler("/usr/bin/bc")
bc.feed(List("9*3","4+11")) //res0: Unit = ()
bc.feed("4*13") //res1: Unit = ()
bc.read() //res2: List[String] = List(27, 15, 52)
bc.read() //res3: List[String] = List()
OK, this is my final word on the subject. I think this ticks every item on your wish list: start the process only once, it stays alive until actively closed, allows alternating the writing and reading.
import sys.process._
class ProcHandler(val cmnd :Seq[String]) {
private var os: java.io.OutputStream = null
private var is: java.io.InputStream = null
private val pio = new ProcessIO(os = _, is = _, _.close())
private val proc = cmnd.run(pio)
def feed(ss :String*) :Unit = {
ss.foreach(_.foreach(os.write(_)))
os.flush()
}
def ready :Boolean = is.available() > 0
def read() :String = {
Seq.fill[Char](is.available())(is.read().toChar).mkString
}
def close() :Unit = {
proc.exitValue()
os.close()
is.close()
}
}
There are still issues and much room for improvement. IO is handled at a basic level (streams) and I'm not sure what I'm doing here is completely safe and correct. The input, feed(), is required to supply the necessary NewLine terminations, and the output, read(), is just a raw String and not separated into a nice collection of string results.
Note that this will bleed system resources if the client code fails to close() all processes.
Note also that reading doesn't wait for content (i.e. no blocking). After writing the response might not be immediately available.
usage:
val bc = new ProcHandler(Seq("/usr/bin/bc","-q"))
bc.feed("44-21\n", "21*4\n")
bc.feed("67+11\n")
if (bc.ready) bc.read() else "not ready" // "23\n84\n78\n"
bc.feed("67-11\n")
if (bc.ready) bc.read() else "not ready" // "56\n"
bc.feed("67*11\n", "1+2\n")
if (bc.ready) bc.read() else "not ready" // "737\n3\n"
if (bc.ready) bc.read() else "not ready" // "not ready"
bc.close()

FS2 stream run till the end of InputStream

I'm very new to FS2 and need some help about the desing. I'm trying to design a stream which will pull the chunks from the underlying InputStream till it's over. Here is what I tried:
import java.io.{File, FileInputStream, InputStream}
import cats.effect.IO
import cats.effect.IO._
object Fs2 {
def main(args: Array[String]): Unit = {
val is = new FileInputStream(new File("/tmp/my-file.mf"))
val stream = fs2.Stream.eval(read(is))
stream.compile.drain.unsafeRunSync()
}
def read(is: InputStream): IO[Array[Byte]] = IO {
val buf = new Array[Byte](4096)
is.read(buf)
println(new String(buf))
buf
}
}
And the program prints the only first chunk. This is reasonable. But I want to find a way to "signal" where to stop reading and where to not stop. I mean keep calling read(is) till its end. Is there a way to achieve that?
I also tried repeatEval(read(is)) but it keeps reading forever... I need something in between.
Use fs2.io.readInputStream or fs2.io.readInputStreamAsync. The former blocks the current thread; the latter blocks a thread in the ExecutionContext. For example:
val is: InputStream = new FileInputStream(new File("/tmp/my-file.mf"))
val stream = fs2.io.readInputStreamAsync(IO(is), 128)

File Upload and processing using akka-http websockets

I'm using some sample Scala code to make a server that receives a file over websocket, stores the file temporarily, runs a bash script on it, and then returns stdout by TextMessage.
Sample code was taken from this github project.
I edited the code slightly within echoService so that it runs another function that processes the temporary file.
object WebServer {
def main(args: Array[String]) {
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val interface = "localhost"
val port = 3000
import Directives._
val route = get {
pathEndOrSingleSlash {
complete("Welcome to websocket server")
}
} ~
path("upload") {
handleWebSocketMessages(echoService)
}
val binding = Http().bindAndHandle(route, interface, port)
println(s"Server is now online at http://$interface:$port\nPress RETURN to stop...")
StdIn.readLine()
binding.flatMap(_.unbind()).onComplete(_ => actorSystem.shutdown())
println("Server is down...")
}
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val echoService: Flow[Message, Message, _] = Flow[Message].mapConcat {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
TextMessage(analyze(imgOutFile))
}
case BinaryMessage.Streamed(stream) => {
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future(Source.single(""))
}
TextMessage(analyze(imgOutFile))
}
private def analyze(imgfile: File): String = {
val p = Runtime.getRuntime.exec(Array("./run-vision.sh", imgfile.toString))
val br = new BufferedReader(new InputStreamReader(p.getInputStream, StandardCharsets.UTF_8))
try {
val result = Stream
.continually(br.readLine())
.takeWhile(_ ne null)
.mkString
result
} finally {
br.close()
}
}
}
}
During testing using Dark WebSocket Terminal, case BinaryMessage.Strict works fine.
Problem: However, case BinaryMessage.Streaming doesn't finish writing the file before running the analyze function, resulting in a blank response from the server.
I'm trying to wrap my head around how Futures are being used here with the Flows in Akka-HTTP, but I'm not having much luck outside trying to get through all the official documentation.
Currently, .mapAsync seems promising, or basically finding a way to chain futures.
I'd really appreciate some insight.
Yes, mapAsync will help you in this occasion. It is a combinator to execute Futures (potentially in parallel) in your stream, and present their results on the output side.
In your case to make things homogenous and make the type checker happy, you'll need to wrap the result of the Strict case into a Future.successful.
A quick fix for your code could be:
val echoService: Flow[Message, Message, _] = Flow[Message].mapAsync(parallelism = 5) {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
case BinaryMessage.Streamed(stream) =>
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
}

How to write a string to Scala Process?

I start and have running a Scala process.
val dir = "/path/to/working/dir/"
val stockfish = Process(Seq("wine", dir + "stockfish_8_x32.exe"))
val logger = ProcessLogger(printf("Stdout: %s%n", _))
val stockfishProcess = stockfish.run(logger, connectInput = true)
The process reads from and writes to standard IO (console). How can I send a string command to the process if it's been already started?
Scala process API has ProcessBuilder which has in turn bunch of useful methods. But ProcessBuilder is used before a process starts to compose complex shell commands. Also Scala has ProcessIO to handle input or output. I don't need it too. I just need to send message to my process.
In Java I would do something like this.
String dir = "/path/to/working/dir/";
ProcessBuilder builder = new ProcessBuilder("wine", dir + "stockfish_8_x32.exe");
Process process = builder.start();
OutputStream stdin = process.getOutputStream();
InputStream stdout = process.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(stdout));
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(stdin));
new Thread(() -> {
try {
String line;
while ((line = reader.readLine()) != null) {
System.out.println("Stdout: " + line);
}
} catch (IOException e) {
e.printStackTrace();
}
}).start();
Thread.sleep(5000); // it's just for example
writer.write("quit"); // send to the process command to stop working
writer.newLine();
writer.flush();
It works quite well. I start my process, get InputStream and OutputStream from it, and use the streams to interact with the process.
It appears Scala Process trait provides no ways to write to it. ProcessBuilder is useless after process run. And ProcessIO is just for IO catching and handling.
Are there any ways to write to Scala running process?
UPDATE:
I don't see how I may use ProcessIO to pass a string to running process.
I did the following.
import scala.io.Source
import scala.sys.process._
object Sample extends App {
def out = (output: java.io.OutputStream) => {
output.flush()
output.close()
}
def in = (input: java.io.InputStream) => {
println("Stdout: " + Source.fromInputStream(input).mkString)
input.close()
}
def go = {
val dir = "/path/to/working/dir/"
val stockfishSeq = Seq("wine", dir + "/stockfish_8_x32.exe")
val pio = new ProcessIO(out, in, err => {})
val stockfish = Process(stockfishSeq)
stockfish.run(pio)
Thread.sleep(5000)
System.out.write("quit\n".getBytes)
pio.writeInput(System.out) // "writeInput" is function "out" which I have passed to conforming ProcessIO instance. I can invoke it from here. It takes OutputStream but where can I obtain it? Here I just pass System.out for example.
}
go
}
Of course it does not work and I failed to understand how to implement functionality as in my Java snippet above. It would be great to have advice or snippet of Scala code clearing my issue.
I think the documentation around Scala processes (specifically the usage and semantics of ProcessIO) could use some improvement. The first time I tried using this API, I also found it very confusing, and it took some trial and error to get my subprocess i/o working correctly.
I think seeing a simple example is probably all you really need. I'll do something really simple: invoking bc as a subprocess to do some trivial computations, and then printing the answers to my stdout. My goal is to do something like this (but from Scala rather than from my shell):
$ printf "1+2\n3+4\n" | bc
3
7
Here's how I'd do it in Scala:
import scala.io.Source
import scala.sys.process._
object SimpleProcessExample extends App {
def out = (output: java.io.OutputStream) => {
output.flush()
output.close()
}
def in = (input: java.io.InputStream) => {
println("Stdout: " + Source.fromInputStream(input).mkString)
input.close()
}
// limit scope of any temporary variables
locally {
val calcCommand = "bc"
// strings are implicitly converted to ProcessBuilder
// via scala.sys.process.ProcessImplicits.stringToProcess(_)
val calcProc = calcCommand.run(new ProcessIO(
// Handle subprocess's stdin
// (which we write via an OutputStream)
in => {
val writer = new java.io.PrintWriter(in)
writer.println("1 + 2")
writer.println("3 + 4")
writer.close()
},
// Handle subprocess's stdout
// (which we read via an InputStream)
out => {
val src = scala.io.Source.fromInputStream(out)
for (line <- src.getLines()) {
println("Answer: " + line)
}
src.close()
},
// We don't want to use stderr, so just close it.
_.close()
))
// Using ProcessBuilder.run() will automatically launch
// a new thread for the input/output routines passed to ProcessIO.
// We just need to wait for it to finish.
val code = calcProc.exitValue()
println(s"Subprocess exited with code $code.")
}
}
Notice that you don't actually call any of the methods of the ProcessIO object directly because they're automatically called by the ProcessBuilder.
Here's the result:
$ scala SimpleProcessExample
Answer: 3
Answer: 7
Subprocess exited with code 0.
If you wanted interaction between the input and output handlers to the subprocess, you can use standard thread communication tools (e.g., have both close over an instance of BlockingQueue).
Here is an example of obtaining input and output streams from a process, which you can write to and read from after the process starts:
object demo {
import scala.sys.process._
def getIO = {
// create piped streams that can attach to process streams:
val procInput = new java.io.PipedOutputStream()
val procOutput = new java.io.PipedInputStream()
val io = new ProcessIO(
// attach to the process's internal input stream
{ in =>
val istream = new java.io.PipedInputStream(procInput)
val buf = Array.fill(100)(0.toByte)
var br = 0
while (br >= 0) {
br = istream.read(buf)
if (br > 0) { in.write(buf, 0, br) }
}
in.close()
},
// attach to the process's internal output stream
{ out =>
val ostream = new java.io.PipedOutputStream(procOutput)
val buf = Array.fill(100)(0.toByte)
var br = 0
while (br >= 0) {
br = out.read(buf)
if (br > 0) { ostream.write(buf, 0, br) }
}
out.close()
},
// ignore stderr
{ err => () }
)
// run the command with the IO object:
val cmd = List("awk", "{ print $1 + $2 }")
val proc = cmd.run(io)
// wrap the raw streams in formatted IO objects:
val procO = new java.io.BufferedReader(new java.io.InputStreamReader(procOutput))
val procI = new java.io.PrintWriter(procInput, true)
(procI, procO)
}
}
Here's a short example of using the input and output objects. Note that it's hard to guarantee that the process will receive it's input until you close the input streams/objects, since everything is piped, buffered, etc.
scala> :load /home/eje/scala/input2proc.scala
Loading /home/eje/scala/input2proc.scala...
defined module demo
scala> val (procI, procO) = demo.getIO
procI: java.io.PrintWriter = java.io.PrintWriter#7e809b79
procO: java.io.BufferedReader = java.io.BufferedReader#5cc126dc
scala> procI.println("1 2")
scala> procI.println("3 4")
scala> procI.println("5 6")
scala> procI.close()
scala> procO.readLine
res4: String = 3
scala> procO.readLine
res5: String = 7
scala> procO.readLine
res6: String = 11
scala>
In general, if you are managing both input and output simultaneously in the same thread, there is the potential for deadlock, since either read or write can block waiting for the other. It is safest to run input logic and output logic in their own threads. With these threading concerns in mind, it is also possible to just put the input and output logic directly into the definitions { in => ... } and { out => ... }, as these are both run in separate threads automatically
I haven't actually tried this, but the documentation says that you can use a instance of ProcessIO to handle the Process's input and output in a manner similar to what you would do in Java.
var outPutStream: Option[OutputStream] = None
val io = new ProcessIO(
{ outputStream =>
outPutStream = Some(outputStream)
},
Source.fromInputStream(_).getLines().foreach(println),
Source.fromInputStream(_).getLines().foreach(println)
)
command run io
val out = outPutStream.get
out.write("test" getBytes())
You can get an InputStream in the same way.

simple scala socket program - talks to one client only?

I'm trying to make a very simple scala socket program that will "echo" out any input it recieves from multiple clients
This program does work but only for a single client. I think this is because execution is always in while(true) loop
import java.net._
import java.io._
import scala.io._
//println(util.Properties.versionString)
val server = new ServerSocket(9999)
println("initialized server")
val client = server.accept
while(true){
val in = new BufferedReader(new InputStreamReader(client.getInputStream)).readLine
val out = new PrintStream(client.getOutputStream)
println("Server received:" + in) // print out the input message
out.println("Message received")
out.flush
}
I've tried
making this modification
while(true){
val client = server.accept
val in = new BufferedReader(new InputStreamReader(client.getInputStream)).readLine
val out = new PrintStream(client.getOutputStream)
println("Server received:" + in)
}
But this does'nt work beyond "echo"ing out a single message
I'd like multiple clients to connect to the socket and constantly receive the output of whatever they type in
Basically you should accept the connection and create a new Future for each client. Beware that the implicit global ExecutionContext might be limited, you might need to find a different one that better fits your use cases.
You can use Scala async if you need more complex tasks with futures, but I think this is probably fine.
Disclaimer, I have not tried this, but something similar might work (based on your code and the docs):
import scala.concurrent._
import ExecutionContext.Implicits.global
...
while(true){
val client = server.accept
Future {
val inReader = new BufferedReader(new InputStreamReader(client.getInputStream))
val out = new PrintStream(client.getOutputStream)
do {
val in = inReader.readLine
println("Server received:" + in)
} while (true/*or a better condition to close the connection */)
client.close
}
}
Here you can find an example for the scala language:
[http://www.scala-lang.org/old/node/55][1]
And this is also a good example from scala twitter school, that works with java libraries:
import java.net.{Socket, ServerSocket}
import java.util.concurrent.{Executors, ExecutorService}
import java.util.Date
class NetworkService(port: Int, poolSize: Int) extends Runnable {
val serverSocket = new ServerSocket(port)
def run() {
while (true) {
// This will block until a connection comes in.
val socket = serverSocket.accept()
(new Handler(socket)).run()
}
}
}
class Handler(socket: Socket) extends Runnable {
def message = (Thread.currentThread.getName() + "\n").getBytes
def run() {
socket.getOutputStream.write(message)
socket.getOutputStream.close()
}
}
(new NetworkService(2020, 2)).run