I start and have running a Scala process.
val dir = "/path/to/working/dir/"
val stockfish = Process(Seq("wine", dir + "stockfish_8_x32.exe"))
val logger = ProcessLogger(printf("Stdout: %s%n", _))
val stockfishProcess = stockfish.run(logger, connectInput = true)
The process reads from and writes to standard IO (console). How can I send a string command to the process if it's been already started?
Scala process API has ProcessBuilder which has in turn bunch of useful methods. But ProcessBuilder is used before a process starts to compose complex shell commands. Also Scala has ProcessIO to handle input or output. I don't need it too. I just need to send message to my process.
In Java I would do something like this.
String dir = "/path/to/working/dir/";
ProcessBuilder builder = new ProcessBuilder("wine", dir + "stockfish_8_x32.exe");
Process process = builder.start();
OutputStream stdin = process.getOutputStream();
InputStream stdout = process.getInputStream();
BufferedReader reader = new BufferedReader(new InputStreamReader(stdout));
BufferedWriter writer = new BufferedWriter(new OutputStreamWriter(stdin));
new Thread(() -> {
try {
String line;
while ((line = reader.readLine()) != null) {
System.out.println("Stdout: " + line);
}
} catch (IOException e) {
e.printStackTrace();
}
}).start();
Thread.sleep(5000); // it's just for example
writer.write("quit"); // send to the process command to stop working
writer.newLine();
writer.flush();
It works quite well. I start my process, get InputStream and OutputStream from it, and use the streams to interact with the process.
It appears Scala Process trait provides no ways to write to it. ProcessBuilder is useless after process run. And ProcessIO is just for IO catching and handling.
Are there any ways to write to Scala running process?
UPDATE:
I don't see how I may use ProcessIO to pass a string to running process.
I did the following.
import scala.io.Source
import scala.sys.process._
object Sample extends App {
def out = (output: java.io.OutputStream) => {
output.flush()
output.close()
}
def in = (input: java.io.InputStream) => {
println("Stdout: " + Source.fromInputStream(input).mkString)
input.close()
}
def go = {
val dir = "/path/to/working/dir/"
val stockfishSeq = Seq("wine", dir + "/stockfish_8_x32.exe")
val pio = new ProcessIO(out, in, err => {})
val stockfish = Process(stockfishSeq)
stockfish.run(pio)
Thread.sleep(5000)
System.out.write("quit\n".getBytes)
pio.writeInput(System.out) // "writeInput" is function "out" which I have passed to conforming ProcessIO instance. I can invoke it from here. It takes OutputStream but where can I obtain it? Here I just pass System.out for example.
}
go
}
Of course it does not work and I failed to understand how to implement functionality as in my Java snippet above. It would be great to have advice or snippet of Scala code clearing my issue.
I think the documentation around Scala processes (specifically the usage and semantics of ProcessIO) could use some improvement. The first time I tried using this API, I also found it very confusing, and it took some trial and error to get my subprocess i/o working correctly.
I think seeing a simple example is probably all you really need. I'll do something really simple: invoking bc as a subprocess to do some trivial computations, and then printing the answers to my stdout. My goal is to do something like this (but from Scala rather than from my shell):
$ printf "1+2\n3+4\n" | bc
3
7
Here's how I'd do it in Scala:
import scala.io.Source
import scala.sys.process._
object SimpleProcessExample extends App {
def out = (output: java.io.OutputStream) => {
output.flush()
output.close()
}
def in = (input: java.io.InputStream) => {
println("Stdout: " + Source.fromInputStream(input).mkString)
input.close()
}
// limit scope of any temporary variables
locally {
val calcCommand = "bc"
// strings are implicitly converted to ProcessBuilder
// via scala.sys.process.ProcessImplicits.stringToProcess(_)
val calcProc = calcCommand.run(new ProcessIO(
// Handle subprocess's stdin
// (which we write via an OutputStream)
in => {
val writer = new java.io.PrintWriter(in)
writer.println("1 + 2")
writer.println("3 + 4")
writer.close()
},
// Handle subprocess's stdout
// (which we read via an InputStream)
out => {
val src = scala.io.Source.fromInputStream(out)
for (line <- src.getLines()) {
println("Answer: " + line)
}
src.close()
},
// We don't want to use stderr, so just close it.
_.close()
))
// Using ProcessBuilder.run() will automatically launch
// a new thread for the input/output routines passed to ProcessIO.
// We just need to wait for it to finish.
val code = calcProc.exitValue()
println(s"Subprocess exited with code $code.")
}
}
Notice that you don't actually call any of the methods of the ProcessIO object directly because they're automatically called by the ProcessBuilder.
Here's the result:
$ scala SimpleProcessExample
Answer: 3
Answer: 7
Subprocess exited with code 0.
If you wanted interaction between the input and output handlers to the subprocess, you can use standard thread communication tools (e.g., have both close over an instance of BlockingQueue).
Here is an example of obtaining input and output streams from a process, which you can write to and read from after the process starts:
object demo {
import scala.sys.process._
def getIO = {
// create piped streams that can attach to process streams:
val procInput = new java.io.PipedOutputStream()
val procOutput = new java.io.PipedInputStream()
val io = new ProcessIO(
// attach to the process's internal input stream
{ in =>
val istream = new java.io.PipedInputStream(procInput)
val buf = Array.fill(100)(0.toByte)
var br = 0
while (br >= 0) {
br = istream.read(buf)
if (br > 0) { in.write(buf, 0, br) }
}
in.close()
},
// attach to the process's internal output stream
{ out =>
val ostream = new java.io.PipedOutputStream(procOutput)
val buf = Array.fill(100)(0.toByte)
var br = 0
while (br >= 0) {
br = out.read(buf)
if (br > 0) { ostream.write(buf, 0, br) }
}
out.close()
},
// ignore stderr
{ err => () }
)
// run the command with the IO object:
val cmd = List("awk", "{ print $1 + $2 }")
val proc = cmd.run(io)
// wrap the raw streams in formatted IO objects:
val procO = new java.io.BufferedReader(new java.io.InputStreamReader(procOutput))
val procI = new java.io.PrintWriter(procInput, true)
(procI, procO)
}
}
Here's a short example of using the input and output objects. Note that it's hard to guarantee that the process will receive it's input until you close the input streams/objects, since everything is piped, buffered, etc.
scala> :load /home/eje/scala/input2proc.scala
Loading /home/eje/scala/input2proc.scala...
defined module demo
scala> val (procI, procO) = demo.getIO
procI: java.io.PrintWriter = java.io.PrintWriter#7e809b79
procO: java.io.BufferedReader = java.io.BufferedReader#5cc126dc
scala> procI.println("1 2")
scala> procI.println("3 4")
scala> procI.println("5 6")
scala> procI.close()
scala> procO.readLine
res4: String = 3
scala> procO.readLine
res5: String = 7
scala> procO.readLine
res6: String = 11
scala>
In general, if you are managing both input and output simultaneously in the same thread, there is the potential for deadlock, since either read or write can block waiting for the other. It is safest to run input logic and output logic in their own threads. With these threading concerns in mind, it is also possible to just put the input and output logic directly into the definitions { in => ... } and { out => ... }, as these are both run in separate threads automatically
I haven't actually tried this, but the documentation says that you can use a instance of ProcessIO to handle the Process's input and output in a manner similar to what you would do in Java.
var outPutStream: Option[OutputStream] = None
val io = new ProcessIO(
{ outputStream =>
outPutStream = Some(outputStream)
},
Source.fromInputStream(_).getLines().foreach(println),
Source.fromInputStream(_).getLines().foreach(println)
)
command run io
val out = outPutStream.get
out.write("test" getBytes())
You can get an InputStream in the same way.
Related
How can I get some kind of writeable stream connected to stdin (and also readable streams connected to stdout and stderr) when launching a process via scala.sys.process library? Here's the code that doesn't work (doesn't even print debug messages)
val p = Process("wc -l")
val io = BasicIO.standard(true)
val lines = Seq("a", "b", "c") mkString "\n"
val buf = lines.getBytes(StandardCharsets.UTF_8)
io withInput { w =>
println("Writing")
w.write(buf)
}
io withOutput { i =>
val s = new BufferedReader(new InputStreamReader(i)).readLine()
println(s"Output is $s")
}
You have a couple of problems.
First in your snippet you never connect your process with the io and never run it.
That can be done like this: p run io.
Second, the withInput & withOutput methods return a NEW ProcessIO they DON'T mutate the actual, and since you don't assign the return of those calls to a variable, you are doing nothing.
The following snippet fixes both problems, hope it works for you.
import scala.io.Source
import scala.sys.process._
import java.nio.charset.StandardCharsets
val p = Process("wc -l")
val io =
BasicIO.standard(true)
.withInput { w =>
val lines = Seq("a", "b", "c").mkString("", "\n", "\n")
val buf = lines.getBytes(StandardCharsets.UTF_8)
println("Writing")
w.write(buf)
w.close()
}
.withOutput { i =>
val s = Source.fromInputStream(i)
println(s"Output is ${s.getLines.mkString(",")}")
i.close()
}
p run io
Don't doubt to ask for clarification.
PS: it prints "Output is 3" - (Thanks to Dima for pointing the mistake).
I am trying to build a class that starts a system process which waits for stdin. The class should have another method which takes a string, inputs that into the system process, and return the process' output.
The reason is that starting the process involves loading a lot of data and hence takes a while.
I am trying to dummy-test this with bc, so that bc is started and waits for input. I would envision an interface like this:
case class BcWrapper(executable: File) {
var bc: Option[???] = None
def startBc(): Unit = bc = Some(???)
def calc(input: String): String = bc.get.???
def stopBc(): Unit = bc.get.???
}
I would like to be able to use it like this:
val wrapper = BcWrapper(new File("/usr/bin/bc"))
wrapper.startBc()
val result1 = wrapper.calc("1 + 1") // should be "2"
val result2 = wrapper.calc(???)
[...]
wrapper.stopBc()
This topic has been touched in multiple questions, but never fully answered for a use case like this one. This question or this one seems to come close. However, I am not sure how to implement the ProcessLogger, nor whether to use one in the first place.
Unfortunately, the Scala documentation is not very elaborate either.
Note that I do not want to read from stdin, but want to call a function.
The background is that I want to read a large file, read it line by line, preprocess the lines, pass them to the external process, and post-process the output.
You can get something similar, but simpler, like so.
import sys.process._
import util.Try
class StdInReader(val reader :String) {
def send(input :String) :Try[String] =
Try(s"/bin/echo $input".#|(reader).!!.trim)
}
usage:
val bc = new StdInReader("/usr/bin/bc")
bc.send("2 * 8") //res0: scala.util.Try[String] = Success(16)
bc.send("12 + 8") //res1: scala.util.Try[String] = Success(20)
bc.send("22 - 8") //res2: scala.util.Try[String] = Success(14)
Programs that send a non-zero exit-code (bc doesn't) will result with a Failure().
If you need more fine-grained control you might start with something like this and expand on it.
import sys.process._
class ProcHandler(val cmnd :String) {
private val resbuf = collection.mutable.Buffer.empty[String]
def run(data :Seq[String]) :Unit = {
cmnd.run(new ProcessIO(
in => {
val writer = new java.io.PrintWriter(in)
data.foreach(writer.println)
writer.close()
},
out => {
val src = io.Source.fromInputStream(out)
src.getLines().foreach(resbuf += _)
src.close()
},
_.close() //maybe create separate buffer for stderr?
)).exitValue()
}
def results() :Seq[String] = {
val rs = collection.mutable.Buffer.empty[String]
resbuf.copyToBuffer(rs)
resbuf.clear()
rs
}
}
usage:
val bc = new ProcHandler("/usr/bin/bc")
bc.run(List("4+5","6-2","2*5"))
bc.run(List("99/3","11*77"))
bc.results() //res0: Seq[String] = ArrayBuffer(9, 4, 10, 33, 847)
OK, I did some more research and found this. It appears to get at what you want but there are limitations. In particular, the process stays open for input until you want to get output. At that point IO streams are closed to insure all buffers are flushed.
import sys.process._
import util.Try
class ProcHandler(val cmnd :String) {
private val procInput = new java.io.PipedOutputStream()
private val procOutput = new java.io.PipedInputStream()
private val proc = cmnd.run( new ProcessIO(
{ in => // attach to the process's internal input stream
val istream = new java.io.PipedInputStream(procInput)
val buf = Array.fill(100)(0.toByte)
Iterator.iterate(istream.read(buf)){ br =>
in.write(buf, 0, br)
istream.read(buf)
}.takeWhile(_>=0).toList
in.close()
},
{ out => // attach to the process's internal output stream
val ostream = new java.io.PipedOutputStream(procOutput)
val buf = Array.fill(100)(0.toByte)
Iterator.iterate(out.read(buf)){ br =>
ostream.write(buf, 0, br)
out.read(buf)
}.takeWhile(_>=0).toList
out.close()
},
_ => () // ignore stderr
))
private val procO = new java.io.BufferedReader(new java.io.InputStreamReader(procOutput))
private val procI = new java.io.PrintWriter(procInput, true)
def feed(str :String) :Unit = procI.println(str)
def feed(ss :Seq[String]) :Unit = ss.foreach(procI.println)
def read() :List[String] = {
procI.close() //close input before reading output
val lines = Stream.iterate(Try(procO.readLine)){_ =>
Try(procO.readLine)
}.takeWhile(_.isSuccess).map(_.get).toList
procO.close()
lines
}
}
usage:
val bc = new ProcHandler("/usr/bin/bc")
bc.feed(List("9*3","4+11")) //res0: Unit = ()
bc.feed("4*13") //res1: Unit = ()
bc.read() //res2: List[String] = List(27, 15, 52)
bc.read() //res3: List[String] = List()
OK, this is my final word on the subject. I think this ticks every item on your wish list: start the process only once, it stays alive until actively closed, allows alternating the writing and reading.
import sys.process._
class ProcHandler(val cmnd :Seq[String]) {
private var os: java.io.OutputStream = null
private var is: java.io.InputStream = null
private val pio = new ProcessIO(os = _, is = _, _.close())
private val proc = cmnd.run(pio)
def feed(ss :String*) :Unit = {
ss.foreach(_.foreach(os.write(_)))
os.flush()
}
def ready :Boolean = is.available() > 0
def read() :String = {
Seq.fill[Char](is.available())(is.read().toChar).mkString
}
def close() :Unit = {
proc.exitValue()
os.close()
is.close()
}
}
There are still issues and much room for improvement. IO is handled at a basic level (streams) and I'm not sure what I'm doing here is completely safe and correct. The input, feed(), is required to supply the necessary NewLine terminations, and the output, read(), is just a raw String and not separated into a nice collection of string results.
Note that this will bleed system resources if the client code fails to close() all processes.
Note also that reading doesn't wait for content (i.e. no blocking). After writing the response might not be immediately available.
usage:
val bc = new ProcHandler(Seq("/usr/bin/bc","-q"))
bc.feed("44-21\n", "21*4\n")
bc.feed("67+11\n")
if (bc.ready) bc.read() else "not ready" // "23\n84\n78\n"
bc.feed("67-11\n")
if (bc.ready) bc.read() else "not ready" // "56\n"
bc.feed("67*11\n", "1+2\n")
if (bc.ready) bc.read() else "not ready" // "737\n3\n"
if (bc.ready) bc.read() else "not ready" // "not ready"
bc.close()
I want to stream some files and zip them on the fly, so users can download multiple files into a single zipped file without writing anything to the local disk. However, my current implementation holds everything in the memory, and will no work for large files. Is there any way to fix it?
I was looking at this implementation: https://gist.github.com/kirked/03c7f111de0e9a1f74377bf95d3f0f60, but couldn't figure out how to use it.
import java.io.{BufferedOutputStream, ByteArrayInputStream, ByteArrayOutputStream}
import java.util.zip.{ZipEntry, ZipOutputStream}
import akka.stream.scaladsl.{StreamConverters}
import org.apache.commons.io.FileUtils
import play.api.mvc.{Action, Controller}
class HomeController extends Controller {
def single() = Action {
Ok.sendFile(
content = new java.io.File("C:\\Users\\a.csv"),
fileName = _ => "a.csv"
)
}
def zip() = Action {
Ok.chunked(StreamConverters.fromInputStream(fileByteData)).withHeaders(
CONTENT_TYPE -> "application/zip",
CONTENT_DISPOSITION -> s"attachment; filename = test.zip"
)
}
def fileByteData(): ByteArrayInputStream = {
val fileList = List(
new java.io.File("C:\\Users\\a.csv"),
new java.io.File("C:\\Users\\b.csv")
)
val baos = new ByteArrayOutputStream()
val zos = new ZipOutputStream(new BufferedOutputStream(baos))
try {
fileList.map(file => {
zos.putNextEntry(new ZipEntry(file.toPath.getFileName.toString))
zos.write(FileUtils.readFileToByteArray(file))
zos.closeEntry()
})
} finally {
zos.close()
}
new ByteArrayInputStream(baos.toByteArray)
}
}
Instead of using a ByteArrayOutputStream to buffer the contents in an array then putting them into a ByteArrayInputStream you could use Java's piping mechanism.
Here's a sketch solution:
def zip() = Action {
// Create Source that listens to an OutputStream
// and pass it to `fileByteData` method.
val zipSource: Source[ByteString, Unit] =
StreamConverters
.asOutputStream()
.mapMaterializedValue(fileByteData)
Ok.chunked(zipSource).withHeaders(
CONTENT_TYPE -> "application/zip",
CONTENT_DISPOSITION -> s"attachment; filename = test.zip")
}
// Send the file data, given an OutputStream to write to.
def fileByteData(os: OutputStream): Unit = {
val fileList = List(
new java.io.File("C:\\Users\\a.csv"),
new java.io.File("C:\\Users\\b.csv")
)
val zos = new ZipOutputStream(os)
val buffer: Array[Byte] = new Array[Byte](2048)
try {
for (file <- fileList) {
zos.putNextEntry(new ZipEntry(file.toPath.getFileName.toString))
val fis = new Files.newInputStream(file.toPath)
try {
#tailrec
def zipFile(): Unit = {
val bytesRead = fis.read(buffer)
if (bytesRead == -1) () else {
zos.write(buffer, 0, bytesRead)
zipFile()
}
}
zipFile()
} finally fis.close()
zos.closeEntry()
}
} finally {
zos.close()
}
}
This is just an outline of an approach. You'll also want to make sure:
- the threading is OK - the fileByteData will hopefully run on a different thread to the sending thread
- the error handling is OK - e.g. all streams are closed properly if there's an error on either the server (e.g. file not found) or client side (early disconnect)
I'm using some sample Scala code to make a server that receives a file over websocket, stores the file temporarily, runs a bash script on it, and then returns stdout by TextMessage.
Sample code was taken from this github project.
I edited the code slightly within echoService so that it runs another function that processes the temporary file.
object WebServer {
def main(args: Array[String]) {
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val interface = "localhost"
val port = 3000
import Directives._
val route = get {
pathEndOrSingleSlash {
complete("Welcome to websocket server")
}
} ~
path("upload") {
handleWebSocketMessages(echoService)
}
val binding = Http().bindAndHandle(route, interface, port)
println(s"Server is now online at http://$interface:$port\nPress RETURN to stop...")
StdIn.readLine()
binding.flatMap(_.unbind()).onComplete(_ => actorSystem.shutdown())
println("Server is down...")
}
implicit val actorSystem = ActorSystem("akka-system")
implicit val flowMaterializer = ActorMaterializer()
val echoService: Flow[Message, Message, _] = Flow[Message].mapConcat {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
TextMessage(analyze(imgOutFile))
}
case BinaryMessage.Streamed(stream) => {
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future(Source.single(""))
}
TextMessage(analyze(imgOutFile))
}
private def analyze(imgfile: File): String = {
val p = Runtime.getRuntime.exec(Array("./run-vision.sh", imgfile.toString))
val br = new BufferedReader(new InputStreamReader(p.getInputStream, StandardCharsets.UTF_8))
try {
val result = Stream
.continually(br.readLine())
.takeWhile(_ ne null)
.mkString
result
} finally {
br.close()
}
}
}
}
During testing using Dark WebSocket Terminal, case BinaryMessage.Strict works fine.
Problem: However, case BinaryMessage.Streaming doesn't finish writing the file before running the analyze function, resulting in a blank response from the server.
I'm trying to wrap my head around how Futures are being used here with the Flows in Akka-HTTP, but I'm not having much luck outside trying to get through all the official documentation.
Currently, .mapAsync seems promising, or basically finding a way to chain futures.
I'd really appreciate some insight.
Yes, mapAsync will help you in this occasion. It is a combinator to execute Futures (potentially in parallel) in your stream, and present their results on the output side.
In your case to make things homogenous and make the type checker happy, you'll need to wrap the result of the Strict case into a Future.successful.
A quick fix for your code could be:
val echoService: Flow[Message, Message, _] = Flow[Message].mapAsync(parallelism = 5) {
case BinaryMessage.Strict(msg) => {
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
case BinaryMessage.Streamed(stream) =>
stream
.limit(Int.MaxValue) // Max frames we are willing to wait for
.completionTimeout(50 seconds) // Max time until last frame
.runFold(ByteString(""))(_ ++ _) // Merges the frames
.flatMap { (msg: ByteString) =>
val decoded: Array[Byte] = msg.toArray
val imgOutFile = new File("/tmp/" + "filename")
val fileOuputStream = new FileOutputStream(imgOutFile)
fileOuputStream.write(decoded)
fileOuputStream.close()
Future.successful(TextMessage(analyze(imgOutFile)))
}
}
Say, in an action I have:
val linesEnu = {
val is = new java.io.FileInputStream(path)
val isr = new java.io.InputStreamReader(is, "UTF-8")
val br = new java.io.BufferedReader(isr)
import scala.collection.JavaConversions._
val rows: scala.collection.Iterator[String] = br.lines.iterator
Enumerator.enumerate(rows)
}
Ok.feed(linesEnu).as(HTML)
How to close readers/streams?
There is a onDoneEnumerating callback that functions like finally (will always be called whether or not the Enumerator fails). You can close the streams there.
val linesEnu = {
val is = new java.io.FileInputStream(path)
val isr = new java.io.InputStreamReader(is, "UTF-8")
val br = new java.io.BufferedReader(isr)
import scala.collection.JavaConversions._
val rows: scala.collection.Iterator[String] = br.lines.iterator
Enumerator.enumerate(rows).onDoneEnumerating {
is.close()
// ... Anything else you want to execute when the Enumerator finishes.
}
}
The IO tools provided by Enumerator give you this kind of resource management out of the box—e.g. if you create an enumerator with fromStream, the stream is guaranteed to get closed after running (even if you only read a single line, etc.).
So for example you could write the following:
import play.api.libs.iteratee._
val splitByNl = Enumeratee.grouped(
Traversable.splitOnceAt[Array[Byte], Byte](_ != '\n'.toByte) &>>
Iteratee.consume()
) compose Enumeratee.map(new String(_, "UTF-8"))
def fileLines(path: String): Enumerator[String] =
Enumerator.fromStream(new java.io.FileInputStream(path)).through(splitByNl)
It's a shame that the library doesn't provide a linesFromStream out of the box, but I personally would still prefer to use fromStream with hand-rolled splitting, etc. over using an iterator and providing my own resource management.