Stream input to external process in Scala - scala

I have an Iterable[String] and I want to stream that to an external Process and return an Iterable[String] for the output.
I feel like this should work as it compiles
import scala.sys.process._
object PipeUtils {
implicit class IteratorStream(s: TraversableOnce[String]) {
def pipe(cmd: String) = s.toStream.#>(cmd).lines
def run(cmd: String) = s.toStream.#>(cmd).!
}
}
However, Scala tries to execute the contents of s instead of pass them in to standard in. Can anyone please tell me what I'm doing wrong?
UPDATE:
I think that my original problem was that the s.toStream was being implicity converted to a ProcessBuilder and then executed. This is incorrect as it's the input to the process.
I have come up with the following solution. This feels very hacky and wrong but it seems to work for now. I'm not writing this as an answer because I feel like the answer should be one line and not this gigantic thing.
object PipeUtils {
/**
* This class feels wrong. I think that for the pipe command it actually loads all of the output
* into memory. This could blow up the machine if used wrong, however, I cannot figure out how to get it to
* work properly. Hopefully http://stackoverflow.com/questions/28095469/stream-input-to-external-process-in-scala
* will get some good responses.
* #param s
*/
implicit class IteratorStream(s: TraversableOnce[String]) {
val in = (in: OutputStream) => {
s.foreach(x => in.write((x + "\n").getBytes))
in.close
}
def pipe(cmd: String) = {
val output = ListBuffer[String]()
val io = new ProcessIO(in,
out => {Source.fromInputStream(out).getLines.foreach(output += _)},
err => {Source.fromInputStream(err).getLines.foreach(println)})
cmd.run(io).exitValue
output.toIterable
}
def run(cmd: String) = {
cmd.run(BasicIO.standard(in)).exitValue
}
}
}
EDIT
The motivation for this comes from using Spark's .pipe function on an RDD. I want this exact same functionality on my local code.

Assuming scala 2.11+, you should use lineStream as suggested by #edi. The reason is that you get a streaming response as it becomes available instead of a batched response. Let's say I have a shell script echo-sleep.sh:
#/usr/bin/env bash
# echo-sleep.sh
while read line; do echo $line; sleep 1; done
and we want to call it from scala using code like the following:
import scala.sys.process._
import scala.language.postfixOps
import java.io.ByteArrayInputStream
implicit class X(in: TraversableOnce[String]) {
// Don't do the BAOS construction in real code. Just for illustration.
def pipe(cmd: String) =
cmd #< new ByteArrayInputStream(in.mkString("\n").getBytes) lineStream
}
Then if we do a final call like:
1 to 10 map (_.toString) pipe "echo-sleep.sh" foreach println
a number in the sequence appears on STDOUT every 1 second. If you buffer, and convert to an Iterable as in your example, you will lose this responsiveness.

Here's a solution demonstrating how to write the process code so that it streams both the input and output. The key is to produce a java.io.PipedInputStream that is passed to the input of the process. This stream is filled from the iterator asynchronously via a java.io.PipedOutputStream. Obviously, feel free to change the input type of the implicit class to an Iterable.
Here's an iterator used to show this works.
/**
* An iterator with pauses used to illustrate data streaming to the process to be run.
*/
class PausingIterator[A](zero: A, until: A, pauseMs: Int)(subsequent: A => A)
extends Iterator[A] {
private[this] var current = zero
def hasNext = current != until
def next(): A = {
if (!hasNext) throw new NoSuchElementException
val r = current
current = subsequent(current)
Thread.sleep(pauseMs)
r
}
}
Here's the actual code you want
import java.io.PipedOutputStream
import java.io.PipedInputStream
import java.io.InputStream
import java.io.PrintWriter
// For process stuff
import scala.sys.process._
import scala.language.postfixOps
// For asynchronous stream writing.
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Future
/**
* A streaming version of the original class. This does not block to wait for the entire
* input or output to be constructed. This allows the process to get data ASAP and allows
* the process to return information back to the scala environment ASAP.
*
* NOTE: Don't forget about error handling in the final production code.
*/
implicit class X(it: Iterator[String]) {
def pipe(cmd: String) = cmd #< iter2is(it) lineStream
/**
* Convert an iterator to an InputStream for use in the pipe function.
* #param it an iterator to convert
*/
private[this] def iter2is[A](it: Iterator[A]): InputStream = {
// What is written to the output stream will appear in the input stream.
val pos = new PipedOutputStream
val pis = new PipedInputStream(pos)
val w = new PrintWriter(pos, true)
// Scala 2.11 (scala 2.10, use 'future'). Executes asynchrously.
// Fill the stream, then close.
Future {
it foreach w.println
w.close
}
// Return possibly before pis is fully written to.
pis
}
}
The final call will show display 0 through 9 and will pause for 3 seconds in between the displaying of each number (second pause on the scala side, 1 second pause on the shell script side).
// echo-sleep.sh is the same script as in my previous post
new PausingIterator(0, 10, 2000)(_ + 1)
.map(_.toString)
.pipe("echo-sleep.sh")
.foreach(println)
Output
0 [ pause 3 secs ]
1 [ pause 3 secs ]
...
8 [ pause 3 secs ]
9 [ pause 3 secs ]

Related

What's the best way to wrap a monix Task with a start time and end time print the difference?

This is what I'm trying right now but it only prints "hey" and not metrics.
I don't want to add metric related stuff in the main function.
import java.util.Date
import monix.eval.Task
import monix.execution.Scheduler.Implicits.global
import scala.concurrent.Await
import scala.concurrent.duration.Duration
class A {
def fellow(): Task[Unit] = {
val result = Task {
println("hey")
Thread.sleep(1000)
}
result
}
}
trait AA extends A {
override def fellow(): Task[Unit] = {
println("AA")
val result = super.fellow()
val start = new Date()
result.foreach(e => {
println("AA", new Date().getTime - start.getTime)
})
result
}
}
val a = new A with AA
val res: Task[Unit] = a.fellow()
Await.result(res.runAsync, Duration.Inf)
You can describe a function such as this:
def measure[A](task: Task[A], logMillis: Long => Task[Unit]): Task[A] =
Task.deferAction { sc =>
val start = sc.clockMonotonic(TimeUnit.MILLISECONDS)
val stopTimer = Task.suspend {
val end = sc.clockMonotonic(TimeUnit.MILLISECONDS)
logMillis(end - start)
}
task.redeemWith(
a => stopTimer.map(_ => a)
e => stopTimer.flatMap(_ => Task.raiseError(e))
)
}
Some piece of advice:
Task values should be pure, along with the functions returning Tasks — functions that trigger side effects and return Task as results are broken
Task is not a 1:1 replacement for Future; when describing a Task, all side effects should be suspended (wrapped) in Task
foreach triggers the Task's evaluation and that's not good, because it triggers the side effects; I was thinking of deprecating and removing it, since its presence is tempting
stop using trait inheritance and just use plain functions — unless you deeply understand OOP and subtyping, it's best to avoid it if possible; and if you're into the Cake pattern, stop doing it and maybe join a support group 🙂
never measure time duration via new Date(), you need a monotonic clock for that and on top of the JVM that's System.nanoTime, which can be accessed via Monix's Scheduler by clockMonotonic, as exemplified above, the Scheduler being given to you via deferAction
stop blocking threads, because that's error prone — instead of doing Thread.sleep, do Task.sleep and all Await.result calls are problematic, unless they are in main or in some other place where dealing with asynchrony is not possible
Hope this helps.
Cheers,
Like #Pierre mentioned, latest version of Monix Task has Task.timed, you can do
timed <- task.timed
(duration, t) = timed

Akka stream - Splitting a stream of ByteString into multiple files

I'm trying to split an incoming Akka stream of bytes (from the body of an http request, but it could also be from a file) into multiple files of a defined size.
For example, if I'm uploading a 10Gb file, it would create something like 10 files of 1Gb. The files would have randomly generated names. My issue is that I don't really know where to start, because all the responses and examples I've read are either storing the whole chunk into memory, or using a delimiter based on a string. Except I can't really have "chunks" of 1Gb, and then just write them to the disk..
Is there any easy way to perform that kind of operation ? My only idea would be to use something like this http://doc.akka.io/docs/akka/2.4/scala/stream/stream-cookbook.html#Chunking_up_a_stream_of_ByteStrings_into_limited_size_ByteStrings but transformed to something like FlowShape[ByteString, File], writting myself into a file the chunks until the max file size is reached, then creating a new file, etc.., and streaming back the created files. Which looks like an atrocious idea not using properly Akka..
Thanks in advance
I often revert to purely functional, non-akka, techniques for problems such as this and then "lift" those functions into akka constructs. By this I mean I try to use only scala "stuff" and then try to wrap that stuff inside of akka later on...
File Creation
Starting with the FileOutputStream creation based on "randomly generated names":
def randomFileNameGenerator : String = ??? //not specified in question
import java.io.FileOutputStream
val randomFileOutGenerator : () => FileOutputStream =
() => new FileOutputStream(randomFileNameGenerator)
State
There needs to be some way of storing the "state" of the current file, e.g. the number of bytes already written:
case class FileState(byteCount : Int = 0,
fileOut : FileOutputStream = randomFileOutGenerator())
File Writing
First we determine if we'd breach the maximum file size threshold with the given ByteString:
import akka.util.ByteString
val isEndOfChunk : (FileState, ByteString, Int) => Boolean =
(state, byteString, maxBytes) =>
state.byteCount + byteString.length > maxBytes
We then have to write the function that creates a new FileState if we've maxed out the capacity of the current one or returns the current state if it is still below capacity:
val closeFileInState : FileState => Unit =
(_ : FileState).fileOut.close()
val getCurrentFileState(FileState, ByteString, Int) => FileState =
(state, byteString, maxBytes) =>
if(isEndOfChunk(maxBytes, state, byteString)) {
closeFileInState(state)
FileState()
}
else
state
The only thing left is to write to the FileOutputStream:
val writeToFileAndReturn(FileState, ByteString) => FileState =
(fileState, byteString) => {
fileState.fileOut write byteString.toArray
fileState copy (byteCount = fileState.byteCount + byteString.size)
}
//the signature ordering will become useful
def writeToChunkedFile(maxBytes : Int)(fileState : FileState, byteString : ByteString) : FileState =
writeToFileAndReturn(getCurrentFileState(maxBytes, fileState, byteString), byteString)
Fold On Any GenTraversableOnce
In scala a GenTraversableOnce is any collection, parallel or not, that has the fold operator. These include Iterator, Vector, Array, Seq, scala stream, ... Th final writeToChunkedFile function perfectly matches the signature of GenTraversableOnce#fold:
val anyIterable : Iterable = ???
val finalFileState = anyIterable.fold(FileState())(writetochunkedFile(maxBytes))
One final loose end; the last FileOutputStream needs to be closed as well. Since the fold will only emit that last FileState we can close that one:
closeFileInState(finalFileState)
Akka Streams
Akka Flow gets its fold from FlowOps#fold which happens to match the GenTraversableOnce signature. Therefore we can "lift" our regular functions into stream values similar to the way we used Iterable fold:
import akka.stream.scaladsl.Flow
def chunkerFlow(maxBytes : Int) : Flow[ByteString, FileState, _] =
Flow[ByteString].fold(FileState())(writeToChunkedFile(maxBytes))
The nice part about handling the problem with regular functions is that they can be used within other asynchronous frameworks beyond streams, e.g. Futures or Actors. You also don't need an akka ActorSystem in unit testing, just regular language data structures.
import akka.stream.scaladsl.Sink
import scala.concurrent.Future
def byteStringSink(maxBytes : Int) : Sink[ByteString, _] =
chunkerFlow(maxBytes) to (Sink foreach closeFileInState)
You can then use this Sink to drain HttpEntity coming from HttpRequest.
You could write a custom graph stage.
Your issue is similar to the one faced in alpakka during upload to amazon S3. ( google alpakka s3 connector.. they wont let me post more than 2 links)
For some reason the s3 connector DiskBuffer however writes the entire incoming source of bytestrings to a file, before emitting out the chunk to do further stream processing..
What we want is something similar to limit a source of byte strings to specific length. In the example, they have limited the incoming Source[ByteString, _] to a source of fixed sized byteStrings by maintaining a memory buffer. I adopted it to work with Files.
The advantage of this is that you can use a dedicated thread pool for this stage to do blocking IO. For good reactive stream you want to keep blocking IO in separate thread pool in actor system.
PS: this does not try to make files of exact size.. so if we read 2KB extra in a 100MB file.. we write those extra bytes to the current file rather than trying to achieve exact size.
import java.io.{FileOutputStream, RandomAccessFile}
import java.nio.channels.FileChannel
import java.nio.file.Path
import akka.stream.stage.{GraphStage, GraphStageLogic, InHandler, OutHandler}
import akka.stream._
import akka.util.ByteString
case class MultipartUploadChunk(path: Path, size: Int, partNumber: Int)
//Starts writing the byteStrings received from upstream to a file. Emits a path after writing a partSize number of bytes. Does not attemtp to write exact number of bytes.
class FileChunker(maxSize: Int, tempDir: Path, partSize: Int)
extends GraphStage[FlowShape[ByteString, MultipartUploadChunk]] {
assert(maxSize > partSize, "Max size should be larger than part size. ")
val in: Inlet[ByteString] = Inlet[ByteString]("PartsMaker.in")
val out: Outlet[MultipartUploadChunk] = Outlet[MultipartUploadChunk]("PartsMaker.out")
override val shape: FlowShape[ByteString, MultipartUploadChunk] = FlowShape.of(in, out)
override def createLogic(inheritedAttributes: Attributes): GraphStageLogic =
new GraphStageLogic(shape) with OutHandler with InHandler {
var partNumber: Int = 0
var length: Int = 0
var currentBuffer: Option[PartBuffer] = None
override def onPull(): Unit =
if (isClosed(in)) {
emitPart(currentBuffer, length)
} else {
pull(in)
}
override def onPush(): Unit = {
val elem = grab(in)
length += elem.size
val currentPart: PartBuffer = currentBuffer match {
case Some(part) => part
case None =>
val newPart = createPart(partNumber)
currentBuffer = Some(newPart)
newPart
}
currentPart.fileChannel.write(elem.asByteBuffer)
if (length > partSize) {
emitPart(currentBuffer, length)
//3. .increment part number, reset length.
partNumber += 1
length = 0
} else {
pull(in)
}
}
override def onUpstreamFinish(): Unit =
if (length > 0) emitPart(currentBuffer, length) // emit part only if something is still left in current buffer.
private def emitPart(maybePart: Option[PartBuffer], size: Int): Unit = maybePart match {
case Some(part) =>
//1. flush the part buffer and truncate the file.
part.fileChannel.force(false)
// not sure why we do this truncate.. but was being done in alpakka. also maybe safe to do.
// val ch = new FileOutputStream(part.path.toFile).getChannel
// try {
// println(s"truncating to size $size")
// ch.truncate(size)
// } finally {
// ch.close()
// }
//2emit the part
val chunk = MultipartUploadChunk(path = part.path, size = length, partNumber = partNumber)
push(out, chunk)
part.fileChannel.close() // TODO: probably close elsewhere.
currentBuffer = None
//complete stage if in is closed.
if (isClosed(in)) completeStage()
case None => if (isClosed(in)) completeStage()
}
private def createPart(partNum: Int): PartBuffer = {
val path: Path = partFile(partNum)
//currentPart.deleteOnExit() //TODO: Enable in prod. requests that the file be deleted when VM dies.
PartBuffer(path, new RandomAccessFile(path.toFile, "rw").getChannel)
}
/**
* Creates a file in the temp directory with name bmcs-buffer-part-$partNumber
* #param partNumber the part number in multipart upload.
* #return
* TODO:add unique id to the file name. for multiple
*/
private def partFile(partNumber: Int): Path =
tempDir.resolve(s"bmcs-buffer-part-$partNumber.bin")
setHandlers(in, out, this)
}
case class PartBuffer(path: Path, fileChannel: FileChannel) //TODO: see if you need mapped byte buffer. might be ok with just output stream / channel.
}
The idiomatic way to split a ByteString stream to multiple files is to use Alpakka's LogRotatorSink. From the documentation:
This sink will takes a function as parameter which returns a Bytestring => Option[Path] function. If the generated function returns a path the sink will rotate the file output to this new path and the actual ByteString will be written to this new file too. With this approach the user can define a custom stateful file generation implementation.
The following fileSizeRotationFunction is also from the documentation:
val fileSizeRotationFunction = () => {
val max = 10 * 1024 * 1024
var size: Long = max
(element: ByteString) =>
{
if (size + element.size > max) {
val path = Files.createTempFile("out-", ".log")
size = element.size
Some(path)
} else {
size += element.size
None
}
}
}
An example of its use:
val source: Source[ByteString, _] = ???
source.runWith(LogRotatorSink(fileSizeRotationFunction))

How to pass input in scala through command line

import scala.io._
object Sum {
def main(args :Array[String]):Unit = {
println("Enter some numbers and press ctrl-c")
val input = Source.fromInputStream(System.in)
val lines = input.getLines.toList
println("Sum "+sum(lines))
}
def toInt(in:String):Option[Int] =
try{
Some(Integer.parseInt(in.trim))
}
catch {
case e: NumberFormatException => None
}
def sum(in :Seq[String]) = {
val ints = in.flatMap(s=>toInt(s))
ints.foldLeft(0) ((a,b) => a +b)
} }
I am trying to run this program after passing input I have press
ctrl + c but
It gives this message E:\Scala>scala HelloWord.scala Enter some
numbers and press ctrl-c 1 2 3 Terminate batch job (Y/N)?
Additional observations, note trait App to make an object executable, hence not having to declare a main(...) function, for instance like this,
object Sum extends App {
import scala.io._
import scala.util._
val nums = Source.stdin.getLines.flatMap(v => Try(v.toInt).toOption)
println(s"Sum: ${nums.sum}")
}
Using Try, non successful conversions from String to Int are turned to None and flattened out.
Also note objects and classes are capitalized, hence instead of object sum by convention we write object Sum.
You can also use an external API. I really like scallop API
Try this piece of code. It should work as intended.
object Sum {
def main(args: Array[String]) {
val lines = io.Source.stdin.getLines
val numbers = lines.map(_.toInt)
println(s"Sum: ${numbers.sum}")
}
}
Plus, the correct shortcut to end the input stream is Ctrl + D.

Interact (i/o) with an external process in Scala

I'm looking for a simple way to start an external process and then write strings to its input and read its output.
In Python, this works:
mosesProcess = subprocess.Popen([mosesBinPath, '-f', mosesModelPath], stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.PIPE);
# ...
mosesProcess.stdin.write(aRequest);
mosesAnswer = mosesProcess.stdout.readline().rstrip();
# ...
mosesProcess.stdin.write(anotherRequest);
mosesAnswer = mosesProcess.stdout.readline().rstrip();
# ...
mosesProcess.stdin.close();
I think in Scala this should be done with scala.sys.process.ProcessBuilder and scala.sys.process.ProcessIO but I don't get how they work (especially the latter).
EDIT:
I have tried things like:
val inputStream = new scala.concurrent.SyncVar[java.io.OutputStream];
val outputStream = new scala.concurrent.SyncVar[java.io.InputStream];
val errStream = new scala.concurrent.SyncVar[java.io.InputStream];
val cmd = "myProc";
val pb = process.Process(cmd);
val pio = new process.ProcessIO(stdin => inputStream.put(stdin),
stdout => outputStream.put(stdout),
stderr => errStream.put(stderr));
pb.run(pio);
inputStream.get.write(("request1" + "\n").getBytes);
println(outputStream.get.read); // It is blocked here
inputStream.get.write(("request2" + "\n").getBytes);
println(outputStream.get.read);
inputStream.get.close()
But the execution gets stuck.
Granted, attrib below is not a great example on the write side of things. I have an EchoServer that would input/output
import scala.sys.process._
import java.io._
object EchoClient{
def main(args: Array[String]) {
var bContinue=true
var cmd="C:\\\\windows\\system32\\attrib.exe"
println(cmd)
val process = Process (cmd)
val io = new ProcessIO (
writer,
out => {scala.io.Source.fromInputStream(out).getLines.foreach(println)},
err => {scala.io.Source.fromInputStream(err).getLines.foreach(println)})
while (bContinue) {
process run io
var answer = readLine("Run again? (y/n)? ")
if (answer=="n" || answer=="N")
bContinue=false
}
}
def reader(input: java.io.InputStream) = {
// read here
}
def writer(output: java.io.OutputStream) = {
// write here
//
}
// TODO: implement an error logger
}
output below :
C:\\windows\system32\attrib.exe
A C:\dev\EchoClient$$anonfun$1.class
A C:\dev\EchoClient$$anonfun$2$$anonfun$apply$1.class
A C:\dev\EchoClient$$anonfun$2.class
A C:\dev\EchoClient$$anonfun$3$$anonfun$apply$2.class
A C:\dev\EchoClient$$anonfun$3.class
A C:\dev\EchoClient$.class
A C:\dev\EchoClient.class
A C:\dev\EchoClient.scala
A C:\dev\echoServer.bat
A C:\dev\EchoServerChg$$anonfun$main$1.class
A C:\dev\EchoServerChg$.class
A C:\dev\EchoServerChg.class
A C:\dev\EchoServerChg.scala
A C:\dev\ScannerTest$$anonfun$main$1.class
A C:\dev\ScannerTest$.class
A C:\dev\ScannerTest.class
A C:\dev\ScannerTest.scala
Run again? (y/n)?
Scala API for ProcessIO:
new ProcessIO(in: (OutputStream) ⇒ Unit, out: (InputStream) ⇒ Unit, err: (InputStream) ⇒ Unit)
I suppose you should provide at least two arguments, 1 outputStream function (writing to the process), 1 inputStream function (reading from the process).
For instance:
def readJob(in: InputStream) {
// do smthing with in
}
def writeJob(out: OutputStream) {
// do somthing with out
}
def errJob(err: InputStream) {
// do smthing with err
}
val process = new ProcessIO(writeJob, readJob, errJob)
Please keep in mind that the streams are Java streams so you will have to check Java API.
Edit: the package page provides examples, maybe you could take a look at them.
ProcessIO is the way to go for low level control and input and output interaction. There even is an often overlooked helper object BasicIO that assists with creating common ProcessIO instances for reading, connecting in/out streams with utility functions. You can look at the source for BasicIO.scala to see what it is doing internally in creating the ProcessIO Instances.
You can sometimes find inspiration from test cases or tools created for the class itself by the project. In the case of Scala, have a look at the source on GitHub. We are fortunate in that there is a detailed example of ProcessIO being used for the scala GraphViz Dot process runner DotRunner.scala!

sys.process to wrap a process as a function

I have an external process that I would like to treat as a
function from String=>String. Given a line of input, it will respond with a single line of output. It seems that I should use
scala.sys.process, which is clearly an elegant library that makes many
shell operations easily accessible from within scala. However, I
can't figure out how to perform this simple use case.
If I write a single line to the process' stdin, it prints the result
in a single line. How can I use sys.process to create a wrapper so I
can use the process interactively? For example, if I had an
implementation for ProcessWrapper, here is a program and it's output:
// abstract definition
class ProcessWrapper(executable: String) {
def apply(line: String): String
}
// program using an implementation
val process = new ProcessWrapper("cat -b")
println(process("foo"))
println(process("bar"))
println(process("baz"))
Output:
1 foo
2 bar
3 baz
It is important that the process is not reloaded for each call to process because there is a significant initialization step.
So - after my comment - this would be my solution
import java.io.BufferedReader
import java.io.File
import java.io.InputStream
import java.io.InputStreamReader
import scala.annotation.tailrec
class ProcessWrapper(cmdLine: String, lineListenerOut: String => Unit, lineListenerErr: String => Unit,
finishHandler: => Unit,
lineMode: Boolean = true, envp: Array[String] = null, dir: File = null) {
class StreamRunnable(val stream: InputStream, listener: String => Unit) extends Runnable {
def run() {
try {
val in = new BufferedReader(new InputStreamReader(this.stream));
#tailrec
def readLines {
val line = in.readLine
if (line != null) {
listener(line)
readLines
}
}
readLines
}
finally {
this.stream.close
finishHandler
}
}
}
val process = Runtime.getRuntime().exec(cmdLine, envp, dir);
val outThread = new Thread(new StreamRunnable(process.getInputStream, lineListenerOut), "StreamHandlerOut")
val errThread = new Thread(new StreamRunnable(process.getErrorStream, lineListenerErr), "StreamHandlerErr")
val sendToProcess = process.getOutputStream
outThread.start
errThread.start
def apply(txt: String) {
sendToProcess.write(txt.getBytes)
if (lineMode)
sendToProcess.write('\n')
sendToProcess.flush
}
}
object ProcessWrapper {
def main(args: Array[String]) {
val process = new ProcessWrapper("python -i", txt => println("py> " + txt),
err => System.err.println("py err> " + err), System.exit(0))
while (true) {
process(readLine)
}
}
}
The main part is the StreamRunnable, where the process is read in a thread and the received line is passed on to a "LineListener" (a simple String => Unit - function).
The main is just a sample implementation - calling python ;)
I'm not sure, but you want somethings like that ?
case class ProcessWrapper(executable: String) {
import java.io.ByteArrayOutputStream
import scala.concurrent.duration.Duration
import java.util.concurrent.TimeUnit
lazy val process = sys.runtime.exec(executable)
def apply(line: String, blockedRead: Boolean = true): String = {
process.getOutputStream().write(line.getBytes())
process.getOutputStream().flush()
val r = new ByteArrayOutputStream
if (blockedRead) {
r.write(process.getInputStream().read())
}
while (process.getInputStream().available() > 0) {
r.write(process.getInputStream().read())
}
r.toString()
}
def close() = process.destroy()
}
val process = ProcessWrapper("cat -b")
println(process("foo\n"))
println(process("bar\n"))
println(process("baz\n"))
println(process("buz\n"))
println(process("puz\n"))
process.close
Result :
1 foo
2 bar
3 baz
4 buz
5 puz
I think that PlayCLI is a better way.
http://blog.greweb.fr/2013/01/playcli-play-iteratees-unix-pipe/ came across this today and looks exactly like what you want
How about using an Akka actor. The actor can have state and thus a reference to an open program (in a thread). You can send messages to that actor.
ProcessWrapper might be a typed actor itself or just something that converts the calls of a function to a call of an actor. If you only have 'process' as method name, then wrapper ! "message" would be enough.
Having a program open and ready to receive commands sounds like an actor that receives messages.
Edit: Probably I got the requirements wrong. You want to send multiple lines to the same process. That's not possible with the below solution.
One possibility would be to add an extension method to the ProcessBuilder that allows for taking the input from a string:
implicit class ProcessBuilderWithStringInput(val builder: ProcessBuilder) extends AnyVal {
// TODO: could use an implicit for the character set
def #<<(s: String) = builder.#<(new ByteArrayInputStream(s.getBytes))
}
You can now use the method like this:
scala> ("bc":ProcessBuilder).#<<("3 + 4\n").!!
res9: String =
"7
"
Note that the type annotation is necessary, because we need two conversions (String -> ProcessBuilder -> ProcessBuilderWithStringInput, and Scala will only apply one conversion automatically.