Grabbing process stderr in scala - scala

When using the scala 2.9 process API, I can do things like
"ls -l"!
which will send the process stdout and stderr into my own.
Or:
val output = "ls -1"!!
which will return whatever was sent to stdout into the val output.
How can I similarly grab stderr?

You can create your own ProcessLogger:
import sys.process._
val logger = ProcessLogger(
(o: String) => println("out " + o),
(e: String) => println("err " + e))
scala> "ls" ! logger
out bin
out doc
out lib
out meta
out misc
out src
res15: Int = 0
scala> "ls -e" ! logger
err ls: invalid option -- e
err Try `ls --help' for more information.
res16: Int = 2
Edit: The previous example simply prints, but it could easily store the output in some structure:
val out = new StringBuilder
val err = new StringBuilder
val logger = ProcessLogger(
(o: String) => out.append(o),
(e: String) => err.append(e))
scala> "ls" ! logger
res22: Int = 0
scala> out
res23: StringBuilder = bindoclibmetamiscsrc
scala> "ls -e" ! logger
res27: Int = 2
scala> out
res28: StringBuilder =
scala> err
res29: StringBuilder = ls: invalid option -- eTry `ls --help' for more information.

Related

scalaz, read and map the lines of a file

The following code to read and map the lines of a file works ok:
def readLines(fileName: String) = scala.io.Source.fromFile(fileName).getLines
def toInt(line: String) = line.toInt
val numbers: Iterator[Int] = readLines("/tmp/file.txt").map(toInt).map(_ * 2)
println(numbers.toList)
I get an iterator of Ints if the executing goes well. But the program throws an exception if the file is not found, or some line contains letters.
How can I transform the program to use scalaz monads and get a Disjunction[Exception, List[Int]]?
I tried this on scalaz 7.2.6, but it does not compile:
import scalaz.Scalaz._
import scalaz._
def readLines(fileName: String): Disjunction[Any, List[String]] =
try { scala.io.Source.fromFile(fileName).getLines.toList.right }
catch { case e: java.io.IOException => e.left}
def toInt(line: String): Disjunction[Any, Int] =
try { line.toInt.right }
catch { case e: NumberFormatException => e.left}
val numbers: Disjunction[Any, Int] = for {
lines: List[String] <- readLines("/tmp/file.txt")
line: String <- lines
n: Int <- toInt(line)
} yield (n * 2)
it fails to compile with these errors:
Error:(89, 37) could not find implicit value for parameter M: scalaz.Monoid[Any]
lines: List[String] <- readLines("/tmp/file.txt")
Error:(89, 37) not enough arguments for method filter: (implicit M: scalaz.Monoid[Any])scalaz.\/[Any,List[String]].
Unspecified value parameter M.
lines: List[String] <- readLines("/tmp/file.txt")
Error:(91, 20) could not find implicit value for parameter M: scalaz.Monoid[Any]
n: Int <- toInt(line)
Error:(91, 20) not enough arguments for method filter: (implicit M: scalaz.Monoid[Any])scalaz.\/[Any,Int].
Unspecified value parameter M.
n: Int <- toInt(line)
I don't understand the errors. what is the problem?
and how to improve this code, so that it does not read all the file into memory, but it reads and maps each line at a time?
Update: Answer from Filippo
import scalaz._
def readLines(fileName: String) = \/.fromTryCatchThrowable[List[String], Exception] {
scala.io.Source.fromFile(fileName).getLines.toList
}
def toInt(line: String) = \/.fromTryCatchThrowable[Int, NumberFormatException](line.toInt)
type λ[+A] = Exception \/ A
val numbers = for {
line: String <- ListT[λ, String](readLines("/tmp/file.txt"))
n: Int <- ListT[λ, Int](toInt(line).map(List(_)))
} yield n * 2
println(numbers)
To answer the second part of your question, I would simply use the Iterator out of the fromFile method:
val lines: Iterator[String] = scala.io.Source.fromFile(fileName).getLines
If you want to use toInt to convert String to Int:
import scala.util.Try
def toInt(line: String): Iterator[Int] =
Try(line.toInt).map(Iterator(_)).getOrElse(Iterator.empty)
Then numbers could look like:
val numbers = readLines("/tmp/file.txt").flatMap(toInt).map(_ * 2)
EDIT
Due the presence of all these try and catch, if you want to keep using that monadic-for I would suggest to check a scalaz helper like .fromTryCatchThrowable on Disjunction:
import scalaz._, Scalaz._
def readLines(fileName: String): Disjunction[Exception, List[String]] =
Disjunction.fromTryCatchThrowable(scala.io.Source.fromFile(fileName).getLines.toList)
def toInt(line: String): Disjunction[Exception, Int] =
Disjunction.fromTryCatchThrowable(line.toInt)
Now we also have Exception instead of Any as the left type.
val numbers = for {
lines: List[String] <- readLines("/tmp/file.txt")
line: String <- lines // The problem is here
n: Int <- toInt(line)
} yield n * 2
The problem with this monadic-for is that the first and third line are using the Disjunction context but the second one uses the List monad. Using a monad transformer like ListT or DisjunctionT here is possible but probably overkill.
EDIT - to reply the comment
As mentioned, if we want a single monadic-for comprehension, we need a monad transformer, in this case ListT. The Disjunction has two type parameters while a Monad M[_] obviously only one. We need to handle this "extra type parameter", for instance using type lambda:
def readLines(fileName: String) = \/.fromTryCatchThrowable[List[String], Exception] {
fromFile(fileName).getLines.toList
}
val listTLines = ListT[({type λ[+a] = Exception \/ a})#λ, String](readLines("/tmp/file.txt"))
What is the type of listTLines? The ListT transformer: ListT[\/[Exception, +?], String]
The last step in the original for-comprehension was toInt:
def toInt(line: String) = \/.fromTryCatchThrowable[Int, NumberFormatException](line.toInt)
val listTNumber = ListT[\/[Exception, +?], Int](toInt("line"))
What is the type of listTNumber? It doesn't even compile, because the toInt return an Int and not a List[Int]. We need a ListT to join that for-comprehension, one trick could be changing listTNumber to:
val listTNumber = ListT[\/[Exception, +?], Int](toInt("line").map(List(_)))
Now we have both steps:
val numbers = for {
line: String <- ListT[\/[Exception, +?], String](readLines("/tmp/file.txt"))
n: Int <- ListT[\/[Exception, +?], Int](toInt(line).map(List(_)))
} yield n * 2
scala> numbers.run.getOrElse(List.empty) foreach println
2
20
200
If you are wondering why all this unwrapping:
scala> val unwrap1 = numbers.run
unwrap1: scalaz.\/[Exception,List[Int]] = \/-(List(2, 20, 200))
scala> val unwrap2 = unwrap1.getOrElse(List())
unwrap2: List[Int] = List(2, 20, 200)
scala> unwrap2 foreach println
2
20
200
(assuming that the sample file contains the lines: 1, 10, 100)
EDIT - comment about compilation issues
The code above compiles thanks to the Kind Projector plugin:
addCompilerPlugin("org.spire-math" % "kind-projector_2.11" % "0.5.2")
With Kind Projector we can have anonymous types like:
Either[Int, +?] // equivalent to: type R[+A] = Either[Int, A]
Instead of:
type IntOrA[A] = Either[Int, A]
// or
({type L[A] = Either[Int, A]})#L
First, the compiler alerts that you´re using for comprehensions mixing types. Your code is transformed by the compiler as that :
readLines("/tmp/file.txt") flatMap { lines => lines } map { line => toInt(line) }
The definition of flatMap is:
def flatMap[A,B](ma: F[A])(f: A => F[B]): F[B]
In your case F is the \/, and this flatMap { lines => lines } is wrong. The compiler alerts with a message like this "List[Nothing] required: scalaz.\/[Any,Int]" because treats list as one function with no parameters and List[Nothing] as result type. Change your code like that:
import scalaz.Scalaz._
import scalaz._
def readLines(fileName: String): Disjunction[Any, List[String]] =
try { scala.io.Source.fromFile(fileName).getLines.toList.right }
catch { case e: java.io.IOException => e.left}
def toInt(line: List[String]): Disjunction[Any, List[Int]] =
try { (line map { _ toInt }).right }
catch { case e: NumberFormatException => e.left}
val numbers = for {
lines <- readLines("/tmp/file.txt")
n <- toInt(lines)
} yield (n map (_ * 2))
That works.
For read line by line maybe FileInputStream can be easier:
fis = new FileInputStream("/tmp/file.txt");
reader = new BufferedReader(new InputStreamReader(fis));
String line = reader.readLine();
while(line != null){
System.out.println(line);
line = reader.readLine();
}
Or you can test the readline function from Source class.

Loading companion objects in spark-shell

When I use :load in the spark-shell it appears as though lines are read separately and thus companion objects are not read in the same "source" file. :paste does not appear to take arguments.
Previously I was building and loading a jar with my code into the spark-shell but was hoping to run it as a script for simplicity. Does anyone have a favorite workaround?
A sufficiently recent shell will have :paste file.
Or, as a workaround, link the templates this way to :load them:
class C(i: Int) {
def c = { println("C..."); i }
}; object C {
def apply(i: Int = 42) = new C(i)
}
Or,
scala> (new $intp.global.Run) compile List("C.scala")
scala> new C().c
C...
res1: Int = 42
More API:
scala> import reflect.io._
import reflect.io._
scala> import reflect.internal.util._
import reflect.internal.util._
scala> val code = File("C.scala").slurp
code: String =
"
class C(i: Int) { def c = { println("C..."); i } }
object C { def apply(i: Int = 42) = new C(i) }
"
scala> $intp interpret code
defined class C
defined object C
res0: scala.tools.nsc.interpreter.IR.Result = Success
scala> C()
res1: C = C#f2f2cc1
Similarly,
scala> $intp interpret s"object X { $code }"
defined object X
res0: scala.tools.nsc.interpreter.IR.Result = Success
scala> X.C()
res1: X.C = X$C#7d322cad
My startup script defines:
implicit class `interpreter interpolator`(val sc: StringContext) { def i(args: Any*) = $intp interpret sc.s(args: _*) }
for
scala> i"val x = 42"
x: Int = 42
res0: scala.tools.nsc.interpreter.IR.Result = Success
This compile trick doesn't appear to work with "script" files. It expects a source file compilable by scalac, where all val and def declarations are inside a type.
So, an alternative hack that will work with :load is to write the case class and companion object inside another object. Here I just pasted the code, without using :paste, but it works with :load, too.
scala> object O {
| case class C(s: String)
| object C {
| def apply() = new C("<no string>")
| }
| }
defined module O
scala> O.C()
res0: O.C = C(<no string>)

get first 2 values in a comma separated string

I am trying to get the first 2 values of a comma separated string in scala. For example
a,b,this is a test
How do i store the values a,b in 2 separate variables?
To keep it easy and clean.
KISS solution:
1.Use split for separation. Then use take which is defined on all ordered sequences to get the elements as needed:
scala> val res = "a,b,this is a test" split ',' take 2
res: Array[String] = Array(a, b)
2.Use Pattern matching to set the variables:
scala> val Array(x,y) = res
x: String = a
y: String = b*
Another solution using Sequence Pattern match in Scalaenter link description here
Welcome to Scala version 2.11.2 (OpenJDK 64-Bit Server VM, Java 1.7.0_65).
Type in expressions to have them evaluated.
Type :help for more information.
scala> val str = "a,b,this is a test"
str: String = a,b,this is a test
scala> val Array(x, y, _*) = str.split(",")
x: String = a
y: String = b
scala> println(s"x = $x, y = $y")
x = a, y = b
Are you looking for the method split ?
"a,b,this is a test".split(',')
res0: Array[String] = Array(a, b, this is a test)
If you want only the first two values you'll need to do something like:
val splitted = "a,b,this is a test".split(',')
val (first, second) = (splitted(0), splitted(1))
There should be some regex options here.
scala> val s = "a,b,this is a test"
s: String = a,b,this is a test
scala> val r = "[^,]+".r
r: scala.util.matching.Regex = [^,]+
scala> r findAllIn s
res0: scala.util.matching.Regex.MatchIterator = non-empty iterator
scala> .toList
res1: List[String] = List(a, b, this is a test)
scala> .take(2)
res2: List[String] = List(a, b)
scala> val a :: b :: _ = res2
a: String = a
b: String = b
but
scala> val a :: b :: _ = (r findAllIn "a" take 2).toList
scala.MatchError: List(a) (of class scala.collection.immutable.$colon$colon)
... 33 elided
or if you're not sure there is a second item, for instance:
scala> val r2 = "([^,]+)(?:,([^,]*))?".r.unanchored
r2: scala.util.matching.UnanchoredRegex = ([^,]+)(?:,([^,]*))?
scala> val (a,b) = "a" match { case r2(x,y) => (x, Option(y)) }
a: String = a
b: Option[String] = None
scala> val (a,b) = s match { case r2(x,y) => (x, Option(y)) }
a: String = a
b: Option[String] = Some(b)
This is a bit nicer if records are long strings.
Footnote: the Option cases look nicer with a regex interpolator.
If your string is short, you may as well just use String.split and take the first two elements.
val myString = "a,b,this is a test"
val splitString = myString.split(',') // Scala adds a split-by-character method in addition to Java's split-by-regex
val a = splitString(0)
val b = splitString(1)
Another solution would be to use a regex to extract the first two elements. I think it's quite elegant.
val myString = "a,b,this is a test"
val regex = """(.*),(.*),.*""".r // all groups (in parenthesis) will be extracted.
val regex(a, b) = myString // a="a", b="b"
Of course, you can tweak the regex to only allow non-empty tokens (or anything else you might need to validate) :
val regex = """(.+),(.+),.+""".r
Note that in my examples I assumed that the string always had at least two tokens. In the first example, you can test the length of the array if needed. The second one will throw a MatchError if the regex doesn't match the string.
I had originally proposed the following solution. I will leave it because it works and doesn't use any class formally marked as deprecated, but the Javadoc for StringTokenizer mentions that it is a legacy class and should no longer be used.
val myString = "a,b,this is a test"
val st = new StringTokenizer(",");
val a = st.nextToken()
val b = st.nextToken()
// You could keep calling st.nextToken(), as long as st.hasMoreTokens is true

Simple destructuring extractor for command line args

The preferred approach would be to use something similar to the commented out line below.
def main(args: Array[String]) {
// val (dbPropsFile, tsvFile, dbTable) = args
val dbPropsFile = args(0)
val tsvFile = args(1)
val dbTable = args(2)
However I am having a little quarrel with the compiler over it:
Error:(13, 9) constructor cannot be instantiated to expected type;
found : (T1, T2, T3)
required: Array[String]
val (dbPropsFile, tsvFile, dbTable) = args
^
So all told this should be an easy few points for someone out there.
Use
val Array(dbPropsFile, tsvFile, dbTable) = args
scala> val Array(a,b,c) = Array(1,2,3)
a: Int = 1
b: Int = 2
c: Int = 3
scala> a
res0: Int = 1

Skip initialization of hash of hash (of hash) in scala?

How do I avoid the initialization (lines 5 and 6) here?
import scala.collection._
def newHash = mutable.Map[String,String]()
def newHoH = mutable.Map[String,mutable.Map[String,String]]()
var foo = mutable.Map[String,mutable.Map[String,mutable.Map[String,String]]]()
foo("bar") = newHoH //line 5
foo("bar")("baz") = newHash //line 6
foo("bar")("baz")("whee") = "duh"
I tried withDefaultValue with a simpler example but obviously I did it wrong:
/***
scala> var foo = mutable.Map[String,mutable.Map[String,String]]().withDefaultValue(mutable.Map(""->""))
foo: scala.collection.mutable.Map[String,scala.collection.mutable.Map[String,String]] = Map()
scala> foo("bar")("baz") = "duh"
scala> foo("b")("baz") = "der"
scala> foo("bar")("baz")
res7: String = der
*/
The withDefault method won't work here. A Map created this way returns a new map everytime there is no key, so calling mymap("foo")("bar") = "ok" will assign "ok" into a temporarily created map, but the next time you call mymap("foo")("bar"), a non-existent "foo" key on mymap will result in creating a new map, which will not contain the mapping "foo" -> "bar".
Instead, consider creating an anonymous map. I show a solution with only 1 nestings:
‡ scala-version 2.10.1
Welcome to Scala version 2.10.1 (Java HotSpot(TM) Server VM, Java 1.7.0_21).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import collection._
import collection._
scala> :paste
// Entering paste mode (ctrl-D to finish)
def newHash = mutable.Map[String,String]().withDefault(_ => "")
def newHoH = new mutable.Map[String,mutable.Map[String,String]]() {
val m = mutable.Map[String, mutable.Map[String, String]]()
def +=(kv: (String, mutable.Map[String, String])) = { m += kv; this }
def -=(k: String) = { m -= k; this }
def get(k: String) = m.get(k) match {
case opt # Some(v) => opt
case None =>
val v = newHash
m(k) = v
Some(v)
}
def iterator = m.iterator
}
// Exiting paste mode, now interpreting.
newHash: scala.collection.mutable.Map[String,String]
newHoH: scala.collection.mutable.Map[String,scala.collection.mutable.Map[String,String]]{val m: scala.collection.mutable.Map[String,scala.collection.mutable.Map[String,String]]}
scala> val m = newHoH
m: scala.collection.mutable.Map[String,scala.collection.mutable.Map[String,String]]{val m: scala.collection.mutable.Map[String,scala.collection.mutable.Map[String,String]]} = Map()
scala> m("foo")("bar") = "ok"
scala> m("foo")("bar")
res1: String = ok