Run a SBT task with arguments from command line - scala

I want to hava a SBT task which takes comma-separated list of test classes given by their fully qualified name as input from command line. Now that I run the task with hard-coded value but I want to get it from command line. Can someone help me in writing a task like this.
lazy val runTask = inputKey[Unit]("custom run")
runTask := {
val one = (runMain in Compile).fullInput(" org.scalatest.tools.Runner -P1 -C reporter.TestReporter -o -s testcase.GetAccountInfo -s testcase.GetProfileInfo").evaluated
}
Something like this,
sbt runTask testcase.GetProfileInfo,testcase.GetAccountInfo
Thanks in advance.

You have to have a Parser, which will parse the input given to the task. Once you have the input, you can convert (runMain in Compile).toTask, and feed the input to the task.
TL;DR; build.sbt
import sbt.complete._
import complete.DefaultParsers._
lazy val myRunTask = inputKey[Unit]("Runs actual tests")
lazy val FullQualifiedClassName =
(charClass(c => isScalaIDChar(c) || (c == '.'), "class name")).+.string
def commaDelimited(display: String) =
token(Space) ~> repsep(token(FullQualifiedClassName, display), token(","))
lazy val testClassArgs: Parser[Seq[String]] =
commaDelimited("<full qualified test class name>").map {
xs: Seq[String] => xs.map(e => s" -s $e ")
}
myRunTask := Def.inputTaskDyn {
val classes = testClassArgs.parsed
runMainInCompile(classes)
}.evaluated
def runMainInCompile(classes: Seq[String]) = Def.taskDyn {
(runMain in Compile).toTask(s" org.scalatest.tools.Runner -P1 -C reporter.TestReporter -o ${classes.mkString}")
}
Parser
Let's start with a parser. The parser must take a space, followed by your classes separated by comma.
Let's first defined a parser, which parses full qualified class name:
lazy val FullQualifiedClassName =
(charClass(c => isScalaIDChar(c) || (c == '.'), "class name")).+.string
Once we have the parser, we can combine it together with another parser. We need to create a parser, which takes comma separated full qualified class names:
def commaDelimited(display: String) =
token(Space) ~> repsep(token(FullQualifiedClassName, display), token(","))
The ~> operator means that the input on the left of it will be discarded. The value returned from the parser is a Seq[String] of the full qualified class names.
Judging from your question, you want your classes to be prefixed with -s. You could do it later, but just to show one more feature of parsers, I'll just do it here.
You can take an output of a parser and convert it to another output, using map.
lazy val testClassArgs: Parser[Seq[String]] =
commaDelimited("<full qualified test class name>").map {
xs: Seq[String] => xs.map(e => s" -s $e ")
}
OK, we're almost there.
Run InputTask with arguments combined with a static string
We can define a new input task key. I'll chose myRunTask, because otherwise it will collide with runTask, which already exists.
Let's define a method which takes a sequence of classes (already prefixed with -s) as an argument, and which returns a Task obtained from an InputTask.
def runMainInCompile(classes: Seq[String]) = Def.taskDyn {
(runMain in Compile).toTask(s" org.scalatest.tools.Runner -P1 -C reporter.TestReporter -o ${classes.mkString}")
}
Now let's combine all elements in one task:
myRunTask := Def.inputTaskDyn {
val classes = testClassArgs.parsed
runMainInCompile(classes)
}.evaluated

Related

Casting a variable to a method in Scala "Runtime Evaluation"

I want to evaluate a function passed as a variable string in scala (sorry but i'm new to scala )
def concate(a:String,b:String): String ={
a+" "+b
}
var func="concate" //i'll get this function name from config as string
I want to perform something like
eval(func("hello","world)) //Like in Python
so output will be like
hello world
Eventually I want to execute few in built functions on a string coming from my config and I don't want to hard code the function names in the code.
EDIT
To Be More clear with my exact usecase
I have a Config file which has multiple functions defined in it that are Spark inbuilt functions on Data frame
application.conf looks like
transformations = [
{
"table" : "users",
"function" : "from_unixtime",
"column" : "epoch"
},
{
"table" : "users",
"function" : "yearofweek",
"column" : "epoch"
}
]
Now functions yearofweek and from_unixtime are Spark inbuilt functions now I want to eval my Dataframe by the functions defined in config. #all the functions are applied to a column defined.
the Obvious way is to write an if else and do string comparison calling a particular inbuilt function but that is way to much..
i am looking for a better solution.
This is indeed possible in scala, as scala is JSR 223 compliant scripting language. Here is an example (running with scala 2.11.8). Note that you need to import your method because otherwise the interpreter will not find it:
package my.example
object EvalDemo {
// evalutates scala code and returns the result as T
def evalAs[T](code: String) = {
import scala.reflect.runtime.currentMirror
import scala.tools.reflect.ToolBox
val toolbox = currentMirror.mkToolBox()
import toolbox.{eval, parse}
eval(parse(code)).asInstanceOf[T]
}
def concate(a: String, b: String): String = a + " " + b
def main(args: Array[String]): Unit = {
var func = "concate" //i'll get this function name from config as string
val code =
s"""
|import my.example.EvalDemo._
|${func}("hello","world")
|""".stripMargin
val result: String = evalAs[String](code)
println(result) // "hello world"
}
}
Have Function to name mapping in the code
def foo(str: String) = str + ", foo"
def bar(str: String) = str + ", bar"
val fmap = Map("foo" -> foo _, "bar" -> bar _)
fmap("foo")("hello")
now based on the function name we get from the config, pass the name to the map and lookup the corresponding function and invoke the arguments on it.
Scala repl
scala> :paste
// Entering paste mode (ctrl-D to finish)
def foo(str: String) = str + ", foo"
def bar(str: String) = str + ", bar"
val fmap = Map("foo" -> foo _, "bar" -> bar _)
fmap("foo")("hello")
// Exiting paste mode, now interpreting.
foo: (str: String)String
bar: (str: String)String
fmap: scala.collection.immutable.Map[String,String => String] = Map(foo -> $$Lambda$1104/1335082762#778a1250, bar -> $$Lambda$1105/841090268#55acec99)
res0: String = hello, foo
Spark offers you a way to write your transformations or queries using SQL. So, you really don't have to worry about Scala functions, casting and evaluation in this case. You just have to parse your config to generate the SQL query.
Let's say you have registered a table users with Spark and want to do a select and transform based on provided config,
// your generated query will look like this,
val query = "SELECT from_unixtime(epoch) as time, weekofyear(epoch) FROM users"
val result = spark.sql(query)
So, all you need to do is - build that query from your config.

How can I use Stream to traverse a tree in Scala?

I have a simple file system abstraction:
trait PathItem { val label: String }
case class PathEnd(label: String, uri: String) extends PathItem
case class PathDirectory(
label: String = "",
contents: List[PathItem] = List.empty[PathItem]
) extends PathItem
With this structure I can build up an arbitrarily complex tree of subdirectories (PathDirectory) and files (PathEnd).
How could I use Scala Streams to extract a list of the "files" something like this:
getFileStream( rootDir ).foreach( f => println(f.uri) )
getFileStream( rootDir ).find( _.uri == "someTargetURI" )
// where getFileStream creates a Stream[PathEnd] given a starting rootDir
Passing through the tree like this would be kinda cool, but I'm not understanding how to create a Stream for this from the scaladoc.
(I know I can just write a simple recursive function, but I'm trying to grok Streams here.)
As mentioned in comments, you can essentially treat a Stream the same as you would a List and you'll get the desired lazily evaluated sequence. Your solution:
def fileStream(p: PathItem): Stream[PathEnd] = {
p match {
case pe: PathEnd => Stream(pe)
case pd: PathDirectory => pd.contents.toStream.flatMap(fileStream)
}
}
Note the flatMap to avoid creating a Stream of Stream instances.
Test:
scala> val pd = PathDirectory(root,List(
PathDirectory("src",List(PathDirectory("main",List(PathEnd("file.scala","file.uri"))))),
PathDirectory("test",List(PathDirectory("main",List(PathEnd("test.scala","test.uri")))))))
scala> fileStream(pd).foreach(println)
PathEnd(file.scala,file.uri)
PathEnd(test.scala,test.uri)

Internal DSL in Scala: Lists without ","

I'm trying to build an internal DSL in Scala to represent algebraic definitions. Let's consider this simplified data model:
case class Var(name:String)
case class Eq(head:Var, body:Var*)
case class Definition(name:String, body:Eq*)
For example a simple definition would be:
val x = Var("x")
val y = Var("y")
val z = Var("z")
val eq1 = Eq(x, y, z)
val eq2 = Eq(y, x, z)
val defn = Definition("Dummy", eq1, eq2)
I would like to have an internal DSL to represent such an equation in the form:
Dummy {
x = y z
y = x z
}
The closest I could get is the following:
Definition("Dummy") := (
"x" -> ("y", "z")
"y" -> ("x", "z")
)
The first problem I encountered is that I cannot have two implicit conversions for Definition and Var, hence Definition("Dummy"). The main problem, however, are the lists. I don't want to surround them by any thing, e.g. (), and I also don't want their elements be separated by commas.
Is what I want possible using Scala? If yes, can anyone show me an easy way of achieving it?
While Scalas syntax is powerful, it is not flexible enough to create arbitrary delimiters for symbols. Thus, there is no way to leave commas and replace them only with spaces.
Nevertheless, it is possible to use macros and parse a string with arbitrary content at compile time. It is not an "easy" solution, but one that works:
object AlgDefDSL {
import language.experimental.macros
import scala.reflect.macros.Context
implicit class DefDSL(sc: StringContext) {
def dsl(): Definition = macro __dsl_impl
}
def __dsl_impl(c: Context)(): c.Expr[Definition] = {
import c.universe._
val defn = c.prefix.tree match {
case Apply(_, List(Apply(_, List(Literal(Constant(s: String)))))) =>
def toAST[A : TypeTag](xs: Tree*): Tree =
Apply(
Select(Ident(typeOf[A].typeSymbol.companionSymbol), newTermName("apply")),
xs.toList
)
def toVarAST(varObj: Var) =
toAST[Var](c.literal(varObj.name).tree)
def toEqAST(eqObj: Eq) =
toAST[Eq]((eqObj.head +: eqObj.body).map(toVarAST(_)): _*)
def toDefAST(defObj: Definition) =
toAST[Definition](c.literal(defObj.name).tree +: defObj.body.map(toEqAST(_)): _*)
parsers.parse(s) match {
case parsers.Success(defn, _) => toDefAST(defn)
case parsers.NoSuccess(msg, _) => c.abort(c.enclosingPosition, msg)
}
}
c.Expr(defn)
}
import scala.util.parsing.combinator.JavaTokenParsers
private object parsers extends JavaTokenParsers {
override val whiteSpace = "[ \t]*".r
lazy val newlines =
opt(rep("\n"))
lazy val varP =
"[a-z]+".r ^^ Var
lazy val eqP =
(varP <~ "=") ~ rep(varP) ^^ {
case lhs ~ rhs => Eq(lhs, rhs: _*)
}
lazy val defHead =
newlines ~> ("[a-zA-Z]+".r <~ "{") <~ newlines
lazy val defBody =
rep(eqP <~ rep("\n"))
lazy val defEnd =
"}" ~ newlines
lazy val defP =
defHead ~ defBody <~ defEnd ^^ {
case name ~ eqs => Definition(name, eqs: _*)
}
def parse(s: String) = parseAll(defP, s)
}
case class Var(name: String)
case class Eq(head: Var, body: Var*)
case class Definition(name: String, body: Eq*)
}
It can be used with something like this:
scala> import AlgDefDSL._
import AlgDefDSL._
scala> dsl"""
| Dummy {
| x = y z
| y = x z
| }
| """
res12: AlgDefDSL.Definition = Definition(Dummy,WrappedArray(Eq(Var(x),WrappedArray(Var(y), Var(z))), Eq(Var(y),WrappedArray(Var(x), Var(z)))))
In addition to sschaef's nice solution I want to mention a few possibilities that are commonly used to get rid of commas in list construction for a DSL.
Colons
This might be trivial, but it is sometimes overlooked as a solution.
line1 ::
line2 ::
line3 ::
Nil
For a DSL it is often desired that every line that contains some instruction/data is terminated the same way (opposed to Lists where all but the last line will get a comma). With such a solutions exchanging the lines no longer can mess up the trailing comma. Unfortunately, the Nil looks a bit ugly.
Fluid API
Another alternative that might be interesting for a DSL is something like that:
BuildDefinition()
.line1
.line2
.line3
.build
where each line is a member function of the builder (and returns a modified builder). This solution requires to eventually convert the builder to a list (which might be done as an implicit conversion). Note that for some APIs it might be possible to pass around the builder instances themselves, and only extract the data wherever needed.
Constructor API
Similarly another possibility is to exploit constructors.
new BuildInterface {
line1
line2
line3
}
Here, BuildInterface is a trait and we simply instantiate an anonymous class from the interface. The line functions call some member functions of this trait. Each invocation can internally update the state of the build interface. Note that this commonly results in a mutable design (but only during construction). To extract the list, an implicit conversion could be used.
Since I don't understand the actual purpose of your DSL, I'm not really sure if any of these techniques is interesting for your scenario. I just wanted to add them since they are common ways to get rid of ",".
Here is another solution which is relatively simple and enables a syntax that is pretty close to your ideal
(as other have pointed, the exact syntax your asked for is not possible, in particular because you cannot redefine delimiter symbols).
My solution stretches a bit what is reasonable to do because it adds an operator right on scala.Symbol,
but if you're going to use this DSL in a constrained scope then this should be OK.
object VarOps {
val currentEqs = new util.DynamicVariable( Vector.empty[Eq] )
}
implicit class VarOps( val variable: Var ) extends AnyVal {
import VarOps._
def :=[T]( body: Var* ) = {
val eq = Eq( variable, body:_* )
currentEqs.value = currentEqs.value :+ eq
}
}
implicit class SymbolOps( val sym: Symbol ) extends AnyVal {
def apply[T]( body: => Unit ): Definition = {
import VarOps._
currentEqs.withValue( Vector.empty[Eq] ) {
body
Definition( sym.name, currentEqs.value:_* )
}
}
}
Now you can do:
'Dummy {
x := (y, z)
y := (x, z)
}
Which builds the following definition (as printed in the REPL):
Definition(Dummy,Vector(Eq(Var(x),WrappedArray(Var(y), Var(z))), Eq(Var(y),WrappedArray(Var(x), Var(z)))))

Best way to parse command-line parameters? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
What's the best way to parse command-line parameters in Scala?
I personally prefer something lightweight that does not require external jar.
Related:
How do I parse command line arguments in Java?
What parameter parser libraries are there for C++?
Best way to parse command line arguments in C#
For most cases you do not need an external parser. Scala's pattern matching allows consuming args in a functional style. For example:
object MmlAlnApp {
val usage = """
Usage: mmlaln [--min-size num] [--max-size num] filename
"""
def main(args: Array[String]) {
if (args.length == 0) println(usage)
val arglist = args.toList
type OptionMap = Map[Symbol, Any]
def nextOption(map : OptionMap, list: List[String]) : OptionMap = {
def isSwitch(s : String) = (s(0) == '-')
list match {
case Nil => map
case "--max-size" :: value :: tail =>
nextOption(map ++ Map('maxsize -> value.toInt), tail)
case "--min-size" :: value :: tail =>
nextOption(map ++ Map('minsize -> value.toInt), tail)
case string :: opt2 :: tail if isSwitch(opt2) =>
nextOption(map ++ Map('infile -> string), list.tail)
case string :: Nil => nextOption(map ++ Map('infile -> string), list.tail)
case option :: tail => println("Unknown option "+option)
exit(1)
}
}
val options = nextOption(Map(),arglist)
println(options)
}
}
will print, for example:
Map('infile -> test/data/paml-aln1.phy, 'maxsize -> 4, 'minsize -> 2)
This version only takes one infile. Easy to improve on (by using a List).
Note also that this approach allows for concatenation of multiple command line arguments - even more than two!
scopt/scopt
val parser = new scopt.OptionParser[Config]("scopt") {
head("scopt", "3.x")
opt[Int]('f', "foo") action { (x, c) =>
c.copy(foo = x) } text("foo is an integer property")
opt[File]('o', "out") required() valueName("<file>") action { (x, c) =>
c.copy(out = x) } text("out is a required file property")
opt[(String, Int)]("max") action { case ((k, v), c) =>
c.copy(libName = k, maxCount = v) } validate { x =>
if (x._2 > 0) success
else failure("Value <max> must be >0")
} keyValueName("<libname>", "<max>") text("maximum count for <libname>")
opt[Unit]("verbose") action { (_, c) =>
c.copy(verbose = true) } text("verbose is a flag")
note("some notes.\n")
help("help") text("prints this usage text")
arg[File]("<file>...") unbounded() optional() action { (x, c) =>
c.copy(files = c.files :+ x) } text("optional unbounded args")
cmd("update") action { (_, c) =>
c.copy(mode = "update") } text("update is a command.") children(
opt[Unit]("not-keepalive") abbr("nk") action { (_, c) =>
c.copy(keepalive = false) } text("disable keepalive"),
opt[Boolean]("xyz") action { (x, c) =>
c.copy(xyz = x) } text("xyz is a boolean property")
)
}
// parser.parse returns Option[C]
parser.parse(args, Config()) map { config =>
// do stuff
} getOrElse {
// arguments are bad, usage message will have been displayed
}
The above generates the following usage text:
scopt 3.x
Usage: scopt [update] [options] [<file>...]
-f <value> | --foo <value>
foo is an integer property
-o <file> | --out <file>
out is a required file property
--max:<libname>=<max>
maximum count for <libname>
--verbose
verbose is a flag
some notes.
--help
prints this usage text
<file>...
optional unbounded args
Command: update
update is a command.
-nk | --not-keepalive
disable keepalive
--xyz <value>
xyz is a boolean property
This is what I currently use. Clean usage without too much baggage.
(Disclaimer: I now maintain this project)
I realize that the question was asked some time ago, but I thought it might help some people, who are googling around (like me), and hit this page.
Scallop looks quite promising as well.
Features (quote from the linked github page):
flag, single-value and multiple value options
POSIX-style short option names (-a) with grouping (-abc)
GNU-style long option names (--opt)
Property arguments (-Dkey=value, -D key1=value key2=value)
Non-string types of options and properties values (with extendable converters)
Powerful matching on trailing args
Subcommands
And some example code (also from that Github page):
import org.rogach.scallop._;
object Conf extends ScallopConf(List("-c","3","-E","fruit=apple","7.2")) {
// all options that are applicable to builder (like description, default, etc)
// are applicable here as well
val count:ScallopOption[Int] = opt[Int]("count", descr = "count the trees", required = true)
.map(1+) // also here work all standard Option methods -
// evaluation is deferred to after option construction
val properties = props[String]('E')
// types (:ScallopOption[Double]) can be omitted, here just for clarity
val size:ScallopOption[Double] = trailArg[Double](required = false)
}
// that's it. Completely type-safe and convenient.
Conf.count() should equal (4)
Conf.properties("fruit") should equal (Some("apple"))
Conf.size.get should equal (Some(7.2))
// passing into other functions
def someInternalFunc(conf:Conf.type) {
conf.count() should equal (4)
}
someInternalFunc(Conf)
I like sliding over arguments for relatively simple configurations.
var name = ""
var port = 0
var ip = ""
args.sliding(2, 2).toList.collect {
case Array("--ip", argIP: String) => ip = argIP
case Array("--port", argPort: String) => port = argPort.toInt
case Array("--name", argName: String) => name = argName
}
Command Line Interface Scala Toolkit (CLIST)
here is mine too! (a bit late in the game though)
https://github.com/backuity/clist
As opposed to scopt it is entirely mutable... but wait! That gives us a pretty nice syntax:
class Cat extends Command(description = "concatenate files and print on the standard output") {
// type-safety: members are typed! so showAll is a Boolean
var showAll = opt[Boolean](abbrev = "A", description = "equivalent to -vET")
var numberNonblank = opt[Boolean](abbrev = "b", description = "number nonempty output lines, overrides -n")
// files is a Seq[File]
var files = args[Seq[File]](description = "files to concat")
}
And a simple way to run it:
Cli.parse(args).withCommand(new Cat) { case cat =>
println(cat.files)
}
You can do a lot more of course (multi-commands, many configuration options, ...) and has no dependency.
I'll finish with a kind of distinctive feature, the default usage (quite often neglected for multi commands):
How to parse parameters without an external dependency. Great question! You may be interested in picocli.
Picocli is specifically designed to solve the problem asked in the question: it is a command line parsing framework in a single file, so you can include it in source form. This lets users run picocli-based applications without requiring picocli as an external dependency.
It works by annotating fields so you write very little code. Quick summary:
Strongly typed everything - command line options as well as positional parameters
Support for POSIX clustered short options (so it handles <command> -xvfInputFile as well as <command> -x -v -f InputFile)
An arity model that allows a minimum, maximum and variable number of parameters, e.g, "1..*", "3..5"
Fluent and compact API to minimize boilerplate client code
Subcommands
Usage help with ANSI colors
The usage help message is easy to customize with annotations (without programming). For example:
(source)
I couldn't resist adding one more screenshot to show what kind of usage help messages are possible. Usage help is the face of your application, so be creative and have fun!
Disclaimer: I created picocli. Feedback or questions very welcome. It is written in java, but let me know if there is any issue using it in scala and I'll try to address it.
This is largely a shameless clone of my answer to the Java question of the same topic. It turns out that JewelCLI is Scala-friendly in that it doesn't require JavaBean style methods to get automatic argument naming.
JewelCLI is a Scala-friendly Java library for command-line parsing that yields clean code. It uses Proxied Interfaces Configured with Annotations to dynamically build a type-safe API for your command-line parameters.
An example parameter interface Person.scala:
import uk.co.flamingpenguin.jewel.cli.Option
trait Person {
#Option def name: String
#Option def times: Int
}
An example usage of the parameter interface Hello.scala:
import uk.co.flamingpenguin.jewel.cli.CliFactory.parseArguments
import uk.co.flamingpenguin.jewel.cli.ArgumentValidationException
object Hello {
def main(args: Array[String]) {
try {
val person = parseArguments(classOf[Person], args:_*)
for (i <- 1 to (person times))
println("Hello " + (person name))
} catch {
case e: ArgumentValidationException => println(e getMessage)
}
}
}
Save copies of the files above to a single directory and download the JewelCLI 0.6 JAR to that directory as well.
Compile and run the example in Bash on Linux/Mac OS X/etc.:
scalac -cp jewelcli-0.6.jar:. Person.scala Hello.scala
scala -cp jewelcli-0.6.jar:. Hello --name="John Doe" --times=3
Compile and run the example in the Windows Command Prompt:
scalac -cp jewelcli-0.6.jar;. Person.scala Hello.scala
scala -cp jewelcli-0.6.jar;. Hello --name="John Doe" --times=3
Running the example should yield the following output:
Hello John Doe
Hello John Doe
Hello John Doe
I am from Java world, I like args4j because its simple, specification is more readable( thanks to annotations) and produces nicely formatted output.
Here is my example snippet:
Specification
import org.kohsuke.args4j.{CmdLineException, CmdLineParser, Option}
object CliArgs {
#Option(name = "-list", required = true,
usage = "List of Nutch Segment(s) Part(s)")
var pathsList: String = null
#Option(name = "-workdir", required = true,
usage = "Work directory.")
var workDir: String = null
#Option(name = "-master",
usage = "Spark master url")
var masterUrl: String = "local[2]"
}
Parse
//var args = "-listt in.txt -workdir out-2".split(" ")
val parser = new CmdLineParser(CliArgs)
try {
parser.parseArgument(args.toList.asJava)
} catch {
case e: CmdLineException =>
print(s"Error:${e.getMessage}\n Usage:\n")
parser.printUsage(System.out)
System.exit(1)
}
println("workDir :" + CliArgs.workDir)
println("listFile :" + CliArgs.pathsList)
println("master :" + CliArgs.masterUrl)
On invalid arguments
Error:Option "-list" is required
Usage:
-list VAL : List of Nutch Segment(s) Part(s)
-master VAL : Spark master url (default: local[2])
-workdir VAL : Work directory.
scala-optparse-applicative
I think scala-optparse-applicative is the most functional command line parser library in Scala.
https://github.com/bmjames/scala-optparse-applicative
I liked the slide() approach of joslinm just not the mutable vars ;) So here's an immutable way to that approach:
case class AppArgs(
seed1: String,
seed2: String,
ip: String,
port: Int
)
object AppArgs {
def empty = new AppArgs("", "", "", 0)
}
val args = Array[String](
"--seed1", "akka.tcp://seed1",
"--seed2", "akka.tcp://seed2",
"--nodeip", "192.167.1.1",
"--nodeport", "2551"
)
val argsInstance = args.sliding(2, 1).toList.foldLeft(AppArgs.empty) { case (accumArgs, currArgs) => currArgs match {
case Array("--seed1", seed1) => accumArgs.copy(seed1 = seed1)
case Array("--seed2", seed2) => accumArgs.copy(seed2 = seed2)
case Array("--nodeip", ip) => accumArgs.copy(ip = ip)
case Array("--nodeport", port) => accumArgs.copy(port = port.toInt)
case unknownArg => accumArgs // Do whatever you want for this case
}
}
There's also JCommander (disclaimer: I created it):
object Main {
object Args {
#Parameter(
names = Array("-f", "--file"),
description = "File to load. Can be specified multiple times.")
var file: java.util.List[String] = null
}
def main(args: Array[String]): Unit = {
new JCommander(Args, args.toArray: _*)
for (filename <- Args.file) {
val f = new File(filename)
printf("file: %s\n", f.getName)
}
}
}
I've just found an extensive command line parsing library in scalac's scala.tools.cmd package.
See http://www.assembla.com/code/scala-eclipse-toolchain/git/nodes/src/compiler/scala/tools/cmd?rev=f59940622e32384b1e08939effd24e924a8ba8db
I've attempted generalize #pjotrp's solution by taking in a list of required positional key symbols, a map of flag -> key symbol and default options:
def parseOptions(args: List[String], required: List[Symbol], optional: Map[String, Symbol], options: Map[Symbol, String]): Map[Symbol, String] = {
args match {
// Empty list
case Nil => options
// Keyword arguments
case key :: value :: tail if optional.get(key) != None =>
parseOptions(tail, required, optional, options ++ Map(optional(key) -> value))
// Positional arguments
case value :: tail if required != Nil =>
parseOptions(tail, required.tail, optional, options ++ Map(required.head -> value))
// Exit if an unknown argument is received
case _ =>
printf("unknown argument(s): %s\n", args.mkString(", "))
sys.exit(1)
}
}
def main(sysargs Array[String]) {
// Required positional arguments by key in options
val required = List('arg1, 'arg2)
// Optional arguments by flag which map to a key in options
val optional = Map("--flag1" -> 'flag1, "--flag2" -> 'flag2)
// Default options that are passed in
var defaultOptions = Map()
// Parse options based on the command line args
val options = parseOptions(sysargs.toList, required, optional, defaultOptions)
}
I have never liked ruby like option parsers. Most developers that used them never write a proper man page for their scripts and end up with pages long options not organized in a proper way because of their parser.
I have always preferred Perl's way of doing things with Perl's Getopt::Long.
I am working on a scala implementation of it. The early API looks something like this:
def print_version() = () => println("version is 0.2")
def main(args: Array[String]) {
val (options, remaining) = OptionParser.getOptions(args,
Map(
"-f|--flag" -> 'flag,
"-s|--string=s" -> 'string,
"-i|--int=i" -> 'int,
"-f|--float=f" -> 'double,
"-p|-procedure=p" -> { () => println("higher order function" }
"-h=p" -> { () => print_synopsis() }
"--help|--man=p" -> { () => launch_manpage() },
"--version=p" -> print_version,
))
So calling script like this:
$ script hello -f --string=mystring -i 7 --float 3.14 --p --version world -- --nothing
Would print:
higher order function
version is 0.2
And return:
remaining = Array("hello", "world", "--nothing")
options = Map('flag -> true,
'string -> "mystring",
'int -> 7,
'double -> 3.14)
The project is hosted in github scala-getoptions.
I'd suggest to use http://docopt.org/. There's a scala-port but the Java implementation https://github.com/docopt/docopt.java works just fine and seems to be better maintained. Here's an example:
import org.docopt.Docopt
import scala.collection.JavaConversions._
import scala.collection.JavaConverters._
val doc =
"""
Usage: my_program [options] <input>
Options:
--sorted fancy sorting
""".stripMargin.trim
//def args = "--sorted test.dat".split(" ").toList
var results = new Docopt(doc).
parse(args()).
map {case(key, value)=>key ->value.toString}
val inputFile = new File(results("<input>"))
val sorted = results("--sorted").toBoolean
This is what I cooked. It returns a tuple of a map and a list. List is for input, like input file names. Map is for switches/options.
val args = "--sw1 1 input_1 --sw2 --sw3 2 input_2 --sw4".split(" ")
val (options, inputs) = OptParser.parse(args)
will return
options: Map[Symbol,Any] = Map('sw1 -> 1, 'sw2 -> true, 'sw3 -> 2, 'sw4 -> true)
inputs: List[Symbol] = List('input_1, 'input_2)
Switches can be "--t" which x will be set to true, or "--x 10" which x will be set to "10". Everything else will end up in list.
object OptParser {
val map: Map[Symbol, Any] = Map()
val list: List[Symbol] = List()
def parse(args: Array[String]): (Map[Symbol, Any], List[Symbol]) = _parse(map, list, args.toList)
private [this] def _parse(map: Map[Symbol, Any], list: List[Symbol], args: List[String]): (Map[Symbol, Any], List[Symbol]) = {
args match {
case Nil => (map, list)
case arg :: value :: tail if (arg.startsWith("--") && !value.startsWith("--")) => _parse(map ++ Map(Symbol(arg.substring(2)) -> value), list, tail)
case arg :: tail if (arg.startsWith("--")) => _parse(map ++ Map(Symbol(arg.substring(2)) -> true), list, tail)
case opt :: tail => _parse(map, list :+ Symbol(opt), tail)
}
}
}
I based my approach on the top answer (from dave4420), and tried to improve it by making it more general-purpose.
It returns a Map[String,String] of all command line parameters
You can query this for the specific parameters you want (eg using .contains) or convert the values into the types you want (eg using toInt).
def argsToOptionMap(args:Array[String]):Map[String,String]= {
def nextOption(
argList:List[String],
map:Map[String, String]
) : Map[String, String] = {
val pattern = "--(\\w+)".r // Selects Arg from --Arg
val patternSwitch = "-(\\w+)".r // Selects Arg from -Arg
argList match {
case Nil => map
case pattern(opt) :: value :: tail => nextOption( tail, map ++ Map(opt->value) )
case patternSwitch(opt) :: tail => nextOption( tail, map ++ Map(opt->null) )
case string :: Nil => map ++ Map(string->null)
case option :: tail => {
println("Unknown option:"+option)
sys.exit(1)
}
}
}
nextOption(args.toList,Map())
}
Example:
val args=Array("--testing1","testing1","-a","-b","--c","d","test2")
argsToOptionMap( args )
Gives:
res0: Map[String,String] = Map(testing1 -> testing1, a -> null, b -> null, c -> d, test2 -> null)
another library: scarg
Here's a scala command line parser that is easy to use. It automatically formats help text, and it converts switch arguments to your desired type. Both short POSIX, and long GNU style switches are supported. Supports switches with required arguments, optional arguments, and multiple value arguments. You can even specify the finite list of acceptable values for a particular switch. Long switch names can be abbreviated on the command line for convenience. Similar to the option parser in the Ruby standard library.
I like the clean look of this code... gleaned from a discussion here:
http://www.scala-lang.org/old/node/4380
object ArgParser {
val usage = """
Usage: parser [-v] [-f file] [-s sopt] ...
Where: -v Run verbosely
-f F Set input file to F
-s S Set Show option to S
"""
var filename: String = ""
var showme: String = ""
var debug: Boolean = false
val unknown = "(^-[^\\s])".r
val pf: PartialFunction[List[String], List[String]] = {
case "-v" :: tail => debug = true; tail
case "-f" :: (arg: String) :: tail => filename = arg; tail
case "-s" :: (arg: String) :: tail => showme = arg; tail
case unknown(bad) :: tail => die("unknown argument " + bad + "\n" + usage)
}
def main(args: Array[String]) {
// if there are required args:
if (args.length == 0) die()
val arglist = args.toList
val remainingopts = parseArgs(arglist,pf)
println("debug=" + debug)
println("showme=" + showme)
println("filename=" + filename)
println("remainingopts=" + remainingopts)
}
def parseArgs(args: List[String], pf: PartialFunction[List[String], List[String]]): List[String] = args match {
case Nil => Nil
case _ => if (pf isDefinedAt args) parseArgs(pf(args),pf) else args.head :: parseArgs(args.tail,pf)
}
def die(msg: String = usage) = {
println(msg)
sys.exit(1)
}
}
I just created my simple enumeration
val args: Array[String] = "-silent -samples 100 -silent".split(" +").toArray
//> args : Array[String] = Array(-silent, -samples, 100, -silent)
object Opts extends Enumeration {
class OptVal extends Val {
override def toString = "-" + super.toString
}
val nopar, silent = new OptVal() { // boolean options
def apply(): Boolean = args.contains(toString)
}
val samples, maxgen = new OptVal() { // integer options
def apply(default: Int) = { val i = args.indexOf(toString) ; if (i == -1) default else args(i+1).toInt}
def apply(): Int = apply(-1)
}
}
Opts.nopar() //> res0: Boolean = false
Opts.silent() //> res1: Boolean = true
Opts.samples() //> res2: Int = 100
Opts.maxgen() //> res3: Int = -1
I understand that solution has two major flaws that may distract you: It eliminates the freedom (i.e. the dependence on other libraries, that you value so much) and redundancy (the DRY principle, you do type the option name only once, as Scala program variable and eliminate it second time typed as command line text).
As everyone posted it's own solution here is mine, cause I wanted something easier to write for the user : https://gist.github.com/gwenzek/78355526e476e08bb34d
The gist contains a code file, plus a test file and a short example copied here:
import ***.ArgsOps._
object Example {
val parser = ArgsOpsParser("--someInt|-i" -> 4, "--someFlag|-f", "--someWord" -> "hello")
def main(args: Array[String]){
val argsOps = parser <<| args
val someInt : Int = argsOps("--someInt")
val someFlag : Boolean = argsOps("--someFlag")
val someWord : String = argsOps("--someWord")
val otherArgs = argsOps.args
foo(someWord, someInt, someFlag)
}
}
There is not fancy options to force a variable to be in some bounds, cause I don't feel that the parser is the best place to do so.
Note : you can have as much alias as you want for a given variable.
I'm going to pile on. I solved this with a simple line of code. My command line arguments look like this:
input--hdfs:/path/to/myData/part-00199.avro output--hdfs:/path/toWrite/Data fileFormat--avro option1--5
This creates an array via Scala's native command line functionality (from either App or a main method):
Array("input--hdfs:/path/to/myData/part-00199.avro", "output--hdfs:/path/toWrite/Data","fileFormat--avro","option1--5")
I can then use this line to parse out the default args array:
val nArgs = args.map(x=>x.split("--")).map(y=>(y(0),y(1))).toMap
Which creates a map with names associated with the command line values:
Map(input -> hdfs:/path/to/myData/part-00199.avro, output -> hdfs:/path/toWrite/Data, fileFormat -> avro, option1 -> 5)
I can then access the values of named parameters in my code and the order they appear on the command line is no longer relevant. I realize this is fairly simple and doesn't have all the advanced functionality mentioned above but seems to be sufficient in most cases, only needs one line of code, and doesn't involve external dependencies.
Here is mine 1-liner
def optArg(prefix: String) = args.drop(3).find { _.startsWith(prefix) }.map{_.replaceFirst(prefix, "")}
def optSpecified(prefix: String) = optArg(prefix) != None
def optInt(prefix: String, default: Int) = optArg(prefix).map(_.toInt).getOrElse(default)
It drops 3 mandatory arguments and gives out the options. Integers are specified like notorious -Xmx<size> java option, jointly with the prefix. You can parse binaries and integers as simple as
val cacheEnabled = optSpecified("cacheOff")
val memSize = optInt("-Xmx", 1000)
No need to import anything.
Poor man's quick-and-dirty one-liner for parsing key=value pairs:
def main(args: Array[String]) {
val cli = args.map(_.split("=") match { case Array(k, v) => k->v } ).toMap
val saveAs = cli("saveAs")
println(saveAs)
}
freecli
package freecli
package examples
package command
import java.io.File
import freecli.core.all._
import freecli.config.all._
import freecli.command.all._
object Git extends App {
case class CommitConfig(all: Boolean, message: String)
val commitCommand =
cmd("commit") {
takesG[CommitConfig] {
O.help --"help" ::
flag --"all" -'a' -~ des("Add changes from all known files") ::
O.string -'m' -~ req -~ des("Commit message")
} ::
runs[CommitConfig] { config =>
if (config.all) {
println(s"Commited all ${config.message}!")
} else {
println(s"Commited ${config.message}!")
}
}
}
val rmCommand =
cmd("rm") {
takesG[File] {
O.help --"help" ::
file -~ des("File to remove from git")
} ::
runs[File] { f =>
println(s"Removed file ${f.getAbsolutePath} from git")
}
}
val remoteCommand =
cmd("remote") {
takes(O.help --"help") ::
cmd("add") {
takesT {
O.help --"help" ::
string -~ des("Remote name") ::
string -~ des("Remote url")
} ::
runs[(String, String)] {
case (s, u) => println(s"Remote $s $u added")
}
} ::
cmd("rm") {
takesG[String] {
O.help --"help" ::
string -~ des("Remote name")
} ::
runs[String] { s =>
println(s"Remote $s removed")
}
}
}
val git =
cmd("git", des("Version control system")) {
takes(help --"help" :: version --"version" -~ value("v1.0")) ::
commitCommand ::
rmCommand ::
remoteCommand
}
val res = runCommandOrFail(git)(args).run
}
This will generate the following usage:
Usage

Scala Newb Question - about scoping and variables

I'm parsing XML, and keep finding myself writing code like:
val xml = <outertag>
<dog>val1</dog>
<cat>val2</cat>
</outertag>
var cat = ""
var dog = ""
for (inner <- xml \ "_") {
inner match {
case <dog>{ dg # _* }</dog> => dog = dg(0).toString()
case <cat>{ ct # _* }</cat> => cat = ct(0).toString()
}
}
/* do something with dog and cat */
It annoys me because I should be able to declare cat and dog as val (immutable), since I only need to set them once, but I have to make them mutable. And besides that it just seems like there must be a better way to do this in scala. Any ideas?
Here are two (now make it three) possible solutions. The first one is pretty quick and dirty. You can run the whole bit in the Scala interpreter.
val xmlData = <outertag>
<dog>val1</dog>
<cat>val2</cat>
</outertag>
// A very simple way to do this mapping.
def simpleGetNodeValue(x:scala.xml.NodeSeq, tag:String) = (x \\ tag).text
val cat = simpleGetNodeValue(xmlData, "cat")
val dog = simpleGetNodeValue(xmlData, "dog")
cat will be "val2", and dog will be "val1".
Note that if either node is not found, an empty string will be returned. You can work around this, or you could write it in a slightly more idiomatic way:
// A more idiomatic Scala way, even though Scala wouldn't give us nulls.
// This returns an Option[String].
def getNodeValue(x:scala.xml.NodeSeq, tag:String) = {
(x \\ tag).text match {
case "" => None
case x:String => Some(x)
}
}
val cat1 = getNodeValue(xmlData, "cat") getOrElse "No cat found."
val dog1 = getNodeValue(xmlData, "dog") getOrElse "No dog found."
val goat = getNodeValue(xmlData, "goat") getOrElse "No goat found."
cat1 will be "val2", dog1 will be "val1", and goat will be "No goat found."
UPDATE: Here's one more convenience method to take a list of tag names and return their matches as a Map[String, String].
// Searches for all tags in the List and returns a Map[String, String].
def getNodeValues(x:scala.xml.NodeSeq, tags:List[String]) = {
tags.foldLeft(Map[String, String]()) { (a, b) => a(b) = simpleGetNodeValue(x, b)}
}
val tagsToMatch = List("dog", "cat")
val matchedValues = getNodeValues(xmlData, tagsToMatch)
If you run that, matchedValues will be Map(dog -> val1, cat -> val2).
Hope that helps!
UPDATE 2: Per Daniel's suggestion, I'm using the double-backslash operator, which will descend into child elements, which may be better as your XML dataset evolves.
scala> val xml = <outertag><dog>val1</dog><cat>val2</cat></outertag>
xml: scala.xml.Elem = <outertag><dog>val1</dog><cat>val2</cat></outertag>
scala> val cat = xml \\ "cat" text
cat: String = val2
scala> val dog = xml \\ "dog" text
dog: String = val1
Consider wrapping up the XML inspection and pattern matching in a function that returns the multiple values you need as a tuple (Tuple2[String, String]). But stop and consider: it looks like it's possible to not match any dog and cat elements, which would leave you returning null for one or both of the tuple components. Perhaps you could return a tuple of Option[String], or throw if either of the element patterns fail to bind.
In any case, you can generally solve these initialization problems by wrapping up the constituent statements into a function to yield an expression. Once you have an expression in hand, you can initialize a constant with the result of its evaluation.