Is it possible to use unix wildcards with Scala's ProcessBuilder? - scala

Process( Seq( "rm", "-r", "*" ), new File( "/some/dir" ) ) !
This code will fail (rm: *: No such file or directory) because the * wildcard is interpreted as string value rather than a unix wildcard. Is it possible to change this behavior?

The glob pattern is a shell feature. You can either glob for files yourself, or execute a shell, passing your arguments to it:
Process( Seq( "/bin/sh", "-c" "rm -r *" ), new File( "/some/dir" ) ) !

Related

Using wildcards with scala.sys.process._ in Scala [duplicate]

Process( Seq( "rm", "-r", "*" ), new File( "/some/dir" ) ) !
This code will fail (rm: *: No such file or directory) because the * wildcard is interpreted as string value rather than a unix wildcard. Is it possible to change this behavior?
The glob pattern is a shell feature. You can either glob for files yourself, or execute a shell, passing your arguments to it:
Process( Seq( "/bin/sh", "-c" "rm -r *" ), new File( "/some/dir" ) ) !

Scala: Listing files that match a regular expression within a directory

I'm trying to list files within a directory that match a regular expression, e.g. ".csv$" this is very similar to Scala & DataBricks: Getting a list of Files
I've been running in circles for hours trying to figure out how Scala can list a directory of files and filter by regex.
import java.io.File
def getListOfFiles(dir: String):List[File] = {
val d = new File(dir)
if (d.exists && d.isDirectory) {
d.listFiles.filter(_.isFile).toList
} else {
List[File]()
}
}
val name : String = ".csv"
val files = getListOfFiles("/home/con/Scripts").map(_.path).filter(_.matches(name))
println(files)
gives the error
/home/con/Scripts/scala/find_files.scala:13: error: value path is not a member of java.io.File
val files = getListOfFiles("/home/con/Scripts").map(_.path).filter(_.matches(name))
I'm trying to figure out the regular Scala equivalent of dbutils.fs.ls which eludes me.
How can list files in a regular directory in Scala?
The error is reporting that path is not defined in java.io.File which it isn't.
If you want to match by name, why don't you get file names? Also, your regex is a bit off if you want to match based on file extension.
Fixing these two problems:
val name : String = ".+\\.csv"
val files = getListOfFiles("/path/to/files/location")
.map(f => f.getName)
.filter(_.matches(name))
will output .csv files in the /path/to/files/location folder.

Execute shell script from scala application

I want to execute the sh file from Scala application.
Let's say I have createPassword.sh file and I need to invoke this sh file from Scala application and get the output back.
How can I achieve through scala application?
This should do the trick if the script is in the current working directory (otherwise specify the full path of the script)
import sys.process._
val result = "./createPassword.sh" !!
result is then a String containing the standard output (and standard error)
EDIT: If you want to use ProcessBuillder from Java SE7, you can also use this in scala:
import java.io.{BufferedReader, InputStreamReader}
val p = new ProcessBuilder("/bin/bash","createPassword.sh")
val p2 = p.start()
val br = new BufferedReader(new InputStreamReader(p2.getInputStream()))
var line:String = ""
while ({line = br.readLine(); line!= null}) {
println(line)
}
Given your dir has a script,
`val path = "./src/test/tests/Integration/"`
`val output = Process("sh test.sh", new File("path")).!!`

Pass a TXT file as stopword list in Scala

I am using the stanford Topic modelling toolbox (TMT) http://nlp.stanford.edu/software/tmt/tmt-0.4/, and I want to prepare my text data set.
I have a txt file of stopwords.
However,
TermStopListFilter()
Which filters out stop words from my CSV data set, only accepts a list within the script, such as:
TermStopListFilter(List("positively","scrumptious"))
How do I import my stopwords.txt file and use it as my stopword list?
A full snipet of the code I use:
val source = CSVFile("filtered.csv");
val text = {
source ~>
Column(1) ~>
TokenizeWith(tokenizer) ~>
TermCounter() ~>
TermMinimumDocumentCountFilter(100) ~>
TermStopListFilter(TXTFile("stopwords.txt"))
TermDynamicStopListFilter(10) ~>
DocumentMinimumLengthFilter(5)
}
well, if your stopwords are "," separated you can try this:
.
.
TermStopListFilter(Source("stopwords.txt").getLines().map(_.split(",")).toList)
.
.
If your stopwords in stopwords.txt are delimited by some other char, change it in split(",") accordingly and most likely you should remove line: TermStopListFilter(List("positively","scrumptious"))

Scala - shell commands with pipe

I'm a Scala beginner and I'm writing a wrapper for invoking shell commands. Currently I'm trying to invoke shell commands with pipes from a specified directory.
To achieve this I wrote simple utility:
def runCommand(command: String, directory: File): (Int, String, String) = {
val errbuffer = new StringBuffer();
val outbuffer = new StringBuffer();
//run the command
val ret = sys.process.Process(command, directory) !
//log output and err
ProcessLogger(outbuffer append _ + "\n", outbuffer append _ + "\n");
return (ret, outbuffer.toString(), errbuffer.toString());
}
However with this utility I can't use pipes, for example:
runCommand("ps -eF | grep -i foo", new File("."));
First I thought, that pipes are shell's functionality, so I tried "/bin/sh -c ps -eF | grep -i foo", but it seems that expression from the right of the pipe was ignored.
I also tried running commands with ! syntax (sys.process._ package), but I couldn't figure out, how to call command from specified directory (without using "cd").
Could you please advice me, how to do this correctly?
Change
val ret = sys.process.Process(command, directory) !
to
val ret = sys.process.stringSeqToProcess(Seq("/bin/bash", "-c", "cd " + directory.getAbsolutePath + ";" + command))
Or you could directly use the magic provided by Scala:
import.scala.sys.process._
val ret = "ps -ef" #| "grep -i foo" !