Avoid Printing of code when running scripts in spark shell - scala

I am trying to connect to DB from in spark shell using scripts in scala file.
when Connecting the script takes password from other location but it does print in console of spark shell.
I just want to avoid those.
Code in Scala is as below,
val config=Map("driver"->"drivername","url"->"dburl","user"->"username","password"->"741852963");
When loading this code in spark shell this prints the code in spark shell too. I want these alone part of to not print in spark console.
How can I achieve this?

You have several ways to achieve this:
You can wrap your config definition in an object. Spark shell will just output that an object is defined
scala> object ConfigHolder {
| val config=Map("secret"->"value")
| }
defined object ConfigHolder
You can then simply reference your config as ConfigHolder.config instead of config
You can disable/re-enable output printing in the shell with the :silent command
scala> :silent
scala> val config=Map("secret"->"value")
scala> :silent
Of course, none of these actions prevent anyone with access to the spark shell from reading your credentials, it just prevents casual onlookers from seeing them.

Related

Is it possible to run a Spark Scala script without going inside spark-shell?

The only two way I know to run Scala based spark code is to either compile a Scala program into a jar file and run it with spark-submit, or run a Scala script by using :load inside the spark-shell. My question is, it is possible to run a Scala file directly on the command line, without first going inside spark-shell and then issuing :load?
You can simply use the stdin redirection with spark-shell:
spark-shell < YourSparkCode.scala
This command starts a spark-shell, interprets your YourSparkCode.scala line by line and quits at the end.
Another option is to use -I <file> option of spark-shell command:
spark-shell -I YourSparkCode.scala
The only difference is that the latter command leaves you inside the shell and you must issue :quit command to close the session.
[UDP]
Passing parameters
Since spark-shell does not execute your source as an application but just interprets your source file line by line, you cannot pass any parameters directly as application arguments.
Fortunately, there may be a lot of options to approach the same (e.g, externalizing the parameters in another file and read it in the very beginning in your script).
But I personally find the Spark configuration the most clean and convenient way.
Your pass your parameters via --conf option:
spark-shell --conf spark.myscript.arg1=val1 --conf spark.yourspace.arg2=val2 < YourSparkCode.scala
(please note that spark. prefix in your property name is mandatory, otherwise Spark will discard your property as invalid)
And read these arguments in your Spark code as below:
val arg1: String = spark.conf.get("spark.myscript.arg1")
val arg2: String = spark.conf.get("spark.myscript.arg2")
It is possible via spark-submit.
https://spark.apache.org/docs/latest/submitting-applications.html
You can even put it to bash script either create sbt-task
https://www.scala-sbt.org/1.x/docs/Tasks.html
to run your code.

Execute the scala script through spark-shell in silent mode

Need to execute the scala script through spark-shell with silent mode. When I am using spark-shell -i "file.scala", after the execution, I am getting into the scala interactive mode. I don't want to get into there.
I have tried to execute the spark-shell -i "file.scala". But I don't know how to execute the script in silent mode.
spark-shell -i "file.scala"
after execution, I get into
scala>
I don't want to get into the scala> mode
Updating (October 2019) for a script that terminates
This question is also about running a script that terminates, that is, a "scala script" that run by spark-shell -i script.scala > output.txt that stopts by yourself (internal instruction System.exit(0) terminates the script). See this question with a good example.
It also needs a "silent mode", it is expected to not pollute the output.txt.
Suppose Spark v2.2+.
PS: there are a lot of cases (typically small tools and module/algorithm tests) where Spark interpreter can be better than compiler... Please, "let's compile!" is not an answer here.
spark-shell -i file.scala keeps the interpreter open
in the end, so System.exit(0) is required to be at the end of your script. The most appropriate solution is to place your code in try {} and put System.exit(0) in finally {} section.
If logging is requiered you can use something like this:
spark-shell < file.scala > test.log 2>&1 &
If you have limitations on editing file and you can't add System.exit(0), use:
echo :quit | scala-shell -i file.scala
UPD
If you want to suppress everything in output except printlns you have to turn off logging for spark-shell. The sample of configs is here. Disabling any kind of logging in $SPARK-HOME/conf/log4j.properties should allow you to see only pritnlns. But I would not follow this approach with printlns. Using general Logging with log4j should be used instead of printlns. You can configure it so obtain the same results as with printlns. It boils down to configuring a pattern. This answer provides an example of a pattern that solves your issue.
The best way is definitively to compile your scala code to a jar and use spark-submit but if you're simply looking for a quick iteration loop, you can simply issue a :quit after parsing your scala code:
echo :quit | scala-shell -i yourfile.scala
Adding onto #rluta's answer. You can place the call to spark-shell command inside a shell script. Say the below in a shell script:
spark-shell < yourfile.scala
But this would require you to keep the lines of code within a line in case a statement is written on different lines.
OR
echo :quit | spark-shell -i yourfile.scala
This should

Load spark scala script into spark shell

I am trying to load a spark scala script into a spark shell using load command where the location of the script is passed in a variable. It's not working
val scriptLoc="/abc/spark"
:load ${scriptLoc}/scriptName.scala
Even tried like this which didn't work either
:load scriptLoc/scriptName.scala
Any help would be appreciated.
You can try
spark-shell -i /path/to/file.scala

Is there a configuration file for Scala REPL / SBT Console?

I've been trying to find some sort of a dotfile to put Scala REPL settings and custom function in.
In particular I'm interested in passing it flags like -Dscala.color (enables syntax highlighting), as well as overriding settings like result string truncation:
scala> :power
scala> vals.isettings.maxPrintString = 10000
It would be nice to have these settings apply to both the simple Scala REPL sessions as well as sbt console sessions.
Does such a central configuration place exist for Scala?
Maybe you can use a modernized Scala REPL:
https://lihaoyi.github.io/Ammonite/
Poor man's solution: Set yourself an alias
alias myScala='scala -Dscala.repl.axPrintString = 10000'
As mentioned here ~/.sbt/0.13/global.sbt is the global configuration file for sbt. You can change your global settings here, this probably not going to effect REPL but should do work with SBT Console
You mainly asked about property settings, this goes a little beyond that to consider loading a definitions file as well—and isn't much help for Windows—but I thought I'd share in case it's useful:
I've resorted to using a wrapper script saved as ~/bin/scala, to set config properties and load some utility functions:
#!/bin/sh
# The scala REPL doesn't have any config file, so this wrapper serves to set
# some property values and load an init file of utilities when run without
# arguments to enter REPL mode.
#
# If there are arguments, just assume we're running a .scala file in script
# mode, a class or jar, etc., and execute normally.
SCALA=${SCALA:-/usr/local/bin/scala}
if [ "$#" -eq 0 ] && [ -r ~/.config/scala/replinit.scala ]; then
exec "$SCALA" -i ~/.config/scala/replinit.scala -Dscala.color
else
exec "$SCALA" "$#"
fi
If you sometimes use Ammonite REPL, as another answer suggests, the utility definitions can be shared by loading them from ~/.ammonite/predef.scala:
try load.exec(ammonite.ops.home/".config"/'scala/"replinit.scala")
catch { case _: Exception => println("=== replrc not loaded! ===") }
I'm not sure about a way to load the init file for sbt console automatically, though—Seth Tisue's comment about the initialize setting is helpful for properties, but using a :load command in a value for initialCommands in console doesn't appear to work.

Execute external command

I do not know whether it is a Scala or Play! question. I want to execute some external command from my Play application, get the output from the command and show a report to user based on the command output. Can anyone help?
For example, when I enter my-command from shell it shows output like below, which I want to capture and show in web:
Id Name IP
====================
1 A x.y.z.a
2 B p.q.r.s
Please, do not worry about format and parsing of the output. Functionally, I am looking something like PHP exec. I know about java Runtime.getRuntime().exec("command") but is there any Scala/Play version to serve the purpose?
The method !! of the Scala process package does what you need, it executes the statement and captures the text output. For example:
import scala.sys.process._
val cmd = "uname -a" // Your command
val output = cmd.!! // Captures the output
scala> import scala.sys.process._
scala> Process("cat temp.txt")!
This assumes there is a temp file in your home directory. ! is for actual execution of the command. See scala.sys.process for more info.
You can use the Process library: for instance
import scala.sys.process.Process
Process("ls").!!
to get the list of files in the folder as a string. The !! get the output of the command