I am running Spark-Shell with Scala and I want to set an environment variable to load data into Google bigQuery. The environment variable is GOOGLE_APPLICATION_CREDENTIALS and it contains /path/to/service/account.json
In python environment I can easily do,
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = "path/to/service/account.json"
However, I cannot do this in Scala. I can print out the system environment variables using,
scala> sys.env
or
scala> System.getenv()
which returns me a map of String Key,Value pairs. However,
scala> System.getenv("GOOGLE_APPLICATION_CREDENTIALS") = "path/to/service/account.json"
returns an error
<console>:26: error: value update is not a member of java.util.Map[String,String]
I found a work around for this problem, though I dont think its the best practice. Here is the 2 step solution for this -
From terminal/cmd, first create the environment variable -
export GOOGLE_APPLICATION_CREDENTIALS=path/to/service/account.json
From the same terminal window, open spark-shell and run -
System.getenv("GOOGLE_APPLICATION_CREDENTIALS")
Related
I have a gatling project where I make use of environment variables:
val Feeder: String = scala.util.Properties.envOrElse("FEEDER", "sts")
When I run it using maven I try to set the value of that variable, but I always get the default value:
$ mvn gatling:test -Dgatling.simulationClass=simulations.MySimulation -DFEEDER=something
I've also tried to just set the value in the terminal before executing maven:
$ FEEDER=something
$ mvn gatling:test -Dgatling.simulationClass=simulations.MySimulation
but I always get the default value "sts"
Am I missing something? How can I set the value I want to get?
You're confusing env variables and Java System Properties.
scala.util.Properties.envOrElse is for the former while -DFEEDER=something is for the latter.
Please use scala.util.Properties.propOrElse or sys.props.getOrElse.
I exported a variable called DBURL='postgresql://string'and I want to use it in my configuration ini file, e.g::
[app:kotti]
sqlalchemy.url = %(DBURL)s
That's not working.
Put this in your __init__.py:
def expandvars_dict(settings):
"""Expands all environment variables in a settings dictionary."""
return dict((key, os.path.expandvars(value)) for
key, value in settings.items())
Then when you export an environment variable to your shell, the proper syntax is this:
sqlalchemy.url = ${DBURL}
Once you have that environment variable set within your .ini, then you can use the configparser syntax:
sqlalchemy.connection = %(sqlalchemy.url)s%(user:pass and other stuff)s
Idea stolen from https://stackoverflow.com/a/16446566/2214933
PasteDeploy (the ini format pyramid is using here) does not support reading directly from environment variables. A couple common options are:
1) Set that option yourself in your main.
import os
def main(global_config, **settings):
settings['sqlalchemy.url'] = os.environ['DBURL']
config = Configurator(settings=settings)
...
2) Define your ini file as a jinja2 template and have a command to render it out to ini format, and just run that as part of your deploy process.
Friends,
I'm trying to set the environment variable "asdf" in my Scala shell, as described here
These are my commands:
scala> import scala.sys.process.Process
import scala.sys.process.Process
scala> Process(Seq("bash", "-c", "echo $asdf"), None, "asdf" -> "Hello, world!").!
Hello, world!
res18: Int = 0
But when i try to read the environment variable back:
scala> sys.env.get("asdf")
res19: Option[String] = None
The output says "None". How do i properly set my environment variable in the current session?
PS - Please do not downvote this; i'm trying really hard but unable to get past my issue
It has nothing to do with Scala, you are just misunderstanding the situation. The map at the end of the line
scala> Process(Seq("bash", "-c", "echo $asdf"), None, "asdf" -> "Hello, world!").!
doesn't change the environment of of this process, the one you are typing into; it changes the environment of the child process that the Process() function creates.
It is not permitted for a Scala/Java process to modify its own environment. You can use the scala.util.Properties object to inspect environmental variables and properties. (Docs are here.) The properties can be added/removed/changed but the environmentals cannot.
I am using this version of Spark : spark-1.4.0-bin-hadoop2.6 . I want to check few default properties. So I gave the following statement in spark-shell
scala> sqlContext.getConf("spark.sql.hive.metastore.version")
I was expecting the call to method getConf to return a value of 0.13.1 as desribed in this link. But I got the below exception
java.util.NoSuchElementException: spark.sql.hive.metastore.version
at org.apache.spark.sql.SQLConf$$anonfun$getConf$1.apply(SQLConf.scala:283)
at org.apache.spark.sql.SQLConf$$anonfun$getConf$1.apply(SQLConf.scala:283)
Am I retrieving the properties in the right way?
You can use
sc.getConf.toDebugString
OR
sqlContext.getAllConfs
which will return all values that have been set, however some defaults are in the code. In your specific example, it is indeed in the code:
getConf(HIVE_METASTORE_VERSION, hiveExecutionVersion)
where the default is indeed in the code:
val hiveExecutionVersion: String = "0.13.1"
So, getConf will attempt to pull the metastore version from the config, falling back to a default, but this is not listed in the conf itself.
In Spark 2.x.x If I wanted to know default value of a Spark Conf I would do this:
Below command will return a Scala Map in spark-shell.
spark.sqlContext.getAllConfs
To find our value for a conf property:
e.g. - To find the default warehouse dir used by spark set to conf -
spark.sql.warehouse.dir:
spark.sqlContext.getAllConfs.get("spark.sql.warehouse.dir")
I am new to hosting world (cloudcontrol), an i got some problem with application credentials, like database administration (mongohq), or google authentification.
So, will i put those variable with some kind of syntaxte (something like $variable) in the code, and then make a commandline with key-value as variable-value ?
If you are using Tornado, it makes it even simpler. Use tornado.options and pass environment variables while running the code.
Use following in your Tornado code:
define("mysql_host", default="127.0.0.1:3306", help="Main user DB")
define("google_oauth_key", help="Client key for Google Oauth")
Then you can access the these values in your rest of your code as:
options.mysql_host
options.google_oauth_key
When you are running your Tornado script, pass the environment variables:
python main.py --mysql_host=$MYSQL_HOST --google_oauth_key=$OAUTH_KEY
assuming both $MYSQL_HOST and $OAUTH_KEY are environment variables. Let me know if you need a full working example or any further help.
example:
First set a environment variable:
$export mongo_uri_env=mongodb://alien:12345#kahana.mongohq.com:10067/essog
and make changes in your Tornado code:
define("mongo_uri", default="127.0.0.1:28017", help="MongoDB URI")
...
...
uri = options.mongo_uri
and you would run your code as
python main.py --mongo_uri=$mongo_uri_env
If you don't want to pass it while running, then you have to read that environment variable directly in your script. For that
import os
...
...
uri = os.environ['mongo_uri_env']