How to force right resource file to be used when calling from another module - scala

Here is the scenario (newbie to spark scala so kindly bear with me)
1) I have module A and a config file under resources folder. Class C in module A reads this config to get information about the file paths
2) i am trying to call Class C (module A) from Module B (after importing the dependencies of Module A in module B)
3) Issue i am facing is Class C (module) code when invoked from Module B , is using the config from Module B instead of its own config in Module A
Note : code works perfectly when i call with in Module A but once i move this code to Module B its using the resources file in Module B instead of Module A resource file.
both the configs have same name.

From the discussion in regard to my original answer, which assumed Lightbend Config (commonly used in the Scala world), it's been discovered that some sort of config.xml is in src/main/resources for the respective modules. These files both end up on the classpath and each module attempts (by an at this point unspecified means) to load the config.xml resource.
The JVM when asked to load resources always loads the first which matches.
The easiest way in a small set of projects to address this collision is to not collide by giving the configs in each project different names.
An alternative which is viable in a larger set of projects is to use Lightbend Config which allows config file inclusion out of the box, as well as the ability to use environment variables to easily override configurations at runtime.
An elaborate strategy for a larger set of projects, depending on how compatible the XML schemas for the various module's config.xmls are (if they're being read using a schema) is to define a custom Maven build process which embeds config.xmls inside one another so that code in module A and module B can share a config.xml: A only cares about the portion of the config which came from A and B only cares about that from B. I'm not particularly familiar with how one would do this in Maven, but I can't think of a reason why it wouldn't be possible.

Related

Editing Spark Module in Spark-kernel

We are currently editing a specific module in Spark. We are using spark-kernel https://github.com/ibm-et/spark-kernel to run all our spark jobs. So, what we did is compile again the code that we have edited. This produces a jar file. However, we do not know how to point the code to the jar file.
It looks like it is referencing again to the old script and not to the newly edited and newly compiled one. Do you have some idea on how to modify some spark packages/modules and reflect the changes with spark-kernel? If we're not going to use spark-kernel, is there a way we can edit a particular module in spark for example, the ALS module in spark: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala. Thanks!
You likely edited a scala or java file and recompiled (even though you call them scripts, they are not scripts in the strict sense because they are not interperted). Assuming that's what you did....
You probably then don't have a clean replacement of the resulting JAR file in the deployment you are testing. Odds are your newly compiled JAR file is somewhere, just not in the somewhere you are observing. To get it there properly, you will have to build more than the JAR file, you will have to repackage your installable and reinstall.
Other techniques exist, if you can identify the unpacked item in an installation, sometimes you can copy it in place; however, such a technique is inherently unmaintainable, so I recommend it only on throw away verification of the change and not on any system that will be used.
Keep in mind that with Spark, sometimes the worker nodes are dynamically deployed. If that is so, you might have to locate the installable of the dynamic deployment system and assure you have the right packaging there too.

sbt run - how to specify working directory? [duplicate]

I would like to be able to run the java program in a specific directory. I think, that it is quite convenient to parametrize working directory, because it allows to easily manage configurations.
For example in one folder you could have configuration for test, in other you could have resources needed for production. You probably think, that there is option to manipulate classpath for including/exluding resources but such solution works only if you are interested in resources stored in classpath and referencing them using Classloader.getResource(r). But what if you have some external configuration and you want to access it using simple instructions like File file = new File("app.properties");?
Let's see ordinary example.
Your application uses app.properties file, where you store credentials for external service. Application looks for this file in working directory, because you uses mentioned File file = new File("app.properties"); instruction to access it. In your tests you want to use app.properties specific to your tests. In you integration tests you want to use app.properties specific to another environment. And finally when you build and release application you want to provide other app.properties file. All these resources you want to access always in the same way just by typing File file = new File("app.properties"); instead of (pseudo code):
if(configTest)
file = File("testWorkDir/app.properties");
else if(config2)
file = File("config2WorkDir/app.properties");
else
file = File("app.properties");
or instead of using resources in classpath
this.getClass.getClassLoader.getResource("app.properties");
Of course you are clever programmer and you use build tool such maven, gradle or sbt :)
Enought at the outset. At least The Question:
Is there a way to set working directory in java and if yes how to configure it in build tools (especially in sbt)?
Additional info:
changing 'user.dir' system property is not working (I've tried to change it programaticly).
In sbt changing 'working directory' via baseDirectory setting for test changes baseDirectory which is not base dir in my understangind and it is not equal new java.io.File(".").getAbsolutePath.
Providing environment variable like YOUR_APP_HOME and referencing resources from this path is feasible but require to remember about this in your code.
In sbt changing 'working directory' via baseDirectory setting for test changes baseDirectory which is not base dir in my understangind and it is not equal new java.io.File(".").getAbsolutePath.
I'm not sure what the above statement means, but with sbt you need to fork to change your working directory during the run or test. This is documented in Enable forking and Change working directory.
If you fork, you can control everything, including the working directory.
http://www.scala-sbt.org/0.13.5/docs/Detailed-Topics/Forking.html
Example code:
fork in run := true
baseDirectory in run := file("/path/to/working/directory/")

Configuration centralization for Play2 Scala; or how to stop hardcoding variables

At the moment, I'm hardcoding several variables like resource names and ports. I would like to move them out of my code.
What are recommended means of implementing a central configuration outside the actual code? Like a file maybe. So that, while the production and development are using same git repository, the configurations would be seperate. I am using Play 2 Framework on Scala.
I would suggest using the Typesafe Config library. It loads and parses files that can be a mix of .properties style, JSON, or extended JSON (called HOCON - "Human-Optimized Config Object Notation"), and is the configuration style used by Play 2 itself (and Akka, Spray, and a quickly growing list of other libraries).
In Play's standard application.conf file you can include files like so:
include "file:///..."
In this file you could override what properties you need to.
Additionally, (as documented in the excellent play docs), one can specify conf files during app startup like so:
Using -Dconfig.file
You can also specify another local configuration file not packaged into the application artifacts:
$ start -Dconfig.file=/opt/conf/prod.conf
Using -Dconfig.url
You can also specify a configuration file to be loaded from any URL:
$ start -Dconfig.url=http://conf.mycompany.com/conf/prod.conf
Note that you can always reference the original configuration file in a new prod.conf file using the include directive, such as:
include "application.conf"
key.to.override=blah
Configuration is likely a mean of taste, but Typesafe Config is one of the common libraries to use in scala/play ecosystem (e.g. it is used in akka).

Using a variable like $(ProjectDir) in Integration Services Package (.dtsx)

Is this possible?
I have a package that needs to be copied to three 3 different servers. Each server is used for a different testing environment. All three servers have the same directory layout. The layout is as follows:
*\SERVER\ConfigFiles* <- Here go the .dtsConfig files.
*\SERVER\Packages* <- Here go the .dtsx files.
I want to be able to use the same package copied over the three 3 different servers without any modification. The only difference amongst the 3 servers would be the content inside the .dtsConfig file. The config files contain directories for the excel, log, and SQL server connection for each environment.
For example. Let's say I have a package called Cars.dtsx. This package is EXACTLY the same amongst all three servers. The package file points to a .dtsConfig file that is in the ConfigFiles folder (which is found on all three servers). I want a way for the package to point to the ConfigFiles\Cars.dtsConfig file on each server, but I want to do it without having to provide the name of the server in the directory.
The way I tried it is using "$(ProjectDir)..\ConfigFiles\Cars.dtsConfig" which seems to work if I run the package through the .sln file rather than the .dtsx file.
I hope that wasn't too confusing. Let me know if you need anymore info. Thanks.
Unless I'm missing some nuance, you don't need to do anything special.
Your package is going to have a hard coded reference to D:\ConfigFiles\Cars.dtsConfig It won't matter whether that package is being run from ServerA, ServerB or ServerZ (as long as you have the same file structure on those servers).
By virtue of your asking the question, are you experiencing something different?

Is it Common to use MEF with a config file for specifying a plugin path?

I have heard that MEF reduces the need for creating config files, but if I have a few different plugin paths that vary depending on the client running the app, is it common and a good idea to have a config file that specifies the correct path. I want to avoid looping through all the DLLs.
Generally people have a well known plugin directory under the where the application is running from, i.e. \Extensions. However that said there isn't any particular reason you cannot do a configuration file for directories or exact extension assemblies.