Presto Plugins: Single JAR vs Multiple JARs - plugins

My Presto plugin has 2 components: some UDFs (for basic MD5 / SHA1 hashing) and an EventListener (for logging queries using FluentD logger)
During development (single-node Presto cluster), I added them under a single Plugin class, bundled a single JAR and faced no problem
During deployment I found a pitfall: the UDFs must be registered with all nodes whereas (my particular) EventListener must be registered only with master node
Now I have two options
1. Bundle them together in single JAR
We can control registration of UDFs / EventListeners via external config file (different configs for master & slave nodes). As more UDFs, EventListeners and other SPIs are added, a single JAR paired with tweaked config file with achieve the desired result.
2. Bundle them as separate JARs
We can create different Plugin classes for UDFs / EventListener and provide corresponding classpaths in META-INF.services/com.facebook.spi.Plugin file through Jenkins. We'll then have different JARs for different components: one JAR for all UDFs, one JAR for all EventListeners etc. However as more functionalities are added in future, we might end up having lots of different JARs.
My questions are
What are the pros and cons of both techniques?
Is there an alternate approach?
I'm currently on Presto 0.194 but will soon be upgrading to Presto 0.206

Either way works. You can do whichever is easiest for you. There's actually a third option in the middle, which is to have multiple Plugin implementations in a single JAR (you would list all implementations in the META-INF/services file).
EventListener is actually used on both the coordinator and workers. Query events happen on the coordinator and split events happen on the workers. However, if you only care about query events, you only need it on the coordinator.
You can deploy the event plugin on both coordinator and workers but only configure it on the coordinator. The code will only be used if you configure it by adding an event-listener.properties file with a event-listener.name property that matches the name you return in your EventListenerFactory.getName() method.

Related

How to read external config file when starting a bundle jboss fuse karaf

The problem is simple: i want to print all topics from apache kafka after installing kafka module on karaf. I need to get properties from cfg file which is located in jbossfuse/etc and create a KafkaConsumer object. I want to implement BundleActivator to be able to start method in the moment of installation module.
The question is: how can i get properties from the config file?
I found some solution here: some solution, they said " you can use ConfigAdimn service from OSGi spec. ". How can i use it? All examples with the code are welcome
Karaf uses Felix-FileInstall to read config files: http://felix.apache.org/documentation/subprojects/apache-felix-file-install.html
So if there is a file named kafka.cfg, it will pick it up and register a config with the ConfigAdmin-Service under the pid 'kafka'.
You can fetch the ConfigAdmin-Service and fetch the config using an Activator and read that config from there, but I strongly recommend to use DeclarativeServices or Blueprint instead to interact with the OSGi-Framework, both support injection of configuration if it is available.
Because otherwise you have to deal yourself with the following topics:
there is no ConfigAdmin (yet), maybe because your bundle starts earlier)
the ConfigAdmin changes (for example due to a package refresh or update)
the configuration is not yet registered (because felix has not read it yet)
the configuration gets updated (for example somone changes the file)

How to represent a dependency relationship between node and artifact in UML Deployment Diagram?

I have a web application that reads from a file. My application is represented as a node and file as an artifact. Can I use dashed arrow to represent their relationship ?
I never used artifacts as you want but it seems legal.
Artifact (p654) : An Artifact represents some (usually reifiable) item of information that is used or produced by a software development process or by operation of a system. Examples of Artifacts include model files, source files, scripts, executable files, database tables, development deliverables, word-processing documents, and mail messages.
A log file is produced by an operation of the system, I guess. Guys, what do you think ?
And there a stereotype in the standard profile: «Create» : A usage dependency denoting that the client classifier creates instances of the supplier classifier. (p 678)
So if you want to model that your wbe server creates an instance of LogFile, the following schema should do the job.
More remarks:
the web server is an execution environment deployed on a node.
your web application is an artifact running on the execution environment.
That's perfectly ok. You could optionally stereotype the dependency with <<create>>.

how to disclude development.conf from docker image creation of play framework application artifact

Using scala playframework 2.5,
I build the app into a jar using sbt plugin PlayScala,
And then build and pushes a docker image out of it using sbt plugin DockerPlugin
Residing in the source code repository conf/development.conf (same where application.conf is).
The last line in application.conf says include development which means that in case development.conf exists, the entries inside of it will override some of the entries in application.conf in such way that provides all default values necessary for making the application runnable locally right out of the box after the source was cloned from source control with zero extra configuration. This technique allows every new developer to slip right in a working application without wasting time on configuration.
The only missing piece to make that architectural design complete is finding a way to exclude development.conf from the final runtime of the app - otherwise this overrides leak into production runtime and obviously the application fails to run.
That can be achieved in various different ways.
One way could be to some how inject logic into the build task (provided as part of the sbt pluging PlayScala I assume) to exclude the file from the jar artifact.
Other way could be injecting logic into the docker image creation process. this logic could manually delete development.conf from the existing jar prior to executing it (assuming that's possible)
If you ever implemented one of the ideas offered,
or maybe some different architectural approach that gives the same "works out of the box" feature, please be kind enough to share :)
I usually have the inverse logic:
I use the application.conf file (that Play uses by default) with all the things needed to run locally. I then have a production.conf file that starts by including the application.conf, and then overrides the necessary stuff.
for deploying to production (or staging) I specify the production/staging.conf file to be used
This is how I solved it eventually.
conf/application.conf is production ready configuration, it contains placeholders for environment variables whom values will be injected in runtime by k8s given the service's deployment.yaml file.
right next to it, conf/development.conf - its first line is include application.conf and the rest of it are overrides which will make the application run out of the box right after git clone by a simple sbt run
What makes the above work, is the addition of the following to build.sbt :
PlayKeys.devSettings := Seq(
"config.resource" -> "development.conf"
)
Works like a charm :)
This can be done via the mappings config key of sbt-native-packager:
mappings in Universal ~= (_.filterNot(_._1.name == "development.conf"))
See here.

SpringXD/Spring Integration: Using different versions of spring-integration-kafka for producer and consumer

I have the following configuration:
Spring-integration-kafka 1.3.1.RELEASE
I have a custom kafka-sink and a custom kafka-source
The configuration I want to have:
I'd like to still using Spring-integration-kafka 1.3.1.RELEASE with my custom kafka-sink.
I'm changing my kafka-source logic to use Spring-integration-kafka-2.1.0.RELEASE. I noticed the way to implement a consumer/producer is way different to prior versions of Spring-integration-kafka.
My question is: could I face some compatibily issues?
I'm using Rabbit.
You should be ok then; it would probably work with the newer kafka jars in the source's /lib directory since each module is loaded in its own classloader so there should be no clashes with the xd/lib jars.
However, you might have to remove the old kafka jars from the xd/lib directory (which is why I asked about the message bus).

NServiceBus Command location

All,
A quick if you will related to the location of commands. We have two hosts, the first which will issue commands, the second which will receive those commands.
The hosts exists in different eco systems/bounded contexts, and therefore I'm trying to determine the best location for the commands.
Do you think that the commands project should reside with the send (in the sender sln), or with the receiver.
They could be kept entirely independent and be in a separate solution, but that doesn't solve the location issue as they're hosted in an internal nuget instance.
Thoughts?
With either commands or events we tend to place those outside of the consuming projects in a common area and build them separately after initial development. We have the build generate the nuget packages and then reference those from the consuming projects. Enabling package restore ensures the consumer's builds work correctly.
As Adam stated, Messages (Commands and Events) are contracts and should be located in a common project, the two consuming project have a dependency on the messages the sen/publish and handle. you can put the messages in separate projects (and/or namespaces) based on the service that owns them.