Creating and using a custom kafka connect configuration provider - apache-kafka

I have installed and tested kafka connect in distributed mode, it works now and it connects to the configured sink and reads from the configured source.
That being the case, I moved to enhance my installation. The one area I think needs immediate attention is the fact that to create a connector, the only available mean is through REST calls, this means I need to send my information through the wire, unprotected.
In order to secure this, kafka introduced the new ConfigProvider seen here.
This is helpful as it allows to set properties in the server and then reference them in the rest call, like so:
{
.
.
"property":"${file:/path/to/file:nameOfThePropertyInFile}"
.
.
}
This works really well, just by adding the property file on the server and adding the following config on the distributed.properties file:
config.providers=file # multiple comma-separated provider types can be specified here
config.providers.file.class=org.apache.kafka.common.config.provider.FileConfigProvider
While this solution works, it really does not help to easy my concerns regarding security, as the information now passed from being sent over the wire, to now be seating on a repository, with text on plain sight for everyone to see.
The kafka team foresaw this issue and allowed clients to produce their own configuration providers implementing the interface ConfigProvider.
I have created my own implementation and packaged in a jar, givin it the sugested final name:
META-INF/services/org.apache.kafka.common.config.ConfigProvider
and added the following entry in the distributed file:
config.providers=cust
config.providers.cust.class=com.somename.configproviders.CustConfigProvider
However I am getting an error from connect, stating that a class implementing ConfigProvider, with the name:
com.somename.configproviders.CustConfigProvider
could not be found.
I am at a loss now, because the documentation on their site is not explicit about how to configure custom config providers very well.
Has someone worked on a similar issue and could provide some insight into this? Any help would be appreciated.

I just went through these to setup a custom ConfigProvider recently. The official doc is ambiguous and confusing.
I have created my own implementation and packaged in a jar, givin it the sugested final name:
META-INF/services/org.apache.kafka.common.config.ConfigProvider
You could name the final name of jar whatever you like, but needs to pack to jar format which has .jar suffix.
Here is the complete step by step. Suppose your custom ConfigProvider fully-qualified name is com.my.CustomConfigProvider.MyClass.
1. create a file under directory: META-INF/services/org.apache.kafka.common.config.ConfigProvider. File content is full qualified class name:
com.my.CustomConfigProvider.MyClass
Include your source code, and above META-INF folder to generate a Jar package. If you are using Maven, file structure looks like this
put your final Jar file, say custom-config-provider-1.0.jar, under the Kafka worker plugin folder. Default is /usr/share/java. PLUGIN_PATH in Kafka worker config file.
Upload all the dependency jars to PLUGIN_PATH as well. Use the META-INFO/MANIFEST.MF file inside your Jar file to configure the 'ClassPath' of dependent jars that your code will use.
In kafka worker config file, create two additional properties:
CONNECT_CONFIG_PROVIDERS: 'mycustom', // Alias name of your ConfigProvider
CONNECT_CONFIG_PROVIDERS_MYCUSTOM_CLASS:'com.my.CustomConfigProvider.MyClass',
Restart workers
Update your connector config file by curling POST to Kafka Restful API. In Connector config file, you could reference the value inside ConfigData returned from ConfigProvider:get(path, keys) by using the syntax like:
database.password=${mycustom:/path/pass/to/get/method:password}
ConfigData is a HashMap which contains {password: 123}
If you still seeing ClassNotFound exception, probably your ClassPath is not setup correctly.
Note:
• If you are using AWS ECS/EC2, you need to set the worker config file by setting the environment variable.
• worker config and connector config file are different.

Related

How to read multiple config file from Spring Cloud Config Server

Spring cloud config server supports reading property files with name ${spring.application.name}.properties. However I have 2 properties files in my application.
a.properties
b.properties
Can I get the config server to read both these properties files?
Rename your properties files in git or file system where your config server is looking at.
a.properties -> <your_application_name>.properties
a.properties -> <your_application_name>-<profile-name>.properties
For example, if your application name is test and you are running your application on dev profile, below two properties will be used together.
test.properties
test-dev.properties
Also you can specify additional profiles in bootstrap.properties of your config client to retrieve more properties files like below. For example,
spring:
profiles: dev
cloud:
config:
uri: http://yourconfigserver.com:8888
profile: dev,dev-db,dev-mq
If you specify like above, below all files will be used together.
test.properties
test-dev.properties
test-dev-db.prpoerties
test-dev-mq.properties
Note that the provided answer assumes your property files address different execution profiles. If they dont, i.e., your properties are split into different files for some other reason, e.g., maintenance purposes, divided by business/functional domain, or any other reason that suits your needs, then, by defining a profile for each such file, you are just "abusing" the profile feature, for achieving your goal (multiple property files per app).
You could then ask "OK, so what is the problem with that?". The problem is that you restrain yourself from various possibilities that you would otherwise have. If you actually want to customize your application configuration by profile you will have to create pseudo, sub, profiles for that since the file name is already a profile. Example:
Your application configuration could be customized by different profiles, which you use inside your springboot application (e.g. in #Profile() annotation), let them be dev, uat, prod. You can boot your application setting different profiles as active, e.g. 'dev' vs 'uat', and get the group of properties that you desire. For your a.properties b.properties and c.properties file, if different file names were supported, you would have a-dev.properties b-dev.properties and c-dev.properties files vs a-uat.properties b-uat.properties and c-uat.properties files for 'dev' and 'uat' profile.
Nevertheless, with the provided solution, you already have defined 3 profiles for each file: appname-a.properties appname-b.properties, and appname-c.properties: a, b, and c. Now imagine you have to create a different profile for each... profile(! it already shows something goes wrong here)! you would end up with a lot of profile permutations (which would get worse as files increase): The files would be appname-a-dev.properties, appname-b-dev.properties, app-c-dev.properties vs appname-a-uat.properties, appname-b-uat.properties, app-c-uat.properties, but the profiles would have been increased from ['dev', ' uat'] to ['a-dev', 'b-dev', 'c-dev', 'a-uat', 'b-uat', 'c-uat'] !!!
Even worse, how are you going to cope with all these profiles inside your code and more specifically your #Profile() annotations? Will you clutter the code space with "artificial" profiles just because you want to add one or two more different property files? It should have been sufficient to define your dev or uat profiles, where applicable, and define somewhere else the applicable property file names (which could then be further supported by profile, without any other configuration action), just as it happens in the externalized properties configuration for individual springboot apps
For argument completeness, I will just add here that if you want to switch to .yml property files one day, with the provided profile-based naming solution, you also loose the ability to define different "yaml document sections per profile" inside the same .yml file (Yes, in .yml you can have one property file yet define multiple logical yml documents inside, which its usually done for customizing the properties for different profiles, while having all related properties in one place). You loose the ability because you have already used the profile in the file name (appname-profile.yml)
I have issued a pull request with a minor fix for spring-cloud-config-server 1.4.x, which allows defining additionally supported file names (appart from "application[-profile]" and "{appname}[-profile]", that are currently supported) by providing a spring.cloud.congif.server.searchNames environment property - analogous to spring.config.name for springboot apps. I hope it gets reviewed and accepted.
I came across the same requirement lately with a little more constraint that I am not allowed to play around the environment profiles. So I wasn't allowed to do as the accepted answer. I'm sharing how I did it as an alternative to those who might have same case as me.
In my application, I have properties such as:
appxyz-data-soures.properties
appxyz-data-soures-staging.properties
appxyz-data-soures-production.properties
appxyz-interfaces.properties
appxyz-interfaces-staging.properties
appxyz-interfaces-production.properties
appxyz-feature.properties
appxyz-feature-staging.properties
appxyz-feature-production.properties
application.properties // for my use, contains local properties only
bootstrap.properties // for my use, contains management properties only
In my application, I have these particular properties set that allow me to achieve what I needed. But note I have the rest of needed config as well (enable cloud config, actuator refresh, eureka service discovery and so on) - just highlighting these for emphasis:
spring.application.name=appxyz
spring.cloud.config.name=appxyz-data-soures,appxyz-interfaces,appxyz-feature
You can observe that I didn't want to play around my application name but instead I used it as prefix for my config property files.
In my configuration server I configured in application.yml to capture pattern: 'appxyz-*':
spring:
cloud:
config:
server:
git:
uri: <git repo default>
repos:
appxyz:
pattern: 'appxyz-*'
uri: <another git repo if you have 1 repo per app>
private-key: ${git.appxyz.pk}
strict-host-key-checking: false
ignore-local-ssh-settings: true
private-key: ${git.default.pk}
In my Git repository I have the following. No application.properties and bootstrap because I didn't want those to be published and overridden/refreshed externally but you can do if you want.
appxyz-data-soures.properties
appxyz-data-soures-staging.properties
appxyz-data-soures-production.properties
appxyz-interfaces.properties
appxyz-interfaces-staging.properties
appxyz-interfaces-production.properties
appxyz-feature.properties
appxyz-feature-staging.properties
appxyz-feature-production.properties
It will be the pattern matching pattern: 'appxyz-*' that will capture and return the matching files from my git repository. The profile will also apply and fetch the correct property file accordingly. The prioritization of value is also preserved.
Furthermore, if you wish to add more file in your application (say appxyz-circuit-breaker.properties), we only need to do:
Add the name pattern in the spring.cloud.config.name=...,appxyz-circuit-breaker
The add the copies of the file locally and also externally (in the git repo.
No need to add/modify more or restart your configuration server later on. For new application, it's like a one time registration thing to add an entry under the repos of application.yml.
Hope it helps in one way or another!
In your application bootstrap.properties, you have to specify like below:
spring.application.name=a,b

IBM Integration Bus: The PIF data could not be found for the specified application

I'm using IBM Integration Bus v10 (previously called IBM Message Broker) to expose COBOL routines as SOAP Web Services.
COBOL routines are integrated into IIB through MQ queues.
We have imported some COBOL copybooks as DFDL schemas in IIB, and the mapping between SOAP messages and DFDL messages is working fine.
However, when the message reaches a node where a serialization of the message tree has to take place (for example, a FileOutput or a MQ request), it fails with the following error:
"The PIF data could not be found for the specified application"
This is the last part of the stack trace of the exception:
RecoverableException
File:CHARACTER:F:\build\slot1\S000_P\src\DataFlowEngine\TemplateNodes\ImbOutputTemplateNode.cpp
Line:INTEGER:303
Function:CHARACTER:ImbOutputTemplateNode::processMessageAssemblyToFailure
Type:CHARACTER:ComIbmFileOutputNode
Name:CHARACTER:MyCustomFlow#FCMComposite_1_5
Label:CHARACTER:MyCustomFlow.File Output
Catalog:CHARACTER:BIPmsgs
Severity:INTEGER:3
Number:INTEGER:2230
Text:CHARACTER:Caught exception and rethrowing
Insert
Type:INTEGER:14
Text:CHARACTER:Kcilmw20Flow.File Output
ParserException
File:CHARACTER:F:\build\slot1\S000_P\src\MTI\MTIforBroker\DfdlParser\ImbDFDLWriter.cpp
Line:INTEGER:315
Function:CHARACTER:ImbDFDLWriter::getDFDLSerializer
Type:CHARACTER:ComIbmSOAPInputNode
Name:CHARACTER:MyCustomFlow#FCMComposite_1_7
Label:CHARACTER:MyCustomFlow.SOAP Input
Catalog:CHARACTER:BIPmsgs
Severity:INTEGER:3
Number:INTEGER:5828
Text:CHARACTER:The PIF data could not be found for the specified application
Insert
Type:INTEGER:5
Text:CHARACTER:MyCustomProject
It seems like something is missing in my deployable BAR file. It's important to say that my application has the message flow and it depends on a shared library that has all the .xsd files (DFDLs).
I suppose that the schemas are OK, as I've generated them using the Toolkit wizard, and the message parsing works well. The problem is only with serialization.
Does anybody know what may be missing here?
OutputRoot.Properties.MessageType must contain the name of the message in the DFDL schema. Additionally when the DFDL schema is in a shared library, OutputRoot.Properties.MessageSet must contain the name of the library.
Sounds as if OutputRoot.Properties is not pointing at the shared library. I cannot remember which subfield does that job - it is either OutputRoot.Properties.MessageType or OutputRoot.Properties.MessageSet.
You can easily check - just check the contents of InputRoot.Properties after an input node that has used the same shared libary.
Faced a similar problem. In my case, a message flow with an HttpRequest node using a DFDL domain parser / format to parse an HTTP response from the remote system threw this error (PIF data could not be found for the specified application). "Re-selecting" the same parser domain & message type on the node followed by build / redeploy solved the problem. Seemed to be a project reference related issue within the IIB toolkit.
you need to create static libraries and refer to application.
in compute node ur coding is based on dfdl body

Two Configuration files in Scala-Spray framework

I have REST API, that is developed using Scala and Spray framework. I am able to execute and launch my Api from localhost. The API is connected to the database. The IP Address(localhost) and port of Database is read from the "application.conf" file under the resources.
Everything works fine till I start using Docker. In Docker I have :
1. One Docker container of Rest API
2. One Docker container of Database.
The IP address of Database changes for each docker instance, therefore I need to update my "application.conf" file. Although I can use the hostname of Db instance that remains the same.
My issue is : Can I have two "application.conf" files , one for localhost and one for Docker instance? IS there a way to change the "application.conf" file at the run time.
P.s I am using "sbt run" to run the application and as per documentation it does not support java system properties or environment variables
Yes, you can choose the config at runtime. spray & akka use the typesafe config library which allows setting single settings or the whole configuration using JVM properties.
From the documentation of config:
For applications using application.{conf,json,properties}, system
properties can be used to force a different config source:
config.resource specifies a resource name - not a basename, i.e. application.conf not application
config.file specifies a filesystem path, again it should include the extension, not be a basename
config.url specifies a URL
These system properties specify a replacement for
application.{conf,json,properties}, not an addition. They only
affect apps using the default ConfigFactory.load() configuration. In
the replacement config file, you can use include "application" to
include the original default config file; after the include statement
you could go on to override certain settings.

Multiple sling configs with same name

If there are multiple sling configuration nodes with the same name in CRX and if I invoke configAdmin.getConfiguration in my OSGi Service, which config value would it pick? I have mulitple config directories under apps like config.qa, config.local, config etc. with have the same config node. How do I make CQ5 pick config.qa instead of config? I did add the property sling.run.mode=publish,qa in sling.properties file. It is still picking up the properties defined under config folder instead of config.qa. Why isn't it picking the props from the config.qa folder like the documentation at http://docs.adobe.com/docs/en/cq/5-4/deploying/configuring_osgi.html?
It should always pick one with most match. For example if you have Run Mode as author,dev,intranet and then you have configs like config.author, config.dev, config.intranet and config.dev.intranet then in that case config.dev.intranet will be chosen. Make sure that you override your common config across this folder to make this work. Please check http://www.wemblog.com/2012/10/how-to-work-with-configurations-in-cq.html for more detail.
Yogesh

Change log file name during runtime - ent lib

I have a WCF service that will serve multiple clients. I'm using ent lib for the logging.
I'd like to have a different log file for each client. is there a way to change the file name back and forth?
I found a few threads but they all talk about editing the config file during runtime.
ALso found this: Enterprise Library Logging but it talks about environment variables. I will set the log name according to the client id.
Thanks
Avi
You can have distinct categories linked to individually configured FlatFile or RollingFile tracelisteners for each client.
If filenames are unknown till runtime, consider using fluent API for configuration, like so:
http://msdn.microsoft.com/en-us/library/ff664363(PandP.50).aspx#fluent_api_logging